## Abstract

We discuss and test the potential usefulness of single-column models (SCMs) for the testing of stochastic physics schemes that have been proposed for use in general circulation models (GCMs). We argue that although single-column tests cannot be definitive in exposing the full behaviour of a stochastic method in the full GCM, and although there are differences between SCM testing of deterministic and stochastic methods, SCM testing remains a useful tool. It is necessary to consider an ensemble of SCM runs produced by the stochastic method. These can be usefully compared with deterministic ensembles describing initial condition uncertainty and also with combinations of these (with structural model changes) into poor man's ensembles. The proposed methodology is demonstrated using an SCM experiment recently developed by the GCSS (GEWEX Cloud System Study) community, simulating transitions between active and suppressed periods of tropical convection.

## 1. Introduction

In recent years, increasing attention has been given to the potential usefulness (Palmer 2001; Wilks 2005) of introducing some stochastic component(s) to the physical parametrizations used in general circulation models (GCMs). For example, many GCMs are known to have insufficient high-frequency, small-scale variability of convective heating rates and precipitation in the tropics, which may damage their ability to represent low-frequency, large-scale climate variability (Ricciardulli & Garcia 2000; Horinouchi *et al*. 2003). A wide variety of plausible stochastic methods continue to be suggested and actively investigated, including perturbing the inputs to a parametrization (e.g. Tompkins & Berner submitted), perturbing the parameters used within it (e.g. Byun & Hong 2007), perturbing its outputs (e.g. Teixeira & Reynolds accepted) and even constructing new parametrizations designed explicitly to be stochastic from the outset (e.g. Plant & Craig 2008). There is a growing acceptance that the use of stochastic elements in GCMs may be desirable for both theoretical and practical reasons (e.g. Penland 2003; Williams 2006). Thus, the time may soon be approaching when the key question changes from *why a stochastic method* to *which stochastic method*. Here, we explore whether single-column modelling might be able to provide some insights that could inform such decision making.

The aim of a stochastic scheme is to introduce variability into the numerical representation of the climate system. In order to determine the variability of some climate phenomenon in the GCM, either multiple or long integrations are likely to be required. Further lengthy explorations would also be required if one wished to assess the impact of that variability on other aspects of the model climate. How then, in practice, should one choose the stochastic method(s) to be used in a GCM? The difficulty is not simply the range of possible schemes available in the literature, but also (at least) two other important considerations.

First, we do not know how well various methods might combine. The motivations behind various schemes, and the uncertainties they attempt to address, may often appear to be very different. At first sight then it may be attractive to use several methods. However, there may be a danger in this of some ‘double counting’, particularly if attempting to combine some of the more generic methods to address parametrization uncertainty. For instance, taking a single parameterization and perturbing its inputs, parameters and outputs simultaneously might not be totally unreasonable, but it would be extremely naive to expect good performance by implementing three such methods directly ‘off the shelf’.

Second, one's difficulties are compounded by the fact that many (if not all) of the stochastic schemes in the literature themselves contain free parameters and structural uncertainties. We can use the random parameters approach to offer a simple example. Suppose that one wished to choose random values for the entrainment rate and the CAPE closure time scale in the parametrization of deep convection. Should those choices be correlated, and if so, then how?

It would surely be impractical to conduct full GCM testing of all plausible stochastic physics schemes and all possible variations on their basic themes. However, it should be possible to do better than testing some best guesses. As a first step, we describe in this paper essentially a test of concept for the idea that single-column model (SCM) experiments might have some useful value for comparing stochastic schemes. We are not at this stage attempting an assessment of the relative performance of various stochastic schemes. Rather, our objective is to demonstrate that simple methods to improve one's understanding of the behaviour of various stochastic schemes are both possible and worth pursuing. It does not seem unreasonable to hope that current best guesses could evolve into educated guesses.

The paper is organized as follows. In §2, we introduce some issues in single-column modelling and their implications for testing stochastic methods. Section 3 describes the modelling framework used in this study, including the stochastic methods (§3*a*) and the ensemble approach (§3*b*). Results are shown for the sensitivity to initial condition (IC) perturbations (§4*a*) and for the mean states (§4*b*) and variabilities (§4*c*,*d*) of various stochastic methods. Conclusions are drawn in §5.

## 2. An SCM approach for stochastic schemes

Single-column modelling has a long history as a useful guide towards understanding and testing the behaviour of deterministic parametrizations within a GCM. In the full GCM, a parametrization interacts with model dynamics and with the other parametrizations. Essentially, the SCM is a means to understand the latter, which may or may not dominate in the full GCM. Usually, the dynamical forcing of the SCM is determined beforehand, perhaps based on an observational campaign. The forcing is independent of the current model state, which constrains the possible responses. Thus the SCM may behave differently from the corresponding full GCM if dynamical feedback is an important aspect of the situation modelled. One consequence is that parametrization errors in a GCM which adversely affect model dynamics may not be apparent in the SCM, which is kept on track by the prescribed dynamics.

It is not immediately apparent how an SCM might be used to make meaningful comparisons of stochastic methods. In some cases, the use of a single column may simply not be viable because the stochastic terms cannot easily be applied (e.g. Shutts 2005). Indeed, it has been suggested that an ideal stochastic method would probably be non-local (Palmer 2001; Craig *et al*. 2005; Ghil *et al*. 2005). For the present though, the majority of stochastic methods can be formulated for a single column. Nonetheless, an obvious objection to SCM comparisons remains: feedback from the introduced variability to the dynamics may be a key feature of the behaviour in a full GCM (e.g. Lin & Neelin 2002). This will be missing in a traditional SCM experiment with specified dynamics. SCMs can include appropriate dynamical feedbacks by using a parametrized dynamics formulation, such as a weak temperature gradient approximation for the tropical atmosphere (e.g. Sobel *et al*. 2007), or by coupling vertical advection to the parametrized diabatic heating via a gravity wave model (Bergman & Sardeshmukh 2004). Ultimately, we believe that these and similar frameworks would be particularly well suited to studying stochastic physics schemes, but do not pursue them further here.

It may nonetheless be possible to gain some useful insights into the behaviour of stochastic methods through an SCM comparison. The results for each method must be considered in the form of an ensemble of SCM runs, each run having a different set of random numbers. Our proposal is to compare such ensembles with the SCM results obtained from multiple deterministic parametrizations, the suite of deterministic configurations being treated as a so-called poor man's ensemble (e.g. Mylne *et al*. 2002).

## 3. Experimental set-up

Experiments have been carried out using the single-column form of the UK Met Office Unified Model (UM, Cullen 1993). The model runs are based on GCSS PCCS case 5, the design of which is described by Petch *et al*. (2007). Specifically, we study here the consecutive time periods B and C. Model intercomparison cases have been a major part of the Global Energy and Water cycle EXperiment (GEWEX) Cloud System Study (GCSS), which aims to support the development of physically based parametrizations for cloud processes. An overview of the precipitating convective cloud systems (PCCS) working group can be found in Moncrieff *et al*. (1997).

Case 5 simulates a column of the atmosphere in the tropical West Pacific warm pool region, at 2° S 156° E, and the model runs presented here span the period 9–28 January 1993. The forcing dataset is derived from observations taken in the TOGA-COARE campaign (Webster & Lukas 1992). It contains temperature and moisture increments due to turbulent fluxes from the ocean surface and to large-scale vertical and horizontal advection.1 Also prescribed are time series of observed winds, towards which the SCM is strongly relaxed, with a time scale of 1 hour. Any changes to the winds in these runs are therefore limited. The forcings and ICs are derived directly from surface and radiosonde measurements averaged over the TOGA-COARE IFA (Intensive Flux Array, see fig. 14 of Webster & Lukas 1992).

The focus of case 5 is the transition of tropical convection from suppressed to active phases, and two such transitions occur during these SCM runs. See figure 1 in which the periods are defined as in Petch *et al*. (2007). Here we label ActB, a very active period with heavy rain, SupC, in which convection is suppressed by the large-scale forcing, and the following active phase, ActC. Rain rates are effectively constrained by the large-scale forcing and are similar to those found in other SCMs (Woolnough *et al*. submitted).

The SCM runs use a time step of 30 min and there are 38 levels in the vertical. The performance of the default UM SCM for this case in comparison with other models is discussed by Petch *et al*. (2007) and Woolnough *et al*. (submitted). It is more consistent with CRM simulations than some of the SCMs, which were somewhat dry.

### (a) Model variants

Taking the UM SCM as a basis, several model configurations have been tested. These differ through either the convection parametrization or the stochastic method used, and are described below. Most of the stochastic methods have been implemented by introducing a stochastic element to the pre-existing UM parametrizations. We quantify the variability associated with a stochastic method using the spread of an ensemble, with a different set of random numbers drawn for the stochastic component of each ensemble member. Small IC perturbations are also included in the ensembles; these are discussed in §3*b*.

*Default UM*. The default UM configuration contains parametrizations for layer cloud microphysics, radiation, boundary-layer processes and convection. Martin*et al*. (2006) provide an overview of the current set of schemes. Convection is represented by a deterministic bulk mass flux scheme based on Gregory & Rowntree (1990), but which has since been modified (Martin*et al*. 2006). There are prognostic moisture variables for specific humidity, cloud liquid water content and cloud ice water content.*Kain–Fritsch convection scheme*. An alternative deterministic mass flux scheme for convection is that of Kain & Fritsch (1990, KF). The version described by Kain (2004) has been implemented here. For a discussion of the differences between the schemes of Gregory & Rowntree (1990) and KF in the UM in a forecasting context, see Done (2002).*Multiplicative noise scheme*. This scheme follows the method of Buizza*et al*. (1999) and is designed to represent model uncertainty. At each time step, the total parametrized tendencies for each model variable are multiplied by a random number*ϵ*_{1}chosen from a uniform distribution between 1−*k*and 1+*k*. The random number is the same for each model variable and at each vertical level. Temporal correlation is enforced by keeping the same random number for multiple time steps. Buizza*et al*. (1999) found that the greatest improvements to the performance of the ECMWF ensemble prediction system occurred for*k*=0.5 and a new random number every 6 hour. The same choices are made here. Total tendencies from the default UM are multiplied by*ϵ*_{1}at the end of each time step, with a check to restore moisture to zero if the stochastic perturbation implies a negative value.*Random parameters scheme*. GCM parametrizations include parameters for which the appropriate value is not well determined. The random parameters scheme attempts to account for parametrization uncertainty by allowing such parameters to vary within a plausible range. This scheme follows the system used (Arribas 2004) in the Met Office Global and Regional Ensemble Prediction System (MOGREPS, Mylne*et al*. 2005). The relevant parameters and ranges can be found in Arribas (2004), but include the entrainment rate and CAPE closure time scale from the UM convection scheme. Temporal correlations are described by a first-order auto-regression model,(3.1)in which the parameter is labelled*P*and the update number*n*.*μ*_{P}is the default value of*P*,*r*is an auto-correlation coefficient and*k*_{P}*ϵ*_{2}is a stochastic shock term (see §3.2 of Mylne*et al*. 2005) in which*ϵ*_{2}is a random number uniformly distributed between −1 and 1, and*k*_{P}is a parameter-dependent normalization. Each parameter is subject to maximum and minimum acceptable bounds, and the same random number*ϵ*_{2}is used for all parameters at each update, every 3 hour.*Random but constant parameters*. Another approach to sampling parameter uncertainty is to run an ensemble in which each run has a fixed, but different, parameter set. This approach has been used to make probabilistic predictions of future climate (e.g. Murphy*et al*. 2004). For this study, we simply adapt the random parameters scheme above by choosing initial parameter values randomly within the acceptable range and holding these fixed. Our method does not explore parameter space in an unbiased way, as it is constrained by the correlations between parameters assumed in the random parameters scheme above. Nonetheless, it allows for an interesting test of the temporal correlations in that scheme.*Plant & Craig stochastic convection scheme*. In the Plant & Craig (2008) parametrization, a finite number of distinct plumes are present in a grid box area at any instant, resulting in a random sampling of the full spectrum for an ensemble of cumulus clouds. The spectrum used is based on an equilibrium exponential distribution of cloud base mass flux (Craig & Cohen 2006) and plumes are produced at random, with the properties of each based on an adaptation of the KF plume model. The smaller the grid box size, the more limited the sampling and the larger the fluctuations from statistical equilibrium. A sounding averaged over nearby grid points and recent time steps provides a smoothed input for the CAPE closure calculation. Of course, spatial averaging is not possible in an SCM. Preliminary tests showed that the scheme behaved sensibly in the SCM when averaging over 20 time steps. This choice represents a compromise between providing a smooth input profile and the need to capture variations in the dynamical forcings.*Deterministic limit of the Plant & Craig scheme*. The Plant & Craig (2008) scheme can operate as a spectral convective parametrization by running the plume model for every category of cloud and weighting the tendencies according to the probability of that cloud occurring. This corresponds to the deterministic limit of a very large grid box in which the cumulus ensemble is well sampled.

### (b) IC ensembles and ensemble size

It should be noted that a stochastic scheme is not required in order for a parametrization embedded in an SCM to exhibit variability: even if the prescribed forcings remain constant, a purely deterministic SCM will vary from one time step to the next. This is particularly associated with switches in parametrizations, the most important of which is the trigger function in the convection scheme. This can often exhibit exaggerated on–off behaviour (e.g. Willett & Milton 2006), and the exact set of timesteps on which the convection scheme triggers can be very sensitive to small changes in the model state (this was found to be true in deterministic SCM ensembles used in this study, not given). Such unsteady behaviour inherent to convective and other parametrizations provides a source of variability in deterministic and stochastic SCMs alike.

Part of the variability in a stochastic physics SCM ensemble2 may arise simply because the stochastic (ST) perturbations force each ensemble member to follow a different realization, with the convection being triggered on a different set of time steps. Such realizations can also be explored in a deterministic model by running an ensemble with IC perturbations. We suggest that such IC ensembles should be run in order to make meaningful comparisons of stochastic schemes with their deterministic counterparts. Hack & Pedretti (2000) suggest that an ensemble approach is appropriate for SCM studies as an SCM can be sensitive to small differences in the ICs.

The perturbations for an IC SCM ensemble should be small enough not to introduce significant bias to any of the ensemble members, but large enough to force the ensemble members to diverge into an unbiased sample of probable realizations early in the model run. It should be emphasized that the perturbations are only used in this study to provide a sample of realizations and are not necessarily intended to represent realistic IC uncertainty. Thus, there is no requirement for the perturbations to match instrumental and sampling errors in the observations that provide the ICs. The results of such ensemble tests are discussed in §4*a*.

Another aspect to consider is the ensemble size required. We use 39 member ensembles, which appears to be sufficient to produce useable results. The robustness of results derived from ensembles of this size can be estimated from a brief statistical consideration.

Assuming some model variable to be approximately normally distributed, its ensemble mean has a standard error of , where *σ* is the standard deviation and *K* is the number of ensemble members. For example, in the IC ensemble for the default UM, the ensemble standard deviation for temperature is of the order 0.5 K (figure 5). This gives an error of approximately 0.08 K, and a 95% CI of approximately 0.16 K, which is less than 10% of the amplitude of typical temperature variations during the model runs (figure 2).

It can be shown that a reasonable approximation to the sampling distribution of the ensemble standard deviation is a normal distribution with standard deviation , for ensemble sizes *K*≳25. We have checked explicitly that the normal distribution holds and in our case it leads to 95% CI of approximately 22% of *σ*. An interval this broad suggests that ensemble spreads calculated for single variables should be interpreted with care.

More accurate ensemble spreads occur for an error norm which sums the ensemble spread over the model column. We present below results for the total column root mean square ensemble spread (TCES), given by(3.2)where *F* is a model variable; its ensemble mean; and *k* labels ensemble members. Assuming hydrostatic balance, the TCES is the square root of the mass-weighted vertical integral of estimated model field variance. We have therefore performed some simple tests to estimate a sampling error for the TCES for temperature of approximately 10%, roughly half that for a single level.

## 4. Results

### (a) Ensemble sensitivity to IC perturbations

We have constructed IC ensembles from two sets of IC perturbations. In set 1, random temperature perturbations are applied to the lowest model level, chosen from a uniform distribution between ±0.25 K. Temperature perturbations for set 2 are larger and cover a greater vertical extent. A uniform distribution is again used with amplitude 0.5 K at the surface and decreasing exponentially above with a height scale of 1 km. For these larger perturbations, it is desirable to ensure that no spurious super-saturation occurs, and so corresponding perturbations are applied to specific humidity field in order to maintain the relative humidity.

Figure 2 shows spaghetti plots of temperature for a single model level in the lower troposphere. Set 1 IC perturbations have been added to the default UM. Figure 3 is equivalent, but for the Kain–Fritsch convection scheme. For the default UM, the perturbations appear to produce a good spread of realizations, but using the KF scheme the ensemble members fail to diverge. Even at the end of the 19-day runs, they remain clustered in six distinct groups, the members of each group triggering convection on the same set of time steps (not shown). Figure 4 shows the temperature plume for the Kain–Fritsch scheme using the set 2 IC perturbations. It is clear that the ensemble members are less tightly clustered than with the set 1 perturbations, producing a more representative sample of realizations. But the spread is still smaller than in the default UM using set 1. It is perhaps slightly surprising that the SCM responds so differently to IC perturbations when different convection schemes are used. This is in contrast to Hume & Jakob (2005).

An important point to note from the simulations of Hack & Pedretti (2000) is their observation of bifurcations in SCM solutions (e.g. their fig. 4), with ensemble members dividing into two or more preferred modes. Clearly, this raises issues with the representativeness of statistics such as the ensemble mean, since the mean state may lie between modes and never actually occur. Little evidence for multi-modal behaviour was found in the present study. Clearly separated modes do occasionally occur, as seen for example, using the Kain–Fritsch convection scheme around the 19th (figure 4). However, these persist for no longer than a day or so. The presence or absence of bifurcations is presumably related to the character of either (or both) the SCM or the large-scale forcing. We do not speculate further here, but rather note that the ensemble mean and standard deviation appear to be genuinely useful diagnostics for the present study.

Figure 5 shows time series of the TCES of temperature, for the default UM ensemble with set 1 and set 2 IC perturbations, and also for an ensemble that includes multiplicative noise scheme, both with and without set 2 IC perturbations added. The corresponding plots for relative humidity are shown in figure 6.

Looking first at the two default UM TCES ensembles, there is more spread over the first 6 days using the larger set 2 IC perturbations. However, the set 1 and 2 ensembles look very similar beyond 6 days, suggesting that the ensemble spread has saturated in both. This is reassuring as it suggests that the saturated level of ensemble spread in temperature is independent of the size and nature of the IC perturbations, but rather provides a measure of the inherent variability of the SCM. For the Kain–Fritsch scheme, the larger IC perturbations produce larger ensemble spreads throughout (figures 3 and 4), but this is because the spread did not saturate when the set 1 IC perturbations were used.

In a stochastic physics (ST) SCM ensemble, the stochastic method provides some physically motivated source of variability. One might anticipate that the physics perturbations would allow the ST ensemble to explore at least those realizations accessible to its deterministic analogue.3 If this is true, then IC perturbations should have little effect on ensemble spread when implemented in an ST ensemble. It is clear from figures 5 and 6 that beyond the first 36 hours or so, the ensembles including multiplicative noise ST perturbations have spreads that are consistently larger than those occurring in the IC-only ensembles, typically by a factor of about a third. The inclusion of IC perturbations in addition to multiplicative noise slightly increases the spread during the first day, but has no significant effect thereafter. This is consistent with the idea that the IC perturbations allow one to sample different realizations, but do not affect the underlying distribution of probably realizations which emerges once the spread of the ensemble saturates. Similar conclusions apply for the other stochastic methods used (not shown).

Comparisons of the effects of IC and ST perturbations have been made before in the context of global GCM ensemble prediction systems. Buizza *et al*. (1999) found that IC-only ensembles produced consistently larger spread than ST-only ensembles, and that ensembles with IC and ST perturbations produced greater spread still.4 Teixeira & Reynolds (accepted) found similar results over the tropics using multiplicative noise scheme applied only to the moist convective tendencies (their fig. 7*a*). Although these results differ from ours in placing far greater emphasis on IC perturbations, this is not surprising given the context. In particular, we focus on the saturated level of ensemble spread due to IC perturbations whereas in the cited studies, the runs do not reach saturation.5 Also those studies used much larger IC perturbations designed to sample IC uncertainty.

It is interesting to note in figures 5 and 6 the time variability of the ensemble spreads. The spread clearly has some dependence on the large-scale forcing, with a peak followed by a sudden drop in spread occurring at the start of each convectively active phase. To study this in more detail, we show in figure 7 a time–height plot of the ensemble spread in temperature in the set 2 default UM ensemble. The spread appears to follow different characteristic regimes during suppressed and active phases, while behaving in a more unsteady manner during transition periods between the two. Note that in figure 7, the active and suppressed phases labelled in figure 1 have been redefined in order to separate out the transition periods. This will allow the ensemble variability characteristic of suppressed and active phases to be analysed in §4*c*.

Most notably during transitions from suppressed to active phases, the pattern of ensemble spread appears to be related to the convective cloud top height. For example, peaks in TCES on the 15th (figures 5 and 6) correspond to large spreads in the mid-troposphere where the ensemble produces a broad range of convective cloud tops. During the following day, this range suddenly narrows and the ensemble spread drops throughout the troposphere. Another interesting feature is the sloping layer of high ensemble spread around the 24th and 25th, which increases in height from roughly 7 to 9 km during the transition from SupC to ActC. This layer closely follows the 75th percentile of convective cloud top height, indicating an ascending lid on the convection. The ensemble spread is large here because the ensemble members produce a range of different heights for this lid, which has a sharp temperature gradient across it (not shown).

### (b) Intercomparison of ensemble mean states

An ensemble was produced for each of the SCM configurations described in §3*a*. In the case of the stochastic Plant & Craig scheme, two separate ensembles were produced for columns with horizontal scales Δ*x* of 50 and 100 km. (As explained in §3*a*, the stochastic fluctuations in that scheme depend on the column size.) To ensure a consistent comparison between stochastic physics SCM ensembles and their deterministic analogues, we included the set 2 IC perturbations in all ensembles, although beyond the first day they make very little difference to the stochastic ones.

We show results here for the precipitable water content (PWC), the mass-weighted integral of specific humidity through the column. Figure 8 shows time series of observation-derived PWC and ensemble means for three deterministic SCM configurations. The SCMs exhibit large systematic biases from the observed PWC: this is probably due to discrepancies in the large-scale forcings used to drive the SCMs, as found in other SCM and CRM studies which use advective forcings derived from observations (e.g., Krueger & Lazarus 1999). The Kain–Fritsch and Plant & Craig schemes (which both use the same plume model) produce a drier state than the default UM, although well within the range of values seen when comparing various SCMs (Woolnough *et al*. submitted). The drying is associated with tropospheric cooling in these schemes relative to the default UM (not shown).

Several of the SCM configurations in this study include ST perturbations but are based on the UM convection scheme. In figure 9, we show the difference in ensemble mean PWC between these configurations and their deterministic analogue, the default UM. Also shown is the difference between the Kain–Fritsch scheme and the default UM. Figure 10 show similar plots for the Plant & Craig scheme. In terms of ensemble mean PWC, the difference between the two convection parametrizations (default UM and Kain–Fritsch) is several times the difference between any of the stochastic schemes and its deterministic analogue. Similar remarks apply to other variables and suggest that the ensemble mean fields are more sensitive to structural differences in the convection scheme than they are to the introduction of stochastic schemes.

However, the observation that stochastic physics schemes designed to represent model uncertainty or departures from statistical equilibrium can change the mean state of the SCM by even a relatively small amount is interesting. Statistical tests indicate that the ensemble mean state of the time-varying random parameters ensemble is significantly cooler and dryer than the default UM for much of the model run, especially during periods of suppressed convection. However, the constant random parameters scheme did not produce this deviation despite sampling the same range of values for model parameters (figure 9). This suggests noise-induced drift: the random noise introduced by the time variation of the model parameters causes the SCM to explore a region of phase space which is asymmetric about the mean state of the deterministic analogue. Note in figure 10 that the stochastic Plant & Craig scheme produces a similar drift relative to its deterministic analogue (most clearly seen around the 22nd), which is also found to be statistically significant during the suppressed phases. This drift is smaller when the larger column size is used.

### (c) Intercomparison of ensemble variability

Figure 11 shows time-mean vertical profiles of ensemble spread in temperature for the active and suppressed periods ActB, SupC and ActC, as labelled in figure 7. There are marked differences between active and suppressed phases. This is most apparent in the mid-troposphere where the spread tends to be higher during active phases, whereas in the lower troposphere most of the SCM configurations exhibit greater spread during the suppressed phase (the Plant & Craig scheme is an exception during ActC). These observations are consistent with the notion that convective variability is a key ingredient in producing spread.

There are distinct differences between the profiles in figure 11*a–c*, which are for configurations using the default UM convection scheme, and those in figure 11*d–f*, which are for configurations based on the Kain–Fritsch plume model. The latter grouping exhibits large peaks in ensemble spread in the upper troposphere and lower stratosphere regions, presumably associated with convective overshoots. Such peaks are absent for the first grouping, which tend to have greater spread in the mid-troposphere. The vertical structure of the ensemble spread profile appears to be primarily dependent on the convective plume model used, with the ST perturbations primarily affecting its amplitude.

In the troposphere, the default UM ensemble produces more spread than the Kain–Fritsch ensemble. These profiles confirm that the convection parametrization is an important source of variability and also that different deterministic convection parametrizations produce rather different variabilities in the host model. Thus, if the high-frequency variability of a model does have important effects on climate, one should introduce some (stochastic) method to control the high-frequency variability, or at least should investigate the on–off characteristics of the GCM convection parametrization.

The schemes that represent model uncertainty (multiplicative noise, random parameters and constant random parameters) tend to scale up the profile of ensemble spread produced by their deterministic analogue in the mid and upper troposphere, but have relatively little effect on the lower troposphere. The multiplicative noise scheme also affects the stratosphere, as it directly perturbs the radiative tendencies that dominate there. The stochastic Plant & Craig scheme also tends to scale up the profile of spread produced by its deterministic mode, but differs from the other methods in that during ActB and SupC it creates substantial increases in spread in the lower troposphere.

The deterministic Plant & Craig scheme generally produces small ensemble spreads, often smaller than those in the Kain–Fritsch ensemble. This is consistent with its design, since it uses time-averaged profiles to reduce time-step-to-time-step variability in its closure calculations. The stochastic form of this scheme is not designed to represent model uncertainty, but rather the variability arising from subsampling the cumulus ensemble within a finite area. For an area of side Δ*x*=100 km, the scheme is certainly more spread than in deterministic mode, but still comparable with the deterministic Kain–Fritsch ensemble and in the troposphere is much less spread than any of the model uncertainty schemes. However, with Δ*x*=50 km the ensemble spread has tropospheric values comparable to those produced by model uncertainty schemes. These results suggest that local fluctuations about convective equilibrium become as important as model uncertainty at resolutions of approximately 50 km, and a key mechanism for variability at smaller grid lengths.

### (d) Comparison of stochastic physics SCM ensemble spread to model uncertainty

Although the stochastic physics schemes used in this study do produce significant ensemble spread, it remains to determine whether or not the levels of spread are appropriate. To examine this point, it is useful to compare the ensembles that are designed to represent model uncertainty with the full range of model states produced by different deterministic structural configurations. A poor man's ensemble is produced by combining the 39-member IC ensembles produced by the default UM, the Kain–Fritsch scheme and the deterministic Plant & Craig scheme, each with equal weighting. The spread of this combined 117-member ensemble can be used as a simple measure of the spread of model states associated with model uncertainty. Certainly, the representativeness of such an ensemble is questionable, but we would suggest that a stochastic scheme that aims to represent model uncertainty should produce at least comparable levels of spread. Figure 12 shows time series of several ensemble percentiles of PWC for each of the stochastic schemes and for the constant random parameters scheme, compared with the same percentiles of the combined deterministic ensemble. (The ensemble mean PWC was shown in figure 8.)

It is encouraging to find that the three schemes designed to represent model uncertainty do indeed produce spread comparable to the combined deterministic ensemble (figure 12*a–c*). However, these schemes tend simply to broaden the ensemble about the ensemble mean state of their deterministic analogue. Thus, they fail to explore regions of phase space which are accessible to the other deterministic schemes. The model uncertainty scheme that looks most promising in this study is the random parameters scheme. As discussed in §4*b*, it produces some noise-induced drift, and this appears to be favourable, in the sense that the distribution of PWC is nudged towards that of the combined deterministic ensemble. It only fails to encompass the full range of model uncertainty during the last day or two of the runs.

Figure 12*d*,*e* show results for the stochastic Plant & Craig scheme. These confirm the points made in §4*c* and show that fluctuations about convective equilibrium result in ensemble spread similar to key aspects of model uncertainty on scales of approximately 50 km.

## 5. Conclusions

Single-column tests isolate the parametrization schemes of a GCM and allow one to study their interactions under prescribed forcings. The strengths of the approach are also its weaknesses: it can be very helpful to explore the behaviour of parametrizations in a clean arrangement, but the behaviour is not necessarily representative of that in the parent GCM. We have performed single-column tests of tropical convection, comparing various stochastic physics methods. The interactions of stochastic perturbations with model dynamics are likely to be an important aspect of their behaviour in a full GCM, but we wished to consider whether SCM tests may nonetheless have value.

It is necessary to study stochastic methods with ensembles of SCM runs. Here we have used an ensemble size of 39 which is certainly practical, and sufficient to make some inferences about the methods, but additional members might have allowed more definitive statements to be made in some cases. The stochastic ensembles can be usefully compared with deterministic ensembles produced by IC uncertainty and also with combinations of these into poor man's ensembles. Such comparisons allow one to judge not merely that the ensemble mean from some stochastic method is sensible, but also to assess the variability that the method produces through its interactions with the GCM parametrization set. For example, if the spread of a stochastic ensemble designed to represent model uncertainty was much larger (smaller) than the poor man's ensemble, then the SCM would imply that for a good performance of the method in a full GCM the interactions of the stochastic perturbations with GCM dynamics should be such as to strongly dampen (amplify) the variability introduced.

In agreement with Hack & Pedretti (2000), deterministic SCM runs were found to be sensitive to small IC perturbations. The variability of ICs ensembles depends on the convection parametrization used, according to the timings and frequency of triggering. The perturbations chosen allowed the runs to diverge into a set of independent realizations within a few days. The same IC perturbations had very little effect on the ensembles produced by stochastic methods, beyond the first day or two.

Three methods designed to represent model uncertainty appeared to perform well, the ensemble spreads being broadly similar to that of the deterministic poor man's ensemble. The ensemble mean states were close to the ensemble means of the deterministic analogue (differences in the convective parametrization produced substantially larger changes to the mean). For the random parameters scheme, however, there was a statistically significant noise-induced drift of ensemble mean PWC. The Plant & Craig scheme produced levels of spread similar to the model uncertainty approaches for Δ*x*=50 km, suggesting that fluctuations about convective equilibrium form an important component of variability at and below this scale.

Although there are some changes in the methodology and philosophy for SCM testing of stochastic methods, we are inclined to view our results as encouraging and to speculate that SCM testing may have a useful role to play in studying stochastic parametrizations. For example, it seems clear from our results that a 2.5° GCM integration would not be a good test of the impact of the convective fluctuations parametrized in the Plant & Craig scheme. Given that comprehensive testing of all details of all plausible stochastic methods will remain impractical, we contend that the indications of potential impact that may be gleaned from SCM tests are far preferable to no indications at all.

## Acknowledgments

We are grateful to A. Arribas, N. Bowler and K. Mylne for their discussions of stochastic methods, to S. Woolnough for their discussions of GCSS case 5 and for providing the observational data, to R. Wong for providing us the default UM SCM configuration for that case and to NCAS Computer Modelling Support. M.A.B. is funded by the NERC award NER/S/A/2006/14189 with CASE support from the Met Office.

## Footnotes

One contribution of 12 to a Theme Issue ‘Stochastic physics and climate modelling’.

↵These datasets are available for the whole of the TOGA-COARE observing period, along with further information about their derivation, at http://tornado.atmos.colostate.edu/togadata/data/ifa_data.html.

↵That is, an ensemble in which each member has the same stochastic parametrization scheme but draws a different set of random numbers for it.

↵By which we mean the equivalent configuration with the stochastic component disabled, providing of course that such an equivalent is well defined. For example, for a stochastic method in which model parameters are selected randomly, the deterministic analogue is simply a simulation with the default parameter set.

↵This is shown for forecast days 3, 5 and 7 in their table 1a.

↵See Teixeira & Reynolds (accepted), fig. 7

*a*, for example.- © 2008 The Royal Society