## Abstract

Regime predictability in atmospheric low-order models augmented with stochastic forcing is studied. Atmospheric regimes are identified as persistent or metastable states using a hidden Markov model analysis. A somewhat counterintuitive, coherence resonance-like effect is observed: regime predictability increases with increasing noise level up to an intermediate optimal value, before decreasing when further increasing the noise level. The enhanced regime predictability is due to increased persistence of the regimes. The effect is found in the Lorenz '63 model and a low-order model of barotropic flow over topography. The increased predictability is only present in the regime dynamics, that is, in a coarse-grained view of the system; predictability of individual trajectories decreases monotonically with increasing noise level. A possible explanation for the phenomenon is given and implications of the finding for weather and climate modelling and prediction are discussed.

## 1. Introduction

A striking feature of the atmospheric circulation is that despite its turbulent and chaotic nature the same large-scale flow structures tend to recur over and over again. Much of the low-frequency variability of the atmosphere appears to be contained in a few teleconnection patterns such as the North Atlantic Oscillation and the Pacific/North American pattern. Such persistent and recurrent flow patterns are commonly called atmospheric regimes.

Initially, synoptic meteorologists identified atmospheric regimes in weather maps on an intuitive basis [1,2]. Later, a more systematic framework for regime identification based on probability density modelling was adopted [3–6]. The temporal evolution of the regimes was taken into account by studying transition probabilities between the regimes [7–9]. Most recently, a unified spatio-temporal concept of atmospheric regimes was proposed based on hidden Markov model (HMM) analysis [10–12] and finite-element clustering [13].

While atmospheric regimes and regime transitions occur in deterministic models often a stochasticity assumption is invoked when explaining regime behaviour. Stochastic noise, representing the effect of unresolved scales and processes, can trigger chaotic itinerancy between different regimes by kicking the system out of the basin of attraction of one regime and into another. This has been studied early on by adding stochastic terms to atmospheric low-order models [14,15].

A particular aspect of atmospheric regimes is their prediction and predictability [12,13,16]. It appears conceivable that persistent regimes are the most predictable flow structures and that their time scale of predictability is larger than the time scale of predictability of individual flow trajectories. This would be obviously relevant to numerical weather prediction, in particular given the fact that even state-of-the-art weather prediction models occasionally miss regime transitions owing to model error.

This paper investigates the regime behaviour in atmospheric low-order models with stochastic forcing focusing on regime predictability. In §2, the use of HMMs for identification of regimes is briefly summarized. Section 3 introduces some useful diagnostics to interpret the HMMs. In §§4 and 5, the regime behaviour of the Lorenz '63 model and a simple model of barotropic flow over topography with noise are studied. Section 6 gives some general concluding remarks.

## 2. Identification of atmospheric regimes

Atmospheric regimes are here identified as persistent or metastable states using an HMM analysis [10–12,17,18]. An HMM is a nonlinear spatio-temporal statistical technique which determines simultaneously clusters in state space based on probability density modelling and regime transitions in a time series. We here use a discrete Gaussian HMM. The HMM has a finite number *K* of hidden, that is, not directly observable, states or regimes. The integer variable *c* takes values from 1 to *K*, denoting which regime the system is in. Conditional on being in regime *c*=*i*, an output vector **x** is observed according to a Gaussian density with mean *μ*_{i} and covariance matrix *Γ*_{i},
2.1Transitions between the regimes are governed by a Markov chain with time step *δt* described by a time-independent (*K*×*K*) transition matrix **A** with elements
2.2The matrix **A** is a row-stochastic matrix, that is, *A*_{ij}≥0 and . The initial probability distribution over the regimes is *ξ*_{i}=*p*(*c*_{1}=*i*). The parameters of the HMM are the means , the covariance matrices , the transition matrix **A** and the initial distribution ** ξ**. Given a time series of length

*N*with sampling interval

*δt*, {

**x**

_{1},…,

**x**

_{N}}, the parameters are estimated by maximizing the likelihood

*L*(

**x**

_{1},…,

**x**

_{N}) using the Baum–Welch algorithm [17–19], which is a special case of an expectation–maximization algorithm [20]. The algorithm also gives the time series of posterior probabilities 2.3The number of regimes

*K*is a hyper-parameter of the HMM which has to be fixed

*a priori*, determining the overall complexity of the HMM. In keeping with the philosophy of atmospheric regimes as metastable states,

*K*is here determined by trying different values and looking for a gap in the eigenvalue spectra of the transition matrices

**A**[10–12] (see §3

*b*).

## 3. Markovian regime diagnostics

A couple of useful diagnostics are introduced, which facilitate the interpretation of the HMMs.

### (a) Stationary distribution

A stationary or invariant distribution of the Markov chain is given by a stochastic vector ** π**=(

*π*

_{1},…,

*π*

_{K}) (where

*π*

_{i}denotes the probability of being in regime

*i*) satisfying 3.1According to the Perron–Frobenius theorem, the transition matrix

**A**possesses a leading eigenvalue

*λ*

_{1}=1 and the corresponding (left-)eigenvector has non-negative entries. Generically, the eigenvector is unique (up to normalization) and represents the stationary distribution to which all initial distributions converge. A sufficient condition for this is that

**A**is primitive, that is,

**A**

^{k}>0 for some integer

*k*≥1.

### (b) Predictability time scales

The remaining eigenvalues {*λ*_{2},…,*λ*_{K}} of the transition matrix **A** (generically) have modulus smaller than one. Their (left-)eigenvectors describe deviations from the stationary distribution as modes in probability space and the modulus of the eigenvalue determines the rate of decay under time evolution. Each eigenvalue can be associated with a predictability time scale *τ*_{i} given by the *e*-folding time of the decay of the corresponding probability mode
3.2If the regime dynamics are indeed Markovian, then *τ*_{i} should be independent of the time step *δt*. Ideally, this should be checked [11,12] by fitting HMMs with different choices of *δt*. The eigenvalues {*λ*_{2},…,*λ*_{K}} and corresponding time scales {*τ*_{2},…,*τ*_{K}} offer a natural criterion for determining the number of regimes *K*. One would try HMMs with various choices for *K*, look for a gap in the eigenvalue spectra of the transition matrices and keep only the eigenvalues closest to one associated with the longest time scales *τ*_{i} [10–12].

### (c) Residence time

The number of steps *m* the Markov chain spends in regime *i* at any one visit follows a geometric distribution, corresponding to the waiting time for the first success in a Bernoulli trial process with success probability 1−*A*_{ii}. We have and . The mean residence time in regime *i* is given as
3.3An overall mean residence time of the system can be defined as the weighted average of the mean residence times
3.4For Markovian regime dynamics, the residence times are independent of the time step *δt*.

### (d) Recurrence time

A point in any of the regimes will eventually leave that regime and return to it at some later time. The Smoluchowski recurrence time of regime *i* is the average time elapsing between a point leaving regime *i* and then returning to it again. It is given by [21]
3.5An average recurrence time of the whole system is
3.6For Markovian regime dynamics, the recurrence times are independent of the time step *δt*.

### (e) Entropy

A (negatively oriented) measure of the predictability of a point starting initially in regime *i* is given by the entropy
3.7The entropy of the Markov chain as a whole [7,22] is the average with respect to the stationary distribution of the entropies of the individual regimes,
3.8The total entropy *H* is a measure of the uncertainty in the future time evolution of a point randomly drawn from the stationary distribution. The entropies depend on the time step *δt*.

## 4. Lorenz '63 system

The first example system considered is the classical Lorenz model [23] augmented with stochastic noise [8,12,24]. The governing equations are
4.1where *η*_{1}, *η*_{2} and *η*_{3} are pairwise independent white Gaussian noise processes with zero mean and unit variance. The standard parameter values *s*=10, *r*=28 and *b*=8/3 are used. The noise standard deviation *σ* is a parameter to be varied. The model is integrated in time numerically using the Euler–Maruyama scheme with step size 5×10^{−5}. In the deterministic system (*σ*=0), the motion settles onto a chaotic attractor with two butterfly-wing-shaped lobes; the trajectory switches irregularly between the two lobes. On each of the wings, the state vector spirals around the unstable fixed point. The attractor is very robust against noise. It is gradually deformed when increasing the noise level but the pronounced two-wing shape is still intact and very similar to the deterministic case up to noise levels as large as *σ*=10. After discarding transient motion, 10 000 time units worth of data are archived at a sampling interval of *δt*=0.1, resulting in a time series of length *N*=100 000.

We first investigate the predictability of the system in the classical sense, that is, the predictability of individual trajectories as a function of the noise level. This is done using analogue or locally constant prediction based on nearest neighbours [25,26]. The dataset is split into a learning dataset and a verification dataset of length 50 000 each. For each point in the verification dataset, the *M* nearest neighbours in the learning dataset with respect to the Euclidian norm are found. A prediction of the future time evolution is given as the average over the future time evolutions of the points in the neighbourhood. Figure 1 shows the predictive skill measured by the RMSE for some representative noise levels. The skill is optimized with respect to the number of nearest neighbours. An optimal value is found to be approximately *M*=10 for *σ*=0 and *M*=40 for *σ*>0. For large lead times, the RMSE of the analogue predictor asymptotes to the RMSE of the climatology forecast, which is equal to the square root of the total variance of the system. Here, the total variance of the system is virtually independent of the noise level. Expectedly, the predictability decreases monotonically with increasing noise level at all lead times. The deterministic case (*σ*=0) shows some return of skill carrying the period of oscillations on the wings but this is not our concern here.

The HMM analysis is performed only on the variable *x* (**x**=*x*). The time series of *x* alone already contains the regime information. Moreover, the problems associated with the strong non-Gaussianity of the Lorenz attractor are less severe when working only in one dimension. HMMs with numbers of regimes *K*=2, *K*=3, *K*=4 and *K*=5 were fitted. Figure 2 displays the eigenvalue spectra of the transition matrices and the corresponding predictability time scales. The eigenvalues are all real here. There is a distinct gap after the second eigenvalue in both the deterministic and the noise-driven system, corresponding to a clear time-scale separation. We therefore only consider HMMs with *K*=2. The regime behaviour of the system is described by the two-wing structure of the attractor. For *σ*=0, the means of the full state vector (*x*,*y*,*z*) conditional on the regimes, that is, the means with respect to the posterior distribution *ν*_{i,n}, are (−6.41, −6.41, 23.59) and (6.35, 6.35, 23.52), close to the unstable steady state on each of the wings. They change only slowly when increasing the noise level. These results are consistent with earlier findings using the full state vector in the HMMs [12].

The predictability time scale *τ*_{2} (figure 3*a*) substantially increases with increasing noise level up to *σ*≈3.2, before decreasing when further increasing the noise level. The increase in *τ*_{2} is almost 25% relative to the value at *σ*=0. The mean residence times (figure 3*b*) show the same pattern, an increase by almost 25% up to *σ*≈3.2 and then a decrease. The entropies (figure 3*c*) behave correspondingly, indicating increased predictability at intermediate noise levels. We here have a complete symmetry between the two regimes owing to the symmetry in the Lorenz equations. The increased regime predictability is due to enhanced persistence of the regimes. This is also evidenced by the transition matrices which at *σ*=0 and *σ*=3.2 are
4.2Owing to the symmetry of the system, we have an equipartition between the regimes in the stationary distribution (figure 3*d*). All deviations are within the statistical sampling fluctuations.

The results are statistically highly significant. The points on the curves in figure 3 refer to integrations with different initial conditions and noise realizations. The fluctuations between neighbouring points give an indication of the sampling uncertainty, which is fairly small.

The increased persistence of the regimes is also visible by eye from the time series of the system (figure 4). To expand on this in more detail, the whole distribution of regime residence times is considered. A data point is classified to be in the regime which has the larger posterior probability *ν*_{i,n}. Figure 5 gives the cumulative distribution of residence times for both regimes together as they are symmetric. The geometric distributions from the Markov model are also indicated. For *σ*=0, the residence times are actually quantized with the period of the oscillations on the wings (about 0.7 time units) as a regime can only be left where the wings merge, that is, after an integer number of oscillations. This feature is still visible in the noise-driven system. At intermediate noise level, the distribution of regime residence times is shifted to larger values. In particular, long residence times are more likely. For example, at *σ*=0, 5% of residence times are larger than 4.25; at *σ*=3.2, 5% of residence times are larger than 6.2.

## 5. Barotropic atmospheric low-order model

We consider a low-order model of large-scale barotropic flow over topography displaying regime behaviour. The system is very similar to the model proposed in the seminal paper by Charney & DeVore [27]. We here actually use the formulation derived by De Swart [28] and also used by Crommelin *et al.* [29] which has a more general zonal forcing profile and a slightly different scaling than the original model by Charney and DeVore. In the following, the model setting is briefly summarized. For more details on the derivation as well as the bifurcation and regime structure of the model, see [28,29]. Here, the model is augmented with stochastic noise.

The model is based on the non-dimensional quasi-geostrophic barotropic vorticity equation in a *β*-plane channel. The governing equation is
5.1where *ψ* is the streamfunction, *h* is the topography, *ψ** is a streamfunction forcing and *J* is the Jacobian operator. The equation is considered on the rectangular domain [0,2*π*]×[0,*bπ*]. The streamfunction is periodic in the *x*-direction: *ψ*(*x*,*y*,*t*)=*ψ*(*x*+2*π*,*y*,*t*). At *y*=0 and *y*=*bπ*, we have the boundary conditions ∂*ψ*/∂*x*=0 and . The streamfunction is expanded into a double Fourier series and only the very large-scale components up to zonal wavenumber 1 and meridional wavenumber 2 are kept, resulting in six real modes, two zonal components and four wave components. The topography is chosen as a wave (1,1) profile: . The streamfunction forcing profile is purely zonal: *ψ**=*ψ**(*y*). Galerkin projection of the vorticity equation onto the basis functions and adding stochastic forcing yields a system of six stochastic differential equations for the streamfunction expansion coefficients as follows:
5.2Here *η*_{i} are pairwise independent Gaussian white noises with zero mean and unit variance. *σ* is the standard deviation of the stochastic forcing. The variables *x*_{1} and *x*_{4} refer to the zonal components, the other variables to the wave components of the flow. The model coefficients are
5.3In the above equations, the terms with *β*_{m} are due to the Coriolis force, the terms with *γ*_{m} and describe topographic interactions, the terms with *α*_{m}, *δ*_{m} and *ε* are nonlinear advection terms and the terms with *C* represent Newtonian damping towards the forced zonal streamfunction profile. The free parameters of the model are the zonal forcing ( and ), the damping time scale (*C*), the Coriolis parameter (*β*), the topographic scale height (*γ*) and the length–width ratio of the channel (*b*). We here use the parameter setting at which the deterministic model exhibits chaotic behaviour with regime transitions [29]. These parameters are within a physically reasonable range. The regime behaviour in the model is owing to a combination of topographic and barotropic instabilities [29]. At each considered noise level, the system is integrated numerically using the Euler–Maruyama scheme with step size 2×10^{−4}. A post-transient time series of length *N*=100 000 is archived with a sampling interval *δt*=1.

Again, the predictability of individual trajectories is investigated. An analogue predictor using nearest neighbours with respect to the Euclidean norm is constructed as before. The length of the learning and verification datasets is 50 000 each. In figure 6, the RMSE for some representative noise levels is displayed. The number of nearest neighbours is *M*=15 for *σ*=0 and *M*=40 for *σ*>0. At long lead times, the errors converge to the square root of the total variance of the system, which slightly increases with increasing noise level. The predictability of the system decreases monotonically with increasing noise level at all lead times.

The HMM analysis is performed in the space of the two zonal flow components (**x**=(*x*_{1},*x*_{4})). Figure 7 shows the eigenvalue spectra of the transition matrices and corresponding predictability time scales for *K*=2, *K*=3, *K*=4 and *K*=5. Again, all of the eigenvalues are real. There is a gap in the time scale after the second eigenvalue, but the third eigenvalue increases in the noisy system. An HMM with *K*=2 quite strongly underestimates the time scale; this may be due to the strong non-Gaussianity of the probability density of the system. Moreover, visual inspection of the regimes in physical space suggests three regimes (see below). We therefore here choose the HMM with *K*=3.

The streamfunction fields associated with the regimes (calculated as the means of the full state vector of the system conditional on the regimes) are displayed in figure 8 for *σ*=0. Regime 1 is a zonal flow regime, regime 3 is a blocking-like regime. Regime 2 has an intermediate character, corresponding to a weakly blocked flow. An HMM with *K*=2 merges regime 1 and regime 2, which is not satisfactory as they are physically quite distinct. The position of the regimes in state space is fairly robust under noise up to about *σ*=0.01, experiencing only a gradual drift. For even higher noise level, all regimes attain a more and more blocked character while still being distinct.

The regime behaviour extracted by the HMM is also visible in time series of the system (figure 9). In the (*x*_{1},*x*_{4})-plane, the regime centres for *σ*=0 are located at *μ*_{1}=(0.922,−0.622), *μ*_{2}=(0.865,−0.448) and *μ*_{3}=(0.774,−0.372). The long time scale *τ*_{2} is associated with the blocking regime versus the two others. The shorter time scale *τ*_{3} is associated with transitions between the zonal and the weakly blocked regime. The zonal and the blocking regime correspond to maxima in the probability density in the (*x*_{1},*x*_{4})-plane; the weakly blocked regime is not visible in the probability density [29].

The predictability time scales (figure 10*a*) increase with increasing noise level up to an intermediate value, before decreasing when further increasing the noise level. *τ*_{2} increases by about 50% and reaches its maximum at *σ*=0.006; *τ*_{3} more than doubles and attains its maximum at *σ*=0.0085. The mean residence times (figure 10*b*) peak at an intermediate noise level for all three regimes. This is particularly prominent for the blocking regime at *σ*=0.006 with a doubling of the mean residence time compared with the value at *σ*=0. The mean recurrence time (figure 10*c*) of the blocking regime gradually decreases with increasing noise level; for the other two regimes it peaks at an intermediate noise level. The entropies (figure 10*d*) indicate maximum regime predictability at intermediate noise level, which is *σ*=0.0075 for the total entropy. The stationary probability distribution (figure 10*e*) changes under stochastic forcing. The blocking regime gains in population while becoming more persistent, before dropping again in favour of the weakly blocked regime. The population of the zonal regime does not change much. The changes in the regime dynamics are visible in the time series of the system (figure 9*c*,*d*). The noise-driven model shows a preference for long blocking episodes and the time scale of the switches between the zonal and the weakly blocked regime is prolonged.

The transition matrices at *σ*=0 and *σ*=0.006 are
5.4They reflect the increased persistence of all three regimes. At *σ*=0, there is an almost closed cycle between the zonal and the weakly blocked regime. This corresponds to long episodes without blocking visible in the time series around *t*=1000 and *t*=4200 (figure 9*a*,*b*). This cycle generates the long mean recurrence time of the blocking regime. Under stochastic forcing, the cycle gradually opens up.

It can be noted that the increase in the regime predictability time scales for the noise-driven system is robust against the choice of the number of regimes *K* (figure 7).

## 6. Discussion

Predictability in the Lorenz '63 system and a barotropic atmospheric low-order model with stochastic noise was investigated. In both models, predictability in the classical sense, that is, predictability of individual flow trajectories, decreases monotonically with increasing noise level. This goes with the intuition that adding uncertainty to the system results in information loss and reduced predictability. By contrast, in both models, regime predictability increases when adding a moderate stochastic forcing, before decreasing for large amounts of stochastic forcing. The enhanced predictability is due to an increased persistence of the regimes. The effect is loosely similar to a coherence resonance [30] where the output of an excitable system is most coherent at an intermediate optimal noise level.

This study uses a simple additive stochastic forcing with the same standard deviation for all variables and no correlations between the noise terms. It appears to be likely that the effect of increased regime predictability could be optimized with a more sophisticated choice of stochastic noise.

A possible explanation for the findings lies in the mechanism of perturbed heteroclinic connections as the underlying dynamics of regime transitions [8,9,29]. In order to exit a regime the system has to follow a particular trajectory along the unstable direction of that regime. This acts as a bottleneck. A small amount of noise prevents the system from finding the exit path and thus increases the persistence of the regimes. Only at large enough noise levels, the diffusion effect takes over and facilitates the regime transitions. Preferred regime transition paths have been found in the Lorenz '63 system [8], in intermediate complexity atmospheric models [8,9] and in reanalysis data [7].

The present results have been obtained with very simple models and it remains to be seen if they carry over to more realistic atmospheric models. They generally hint at the likely importance of sub-grid-scale variability for setting the dynamical structure of atmospheric regimes. It has recently been shown that quite a high model resolution is necessary to faithfully simulate the nonlinear features of regimes found in the real atmosphere [31]. Atmospheric regimes appear to be a complex phenomenon involving many spatial and temporal scales. It has also been shown that stochastic parametrizations can improve a model's regime behaviour, in particular the occurrence of blocking [32]. Stochastic terms may have a function for the model's regime predictability to match the regime predictability of the real atmosphere but this certainly needs further investigation.

## Footnotes

One contribution of 14 to a Theme Issue ‘Stochastic modelling and energy-efficient computing for weather and climate prediction’.

- © 2014 The Author(s) Published by the Royal Society. All rights reserved.