Long-range dependence (LRD) and non-Gaussianity are ubiquitous in many natural systems such as ecosystems, biological systems and climate. However, it is not always appreciated that the two phenomena may occur together in natural systems and that self-similarity in a system can be a superposition of both phenomena. These features, which are common in complex systems, impact the attribution of trends and the occurrence and clustering of extremes. The risk assessment of systems with these properties will lead to different outcomes (e.g. return periods) than the more common assumption of independence of extremes. Two paradigmatic models are discussed that can simultaneously account for LRD and non-Gaussianity: autoregressive fractional integrated moving average (ARFIMA) and linear fractional stable motion (LFSM). Statistical properties of estimators for LRD and self-similarity are critically assessed. It is found that the most popular estimators can be biased in the presence of important features of many natural systems like trends and multiplicative noise. Also the LRD and non-Gaussianity of two typical natural time series are discussed.
Meteorology has been both a seedbed and a testbed for many advances in the mathematics of complex systems; the former being seen in its contribution of the Lorenz model to nonlinear science , while the latter is exemplified by ensemble forecasting. This parallel evolution of theory and practice has identified important problems, e.g. the need to model the effect of extra, noise-like, degrees of freedom on deterministic low-dimensional dynamics. Other well-established paradigms such as Hasselmann's stochastic model  have highlighted the importance of red noise to mathematical climatology.
However, since the pioneering work of the late Benoit Mandelbrot, increasing attention has been paid to two self-similar (or ‘fractal’) aspects of time series: long-range dependence (LRD) in time, and the spatial counterpart of LRD, heavy-tailed probability distributions in amplitude. First identified in hydrology, and since studied in research areas as diverse as biology, telecommunications, social networks, econometrics and the climate system, LRD is characterized by its low-frequency singular behaviour, the so-called 1/f power spectrum. When present in a signal, the correlations captured by LRD will both hamper the identification and the attribution of deterministic trends , and impede the quantification of their significance. As was the case in the original context of hydrology, the presence of LRD in climate time series has been intensely debated, and it is still sometimes ascribed to transient effects, calibration issues or other forms of non-stationarity [4,5].
LRD is also of practical significance. By making systems subject to longer-lived fluctuations , it changes the information available to make predictions about the state of a complex system. It impacts the occurrence and clustering of extremes [7–9], which are important for risk assessments and mitigation strategies, e.g. insurance pricing and flood defence. The traditional assumption is that extremes are independent events. But there is growing evidence for clustering of extreme extra-tropical storms  and precipitation events (e.g. 2007 United Kingdom floods). This clustering of extremes will lead to higher return periods of extreme events than the more common assumption of independence. Recently, detrended fluctuation analysis (DFA), a tool originally developed to detect LRD, has found application in the prediction of dangerous bifurcations in dynamical systems, such as climate ‘tipping points’ [11–13]. Our contribution in this paper is focused on both (i) the problem of accurately estimating LRD in the presence of other signal elements and (ii) the complicating effects of heavy-tailed amplitude distributions, when they are present.
A process is long-range-dependent when the prediction of its next state depends on the whole of its past. An imprint of this dependence structure is that the covariance r(k)=Cov(X(k),X(0)) decays slowly, as , so that 1.1 This slow decay of the covariances means that the values of the process X are strongly dependent over long periods of time. This contrasts with the more familiar short-range-dependent system, where . In a short-range-dependent process, the next state only depends on the current state and the recent past. The archetype of a short-range-dependent process is a first-order Markov process, where the next state depends only on the present state and is conditionally independent of past states. See Beran  for more details.
LRD of a system is characterized by the parameter d in the statistics literature and sometimes by Mandelbrot's Joseph parameter J, for example, in the physics literature. Temporal LRD has been detected in the water level of the river Nile [14,15]. It was observed that the range of values grows as τJ, where J refers to the Joseph exponent [16, p. 157] and τ to the time period under consideration. The growth of range was anomalously large compared with that in the familiar paradigms of random walks and Brownian motion, which have both played a central role in understanding diffusion. In the random walk, the variance grows as the square root of time τ1/2 . The subsequent theoretical explanation of the difference between the observed anomalous diffusion and the Brownian motion paradigm came first through the use of a self-similar process, in particular via the study of fractional Brownian motion (fBm) . Here, self-similarity means that the statistical properties on all scales are similar. This property is controlled by the self-similarity parameter H. A stochastic process S is self-similar when a rescaling of time by a factor λ leads to a rescaling of the amplitude of the process S by λH. Thus, a process is said to be H-self-similar when the following relation holds : 1.2 where ∼d denotes equality in distribution. It is not always appreciated, and is sometimes confused in the literature, that two different aspects of a time series can contribute to its self-similarity H. The first is LRD, which is synonymous with persistence, referred to by Mandelbrot & Wallis  as the Joseph effect. Persistent systems exhibit longevity by having a tendency to maintain the way they have been recently. Examples are heat waves and drought conditions. The second source of self-similarity identified by Mandelbrot, the so-called Noah effect, is the property that change in a system can be rather large and can occur very abruptly, i.e. time series drawn from systems can exhibit sharp discontinuities, e.g. earthquakes.
It may at first seem odd that the two phenomena occur simultaneously in natural systems, because they pull in opposite directions. On reflection, taken together, however, we see that the two effects capture the facts that coherent structures in nature are real and that they can emerge, change or even vanish very quickly. The archetypal model of the Noah effect is non-Gaussian jumps whose complementary cumulative distribution function decays with a power law in size s, p(s)∼s−α. Here, α denotes the stability exponent (referred to as tail exponent in the statistics literature). Thus, the self-similarity parameter is 1.3 The distinction between H and is important because not all observed time series have Gaussian fluctuations. As such, one may find that popular diagnostics, as shown below, may be insensitive to heavy tails, measuring J and not H. However, in many situations, both heavy tails and the clustering of extremes caused by LRD are very important and both contribute to the value of H. This makes it necessary to be able to measure both H and J independently.
Many estimators for the LRD exponent d have thus been developed, and are widely used in the respective literature. Much of what has been rigorously established about the estimators, however, is for a particular LRD Gaussian model, fBm, which is also H self-similar and has a particularly simple relation between H and J, i.e. (α=2 for Gaussian increments). Most observed time series depart from the assumptions of fBm in some way or another, so it is important to critically evaluate how sensitive the estimators are to deviations from fractional Gaussian noise (fGn). fBm is a random walk with long-range-correlated increments and the value of J in such H self-similar walks is directly connected to the presence of LRD in their increments.
In this study, we critically re-evaluate four popular methods for measuring d and two for measuring H: variable bandwidth (VB) estimator , wavelets (WLs) , DFA  and the rescaled range analysis (R/S) [14,16]; exact Whittle estimator [23,24] and semi-parametric estimators (in this study we will focus on the Geweke–Porter-Hudak (GPH) estimator ; semi-parametric estimators are also described in Bardet et al. , Robinson [27,28] and Hurvich et al. ). This is a not a complete set of estimators for the self-similarity or LRD parameter (other estimators that are well established in the statistics community include the fractional exponential (FEXP) model ). Here, we want to focus on some of the most popular estimators in the physics and geosciences communities (R/S [31–35], DFA [7, 35–37], GPH [3,38,39], VB  and WL [21,41]).
We show scatter plots that compare the imposed parameters for a number of trials with the values detected by the above methods. These have the advantage of visually indicating which methods measure the complete self-similarity exponent H, and which, instead, measure LRD via d. For evaluating the performance of these estimators, we employ two well-established paradigmatic time-series models: a self-similar process (linear fractional stable motion (LFSM) ) and an asymptotically self-similar process (autoregressive fractional integrated moving average (ARFIMA) ).
In most natural systems, trends are common properties, so we systematically examine the performance of the various estimators when a linear trend is superposed on each time series. This is an important issue because in practice it is not always easy to remove trends. That they typically manifest themselves in higher moments presents further challenges still. Trends are the hardest to deal with because long-range-dependent processes can produce apparent trends over rather long periods of time . Therefore, it can be difficult to decide if a trend is due to external forcing or due to finite time-series length.
In §2, we present the two paradigmatic time-series models in more detail and discuss various special cases. In §3, we introduce the estimators of LRD and self-similarity. We test their utility in §4. In §5, we discuss the LRD and self-similarity properties of two exemplary time series from nature, and in §6, we provide conclusions.
2. Paradigmatic models of natural time series
(a) Natural time-series examples
The most prominent and widely used paradigmatic LRD time-series model, especially in the physics literature, is fGn. However, most natural time series are not strictly Gaussian and many are actually highly non-Gaussian. To illustrate this point, we discuss two time series from nature. In figure 1a, we display monthly mean temperature from the Antarctic station Faraday–Vernadsky [3,43]. It is easy to see that the raw time series is highly non-Gaussian. This is confirmed by the probability distribution functions (PDFs) plotted in figure 1b. There is also a striking seasonal signal visible in the skewness of the PDFs. However, it must be noted that polar temperature time series are much more non-Gaussian than mid-latitude ones, which are typically nearly Gaussian. Our second exemplary time series is the auroral electrojet (AE) index (figure 1c [44,45]). The AE index is derived from 1 min resolution time series from 12 high-latitude magnetometers. Reflecting the intermittent nature of the ionospheric and solar wind processes that influence it, the AE index is seen to be spiky and strongly non-Gaussian (figure 1d). Clearly, many natural time series are non-Gaussian.
(b) Paradigmatic models
While fGn and fBm are Gaussian, they are still useful idealized paradigmatic models for understanding many observed phenomena. A paradigmatic model is an idealized framework that captures some properties of observed time series, though not all. Most paradigmatic models allow analytical work, although at the expense of an over-idealization of the physical or biological phenomena. There is a need for better paradigmatic models of observed phenomena that allow simultaneously non-Gaussian statistics and LRD. Such models are needed, for example, for hypothesis testing and time-series simulation. Thus, we briefly discuss two paradigmatic time-series models that exhibit LRD and non-Gaussianity. The widely used fGn and fBm are special cases in the Gaussian limit of these more general models, thereby retaining some of the analytical tractability of these models.
In this study, we use two classes of processes with symmetric α-stable (SαS) distributions. They are the LFSM and the ARFIMA model with SαS innovations. Stoev & Taqqu  present efficient methods for the simulation of these processes using the fast Fourier transform.
(i) Linear fractional stable motion
LFSM is a model that exhibits LRD and heavy tails at the same time. It is an extension to the simpler Brownian random walk model. It links individual heavy-tailed jumps by means of a retarded memory kernel. It can be represented by a stochastic process1 , which is defined by the stochastic integral 2.1 where 0<H<1, α∈(0,2) and where is a standard symmetric α-stable (SαS) Lévy process. The process XH,α is called LFSM and is self-similar with self-similarity parameter H. Thus, it satisfies relation (1.2) and has stationary increments. The parameter α controls the tails of the distribution of an SαS random variable ξ, that is, 2.2 The greater the value of α, the lower the probability of extreme fluctuations of the SαS process XH,α. We recommend the introductions by Taqqu  and Mercik et al.  for detailed expositions.
(ii) Fractional Brownian motion
In the Gaussian case, we have that α=2 and the LFSM process (2.1) reduces to fBm. In this case, the self-similarity parameter H will always equal the Joseph exponent J, thus .
(iii) Ordinary Lévy motion
In the memory-less case, we have d=0 and 0<α<2. In this case, equation (2.2) holds and the tails of the distribution decay according to a power law and consequently Lα has infinite variance and is called ordinary Lévy motion. This encapsulates the Noah effect. In this case, the self-similarity exponent is H=1/α.
(iv) Autoregressive fractional integrated moving average
Another paradigmatic model exhibiting LRD with heavy-tailed fluctuations is ARFIMA, which has the added ability of exhibiting short-range-dependent behaviour. This model, denoted ARFIMA(p,d,q), , extends the usual ARIMA(p,d,q) models, in which d takes integer values . An ARIMA model is written as 2.3 where B denotes the back shift operator defined by BXt=Xt−1, B2Xt=Xt−2,…. The polynomials Φ and Ψ are defined as and , where p and q are integers. The innovations ϵt (t=1,2,…) are usually independent and identically distributed (iid) normal variables with zero expectation and variance , but can also be α-stable distributed. The widely used autoregressive process of first-order (AR(1)) is a special case with p=1, d=0 and q=0. Typically d is an integer. In the case that d is a fractional real number, X(t) is an ARFIMA(p,d,q) process with and exhibits LRD. In the Gaussian case, ARFIMA is stationary. An ARFIMA(0,d,0) is asymptotically equivalent to fBm . An ARFIMA(p,d,q) with p>0 and q>0 is long-range-dependent but not self-similar. However, it is asymptotically self-similar for long time scales.
3. Estimators of long-range dependence and self-similarity
Here, we briefly describe the four estimators of the self-similarity (SS) and LRD parameters used in this study.
(a) Variable bandwidth
The VB method  is a technique for estimating the self-similarity exponent, H, from a time series x(t). The time series of length T is divided into windows or ‘bands’ of width r. The VB method can deploy two different algorithms to estimate H: (i) the standard deviation of the time series, σ(r), is computed in each band; or (ii) the difference between the maximum and minimum value in each band, ϵ(r), is computed. Then σ(r) and ϵ(r) are averaged over all the possible bands by varying the origin at fixed r, 3.1 where Lr is the number of windows of length r. This is repeated over a range of window sizes. Both quantities follow a power-law behaviour for self-similar time series  such that VBw(r)=rH and VBδ(r)=rH. Thus, the self-similarity exponent, H, is obtained from the slope of the corresponding log–log plot.
A WL ψ is a function with zero average and is normalized to one. A family of WLs is generated by scaling ψ by a factor s and translating it by u (). The WL transform allows one to construct a time–frequency representation of a signal, the WL spectrum. One can then infer the self-similarity parameter from the WL spectrum via ordinary least squares at large WL scales [21,50].
(c) Rescaled range
The rescaled range R/S  is a technique for estimating the LRD parameter d from a time series. The R/S estimator is given by 3.2 where 1≤t≤τ, , and . It scales like τJ, where the value of J can be estimated from the slope of R/S from a log–log plot.
(d) Detrended fluctuation analysis
DFA  also estimates the LRD parameter d from a time series. In DFA, a profile is first computed by . The profile is cut into Ns non-overlapping segments of equal length s and then the local trend is subtracted for each segment v by a polynomial least-squares fit of the data. Linear (DFA1), quadratic (DFA2), cubic (DFA3) or higher-order polynomials can be used for detrending. In the nth-order DFA, trends of order n in the profile, and of order n−1 in the original record, are eliminated. Next, the variance for each of the Ns segments is calculated by averaging over all data points i in the vth segment: 3.3 Finally, the average over all segments is computed and the square root is applied to obtain the following fluctuation function: 3.4 For different detrending orders, n, we obtain different fluctuation functions F(s), which are denoted by F(n)(s). The fluctuation function scales according to F(n)(s)∼sζ, with . There are many variants of DFA, but use of standard DFA is recommended by Bashan et al.  if the functional form of a trend is not a priori known.
(e) Exact Whittle estimator
The exact Whittle estimator is a semi-parametric estimator [23,24]. This method assumes that the underlying model of LRD can be represented by (1−B)dXt=εt, where ε is iid noise. See Shimotsu & Phillips [23,24] for more details.
(f) Power spectral method
As a semi-parametric estimator, we use the power spectral method of Geweke & Porter-Hudak  and Hurvich et al. . Spectral methods find d by estimating the spectral slope. The periodogram is used, which is an estimate of the spectral density of a finite-length time series and is given by 3.5 where λj=j/N is the frequency and the square brackets denote rounding towards zero. A series with LRD has a spectral density proportional to |λ|−2d close to the origin. Since is an estimator of the spectral density, d is estimated by a regression of the logarithm of the periodogram versus the logarithm of the frequency λ. Thus having calculated the spectral density estimate , semi-parametric estimators fit a power law of the form f(λ,b,d)=b|λ|d, where b is the scaling factor.
4. Empirical tests
In order to examine the statistical properties of the LRD estimators, like bias, spread and outliers, we generate various test time series from the above paradigmatic models. Specifically, we generate ensembles of 1000 members with randomly selected parameters from uniform distributions: fGn with d∈(−0.5,0.5), ordinary Lévy noise (oLn) with α∈(1,2), ARFIMA with Gaussian increments ARFIMA-G(1,d,1) and with ordinary Lévy increments ARFIMA-L(1,d,1) with the autoregressive coefficient , and the moving average coefficient b1∈(0,1), and α∈(1,2). We also use various first-order autoregressive processes with parameters randomly chosen in (−1,1). The time-series length is always 215−M with M=6000 (see Stoev & Taqqu  for an explanation of M), which is comparable to the length of most observed climatic and other natural time series, thus long enough for our purposes.
(a) Paradigmatic time series
First we examine how well the SS and LRD estimators work for the models of self-similarity. Most of the methods (e.g. DFA and R/S) require a regression fit in order to estimate the SS or LRD parameters. As shown by Chen et al. , non-stationarities and short-range dependences can cause crossovers in the fluctuation curves. Because of this we only regress on the long-range part of the fluctuation curves. Our results are robust to the particular choice of the cut-off.
As can be seen in figure 2, the WL and the VB methods are the only methods that are able to infer the self-similarity of oLn. While the WL method has no large estimation spread, the VB method exhibits large errors. All other estimators estimate d rather than H, with R/S producing the largest estimation variance, while Whittle, DFA1 and GPH have considerably smaller estimation variances, but with the odd outlier (figure 2). All four estimators do a reasonably good job of inferring H from fGn, with VBw(r) having the largest bias for H close to 0 and close to 1, and R/S also having some bias at small H with a relatively large estimation variance. WL, DFA1 and GPH produce very tight estimates, with WL and DFA1 only biased for small values, and Whittle and GPH working well over the entire range. For these paradigmatic time series, the two semi-parametric estimators, Whittle and GPH, give the best estimates. This picture changes once we allow for short-term dependence structures to contaminate the pure self-similar character of fGn and oLn. As figure 3 shows, all estimators work considerably worse for ARFIMA surrogate data, with large estimation spreads and many outliers. Again the Whittle estimator and GPH perform better than the other estimators.
Before we go on to examine how well the estimators work with superimposed trends, we test their accuracy on data generated from three basic short-range dependence models: independent white noise, AR(1) and ARFIMA(1,0,1). As figure 4a–c show, the Whittle estimator, GPH and DFA have the least bias. But it also has to be noted that all estimators have many outliers, suggesting that, given an individual time series, the estimates of d or H can be severely biased. The performance of higher-order DFA is very similar to DFA1 in the above test cases (not shown). These results are consistent with previous empirical studies .
Another important test is to see how the presence of trends affects the various estimators. Here, we consider three cases: (1) a linear trend superimposed by realizations of fGn, oLn, ARFIMA-G and ARFIMA-L; (2) a linear trend only in the second half of the time series superimposed by realizations of fGn, oLn, ARFIMA-G and ARFIMA-L; and (3) a linear trend in the variance. Cases 2 and 3 are motivated by climate change where the time series may (case 2) or may not (case 1) include the pre-industrial era, and where climate change also influences the frequency and the strength of fluctuations (e.g. storms; case 3). Based on the evidence gathered so far, it is reasonable to expect that this can lead to bias any LRD estimate.
For cases 1 and 2, we assume the magnitude of the linear trend to be 1. The empirical tests reveal that the GPH estimator is slightly biased for fGn, ARFIMA-G and ARFIMA-L, and has a relatively large negative bias for oLn (figure 5). DFA is least biased for fGn data, but has considerable bias for oLn, ARFIMA-G and ARFIMA-L (figure 5) generated data. Both ARFIMA-G and ARFIMA-L estimators have a large number of outliers. Both R/S and VB show qualitatively similar behaviour compared with GPH (not shown). Our results are qualitatively consistent with the study by Hu et al. .
Now, we examine how the estimators handle a trend in the variance (case 3). For this purpose, we use an AR(1) model with increasing variance: 4.1 Here, again, we generate 1000 realizations by sampling values for F from a uniform distribution U(0,2), using α∈U(−0.5,0.5) and σ∈U(0,1). For this case, DFA is the least biased, and GPH and R/S show considerable bias, while the VB estimators have a huge bias, although their estimates are very narrow (figure 4e).
(c) Correlated additive and multiplicative noise
Both LFSM and ARFIMA are additive noise models. In previous studies, it has been shown that multiplicative noise is important in natural systems [54,55]. This raises the question of what the effect of multiplicative noise would be on LRD estimators. To investigate, we use data generated from a process that has a so-called correlated additive and multiplicative (CAM) noise term. An example is the normal form for reduced climate models , which is given by the following stochastic differential equation (SDE): 4.2 As shown in Majda et al. , the PDF of equation (4.2) exhibits a power-law decay over a particular range, although its tail ultimately decays exponentially. We generate 1000 realizations by sampling random values for the SDE parameters from a uniform distribution while complying with the parameter relations as described in Majda et al. .
The GPH estimator is slightly biased towards positive values, while DFA and R/S have larger biases. Observe that the WL estimator (as well as both VB estimators (not shown)) estimates self-similar behaviour (figure 4f). While the PDF of equation (4.2) decays in a power-law-like way over a given range, it is not self-similar because its ultimate decay is exponential . This makes the estimates obtained by the WL and VB estimators questionable. Furthermore, R/S, DFA, GPH and the Whittle estimator again show an uncomfortably large number of outliers, again suggesting that the estimators may not be very reliable for this case, or that the signal is not characterized simply by H and d. We note that Kantelhardt et al.  have studied the performance of more general multi-fractal DFA methods, which were found to extract the full range of scaling exponents in a particular multi-fractal test case.
5. Natural time-series examples
Now we return to the two natural time series from §2, and analyse their self-similarity and LRD characteristics. We have shown above that the two time series are non-Gaussian; are they also long-range-dependent or self-similar?
Applying the various estimation methods to the Faraday–Vernadsky temperature time series gives evidence for LRD but with a wide variety of values: Whittle d=0.24, GPH d=0.28, DFA2 d=0.43, R/S d=0.33, WL H=0.53, VBw H=0.92 and VBδ H=0.96. While the three LRD estimators all provide evidence for LRD, they provide a rather large range of values of the LRD parameter, d. However, all of the estimates are between 0 and 0.5, suggesting a long-range-dependent but stationary process. The H estimates are larger than the d estimates; this might suggest self-similarity with α≈1.5.
There is also evidence for LRD in the AE index, again with a wide variety of estimates from the various estimators: Whittle d=0.28, GPH d=0.72, DFA2 d=0.66, R/S d=0.4, WL H=0.94, VBw H=0.97 and VBδ H=0.99. The estimates from GPH and DFA2 are larger than 0.5, suggesting a non-stationary time series, while the Whittle and R/S estimates suggest a stationary long-range-dependent time series. Again, the H estimates are larger than the d estimates, providing some evidence for self-similarity. Other evidence has been presented that AE may not be a simple fractal ; recently work has focused on high-frequency non-stationary and lower-frequency 1/f properties .
While all LRD estimators agree that there is evidence for LRD in these two time series, they provide a rather large range of possible values, illustrating the problem of statistically robust LRD estimation in practice and the need for further investigation of the performance of statistical indicators in the presence of departures from fractality, including possible multi-fractality.
There are two contributions to self-similarity: (i) LRD and (ii) non-Gaussian jumps. This is not always appreciated in the various communities with interests in detecting self-similarity and LRD. We have shown that empirical estimators of LRD are at best biased for fractional Gaussian noise, but at worst not robust for processes that deviate from this idealized model. We have also shown that the empirical estimators are not very robust in the presence of trends and multiplicative noise.
Our results have several important implications for the modelling of natural time series. In our view, the ARFIMA model is a much better paradigmatic model of natural time series than fGn since it allows one to explicitly model short-range and long-range behaviour while also allowing for non-Gaussian increments.
Finally, there is a need for estimation procedures that can deal with multiplicative noise and trends in the variance. Such effects introduce sizeable biases and estimation uncertainty. As such, all estimations of LRD have to be taken with precaution. While it is true that all of the estimators we tested perform reasonably well for fractional Gaussian noise, once a time series is non-Gaussian or is non-stationary (in trend or volatility) the estimators can be problematic.
We thank M. Freeman and two anonymous reviewers for their comments on an earlier version of this manuscript. In particular, we are grateful to one of the reviewers for their emphasis on the Whittle and WL estimators. This study is part of the British Antarctic Survey Polar Science for Planet Earth Programme. It was funded by the Natural Environment Research Council. T.G. acknowledges generous support through an EPSRC studentship. R.B.G. was funded under EPSRC research grant EP/D065704/1.
One contribution of 13 to a Theme Issue ‘Climate predictions: the influence of nonlinearity and randomness’.
- This journal is © 2012 The Royal Society