Decadal climate prediction (project GCEP)

Keith Haines, Leon Hermanson, Chunlei Liu, Debbie Putt, Rowan Sutton, Alan Iwi, Doug Smith


Decadal prediction uses climate models forced by changing greenhouse gases, as in the International Panel for Climate Change, but unlike longer range predictions they also require initialization with observations of the current climate. In particular, the upper-ocean heat content and circulation have a critical influence. Decadal prediction is still in its infancy and there is an urgent need to understand the important processes that determine predictability on these timescales. We have taken the first Hadley Centre Decadal Prediction System (DePreSys) and implemented it on several NERC institute compute clusters in order to study a wider range of initial condition impacts on decadal forecasting, eventually including the state of the land and cryosphere. The eScience methods are used to manage submission and output from the many ensemble model runs required to assess predictive skill. Early results suggest initial condition skill may extend for several years, even over land areas, but this depends sensitively on the definition used to measure skill, and alternatives are presented. The Grid for Coupled Ensemble Prediction (GCEP) system will allow the UK academic community to contribute to international experiments being planned to explore decadal climate predictability.


1. Beyond scenario based climate prediction

The International Panel for Climate Change (IPCC) monitors and assesses the evidence for climate change and the probable future scenarios of that change. The modelling that is used to assess these scenarios defines, as external drivers of the climate, concentrations of CO2 and other greenhouse gases, atmospheric aerosols and the ongoing changes in the solar cycle. The climate models themselves consist of representations of the atmosphere, oceans, land surface and cryosphere systems, although not all feedbacks between these systems are normally represented. Multiple runs of these models are made in order to average over the internally generated variability of the system, in order to give an estimate of the most probable climate evolution, rather than just one possible outcome. Furthermore, multi-model or multi-parameter results from the same model are also used for assessments because by sampling more uncertainty in climate representation, it should be possible to average out some of the biases of the individual modelling frameworks. Typically, these scenario predictions are run for periods of 100 years into the future with a range of possible anthropogenically controlled CO2, and other greenhouse gas concentrations specified. On these long timescales, the climate models are assumed to lose all memory of their initial conditions and thus current observations of the Earth's climate were not used in any direct way in IPCC AR4 for making such scenario predictions.

By contrast, on the shortest timescales of a few days ahead weather forecast models depend critically on the use of current observations of the atmosphere and sea surface temperatures (SSTs) in order to make their forecasts. On these very short timescales, only the atmosphere is assumed to vary and thus weather forecasts assume all other components (ocean, land and cryosphere and also CO2 and aerosol concentrations) to be fixed for the durations of interest.

On intermediate timescales, seasonal forecasts are performed with models that have similarities to the climate models used for IPCC scenario predictions. Critically, seasonal forecast models have an active ocean component into which real ocean observations are assimilated in order to initialize the prediction (Alves et al. 2004; Anderson 2008). The focus of seasonal forecasting (typically 6–12 months ahead) has mainly been on the tropical Pacific region where El Niño events occur which have a large impact on the atmosphere throughout the Pacific basin and beyond. Seasonal forecast models are therefore tuned to get the tropical Pacific to work well, but in many cases they are less sophisticated than full climate models such as HadCM3 (Stott et al. 2000). One example is that external forcings from aerosols and CO2 are kept fixed over periods out to 12 months ahead.

It is now a major challenge to make climate model forecasts with a range of 1–10 years ahead, intermediate between the seasonal forecasting range and climate scenario predictions. On these longer timescales, it is no longer possible to neglect the variable impact of external forcings, in particular the role of CO2 and other anthropogenic gases, and variations in the solar cycle. However, at the same time, it is essential to include/assimilate direct observations of the current state of the climate system; certainly from the oceans, but perhaps also from the land surface (in particular, snow cover and soil moisture), and the cryosphere (particularly sea ice cover). Such forecasts would carry tremendous value for government, commercial and financial planning.

There are several advances that are emerging as key to enable better interannual–decadal predictions to be possible in future. The first is the rapid advance of observing technologies for the slower components of the climate system, i.e. the oceans and cryosphere, as well as for the land surface state. In the oceans, the Argo profiling float array ( was completed in 2007, monitoring the top 2 km of ocean heat content and density throughout the globe with over 3000 robotic profilers. Meanwhile, new satellite instruments such as the NASA Aquarius and ESA SMOS missions will soon be launched to monitor soil moisture and ocean salinity conditions using L-band radars. Computer technologies are improving, allowing extensive exploration of climate model uncertainty through parameter sweeps, and the running of climate models at much higher resolutions, e.g. the HiGEM project (Shaffrey et al. in press). Finally, there are now new ways of simultaneously assimilating observations from different components (e.g. atmosphere and oceans) into climate models, in order to initialize them to make interannual–decadal predictions, for example the 4D-Var couple assimilation approach being developed by Awaji and co-workers in Japan (Mochizuki et al. 2007; Sugiura et al. 2008).

In this paper, §2 reviews the first system developed within the UK Hadley Centre explicitly to make decadal predictions. In §3, we show how this system has been ported using grid computing methods for use by the academic community, who will thus be able to collaborate on further developments of the system, exploring the sensitivities to various data types and assimilation methods. In §§4 and 5, we describe preliminary results obtained with this system in the Grid for Coupled Ensemble Prediction (GCEP) project. Section 4 focuses on predictability based on naturally selected initial conditions and §5 focuses on regional aspects of hindcast predictions made for the 1990s. Section 6 is a summary and future outlook.

2. The Hadley Centre Decadal Prediction System

Predictability experiments using HadCM3 (Gordon et al. 2000), one of the Hadley Centre coupled models used for the IPCC AR4 scenario predictions, were first described by Collins (2002) and Collins & Allen (2002). In these papers the memory of the HadCM3 system for initial conditions generated by the natural variability of the model, in the presence of external forcing changes, was tested. They showed that in a statistically average sense the system has limited local memory over land (at most a few seasons, less outside the tropics) and rather longer memory over the oceans (from a few seasons to several years, but with large geographical variations). However, these experiments did not examine far-field effects associated with particularly large anomalies, especially those in ocean heat content, which may be expected to have the biggest influence on the atmosphere. In addition, these experiments did not attempt to control the initial conditions of the system through data assimilation.

The Hadley Centre Decadal Prediction System (known affectionately as DePreSys) was first described in Smith et al. (2007). It is based on the HadCM3 coupled climate model and all runs are performed with all external forcings, CO2, aerosols and solar variations, applied. The CO2 and solar components of the forcing are largely predictable for a few years ahead, while the aerosol forcing, dominated by volcanic eruptions, cannot be confidently forecast into the future, although the past aerosol conditions are known. Therefore, in all hindcast experiments (in which DePreSys is used to ‘predict’ the past based only on observations available at the start of computations) the effects of volcanic eruptions occurring after the hindcast start date are not included.

The DePreSys is an anomaly assimilation system in which a fixed seasonal cycle of bias in the HadCM3 climate model is allowed to persist with the assumption that it is known, and hence only anomalies from that biased model state are assimilated and predicted (Smith et al. 2007). This is an alternative approach to most seasonal forecast systems in which a time-dependent model drift back towards the biased model climate is allowed for in the predictions (Stockdale 1997). The longer timescales of decadal prediction mean that model drift may be more nonlinear and less easily evaluated from past observations.

So far, only atmospheric and oceanic observations are used for assimilation in the DePreSys. ECMWF Re-Analysis (ERA40; Uppala et al. 2005) winds (u, v), potential temperature (θ) and surface atmospheric pressure (p*) are first used to define a seasonal climatology (12 monthly fields) over the period 1958–2001. Anomalies from this climatology are derived and combined with the model's own monthly climatology to define the required atmospheric fields to be assimilated as DePreSys initial conditions. A similar procedure is followed for the ocean initial conditions, using three-dimensional ocean temperature and salinity assimilation. There is no widely accepted analysis product for the past ocean states equivalent to ERA40, and so several different ocean analysis products based on recent quality controlled ocean observational profiles (Ingleby & Huddleston 2007) have been used in the GCEP project (e.g. Balmaseda et al. 2007; Smith & Murphy 2007). The analysed atmospheric and oceanic conditions are then assimilated concurrently directly into the coupled climate model using a very simple nudging technique. The resulting ‘climate model analysis’, typically over the period 1979–2001, then contains interesting information on other non-assimilated fields such as sea ice cover, snow cover and soil moisture, which can be compared with independent observations (see the companion paper by Putt et al. (2009)).

Throughout this period (1979–2001), many ensemble hindcast experiments are then launched starting from the initial condition fields of the climate model analysis, in order to assess the predictive skill of the DePreSys. A typical hindcast ensemble involves:

  1. selecting the climate analysis on a certain date, e.g. 1 May 1991,

  2. generating an ensemble (typically four members) of slightly different initial conditions for the climate model, e.g. by adding small numerical noise to the SSTs from this climate analysis,

  3. running this ensemble of experiments forward, e.g. from May 1991, with the free coupled model (no further assimilation) for several years (Smith et al. run hindcasts for 10 years), where no future information is used, e.g. the May 1991 ensemble would not include the Pinatubo eruption aerosols, but a similar ensemble beginning in November 1991 would do so,

  4. assessing each hindcast ensemble, both the individual members and the ensemble mean, against the atmosphere, ocean or land surface anomalies which were actually observed over the period of the hindcast, and

  5. assessing the statistics of many similar ensemble hindcasts begun at different times (1979–2001) by combining together the hindcasts with the same lead times.

The results published in Smith et al. (2007) focus on the global mean skill of DePreSys over the period 1982–2001, as well as a prediction of future global mean temperature changes out to 2014. Figure 1 shows two hindcasts (starting in 1985 and 1995) and a forecast from 2005, for global surface air temperatures. All anomalies are rising owing to externally driven global warming but the DePreSys is closer to observations in the hindcast periods than equivalent uninitialized predictions (NoAssim), with some of the variability for the first few years of each hindcast being clearly successfully reproduced. Furthermore, the DePreSys forecast from 2005 predicted that global warming would be temporarily offset by natural internal variability, and this is supported by the observations to date. An updated forecast, starting from September 2007, predicts that the current La Niña will rapidly decay, followed by continued warming consistent with the 2005 forecast.

Figure 1

Global annual-mean surface temperature anomalies (relative to 1979–2001). The DePreSys predictions (white curves with red confidence limits), starting in 1985, 1995 and 2005, are compared with equivalent predictions without data assimilation (blue curves), and with observations (black curve) from HadCRUT3 (Brohan et al. 2006). An updated DePreSys forecast starting from September 2007 is also shown (solid green curve, ensemble mean only). Updated from Smith et al. (2007).

Many aspects of the predictions in figure 1 require more careful study to understand the origins of the skill. In the following sections, we look at the methods used to implement DePreSys outside the Hadley Centre, and show early results on regional aspects of the early year skill that can be seen in figure 1.

3. Climate prediction on shared compute clusters

The limited size of the HadCM3 climate model (it typically scales well to no more than eight parallel processors), combined with the large number of multi-year ensemble hindcast experiments required by DePreSys, provide a good opportunity to use shared grid computing resources rather than supercomputer resources for this work. This was the eScience goal we set out at the beginning of the GCEP project.

(a) HadCM3 and DePreSys interface for the University of Reading cluster

The Met Office Unified Model (or ‘UM’, v. 4.5), the base code for the HadCM3 model, was first installed on a 64 core cluster at the University of Reading (other larger clusters at British Antarctic Survey (BAS) and Proudman Oceanography Laboratory (POL) were later used; see below). The configuration and job control is performed using a graphical user interface (UMUI), from an analysis server attached to the Reading cluster. This allows a wide variety of user-specified technical and scientific control files to be produced as input to the model. For ensemble runs, we have written a tool that makes many modified copies of these control files differing, for example, in choice of initial conditions.

The DePreSys involves additional modifications to the HadCM3 code, as well as the inclusion of new input files containing the data to be assimilated; these too can be specified via the UMUI. The observational assimilation input files are generated separately by removing the observational climatology and adding the model climatology, after interpolating the observations onto the model grid.

(b) Extending DePreSys for the NERC Cluster Grid

The NERC Cluster Grid (Bretherton et al. 2009) is being developed explicitly to service consortium modelling projects such as GCEP with intermediate compute requirements. It currently consists of a lightweight web based cluster management system covering NERC clusters at Reading, BAS, POL, PML and NOCS, allowing user statistics and standard Sun Grid Engine commands to be managed from a single interface. A lightweight tool G-Rex has also been developed to allow users to manage job submission and data output from within their own computing environment and to prevent build-up of large data output on remote cluster resources.

The complexity of the UM job control means that several compromises have so far been made to access NERC Cluster Grid resources but UMUI control files are copied, e.g. to the BAS cluster, and model output files can be automatically moved back to the client as runs progress so that jobs can be controlled and analysed without interactive login into the remote cluster and using very little remote data storage.

4. Predictability studies

This section uses important known patterns of low-frequency ocean temperature/heat content variability to test the timescales over which the HadCM3 climate system can show predictive skill. This is done by comparing two ensembles started from naturally occurring extremes of these important ocean patterns. The two ensembles will eventually converge as they approach the climatological ensemble. The length of time that the ensembles remain distinguishable due to the initial conditions for various aspects of the climate state (not just those used to make the initial extreme selection) can then be studied.

The extreme initial ocean states were selected from a five member ensemble run with realistic twentieth century forcings, including changing greenhouse gases and volcanic aerosols. A single run started in 1860 was used to initialize the five member ensemble, which was integrated over the period 1950–2000. The pair of extreme states for each experiment was then selected from two different ensemble members at the same point in time. It is important to note that, although the initial conditions were selected from members with large differences in key indices, this is by no means the only difference between the chosen ensemble members. Four experiments were completed, each consisting of two ensembles with different initial states. Details of the experiments are given in table 1.

View this table:
Table 1

Details of four experiment sets used to test HadCM3 predictability. (The period of each experiment and size of the ensembles used are given. The last four columns are SST differences that define the initial condition extremes. Experiments 1, 2, 3 and 4 were selected based on North Atlantic, Tropical Atlantic Dipole (TAD), Interdecadal Pacific Oscillation (IPO) and Southern Ocean (SO) SST differences, respectively. All regions are defined in the text. Units for these columns are °C.)

Figure 2 shows ensemble plumes from these experiments. Each row is a different experiment with plumes for three different predicted indices being shown. The first column is annual mean global ocean 500m heat content anomaly. The second is the annual mean anomaly of an SST index for the Interdecadal Pacific Oscillation (IPO, defined as the average temperature in the box 170° W : 100° W, 15° S : 15° N minus the average temperature in two off-equatorial boxes 170° E : 140° W, 25° N : 40° N and 150° E : 160° W, 40° S : 25° S). The third column is the annual mean Atlantic Meridional Overturning Circulation (MOC) anomaly in Sv at 30° N. The dark continuous lines show the ensemble means and the shading indicates one s.d. of the ensemble spread. Each plot is marked with the predictability in years as determined from a t-test of the ensemble means.

Figure 2

Predictability plumes from four experiments, one per row. The predictability in years is indicated in each panel. See the text for information on the variables being predicted.

In general as the figure shows, ocean heat content is more predictable than SST. This is expected as SSTs are more influenced by the noisy atmosphere than is heat content. The ocean heat content has a very strong trend in all experiments from the increasing radiative forcing of greenhouse gases. But despite this strong external forcing, differences in the initial conditions are still important, often for more than 5 years.

The Pacific is important to climate with many large teleconnections across the globe. On decadal timescales, the IPO is the main mode of variability in the Pacific. The plots in the second column show that the IPO has predictability beyond 1 year and sometimes for much longer. These variations in predictability show how much predictability depends on the initial state. This means that an average predictability, e.g. as estimated by Collins (2002), can be potentially misleading.

The MOC has a lot less variability in its predictability; plumes in the third column. Interestingly, the two experiments with the longest predictability (1 and 3) also have the largest differences in the North Atlantic initial SSTs (see column 4 of table 1). This may be due to the Atlantic Multidecadal Oscillation (Kerr 2000) that links North Atlantic SSTs and oscillations in the MOC. However, the small sample size here means that this correlation between initial North Atlantic SST differences and the predictability of the MOC could have happened by chance and further work is needed to test this.

It is interesting to contrast plots (d) and (e). The behaviour in (d) is quite simple; plumes start out narrow, slowly widen and eventually merge. In (e), the plumes are not distinguishable until the last 3 years of the run. Although it is very unlikely that three consecutive years would show significant differences between the ensembles by chance, this cannot be ruled out until a mechanism for this predictability is found. Through interactions in the climate system, predictability could be being transferred to the Pacific sea-surface temperatures, even though it was not there initially. This emergence of predictability for a univariate measure, and the sensitivity of this predictability to the initial state, makes it clear that much more work is needed to uncover and understand climate predictability.

5. Preliminary climate predictions

This section extends the results from the DePreSys prediction system introduced in §2 to begin to look at the regional nature of skill. To test the ability to predict climate anomalies out to 2 years ahead a set of hindcast ensembles were started every 1 May and 1 November between 1990 and 1999. There are 20 start dates, each with an ensemble of four climate model runs of 2 years each (160 years of modelling). The initial conditions are taken from the DePreSys assimilation system with both atmosphere and ocean anomalies corresponding to observations through the period. Noise (amplitude 0.05°C) was added to the SSTs at each start date to generate the different initial conditions in each of the ensemble members. The hindcasts use initial conditions but no future information such as new sources of volcanic eruption aerosols, and therefore retain realistic uncertainties that would arise from forecasts.

The anomalies from all these hindcasts can be compared with observed anomalies at various lead times. Figure 3 shows the mean errors in surface air temperature from the second year of all these hindcasts. The same figure also shows a similar set of mean errors from an equivalent No Assimilation set of experiments over the same period without assimilation of any observations (see Smith et al. (2007) for more details). The importance of the assimilated initial conditions can be seen from the larger mean surface temperature errors from the No Assimilation experiments that provide the best predictions available without the use of initial observations. The reduced mean surface air temperature errors between 1 and 2 years after the start of the experiments over large areas, including land areas, appears to demonstrate skill of the prediction system. It is this increase in skill in hindcasting the mean temperature errors that dominates the r.m.s.e. shown in the DePreSys paper of Smith et al. (2007), and which they show can persist for considerably longer than the 2 years presented here. The origin of this skill is still being investigated as it does depend on the climatological period chosen to define the anomalies.

Figure 3

Mean surface air temperature errors from the second year of 20 hindcast ensembles run over the period 1990–2001. (a) The errors from ensembles initialized from free transient runs of HadCM3, i.e. with No Assimilation. (b) The errors from the DePreSys hindcast ensembles initialized with observed atmosphere and ocean anomalies. Units are °C.

Another, perhaps more robust, aspect of the prediction skill can be illustrated by removing the mean errors from the hindcasts and looking at the s.d. of the surface temperature errors. This provides a bias correction of the hindcasts, tuned to the particular period of interest. Figure 4 shows the difference in the s.d. of surface temperature errors between the assimilation experiments and no assimilation experiments in years 1 and 2 of the hindcasts. Positive regions show where the initialized assimilation system has more predictive skill. It is interesting to note that the s.d. of the surface temperature errors forms a rather smaller component of the total r.m.s.e. than comes from the mean temperature errors. For these s.d. temperature errors in the first year of the hindcasts, figure 4a, the El Niño region in the central and eastern Pacific stands out as the region with the largest skill increase, as would perhaps be expected. Other results (not shown) suggest that the DePreSys shows very good skill in hindcasting the large El Niño event that occurred in 1997–1998, both in amplitude and timing (Haines 2007), hence explaining the large skill increase seen in figure 4a. However, for the second year of the hindcasts, figure 4b, the region in the northwestern Atlantic Ocean shows the largest increase in skill that is statistically robust. Further results (not shown) indicate that the initialized climate model shows useful skill in predicting interesting transitions in the strength and depth of the subpolar gyre of the North Atlantic that occur in the mid 1990s in this region.

Figure 4

Standard deviations of surface air temperature errors. Each plot shows the difference in errors between the hindcast ensembles with and without data assimilation, so that positive values show where the DePreSys ensemble errors are lower than those with No Assimilation. (a) Results from the first year of all hindcasts (1990–2001). (b) Results from the second year of these hindcasts. Units are °C.

These regional results demonstrating different aspects of the predictive skill of the DePreSys climate prediction system will provide important clues to help understand and further improve our ability to predict global and regional climate out to a few years ahead, in the coming decade. The ability to carry out these experiments within a shared compute cluster environment will widen the access of the scientific community to these climate prediction tools, and lead to more rapid advances in the science.

6. Discussion and future vision

The potential for using climate models for interannual to decadal forecasting based on initialization with current climate observations is quickly becoming a major research focus internationally. The coupled model intercomparison project phase 4 (CMIP4; is defining a framework of experiments through which many operational forecasting centres will collaborate to explore initial condition skill in climate models. The sharing of computer resources and the implementation of DePreSys on compute clusters being developed within the GCEP project will enable wider participation of the academic community in this important research agenda. Planned future GCEP studies include:

  1. improved use of ocean synthesis products for initializing coupled predictions,

  2. focus on the MOC as a predictor and predictand of short-term climate change,

  3. investigation of short- and long-range interactions in determining forecast skill, and

  4. studies of the land surface and sea ice distributions and their influence on hindcast skill.

The nature and complexity of climate prediction requires diverse experimental approaches and we envision the need for short runs of high-resolution climate models on HPC resources in combination with many more runs of intermediate resolution models on a diverse grid of compute resources to explore complexity and provide a flexible platform for tackling climate prediction.


This project was supported by the NERC eScience programme grant NE/C/515820/1. D.S. was supported by the Joint DECC and MoD Integrated Climate Programme—GA01101, CBC/2B/0417_Annex C5. K.H. would like to thank BMT for their continued sponsorship, support and interest in informatics and eScience.


  • One contribution of 24 to a Discussion Meeting Issue ‘The environmental eScience revolution’.

  • This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


View Abstract