The ocean interacts with the atmosphere, biosphere and cryosphere in a complex way, modulating climate through the storage and transport of heat, nutrients and carbon. As such, it is important that we understand the ways in which the ocean behaves and the factors that can lead to change. In order to gain this understanding, we need to look back into the past, on time scales from recent decadal-scale change, through the abrupt changes of the Pleistocene and back to times when the Earth's climate was significantly different than the Holocene. A key challenge facing the field of palaeoceanography is to combine data and modelling in a common framework. Coupling palaeo-data and models should improve our knowledge of how the Earth works, and perhaps of more direct societal relevance, might enable us to provide better predictive capabilities in climate modelling. In this discussion paper, we examine the motivations, past successes and challenges facing palaeoceanographic studies. We then suggest a number of areas and approaches that we believe will allow palaeoceanography to continue to provide new insights into processes that affect future climate change.
On time scales of decades to millions of years, the ocean plays a key role in modulating and controlling global climate, including heat redistribution [1,2], the carbon cycle  and the hydrologic cycle . If we are to make predictions about the response of the Earth to perturbations, either natural or anthropogenic, then we need a mechanistic understanding of the ocean's behaviour. We have been making instrumental measurements for less than 0.1 millionth of Earth's history; so in order to gain a better perspective on the rates and amplitudes of change, we need to look further back in the past using geologic records. For example, evidence from terrestrial and marine realms points to warming surface temperatures and higher sea level in recent years [5,6]. Without palaeoclimate data from the relatively stable Holocene and beyond, these trends would have no baseline for comparison. The field of palaeoceanography seeks to document and probe the way the ocean and climate systems have behaved beyond the instrumental record. The subject is broad and has implications across a range of disciplines, but for the purpose of this paper, we draw almost exclusively on examples and questions posed by the study of Quaternary climate change.
We have known for more than a century that the Earth experienced colder temperatures in the past, for example, from glacial landforms found across large parts of the now temperate UK . It was not until what we might consider to be the advent of modern palaeoceanography, however, that analysis of the oxygen isotopic composition of foraminifera in marine sediments was used to look at Pleistocene temperatures from a marine perspective . Decades later, Hays et al. showed that long-term cycles in δ18O were linked to changes in solar insolation , and the Climate: Long Range Investigation, Mapping And Prediction (CLIMAP) project compiled a global picture of sea-surface temperatures during the last glacial maximum . This latter study showed that polar regions exhibited larger changes in temperature than the equatorial regions—a finding that appears to be reflected in changes in climate in modern decades . The details of both of these early studies continue to be debated, but the two approaches of studying temporally continuous records at one location, or coeval, spatially distributed datasets remain at the heart of many palaeoceanographic studies.
Even in the early decades of palaeoceanography, there was interplay between co-evolving theoretical (‘modelling’) developments and observational work. The work of Stommel & Arons  sets the basis for our conceptual understanding of deep-ocean circulation . Subsequent water-mass studies in conjunction with simple box models have confirmed this broad picture , though the dynamical forcing and character of deep-ocean circulation remains contentious [13,14]. Deep-ocean circulation has a response time of the order of millennia, an order of magnitude longer than instrumental records; so palaeoceanographic records are uniquely positioned to understand this system and its implications for global climate. For example, palaeo-records of deep Atlantic circulation show that its circulation was significantly different during the last glacial maximum than it is today . Such changes have been linked to the global heat and carbon balance through processes including biological productivity  and ice-shelf dynamics .
Our conceptual understanding of the background state of deep-ocean circulation and its shorter time-scale variability of the ocean has continued to develop alongside new observational work [18,19]. Over the last few decades, significant focus has turned to millennial-scale ‘abrupt’ climate events during the last glacial and deglaciation (figure 1) , with changes in the transport of the ocean during these events closely linked to changes in the meridional heat distribution along the axis of the Atlantic Ocean. The apparent difference in phase between north and south high-latitude temperature changes has led to this phenomenon being named the seesaw effect (figure 1) [23,24]. The forcing of these changes is still not fully understood, but candidates include the effect of the North Atlantic freshwater budget , sea-ice formation  and changes in high-latitude atmospheric circulation , perhaps linked to seasonality . There are indications from the palaeo-record that the deep Atlantic Ocean circulation rates may have been very much reduced during massive ice-rafting events in the northern hemisphere (figure 1) . As a possible analogy, predictions that Atlantic overturning circulation may slow in the future under a global warming scenario  appear to be consistent with recent observations of a reduction in Atlantic overturning over recent decades . However, the short span of these records means that they cannot be used to determine whether the changes are anomalous compared with natural variability, a criticism that appears to be valid given more recent observations . Likewise, recent observations of the intrusion of upwelled warm water masses below ice shelves in West Antarctica  indicates the need to understand the long-term heat cycle in the deep ocean to improve our understanding of shorter-term processes. Using models to test the sensitivity of climate to ocean circulation changes may inform the debate on what may be happening in the climate system today. It is for questions like these that palaeoceanographic data are of most value.
Palaeoceanography has the potential to underpin our understanding of how the Earth system works. Box models and mechanistic process models coupled with data have helped us to give a first-order understanding of the links between the carbon cycle and ocean processes, for example, leading to a better understanding of the links between the ocean and greenhouse-gas concentrations in the atmosphere on longer time scales [3,32]. It is through these basic steps in understanding connections within the climate system that palaeoceanography may improve our ability to predict future climate. Despite developments in sampling, technology and growing interest in climate studies, there are still significant and outstanding gaps in our understanding of how the ocean and climate interact. As such, timely investment in palaeoceanographic research is required to overcome outstanding hurdles and to move palaeoceanographic studies onto a quantitative footing.
In this article, we argue that there are a few relatively straightforward steps that would maximize the return on the efforts made to reconstruct the history of the ocean, helping us to put palaeoceanographic research at the forefront of modern climate studies. In brief, palaeoceanographic data should be based on well-quantified, well-constrained proxies with appropriate uncertainties and should be presented with adequate, quantified age constraints, such that they provide robust datasets for assessing the competency of climate models. In the best case, palaeoceanographic reconstructions would provide data for the past that are directly comparable to modern measurements, preferably even with records that overlap the instrumental period of direct observation, making the transition to modelling palaeoceanographic data straightforward. These records would need to be on a common time frame to allow direct comparisons between datasets, and to allow meaningful syntheses. To address these difficulties, we focus on the steps required to bring palaeoceanographic data up to these standards. We examine evidence for changes in the main parameters linked to ocean circulation and carbon cycling during the Pleistocene, consider the reliability of current dating approaches and consider the most viable approaches to modelling available and future data. The main principles apply equally to the smaller (but no less abrupt) changes in climate observed during the Holocene, and to the massive climatic reorganizations that happened much further back in Earth history, such as the Palaeocene–Eocene thermal maximum (PETM) .
Section 1 outlined the potential of palaeoceanography to inform our understanding of how the Earth system works, and how this knowledge may influence our understanding of modern and future climate processes. We also outlined the practical steps that we believe are essential for allowing palaeoceanographic studies to realize their full potential in this regard. In the discussion, we will describe these steps in greater detail using examples and case studies from the literature, together with ‘suggested ways forward’ in each section, which contains suggestions as to how we might improve our approaches. While there are many open questions in palaeoceanography, we have focused on the causes and structure of abrupt climate variability and the processes that transmit these signals globally.
Currently, the palaeoceanographic community has two primary competing hypotheses to account for the cause and transmission of abrupt climate variability involving either changes to the meridional overturning circulation of the ocean or the global reorganization of the atmospheric system [34,35], although of course these two are not mutually exclusive. A major scientific challenge to the palaeoceanographic community over the next few years is to design tests for each competing hypothesis, working towards the goal of producing a sound mechanistic understanding of the processes involved in abrupt climate change. This understanding is arguably an essential step towards more reliable projections of future global climate change. Likewise, the question of ‘climate sensitivity’ or the response of global temperature to carbon dioxide levels is still a contentious issue. Palaeoceanography has a central role to play in this vital debate, both through documentation of past climate scenarios, and through understanding the processes that have driven these changes. It is our opinion that these tests require progress in all of the steps below, namely well-quantified proxies, accurate precise dates, well-placed samples and carefully designed modelling studies.
(a) Proxy development and quantification
We cannot directly measure ocean physics, chemistry or biology in the past; so we must turn to ‘proxies’ that reflect the palaeo-environment. Analysis of proxies in marine sediments has led to discoveries on the nature of ocean behaviour in the past, but these records are imperfect. Sediment accumulates heterogeneously, it can be mixed by biological or physical processes, and it can be altered through diagenetic processes. Although there is an ever-growing collection of proxies , they tend to record several aspects of the ocean environment, and are often controlled in large part by biological processes, complicating their simple interpretation. There is a wide array of proxies for the past ocean, from traditional stable isotopes of oxygen and carbon [33,37] through to exotic tracers such as zinc or silicon isotopes [38,39], ancient DNA arrays , organic proxies  and analysis of grain sizes . This diversity has led to some major steps forward, but can lead to confusion when non-specialists use the data for interpretation or modelling studies. Understanding the details of the controls on each proxy (e.g. thermodynamics, kinetics, biological processes, diagenesis) rather than relying on empirical calibrations is a necessary step towards predicting and quantifying the complexities and uncertainties associated with each approach. These ‘complications’ have often been seen as a weakness in proxy data, but the added richness in this data could also turn out to be a strength if a multi-proxy approach is combined with state-of-the-art data assimilation into multi-proxy models.
In this section, we use three examples to demonstrate the potential advantages and pitfalls of proxy approaches, and highlight ongoing and future approaches to improving the understanding and quantification of these proxies. It should be noted that there are a wide variety of other tracers and archives that are of value, but not all can be discussed within the confines of this paper. Finally, we discuss overarching approaches to improving proxy studies.
(i) Salinity and temperature
Temperature and salinity are central to global heat budgets and the freshwater cycle. They also control the density (buoyancy) of seawater, and thus are directly related to the overturning circulation of the ocean. However, they are both extraordinarily difficult to reconstruct quantitatively, particularly in the deep ocean where the amplitudes of change are smaller than the surface ocean. Instrumental records indicate that upper ocean temperatures have been warming over the last few decades [6,43]. Further back in time, a variety of temperature proxies, including faunal assemblages, trace metals in foraminifera and organic tracers, indicate that surface waters were cooler during the last glacial maximum . The amplitude (and timing) of these reconstructed changes, however, does not always agree . Instrumental records indicate ongoing changes in North Atlantic salinity today, indicative of enhanced freshwater export from the Arctic , but past salinity changes are even harder to reconstruct.
Here, we examine an approach based on δ18O and Mg/Ca in foraminifera that has been used to look at sea-surface salinity changes in the past . The δ18O of foraminiferal calcite is influenced by temperature, by biological manipulation and by the δ18O of seawater. The δ18O of seawater has local (salinity) and global (mean δ18O of seawater) controls. In order to derive salinity from δ18O, the influences of the other parameters must be removed. First, Mg/Ca is used to estimate temperature using empirically derived calibrations . Second, an adjustment for the estimated ice-volume-controlled δ18O mean global shift is made. Hence, the final uncertainty on the salinity reconstructions includes analytical uncertainty on the Mg/Ca and δ18O of the foraminifera, the veracity of the global δ18O estimate, and the robustness and uncertainty of the Mg/Ca to temperature calibration. Schmidt et al.  assessed a final uncertainty of 0.2$ (1 sigma) on the derived salinity parameter. The maximum difference in the salinity-excess parameter from the last glacial maximum to Holocene is approximately 0.6$, apparently resolvable at the 2 sigma level. However, an independent assessment of this same uncertainty gives 0.6$ as a realistic limit . In addition to this ‘known’ uncertainty, it has recently been shown that Mg/Ca is also controlled by salinity . Indeed when Arbuszewski et al.  corrected for this salinity effect to recalculate temperature in the same core as the Schmidt et al.  record, they found differences of up to 2°C colder during the last glacial maximum, with a significant effect on the residual δ18O salinity reconstruction .
Temperature, salinity and density are vital parameters for linking the ocean and climate systems; so we need to continue to invest in efforts that improve and quantify existing and new proxies. Therefore, an obvious step is to improve empirical temperature calibrations through a combination of laboratory culturing in controlled conditions and through dedicated field studies, including sediment-trap sampling and careful collection of undisturbed modern core-top material (e.g. by multi-coring). We might also consider new proxies, and there are likely to be approaches that will come to the fore with continued understanding of marine chemistry. Alternative marine-based temperature proxies that show promise include the clumped isotopes of oxygen and carbon in carbonates , and organic proxies such as TEX86 . Comparing the similarities and differences of multiple proxies with overlapping controls may allow us to draw out the true temperature signal. Additionally, using a suite of temperature proxies from the same site has the potential to reveal important details such as seasonality or the depth structure in the thermocline . A third route is to use the proxies in their full complexity. It might be argued that a proxy that reflects both temperature and salinity is in fact just as valuable as the two separately because of its direct connection to density. Vertical sections of density are used to calculate fluxes in modern oceanography  and extending this approach to the past (as has been done to examine mean flow in the Florida Straits ) has obvious appeal. In a further step, Lund et al.  used a tracer budget combined with δ13C and δ18O depth profiles in the western Atlantic to infer a doubling in the ratio of advection to diffusion of glacial southern sourced waters compared with today . Although there are limitations to this approach, such as the inability to distinguish between an increase in advection or a decrease in mixing, this study demonstrates how new approaches can extract additional value from existing data.
(ii) Water-mass flux
Deep-ocean transport is a fundamental part of the ocean/atmosphere system, responsible for transport of water, heat and nutrients around the Earth. Although there are several tracers of deep-water mass geometry in the past such as Cd/Ca and δ13C measured in foraminifera shells [15,54], we know less about the fluxes of water masses and their properties. Several proxies have been used to look at the rates of ocean processes in the past, including sediment grain-size distribution , 231Pa/230Th activity ratios in marine sediments , density reconstructions of geostrophic flow (see above; ) and radiocarbon . Here, we examine the issues surrounding one of these proxies, sedimentary 231Pa/230Th . The principle behind the proxy is that both isotopes are produced at predictable rates in seawater from the decay of uranium, but that while Th is almost completely removed from the water column near the site of its production by scavenging onto falling marine particles, Pa is less particle reactive, and can be exported away from its formation site and buried elsewhere. Variations in the rate of water advection therefore have the potential to change the burial ratio of 231Pa/230Th. A landmark study in 2004 showed abrupt changes in this burial ratio in the deep North Atlantic, and has been widely cited as evidence for a substantial reduction in Atlantic overturning during the last Heinrich event .
Of course, there may not be a simple relationship between 231Pa/230Th burial and overturning rate. For example, sediment-trap and water-column studies indicate that the scavenging rate of Pa is strongly influenced by the particle flux and by the composition of the falling marine particles [57,58]. In particular, Pa has a high affinity for opal, and some down-core records appear to exhibit close relationships between opal accumulation and sedimentary 231Pa/230Th ratio [34,59]. Similarly, high particle fluxes near margins can lead to large gradients in 231Pa/230Th that are not related to advection .
A variety of modelling approaches have been used to try and understand the complexities of these isotopes in the marine setting. Simple one-dimensional isotope scavenging models have played a role in the interpretation of Pa and Th in the ocean [61,62]. Marchal et al.  used a simple model of Pa and Th scavenging to model meridional sections of 231Pa/230Th in the ocean basins using the Bern 2.5D ocean model and considered both glacial and interglacial circulation states . Siddall et al.  considered the impact of different particle types on Pa/Th distributions in the ocean using the Bern three-dimensional model . Using this same model, Siddall et al.  modelled the time-dependent response of 231Pa/230Th to changes in ocean circulation during a numerical experiment where freshwater was imposed on the North Atlantic to cause a collapse in overturning circulation . Other studies have also examined the impact of particle size on the 231Pa/230Th distribution in the ocean . These modelling studies have enhanced data interpretation by helping us to differentiate the multiple effects on 231Pa/230Th distribution in the ocean . The modelling of Luo et al.  suggests that understanding 231Pa/230Th changes in the past requires a very dense sampling strategy to unravel the complexities of 231Pa/230Th behaviour . Indeed, recent inverse modelling suggests that the existing 231Pa/230Th sedimentary data reconstructions are not adequate to rigorously characterize changes in ocean circulation . Attempts to model other ‘novel’ geochemical proxies have also had a direct impact on the interpretation of palaeoceanographic records. For example, modelling of neodymium isotopes reveals that ocean circulation, particle scavenging and input from the continents all play significant roles in their distribution [70–72].
If net water-mass transport is indeed a primary control on sedimentary 231Pa/230Th, then it is worth dedicated effort to extract this signal. There are two main ways of addressing the outstanding issues. The first is to continue to collect 231Pa/230Th down-core records covering a range of depths and locations with associated mineralogical and particle flux data . Such an approach would give the mean 231Pa/230Th sedimentary ratio on a basin-wide scale, better suited to assessing the large-scale effect of particles on 231Pa/230Th and maximizing the potential for modelling studies to distinguish unique circulation patterns. Second, a dedicated focus on understanding the modern geochemical cycling of Pa and Th in the marine setting, from their sources and sinks to their interactions with particles, will enable a more sophisticated interpretation of sedimentary 231Pa/230Th in the past. Systematic studies are now in progress within the framework of the international GEOTRACES programme . Basin-wide sections of trace elements and isotopes (see http://www.geotraces.org/ for details of locations of ocean sections) in dissolved and particulate form are being collected and analysed to generate a global coverage similar in scale to the Geochemical Ocean Sections Study (GEOSECS) or the World Ocean Circulation Experiment (WOCE). Even at this early stage, the results are revealing new complexities , highlighting the value of basin-wide studies. Unfortunately, only a few of these field campaigns are collecting core-top sediment samples beneath the water-column profiles, despite the stated overarching aim of the programme to improve proxy calibrations. Wasting this opportunity to collect paired samples severely limits our ability to improve our palaeoceanographic proxies, and we would argue that all such future studies should include collection of core-top sediment samples to examine seawater–sediment interactions. Likewise, continuous ocean observing arrays such as OceanSITES (http://www.oceansites.org/) and the Ocean Observatories Initiative (http://www.oceanobservatories.org/) offer potential to provide insights into modern processes that might help develop palaeoceanographic proxies. To date, few of the parameters that are measured in the palaeo-record are included in the measured variables in these programmes. This disconnect indicates an area in which better integration between the modern and palaeooceanographic communities could lead to improved use of resources.
It is only with a deep understanding of the geochemical cycling of these isotopes in the water column and the sediments that we are likely to make significant improvements in our interpretation of palaeoceanographic records. Even with such efforts, it may be that we can never truly interpret a geochemical tracer such as 231Pa/230Th simply in terms of water-mass advection. However, if we are able to understand the complex controls on the export of Pa, then model approaches to comprehensive datasets are likely to be informative as to past advection rates .
(iii) Carbon cycling
Given the 50-fold difference in the carbon content between the ocean and the atmosphere, it is apparent that small shifts in ocean carbon cycling, either through physical processes such as stratification or via biological processes such as changing productivity, could have a big influence on atmospheric pCO2. Since the beginning of the industrial revolution, about a quarter of the CO2 released in the atmosphere by anthropogenic activities has been absorbed by the world's oceans . Without this ocean uptake, atmospheric CO2 levels would be much higher. Beyond the impact of CO2 on the global climate, the effects of elevated CO2 in the ocean are being studied in order to determine the effect of decreasing pH on the marine environment (ocean acidification) .
Ocean general circulation models attest to the instability of the Atlantic overturning strength and its capacity for rapid changes linked to atmospheric CO2 . During glacial periods, the Atlantic overturning circulation was reduced in strength, and atmospheric CO2 concentrations were significantly lower compared with pre-industrial levels. Although no single mechanism accounts for the full amplitude of past CO2 variability , there is general agreement that lower CO2 levels during glacial periods require increased isolation of deep-water masses from the atmosphere compared with interglacials. Some studies have inferred a key role for the Southern Ocean because deep-water masses outcrop in the Southern Ocean, allowing gases to exchange with the atmosphere [16,32,34]. Mechanisms include deep-ocean ventilation via upwelling as a result of changes in the southern westerlies  or the Atlantic overturning strength . Indeed, modelling studies have linked North Atlantic driven changes to the ocean overturning with changes in outgassing from the Southern Ocean on millennial time scales . The links between ocean circulation and CO2, however, are not straightforward [32,81]. For example, the deep ocean regulates the supply of nutrients to the surface (influencing productivity), it plays a large role in modulating surface temperature (influencing gas solubility and productivity), and modelling studies have indicated that the prevalence of preformed nutrients play a key role in the efficiency of the ocean's biological pump . Clearly, the inclusion of detailed biogeochemical cycles is required for a realistic evaluation of the role of ocean circulation on atmospheric CO2 levels. Most recently, there have been attempts to model the carbon cycle with earth-system models of varying complexity. Such models have been able to capture elements of the glacial to interglacial variability in CO2  and also longer-term trends in CO2 , albeit with heavily simplified and over-parametrized models. The complexity of these systems leaves modelling efforts with many challenges if we are to understand and predict the climate's sensitivity to carbon perturbations.
Determination of the carbon content of water masses in the past is, therefore, a tantalizing goal; it would perhaps allow us to put better estimates on climate sensitivity to rising CO2 and to assess the potential impacts of ocean acidification. However, reconstructing carbon content in the past is difficult to achieve, in part because of the small signals involved. A promising approach to reconstructing the history of specific aspects of oceanic carbon chemistry is based on the chemistry of boron and its isotopes in the skeletons of carbonate-forming organisms . The B content, typically measured as B/Ca, has been linked to the carbonate ion concentration in seawater , while the isotopic composition appears to be controlled by seawater pH . From the isotopic perspective, the best calibrations exist for planktonic  and benthic  foraminifera and shallow-water corals . There are difficulties with this approach, including the challenges in making accurate and precise analyses . Additionally, the biological manipulation of seawater by marine organisms as they form their skeletons (biomineralization) causes different species to incorporate boron at different isotopic compositions and concentrations [89,90], so-called vital effects. Such vital effects may arise from differences in the species of boron incorporated into the carbonate skeleton and from the organism's ability to manipulate seawater pH as it calcifies. Notwithstanding these types of issues, boron continues to offer promise for examining the history of carbon cycling in the ocean , and even for examining recent changes that are difficult to detect with current instrumental arrays. The importance of the carbon cycle on long and short time scales makes continued improvement and application of boron-based and other carbon-cycle proxies worth the effort.
(iv) Suggested ways forward
The earlier-mentioned examples describe approaches to reconstructing three integral parts of the ocean system: temperature and salinity, ocean circulation rates, and carbon cycling. There have been sufficient successes to warrant continued use of each; however, these are outstanding questions that would, if addressed, improve our knowledge of the ocean behaviour during climate events. Each proxy has its own specific issues, but the following are generalities that apply to all proxies.
An incomplete understanding of the controls on the chemistry of the ocean today hampers our attempts to interpret past changes. The ongoing GEOTRACES programme is making significant progress towards understanding modern geochemical cycling of trace elements and isotopes in the ocean, and over the next decade should solidify our understanding of these complex systems. Continued efforts to understand the cycling of trace metals, isotopes and other components of the modern ocean (e.g. formation and cycling of organic molecules) and how they are transmitted into marine archives, together with realistic assessments of the inherent uncertainties in these processes, will be essential over the next decade.
A common problem for making modern calibration curves is access to appropriate samples. There are two ways to produce such samples. The first is to grow samples in the laboratory with either inorganic crystals  or marine organisms . The advantage to this approach is that the system can be carefully controlled and varied over a range of appropriate settings. Drawbacks to this approach include the challenges of constraining growth rates for inorganic precipitates and un-natural growth conditions that may lead to stress or other complications for biological samples. A second approach is to use co-located, bottom water and core-top sediment samples (or other modern representatives of the archive in question). While field studies form the most appropriate approach for observing the full range of variability in the present-day natural system, there are some disadvantages. Sediment core-top material is often not modern owing to bioturbation or winnowing, diagenetic effects may mask the original signature of the proxy and many components of the ocean system vary in parallel, making it difficult to calibrate a single parameter (e.g. temperature) on its own. Using sediment-trap samples can help with the first two of these issues. Continued collection and distribution of appropriate core-top samples from the widest possible array of marine settings, for example, in conjunction with the GEOTRACES programme, will help address this last issue.
At times, it is our analytical uncertainty that causes difficulties in cross-comparisons and global syntheses. While it might be expected that complex isotopic analyses or contamination-prone tracers such as boron isotopes or Cd/Ca ratios will be problematic, there are issues even with the most established proxies. For example, inter-laboratory differences still exist for δ18O , significant enough to make it challenging to use data from different laboratories in a compilation study . As general good practice, inter-laboratory comparison exercises, such as those undertaken for Mg/Ca in foraminifera , and distribution of well-characterized standards will aid these efforts.
During the process of biomineralization, organisms tend to manipulate the chemistry of seawater as they build their skeletons. Thus, for proxies that are analysed in such skeletons, some understanding, or at least quantification, of the extent of these manipulations is required. Biological effects have long been documented in empirical calibrations, and models for biomineralization are being built upon . Such modelling is particularly advantageous for predicting where and when proxies are likely to fail and succeed. For example, at lower pH values, foraminifera appear to be able to manipulate the fluid from which they form their skeleton so that the δ11B falls further from the expected ratio . Modelling of such effects may also lead to new ways to extract temperature signals from traditional and novel archives . However, models rarely constrain all available observations; so further improvements to constraining biomineralization are likely to require combined observational and modelling efforts. Culturing and laboratory growth experiments of both inorganic crystal and marine organisms coupled with detailed micro- or nano-geochemical studies will allow us to test individual controls on each parameter. A concrete understanding of biomineralization processes will put palaeoceanographic proxy reconstructions on a sounder footing for the future.
A missing link in contemporary palaeoceanography may be the modelling of molecular and particle-scale interactions of isotopic tracers . There are fundamental gaps in our theoretical and modelling understanding of the underlying thermodynamics that control the broad biogeochemical processes used in the palaeoceanographic community. A better understanding of processes such as particle adsorption and desorption will improve our understanding of important transport mechanisms in the ocean such as vertical scavenging [64,98].
Finally, in this list of approaches for improving proxy calibrations, we should emphasizes the requirement that the uncertainty in each proxy be carefully assessed, including analytical accuracy and precision, scatter in the calibration and acknowledgement of caveats. Providing clear documentation of these uncertainties will allow transfer of palaeoceanographic datasets into global syntheses providing the most realistic modelling targets.
(b) Relative and absolute dating of marine archives
Equally important to accurate proxy reconstructions is that palaeoceanographic records are on a common, well-constrained time scale. Without precise and accurate dates, it is not possible to produce ‘snapshot’ compilations of palaeo-data from a particular time in Earth's history, or to examine lead–lag relationships across hemispheres , between different parts of the climate system  or even to identify potential forcing mechanisms . A particular strength of ice-core records has been the precise dating from layer counting coupled with the ability to examine relative lead-and-lag relationships between the two polar regions using the atmospherically well-mixed methane as a common temporal framework . Speleothem records provide similarly high-resolution records of the hydrologic cycle at low latitudes , and are particularly valuable because they can be precisely dated using the decay of uranium to thorium . By contrast, the sedimentary record of the past ocean lacks the age control to allow us to take full advantage of these records. Proxies have been analysed in a variety of archives, including components of sediment cores, corals, bivalves and sclerosponges [15,103–105]. Some of these can be dated directly, some through stratigraphic correlations, and some have virtually no age control at all.
It is clear that orbital ‘Milankovitch’ pacing influences major climate transi- tions . However, attempts to provide independent ages for the timing of the major deglacial events remain controversial. Over the late Pleistocene, dating techniques based on the decay of uranium to thorium hold the most promise . Such dating of carbonate-rich sediment and speleothems have indicated that massive climate changes may, at times, have preceded the presumed major orbital forcing [100,108]. Recent studies based on coral terraces have even indicated that sea level may have experienced rapid and large-amplitude changes during the last interglacial period . When pushing analytical techniques to the limits of their precision as needed to test climate forcing, we find that the details of the dating begin to break down . Aragonite is prone to diagene- sis, causing the geochemical properties of uranium and its daughter products to change over time [110,111]. This open-system behaviour coupled with poten- tial inter-laboratory discrepancies and a poorly constrained history of uranium in seawater  means that there are still areas in which improvements could improve this approach. For example, a new project (www.useries.org) is conduct- ing an international intercalibration effort, including distributing precisely calibrated standards to the community, and producing a web-based user interface for sharing U-series data in a common format, making it significantly easier to update ages, given a change to, for example, a half-life of one of the isotopes.
Age models for sediment cores are typically produced by first identifying age ‘tie points’, and then interpolating between these points. These tie points can be determined by correlations to another record (stratigraphic correlation), biostratigraphy, ash layers, magnetic reversals and radiometric approaches. Stratigraphic correlations make assumptions about connections in the climate system. The most obvious example is that of the Mapping Spectral Variability in Global Climate Project (SPECMAP), which used the pacing of orbital cycles to date a composite marine δ18O record and establish a connection between the two . This approach has been extended to higher resolution studies as the focus of Pleistocene palaeoceanographic studies has turned to millennial-scale events. Commonly, some aspect of a high sedimentation rate sediment core, such as Mg/Ca in foraminifera  or the colour index , is compared with another record, and a number of tie points (such as changes in slope, or mid points of extreme changes in the proxy) are selected. Common comparison targets are records from Greenland ice cores, or uranium-series-dated Chinese speleothem records [4,114]. The allocation of tie points based on stratigraphic correlation depends upon a predetermined decision that the record resembles the general features of another record. This a priori decision precludes a subsequent test as to relative timing because the records have been forced onto the same time scale.
Barring stratigraphic correlations, for cores that span the last approximately 30 000 years, most tie points depend on radiocarbon dates. Radiocarbon is an appealing candidate for dating late Pleistocene material because of its appropriate half-life. However, radiocarbon is an integral part of the carbon cycle and as such, its use as a chronometer is fraught with biases and uncertainty, particularly in the marine setting (figure 2). First, its production rate is highly variable over time such that a 14C/12C ratio must be ‘calibrated’ to account for the initial 14C/12C ratio. This calibration is usually carried out by comparison with a community-based consensus calibration curve (e.g. IntCal ). New calibration curves are published every few years, often with substantial differences during times of rapid climate change. For example, at around 17–16 ka, there is a more than 800 year difference between Intcal04 and Intcal09 [115,118]. A recent publication from a U–Th-dated Chinese speleothem is in agreement with the earlier Intcal04 record . Second, the reservoir age (or difference in 14C/12C from the water in which the sample was formed and the atmosphere) is required to get the true initial 14C/12C for marine samples. These reservoir ages are poorly constrained over time, are based on a few scattered samples and on a carbon-cycle model . As an example of the extent of the problem, two recent publications on cores from the South Atlantic used reservoir ages that differed by more than 1300 years [22,80]. Faunal differences, bioturbation and dissolution can also lead to differences of thousands of years . Detailed dating studies have revealed even greater complexity to the age of marine cores: not all components of a sediment core from the same depth have the same age. For example, Ohkouchi et al.  showed that the organic compounds used for the alkenone temperature proxy may have ages up 7000 years different to the radiocarbon ages of foraminifera in the same sediment sample . Radiocarbon is an exciting tool for tracing carbon-cycle processes, but age models based on its analysis should be closely scrutinized before they are used as the basis for assessing the timing of climate changes. At the very least, all studies should use single species or date multiple species and show that they have concordant ages . Publications should include the complete radiocarbon dataset, with the calibration curve used, the reservoir age and any other relevant metadata so that the ages and associated age model can be readily reassessed when further information as to the history of radiocarbon is acquired. The uncertainty should be fully quantified and include proper documentation of the reservoir ages.
It is clear that without age control, palaeo-data have little value. Thus, concerted effort into improving chronological constraints should be central to palaeoceanographic research on all time scales. These efforts should avoid stratigraphic correlations that make predetermined judgements about connections between different parts of the climate system and preclude testing conceptual models that rely on lead–lag relationships. All information used to generate age models should be documented so that it can readily be updated should new information come to light (e.g. new calibration curves for radiocarbon dating). This information should include the methodological details related to tie points, and the rationale for the choices made for interpolation between tie points. Efforts should be made to improve our understanding of the geochemistry and cycling of relevant systems. For example, it is clear that we need better constraints on reservoir ages if we are to continue to use radiocarbon as a dating tool. At the very least, we should use consensus values so that cores from similar locations can be compared with one another. Likewise, we need better constraints on the history of uranium in the ocean to determine the best U-series dates. Finally, all of the associated uncertainties on an age model should be adequately quantified and stated so that palaeo-data are not over-interpreted in relation to one another.
Eventually, to produce marine records with the same temporal resolution and age control as ice cores and speleothems, it is likely that we will have to use non-sediment-based archives. Sediments continue to provide valuable information, especially through their ability to record continuous records, and through the wealth of approaches that can be used to tackle palaeoceanographic questions. For example, sediment cores may include components from marine and terrestrial sources, allowing cross-comparison of the two realms within one record. With rare exceptions, however, sediment cores are of most use over time scales of thousands to hundreds of thousands of years. Factors such as bioturbation, slow sedimentation rates and diagenetic alteration tend to limit our ability to interpret records over shorter time scales. With improving technology and ability to collect samples, new archives are proving valuable in filling the gaps where sediments are least effective. For example, shallow-water corals provide annually resolved records of the surface ocean , and the skeletons of deep-sea corals provide highly resolved, U-series-dated records of the intermediate and deep ocean [122,123]. Continued efforts to collect and take advantage of new archives and approaches are required to produce adequate records of the rates and amplitudes of oceanic changes in the past.
(c) Focused effort or global coverage?
In an ideal world, we would produce high-temporal resolution, closely spaced (vertical and horizontal) and precisely dated data for all ocean parameters for the global ocean over the last 65 million years. These data would be incorporated into fully coupled, highly resolved transient climate models and lead to a comprehensive understanding of the Earth system and the best possible predictions of future trajectories. It is probably fair to say that this effort would be greater than could be carried out by the research community over many decades, subsume more resources than are currently committed to science and would not even be possible, given that perfect archives of the past do not exist. Obviously, we must be more selective in our efforts, and choose the targets of our research wisely.
There are two opposing views of how data should be distributed: evenly, or focused on specific ‘key’ locations. It was only early in ocean exploration that samples were collected regularly, no matter the location. For example, sampling by the R/V Vema increased recovery of sediment cores from approximately 100 in 1948 to more than 1000 in 1956 because samples were collected every day, wherever the ship happened to be (http://www.ldeo.columbia.edu/research/office-of-marine-operations/history/vema). Since then, palaeoceanographic data tend to be clustered spatially for historical, practical and scientific reasons. For example, there is a distinct bias towards sampling in the North Atlantic due, in part, to the long scientific history of the adjoining countries. On the practical side, of course, sediments can only be collected where they accumulate and are preserved. Much of the sea floor is unsuitable for coring due to processes such as erosion, or turbidite deposition. Selection for sites with undisturbed sedimentary sequences necessarily limits the locations where good palaeoceanographic records can be developed and typically under areas of high productivity, near margins or where they are focused by currents. Nowadays palaeoceanographic field campaigns tend to target specific processes or regions such as upwelling off Africa  or across frontal zones [125,126].
To match up with global modelling targets, it might be argued that it is desirable to have a ‘sample in every grid box’, but such sampling would be prohibitively expensive and of inconsistent quality. Regional-scale models may be better served by higher resolution process studies. Where processes have a disproportionate impact on the wider scale circulation compared with their own extent , such a targeted strategy may be particularly justified. In regional studies, the locations of palaeoceanographic reconstructions should have a high spatial resolution, both vertical and horizontal to cover vertical water-mass boundaries and frontal zones. With adequate age control, then this type of data would contribute to reconstruction of oceanic sections similar in nature to the data collected in programmes such as GEOTRACES and WOCE, but for the past ocean. An example of this type of approach is the last glacial maximum compilation of the West Atlantic using the nutrient-based tracer δ13C . Similarly, efforts, such as CLIMAP and more recently the Multiproxy Approach for the Reconstruction of the Glacial Ocean surface (MARGO), have attempted global syntheses of sea-surface temperatures [128,129]. Both of these types of compilations have been widely used by the data and modelling communities , highlighting the benefit of this sort of approach. But even in these few cases where reasonable amounts of data exist, there are still key gaps, such as at intermediate depth or in the Southern Ocean.
As we determine where to put efforts into collecting new samples, there will be the inevitable trade-off between cost and potential scientific output. In addition, we need access to technology that allows us to collect samples that are best suited to our studies. For example, the UK does not have a long coring system, limiting our ability to access sediments for deeper time studies. We do have a sophisticated remotely operated vehicle (R/V Isis), but lack of sufficient funding has severely limited its use, and thus far it has not been used for palaeoclimate studies. The International Ocean Drilling Programme (IODP) is costly, but has allowed drilling down to depths that allow investigation of ancient climate regimes, such as the PETM . Recent changes to the funding structure of the IODP (http://www.iodp.org) mean that access to newly drilled samples may be reduced in the future. Certainly, there is a place for enhanced international cooperation in marine technology and ship access if we are to realize maximum benefit from ocean-going expeditions.
Sampling strategies fall into two broad categories: spatial and temporal. There are already thousands of samples that have been collected from the sea floor; so one might question whether we need to collect more. Likewise, we already have a wealth of information on the history of the oceans; so perhaps we should be focusing our resources on synthesizing and modelling those. However, as described in prior sections, our limited data coverage, both spatial and temporal, coupled with, at times, poorly constrained proxies and age models requires that we do continue to produce high-quality data that are suited to testing climate models and conceptual frameworks. Programmes such as the IODP have been particularly valuable in part because of the excellent archiving and availability of samples to the wider scientific community. Some individual institutions have assembled excellent facilities for searching core repositories that allow researchers to search and request samples, but in some cases, these cores may not be preserved sufficiently carefully to allow application of modern techniques. Likewise, efforts at compiling extensive databases of palaeoceanographic data with associated metadata are invaluable for assessing the state-of-the-art in terms of palaeoceanographic reconstructions (e.g. http://www.ncdc.noaa.gov/paleo/paleocean.html) and allowing diverse scientists to access datasets. However, without expert interpretation, and assessment of the age models that are provided, it is easy for these datasets to be misinterpreted. To remedy this situation, the data should be clearly described, with adequate information as to the limitations, and in addition, there should be close collaboration between those that collect data, and those that seek to use it.
Even with careful synthesis, there are still gaps in the data we need to test key climate hypotheses, from the samples needed for rigorous proxy calibration to areas of the ocean that are hard to sample but still vitally important. For example, carbonate dissolution and low accumulation rates in the deep Pacific and Southern Oceans means that we do not yet know exactly how and when they participated in millennial climate events. Given the deep Pacific contains a huge volume of water, and was presumably the location of excess stored carbon during glacial periods, and that the Southern Ocean is widely thought to be a key player in the uptake and release of CO2, we need to focus effort on these regions. Even in the North Atlantic where we have considerably more information, we are still lacking full depth and age coverage of what appear to be proxies that could contain vital information on overturning rates such as Pa/Th and radiocarbon [73,122]. In many cases, the samples have not been collected, particularly as we realize the advantage of new climate archives beyond sediment cores.
(d) Comparisons between modelling and data
The interplay between modelling, data and theory in palaeoceanography is intricate, being an iterative and complementary process. In §1, we described how early large-scale conceptual models drove the fundamental questions for early palaeoceanographic studies [2,12]. As described in §2a, modelling is playing an important role in developing our interpretations of new palaeoceanographic proxies [62–70], and this type of approach characterizes one of the ways that modelling and observational techniques can be used in tandem. We have hinted at a missing link between palaeo-proxy development and a theoretical/modelling understanding of molecular/particle-scale processes [62,94,95]. We have also discussed the use of modelling to open specific questions regarding the carbon cycle that can only be addressed with new palaeoceanographic data [16,31,56,78]. Here, we will briefly consider the use of state-of-the-art climate models that are similar to those used to make projections of future climate under different future emissions scenarios. This exercise helps us to test contemporary theories of palaeoceanographic change, many of which have been developed using simpler models .
Due in part to the limitations of computing power, there have been few attempts to run transient models of abrupt climate events in the most sophisticated coupled models. In a recent study, however, Liu et al.  did consider the period of millennial variability between the end of the last glacial period and the end of the Younger Dryas cooling event  using a sophisticated atmosphere–ocean general circulation model (AOGCM). By using ice-sheet boundary conditions from a glacio-isostatic adjustment model, greenhouse-gas concentrations from ice cores, insolation forcing calculated from orbital parameters and freshwater forcing in the North Atlantic, these authors were able to reconstruct both millennial variability and the background deglacial warming trend across a large range of different palaeo-proxy data at different sites. This simulation seems to indicate that our overall picture of the physical climate changes during the end of the last glacial period is reasonable. However, the freshwater input to the North Atlantic is the dominant control on the model outcome and this is an imposed variable (i.e. the model does not calculate this freely). The freshwater forcing in this model has been described as unrealistic in comparison with data constraints , so it is evident that we still have a lot to learn regarding the processes driving millennial variability . Further studies using different AOGCMs are needed to address the challenges and questions posed by trying to model such a complex, transient system.
Heavily parametrized ‘simple’ models often capture numerous equilibria, hysteresis and abrupt changes , but for full-complexity models of the type used in the Intergovernmental Panel on Climate Change reports, the results are different . Multiple equilibria and hysteresis have rarely been seen in full-complexity models with notable exceptions , and therefore may be mere artefacts of simpler models. Realistic, proxy-based estimates of typical freshwater input into the North Atlantic of 0.1 or 0.2 Sverdrups (1 Sverdrup = 106 m3 s−1) are about a tenth of the value used in some simulations . These more realistic freshwater fluxes result in a relatively small reduction in circulation of about 30 per cent for glacial climate states resulting in changes in Greenland's air temperature of only 2–3°C compared with reconstructed values of 8–15°C . Given the discussion in the paragraph above, comparison with palaeo-data implies that full-complexity models may be considerably less sensitive than palaeo-data suggest [127,132]. While more modelling studies are needed, palaeo-data pose important challenges to the state-of-the-art climate simulations used for forward climate projections .
There are several paths for development in palaeoceanographic/palaeoclimate modelling. Here, we have discussed new proxy development, carbon-cycle modelling and model-data comparison for sophisticated climate models.
Progress continues to be made in developing new, productive relationships between ocean modellers and biogeochemists in order to develop fuller understanding of palaeo-proxy data such as Pa/Th or Nd isotopes. This strategy has shown considerable promise [63,64,69,71]. Another avenue for progress is the development of coupled modelling. This approach includes coupled carbon-cycle–climate modelling  and coupled ice-sheet–climate modelling  and even carbon-cycle–ice-sheet–climate modelling . As computing power becomes more available, these simple models increase in resolution and complexity (for example from two- to three-dimensional ocean simulations), but their simplicity still limits their reliability, and their results are not necessarily reproduced by more sophisticated models .
In a recent assessment of the ability of sophisticated climate models to simulate palaeo climate, Valdes  stated
Overall, the modelling of past abrupt events does not give us confidence in the ability of complex models to simulate critical threshold behaviour that we know has occurred in the past. In response to this deficiency, first we need to challenge the palaeodata, and continue to improve our knowledge of past forcing factors and the ensuing climate response. Second, we need to understand the physics and dynamics of documented abrupt change events better. And third, we need to develop more sophisticated tests of the full complexity models—tests that help to analyse their behaviour during abrupt changes. , p. 415
This set of challenges presents several ways forward for palaeoclimate modelling based around a deterministic understanding of the climate system. However, there are important differences between models for key characteristics of the climate such as the sensitivity of ocean circulation to freshwater input in the North Atlantic . Such model-dependent features indicate an important sensitivity to model boundary conditions for these abrupt changes in climate that is yet to be fully understood. Indeed, some authors go so far as to argue that it may not be appropriate to model these sensitive processes deterministically [13,14]. Instead, such systems may be stochastic in nature , and, in this case, deterministic models are taking on problems posed by the palaeo-data that are intractable, and a fundamental change of strategy may be needed.
3. Concluding remarks
This paper has covered aspects of the motivation, history, successes and challenges to palaeoceanography, including suggestions as to ‘suggested ways forward:’ in areas covering conceptual and complex models, through to sampling strategies and the finer details of proxy calibrations and dating. Despite this broad overview, we are aware that we have only grazed the surface of the multi-disciplinary, wide-ranging field of palaeoceanography. Rather than suggest an unachievable set of goals for the next 20 years, we have tried to provide practical, logical steps that will improve the links between modelling and data compilations. These efforts include improved and quantified proxy calibrations, appropriate age control on all palaeo-data and new process questions from coupled and high-resolution modelling efforts. Indeed, it is unlikely that we could predict the biggest finding to come in this field. Based on publications from the last century, the only results that we should expect in the future are more surprises.
The preceding discussion highlighted where we might usefully expend effort understanding and quantifying proxies putting palaeoceanographic records onto common, well-quantified time scales, and harnessing the combined potential of models and data to understand the Earth's climate system. For example, we still do not understand the causes of abrupt millennial-scale events: we have conceptual models describing the sequences of events, but we are still missing a global picture of these transient events on a common time frame—what was happening during extreme climate events such as Heinrich event 1? Do we even have the age control to do this type of study? Can they be modelled successfully using freshwater forcing? If not, then are the models at fault, or the data, or are modellers even using the wrong approach? It is obvious that even with targeted studies, it is labour intensive and costly to do this sort of survey on a common time frame with streamlined proxies. For this reason, we would argue that an emphasis on returning to the basics of proxy quantification and age control will eventually lead to significantly more output from our limited resources. Such datasets will make synthesis efforts considerably more straightforward, thus promoting closer collaboration between data and model approaches. Whatever happens, the mutual interactions between theory, modelling and palaeo-proxy data look set to continue to play a key role in future developments.
We acknowledge the helpful discussion of Paul Valdes, Jerry McManus, Kate Hendry and Andrea Burke, and reviews by Bob Anderson and one anonymous reviewer, although this paper does not necessarily reflect their views. We also thank Harry Bryden, Carol Robinson and Challenger Society for this opportunity to air our opinions on the future of palaeoceanography. We acknowledge the support of the Marie-Curie Reintegration programme (L.F.R.), the European Research Council (L.F.R.) and the Research Council UK fellowship (M.S.) and the University of Bristol.
One contribution of 11 to a Theme Issue ‘Prospectus for UK marine science’.
- This journal is © 2012 The Royal Society