This paper discusses some of the major issues that surround estimating regional emissions of trace gases from atmospheric observations through inversion modelling. Inversion methods use modelled knowledge of how emissions dilute in the atmosphere as they travel from their source to an observation point, together with the observations, to calculate a grid of emissions. The problem is one of minimizing the mismatch between a modelled and observed time series of concentration. There are many methods of comparing time series, some involving a priori knowledge others without. The location, terrain and height of the observation station can also be very significant in determining how well a model can represent the dilution from emission source to receptor. The inversion solution (emission map) will assign some of the sources incorrectly for a variety of reasons, e.g. local sources, intermittent releases, errors in the modelled transport or observation, and the choice of the spatial and temporal resolution of the emission map. The reasons for uncertainty in the modelled emissions are discussed along with suggestions as to how some of these can be minimized. Using multiple stations to further constrain the inversion should reduce the uncertainty; however, care is needed if the potential improvements are to be realized.
As part of both the Kyoto and Montreal Protocols countries are obliged to estimate their annual emissions of a range of greenhouse gases (GHGs) and ozone-depleting gases, such as methane (CH4), nitrous oxide (N2O), chlorofluorocarbons (CFCs) and hydrofluorocarbons (HFCs). These emission estimates (inventories) are submitted to the United Nations Framework Convention on Climate Change (UNFCCC) and are required to follow strict guidelines. The inventories are produced using a combination of activity data, for example, for CH4 these would include: number of dairy cows, number of beef cattle, number and type of land-fills, natural gas usage, etc., and empirical emission factors that link activities to emissions. Such inventories are referred to as ‘bottom-up’ estimates of emission.
Inversion modelling using atmospheric observations provides a completely independent method for verifying these inventory estimates as they use totally different input information; such emission estimates are often referred to as ‘top-down’ estimates. Inversion methods require information on how emissions dilute in the atmosphere as they travel to an observation point from their source regions to calculate a best-fitting grid of emissions. The inversion problem involves minimizing the differences in concentration between the modelled and observed time series. There are many studies using different inversion methods, for example, globally [1,2], regionally [3,4] and both globally and regionally [5–8]. However, up until now they are not mandated by either Protocol but are instead considered to be best practice. Both the bottom-up and top-down methods are subject to significant uncertainty which varies for different gases and methods used.
This paper is focused on top-down emission estimates using inversion modelling and will explore some of the challenges that need to be considered when undertaking such an activity on the regional scale. Detailed regional emission estimates are politically important as they resolve at the country scale or finer. This paper will close with an example of the results from one particular method to illustrate how some of the issues can be addressed.
(a) Comparing time series and constraining the solution
The central component of estimating emissions from observations is to model how emissions from a specified horizontal grid dilute via a transport matrix from their source to the receptor, the observation. This transport matrix is over-determined and inconsistent, and therefore cannot be simply inverted in the mathematical sense; hence, different approaches are used to mimic the inverting process. By considering all of the grid points in the domain, the model is thus able to, given a distribution of emissions (hereafter, referred to as an emission map), calculate a time series of concentration at the same point as the observation. By considering the statistical fit between the observed and modelled time series, the emission map can be adjusted to improve the fit. The aim of the best-fit method is to maximize the fit between the two datasets. A key point is thus what statistical function is used to measure the quality of the fit. Some common functions are root mean square error (r.m.s.e.), correlation coefficient (r) and fraction within a factor of two (FAC2) but there are many more. Each statistic has its strengths but also some weaknesses. The question is what is a good statistical fit? Figure 1 shows some very simplistic and extreme examples of how these three statistical measures fare for different datasets. As illustrated one extreme data point can have a very significant impact on both r and r.m.s.e., whereas FAC2 is resilient to individual values. However, FAC2 fails to pick up on an excellent correlation but incorrect magnitude.
It is possible, coupled to the statistical fit algorithm, to use any knowledge of how emissions are thought to be distributed (a priori information) to inform the final solution. This is the approach used by several authors, e.g. [5–8], but is not universal, for example, Manning et al.  use no a priori. The further away the current solution is from the a priori the larger the cost in the function, i.e. the benefit in terms of a better fit needs to outweigh the penalty of a solution more distant from the a priori. The key point is how to price the distance from the a priori solution relative to the statistical fit of the modelled time series to the observations. In theory, this should relate to the perceived quality of the a priori but this is a very subjective decision. Usually, this is defined as a percentage relative to the a priori solution, e.g. 100 per cent uncertainty is often used. This may sound significant but is only a factor of two of the magnitude in a grid cell and, depending on the original estimate in a particular grid box, it can be small. Therefore, if the a priori estimate failed to take account of a significant source, the inversion method is prevented, by the a priori, from increasing the solution sufficiently. If the final inversion solution is near to the limits imposed by the a priori solution, as defined by the specific method, then the a priori uncertainty limits may need to be relaxed to allow more freedom in the solution. However, if the a priori estimate is relatively good, it can significantly improve the robustness of the final solution as it is adding real knowledge to the inversion process, i.e. by ensuring that emissions are land-based, etc. The use of an a priori solution can, however, prevent unexpected sources of a particular gas from being discovered by the inversion method.
(b) Location and height of the observation point
Another key point to consider is how well does the transport model, and its underpinning three-dimensional meteorological flow model, capture the flow of air from the source of the emission to the receptor. It is particularly important that the flow near to the receptor is well represented as this will affect all of the modelled results.
Observations in flat terrain sites, e.g. Cabauw in The Netherlands, are usually ideal as the flow around the measurement site is relatively easily represented by three-dimensional numerical weather prediction (NWP) models. The lack of hills, coasts or any other significant topographic features means that a grid box averaged meteorological value is a good estimate of the entire box. It is even better if the observation point is elevated, i.e. on a tall tower, such as Cabauw, as the ground is the most complex area to be modelled as it is heterogeneous, and therefore always subject to some local flows that are not captured by the NWP models.
Coastal stations have the advantage of having a clean undisturbed wind sector, i.e. over the sea, that is well mixed and, depending on the upwind fetch, can be used to derive baseline concentrations. However, these stations are subject to sea breezes that can significantly disturb the local air flow, features that are not well resolved by NWP models especially those with horizontal resolutions greater than a few kilometres.
Tower measurements have advantages because they are away from the complex meteorology near to the ground but do, depending on their height, have the added complication of sometimes being outside the boundary layer (BL) and in the free troposphere. The BL is the volume of air connected to the ground and which is directly affected by it. It varies throughout the day from a midday peak of up to 3 km on a hot day in summer in the UK to potentially below 50 m on a cold cloudless night. Models find it difficult to predict accurately the height of the BL as it is an imprecise concept with various definitions. Knowing whether a measurement is recorded within the BL or in the free troposphere is vital if one is to understand the dilution of the emissions en route to the observation point. The levels of atmospheric turbulence vary with height but are generally weaker above the BL, and hence the modelled BL strongly affects the amount of modelled dilution. It becomes a matter of defining which point in your three-dimensional model atmosphere is most representative of the air that the observation station is experiencing; this is especially important for stations situated in mountainous areas.
For stations in mountainous or hilly areas defining where the station is within the modelled atmosphere is not a simple matter and will potentially vary from day to night. For example, consider the surface station at Jungfraujoch in the Swiss Alps. This station is positioned at 3580 m above sea level (a.s.l.) and is situated on the saddle between two mountains with steep valleys on the other two sides. In the UK Met Office global model (40 km horizontal resolution), the ground level in the grid square that contains Jungfraujoch has an altitude of just 1760 m; in the higher resolution Met Office North Atlantic European model (12 km) it has an altitude of 2110 m. During the night, the flow reaching the station is probably disconnected from the ground, i.e. is in the free troposphere, and therefore probably best represented by a model point at 3580 m a.s.l., i.e. a point approximately 1.8 km above the global model ground. During the day, the station is probably influenced by upslope surface winds from the valleys, and thus is more likely to be in flow that is within the BL. However, both of these situations will be very meteorologically dependent and will vary from day to day. What happens between these two extremes and at what time the transition from one to the other occurs, if it does, is also very difficult to predict. So how to interpret the observations at Jungfraujoch and other such mountain stations using models? One method, Bergamaschi et al. , uses only night-time observations and assumes disconnected flow, while others, e.g. Reimann et al. , assume the station is on the surface and their high-resolution modelled flow is always correct, so all observations can be used. Figure 2 shows for 1 year, using global meteorology, the difference in the total number of 3 h intervals when air from each grid cell reaches Jungfraujoch if the modelled station is at 3580 m a.s.l. (1820 m above ground level (a.g.l.)) compared with if it is at the surface (0 m a.g.l.). For large areas of Europe this difference is greater than 400 events per year and will have a strong impact on the inversion solution. The same issue, albeit with reduced impact, can affect less mountainous stations, for example, Tall Tower Angus at 535 m a.s.l. (222 m a.g.l.) situated in the hills north of Dundee in Scotland, where the impact is about 10 per cent compared with that seen at Jungfraujoch.
(c) Spatial and temporal resolution of the modelled emission estimates
The inversion models use a pre-defined horizontal grid and a specific time window in which to estimate emissions. During this time window, the emissions in a grid box are assumed invariant although different emission source categories can have different release rates. For example, Bergamaschi et al.  use monthly windows, a 1° horizontal grid and solve for different source categories separately, whereas O’Doherty et al. [9,10] use 3 yearly windows, a variable horizontal grid depending on distance from the measurement point, and solve for just total emissions per grid. Computational cost can be a limiting factor on the achievable resolution of the estimated emission map but more likely it is the lack of sufficient knowledge from different geographical regions that dictates this choice, i.e. how often and how significantly does an emission from a grid box impact on the observation location.
The inversion models have uncertainty and this will lead to emissions being assigned to an incorrect grid box. However, this wrong assignment could have implications, potentially very significant, for the total emission that is estimated for a region. This is because dilution from an emission source is exponential. Therefore, if a source is estimated to be further away than in reality, the total modelled emission will have a positive bias and vice versa for a modelled source that is nearer, and, because dilution is exponential, the differences can be large.
Intermittent or local sources can cause sources to be incorrectly assigned. Imagine a herd of cows to the west of a measurement station but close enough to be in the same grid box. When the wind blows from the west, the station will record elevated CH4 but when it blows from any of the other three directions there will be no elevation in the measurements. The inversion model will then naturally increase the emissions to the west of the actual cell containing the cows, but in doing so, it has to increase the cows’ emissions to compensate for the greater distance, and thus dilution from the cows to the station. So it is preferable that observation stations are remote (more than a couple of grid boxes) from the nearest significant source.
The resolution of the output emission grid needs consideration. It is reasonable to argue that the resolved emission grids should be no smaller than the grid resolution of the driving meteorological model because the necessary local subtleties of the meteorology will not be captured. However, having greater scope for allowing spatial variability in the emissions gives the inversion model greater flexibility to capture actual emission distributions, for example, the emissions from energy generation generally come from point source chimneys. The inversion model can, however, due say to the statistical method used, add detail when it is wrong to do so. For example, if the inversion method allows negative emission solutions (i.e. to represent potential loss processes) it is possible to unrealistically estimate large positive sources adjacent to large negative sources (a dipole), the net impact of these sources on the receptor is small as they cancel each other out.
As discussed above, the impact of distant sources on the receptor is small compared with nearby sources. One method to rectify this imbalance is to group together source areas as the distance from the measurement station increases. This is the approach taken by Manning et al.  and Vermeulen et al. . The result of this grouping goes some way to equalizing out the impact of near and distant sources, therefore not biasing the solution one way or the other. Other authors, e.g. Bergamaschi et al. , rely on the a priori solutions to prevent unrealistic solutions being estimated far from the receptor locations.
(d) Background concentrations or emissions outside the regional domain
For most gases, owing to their long atmospheric lifetimes, e.g. CH4 and N2O, the atmospheric concentration of the air entering the domain of interest from other more distant regions will be significant. In most cases, the regional perturbation in the concentration signal caused by regional emissions or loss processes will be small in comparison to the background concentration. For example, N2O in 2009 had an annual average Northern Hemisphere background level of 322 ppb  but the most significant pollution incident (i.e. air from Europe) at Mace Head, a monitoring station on the west coast of Ireland, in 2009 led to levels of 327 ppb, i.e. regional emissions created elevations of at most 5 ppb above baseline. It is therefore vitally important that the concentration of the air coming into the regional domain (baseline) is well understood. Global models, e.g. Bergamaschi et al. , use global observations to try and ensure that the air entering the regional domain is at the correct concentration. Other models try and estimate the baseline concentration by just considering those wind trajectories that come from assumed background sectors, e.g. from the Atlantic Ocean at Mace Head .
The global approach requires a significant number of well-spaced globally distributed observation stations, the National Oceanic and Atmospheric Adminstration network is ideal for this, along with a good description of the loss processes that the gas undergoes and its global emission distribution. The loss processes will include some of: loss to the stratosphere, deposition to the ground, chemical reaction with other gases, dissolution in the ocean and biological sinks. The benefit of the global approach is that, provided the above is well understood, the baseline concentration will have good spatial and temporal variability.
Background-sector approaches can ignore the loss processes for gases with atmospheric lifetimes of years (e.g. most GHGs of interest) as they are only interested in the travel within the previous fortnight. However, they do require the measurement station to be positioned so that for a significant amount of the time (of the order of at least 20–30%) background concentration air is observed. This approach has the benefit of being directly related to the measurements with no concern for global loss and production terms that can be difficult to model correctly; however, it will suffer from a lack of variability in the baseline concentrations from different directions. For example, most gases have a latitudinal gradient, usually higher in the Northern Hemisphere, because of the disparity of emissions in the two hemispheres. Hence, different concentration baseline air arrives at Mace Head when its recent origin is from equatorial latitudes. Owing to the short-lived nature of such events, of the order of days, they will not be captured in the modelled baseline.
(e) Number of observations and using multiple stations
Increasing the number of observations available to the inversion model will give it more information about the distribution and magnitude of sources. Increasing the number of observations requires either the time window of an inversion solution to be increased, the use of observations from multiple stations or increasing the frequency of the observation, e.g. changing from flask to in situ. The former means extending the time window when the emissions are assumed to be invariant. This is justified for bulk averages but potentially important trends in emissions are masked and may lead to poor results if there have been significant changes in the emission map within the extended time window. The use of multiple stations is ideal as it not only increases the total number of observations but provides detailed knowledge of emissions in a different area and helps the inversion better triangulate the sources of release. However, care needs to be exercised to ensure that the observations are inter-comparable . The observations need to be reported on the same calibration scale but it is also important to ensure that when two instruments analyse the same air they produce the same results, or, if they do not, how can they be corrected so that they do. If the observations are not inter-comparable the inversion model will estimate inaccurate emission maps. For example, consider two stations where one is reading high (station 2) when compared with the other (station 1). The inversion model will need to bolster the emissions that would have been estimated if only station 1 were used and the easiest place for the model to do this is where the station 2 has the greatest influence and station 1 the least, i.e. in close proximity to station 2. Therefore, the impact of the lack of inter-comparability would be enhanced estimated emissions close to the station reading high and reduced emissions close to the station reading low.
It should also be recognized that certain observations will be more difficult to model than others. As discussed above with respect to mountain flows, there may be certain times of the day  or in certain conditions, i.e. when the wind speeds are light or the BL is low , when the meteorology is more difficult to model, or certain times when the observations have greater uncertainty. It may, therefore, be prudent to remove these observations from the dataset used by the inversion method so as to prevent them from distorting the resulting emission map. However, this will always be an arbitrary decision and would need to be tested to ensure that the inversion solution is not strongly dependent on the selection algorithm used.
The uncertainties in inversion model estimates in regional domains arise for a combination of reasons. There are uncertainties in the observations, in the modelled meteorology, in the transport and dilution models and in the global baseline estimates, whether calculated globally or from the regional observations. The inversion models introduce uncertainties when sources are intermittent or if the sources are sub-grid scale and local to the measurement point. The choice of the statistical measure used will introduce some uncertainty as will applying a priori constraints to the inversion solution. Also the uncertainty will vary depending on the gas considered; for example, CO2 has a complex natural source/sink component that varies diurnally, whereas HFC-134a, a purely anthropogenic gas with no surface sinks, does not.
It is difficult to assess the total uncertainty but it is vital to try and do so. One method is to randomly perturb the observations or randomly sub-select observations to use or, when multiple stations are used, removing all the observations from a station. Solving the inversion multiple times in this way tests the robustness of the solution and highlights whether individual observations are having a disproportionate effect on the solution [6,12]. Repeating the inversion with a different set of NWP meteorology would be a very useful exercise but is generally computationally or practically prohibitive as dispersion models are usually linked to a specific meteorological dataset. A good test of the inversion method is to perform what are called pseudo-observation tests . This is where the transport model is used to generate a pseudo-observation time series from a given emission map and then the inversion method is invoked to try and recreate it. Obviously as the same meteorology and transport model are used to derive the observations as are used to create the dilution matrix in the inversion model there is a unique known solution. Performing these same tests but with the addition of random noise gives information on the robustness of the inversion method.
3. Example method and results
This section describes briefly an inversion method that has been developed and how some of the issues discussed have been addressed. The example relates to the long-lived GHG HFC-134a that is now commonly used in mobile air-conditioning units, e.g. in cars. The inversion method does not use the global modelling baseline approach and requires no a priori estimate.
The inversion method, as described in Manning et al.  and O’Doherty et al. [9,10], has been developed at the UK Met Office to estimate, primarily, the UK emissions of a wide range of GHGs and ozone-depleting gases from observations at Mace Head. It uses the Numerical Atmospheric dispersion Modelling Environment (NAME) Lagrangian atmospheric dispersion model  to derive the dilution matrix that links the emissions to the observation point. It is a two stage process: the first stage is to estimate the mid-latitude Northern Hemisphere baseline concentration that enters the regional domain, and the second stage is to use the observed perturbations above this baseline to estimate regional emissions.
The baseline concentration is estimated by selecting only those observations that are taken when the air has come from the mid- to North Atlantic and when the impact of sources local to Mace Head can be assumed negligible, i.e. in moderate to high wind speeds. After statistical analysis to remove outliers, a quadratic function is fitted to the baseline observations in a running 40 day window, resulting in a smoothed baseline time series. Figure 3 shows the annual baseline time series for HFC-134a at Mace Head from 1994 to 2009. The rapid rise in concentration reflects the increased use of this gas in developed countries in mobile air-conditioning systems following the phase-out of CFCs as dictated by the Montreal Protocol.
The time series of baseline concentration is subtracted from the high-frequency observations at Mace Head to derive an observational time series that reflects only the impact of recent (within 12 days) emissions on the regional scale (now referred to simply as the observations). Observations that are below zero are fixed to zero; this is an artefact of defining the baseline in this way and arise, for example, when low concentration air from equatorial latitudes is observed at Mace Head. A global model would not need to do this as the whole atmosphere is modelled; however, the global model baseline may not be any more accurate at a given station as it is not derived from those observations. Observations that could be heavily influenced by emissions very local to the measurement station are excluded from the dataset. The inversion system is set up to solve in 3 year windows that move month by month through the entire observational dataset. For each time window, an emission grid is calculated which tries to ensure that each grid cell significantly contributes to the observational record. It does this by gradually combining grid cells together as the distance from the measurement point increases (figure 4). The statistical score used combines correlation coefficient, normalized mean square error, fraction within a factor of two and the fraction of model values within the noise of the observations. The noise of the observations is defined as the standard deviation of the observations classed as baseline about the smoothed baseline time series (0.4 ppt).
Using a mix of different statistical measures increases the robustness of the solution to different groupings of observations and has proved the best when the inversion method has been tested on pseudo-observations .
Each 3 year period is solved multiple times (26), each time a different random noise time series has been applied to the observations. The noise is derived from a lognormal distribution (mean 1, standard deviation 0.08 ppt) and is applied multiplicatively to the observations. Annual emission estimates for northwest Europe (Ireland, UK, France, Germany, Belgium, The Netherlands, Luxembourg, Denmark) are shown in figure 5 along with the values reported to the UNFCCC (2009 submissions, http://unfccc.int, and UK submissions from personal communication with AEA). Each 3 year period that completely extends across a calendar year is used to estimate the 5th, 25th, median, 75th and 95th percentile estimates for that year. Note that the inversion domain is significantly larger than the actual area of interest, this is deliberate. The grid cells towards the edge of the domain will be more strongly impacted by errors in the baseline concentrations, and therefore excluding them minimizes this error. If the defined baseline is too low, for example, the air entering the eastern edge of the domain is higher than baseline, the grid cells on that border of the domain will be artificially inflated to correct for this.
The inversion model predicts the same increasing trend as the inventory but the magnitude it estimates is consistently lower. This may point to a slower leakage rate from installed systems than that assumed in the inventory emission model and should warrant further investigation.
This paper discusses some of the principal issues that need to be addressed as part of an atmospheric inversion modelling study. The issues range from how the inversion method assesses quality of fit, the transport model used to describe the dilution from source to receptor to the quality and location of the measurement station. The uncertainty in the inversion solution is difficult to quantify but it is important that an attempt is made to understand its magnitude.
Inversion modelling can provide valuable information to enable inventory estimates to be independently assessed. Although the uncertainties in the inversion estimates can be significant, large differences in magnitude or trend compared with the inventory should concern inventory collators and make them re-examine their assumptions, emission factors and activity data.
Simon O’Doherty and Peter Simmonds at the University of Bristol are thanked for the permission to use the Mace Head observations. UK Department of Energy and Climate Change (DECC) is acknowledged for its support for this work (contract GA 0201). AEA is acknowledged for the provision of the UK inventory emission totals for HFC-134a.
One contribution of 17 to a Discussion Meeting Issue ‘Greenhouse gases in the Earth system: setting the agenda to 2030’.
- This journal is © 2011 The Royal Society