## Abstract

Kohn–Sham density functional theory is in principle an exact formulation of quantum mechanical electronic structure theory, but in practice we have to rely on approximate exchange–correlation (xc) functionals. The objective of our work has been to design an xc functional with broad accuracy across as wide an expanse of chemistry and physics as possible, leading—as a long-range goal—to a functional with good accuracy for all problems, i.e. a universal functional. To guide our path towards that goal and to measure our progress, we have developed—building on earlier work of our group—a set of databases of reference data for a variety of energetic and structural properties in chemistry and physics. These databases include energies of molecular processes, such as atomization, complexation, proton addition and ionization; they also include molecular geometries and solid-state lattice constants, chemical reaction barrier heights, and cohesive energies and band gaps of solids. For this paper, we gather many of these databases into four comprehensive databases, two with 384 energetic data for chemistry and solid-state physics and another two with 68 structural data for chemistry and solid-state physics, and we test two wave function methods and 77 density functionals (12 Minnesota meta functionals and 65 others) in a consistent way across this same broad set of data. We especially highlight the Minnesota density functionals, but the results have broader implications in that one may see the successes and failures of many kinds of density functionals when they are all applied to the same data. Therefore, the results provide a status report on the quest for a universal functional.

## 1. Introduction

Density functional theory (DFT) [1] has enabled electronic structure theory to be applied to materials and complex chemical problems with an accuracy unobtainable by any other approach. This makes DFT useful for modelling, prediction, design and understanding. Mainstream DFT applications are based on Kohn–Sham theory [2] and its various generalizations, especially the spin-polarized formulation [3] and the generalization to treat a portion of the exchange energy by Hartree–Fock (HF) theory [4]. Kohn–Sham DFT is formally exact, but it involves a functional called the exchange–correlation (xc) functional—or, for simplicity, the density functional—and the exact form of this functional is unknown and essentially unknowable. The situation is well summarized by the following quotation:
DFT is the method of choice for first principles quantum chemical calculations of the electronic structure and properties of many molecular and solid systems. With the exact exchange–correlation functional, DFT would take into full account all complex many-body effects at a computation cost characteristic of mean field approximations. Unfortunately, the exact exchange–correlation functional is unknown, making it essential to pursue the quest of finding more accurate and reliable functionals.

[5], p. 8495

This article is concerned with practical approximations to the exact xc functional—such density functional approximations are usually just called density functionals, and we use that terminology here. A considerable amount of research, mainly over the last 30 years, has gone into developing better density functionals, and the best available functional for one application is often not the best for another. So it is of great practical importance to learn which density functionals perform well for which applications and to attempt to design density functionals that are as universally successful as possible. This has been a goal of our recent work, and this article provides one way to see how much progress we and others have made. In particular, this article involves a test of 12 Minnesota meta functionals and 65 other xc functionals against a common database of 451 data representing molecular energetic and structural data and solid-state energetic and structural data.

## 2. Approximations to the exchange–correlation functional

This section introduces the most common types of approximations used in modern Kohn–Sham DFT calculations. Kohn–Sham theory involves parametrizing the electron density by a Slater determinant, which enforces the Pauli exclusion principle and allows one to calculate the so-called non-interacting kinetic energy from the orbitals of the Slater determinant as if it were a wave function. The Kohn–Sham orbitals satisfy a set of coupled differential equations similar to the HF equations of wave function theory (WFT) but containing the xc potential instead of the HF exchange potential. The xc potential approximates the exchange and adds electron correlation and the electron kinetic energy beyond the non-interacting part.

Kohn–Sham theory writes the energy as a sum of (i) the non-interacting kinetic energy, (ii) the classical Coulomb energy of the electrons, consisting of their interaction with the nuclei and any external field that may be present (in the rest of this article we assume no such external field is present) and their self-Coulomb energy, and (iii) an unknown functional already mentioned, called the xc functional. We begin by following the usual convention of writing the xc functional *E*_{xc} as the sum of an exchange contribution and a correlation contribution,
2.1Each term in equation (2.1) can be a functional of the total electron density *ρ* or—in a spin-polarized case—of the two spin densities *ρ*_{σ}, namely *ρ*_{α} for up-spin electrons and *ρ*_{β} for down-spin electrons, where the total density is
2.2The density functional can also be a functional of several other quantities; in particular, we consider functionals that depend on one or more of the following additional variables:

— the local dimensionless reduced spin-density gradients;

— the local spin-labelled kinetic energy densities

*τ*_{σ}; and— the HF energy density.

We emphasize that here—and usually when we mention HF exchange—we are referring to the HF method for calculating the exchange energy from the Kohn–Sham Slater determinant, not to the actual HF exchange energy, which is what one would get from the HF Slater determinant.

Note that the dependency on *τ*_{σ} or the HF energy density yields an orbital-dependent functional that depends on the occupied Kohn–Sham orbitals. Such a functional is still a density functional because the Kohn–Sham orbitals are functionals of the density [6]. In this article, we do not consider density functionals that depend on unoccupied orbitals (such functionals, sometimes called doubly hybrid functionals [7,8], are potentially more accurate than the functionals considered here, but they are also more expensive).

It is useful to provide definitions of the dimensionless reduced spin-density gradients and of the local spin-labelled kinetic energy densities. But we note that different authors define these quantities in different ways. In particular, two main conventions are commonly used: one is usually used by Perdew, 2.3 2.4and the other is usually used by Becke, 2.5 2.6These conventions differ only by numerical factors; therefore, they only involve differences in how the formulae are written and how the axes are scaled in plots. However, it is important to be aware of these differences, especially when functionals are analysed or implemented into software. We recommend using different symbols for the two sets of definitions, so it will be clear which is being used.

Before proceeding we comment further on the partition of the xc functional into two terms in equation (2.1). This is very convenient and sometimes useful for understanding the structure of the unknown exact functional, but it is somewhat arbitrary because no physical result depends on *E*_{x} or *E*_{c} separately—only the sum has physical meaning [9,10]. Furthermore, the language is different from the familiar language of WFT. In particular, *E*_{c} includes only what is called dynamical correlation in WFT, and *E*_{x} includes both exchange and left–right correlation [6,11]. Some workers use scaling criteria [12] to distinguish ‘exchange’ from ‘correlation’. However, approximate density functionals include some exchange effects in *E*_{c} and some correlations in *E*_{x} simply because the approximation is imperfect. Having recognized this we will use *E*_{x} and *E*_{c} simply to indicate the motivations for the functional forms without repeating the warning that our partition might not agree with that preferred by others.

It should be noted that essentially all xc functionals have empirical elements. Some have fitted parameters, some have parameters inherited from previous work, some involve an experience-based selection of constraints or functional forms and many have more than one of the various kinds of empiricism; and parameters are fitted to both experimental data and fundamental *ab initio* data. There is no unique way to count empirical elements, so we do not include such counts. Furthermore, we think that the key issue is the choice of functional form, not the number of parameters. (For example, is a functional with a sixth-order polynomial in the density gradient and an empirical amount of HF exchange less empirical than a functional with an eighth-order polynomial in the density gradient and no HF exchange parameter?) In the literature, the success of functionals based on fitting a large number of parameters is sometimes attributed solely to that very fact, as if a large number of parameters is sufficient to obtain good results. This represents a fundamental misunderstanding. For a given functional form that does not contain the needed dependence on the critical values of the density, density gradient or other variables, one cannot fit a broad set of data even if one increases the number of parameters to be arbitrarily large. Therefore, a key issue is the choice of functional form, and that is what we discuss next.

### (a) Local approximations: local spin-density approximation, generalized gradient approximation, non-separable gradient approximation and meta-generalized gradient approximation

Because energy densities that depend only on the spin densities, their gradients and the spin-labelled kinetic energy densities depend only on the local values of these variables at a given point in space they are called local functionals. The oldest density functionals, dating back to the original Kohn–Sham article [2] with roots in the pre-DFT literature [13–16], are local and based on the uniform electron gas (UEG), which is a fictional system with a constant electron density generated by a smeared out positive background charge (not nuclei). Such approximations are sometimes called UEG approximations or free electron gas approximations, but more often they are called the local density approximation (LDA, when one is discussing only closed-shell singlets) or called the local spin-density approximation (LSDA, in the general case) to denote that they depend only on *ρ*_{σ}, not on *s*_{σ}, *τ*_{σ} or the HF exchange density. (The reader should not confuse the ‘local density’ functionals that depend only on the local densities, called LDA or LSDA, with the more general ‘local’ functionals (sometimes called ‘semilocal’ functionals, especially in the physics literature) that also depend on other local variables. The reader should also note that in the early literature functionals that are now called ‘local’ or ‘semilocal’ were sometimes called ‘non-local’; a confusing practice that now seems to have disappeared.)

The exchange energy of the UEG for a spin-polarized system has a simple mathematical expression (see appendix A for a description of the differences between the spin-polarized and spin-unpolarized cases) [2,15], 2.7Equation (2.7) may be called the Gáspár–Kohn–Sham (GKS) approximation for exchange (we warn the reader that ‘GKS’ should not be confused with other uses of the same acronym, such as that stand for generalized Kohn–Sham).

The correlation energy of the UEG is usually based on parametrizations of quantum Monte Carlo calculations. UEG correlation functionals include the work of Vosko, Wilk and Nusair (VWN) [17] with both their formulae 3 and 5 being used in current software. More recent work is that of Perdew and Wang in 1991 and 1992 [18], which is denoted PW92 when one is discussing UEG correlation. The differences in the mathematical forms of these parametrizations are sometimes significant, but the numerical results that they provide are very similar in most cases. A review of the UEG correlation energy is available [19] and further theoretical progress has also been achieved [20].

Early attempts to go beyond the UEG involved gradient expansions (i.e. a power series in a suitable function of the density gradient); however, it was noted early on [21] that ‘Opinion about the usefulness of including the lowest gradient correction in *E*_{xc}[*ρ*] in real condensed matter systems (in which the density gradients are *not* small) is divided’. A brief summary of the unsatisfactory properties of the truncated gradient approximation is available [22]. In the light of this situation, a more general approach was therefore widely adopted in which density functional approximations to the exchange energy (*E*_{x}) beyond LSDA are written as functionals of the spin-density distributions, *ρ*_{σ}, and their reduced gradients, *s*_{σ}, as
2.8but the leading terms in the expansion are not required to be correct. In generalized gradient approximations (GGAs) for exchange [23], *Γ*_{xσ} is factored by a scaling argument into
2.9where *G*_{xσ} is usually taken as the uniform-electron-gas limit of equation (2.7). If we set *F*_{σ=1}, we get back the LSDA; therefore, *F*_{xσ} (which is usually greater than unity) is usually called the GGA enhancement factor, which by definition depends only on the dimensionless reduced gradient. Various approximations have been proposed for the enhancement factor; Langreth & Mehl [24], Perdew & Wang [23] and Becke [25] carried out early work in this area. Resulting popular GGAs include the asymptotically correct exchange functional of Becke (B88) [26], the constraint-selection-based PBE functional of Perdew and co-workers [27], whose exchange portion is very similar to the earlier empirical X*αβγ* functional of Becke, here called B86, and the optimized exchange functional of Handy and Cohen (OptX) [28]. GGAs also involve correlation functionals depending on *ρ*_{σ} and *s*_{σ} (for example, combining B86, B88 or OptX with the LYP GGA [29] for correlation yields, respectively, the B86LYP, BLYP and OLYP functionals), and they often, but not always, tend to the UEG limit when *s*_{σ} goes to zero. Our recent second-order GGA functionals SOGGA [30] and SOGGA11 [31] tend to the UEG limit in the correct way not just at *s*_{σ}=0, but also to the analytically known leading term in an expansion in *s*_{σ} for a nearly UEG. This term is second order in *s*_{σ}.

A major drawback of the GGA functional form is that it cannot satisfy all the theoretical constraints of the exact functional, and it has been found in practice that a single GGA cannot provide good accuracy for all databases of interest. One of the most significant consequences of this fact is that no GGA in the form of equation (2.9) has been able to provide good atomization energies for molecules and also good performance for lattice parameters of solids. An alternative (slightly weaker) statement of this fact is that ‘No single GGA can describe with high accuracy the properties of both solids (surface energies and lattice constants) and molecules (total and atomization energies)’ [32]. Theoretical considerations and results with simple functional forms suggested [25–35] that a GGA that provides good atomization energies needs a second-order coefficient in the exchange functional (i.e. the coefficient of in a Taylor series for *F*_{xσ}) that is about twice as large as the exact coefficient calculated from the gradient expansion in the slowly varying limit. However, using the correct value for this second-order coefficient was found to be one of the key ingredients necessary to obtain correct lattice parameters in solids, and this led to the creation of functionals, for example PBEsol [33], that intentionally sacrifice performance for molecular energies to empirically improve the lattice constants by enforcing the second-order constraint on exchange. One must, however, be cautious about this kind of reasoning because it has often been based on studies with very restricted functional forms for GGAs. If a GGA has a limited functional form with only a few parameters, changing the functional form to change any one property or to fit any one constraint makes a global change in the functional, and one cannot be sure about which aspect of the resulting change in the functional is most responsible for the observed change in predictions. For example, the very flexible SOGGA11 functional [31] recovers the exact second-order coefficients in both exchange [36] and correlation [37] and is parametrized for broad accuracy for chemistry [31]; however, its performance for lattice constants is poor [38], clearly showing the fact that there is more involved than simply changing the second-order coefficient.

Although various arguments can be used (though non-uniquely) to define exact exchange in the context of DFT [10,39], one can also argue that ‘exchange and correlation need not…be separated in DFT’ [40]. Progress beyond the conventional GGA, but with the same ingredients, was achieved recently by the introduction [41] of a new functional that makes use of the same ingredients as GGA functionals (which are the spin densities and their gradients), but uses a more general functional form in which an exchange-type term is not separable as is equation (2.9). This new kind of functional approximation includes both exchange and correlation in a non-separable way by a new functional type that has the form of a non-separable gradient enhancement of UEG exchange; it also includes a more conventional correlation term. Functionals built this way are called non-separable gradient approximations (NGAs), and the first specific realization is called N12 [41].

To go beyond the GGA and the NGA, one needs to add more ingredients to the functional form, and the most popular way to do so has been to add either the second derivatives (Laplacian) of the spin densities or the spin-labelled non-interacting kinetic energy densities. Functionals incorporating either of these ingredients receive the label meta. Numerical instabilities linked to the use of the Laplacian in functional approximations, however, are one reason that made the non-interacting kinetic energy density the preferred ingredient for such approximations, which are usually built on GGAs and called meta-GGA functionals. We have also built a meta functional on an NGA; the resulting meta-NGA is discussed in §2*c*.

### (b) Hybrid functionals and range separation

As mentioned above, the Kohn–Sham equations include the Coulomb interaction of the electron density with the nuclei, and they also include the classical Coulomb self-energy of the electronic charge distribution. The classical self-interaction energy is non-local because the energy density at a given point involves an integration over all space. The exact xc energy must also be non-local in order to correct the classical approximation in the self-energy [42]. Hybrid GGAs replace a percentage of the local exchange by HF exchange. The motivation for this is the nature of the error in the classical self-energy, namely that the classical approximation includes the interaction of the entire electron distribution with itself by Coulomb's law without recognizing that the parts of an electron distribution corresponding to the same electron do not interact with one another. In DFT, this spurious self-interaction must be removed by the xc functional. A local functional cannot completely remove this self-interaction so the unknown exact functional must be non-local. In WFT, this spurious self-interaction is removed by antisymmetrization, and for a single Slater determinant, this results in the HF exchange energy. An energy density that depends on HF exchange can therefore provide a better approximate functional by removing some of the self-interaction.

Hybrid functionals started to become very popular after the broad success of the B3PW91 [26,43] functional and its even more successful close cousin B3LYP [26,29,44]. Following the most recent developments in density functionals, hybrid functionals that have a constant percentage of HF exchange (‘constant’ meaning independent of density, density gradient, interelectronic distance or position in space) are now called global hybrid functionals. Although more advanced and accurate functionals have been developed throughout the years, global hybrid GGAs still remain among the most used functionals for chemical applications, probably because of their broad availability in popular quantum chemistry computer programs and their well-earned user familiarity. As we recently noted [45]: ‘the B3LYP global hybrid GGA functional is still the most popular functional in most areas of quantum chemistry, despite its known shortcomings. In part, this is true because DFT is now widely used by nonspecialists, and the early successes of B3LYP gave it a good reputation and made it hard to displace even with better performing global hybrid GGAs’. Not only are there better performing global hybrid GGAs, but also hybrid meta-GGAs of various types that perform even better. Nevertheless, for reasons of simplicity, one may sometimes prefer a global hybrid GGA, and we have recently optimized such a functional against a broad range of chemical properties; the resulting functional is called SOGGA11-X [45].

A more recent development in the creation of new hybrid functionals is represented by range separation [46,47]; the basic idea of this approach is that the interelectronic Coulomb operator can be split into a short-range (SR) part and a long-range (LR) part. This effort is usually achieved by using an operator such as [46]
2.10where *r*_{12} is the interelectronic distance, and the error function is used because it allows a simple calculation of the integrals. However, in principle, any other separation can be used, and some other functions—for example, the Yukawa potential—although less common—have been employed, leading to similar results [48,49,50].

The most popular kind of range-separated hybrids are called long-range-corrected (LC) functionals [46,47,51,52]; they use 100% HF exchange in the LR limit and a smaller value, usually between 0% and 50%, in the SR limit. In the LR limit, 100% HF exchange is used to compensate part of the self-interaction error of DFT, because HF leads to an effective potential that has the correct asymptotic behaviour. A closely related range separation strategy is that employed by the CAM-B3LYP functional [53], which has 19% non-local exchange in the SR limit and 65% in the LR limit.

The opposite approach to LC is used in the so-called screened-exchange hybrid functionals, such as the HSE06 functional (sometimes called HSE) [54,55] and our recent N12-SX [56] and MN12-SX [56] functionals. This approach uses a finite amount of HF exchange at SR, but none in the LR limit, in order to cut the computational cost of non-local exchange integrals for extended systems, while at the same time—in principle—retaining the good performance features of global hybrid functionals for most chemical properties. The second possible reason to use screened exchange is the argument that dielectric and correlation effects screen LR exchange. We note that the savings in computer time when compared with hybrid functionals that do not screen the exchange depend strongly on the software, but in some cases ‘it reduces significantly’ [57], for example, owing to reducing the number of *k* points needed for calculations with periodic boundary conditions [58]. Nevertheless, ‘Despite extensive efforts towards computationally efficient implementations, HSE is still rather more expensive than’ local functionals [59].

Range separation can also be used to mix two local functionals so the percentage of HF exchange is zero for both SR and LR [60]. Note that a range-separated local functional is not separable in the sense of equation (2.9). Thus, combining two meta-GGAs via range separation yields a meta-NGA.

A word is in order here about HF exchange and multi-reference systems. Multi-reference systems are most easily defined in the language of WFT. Multi-reference systems are those that cannot, even to zero order, be well described by a wave function in the form of a single Slater determinant, which is the HF approximation. Prominent examples include diradicals and systems with stretched bonds. The variational energy lowering (when compared with that obtained with a single Slater determinant) when one uses the smallest number of determinants that provide a good zero-order description is sometimes called static correlation energy or—in some cases—left–right correlation energy (although one should be careful to note that mixing two or more determinants corresponding to different electron configurations in WFT inevitably brings in some dynamical correlation as well). Thus, static correlation energy is a special type of error in the HF approximation. Hybrid functionals, by using HF exchange, build in this error, whereas local functionals need not have this kind of error, and in fact it is an advantage of DFT that local exchange brings in some left–right correlation energy without the expense of a multi-configurational wave function [11]. But, as mentioned above, a local functional cannot completely remove the spurious self-interaction of the classical electron repulsion in the Kohn–Sham equations. Thus, in practice, there is always a trade-off, with zero or low HF exchange leading to a smaller static correlation error, but higher self-interaction error, and high HF exchange having higher static correlation error and lower self-interaction error.

### (c) The Minnesota meta functionals

Several meta functionals developed in Minnesota in 2005 and later have been given names of the form M*yz* or M*yz*-suffix, where *yz* denotes the year 20*yz* and M denotes Minnesota or meta. Such functionals have been called Minnesota functionals, and they are our functionals with the broadest accuracy. When we develop functionals that do not include a dependence on kinetic energy density, which we consider to be an important ingredient in a broadly applicable density functional, it is usually to demonstrate what can be done with a restricted set of ingredients, which is a demonstration that is of interest both for fundamental understanding and for ease of implementation, but nevertheless we cannot expect the highest possible accuracy if we forego dependence on kinetic energy density. The Minnesota functional family consists of one meta-GGA (M06-L), two meta-NGAs (M11-L and MN12-L), seven global hybrid meta-GGAs (M05, M05-2X, M06-HF, M06, M06-2X, M08-HX and M08-SO), one range-separated hybrid meta-GGA (M11) and one screened-exchange hybrid meta-NGA (MN12-SX).

These functionals are all parametrized against a broad range of chemical data. The success (or the lack of success) of a given one of these functionals depends on the design of (or choice of) an appropriate functional form as well as the parametrization strategy (what data to use, how to weight the various items of data, and how to optimize linear and nonlinear parameters). In some cases, the desired behaviour can be designed in, whereas in other cases the ‘results show how a flexible functional form can lead to the “discovery” of a desired behavior of a functional’ [61]. Next, we explain the Minnesota functionals in chronological order.

—

*M05 family [62,63]*. The first meta-GGA functional to be named Minnesota functional dates back to 2005, when we first used a flexible functional form to optimize a meta-GGA functional, called M05 [62], on a large number of databases representing important chemical properties. We found that the M05 functional was able to give better balanced accuracy for chemical reaction barrier heights and bond energies in molecules containing metals than any previously available functionals were. We also optimized a related functional, called M05-2X [63], with a percentage*X*of HF exchange that is twice as high. As far as energetic properties are concerned, M05-2X was found to perform much worse than M05 for many systems containing transition metals, but is better for almost all other systems, the exception being systems with high multi-reference character; however, the increase of*X*often makes geometries and vibrations slightly worse. The conclusions about applicability to multi-reference systems can be understood in part from the discussion above of the static correlation error brought in by HF exchange, and they will be re-examined in this survey using new databases that are more extended and reorganized than those used in the original M05 [62] and M05-2X [63] publications.—

*M06 family [64–66]*. The M06 family of functionals is composed of four functionals that have similar functional forms for the DFT part, but each has parameters optimized to be used with a different percentage of HF exchange. The four functionals are: M06-L [64], a local functional (no HF exchange); M06 [65], a global hybrid meta-GGA with 27% of HF exchange, leading to a well-balanced functional for overall good performance for chemistry; M06-2X [65], a global hybrid meta-GGA with 54% HF exchange, for top-level across-the-board performances in all areas of chemistry including thermochemistry and reaction kinetics, but excluding multi-reference systems, such as many systems containing transition metals; and M06-HF [66], a global hybrid meta-GGA with 100% HF exchange, suitable for the calculation of spectroscopic properties of charge-transfer transitions, where elimination of self-interaction error is of paramount importance. Although it was formerly believed that one could not design a generally useful functional with 100% HF exchange, M06-HF disproved this by achieving overall performances for chemistry better than the popular B3LYP functional did.—

*M08 family [67]*. The success of the M06 family of functionals is largely documented in previous studies and is confirmed by their popularity. Noting that we were very close to the limit that the M05/M06 functional form could provide, in 2008, we made an attempt to improve it [67]. This led to a new global hybrid meta-GGA functional called M08-HX, where HX stands for*H*igh HF e*X*change; it uses 52.23% of non-local exchange. M08-HX is an improved version of the previous high HF exchange Minnesota functionals M05-2X and M06-2X. The M08-HX functional performs similar to the M06-2X functional, but it uses a much cleaner functional form for both the exchange and the correlation. This functional form later became the base for the future Minnesota functionals, such as the M11 family. Following the development of the SOGGA functional, we also introduced a hybrid meta-GGA functional, called M08-SO, that is correct through second order in both the exchange and the correlation by enforcing the second-order constraint on the M08 functional form. M08-SO is a high-exchange functional (with 56.79% of HF exchange) and its performance is very similar for many (but not all) properties to that of M08-HX. The M08-SO functional was the first Minnesota family functional to be correct through second order. (SOGGA is not called a Minnesota functional because it does not depend on the kinetic energy density.) Despite having improved the functional form—especially in the terms that depend on the kinetic energy density—and having eliminated a problematic term from the VS98 exchange to reduce the grid sensitivity, both M08 functionals still require ultrafine quality integration grids, mainly because of the sensitivity coming from the correlation part. The grid sensitivity and the related occasional convergence problems within self-consistent-field (SCF) calculations are present—to different extents—in all meta-GGA functionals, including the M11 family discussed below and including meta-GGAs from other groups. This represents an open problem in DFT development, still to be solved. The grid sensitivity should not, however, be overemphasized. It has not usually been a hindrance to using the functionals for practical work, although the need for a finer grid can slightly raise the cost of a calculation.—

*M11 family [60,68]*. In the recent 2011 generation of Minnesota functionals, we introduced the range-separation feature into our already very successful functional form. The M11 functional [68] is entirely based on the M08 meta-GGA functional form for the density functional part, but it uses the LR correction scheme in the way employed by Chai & Head-Gordon [69] for a GGA, with the percentage of HF exchange in M11 varying from 42.8% at SR to 100% at LR. This means that the percentage of HF exchange in M11 is always greater than or equal to 42.8%, and this puts it in the class of the high-*X*functionals (such as M05-2X, M06-2X, M08-HX and M08-SO), but the fact that it has 100% of HF exchange in the LR provides it with the chief advantage of M06-HF. Range separation in the M11-L functional [60] was introduced at the local level only, by introducing a new strategy called dual-range exchange. The dual-range exchange strategy of M11-L uses two different local functionals, one for SR and one for LR, and each local functional is a flexible meta-GGA. The meta-GGA forms are parametrized using databases of accurate reference data, as has been done for all Minnesota functionals. The main differences between the dual-range approach (as in M11-L), the LC approach (as in M11) and the screened-exchange approach (as in HSE06) are schematically illustrated in figure 1. The performance of the M11-L functional is much better than its predecessor, M06-L, for many classes of interaction, and this often makes M11-L a potential substitute even for the M05 and M06 hybrid functionals, avoiding the computational cost of the expensive HF exchange. In summary, in the M11 family, M11-L in principle replaces M06-L, M11 in principle replaces M06-2X, M08-HX, M08-SO and M06-HF, and M06 is in principle replaced by either M11 or M11-L, thereby bringing us closer to a single universal functional, which is still an unmet challenge for long-term research. This is illustrated in the genealogy tree of figure 2. We note though that the ‘in principle’ qualification should not be taken lightly; at this stage, there is more experience using the older functionals, and one should be cautious when switching to the less extensively vetted newer ones. This caution is particularly true when it comes to transition metals and other metals. Although we have placed more emphasis in recent work on systems containing metals and systems with multi-reference character, our training sets are still not good enough in this regard. This is especially true because there is more than one kind of multi-reference character, and we are still sorting out the issues involved in designing density functionals that can treat metal-containing compounds and systems with multi-reference character reliably.—

*MN12 family [56,70]*. The MN12 family includes MN12-L [70] and MN12-SX [56]. These functionals both build on the N12 non-separable functional form discussed above, with MN12-L adding kinetic energy density and MN12-SX adding both kinetic energy density and screened exchange, which is also explained above. The MN12-L functional is particularly accurate for atomization energies, ionization potentials, barrier heights, non-covalent interactions, isomerization energies of large molecules, and solid-state lattice constants and cohesive energies [70]. MN12-SX is better than MN12-L for main-group atomization energies, electron affinities, proton affinities, alkyl bond dissociation energies, hydrocarbon energetics,*π*-system thermochemistry, barrier heights of all kinds, non-covalent interaction energies, difficult cases, atomic energies, main-group non-hydrogenic bond lengths and semiconductor band gaps, and it has the best overall performance on the CE345 database of any functional we have considered, but we still require experience on a broad range of applications to say how useful it will be in the long run.

In classifying functionals, many workers use the Jacob's ladder scheme of classification, first introduced by Perdew & Schmidt [22]. The functional approximations described in §2*a* are all on rungs 1–3 of Jacob's ladder, hybrid functionals are on rung 4, M06-L, M11-L and MN12-L are on rung 3, and the other Minnesota functionals are on rung 4. The ladder scheme does not distinguish hybrid GGAs from hybrid meta-GGAs, nor conventional GGAs from an NGA, nor does it distinguish range-separated hybrids from global hybrids or dual-range local functionals from single-range ones; however, these differences are very important.

### (d) Functionals with non-local correlation

For exchange, the results presented in this article include both local (LSDA, GGA, NGA, meta-GGA and meta-NGA) approximations and non-local ones (various hybrid mixtures of HF exchange). One may also consider non-local correlation, and this may be done with or without considering unoccupied orbitals. In this article, though, we do not consider such functionals. Certainly they are more complicated, and usually they are much more expensive to compute. Furthermore, they are still being heavily vetted, and no firm conclusion has emerged. Some examples of functionals with non-local correlation are MC3BB [7], MCG3/MPWB [71], vdW-DF [72], TPSS/CCSD(T) [73], vdW-DF2 [74], sc-NEVPT2-srDFT [75], optB88-vdW [76], optB86b-vdW [77], VV10 [78], PWPB95-D3 [79] and LC-VV10 [80]. These functionals deserve a separate review.

### (e) Functionals including molecular mechanics

A large industry has developed for adding post-SCF molecular mechanics terms to various density functionals. The motivation is to add LR dispersion forces, which are missing in functionals with local (LSDA, GGA or meta-GGA) correlation. This work deserves some comments. For a start, we note that polar molecules have significant electrostatic and inductive forces, and the dispersion forces need not dominate the non-covalent attraction. Many density functionals predict reasonably accurate multipole moments and polarizabilities, so they do quite well for electrostatic and induction forces. Attention has therefore centred on dispersion forces. We note though that ‘dispersion’ is an ambiguous term. Its original usage was to refer to LR forces treated by perturbation theory of separated subsystems with negligible overlap; the formalism involved a sum over excited states, and the name invokes the Kronig–Kramers dispersion relations that characterize the spectrum of any molecule because some of the same quantities that appear in the perturbation expressions for LR forces also appear in those dispersion relations. What is not always clear when one discusses dispersion interactions is that the assumption of negligible overlap is by no means true at the minimum energy geometries of van der Waals molecules. At such configurations, as they are energy minima, the gradient of the repulsive forces has the same magnitude as that of the attractive force (they are equal and opposite). But, the repulsive forces between non-polar molecules and atoms are due to exchange repulsion, which results from an overlap of the orbitals on the two subsystems. When the overlap is present, as at van der Waals minima, we prefer to speak of dispersion-like forces or medium-range correlation energy, saving the terms ‘long range’ and ‘dispersion’ for the essentially overlapless region of interaction. Although functionals with local correlation (LSDA, GGA and meta-GGA) cannot predict the LR-induced, dipole-induced dipole dispersion forces, they *can* predict medium-range attractive non-covalent interaction at van der Waals distances where the overlap is not negligible. However, many available functionals underestimate these forces or even predict net repulsion at van der Waals distances. That is the motivation for adding molecular mechanics attraction that decays at LR as *R*^{−6}, where *R* is the distance between subsystems. An example of how molecular mechanics ‘dispersion’ cancels the overly repulsive nature of intermolecular interactions provided by the popular B3LYP functional has been provided by Acosta-Silva *et al*. [81].

One must modify the *R*^{−6} functional form at medium range for two reasons: (i) attractive terms of order of *R*^{−8} and *R*^{−10}, even R^{−7}, become significant at medium range, so *R*^{−6} underestimates the undamped force and (ii) the orbital overlap makes the multipole expansion leading to these powers invalid, and so the expansions must be damped to avoid overestimating the non-covalent attraction.

Molecular mechanics dispersion involves first a choice of functional form, then parameters or approximations to predict the coefficient of *R*^{−n} terms (*n*≥6) and their dependence on molecular geometry and bonding pattern, and finally parameters in the damping function(s). Because there is no unique definition of the dispersion-like contribution once the overlap becomes significant, and because the parametrization is making up for errors in the density functional without added terms, the parametrization cannot be validated independently of the non-dispersion-like interactions, and the parameters depend strongly on which underlying functional is involved. In most cases, the parametrization is made to be independent of charge state, oxidation state, partial atomic charges, hybridization and bonding pattern, which can be a serious approximation, and if *R*^{−n} terms are underdamped they can cause large errors in some cases. Grimme has studied these issues most carefully; he eventually became unsatisfied with his first two rounds of parametrization and all other molecular mechanics approaches to the dispersion problem, so he devised a set of ‘D3’ third-generation functional forms and parametrizations, which are the most complete attempt to minimize these difficulties [82]. Even there though, it was found that a special reparametrization was required to treat ionic surfaces [83], and then the D3 method was further improved by the D3(BJ) method [84]. Similar to D3 and D3(BJ), the Tkatchenko–Scheffler–van der Waals (TS-vdW) molecular mechanics formalism also involves a parametrized damping function which needs to be reparametrized for each functional [85], because the ‘dispersion’ terms include not only dispersion-like contributions to the interaction energy but also molecular mechanics corrections for systematic errors in the underlying xc functional.

Molecular mechanics treatments of dispersion-like interactions also have the disadvantage that, when one adds pairwise atom-centred dispersion terms, there is no accounting for whether these interactions are screened by other parts of the system that may lie between the two interacting atoms.

Despite these concerns, we do include two functionals with post-SCF molecular mechanics ‘dispersion’ terms in the tests reported here, but that is only a small subset of the many available parametrizations. Our own attitude is to prefer density functionals that predict the medium-range attractive non-covalent interactions as part of the density-dependent self-consistently used functional itself, and the Minnesota functionals as a group tend to do a better job of this than other available functionals with local correlation. Karton *et al*. [86] devised a damped dispersion correction and empirically determined a scale factor by which it should be multiplied when used with various underlying functions. For example, for B97-3, the scale factor is 0.90, indicating that only 10% of the dispersion-like interaction terms were included in the functional (on average), and for BLYP the factor was 1.20, indicating not only the absence of dispersion-like interactions but also a systematic overestimation of medium-range repulsive forces that need to be compensated by a factor greater than unity. The scale factors for some other functionals considered in this article were 1.10 for B3PW91 and HCTH407, 1.05 for B3LYP, 1.00 for TPSS, 0.90 for TPSSh and TPSS1KCIS, 0.75 for PBE, 0.70 for PBE0 and 0.765 for BMK. However, the required factor was only 0.50 for PW6B95 (a precursor of M05), 0.25 for M06, 0.20 for M06-L and 0.06 for M06-2X. In other words, M06-2X already includes on average 94% of the empirically needed dispersion-like interactions. Similarly, when Goerigk & Grimme [87] added dispersion-like terms to the M05-2X, M06-L, M06-HF, M06 and M06-2X functionals, their parameter values indicated a need for correction mainly at large *R*, not at medium *R*. The third example similar to this is in the parametrization of the TS-vdW model, where the ‘dispersion’ terms were damped at a larger distance for M06 and M06-L than for six other studied functionals [85] because the Minnesota functionals already included more attractive non-covalent interactions at van der Waals distances. The results presented below show that two local functionals developed in Minnesota (M11-L and MN12-L) are more accurate than M06-L for non-covalent complexation energies, and five non-local functionals developed in Minnesota (M06-2X, M08-HX, M08-SO, M11 and MN12-SX) are more accurate than M06 for non-covalent complexation energies. Undoubtedly, there are situations involving very large molecules or condensed phases where it would be helpful to add an LR correction onto any functional based on local correlation, but the effect is expected to be small in most cases for the Minnesota functionals, and we have not pursued that. An extensive review of WFT and DFT calculations on non-covalent interactions has been provided by Riley *et al*. [88].

## 3. Computational aspects

There are a large number of programs that can perform DFT calculations and the Minnesota functionals are included in several of them. All the results presented in this article were calculated using a locally modified version [89] of *Gaussian 09* [90]. The 2011 generation of Minnesota functionals is also implemented, at present, in the following programs: *Q-Chem* (as of v. 4.0) [91], GAMESS (as of release R1 2012) [92] and *NWChem* (as a patch to v. 6.1, and to be fully included in the next major release) [93]. The older Minnesota functionals are present in even more software, while the implementation of the 2012 generation is currently in progress; more details of the implementation of Minnesota functionals are given at http://comp.chem.umn.edu/info/DFT.htm.

Meta-GGA functionals are in general more sensitive to the integration grid than GGA functionals, and therefore they usually require a finer integration grid than the default of most popular programs. In this paper though, all calculations, whether with or without kinetic energy density, were performed using the ultrafine (‘99,590’) Lebedev grid of the *Gaussian 09* program [90]. In addition, we always allow symmetry breaking in the orbitals of the Slater determinant in order to converge to the stable broken-symmetry solution (through the *STABLE*=OPT *Gaussian* keyword).

Another computational issue is basis set superposition error. We do not include a counterpoise correction for such errors in this work. The reasons for this are that we have found that such corrections do not necessarily improve the accuracy and there is no generally accepted scheme for applying such corrections to all problems of interest, for example ternary interactions, transition state barrier heights or condensed-phase problems. Our goal is to evaluate methods that yield useful results without such corrections.

## 4. Databases

Many databases have been collected and used throughout the years by our group for the applications and development of DFT, and this effort is still an ongoing process with (especially) further work being carried out to develop more comprehensive databases for metal-ligand bond energies. The paper presenting the M06 and M06-2X functionals [65] summarized our most important databases as of 2006, and this article summarizes an important subset of our data as of 2012. In particular, we summarize the current status of many of our most widely used databases, including in a few cases new data added for the first time here.

Each of our databases represents one or another particular class of properties—such as atomization energies, reaction barriers, lattice constants, band gaps, etc. Each database is dubbed with an acronym representing the particular property considered (e.g. MGAEs for main-group atomization energies), followed by the number of data in the subset (e.g. 109 data), and eventually—if there has been more than one version—by the last two digits of the year of latest revision of the database (e.g. 11 for 2011); if the year is not specified, it means that the database is at its first generation and no revision of its data has ever been made (therefore, its year is that of the corresponding reference).

We have combined several of our databases into four comprehensive databases, representing energetic and structural properties for chemistry and physics. The first database is called CE345 (chemistry energetic database with 345 data); the second database is called PE39 (physics energetic database with 39 data); the third database is called CS20 (chemistry structural database with 20 data); finally, the fourth database is called PS47 (physics structural database with 47 data). These comprehensive databases are the main subject of this review, and we have made them available through a webpage called http://comp.chem.umn.edu/db. In the next sections, we use these databases to assess the performance of a large number of density functionals, including the Minnesota functionals.

All energetic data in the databases are Born–Oppenheimer potential energy differences without zero-point energy or thermal vibrational, rotational or translational contributions.

Although the comprehensive databases were put together by combining data from other databases, it is easiest to understand their structure by dissecting them rather than discussing their assembly. The comprehensive databases may be considered level 1 of a hierarchy. Each of these level 1 databases contains non-overlapping level 2 subdatabases, where ‘non-overlapping’ means that each datum appears in one and only one subdatabase. These level 2 databases will be called the primary databases in this article. A list of the primary databases, together with the reference or references for the latest version, can be found in table 1, while a graphical summary is available in figure 3.

Note that all comparisons with the CE345 reference data in table 1 are carried out by means of single-point calculations at specified geometries, which are taken (as indicated in each row of the table) from experiment, HF, B97-D or B3LYP calculations, Møller–Plesset second-order perturbation theory (MP2), quadratic configuration interaction with single and double excitations (QCISD) or multi-coefficient quadratic configuration interaction with single and double excitations, v. 3 (MC-QCISD/3). By contrast, the comparisons with reference data in the PE39 database are carried out with consistently optimized geometries.

We will also consider two other kinds of databases, which we call level 3 databases. Level 3 databases can be either subsets of level 2 databases (and in these cases we call them secondary databases) or databases that have data from more than one primary database but are not simply the union of two or more other databases (and in this case we call them analytical databases). Analytical databases are useful because we reorganized the data in order to allow a different kind of analysis of the performance of the density functionals.

The primary (level 2) databases are explained further in §4*a*, while the secondary and analytical (level 3) databases are explained further in §4*b*. Statistical conventions are explained in §4*c*.

Although our databases are extensive and broad, they do not include all features present in the previous assessments. Although DFT is increasingly being applied to excited-state molecular problems, this article is limited to ground states and band gaps, although we do consider ionization potentials and electron affinities. The accuracy of DFT for molecular electronic excitation energies is assessed elsewhere [105–114]. A 2009 review [115] included 56 DFT assessment and validation articles for transition metals alone; when compared with the sum of the data in those assessments, transition metals are represented here with limited data, although some care was taken to make those data representative. (As mentioned above, our databases in this area are currently undergoing expansion.) A very large database for main-group chemistry, complementary to the present one, is provided by the GMTKN30 database of Grimme, which is a collection of 30 subdatabases containing 841 relative energies and which has been thoroughly studied [87]. We have studied three of its subdatabases for reaction energies in subsequent work [116]. Mardirossian *et al*. [117] have also recently performed extensive systematic tests against multiple kinds of data. The extensive tests of density functionals by Rayne *et al*. [118,119] are also noteworthy.

We note that databases of heats of formation (for example, the G2/97 test set [120]) have been widely used for testing electronic structure theories, including DFT. Using such databases requires one to include vibrational zero-point effects (for the heat of formation at 0 K) and thermal vibrational energies (for the heat of formation at 298 K). This raises issues of vibrational anharmonicity and allows for the possibility that errors in estimated vibrational contributions add to or cancel the errors in electronic energies. We prefer instead to use vibration-exclusive energies (also called zero-point-exclusive energies), which are differences in Born–Oppenheimer potential energy surfaces, and usually we carry out energetic testing at fixed geometries so that all density functionals are evaluated at the same set of geometries. As mentioned in the fourth paragraph of this section, this is the approach adopted in this article. This means that, when the reference data are obtained from experiment, the vibrational and rotational effects are removed at the stage of developing the reference data.

### (a) The primary databases

The chemistry energetic database, CE345, has 15 non-overlapping subsets of different properties; these are the primary chemistry energetic databases. Calculations are performed at fixed geometries so that all methods are compared at the same set of predetermined geometries, which are presented in table 1. The primary subsets that compose the chemistry energetic database, each representing a relevant property, are explained below.

—

*Main-group atomization energies*(*MGAE109/11*). The MGAE is composed of 109 atomization energies. It was introduced as an expansion of Database/3 [121] and Database/4 [122] and was first used for DFT development in the test set of the M05 functional [63]. The database was most recently updated in 2011 [45] by using more accurate reference data from W4 [123], W4.2 [123], W4.3 [123] and W4.4 [124] calculations. Geometries for all molecules are obtained at the QCISD/MG3 level of theory [125], and we used the MG3S basis set for single-point energies [96].—

*Single-reference metal bond energies*(*SRMBE13*). This database and the next one for multi-reference bond energies were recently formulated [31] and expanded [60], based on previous work on systems containing metals [126,127]. The first four data in this database are the bond dissociation energies for diatomic molecules containing metals that have positive MP2 binding energies, in particular: Ag_{2}, CuAg, Cu_{2}and Zr_{2}. The next eight data are extracted from a previous database by choosing those that have B1 diagnostics [127] smaller than 10 kcal mol^{−1}, in order to include only data with single-reference character: AgH, CoH, Cr(CH_{3})^{+}, Cu(H_{2}O)^{+}, FeH, LiCl, LiO and V(CO)^{+}. The final datum was added recently [60] by adding the bond energy for AlCl [128]. We used the def2-TZVP basis set [129,130] for the SRMBE13 database. Bond energies for all molecules in this database are equilibrium ones (*D*_{e}), obtained from the experimental bond energy in the ground vibrational state (*D*_{0}).—

*Multi-reference bond energies*(*MRBE10*). This database contains 10 systems with high multi-reference character. Five data come from a previous database of multi-reference metal bond energies (MRMBE5, [31]): Cr_{2}, V_{2}, Ni(CH_{2})^{+}, Fe(CO)_{5}and VS. Bond energies for these molecules are bond energy at equilibrium (*D*_{e}), obtained from the experimental bond energy in the ground vibrational state (*D*_{0}), as for the previous database. The remaining five data are taken from Karton*et al*. [128] and do not involve metals. All these data have high multi-reference character according to the %[(T)] diagnostic [123] (they all also have B1 diagnostics [127] greater than 10 kcal mol^{−1}), and the detailed dissociation reactions are: B_{2}→2B, O_{3}→O_{2}+O, C_{2}→2C, S_{4}→2S_{2}and Cl_{2}O→Cl_{2}+O. Geometries for these five reactions are obtained at the QCISD/MG3 level of theory [125]. For the MRBE10 database, we used the def2-TZVP basis set [129,130].—

*Isomerization of large organic molecules*(*IsoL6/11*). This database was introduced in order to include larger molecules in the training and performance evaluation of density functionals, and it is based on a larger database from Grimme, called IsoL22 [131]. However, as some of the reference data in the original IsoL22 are questionable, we [94] recalculated the reference energy for six of the smallest molecules in Grimme's database by using the accurate CCSD(T)-F12a/aug-cc-pVDZ method [132–136], and collected the results in the IsoL6/11 set. Geometries for this set are taken from the original work of Grimme and co-workers [131] and are optimized at the B97-D/TZVP level [137,138]. For this database, we used the MG3SXP basis set [67].—

*Ionization potentials*(*IP21*). The ionization potential (IP) database [70] contains data from six main-group atoms (C, S, O, P, Si and Cl), seven transition metal atoms (Cr, Cu, Mo, Pd, Rh, Ru and Zn) and eight molecules (SH, Cl_{2}, OH, O_{2}, PH, PH_{2}, S_{2}and FeC). Calculations on molecules involve separately optimized geometries for neutral molecules and cations. The equilibrium bond length of FeC is obtained [97] from the experimental bond length [139] in the ground vibrational state, while geometries for all other molecules are obtained at the QCISD/MG3 [125] level of theory. We used the MG3S basis set [96] for the main-group atoms and all molecules except FeC, for which we used the SDD+2fg [140] basis for Fe and the def2-QZVPP basis [130] for C; we used the cc-pVTZ-DK basis set [141] for the transition metal atoms. The scalar relativistic effects are included in the calculations of the seven transition metal atomic IPs by using the Douglas–Kroll–Hess (DKH) second-order scalar relativistic Hamiltonian [142–144], while they are included in the calculations of FeC with the SDD relativistic effective core potential [140].—

*Electronic affinities*(*EA13/03*). The electronic affinities (EA) database [62,63,95,96] contains six main-group atoms (C, S, O, P, Si and Cl) and seven molecules (SH, Cl_{2}, OH, O_{2}, PH, PH_{2}and S_{2}). Reference data and geometries are obtained at the QCISD/MG3 level of theory [125]. Calculations on molecules involve separately optimized geometries from neutral molecules and anions. For this database, we used the MG3S basis set [96].—

*Proton affinities*(*PA8/06*). The proton affinities (PA) database [98] contains the proton affinities of the following small molecules: NH_{3}, H_{2}O, C_{2}H_{2}, SiH_{4}, PH_{3}, H_{2}S, HCl and H_{2}. As for the previous two sets, calculations involve separately optimized geometries from neutral and charged (in this case protonated) molecules. Geometries are obtained at the MP2/6-31G(2df,p) level of theory. For this database, we used the MG3S basis set [96].—

*Alkyl bond dissociation energies*(*ABDE12*). This database is a merger of ABDE4/05 and ABDEL8. ABDE4/05 contains four bond dissociation energies of small R–X organic molecules, with R=*methyl*and isopropyl, and X=CH_{3}and OCH_{3}.*D*_{0}values were taken from a paper by Izgorodina*et al*. [99], and we used the B3LYP/6-31G(d) zero-point vibrational energies scaled with a scale factor of 0.9806 to obtain our best estimate of the*D*_{e}values in the database. For this database, we used the MG3S basis set [96]. ABDEL8 is from a 2011 paper [31].—

*Hydrocarbon chemistry*(*HC7/11*). This database consists of seven cases of hydrocarbon data that are sensitive to medium-range correlation energy. HC7 is the combination of the HC5 database [145] with two isodesmic reactions (involving adamantane and bicycle [2.2.2]octane) that were singled out as difficult cases by Grimme. All geometries are obtained at the MP2/6-311+G(d,p) level of theory. The original reference data for this database were published in the original paper [65], and some inconsistencies in the reference data were corrected [31]. For this database, we used the 6-311+G(2df,2p) basis set [146].—

*Thermochemistry of*(*π*systems). This database containing*π*TC13*π*systems [63,64] is composed of three isomeric energy differences between allene and propyne as well as higher homologues (which correspond to cumulenes and polyenes—this subset is called*π*IE3/06), five proton affinities of conjugated polyenes (PA-CP5/06) and five proton affinities of conjugated Schiff bases (PA-SB5/06). Geometries for all the molecules in this database are obtained at the MP2/6-31+G(d,p) level of theory, and we used the MG3S basis set [96].—

*Hydrogen transfer barrier heights*(*HTBH38/08*). This database contains 38 transition state barrier heights for 19 hydrogen transfer (HT) reactions, 18 of which involve radicals as reactant and product. Six reference data in the HTBH38 database were revised in 2008 [101]. All geometries are obtained at the QCISD/MG3 level of theory [125]. For this database, we used the MG3S basis set [96]. All reactions in HTBH38/08 are isodesmic.—

*Non-hydrogen transfer barrier heights*(*NHTBH38/08*). The original version of this database was created in 2004 [71] by joining three older databases containing 38 transition state barrier heights for non-hydrogen transfer (NHT) reactions. NHTBH38/08 contains 12 barrier heights for heavy-atom transfer reactions, 16 barrier heights for nucleophilic substitution (NS) reactions, and 10 barrier heights for non-NS unimolecular and association reactions. As for the previous case, geometries are obtained at the QCISD/MG3 level of theory [125]. Eighteen reference data in the NHTBH38 database were revised in 2008 [101]. For this database, we used the MG3S basis set [96].—

*Non-covalent complexation energies*(*NCCE31/05*). Several databases have been developed in our group for various kinds of non-covalent interactions, and currently we use HB6/04 [102], CT7/04 [102], DI6/04 [102], WI7/05 [95] and PPS5/05 [95]. The geometries for the benzene dimers in the NCCE31/05 database are taken from Sinnokrot & Sherrill [147], while geometries for all other molecules in this database are optimized at the MC-QCISD/3 level [121,148]. For this database, we used the MG3S basis set [96].—

*Difficult cases*(*DC9/12*). In this database of difficult cases for DFT, we used the data from our previous DC10 database and omitted the datum for the atomization energy of ozone. The omission is to avoid a repetition in the database, as ozone is also present in the MRBE10 set. The /12 suffix was added to the name to avoid confusion with a database by Grimme which is called DC9, but contains different data. All geometries are obtained at the MP2/6-311+G(d,p) level of theory [125], while we used the MG3S basis set [96] for the calculations.—

*Atomic energies*(*AE17*). Total atomic energies of the atoms from H to Cl. Reference data are from [103]. We recently updated the basis set for this database to use a more complete basis set that includes core-polarization functions; in particular, we now use the cc-pwCV5Z basis set [149] for H, He and atoms from Be to Ne and from Al to Ar, while we used the cc-pCVQZ basis set [150] for Li, Be, Na and Mg atoms.

The solid-state physics energetic set, PE39, has two non-overlapping primary subsets containing, respectively, cohesive energies and band gaps. For this set, the lattice constants of the solids were reoptimized for each method studied. The primary subsets that compose the physics energetic database are as follows.

—

*Solid-state cohesive energies*(*SSCE8*). This set includes the cohesive energies of eight solids: C, Si, SiC, Ge, NaCl, NaF, LiCl and LiF. We first used this database for the evaluation of the SOGGA functional [30]; reference data are taken from [104], and the geometry of each solid is optimized for each method. For this database, we used the m-6-311G* basis set [151].—

*Semiconductors band gaps*(*SBG31*). This database was recently created for the evaluation of the performance of our M11-L functional [38]. It contains band gaps for four unary semiconductors from group 14 (C, Ge, Si, SiC), six binary semiconductors from groups 2 and 16 (MgS, MgSe, MgTe, BaS, BaSe, BaTe), 14 binary semiconductors from groups 13 and 15 (BP, BAs, AlP, AlAs, AlSb, GaN,*β*-GaN, GaP, GaAs, GaSb, InN, InP, InAs, InSb), and seven binary semiconductors composed of group 12 and an element from group 16 (ZnO, ZnS, ZnSe, ZnTe, CdS, CdSe, CdTe). Reference data are taken from [151–153]. In this database, we follow the usual convention of comparing single-particle gaps from calculations (that is, band energy gaps, which are the crystal analogue of orbital energy gaps in molecules) with experimental optical gaps. Band gaps are calculated at the optimized geometry for each method. For this database, we used the m-6-311G* basis set [151].

The chemistry structural database, CS20, is composed of two non-overlapping primary subsets containing a total of 20 geometrical data. The primary structural databases are as follows.

—

*Main-group hydrogenic bond lengths*(*MGHBL9*). The MGHBL9 database of the structural set contains nine hydrogenic bond lengths [30]. For this database, we used the 6-311+G(2df,2p) basis set [146].—

*Main-group non-hydrogenic bond lengths*(*MGNHBL11*). This set contains 11 non-hydrogenic bond lengths and is composed of nine data from the older MGNHBL10 set [30], with the addition of the bond length of MgS [41]. For this database, we used the 6-311+G(2df,2p) basis set [146].

Finally, the solid-state physics structural database, PS47, is composed of four non-overlapping primary subsets containing a total of 47 structural data from 44 solids, as follows.

—

*Main-group lattice constants*(*MLC4*). The main-group solid-state lattice constants set is composed of four main-group metals: Li, Na, K and Al. Reference data were taken from [104]. For this database, we used the m-6-311G* basis set [151].—

*Ionic lattice constants*(*ILC5*). The ionic solid-state lattice constants set is composed of five ionic solids: NaCl, NaF, LiCl, LiF and MgO. Reference data were taken from [104]. For this database, we used the m-6-311G* basis set [151].—

*Transition metals lattice constants*(*TMLC4*). The transition metals solid-state lattice constants set is composed of four transition metals: Cu, Rh, Pd and Ag. Reference data were taken from [104]. For this database, we used the m-6-311G* basis set [151].—

*Semiconductors lattice constants*(*SLC34*). This database [38] is composed of the lattice constants of the same semiconductors as in SBG31; the difference of three data from the number of data in the band gaps database is explained by the fact that three solids with wurtzite structure (GaN, InN, ZnO) require the specification of two lattice constants. Reference data are equilibrium values [151,154–157] obtained by removing the zero-point anharmonic expansion so that we can directly compare our calculated results with the experimental data.

### (b) Secondary and analytical databases

The secondary (level 3) subsets are described below:

—

*Atomization energies*(*AE6/11*). This is a small but representative subset of MGAE109/11 containing atomization energies of SiH_{4}, SiO, S_{2}, HCOCOH, propyne and cyclobutane.—

*Small-B1 atomization energies*(*SB1AE97*)*and large-B1 atomization energies*(*LB1AE12*). These databases are subsets of the MGAE109/11 database and have been constructed according to the B1 diagnostic, which was originally developed [126] to give an indication of multi-reference [158] character. However, we now recognize that it is a more general diagnostic signalling a ‘difficult case’, perhaps because of multi-reference character but perhaps for other reasons. Nevertheless, the B1 diagnostic as applied to MGAE109/11 probably does mainly differentiate single-reference and multi-reference cases, because the SB1AE97 set is composed of cases that are very likely single reference, while LB1AE12 is composed of cases that can be multi-reference.—

*Ionization potentials*(*IP13/03*). The IP13/03 database [62,63,95,96] is composed of the 13 main-group ionization potentials in IP21.—

*Ionization potentials of metals*(*IPM8*). The IPM8 database [70] is composed of eight ionization potentials of metal atoms and FeC, all from the IP21 database.—

*Alkyl bond dissociation energies*(*ABDE4/05*). This database contains four bond dissociation energies of small R–X organic molecules, with R=*methyl*and isopropyl, and X=CH_{3}and OCH_{3}. It is a subset of ABDE12.—

*Larger set of alkyl bond dissociation energies of molecules*(*ABDEL8*). This set of alkyl bond dissociation energies [31] includes eight R−X bond dissociation energies of molecules, with R=ethyl and tert-butyl, and X=H, CH_{3}, OCH_{3}, OH. It is a subset of ABDE12.—

(*π*-systems interaction energies). The*π*IE3/06*π*IE3/06 database contains three isomeric energy differences between allene and propyne as well as higher homologues (which correspond to cumulenes and polyenes) [98,159].—

*Proton affinities of conjugated polyenes*(*PA-CP5/06*). The PA-CP5/06 database [98] contains proton affinities of five conjugated polyenes.—

*Proton affinities of Schiff bases*(*PA-SB5/06*). The PA-SB5/06 database [98] contains proton affinities of five conjugated Schiff bases.—

*Heavy-atom transfer*(*HATBH12/08*). The heavy-atom transfer database contains 12 reaction barrier heights involving heavy atoms.—

*Nucleophilic substitution*(*NSBH16/08*). The NS database contains 16 barrier heights of NS reactions.—

*Unimolecular and association reactions*(*UABH10/08*). The UAB10/08 database contains 10 barrier heights of unimolecular and association reactions.—

*Hydrogen bonds*(*HB6/04*). The HB6/04 consists of binding energies of six hydrogen-bonded dimers.—

*Charge transfer*(*CT7/04*). The CT7/04 database consists of binding energies of seven charge-transfer complexes.—

*Dipole interactions*(*DI6/04*). The DI6/04 database contains the binding energies of six dipole interaction complexes.—

*Electrostatic-dominated complexation energies*(*EDCE19*). The EDCE19 database is a merger of the three previous secondary subsets of the non-covalent complexation energies database: HB6/04, CT7/04 and DI6/04. This secondary database was introduced in order to discriminate between complexation energies and different magnitudes within the NCCE31/05 database (e.g.*π*–*π*stacking and weak interaction complexation energies usually have magnitudes that are one order of magnitude smaller than the electrostatic-dominated complexes).—

*Weak interactions*(*WI7/05*). The WI7/05 database [95] consists of the binding energies of seven weak interaction complexes, all of which are bound by dispersion-like interactions.—

(*π*−*π*stacking*PPS5/05*). The PPS5/05 database [95] consists of binding energies of five*π*–*π*stacking complexes.

The analytical (level 3) subsets are described below.

—

*Metal bond energies*(*MBE18*). The MBE18 database comprehends all metal bond energies in our energetic broad chemistry set and is composed of all data in SRMBE13 and five data from MRBE10.—

*Transition metal bond energies*(*TMBE15*). The TMBE15 database collects the transition metals bond energies from the SRMBE13 (10 data) and MRBE10 (five data). This database is directly derived from MBE18 by excluding three bond energies of molecules containing main-group metals.—

*Diverse barrier heights*(*DBH24/08*). The DBH24/08 subset [101] comes from a different subdivision within the barrier heights sets, and includes six HT reactions from HTBH38/08 and 18 barrier heights from NHTBH38/08 representing six heavy-atom transfer, six NS and six unimolecular and association reactions.—

*Geometries of diatomic*(*DG6*). The DG6 subset [41] comes from a different subdivision within the chemistry structural set, and includes six diatomic molecules: H_{2}, HF, OH, N_{2}, Cl_{2}and MgS. The first three molecules are from MGHBL9, while the last three molecules are from MGNHBL11. For this subset, we used the 6-311+G(2df,2p) basis set [146].—

*Solid-state lattice constants*(*SSLC18*). The solid-state lattice constants subset comes from a different subdivision within the physics structural set and includes a broad set of solids: four main-group metals (Li, Na, K, Al), five semiconductors (C, Si, SiC, Ge, GaAs), five ionic solids (NaCl, NaF, LiCl, LiF and MgO), and four transition metals (Cu, Rh, Pd and Ag). For this subset, we used the m-6-311G* basis set [151]. Note that SSLC18 and SLC34 are not completely non-overlapping. In particular, five semiconductors occur in both SSLC18 and SLC34.

### (c) Statistical data

In general, we report errors as mean unsigned errors (MUEs), which are mean absolute deviations from the reference data. For five of the databases, in particular MGAE109/11, SB1AE97, LB1AE12, AE6 and DC9/12, we report errors as the MUE per bond (MUE_{PB}). In these cases, we first compute the MUE, and then we divide by the average number of bonds per datum (counting multiple bonds as well as single bonds as one). The average numbers of bonds are 4.71 for MGAE109/11, 5.10 for SB1AE97, 1.33 for LB1AE12, 4.67 for AE6 and 9.22 for DC9/12.

The CE345 database is of special interest because energetic properties have historically been the properties of most interest for density functional applications in chemistry. The MUE for this database is computed as follows: 4.1

Two other average data are also used in the overall evaluation. The first is CExMR335 and is calculated as in equation (4.1) excluding MRBE10. This set is useful for the evaluation of hybrid functionals, many of which should not be used for multi-reference systems, and the large error for these cases will dominate the average of BC345. The second is CExAE328, calculated including all subsets of CE345 except AE17. This set is useful for an alternative evaluation of those functionals that have large errors for the atomic energies (e.g. many GGA functionals, for example PBE). The MUE for CExMR335 is calculated using the same formula and excluding MUE(MRBE10) (with 335 in the denominator), while that for CExAE328 is calculated excluding MUE(AE17) (with 328 in the denominator).

The MUE for PE39 is also calculated from the primary subsets as 4.2and the MUE for CS20 is calculated in a similar way as 4.3The evaluation of functional performance using the physics structural set, PS47, is based on lattice constants, which by definition differ from the calculated nearest neighbour distances by a geometrical factor, and therefore they have different magnitude from the errors for bond lengths in molecules. If an approximate density functional were equally valid for solids and molecules, we would expect the MUEs for the lattice constants to be larger than those for molecular bond lengths by a factor of about 2.15, which is the average of the geometrical factors for the PS47 database. The mean errors for PS47 are calculated from the primary subsets as 4.4

## 5. Validation of density functionals

In this section, we present results for a large number of density functionals, including all of our own second-order functionals, the NGA and the Minnesota functionals. This is primarily done to put the results of our most recent functionals in perspective and to show strong and weak points of each method. It is worthwhile to mention at this point that, although we made use of parametrizations based on some of the databases presented above, none of our functionals has been parametrized using all the data in the four comprehensive databases. The functionals that we considered in this study are presented with the corresponding original reference or references in table 2.

Although this assessment aims for comprehensiveness, it is almost impossible to be truly complete. For example, not all functionals that have been proposed could be included. Functionals omitted include some less successful functionals, some promising functionals that we did not manage to get into our computer program yet with the time and resources available, some functionals omitted just to keep the tables compact enough to be readily comprehensible and the scope limited enough to allow reasonable discussion, and—as already mentioned—all functionals with non-local correlation simply because they raise so many new issues that they should have a separate assessment. Some examples of interesting unincluded functionals with local correlation are B86MGC [182], B97-1 [183], B3LYP* [184], LCgau-BOP [185], GauPBE [186], PW6B95-D3 [87], *ω*M05-D [187] and SXR12 [188]; and some examples of functionals with non-local correlation are given in §2*d*. Despite these limitations, the present comparison does involve a wide variety of functionals that illustrate most of the diverse features one may encounter.

### (a) Results for CE345

Results in terms of MUEs for the primary chemistry databases and their overall statistical data are presented in this section. For the purpose of an easier presentation, we grouped the functionals in eight classes, whose results are presented in different tables. The classes, with the respective tables, are: LSDA and first-generation GGA functionals (table 3), where we define a first-generation GGA as one that is published before 1993 (the year of publication of the B3LYP functional, which is a seam in the timeline of DFT); second-generation GGA functionals and the NGA functional (table 4); first-generation global hybrid GGA functionals (table 5, once again first-generation functionals are functionals published before 1993); second-generation global hybrid GGA functionals (table 6); range-separated hybrid GGA functionals (table 7); meta-GGA functionals (table 8); first-generation hybrid meta-GGA functionals (table 9), where we define a first-generation hybrid meta-GGA as one that is published before the first Minnesota functional (M05); global and range-separated hybrid meta-GGA functionals published since M05 (table 10). Within each table, functionals are ordered according to their year of publication.

Let us first analyse the performances for the most basic databases that are of most general interest to chemists, in particular those for atomization energies, barrier heights and non-covalent interactions. The best functionals for the MGAE109/11 database are, in general, hybrid meta-GGAs with a high percentage of HF exchange; the most successful functionals are M06-2X, PW6B95, *ω*B97X, MN12-SX, *ω*B97X-D, M11 and BMK, all with average errors in the range of 0.50±0.03 kcal mol^{−1} per bond. Local functionals at the LSDA and GGA level are on average not successful for atomization energies, with mean errors higher than 1 kcal mol^{−1} per bond. Some meta-GGA and meta-NGA functionals, however, are capable of providing acceptable performance even without non-local HF exchange, in particular MN12-L, VSXC and M11-L have errors below 0.75 kcal mol^{−1} per bond, which is comparable to many hybrid functionals.

The situation for the barrier heights databases (HTBH38/08 and NHTBH38/08) is similar in some respects to that for atomization energies, but with some key differences, as barrier heights seem to be sensitive not only to the percentage of HF exchange but also to the quality of the density functional [101]. The most successful functionals for this case are again the hybrid meta-GGA functionals with a high percentage of HF exchange. Among local functionals, MN12-L and M11-L stand out with mean errors close to those of hybrid functionals (less than 3 kcal mol^{−1}), while all other local functionals, including M06-L, which at one point in time was the best local functional for barrier heights, are above 4 kcal mol^{−1}.

Next, consider the non-covalent interactions. For the NCCE31/05 database, it is interesting to note how important is the use of the kinetic energy density. Meta-GGA functionals are on average much more successful than GGA and LSDA, clearly showing that the kinetic energy density is a crucial ingredient for this property. M11, PWB6K, M05-2X, M06-2X and MN12-SX all stand out with MUEs of 0.30 kcal mol^{−1} or less.

As far as the other molecular properties are concerned, we note that, as expected, local functionals are much more successful than hybrid functionals for the multi-reference database, for which hybrid functionals with a high percentage of HF exchange fail badly. The rearrangement of our databases shows more clearly than previous analyses which functionals represent progress in treating multi-reference systems. It is clear when considering all databases in CE345 that the most successful functionals considering all databases in CE345 are range-separated hybrid functionals (*ω*B97X), local meta-GGA and meta-NGA functionals (M06-L, M11-L and MN12-L), and hybrid meta-GGA functionals with a moderate percentage of HF exchange (MN12-SX, M05, M06).

Atomic energies, as represented by the AE17 database, deserve special consideration. Some exchange functionals, such as the B88 and OptX functionals, were optimized to fit HF exchange energies. As the exchange energy is much larger in magnitude than the correlation energy, this helps to get better results for the atomic energies but it is not sufficient for getting the best results for at least two reasons: (i) the HF exchange energy is ‘exact’ for HF wave functions but not for correlated ones and (ii) one must also have an accurate correlation potential. Nevertheless, the OptX functional does give relatively accurate atomic energies [5]. The tables show MUEs in atomic energies of 8.67 and 10.13 kcal mol^{−1}, respectively, for BLYP and OLYP, and several other functionals do better than this. Functionals with MUEs below 5.2 kcal mol^{−1} for AE17 are PW91, B3PW91, B98, MPW3LYP, SOGGA11-X, MPWKCIS1K, M06-2X, M08-HX, M08-SO and MN12-SX. Based on the GMTKN30 database, Goerigk & Grimme [87] found ‘no statistical correlation between a functional's accuracy for atomization energies and the performance for chemically more relevant reaction energies’. Nevertheless, if one wants to obtain the right answer for the right reason, without relying on cancellation of errors, obtaining a reasonable value for the atomic energy along with good properties for molecules and solids is a worthwhile goal, and it may be helpful when one considers even broader databases. For those who wish to evaluate the performance of density functionals without considering absolute atomic energies, the tables also provide the MUE for the CExAE328 database, in which atomic energies are excluded.

### (b) Results for PE39

Results for the solid-state physics energetic set and its primary databases are presented in this section. Because of the high expense of LR HF exchange, global hybrid and long-range-corrected hybrid functionals are not included in this evaluation. Results for the physics set are presented for LSDA, GGA, NGA, screened-exchange GGA and meta-GGA functionals in table 11.

Results for the cohesive energies database, SSCE8, show that LSDA has the worst performance for solid cohesive energies, while GGA-type functionals that are especially accurate for lattice constant calculations (such as SOGGA and PBEsol) are not high performers for solid-state cohesive energy. On the other hand, some functionals that work better for chemical energetics (e.g. SOGGA11, PBE, N12-SX and MN12-L) are successful for the cohesive energies database.

Understanding the use of DFT for band gap prediction has a long history [189] and is of great practical importance [38,151,190,191]. The incorrect prediction of band gaps in solids is an especially serious problem for treating defects, where various corrections are introduced, but they lead to inconsistent results for the positions of defect levels relative to band edges [192]. These problems are minimized, even if they do not go away, by using a density functional that gives a smaller error in the band gap. Our results for the band gap database show that almost all local functionals have MUEs in band gaps of 1.0±0.2 eV. Notable exceptions are the two Minnesota meta-GGA and meta-NGA functionals, M06-L and M11-L, which are more precise than the other functionals. The hybrid HSE06 and N12-SX functionals are the best performers for this database, although M11-L provides very good performance for a local functional.

It is interesting that TPSSLYP1W performs better than revTPSS for two of the three databases presented in table 11, especially when one considers that the LYP correlation functional does not satisfy the UEG constraint as the density inhomogeneity vanishes and would not have been expected to be compatible with TPSS exchange. The single parameter in TPSSLYP1W was fitted to water clusters.

The best performers for PE39 are N12-SX, HSE06, MN12-SX, M11-L, M06-L and MN12-L.

### (c) Results for CS20

Results for the chemistry structural set and its primary databases are presented in table 12.

Results for the chemistry structural database show that the most successful functionals are either local or hybrids with a moderate percentage of HF exchange. However, many functionals perform satisfactorily, with 59 functionals having MUEs of 0.010 Å or less.

### (d) Results for PS47

Results for the solid-state physics structural set and its primary databases are presented in this section. As for the PE39 database, global hybrid and LC hybrid functionals are not included in this evaluation, because evaluating the LR HF exchange is expensive and not practical. Results are presented in table 13.

The solid-state physics structural database is very interesting because a great amount of recent attention has been devoted to this problem in the physics community. Results presented in table 13 show that functionals such as SOGGA, PBEsol and HSE06 are among the top performers for this database, ranking higher than they do in other databases. However, the performance of the N12 functional is noteworthy because not only is it the third best performer for this database, but also it is the top performer for CE345 among functionals that depend only on the density and its gradient. N12-SX and MN12-L are also noteworthy for having outstanding performance on both CE345 and PS47. Among GGA functionals SOGGA11 is by far the best functional for CE345; however, its performance for the solid-state structural database is disappointing. This observation proved wrong the earlier hypothesis that functionals that are correct to second order will produce good solid-state lattice constants, and this observation is the main finding that brought us to the development of the N12 functional. This situation indicates that a single GGA functional is too limited to provide good accuracy for both chemistry and solid-state physics, and only a more flexible functional form was able to overcome the problem. At the meta-GGA level, MN12-L provides optimal performance for PS47, followed by revTPSS, but the performance of the latter for the chemistry database is a little disappointing (although not terrible compared with other local functionals). The opposite is true for our early Minnesota functionals: M06-L and M11-L are top performers for the chemistry database, but their accuracies for PS47 are a little disappointing (although not as terrible as many other functionals that perform well for chemistry). MN12-L though is outstanding for both chemistry and physics.

### (e) Secondary and analytical databases

The analysis of the results for secondary and analytical databases shows other interesting patterns. The main characteristics of this analysis are reported below, while detailed results are presented in the electronic supplementary material.

One of the most interesting assessments in the light of the current heavy activity by several groups in adding molecular mechanics to DFT is the area of attractive non-covalent interactions. Results for the secondary subsets of the non-covalent interactions database are presented in table 14 for the 25 functionals that perform best on NCCE31/05 along with—for comparison—the MP2 WFT method and the popular PBE and B3LYP functionals.

From the data we note immediately that the first 19 functionals have either range separation in the exchange or a meta-GGA functional form, or both, clearly demonstrating the desirability of going beyond the incomplete description at the GGA and global hybrid GGA levels, which have been incapable of providing a quantitative treatment of energetic properties with a strong dependence on medium-range correlation energy. The first five functionals (M11, PWB6K, M05-2X, M06-2X and MN12-SX) show excellent performance without molecular mechanics and have MUEs of 0.45 kcal mol^{−1} or less for all six of the secondary non-covalent complexation energy databases. The two best global hybrid GGAs for non-covalent interactions are SOGGA11-X and MPW1K.

A clear trend can be seen in the results for non-covalent interactions for local functionals and functionals with a low percentage of HF exchange, in that they perform worse for the charge-transfer database (CT7/04), as expected by their local (or very small non-local) character. However, the local MN12-L is better for CT7/04 than MP2 or HFLYP, which both have 100% HF exchange, or than B3LYP, which has 20%. Performance of the other secondary non-covalent complexation energy databases seems more influenced by the quality of the density functional form than simply by the percentage of non-local exchange. For the weak interactions (WI7/05) and the *π*–*π* stacking (PPS5/05) databases, it is interesting to note the performance of MP2, which is the best method for weak interactions, but the worst for *π*–*π* stacking. Another interesting feature of table 14, in the light of the frequently heard (but erroneous) comment that one can only obtain good results for non-covalent interactions by heavy parametrization, is the good performance of the lightly parametrized PWB6K, MPW1B95 and PW6B95 functionals.

Another interesting set of databases to compare are the group of two alkyl bond dissociation databases, the weak interaction database and the *π*–*π* stacking database. These databases all depend on having a good treatment of medium-range correlation energy, but in different ways. Performing better than the average on all four databases is a difficult challenge. This challenge is met by only three local functionals (namely SOGGA, M11-L and MN12-L), although two others (PW91 and M06-L) just miss this mark, and by only two global hybrid GGAs (PBE0 and B98). However, this test is passed by all the range-separated functionals, and it is passed by most of the hybrid meta-GGAs.

Another set of related databases are LB1AE12, DC9, TMBE15 and MRBE10. Only a functional that can handle multi-reference systems can do well on these databases, and we ask how many functionals do better than the average on LB1AE12 and DC9 and also have an MUE below 10 kcal mol^{−1} on both TMBE15 and MRBE10. Only eight functionals succeed, seven local functionals (OLYP, RPBE, revPBE, N12, M06-L, M11-L and MN12-L) and one non-local (N12-SX). Note that this set has no overlap with the successful performers in the previous paragraph, which provides another example of the difficulty in finding a universal functional. As mentioned above, the metal databases are still rather small, and further work is needed to draw more definitive conclusions about performance for metal-containing molecules.

Both ionization potentials and proton affinities require good accuracy for cations. We then ask which functionals have an MUE below 5 kcal mol^{−1} for IP21 and for both proton affinity databases. Surprisingly, only three local functionals (SOGGA11, M11-L and MN12-L) and four hybrid functionals (M05-2X, M06-2X, M08-HX and M08-SO) pass this test—with M06-2X getting all three MUEs below 3 kcal mol^{−1}.

The extra databases allow us to ask which density functionals perform well for all four kinds of barrier heights: HT, heavy-atom transfer, NS and the unimolecular/association group. Among local functionals, only MN12-L has an MUE smaller than 4 kcal mol^{−1} on all four of these databases, and among the others MOHLYP2 and M11-L come the closest to meeting the challenge, but three global hybrid GGAs (MPW1K, B97-3 and SOGGA11-X), three long-range-corrected hybrid GGAs (LC-*ω*PBE, *ω*B97 and *ω*B97X), and one screened-exchange hybrid GGA (N12-SX) get all four barrier height groups below 4 kcal mol^{−1}; two of these (MPW1K, LC-*ω*PBE) have all four below 3 kcal mol^{−1} but not below 2 kcal mol^{−1}, and three of them (B97-3, SOGGA11-X and N12-SX) have all four below 2 kcal mol^{−1}. However, the hybrid meta-GGAs excel here; three of them (MPW1B95, M05, M06) have a maximum of the four MUEs between 3.3 and 4.3 kcal mol^{−1}, two (MPWKCIS1K and M05-2X) have the maximum between 2 and 3 kcal mol^{−1}, and nine of them (BB1K, MPWB1K, BMK, PWB6K, M06-2X, M08-HX, M08-SO, M11 and MN12-SX) have the maximum of these four MUEs below 2 kcal mol^{−1}. Averaged over all 76 barrier heights, the following functionals do best: M08-HX with an MUE of 1.0 kcal mol^{−1}, M08-SO with 1.1, M06-2X, BMK and MN12-SX with 1.2, BB1K, PWB6K and M11 with 1.3, MPWB1K with 1.4, and MPW1K and SOGGA11-X with 1.5.

### (f) Overall analysis

Table 15 gives the MUEs for the CE345, CS20, PE39 and PS47 databases, but only for those functionals for which we ran all four databases (which means that any functional that has a non-zero percentage of HF exchange at LR is not included). Functionals are ordered according to increasing MUE for our largest database, CE345. We highlighted in bold the performances that are better than the average (for all functionals in this table) for a given database. The Pearson correlation coefficients between the columns of table 15 are shown in table 16. The CS20 column of table 15 shows the best correlation with other rows, having a Pearson correlation coefficient *r* of 0.31, 0.32 and 0.58 with the CE345, PE39 and PS47 columns, respectively. The CE345 column is the least correlated with the others, and actually has a negative correlation coefficient of *r*=−0.10 with the PS47 column. This illustrates dramatically the difficulty in finding the functional that does well for both chemistry energetics and solid-state physics structural data.

Only seven functionals (MN12-SX, MN12-L, N12-SX, M06-L, N12, HSE06 and TPSS) have four bold entries in table 15. N12 is the only one of these that is restricted to just the density and density gradient; N12-SX and HSE06 also have short-range HF exchange, MN12-L, M06-L and TPSS also involve kinetic energy density, while MN12-SX has both short-range HF exchange and kinetic energy density.

For the next analysis, we sorted the various density functionals by decades and calculated the average MUE on the broad chemistry CE345 database for each decade. These results are shown in figure 4. We can clearly see significant improvement in each decade, and although there are only a few functionals from 2010 and later, they have an average MUE that is well below 3 kcal mol^{−1} and that is 3.7 times smaller than the average MUE from the 1980s functionals.

A completely universal functional is extremely hard to determine because it is very complicated and non-local, and therefore probably it will never be found [193]. Systematic approaches, for example a truncated Taylor expansion, which have guided many efforts in the past, are problematic. Quantum Monte Carlo investigations have been used to try to identify design elements for DFT [194,195], but so far this has not led to improved functionals. Nevertheless, continued investigations of what works and what does not and what ranges of the variables are important for various kinds of properties [196,197] have contributed to progress, despite there being no *a priori* guarantees and no systematic path. This progress gives hope that the continued exploration of functions of local variables and functionals of non-local functions may lead to further success.

## 6. Future developments

Where do we need to improve DFT?

(i) We need to keep improving and expanding the databases, for example on the solid-state physics side, where many important properties, such as surface energies and chemisorption and physisorption energies, can be added if suitable reference data can be identified. The performances of DFT for such properties are either largely unknown or have been investigated to a much lesser extent than the key quantities in small-molecule chemistry. The addition of such databases to the evaluation of DFT functionals can be a starting point to improve them, in the same way that the recent addition of solid-state data to the development of our own Minnesota functionals expanded their reliability and brought us a new generation of improved functionals.

(ii) According to the results presented in the previous section, one property that will need major attention in the future is the treatment of multi-reference systems and this is confirmed by the increase in computational studies on transition metal chemistry [115]. Kohn–Sham DFT with the unknown exact functional is exact even for multi-reference systems, but as the density is represented by the density of a single Slater determinant, it will require greater insight and more complicated functional forms to represent the xc energy if the orbital product in the determinant is representing the density mathematically but not physically. The best method (M11-L) for our multi-reference database still has a mean error of 6 kcal mol

^{−1}, which is too large and will need to be improved in the future. However, the sources of these large errors are not fully understood, and more investigations will be necessary before the emergence of reliable strategies to systematically improve the treatment of multi-reference systems. One start on understanding multi-reference systems is more systematic exploration of open-shell systems [198].

## 7. Perspective

This brings us to the next question: are there prospects for further improvement? It is always hard to make predictions, ‘especially about the future’; however, we believe that there will indeed be further improvement. This may involve breaking of some of the constraints that we are following right now or—less likely as it seems to us—adding new constraints. Even more important may be rethinking some of the functional dependencies of density functionals currently available. As just one example of how to proceed, it is worth re-examining whether the UEG is the best starting point, and we believe that the next generation of functionals should be based on a wider search for productive and more realistic models (e.g. as in the Becke–Roussel functional [199], where the H atom is used).

For the older Minnesota functionals, M05 and M06, the suggestions that we gave in Zhao & Truhlar [200] remain valid, and we report them once again here: ‘We recommend (i) the M06-2X, BMK, and M05-2X functionals for main-group thermochemistry and kinetics; (ii) M06-2X, M05-2X, and M06 for systems where main-group thermochemistry, kinetics, and noncovalent interactions are all important; (iii) the M06-L and M06 functionals for transition metal thermochemistry; (iv) M06 for problems involving rearrangements of both organic and transition metal bonds; (v) M06-2X, M05-2X, M06-HF, M06, and M06-L for the study of non-covalent interactions; (vi) M06-HF when the use of full HF exchange is important, for example, to avoid the error of self-interaction at LR; and (vii) M06-L when a local functional is required, because a local functional has much lower cost for large systems’. However, in view of the results presented in this review, the new M11 family of Minnesota functionals and the N12 functional provide top-level performance that in most respects exceeds that of the previous Minnesota functionals. For this reason, we can add M11 to the suggestion for main-group thermochemistry, kinetics and non-covalent interactions and for problems where the use of full HF exchange at LR is important. M11-L is particularly successful, at least so far, for transition metal thermochemistry, kinetics and non-covalent interactions.

The SOGGA, SOGGA11 and SOGGA11-X functionals have been particularly important in exploring the limit of what a GGA functional form can do. However, they are usually less successful than the corresponding meta-GGA (Minnesota) functionals. Nevertheless, they provided insight on which we could build, and SOGGA (suggested for solid-state physics) and SOGGA11 (suggested for chemistry) can be joined by the N12 functional, which is capable of providing balanced performance for both chemistry and solid-state physics, using the same ingredients as its predecessors and at essentially the same computational cost. Combining the non-separable form with kinetic energy density yields the very successful local functional, MN12-L, and adding screened exchange gives the even better MN12-SX; these functionals have outstanding overall performance in our tests so far.

## 8. Conclusion

One often hears that WFT is systematically improvable, while DFT is not. However, the facts show that, since the 1980s, the MUE of DFT on a broad chemistry energetic database has been improved by a factor of 3.7. Furthermore, we are recently beginning to see similar improvements in solid-state physics databases. DFT has become the method of choice for most physical properties and chemical and materials calculations, but there are still areas where we need significant improvements, and we expect continued improvement in the years ahead.

## Funding statement

This work was supported in part by the Air Force Office of Scientific Research under grant no. FA9550-11-0078 and the National Science Foundation under grant no. CHE09-56776.

## Acknowledgements

The authors are grateful to Yan Zhao and Nate Schultz for their contributions to the Minnesota density functional program and to the many workers in DFT who paved the way for our own work and from whose personal comments and writings we learned many useful things, with special shout-outs to Axel Becke, Nick Handy, Mel Levy and John Perdew.

## A. Appendix A. Formula for the spin-unpolarized exchange density

The formulae for the exchange density functionals in the main text are those for the more general spin-polarized case. In many cases, it can be convenient to present the formula for the exchange in the spin-unpolarized case (*ρ*_{α}=*ρ*_{β}=1/2*ρ*). The derivation of the spin-polarized formulae for exchange from the spin-unpolarized ones is straightforward and uses the spin-scaling relationship
A 1

However, some of the local quantities, in particular the reduced density gradients and the UEG exchange energies, have different numerical prefactors in these cases and the following definitions should be kept in mind when reading the literature: A 2 A 3 A 4

## Footnotes

One contribution of 8 to a Theme Issue ‘Density functional theory across chemistry, physics and biology’.

- © 2014 The Author(s) Published by the Royal Society. All rights reserved.