## Abstract

The most accurate electronic structure calculations are performed using wave function expansions in terms of basis functions explicitly dependent on the inter-electron distances. In our recent work, we use such basis functions to calculate a highly accurate potential energy surface (PES) for the H ion. The functions are explicitly correlated Gaussians, which include inter-electron distances in the exponent. Key to obtaining the high accuracy in the calculations has been the use of the analytical energy gradient determined with respect to the Gaussian exponential parameters in the minimization of the Rayleigh–Ritz variational energy functional. The effective elimination of linear dependences between the basis functions and the automatic adjustment of the positions of the Gaussian centres to the changing molecular geometry of the system are the keys to the success of the computational procedure. After adiabatic and relativistic corrections are added to the PES and with an effective accounting of the non-adiabatic effects in the calculation of the rotational/vibrational states, the experimental H rovibrational spectrum is reproduced at the 0.1 cm^{−1} accuracy level up to 16 600 cm^{−1} above the ground state.

## 1. Introduction

Two-electron molecular systems have always been the popular models for testing high-accuracy methods developed by theoretical chemists. Since the early 1970s, experiments [1] and theoretical works [2] have led to a complete reproduction and assignment of the rovibrational spectrum of H_{2}. In the following years, similar efforts were directed towards the smallest triatomic molecular system, the H ion. Owing to the higher number of nuclear degrees of freedom, this is a significantly more difficult problem to describe in quantum mechanical calculations than the diatomic H_{2}.

H plays an important role in astrophysics and astrochemistry. After the detection of H in the interstellar medium [3], it became clear that this ion has an important role in the chemistry of interstellar space and it is responsible for many important chemical reactions taking place in this medium. The importance of these reactions is amplified by the fact that H is the most abundant polyatomic ion populating the Universe [3]. The formation reaction of H,
1.1
and the reaction of its destruction, i.e. destructive recombination,
1.2
1.3
1.4
are very fast, and the fragile equilibrium between them determines the amount of H in interstellar space. Thus, the thermodynamic and kinetic modelling of the aforementioned reactions is of pivotal importance to astrochemistry. The most important prerequisite for this type of modelling is the availability of a very accurate and complete potential energy surface (PES) of H that describes all the different channels involved in reactions (1.1)–(1.4). The PES has to be calculated with high accuracy asymptotically, i.e. also in the areas corresponding to the reactants and the products, as well as all transition and intermediate states of the reactions. Such calculations present a task that is significantly more daunting than similar calculations for two-electron atomic and diatomic systems such as He and H_{2}, which have been the subject of high-accuracy calculations since the very beginning of quantum chemistry.

## 2. Molecular calculations with explicitly correlated Gaussians

Quantum mechanical calculations of molecular systems with only a few nuclei and a few electrons, such as H, can be performed either by treating all particles on an equal footing or by approximately decoupling the motion of the electrons from the motion of the nuclei, i.e. by assuming the Born–Oppenheimer (BO) approximation. The BO approximation involves calculating the electronic states separately from the states corresponding to the rotational and vibrational motions of the molecular frame.

### (a) Born–Oppenheimer molecular calculations

In the BO approximation, by factoring out the motion of the centre of mass, the electronic Hamiltonian becomes
2.1
In this equation, {**r**^{e}} and {**r**^{n}} are labels of electronic and nuclear coordinates, respectively, is a Laplacian operator that acts only on the three Cartesian coordinates of the *i*th electron, *N*_{e} represents the total number of electrons in the system, and *V* ({**r**^{e}};{**r**^{n}}) is the potential of nuclear–nuclear, nuclear–electron and electron–electron Coulomb interactions. The effective nuclear Hamiltonian becomes
2.2

where *E*({**r**^{n}}) is the electronic energy obtained by solving the time-independent Schrödinger equation for the electrons with the electronic Hamiltonian in equation (2.1) for all the nuclear configurations, *m*_{i} are the nuclear masses and *N*_{n} is the total number of nuclei in the system.

The basis set that we employ in the electronic BO H calculations consists of the following floating explicitly correlated Gaussians (FECGs) [4,5]:
2.3
where ⊗ is the Kronecker product symbol, **r** is a (3×*N*_{e})-dimensional vector collecting the Cartesian coordinates of all the electrons stacked one on top of the other, **I**_{3} is a 3×3 unit matrix, **A**_{k} is an *N*_{e}×*N*_{e} symmetric matrix of exponential parameters and **s**_{k} is a 3*N*_{e}-dimensional vector of Gaussian shifts. The prime symbol denotes matrix/vector transposition. The FECG basis functions are much more flexible than any orbital-based counterpart. By allowing the Gaussian shifts to float, it is possible to describe with these basis functions any non-spherical electron density. There is another important quality of FECGs: the Hamiltonian and overlap integrals are of straightforward implementation and of comparably low computational complexity [6].

The Gaussian shifts (**s**_{k}) have the effect of shifting the peak of the Gaussian functions according to the variational principle, often bringing a function's maximum near a nucleus. For H, these shifts are six-dimensional vectors and can be further decomposed into two three-dimensional ‘Gaussian centres’ as they play a role similar to the Gaussian centres of one-electron Gaussian orbitals. FECGs are also functions of the elements of the *N*_{e}×*N*_{e} symmetric matrix, **A**_{k} collecting the Gaussian exponential parameters.

In sum, for the two-electron ion H, the **r** and **s**_{k} vectors assume the following forms:
2.4
while matrix **A**_{k} takes the form
2.5

In the variational calculation, the electronic energy defined by the following ratio evaluated for a trial wave function *Φ*, namely
2.6
is minimized. When the trial function, *Φ*, is expanded in terms of *M* linearly independent FECGs, i.e.
2.7

then the Rayleigh–Ritz ratio in equation (2.6) becomes [6]
2.8
and the minimization is carried out with respect to (w.r.t.) the {**A**_{k},**s**_{k}} nonlinear parameters, as well as the linear {*c*_{k}} coefficients. The nonlinear parameters are optimized using a quasi-Newton procedure [5–7]. The linear coefficients are optimized by solving the secular equation, i.e. by simultaneous diagonalization of the overlap and Hamiltonian matrices (generalized eigenvalue problem).

What sets our method apart from other FECG implementations [8,9] is the use of the analytic energy gradient determined w.r.t. the FECG nonlinear parameters during the minimization [5,6]. The gradient is defined as the first derivative of the energy (*E*) w.r.t. the nonlinear parameters (**A**_{k} and **s**_{k}). This derivative depends on the derivatives of the Hamiltonian and overlap matrix elements, and 〈*g*_{k}|*g*_{ℓ}〉, respectively. Because integrals of Gaussian functions are analytic, closed forms of the earlier-mentioned matrix elements and derivatives thereof have been derived [6].

### (b) Challenges in performing single-point calculations with FECGs: linear dependences

Achieving high accuracy in electronic structure calculations by using explicitly correlated basis functions is difficult for two reasons. First, such calculations usually require large basis sets that include hundreds or even thousands of functions in order to reach an adequate energy convergence. Second, with such a large number of basis functions, linear dependences between the basis functions frequently appear in the calculation and they need to be continuously eliminated in the course of the optimization of the wave function in order to maintain the numerical stability and limit the numerical noise in the energy [7,10]. In one of our earlier works [7], the occurrence of linear dependences among the basis functions prevented us from reaching a close agreement between the calculated transition energies and the experimental values.

During the variational optimization of the wave function, the optimization routine varies the nonlinear parameters of the basis functions used to expand the trial wave function. Numerical instability may arise if during this optimization two nearly identical Gaussian functions are formed. Linear dependences between Gaussians are a serious problem because they may undermine the numerical stability of the calculation, leading the energy and the energy gradient to contain significant numerical noise. Inaccuracy of the gradient may completely destroy the optimization process. When the size of the basis set becomes large (say, larger than 200 for a two-electron system), linear dependences become unavoidable, unless methods to eliminate them effectively are devised.

Linear dependences most often appear in pairs of Gaussians. There may be a physical reason why two Gaussians want to become linearly dependent. If, for example, the two Gaussians in a pair differ (a little) only in terms of a single *A*_{k} matrix element and this element is the *i*th diagonal element (for simplicity let us assume that **s**_{k} are all zero), then a linear combination of the two Gaussians with linear expansion coefficients in the wave function of approximately the same magnitude, but with opposite signs, mimics a Gaussian, which is approximately equal to one of the Gaussians in the pair, multiplied by . This is a new type of function that is not included in the basis set. In a similar way, functions approximating products of Gaussians by squares of the inter-electron distances, , can be generated. Such functions can help to better describe the inter-electron correlation, because they have minima (not maxima as in the case of the exponentially correlated Gaussians) when *r*_{ij} becomes zero, which happens when two electrons are located at the same point in space. Obviously, such a situation is very unlikely, owing to the inter-electron repulsion. Thus, linear dependences, if they start to appear more frequently during the variational optimization of the Gaussian nonlinear parameters, provide an indication that the basis set needs to be augmented with new types of Gaussians to improve the convergence of the calculation. However, such an augmentation is not always feasible, because it may require the implementation of new types of Hamiltonian and overlap integrals in the code used in the calculation.

We have developed a procedure that efficiently deals with the linear-dependence problem without irreparably slowing down the convergence of the optimization. The first part of the procedure consists of a routine that checks for linear dependence in each pair of functions in the basis set. Next, if a linear dependence is found in a pair, one of the two functions of the pair is replaced by a new Gaussian. This Gaussian is obtained from an optimization where its overlap with the linearly dependent pair, where the Gaussians are multiplied by the linear coefficients with which they enter the wave function, is maximized. Two functions, *ϕ*_{k} and *ϕ*_{l}, are considered to be linearly dependent if the following criterion is met:
2.9
where *t* in our calculations ranges from 10^{−3} to 10^{−4}, depending on the size of the basis set. Next, the new function *ϕ* is found by maximizing the following overlap:
2.10
The new function then replaces *ϕ*_{l}, and the calculation is restarted with the new pair, *ϕ*_{k} and *ϕ*, replacing the linearly dependent pair, *ϕ*_{k} and *ϕ*_{l}.

In a recent work [5], we have implemented the above scheme and carried out the calculation of the ground state of H at its equilibrium geometry (an equilateral triangle with a bond length of 1.65 *a*_{0}). During the calculation, we encountered other issues that undermined the numerical accuracy and the convergence of the calculations. The most notable of those issues involved the appearance during the optimization of linear coefficients of the wave function expansion (the *c*_{k} of equation (2.7)) with very small magnitude. The basis functions associated with small linear coefficients may be important to the calculations, as they could be related to FECGs with high kinetic energy needed to approximate the cusp condition (missing in Gaussian functions), and FECGs with low kinetic energy needed for the proper long-range decay behaviour of the wave function (i.e. an exponential decay that is not properly described by a single Gaussian decay). Therefore, we devised a computational strategy to monitor the magnitude of the coefficients and remove the ones with exceedingly low magnitudes. However, the magnitude of the coefficients becomes less and less of an issue as the basis set approaches completeness. Therefore, the procedure strictly monitors and restricts the magnitude of the linear coefficients at the initial stages of the basis set growing process, but it allows for coefficients with lower magnitude in the wave function expansion as the basis set size increases.

In our pilot H calculations, we used up to 1000 Gaussians in expanding the wave function. The energy convergence w.r.t. the basis set size was remarkably better than previous calculations [11] and the total energy of −1.343 835 625 02 *E*_{h} calculated with 1000 FECGs is the best variational upper bound ever obtained for this system, thus validating the computational procedure we used in the calculations.

## 3. Brief recap of H_{3}^{+} potential energy surface calculations

H becomes very floppy even when excited to vibrational states located in the infrared region of the spectrum. As these states are significantly anharmonic, it is more difficult to describe them in quantum mechanical calculations. This has been the reason for slow progress in obtaining sufficiently accurate results. On the experimental side, the spectroscopy of H has also been advancing slowly because, as for any molecule, there is a strong decrease of all spectral intensities (as low as a factor of a million compared with the fundamental transition [12]) as the excitation energy increases. In addition, past first-principles predictions became increasingly inaccurate and lost their ability to guide experimental line searches and spectral identifications [13]. This roadblock persisted despite recent theoretical advances [8,9,14–17], in which increasingly more accurate and complete H PESs have been determined.

Among the *ab initio* PESs that have been essential in studies of the H vibrations, one should particularly mention the potential generated by Meyer *et al.* [18] (hereafter referred to as MBB), using the full configuration interaction (FCI) method. The MBB PES was based on 69 grid points with a maximum energy of 25 000 cm^{−1} above the bottom of the PES. The accuracy of the MBB PES was subsequently increased by more precise calculations involving the configuration interaction method with *r*_{12} factors multiplying the configuration functions [19]. Higher accuracy (claimed to be as high as 0.02 cm^{−1}) was later achieved by using explicitly correlated Gaussian functions and by including the adiabatic and relativistic corrections [8,9].

Another calculation of the H PES and rovibrational spectrum was more recently reported by Velilla *et al.* [20]. Their PES obtained in FCI calculations, which they called the global PES, contained long-range electrostatic interactions represented in an analytical form. The accuracy they achieved for the vibrational states located in the vicinity of the equilateral triangular configuration of the ion was claimed to be of the order of a few tenths of a wavenumber and of about a wavenumber for higher energy states. It is clear now, and we will show in more detail in the following section, that an orbital-based method such as FCI is not competitive with a method that uses explicitly correlated basis functions.

## 4. Very accurate H_{3}^{+} potential energy surface calculated with FECGs

Even though significant progress has been made in the calculation of the H PES, there is still room for improvement, especially in peripheral regions corresponding to dissociative configurations and in areas near the minima and the barrier to linearity. Such an improved optimization can now be carried out, for example, with FECGs using the procedure outlined in the previous section that uses the analytical energy gradient determined w.r.t. the nonlinear parameters.

In this section, two calculations are described. The first one is a pilot calculation that was presented by Pavanello *et al*. [21] showing the ability to obtain very accurate energies with FECGs with the algorithms described in the previous section. The second calculation constitutes a major undertaking: the energy of H was evaluated with FECG basis functions on a grid of 42 498 geometries of the ion [22,23]. As we explain below, this constitutes simultaneously the most dense, extended and accurate set of electronic energies for this ion presently available.

### (a) Proof-of-principle potential energy surface calculation

In the beginning, we carried out single-point calculations of the H total energy at several nuclear configurations of this ion. These calculations were devised to test whether it would be feasible to embark on recalculating the whole PES with a much larger number of grid points (tens of thousands) and with much higher accuracy than achieved in the previous calculations. For the testing, we chose to use the same 69 PES points as used to generate the MBB PES [9,18,19]. This allowed us to compare our energies with the most accurate energies obtained for those points by Cencek *et al.* [9].

Each of the 69 points used to generate the MBB PES is specified by three integers, *n*_{a}, *n*_{x} and *n*_{y}, which are related to the symmetry adapted deformation coordinates *S*_{a}, *S*_{x} and *S*_{y} as (all values are expressed in atomic units) [19]:
4.1
The are the following functions of the H bond distances *R*_{kl}:
4.2
where *R*_{ref} is the internuclear distance in the unilateral triangular equilibrium geometry of H, *α*=0.15 and *β*=1.30.

Cencek *et al.* [9] generated a set of FECGs for each PES grid point of H independently. That is, for each geometry point, a new set of FECGs was generated from scratch. The rationale was to maintain similar precision in the calculated energy across the whole set of points. From a purely computational standpoint, the procedure adopted in Cencek *et al.* [9] is slow and not very efficient. The efficiency of the calculation can be considerably improved by an approach we implemented [24], where the guess of the FECG basis set for a particular PES point is generated based on the FECG basis functions of a close-by PES point, which has already been calculated. This approach assumes that, if two geometries are close enough, the **A**_{k} matrices of the two sets of basis functions should be very similar. The only parameters that need some adjustment are the shifts, **s**_{k}. For the two-electron H, each shift, **s**_{k}, is a six-dimensional vector comprising two three-dimensional Gaussian centres, , where *i*=1, 2 indicates the centre of the Gaussian of electron 1 or 2, respectively. In our procedure, we let the shifts of the Gaussian functions relax along with the geometry changes of the molecule as if they were attached to every nuclei with springs (figure 1). For example, if the position of nucleus *α* changes from one MBB point to the next from **R**_{α} to **R**_{α}+Δ**R**_{α}, the centres of each Gaussian, , *i*=1,2, follow the nuclear movement and change to , where
4.3

where *d*=(1/*r*_{1}+1/*r*_{2}+1/*r*_{3})^{−1} (see figure 1 for the definition of the quantities used in (4.3)). This procedure takes into account the fact that, if a Gaussian centre is closer to a certain nucleus, say **R**_{1}, then should be most similar to Δ**R**_{1}. This allows us to generate an initial guess of the basis set from a neighbouring grid point in a continuous fashion, avoiding the hassle of growing the basis set from scratch. While using the new routine for generating the basis set from close-by points, we noticed a significant decrease in the computational time needed to complete the calculations for new grid points. This new procedure allowed us to reduce the computational time by two orders of magnitude.

In table 1, the energies produced by our (variational) calculations [21] at selected geometry points belonging to the MBB grid are compared with energies reported by Cencek *et al.* [9]. As one can see, our energies are noticeably better (lower) than the energies of Cencek *et al.* Generally, the differences increase when the H structure becomes more distorted from the equilibrium geometry, indicating that perhaps, for those points, the calculations of Cencek *et al.* were not as tightly converged as for the configurations closer to the equilibrium.

We should mention that we have also examined the H ECG PES generated more recently by Bachorz *et al.* [8]. To our surprise, we noticed that their energies of some of the linear configurations are considerably off from our energies. It seems like there was something wrong with their calculations of those PES points. The inaccuracies also affected the calculations of the vibrational energies reported by Bachorz *et al.* [8].

### (b) The most accurate potential energy surface of H_{3}^{+} to date

For constructing a global PES, it is essential to have a grid spanning the complete configuration space of the nuclei in the system. For H such a grid may be constructed in a systematic manner in hyperspherical coordinates, in particular, in the so-called democratic hyperspherical coordinates [25,26]. We therefore chose to recalculate an energy surface of H in hyperspherical coordinates rather than using the MBB set. In the hyperspherical coordinates, the hyperradius, *ρ*, describes the overall size of the system, whereas the two hyperangles, *θ* and *ϕ*, describe its geometrical shape. Namely
4.4
where *R*_{12}, *R*_{23} and *R*_{31} are the three sides of a triangle uniquely representing the H geometry configuration. In a recent collaborative work [22,23], we have generated a very dense grid using the following ranges of the parameters and the corresponding step sizes:
4.5
This grid of points is the same in terms of angles, but denser in terms of the *ρ* variable, as used by Viegas *et al.* [27]. Expressions (4.5) generated a grid of H geometry configurations consisting of 44 885 points. From that grid, the configurations with one or more internuclear distances smaller than 0.7 *a*_{0} were eliminated, leaving a total of 42 498 points. These points were organized in 235 paths, i.e. sets of points corresponding to fixed values of angles *θ* and *ϕ* and a varying value of hyperspherical radius *ρ*.

We then embarked upon the calculations of the total electronic energy of H for those geometry points. We used 900 FECGs in the electronic wave function expansion to calculate the PES electronic energies. The basis set for each PES point was generated from the basis of a nearby geometry point using the method discussed earlier and depicted in figure 1. It took nine months of continuous calculations on a 210 CPU parallel computer system to complete the work.

Beyond BO, corrections to the electronic energy were introduced to account for the effects not included in the BO approximation. The diagonal adiabatic corrections (which account to first order for the neglected coupling between the motions of the nuclei and the electrons) were included in the PES. Further, we included corrections for non-adiabatic (NA) effects through the use of effective masses in the nuclear calculations [22]. These masses were derived using an *ab initio* procedure by Bunker & Moss [28]. Corrections accounting for relativistic effects were also included in the PES. These corrections were taken from the work of Bachorz *et al.* [8] where they were calculated with the first-order perturbation theory. Quantum electrodynamics corrections and corrections due to the finite size of the proton are of even smaller size, and were not included in the PES.

After fitting the energy points, a smooth PES was generated with an r.m.s. error of only 0.097 cm^{−1}. This PES was used to calculate the energies of rovibrational states, and the experimental rovibrational lines were reproduced with unprecedented high accuracy up to the mid-visible range of the spectrum [22]. For more details concerning the form of the PES fitting function and the procedure used in the fitting, the reader is referred to Pavanello *et al*. [23]. In the calculations of the rovibrational states, the PES provides the potential well in which the nuclei (protons in the case of H) move. As the energy of this motion increases, larger domains of the PES start to become accessible by the wandering-around protons. The information of what happens to the protons in this motion can be obtained from the rovibrational wave functions. Thus, it is interesting to analyse the nuclear wave functions obtained by solving the rovibrational Schrödinger equation with our calculated PES. In figure 2, the vibrational wave functions of two high-lying states of H (excited bending states) are depicted. These states give transitions in the visible and near-visible region of the spectrum. In the plots, the internuclear distance between two of the protons is kept constant at 1.4 *a*_{0} (the equilibrium distance of an H_{2} molecule), as the wave functions of both states peak near this distance. The two states depicted can be thought of as describing a proton orbiting around an H_{2} molecule. The maxima and minima of the wave functions correspond to H geometries occurring with large probabilities during this orbiting. The relatively high amplitudes of both wave functions at linear configurations revealed by the plots are the keys to understanding the inability of past studies to achieve high accuracy in the calculations.

## 5. Conclusions

In this work, we describe in fair detail the computational procedures we have implemented over the past years to calculate with high accuracy the total non-relativistic BO energies of small molecules, using FECG functions. We specifically focus on H and we discuss the procedure involved in our recent variational FECG calculations of the BO PES of this ion [22]. At present, this is the most accurate global ground-state H PES available. Together with a simple model for non-adiabatic effects, the calculated PES has allowed us to predict the rovibrational transitions of H in the infrared and mid-visible regions of the spectrum with unprecedented accuracy. Measurements extending far into the visible region have recently become available, and our calculations have been shown to match those observed spectral lines with an average deviation as low as 0.14 cm^{−1} [22,23].

The accuracy of the global PES presented in Pavanello *et al.* [22,23] and its large spatial extent set the foundation for describing and assigning the transitions in the Carrington spectrum [29,30], which stretches into the dissociative continuum, and constitutes a new golden standard for the smallest triatomic molecule.

## Acknowledgements

M.P. acknowledges funding from a Marie Curie Fellowship of the European Commission (grant no. PIIF-GA-2009-254444).

## Footnotes

One contribution of 21 to a Theo Murphy Meeting Issue ‘Chemistry, astronomy and physics of H

_{3}^{+}’.

- This journal is © 2012 The Royal Society