## Abstract

Although there is a long history behind the idea of chemical structure, this is a key concept that continues to challenge chemists. Chemical structure is fundamental to understanding most of the properties of matter and its knowledge for complex systems requires the use of state-of-the-art techniques, either experimental or theoretical. From the theoretical view point, one needs to establish the interaction potential among the atoms or molecules of the system, which contains all the information regarding the energy landscape, and employ optimization algorithms to discover the relevant stationary points. In particular, global optimization methods are of major importance to search for the low-energy structures of molecular aggregates. We review the application of global optimization techniques to several molecular clusters; some new results are also reported. Emphasis is given to evolutionary algorithms and their application in the study of the microsolvation of alkali-metal and Ca^{2+} ions with various types of solvents.

This article is part of the themed issue ‘Theoretical and computational studies of non-equilibrium and non-statistical dynamics in the gas phase, in the condensed phase and at interfaces’.

## 1. Introduction

The first step towards the representation of a chemical structure relies on knowledge of the connectivity that establishes the order in which atoms are linked to each other. This feature became apparent from the independent work of Wöhler [1] and Liebig [2] on fulminates and cyanates. Some years later, Berzelius introduced the concept of isomerism to characterize such compounds with different properties but the same atomic composition. In turn, Pasteur [3] separated the two enantiomers (i.e. structures that rotate the plane-polarized light in opposite directions) of tartaric acid in 1848, thus helping to establish the concept of chemical chirality. However, one of the first attempts to fix the structure of a molecule was carried out by Hofmann and Kekulé, both in 1865. While Hofmann erroneously suggested a model for the structure of methane where the hydrogens define the corners of a square whose centre is the carbon, Kekulé proposed a hexagonal geometry for benzene, with all atoms in the same plane. This was corroborated by the X-ray crystallography experiments of Lonsdale in 1929 [4]. In fact, the advent of quantum chemistry and X-ray crystallography techniques enabled the discovery of the fantastic world of complexity and diversity of structures arising in Nature or that can be prepared in the laboratory. Nowadays, the concept of structure is in the realm of modern chemistry, because it is intimately related to the fundamental properties of matter; hence, much work has been devoted to the development of techniques to discover low-energy structures.

In the last two decades or so, there has been great interest in the exploration of low-energy landscapes with the aim of discovering stable structures of a great diversity of systems [5], including atomic and molecular clusters, proteins and colloids. However, minimizing the total energy of atomic and molecular clusters is a difficult numerical optimization problem, especially when weak non-covalent intermolecular interactions are involved in the binding. The potential energy surface (PES) associated with the monomer interactions is highly multi-dimensional and defines extremely rugged search landscapes with a large number of local minima and multiple-funnel topography. It must be formulated by adopting suitable functions, whose parameters, depending on the basic features of the involved partners, must have a physical meaning in order to correctly describe, in a consistent way, the structure and energy of the most and other less stable configurations. In this area, discovering the lowest-energy structures of molecular clusters is particularly challenging, as the optimization method must determine both the relative location and the spacial orientation of the particles composing the aggregate. Among the various methodologies proposed in the literature, we highlight the basin-hopping (BH) approach [6], genetic algorithms [7–14] (which, instead, we designate as evolutionary algorithms [15]), and the heuristic method combined with geometrical perturbations (HMGP) carried out through the application of surface, interior and orientation operators [16–19]. In particular, our work relies on evolutionary algorithms (EAs) that are state-of-the-art methods for cluster geometry optimization. They are robust search strategies that are able to successfully sample hard search spaces and that enable the efficient discovery of good quality solutions.

In this paper, we review the work developed by us on the global optimization of molecular clusters. We are particularly concerned with the microsolvation of alkali-metal ions, but novel results involving the Ca^{2+} ion are also discussed. Thus, we present the methodology adopted in the research involving molecular clusters in §2, and the details concerning the global optimization procedure are given in §3. The post-optimization analysis relying on electronic structure calculations and molecular dynamics simulations, which have been applied for some of the studied systems, is described in §4. In §5, we review data previously published on molecular clusters of representative solvents: water, methanol and benzene; putative global minimum (GM) structures obtained for hexafluorobenzene (HFBz) are also shown for the first time. In turn, the main results for the microsolvation of alkali ions are discussed in §6, while novel results for clusters of Ca^{2+}-(H_{2}O)_{n} (*n*=1−27) and Ca^{2+}-(CH_{3}OH)_{n} (*n*=1−10) are presented in §7. Finally, a summary of the main achievements and comments on future work are in §8.

## 2. Methodology

Global optimization methods may be used to obtain a picture of the energy landscape of molecular clusters. Although these optimization algorithms may be coupled with state-of-the-art ab initio or density functional theory (DFT) software to search for minimum-energy structures on the ‘true’ potential surface (e.g. [20–23]), this has a high computational cost and it is only affordable for systems with small dimension. A more general methodology uses a two-step approach: (i) the global optimization is performed for a less expensive semi-empirical potential function, which has been parametrized with information from experiments and/or electronic structure calculations; (ii) the minimum-energy structures so obtained are, then, locally reoptimized at the ab initio (or DFT) level of theory. Probably, the latter is the methodology most often used to determine the energy landscape of molecular clusters and, hence, it is the one we use for some of the aggregates in our work.

The static view of the energy landscape may be complemented by studying the dynamics of the molecular cluster by starting the simulation from a given minimum that was previously obtained in the global optimization process. Non-equilibrium dynamics simulations give relevant information about the difficulty of surmounting the potential barriers for converting one isomer into another, and it is also possible to estimate the dissociation temperature. In addition, the average time it takes to extract a molecule from the cluster is a key parameter to infer information about the kinetics and the mechanism of dissociation.

## 3. Optimization of molecular clusters

### (a) Potential models

Several empirical potentials have been employed to study molecular clusters. Some of the systems in our work employ standard potentials, like the TIP4P [24,25] used to describe water clusters or the OPLS-AA model [26] available for describing the interactions within Li^{+}–water, Li^{+}–methanol, Ca^{2+}–water and Ca^{2+}–methanol (see also [27] and references therein). For the remaining systems, however, we have adopted more sophisticated models.

The semi-empirical potential methodology, developed in collaboration with the Perugia Group, exploits the idea that the PES depends on the critical balance of a limited number of leading interaction components. The latter are considered to be ‘effective’ components because they indirectly include the role of less important terms (such as those due to many-body, perturbative charge transfer and mixing effects) that often provide smaller and opposite contributions which tend to balance. Such effective components are properly formulated by using empirical/semi-empirical functions that contain parameters with a defined physical meaning, such as those related to the basic physical properties of involved partners. Specifically, they must depend on the total polarizabilty, which is related to the shape, dimension and energetics of the outer electronic clouds of each partner, on the polarizability components, which are exploited to correctly identify the interaction centres on the molecular frame, on the permanent charge and on the permanent electric multipole. For clusters involving a closed-shell ion and polyatomic molecules, such as alkali ion–benzene/hexafluorobenzene molecules, the leading components have been identified by the size (Pauli) repulsion combined with the dispersion plus induction attraction (indicated as *V*_{nel}) to which must be added an electrostatic term (*V*_{el}) that describes the interaction between the permanent charge of the ion and the charge distribution on the molecular frame. Recently, an improved Lennard-Jones (ILJ) function [28] was introduced to represent *V*_{nel}. The reliability and generality of such a function has been tested on accurate scattering and spectroscopic data and it has been demonstrated that it removes most of the inadequacies of the traditional LJ model due to an excessive repulsion at short range and a too strong attraction at long range, which can modify the interaction behaviour especially for large clusters. Some examples of applications of this method are given in [29,30] (see also references therein). In addition, the basic potential parameters of *V*_{nel} and *V*_{el} can be properly optimized by comparing the method predictions with the results of the experimental investigation, when available, and/or with high-level *ab initio* calculations, performed at defined structures of the simplest aggregates (see for instance [31]).

### (b) Evolutionary algorithm

EAs are stochastic global optimization methods, loosely inspired by the principles of natural selection and genetics. When solving a problem, EAs generate an initial set of random solutions. Then, they iteratively seek to improve the solution pool until a termination criterion is met.

Defining a genetic representation for the solutions is a crucial step in the application of an EA. In the optimization of a molecular cluster with *n* monomers, the genotype is a linear sequence of *n* 6-tuples real values. Each tuple codifies the location and orientation of a single molecule in the following way: the first three values specify the Cartesian coordinates (*x*,*y*,*z*) of the molecule’s centre of mass, whereas the remaining three encode the Euler angles (*ϕ*,*θ*,*ψ*) describing the orientation of the rigid body in the three-dimensional space [32]. Given minimum and maximum bounds for the coordinate ranges, the EA explores the search space defined by all possible coordinates and orientations, seeking the arrangement that leads to the global minimum.

All candidate solutions generated by the EA are locally optimized by L-BFGS, a quasi-Newton procedure that efficiently converges to the nearest local optimum [33]. The fitness value of each converged solution corresponds to its potential. If a solution is evaluated in a coordinate frame different from the one encoded in the genotype, the EA performs the required transformation and then the potential is calculated (see [32] for details).

Three operators regulate the actions of an EA iteration. Tournament selection is a probabilistic operator that chooses the best solutions from the current pool to be the parents of the next generation. One of the strong points of tournament selection is its ability to adjust the selective pressure by changing the tournament size. Then, crossover and mutation take selected solutions and use the encoded information to create new molecular clusters. Simulated binary crossover is a state-of-the-art arithmetic operator for numerical optimization [34]. It takes pairs of parents and, with a given probability, exchanges information between the two real-valued sequences, leading to the formation of two novel vectors. Sigma mutation is applied to individual values from the genetic sequence and slightly modifies them. If applied to a gene encoding a coordinate, it moves the molecule’s centre of mass, whereas the perturbation of an angle changes its orientation. The perturbation is obtained from a Gaussian distribution with mean 0 and *σ* standard deviation, a parameter that helps to control the magnitude of the change.

The new solutions are locally optimized and evaluated. Afterwards, they replace the old population and a new cycle starts. Given the stochastic nature of the algorithm, in some iterations the quality of the best solution decreases. When that happens, an elitist operator takes the best solution from the last generation and directly inserts it in the new population. Figure 1 highlights the main steps of the EA.

To assess the ability of the EA to discover the global minimum structures of molecular clusters, we have applied it to different types of benchmark systems in previous works (aggregates of water [32], benzene [32] and lithium ion solvated by water molecules [35]). The results collected included the ability of the EA to hit the putative global minimum and the number of successful runs (i.e. the success rate). The EA could hit all the putative global minima for clusters of water (up to *n*=20), benzene (up to *n*=30) and Li^{+}−(H_{2}O)_{n} (*n*=1−20); for the last, the EA was even able to discover [35] a new global minimum structure for Li^{+}−(H_{2}O)_{17}. We further extend the benchmarking to Ca^{2+}-(H_{2}O)_{n} clusters; the results obtained with the EA are discussed in §7.

As expected, the success rates tend to deteriorate with increasing size of the cluster. Nevertheless, values of 100% were reached for most of the studied cases; conversely, success rates below 30% were only obtained for a limited number of cluster sizes: *n*≥17 for water, *n*=27 and 29 for benzene, and *n*=15,17,18 and 20 for Li^{+}−(H_{2}O)_{n} aggregates.

Moreover, we have noticed that systems involving hydrogen-bond networks require longer runs and, hence, a number of potential-energy evaluations of 100 000 or even more (e.g. 250 000 for water clusters) were employed [32,35]. Because benzene aggregates are easier to globally optimize, we have used only 50 000 potential evaluations per run in this case [32]. Nonetheless, the global minima were discovered, in most cases, after a reduced number of generations (or potential evaluations), particularly for the smaller cluster sizes.

## 4. Post-optimization analysis

### (a) Electronic structure calculations

To assess the quality of the potential for the interaction of the potassium ion with clusters of benzene and hexafluorobenzene, we performed electronic structure calculations on the clusters with low *n*. We started the geometry optimization by using the minimum structures obtained from the EA optimization.

We used GAMESS [36] and Gaussian [37] for the MP2 [38] calculations, with Gaussian being used on some of the larger systems. All, except for the most inner electrons, were correlated in the MP2 calculation. The basis set used was the TZV basis [39,40], as named in GAMESS. This basis set is made up by combining the Dunning [40] basis set for hydrogen, carbon and fluoride with the basis set of Wachters [41] for potassium. To this basis set, a *p*-type polarization function was added to hydrogen and a *d*-type polarization function was added to all the other atoms. As this basis set is not available in Gaussian, we used the EMSL Basis Set Exchange [42,43] to obtain the basis set in the apropriate format for Gaussian. We checked the validity of the obtained basis set by performing MP2 calculations on benzene (using the same geometry) and compared the Hartree–Fock and MP2 energies obtained and they matched, within the calculation error. This also ensured that the same orbital space was being correlated in the two programs. We used a convergence criterion for the geometry optimization of 10^{−5} *E*_{h}/*a*_{0} or better, across all the calculations; note that 1*E*_{h}(hartree)=4.359744650×10^{−18} J and 1*a*_{0}(bohr)=5.2917721067×10^{−11} m. Time and computer constraints prevented us from performing the MP2 calculations for the systems with *n*>4 for benzene and with *n*>3 for hexafluorobenzene.

### (b) Molecular dynamics simulations

Molecular dynamics (MD) simulations were performed using the DL_{−}POLY program [44]. We considered a microcanonical ensemble of particles (NVE), and simulations lasting 20 ns were carried out for Na^{+}−(Bz)_{4} and Cs^{+}−(Bz)_{4} at different values of temperature, *T*. Within the NVE ensemble of particles, one can ensure the conservation of the total energy (kinetic+potential) of the cluster system (*E*_{tot}=*E*_{kin}+*E*_{cfg}). A time step of 0.001 ps, which is large enough to keep the fluctuations of *E*_{tot} (when no isomerization occurs) well below 10^{−5} kJ *mol*^{−1}, was used in all the simulations, which were performed without imposing boundary conditions. The initial configurations are those derived in our previous work [29]. As in this work we are concerned with the study of both the isomerization processes and the tendency of the clusters to dissociate; the equilibration stage usually considered in MD is absent from our simulations. In fact, we want to investigate the differences in the dynamic behaviour of the most stable isomers of Na^{+}−(Bz)_{4} and Cs^{+}−(Bz)_{4} and, hence, we needed to avoid the inconvenient interconversion among such isomer structures that may occur during the equilibration stage of the system. This means that in our simulations, independently of the *T* considered, the initial configuration of the simulations is that of the most stable isomer. The analysis of the differences between Na^{+}−(Bz)_{4} and Cs^{+}−(Bz)_{4} was based on the mean values of both the potential energy (configuration energy, *E*_{cfg}) and *T*. Moreover, in order to distinguish between the isomerization and dissociation processes, the time evolution of some relevant distances was also analysed.

## 5. Aggregates of solvent molecules

Water clusters have been much studied for different reasons and, hence, they constitute the benchmark systems. Because of this, water clusters modelled with the TIP4P potential were used as a first test for the performance of our EA [32]. As the water molecule has a large dipole moment and it may establish strong hydrogen bonds, the (H_{2}O)_{n} clusters show a quite organized structure. For instance, the TIP4P potential tends to give regular polyhedral structures (that are composed of fused cubes) for the global minima [6], while the TIP5P model preferentially adopts tetrahedral arrangements where the number of H-bonds is not maximized [45]. In general, MP2 calculations [46–48] point to a better agreement with the energetic trends of TIP4P clusters than those of the TIP5P ones [45].

Methanol clusters have been studied by many authors [35,49–55]. As the methanol molecule combines a hydrophobic part with a hydrophilic OH-group, the structure of (CH_{3}OH)_{n} clusters is less dominated by the H-bond network than in the case of water. For *n*≤20, the (H_{2}O)_{n} clusters increase the formation of H-bonds by one to three per water molecule that is added, while this number is almost always one for the growth of (CH_{3}OH)_{n} [35]. Such structural features may be viewed as indicative trends, because the OPLS-AA model used in the global geometry optimization is not expected to be accurate for small clusters, as shown by the comparison with the dispersion-corrected DFT results [54].

In contrast with water or methanol, benzene clusters are prototype systems that can be used to study non-polar environments, which are relevant to understanding the selectivity of many chemical phenomena occurring in living organisms. In a recent work [31], we proposed a new accurate Bz–Bz interaction potential to obtain low-energy structures (including the global minimum) of benzene clusters up to *n*=25 by employing the EA. In the case of small aggregates, we recalculated the low-lying minima at the MP2C/CBS level of theory [56]: the four low-energy minima of Bz_{3} follow the same order as for the empirical potential, and the Bz_{4} global minimum was also confirmed [31]. As in previous studies [32,57] with less accurate potentials [58,59], structures with special stability compared with their neighbour-sizes (so-called ‘magic numbers’) arise at *n*=13,19 and 23, which appears to be related to the number of Bz molecules surrounded by a maximum of 12 nearest-neighbours [31,57]; nonetheless, these potentials show different structures for several cluster sizes.

In addition, we present putative global minima for the clusters (HFBz)_{n} (*n*=1−20) that were calculated with the EA; the HFBz–HFBz interaction potential is the same as in [30]. Figure 2 displays the lowest-energy structure of these clusters. While Bz_{2} shows a T-shaped structure [31], the lowest-energy minimum for HFBz_{2} is a displaced-stacked graphite-like structure. Nonetheless, there are many HFBz_{n} structures resembling the corresponding Bz_{n} ones. This is particularly true for *n*=3,4 and 8, whose global minimum structures are quite symmetric and belong to the same point group of symmetry in both Bz_{n} and HFBz_{n} clusters. Most of the remaining Bz_{n} and HFBz_{n} structures have no symmetry, with the exceptions of Bz_{2} (*C*_{s}), Bz_{6} (*C*_{2}), Bz_{13} (*S*_{6}) and HFBz_{17} (*C*_{2}).

The binding energy has a similar trend for both the HFBz and Bz clusters (see electronic supplementary material, figure S1); excluding *n*=2, the Bz_{n} energy values are always slightly above the HFBz_{n} ones. Also the second energy difference (bottom panel) shows the most prominent ‘magic number’ at *n*=13, though the peak for HFBz_{13} is smaller than for Bz_{13}. Probably, the symmetric Bz_{13} cluster is more stabilized than the HFBz_{13} *C*_{1} structure, which may explain the difference in the two peaks. By contrast, the *n*=19 ‘magic number’ for Bz clusters is missed for HFBz_{19}.

## 6. Microsolvation of alkali-metal ions

### (a) M^{+}(H_{2}O)_{n} and M^{+}(CH_{3}OH)_{n} clusters

A thorough global optimization study on the clusters M^{+}(H_{2}O)_{n} (with M=Na, K, Cs), *n*≤24, has been carried out by Hartke and co-workers [60–62] with their genetic algorithm [7,11,12]; the TIP4P/OPLS potential [24,63] was employed. Especially interesting is the result for *n*=20: they obtained [60–62] a ‘magic number’ for both K^{+}(H_{2}O)_{20} and Cs^{+}(H_{2}O)_{20} clusters, while it is absent for Na^{+}(H_{2}O)_{20}, which is in agreement with high-resolution mass spectrometry experiments [61]. Instead of an expected dodecahedron cage at *n*=20 (where the water molecules would form five-member rings with the cation in the centre), the global optimization results for clusters with K^{+} and Cs^{+} suggest the formation of cage structures which may contain four-, five- and six-member rings, but all with three-coordinated water molecules. Also, MD simulations on these clusters show that the molecular motion appears to be essentially hindered by the network, which contributes to the stability of such structures [62]. Conversely, the GM of Na^{+}(H_{2}O)_{20} is not a clathrate, as the structure has the cation located off-centre [61], and MD simulation has shown that the GM is easily accessible from the higher-energy dodecahedral geometry [62]. Accordingly, recent direct experimental observations, based on infrared photodissociation spectroscopy and blackbody infrared radiative dissociation, show [64] that clathrates are formed for K^{+}(H_{2}O)_{20} and Cs^{+}(H_{2}O)_{20}, but these types of structures are much less significant for Na^{+}(H_{2}O)_{20}.

Clusters of Li^{+}(H_{2}O)_{n} (*n*=1−20) have been globally optimized by González *et al.* [65] with the BH method [66]. More recently, three of us [35] have reproduced the global minima published in [65] with the EA described in §3b. The reoptimization at the ab initio level for *n*≤10 originates similar structures [35]. As mentioned in §3b, a new lowest-energy structure was obtained for *n*=17, but the energy order is reverted at the ab initio level. In general, the Li^{+} tends to occupy a peripherial position in the cluster, thus showing a weaker ‘structure-breaking’ effect, especially for large clusters. However, the formation of the first solvation shell in the smaller clusters (up to *n*=6) leads to a predominance of the ion–solvent interaction, with a central ion surrounded by water molecules; such behaviour is somehow similar to what happens with Na^{+} [60–62].

The microsolvation of Li^{+} with methanol leads to structures where, as in water, the cation is preferentially off-centre [35]. For the smaller clusters, the first solvation shell is completed with four solvent molecules, but the coordination number may be extended up to six for Li^{+}(CH_{3}OH)_{20}. In contrast with the solvation with water, the H-bond network does not change too much by the addition of one methanol molecule. In turn, the loss of H-bonds due to the presence of the ion never exceeds four and, for *n*≥12, the number of H-bonds becomes the same as in pure methanol clusters [35]. Even though we have obtained [35] global energy structures of Li^{+}(CH_{3}OH)_{n} clusters up to *n*=20, most of the success of the present methodology was mainly verified for *n*=6 and *n*=7. For these sizes, we have reoptimized at the MP2 level of theory the geometries of low-energy aggregates obtained with the EA and discovered new global minima that were missed in a previous work by Wu *et al.* [67]. Moreover, the calculated [35] vibrational spectrum for the Li^{+}(CH_{3}OH)_{6} global minimum is compatible with the experimental one [67].

### (b) M^{+}(Bz)_{n} and M^{+}(HFBz)_{n} clusters

In contrast with some similarity between the structures of Bz_{n} and HFBz_{n} (see above), we have shown [30] that the scenario changes completely for the clusters resulting from the microsolvation of K^{+} with Bz and HFBz. While the ion approaches Bz preferentially along the *C*_{6} axis, the lowest minimum for K^{+}-HFBz is a planar structure with K^{+} equidistant from two fluorine atoms. Such a difference was attributed [30] to the inversion in the sign of the quadrupole moment of Bz and HFBz [68–70], which is well described by the potential model employed in our global optimization studies.

Although we have obtained global minimum structures up to *n*=21 for both solvents [29,30], special attention has been devoted to the formation of the first solvation shell. In the case of K^{+}-(Bz)_{n}, the first solvation shell accommodates four Bz molecules; by changing the ion to Na^{+} or Cs^{+}, we have concluded [29] that the first solvation shell closes at *n*=3 or *n*=4, respectively. It is worth noting that such results are in line with the infrared and mass spectrometry experiments of Lisy and co-workers [71,72], who reported that a closure of the first solvation shell is expected at *n*<4 (*n*=4) for a sodium (potassium) ion. Regarding the K^{+}-(HFBz)_{n} clusters, the closure of the first solvation shell occurs at *n*=9.

As the global optimization of these clusters was based on a semi-empirical potential, we reoptimized the geometry of the smallest structures, as described in §4, so that one can assess the reliability of the interaction model as the size of the system increases. The results were analysed in terms of the energy and the root mean squared deviation calculated with the SAICS program [73] for the best superposition of the structure from the global optimization and the one obtained after reoptimization at the ab initio level. Figure 3 shows the best superposition between different pairs of structures for clusters K^{+}-(Bz)_{n} (*n*=1−4) and K^{+}-(HFBz)_{n} (*n*=1−3); for both K^{+}-(Bz)_{3} and K^{+}-(HFBz)_{3}, we have analysed three isomers. As one can see from figure 3, and table S1 in the electronic supplementary material, the structures are very similar and they do not deviate significantly from the EA optimized structures. Also, in the case of the isomer structures, the energy order is maintained.

In a previous work [74], we have studied the dynamics of first-solvation-shell clusters of a potassium ion with either benzene or hexafluorobenzene; the focus was on different isomers of the clusters: K^{+}−(Bz)_{3}, K^{+}−(Bz)_{4}, K^{+}−(HFBz)_{8} and K^{+}−(HFBz)_{9}. It has been shown [74] that isomerizations from low-energy isomers to form the GM structure may occur at *T*∼15−20 K (or *T*<90 *K*) for K^{+}-(Bz)_{3} (K^{+}-(Bz)_{4}). This is a clear indication that the low-energy isomers are connected through low potential barriers. Similarly for K^{+}−(*HFBz*)_{8}, the isomerization from the GM to higher-energy minima also occurs at low temperatures (e.g. *T*=24 K); large-amplitude motion (which is compatible with isomerization from the GM structure) was also observed for K^{+}−(HFBz)_{9} at *T*=210 K. When departing from higher-energy structures, however, isomerization to form the GM is never obtained (even for high temperatures), which is an indication that the energy landscapes of K^{+}-(HFBz)_{n} clusters are more complex than the K^{+}−(Bz)_{n} ones.

Concerning the dissociation dynamics, we have noticed [74] that dissociation occurs at *T*∼90 K (*T*=400 K) for K^{+}−(Bz)_{3} (K^{+}−(Bz)_{4}). In turn, dissociative events were observed at *T*=260 K (*T*=210 K) for *K*^{+}−(HFBz)_{8} (K^{+}−(HFBz)_{9}). While for K^{+}−(HFBz)_{8} the dissociation occurs at around 20 ns of simulation time, such an event is faster in the case of K^{+}−(HFBz)_{9} (about 12 ns). Thus, it appears that K^{+}−(HFBz)_{9} is kinetically less stable than K^{+}−(HFBz)_{8}.

In the remainder of this section, we further analyse the dissociation dynamics of microsolvation clusters of alkali-metal ions with benzene. We have carried out new dynamics simulations for *Na*^{+}−(Bz)_{4} and Cs^{+}−(Bz)_{4}, which allows for a detailed comparison with the results previously obtained [74] for K^{+}−(Bz)_{4}. Similarly to the previous results, MD simulations for Na^{+}−(Bz)_{4} and Cs^{+}−(Bz)_{4} indicate that low-energy structures isomerize to form the GM at low temperatures (*T*∼20−25 K).

With regard to the dissociation of Na^{+}−(Bz)_{4} to form Na^{+}−(Bz)_{3}+Bz, MD simulations show that dissociation occurs at lower temperatures than for K^{+}−(Bz)_{4}. In fact, taking as the initial configuration that of the GM, we have observed that dissociation occurs at temperatures higher than 275 K. The evolution, along the simulation time at *T*=280 K, of the six distances between benzene molecules and the four from Na^{+} to Bz are represented in the top and middle panels of figure 4*a*, respectively, while the values of *E*_{cfg} are shown in the lower panel.

In contrast, the dissociation of Cs^{+}−(Bz)_{4}, analysed from MD simulations taking as the initial configuration that of the corresponding GM, is observed at temperatures higher than 550 K. In the top and middle panels of figure 4*b*, we analyse the evolution, along a trajectory performed at *T*=560 K, of the relevant Bz–Bz and Bz-Cs^{+} distances, while in the lower panel the evolution of *E*_{cfg} is shown.

MD simulations of Na^{+}−(Bz)_{4} (using the Cs^{+} mass) and of Cs^{+}−(Bz)_{4} (using the Na^{+} mass) have also been performed and the results indicate that the sequence of dissociation temperatures for Na^{+}−(Bz)_{4} (*T*=280 K), K^{+}−(Bz)_{4} (*T*=400 K) and Cs^{+}−(Bz)_{4} (*T*=560 K) is not the consequence of a mass effect. Moreover, the cation–Bz interaction contributions (for details, see [29,74] and references therein) seem to suggest that the differences in the temperature of dissociation cannot be due to energy effects. Accordingly, such differences could be due to the distribution of the Bz molecules in the first solvation shell. An analysis of the radial distribution functions at temperatures lower than the dissociation one shows the different mobilities of the Bz molecules around the cation. Figure 5 shows the results obtained at *T*=250 K for K^{+}−(Bz)_{4} and Cs^{+}−(Bz)_{4}.

As can be seen in figure 5, the fluctuation of the Bz–M^{+} distances is larger for K^{+} than for Cs^{+}, indicating that K^{+}−(Bz)_{4} forms a more flexible structure than Cs^{+}−(Bz)_{4}, thus favouring an earlier dissociation of K^{+}−(Bz)_{4} than of Cs^{+}−(Bz)_{4}. In fact, in our previous paper on M^{+}-(Bz)_{n} systems [29], where the distances between the centre of the benzene rings and the alkali ion were analysed as a function of the cluster size, it was observed that such distances are equal for the GM of Cs^{+}−(Bz)_{4}; however, for K^{+}−(Bz)_{4} only three of them are equivalent, while the fourth one is slightly larger. Moreover, strong ‘magic numbers’ of the M^{+}-(Bz)_{n} clusters appear at *n*=2,3 and 4 for M=Na, K, and Cs, respectively, thus explaining the higher stability of Cs^{+}−(Bz)_{4}. The increase of two molecules of Bz in respect to the magic number *n*=2 for Na^{+} explains the dissociation of the Na^{+}−(Bz)_{4} at low temperatures.

## 7. Microsolvation of calcium ion with water and methanol

We report here for the first time global optimization results obtained with the EA described in §3b for the clusters Ca^{2+}(H_{2}O)_{n} (*n*=1−27) and Ca^{2+}(CH_{3}OH)_{n} (*n*=1−10). Previously, González *et al.* [65] published putative GM for Ca^{2+}(H_{2}O)_{n} (*n*=1−20), which we have used for comparing our structures. We could reproduce all the lowest-energy structures of [65] up to *n*=19; for *n*=20, the structure from [65] and the one obtained with the EA are essentially degenerate, but show a very different structure (see electronic supplementary material, figure S2). Although the two structures have a similar first solvation shell with a coordination number of eight, the new minimum shows several water molecules in the third shell whereas this is absent for the cluster from [65].

The high coordination of Ca^{2+} for most of the clusters with *n*≥8 (exceptions in the studied range are only *n*=9 and 10, with seven water molecules coordinated) is a consequence of the strong ion–solvent interaction, which is highly competitive with the formation of the hydrogen bond network (especially for *n*≥15, at which molecules begin to arise in the third solvation shell); there is, however, some disparity among different experimental and theoretical estimates for the coordination number of these clusters (e.g. [75] and references therein), and this certainly deserves further investigation. Even though the cation prefers to stay inside the cluster (in agreement with MD simulations [27]), thus working as a structure-breaking species, the formation of clathrates with the lowest energy are not observed up to *n*=27. In fact, it was shown [65] that the formation of a clathrate for Ca^{2+}−(*H*_{2}*O*)_{20} is unfavourable, because the water–ion energy contribution rather stabilizes the GM.

Concerning the solvation of Ca^{2+} by methanol, we have studied clusters only for *n*≤10, because the global optimization is more computationally demanding due to the large dimensionality of the system (which complicates the calculation of the potential function), as well as the great number of local minima in a small energy window. The results show (see electronic supplementary material, figure S3) that, up to *n*=6, the GM clusters grow by the addition of one methanol molecule to the first solvation shell. For *n*=7, the coordination number retains a value of six (as for *n*=6), while one methanol molecule goes to the second solvation shell and establishes two hydrogen bonds. The *n*=8 cluster has seven molecules in the first solvation shell and one outside that forms two hydrogen bonds. The coordination of seven is retained for *n*=9 and 10, with two and three molecules in the second shell forming four and six hydrogen bonds, respectively.

## 8. Summary and future work

We have reviewed our recent work on molecular clusters by employing a methodology based on evolutionary algorithms. In addition, we have reported new global optimization results for hexafluorobenzene clusters and microsolvation of Ca^{2+} with either water or methanol. It has been shown that our EA has been quite successful in discovering the GM of the studied molecular clusters and, hence, it is expected to show a similar behaviour for other systems. We should also mention that one expects the EA and BH methods to have equivalent performance for searching the global minimum structure, as has been shown for water clusters [76]. Nonetheless, appropriate procedures (that include several particle-move strategies in the Monte Carlo process) are usually employed in the BH method to help in reaching the global minimum of difficult systems [77].

In the microsolvation of ions, the structure of the clusters results from the balance between the ion–solvent and solvent–solvent interactions. Thus, strong solvent–solvent attraction leads to structures where the ion preferentially occupies a peripheral position in the cluster and shows a low coordination number; examples of this behaviour have been presented for Li^{+} and Na^{+} with solvents that form H-bonds. Despite the importance of the H-bond network, heavier ions, such as K^{+} and Cs^{+}, or with larger charge (e.g. Ca^{2+}) tend, by contrast, to be more central (having high coordination numbers) in the structure of the microsolvation clusters; stable cage structures were observed in the case of K^{+} and Cs^{+} ions. As for the solvation of K^{+} with Bz or HFBz, the ion tends to be located in the interior of the cluster (not necessarily in the centre), with the first solvation shell closing at *n*=4 or *n*=9, respectively. MD simulations of such systems indicate that, although low-energy barriers among the most stable isomers would be expected due to the rapid isomerizations occurring at low temperatures, the energy landscape for K^{+}(HFBz)_{9} appears to be more complex than that for the K^{+}(Bz)_{3} and K^{+}(Bz)_{4} clusters. In turn, the faster dissociation of K^{+}(Bz)_{4} in comparison with Cs^{+}(Bz)_{4} (or Na^{+}(Bz)_{4}) has been attributed to a more flexible structure of the former cluster.

Another important application of global optimization not treated in this work is the use of this methodology to generate a pool of low-energy isomers for a given cluster size that can be tentatively assigned to the experimental spectra. For instance, this powerful strategy has been employed recently for the structure assignment of water clusters [78]. However, there are various isomers that contribute to the spectrum and, for example, in the case of (H_{2}*O*)_{32} seven structures were required to obtain a good match between the experimental and theoretical data [78].

The EA employed in this work is completely unbiased, as it does not incorporate any specific information about the chemical or physical properties of the clusters being optimized. This design option enhances the robustness and generalization ability of the algorithm, without compromising its effectiveness, as confirmed by the results presented in this study. There is, however, a natural decrease in the success rate, as the size of the clusters increases. One possible line of research that might enhance the scalability of the method and foster the ability to optimize large molecular clusters is to develop specific transformation operators to handle molecular clusters, particularly the generalization of the cut-and-splice operators [13,79] to these chemical systems.

## Data accessibility

Electronic supplementary material includes figures S1–S3, table S1 and Cartesian coordinates and energies of all minima.

## Authors' contributions

J.M.C.M. conceived and designed the study, and drafted part of the manuscript. F.B.P. developed the code of the evolutionary algorithm employed in the GM optimization and drafted §3b. J.L.L.-T. developed the interface code that links the semi-empirical potentials and the evolutionary algorithm, and also performed the global optimization of clusters resulting from the microsolvation of Ca^{2+} with either water or methanol. P.E.A. performed ab initio calculations on the K^{+}-benzene clusters. M.A. performed molecular dynamics calculations, helped in the development of the semi-empirical potentials for the alkali-metal ions with benzene and hexafluorobenzene and drafted §4b and §6b. A.A. helped in the development of the semi-empirical potentials for the alkali-metal ions with benzene and hexafluorobenzene and contributed to the discussion on the molecular dynamics simulations. F.P. helped in the development of the semi-empirical potentials for benzene clusters and the alkali-metal ions with benzene and hexafluorobenzene and drafted §3a. M.B. performed ab initio calculations on the *K*^{+}-hexafluorobenzene clusters and helped in the development of the semi-empirical potentials for benzene clusters. All authors read and approved the manuscript.

## Competing interests

The authors declare that they have no competing interests.

## Funding

J.M.C.M. and P.E.A. acknowledge support from the Coimbra Chemistry Centre (CQC), which is financed by the Portuguese ‘Fundação para a Ciência e a Tecnologia’ (FCT) through project no. 007630 UID/QUI/00313/2013, co-funded by COMPETE2020-UE. M.A. and A.A. are grateful for financial support from the Ministerio de Educación y Ciencia (Spain, project no. CTQ2013-41307-P) and from the Generalitat de Catalunya (2009SGR-17). F.P. acknowledges financial support from the Italian Ministry of University and Research (MIUR) for PRIN 2010-2011, grant no. 2010 ERFKXL_002. M.B. acknowledges financial support by the Spanish ‘Ministerio de Ciencia e Innovacion’ for grant no. FIS2013-48275-C2-1-P.

## Acknowledgements

J.M.C.M. is grateful to the COST action ‘Molecules in Motion’ (MOLIM, CM1405) for support. F.P. acknowledges the Dipartimento di Chimica, Biologia e Biotecnologie (Università di Perugia) and Fondazione Cassa di Risparmio di Perugia (progetto 2015.00331.021). We are grateful for the provision of supercomputing time in the resources hosted at Laboratório de Computação Avançada, Universidade de Coimbra (Portugal), Center de Supercomputació de Catalunya CESCA-C4 and Fundació Catalana per a la Recerca, and CESGA (Spain).

## Footnotes

One contribution of 11 to a theme issue ‘Theoretical and computational studies of non-equilibrium and non-statistical dynamics in the gas phase, in the condensed phase and at interfaces’.

Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3685621.

- Accepted September 12, 2016.

- © 2017 The Author(s)

Published by the Royal Society. All rights reserved.