## Abstract

High molecular weight polymer systems show very long relaxation times, of the order of milliseconds or more. This time-scale proves practically inaccessible for atomic-scale dynamical simulation such as molecular dynamics. Even with a Monte Carlo (MC) simulation, the generation of statistically independent configurations is non-trivial. Many moves have been proposed to enhance the efficiency of MC simulation of polymers. Each is described by a proposal density (* x*′;

*): the probability of selecting the trial state*

**x***′ given that the system is in the current state*

**x***. This proposal density must be parametrized for a particular chain length, chemistry and temperature. Choosing the correct set of parameters can greatly increase the rate at which the system explores its configuration space. Computational steering (CS) provides a new methodology for a systematic search to optimize the proposal densities for individual moves, and to combine groups of moves to greatly improve the equilibration of a model polymer system. We show that monitoring the correlation time of the system is an ideal single parameter for characterizing the efficiency of a proposal density function, and that this is best evaluated by a distributed network of replicas of the system, with the operator making decisions based on the averages generated over these replicas. We have developed an MC code for simulating an anisotropic atomistic bead model which implements the CS paradigm. We report simulations of thin film polystyrene.*

**x**## 1. Introduction

Since polyacetylene was found to be a semiconductor in the late 1970s (Chiang *et al*. 1977) there has been a rapid proliferation of polymer based optical and electronic devices, from diodes and transitors to polymer LEDs (PLEDs), photodiodes and photovoltaics, and even polymer based lasers (Braun 2002; Peterson & Metzger 2004). The semiconducting properties of these materials derives from the overlap of delocalized *π* bonds along the length of the chain, with a band gap between the valence and conduction bands. Light emission events occur from the recombination of mobile electron and hole states. The physics of charge transport in semiconducting polymeric materials is determined by the dynamics of polarons and excitons. It has been seen experimentally that the interchain separation of polymers strongly affects the polaron generation and so the photoluminescent efficiency (Lipson *et al*. 2004), and that the charge carrier mobilities are strongly affected by the composition of the device (Pacias *et al*. 2003). This work aims to improve the methodology by which relaxed conjugated polymer structures can be found and so provide an understanding of the local configurations that control inter-chain hopping of polarons.

To develop theoretical models for the performance of such devices, we must consider interfacial regions. The ability to control these interfaces may be among the most important factors in the success of polymeric light emitting devices (Friend *et al*. 1999; Scott 2003 and references therein). The surface of a polymer is different from the bulk, as the surface shows a depressed glass transition temperature (Liem *et al*. 2004). In films thin compared to the radius of gyration of the polymer, the polymer strands are seen to coil into ellipsoids and cigar shapes rather than the spheres seen in the bulk (Mansfield & Theodorou 1991; Baschnagel & Binder 1995; Kraus *et al*. 2000; Kerle *et al*. 2001). There is evidence from sum frequency generation studies that side groups preferentially orientate themselves out of the surface, both the phenyl groups in polystyrene (Gautam *et al*. 2000; Briggman *et al*. 2001) and methyl groups in polypropylene (Zhang *et al*. 1997; Opdahl *et al*. 2002). The glass-transition temperature *T*_{g} affects the lifetime of a device at elevated operating temperatures. *T*_{g} is itself affected by the porosity of the surface, the density of chain ends and the mobility of chains in the surface, as well as the inherent intermolecular interactions which might either lead to closer packing or a more amorphous surface. We would like to model the orientation of side-branches, the porosity and other properties at surfaces. This necessitates modelling a polymer surface in atomistic detail.

Polymers used for modern devices are complex, with rigid units such as phenyl rings and side-branches to aid solubility. The set of Monte Carlo (MC) moves in the literature for atomic-scale modelling (Vacatello *et al*. 1980; Pant & Theodorou 1995; Mavrantas & Theodorou 1999; Mavrantzas *et al*. 1999; Wick & Siepmann 2000; Uhlherr 2000; Doxastakis *et al*. 2001*a*,*b*; Uhlherr *et al*. 2001) has shown that it is feasible to attempt equilibration of chain molecules. However, the current state-of-the-art is a good deal less than the molecular weights of polymers used in technological applications, and most moves have been developed for the simplest polyethylene backbone with no side-branches. New MC moves and modifications to existing algorithms are suggested regularly. We would like to prototype rapidly and test moves for a new polymer system, where the dominant mechanism of motion might be through reptation or by altering the internal degrees of freedom.

Here we formalize a methodology by which a set of MC move parameters for a new polymer can be readily optimized, vastly improving efficiency. This requires a simple means of testing whether one set of parameters is ‘better’ than another. As our performance measure, we track the correlation time of the polymer system, measured as the characteristic time in wall time (seconds) for new polymer conformations to be generated. If adjusting a parameter decreases correlation time, it has made the system more computationally efficient. We have performed tests on the much-studied polystyrene system. This has a rigid phenyl group, and a side-branch on the monomer, so it contains some of the essential complexities found in commercially important polymers.

Manchester Computing has provided us with the computational steering application programming interface (CS API) and steering client as part of the RealityGrid project, which allows us to manipulate directly program parameters and graph monitored variables. We have integrated the CS API into a new polymer simulation code, Polysteer. The CS API provides us with a steering client which can request services, run applications and manipulate directly the data stored by a running MC program, graphing abilities so we can monitor the effects of our changes and checkpointing and logging facilities so results can be reproduced. We show that to monitor the program online means a trivial parallelization strategy that is perfectly suited to the distributed model offered by the Grid is not just desirable, but *required* to explore properly the parametrization of the proposal density (* x*′;

*).*

**x**## 2. Steering the Metropolis method

The Metropolis Monte Carlo method has been widely used in molecular simulation for many years. It is used to generate a Markov chain of configurations, with each new configuration being generated using a set of selection rules from the previous, and the acceptance probability of the configuration change being min (*a*, 1), where(2.1)where is the Boltzmann factor for state * x* and (

*′;*

**x***) is probability of selecting state*

**x***′ given the initial state*

**x***.*

**x**We will steer the simulation by direct manipulation of the selection probability function (* x*′;

*). This is achieved by the user altering the value of a parameter tracked by the steering client.*

**x**As this is not a stochastic process, it will bias the path of the system in an unqualified manner, and so statistical quantities computed from the sequence of configurations from before and after the change may not be combined. We therefore conceptually split the simulation into two sections. In the first section the user may modify parameters looking for an optimal set for simulation. In the second parameter modification is barred, the system is allowed to evolve naturally and statistical quantities are generated.

We will assume that the selection probability is a function of a set of steerable parameters {*s*}, those registered with the steering client and eligible for direct user manipulation; and also a set of unsteerable parameters {*t*}. To determine the optimal values for {*s*} requires that we have some measure by which ({*s*}; {*t*}) can be compared to ({*s*′}; {*t*}), and that by this measure one may be found to be better.

If the system is evolved over long time scales using the Metropolis procedure, the probability of finding the system in state * x* at a given time is proportional to its Boltzmann factor. The ensemble average of a property may therefore be estimated by its average taken over a sufficiently large number of state samples. But it is extremely difficult to determine in a general sense what ‘sufficiently large’ should mean, as the simulation may well have been trapped in one or more local minima, without ever having explored the full configuration space. So ({

*s*}; {

*t*}) can only really be said to be optimal if it explores the global configuration space at the highest rate.

We will now briefly discuss three previously used measures for the efficiency of a MC simulation. Firstly for many simulations a ‘rule-of-thumb’ estimate Schlick (2002) for optimal efficiency is sufficient—that ({*s*}; {*t*}) should be tuned so that the acceptance rate should be close to 50%. This rule is suitable for simple displacement or orientation change moves, but will fail for more complex selection probabilities.1

Other authors (L'Ecuyer 1994; Denwig 2000) have used as a figure of merit an efficiency defined as(2.2)where *T*_{cpu} is the computer time taken for the run and *σ* the error estimate of the average of some computed property. This is a much more reasonable estimator, but is not suitable for generating the rolling estimate of efficiency that we require.

A third class of efficiency estimators take the rate of change in a specific property with cpu time. Frenkel & Smit (1996, p. 107) suggest the rate of squared volume change for their example of an ideal gas in a piston; Karaiskos *et al*. have suggested the displacement of the centre of mass and correlation time of the first Rouse mode in a study of the efficiency of the configurational bias method. These *ad hoc* methods, while certainly useful, do not have sufficient generality to compare different classes of moves, which may operate on different subspaces.

Led by these considerations we choose the correlation time as the measure of the efficiency. The autocorrelation function for the system at time *τ* is defined by:(2.3)where the time *t* is wall time, the dot ⋅ indicates an inner product operator which returns the projection of the system state at time *τ* on the system state at the earlier time *τ*−*t*. Note that we do not expect to see any oscillatory behaviour in *c*(*t*; *τ*), just a fixed rate of movement through configuration space. We there fore expect(2.4)The rate *λ*_{τ} is found by least squares fitting the measures for log *c*(*t*; *τ*) against *t*. A large value for *λ*_{τ} means fast movement through configuration space at time *τ*.

By this means we can say that ({*s*}; {*t*}) is more efficient in sampling the configuration space than ({*s*′}; {*t*}) if *λ*_{τ} is higher in the former case. Provided the operation (⋅) may be defined *λ*_{τ} is available for any MC simulation, and so this test of sampler distributions is general. *λ*_{τ}, or its inverse the correlation time is therefore a good candidate to be the quantity monitored by the operator.

In equation (2.3), 〈 〉 denotes that an ensemble average should be taken. This is clearly not an accessible average, but it is possible and preferable to take the average over a number of members of the ensemble by trivially parallelizing the simulation. A distributed simulation with *p* statistically independent replicas of the system is therefore desirable for the generation of the rate *λ*_{τ}. The contributions of individual nodes to equation (2.3) may be calculated locally and passed periodically to a central master thread. This makes the message passing required minimal, the update period for *λ*_{τ} is only that required by the operator, and as *λ*_{τ} is found by best fitting it is rather robust. This method for measuring the efficiency of a proposed sampler distribution is ideally suited to more widely distributed networks of processor nodes. Indeed the more replicas of the system the operator can muster the better the sampling of the ensemble. The calculation of *λ*_{τ} is discussed in detail below.

The master thread is required therefore to generate ‘ensemble’ averages from the autocorrelation data coming in from the distributed nodes, to communicate with the steering client, and to broadcast changes in the steerable parameters. When a good set of parameters is decided for the sampler distribution, the steering client is closed down, and the system will start to generate the desired statistical quantities.

## 3. A rolling estimate of the correlation time

For a specific MC simulation we have to generate a rolling estimate of the correlation time which can be monitored by the operator. We model the surface structure of polystyrene, as a preliminary step to modelling the surfaces of more complex conjugated polymers. We will use a model which retains some degree of atomic detail of the surface structure. In this model the only degrees of freedom are the torsional angles of carbon–carbon single bonds, which fixes the position of all atoms and bonds to within a rigid-body transformation, plus the position and orientation of one bond in the chain. We use the term *bead* to refer to a single object in the polymer which has a defined potential interaction and connectivity with other beads and no internal degrees of freedom. In the specific case of polystyrene, links of the backbone are therefore modelled as alternate *CH*_{2} and *CH* united atom beads, and the side-branches as phenyl rings attached to the *CH* beads. The ends of the chain are not treated as having additional hydrogen atoms. The TraPPE-UA potentials for *CH*_{x} used are given by Wick & Siepmann (2000). The phenyl rings themselves have no additional degrees of freedom, and are represented as prolate spheroids by the Gay-Berne potential determined by Walsh (2002) and Lorenz *et al*. (2003). The phenyl-*CH*_{x} potential has an intermediate form generated using the standard Lorentz–Berthelot combining rules, and the CCSD(T)-fitted torsional potential for styrene of Sancho-Garcìa & Pèrez-Jimènez (2002).

Equation (2.3) requires a definition for the inner product operator. Each bead in a polymer chain has a position in three-dimensional (optionally periodic) space and an orientation determined by the orientation of its bonds. We will take the inner product as being a scalar constructed from the position and the orientation of the bead. This is sufficient to compare any moves which preserve polydispersity. The position of bead *i* taken at time *τ* is denoted . The rotation of the coordinate frame of bead *i* with respect to the laboratory frame is tracked to determine the direction of the bonds and anisotropic contributions to the potentials. One of the three axes, designated is used to track the angular correlation. For phenyl beads, this axis is the normal to the plane of atoms. For *CH*_{x} beads it is the direction of the bond to the next atom in the chain.2

We can therefore define an inner product for a bead *i* at time *τ* and a bead *j* at time *τ*−*t*(3.1)[ ]_{m} denotes that the minimum image is used in periodic axes and *R*_{g} is the radius of gyration. The Kronecker delta selects that the inner product is non-zero only if the ‘same’ bead (recognized by the same index) is taken at both times. The concept of ‘sameness’ used is discussed in more detail below. This form has the desired property that . It is also bounded to a lower value of . The fitting of the correlation function to an exponential function will fail if an inner product less than or equal to 0 contributes.

We can therefore write the inner product for the system at two times *τ* and *τ*−*t*(3.2)where the sum runs over the *N* beads in the system.

### (a) Selecting the same bead at two times

All system properties must be invariant under the re-indexing of the beads. The actual re-indexing implemented normally will be chosen by a combination of computational efficiency and expedience. For our case we should like to make a specific choice for the re-indexing such that the system–system inner product (equation (3.2)) is maximized, subject to the constraint that only the indices of beads of the same chemical type are exchanged. In practice it is more convenient to allow each MC move to perform some re-indexing, rather than to attempt a global maximization each time an inner product is calculated.

### (b) Fitting the exponential function

The exponential form (equation (2.4)) is found by periodically storing a copy of the system together with the time when it was recorded. At time *τ*, a cache of *M* copies of the system from earlier times {*τ*+*t*_{1}, *τ*+*t*_{2},…,*τ*+*t*_{M}} have been stored. The oldest state in the cache is overwritten each time a new copy of the system is taken. These *M* states give *M* points *S*(*τ*)⋅*S*(*τ*−*t*_{i}). These points are shown in figure 1 as open squares. The value of the inner product decreases with increasing *t*, suggesting that the system is moving away from earlier positions. The decrease will often show plateaux where little or no movement through configuration space has occurred. Consequently the best-fit exponential (dashed line) through these points fluctuates rapidly.

At equilibrium, *S*(*τ*)⋅*S*(*τ*−*t*)=*S*(*τ*′)⋅*S*(*τ*′−*t*), so to construct a better estimate of the exponential fit it is possible to use points taken at (*L* separate) earlier times. Figure 1 shows a simulation of a single chain of 100 atactic polystyrene monomer units. Every 2.5 s an inner product of the current system state with the set of *M*=16 stored earlier copies is taken. The latest *L*=16 sets of such points are used as a larger dataset, and the best-fit exponential through these is shown as the bold line. This process of storing and reusing the projection of the system over recent times makes the correlation time returned a rolling estimate rather than an instantaneous estimate. A large value for *M* gives a better instantaneous estimate at the expense of greater storage requirement and overhead calculating the inner products. A large value for *L* smoothes the variations in the instantaneous estimate of the correlation time, but makes the correlation time less responsive to important issues like failure to equilibrate or the system becoming trapped in a local minimum. The parameters *M* and *L* and the time between samples of the system can be steered during the simulation.

We find that even with the smoothing of the rolling estimate of the correlation time made by choosing large *L* a single processor is insufficiently smooth for steering the simulation. If the system gets trapped in a local energy minimum, then the rate of movement through configuration space drops sharply and so the rolling correlation time estimate increases. To provide a steadier and more reliable estimate of the rolling correlation time we therefore perform the calculation in parallel. This is illustrated in figure 2.

A master thread running on a local terminal is steered using the steering client, and a number of compute nodes. We have used eight identical compute nodes for optimizing the parametrization, with each node periodically returning the set of points {*S*(*τ*)⋅*S*(*τ*−*t*)}—i.e. the complete scatter plot from figure 1. This dataset for the message passing is very small, and as it should be independent of the time *τ*, it is insensitive to the time taken to pass the message. A single exponential can then be fit through the complete set of points. If the computation is performed on a heterogeneous network then each group of identical nodes will return a rate *λ*_{p}≈*λ*_{x}*λ*_{MC}, where *λ*_{x} is proportional to the execution speed of the code, and *λ*_{MC} is dependent on the parametrization. We only need to know whether *λ*_{MC} is increased or decreased by a parameter change, and so on a heterogeneous network of computers it would be reasonable to monitor . We have performed visualization of the system on individual compute nodes on the local terminal, but for larger systems this task can be distributed. The visualization is extremely useful to gain physical insight into how the individual moves are operating.

The process of selection of parameters by CS is illustrated by the screenshot of the Polysteer program in figure 3.

## 4. Monte Carlo simulations

To parametrize MC moves for polystyrene systems we have applied the CS method as described above, using four independent processes to average the correlation time. We have considered two separate sets of MC moves for the parameter fitting applied to the same models of atactic polystyrene with a degree of polymerization of 100 at a simulated temperature of 500 K.

We first consider a set of moves based on individual rotations of the bonds: end rotation, biased reptation and configurational bias (de Pablo *et al*. 1992). These moves are conceptually quite simple and so easy to code, but are expected to be much better suited to the equilibration of oligomers. The second set of moves is based on collective movements by fitting the backbone to a spline curve. It can be shown (Mason, paper in preparation) that a backbone chain can be uniquely fitted to a spline curve, and vice versa, with one knot for every *N*_{k} beads. New conformations of the backbone can be proposed by moving the knots of the spline, and the uniqueness of the solution ensures that detailed balance is obeyed. A spline knot move shifts one knot by a random vector of magnitude up to ±d*x*, refits the spline and then refits the short end of the backbone. A spline reptation move shifts all the knots along the direction of the curve, and then selects a new position for the knot on the end such that the new tangent is at an angle of up to d*θ* to the old tangent. This set of spline-based moves is described in detail elsewhere.

The process of optimizing MC moves through CS gives parameters shown in table 1. It is clear that the set of moves which gives correlated movement of the backbone is much more efficient for this system.

We now consider modelling a polystyrene thin-films.

The films were constructed from 20 chains, each with degree of polymerization 100. The supercell was periodic in the *x*- and *y*-directions, with cell side of 125 Å in these dimensions, and no periodicity was imposed in the *z*-direction. Initial configurations for the thin films were generated in a two-step process of first laying down the backbones in a semi-random position and then improving this configuration as described below. The backbones were represented as splines, with a spline knot every five backbone beads, and these splines were generated incrementally such that each knot was placed at a random displacement d*x* from the previous, where d*x* is of magnitude smaller than *r*_{max}=2.55 Å, the greatest length possible from a sequence of five backbone beads, where we have taken the bond length to be 1.54 Å, and the bond angle to be that of polyethylene, 112°. To generate a film a further condition was imposed that the *z*-position of the knots be in the range 0–22 Å from the *z*=0 plane. The initial position was then improved by searching for positions where the spline curves came together closer than 1A, and moving apart the knots nearest to this point of close approach. The film thus generated has a density of 1 monomer per 171.9 Å^{3}, or 1000 Kg m^{−3}. The specific gravity of polystyrene is actually 1.07; we have reduced the density slightly to improve the equilibration.

During the simulation a hard wall potential of 1000 eV was imposed at the lower (*z*=0) boundary to strongly discourage moves which would place beads below this line. The upper surface at *z*=22*A* was free. The atactic polystyrene film before and after relaxation is shown in figure 4. It can be seen that some of the chains ends are not bound into the surface and have started to diffuse away from the bulk. This makes the surface of the film somewhat diffuse and is a particular problem for modelling: the chain length we use necessarily has to be considerably shorter than that found in real films and so there is a greater proportion of ends at the surface.

Figure 5 shows a histogram of the orientation of the phenyl rings after relaxation in a layer of 5 Å centred on the position of the original upper surface. The orientation of the rings is not randomly distributed, but is instead peaked at angles between 50 and 60°. The expectation value of the normal orientation is found to be 58.1°. This is in reasonable agreement with experimentally found values (55°; Ueno *et al*. 1998; 51°; Briggman *et al*. 2001) given the simplified potentials which are used here, and that the film we have modelled is much thinner than that used in experiment. The density of the surface layer at the end of simulation was found to be 690±30 Kg m^{−3}. This is a reduction of almost a third from the density in the bulk and before equilibration.

## 5. Conclusions

We have described a methodology to approach the difficult task of selecting a good set of MC moves for simulating a new polymer system. As more MC moves are discovered it becomes possible to make great efficiency savings by selecting the correct set for the type of simulations intended. But it is also true that a systematic means of comparing sets of moves must be used: it is not sufficient to assume that a higher acceptance rate will lead to faster simulation as each proposed move type takes a different length of time to generate.

Our approach describes how the technology provided by the Reality Grid project can be employed to optimize the parametrization of these move types. We have described in §2 a single parameter, the correlation time, which can be monitored to determine whether a parameter change was successful or not. We have developed a new computer program, Polysteer, for the MC simulation of polymers, which incorporates both a CS client attached to a running simulation, allowing the operator to adjust parameters online, and an online visualization client. The fluctuations in the correlation time requires that the simulation be deployed as a distributed parallel calculation, with as many slave threads as possible contributing their own estimate of the rolling correlation time.

We have used our model to parametrize MC moves suitable for the relaxation of polystyrene strands, and have used the model to predict the orientation of phenyl rings at the surface of a thin film of polystyrene, finding that the normals have an expected orientation from the normal of the plane of 58.1°, a figure in agreement with experimental values.

The process of optimization could of course always be performed automatically using a multidimensional minimizer. While technically possible, this process could well take a very long time to converge, as the correlation time fluctuates very rapidly. We believe that a human operator is much better at seeing the patterns present in data, and can not only apply physical insight to speed up the process of optimization, but can also gain physical insight by watching in real time the effect of changing simulation parameters.

## Acknowledgments

D.R.M. was supported by EPSRC grant number GR/R67699/02. D.R.M. would like to thank Stephen Pickles and Andrew Porter of Manchester computing for their help implementing computational steering.

## Footnotes

↵† Present address: Condensed Matter Theory Group, Blackett Laboratory, Imperial College, London SW7 2AZ (d.mason@imperial.ac.uk).

One contribution of 27 to a Theme ‘Scientific Grid computing’.

↵As an (extreme) example consider a selection probability which has a 50% chance of producing a very small configuration change and is always accepted, and a 50% chance of producing a large move which is always rejected. This selection probability gives the desired acceptance ratio but actually hardly moves at all through the configuration space.

↵The equivalent direction for the last bead in the backbone is readily found from the direction and torsional angle of the bond to the previous bead.

- © 2005 The Royal Society