## Abstract

Erwin Schrödinger famously and presciently ascribed the vehicle transmitting the hereditary information underlying life to an ‘aperiodic crystal’. We compare and contrast this, only later discovered to be stored in the linear biomolecule DNA, with the information-bearing, layered quasi-one-dimensional materials investigated by the emerging field of *chaotic crystallography*. Despite differences in functionality, the same information measures capture structure and novelty in both, suggesting an intimate coherence between the information character of biotic and abiotic matter—a broadly applicable physics of information. We review layered solids and consider three examples of how information- and computation-theoretic techniques are being applied to understand their structure. In particular, (i) we review recent efforts to apply new kinds of information measures to quantify disordered crystals; (ii) we discuss the structure of ice I in information-theoretic terms; and (iii) we recount recent investigations into the structure of tris(bicyclo[2.1.1]hexeno)benzene, showing how an information-theoretic analysis yields additional insight into its structure. We then illustrate a new Second Law of Thermodynamics that describes information processing in active low-dimensional materials, reviewing Maxwell's Demon and a new class of molecular devices that act as information catalysts. Lastly, we conclude by speculating on how these ideas from informational materials science may impact biology.

## 1. Introduction

To account for the ‘special properties’ of life—e.g. movement, metabolism, reproduction, development—the prevailing wisdom from the time of Aristotle into the nineteenth century was that organic matter differed in some fundamental way from inorganic matter. While this notion, called *vitalism*, may seem quaint to twenty-first century scientists, it held sway until the chemist Friedrich Wöhler showed that, unexpectedly, a known organic compound, urea, could be artificially synthesized from cyanic acid and ammonia [1]. This fabrication process, while different from that used in biological systems, nonetheless served as an important clue that the divide between living and non-living matter was not absolute. Abiotic processes could make substances theretofore only encountered in biologically derived materials. Additionally, we see that—and not for the last time—results obtained from one discipline, chemistry, have had important consequences in another, biology. This confluence of diverse avenues of enquiry coalescing into an ever larger conceptual picture of Nature is, of course, an oft-repeated theme in the sciences. Other famous examples include Newton's discovery that the motion of celestial bodies, such as the moon and planets, and that of terrestrial ones under the influence of gravity, such as the proverbial apple, are both manifestations of a universal law of gravitational attraction; James Clerk Maxwell's unification of electricity and magnetism into his famous equations; and James Prescott Joule's demonstration that the caloric was nothing but energy by another name, now formalized in the First Law of Thermodynamics. Indeed, E. O. Wilson takes the extreme position that *all* human knowledge, from the most concrete of the sciences to the least precise of the liberal arts, is ultimately interlinked [2].

We need not go quite so far as Wilson. It is enough for our purposes to realize that, while ‘abiotic’ sciences such as physics, chemistry, astronomy and geology share obvious strong interconnections, biology has remained relatively aloof. This is not to say that biology has not benefited greatly from knowledge transferred to it from other physical sciences. In addition to the urea example above, note that metabolism is at its core a question of the utilization and transformation of energy—a notion made concrete and operational in physics. And too, biology has benefited tremendously from techniques and discoveries made in other sciences. Indeed, in 1937 Max Delbrück (Nobel Prize in Physiology or Medicine 1969) adapted his training in astrophysics and theoretical physics to probe gene susceptibility to mutations, stimulating physicists’ interest in biology and establishing molecular biology. More familiar, though, it was the infamous X-ray diffraction image known as ‘photograph 51’ from the laboratory of Rosalind Franklin that provided a key insight leading geneticist James Watson and physicist Francis Crick (Nobel Prize in Physiology or Medicine 1962) to propose the double helical structure of DNA [3]. Despite the above, biology is clearly the least well integrated in the family of sciences. We can speculate that the sheer complexity of life and the novel phenomena it displays are at least partially responsible for this. Even one of the most basic organisms, *Mycoplasma genitalium*, has a genome of ‘only’ 580 070 base pairs [4]. Biology is complicated.

And it is perhaps due to this complication that the mathematical ‘sciences’^{1} have made their least impact in theoretical biology. By and large, the advanced mathematical techniques that saturate any theoretical physics text find no counterpart in biology texts. There is one area, however, where arguably biology has outpaced her sister sciences: the incorporation of *information theory* [5,6] into the description of physical systems. And, we will suggest that biology has carved a conceptual path that abiotic physical sciences would do well to emulate. Before we move too far ahead, though, let us start at the turn of the twentieth century and visit one of the many revolutionary advances that ushered in the era of ‘modern’ physics and that remains today a key probe of molecular biological structure.

## 2. Structure, aperiodic crystals and information

The immense conceptual advances in physics made in the first third of the twentieth century are legion, but here we focus on the contributions to the structure of matter. While it is Max von Laue (Nobel Prize in Physics 1914) who is credited with the discovery of the diffraction of X-rays by crystals, it is the father and son team, Sir William Henry Bragg and William Lawrence Bragg (Nobel Prize in Physics 1915), who receive much of the credit for exploiting it as a tool to determine crystal structure. For a periodic repetition of some pattern, as one might find in the simple crystals such as NaCl, the diffraction pattern is dominated by very strong reflections at particular angles, called *Bragg reflections*. Much weaker diffuse scattering is known to occur between the Bragg reflections and had been observed as early as 1912 by Walter Friedrich. While this diffuse scattering can be explained by the thermal motion of the constituent atoms, it could genuinely be a harbinger of deviations from perfect periodic order. But the assumption of periodicity greatly simplifies the analysis of diffraction patterns, and the early years of crystallography were marked with enormous success in solving for the periodic structures that seemed so common. Indeed, it may be argued that this research programme, which was so successful in describing a particular kind of structure—periodic structure, ‘the infinite repetition in space of identical structural units’—came at the cost of developing alternative theoretical tools.

On the biology front, cognizant of Delbrück's results on mutations, the prominent physicist Erwin Schrödinger (Nobel Prize in Physics 1933) was busy considering life from a physics point of view. In his now classic 1944 book, *What is Life?* [7], Schrödinger introduces two concepts that are of interest to us here. The first is *negentropy*, or the entropy that an organism exports to its surroundings to keep its internal entropy low. If one views entropy as a measure of disorder, then the Second Law of Thermodynamics makes it clear that, for an organism to maintain some structure, it must rid itself of the disorder that accompanies life-maintaining processes. The second, and equally important, is the idea that the hereditary mechanism that must exist so that traits of individuals can be passed to offspring could be housed in what he called an *aperiodic crystal*. Although H. L. Müller made a similar proposal over 20 years prior, it was Schrödinger's advocacy that captured the imagination of Crick and Watson to seriously investigate this possibility. Schrödinger's aperiodic crystal was some material substrate, perhaps a molecule, that lacked strict periodicity. The reason for this is that exact repetition of a motif, in other words a crystal, is information poor—too poor to carry heredity. Without some unpredictability, or novelty, nothing new is learned and communicated. It is remarkable that Schrödinger made this prediction before a quantitative understanding of information was articulated.

In 1947, three physicists from Bell Telephone Laboratories, John Bardeen, Walter Brattain and William Shockley (Nobel Prize in Physics 1956), invented a small device that revolutionized the design of electrical circuits: the transistor, which ushered in the era of electronics. Its significance was immediately recognized and a press release was duly issued the next year. Yet, arguably [8], this was only the *second* most important announcement to come out of Bell Laboratories in 1948. The first came from a 32-year-old mathematician, engineer and cryptographer, Claude E. Shannon,^{2} in the form of a paper in the *Bell System Technical Journal* with the unassuming title ‘A mathematical theory of communication’ [5].

Shannon's main premise is that information is a degree of surprise. Given an *information source* *X*—a set of messages {*x*} that occur with probabilities —an individual message's *self-information* is . Thus, predictable events () are not informative—H(*x*)=0—as they are not surprising. Wholly unpredictable events, such as the flips of a fair coin, are highly informative: . When using logarithms base 2 the information unit is a *bit* or binary digit. Shannon's first major result was to show that the average self-information, what he called the *entropy* paralleling Boltzmann and Gibbs in vocabulary and notation, , measures how compressible a source's messages are. However, quantifying information was simply preliminary to Shannon's main motivation. Working for the Bell Telephone Company, a communications enterprise, his main goal was to lay out operational constraints for communicating information over noisy, error-prone transmission equipment, which he formalized as a *communication channel*. The result was his most famous and far-reaching result: as long as the source entropy is less than the channel's transmission capacity——then, even if errors are introduced, there is a way to encode the source messages such that the receiver observing the noisy channel output can *exactly* reconstruct the original messages. This single result is key to almost all communication technologies that drive today's modern economies.

Shannon himself was rather careful to distance his quantitative theory on the amount of information in a source from discussions of that information's meaning or semantic content [9]. His goal was the operational result just recounted, which did not require knowing what information was being communicated. However, as we will explain, his measure of information and its semantics turn out to provide a central and quantitative tool for understanding the organization of materials that are more than periodic crystals—materials that are not regular repetitions of identical unit cells. We call this application of Shannon's information theory to material structure ‘chaotic crystallography’, for reasons that will become evident.

What kinds of materials are not crystals? An obvious class is those in which atoms of random kinds are randomly placed in space. The resulting dichotomy—materials are either periodic or random—is too simple a view. There is a spectrum. A first example, one controversial in its time, came in the discovery of quasi-crystals [10]: metals with long-range orientational order, an icosahedral phase, but no translational symmetry. This fell so far outside of the periodic–random dichotomy that it was some years after experimental detection that quasi-crystals were widely accepted (Nobel Prize in Chemistry 2011).^{3}

Given this wide spectrum, one needs tools that readily describe processes that range from periodicity to randomness and capture the intermediate semi-ordered, semi-disordered structures. Information theory is one of those tools. We will describe how it applies to material structure, forming the endeavour of chaotic crystallography. A compelling insight is that, although we start with a focus just on surprise and prediction, we are led to novel notions of structure, partial symmetries and information storage.

## 3. From information measures to structure

Although fundamental to the practice of science, a thorough understanding of the information obtained from individual measurements has only recently been examined in detail [15,16]. The key issues at hand are easily stated: Given a history of such measurements, how much does one learn from any particular observation? How much of the past is useful for predicting the results of future measurements? To what degree is a measurement simply randomness, and not structure? How much information obtained in the present is transmitted to the future? Perhaps not surprisingly, considering these questions in the light of information theory [6] revealed a number of new computational and informational measures that give important insights into how correlations are manifested in different kinds of structure.

As noted above, the workhorse of information theory is the *Shannon entropy* [5] of a random variable *X*: , where the *x* are the possible realizations of the discrete variable *X* and is the probability of observing *x*. While Shannon entropy has many interpretations, most useful here is that it is the average amount of information an observation reveals when measuring that variable. Real measurements are often sequential in time and one might expect that there are correlations between measurements. The extension of the Shannon entropy to a series of measurements follows naturally by replacing the single random variable *X* with the sequence *X*_{1},*X*_{2},…*X*_{L} of random variables—often written *X*^{L}—and the realization of a single measurement *x* by the series of measurements *x*_{1},*x*_{2},…*x*_{L}, the latter conveniently denoted *x*^{L}. Thus, by considering successively longer pasts—*X*_{−1}, then *X*_{−2}*X*_{−1}, and so on—one can quantify how less uncertain a measurement of *X*_{0} is. Or, stated differently, we can quantify how much knowledge of the past reduces the information learned in the present: H[*X*_{0}]≥H[*X*_{0} | *X*_{−1}]≥H[*X*_{0} | *X*_{−1},*X*_{−2}]≥⋯ , where we introduced the conditional Shannon entropy H[*X* | ⋅]. It is also useful to consider the entropy rate *h*_{μ}, the information learned on average per observation, having seen an infinite past: *h*_{μ}=H[*X*_{0} | …*X*_{−3}*X*_{−2}*X*_{−1}].

Since information theory was originally developed in the context of communication, imagined as a temporal progression of symbols, a natural notion of past, present and future permeated the theory. Operating under this prejudice introduced a preferred arrow of time. As a consequence, the utility of conditioning *current* measurements, or observed symbols, on *future* observations was not obvious. From a mathematical point of view, of course, there is no inherent impediment to doing this. However, replacing a time series by a spatial one lifts the directional prejudice, opening a way to identify other measures of information that treat the past and future on equal footing [15,17–20].

As an example, consider a single measurement of the random variable *X*. The theoretical maximum amount of information that one can possibly learn is just H[*X*] (figure 1*a*). However, if there are correlations or regularities in the data, some of this could have been anticipated from previous observations. Let us call this part the *redundancy rate* *ρ*_{μ}=I[*X*_{0}:…*X*_{−3}*X*_{−2}*X*_{−1}]—the shared or *mutual information* between the present *X*_{0} and the past …*X*_{−3}*X*_{−2}*X*_{−1}. The other part of the information could not be anticipated; it truly is random and is just *h*_{μ}. Thus, the amount of information H[*X*_{0}] available in a single measurement *X*_{0} naturally decomposes into these two parts, as shown in figure 1*b*.

However, further conditioning yields further decomposition of each of these. First, the random portion *h*_{μ} breaks into two parts: the *ephemeral information rate* *r*_{μ} and the *bound information rate* *b*_{μ}. The ephemeral information rate *r*_{μ}=H[*X*_{0} | …*X*_{−3}*X*_{−2}*X*_{−1},*X*_{1}*X*_{2}*X*_{3}…] is the information that exists only in the present. It is not predictable from the past nor is it communicated to the future. Existing only in the present, it is ephemeral. The bound information rate *b*_{μ}=I[*X*_{0}:*X*_{1}*X*_{2}*X*_{3}… | …*X*_{−3}*X*_{−2}*X*_{−1}] is the information shared between the present and future, but is not in the past. As such, it measures the rate at which spontaneously generated information (*h*_{μ}) is actively stored by a system. Second, the redundancy rate also breaks into two parts, the first again being *b*_{μ} and a second part called the *enigmatic information rate* *q*_{μ}. The latter is three-way mutual information I[…*X*_{−3}*X*_{−2}*X*_{−1}:*X*_{0}:*X*_{1}*X*_{2}*X*_{3}…] shared between the past, the present and the future.

The net ‘decomposition’ of the information H[*X*_{0}] in a single measurement is illustrated in figure 1*c*. This is only a sampling of the possible ways that information can be semantically partitioned between the past, present and future. Figure 1*d*, for example, is a decomposition into dissipated *r*_{μ} and useful information *w*_{μ}. Moreover, other additional measures, discussed by James *et al.* [15,16,21], have been defined and explored. Importantly, they now can all be analytically calculated from a process's *ϵ*-machine [22,23], once that is in hand.

## 4. Chaotic crystallography

Armed with this new arsenal of structural information measures, a detailed, quantitative picture of how information is shared between the past, present and future is made plain. With these in mind, *intrinsic computation* is defined as how systems store, organize and transform historical and spatial information [24,25]. Different processes may have quantitatively and qualitatively different kinds of intrinsic computation, and understanding these differences gives insight into how a system is structured [26].

*Chaotic crystallography* (ChC) [23,27–33]^{4} then is the application of these information- and computation-theoretic methods to discover and characterize structure in materials. It reinterprets the time axis, used above for pedagogical reasons, for a one-dimensional spatial coordinate along some direction in a material. The choice of the name is intended to be evocative: we retain the term ‘crystallography’ to emphasize continuity with past goals of understanding material structure; and we introduce the term ‘chaotic’ to associate this new approach with notions of disorder, complexity and information processing. Using chaotic crystallography, we can describe the ways in which this information decomposition quantitatively captures crystal structure—distinguishing structure that might be expected, i.e. repetitive periodic structure, from that structure not expected, i.e. faulting structure. Materials that are usefully and amenably described under this framework are called *chaotic crystals*.

Obviously, the study of disorder in condensed matter systems is far from new, but too often it has been overshadowed by the crystallography of periodically ordered materials, called *classical crystallography* [35–37]. Mackay has long and persistently advocated that the scope of crystallography be extended to include organization on many length scales as well as to include forms of organization that are not necessarily periodic. In a series of articles spanning over 40 years [12,36–42], Mackay promoted the idea that crystallography is—or at least ought to be—a general science of structure.^{5} We agree. This research programme, dubbed *generalized crystallography*, claims a huge swathe of conceptual space, including hierarchical organization, pentagonal symmetry, cellular automata, aperiodic crystals, incommensurable periodicities, glasses, lyotropic liquid crystals, mesophases and finite crystals—to name just a few [36]. Also included are concepts from information theory and computation. In selecting the name ‘chaotic crystallography’ then, we differentiate our efforts from this more diffuse programme to instead focus tightly on how information and computation elucidate material structure and organization.

We should note that chaotic crystallography is not the only or even the first attempt to incorporate information- and computation-theoretic ideas into crystallography. Some efforts from the 1980s centred on employing information theory as a tool to solve the ‘phase problem’ [43–45]. More recently, Lalik [46] has used information theory as a metric to gauge the degree of distortion in coordination polyhedra in crystals by computing the ‘difference in the Shannon information content between uniform and non-uniform distributions of the bond orders within polyhedra’. Estevez-Rams & González-Férez [47] have applied algorithmic or Kolmogorov–Chaitin complexity to the task of quantifying crystal structure for both ordered and disordered specimens, as well as quasi-crystals. They contributed to the ongoing discussion of the definition of a crystal, suggesting that information- and computation-theoretic notions do provide insight into the structure of matter. Although certainly of conceptual interest, unfortunately the Kolmogorov–Chaitin complexity is in general incalculable, making this approach of limited practical use. Krivovichev [48,49] has calculated topological complexities of crystal structures as Shannon information-based measures over the related quotient graph. He has additionally surveyed the 2000 most complex structures in the Inorganic Crystal Structure Database to determine which inorganic structure is the most complex [50].^{6} This research programme appears, at least at the moment, to be largely if not exclusively confined to purely crystalline structures.

To our knowledge, the term ‘chaotic crystal’ has been used in two previous contexts. In 1991, Leuschner [51] introduced several models of structure for one-dimensional crystals, capable of producing completely periodic, quasi-periodic and chaotic behaviour. The latter was accomplished using the Logistic Map [24] as a generator of uncertainty in the stacking sequence—in effect using it as a random number generator. Later, Le Berre *et al.* [52], in the context of steady-state pattern formation of two-dimensional systems, defined a chaotic crystal as ‘any structure without long range order, but spatially statistically homogeneous’. Our use of the term is both less restrictive, in accounting for long-range order, and more general, in allowing for a wide range of types of disorder. It should be apparent that the chaotic crystal we describe here is just the kind of crystal that Schrödinger imagined as the carrier of heredity. While he called it an aperiodic crystal, that term has been usurped to describe a very special kind of deviation from periodicity, the kind that is found to preserve sharp peaks in the diffraction pattern [53]. Thus, we use the term chaotic crystal to indicate a broader notion of non-crystallinity, one that encompasses structures with a non-zero entropy density, as is needed for any structure, such as DNA, to house information.^{7}

### (a) Material informatics of faults and defects

As classical crystallography largely concentrates on periodic structures, it encounters difficulty classifying structures that do not fit this paradigm. Most efforts have centred on describing how a crystal, which presumably could have been perfectly ordered, falls short of this ideal. For example, in close-packed structures, Frank [55] distinguished two kinds of layer faults: *intrinsic* and *extrinsic*. For intrinsic faults, each layer in the material may be thought of as belonging to one of two crystal structures: either that to the left of the fault or that to the right. It is as if two perfect, undefected crystals are glued together and the interface between them is the fault. By contrast, it may be that a particular layer cannot be thought of as a natural extension of the crystal structure on either side of the fault. These are extrinsic faults. Another classification scheme has its origins in the mechanism that produced the fault. In close-packed structures, commonly encountered faults include *growth faults*—i.e. those that occur during the crystal growth process; *deformation faults*—which are often associated with some post-formation mechanical stress to the crystal; and *layer-displacement faults*—which can occur by diffusion between adjacent layers. As each is defined in relation to its parent crystal structure, each kind of crystal structure typically has its own distinctive morphology for each kind of fault.

The result is a confusing menagerie of stacking sequences that deviate from the normal. This collection may not be exhaustive, depending on how large a neighbourhood one considers, nor may particular sequences be unambiguously assigned to a particular kind of fault structure. Indeed, in the event that there are multiple kinds of faults, or multiple mechanisms for producing faults, an attempted analysis of the fault structure may be indeterminate [27]. Faulting may also be classified in terms of how faults are spatially related to each other. The absence of correlation between faults implies *random faulting*. Alternatively, the presence of a fault can influence the probability of finding another fault nearby. This latter phenomenon is called *non-random* faulting and is not uncommon in heavily defected specimens. Lastly, in some materials faults appear to be regularly interjected into the specimen, and this is referred to as *periodic faulting*. Screw dislocations are thought to be a common cause of these latter faults [56].

These phenomenological categorizations, while often helpful and sensible, especially for weakly faulted crystals, are not without difficulties. First, it is clear that each is grounded in the assumption that the native, or ideal, state of the specimen must be a periodic structure. This bias, perhaps not intentionally, relegates non-periodic stacking to less stature, as is evident in the use of the term ‘fault’. It may be rather that disorder is the natural state of the specimen [57], in which case employing a framework that incorporates this feature of matter upfront will prove more satisfactory. In fact, it is not even clear that periodic order should be the ground state for many kinds of materials, even for those with finite-range interactions and in the absence of fine-tuning of energetic coupling parameters between layers [58], as is found in axial next-nearest neighbour Ising (ANNNI) models [59]. Second, an analysis of the stacking structure based on these categories may not be unambiguous, especially in the case of heavy faulting. Third, this entire view is only tenable in the limit that a parent crystal exists, i.e. it only applies in the weak faulting limit.

Consistency can be brought to this complicated picture of material structure by using information theory [33]. A complementary view may be postulated by asking how information is shared and distributed in a crystal, and a natural candidate for this kind of analysis is to employ the information measures above. Although the previous exposition used a temporal vocabulary of a past, present and future, there is no mathematical change to the theory if instead we adopt the view that the observed sequences are spatial configurations. That is, there are measurements that are to the left of the present measurement, the present measurement itself and those measurements to the right of the current measurement. For quasi-one-dimensional materials, we assume each measurement as the orientation of a layer. This view of a sequence of layer orientations translates to an *information diagram* or *I*-*diagram*, as shown in figure 2. There, we see how information is shared between the different halves of the specimen and the current layer. The information measures given in terms of mutual information can be interpreted as *layer correlations* within the specimen. Importantly, although one typically averages them over the crystal, it is possible instead to not perform that average, but examine them layer by layer. As shown in James *et al.* [16], information-theoretic measures can be quite sensitive to changes in system parameters and we expect will provide a barometer quantifying important aspects of material structure.

As an example, electronic structure calculations arising from one-dimensional potentials are known to depend on pairwise correlations [60,61], with the transmission probability spectrum of an electron through such potentials often governed by the correlation length. Information-theoretic quantities, with their more nuanced view of correlation lengths in terms of conditional and mutual information, give a more detailed picture of the role of disorder in electronic structure. One of the simpler and more common measures of global correlation is the mutual information between the two halves of a specimen: the *excess entropy* **E**=I[…*X*_{−3}*X*_{−2}*X*_{−1}:*X*_{0}*X*_{1}*X*_{2}…]. Inspection of the information diagram reveals its decomposition into information atoms: **E**=*b*_{μ}+*q*_{μ}+*σ*_{μ}.

Additionally, not only is the global structure important, but also local defects can introduce local deviations from average structure, as seen in Anderson localization [61]. This is a current area of research interest [62]. Similarly, regions of charge surplus or depletion can affect other properties, such as the transmission of light. The area of disordered photonics attempts to understand and exploit such structures for new technologies [63].

Thus, a number of questions can be asked concerning the distribution of information in the crystal as revealed in its structure. For example, how much information is obtained from the current measurement? Is this shared with its neighbours or is it localized? Considering questions such as these leads to a new categorization of disordered structure in crystals.

### (b) Chaotic crystals: structure in disorder

The net result is a consistent, quantitative and predictive theory of structure in disordered materials that extends beyond faulting and weak disorder and that applies to the full spectrum of material structure from ideal periodic crystal to amorphous materials and complex long-ranged mixtures in between. As Ball [64] notes, in short, we have a new view of what crystals are and can be. Varn & Crutchfield [33] review how this works in detail.

Let us illustrate how chaotic crystallography applies to real-world materials—the close-packed structures of ice and a complex molecule used to probe the chemistry of benzene's aromaticity. Then, combining these results with previous chaotic crystallographic analyses of zinc sulfide (ZnS), we demonstrate how a unified vision of organization in materials is emerging.

#### (i) Layer disorder in ice I

Although often thought of as merely the medium of life—albeit an essential one^{8} —there has been growing appreciation of the active role that water plays in biological processes. As an example, Ball [65,66] cites the generic interaction of two proteins. If both are dissolved in the cellular medium, the intervening water molecules must be removed for an interaction to occur. Water is, of course, polar, and displacing the last few layers of water may be non-trivial, depending on for instance to what degree the protein activation sites are either hydrophilic or hydrophobic. Additionally, one should expect properties of thin water films, such as viscosity, to deviate significantly from their bulk properties. Even the simulation of complex polypeptides is incomplete absent consideration of the influence of water [65]. As another example, there is evidence that life engineers and precipitates the formation of ice. Without the influence of impurities to act as centres of inhomogeneous ice nucleation, water in clouds can be expected to freeze at 235 K or even as high as 243 K via homogeneous ice nucleation [67]. Impurities such as soot, metallic particles and biological agents can raise this temperature. Indeed, Murray *et al.* [68] report that above 258 K the only materials known to promote ice nucleation are biological ones. A particularly effective biological agent is the bacterium *Pseudomonas syringae* that, due to protein complexes on its cell surface, can initiate freezing at temperatures as high as 271 K [69]. Although its particular role may be highly varied depending on circumstances, as Ball points out [65], regarding ‘it [water] essentially as the backdrop on which life's molecular components are arrayed’ is simplistic and naive.

Given the structural simplicity of a water molecule—H_{2}O—and its importance to biological as well as other natural systems, it is perhaps surprising that, in both its liquid and solid forms, H_{2}O remains somewhat mysterious. In the liquid state, water molecules form ‘networks’, where the connections are made from hydrogen bonds, giving the substance considerable structure. So too, ice shows considerable and variable structure. There are no less than 15 known distinct polymorphs of ice (usually specified by Roman numerals) [70], although some of them only exist under conditions too extreme to be commonly observed terrestrially [71] and some are also metastable. Additionally, as thermodynamic conditions change, these different polymorphs can undergo solid-state transformations from one form to another. The common polymorph usually encountered in everyday life is called *hexagonal ice* (ice I_{h}). For some time, an alternative form of ice, *cubic ice* (ice I_{c}), was thought to be a low-temperature modification coexisting with ice I_{h} at temperatures as high as 240 K [72]. However, recent studies [73] have concluded that in fact previous observations are probably not this purely cubic form, but rather a mixture of cubic and hexagonal stackings, called *stacking-disordered ice*, abbreviated (ice I_{sd}) by Malkin *et al.* [74] and (ice I_{ch}) by Hansen *et al.* [75]. Whether or not this will be confirmed by additional studies, ice I_{c} gives a convenient boundary condition on the possible structures that could exist and we will proceed as though ice I_{c} is a possible metastable phase. Above 170 K, ice I_{sd} transforms irreversibly to ice I_{h}.

Structurally, ice I (ice I_{h}, ice I_{c}, ice I_{sd}) can be thought of as a layered material. The oxygens in the water molecules organize into layers consisting of six-member puckered rings [74].^{9} These layers can further assume only three possible stacking orientations, called *A*, *B* or *C*, just as in close-packed structures [76]. The layers are organized so that, upon scanning the material, the layers form *double layers*, where each individual layer in this double layer must have the same orientation. Additionally, just as in the close-packed case, adjacent double layers cannot have the same orientation. Since stacking faults are confined to interruptions *between* the double layers, one usually takes a double layer as a *modular layer* (ML) [77], and labels it by *A*, *B* or *C*. Thus, ice I_{h} is given by …*ABAB*… (or equivalently …*BCBC*… or …*CACA*…), and ice I_{c} by …*ABCABC*… (or equivalently …*CBACBA*…). It is sometimes more convenient to work with an alternative labelling, called the Wyckoff–Jagodzinski notation [76]. One considers triplets of MLs, and labels the centre ML as either *h* or *c*, depending on whether it is hexagonally (*h*) or cubically (*c*) related to its neighbours. For example, the inner-most four MLs of the stacking sequence *ABCBCA* would be written as *chhc*. It should be apparent that any stacking structure, whether ordered or disordered, can be expressed as some *hc* sequence. The ice I_{h} stacking structure is displayed in figure 3*a* and ice I_{c} is in figure 3*b*. A possible disordered stacking sequence is shown in figure 3*c*.

However, despite a recent flurry of theoretical, simulation and experimental studies [69,72–75,77–81], there is still much that is not understood about the formation of ice or the transformations between the various polymorphs [70]. In an effort to understand the coexistence of ice I_{h} and ice I_{c} at low temperatures, Thürmer & Nie [72] examined their formation on Pt via scanning tunnelling microscopy and atomic force microscopy. They found a complex interplay between the initial formation of ice I_{h} clusters that grow by layer nucleation and eventually coalesce. The details of the coalescence and the nature of domain boundaries between nucleation centres strongly influence whether subsequent growth is ice I_{h} or ice I_{c}. Importantly, they demonstrate that ice films of arbitrary thickness can be imaged at molecular layer resolution. Several groups [73–75,78] have applied the *disorder model* of Jagodzinski [82,83] to simulated or experimental X-ray diffraction patterns, using a range of influence between layers, called the *Reichweite*, of *s*=2,3,4. They found that it is necessary to use *s*=4 to describe some samples. Molecular dynamics simulations [80] showed that ice crystallizing at 180 K contains both ice I_{c} and ice I_{h} in a ratio of 2:1, while other molecular dynamics simulation studies [84] found that pairs of point defects can play an important role in shifting layers in ice I. Yet other molecular simulations [85] suggested that a yet new phase of ice, called *ice 0*, may provide a thermodynamic explanation for some features of ice growth.

Chaotic crystallography yields important insights into the kinds of appropriate models and the nature of stacking processes observed, as well as aids in comparing experimental, simulation and theoretical studies. In this way, chaotic crystallography provides a common platform to relate these diverse observations and calculations.

Let us begin with the models used. The *ϵ*-machines that describe ice I_{h} and ice I_{c} are shown in figure 4*a*,*b*. They are quite similar, both having but one state and one transition each. Computationally, they are quite simple. Also simple is the *ϵ*-machine shown in figure 4*c*. There are two transitions from a single state, with the probability of a *c* being *α*_{0} and an *h* being .^{10} It is apparent that the previous two models are just special cases of this latter one. We recognize that these three models describe *independent and identically distributed* (IID) stacking processes. They imply no correlations between the symbols. However, the coding scheme used here, the transformation of the *ABC* notation to the Wyckoff–Jagodzinski notation, builds in stacking constraints and effectively gives a two-ML influence distance. We identify this range of influence as the *Reichweite* *s*.

The next model commonly used is Jagodzinski's *s*=3 disorder model in figure 4*d*. Here, the next symbol in the sequence depends only on the previous symbol (either *h* or *c*), making this a first-order Markov model. The last model explored in the literature is Jagodzinski's *s*=4 disorder model, and this is depicted in figure 4*e*. Since the probability of observing the next symbol depends on the previous two symbols, we recognize this as a second-order Markov model. Again, the mapping of the *ABC* notation to the Wyckoff–Jagodzinski notation folds in an extra two-ML range of influence in terms of the physical stacking of MLs. It is apparent that one could continue this process, considering ever larger *Reichweite*, i.e. higher-order Markov models, indefinitely. However, finite-range Markov processes are only a small fraction of the possible finite-state processes that one could consider. By finite-state, we mean that there are a finite number of states; but this does not mean that the range of influence need be finite. Simulations of simple solid-state transformations in ZnS (also a close-packed structure) from the hexagonal stacking structure to the disordered cubic one produced stacking processes with an infinite range of influence [28]. Thus, we are led to suspect that, despite the excellent agreement between experimental and theoretical diffraction patterns reported by some researchers for ice I, the real process may belong to a computationally more sophisticated class. Chaotic crystallography, with its emphasis on information- and computation-theoretic measures, allows one to recognize the possibility and indeed to ask the relevant questions.

How can we observe or deduce the presence of such sophisticated stacking processes? One way is improved inference techniques. While chaotic crystallography has an inference algorithm, *ϵ*-*machine spectral reconstruction theory* [27,30] that detects finite-range processes from diffraction patterns, there is the possibility of extending it to include infinite-order processes. Also, the simulation studies discussed earlier can result in disordered stacking sequences and there are techniques, such as the subtree merging [24] and Bayesian structure inference [86] algorithms, that can discover these finite-state but infinite-range processes from sequential data. This suggests that the appropriate level of comparison between theory, simulation and experiment is not some signal (the diffraction pattern), but rather the stacking process itself, as specified by the *ϵ*-machine. Chaotic crystallography is a platform for such comparison.

Also, by studying the *ϵ*-machine's causal architecture, i.e. the arrangement of causal states and the transitions connecting them, it is possible to discover the kinds of faults present. Indeed, this was done for ZnS polytypes [27,29]. Recently, several different kinds of faults were proposed for ice I [77], and a proper analysis of the associated *ϵ*-machine, combined with theoretical and experimental studies, can elucidate which faults are important in a particular specimen. This could be quite valuable, as there are many possible routes of formation for disordered ice specimens, and different mechanisms, such as solid-state transformations versus growth, likely leave a discernible fingerprint in the causal architecture.

#### (ii) Organization of aromaticity

Benzene is famous for its curious ‘aromatic’ character that stems directly from the six *π* electrons shared between its six carbon atoms and hovering above and below the plane of its carbon-atom ring. To understand this character, chemists are trying to localize the delocalized *π* electrons, partly to understand benzene's physical character and partly to find new ways to control chemical reactivity and discover new synthetic paths. One goal is to engineer benzene's novel electronic motif to act as a controllable reaction catalyst. There is an active research programme to modify benzene's aromatic properties by adding on ‘bicyclic’ rings outside the main ring. This led to the creation of tris(bicyclo[2.1.1]hexeno)benzene (TBHB). TBHB's structure is critical to understanding how to localize benzene's *π* electrons [87].

We recount recent experimental probes of TBHB's structure, demonstrating how an information-theoretic analysis yields additional insight. TBHB is a largely planar molecule that has attracted attention as one of the first confirmed mononuclear benzenoid hydrocarbons with a cyclohexatriene-like geometry [88]. Figure 5*a* shows the molecular structure of TBHB, and figure 5*b* gives a schematic of formula. Of particular interest is the central benzene ring, where the internal angles of the carbon–carbon bonds are all 120°, but there is remarkable alteration of the two inequivalent bond lengths between the carbons (1.438(5)−1.349(6) Å) [88]. Of additional interest is the crystallographic structure of TBHB. Here, two crystal morphologies are observed, monoclinic and hexagonal [90]. For this latter structure, X-ray diffraction studies reveal significant diffuse scattering along rods in reciprocal space, a hallmark of planar disorder. Figure 6*a* shows the positions of the diffuse rods in reciprocal space, and figure 6*b* gives an illustration of the average layer structure of TBHB. We will call the extension of this configuration into a two-dimensional periodic array a ML for TBHB.

Of more recent interest [89,91], and the problem that concerns us here, is quantifying and describing the disordered stacking structures observed in TBHB. In order to do this, we must specify the possible ML–ML stacking arrangements and establish a convenient nomenclature to express extended stacking structures. The stacking rules and conventions for layers of TBHB can be summarized as follows.^{11} (i) While there are three ways that two MLs can be stacked, they are geometrically equivalent and are related by a rotation of 120° about the stacking direction. Thus, there is only a single kind of ML–ML relationship. (ii) For triplets of MLs, there are two geometrically inequivalent stacking arrangements. For the case where a molecule in the (*i*+2)th ML is directly above one in the *i*th ML, this arrangement is called *eclipsed*. The other distinct possibility is that the (*i*+2)th ML occupies one of the other two positions. These are geometrically equivalent, being related by a mirror operation, and are called *bent*. However, as one advances along the stacking direction, these latter two can be differentiated as rotating in either a clockwise or an anticlockwise fashion. Together, then, we need to distinguish between three different triplets of stacking sequences: an eclipsed triplet, which we symbolize by *e*, a clockwise bent triplet, which we will symbolize by *l*, and an anticlockwise bent triplet, symbolized by *r*.^{12} We collect these possibilities into the set .

Let us imagine a sliding window that permits observation of but three MLs at a time. That three-ML sequence is then assigned a symbol from . The window then increments along the stacking direction by one ML, so that the last ML in the sequence becomes hidden, and a new ML is revealed. This new three-ML sequence can again be specified by one of the symbols in , such that the four-ML sequence is given by a two-letter sequence from . Thus, a physical stacking sequence can be written as a sequence over the set of these triplets, .

Recently, Michels-Clark *et al.* [89] compared three different methods of determining stacking structure for disordered TBHB from diffraction patterns: differential evolution, particle swarm optimization and a genetic algorithm. Although computationally intensive, they found excellent agreement between calculated and reference diffraction patterns, obtaining an *R*-factor fitness of for their best case differential evolution algorithm. We analyse that case in detail now.

Michels-Clark *et al.* [89] assume a second-order Markov process in the *rle* notation,^{13} so that the probabilities of successive symbols are dependent on only the two previous symbols seen, i.e. , which they call *structural motifs*. Michels-Clark *et al.* [89] directly report the probability of *e* following *e* as , which is only two standard deviations above 0. In addition, the probability of the *ee* sequence itself is only 0.00033. Thus, we neglect the *ee* sequence when we construct the hidden Markov model. Michels-Clark *et al.* [89,92] relate transition probabilities between structural motifs to model parameters, so that we can directly calculate transition probabilities from any solution of model parameters.^{14} Taking the values for the best case differential evolution solution given in table 2 of [89], we calculate these probabilities. In doing so, we find that and . This equivalence of length-two futures between the length-two pasts *ll* and *rl* suggests that these two pasts should be combined into a single causal state.^{15} A similar argument holds for *rr* and *lr*. Thus, we collect the pasts *ll* and *rl* into a single state and likewise group *rr* and *lr* into the single state . The remaining four possible length-two pasts—*re*,*le*,*er*,*el*—each have unique futures, and so each forms its own causal state. The resulting six-state *ϵ*-machine is shown in figure 7. We can now give a chaotic crystallographic interpretation of the stacking process.

The most curious feature of this structure is its highly disordered pair of central states and , each with an asymptotic state probability of 0.456. If we neglect for the moment the symbol *e* (which only has a probability of ), this two-state core represents a nearly random process between the two symbols *l* and *r*. With just a probability of 0.024, strings composed of random *lr* sequences are interrupted by *e*. Now, from the states and come the most lopsided transition probabilities on the *ϵ*-machine: we see that . That is, one almost always (0.978 probability) observes that the symbol following *e* is *opposite* the one that preceded the *e*:
In other words, an *e* is almost always sandwiched between two unlike symbols drawn from *l* and *r*. There is then a 0.494 probability that the process will return to the central states and on emission of either an *l* or an *r*. If not, then another *e* is observed. Thus, even though pairs of *e*'s (*ee*) are unlikely (and, in fact, prohibited in this *ϵ*-machine) the probability of observing two *e*'s separated by a single *r* or *l* is surprisingly large. It appears that they ‘clump’ into small regions.

Often one examines the causal architecture for hints of what kinds of faults might be present. But since the most visited states ( and ) do not represent a cycle, there is no parent crystal here. Instead, one might take the view that the underlying process is random between *l* and *r* and only occasionally interrupted by *e*, such that *e* plays the role of a fault. If so, we think of this as an extrinsic fault, as *e* is not a part of the structure on either side. In any event, the causal architecture of the stacking of TBHB presents an interesting inversion of usual crystal-fault structure dichotomy. It is the core states and that give rise to the disordered stacking, and the less visited peripheral states, especially and , that serve to constrain the probabilities of stacking sequences and, hence, impose a sort of ‘structure’. In this view, it is perhaps not too extreme to think of TBHB as a kind of ‘anti-crystal’.

#### (iii) Towards a unified view of material structure

How does all of this fit together? Let us contrast the chore of the crystallographer tasked with determining the structure of a periodic material and a non-periodic one. For the full three-dimensional periodic case, there are seven possible crystal systems: *triclinic*, *monoclinic*, *orthogonal*, *tetragonal*, *cubic*, *trigonal* and *hexagonal*. One, of course, can be more specific and note that there are 230 crystallographic space groups. A periodic crystal must belong to one and only one of them. Thus, crystallography is equipped with tools that partition the space of all possible crystal structures into a finite number of non-overlapping sets. Of all the bewildering number of ways one might imagine putting atoms together in a periodic three-dimensional array, this limited classification system exhausts the possibilities. One can discuss the similarities between the different systems [93] and otherwise approach a genuine understanding of varieties of possible structures. But can the same be said for non-periodic materials?

To simplify the discussion, let us confine our attention to the one-dimensional case of stacking 1000 MLs. Let us suppose that this is over an alphabet of cardinality two. How many possible stacking sequences are there?^{16} Well, there are 2^{1000}≈10^{301}. Given that there are about 10^{80} protons in the observable Universe, it is clear that a comprehensive listing is simply not possible. And, if it were, it is questionable how helpful it would be. For these disordered materials, then, we are forced to appeal to statistical methods. Instead of a classification scheme fine-grained at the level of individual sequences, we instead collect all sequences that have the same statistical properties into a set. Colloquially, each set represents a *stacking process*. Operationally, we attempt to identify to which process a particular sequence belongs, and then we analyse the process in lieu of the particular sequence.

Each of the graphs in figures 4 and 7 specifies a particular process and defines a hidden Markov model. While there is still an infinite number of possible processes in the limit of indefinitely long sequences, a kind of order has been imposed. We can, for example, enumerate all the processes over a two-symbol alphabet with just one state. There is but one, and it is shown in figure 4*c*. (Figure 4*a*,*b* are just special cases of figure 4*c*.) For two-state binary processes, there are 13 [86].^{17} For binary processes, the number of distinct processes up to six states has been tabulated [95]. Thus, chaotic crystallography does for disordered materials much the same service that classical crystallography does for perfectly ordered ones: it organizes and structures the space of possible atomic arrangements. Furthermore, it allows comparison of the hidden Markov models between different materials in much the same way that crystal structures of different materials are compared according to which, for example, crystal system they belong.

We contend then that the hidden Markov models describing not only different specimens of the same material but different materials altogether can be compared, either by direct examination of the graphical model of the process or by information measures that characterize various computational requirements. As an example, we can compare measures of intrinsic computation between the two materials considered in the previous subsections as well as that of a third layered material, ZnS. Of the many measures one can select, we choose to examine these materials’ informational organization via a *complexity–entropy diagram* [25]. A complexity–entropy diagram plots, for each stacking process, the entropy rate *h*_{μ} of a symbol sequence discussed in §3 and the mutual information between the two halves of the specimen, the excess entropy **E**, introduced in §4. These measures can be calculated directly from the hidden Markov model for the stacking processes.

We begin with ice. Note that ice I_{c} and ice I_{h} are both described by single-state machines and, thus, each half of the crystal shares no information with the other half, giving **E**(I_{c})=**E**(*I*_{h})=0 bits. Similarly, being perfectly ordered, we find *h*_{μ}(I_{c})=*h*_{μ}(*I*_{h})=0 bits ML^{−1}. For ice I_{sd}, we calculate this quantity for a number of experimental specimens reported in the literature. Malkin *et al.* [73] performed X-ray diffraction studies of several samples of ice I that had been recrystallized from ice II and heated at rates between 0.1 to 30 K per minute over temperature ranges of 148–168 K. They used the *s*=4 Jagodzinski disorder model to analyse their results, and we find by direct calculation from the data given in their table 4 that these information measures cluster in the range of **E**(I_{sd})≈0.10−0.15 bits and *h*_{μ}(I_{sd})≈0.75−0.90 bits ML^{−1}. Murray *et al.* [78] carried out similar studies on ice I deposited as amorphous ice from the vapour phase onto a glass substrate at 110 K. The sample was subsequently warmed at a rate of 1 K per minute, and they reported diffraction patterns recorded at selected temperatures in the range of 125–160 K. They too analysed the diffraction patterns using the *s*=4 Jagodzinski disorder model, although they found that memory effects were negligible. We find by direct calculation from the data given in their table 1 that these information measures cluster near **E**(I_{sd})≈0 bits and *h*_{μ}(I_{sd})≈0.95−1.00 bits ML^{−1}. We can do the same for TBHB. We find **E**(TBHB)=0.21 bits and *h*_{μ}(TBHB)=1.1 bits ML^{−1}. For comparison, we also consider these quantities for several specimens of ZnS analysed elsewhere [29]. Lastly, to contrast with these disordered samples, we consider a one-dimensional process that has characteristics similar to those of a quasi-crystal, the *Thue–Morse Process* (TM process) [96]. Like a quasi-crystal, it is completely ‘ordered’, but non-periodic. We have bits, where *N* is the number of layers in the specimen, and *h*_{μ}(TM)=0 bits ML^{−1}.

Since the maximum possible stacking disorder is 1 bit ML^{−1} for ice I, we see that disordered ice I really is, well, disordered. Additionally, very little information (**E**) is shared between the different halves. There is little one can predict about one half of the specimen knowing the other half. The clustering of these information measures does lend credibility to the notion that ice I_{sd} is a ‘new’ form of ice. We would, however, exercise caution in referring to this as a distinct thermodynamic phase of ice. Observe that it is not only not well defined in stacking-sequence space, i.e. there are many sequences that correspond to ice I_{sd}, but we also see from the spread of information measures on the complexity–entropy diagram, it is not well defined in process-space either. We prefer the interpretation that these specimens are chaotic crystals, each being described by a different hidden Markov model and each exhibiting different measures of information processing. Thus, they really do not constitute a separate phase in the same sense that ice I_{c} and ice I_{h} are. Ice I_{sd} is, at least at the moment, an umbrella term for ice I with a largely random stacking of hexagonal and cubic layers. We do note that information-theoretic measures can distinguish between ice I_{sd} samples having different histories under different thermodynamic conditions.

TBHB similarly appears to be highly disorganized. We note that, since there are three possible stacking orientations for each ML, the maximum possible stacking entropy is . Indeed, as noted before, there is no central periodic cycle on the graph of figure 7 as is commonly seen in faulted materials. Similarly as well, not much information is shared between the two halves. ZnS presents the least entropic of the disordered specimens we consider. Like the ice I specimens considered, the ZnS ones are either grown disordered or caught in the transformation between crystalline phases: a hexagonal phase and a cubic one. Generically, however, ZnS appears to have more structured intermediate states, suggesting a more structured transformation, probably as a result of significant constraints on the types of disordering mechanisms in play. We can speculate that, although ice I and ZnS can both be described as close-packed structures, the disordering and transformation mechanisms are at least quantitatively, if not qualitatively, different for each.

Examining figure 8, we see that the complexity–entropy diagram also provides a partitioning for the kinds of structures that can exist. For example, any periodic process has zero entropy, thus on a complexity–entropy diagram all perfect crystals are confined to the vertical axis. This, then, makes concrete just how special crystallinity is. Similarly, quasi-crystals inhabit the upper left corner of the diagram, also confined to the vertical axis. Thus, while quite interesting, quasi-crystals are informationally rather special organizations. All the space to the right of the vertical axis is occupied by entropic crystals—just the kinds of specimens that chaotic crystallography is ideally suited to describe. Thus, chaotic crystallography introduces tools to quantify these structures and represents a significant expansion over the domain of classical crystallography.

Although we maintain that understanding structure in itself is a worthy enough goal, we are mindful that one of the fruits to be harvested from this enquiry is the possible exploitation of the connection between *structure* and *function*.^{18} The interrelationship between structure and material properties is quite well known. Carbon can exist as a face-centred cubic crystal and, when a specimen is so ordered, we call it a diamond. More commonly, carbon is found in hexagonal sheets and is known as graphite. Carbon can also be arranged as nanotubes and spherical shells informally called *Bucky balls*. And, though each of these is equivalent by composition, their material properties are vastly different. Structure matters. Less drastically, different kinds of stacking structures change material properties in more subtle ways. Brafman & Steinberger [98] noted that, by changing from one kind of periodic stacking structure in ZnS to another, the degree of birefringence changes. Indeed, this change appeared to depend on only a single parameter, the hexagonality, which is the fraction of layers hexagonally related to their neighbours, given by . And, perhaps consequentially, it did so in a very smooth and predictable way. We know that stacking structure affects other material properties, such as the diffraction pattern and, clearly, the correlation functions. It requires little imagination to speculate that other properties may be similarly affected.

Let us then return to the case of stacking 1000 MLs. Suppose we task a materials scientist to investigate the possible material properties obtainable from different stacking sequences. Even in the simple binary case, as we saw above, there are approximately 10^{301} such sequences. Thus, (the admittedly naive approach of) a detailed sequence-by-sequence analysis is unfeasible—either experimentally, theoretically or via simulation. Yet in the absence of any theory of disorder in materials, such a brute force investigative approach might be thought necessary. A chaotic crystallography perspective immediately equips the materials scientist with tools to approach the problem. She knows, for instance, that many materials properties are dictated by and calculable from knowledge of the stacking process alone. Thus, instead of trying to tackle the problem sequence by sequence, it is profitable instead to approach it process by process. Although the space is still enormous, it is considerably smaller and, importantly, is now systematized. Starting with simple processes and proceeding to more complex ones, might, for example, be an effective strategy.^{19} Furthermore, properties may not even depend on the stacking process details, but may instead correlate with overall statistical properties or information-theoretic measures. The case of the birefringence of ZnS hints at this. A single statistical parameter correlates with the observed birefringence; at least for periodic stacking sequences. Also, the diffraction pattern is known to depend only on the pairwise correlations between MLs. It is well known that different stacking processes can have the same correlation functions, suggesting that an even less fine-grained approach may be profitable. To the extent that transmission properties through disordered potentials depend only on correlation functions [61], here too a less fine-grained approach may be useful.

One may object to and question whether we are guaranteed that all material properties are the same for all realizations of a process. We are not. However, theoretical results suggesting the important parameters to consider, coupled with experimental observations and the outcomes of simulations, can give confidence that a particular property under study is an ensemble property. Unquestionably, much of the connection between information-theoretic properties and materials properties remains unexplored. Along the lines presented here and paralleling Schrödinger's principled but speculative thoughts about life's organization, the abundant hints of intimate connections are too promising and possible rewards of finding and exploiting such connections are too rich to not explore.

We note too that the exercise of predicting material properties from structure is by no means academic: the Materials Genome Initiative [99] is a coordinated and dedicated effort spanning theoretical, experimental and simulation studies attempting to do just this. Given the sheer variety of possible arrangements of atoms, an organizational scheme that structures the space of possibilities is an absolute necessity. Otherwise, researchers will find themselves relying on intuition—formidable certainly, but all too often unreliable—alone to propose and assemble possible configurations with novel material properties. Without too much exaggeration, it is akin to banging on a keyboard hoping to finger out one of Shakespeare's sonnets: possible yes, but ever so much more likely if one knows the rules of English grammar.

## 5. Thermodynamics of material computation

Up to this point, we focused exclusively on informational properties embedded in the static structure of ‘chaotic’ materials, ignoring temporal dynamics … of their growth, their functional behaviour in the ‘wild’, and the like. A full story, though, requires a thermodynamic accounting of the informational aspects of such materials—the energetics of their equilibrium and non-equilibrium configurations, the energetics of how they come to be, how they are transformed and what functions they support. Here, to illustrate the connections between intrinsic information and energetic costs, we briefly review recent explorations of Maxwell's Demon and a ratchet model that describes how molecular ‘engines’ can store and process information as they traverse a control sequence.

### (a) Szilard's single-molecule engine

Biological macromolecules [100–102] perform tasks that involve the simultaneous manipulation of energy, information and matter. Although we can sometimes identify such functioning—in the current-gating of a membrane ion channel [103,104] that supports propagating spike trains along a neuronal axon or in a motor protein hauling nutrients across a cell's microtubule highways [100]—it is not well understood. Understanding calls on a thermodynamics of nanoscale systems that operate far out of equilibrium and on a physics of information that quantitatively identifies organization and function. At root, we must rectify this functioning with the entropy generation dictated by the Second Law of Thermodynamics. James Clerk Maxwell introduced the Demon that now bears his name to highlight the essential paradox. If a Demon can measure the state of a molecular system and take actions based on that knowledge, the Second Law can be violated: sorting slow and fast molecules onto separate sides of a partition creates a temperature gradient that a heat engine can convert to useful work. In this way, Demon ‘intelligence’—or, in our vocabulary, information processing—can convert thermal fluctuations (disorganized energy) to work (organized energy).

In 1929, Leo Szilard introduced an ideal Maxwellian Demon for examining the role of information processing in the Second Law [105]; a thought experiment that a decade or so later provided an impetus to Shannon's communication theory [106]. Szilard's Engine consists of three components: a controller (the Demon), a thermodynamic system (a molecule in a box) and a heat reservoir that keeps both thermalized to a temperature *T*. It operates by a simple mechanism of a repeating three-step cycle of measurement, control and erasure. During measurement, a barrier is inserted midway in the box, constraining the molecule to either the box's left or right half, and the Demon memory changes to reflect on which side the molecule is. In the thermodynamic control step, the Demon uses that knowledge to allow the molecule to push the barrier to the side opposite the molecule, extracting work from the thermal reservoir. In the erasure step, the Demon resets its finite memory to a default state, so that it can perform measurement again. The periodic *protocol* cycle of measurement, control and erasure repeats endlessly and deterministically. The net result being the extraction of work from the reservoir balanced by entropy created by changes in the Demon's memory. The Second Law is respected and the Demon exorcised, since dumping that entropy to the heat bath requires a work flow that exactly compensates energy gained during the control step.

Connecting nonlinear dynamics to the thermodynamics of Szilard's Engine, we recently showed that its measurement–control–erasure barrier–sliding protocol is equivalent to a discrete-time two-dimensional map from the unit square to itself [107]. This explicit construction establishes that Szilard's Engine is a chaotic system whose component maps are thermodynamic transformations—what we now call a *piecewise thermodynamical system*. An animation of the Szilard Engine, recast as this chaotic dynamical system, can be viewed at http://csc.ucdavis.edu/~cmg/compmech/pubs/dds.htm.

What does chaos in the Szilard Engine mean? The joint system generates information—information that the Demon must keep repeatedly measuring to stay synchronized to the molecule's position. On the one hand, information is generated by the heat reservoir through state-space expansion during control. This is the chaotic instability in the Engine when viewed as a dynamical system. And, on the other, information is stored by the Demon (temporarily) so that it can extract energy from the reservoir by allowing the partition to move in the appropriate direction. To return the Engine to the same initial state, that stored information must be erased. This dynamically contracts state space and so is locally dissipative, giving up energy to the reservoir.

The overall information production rate is given by the Engine's Kolmogorov–Sinai entropy *h*_{μ} [108]. This measures the flow of information from the molecular subsystem into the Demon: information harvested from the reservoir and used by the Demon to convert thermal energy into work. Simply stated, the degree of chaos determines the rate of energy extraction from the reservoir. Moreover, in its basic configuration with the barrier placed in the box's middle and its memory states being of equal size, the Demon's molecule-position measurements are optimal. It uses all of the information generated *h*_{μ} by the thermodynamic system: recalling §3, all of the generated information *h*_{μ} is bound information *b*_{μ}; none of the generated information is lost (*r*_{μ} vanishes).

Critically, the dynamical Szilard Engine shows that a widely held belief about the thermodynamic costs of information processing—the so-called Landauer principle [109–113]: each erased bit costs of dissipated energy and the act of measurement comes at no thermodynamic cost—is at best a special case [107,114–116].^{20} As the partition location varies and the Demon memory cells change size, both measurement and erasure can dissipate any positive or negative amount of heat. Specifically, there are Szilard Engine configurations that directly violate Landauer's principle: erasure is thermodynamically free and measurement is costly—an anti-Landauer principle. The result is that the Szilard Engine achieves a lower bound on energy dissipation expressed as the sum of measurement and erasure thermodynamic costs. In this, the Szilard Engine captures an optimality in the conversion of information into work that is analogous to a Carnot Engine's optimal efficiency when converting a difference in thermal energies to work.

### (b) Information catalysts

Szilard's Engine is one of the simplest controlled thermodynamic devices that lays bare the tension between the Second Law and functionality of an information-gathering entity or subsystem (the Demon). The net work extracted exactly balances the thermodynamic (entropic) costs. This was Szilard's main point, though we see that his Engine was not very functional, merely consistent with the Second Law. The major contribution was that, long before Shannon's information theory, Szilard recognized the importance of the Demon's information acquisition and storage in resolving Maxwell's paradox.

This allows us to move to a more sophisticated device that uses a reservoir of information (a string of random bits) to extract net positive work from a heat reservoir. To set the stage for the thermodynamics we are interested in, but staying in the spirit of complex materials, let us re-imagine the Szilard Engine implemented as an enzyme macromolecule whose conformational states implement the measurement–control–erase protocol. Moreover, let this enzyme traverse a one-dimensional periodic crystal—say, a strand of DNA—reading its successive base-pairs to obtain individual measurement, control and erase protocol commands. The preceding thermodynamics and informational analysis thus apply to such a molecular engine—an actively controlled system that can rectify fluctuations, being only temporarily, locally inconsistent with the Second Law.

Let us go one step further, though, to imagine a functional enzyme that over a thermodynamic cycle extracts net positive work from an information reservoir to store or release energy as it assembles or disassembles a chain of small molecular components. In this, we replace the one-dimensional ‘control’ molecule with a set of random bits that come into local equilibrium with the enzyme. As they do, the enzyme's dynamic shifts to catalyse assembling the components. The shift allows the enzyme to selectively use energy from a reservoir, say an ATP-rich environment whose molecules the machine accesses when energy is needed (ATP → ADP) or given up (ADP → ATP). Figure 9 illustrates the new, functional molecular machine.

In this way, the imagined enzyme acts as an *information catalyst* that facilitates, via what are otherwise thermodynamically unfavourable reactions, the assembly of the chain of molecular components. In the 1940s, Leon Brillouin [118] and Norbert Wiener [119], early pioneers in the physics of information, viewed enzymes as just these kinds of catalysts. In particular, Brillouin proposed a rather similar ‘negative catalysis’ as the molecular substrate that generated negentropy—the ordering principle Schrödinger identified as necessary to sustain life processes consistent with the Second Law. Only much later would such ‘information molecules’ be championed by the evolutionary biologists John Maynard Smith and Eörs Szathmáry [120].

We recently analysed the thermodynamics of a class of memoryful information catalysts [117] for which all correlations among system components could be explicitly accounted. This gave an exact, analytical treatment of the thermodynamically relevant Shannon information change from the input information reservoir (bit string with entropy rate *h*_{μ}) to an exhaust reservoir (bit string with entropy rate ). The result was a refined and broadly applicable Second Law that properly accounts for the intrinsic information processing reflected in the accumulation of temporal correlations. On the one hand, the result gives an informational upper bound on the maximum average work 〈*W*〉 extracted per cycle:
where *k*_{B} is Boltzmann's constant and *T* is the environment's temperature. On the other hand, this new Second Law bounds the energy needed to materially drive transforming the input information to the output information. That is, it upper bounds the amount −〈*W*〉 of input work to a physical system to support a given rate of intrinsic computation, interpreted as producing a more ordered output—a reduction in entropy rate.

This Second Law allows us to identify the Demon's thermodynamic functions. Depending on model parameters, it acts as an *engine*, extracting energy from a single reservoir and converting it into work (〈*W*〉>0) by randomizing the input information (*h*_{μ}′−*h*_{μ}>0), or as an *information eraser*, erasing information (*h*_{μ}′−*h*_{μ}<0) in the input at the cost of the external input of work (〈*W*〉<0). Moreover, the Demon supports a counterintuitive functionality. In contrast with previous erasers that only decreased single-bit uncertainty H[*X*_{0}], it sports a new kind of eraser that removes multiple-bit uncertainties by adding correlation (temporal order), while single-bit uncertainties are actually increased (H′[*X*_{0}]−H[*X*_{0}]>0). This modality leads to a provocative interpretation of life processes: the existence of natural Demons with memory (internal states) is a sign that they have been adapted to leverage temporally correlated fluctuations in their environment.

## 6. Conclusion

We have come a long way from Schrödinger's prescient insight on aperiodic crystals. We argued, across several rather different scales of space and time and several distinct application domains, that there is an intimate link between the physics of life and understanding the informational basis of biological processes when viewed in terms of life's constituent complex materials. We noted, along the way, the close connection between new experimental techniques and novel theoretical foundations—a connection necessary for advancing our understanding of biological organization and processes. We argued for the importance of structure and strove to show that we can now directly and quantitatively talk about organization in disordered materials, a consequence of breaking away from viewing crystals as only periodic [42,64]. These structured-disordered materials, in their ability to store and process information, presumably played a role in the transition from mere molecules to material organizations that became substrates supporting biology [121]. For biology, of course, its non-crystalline ‘disorder’ is much much more, it encodes the information necessary for life. Thus, biological matter is more than wet, squishy ‘soft matter’; it is informational matter. DNA, RNA and proteins are molecules of information [118–120]. So much so that DNA, for example, can be programmed [122–124]. And, in a complementary way, the parallels driving our development here perhaps give an alternative view of ‘material genomics’ [99].

What distinguishes biological matter from mere physical matter is that the information in the former encodes organization and that organization takes on a catalytic function through interactions in a structurally diverse environment. Moreover and critically, these characters are expressed in a way that leads to increasingly novel, complex chemical structures—structures that form into entities with differential replication rates [125]. And the high-replication entities, in turn, modify the environment, building ‘niches’ that enhance replication; completing a thermodynamic cycle whose long-term evolutionary dynamics are thought to be creatively open-ended.

We saw that pondering Schrödinger's view of the physical basis of life raised questions of order, disorder and structure in one-dimensional materials. Chaotic crystallography—free of ‘the benevolent tyranny of the crystal’ [126]—emerged as an overarching theory for the organization of close-packed materials. It gave a consistent way to describe, at one and the same time, order and disorder in layer stacking in ice and aromatic compounds and, generally, in one-dimensional chaotic crystals. And, in this, it hints at a role that local (dis)ordering can play in enhancing how biomolecules function synergistically in solution. The issue of biological function forced us to probe more deeply into its consistency with the Second Law of Thermodynamics. We then turned to consider two simple cases of Maxwellian molecular Demons to illustrate that the Second Law of Thermodynamics is perfectly consistent with the informational character and functionality of smart molecules—that thermodynamics can begin to describe the energetics of such information catalysts.

Admittedly, we addressed only in a cursory way several of the major challenges that an informational view of matter poses. Shannon introduced information as surprise and we showed that this readily led to seeing how information is created, stored and transmitted. We only just broached the abiding question, however, of how these kinds of information contribute to material functionality. Szilard's Engine and related information catalysts hinted at how we will come to analyse functional information in complex materials and biomolecules. Hopefully, the informational perspective will be sufficiently fruitful to extend to analysing how such structured objects operate in their environments—how, for example, water plays a critical role in biomolecular interactions and function.

## Note added in proof

It has been brought to our attention that recently Cartwright & Mackay [42] speculated that cubic ice might profitably be viewed as a one-dimensional chaotic crystal due to its disordered and aperiodic character. Note that we confirm their suspicions and do exactly that in §4b(i). We thank Julyan Cartwright for pointing this out. Predating the notion of a chaotic crystal of course is Ruelle's famous ‘turbulent crystal’ [127], a crystal that breaks translational symmetry and has ‘a structure modulated in a non-quasiperiodic manner’. Such crystals would be expected to show some ’fuzziness’ in the sharp reflections of the diffraction pattern as well as diffuse scattering and would be similar to the notion of a chaotic crystal discussed here. We differentiate our efforts from these previous descriptions by insisting that it is information- and computation-theoretic measures and methods that are the proper platform to quantify disordered crystal structures.

## Authors' contributions

J.P.C. and D.P.V. wrote the manuscript jointly.

## Competing interests

We have no competing interests.

## Funding

This material is based upon work supported by, or in part by, the US Army Research Laboratory and the US Army Research Office under contract W911NF-13-1-0390.

## Acknowledgements

The authors thank Alec Boyd, Hans-Beat Bürgi, Korana Burke, Julyan Cartwright, Ryan James, Xincheng Lei, John Mahoney, Dibyendu Mandal, Tara Michels-Clark and Benjamin Murray for helpful conversations. We thank the Santa Fe Institute for its hospitality during visits. J.P.C. is an SFI External Faculty member.

## Footnotes

One contribution of 21 to a theme issue ‘DNA as information’.

↵1 We take the view that science is fundamentally an experimental endeavour, ultimately dependent on empirical observation. Pure mathematics, while of enormous interest for both its intellectual vigour and beauty, as well as its practical applications, need not make appeal to experiment for validation or refutation of its claims and, thus, does not constitute a scientific discipline.

↵2 In a perhaps unexpected overlap to our narrative, Shannon's PhD thesis from Massachusetts Institute of Technology (1940) was titled

*An Algebra for Theoretical Genetics*and explored the mathematics of genetic recombination.↵3 It is not always appreciated that Alan Mackay anticipated the discovery of quasi-crystals. As early as 1962 [11] he calculated the packing density of identical spheres in an icosahedral shell packing, and in 1981 [12] he considered five-fold symmetric structures in the context of crystallography. For his efforts, he was awarded the Oliver E. Buckley Condensed Matter Prize in 2010 ‘for pioneering contributions to the theory of quasi-crystals, including the prediction of their diffraction pattern’ [13]. A short history of the discovery of quasi-crystals, and the scientific environment at the time, is given by Hargittai [14].

↵4 How one should properly define a ‘crystal’—and by extension the field of crystallography—has been a matter of some controversy. (See, for example, an extended discussion in

*Zeitschrift für Kristallographie***222**(6) 2007.) In the early days of crystallography, it was natural to seek a definition to delineate the boundaries of the field. Quite properly, however, as knowledge increased, the adherence to the notion of a crystal as a periodic repetition of atoms in space was found to be too confining—hence the doubt over whether quasi-crystals were really crystals. With the redefinition of the term ‘crystal’ by the International Union of Crystallography (IUCr) in 1992 to include anything with ‘an essentially sharp diffraction pattern’ [34], the scope of the field was expanded. However, intellectual explorations have never been hemmed in by rigid definitions. A view, and one with merit in our opinion, is that the purview of any field is defined by what its practitioners do. Even a cursory glance at articles in IUCr journals shows a long and tenacious interest in disordered structures. Thus, if crystallography is what crystallographers do, and what crystallographers do is study crystals, then the association of crystals—and hence crystallography itself—with disorder is and always has been part of the field. We hope that purists will not judge too harshly as we make this connection explicit by attaching notions of disorder, entropy and information to the terms ‘crystal’ and ‘crystallography.’↵5 In his own words ‘… we contend that the subject matter of crystallography is really structure (at various levels) and that the concern with crystals is only incidental’ [39] and ‘… crystallography is no longer the science dealing only with the symmetry of the external forms of mineral crystals but has exfoliated to become a much more general science dealing with structure at the atomic level—the way in which the aggregations of atoms combine to give the manifold properties of matter, living as well as inanimate’ [36]. Notwithstanding our great admiration for Mackay's efforts and enthusiasm, we sense that the struggle for the soul of crystallography remains an ongoing battle.

↵6 It's the intermetallic compound Al

_{55.4}Cu_{5.4}Ta_{39.1}, with a total structural information content of 48 538.637 bits per unit cell.↵7 Our usage is similar to the recently introduced ‘crystallographically challenged materials’ of Billinge & Kanatzidis [54]. There appears to be some difference in emphasis, as they define crystallographically challenged materials as those materials where ‘deviations from perfect crystallinity are rather severe but a remnant of the crystallinity is apparent’. It is still possible to have considerable structure without retaining a ‘remnant of the crystallinity’, thus a chaotic crystal might be more broadly defined. However, these two terms largely occupy the same conceptual space.

↵8 The necessity of water for life has come under significant scrutiny [65].

↵9 Note that the position of the oxygens does not uniquely fix the positions of the hydrogens. In ice I, of the four possible positions that may be occupied by a hydrogen, only two are, and these are usually taken to be random. Thus, ice I is referred to as

*proton disordered*. We do not consider proton disordering in our analysis.↵10 Here and elsewhere, we adopt the convention that a bar over a variable means 1 minus that variable, i.e. .

↵11 The stacking rules given here may seem non-intuitive. Both Bürgi

*et al.*[91] and Michels-Clark*et al.*[89] give excellent and extended discussions of the stacking possibilities and the geometrical and chemical constraints that cause them. We only synopsize those results, and the interested reader is urged to consult these references for a detailed explanation.↵12 To ease the burden of the nomenclature we will introduce shortly, we are using {

*l*,*r*} for these triplet sequences, instead of the {*b*_{L},*b*_{R}} used in Michel-Clark*et al.*[89]. However, there is no change in meaning between these two sets.↵13 Note, however, that since each symbol implies information about the arrangement of three MLs, in terms of the MLs their stacking process is fourth-order Markovian.

↵14 We use the corrected table given in the latter reference, [92], in this present analysis.

↵15 The collapsing of pasts with equivalent futures in the context of crystallography is discussed extensively in Example C of [31]. Note that in the present case one only need consider length-two sequences as this model is second-order Markovian in the

*rle*notation.↵16 In our enumeration, we neglect to consider possible physical and symmetry constraints that could reduce this number significantly. (See [94] and references therein for a discussion.) While it is possible to economize the enumeration task, the growth of the number of sequences is still exponential in sequence length and our main point holds.

↵17 We brush aside some technicalities here. In this enumeration, we require that each state transitions to only one successor state on the emission of a particular symbol, a property called

*unifilarity*. These details, while important, do not detract from our main point here that process space over a finite alphabet can be systematically ordered.↵18 We are reminded here of the well-known epigram by Louis Sullivan, ‘Form ever follows function’ [97]. Although uttered in the context of building architecture, it applies equally well to material architecture.

↵19 Chaotic crystallography has tools for quantifying the complexity of the stacking process [30,33], so this notion of treating simple processes first can be made operational.

↵20 Early on, von Neumann [114], lecture 4 discussed the general costs of information processing and transmission without falling into the trap of assigning costs only to information erasure.

- Accepted December 8, 2015.

- © 2016 The Author(s)

Published by the Royal Society. All rights reserved.