## Abstract

A method of defining non-equilibrium entropy for a chaotic dynamical system is proposed which, unlike the usual method based on Boltzmann’s principle , does not involve the concept of a macroscopic state. The idea is illustrated using an example based on Arnold’s ‘cat’ map. The example also demonstrates that it is possible to have irreversible behaviour, involving a large increase of entropy, in a chaotic system with only two degrees of freedom.

## 1. Introduction

It is a part of everyday experience that matter behaves irreversibly: heat flows from hot to cold, never from cold to hot; plates break when we drop them, but never reconstitute themselves, and so on. A way of quantifying this irreversibility is provided by the second law of thermodynamics, which can be encapsulated in the statement that the thermodynamic entropy of a thermally isolated system cannot decrease.

Irreversibility is displayed by most of the partial differential equations (PDEs) we use to model the macroscopic behaviour of matter—the heat equation, the Navier–Stokes equations and so on (though not the Euler equations). The irreversibility comes from the lack of symmetry of these PDEs under time reversal—the fact that they are not invariant under the transformation *t*→−*t*,*v*→−*v* (reversal of the sign of the time variable and all velocity variables). From such PDEs, it is usually possible to derive a result corresponding to the second law of thermodynamics. If the boundary conditions permit no heat to cross the boundary, then this result takes the form that some quantity that can be interpreted as the entropy increases with time or stays constant. Likewise, Boltzmann’s integro-differential equation for the time evolution of the velocity distribution in a gas is not symmetrical under time reversal, and Boltzmann showed, in his celebrated *H* theorem, that the entropy-like quantity −*H* increases with time or stays constant [1] (English translation in [2]).

For the ‘microscopic’ descriptions used in statistical mechanics to describe the motion of individual molecules in detail, the situation is different. The differential equations of particle dynamics (Hamilton’s equations of motion in classical mechanics, Schrödinger’s time-dependent equation in quantum mechanics) are symmetric under time reversal. From this symmetry, it follows that in particle dynamics, there cannot be a dynamical variable that increases with time for every solution of the equations of motion. The reason is that, for every motion on which a dynamical variable increases with time, there must be another motion, obtained from the first one by time reversal, on which the same dynamical variable decreases with time.

What this paradoxical situation reveals is that the microscopic models, even though they contain so much more detail than the PDE models, are incomplete in the sense that their differential equations do not capture the difference between plausible motions (such as heat moving from hot to cold) and implausible ones (such as heat moving from cold to hot). One way of tackling this incompleteness might be to append to the differential equations a criterion for ruling out the implausible motions. This could take the form of a dynamical variable that is equal to the entropy; then motions for which the entropy decreased with time would be recognized as implausible.

An important method for defining entropy in a non-equilibrium system is Boltzmann’s principle [3–5] (see also 6). To formulate this principle, let us denote by *Γ* the phase space, consisting of all possible dynamical states of the system, and by *Γ*_{M} the set of all dynamical states compatible with a given macroscopic state *M*. Then, the entropy of the system, when it is in the macroscopic state *M*, can be defined as
1.1
where *k* is Boltzmann’s constant, *μ*(*Γ*_{M}) is the measure of the set *Γ*_{M} and *c* is a constant depending on the number and type of particles in the system, which is necessary, in general, for consistency with the thermodynamic entropy, but can be taken equal to 1 for the very simple systems considered in this paper.

While the definition (1.1) of entropy has the virtue of being simple in principle, there are some difficulties. One of them is that the definition of a macrostate is quite vague. If one defines macrostates in terms of how many particles are in a particular region of space, for example, then it is not clear what is meant by saying that two values for this particle number are macroscopically identical. Suppose there are a million particles in that region. Would we say that this is macroscopically identical to 999 999? to 999 000?

A second difficulty is that the very notion of ‘macroscopic’ seems to presuppose that the system consists of a very large number of particles. But later in this paper an example is given showing that irreversible behaviour is possible in a system with only two degrees of freedom. One would like to have a definition of entropy that can be used for systems of any size, or ones that do not consist of particles at all.

The purpose of the present paper is to propose an alternative to the definition (1.1) of entropy which can be used for any dynamical system at all, regardless of the number of degrees of freedom and of whether the concept of macroscopic state applies.

## 2. Dynamical self-correlation and non-equilibrium entropy

Here, a purely mechanical definition is proposed for the ‘entropy’ of a chaotic dynamical system. It defines an entropy for a segment of a trajectory. Very roughly, this entropy is the logarithm of the measure of the part of phase space that can be reached from phase points near the ends of the trajectory segment during the time while it is being traversed. The definition will be formulated for dynamical systems where the time changes continuously; the modifications necessary when the time changes in steps should be obvious.

Let *Γ* denote the phase space and *γ* a general point in *Γ*, and let *ϕ*_{t} be the flow on *Γ* which defines the time evolution. It will be assumed that *ϕ*_{t} has time-reversal invariance. This means that there is an involution operator *T* with the property that if is a trajectory of the dynamical system (i.e. a solution of its equations of motion), then the set is also a trajectory. For example, in Hamiltonian mechanics, *T* is the operator that reverses all velocities and/or momenta, but leaves the positions invariant. The assumed time-reversal symmetry of the dynamical system can be written
2.1

We assume further that there is a measure *μ* on *Γ* which is preserved by *ϕ*_{t}, i.e. that *μ*(*ϕ*_{t}(*A*))=*μ*(*A*) for every measurable set *A* in phase space. For the time being, we assume also that this measure is ergodic, i.e. that every measurable invariant set has measure either 0 or *μ*(*Γ*). In classical Hamiltonian mechanics, the Lebesgue measure on phase space is preserved by the flow, but is not ergodic.

Consider now a trajectory segment whose ends are *γ*_{1} (arbitrary) and *γ*_{2}:=*ϕ*_{t12}(*γ*_{1}), where *t*_{12} is arbitrary and may have either sign. That is to say, *t*_{12} is arbitrary and *γ*_{1},*γ*_{2} are arbitrary subject to the condition *γ*_{2}:=*ϕ*_{t12}(*γ*_{1}). Choose any small number *ϵ* and define the *dynamical self-correlation* of the two endpoints of the trajectory segment (for the ergodic case) to be
2.2
where *B*_{ϵ}(*γ*) denotes the ball of radius *ϵ* centred at the phase point *γ*, and *μ*(*B*_{ϵ}):=*μ*(*B*_{ϵ}(*γ*)), the measure of a ball of radius *ϵ*, which is the same for all *γ*.

The dynamical self-correlation has the obvious properties^{1}
2.3
Moreover, the function *C*_{ϵ}(⋅,⋅) is symmetric in its two arguments:
2.4
This property follows from the fact that the mapping *ϕ*_{t} preserves measure, so that the sets *ϕ*_{t12}*B*_{ϵ}(*γ*_{1})∩*B*_{ϵ}(*γ*_{2}) and *ϕ*_{t21}(*ϕ*_{t12}*B*_{ϵ}(*γ*_{1})∩*B*_{ϵ}(*γ*_{2}))=*B*_{ϵ}(*γ*_{1})∩*ϕ*_{t21}*B*_{ϵ}(*γ*_{2}) (where *t*_{21}:=−*t*_{12}) have the same measure and hence the numerators in the definitions of *C*_{ϵ}(*γ*_{1},*γ*_{2}) and *C*_{ϵ}(*γ*_{2},*γ*_{1}) are equal.

It can also be shown, using the symmetry properties of *ϕ*_{t} and *B*_{ϵ} with respect to the involution *T*, that the function *C*_{ϵ}(⋅,⋅) is invariant under the time reversal involution
2.5

Informally, *C*_{ϵ}(*γ*_{1},*γ*_{2}) as defined in (2.2) can (for positive *t*_{12}) be interpreted as the conditional probability that, if the system is started at time *t*_{1} from a phase point chosen at random from the neighbourhood *B*_{ϵ}(*γ*_{1}), then its phase point at the later time *t*_{2} will lie in the neighbourhood *B*_{ϵ}(*γ*_{2}). The larger the region of phase space the phase point can stray to within the time interval *t*_{2}−*t*_{1}, the less likely it is to find its way to *B*_{ϵ}(*γ*_{2}) at the appointed time, and so we might expect *C*_{ϵ}(*γ*_{1},*γ*_{2}) to be inversely proportional to the volume of phase space the system can reach from phase points near *γ*_{1} during the available time. According to Boltzmann’s principle, the entropy associated with this amount of phase space is a constant plus *k* times the logarithm of its volume, so we may expect to interpret *k* times the logarithm of 1/*C*_{ϵ}(*γ*_{1},*γ*_{2}) as an entropy.

To formulate a quantitative relation between dynamical self-correlation and entropy, consider the behaviour of the dynamical self-correlation when the length of the trajectory segment is very large. According to the informal interpretation just mentioned, *C*_{ϵ}(*γ*_{1},*γ*_{2}) is the probability that, at the time *t*_{2}, the phase point will be found in the region *B*_{ϵ}(*γ*_{2}). For very large *t*_{2}, assuming the dynamical system to be chaotic, one may plausibly equate this probability with the equilibrium probability of finding the phase point in *B*_{ϵ}(*γ*_{2}). Under our assumption that the dynamical system is ergodic, this equilibrium probability is
2.6
Using the fact that *μ*(*B*_{ϵ}(*γ*)) is independent of *γ*, we arrive at the following conjecture.

### Conjecture 2.1

If the dynamical system is ergodic, then 2.7

Some support for this conjecture is given by the result in §3 where equation (2.7) is shown to hold (albeit for square, not circular, neighbourhoods) in the case of the Arnold ‘cat’ map.

Using Boltzmann’s principle (1.1), we can use the conjecture (2.7) to express the equilibrium entropy in terms of dynamical self-correlation: 2.8 for the case where the phase space is ergodic.

By analogy, let us define
2.9
As explained earlier, one might expect *C*_{ϵ}(*γ*_{1},*γ*_{2}) to be inversely proportional to the volume of phase space the system can reach from phase points near *γ*_{1} during the available time, and so *S*_{ϵ}(*γ*_{1},*γ*_{2}) can be interpreted as *k* times the logarithm of this phase-space volume, i.e. (by Boltzmann’s principle (1.1)) as a non-equilibrium entropy associated with a system which is started at time *t*_{1} near state *γ*_{1} and is then allowed to evolve for a limited time, not long enough to reach equilibrium.

In the alternative case where the system is not ergodic, the formulae are a little more complicated. Consider, for example, the case where there is just one invariant of the motion, the energy. To each value of *E* there corresponds an ‘energy surface’, the set *Γ*_{E}:={*γ*:*H*(*γ*)=*E*} where *H*(⋅) is the Hamiltonian. The invariant measure obtained by restricting the measure *μ* to *Γ*_{E} (the microcanonical measure at energy *E*) will be denoted by *μ*^{(E)}.

The extra complication for this case arises from the fact that, although *μ*(*B*_{ϵ}(*γ*)) is independent of *γ*, the microcanonical measure of the set *B*_{ϵ}(*γ*)∩*Γ*_{E} does depend on *γ*. A convenient way of allowing for this is to make *ϵ* depend on *γ*, in such a way that the microcanonical measure of *B*_{ϵ(γ)}(*γ*)∩*Γ*_{E} remains constant as *γ* moves along the trajectory. Using the notation *ϵ*_{1}:=*ϵ*(*γ*_{1}),*ϵ*_{2}:=*ϵ*(*γ*_{2}), this requirement reads *μ*^{(E)}*B*_{ϵ1}(*γ*_{1})∩*Γ*_{E}=*μ*^{(E)}*B*_{ϵ2}(*γ*_{2})∩*Γ*_{E}. The proposed ‘microcanonical’ analogue of equation (2.2) is thus
2.10
where *j* can be either 1 or 2, it is immaterial which. Analogues of conjecture eq2.1 and of the formulae (2.8) and (2.9) can also be written down, but they will not be given here because they are not needed later in the paper.

## 3. Example 1: Arnold’s ‘cat’ map

To illustrate some of the properties of the entropy definition (2.9), we apply it to two dynamical systems which are simple enough to permit some of the relevant quantities to be calculated exactly.

Both examples use discrete dynamics, so that the time variable *t* takes only integer values. In the first example, the phase space is a square *Q*_{L}:=[0,*L*)⊗[0,*L*) of arbitrary side length *L*, with opposite edges identified so as to make it a two-dimensional torus. The dynamical rule is the Arnold ‘cat’ map [7] obtained by multiplying the column vector [*p*,*q*]^{T} representing the phase point *γ* by the matrix
3.1
and then projecting into the square *Q*_{L} using the projection **P**_{L} defined by **P**_{L}(*p*,*q*):=(*p* mod *L*, *q* mod *L*) in which, for example, *p* mod *L*:=*p*−*L*[*p*/*L*], where [*p*/*L*] denotes the largest integer ≤*p*/*L*. The formula for a single step of the evolution can be written as
3.2

This dynamical system is reversible in the following sense: let the sequence (…*γ*_{0},*γ*_{1},*γ*_{2},…) be a trajectory, meaning that it satisfies *γ*_{t+1}=*ϕ*(*γ*_{t}) for all *t*∈**Z**; then the sequence (…*Tγ*_{2},*Tγ*_{1},*Tγ*_{0},…), where *T*(*p*,*q*):=(*q*,−*p*), is also a trajectory. (Proof: because the matrix
3.3
representing the involution *T* satisfies **ATA**=**T**, it follows that if *γ*_{n+1}=*Aγ*_{n}, then *Tγ*_{n}=*ATγ*_{n+1}.)

This dynamical system has just one positive Lyapunov exponent, which is the logarithm of the larger of the two eigenvalues of the matrix **A**. This eigenvalue is *G*^{2}, where *G* denotes the golden ratio (1+√5)/2=1.618…. The normalized right eigenvectors of the matrix **A** are
3.4

To simplify the calculations, we replace the ball *B*_{ϵ}(*γ*) used in the preceding section by a square neighbourhood *N*_{ϵ}(*γ*). The edges of *N*_{ϵ}(*γ*) are taken to be parallel to the eigenvectors and their length is 2*ϵ*, so that *N*_{ϵ}(*γ*) is just big enough to include the ball *B*_{ϵ}(*γ*), but small enough to be included in . Its corners are the four points *γ*+*ϵ*(±**u**±**v**). The matrix *A*^{t} converts *N*_{ϵ}(*γ*) into a rectangle with corners
3.5
where *x*,*y* are defined by *γ*=*x***u**+*y***v**. This rectangle will be called the ‘long rectangle’. The lengths of its sides are 2*ϵG*^{2t} in the **u** direction and 2*ϵG*^{−2t} in the **v** direction.

To apply the definition of dynamical self-correlation, equation (2.2), we need the overlap area of the square *N*_{ϵ}(*ϕ*^{t}*γ*) with the figure *ϕ*^{t}(*N*_{ϵ}(*γ*)) obtained by applying the projection **P**_{L} to the long rectangle. We shall take *t*:=*t*_{12} to be positive; the results for negative *t* can be obtained using the symmetry rule (2.4) if they should be required. A simple case arises when *ϕ*^{t}*γ* is not too close to the edges of *Q*_{L}, whereas *ϵ* and *t* are small enough for the long rectangle, whose length is 2*ϵG*^{2t}, to lie entirely inside *Q*_{L}. Then, the projection **P**_{L} leaves the long rectangle unaltered, and that rectangle’s intersection with *N*_{ϵ}(*γ*) is simply a smaller rectangle; the width of the intersection rectangle is the same as that of the long rectangle, namely 2*ϵG*^{−2t}, and the length of the intersection rectangle is 2*ϵ*, making its area 4*ϵ*^{2}*G*^{−2t}. Putting this result into the definition (2.2), we find that, for this dynamical system,
3.6

This formula can also be written in terms of the positive Lyapunov exponent : 3.7 This formula is also true, approximately, when the neighbourhoods are taken to be balls rather than squares.

The calculation of *C*_{ϵ}(*γ*_{1},*γ*_{2}) for the opposite case, in which *ϵG*^{2t}≫1, is more complicated, and will be given as a theorem.

### Theorem 3.1

*For the discrete dynamical system defined by the mapping (eq3.2), the dynamical self-correlation is given by
*
3.8
*The proof of this theorem depends on lemma 3.2.*

### Lemma 3.2

Let *p*_{0} satisfy 0≤*p*_{0}<*L*, let *n*_{1},*n*_{2} be positive integers and consider the set of points
3.9
where **P**_{L}(*x*):=*x* mod *L*. Let *a*,*b* be numbers satisfying 0≤*a*<*b*<*L*. Then, the number of points from the set *Σ* that lie in the semi-open interval [*a*,*b*), which we denote by ♯{*Σ*∩[*a*,*b*)} satisfies
3.10
where *n*:=*n*_{1}+*n*_{2}+1.

### Proof of lemma.

(This proof ends just after equation (3.23).) We consider first the case where *n*=*F*_{k} for some *k*, where are the Fibonacci numbers, defined by the rule *F*_{k}:=*F*_{k−1}+*F*_{k−2} with *F*_{0}:=0,*F*_{1}:=1. In this case, the error term turns out to be *O*(1), a stronger result than the in equation (3.10).

The successive continued-fraction approximants to 1/*G* are *F*_{k−1}/*F*_{k}. According to the theory of continued fractions, the error in using one of these approximants in place of 1/*G* has the upper bound
3.11
Under this approximation, each point of *Σ*:=**P**_{L}{*p*_{0}+*jL*/*G*}_{j=−n1,…,n2} is approximated by a point from the set
3.12

The set *Σ*^{(k)} comprises *n*=*F*_{k} points. As *F*_{k−1} and *F*_{k} have no common factor, the *F*_{k} different numbers −*n*_{1}*F*_{k−1}…*n*_{2}*F*_{k−1} all give different remainders on division by *F*_{k} (for if two were to give the same remainder, their difference would at the same time be a multiple of *F*_{k} and the product of *F*_{k−1} by a number less than *F*_{k}, which cannot be). Therefore, each of the *F*_{k} possible remainders occurs just once. Consequently, denoting the remainders by *r*, the set *Σ*^{(k)} can be written as
3.13
Thus, the set *Σ*^{(k)} comprises *n*=*F*_{k} points, which are equally spaced along the interval (0,*L*), their separation being *L*/*n*.

By (3.11), the error in approximating the number *jL*/*G* by *jLF*_{k−1}/*F*_{k} is at most *jL*/*F*_{k}*F*_{k+1}, whose magnitude is bounded above by *L*/*F*_{k+1} because |*j*|≤*n*=*F*_{k}.

Suppose *k* is large enough to make and consider the open interval [*a*+*L*/*F*_{k+1},*b*−*L*/*F*_{k+1}), whose length is the positive number *b*−*a*−2*L*/*F*_{k+1}. Denote the number of points of *Σ*^{(k)} that lie in this interval by *m*_{k}. Each of these points lies within a distance *L*/*F*_{k+1} of the corresponding point of *Σ*; therefore, the corresponding points of *Σ* all lie within the larger interval [*a*,*b*) and it follows that ♯{*Σ*∩[*a*,*b*)}≥*m*_{k}. Moreover, because the separation of the points of *Σ*^{(k)} is *L*/*n*, the length of an interval occupied by *m*_{k} of them is at most (*m*_{k}+1)*L*/*n*, and so the number of them in the interval [*a*+*L*/*F*_{k−1},*b*−*L*/*F*_{k−1}) satisfies (*m*_{k}+1)*L*/*n*≥*b*−*a*−2*L*/*F*_{k−1}. Putting together these two inequalities, we obtain
3.14
In a similar way, it can be shown that ♯{*Σ*∩[*a*,*b*)} is bounded above by the number of points of *Σ*^{(k)} in the interval [*a*−*L*/*F*_{k},*b*+*L*/*F*_{k}), which in turn is bounded above by (*b*−*a*+2*L*/*F*_{k})*n*/*L*+1, so that
3.15
The upper and lower bounds (3.15) and (3.14) taken together imply (3.10), and the proof of the lemma for the case where *n* is a Fibonacci number is complete.

For the case where *n* is not a Fibonacci number, choose *k* to be the largest integer for which
3.16
Let *k*′ be the largest integer for which *n*−*F*_{k}≥*F*_{k′}; if *n*−*F*_{k}≠*F*_{k′}, then let *k*′′ be the largest integer for which *n*−*F*_{k}−*F*_{k′}≥*F*_{k′′} and so on. In this way, we can express *n* as a finite sum of decreasing Fibonacci numbers:
3.17

Corresponding to the decomposition (3.17) of the number *n*, we can decompose the set *Σ* defined in (3.9) into a finite number of subsets, the first comprising *F*_{k} points, the second *F*_{k′} points and so on:
3.18
where
3.19
and so on. Because of the irrationality of *G,* these sets are disjoint.

In the set *Σ*_{k}, we follow the procedure used in the first part of this proof, approximating 1/*G* by *F*_{k−1}/*F*_{k} and arriving at the result
3.20
For the set *Σ*′_{k}, we follow an analogous procedure, but approximating 1/*G* by *F*_{k′−1}/*F*_{k′} this time. This gives the result
3.21
Similarly, approximating 1/*G* by *F*_{k′′−1}/*F*_{k′′} in *Σ*′′_{k}, we obtain
3.22
and so on. Adding up the equations (3.20)–(3.22), etc., and using the formulae (3.17) and (3.18) together with the fact that the sets *Σ*_{k},*Σ*′_{k},*Σ*′′_{k},… are disjoint, we obtain
3.23
because the number of terms in the finite sum (3.17) is bounded above by *k* which is by Binet’s formula *F*_{k}=(*G*^{k}−(−*G*)^{−k})/√5. This completes the proof of the lemma.

### Proof of theorem.

We want to evaluate
3.24
as defined in equation (2.2), using phase points *γ*_{1}=*x***u**+*y***v** and *γ*_{2}=**P**_{L}(*xG*^{2t}**u**+*yG*^{−2t}**v**), where *t*:=*t*_{2}−*t*_{1}, and the neighbourhoods taken to be squares with sides of length 2*ϵ* parallel to the eigenvectors **u**,**v**. For the sake of simplicity, we assume that the distances from *γ*_{1} and *γ*_{2} to the edges of *Q*_{L} are greater than *ϵ*, so that both their neighbourhoods are inside *Q*_{L}. Then, the corners of *N*_{ϵ}(*γ*_{1}) are the phase points (*x*±*ϵ*)**u**+(*y*±*ϵ*)**v** and those of *N*_{ϵ}(*γ*_{2}) are **P**_{L}((*xG*^{2t}±*ϵ*)**u**+(*yG*^{−2t}+*ϵ*)**v**). The region *ϕ*_{t}*N*_{ϵ}(*γ*_{1}) is obtained by applying the projection operator **P**_{L} to the ‘long rectangle’ with corners *G*^{2t}(*x*±*ϵ*)**u**+*G*^{−2t}(*y*±*ϵ*)**v**. The centre of the long rectangle is the point *γ*_{2}=*xG*^{2t}**u**+*yG*^{−2t}**v** and (for positive *t*) its length is 2*G*^{2t}*ϵ* in the **u** direction and its width is 2*G*^{−2t}*ϵ* in the **v** direction.

The midline of the long rectangle is the line in the **u** direction joining the points *G*^{2t}(*x*±*ϵ*)**u**+*G*^{−2t}*y***v**. The length of this line is 2*G*^{2t}*ϵ* in the **u** direction. The line, projected if necessary, meets the (horizontal) *p*-axis at a point (*p*_{0},0), where *p*_{0}=*p*_{2}−*q*_{2}/*G* and (*p*_{2},*q*_{2}) are the Cartesian coordinates of *γ*_{2}. Each intersection of this line with a horizontal line *q*=integer×*L* will map, under the projection operator **P**_{L}, to an intersection with the line *q*=0, i.e. the *p*-axis. The number of such intersections (the *n* of lemma 3.2) is, with accuracy *O*(1), equal to 1/*L* times the length of the projection of the midline of the long rectangle onto the *q* axis, i.e. (1/*L*)2*G*^{2t}*ϵG*(1+*G*^{2})^{−1/2}+*O*(1) (see equation (3.4)). Using this expression in place of the number *n* in the lemma (equation (3.10)), we find that the number of intersections of the image of the midline of the long rectangle with an arbitrary interval of length *b*−*a* on the *p*-axis is
3.25

For the interval [*a*,*b*), we choose the part of the *p*-axis through which all lines in the **u** direction through *N*_{ϵ}(*γ*_{2}) (extended if necessary) pass. The length of this interval is (*b*−*a*)=2*ϵ*(1+*G*^{2})^{1/2}/*G*; so the formula (3.25) tells us that the number of intersections of *N*_{ϵ}(*γ*_{2}) with the image (under **P**_{L}) of the midline of the long rectangle is
3.26

Each of these intersections is a line segment, nearly all having length 2*ϵ*. Each of these line segments is the midline of the intersection of *N*_{ϵ}(*γ*_{2}) with part of the image of the long rectangle; and each of these intersection rectangles, with at most two exceptions, has length 2*ϵ*, width 2*ϵG*^{−2t} and hence area 2*ϵ*×2*ϵG*^{−2t}=4*ϵ*^{2}*G*^{−2t}. Multiplying by the number of such intersection rectangles, given in equation (3.26), we find the total intersection area of *N*_{ϵ}(*γ*_{2}) with the projection of the long rectangle is
3.27
Using this result to evaluate the numerator of equation (3.24), together with the obvious formula 4*ϵ*^{2} for the denominator, we conclude that
3.28
This completes the proof of the theorem.

## 4. Example 2: a ‘cat and kitten’ map

Our second example is a dynamical system whose phase space comprises two squares on two different copies of **R**^{2}. One of the squares is *Q*:=[0,*L*)⊗[0,*L*); the other is *Q*′:=[0,*L*′)⊗[0,*L*′), where *L* and *L*′ are positive integers. In the interesting case, *L*′ is chosen much larger than *L*. For each square, the opposite edges are identified so as to make it a two-dimensional torus.

The rule specifying the motion of the phase point is that, provided the phase point is outside a particular small region in *Q*∩*Q*′, which will be called the ‘window’, it follows the Arnold dynamical rule (3.2) appropriate to the torus it is in; but if the phase point lands on the window, then it jumps to the other torus before continuing. We shall take the window to be a square neighbourhood of some point *ω*∈*Q*∩*Q*′, namely *N*_{δ}(*ω*), where *δ* is a small constant. The point *ω* should be chosen so that *N*_{δ}(*ω*)∩*ϕ*(*N*_{δ}(*ω*))=∅, i.e. the phase point does not immediately jump back again at the next step. The choice would be appropriate, for example.

For this dynamical system, the phase point is labelled by three variables: two real variables *p* and *q*, plus an extra variable *Z* which takes only two values: *L* if the phase point is in the torus *Q*, and *L*′ if it is in *Q*′. The analogue of the formula (3.2) is now
4.1

Without going into any rigorous analysis, it is plausible that, over a very long time interval, all or almost all trajectories will spend most of their time in the larger torus, but will make occasional excursions into the smaller torus. It is a reasonable conjecture that the probability per timestep of hitting the window and thereby moving to the other torus is the small number *δ*^{2}/*L*^{2} when the phase point is in *Q*, and the even smaller number *δ*^{2}/*L*′^{2} when the phase point is in *Q*′. Thus, we can estimate the duration of each sojourn in *Q* to be of order *L*^{2}/*δ*^{2}, while the duration of each sojourn in *Q*′ is of order *L*^{′2}/*δ*^{2}, a much larger number.

Carrying this type of reasoning a bit further, we can obtain conjectural information about the dynamical self-correlations. Consider first the case where both *γ*_{1} and *γ*_{2} are in the larger torus *Q*′. Then, because most trajectories spend only a tiny fraction of their time in the smaller torus *Q*, the dynamical self-correlations will be very close to what they would be if the smaller torus did not exist at all; so, from (3.6) and (3.8), we may expect that (with *t*>0 as usual)
4.2
If, on the other hand, *γ*_{1} and *γ*_{2} are both in the smaller torus, then things are more complicated, because the fraction of the relevant trajectories that visit the other torus is significantly larger. Estimating the probability per timestep of escaping from the smaller torus as *δ*^{2}/*L*^{2}, the condition for the analogue of (4.2) to hold is *tδ*^{2}/*L*^{2}≪1, so that we may expect
4.3
In place of 4*ϵ*^{2}/*L*^{2} in the last line, the formula (4*ϵ*^{2}/*L*^{2})*e*^{−tδ2/L2} would probably hold over a larger time range.

The reader may find it interesting to work out what happens in the third case, for which the two ends of the trajectory segment are in different toruses.

Using (4.2) and (4.3), we can now evaluate the entropies of various trajectory segments as defined by the formula (2.9) with *G*^{2t}≫*L*/*ϵ*. The results are
4.4

From the first of these formulae, equation (4.2), the entropy when *γ*_{1} and *γ*_{2} are both in the large torus is
4.5
According to equation (4.2), this expression should be equal to the equilibrium entropy, and (as *L*′≫*L*), it is indeed very close to the exact equilibrium entropy, which according to Boltzmann’s formula (1.1) is .

From the second formula, equation (4.3), the (non-equilibrium) entropy for phase points in the smaller torus, appropriate for the time scale *t*≪*L*^{2}/*δ*^{2} during which the non-equilibrium state lasts, is
4.6

There is nothing surprising about the entropy formula (4.6). The expression could have been obtained much more quickly by the ad hoc procedure of making a small change in the dynamics, namely closing the window completely so that the phase space splits into two mutually inaccessible parts, and then applying Boltzmann’s principle to whichever part the phase point happens to be in. What the calculations illustrate, however, is that the formula (2.9) provides a self-contained definition of non-equilibrium entropy which requires no ad hoc procedures and no additional input about macroscopic descriptions. All that is necessary is to get the right time scale—the duration of a trajectory which is long enough for the initial ‘transient’ exponential decay to have died out (), but not so long that the slow approach to equilibrium can have a significant effect (*t*≪*L*^{2}/*δ*^{2}).

An interesting feature of this dynamical system is that it can behave irreversibly, even though it does not satisfy the usual criteria that an irreversible system is supposed to obey. Irreversibility is generally held to be a property of systems that are large in the sense of comprising a large number of particles (and therefore having many degrees of freedom), and to reveal itself in the time evolution of macroscopic variables, such as the local density, which are defined in terms of averages over a large number of particles. But the two-torus system considered here has only two degrees of freedom, and there are no particles over which to define macroscopic variables as averages. Nevertheless, this dynamical system has the ability to behave irreversibly. If it is started at a randomly chosen point in the smaller torus *Q*, then the chances are that, after a time of order *L*^{2}/*δ*^{2}, it will emerge from that torus and that it will then remain in the larger torus for a much longer time, of order *L*^{′2}/*δ*^{2}, before its next visit to the smaller torus. By choosing *L*′ large enough, we can make the return time *L*^{′2}/*δ*^{2} as large as we like—larger than the age of the Universe, if desired—making the original transition from torus *Q* to torus *Q*′ almost literally irreversible.

When the ‘irreversible’ transition from the smaller to the larger torus takes place, the entropy increases by the amount , which can be arbitrarily large (in comparison with *k*) depending on how large we choose to make the ratio *L*′/*L*. Just as in thermodynamics, the irreversible process is accompanied by a large entropy increase.

## 5. Conclusion

This paper gives a method for defining an entropy associated with a segment of trajectory in a chaotic dynamical system. The definition is purely dynamical (‘microscopic’); it does not depend on any macroscopic or observational description. The definition depends on two parameters: *ϵ*, the size of the neighbourhood; and *t*_{12}, the length of the trajectory segment. In order to get a useful result, both have to be chosen sensibly—see the conditions in equation (4.3).

One way to carry this work forward would be to look for a connection between the entropy defined here and the one in Boltzmann’s *H* theorem. It might also be possible to prove entropy increase results: for example, it may be that if one trajectory segment is a subset of another, then the larger segment must have the larger entropy. Another topic that could be investigated is the general connection between the dynamical self-correlation at small times and the Lyapunov exponents, illustrated by the formula (3.7) for the Arnold map.

## Acknowledgements

I am grateful to John Ball for stimulating discussions which initiated this work, and to the organizers and sponsors of the international scientific seminar ‘Entropy and Convexity for Nonlinear Partial Differential Equations’ co-hosted by the Oxford Centre for Nonlinear PDE and the Royal Society of London, held on 16 and 17 June 2011 at the Kavli Royal Society International Centre in Buckinghamshire, at which these discussions took place. I am also grateful to Joel Lebowitz for discussions about irreversibility over many years, and to the referees for their suggestions for improving this paper.

## Footnotes

One contribution of 11 to a Theme Issue ‘Entropy and convexity for nonlinear partial differential equations’.

↵1 Normally, the lower bound can be strengthened slightly to 0<

*C*_{ϵ}(*γ*_{1},*γ*_{2}), because the interior of*ϕ*_{t12}*B*_{ϵ}(*γ*_{1})∩*B*_{ϵ}(*γ*_{2}), being the intersection of two open sets, is itself open and therefore has positive measure.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.