## Abstract

Is the characterization of biological systems as *complex systems* in the mathematical sense a fruitful assertion? In this paper we argue in the affirmative, although obviously we do not attempt to confront all the issues raised by this question. We use the fly's visual system as an example and analyse our experimental results of one particular neuron in the fly's visual system from this point of view. We find that the motion-sensitive ‘H1’ neuron, which converts incoming signals into a sequence of identical pulses or ‘spikes’, encodes the information contained in the stimulus into an alphabet composed of a few letters. This encoding occurs on multilayered sets, one of the features attributed to *complex systems*. The conversion of intervals between consecutive occurrences of spikes into an alphabet requires us to construct a *generating partition*. This entails a one-to-one correspondence between sequences of spike intervals and words written in the alphabet. The *alphabet dynamics* is multifractal both with and without stimulus, though the multifractality increases with the stimulus entropy. This is in sharp contrast to models generating independent spike intervals, such as models using Poisson statistics, whose dynamics is monofractal. We embed the support of the probability measure, which describes the distribution of words written in this alphabet, in a two-dimensional space, whose topology can be reproduced by an M-shaped map. This map has positive Lyapunov exponents, indicating a chaotic-like encoding.

## 1. The fly's optical tract as a complex system

The neural system of any living creature, the fly in particular, is certainly a complex system. In fact, it consists of individual neurons1 joined into modules, which interact with themselves and with the external world. This multiple-component system evolves and adapts as a consequence of these internal and external dynamical interactions. The interacting components lead to hierarchical structures with different causations at different levels. In an experiment we usually only observe causations in a few levels, in our case at the ‘last’ neuron. There are various mathematical characterizations of a *complex system* (Grassberger 1986; Poon & Grebogi 1995; Bialek *et al.* 2001). We do not attempt to satisfy all the criteria spelt out there, but suggest that the encoding of external stimuli in a part of the fly's optical tract takes place in *multilayered sets* (Baptista *et al.* 2006).

Roughly, in the optical system of the fly, the incoming information is processed first by the photoreceptors and then by various modules—the lamina, medulla, lobula and lobula plate. From here the information goes to the sensory system. We record from one neuron, the ‘H1’, located in the lobula plate. H1 encodes the incoming signals into a sequence of identical pulses, termed action potentials or spikes. If such a spiking neuron has to encode *relevant* features of the stimulus, it has to fire precisely timed spikes depending on the presynaptic input generated by the instantaneous stimuli the organism receives. Our experimental results strongly suggest that this encoding takes place on multilayered sets, a characteristic of the complex nature of this system. These sets are defined in terms of symbolic sequences of letters, selected from an alphabet according to the size of spike time intervals. Furthermore, we present strong evidence that the underlying dynamics (UD) on each layer is multifractal, thus allowing for a chaotic-like type of encoding. Already in the no-stimulus regime, owing to the multifractal dynamics, the UD is highly flexible, ready to adapt its dynamics to the statistical properties of the stimulus (Fairhall *et al.* 2001) to be encoded. Then, in the presence of the stimulus, finely tuned spike times ride on a set whose UD has now an increased multifractality, shaped by the properties of the stimulus. We also show that a two-letter alphabet is required for the UD to represent the no-stimulus dynamics of spontaneous firing, while a four-letter alphabet is necessary for the UD to encode the information contained in the stimulus.

## 2. Multifractality

We study this underlying dynamical regime in a prominent example of a spiking neuron: the H1 neuron, located in the lobula plate of the fly, *Chrysomya megacephala*. This area is several layers back from each compound eye of the fly. It includes large motion-detector neurons with wide receptive fields and pronounced direction selectivity. The two H1 neurons, one for each lobula plate, are sensitive to image motion around a vertical axis (Hausen 1981). Their response is excited by horizontal back-to-front motions, generated by rotations around a vertical axis and suppressed by motions in the opposite direction. One of these neurons was stimulated by a computer-controlled random, vertical bar pattern with horizontal velocity *v*(*t*). Here we discretize time in bins of 2 ms, which is roughly the refractory period of the H1 neuron. The fly therefore saw a new frame on the monitor every δ*t*=2 ms,2 whose change in position δ*x* was given by δ*x*(*t*)=*v*(*t*)δ*t*.

In order to be minimally representative, we selected four types of stimuli *v*(*t*): *S*_{0}, no stimulus; *S*_{1}, constant velocity; *S*_{2}, a stimulus generated by an Ornstein–Uhlenbeck (Uhlenbeck & Ornstein 1930) process with correlation time *τ*_{c}=20 ms;3 *S*_{3}, an uncorrelated (*τ*_{c}=0) Gaussian stimulus.

In each experiment, such a spiking neuron generates a sequence of spike times *t*_{i}, *i*=1, 2, 3, …, *N*_{s}, where *N*_{s}∼10^{5} in our experiments. All the information received is compressed into the sequence of intervals Δ*t*_{i}=*t*_{i+1}−*t*_{i}. In order to extract the UD, we classify all the possible intervals into a discrete set of cardinality *N*, depending on their size: , etc., where *d*_{j}, *j*=1, 2, …, *N*−1, are a set of *dividers*. Evidently, if we make this set large enough, we recover the original intervals, when *d*_{j}−*d*_{j−1} approaches 2 ms. The question is: can we choose a reasonably small set of dividers without compromising the UD of the original intervals Δ*t*_{i}? In other words, *what is the size of the alphabet the fly's H1 neuron uses to speak postsynaptically?*

We choose our dividers so as to maximize Shannon's entropy (Strong *et al.* 1998). Equivalent results are obtained, although with larger errors, if we minimize the information the spike train provides about the stimulus. For a given set of *N*−1 dividers, we study the dynamics of our sequence of spike intervals Δ*t*_{i} converted into a sequence of words of length *L* containing *N* letters. Note that an *L*-letter word may comprise a very long time interval. Now count up all the different sequences showing up in an experiment, get their probabilities *P*_{k} and compute the entropy per letter of a word of length *L* with *N* possible letters of the experimental sequence(2.1)

Figure 1*a* shows the entropy for *L*=10 with only one divider for the four different datasets. The entropy shows a maximum at bins. We now construct a uniquely defined *generating* (Plumecoq & Lefranc 2000*a*,*b*) partition of our time intervals. *Generating* means that our alphabet has to be able to allow a one-to-one encoding of all possible sequences of intervals. It will turn out to be the partition with . This is a highly non-trivial endeavour. Even for such well-studied systems as the Henon map (Grassberger & Kantz 1985), this question has not been settled. In order to search for such a generating partition, we require the following consistency requirement to be satisfied: *in order to be generating, all the dividers of a particular layer J must already be contained in the previous layers* . If this were not true, conflicting information would result as we go from layer *J*−1 to layer *J*. Dividers are extracted from the data using always *L*=1. We will see that, within errors, only one partition satisfies this requirement. We now construct table 1 in two steps. Firstly, maximizing the entropy for ,4 we get the respective dividers in table 1. Thus, for experiment *S*_{0}, we found the dividers: 23; 9, 23 and 49; and 6, 14, 23, 34, 49 and 71. Secondly, we turn to the remaining values of *N*, searching for dividers satisfying our consistency condition. This is indeed possible, if we fix, for example, the bold-faced dividers and perform a constrained maximization to obtain the remaining ones. We obtain the new *S*_{0} dividers: 9, 71, 34 and 14. Note that the divider 9, obtained using two letters, arose in the sequence starting with *N*=2 only when we used four letters. The corresponding entropy per letter *H*_{2}(*N*=3, 5, 6, 7, …) equals to within 5% the entropy obtained with an unconstrained maximization, which yields completely different dividers. We conclude that *H*_{2}(*N*) for does indeed satisfy our consistency condition to be a generating partition. Is it unique? If we perform the same procedure for other partitions, e.g. , the first step yields the same value for the unconstrained entropy maxima . In the second step, to obtain the remaining dividers, the constrained maximizations yield smaller entropies by as much as 10% as compared with the unconstrained ones (as shown in figure 2). Here, *H*_{3}(*N*=2) means: (i) maximize *H*_{3}(*N*=3) to get [*d*_{0}, *d*_{1}] and (ii) maximize *H*_{3}(*N*=2), using either *d*_{0} or *d*_{1}, whichever gives the larger entropy. The only exception here is one point corresponding to *N*=3. The same is also true for other partitions, starting with prime numbers 5 and 7, thus indicating that within errors the binary partition is the only generating one.

Figure 1*c* shows that the number of letters needed to maximize the entropy decreases with increasing word length *L*, due to correlations manifesting themselves and undersampling effects. Since these put a limit to the size of the alphabet usable, we address the UD using just a few letters. We note, however, that even for *N* smaller than *N*_{max}, e.g. seven dividers for experiments *S*_{2} and *S*_{3}, we obtain essentially the information the original spike trains convey about the stimulus (Strong *et al.* 1998). Figure 3 shows the information per letter *I*_{A} for *S*_{2}, divided by the information the original spike trains convey about the stimulus, using the dividers of table 1. In other words, for , we use the dividers [5], [3, 5], [3, 5, 11], [3, 4, 5, 11], …. The error bars shown are due to systematic errors in the computation of the *noise entropy*, i.e. the entropy of the spike train, given the stimulus. In the conversion of spike times into an alphabet, the precise timing relation of the stimulus to a particular letter is partially lost and this fact is responsible for the error bars.

Since we are going to extrapolate to *L* large, we focus on sequences of words of length *L* containing four letters, using the best dividers of table 1. Only fine details of our results depend on this choice.

We now encode each of these *L*-letter words in a one-to-one map into a real number *W _{i}*, 0≤

*W*≤1,5 whose frequencies we count in order to compute their probabilities

_{i}*p*

_{i}. The structure of the space of sequences for a given

*N*may now be uncovered by computing the generalized dimensions6

*D*

_{q}(Renyi 1971; Grassberger & Procaccia 1983

*b*; Hentschel & Procaccia 1983). These are logarithmic ratios between the probabilities

*p*

_{i}and their physical occupation

*ϵ*, which in our space is given by

*ϵ*=

*N*

^{−L}. The index

*q*can be thought of as a filter: a larger

*q*enhances this ratio for large probabilities, whereas a negative one emphasizes the smaller probabilities,(2.2)

An equivalent quantity is *f*_{α}, the Legendre transform of (1−*q*)*D*_{q}. The index *α* measures the possible local fractal dimensions (Grassberger & Procaccia 1983*a*; Halsey *et al.* 1986) in our space , occurring with singularity strength *f*_{α}. This is the global dimension of the set of points, which locally scales with singularity strength *α*. They are given by , .7

In figure 4*a*,*b*, we show the spectra of the symbolic sequences in with their error,7 which allow us to draw the following conclusions. The H1 neuron has a *multifractal character*, exhibiting the existence of an infinite number of dimensions *α* with densities *f*_{α}. This is in sharp contrast to a memoryless, uncorrelated spike train with a Poissonian or similar probability distribution. Any distribution with independent increments yields a non-trivial *f*_{α} for suboptimal dividers, but is really monofractal for the optimal ones with , independent of *q*. In fact, for *N*=2, let *p*_{0}/*p*_{1} be the probabilities for an interval being smaller/greater than *d*_{1}. Then , where *λ* is the spike rate. Then . Maximizing the entropy gives and , yielding .

Real spike trains by contrast are multifractal even for the best dividers for all types of stimuli. The fractal dimensions are roughly the same for all the datasets, since . This means that the neuron's dynamics has continuous support on , the probability measure being distributed without ‘holes’. The spectra's shape— and (except for *S*_{3})—indicates that has at least two scales and two main components: high-probability sets, hot spots localized in small portions of the symbolic space with density ; and another low-probability set, cold spots spread out all over with density . In the set associated with *S*_{0}, the number of cold spots very much dominates——the hot ones, implying one dominant scale. For the other datasets, the scales are comparable. Therefore, as the dynamics of the stimulus becomes faster as we go from *τ*_{c}=∞ to *τ*_{c}=0, the suppressed scale emerges.

Given the *f*_{α} spectra of our data, what is the simplest set with this spectrum? To address this question, we construct a probabilistic two-scale set with the following rule: an interval of length unity is divided into *b*+1 equal pieces, such that one piece receives *p*_{0} of the original measure and the remaining *b* receive *p*_{1} each, with *p*_{0}+*bp*_{1}=1. Iterating this process self-similarly yields a set of dimension(2.3)with and , where *b* is adjusted to produce the correct value for in figure 4*b*. The resulting *f*_{α} spectra are shown in figure 4*c*,*d*.

Note that *b* jumps to lower values, once a stimulus is turned on. This means that for no stimulus the dynamics distributes the measure rather uniformly into 31 equal intervals, where 30 of them receive *p*_{1} of the measure and one receives *p*_{0} with *p*_{0}>*p*_{1}. At the *r*th iteration of the set, 30 intervals will contain of the measure (cold spots), one interval will contain of the measure (hot spots), i.e. virtually all of the measure, and the rest will receive combinations proportional to , 0≤*k*≤*r*. When stimulated, the number of pieces drops dramatically to approximately one, the hot spots approximately balancing the cold ones. Stimuli thus reshape the probability landscape, which becomes more and more structured as the stimulus entropy increases.

## 3. The M map and a two-dimensional symbolic space

The uncovering of a neural code, the rule by which a stimulus is encoded into a sequence of spike times, is one of the central issues in neuroscience. Here we address a much simpler, though preliminary, issue: finding a *pseudo-code*, which generates sequences with four letters, having the same probability distributions and the same *f*_{α} spectra as the data.

We propose a piecewise almost linear map of the form shown in figure 5.8 It was selected out of several dozens of multiparameter maps, which did not work. The parameters (*c*_{1}, *c*_{2} and *c*_{3}) of the M map were adjusted, starting from initial divider intervals set proportional to the respective letter probabilities. It models the most probable words, meaning that, for stimuli [*S*_{0}, *S*_{1}, *S*_{2}, *S*_{3}], we only consider words appearing more than [4, 20, 20, 25] times, discarding words with frequency approximately less than 0.012%. Note that cold spots are sensitive to low-probability words. The nonlinearity for 0≤*x*_{n}≤*c*_{1} models the prominent singularity due to words mostly composed of ‘0’s in dataset *S*_{0} and is given by(3.1)with *g*=0.9. The jump in *b* (see figure 5) is reflected in this nonlinearity. If *g*=0, the map is piecewise linear.

We create a symbolic sequence from a trajectory of the M map, associating letters *β*_{1}, *β*_{2}, *β*_{3}, *β*_{4}, if , respectively, to real numbers.

The extraction of the second layer's equivalent dynamics is one of the basic steps in analysing this system within the framework of *dynamical systems theory*. We therefore convert the dynamics of our sequence of spike intervals Δ*t*_{i} into a sequence of words of length *L*=6 containing *N*=4 letters and embed our sequences into a Cartesian two-dimensional space9 for visualization. The symbolic spaces for the sequences generated by the M model with the parameters of table 2 are shown in figure 6*e*–*h*. The symbolic sequences extracted from the original data using four letters, embedded in ,10 again for the most probable words, are shown in figure 6*a*–*d*. They exhibit the same qualitative features as the M model-generated ones.

As the fly is subjected to different stimuli, the H1 dynamics is rather robust (as can be gleaned from table 2, which shows only a small change in parameter values).

We can also infer the *chaoticity* of the experimental data, measuring the Lyapunov exponents of the corresponding M map for the four datasets (see table 3).

As expected, the exponents increase as the stimuli's entropy increases. Note though that, even for no or constant stimuli, the exponents are positive, indicating a genuine chaotic UD.

## 4. Conclusion

We show that methods from *dynamical systems* from the point of view of *complex systems*, when applied to a biological system, are able to extract new and relevant features. In our case, we extract the UD of the H1 neuron by a finite size alphabet with about four letters. Analysing sequences written in this alphabet allows us to exhibit the multifractal character of the sequence space. We did model our data sequences by a piecewise almost linear map, whose dynamics compares well with the original data in a two-dimensional state space. The stimulus shapes the landscape of the sequence space from mainly uniform to highly structured, as the stimuli become increasingly dynamically variable. But this reshaping is played out on different chaotically unstable layers. These are dynamically linked—a behaviour typical of complex systems.

The existence of multifractal sets strongly suggests a chaotic dynamics—also supported by the positive Lyapunov exponents that we compute using the M map. This leads us to think about strange attractors, which exhibit a natural way to generate hot spots: the existence of folds of an attractor with bounded measure (Grebogi *et al.* 1988). There the system loses its hyperbolicity, since the stable and unstable manifolds are tangent. Owing to ergodicity or recurrence, these folds map densely into the attractor, generating high-probability spots.

The tools developed here for the analysis of our data are of general applicability. The fly's optical processing system reveals its fascinating complexity, even in the basic aspects of the spike generating dynamics. Aspects of this spike generating dynamics provide a means for its manipulation and control. It remains to be seen whether the layered structure uncovered here has a counterpart in the fly's optical information processing system.

## Acknowledgments

Thanks go to the Fundação de Amparo a Pesquisa do Estado de São Paulo (FAPESP; M.S.B., R.K., C.G.) and the Max-Planck-Institute (M.S.B.) for financial support. It is R.K.'s pleasure to thank W. Bialek and R. R. van Steveninck for their infinite patience, when sharing their insights. We also thank I. Zuccoloto for her help in fly preparation.

## Footnotes

↵† Present address: Max-Planck-Institut für Physik Komplexer Systeme, Nöthnitzerstrasse 38, 01187 Dresden, Germany.

One contribution of 15 to a Theme Issue ‘Experimental chaos I’.

↵Even a neuron itself may be considered as such a system.

↵The experiments were performed in

*DipteraLab*, Instituto de Física, São Carlos, Brazil. Flies were immobilized with wax and viewed an image displayed on a Tektronix 608 monitor from a distance of 12 cm, the setup being similar to that of Fairhall*et al.*(2001). The light intensity corresponds roughly to that seen by a fly at dusk.↵The fly's reaction time is of the order of 20 ms, and this is thus a behaviourally relevant time scale.

↵The subscript ‘2’ in

*H*_{2}(*N*) indicates that we start our search for dividers using a sequence beginning with this prime number.↵Our results are independent of the particular encoding used, as long as it is one-to-one.

↵Note that, since no metric characterization of the space (

*W*) is performed, it is irrelevant to the way we encode words into real numbers to calculate the generalized dimensions.↵

*D*_{q}can be reliably computed, if there are linear regimes, when plotting the numerator versus the denominator of equation (2.2). The*q*values were selected so as to produce*D*_{q}values with a variance of no larger than 5% relative to the absolute value (this is obtained from the residual of the linear fitting) and which produce an error in of no more than 10% relative to the absolute value, except for some low-probability sets, which have larger errors.↵Notice that

*h*_{3}may be greater than 1.↵The false nearest neighbour technique (Kantz & Schreiber 1997) indicates an embedding dimension

*d*_{e}∼3–6.↵The position of the stripes depends on how we code for

*W*_{i}, but quantities like*D*_{q},*f*_{α}and Lyapunov exponents are coordinate independent.- © 2007 The Royal Society