## Abstract

This paper briefly outlines our growing understanding of the relationships between the network structure of ecological networks—both in mathematical models and in the real world—and their consequent dynamical properties. These are interesting, inter alia, because they affect the system’s ability to withstand disturbance, whether natural or human-created. The paper also sketches recent interest in the potential relevance of this work to ‘systemic risk’ and regulatory measures in banking systems, emphasizing the similarities and differences. I conclude with some cautions against drawing excessively general conclusions from any such models.

## 1. Introduction

The primary focus of this paper is on the interplay between the structure of a network of interacting entities and the way such structure may influence the system’s response to perturbations of various kinds. One of the earliest areas of exploration of such interplay between a web’s structure and its dynamics was for food-webs (or more generally ‘ecosystems’), and I will begin with this. Subsequent sections of the paper similarly pursue the theme in the contexts of infectious diseases, IT networks (very briefly, given the excellent survey by Barabási elsewhere in this issue), and most recently financial networks. As Barabási and others emphasized in the 1990s (and in this issue) for IT and other related networks, the interconnections we observe in real networks very rarely correspond to random, Erdős–Rényi ones. This theme, and its important consequences, was separately established for ecosystems in the early 1970s and for infectious diseases in the mid-1980s. Exploration of its implications for financial systems is very recent.

A significant fraction of what I have to say here has previously been published in a review in *Trends in Ecology and Evolution* [1]. But this journal is rarely read by members of the Web-science community, so I hope some repetition is forgivable. The present paper concludes with brief caveats about some recent excesses of enthusiasm.

## 2. Stability and complexity in ecosystems: 1960s to 2000s

Ecology is a young subject. The name itself was coined just over a century ago, and the oldest professional society is less than 100 years old. The early decades were, as in most subjects, confined mainly to descriptive work, laying the foundations. As a conceptual base began to emerge around the middle of the previous century, one central concept was that food-web complexity—more species and richer array of interactions among them—may of itself tend to confer ‘stability’ in the sense of robustness against environmental or other disturbances. The influential Hutchinson [2], following Elton [3], suggested that ‘… oscillations observed in arctic and boreal fauna may be due in part to the communities not being sufficiently complex to damp out oscillations’. This was based mainly on a misunderstanding of a paper by MacArthur [4], which Hutchinson saw as a ‘… formal proof of the increase in stability of a community as the number of links in its food-web increases’.

This ‘conventional wisdom’ of the 1960s was undercut by investigation of a simple mathematical metaphor relating to it [5]. Consider a community of *N* species, each possessing intraspecific mechanisms which, were the species in isolation, would stabilize perturbations (with a characteristic damping time normalized to be *T*_{d}=1). Now let there be a randomly constructed network of interactions among these *N* species, with a mean number, *m*, of links per species, and each interaction, independently randomly, being + or − and with average magnitude *α* when scaled against the intraspecific effects.

The overall stability of such a ‘randomly constructed’ assembly *explicitly depends both* on the network’s connectance (i.e. the number of actual links as a ratio to the number of possible links per species; *C*=*m*/*N*), and on the average interaction strength *α*. More precisely, for large values of *N*, the system is stable if, but only if,
2.1
As emphasized at the time [6], pp. 76, 174, real food-webs are, of course, not ‘randomly assembled’ networks, but are the winnowed products of evolutionary processes. So the reshaped agenda has been to seek, both in nature and in mathematical models, the special kinds of food-web/network structures that may help reconcile ‘complexity’ with ‘systemic stability’. Two tentative suggestions were that ‘predator–prey interactions (+−) are consistent with qualitative stability [dependent only on the signs, independent of magnitude, of interactions], whereas mutualistic (++) and competitive (−−) interactions are not’ and ‘for a given average interaction strength and web connectivity [food-webs may be more robust] if the interactions tend to be concentrated in small blocks’ [6].

More recent work, both theoretical and experimental/observational, testifies to the fact that real food-webs are indeed not randomly connected, Erdős–Rényi ones. Thus Kuris *et al.* [7], in a study of three estuarine ecosystems, which—seminally—included parasites (whose biomass exceeded that of top predators), found a preponderance of predator–prey interactions in these networks. This coincides with a recent computer study by Allesina & Pascual [8] of some 10 000 randomly assembled food-webs, which found that those with a disproportionate fraction of predator–prey interactions were significantly more likely to be stable (all eigenvalues of the system in the left half-plane). Analytic results on greatly oversimplified toy models give some intuitive insights into these results.

Other studies find evidence for modularity in networks, particularly plant–pollinator, where linkages tend to be ‘overdispersed’ and ‘disassociative’ [9]. Sugihara & Ye [10] extend this, in a review of nested hierarchies in food-webs. More generally, and importantly, many and perhaps most of the interesting, and practically relevant, questions about a network’s response to perturbation depend not only on its topology, but also on the individual interaction strengths. For an excellent review of these issues, see the Theme Issue of the *Philosophical Transactions of The Royal Society B* edited by Dobson *et al.* [11].

In all this, it is encouraging to note that food-webs tentatively reconstructed from palaeo-data (Burgess Shale) by Dunne *et al*. [12] seem similar to present-day ones—especially in predator–prey ratios—which implies that there really is something to be explained.

## 3. Infectious diseases and contact networks

A central concept in discussing the transmission dynamics of infectious diseases is the basic reproductive number, *R*_{0}, which measures the average number of secondary infections produced when one infected individual is introduced into a wholly susceptible population [13]. Essentially all earlier work on directly transmitted infection agents assumed that the network of contacts was a random, Erdős–Rényi one. On this basis, Kermack & McKendrick [14] calculated that the expected fraction of the population to experience infection, *I*, would be given by the equation
3.1
For *R*_{0} less than unity, *I*=0 is the only solution; for large *R*_{0}, .

The quantity *R*_{0} essentially depends on the numbers of contacts per unit time (*c*), duration of infectiousness (*D*), and the transmission probability (i.e. chance that a susceptible individual will become infected upon contact, *β*). Although this quantity can be estimated from appropriate serological data, it is usually impossible to calculate from first principles; accurate estimation of the transmission probability is the main problem. For sexually transmitted diseases, however, rough estimates of all three components of *R*_{0} are possible. When Yorke *et al.* [15] did this for the gonorrhoea epidemic in the USA western states, they found *R*_{0} significantly less than unity. This was, of course, inconsistent with the observed epidemic. Yorke *et al.* resolved this dilemma by assuming a small sub-population of ‘superspreaders’, for whom *R*_{0}≫1. Moreover, you could not simply substitute the average contact number, *c*, into the formula *R*_{0}=*cDβ*; the superspreaders are both more likely to acquire infection, and also to transmit it, by virtue of their hyperactivity.

As seen above for ecosystems, and as emphasized also in Barabási’s paper in this issue, the essential issue is that the contact network is *NOT* a random one. More formally, for a contact network with degree distribution {*P*(*i*)}, May & Anderson [16] have shown that the ‘epidemiologically appropriate’ expression for the average number of partners is not the simple average, but rather the mean-square number of partners divided by the mean. This accords with the intuitive explanation given above. *R*_{0} can then be written in its appropriately generalized form as
3.2
Here CV denotes the coefficient of variation of the degree distribution.

The generalized version of the Kermack–McKendrick equation (3.1) is as follows [13]. For a degree distribution {*P*(*i*)}, the infected fraction of individuals in the *i*th class—the class with *i* partners per unit time—is now given by
3.3
Here *α* is given by
3.4
The total fraction infected is then
3.5
Figure 1 shows the dramatic range of fractions infected, for a given value of *R*_{0} (correctly calculated), as the degree distribution becomes ever more ‘long-tailed’, i.e. as CV increases.

Equations (3.1)–(3.5) were derived from the standard S–I–R differential equations of conventional epidemiological theory. It is interesting to note, in passing, that they can alternatively be derived—in a seemingly very different way—directly from network theory. The fraction infected is equivalent to graph theory’s ‘giant cluster’: *I*=1−*u*, where *u* is the fraction of the network’s nodes *NOT* in the giant cluster. But the probability, *u*, that any one node is not connected to the giant cluster is equal to the probability that all its contacts are themselves nodes that also are unconnected. That is . But the right-hand side of this equation is simply the generating function of the degree distribution, *G*(*u*), whence *u*=*G*(*u*). In particular, for a random, Erdős–Rényi network, the classic Kermack–McKendrick equation (3.1) can now be found directly from the generating function .

Pursuing the theme of non-random networks, an important theorem has been established by Anderson & May [13]. Define *p*_{c} to be the critical fraction of a population that must be immunized in order to eradicate infection, assuming that the contact network is a random one. And suppose we now implement our programme by vaccinating this fraction, at random. If in fact the network’s degree distribution is not random, but rather has significant variance in contact numbers, we will find we need to immunize a larger fraction than estimated. Conversely, if we take advantage of the non-randomness to focus immunization on the superspreaders, eradication of infection will be achieved more easily than estimated. This key theme will be re-echoed in the next two sections.

Before leaving this section, I deal with one persistent misunderstanding. If the degree distribution is taken to obey a power law or ‘scale-free’ distribution, *P*(*i*)∼*i*^{−δ}, then for 3>*δ*>2 (as often found) the distribution’s variance tends to infinity as the number of nodes or people becomes large. Correspondingly, from equation (3.2), *R*_{0} if correctly calculated will tend to infinity. This has led to the misapprehension that the apparent lack of an ‘epidemiological threshold’ means an affliction like HIV/AIDs cannot be eradicated, no matter how small we can make the effective value of the transmission probability, *β*. But if the fraction infected is calculated correctly, we find that the fraction infected in the limit where *β*≪1 is ; here, as above, *c* is simply the average partner number (never mind the infinite variance) and *D* the duration of infectiousness. For very small *β*, , regardless of the scale-free distribution. For details, see May [1].

## 4. IT networks and their vulnerability

Here Barabási’s paper in this Theme Issue gives an excellent survey of interesting and important work which owes much to him and his colleagues. Both his and my papers highlight the importance of recognizing the ubiquity of situations where we deal with networks processing long-tailed, far-from-random degree distributions. In particular, Barabási’s beautiful paper [17] demonstrates that important IT networks exhibit these properties, leading to his entirely independent discovery of the 1991 theorem about such networks’ robustness to random attacks and fragility to targeted attacks.

## 5. Network dynamics of financial systems

In 2006, well ahead of the banking crisis that followed, the US National Academies/National Research Council and Federal Reserve Bank of New York collaborated on a study aimed to ‘stimulate fresh thinking on systemic risk’ [18]. The basic observation motivating this study was that financial institutions—here generically called ‘banks’—were creating ever more complex financial instruments aimed at minimizing risk to individual entities, but that no one was paying significant attention to possible consequences for the stability of the entire system. Recognizing the possible ‘read-across’ from other areas explicitly concerned with systemic risk, this NAS/FRBNY study drew banking people together with researchers from ecology, infectious disease transmission and energy supply grids [19].

Recognizing the potential relationship between the network of connections among individual banks and the dynamics of the system as a whole, the study also commissioned an analysis of the topology of interbank payment flows within the US Fedwire service. This is a real-time settlement system, operated by the Federal Reserve System, within which some 9500 participating banks transfer funds. Interestingly, this network is highly disassociative; big banks were disproportionately connected to small ones, and vice versa; the average bank was connected to 15 others, but this does not accurately convey the reality in which most banks have only a few connections while a small number of ‘hubs’ have thousands. Such strongly non-random and disassociative characteristics of the bank-transfer network are, as we have seen above, shared by many other complex networks, and particularly by many ecological systems.

The subsequent failure of some major financial institutions, and the consequent propagation of shocks throughout the banking system, have prompted a rapid rise in studies of the dynamics of ‘toy models’ of banking networks [20–25], along with exploration of possible implications for better regulation of the system [26–29].

There are, of course, many differences between ecosystems and banking systems. Not least is that nodes and links in food-webs are relatively well defined; a node is a particular species (or sometimes group thereof, e.g. ‘spiders’), linked to others as prey, predator, competitor or mutualist. A minimally complex representation of a node in the interbank (henceforth IB) network is given by figure 2. It has two kinds of incoming links (deposits, which are liabilities external to the network, and IB borrowing) and two kinds of outgoing links (external assets and IB loans). The borrowing/lending links are basically conventional network ones, although even here we can have several separate loans/links between any two banks. To the contrary, the external assets can, and usually do, constitute a ‘network of networks’; minimally complex toy models may have a given bank having assets of *n* different kinds, *c* (≤*n*) of which are shared with varying numbers of other banks.

The sum of a bank’s assets must exceed its liabilities. The difference represents the bank’s capital reserve or ‘net worth’, denoted by *γ* in figure 2. If *γ* becomes negative, the bank fails.

The toy models referred to above explore the systemic consequences of failure of any one bank, which—as shown in figure 2—is assumed to be caused by one (or more) of its external assets losing value in excess of the capital reserve, *γ*. Such a loss has three distinct consequences for other banks in the system.

First, following the initial bank’s failure, its net loss is distributed equally among its *z* creditor banks. These *z* banks will also fail if the loss thus inflicted on them exceeds *γ* and so on. But note that in each phase of this process, the shock is diluted by a factor *z*. That is, loan default shocks attenuate as they propagate. Note also that the attenuation is more marked for larger *z*. On the other hand, large *z* means more banks are potentially at risk. This highlights the complex interplay between fragility and robustness in such networks [21].

Second, the initial bank’s loss of value for one asset in figure 2 is likely to cause depreciation of that specific asset class, as a result of ‘fire sales’ or more general reappraisal. This will transmit a ‘liquidity shock’ to all other banks holding that asset. In earlier studies, this is usually handled by discounting *all* external asset classes by a factor , where *x* represents the fraction of all banks that have failed at any given stage of the shock propagation process. May & Arinaminpathy [23] have introduced the refinement of distinguishing strong liquidity shocks (SLS), where there is a relatively large discount factor *α*, for banks actually holding a failing asset class and thereby directly affected by fire sales or the like, and weak liquidity shocks (WLS), with a relatively smaller discount factor *β*, to account for less direct—but nonetheless real—confidence effects upon all asset classes held by banks that have failed (even though these classes were not directly implicated in the failure). It is clear that—in contrast with loan shocks—all such liquidity shocks are amplified in each successive phase of propagation of the initial shock (*x* increases, and so both SLS and WLS become larger). The introduction of such less tangible ‘confidence effects’ via WLS, however, can make a significant difference to the signature of the potential cascade produced by the initial failure. Without WLS, the fraction of all banks failing rises gradually as the value of *γ* (which is usually specified by regulators) falls to lower values. But once WLS effects are included, the ‘shock wave’ steepens, with a single initial failure tending to bring the entire system down.

Third, the failure of any one bank can result in other banks taking the precaution of increasing their capital reserves by calling in loans, or lending for shorter terms (which has many of the same effects). Such ‘liquidity hoarding’ (LH) is another form of confidence effect, and as such shares some of the character of a WLS. Like liquidity shocks to external assets, the shocks produced by LH tend to amplify as more banks fail.

Most of these models assume all banks of roughly the same size, and both the network IB loans and the networks of shared assets to be connected at random. More recent work [30] makes more realistic assumptions, treating a toy system with ‘big banks and little banks’ in which the (relatively few) big bank nodes have many more links, and where the network connections are made either proportionately or disassociatively. These models also explore the effects of taking the ratio of capital reserves to total assets to be the same for all banks, or alternatively either larger or smaller for big banks. These model systems are significantly more robust when big banks hold relatively larger capital reserves. In practice, in the recent boom years, the opposite was the case.

Some of the messages from these models are surveyed by Haldane [26], Haldane & May [29] and Jones [31].

## 6. Some caveats

I end with three cautionary notes, each of them relevant both to the foregoing and more generally.

Many of the papers in this Theme Issue—including this one—have emphasized the fact that many, indeed most, degree distributions seen in the wide variety of contexts within ‘Web science’ have assembled themselves by mechanisms which produce long-tailed distributions. This contrasts with many simple studies, including the ‘first generation’ of toy models for financial systems discussed above, which assume random, Erdős–Rényi models. More generally, however, it is too often implicitly assumed that a sample from a network is representative of the full network’s degree distribution. This is not necessarily true. Stumpf *et al.* [32] have shown that such a sample accurately characterizes the full network if, and only if, the degree distribution is binomial (i.e. has a generating function *G*(*x*)=[1+*m*(1−*x*)/*k*]^{−k}, where *k* can be negative or positive). This category does not include scale-free distributions, although here a sample can be reasonably reliable if it is large enough. It is an interesting sidelight on the sociology of science that Stumpf’s paper had some difficulty being published.

Even if the degree distribution is accurately known, it does not fully characterize the network. For a given degree distribution, the links can be made proportionately (the probability of connecting to any other node being proportional to that node’s number of links), or associatively (high-linkage nodes preferentially connected to other high-linkage ones), or conversely disassociatively, or elsewhere along a continuum of possibilities. Furthermore, the dynamical behaviour of the network can depend in important ways on such linkage details. For instance, if the network in question is describing partner acquisition relevant to transmission of a sexually transmitted disease, associative connections will see an epidemic arise faster but infect fewer people. Conversely, a disassociative pattern makes for slower initial rise in incidence, but with maximum numbers infected in the long run [33]. Proportional connections are, of course, intermediate.

Finally, in many contexts, a network’s dynamical response to disturbance will depend not only on its topology, but also on the strengths of, or flows along, individual links (and on how these correlate with a node’s degree distribution).

## Footnotes

One contribution of 15 to a Discussion Meeting Issue ‘Web science: a new frontier’.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.