## Abstract

Given a suitably parametrized family of equilibrium models and a higher level criterion by which to measure an equilibrium state, mathematical programs with equilibrium constraints (MPECs) provide a framework for improving or optimizing the equilibrium state. An example is toll design in traffic networks, which attempts to reduce total travel time by choosing which arcs to toll and what toll levels to impose. Here, a Wardrop equilibrium describes the traffic response to each toll design. Communication networks also have a deep literature on equilibrium flows that suggest some MPECs. We focus on mathematical programs with complementarity constraints (MPCCs), a subclass of MPECs for which the lower level equilibrium system can be formulated as a complementarity problem and therefore, importantly, as a nonlinear program (NLP). Although MPECs and MPCCs are typically non-convex, which is a consequence of the upper level objective clashing with the users' objectives in the lower level equilibrium program, the last decade of research has paved the way for finding local solutions of MPCCs via standard NLP techniques.

## 1. Introduction

Mathematical programs with equilibrium constraints (MPECs) are problems of the form(1.1)where may variously be called the upper level or leader or design or control vector, and the lower level or follower or response or state vector, respectively; the upper level objective function is smooth; and the lower level optimization or equilibrium problem is formulated with smooth functions.

Equilibrium constraints may come in the form of a game, a variational inequality (VI) or stationary conditions of one of these problems or of an optimization problem. In this paper we restrict our attention to equilibrium constraints that can be formulated via finitely many smooth equations, inequalities and complementarity relations, namely in the framework of an underdetermined mixed complementarity problem (MCP). That is, the MPEC will be a mathematical program with complementarity constraints (MPCC) as follows:(1.2)where and are smooth, and ⊥ denotes orthogonality, so that can be written as a nonlinear equality constraint . Thus, significantly, MPCCs are open to the application of many nonlinear programming techniques. Note in (1.2) above that we do not distinguish between upper and lower level variables, but aggregate them into the vector *z*, and only omit general nonlinear equality constraints to keep notation simple.

### (a) Where do MPECs and MPCCs come from?

Consider an equilibrium system, or stationary conditions of an optimization problem, that we call the ‘forward model’. The data of such a model implicitly describe a ‘forward output’, namely an equilibrium solution. There are two paradigms fundamental to scientific modelling that give rise to MPECs when the forward model is an equilibrium problem.

The first paradigm is the design problem, in which parameters that affect the equilibrium model are adjusted to explore the space of achievable equilibria, and thereby improve the performance of an existing or planned system. The second paradigm is the inverse process of parameter estimation, where we try to identify an existing equilibrium system with the one that appears in a parametrized family. An inverse MPEC could be formed by minimizing the least-squares distance between the given observations of the existing equilibrium state and the corresponding values of the forward equilibrium, subject to changing the model parameters within the acceptable limits. This can also be regarded as calibration.

Note that stationary conditions of nonlinear programs (NLPs) are formulated as Karush–Kuhn–Tucker (KKT) systems (Nocedal & Wright 1999), which are themselves MCPs. For example, a bi-level optimization problem may be tackled as an MPCC after replacing the lower level problem by its KKT conditions.

General references on MPECs include the monographs by Luo *et al*. (1996) and Outrata *et al*. (1998), and the more recent bibliography by Dempe (2003) on the bi-level programming and MPEC literature.

### (b) Goals of this paper

This paper reviews equilibrium problems and associated MPECs and MPCCs in traffic (§2) and communication (§3) networks. The interest in MPCCs is that these can readily be written as NLPs, e.g. (1.3) to follow, and NLP methods applied to find local solutions. Solving MPCCs using NLP techniques is the subject of the brief §4. Concluding remarks on research in traffic and telecommunications networks are given in §5. An extended version of this paper, Ralph (2008), reviews the Lagrangian stationary conditions for MPCCs that underpin recent advances in solving MPCCs by NLP techniques.

For concreteness, consider the NLP formed by replacing orthogonality between and in (1.2) by a formally weaker statement, (1.3)This NLP is equivalent to the MPCC (1.2) in that a local or global minimum of the former is the same for the latter, and vice versa. This suggests that we may use standard NLP methods to solve MPCCs, as will be explored later.

A cautionary note is that MPCCs are non-convex optimization problems, while NLP methods largely aim to find stationary points or, at best, local optima, rather than global optima. Non-convexity appears in the simplest complementarity constraint , over scalars *x* and *y*. This gives the non-convex feasible set , which is the union of two polyhedral convex ‘pieces’. There is a dearth of research on global optimization of MPECs though we can point to Hu *et al*. (2007) for exact methods and Sumalee (2004*a*,*b*) for heuristic search methods, and references therein.

## 2. Traffic equilibrium and associated MPCCs

We briefly describe an important forward equilibrium model in road traffic networks, due initially to Wardrop (1952), and discuss associated design and inverse MPECs. It will be important for us to show the transformation of an equilibrium model into an MCP, so that subsequent MPECs can be written as MPCCs.

### (a) Forward model of Wardrop traffic equilibria

Notation: consider a network with *N* nodes, a set of directed arcs or links , where links are denoted either *ij* for flow from node *i* to *j* or more generally *a*, a set of origin–destination (OD) pairs, denoted *kl* or *κ*, and corresponding list of OD demands or quantities , . Each OD pair *kl* defines a commodity in a multi-commodity network flow problem, whose feasible set is given by those non-negative vectors that achieve a transfer of *q*_{kl} units from *k* to *l*. We assume that the network is connected, in particular, for each OD pair *kl*∈, it is possible to ship one unit from *k* to *l*. It is easy to extend this model to include elastic demand that decreases as congestion increases. See the monograph of Patriksson (1994) for further details on traffic modelling.

Wardrop traffic equilibria: we wish to find the equilibrium pattern of traffic flow over feasible OD flows *x*^{κ}, bearing in mind that travel durations on links depend on link flows. Additional constraints, such as upper bounds on any link flow *f*_{a}, can also be imposed to define the set of feasible flows *f*. Wardrop's equilibrium principle (Wardrop 1952) states that, at equilibrium, OD traffic will flow only on the shortest time paths. The shortest path idea generalizes directly to a cost-of-travel function that may combine any number of factors, such as distance, time, fuel cost, etc.

VI formulation of traffic equilibrium: Smith (1979), followed by Dafermos (1980), characterizes Wardrop's traffic equilibrium as a variational inequality, as follows. Let denote the unit or marginal travel time, or duration for short, of flow on link *a* given a feasible flow vector *f*. An equilibrium is a feasible flow *f*^{*} such that for any other feasible flow *f*,(2.1)We denote this problem by VI, where *F* is the set of feasible flows and *d* is the vector over links of travel durations, . Formally, VI is the problem of finding such that for all .

MCP formulation of traffic equilibrium: we express *F* as a polyhedral set of commodity link flows *f* that can be formulated as(2.2)where denotes the cardinality of ; is the node–arc incidence matrix, whose entry in row *k* and column is 1 if *j*=*k*, −1 if *i*=*k* and 0 otherwise; and consists of net inflows at each node, i.e. its *i*th component is if *i*=*k*, if *i*=*l* and zero otherwise. We introduce a KKT or dual multiplier where each corresponds to the OD flow balance equation . The dual condition that characterizes *f*^{*} as a solution of VI is the existence of a multicommodity flow vector *x*^{*} and dual multiplier *μ* such that (2.2) holds when , , and1(2.3)That is, the MCP formed by (2.2) and (2.3) characterizes a Wardrop equilibrium. Facchinei & Pang (2003) give details on the relationship between VIs and MCPs.

Existence of traffic equilibria: provided *F* is convex, compact and non-empty, and *d* is continuous on *F*, it follows from Brouwer's fixed point theorem that VI has a solution (Facchinei & Pang 2003), hence that the associated KKT condition also has a solution. It is clear that *F* is closed and convex. Note that *F* is bounded since, for each , the flow on any path from *k* to *l* is non-negative and bounded above by ; thus each link flow *f*_{a} lies between 0 and . *F* is also non-empty if there are no additional constraints beyond the feasibility of the subordinate OD flows, since the network is connected. These properties and hence existence of a traffic equilibrium are preserved by many additional constraints such as sufficiently large upper bounds *u*_{a} on total flow on link *a*. Uniqueness of the equilibrium *f*^{*} can also be shown in many circumstances (Patriksson 1994; Facchinei & Pang 2003). The first existence and uniqueness proof of Wardrop equilibria are via an entirely different route, however, namely the equivalent optimization reformulation of Beckmann *et al*. (1956). This equivalence relies on a symmetry condition on the link cost functions that holds trivially if the cost of flow on link *a* depends only on flow on that link.

### (b) Toll design in traffic networks

Think of decreasing total travel time in a road traffic network by choosing an appropriate cordon charging scheme around a city, of which London's cordon charging scheme is an example. The currency and challenges of road tolling in practice are well described in Kelly (2006).

Some motivation for tolling schemes comes from the observation that the maximum welfare attainable under a centralized planner (benevolent dictator), who prescribes OD routes to all road users, can be achieved by a traffic equilibrium under a tolling scheme where there is a possibly different marginal toll cost on every link. This result assumes that all users view the trade-off between time and toll cost in the same way, i.e. share the same time value of money, an assumption that will be preserved below in order to conserve notation. Accommodating user classes each with a distinct time value of money is possible by introducing further commodities into the network flow model.

Of course tolling every link would be difficult, more so for link-dependent toll levels. Cordons reduce the complexity of the toll scheme by setting a single toll price throughout the cordon and a zero toll elsewhere. A cordon is a list of nodes that form a ring, around what is deemed to be the congested area, via their connecting links. For simplicity, we will not model this here—see Sumalee (2004*a*,*b*) for a graph-theoretic description that also converts to a computational description—but consider the related problem Sumalee (2004*a*) of choosing a subset of all links and a single toll charge *τ* applied to each unit of traffic on those links that constitute the network design variables, in order to maximize a system utility function.

Suppose travel along link *a* is tolled at £*τ* per unit flow. We model the user cost function of using link *a* as , where is link duration, as above, and scales time to money. Given the arc flows *f*, list of toll arcs and toll level *τ*, defineand . The system utility associated with any flow *f* is the total benefit, , less total cost,2 . The value that road users accrue from travel has been omitted from the system utility since, in this simple setting, demand is assumed to be a fixed vector *q*, thus is unaffected by the choice of or *τ*. Since toll revenues and toll costs net to zero, maximizing system utility is equivalent to minimizing total user travel time

The combinatorial constraint makes this problem even more difficult than a standard MPEC whose variables are continuous. Nevertheless, genetic algorithms appear to be an effective heuristic for finding good toll designs (Sumalee 2004*a*,*b*).

To pose the toll design problem as an MPCC, observe that the MCP corresponding to the equilibrium constraint is (2.2) and (2.3) with *d*_{a} replaced by in the latter.

### (c) OD demand estimation, an inverse traffic equilibrium problem

Consider the forward traffic equilibrium model given above where the OD demand vector is unknown, but we believe it lies in a closed convex set *Q* that is based on values for OD demands that are known from a previous period. Suppose further that we have measured arc flows in the current period, for *a* in a subset of , where traffic flow is assumed to be at equilibrium. We now wish to calibrate the network model by estimating OD demands *q* for the current period. This is a well-studied problem, e.g. Chen & Florian (1998) and references therein.

A least-squares version of the calibration problem isThis MPEC can be written as an MPCC, assuming *Q* is specified by finitely many smooth equalities and inequalities, by using (2.2) and (2.3) to represent the equilibrium constraint.

## 3. Congestion control in telecommunication networks and related MPCCs

We begin with a standard equilibrium model of the transmission control protocol (TCP) in a communication network like the Internet, by Kelly *et al*. (1998), and then consider some related design and inverse MPECs and MPCCs.

### (a) Rate flow equilibria for TCP

The following TCP model is reworked from Kelly (2003, 2008). Again we have a connected network consisting of *N* nodes, a set of arcs and a set of OD pairs . Path flows are intrinsic because, for a given OD pair , the set of admissible routes from *k* to *l*, or routes that serve *κ*, is given by a routing protocol that is exogenous to TCP. In particular, neither the shortest path aspect of Wardrop equilibria nor the remodelling path flows by link flows in its VI and MCP formulations can be readily applied to TCP.

Forward model of TCP equilibria: TCP is used at each node to determine the rate at which packets should be sent along outgoing routes. It responds to the level of unreliability of the network, due to congestion, which is manifested through loss of data packets. Let denote the set of all routes used in the network. For each route *r*, TCP maintains a variable `cwnd`_{r}<0 that is the size of the so-called congestion window. For every positive acknowledgement of a packet sent on *r*, `cwnd`_{r} is increased by *a*`cwnd`_{r}^{α} and, for every lost packet, `cwnd`_{r} is decreased by *b*`cwnd`_{r}^{β}. The parameters *a*, *b*, *α* and *β* are independent of the route, with and . (In Kelly (2008), *a*=1, , and .) Let be the probability that a packet is lost on *r*, hence the average change in `cwnd`_{r} per packet sent is . When network flows are in equilibrium, the average change is zero and we derive(3.1)

To understand the intrinsic meaning of *p*_{r} rather than the equilibrium identity (3.1), we look to the probability of packet loss on link *a* which, in turn, is related to the link capacity. The capacity of link *a* is on the rate of flow, rather than the number of packets. Suppose the round trip time of sending out a packet on a route *r* and receiving a confirmation is roughly constant, denoted by , that is, congestion causes packet loss rather than delay. Then the flow rate on route *r* is roughly(3.2)and the total flow rate on link *a* is the sum of over routes *r* containing *a*. The latter can be written *Bz* where and is the link-route incidence matrix: equals 1 if link *a* lies in *r* and 0 otherwise. Thus, link capacity bounds are given by .

Returning to the likelihood of lost packets on link *a*, if the associated flow rate is strictly below then is deemed to be zero, while if the flow rate hits then may be positive. This is a complementarity relationship(3.3)Moreover, is the product of probabilities over all links *a* in *r*, which can be written as . We approximate this product by , assuming each is small, thus We can express this more succinctly as(3.4)

MCP formulation of TCP equilibria: let us replace `cwnd`_{r} in (3.1) by , from (3.2), and explicitly state the requirement Assuming that each at equilibrium, there is no loss of generality in rewriting these conditions as a complementarity relationwhere is the function(3.5)Now eliminate *p* using (3.4), to form the MCP in which (3.3) is repeated(3.6)

VI formulation and existence of TCP equilibria: writing , then (Facchinei & Pang 2003) the MCP (3.6) is equivalent to VI. Note that the link-loss probability vector *π* is now implicit in the VI: it is the KKT multiplier or shadow cost of the link-capacity constraint . We have assumed above, which implies *ϕ* is continuous on *Z*. Also, given that *B* is the link-arc incidence matrix of a connected network, *Z* is non-empty, compact and convex. Thus, there exists a solution *z* of VI, i.e. flow rate equilibrium for TCP. We have also assumed ; hence, from (3.5), each component function depends only on and is strictly decreasing in . Thus, trivially, is strictly monotone3 and uniqueness of the equilibrium follows Facchinei & Pang (2003).

In fact, existence is usually shown (Kelly *et al*. 1998; Kelly 2003, 2008) via an optimization formulation of the equilibrium model, whose KKT conditions are (3.6). This optimization formulation is identified as the direct analogue for TCP of the formulation of Wardrop traffic equilibrium by Beckmann *et al*. (1956). We make further comments on VI versus optimization models in §5.

### (b) A design question for congestion control

Here and in §3*c* we speculate on the application of MPECs and MPCCs to congestion control in communication networks.

Suppose we want to improve performance of the network. One performance metric is throughput . Another is the average reliability that can be modelled as 1 less than the probability of losing a packet anywhere in the network, the latter probability being . There are several others studied in the literature, including fairness, stability, avoidance of trip time bias and so on (Kelly *et al*. 1998; Kelly 2003, 2008). These can be accounted for in many ways, e.g. by maximizing a weighted sum of different performance objectives or including, in the constraints, bounds on performance metrics.

As a concrete example, suppose we want to improve throughput without sacrificing reliability of any link. This question is studied in the recent work of Gu *et al*. (2007) for small buffer networks in which throughput and link speed is increasing, where an ‘evolutionary’ version of TCP is proposed to prevent losses going to 100% as sending rates increase. Here, however, we model this as an MPCC. Let us set an upper bound on probability of packet loss and insist that(3.7)An optimal design MPCC would be to maximize over upper level variables and lower level variables subject to linear constraints on the former and the following requirements on the latter: *z* and satisfy the complementarity constraints (3.6), *p* is given by (3.4) and satisfies the reliability constraints (3.7).

Looking beyond TCP, there are many layers in a communication network (Lin *et al*. 2006; Chiang *et al*. 2007), e.g. route selection is a topic of its own. The design problem of optimizing route selection and TCP parameters simultaneously, both of which affect the TCP equilibrium, is an MPEC that mixes combinatorial variables with continuous ones. This approach is unconvincing, however, because no single entity controls route selection and TCP parameters for all OD pairs. A decomposition approach that might have more realism is a game between OD pairs, where each agent chooses its own routes and TCP settings to maximize a measure of its performance. This falls into the category of ‘equilibrium program with equilibrium constraints’.

### (c) Inverse problems in transmission control

We very briefly mention two possible inverse or calibration MPECs, both of which are attempts to identify user behaviour from limited datasets.

The first is the direct analogue of the OD demand problem in traffic equilibrium problems in which we estimate some unknown OD rate flows in the knowledge of link flow rates and perhaps some particular route flow rates.

The second is to identify how aggressively users push packets through the network where users are represented as OD pairs. An aggressive (passive) user sets large (small) relative to . Suppose OD route flows are known, then we can estimate user aggression via an MPEC in which the TCP parameters for each OD pair, or a subset, are the upper level variables. These can be chosen differently for each OD pair subject to specified parameter constraints for that OD pair. The lower level variables would be *z* and *π* and would have to satisfy the MCP formulation (3.6) of TCP equilibrium. (It should be noted that unique existence of a TCP equilibrium does not rely on the TCP parameters being identical for all users.) The objective function would be the least-squares difference between the OD route flows that are given as data and those which solve the lower level MCP.

## 4. Nonlinear programming approaches to solving MPCCs

Perhaps the main reason for the popularity of the MPCC format in representing MPECs and bi-level programs is that it can, at least formally, be tackled by nonlinear programming techniques. Here, we briefly review four NLP approaches to finding local solutions, more precisely stationary points, of MPCCs.

A word of warning is that the naive application of NLP ideas to MPCCs is often problematic. For example, the NLP formulation (1.3) is unstable in that perturbing its last constraint to makes the problem infeasible for arbitrarily small . As seen in Jiang & Ralph (1999) and Scholtes (2001) and elsewhere, solving MPCCs with general NLP software often results in slow or premature convergence. Even when an NLP method seems to work well, it may only be able to identify a stationary point that may be a local minimum in many circumstances but need not be a global solution of the problem.

In §4*a*,*b* we present two closely related methods, the regularization (Scholtes 2001) and complementarity–penalty (Ralph & Wright 2004) methods, which use the sequential NLP framework. Here, an MPCC is approximated by a family of NLPs, each of which depends on a single parameter and has better posed (more numerically stable) constraints than (1.3). The idea is to solve a sequence of NLPs as in the hope of identifying a stationary point of the MPCC. In §4*c* we sketch a penalty interior-point method (Leyffer *et al*. 2007). This embeds traditional interior-point ideas that execute only one linear system solve per iteration, into the complementarity–penalty framework. Finally, §4*d* looks briefly at the classical sequential quadratic programming (SQP) method. This has been applied directly to NLP formulations of MPCCs similar to (1.3) with surprising success, see Fletcher & Leyffer (2004) and Fletcher *et al*. (2006) as well as Anitescu (2005).

The ordering of this material shows the historical progression in the application of NLP ideas to solving MPCCs. Sequential NLP methods rely on the repeated application of a black-box NLP solver, whereas the interior-point approach adapts an NLP algorithm to handle MPCCs. Most conveniently, however, SQP is a standard NLP method that is applied, again as a black box but now directly, to an equivalent NLP formulation. See Ralph (2008) for further details and discussion.

Note that, for computation, Dirkse & Ferris (1999) give a modelling framework for the general algebraic modelling system and Matlab environments, which allows easy testing of many reformulations of an MPCC, including regularization, complementarity–penalty, smoothing and direct NLP methods.

### (a) Regularization

Regularization (Scholtes 2001), also called relaxation, is based on the following NLP in which is a parameter:Reg(ϵ)where * is the Hadamard or componentwise product, so is the vector with *i*th component ; and . When , the constraints of this problem are usually much better behaved numerically than those of the NLP formulation (1.3), e.g. feasibility may persist under small data perturbations. When , the regularized model is, clearly, equivalent to the MPCC (1.2).

The idea is to find a local minimum of for each *k*, where . Now suppose is a limit point of the sequence . Then is feasible for and hence for the MPCC. Scholtes (2001) provides technical conditions that are sufficient for to be stationary for . More precisely, theorem 3.3 of Scholtes (2001) gives sufficient conditions for to be a so-called strongly stationary point of the MPCC. The equivalence between strong stationarity of (1.2) and usual stationarity of the NLP —via Lagrange or KKT multipliers, see Nocedal & Wright (1999)—is given by Anitescu (2005).

Computational results in Scholtes (2001) indicate that regularization can be more effective than direct application of a solver to the naive NLP formulation (1.3).

### (b) The complementarity–penalty method

We make a quick comparison of regularization with the complementarity–penalty approach, in which the troublesome constraint in the NLP (1.3) is omitted at the cost of adding a penalty term to the objective,Pen(ϵ)

As in regularization, the idea is to find a local minimum or stationary point of , where . Suppose is a limit point of the sequence . Then satisfies the constraints of the MPCC with the possible exception of orthogonality between and . If orthogonality is achieved in the limit, then the convergence results of the complementarity–penalty method (Hu & Ralph 2004) mirror those of the regularization method given above.

A surprise, however, is that the complementarity–penalty method exhibits a kind of ‘exactness’. That is, it may be possible to solve an MPCC, exactly, by solving the complementarity–penalty problem for any small enough (Ralph & Wright 2004, proposition 5.3). What makes this surprising is that when NLPs that are reformulated via smooth penalty terms, a solution of the penalty problem is typically not feasible (let alone stationary) for the original problem except as the penalty parameter goes to its limit.

### (c) Penalty–interior method

The penalty–interior method of Leyffer *et al*. (2007) is an interior-point method based on the complementarity–penalty version of sequential NLP. However it solves only one linear system per iteration, as do standard interior-point methods, rather than solving an NLP to optimality. The standard primal–dual interior-point framework (Wright 1997) is adapted for this purpose. We simplify the setting of Leyffer *et al*. (2007) by assuming that each constraint function *g*, *r* and *s* is linear (or affine).

The method uses two penalty parameters: to penalize deviation of and from orthogonality, and to enforce non-negativity of all inequality constraints via a log-barrier penalty function. The penalty–interior formulation is

Bar(ϵ,μ)This is an unconstrained problem, although the domain of the objective function requires strict positivity of the vectors *r*(*x*) and *s*(*x*).

Sequential NLP provides motivation, namely to find a solution of for positive where . A solution of the barrier problem is taken to be a stationary point, i.e. a point at which the gradient of the barrier function is zero. This stationary condition defines a nonlinear system of equations that can be solved by Newton's method, with a safeguard on the step length to prevent iterates from violating positivity of constraint values. The penalty–interior method reduces the computational demand of sequential NLP by only approximately solving , while increasing the solution accuracy as *k* increases.

The most basic requirement for convergence is that all parameters (penalty parameters and solution accuracy tolerances) approach zero as . Global convergence analysis, i.e. properties of limit points of , is similar to that of the regularization method. (See Leyffer *et al*. 2007 for details.)

More interestingly, in many circumstances (Leyffer *et al*. 2007, theorem 3.4), it is possible to arrange the updates of the parameters such that only one step of Newton's method (one linear system solve) is required per iteration; becomes fixed after a finite number of iterations, i.e. ; and the sequence of iterates converges *superlinearly* to a solution of the MPCC. Superlinear convergence means . This is very useful in practice because the last few iterations dramatically improve the accuracy of iterates, which confirms that the method is truly converging to a solution.

The algorithm's commendable numerical behaviour, of one linear solve per iteration resulting in superlinear convergence, is often seen in computation according to the experience of Leyffer *et al*. (2007). While conditions for superlinear convergence of interior-point methods in solving convex programs are well understood, the fact that this is obtained for MPCCs can be credited to the exactness property of the penalty–complementarity formulation (which is apparent in the fact that need not be driven to zero).

### (d) Sequential quadratic programming

SQP is a variant of Newton's method designed for constrained NLPs (Nocedal & Wright 1999). At each iteration, it solves a quadratic program (rather than solving a linear system of equations as Newton's method would do). It is considered an efficient general method for solving constrained optimization problems and its iterates often converge superlinearly. The usual convergence analysis does not apply to MPCCs, however, due to their numerical instability. Given this and the numerical difficulties associated with the NLP formulation (1.3), it was thus quite a surprise when Fletcher & Leyffer (2004) applied their SQP code SQP-Filter to a suite of test MPCCs and found that the method usually converged superlinearly to a local solution. This was perhaps the first time superlinear convergence was observed for a standard NLP method applied directly to an equivalent NLP reformulation of an MPCC.

Fletcher *et al*. (2006) explain the local superlinear convergence of SQP applied to NLP reformulations of MPCCs, as does Anitescu (2005) for the SQP variant used in the `SNOPT` code of Gill *et al*. (2002). The practical application of NLP codes to MPCCs has progressed from small- or medium-size problems to more demanding applications such as a large-scale nonlinear MPCC model of an electricity market network with 20 000 variables and 10 000 constraints (Chen *et al*. 2006).

## 5. Concluding remarks

We conclude with brief remarks on the different philosophies that underlie research in traffic and communication network equilibria.

Though VI and MCP models of equilibria are emphasized above, convex optimization models are among the foundation stones of research on network equilibria. The seminal optimization approaches of Beckmann *et al*. (1956) for Wardrop equilibria and Kelly *et al*. (1998) for TCP equilibria, which are brought together in Kelly & Voice (2005), underline this point. This view extends readily to the much more general setting of multi-layered communication networks as surveyed by Lin *et al*. (2006) and Chiang *et al*. (2007).

Theory for convex optimization is deep and elegant, including duality which has been useful in understanding and designing decomposition schemes that account for the heterogeneous nature of network users and processes. MPECs by contrast are unavoidably non-convex optimization problems.

This disjunction is also related to differences in research directions taken in traffic networks versus communication networks. In particular, the computational and algorithmic literature on bi-level optimization (including MPECs) over traffic networks (e.g. Marcotte 1986; Chen & Florian 1998; Smith 2004; Sumalee 2004*a*,*b* and references therein) appears not to be paralleled in communication networks. It seems that MPECs have become important in applications for which static equilibria capture most of the system information needed for design. By contrast, dynamics of network flows are central to communications networks where protocols like TCP are control mechanisms, so that issues like stability (Kelly *et al*. 1998; Kelly 2003, 2008; Kelly & Voice 2005) are critical.

## Acknowledgments

The author gratefully acknowledges the input of Frank Kelly, Gaurav Raina, Mike Smith and Agachai Sumalee on the literature of network equilibria, and Michael Ferris, Jong-Shi Pang and Stefan Scholtes on technical developments for MPECs.

## Footnotes

One contribution of 16 to a Discussion Meeting Issue ‘Networks: modelling and control’.

↵More traditionally, non-negative multipliers associated with the inequalities would also appear in the complementarity formulation. These, however, can be eliminated to give (2.3).

↵This is simply an economic definition. There is no assumption that the toll revenues are somehow injected back into the traffic system.

↵−

*ϕ*(*z*) is strictly monotone on*Z*if for any distinct .- © 2008 The Royal Society