## Abstract

We defend pluralism in mathematics, and in particular Errett Bishop's constructive approach to mathematics, on pragmatic grounds, avoiding the philosophical issues which have dissuaded many mathematicians from taking it seriously. We also explain the computational value of interval arithmetic.

## 1. Introduction

Errett Bishop's book ‘Foundations of Constructive Analysis’ appeared in 1967 and started a new era in the development of constructive mathematics. His account of the subject was entirely different from, and far more systematic than, Brouwer's programme of intuitionistic mathematics. The latter attracted a few adherents in the 1920s and 1930s, but was widely rejected because of its conflicts with the dominant classical view of the subject.

Unfortunately, Bishop's book was ignored by most mathematicians, who assumed that the issues involved had all been settled, and that he could not have anything interesting to say. My task in this meeting is to try to persuade you that his programme provides valuable insights into matters which should be of concern to anyone who has even indirect involvement in computation.

In this paper I will *not* discuss the philosophical issues relating to Bishop's work, which are treated at some length in Billinge (2003) and Davies (2004), beyond saying that one can admire his mathematical contributions without adopting his philosophical position. Briefly, I defend what I call pluralism in mathematics—the view that classical mathematics, constructive mathematics, computer assisted mathematics and various forms of finitistic mathematics can coexist. I revive Carnap's dictum that one must decide the framework of discourse *before* questions about existence and truth make sense; see Carnap (1950). In different frameworks the answer to a question may be different, but this in no way implies that one or the other is ‘right’. This position is anti-Platonistic.

From chapter 2 onwards Bishop (1967) is completely recognizable as rigorous pure mathematics. Many well-known theorems appear, sometimes in forms which are not the usual ones, although trivially equivalent to them from a classical point of view. A few theorems are simply absent. The value of Bishop's efforts may not be immediately clear to everyone, in spite of what he writes in his first chapter. I will show that the differences between classical and constructive mathematics always correspond to situations in which real difficulties arise in numerical computation for suitable examples. Classical mathematics may provide a more flexible context for proving the existence of entities, but constructive mathematics provides a systematic approach to understanding why computational solutions of problems are sometimes not easy to obtain.

This is not to say that constructive mathematics is simply a part of numerical analysis. Numerical analysts have more problems than do constructivists. Bishop (1967, p. 140) gives a straightforward proof of the fundamental theorem of algebra, with one small but interesting proviso, even though it is known that the zeros of a polynomial of moderate degree (say at least 50) are highly unstable with respect to very small changes in the coefficients. Nevertheless, one can gain many insights into the differences between classical and constructive mathematics by considering numerical examples.

## 2. What is constructive mathematics?

It has often been said that Bishop rejected the law of the excluded middle, but a more useful description of the situation is that he gave the symbol ∃ a different meaning from the usual one. In classical mathematics ∃ refers to Platonic existence, but Bishop used it to refer to the production of an algorithm for constructing the relevant quantity. In classical mathematics ∃ may be defined in terms of ∀: the expression ∃*xA* is logically equivalent to ¬(∀*a*¬*A*). In constructive mathematics, ∃ is a new quantifier with stricter conditions for its application. All of the differences between classical and constructive mathematics follow from the new meaning assigned to the symbol. We wish to emphasize that every theorem in Bishop's constructive mathematics is also a theorem in classical mathematics. Constructive mathematicians have to work harder to prove theorems, because their criteria for existence are stricter; the pay-off is that the statements of the theorems contain more information.

Starting from the natural numbers, Bishop constructed the real number system and established many of its familiar properties. However, in his approach, one cannot assert for every real *x* that either *x*=0 or *x*≠0. Nor is it the case that every bounded set of real numbers has a least upper bound. The reason for this is best illustrated with an example. For each positive integer *n*, we define the number *a*_{n} to be 1 if the *n*th digit in the decimal expansion of *π* is the start of a sequence of a thousand consecutive sevens; otherwise we put *a*_{n}=0. For each *n*, the value of *a*_{n} can be determined within a finite length of time. However, the least upper bound *A* of the sequence is currently unknown. Platonically speaking either *A*=0 or *A*=1, even if we do not know which is the case, but Bishop would say that such a statement has no content: it simply reformulates the question in a trivial manner.

If we putthen *s* exists constructively, because it can be calculated with any assigned accuracy in a finite length of time. However, we do not know whether or not *s*=0. Even if it were eventually proved that every sequence of digits occurs somewhere in the decimal expansion of *π*, so that *s*≠0, whether *s* is positive or negative seems to be far beyond reach. The development of constructive mathematics has to reflect the existence of an infinite number of questions of the same type.

It is very surprising that Bishop could develop so much of analysis without using the least upper bound principle or the law of the excluded middle. His book contains fully worked out accounts of the theory of Banach spaces, the spectral theorem, integration theory and Haar measure, among other topics. Set theory is substantially different from the classical version, particularly when considering the complement of a subset. Point set topology is not easy to develop (see Bridges & Luminiţa 2003 for one approach to this), but Bishop contains a rich constructive version of the theory of metric spaces. Compactness is defined in Bishop (1967, p. 88) using the notion of total boundedness, which is the way compactness is often proved in applications. This is not a coincidence.

## 3. What significance do the differences have?

My goal in this section is to explain why a classical mathematician (and I emphasize that I am one) might be interested in Bishop's programme. My thesis is that by adopting the constructive framework one can handle certain numerical problems on a systematic basis, whereas classical mathematicians have to deal with them piecemeal, and try to remember whether or when they are using results which are not computationally feasible. I make no claim that constructive mathematics is superior to classical mathematics in all contexts, but only that it sometimes provides illuminating insights.

Producing examples to demonstrate the differences between classical and constructive mathematics often exploits the difference between recursive and recursively enumerable subsets of * N*. Another method is to define a sequence whose behaviour as

*n*→∞ depends upon whether some famous conjecture is true or false.1 We adopt a third strategy, showing that the impossibility of proving something in constructive mathematics is regularly

*associated*with the extreme difficulty of showing it numerically for quite ordinary functions. We emphasize that the functions considered below do not, strictly speaking, provide examples of the constructive phenomenon, but we feel that in spite of this they explain why the constructive phenomenon exists.

Let us start with the intermediate value theorem for continuous functions of a single real variable. Bishop (1967, p. 5) explains why this theorem cannot be proved in a constructive framework. In the context of constructive mathematics one cannot find a value of the variable *x* for which *f*(*x*)=*c* by the method of bisection, because being able to evaluate *f*(*x*)−*c* with arbitrary accuracy does not imply that one can determine whether it is positive or negative.

Slight modifications of the intermediate value theorem are, however, valid constructively. If is continuous and *f*(*a*)<*c*<*f*(*b*) then, given *ϵ*>0, one can constructively find *x*∈(*a*, *b*) such that |*f*(*x*)−*c*|<*ϵ*. In addition the intermediate value theorem itself may be proved constructively under a mild extra condition on *f* (being locally non-constant), which is almost always satisfied. See Bishop & Bridges (1985) and Bridges (1998).

We explain the differences between the classical and constructive versions of the intermediate value theorem by means of two examples, one ersatz and one genuine. Let(3.1)on (−1, 1) (figure 1). *Given the formula*, it is evident that the only solution of *f*(*x*)=0 is *x*=0, but one would need to calculate to very high accuracy to determine from its numerical values that the function is not identically zero throughout the interval (−1/4, 1/4). However, many digits one uses in the numerical computation, a similar difficulty arises if one replaces 45 by a larger number. In applied situations a closed formula is frequently not available, and the above problem may be a pressing one.

A genuine example presenting exactly the same difficulties is obtained as follows. LetThen one cannot constructively solve *g*(*x*)=*c* if *c* is an extremely small number for which one does not know whether *c*=0, *c*<0 or *c*>0.

In constructive mathematics every non-negative continuous function on a closed bounded interval has a non-negative infimum; Bishop (1967, p. 35) provides a procedure for computing this with arbitrary accuracy. This does not imply that one can always determine whether the minimum is zero or positive, nor does it imply that one can find a point at which the infimum is achieved. Both can be a serious problem in numerical analysis as well. If *ϵ*>0 is sufficiently small it is difficult to show *purely numerically* that the polynomial(3.2)never vanishes, and also difficult to determine whether its minimum value occurs near *x*=*π* or *x*=−*π*. For functions arising in applied mathematics that are not given by explicit formulae this can again be a serious problem (figure 2).

The suggestion that classical mathematics takes priority over constructive mathematics because of the Putnam–Quine argument about the indispensability of the former in the physical sciences are not convincing, for reasons spelled out in Davies (2003*a*,*b*). The differences between the two systems are unimportant from the point of view of physicists; this is why Newton, Laplace, Maxwell and other scientists were able to develop very successful mathematically based theories long before the real number system was formalized at the end of the nineteenth century.

Hellman has tried to put some flesh on the Putnam–Quine argument in several recent papers, in which he claims that there are no constructive versions of some key results in mathematical physics. We start with Hellman (1993*a*), which deals with Gleason's theorem, considered by some (but not the author of this paper) to be of considerable importance in the foundations of quantum theory. This concerns the (non-distributive) lattice of closed subspaces of a Hilbert space of dimension greater than two. In this lattice the analogues of set-theoretic complements are orthogonal complements. Gleason's theorem states that if *μ* is a normalized, countably additive measure on in a suitable sense, then there exists a non-negative, self-adjoint operator *S* on with trace 1 such thatfor all *L*∈, where *P*_{L} is the orthogonal projection with range *L*. Hellman showed that a different version of Gleason's theorem cannot be proved in constructive mathematics. Nevertheless, Gleason's original version of the theorem, stated above, is constructively valid; see Richman & Bridges (1999) and Richman (2000). The difference between the two versions relates to the validity of the principal axes theorem, discussed below.

In Hellman (1993*b*), the author showed that one version of the spectral theorem for unbounded self-adjoint operators is not constructively valid. However, in his original book Bishop (1967, p. 275) had already proved a different version, for a commuting sequence of bounded self-adjoint operators, which is completely acceptable even to classical mathematicians. After Hellman's paper appeared, Ye (1998) published a constructive version of the spectral theorem for an unbounded self-adjoint operator. Hellman's focus on the issue of domains and unboundedness is misguided because an unbounded operator becomes bounded as soon as one makes its domain into a Banach space by endowing it with the graph normThe difficulty of determining whether a vector lies in the domain of an unbounded self-adjoint operator is not just a problem for constructivists. The classical theory progressed much more rapidly after it was realized that it was sufficient to specify an explicit domain of essential self-adjointness (a so-called ‘core’ of the operator) or even a core for the associated quadratic form; see Davies (1980). A considerable amount is now known about the spectral theory of whole classes of differential operators whose domains cannot be identified as standard function spaces; see, for example, Davies (1989).

The principal axes theorem is the classical result that every self-adjoint matrix has a complete orthonormal set of eigenvectors. The theorem is not constructively valid, for good reasons: the eigenvectors may change extremely rapidly as a parameter passes smoothly through a critical value. We invite the reader to check this for the elementary examplewhere *ϵ*=10^{−100} and *s* passes through the value 0. The classical version of the spectral theorem provides no insight into the existence of these computational problems, and does not suggest how they might be overcome. One *can* understand such problems in classical terms, but the constructive approach provides a systematic framework for doing so.

Finally in Hellman (1998) the author showed that the Hawking–Penrose singularity theorem is not constructively valid. This is an interesting observation, since the theorem has been extremely influential in the subject. It is embarrassing for a radical constructivist but not for a pluralist. It remains extremely hard to say much about the nature of the singularities: one certainly cannot identify them as black holes without the benefit of a ‘cosmic censorship hypothesis’. It is very likely that any detailed classification of the singularities will also be constructively valid.

The constructive version of the Hahn–Banach theorem in Bishop (1967, p. 263) applies to normable linear functionals, and Bishop needs to distinguish these from the more general bounded linear functionals. The following example explains why. Take the sequence *a* defined above using the digits of *π* and put *b*_{n}=1 if *a*_{n}=1 and also *a*_{m}=0 for all *m*<*n*; otherwise put *b*_{n}=0. The sequence *b* identifies the first occurrence of a thousand consecutive sevens in the decimal expansion of *π*, if such a sequence exists. Classically *b*∈*l*^{2}(* N*) but constructively we cannot assert this (in 2004), because we are not able to evaluate ‖

*b*‖

_{2}with arbitrary accuracy. Even if we were assured that a sequence of a thousand consecutive sevens existed, we would still not be able to place

*b*in

*l*

^{2}(

*) constructively unless we were given some information about how large*

**N***n*had to be for

*b*

_{n}=1.2 Nevertheless the formulais constructively well defined for all

*c*∈

*l*

^{2}(

*) and defines a bounded linear functional*

**N***ϕ*on

*l*

^{2}(

*).*

**N**A linear operator *A* on a Banach space is said to be bounded if there exists a constant *c* such that ‖*Ax*‖≤*c*‖*x*‖ for all *x*∈. Its norm is the smallest such constant if that exists. The constructive distinction between bounded and normable operators is related to the fact that there is no effective classical algorithm for determining the norms of bounded operators on most Banach spaces. This is why finding the best constants for various Sobolev embeddings in *L*^{p} spaces and other operators of importance in Fourier analysis has occupied mathematicians for decades. The same problem occurs for large finite matrices—standard software packages only provide routines for computing the norm of a sufficiently large *n*×*n* matrix with respect to the *l*^{p} norm on **C**^{n} for *p*=1, 2, ∞.

## 4. Computer-assisted mathematics

Over the last 30 years a number of famous problems in pure mathematics have been solved by methods that conform entirely to the standards of classical mathematics, except for the fact that they involve a large amount of computation.

Because it may not be well known to this audience, I will concentrate on the increasing use of controlled numerical methods to *prove completely rigorously* the existence of, and describe, the solutions of various problems in analysis. Interval arithmetic provides a new version of finite arithmetic. It has an ancient history, but is slowly becoming more important in connection with computer-assisted proofs of theorems in analysis. See Moore (1979), Markov & Okumura (1999), Plum (2001), Plum & Wieners (2002), Breuer *et al*. (2003) and Interval Computations Web-Site (2004) for some of the many applications. Its basic entities may be written in the formwhere one imposes some upper bound on the number of significant digits allowed. The interpretation of this expression is as the interval where *x*=1247.062 96 and , but the definitions of the basic operations of arithmetic on the entities do not depend logically upon this intuition, nor upon any commitment to the existence of the real number system. To add two entities one adds the lower bounds and then rounds *down* to the prescribed number of digits, and also adds together the two upper bounds and rounds *up* to the prescribed number of digits. If one ignores the rounding procedure then *u*+*v*=*w* where

The definition of multiplication is similar, but more complicated. One puts *uv*=*w* where

One identifies an integer *n* with the interval [*n*, *n*] and writes *u*∼*v* if the two intervals overlap, i.e. ifOne puts *x*>0 if , and *x*<0 if ; if neither of these holds then *x*∼0. One might define *π* in the system by , without commitment to the existence of an ‘exact’ value. In interval arithmeticbut the two are not equal. One needs to choose the right way of evaluating an expression to minimize the width of the interval produced.

In interval arithmetic the standard functions such as sin, log, etc. take intervals to intervals. The programming languages have to be fairly subtle to achieve this. For example when obtainingthe programming language must take into account the fact that 1.57<*π*/2<1.58. There is no requirement for the initial interval to be small. ThusA systematic description of interval arithmetic has been completed, and programming languages using it are readily available. It allows a rigorous approach to global optimization and makes a small but vital contribution to the solution of certain well-known, nonlinear, elliptic, partial differential equations. See Plum (2001), Plum & Wieners (2002), Breuer *et al*. (2003) and references there. In these papers the authors start by searching for approximate solutions on an experimental basis. They then prove various analytical results which establish that one can use a contraction mapping principle to prove the existence of true solutions close to the approximate solutions, *provided* certain inequalities hold. Finally, they verify the inequalities rigorously using interval arithmetic.

It seems clear that this approach to nonlinear PDEs will expand rapidly over the coming decades. Those of us who feel uneasy about computer assisted proofs will either have to come to terms with them, or assign an ever-increasing fraction of their subject to a new category. Computers are fallible in different ways from mathematicians, but both are subject to verification by a variety of methods. In the case of computer-assisted proofs these range from formal proofs of correctness to the writing of independent programs, which are then run on machines with different operating systems. Absolute certainty is a chimera in both contexts, as the classification of the finite simple groups and the solution of Kepler's sphere packing problem have shown.

Journal editors are struggling to come to terms with this situation. In the author's opinion they should state in each problematical case exactly what degree of confidence they have in each part of a very complex proof. This will provide the best information for future generations to assess whether they wish to rely upon the result and if not, which parts need further attention. Programs should probably be archived for reference, if this is practically possible, but the checking of badly written programs will never be as convincing as the production of better ones.

## Discussion

D. B. A. Epstein (*Department of Mathematics, University of Warwick, UK*). We could use the symbol (backwards capital E with outline and hollow inside) ‘hollow exists’ for classical (backwards E), and (backwards E) for constructive (backwards E).

E. B. Davies. Yes, suggestions of this type have been made. Your proposal, with its judgemental overtones, is more likely to be received well by a constructivist than by a classical mathematician! The problem with trying to amalgamate the two frameworks in this way is that once one gets more deeply into the two approaches, one finds them proceeding on divergent paths, particularly in subjects using set theory heavily, such as topology. My guess is that one would eventually have to use distinguishing fonts for so many concepts that the proposal would be counterproductive. Even if this were not the case, may I make an analogy with English and French? One could regard them as a single language in which words that we now regard as translations of each other are considered instead to be synonyms, possibly with different shades of meaning. Would this actually help anything?

D. B. A. Epstein. Moe Hirsch suggests a half-life for the truth of theorems or rather one's degree of belief. If it has not been reproved or re-used, then one's degree of belief decreases and eventually vanishes.

E. B. Davies. This is a different issue, but nevertheless an important one. Mathematicians value theorems more highly if they have connections with other results than if they are totally isolated. In spite of their commitment to proofs being true or false as they stand, mathematicians appear to like theories that have highly redundant logical structures. It seems that they do not wholly trust their ability to avoid error when following a long logical argument—and there is quite a lot of historical evidence that they should not trust proofs that have not been confirmed by some independent evidence. Such a statement would not be regarded as controversial in any experimental subject, but many mathematicians do not like to admit that it also applies to their own subject.

J. M. Needham (*Department of Computer Science, University of Bath, UK*). How do you propose to run the classical world and the constructive world side by side?

E. B. Davies. I do not personally think this is a fundamental problem. When I play chess I manage to remember that its rules are different from those of checkers, and when I study vector spaces I remember which theorems only work in finite dimensions and which generalize to infinite dimensions. I manage to remember that the standard form of remainder in Taylor's theorem does not work for complex-valued functions. If one wants to remember another distinction then one can. Of course it is easier if one does this from the beginning rather than when one is older. A valuable guideline is that if one has a classical proof that provides a stable procedure for computing the answer, then it almost surely has a constructive analogue.

I would stress that the classical and constructive worlds are indeed different—one is not discussing ‘the real numbers’ and arguing about whether some proof about them is acceptable. One is studying two different entities that have similar but different properties. One might compare the integers as a classical mathematician thinks about them and the integers as treated in a typical computer program. In both cases addition leads to a definite result, but for very large integers the computer's output might be an infinity or error symbol. We consider that one system is right and the other is wrong, but that is because we consider that the computer is trying to implement our ideas and not fully succeeding in doing so. In the case of classical versus constructive mathematics no such judgement is possible.

E. B. Davies (*additional comment*). Those working in and promoting constructive mathematics are well accustomed to hearing comments to the effect that there is no evidence that it has any contributions to make to ‘real mathematics’. A few remarks are worth making about this. In some fields, such as algebraic geometry, this may well be true, but that does not mean that it is bound to be equally true in others. Nobody would (or should) claim that constructive analysis leads to the routine solution of difficult problems. Highly original ideas often enable one to solve problems that were previously intractable, and this will remain the case whether one uses classical or constructive methods. One of my goals was to persuade people to move beyond the commonplace view that classical mathematics is somehow ‘right’ and other approaches thereby ‘wrong’.

The areas in which constructive mathematics does provide valuable insights are those close to numerical analysis and other fields in which the existence of explicit estimates is of the essence. In these subjects a number of ordinary working mathematicians have found that an awareness of constructive mathematics helps them to understand the nature of the problems that they are facing better.

## Footnotes

One contribution of 13 to a Discussion Meeting Issue ‘The nature of mathematical proof’.

↵If one uses the Goldbach conjecture, for example, then one puts

*a*_{n}=0, if 2*n*may be written as the sum of two primes, and*a*_{n}=1 if it cannot.↵Taken literally this sentence is non-sensical. Classically, there is no problem in asserting that

*b*∈*l*^{2}(). Constructively, one could not be given an assurance about the existence of**N***n*for which*b*_{n}=1 without also being given the relevant information about how large*n*had to be. This is a familiar problem when one tries to compare two incommensurate frameworks.- © 2005 The Royal Society