## Abstract

Wikipedia has claimed for over 3 years now that John von Neumann was the ‘first quantum Bayesian’. In context, this reads as stating that von Neumann inaugurated QBism, the approach to quantum theory promoted by Fuchs, Mermin and Schack. This essay explores how such a claim is, historically speaking, unsupported.

## 1. Introduction

The Wikipedia article on Quantum Bayesianism has claimed since April 2012 that John von Neumann was the ‘first quantum Bayesian’ [1]. To a reader acquainted with quantum foundations and the history of quantum theory, this is a strikingly odd assertion. This note explains why the claim is incorrect and explores how it came to be made.

A ‘Quantum Bayesian’ is one who interprets the probabilities arising in quantum physics according to some variety of the Bayesian view of probabilities. Given the profusion of schools of thought under the Bayesian umbrella, it should not be surprising that a variety of ways to be some kind of Bayesian about quantum theory has also arisen [2–15]. The most radical approach is *QBism*, which maintains that all quantum states are expressions of personalist Bayesian probabilities about potential future experiences [3,16–30]. For this essay, I will take the writings of Fuchs and co-workers [16–26] as the defining statements of what a QBist *is* and *is not*. Furthermore, their union and intersection are indicative of what a QBist might be, might not be and is not obligated to entertain.

QBism is the primary focus of the Wikipedia article on Quantum Bayesianism, and the take-away impression is that John von Neumann was not just a partisan of the Bayesian lifestyle but also the first QBist. This is an untenable claim.

In this essay, we will not be strongly concerned with which interpretation of quantum mechanics is ‘correct’, or with what it might *mean* for an interpretation of quantum mechanics to *be* ‘correct’. Our focus will instead be on who said what and when. However, to evaluate the ‘von Neumann was the first Quantum Bayesian’ claim properly, we need to clarify what a ‘Quantum Bayesian’ world view might be, and QBism, in many ways an extreme among such views, provides a convenient vantage point. Therefore, we will establish the basic notions of QBism, and then, in following sections, we will turn to the writings of von Neumann.

## 2. QBism

QBism is an interpretation of quantum mechanics which takes as fundamental the ideas of *agent* and *experience.* A ‘quantum measurement’ is, in QBism, an act which an agent performs on the external world. A quantum state is an agent's encoding of her own personal expectations for what she might experience as a result of carrying out an action. This holds true for all quantum states, pure or mixed; a state without an agent is a contradiction in terms. Furthermore, each experience is a personal event specific to the agent who evokes it. Cabello [31] classifies QBism as a kind of ‘participatory realism’, similar to the thinking of John Wheeler.

Different authors have emphasized different aspects of QBism. The discussions by Barnum [3] and Mermin [22–26], and the briefer remarks by Schlosshauer & Claringbold [32] and Żukowski [33], place their focus on how QBism gives meaning to the current mathematical formalism of quantum theory. Fuchs & Schack [18] have also addressed this aspect, while in addition pushing forward technical work which aims to reformulate quantum theory and build it up anew from explicitly QBist postulates [19]. Owing to the historical subject matter of this essay, the former will be more relevant here.^{1}

Individual statements and arguments drawn from the writings of other scientists can sometimes fit neatly within the QBist programme. Examples come to mind in the works of Aaronson [36], pp. xii–xiii, 110, Bacon [37], Baez [2], Bell [25,26], Einstein [21], Feynman [38], p. 6-7, Nielsen [39], Peierls [26], Schrödinger [23,25,26] and others. This is not to claim that any of these authors are QBist or proto-QBist (the latter term being also unpleasantly teleological). Indeed, one can find self-identified non-QBists and critics of QBism who agree with QBists on non-trivial points [40,41]. Physicists and their opinions are sufficiently complicated that we cannot pigeonhole them based on isolated snippets of text. Placement and classification require more systematic study than that, if they are to have any meaning. With this concern in mind, we turn to surveying the writings of von Neumann. First, we shall see that von Neumann's interpretation of probability, though it displayed varying nuances over time, never aligned with that advocated by Fuchs, Mermin and Schack. Then, we will study the evidence indicating that von Neumann made a category distinction between different kinds of quantum states that QBists (and some other varieties of Quantum Bayesians) do not. Having established that ‘Quantum Bayesian’ is not a good description for von Neumann's thought, we will turn to the argument that underlies the claim in Wikipedia. Digging into the material that ostensibly supports that claim will reveal that support to be rather insubstantial.

## 3. Von Neumann on probability

### (a) Frequentism (1932)

To begin with, we examine von Neumann's *Mathematical Foundations of Quantum Mechanics*, hereinafter *MFQM* [42]. This book is indicative of von Neumann's thinking in 1932, the time of the publication of the original German edition. Quoting from *MFQM*, p. 298:
However, the investigation of the physical quantities related to a single object

*S* is not the only thing which can be done – especially if doubts exist relative to the simultaneous measurability of several quantities. In such cases it is also possible to observe great statistical ensembles which consist of many systems *S*_{1},…,*S*_{N} (i.e. *N* models of *S*, *N* large).^{156}

Note 156 reads as follows:
Such ensembles, called collectives, are in general necessary for establishing probability theory as the theory of frequencies. They were introduced by R. v. Mises, who discovered their meaning for probability theory, and who built up a complete theory on this foundation (cf., for example, his book, ‘Wahrscheinlichkeit, Statistik and ihre Wahrenheit’, Berlin, 1928).

[Solche Gesamtheiten, Kollektive gennant, sind überhaupt notwendig um die Wahrscheinlichkeitsrechnung als Lehre von den Häufigkeiten begründen zu können. Sie wurden von R. v. Mises eingeführt, der ihre Bedeutung für die Wahrscheinlichkeitsrechnung erkannte, und einen entsprechenden Aufbau derselben durchführte (vgl. z. B. sein Buch Wahrscheinlichkeit, Statistik und ihre Wahrheit, Berlin 1928).]

Note the discrepancy in the titles given for von Mises’ book; apparently, the translator made an error here. In fact, von Neumann also errs in this passage, as the ‘ihre’ is an interpolation. Nevertheless, the meaning of the passage is clear: in 1932, von Neumann interpreted probability in a frequentist manner.

Evidence of this occurs throughout *MFQM*, in fact. For example, ‘expectation value’ is defined as ‘the arithmetic mean of all results of measurement in a sufficiently large statistical ensemble’ (p. 308). Lüders [43], who improved upon von Neumann's theory of measurement, also thought in terms of ‘an ensemble of identical and independent systems’ [einer Gesamtheit gleichartiger und unabhängiger Systeme]. This is one example of later researchers not finding a Bayesian message in von Neumann.

### (b) Probability as extended logic (*ca* 1937)

To see how von Neumann's thinking on the foundations of probability changed, we turn next to an unfinished manuscript from about 1937, which is included in his *Collected Works* [44]. Von Neumann imagines a collection of a large number of ‘specimens’ of a physical system *S*_{1} and considers interpreting the transition probability in terms of a relative frequency:
[I]f we measure on each first , and then in immediate succession , and if then the number of those among where is found to be true is

*M*, and the number of those where , are both found to be true is *M*′, then:(H) means that

*M*′/*M*→*θ* for .This view, the so-called ‘

*frequency theory of probability*’ has been very brilliantly upheld and expounded by R. V. Mises. This view, however, is not acceptable to us, at least not in the present ‘logical’ context. We do not think that (H) really expresses a convergence-statement in the strict mathematical sense of the word—at least not without extending the physical terminology and ideology to infinite systems (namely, to the entirety of an infinite sequence )—and we are not prepared to carry out such an extension at this stage. The approximative forms of (H), on the other hand, are mere probability-statements, e.g. ‘Bernoulli's law of great numbers’ […] And such probability-statements are again of the same nature as the relation , which they should interpret.

Von Neumann then makes the following declaration:
We prefer, therefore, to disclaim any intention to interpret the relations (0<

*θ*<1) in terms of strict logics. In other words, we admit:*Probability logics cannot be reduced to strict logics, but constitute an essentially wider system than the latter, and statements of the form* *P*(*a*,*b*)=*θ* (0<*θ*<1) *are perfectly new and* sui generis *aspects of physical reality*.So probability logics appear as an essential extension of strict logics. This view, the so-called ‘logical theory of probability’ is the foundation of J. N. [sic] Keynes's work on the subject.

In short, the later von Neumann interprets quantum probabilities as *logical probabilities*. Moreover, he explicitly identifies this view with that worked out by Keynes.

At this point, it is a good idea to compare Keynes's ‘logical probability’ with the thinking of F. P. Ramsey, whose interpretation is closer to that invoked in QBism [20], pp. ix, 1225–1229, 1374. Fortunately, we have a statement by Keynes himself on this subject. In October 1931—after Ramsey's death at the age of 26—Keynes wrote the following [45]:
Formal logic is concerned with nothing but the rules of

*consistent* thought. But in addition to this we have certain ‘useful mental habits’ for handling the material with which we are supplied by our perceptions and by our memory and perhaps in other ways, and so arriving at or towards truth; and the analysis of such habits is also a sort of logic. The application of these ideas to the logic of probability is very fruitful. Ramsey argues, as against the view which I had put forward, that probability is concerned not with objective relations between propositions but (in some sense) with degrees of belief, and he succeeds in showing that the calculus of probabilities simply amounts to a set of rules for ensuring that the system of degrees of belief which we hold shall be a *consistent* system. Thus the calculus of probabilities belongs to formal logic. But the basis of our degrees of belief—or the *a priori*, as they used to be called—is part of our human outfit, perhaps given us merely by natural selection, analogous to our perceptions and our memories rather than to formal logic.

And, having made this comparison, Keynes goes on to say,
So far I yield to Ramsey—I think he is right. But in attempting to distinguish ‘rational’ degrees of belief from belief in general he was not yet, I think, quite successful. It is not getting to the bottom of the principle of induction merely to say that it is a useful mental habit. Yet in attempting to distinguish a ‘human’ logic from formal logic on the one hand and descriptive psychology on the other, Ramsey may have been pointing the way to the next field of study when formal logic has been put into good order and its highly limited scope properly defined.

### (c) Debating Bohr in Warsaw (1938)

In 1938, von Neumann attended a conference in Warsaw on ‘New theories in physics’. The meeting, which ran from 30 May to 3 June, was attended by Bohr, Brillouin, de Broglie, C. G. Darwin, Eddington, Gamow, Kramers, Langevin, Wigner and others. Bohr presented a report on ‘The causality problem in atomic physics’, to which von Neumann replied in the discussion afterward [46]. Von Neumann's remarks begin by interpreting probabilities in terms of ensembles:
If we wish to analyse the meaning of the statistical statements of quantum mechanics, we must necessarily deal with « ensembles » of a great number of identical systems, and not with individual systems. [46], p. 30

He segues, however, into a discussion of quantum logic, arguing that the central point is the failure of the distributive law. This leads to the following:
A complete derivation of quantum mechanics is only possible if the propositional calculus of logics is so extended, as to include probabilities, in harmony with the ideas of J. M. Keynes. In the quantum mechanical terminology : the notion of a « transition probability » from

*a* to *b*, to be denoted by *P*(*a*,*b*) must be introduced. (*P*(*a*,*b*) is the probability of *b*, if *a* is known to be true. *P*(*a*,*b*) can be used to define and −*a* : *P*(*a*,*b*)=1 means , *P*(*a*,*b*)=0 means . But *P*(*a*,*b*)=*ϕ*, with a *ϕ*>0, <1 is a new « sui generis » statement, only understandable in terms of probabilities.) [46], p. 38

It is interesting that von Neumann does not attempt to use Keynesian logical-probability theory to define *single-shot* probabilities. Instead, he still treats statistical statements as having meaning only for ensembles.

### (d) Game theory (1944)

Von Neumann coauthored the textbook *Theory of Games and Economic Behavior* with Oskar Morgenstern [47]. The book, first published in 1944, is frequentist in orientation, though the authors express this as a matter of convenience rather than necessity. Von Neumann and Morgenstern call the ‘interpretation of probability as frequency in long runs’ a ‘perfectly well founded’ notion, but they leave the door open to alternative conceptions of probability [47], p. 19. Morgenstern later explained [48], pp. 809–810,
We were, of course, aware of the difficulty with the logical foundations of probability theory. We decided we would base our arguments on the classical frequency definition of probability, but we included a footnote saying that one could axiomatize utility and probability together and introduce a subjective notion of probability. This was done later by others.

### (e) Generating random numbers (1951)

One of von Neumann's memorable remarks has gained a certain infamy: ‘Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin’. This quotation occurs in an item in a 1951 volume of conference proceedings, where von Neumann also discusses physical phenomena which can be used to generate random numerical sequences [52]. He proposes ‘nuclear accidents’ as the ideal source, which in the era after Chernobyl and Fukushima comes across as slightly ominous. However, in context it is plain enough that the ‘accidents’ in question are events like individual clicks from a Geiger counter.
There are nuclear accidents, for example, which are the ideal of randomness, and up to a certain accuracy you can count them. One difficulty is that one is never quite sure what is the probability of occurrence of the nuclear accident. This difficulty has been overcome by taking larger counts than one [does] in testing for either even or odd. To cite a human example, for simplicity, in tossing a coin it is probably easier to make two consecutive tosses independent than to toss heads with probability exactly one-half. If independence of probability tosses is assumed, we can reconstruct a 50–50 chance out of even a badly biased coin by tossing twice. If we get heads-heads or tails-tails, we reject the tosses and try again. If we get heads-tails (or tails-heads), we accept the result as heads (or tails). The resulting process is rigorously unbiased, although the amended process is at most 25 percent as efficient as ordinary coin-tossing. [52], p. 768

The language here is *prima facie* frequentist or propensity-inclined, treating probabilities as unknown quantities to be measured (one is never quite sure what is the probability of occurrence of the nuclear accident). A Bayesian can give meaning to statements about ‘unknown probabilities’—this is the territory of the de Finetti theorem [53]—but von Neumann's phrasing does not sound like a stringent Bayesian's first choice of words.

In summary, von Neumann's interpretation of probability moved from a kollectiv-based strict frequentism to a Keynesian view. Nowhere do we find an outright endorsement of personalist Bayesianism; the closest approach is much later than *MFQM*, is not in the context of quantum theory, and is itself mixed in with a claim that thinking of probability as long-run frequency is good enough for practical purposes.

## 4. Pure and mixed states

Von Neumann's philosophy of probability, and the way his thinking changed over time, has been discussed by others—for example, by Bub [54], Rédei [55], Stairs [56] and Valente [57]. Less remarked upon, but also important to this comment, are von Neumann's statements concerning the distinction between *pure* and *mixed* quantum states.

Returning to *MFQM* [42], on p. 295 we find the following:
In the state

*ϕ* the quantity has the expectation value *ρ*=(*Rϕ*,*ϕ*) and has as its dispersion *ϵ*^{2} the expectation value of the quantity , i.e. ((*R*−*ρ*⋅1)^{2}*ϕ*,*ϕ*)=||*Rϕ*||^{2}− (*Rϕ*,*ϕ*)^{2} (cf. Note 130; all these are calculated with the aid of ) which is in general >0 (and =0 only for *Rϕ*=*ρ*⋅*ϕ*, cf. III.3.) – therefore there exists a statistical distribution of , even though *ϕ* is one individual state – as we have repeatedly noted.) But the statistical character may become even more prominent, if we do not even know what state is actually present – for example, when several states *ϕ*_{1},*ϕ*_{2},… with the respective probabilities *w*_{1},*w*_{2},… (*w*_{1}≥0,*w*_{2}≥0,…,*w*_{1}+*w*_{2}+⋯=1) constitute the description. Then the expectation value of the quantity , in the sense of the generally valid rules of the calculus of probabilities is .

The language here indicates that, for von Neumann, a pure state is something an individual system has, and a more general density matrix stands for an ensemble in which different pure states are physically present with different frequencies.

Von Neumann writes freely of properties possessed by quantum systems (p. 338):
Instead of saying that several results of measurement (on

*S*) are known, we can also say that *S* was examined in relation to a certain property and its presence was ascertained. […] The information about *S* therefore always amounts to the presence of a certain property which is formally characterized by stating the projection *E*.

And, shortly thereafter, bluntly:
That is, if is present, the state is

*ϕ*.

An exhaustive measurement fixes the value of a physical property, and the presence of a physical property can mandate the correctness of a choice of quantum state. Unsharp measurements, in von Neumann's development, ‘are incomplete and do not succeed in determining a unique state’ (p. 340). Again, we see the language making a category distinction between quantities which are ‘states’ and more general entities which are not. Later, von Neumann writes that a density operator which is a ‘mixture of several states’ is ‘not a state’ itself (p. 350). This distinction is maintained throughout his discussion of what we now call the von Neumann entropy.

This idea—that pure and mixed states are qualitatively different kinds of entity—went unquestioned by Bohm [58], but was soon challenged by Jaynes [59]. One can conceive of uncertainty intrinsic to a pure state and uncertainty about which pure state might be present. However, as Jaynes writes,
If the former probabilities are interpreted in the objective sense, while the latter are clearly subjective, we have a very puzzling situation. Many different arrays, representing different combinations of subjective and objective aspects, all lead to the same density matrix, and thus to the same predictions.

This argument was later made by Ochs [60], and by Caves *et al.* [53]. The latter authors write, ‘a mixed state has infinitely many ensemble decompositions into pure states’—even into *different numbers of* pure states—‘so the distinction between subjective and objective becomes hopelessly blurred’.^{2}

Arguably, it would be more consistent for a strict kollectivist to treat all quantum states, pure and mixed, in ensemble terms, where pure states correspond to the most purified possible ensembles. However, von Neumann's statements about the presence of physical properties determining unique states clash with this position.

## 5. Measurement and subjectivity

As we noted in §2, it is not so difficult to find in physicists’ writings statements which, taken in isolation, are compatible with QBism. The more important question is whether a verbal corpus yields up enough of these cherries to fill a bowl.

The last chapter of *MFQM* concerns ‘The measuring process’. Here, we find mentions of ‘subjective perception’ and ‘the intellectual inner life of the individual’ (p. 418). Surely, this is where we should look for evidence of von Neumann's Quantum-Bayesian sympathies.

He writes (p. 420),
Indeed experience only makes statements of this type: an observer has made a certain (subjective) observation; and never any like this: a physical quantity has a certain value.

But the idea that we ultimately rely on sense impressions to adjudicate between scientific models is hardly original to quantum mechanics, or to Bayesian interpretations thereof. A form of the idea is attributed to Democritus,^{3} and Lucretius discussed it in verse.^{4} Schrödinger commented, ‘Quantum mechanics forbids statements about what really exists—statements about the object. Its statements deal only with the object-subject relation’. However, he continued, ‘this holds, after all, for any description of nature’; the crucial point is that, in quantum physics, it ‘holds in a much more radical and far reaching sense’ [17].

In classical mechanics, the mass of an object is a basic property which that object has whether or not a physicist is nearby to be interested in it. Let us imagine a physicist working in the days before quantum theory. He drops a rock on his foot and sees red—a subjective perception. He then uses this perception, which is part of his ‘inner life’, to rule out the hypothesis that the rock is of negligible mass. The variable *m* in his equations refers to an intrinsic property of the rock, not to any of his sensations, even though his sensory perceptions are what he uses to assign a value (or a spread of reasonable values) to the variable *m*. For the Newtonian, the rock *has a mass,* although the Newtonian might admit when pressed that any *particular value used in a calculation* is chosen because it adequately summarizes past experiences and helps to predict future ones.

The critical question is whether von Neumann reads the mathematical entities appearing in the quantum formalism as physical quantities akin to Newtonian masses, about which we can use subjective perceptions to make estimations, or if he reads those mathematical entities as standing for perceptions themselves. The blanket statement quoted above that ‘experience only makes statements of this type’ certainly suggests that he views the point about ‘(subjective) observation’ to be applicable to classical physics. The evidence we saw in the previous section indicates that von Neumann treats quantum states as physical properties held by objects themselves, that is, as more analogous to the mass of a Newtonian rock than to experiences in the flow of ‘intellectual inner life’.

At one point in the Warsaw proceedings, von Neumann does approach a statement which might not sound completely out of place coming from a QBist. In a later discussion than the exchange we examined above, the Warsaw proceedings record the following [46], p. 44:
Professor von Neumann thought that there must always be an observer somewhere in a system : it was therefore necessary to establish a limit between the observed and the observer. But it was by no means necessary that this limit should coincide with the geometrical limits of the physical body of the individual who observes. We could quite well « contract » the observer or « expand » him : we could include all that passed within the eye of the observer in the « observed » part of the system — which is described in a quantum manner. Then the « observer » would begin behind the retina. Or we could include part of the apparatus which we used in the physical observation — a microscope for instance — in the « observer». The principle of « psycho-physical parallelism » expresses this exactly : that this limit may be displaced, in principle at least, as much as we wish inside the physical body of the individual who observes. There is thus no part of the system which is essentially the observer, but in order to formulate quantum theory, an observer must always be placed somewhere.

Terms such as ‘observer’ and ‘measurement’ imply an essential passivity, Fuchs and Schack [19] have argued; such words suggest a casual, uninvolved reading-off, rather than a participatory act. But if we replace ‘observer’ with ‘agent’ in von Neumann's concluding line, we would have the statement, ‘In order to formulate quantum theory, an agent must always be placed somewhere’; *this* claim—that *agent* is a fundamental concept which quantum theory is built upon—would fit within QBism.

To a QBist, this is the killing flaw in von Neumann's interpretation of quantum mechanics. On the one hand, von Neumann affirms that one cannot formulate the theory without an observer, but, on the other, quantum states are physical properties of systems outside the observer, and probabilities are frequencies in kollectivs or Keynesian logical valuations—conceptions of probability which try to delete the agent at all cost.^{5}

In broad overview, von Neumann's approach to quantum measurement begins with a physical system, which then interacts with some kind of measuring apparatus, which is then studied by an observer. The system–apparatus and apparatus–observer interactions are treated as physically distinct kinds of time evolution. Von Neumann calls the difference between these processes ‘very fundamental’ (§VI.1, p. 418).

If one takes quantum states to be intellectual tools held by an individual agent, then the procedure of inserting an intermediate apparatus adds nothing to the basic philosophical understanding of what quantum theory is about. It could well be a beneficial mathematical exercise, part of working out what to do when making use of multipartite systems. (There are good practical reasons to understand how the quantum formalism applies to a system one of whose parts is a probe or an ancilla for the other, or is a communication channel whose function is limited in some way.) Regardless, if quantum states are personalist Bayesian quantities, then introducing probes and ancillas brings nothing intrinsically new. And if *von Neumann had seen quantum states in anything like the QBist fashion,* it is difficult to find a rationale for why *MFQM*'s entire chapter on ‘the measuring process’ assumes the shape it does.

Fuchs has written, ‘von Neumann's setting the issue of measurement in these terms was the great original sin of the quantum foundational debate’ [20], p. 2035.

## 6. Measurement redux: the ‘Quantum Bayes rule’

Wikipedia attributes the statement ‘The first quantum Bayesian was von Neumann’ to R. F. Streater [67]. As mentioned earlier, the effect of saying this in an article which primarily concerns QBism is to claim that von Neumann was himself either QBist or something much like it. Looking up this source, we find that what Streater calls ‘Quantum Bayesianism’ could indeed reasonably include QBism. For example, he states that Bayesians ‘attribute *all* the entropy in a state to the lack of information in the observer’ (p. 71). And in discussing density matrices, he writes, ‘the Bayesian's *ρ* is entirely about his knowledge’ (p. 72). So, the meaning of the statement created by placing it within the Wikipedia article is not too much of a stretch.

Streater bases his claim that von Neumann was the ‘first quantum Bayesian’ on *MFQM*, never addressing the plainly frequentist orientation of that book. Nor does Streater refer to the passages about ‘subjective perception’ and ‘experience’.

The root of the confusion appears to be that, towards the end of *MFQM*, von Neumann derives a formula which turns out in retrospect to be a specialized case of a quantum analogue of the Bayes conditioning rule. Von Neumann motivates his argument with the following (p. 337):
If anterior measurements do not suffice to determine the present state uniquely, then we may still be able to infer from those measurements, under certain circumstances, with what probabilities particular states are present. (This holds in causal theories, for example, in classical mechanics, as well as in quantum mechanics.) The proper problem is then this: Given certain results of measurements, find a mixture whose statistics are the same as those which we shall expect for a system

**S** of which we know only that these measurements were carried out on it and that they had the results mentioned.

Here, von Neumann treats quantum states as analogous to the *physical* states of classical mechanics, i.e. to points in phase space. In classical mechanics, if a system could be at one of multiple points in its phase space, we write a Liouville probability density over that space; we can infer that, for von Neumann, it is *mixtures* which are analogous to Liouville densities. This is in sharp contrast with QBist and much other Quantum-Bayesian thinking, in which *all* quantum states, however pure, are expressions of an agent's probability assignments.

As always in *MFQM*, a system has a state, even if we do not know what that state is. And, as always in *MFQM*, statistics means ensembles of identically prepared systems:
If, for many systems

**S**′_{1},…,**S**′_{M} (replicas of **S**), these measurements give the results mentioned, then this ensemble [**S**′_{1},…,**S**′_{M}] coincides in all its statistical properties with the mixture that corresponds to the results of the measurements.

Changes in statistical properties mean the creation of new ensembles with different population demographics:
That the results of the measurements are the same for all

**S**′_{1},…,**S**′_{M} can be attributed, by **M.**, to the fact that originally a large ensemble [**S**_{1},…,**S**_{N}] was given in which the measurements were carried out, and then those elements for which the desired results occurred were collected into a new ensemble. This is then [**S**′_{1},…,**S**′_{M}].

Here, **M.** refers to the measurement postulate, ‘If the physical quantity is measured twice in succession in a system **S**, then we get the same value each time’ (p. 335).

So, we have *something like* the updating of probabilities by the Bayes rule. However, von Neumann phrases the scenario in completely kollectivist language. Merely invoking Bayes’ theorem does not make one a Bayesian. For example, von Mises makes use of Bayes’ theorem, calling it ‘a proposition applying to an infinite number of experiments’, or, in other words, to a kollectiv [68], p. 123. One could be wholly agnostic about the interpretation of probability, setting up the theory from measure-theoretic or abstract-algebraic axioms [69]; multiplication and division of probabilities would then be legitimate operations having meaning only with respect to those axioms.

It is in this context that von Neumann mentions *a priori* and *a posteriori* probabilities. These terms could be glossed in a Bayesian way, but only at the cost of ignoring everything else *MFQM* says about the interpretation of probability, including the discussion of ‘ensembles’ in the same paragraph. And even if one were to do so, the way in which von Neumann allows pre-existing physical properties to determine quantum states would imply a view in which mixed states are *Bayesian probability distributions over pure states.* (And though it could potentially be called ‘Quantum Bayesian’, it is definitely not QBist.) This is a difficult position to maintain, per the Baynesian argument given above. Furthermore, given the professed kollectivism of *MFQM*, we should recall what von Mises said about these terms [68], p. 46:
It is useful to introduce distinct names for the two probabilities of the same attribute, the given probability in the initial collective and the calculated one in the new collective formed by partition. The current expressions for these two probabilities are not very satisfactory, although I cannot deny that they are impressive enough. The usual way is to call the probability in the initial collective the

*a priori*, and that in the derived collective the *a posteriori* probability.

This usage exactly parallels von Neumann's. To continue:
The fact that these expressions suggest a connexion with a well-known philosophical terminology is their first deficiency in my eyes. Another one is that these same expressions,

*a priori* and *a posteriori,* are used in the classical theory of probability in a different sense as well, namely, to distinguish between probabilities derived from empirical data and those assumed on the basis of some hypothesis; such a distinction is not pertinent in our theory. I prefer, therefore, to give to the two probabilities less pretentious names, which have less far-reaching and general associations. I will speak of *initial probability* and *final probability,* meaning by the first term the probability in the original collective, and by the second one, the probability (of the same attribute) in the collective derived by partition.

Von Neumann uses the more common terminology, but the meaning which *MFQM* vests in the words is, by all evidence, the same as that which von Mises does. The result is not an argument that probabilities should be seen as quantified fervencies of belief, but rather that a certain problem involving ensemble frequencies admits non-unique solutions.

Von Neumann derives his basic relation between initial and final ensembles by considering the following procedure (p. 340). We measure some binary physical property on each element of the initial ensemble, which is described by the statistical operator *U*_{0}. The elements for which this measurement yields the outcome 1 (instead of 0) are collected to form a new ensemble, whose statistical operator is *U*. The two ensembles are related by
6.1Here, *P*_{[ϕn]} is the projector onto the state *ϕ*_{n}, and the set {*ϕ*_{1},*ϕ*_{2},…,*ϕ*_{n}} is an orthonormal basis which spans the subspace in which measuring yields the value 1.

Lüders [43] criticized von Neumann's result and proposed a correction. A more modern way to represent state-change upon measurement is to write the Lüders rule using the mathematics of *effects and operations.* A measurement is a positive operator-valued measure (POVM) which furnishes a resolution of the identity:
6.2and each of the {*E*_{k}} can be written
6.3Here, the index *k* labels the possible outcomes of the measurement. If the initial density operator is *U*_{0}, then, upon obtaining the outcome *k*, we update the density operator to
6.4For analyses of how this update rule is analogous to, or a variant of, Bayesian conditioning, see Schack *et al.* [70] and also Fuchs [71]. Illustrative examples are developed in Fuchs & Schack [72].

Streater bases his criticism of von Neumann on that found in the textbook of Krylov [73], writing, ‘Krylov did not believe that all the characteristics of the state reflect only the lack of knowledge of the observer, but that there was a physical state, *ρ*_{0} out there to be found’. But this is exactly what von Neumann stated: recall that, for him, a ‘mixture of several states’ is ‘not a state’ itself (*MFQM*, p. 350).

Krylov's primary complaint with von Neumann (pp. 184–85) is the multiplicity of valid decompositions of mixed-state density operators.
Firstly, in using the statistical operator, we assume the selection of a certain orthogonal system of coordinates in the subspace delimited by an inexhaustively complete experiment and we also assume a certain choice of weights

*w*_{i}. […] A change in the orthogonal system means, generally speaking, a transition to another physical state described by the statistical operator, to what is said to be another statistical aggregate. (‘A state’ is understood here in a more general sense than a state exhaustively completely determined, using a *Ψ*-function.) […] Therefore, the selection of a certain orthogonal system of functions and the fixing of certain weights *w*_{i}, which the von Neumann operator presupposes, amounts to the introduction of some *physically fictitious* properties of the reality being described.

That is, Krylov finds the conjunction of the following two statements unacceptable:

— A pure quantum state is a physical property of a system.

— The quantum formalism implies that a mixed-state statistical operator has multiple decompositions into linear combinations of pure states.

Krylov insists upon the former and therefore finds the latter unsatisfactory. As we discussed earlier, QBists (and some other varieties of Quantum Bayesians) agree that these two statements clash with each other, but discard the first instead. Von Neumann holds onto both (*MFQM*, §IV.3).

## 7. Discussion

We have seen that ‘Quantum Bayesian’ is not at all a good description of von Neumann. Along the way, we have encountered some of the practical issues that make classifying scientists a difficult problem. It is worth considering these issues more generally. Having done so, we will conclude with some contemplations about the platform that made the claim ‘the first Quantum Bayesian was von Neumann’ visible enough to be noticed in the first place.

A good classification summarizes, as succinctly as possible, the known statements and actions of an individual, and should have predictive power for statements yet unmade or undiscovered. Furthermore, the descriptive terms in commonest circulation should provide the most useful and meaningful understanding of relevant distinctions that painting with a broad brush can convey [74]. It is, to say the least, debatable that the jargon we have today on the philosophical side of quantum theory fares at all well in this regard. Peres [75] quips, ‘There seems to be at least as many Copenhagen interpretations as people who use that term, perhaps even more’. For Żukowski [33], Copenhagen is ‘different for every apostle’. Indeed, a good case can be made that the idea of a unified ‘Copenhagen Interpretation’ was a myth of the 1950s, which now elides in our perception the differences among views held by physicists 30 years earlier [76–78]. Similarly, Kent [79] tabulates at least 21 varieties of Everettian interpretation. These can differ in underacknowledged ways from Everett's original [80], and many are ‘generally incompatible’ with one another [81].

The terms we employ are signifiers that we use to determine what we bother to read. They establish the distinctions which we allow to exist between the players in the histories we retell. Then we pose those players according to our sentiments for the person and for the philosophy.^{6}

The story of von Neumann also exemplifies other challenges. We scientists change our views over time. We leave fragmentary records of our thoughts, not atypically muddled by the compromises of coauthorship and journal publication. (The rise of an electronic preprint culture has, among other things, provided a way to track what squeezing our works into journals can do to them; e.g. [83].) And because our attitudes can be moving targets, combining a physicist's statement from year *N* with another statement they made in year *N*+20 to deduce what they ‘must logically have believed’ is an exercise fraught with a scholarly kind of peril.

The scholarly literature is quite capable of spreading urban legends of its own [84]. What happens when we bring Wikipedia into the mix?

The Wikipedia project bills itself as ‘the free encyclopedia that anyone can edit’; even if we gloss ‘anyone’ as ‘anyone with an Internet connection’, this is only true to a first approximation. Individual users, or the machines they edit from, can be blocked from editing for various lengths of time for offences such as persistent hate speech [85]. Also, individual pages can be protected from editing to different extents. Wikipedia has its own policies and community institutions [86], often referred to by acronyms (NPOV, NOR, FAC, FARC and so forth). These, in addition to sheer size and visibility, distinguish Wikipedia from other applications of the wiki concept, such as the nLab [87]. For example, Wikipedia has a ‘No Original Research’ policy [88], but the nLab has as one of its primary goals the facilitation of original mathematical work.

On Wikipedia, the ‘right to figure forever in the history of the subject like a fly in amber’ [89] can be supported even by a mention in nothing more substantial than the popular press, such as by *New Scientist* magazine, despite the well-known failure modes of that industry [20], p. 2221.

Thanks to the No Original Research policy mentioned earlier, correcting misconceptions propagated by popular-science magazines and the like cannot begin with Wikipedia itself. The NOR policy makes sense for what Wikipedia is and what it tries to be: an encyclopedia is a tertiary source, rather than a primary or secondary one. Moreover, Wikipedia lacks the infrastructure to evaluate original scholarship, and identifying who wrote what in its articles is an arduous task, making it a poor place to advance new claims in a forthright way. Academic life depends on receiving credit for one's own work, and the way Wikipedia articles are made flattens all contributions together, obscuring authorship. This essay is an attempt to do in a different venue what cannot feasibly be done within Wikipedia alone: set the record straight.

## Competing interests

The author declares that they have no competing interests.

## Funding

We received no funding for this study.

## Acknowledgements

I thank John DeBrota and Chris Fuchs for discussions during writing and editing.

## Footnotes

One contribution of 14 to a theme issue ‘Quantum foundations: information approach’.

↵1 This is not to say that the more technical side of QBism is without historical and philosophical interest. The roots of the mathematics involved go back to Schwinger [34] and Weyl [35], §IV.D.14, indeed to the very transition from the ‘old quantum theory’ to the new [20], pp. 2055–2056, 2257–2258, 2280. And, in the symmetric informationally complete representation of quantum states and channels, the Born rule—usually written as

*p*(*i*)=*tr*(*ρE*_{i})—and unitary evolution—typically written as —take*the same form.*Both are simple affine deformations of the law of total probability [19]. This clarifies that both are*synchronic*relations between probability ascriptions [18]. Alice carries a probability distribution for an informationally complete measurement, which she uses to summarize her expectations. Alice can calculate other probability distributions from it, synchronically, including distributions for other informationally complete measurements which she might carry out in the distant future.↵2 The mathematical point of the multiplicity of ensemble decompositions was made most famously by Hughston

*et al.*[61] in 1993. It was also demonstrated almost 60 years earlier by Schrödinger [62], who disclaimed priority for the result, suggesting that some form of the idea was folk knowledge or shared conversationally at the time. The fact that a mixed state can be decomposed in multiple ways is one of the phenomena reproduced in the ‘epistricted’ models of Spekkens [63,64].↵3 Democritus, fragment DK 68B9, which comes to us via Sextus Empiricus,

*Against the Mathematicians*(*ca*200), 7.136. The translation in K. Freeman's*Ancilla to the Pre-Socratic Philosophers*65 is, ‘We know nothing accurately in reality, but [only] as it changes according to the bodily condition, and the constitution of those things that flow upon [the body] and impinge upon it’.↵4 Lucretius,

*De Rerum Natura*(*ca*60–50 BCE), Book IV. The relevant passage begins with, ‘Invenies primis ab sensibus esse creatam / notitiem veri neque sensus posse refelli’ [You will find the knowledge of truth is created first by the senses, and the senses cannot be denied].↵5 Arguably, von Neumann is not always consistent in his treatment of ‘psycho-physical parallelism’. A careful reading of

*MFQM*§VI.3 suggests that it elides the ‘limit between the observed and the observer’ which he deems essential, both in*MFQM*(§VI.1, p. 420) and at the Warsaw conference. But teasing out the meaning of ‘psycho-physical parallelism’ is no simple task [66], and, for the purposes of this essay, pursuing it in greater depth is not essential.↵6 I myself have heard Feynman claimed as a Copenhagener, a Gell-Mannian, an Everettic and an enthusiast for non-local hidden variables. The implication is that the correct interpretation of the quantum will be decided, not even by opinion poll [21,82], but by who occupies the most shelf space in the Caltech bookstore.

- Accepted January 10, 2016.

- © 2016 The Author(s)