## Abstract

We review recent work that employs the framework of logical inference to establish a bridge between data gathered through experiments and their objective description in terms of human-made concepts. It is shown that logical inference applied to experiments for which the observed events are independent and for which the frequency distribution of these events is robust with respect to small changes of the conditions under which the experiments are carried out yields, without introducing any concept of quantum theory, the quantum theoretical description in terms of the Schrödinger or the Pauli equation, the Stern–Gerlach or Einstein–Podolsky–Rosen–Bohm experiments. The extraordinary descriptive power of quantum theory then follows from the fact that it is plausible reasoning, that is common sense, applied to reproducible and robust experimental data.

## 1. Introduction

Quantum theory is unsurpassed as a description of the data produced by many very different experiments in (sub)-atomic, molecular and condensed matter physics, quantum optics, etc. A large body of work focuses on different interpretations of the quantum formalism [1–8] and its derivations from different sets of axioms [9–20] but offers no explanation of the success of quantum theory that goes beyond ‘that is because of the way it is’. In this paper, we review recent work that de-mystifies the extraordinary power of quantum theory [21–23] by formalizing the human thought processes by which relationships between data gathered through experiments and objective descriptions in terms of human-made concepts may be established. A basic premise of this approach is that scientific theories are the result of cognitive processing of the discrete events which are registered by our sensory system, of the logical and/or cause-and-effect relationships between those events, and the use of metaphors to make abstractions and construct concepts. This form of cognitive processing may be expressed in terms of the algebra of logical inference (LI), a mathematical framework that facilitates rational reasoning when there is uncertainty [24–28]. Statistical mechanics can be given an information-theoretical justification by viewing the former as a problem of LI, thereby establishing a relationship between the information-theoretic entropy [29] and the thermodynamic entropy [30,31]. The resulting maximum-entropy principle [26,28,30,31] has recently been generalized to a principle of entropic dynamics, a framework in which dynamical laws can be derived [32,33].

The LI approach, being extended logic, is not bound by ‘laws of physics’ and does not require assumptions such as ‘the observed events are signatures of an underlying objective reality—which may or may not be mathematical in nature’, ‘all things physical are information-theoretic in origin’, ‘the universe is participatory’, etc. It yields results that are unambiguous and independent of the individual subjective judgment, providing a rational explanation for the extraordinary descriptive power of quantum theory, and it also provides strong support for Bohr's statement [34] that ‘The physical content of quantum mechanics is exhausted by its power to formulate statistical laws governing observations under conditions specified in plain language’.

LI applies to situations where there may or may not be causal relationships between the events [26,28]. Extracting cause-and-effect relationships from empirical evidence is a highly non-trivial problem. In general, LI does not establish cause-and-effect relationships [28,35], although rational reasoning about these relations should comply with the rules of LI. Furthermore, a derivation of a quantum theoretical description from LI principles does not prohibit the construction of cause-and-effect mechanisms that create the *impression* that these mechanisms produce data that can be described by quantum theory [36,37]. In fact, there is a substantial body of work demonstrating that it is indeed possible to construct simulation models which reproduce, on an event-by-event basis, the results of interference/entanglement/uncertainty experiments with single photons/neutrons [38–42]. This demonstration does not imply the reality of hidden variables or something like that.

The LI approach which we review here leads to the view that quantum theory is a phenomenological theory which can be derived from a set of simple general principles, not axioms, in a way that is independent of any (strictly speaking, unknown) ‘more microscopic’ level of description. Therefore, its power does not depend on whether there exists an underlying classical world with some hidden variables, or not. In this sense, there is a clear parallel with Einstein's view on thermodynamics. Einstein did not regard thermodynamics as a constructive theory, an attempt to build a picture of complex phenomena out of some relatively simply propositions, but rather as a theory of principles based on empirically observed properties of phenomena, independent of a particular underlying model [43].

The paper is structured as follows. Section 2 briefly recapitulates the basic elements of LI and reviews applications to the Stern–Gerlach (SG) and Einstein–Podolsky–Rosen–Bohm (EPRB) experiment and a particle in a potential. It shows how the LI approach directly leads to the probabilities for observing the events without invoking any concept of quantum theory. In §3, we discuss two methods for transforming the solutions obtained through LI into the equations that we know from quantum theory. A summary and discussion of more general aspects of the work presented in this paper are given in §4.

## 2. Logical inference

The key concept of the LI approach is the plausibility, denoted by *P*(*A* | *B*), which, in general, expresses the degree of belief of an individual that proposition *A* is true, given that proposition *B* is true [25,26,28,44]. The plausibility *P*(*A* | *B*) is an intermediate mental construct that serves to carry out inductive logic, that is rational reasoning, in a mathematically well-defined manner [26,28].

The algebra of LI can be derived from three so-called ‘desiderata’, namely (i) *plausibilities are represented by real numbers*, (ii) plausibilities must exhibit agreement with rationality and (iii) *all rules relating plausibilities must be consistent* [25–28]. These three desiderata only describe the essential features of the plausibilities and are not a set of axioms that plausibilities have to satisfy. It is a most remarkable fact that these three desiderata suffice to uniquely determine the set of rules by which plausibilities may be manipulated [25–28]. It can be shown [25–28] that plausibilities may be chosen to take numerical values in the range [0,1] and these values are related by three rules, namely: (1) , where denotes the negation of proposition *A* and *Z* is a proposition assumed to be true; (2) the ‘product rule’ *P*(*AB* | *Z*)=*P*(*A* | *BZ*)*P*(*B* | *Z*)= *P*(*B* | *AZ*)*P*(*A* | *Z*), where the ‘product’ *BZ* denotes the logical product (conjunction) of the propositions *B* and *Z*; and (3) and , where the ‘sum’ *A*+*B* denotes the logical sum (inclusive disjunction) of the propositions *A* and *B* [25–28]. The algebra of LI, as defined by the rules (1)–(3), contains Boolean algebra as a special case and is the foundation for powerful tools such as the maximum entropy method and Bayesian analysis [26,28]. The rules (1)–(3) are unique [26–28]: any other rule which applies to plausibilities represented by real numbers and is in conflict with rules (1)–(3) will be at odds with rational reasoning and consistency, as embodied by the desiderata (i)–(iii). It should be mentioned here that it is not allowed to define a plausibility for a proposition conditional on the conjunction of mutually exclusive propositions: reasoning on the basis of two or more contradictory premises is out of the scope of LI.

The applications of LI which we review in the present paper describe phenomena in a manner which is independent of individual subjective judgment. Therefore, to differentiate between the ‘objective’ and ‘subjective’ mode of application of LI, we will refer to the plausibility as ‘inference-probability’ or ‘i-prob’ for short. A more extensive discussion and arguments for distinguishing between plausibility, inference probability and Kolmogorov probability can be found in [22]. For the purpose of the present review, it is sufficient if the reader does not think of the i-prob as a frequency or probability in the traditional mathematical sense but merely as a numerical measure for the fact that proposition *A* is true, given that proposition *B* is true.

In real experiments, there is always uncertainty about some factors which may or may not influence the outcome of the measurements: it is presumptuous to assume that we know ‘everything’ about these factors. In particular, if experiment shows that (a) there is uncertainty about each individual event and (b) the conditions under which the experiment is carried out are also subject to uncertainties, then the data collected in such an experiment cannot be described by the traditional theories of classical physics, the reason being that the theoretical description of ‘classical physics’ assumes that there is absolute certainty about the outcome of each individual experiment on each individual object. By contrast, the LI approach is well suited to deal with uncertainties but, as will be explicitly shown later, to render the resulting description free of individual subjective judgement, it is necessary to assume that (c) the frequencies with which events are observed are reproducible and robust (to be discussed later) against small changes in the conditions. Furthermore, the LI approach only yields a quantum theoretical description if in addition we assume that (d) individual events are independent, meaning that knowing any (necessarily finite) set of events does not help to increase the certainty by which we can predict another (past or future) event that does not belong to the set.

If the experimental data comply with requirements (a)–(d), application of LI rather straightforwardly yields basic equations of quantum theory. The LI derivation of these equations has a generic structure. The first step is to list the features of the experiment that are deemed to be relevant and to introduce the i-probs of the individual events. The second step is to impose the condition that the experiment yields robust, reproducible results, not on the level of individual events, but on the level of the frequencies of observing many events and, depending on the problem, to impose other constraints about, for example, the fact that the particle moves, etc. As an example of such a constraint, we will use a natural requirement that the equations of classical mechanics should be as accurate as possible for the *average* results of many quantum experiments. The result of the second step is a functional of the i-prob. The third step is to solve the robust optimization problem defined in terms of this functional and, optionally, to transform the formal solution in terms of i-probs into linear equations which we recognize as basic equations of quantum theory.

### (a) Application: Stern–Gerlach experiment

The application of LI to event-based phenomena follows a particular pattern which is best illustrated by considering the simplest case, namely the SG experiment of which the schematic is shown in figure 1. In the SG experiment, there are two different outcomes which we label by the variable *x* taking the values *x*=+1 or *x*=−1. In an SG experiment, the source is activated at discrete times labelled by *i*=1,…,*N*, resulting in time series of detection events *x*_{i}=±1. The first step in the LI treatment is to assign to an individual event *x*=±1, an i-prob *P*(*x* | **a**,**M**,*Z*) to observe that event. Here, **a** and **M** are shorthands for the proposition that (within a small range) the directions of the magnet and of the magnetic moment of the particle are indeed **a** and **M**, respectively, and that the proposition *Z*, representing all the other conditions under which the experiment is performed, is true. It is assumed that the conditions represented by *Z* are fixed and identical for all experiments.

Assuming that the observed counts do not depend on the orientation of the chosen reference frame, *P*(*x* | **a**,**M**,*Z*) can only depend on **a**⋅**M** (by construction |**a**|=1 and |**M**|=1). Hence, we must have *P*(*x* | **a**,**M**,*Z*)=*P*(*x* | **a**⋅**M**,*Z*)=*P*(*x* | *θ*,*Z*), where . This assumption is necessary to consider **M** as the direction of the magnetic moment of the particle, whereas only **a** is known from experiment. Such symmetry requirements are very important for our construction, as they establish relationships between what is measured (position of detector) and what is supposed to be measured (characteristics of a particle). It is expedient to write *P*(*x* | *θ*,*Z*) as
2.1According to assumption (d), there is no relationship between the actual values of *x*_{n} and *x*_{n′} if *n*≠*n*′. With this assumption, repeated application of the product rule yields
2.2Repeating the experiment *N* times yields *n*_{x} events of the type {*x*}(*n*_{+1}+*n*_{−1}=*N*) and the i-prob to observe the compound event {*n*_{+1}, *n*_{−1}} is given by [22]
2.3

Although the individual events may be expected to change from run to run, for sufficiently large *N* the numbers {*n*_{+1},*n*_{−1}} should exhibit some kind of robustness with respect to small changes of *θ*. Otherwise, the {*n*_{+1},*n*_{−1}} would vary erratically with *θ* and these ‘irreproducible’ experiments would be discarded. Obviously, the expected robustness with respect to small variations should be reflected in the expression of the i-prob to observe the data (within the usual statistical fluctuations).

If the outcome of the experiment is indeed described by the i-prob equation (2.3) and the experiment is supposed to yield reproducible, robust results, small changes of *θ* should not have a drastic effect on the outcome. Let *H*_{0} and *H*_{1} be the hypotheses that the data {*n*_{+1},*n*_{−1}} are observed if the angle between the unit vector **a** is *θ* and *θ*+*ε*, respectively. The evidence Ev of hypothesis *H*_{1}, relative to hypothesis *H*_{0}, is defined by [26,28]
2.4where the logarithm serves to facilitate the algebraic manipulations. If *H*_{1} is more (less) plausible than *H*_{0} then Ev>0 (Ev<0). The absolute value of the evidence, |Ev| is a measure for the robustness of the description (the sign of Ev is arbitrary, hence irrelevant): the smaller |Ev| the more robust the experiment is for small changes of *θ*.

The problem of determining the most robust description of the experimental data may now be formulated as follows: search for the i-probs *P*(*n*_{+1},*n*_{−1} | *θ*,*N*,*Z*) which minimize |Ev| for all possible *ε* (*ε* small) and for all possible *θ*. The clauses ‘for all possible *ε* and *θ*’ render the minimization problem an instance of a robust optimization problem. The robust optimization problem has a trivial solution, namely *P*(*n*_{+1},*n*_{−1} | *θ*,*N*,*Z*)=*P*(*n*_{+1},*n*_{−1} | *N*,*Z*), which can only describe experiments for which {*n*_{+1},*n*_{−1}} show no dependence on *θ*. Experiments which produce results that do not depend on the conditions seem fairly pointless and therefore we explicitly exclude i-probs that are constant with respect to changes of the conditions. It is not difficult to show [22] that our concept of a robust experiment implies that the i-probs which describe such an experiment can be found by minimizing |Ev|, subject to the constraints that (C1) *ε* is small but arbitrary, (C2) not all i-probs are independent of *θ* and (C3) |Ev| is independent of *θ* [21–23]. Omitting terms of , minimizing |Ev| while taking into account the constraints (C2) and (C3) amounts to finding the i-probs *P*(*x* | *θ*,*Z*) which minimize [22],
2.5subject to the constraint that ∂*P*(*x* | *θ*,*Z*)/∂*θ*≠0.^{1} The r.h.s. of equation (2.5) is the Fisher information for the problem at hand and, because of the constraint (C3), should not depend on *θ*.

Using equation (2.1), we can rewrite equation (2.5) as *I*_{F}=(∂*E*(*θ*)/∂*θ*)^{2}/((1−*E*^{2}(*θ*)), yielding , where *ϕ* is an integration constant. As *E*(*θ*) is a periodic function of *θ* we must have , where *K* is an integer and hence . The solution *K*=*I*_{F}=0 is excluded from further consideration because it describes an experiment in which the frequency distribution of the observed data does not depend on *θ* (see constraint (C2)). Therefore, the physically relevant, non-trivial solution with minimum Fisher information corresponds to *K*=1. Furthermore, as *E*(*θ*) is a function of only, we must have *ϕ*=0, *π*. Therefore, for the SG experiment, the solution of the robust optimization problem reads
2.6The ± sign in equation (2.6) reflects the fact that the mapping between *x*=±1 and the two different directions is only determined up to a sign.

Comparing equation (2.6) with the quantum theoretical expression (which is exactly the same) demonstrates that Born's rule, one of the postulates of quantum theory, appears as a consequence of LI applied to robust experiments [22,23]. In the LI approach, equation (2.6) is not postulated but follows from the assumption that the (thought) experiment that is being performed yields the most reproducible results, revealing the conditions for an experiment to produce data which are described by quantum theory.

### (b) Application: Einstein–Podolsky–Rosen–Bohm experiment

The LI treatment of the EPRB experiment is, in essence, the same as that of the SG experiment. Therefore, we only discuss the main assumptions and present the results. Technical details can be found elsewhere [22].

Referring to the schematic shown in figure 2, the i-prob to observe a pair {*x*,*y*} is denoted by *P*(*x*,*y* | **a**_{1},**a**_{2},*Z*), where *Z* represents all the conditions under which the experiment is performed, with the exception of the directions **a**_{1} and **a**_{2} of the SG magnets *B*_{1} and *B*_{2}, respectively. It is important to note that *P*(*x*,*y* | **a**_{1},**a**_{2},*Z*) does not depend on **M**_{1} and **M**_{2}. In concert with the general assumption (d), it is assumed that there is no relationship between the actual values of the pairs {*x*_{i},*y*_{i}} and {*x*_{i′},*y*_{i′}} if *i*≠*i*′, meaning that each repetition of the experiment represents an identical event of which the outcome is logically independent of any other such event. Invoking the product rule, the logical consequence of this assumption is that meaning that the i-prob *P*(*x*_{1},*y*_{1},…,*x*_{N},*y*_{N} | **a**_{1},**a**_{2},*Z*) to observe the compound event {{*x*_{1},*y*_{1}},…,{*x*_{N},*y*_{N}}} is completely determined by the i-prob *P*(*x*,*y* | **a**_{1},**a**_{2},*Z*) to observe the pair {*x*,*y*}.

We also assume that the i-prob *P*(*x*,*y* | **a**_{1},**a**_{2},*Z*) to observe a pair {*x*,*y*} does not change if we apply the same rotation to both magnets *B*_{1} and *B*_{2}. Expressing this invariance with respect to rotations of the coordinate system (Euclidean space and Cartesian coordinates are used throughout this paper) in terms of i-probs yields where denotes an arbitrary rotation in three-dimensional space which is applied to both magnets *B*_{1} and *B*_{2}, implying that *P*(*x*,*y* | **a**_{1},**a**_{2},*Z*) is a function of the inner product **a**_{1}⋅**a**_{2} only. Therefore, we must have *P*(*x*,*y* | **a**_{1},**a**_{2},*Z*)=*P*(*x*,*y* | **a**_{1}⋅**a**_{2},*Z*)=*P*(*x*,*y* | *θ*,*Z*), where denotes the angle between the unit vectors **a**_{1} and **a**_{2}. Note that for any integer value of *K*, *θ*+2*πK* represents the same physical arrangement of the magnets *M*_{1} and *M*_{2}. From the algebra of LI, it follows that the i-prob to observe *x*, irrespective of the observed value of *y*, is given by . In the context of the EPRB experiment, it is assumed that observing *x*=+1 is as likely as observing *x*=−1, independent of the observed value of *y*. This implies that we must have *P*(*x*=+1 | **a**_{1},**a**_{2},*Z*)=*P*(*x*=−1 | **a**_{1},**a**_{2},*Z*) which, in view of the fact that *P*(*x*=+1 | **a**_{1},**a**_{2},*Z*)+*P*(*x*=−1 | **a**_{1},**a**_{2},*Z*)=1 implies that . Applying the same reasoning to the assumption that, independent of the observed values of *x*, observing *y*=+1 is as likely as observing *y*=−1 yields . Note that we did not assign any prior i-prob nor did we make any reference to concepts such as the singlet state. Although the symmetry properties which have been assumed are reminiscent of those of the singlet state, this is deceptive: without altering the assumptions, the LI approach also yields the correlations for the triplet states [22].

Adopting the same reasoning as in §2b, it follows directly from assumptions (a)–(d) that the i-prob to observe a pair {*x*,*y*} takes the form [22] *P*(*x*,*y* | *θ*,*Z*)=[1+*xyE*_{12}(*θ*)]/4, where *E*_{12}(*θ*)=*E*_{12}(**a**_{1},**a**_{2},*Z*) is a periodic function of *θ*. Minimization of the corresponding expression of |Ev| while taking into account the constraints (C2) and (C3) (see §2a) is tantamount to finding the i-probs *P*(*x*,*y* | *θ*,*Z*) which minimize [22]
2.7subject to the constraint that ∂*P*(*x*,*y* | *θ*,*Z*)/∂*θ*≠0 for some pairs (*x*,*y*). Equation (2.7) is readily integrated to yield , where *ϕ* is an integration constant. As *E*_{12}(*θ*) is a periodic function of *θ*, we must have , where *K* is an integer and hence . Because of constraint (C2), we exclude the case *K*=*I*_{F}=0 from further consideration. Hence the physically relevant, non-trivial solution corresponds to *K*=1. Furthermore, as *E*_{12}(*θ*) is a function of only, we must have *ϕ*=0,*π*, reflecting an ambiguity in the definition of the direction of *B*_{1} relative to the direction of *B*_{2}. Choosing the solution with *ϕ*=*π*, we find
2.8 and , all in complete agreement with the quantum theoretical description of two particles in the singlet state [45,46]. As the LI treatment of a robust EPRB experiment directly yields the probabilistic description that we know from quantum theory without invoking the notions of the latter, it follows that the concept of quantum entanglement cannot be essential for describing the data produced by EPRB experiments.

It may be of interest to mention here that, in spite of the widely spread claims that real EPRB experiments have proved quantum theory correct, none of the three different experiments for which data have been made available [47–49] survives the confrontation with the five-standard-deviation-criterion hypothesis test that the data comply with the quantum theoretical description given by equation (2.8) [50,51]. It seems that, for the time being, only computer experiments are able to generate data that are not in conflict with the quantum theoretical description of the EPRB thought experiment [39,50].

### (c) Application: particle in a potential

For simplicity of notation and presentation, in the present review, we only discuss the problem of inferring the plausibility that the particle is at a certain position *X* on a line and produces a click on a detector at position *x*, also on a line. The fully fledged three-dimensional derivation including electromagnetic potentials and/or spin can be found elsewhere [22,23].

The measurement scenario is as follows. We imagine *N* repetitions (*n*=1,…,*N*) of an experiment performed on a particle moving on a line of linear extent [−*L*,*L*]. Nothing is known about the direction of motion of the particle. In each such experiment, a source emits a signal at discrete times labelled by the integer *τ*=1,…,*M*. It is assumed that for each repetition, the particle is at the unknown position −*L*≤*X*_{τ}≤*L*. The signal solicits a response of the particle, generating a click of the detector at discrete position *j*_{n,τ} with −*K*≤*j*_{n,τ}≤*K*. The detectors −*K*,…,*K* have spatial extent Δ=*L*/*K* and are placed next to each other, completely covering the line segment [−*L*,*L*]. It is assumed that for each signal emitted by the source, one and only one of the 2*K*+1 detectors fires.

The result of *N* repetitions of the experiment yields the dataset
2.9or, denoting the total count of clicks of detector *j* at time *τ* by 0≤*k*_{j,τ}≤*N*, we have
2.10Following the general procedure, the next step is to introduce the i-prob *P*(*j* | *X*_{τ},*τ*,*Z*), expressing the relation between a particle at unknown location *X*_{τ} at discrete time *τ* and the click of the detector at position *j*. The conditions represented by *Z* are fixed and identical for all experiments. Note that we will not try to estimate the unknown position *X* but rather determine the i-prob *P*( *j* | *X*_{τ},*τ*,*Z*) which yields the most robust set of data, robust with respect to small changes of *X*_{τ} for all *τ*. According to basic assumption (d), there is no relation between the actual values of *j*_{n,τ} and *j*_{n′,τ′} if *n*≠*n*′ or *τ*≠*τ*′. Hence, for fixed positions *X*_{τ}, the i-prob to observe all the data is given by
2.11It is now straightforward to repeat the steps that led to equation (2.4) to find that the measure of a robust experiment is given by
2.12and that the most non-trivial robust experiment is described by the i-prob *P*(*j* | *X*_{τ},*τ*,*Z*) which minimizes the Fisher information
2.13subject to the constraint that not all ∂*P*( *j* | *X*_{τ},*τ*,*Z*)/∂*X*_{τ} are zero and the additional constraints to be discussed below.

As the Schrödinger equation is formulated in continuum space, it is necessary to replace equation (2.13) by its continuum limit,
2.14where we assumed that it does not matter where in space we perform the experiment (homogeneity of space), implying that *P*(*x* | *X*(*τ*),*τ*,*Z*)=*P*(*x*+*ζ* | *X*(*τ*)+*ζ*,*τ*,*Z*), where *ζ* is an arbitrary real number. As before, it is a symmetry requirement which allows us to regard the unknown quantity *X* as the ‘coordinate of a particle’ based on measurement of the coordinate of the detector that clicks. Technically speaking, after passing to the continuum limit, *P*(*x* | *X*(*t*),*t*,*Z*) denotes the probability density, not the probability itself, but as there can be no confusion about which case, discrete or continuum, we are considering, we use the same symbol for the probability density and the probability.

In general, if there is no uncertainty about individual events, we expect the description to agree with classical theoretical mechanics. We use this ‘correspondence principle’ to incorporate classical theoretical mechanics into the LI approach [22,23]. In the absence of uncertainty and in line with the basic ideas of classical mechanics, the observed detector clicks form smooth trajectories. One such trajectory can always be represented by d*x*(*t*)/d*t*=*U*(*x*(*t*),*t*) but the function *U*(.,.) may not be ‘universal’ in the sense that it may change from experiment to experiment, e.g. with the initial conditions. However, if *U*(.,.) is universal and sufficiently ‘nice’ (we ignore technical details related to differentiability, etc.) we may write [52]
2.15where *S*(*x*,*t*)≡*S*(*x*(*t*),*t*) and it follows that
2.16showing that if there exists a universal function *U*(*x*(*t*),*t*) which describes the data according to equation (2.15), then there exists a potential *V* (*x*,*t*) such that the Hamilton–Jacobi equation,
2.17holds [52]. Thus, the assumption that in the absence of uncertainty, all the possible trajectories *x*(*t*) can be described by one function *U*(.,.) quite straightforwardly yields equation (2.17), that is one of the formulations of classical theoretical mechanics.

In the presence of uncertainty about individual events, we can combine the notion of a robust experiment and the desire to recover equations of classical mechanics as a limiting case by searching for the robust (i.e. for all *X*(*t*)) minima of the functional [22,23]
2.18where λ is a parameter having dimension s^{2}/kg m^{4} and, for convenience of comparing with quantum theory, we introduced the mass *m* of the particle by substituting and .

Standard variational calculus yields the extrema of equation (2.18) in terms of two coupled nonlinear first-order differential equations of the functions *P*(*x* | *X*(*t*),*t*,*Z*) and *S*(*x*,*t*) which are identical to the (one-dimensional version of the) equations that appear in Madelung's hydrodynamical form [53] or Bohm's interpretation [1] of quantum theory. However, equation (2.18) was not derived from quantum theory but was obtained through logical inference from data produced by robust experiments and a correspondence principle, without invoking concepts of quantum theory. Therefore, in principle we do not need the latter to describe these experiments but we can use the equivalence of equation (2.18) and the mathematical framework of quantum theory to great advantage for turning the nonlinear equations into linear ones which can be solved by the powerful machinery of linear algebra. Technical details of the derivation of the functionals analogous to *F* for the multidimensional Schrödinger equation and Pauli equation for a particle with spin can be found in the original papers [22] and [23], respectively.

## 3. Connecting with quantum theory

The LI approach yields descriptions of robust experiments in terms of i-probs. In this section, we discuss two different methods of transforming these i-probs into the wavefunction formalism of quantum theory.

The first method is based on the general observation that in scientific reasoning it is good practice to reduce the complexity of the description of the whole by separating the description of data into several parts. We consider different ways of organizing the observed data and scrutinize the conditions under which a description of the various parts of the experiment can be separated (as much as possible). Then we show, in the case of the SG and EPRB experiments, how the wavefunction description naturally emerges as a result of this separation procedure. It automatically follows that the wavefunction (or density matrix) description is less general than the one in terms of conditional probabilities in the sense that the former can only describe situations in which the separation procedure can actually be carried out.

The second method employs a polar representation of the i-prob to bring the nonlinear robust optimization problem into a linear form and is most useful for cases that involve dynamics, yielding Schrödinger-like equations.

### (a) Separation procedure [54]

Consider again the SG experiment (figure 1) yielding the dataset
3.1where *N* is the total number of recorded events. Suppose that the analysis of correlations among the observed *x*_{i} indicates that the *x*_{i} are independent events, in line with assumption (d) (see §2). Then the counts *N*(±1 | **a**,**M**,*Z*) of outcomes with *x*=±1 (*N*=*N*(+1 | **a**,**M**,*Z*)+*N*(−1 | **a**,**M**,*Z*)) give a complete characterization of the data. In essence, all datasets having the same average
3.2are equivalent. Assuming (as in §2a) that the observed counts do not depend on the orientation of the chosen reference frame, *f*(*x* | **a**,**M**,*Z*) can only depend on **a**⋅**M** (by construction |**a**|=1 and |**M**|=1). Hence, we must have *f*(*x* | **a**,**M**,*Z*)=*f*(*x* | **a**⋅**M**,*Z*).

Equation (3.2) is a holistic description of the data in terms of **a**⋅**M** and it is by no means obvious how to construct, if possible at all, a description in terms of a part that refers to the object (represented by **M**) and another part that refers to the magnet (represented by **a**). To explore the possibilities of separating in parts, it is expedient to consider alternative ways of writing equation (3.2). Let us first organize the data and frequencies in vectors **x**=(+1,−1)^{T} and **f**=( *f*(+1 | **a**,**M**,*Z*),*f*(−1 | **a**,**M**,*Z*))^{T}, respectively. Then, we trivially have
3.3where **fx**^{T} is a 2×2 matrix and Tr *A* denotes the trace of the matrix *A*. Now note that *any* rewriting of **x** and **f** in terms of vectors, matrices, …, and such that and does not change 〈*x*〉; that is, it yields the same complete description of the data. Therefore, with this in mind, we consider the rearrangement of the data into 2×2 (diagonal, hermitian) matrices *X* and *F* with elements *X*(*x*,*x*′)=*xδ*_{x,x′} and *F*(*x*,*x*′)=*f*(*x* | **a**,**M**,*Z*)*δ*_{x,x′}, respectively, and rewrite equation (3.3) as
3.4where and can be any pair of 2×2 matrices that satisfies equation (3.4). Clearly, a formal rewriting of equation (3.3) such as equation (3.4) cannot, by itself, bring anything new but representation (3.4) offers the flexibility that allows us to perform the separation by using some elementary linear algebra, as we now show.

We know from linear algebra that any hermitian 2×2 matrix can be written as a linear combination of four hermitian 2×2 matrices. Without loss of generality, we may choose the Pauli-spin matrices ** σ**=(

*σ*

^{x},

*σ*

^{y},

*σ*

^{z}) and the unit matrix as the orthonormal basis set for the vector space of 2×2 matrices with an inner product defined by (

*A*,

*B*)=Tr

*A*

^{†}

*B*. Without loss of generality, we may write 3.5where

**=(**

*ρ**ρ*

_{x},

*ρ*

_{y},

*ρ*

_{z}),

*u*

_{0}and

**u**=(

*u*

_{x},

*u*

_{y},

*u*

_{z}) are all real-valued. It is now straightforward to show [23] that the desired separation can be realized by requiring that

*u*

_{0}=

*u*

_{0}(

**a**,

*Z*),

*u*

_{x}=

*u*

_{x}(

**a**,

*Z*),

*u*

_{y}=

*u*

_{y}(

**a**,

*Z*),

*u*

_{z}=

*u*

_{z}(

**a**,

*Z*),

*ρ*

_{x}=

*ρ*

_{x}(

**M**,

*Z*),

*ρ*

_{y}=

*ρ*

_{y}(

**M**,

*Z*) and

*ρ*

_{z}=

*ρ*

_{z}(

**M**,

*Z*) (recall that

*Z*is considered to represent all fixed conditions which are important to the actual experiment but are not of immediate interest). Assuming that the observed counts do not depend on the orientation of the reference frame (see earlier), 〈

*x*〉 is a function of

**a**⋅

**M**only. This requirement enforces

**=**

*ρ***M**and

**u**=

**a**. Hence, we have 〈

*x*〉=

*u*

_{0}+

**M**⋅

**a**and as |〈

*x*〉|≤1 it follows that −1≤

*u*

_{0}+

**M**⋅

**a**≤1; for

**a**=

**M**and

**a**=−

**M**we have

*u*

_{0}≤0 and 0≥

*u*

_{0}, respectively, hence

*u*

_{0}=0. Note that we could equally well have made the choice

**=**

*ρ***a**and

**u**=

**M**instead of

**=**

*ρ***M**and

**u**=

**a**. However, the former choice leads to inconsistencies for instance when we consider an experiment in which we place several SG magnets in succession or consider the EPRB experiment.

Thus, we have shown that only the desire to represent the data equation (3.1) such that the description of the whole experiment separates into a description of the ‘source’ (**M**) and a description of the ‘measurement device’ (**a**) together with some elementary linear algebra leads to the unique description of the SG experiment in terms of 2×2 matrices,
3.6conditional on the assumptions that the individual outcomes of an SG experiment are independent and that the frequency distribution of these outcomes does not depend on the orientation of the reference frame. From equation (3.6), it follows immediately that , that is is a projection. This implies that we can write [46]
3.7where the vector |*Ψ*〉 is expressed in the basis of the eigenstates (|↑〉,|↓〉) of the *σ*^{z} matrix.

Summarizing: changing the representation of the data in combination with the desire to separate as much as possible the description of the source and measurement devices automatically enforces the Hilbert space structure that is a characteristic signature of quantum theory [23]. No postulates of quantum theory are required to derive (or postulate) equations (3.6) or (3.7). Furthermore, it is straightforward to extend the description to include mixed states [23].

There is nothing that forbids an experiment to yield for instance *f*(*x* | **a**⋅〈**M**〉),*Z*)=(1+*x*(**a**⋅**M**)^{k})/2 with *k*=2 (we certainly can generate such data using a digital computer, a metaphor of a physical device on which we carry out experiments). However, the data produced by such an experiment cannot be represented by equation (3.6). In other words, the class of conceivable SG experiments is significantly larger than the class of experiments that allows for the separation: the class of realizable SG experiments is (much) larger than the class of SG experiments describable by quantum theory.

The fact that the separation procedure leads, in such a simple manner, to the quantum theoretical description (3.6) of the SG experiment provokes the question: ‘What is so special about the case in which the separation procedure can be carried out?’ The answer is given in §2. Using equation (3.6), according to the postulate of quantum theory [46], the probability to observe an event *x* is given by , which is exactly the same expression as the one obtained by LI treatment of a robust SG experiment. In other words, if the SG experiment is robust, it may be equally well described by quantum theory.

The application of the separation procedure to the EPRB experiment is an almost trivial extension of the application to the SG experiment. We start by writing the observations *xy*=(+1,−1,+1,−1) and frequencies ( *f*(+1,+1 | *θ*,*Z*), *f*(−1,+1 | *θ*,*Z*), *f*(+1,−1 | *θ*,*Z*), *f*(−1,−1 | *θ*,*Z*)) as 4×4 matrices *X*, *Y* and *F* with elements *X*([*x*,*y*],[*x*′,*y*′])=*xδ*_{x,x′}*δ*_{y,y′}, *Y* ([*x*,*y*],[*x*′,*y*′])=*yδ*_{x,x′}*δ*_{y,y′} and *F*([*x*,*y*],[*x*′,*y*′])=*f*(*xy* | **a**,**M**,*Z*)*δ*_{x,x′}*δ*_{y,y′}, respectively. Here we use the notation [*x*,*y*]=(1−*x*)/2+(1−*y*) to indicate that the pairs (*x*,*y*) and (*x*′,*y*′) specify the row, respectively, the column index (running from 0 to 3) of the matrices *X* and *F*. We search for 4×4 matrices , and which satisfy
3.8and allow for the desired separation. Using the direct product of the Pauli-spin matrices for *j*=1,2 and the unit matrix as the orthonormal basis set for the vector space of 4×4 matrices, we may write (without loss of generality) , where the number *ρ*_{0}, the vectors *ρ*_{j} and the matrix *ρ*_{12} are all real-valued. As each of the two sides of the EPRB experiment contains an SG magnet, consistency with the separated description of the SG experiment demands that we choose and . We find the explicit expression of by requiring that equation (3.8) holds. Focusing on the case of the EPRB experiment for which 〈*x*〉=〈*y*〉=0 and 〈*xy*〉=−**a**_{1}⋅**a**_{2}, it follows that , *ρ*_{1}=*ρ*_{2}=0 and that takes the form [23]
3.9It is not difficult to verify that , hence equation (3.9) is the density matrix of a pure quantum state [46]. Computing the matrix elements of in the spin-up, spin-down basis of both spins, we find
3.10and
3.11which we recognize as the quantum theoretical description of two spin- objects in the singlet state. Therefore, we have shown that rewriting the data gathered in an ideal EPRB thought experiment in a manner that allows for the envisaged separation naturally leads, without invoking postulates of quantum theory and/or probability theory, to the quantum theoretical description of two spins in the singlet state.

As in the case of the ideal SG experiment, the representation in parts puts a severe restriction on the kind of data that we can describe, again provoking the question: ‘What is so special about the case in which the separation procedure can be carried out?’ The answer is the same as in the case of the SG experiment: it is precisely for the special case of the robust EPRB experiment.

### (b) Equivalence with quadratic forms

In the case of SG or EPRB experiment, the LI approach yields equations for the i-probs which are easy to solve directly. As the i-probs describe the data produced by robust experiments, the connection to the quantum formalism is mainly of pedagogical interest. However, if the equations for the i-probs are nonlinear, as in the case of a particle in a potential discussed in §2c, and not easy to solve, it is expedient to search for alternative equations that are much easier to solve. Fortunately, in the case at hand, we can make good use of the large body of work that explores mathematically equivalent forms of quantum theory.

Consider the quadratic functional
3.12with the shorthand notation *ψ*≡*ψ*(*x* | *X*(*t*),*t*,*Z*). Substitute to obtain equation (2.18), demonstrating that equation (2.18) and equation (3.12) are equivalent (the ambiguity in the phase of *ψ* can be shown to be irrelevant [22]). On the other hand, the extrema of equation (3.12) are given by the solution of the linear partial differential equation
3.13which turns into the time-dependent Schrödinger equation if we set .

From our derivation of equation (3.13) from LI principles, it is clear that (i) the actual value of λ can only be determined by comparing the outcome of calculations based on equation (2.18) or equation (3.13) with experimental data and that (ii) the wavefunction *ψ*(*x* | *X*(*t*),*t*,*Z*) is just a mathematical concept, a vehicle to solve a class of complicated nonlinear minimization problems through the minimization of quadratic forms. As a product of human imagination, this concept is an extraordinarily useful tool that serves no purpose other than transforming nonlinear equations into linear ones.

## 4. Conclusion

Using the simplest, non-trivial examples, it was shown how the application of LI to experiments for which the observed events are independent and for which the frequency distribution of these events is robust with respect to small changes of the conditions under which the experiments are carried out yields, without introducing any concept of quantum theory, some of the most basic equations of quantum theory. More extensive discussions of applications to the time-(in)dependent phenomena with or without spin can be found elsewhere [21,22,41]. Work to include relativistic effects is in progress [55].

The key point of an LI application to quantum physics experiments is to express precisely and unambiguously, using the mathematical framework of plausible reasoning [24–28], the conditions of a robust experiment; see §2. This translates into a global optimization problem for the i-prob, the solution of which may be very simple as in the case of the SG and EPRB experiment or may yield a fairly complicated nonlinear set of equations. The mathematical machinery of quantum theory appears as a result of transforming a set of nonlinear equations into a set of linear ones or emerges from the desire to separate the description into various parts.

It will not have escaped the reader that the LI approach reviewed in the present paper is void of postulates regarding ‘wavefunctions’, ‘observables’, ‘quantization rules’, ‘quantum measurements’ [56], ‘Born's rule’, etc., nor that there are ‘interpretational’ issues. This is a direct consequence of the basic premise of the LI approach, namely that current scientific knowledge derives, through cognitive processes in the human brain, from the discrete events which are observed in laboratory experiments and from relationships between those events that we, humans, discover. These discrete events are not ‘generated’ according to certain quantum laws: instead these laws appear as the result of (the best) LI from the data.

This viewpoint seems completely in line with Bohr's view [34]: ‘Physics is to be regarded not so much as the study of something a priori given, but rather as the development of methods of ordering and surveying human experience. In this respect, our task must be to account for such experience in a manner independent of individual subjective judgment and therefore objective in the sense that it can be unambiguously communicated in ordinary human language.’ This, in our opinion, is exactly what the LI approach allows us to do. The extraordinary descriptive power of quantum theory then follows from the fact that it is plausible reasoning, that is common sense, applied to robust experiments.

From our LI derivations of some of the most basic equations of quantum theory, it follows that the latter describes only robust experiments. This is best illustrated by comparing the (high) accuracy by which quantum theory predicts, say, the ratios of the wavelengths of the Balmer absorption/emission lines of hydrogen [57] with the comparably low accuracy of say EPRB experiments that purport to provide evidence for the singlet state of two spin- particles [39,50]. In the former case, the high accuracy originates from doing a massive amount of experiments on a very large collection of identical atoms and, as in any statistical experiment, what we observe most of the time is the most robust response. Thus, the solution of the LI problem (e.g. the Schrödinger equation in the case at hand) is the one that is ‘observed’ most frequently. By contrast, in experiments that provide data on an event-by-event basis, the statistical samples are much smaller and the external conditions may vary significantly from one experiment to the next. In other words, these experiments are not as robust as the spectroscopic experiments. From the LI viewpoint, it is therefore natural that these experiments produce data that show (much) larger deviations from the quantum-theoretical prediction than spectroscopic data. By the same argument, the LI approach offers a rational explanation for the observation that it seems to take considerable effort to engineer nanoscale devices that operate in a regime such that the experimental data comply with quantum theory.

## Authors' contributions

All authors contributed to the material presented in and writing of this manuscript.

## Funding

M.I.K. acknowledges financial support by the European Research Council, project 338957 FEMTO/ NANO.

## Competing interests

The authors have no competing interests.

## Footnotes

One contribution of 14 to a theme issue ‘Quantum foundations: information approach’.

↵1 In the course of deriving equation (2.5), our criterion of robustness enforces the intuitively obvious assignment

*P*(*x*|*θ*,*Z*)=*n*_{x}/*N*, establishing the relationship between the epistemological concept (i-prob) and the physically measurable quantity (frequency of outcomes). It is at this point that the possibility to view the i-prob as a ‘subjective’ assignment is eliminated [22].

- Accepted November 4, 2015.

- © 2016 The Author(s)