A statistical theory fulfills the Bell criterion of realism, if it, together with the prediction of probability distributions ρ(y,a)dy (where y ∈ Y are the measurement results and a ∈ A control parameters of the measurement device), gives also realistic explanations for them. These explanations consist of a probability distribution ρ(λ) dλ on a space λ ∈ Λ of "states of the real object" and a function y(λ,a): Λ×A→Y, which describes the result of the measurement y, if the object is in state λ, so that for every test function f(y) on Y we have:
∫ f(y) ρ(y,a) dy = ∫ f(y(λ,a)) ρ(λ) dλ.
It seems useful to distinguish two classes of theories: statistical theories and realistic theories.
The aim of a statistical theory is only to reproduce the observable statistical effects. Thus, a statistical theory is completely defined if for every experiment the resulting probability distributions ρ(y,a) dy is predicted by the theory. In other words, for every (continuous bounded) function of the measurement results f: Y → R the statistical theory has to define the expectation value
E(f|a) = ∫ f(y) ρ(y,a) dy.
The archetypical example of such a statistical theory is quantum theory.
Realistic theories have to define more than only the probability distributions ρ(y,a) dy. They are obliged to postulate some realistic explanation, which is given in terms of a real state of the measured object x ∈ Λ. We don't know the real state x, and this is not necessary for a realistic explanation to fix the real state x uniquely. It is sufficient as an explanation if we have a probability distribution ρ(λ) dλ on the space Λ of all possible states of reality x.
The important point is that this real state cannot depend on the free will choices of experimenters a what to measure. Thus, different from ρ(y,a) dy, the probability distribution ρ(λ) dλ is not allowed to depend on a.
The real, complete state x of the measured object, together with the control parameters a chosen by the experimenter, already define the result of the experiment y. Thus, we have some function y(λ,a): Λ × A → Y. This seems to introduce some element of determinism into our definition of realism. But a classical stochastic function, with arbitrary space Ω, can be used as well and gives nothing new: We would obtain a function y(λ,a,ω): Λ × A × Ω → Y. But, once we are free, in the realistic explanation, to define the space Λ, we can use the space Λ' ≅ Λ × Ω with probability distribution ρ(λ') dλ' = ρ(λ) ρ(ω) dλ dω, so that y(λ,a,ω) = y(λ',a;). This already defines a realistic interpretation withr Λ' as the set of states of reality. Thus, replacing y(λ,a): Λ × A → Y by a stochastic function does not lead to any nontrivial generalization.
Given the function y(λ,a): Λ × A → Y and the probability distribution ρ(λ) dλ, we can predict all observable expectation values E(f|a) by the formula:
E(f|a) = ∫ f(y) ρ(y,a) dy = ∫ f(y(λ,a)) ρ(λ) dλ.
Thus, the realistic theory can make the same empirical predictions as the statistical theory which it explains. But it may be, that for some results of experiments we cannot find an appropriate realistic interpretation. In this case, observing such a probability distribution would falsify the particular realistic theory.
The motivation for this criterion of realism is the following alternative definition: The EPR-Bell criterion of realism is all we need, instead of Einstein causality, to prove Bell's inequality.
The consequence is, that this criterion of realism is rejected by the majority of modern physicists — a decision which is, in my opinion, wrong. The aim of these pages is to justify this opinion.
The EPR paper is famous for its EPR criterion of reality. Instead, we name our criterion Bell criterion of realism. Using Bell instead of EPR is clear: We focus our interest on the proof of Bell's inequality, thus, want to fix what we need to prove this theorem.
But why realism instead of reality? This is, indeed, an interesting point worth explanation. Reality is what exists. Realism is, instead, a principle. It is a property of theories. There are realistic theories, an there are other types of theories, like mystic or solipsistic theories.
Our criterion may be applied to theories. It allows to distinguish "realistic" theories (which fulfill this criterion) from other theories. Therefore, "criterion of realism" seems more appropriate.
The common sense notion of realism contains more than the EPR-Bell criterion of reality. This is a consequence of our alternative definition: All things, which are part of common sense realism, but are not necessary for the proof of Bell's inequality, have been omitted — they are unnecessary. One, very important, but omitted part is the requirement of consistency between the different partial explanations of different experiments.
Imagine A observes an object B, C observes A together with B, and D, D observes A, B, and C. All these observations lead to statistical predictions. Following the definition, all these statistical predictions need explanations. What is missed in our definition is that these explanations have to be compatible with each other. As a consequence of such compatibility conditions, realistic theories usually describe not only some particular objects, but the whole universe including observers.
This consistency requirement is what usually gives realistic theories additional predictive power. Imagine a large set of observations. Imagine that, for each particular observation, we have some explanation. But these explanations may be incompatible with each other. In a consistent realistic theory, this would be forbidden. Therefore, this set of observations would falsify the realistic theory. Instead, if we restrict ourself to the consideration of the statistical predictions, we have no such contradiction. Thus, we have less possibilities to falsify the theory, thus, less predictive power in the sense of Popper's criterion.
It does not follow that for all observables of quantum theory exists some predefined hidden variables. This can be seen in the standard example of a realistic interpretation of quantum theory: The pilot wave theory. In this theory, only the cofiguration space observables have definite values. While there exists a trajectory in configuration space, thus, also some corresponding velocity, the property measured by quantum momentum observation p = -i ∂q does not measure this velocity, but, instead, "measures" some result of a complex interaction between the "measured object" and the "measurement device", a result which, in particular, depends on the initial configuration of the "measurement device".
The criterion of realism has also nothing to do with determinism. Instead, every classical stochastic process is realistic according to our criterion.
Given the probability distribution ρ(λ) dλ and the function y(λ,a), it is possible to construct a probability distribution ρ(y,a) dy. To explain how, we have to define probability distributions. How to do this in a way that laymen can follow? It seems, the dual definition is useful here: A probability distribution ρ(λ)dλ is defined, if we have defined, for all continuous bounded functions f(λ), their expectation value E(f), which is usually written as
E(f) = ∫ f(λ) ρ(λ) dλ.
This map f(.) → E(f) has to fulfill some axioms (f≥0 → E(f)≥0, E(1)=1, E(f+g)=E(f)+E(g), E(cf) = cE(f) for constants c). If it does, the map defines a probability distribution. It seems to be a quite natural definition, essentially because we, anyway, want to be able to compute such integrals.
And with this definition, it is quite simple to construct the image ρ(y) dy of ρ(λ) dλ for a given function y(λ). Indeed, all we need is the formula
∫ f(y) ρ(y) dy = ∫ f(y(λ)) ρ(λ) dλ.
If ρ(λ) dλ is a probability distribution, then the term on the right hand side is defined for every function f(.) on Y, and has all the nice properties we need to define a probability distribution on Y. This can be easily checked – all what we need to prove the properties for the left hand side follows from the corresponding properties of the probability distribution ρ(λ) dλ.
In comparison with other ways to define probability distributions this way seems to be quite simple and nice. Moreover, the expression for expectation values we would have to introduce anyway, because we use it in the proof of Bell's inequality.