A possible way to get the Born Rule in Many Worlds

[December 22, 2019]

The Many-Worlds Interpretation (MWI) of quantum mechanics is probably roughly correct. There is no reason to think that the rules of atomic phenomena would stop applying at larger scales when an experimenter becomes entangled with their experiment (kooky interjections about consciousness notwithstanding…).

However, MWI has the problem that it does not easily explain why quantum randomness leads to the probabilities that we observe. The Born Rule says that if a system is in a state , upon ‘measurement’ (in which we entangle with one or the other outcome), we measure the eigenvalue associated with the state with probability

The Born Rule is normally included as an additional postulate in MWI, and this is somewhat unsatisfying. Or at least, it is apparently difficult to justify, given that I’ve read a bunch of attempts, each of which talks about how there haven’t been any other satisfactory attempts. I think it would be unobjectionable to say that there is not a consensus on how to motivate the Born Rule from MWI without any other assumptions.

Anyway here’s an argument that I find somewhat compelling? See what you think.


1. A classical coin

First let’s think about classical probability, but write it in a notation suggestive of quantum mechanics. Suppose we’re flipping a biased coin that gets heads with probability and . Let’s call its states and , so the results of a coin flip are written as with . Upon iterations of classical coin-flipping we end up in state

Where means a state in which we have observed heads and tails (in any order).

Now suppose this whole experiment is being performed by a poor experimenter who’s, like, locked in a box or something. The experimenter does the experiment, writes down what they think the probability of heads is, and then transmits that to us, outside of the box. So the only value we end up seeing is the value of their measurement of , which we’ll call . The best estimate that the experimenter can give, of course, is their observed frequency , so we might say that the resulting system’s states are identified by the probability perceived by the experimenter:

If you let get very large, the system with will end up having the highest-magnitude amplitude, and so we expect to end up in a ‘universe’ where the measurement of the probability converges on the true value of . This is easily seen, because for large the binomial distribution converges to a normal distribution with mean . So, asymptotically, the state becomes increasingly high-amplitude relative to all of the others. This is a way of phrasing the law of large numbers.

I think this is as good an explanation as any as to what probability ‘is’. Instead of trying to figure out what it means for us to experience an infinite number of events and observe a probability, let’s just let an experimenter who’s locked in a box figure it out for us, and then just have them send us their results! Unsurprisingly, the experimenter does a good job of recovering classical probability.


2. A quantum coin

Now let’s try it for a quantum coin (okay, a qubit). The individual experiment runs are now given by where are probability amplitudes with . Note that normalizing these to sum to 1 doesn’t predetermine what the experienced probabilities are, and as we will see the normalization isn’t necessary.

As before we generate a state that’s something like:

Where are things going to go differently? A potential problem is that each of the measurement results that comprise a macrostate could have different phases, and there is no reason to think that they will add up neatly – there could be interference between different ways of getting the same result. I’m not totally sure this is reasonable, but it leads to an interesting result, so let’s assume it is.

Consider running the experiment twice, but letting each state have a different have a different phase . (We can ignore the phase without loss of generality by treating it as an overall coefficient to the entire wave function)

The state we generate will be:

This is no longer a clean binomial distribution. Writing and for clarity, the two-iteration wave function is:

And only has the same magnitude as when .


Now let’s consider what this looks like as .

For a state with terms, we end up with a sum of exponentials with phases in them:

Here is the set of -element subsets of elements. For instance if :

Our wave function for iterations of the experiment is given by

The classical version of this is a binomial distribution because is replaced with . The quantum version observes some cancellation. We want to know: as , what value of dominates?

We don’t know anything the phases themselves, so we’ll treat them as classical independent random variables. This means that and therefore for all . But the expected magnitude is not 0. The sum of all of these random vectors forms a random walk in the complex plane, and the expected amplitude of a random walk is given by .

Briefly: this comes from the fact that

This means that the magnitude of the term for our quantum coin is proportional to , rather than the classical value of .

For , the same argument applies (it’s still basically a random walk), except that there are terms in the sum, so in every case we get an expected amplitude .


These don’t tell us the constant of proportionality, since , but fortunately we only need to compute the value of at the peak, and we can find that using , which is easy to work with:

This is a binomial distribution , which asymptotically looks like a normal distribution with maximum , which means that the highest-amplitude state measures is:

Thus we conclude that the observed probability of measuring when interacting with a system in state is centered around , as reported by an experimenter in a box who runs the measurement many times, which is what we probably are anyway. And that’s the Born Rule.

Ultimately this seems to be because different ways of seeing the same result interfere with each other, suppressing the amplitudes of seeing less uniform results by a factor of the square root of their multiplicity.

(Note that this argument should still work if ; the resulting asymptotic normal distribution will end up having mean .)

So that’s interesting.


It’s unclear to me how carefully isolated an experiment would have to be for different orderings of its results to interfere with each other. Presumably the answer is “a lot”, but what if it isn’t? I’m intrigued, either way, by the fact that this type of calculation produces the right answer through a relatively elementary manipulation. I’m especially intrigued because I suspected it would work before calculating, and it did, which never happens…

Suffice to say I would love to know a) what’s wrong with this argument (I feel like could be circular, but I haven’t figured out how), or b) if it exists in the literature somewhere, cause I haven’t found anything, although admittedly I didn’t look very hard.

I can think of some strange implications of this argument but I don’t want to get ahead of myself.

I would also kinda like to go to graduate school.