The Essence of Quantum Mechanics

July 24, 2020

QM is harder to understand that it needs to be because people are reluctant to write down explicit examples of all of the objects that you talk about. They’ll write a whole textbook about operators and observables and their eigenvalues when they act on the wave function, but they won’t just write down an example of a wave function that has those properties.

This is my attempt at fixing that. I’m trying to learn QFT and it helps to have the prerequisites compressed into the simplest possible representation. It also helps me to write everything down in a compressed form so I can reference it more easily.

This will make no sense if you don’t already have a good understanding of quantum mechanics. Also it might be kinda wrong-ish, but I bet it will help.

Conventions: \(c = 1\), \(g_{\mu \nu} = \text{diag}(+, -, -, -)\). I like to write \(S_{\b{x}}\) for \(\nabla S\).

1. Wavefunction solutions

QM makes a lot more sense to me if you (a) handle everything relativistically from the start and (b) just assume the forms of the wave function solutions instead of deriving them.

The basic point is that a single Fourier mode of a wave function with definite four-momentum \(p^\mu = (E, \b{p})\) looks like this:

\[\psi(\b{x}, t) = e^{i/\hbar (\b{p} \cdot \b{x} - E t)}\]

And all of the typical operators acting on this object clearly extract their classical values as eigenvalues:

\[\begin{aligned} \hat{P} \psi &= - i \hbar \p_\b{x} \psi = \b{p} \psi \\ \hat{E} \psi &= i \hat \p_t \psi = E \psi \end{aligned}\]

Then when we contract with a ket, that is, a complex-conjugated wave function, we’re just cancelling out the exponential:

\[\< \psi \| \hat{P} \| \psi \> = e^{-\text{stuff}} \b{p} e^{\text{stuff}} = \b{p}\]

This is much easier to understand than the usual treatment of operators as mystical new objects with mystical new properties.

Of course in practice wave functions aren’t simply Fourier modes. But they can be decomposed into Fourier modes, because that’s what all of QFT spends its time doing. Anyway all that means is that we write them as energy-momentum eigenstates, and then all the calculations are in terms of the now-diagonal energy and momentum operators.

So, if I had my way I’d start a quantum mechanics course with special relativity, followed by introducing the scalar wave function, like this.

A generic wave function on spacetime has the form \(\psi(t, \b{x}) = e^{ i S(t, \b{x})/\hbar}\) which assigns a complex phase to every point. It is fully determined by the action \(S(\b{x}, t)\), and in particular given an initial state \(\psi_0\) it’s determined by the action gradient \(dS = S_{\mu} dx^\mu\). This lets us compare quantum states by integrating over some path \(\Gamma\):

\[\psi(t, \b{x}) = e^{i/\hbar \int_{\Gamma} dS} \psi_0\]

Later on when potentials are involved we will need to be specific about the path of integration, but for now we can think of \(S\) as a scalar function that determines \(\psi\) everywhere.

Relativistic invariance insists that \(\psi\) have the same value in any reference frame, and \(- i \hbar \p \psi = - i \hbar (\p S) \psi = (S_t, S_{\b{x}}) \psi\) must be a covariant 4-vector. Contraction with \(\bar{\psi}\) extracts the vector components: \(\< \psi \| {- i \hbar \p} \| \psi \> = \bar{\psi} (S_t, S_{\b{x}}) \psi = (S_t, S_{\b{x}})\). Finally, \(\| (S_t, S_{\b{x}}) \| = \sqrt{S_{t}^2 - S_{\b{x}}^2}\) must be a Lorentz-invariant scalar.

We call \(i \hbar \p_t = \hat{E}\) and \(- i \hbar \p_x = \hat{P}\) the energy and momentum operators. The quantum mechanical operators apparently extract properties of \(S\), but because \(S\) is packed inside an exponential, they extract them as eigenvalues: \(i \hbar \p_t \psi = - S_t \psi\). Our quantum-mechanical inner product and our operators are just tools for extracting properties of \(S\), since \(\psi\) is the only thing we can directly operate on. When an equation like the Schrödinger equation contains a \(\hat{P} = - i \hbar \p_x\) operator, it’s just a skew way of writing the \(p_x\) value directly.

Since quantum mechanical measurements only happen through operators like these, the exact values of \(\psi\) are only determined up to a phase, and therefore \(S\) is only determined up to a constant. The actual values are not physically observable.

For a free massive spinless particle the action is \(S = - p_{\mu} x^{\mu} = \int -p_\mu dx^{\mu}\), where \(p\) is the four-momentum and \(\| p \| = m\), the rest mass. In the rest frame this is simply \(S = -m \tau = - \int m d\tau\). In the absence of a potential this gives the wave function:

\[\psi(x) = e^{- i/\hbar \int_{0}^{x} p_\mu dx^\mu} \psi(0) = e^{- i/\hbar p_\mu x^\mu} \psi(0) = e^{i/\hbar ( \b{p} \cdot \b{x} - E t)} \psi(0)\]

which is a Fourier component with momentum \(p_\mu\). Time evolution via exponentiation of the Hamiltonian amounts to translating in \(t\):

\[\psi(t + \Delta t) = e^{i/\hbar \hat{H} \Delta t} \psi(t) = e^{\Delta t \p_t} \psi(t) = e^{i/\hbar (\b{p} \cdot \b{x} - E(t + \Delta t))} \psi_0\]

(This uses the idea that exponentiating a differential operator translates in that coordinate: \(e^{a \p_x} f(x) = f(x+a)\).)

When an initial state is not a pure Fourier mode with a definite momentum, we expand it as a sum of modes. For instance, if at \(t=0\) we measure an electron at \(\b{x} = 0\), then the initial state is

\[\psi(0, \b{x}) = \delta(\b{x}) = \int e^{i \b{p} \cdot \b{x}} d \b{p}\]

When potentials are involved, \(dS\) is modified. The electromagnetic field, for instance, enters as \(p \mapsto p - i qA\), so \(dS = (p_{\mu} - i q A_{\mu}) dx^{\mu}\). Depending on the field configuration we may no longer be able to easily integrate \(\int dS\): if \(A\) includes a current, then it contains a ‘line’ of divergence, and the path integral’s result will depend on how many times \(\Gamma\) circles this divergence. This causes the path integral to give different values based on the choice of path. Summing over these paths, with appropriate weighting, corresponds in QFT to summing over the number of photons that are exchanged (I think. Will work it out in detail when I learn more of QFT).

2. Correspondences

Many concepts in quantum mechanics follow more naturally from this foundation:

Mass: For a free particle \(S_t = E\) and \(S_{\b{x}} = \b{p}\), and \(m = \sqrt{E^2 - p^2}\) is the relativistic rest energy/momentum relation. The wave function looks like \(\psi = e^{i/\hbar \int \b{p} \cdot d\b{x} - E dt} \psi_0\). A high energy/momentum corresponds to a rapidly changing action, and thus to a wave function that is quickly rotating as you translate in time or space. Ultimately, the mass \(m\) corresponds to the speed of phase rotation in a particle’s rest frame, and its energy and momentum are the results of Lorentz-transforming \(dS = - m \d\tau\) into other frames.

Path integration: Relative changes in \(S\) can be found by integrating: \(S(f) - S(i) = \int_{\Gamma} dS\) along any curve \(\Gamma\) from \(i\) to \(f\), and \(\psi(f) = e^{i/\hbar (S(f) - S(i))} \psi(i)\). Thus \(e^{i/\hbar (S(f) - S(i))}\) is the ‘transition matrix’ between any two states, along a given path. The total transition amplitude is a sum over all possible paths between two states. This extends handily to QFT’s path integrals when creation/annihilation of particles is included.

The roles of \(\hbar\) and \(i\): \(S \mapsto e^{iS / \hbar }\) is the conversion from ‘action’ space to ‘phase’ space. \(\hbar\) changes units from action (energy \(\times\) time) to radians; if we set \(\hbar = 1\) we are declaring that we measure action in radians. The resulting space after mapping by \(e^{iS}\) is physically meaningful, because in some cases we’ll end up summing these phase factors from multiple starting states and seeing interference patterns. I suspect that the output space is the \(U(1)\) that is identified with the electromagnetic gauge field but am not sure. If so, I think it would be good to write \(R_{EM}\) instead of \(i\), in order to avoid accidentally conflating the \(i\) factors from rotations in different spaces.

Angular momentum: The orbital angular momentum operator, \(\hat{L}_z = -i \hbar \p_{\phi}\), does the same thing as \(\hat{P} = - i \hbar \p_{\b{x}}\) but for a wave function in spherical coordinates. The azimuthal angle term looks like \(\psi \sim e^{i/\hbar (l_z \phi - E t)}\), and \(\hat{L}_z \psi = l_z \psi\). The azimuthal quantum number \(l_z\) (often written \(m\)) measures how many times \(\psi\) oscillates in a rotation of the polar angle \(\phi\); it is quantized precisely because the \(\phi\) coordinate has a built-in periodicity. A \(z\)-angular momentum value of \(l_z\) labels the number of periods the wave makes as you rotate \(\phi\) in the \(z\)-plane.

Spin-\(\frac{1}{2}\): If \(l_z = 1/2\), then \(\psi_{\pm} \sim e^{i/\hbar (\pm \frac{1}{2} \phi - E t)}\) acts like a spinor (by modeling the spin as orbital angular momentum, and omitting the \(r\) and \(\theta\) components). This function appears trivially unphysical, since it has different values at \(\phi = 0\) vs \(\phi = 2 \pi\). The resolution is the fact that it’s only meaningful to use the wave function to compare states that are connected by a path – and for a spinor it’s correct that \(\< \psi(\phi = 2 \pi) \| \psi(\phi = 0) \> = -1\). (This is a useful mental model but isn’t the full story. My next post will be a truly exhaustive exploration of spinors.) (Much later edit: lol, no, spinors are incredibly confusing. Hopefully I’ll finish that next post up someday.)

Spin-\(1\): A vector-valued wave function \(\vec{\psi} = (\psi_x, \psi_y, \psi_z)\), where the terms transform according to physical rotations, is a spin-1 object. To consider its \(\b{z}\)-angular momentum we change bases to a spherical tensor basis (not to be confused with spherical coordinates):

\[(\hat{x}, \hat{y}, \hat{z}) \ra (\frac{\hat{x} + i \hat{y}}{\sqrt{2}}, \hat{z}, \frac{\hat{x} - i \hat{y}}{\sqrt{2}})\]

In this basis, wave functions are diagonalized in terms of their \(\phi\) dependence, so that we can see what their \(\b{z}\)-angular momentum is.

Or in cylindrical coordinates, using \(\hat{x} = (\cos \phi )\hat{\rho} - (\rho \sin \phi )\hat{\phi}\) and \(\hat{y} = (\sin \phi) \hat{\rho} + (\rho \cos \phi) \hat{\phi}\):

\[= (\frac{e^{i \phi} (\hat{\rho}+ i \rho \hat{\phi})}{\sqrt{2}}, \hat{z}, \frac{e^{- i \phi} (\hat{\rho} - i \rho \hat{\phi})}{\sqrt{2}})\]

The coordinates of \(\vec{\psi}\) in this basis are:

\[(\psi_{+1}, \psi_0, \psi_{-1}) = (\frac{\psi_x - i \psi_y}{\sqrt{2}}, \psi_z, \frac{\psi_x + i \psi_y}{\sqrt{2}})\]

In the new basis, the coordinate vectors have an explicit \(\phi\)-dependence, which captures the idea that any vector-valued function has an intrinisic \(\phi\)-derivative, independent of reference frame, just by virtue of being a vector. This is kinda obvious in hindsight, but it took me forever to understand. A vector is spin-1 because when you write it in a spherical tensor basis it factors into components with angular momentum \((1, 0, -1)\).

So the components of a vector wave function \(\vec{\psi}\) locally looks like \(\psi_{s_z}(\phi, \rho, z, t) \sim e^{i \hbar (s_z + l_z) \phi } \psi_{s_z}(\rho, z, t)\), where \(l_z\) is its orbital angular momentum and \(s_z \in (+1, 0, -1)\) is a frame-independent contribution just from its vectorial nature. The \(0\) component of \(L_z\) spin corresponds to a vector-valued wave function pointing only in the \(z\) direction. \(\pm 1\) components correspond to having \(x\) or \(y\) components, with the sign determined by their relative phase.

Note what it means to have spin \(1\): it’s not that it fixes the value of the angular momentum; rather, it specifies the different ways that the angular momentum can transform under rotation. The three choices determine whether \(\vec{\psi}\) is in the \(z\) direction \((s_z = 0)\) or whether it has a positive or negative ‘rotational’ components in the \(xy\) plane (\(s_z = \pm 1\)). Particularly, having angular momentum \(s_z = +1\) means that the \(y\) component is advanced in phase compared to the \(x\) component. This is why the ‘ladder’ operator \(\hat{S}_+ = \hat{S}_x + i \hat{S}_y\) serves to increase the angular momentum: it simply includes a factor of \(e^{i \phi}\):

\[\hat{L}_+ = (\hat{L}_x + i \hat{L}_y) \sim e^{i \phi}\]

(Or rather, it modifies the state such that it picks up an additional factor of \(e^{i \phi}\), moving one unit of angular momentum out of an \(\b{x}\) or \(\b{y}\) mode (which are superpositions of \(\b{z}_{+1}\) and \(\b{z}_{-1}\) modes) into a \(\b{z}\) mode.)

By the way, photons are spin-1 particles, but cannot have the \(s_z = 0\) state for what I currently understand as ‘complicated technical reasons’. Roughly, it goes: because photons have no rest frame, the \(s_z = 0\) value is forbidden, as that would imply that there is a choice of \(z\) around which a photon wave function is symmetric. The remaining \(s_z = \pm 1\) states correspond to the two linear photon polarizations.

The Schrödinger Equation: We can write \(S_t^2 - S_{\b{x}}^2 = m^2\) as \(S_t = \sqrt{m^2 + S_x^2} = m \sqrt{1 + \frac{S_x^2}{m^2}}\) and expand as a Taylor series (note that \(\| S_x/m \| = \| p / m \| \ll 1\)) to get:

\[S_t = m (1 + \frac{1}{2} \frac{S_x^2}{m^2} + O((\frac{S_x^2}{m^2})^2) \approx m + \frac{S_x^2}{2m}\]

Using our operator forms we get the free-particle Schrödinger equation:

\[\hat{E} \psi \approx (m + \hat{P}^2/2m) \psi\]

Interpreting, this says that the time-derivative of the action is a constant (the mass) plus a term proportional to the kinetic energy, plus higher-order terms that vanish at low momentums.

The initial \(m\) term is normally ignored in non-relativistic QM. It corresponds to a constant change in phase along any path (and adds a constant term to the Lagrangian), but it drops out of any calculation if you (a) only integrate over time and (b) never create/annihilate particles.

Schrödinger with potential: The \(V\) term in the non-relativistic Schrödinger ends up next to the kinetic energy term: \(\hat{E} \psi \approx (m + \hat{P}^2/2m + V) \psi\). Working backwards through the derivation, we figure that the constraint on \(S\) must be: \(S_t - V = \sqrt{m^2 + S_x^2}\). But there is no particular reason this would have a clean relativistic form, since we treat our potential non-relativistically anyway.

Nevetheless we can add to our interpretation: the role of a classical scalar potential \(V\) is to modify the phase change as a wave function translates in time, such that the particle acts like it has energy \(E - V\) instead of \(E\). The role of a vector potential is to modify the momentum, \(\b{p} \mapsto \b{p} - \vec{A}\).

The electromagnetic field uses the 4-potential \(q A = q (\phi, \vec{A})\). The electromagnetic wave function is something like \(\psi = e^{i/\hbar \int (\b{p} - q \vec{A}) \cdot d\b{x} - (E - q \phi)]dt } = e^{i/\hbar \int (p_{\mu} - q A_{\mu}) dx^{\mu}}\).

Covariant Derivatives:

Given the electromagnetic wave function of the form \(\psi = e^{i/\hbar \int (p_{\mu} - q A_{\mu}) dx^{\mu}}\), we can extract the \(p_{\mu}\) term with a more involved derivative operator, the ‘covariant derivative’ \(D_{\mu} = \p_{\mu} + i q A_{\mu}\), or equivalently, modifying the moment operator to be \(\hat{P}_{\mu} = \hat{p}_{\mu} + \hbar q A_{\mu}\):

\[- i \hbar D_{\mu} \psi = - i \hbar (\p_{\mu} + i q A_{\mu}) \psi = p_{\mu} \psi\]

This derivative manages to extract the \(p_{\mu}\) term by itself by subtracting off the \(qA\) contribution.

Gauge transformations:

Since physics is determined by an action integral like \(\int( p_\mu - q A_\mu )dx^\mu\), any integrable (exact) form (some \(\Lambda\) with \(d \Lambda = 0\)) can be added to \(A\) and will only affect the action in a path-independent way:

\[\int_i^f (p_\mu - q A_\mu + \Lambda_\mu) dx^\mu= P_i^f - \Lambda_i^f - q \int_i^f A_\mu dx^\mu\]

The covariant derivative is so called because it produces a derivative operator, and thus a momentum operator, which respects this gauge-freedom by removing any explicit dependence on the value of \(A\). In my opinion, though, this is a very roundabout way to reach the conclusion: the explicit purpose of \(\hat{P}\) is to extract the value of \(p\), which is ultimately the thing that must obey \(p_{\mu} p^{\mu} = m^2\); the specific method of removing the gauge freedom is an implementation detail.

This performs a gauge transform that doesn’t affect the relative amplitudes of different paths between \(i \ra f\) – only the resulting phase. As such there is no way to observe this effect in a closed system, so the addition of \(d \Lambda\) is a free variable in the theory. However, it turns out to be important when considering interacting systems, in ways that I haven’t learned yet but will be essential in QFT.

The Lagrangian: The integral \(\Delta S = \int dS\) can be parameterized by time as

\[\Delta S = \int (S_{\b{x}} \cdot d\b{x}/dt - S_t) dt = \int L \, dt\]

\(L = dS / dt\) is the source of the (single-particle) Lagrangian, and is where the elementary form \(L = T - V\) comes from. For a free particle, \(L dt = -m d\tau\), and \(\Delta S = - \int m d \tau\). In a classical scalar potential with \(S_t = E = T + V\):

\[L = S_{\b{x}} \cdot d\b{x}/dt - S_t = \b{p} \cdot \vec{v} - E\]

In classical mechanics often \(E = T + V\) and \(\b{p} \cdot \vec{v} = 2 T\), giving

\[L = 2 T - (T + V) = T - V\]

Regardless of how we parameterize \(S = \int dS\), applying stationary-action will give the classical trajectory. Feynman’s classic explanation of this is that all but the ‘stationary’ path – the choice of \(\Gamma\) such that \(\delta S / \delta \Gamma \vert_{\Gamma} = 0\) – will exhibit destructive interference in the macroscopic limit, resulting in the laws of classical physics. Quantitatively, this means that in the classical limit as \(\hbar \ra 0\), the path integral is dominated by the minimal path:

\[\begin{aligned} \lim_{\hbar \ra 0} \int d\Gamma e^{i S[\Gamma] /\hbar} &= \lim_{\hbar \ra 0} \int d (\Delta \Gamma) e^{i S[\Gamma_{\text{min}} + \Delta \Gamma] /\hbar} \\ &\sim \lim_{\hbar \ra 0} e^{i S[\Gamma_{\text{min}}] /\hbar } \end{aligned}\]

I don’t exactly know how to make that rigorous but it makes heuristic sense: as \(\hbar \ra 0\) the function oscillates infinitely rapidly, cancelling itself out in the integral over \(\Delta \Gamma\), but at least the minimal path, where \(\delta S / \delta \Gamma = 0\), oscillates less than the rest do. We could imagine expanding \(S\) as a Taylor series \(S = S[\Gamma_{\text{min}}] + (\delta S / \delta \Gamma) \delta \Gamma + \ldots\), but I really don’t know if that’s allowed.

Noether’s Theorem: Suppose there is some dynamical variable \(q\) that \(S\) depends on. Then we can locally approximate \(\Delta S = S(q, \ldots) + S_q \Delta q\), adding a phase to the wave function \(\psi \ra e^{i S_q \Delta q /\hbar} \psi\). This leaves physics unchanged if and only if \(S_q\) is a constant, such that this is a uniform global phase transformation.

But if \(q\) is a physical symmetry of the system, then it must lead to the same physics; therefore \(S_q\) is a constant throughout the system’s evolution (gauge fields notwithstanding). \(S_q\) is called the ‘Noether charge’ corresponding to the \(q\) symmetry. \(E\) is the charge associated with \(t\); \(\b{p}\) for \(\b{x}\), \(\vec{L}\) for \(\vec{\theta}\), etc.


  1. QM is easier to follow if you start from the fact that the wave function has the form \(\psi = e^{i S/\hbar}\).
  2. Operators and inner products are ways to extract properties of \(S\).
  3. The Schrödinger equation for a free particle is a low-energy approximation of the statement that \(\| \p S \| = m\).
  4. The only free physical quantity in a wave function is the 4-vector \(\p S\), which measures which part of the variation in \(S\) is in the spatial vs timelike direction.
  5. Potentials enter by modifying \(\p S\), eg \(\p S \mapsto \p S - i q A\). \(\int_i^f dS = S(f) - S(i)\) may no longer hold depending on the properties of \(A\).
  6. Intrinsic angular momentum is a property of what kind of object the wave function’s value is (scalar, vector, spinor, etc).

Normally you have to unlearn QM to learn relativistic QM, but the relativistic version makes much more sense in the first place so why not start there?

Next up: spinors? Oh god.