Breaking the rules: The physicist's flexible mathematics

If you look at the curriculum of any respectable Physics program, you’ll find mathematics courses like calculus, linear algebra, differential equations, maybe some complex analysis and differential geometry. It seems at a first glance that physicists use the results of mathematical theories (e.g. theorems from calculus, linear algebra, functional analysis, etc.), under certain “physical” interpretations of the objects that are used therein (e.g. forces are vector fields, trajectories are curves, quantum states are elements of a Hilbert space), to arrive at meaningful expressions concerning physical entities. That is, there is some kind of dictionary that tells you: this mathematical object represents this physical entity.

This is especially the case in classical physics, where, for example, it can be seen that the generalized coordinates that are used in (holonomic) systems correspond to coordinates on a specific manifold, and time-evolution corresponds to moving on a curve through that manifold. There is a clear dictionary. Curves to particles, vector fields to forces. Even in non-relativistic quantum mechanics one can find these kinds of dictionaries. It gets to the point where one can state this dictionary quite explicitly: Many a book on “quantum mechanics for mathematicians” coldly states that quantum mechanics is simply a representation of the Heisenberg algebra on a projective separable complex Hilbert space (uh… sure.).

But physicists don’t give a damn about what the Hilbert space is (or is it even a Hilbert space), and they don’t give a damn about flows on manifolds that give rise to Hamilton’s equations or the probability measures that describe the canonical ensemble. They don’t care about the dictionary: only mathematicians do, because it gives them confidence that the “physics” they are doing are well-founded.

Now, it is clear at the outset that physicists think of and use mathematics very differently from mathematicians (otherwise physicists would simply be called “mathematicians”). The physicist’s goal is not necessarily to prove abstract theorems about the objects she is dealing with, but rather to say something meaningful about the physical entities that her mathematical objects represent. So it is not necessary to drown in details of the mathematics involved. Physicists use mathematics.

Kind of. It seems that they do not use the same “flavor” (if you want to call it that, and I do) of mathematics that the mathematicians do. In physics, one does not care about the nuances of the axiomatic system one works with. As Folland put it in his QFT book (second ed. pp 47-48):

Mathematicians are trained to think that all mathematical objects should be realized in some specific way as sets. When we do calculus on the real line we may not wish to think of real numbers as Dedekind cuts or equivalence classes of Cauchy sequences of rationals, but it reassures us to know that we can do so. Physicists’ thought processes, on the other hand, are anchored in the physical world rather than in set theory, and they see no need to tie themselves down to specific set-theoretic models.

He then goes to explain that phycisists work much more symbolically. It is as if they are working with an logical system (first order, second order, I don’t know) where there is only syntax but no semantics. That is, we only consider the relationships between the symbols, with disregard for whether those symbols with those specific relations can be realized by a specific set. We introduce the notion of functions, vectors, tensors, whatevers, and the notion of addition, multiplication, derivation, integration, etc., purely symbolically: if you have these symbols, you can arrange them in this order, and if you put this integration symbol there, you can exchange the whole lot by this other sets of symbols.

So, physicists use symbols with syntax but not semantics? Not exactly, because the mathematical objects they are dealing with often correspond to physical (“””"”real””””, in many quotation marks) entities. And hence why we do not accept certain solutions to equations in the grounds of “physicality” (oh, for certain values of mass the height of this parabola is imaginary? Cross it out). These physicality assumptions end up being mathematical conditions on the objects we deal with. For example, the “measurable” observables should all be real numbers, and this implies that observables in quantum mechanics are represented by Hermitian operators. Then one may think that these “physicality conditions” can be simply added as axioms of the theory (e.g. add a “for all x, x is observable implies x is real”).

However, this implicitly states that (at least within the realm of a particular physical theory) there is an axiomatic system under which a physicist works. But upon further inspection, this does not seem to hold very well. First, the different areas of physics are not as well-delineated as those in mathematics, so it is unclear at the outset what this axiomatic system would be. But even if the physicist assumes she is working in an explicit framework (say, for example, quantum mechanics), she might use results and ideas and objects from another framework that might not a priori fit well within the first. This often requires reinterpretation (oh, yeah X is discrete and I want to interpret it as Y, which is not, so let’s say that X is a sampling or approximation of Y), but in some cases even the reintrepretation is inconsistent with the original framework. Furthermore, the physicist is willing to break the mathematical rules under which she is working, in order to press onward with her theory.

Physicists are willing to break the rules of the game in order to keep the game going.

My favorite example of this is Dirac’s development of an equation for relativistic quantum mechanics. To be brief, Dirac’s idea was to find the square root of the Klein-Gordon operator. His motivations are not important. What is important is that the Klein-Gordon operator is a differential operator that acts on real (or complex) functions of R^4, what we call scalar functions, and it turns out that it is impossible to find such a square root.

That is, if you work constrained to your axiomatic system of differential operators acting on scalar functions. Dirac wasn’t feeling constrained. So he did what physicists do: Who gives a damn about rules? He wrote down an object with some free parameters whose square would be the Klein-Gordon operator, and tried to figure out what those parameters needed to be. The problem is that those free parameters were apparently inconsistent: they needed to be anti-commuting: ab=-ba.

That doesn’t happen for complex numbers. Then Dirac realized that he could still find his square root if he allowed for matrix-valued fields, instead of scalar-valued fields, so that the anti-commutativity of the parameters could make sense. So Dirac broke the rules, but in a good way. If he had stopped at “welp, complex numbers are not anticommutative”, it would’ve been the end of it. No relativistic quantum mechanics (at least not as we know it, possibly).

Physicists are willing to break the rules if it suits them, but they otherwise play by the rules of the game, the game being calculus, algebra, etc. Those rules can, and will be, overruled if necessary. But not all of them, only… most of them. There are some absolutely unbreakable rules in physics, mostly regarding causality and thermodynamics. It’s important to note that mathematicians also break the rules, but they are much more careful and systematic about it. A theorem is proven under certain assumptions, but sometimes it is interesting to see what happens if the assumptions are relaxed. Similarly, it is interesting to see what happens when the axioms and fundamental definitions of a theory are relaxed. That is how we get complex numbers, and… well… everything else that is weird in mathematics.

Physicists, in the end, are willing to do almost anything (and given some fun stuff in high energy physics I’m tempted to remove the “almost”), as long as one obtains a meaningful statement about physical reality. They use a Machiavellian flavor of mathematics, where the ends justify the means. How valid is this? Well, it has mostly worked. We have computers, and the internet, and smartphones, and robots, and nuclear weapons, and nuclear-powered robots on Mars. There is something that is right about this approach to physics, about this flexible use of mathematics. It seems that mathematics, and I mean precise, rigorous mathematics, is more of a guide than a fixed toolset with fixed rules. And that is entirely okay, as long as we’re honest about it.