How a Fields Medal Led to a Mathematical Roller-coaster Journey
By Keith Devlin @profkeithdevlin
First, congratulations to Caucher Birkar, Alessio Figalli, Peter Scholze, and Akshay Venkatesh on being awarded the Fields Medal, an award that for regular “Devlin’s Angle” readers needs neither introduction nor description. (If it does, use Google.)
With Fields Medals awarded only (at most) once every four years to mathematicians who produce truly exceptional mathematics before they turn forty, few of us who enter the field come close to getting one. (Indeed, in some ways they are more akin to Olympic Gold Medals than the Nobel Prizes with which they are usually compared. Few club athletes will get one of those either.) On the other hand, many of us earn our doctorates, or build our careers, by understanding a new approach or mastering a new technique that led to a Fields Medal. Just as Bill Gates copied the groundbreaking Macintosh interface to create Windows, so too it can pay off handsomely, and often quickly, for a young mathematician to “reverse-engineer” a Fields-Medal-winning new result and try to use it to solve a different – though often related – problem.
In fact, sometimes, the medal-winning breakthrough has such broad applicability that is initiates an entire new subfield of mathematics. That was exactly how I began my mathematical career almost a half a century ago. An interest in computing, initiated by a high-school summer internship writing software for British Petroleum (using the very first digital computer delivered to the city I grew up in), stayed with me throughout my undergraduate years, culminating with me interviewing for a job at IBM on graduation. But I was put off by the strong corporate culture and the lack of intellectual freedom I feared would come with joining Big Blue. Instead, I decided to go for a PhD in the general area of computing. Unfortunately, this was before Computer Science was a recognized discipline, and though it was possible to pursue graduate research related to computing, mathematically speaking there was not much of a “there” there back then.
The one mathematically-intriguing little “there” was a relatively new subject called Automata Theory that I had come across references to. Moreover, one of the pioneers of that field, John Shepherdson, was a professor of mathematics at the University of Bristol, just over a hundred miles from London, where I had just graduated. As chair of department, Shepherdson had built up a strong research team of experts in different branches of Mathematical Logic, the subfield of mathematics that provided the mathematical tools for Automata Theory. And so it was then that I applied to do a PhD at Bristol. That was in the fall of 1968.
Once I arrived in Bristol, everything changed. Among the mathematics graduate-student community at Bristol University, all the buzz – and there was a lot of it – was about an emerging new field called Axiomatic Set Theory. Actually, the field itself was not new. But as a result of a Fields Medal winning new result, it had recently blossomed into an exciting new area of research.
Not long after Georg Cantor’s introduction of abstract Set Theory in the late 19th century, Bertrand Russell came up with his famous paradox, concerning the set of all sets that are not members of themselves. To escape from the paradox – more accurately, to rescue the appealing, natural notion of using abstract sets as the basic building block out of which to construct all mathematical objects – Ernst Zermelo formulated a seemingly simple set of axioms to legislate the formation of sets. With an important addition from Abraham Fraenkel in 1925, that axiom system seemed to provide an adequate basis for the construction of all the objects of mathematics, while avoiding Russell’s Paradox. Zermelo-Fraenkel Set Theory, as it became known, rapidly came to be regarded as the "Grand Unified Theory" of mathematics, the basic system on which everything else is built. It was generally referred to as ZFC, the “C” denoting the Axiom of Choice, a basic principle Zermelo included but which was sufficiently controversial that its use was often acknowledged explicitly.
While the ZFC axiom system was indeed sufficient to ground all of mathematics, there were a small number of seemingly-simple questions about sets that no one could answer using just those axioms. The most notorious by far went back to Cantor himself: Cantor’s Continuum Problemasks how many real numbers there are? Of course, one answer is that there are an infinite number of such. But the ZFC axioms allow the construction of an entire system of infinite numbers of increasing size, together with an arithmetic, that can be used to provide a “count” of any set whatsoever. The smallest such infinite number, aleph-0, is the number of natural numbers. After aleph-0, the next infinite number is aleph-1. Then aleph-2, and so on. (It’s actually a lot more complicated than that, but let’s leave that to one side for now.)
Cantor showed that the real continuum, the set of all real numbers, has an infinite size strictly larger than aleph-0, so it must be at least aleph-1. But which aleph exactly was it? It’s tempting (on the grounds of pure laziness) to assume it’s aleph-1, an assumption known as the Continuum Hypothesis (CH). But there is no known evidence to support such an assumption.
In 1940, Kurt Goedel contructed a set-theoretic model of ZFC in which CH is true, thereby demonstrating that CH could never be proved false. But that does not imply that it is true. Maybe it was possible to construct another model in which CH was false. If so, then CH would be completely undecidable, based on the ZFC axioms. This would mean that the ZFC axioms are not sufficient to answer all reasonable questions about sets.
In 1963, Paul Cohen, a young mathematician at Stanford University, found such a model. Using an ingenious new method for constructing models of set theory that he called forcing, Cohen was able to create a model of ZFC in which CH is false. That result earned him the Fields Medal in 1966.
By 1968, when I went to the University of Bristol to commence my doctoral work, Cohen’s new method of forcing had been shown to have wide applicability, making it possible to prove that a number of long-standing, unanswered mathematical questions were in fact undecidable in the ZFC system. This opened up an exciting new pathway to getting a PhD. Learn how to use the method of forcing and then start applying it to unsolved mathematical problems, of which there was no shortage. Large numbers of beginning graduate students did just that, and by the time I joined a group of them, a few months after arriving at Bristol, the field was red hot. My interest in computation did not go away, but it would be over two decades before I would pick it up again. At 21 years of age, with a newly minted bachelors degree in mathematics under my belt, I had a mathematical research career to build, and axiomatic set theory was by far the most exciting field to do it in. I jumped onto the roller coaster and joined in the fun.
Working in my newly chosen field was just like working in any other branch of mathematics. Each day, you woke up and attempted to prove various mathematical statements using logically rigorous reasoning. To an observer looking over your shoulder, doing that involved scribbling formulas on paper and manipulating them in an attempt to construct a proof, just like any other branch of mathematics. The “rules of the game” were exactly the same as in any other branch of mathematics as well. The only difference was the nature of the answers you obtained – on the rare occasion when you did so. (Mathematics research is 95% failure. Actually, the failure rate may be higher than that; we have a far worse batting average than any professional baseball hitter.) In what those of us in this new field called “classical mathematics,” the goal was to prove statements about mathematical objects were true or false. In the new mathematics of undecidability proofs, the goal was to prove that statements about mathematical objects were undecidable (in the ZFC system). In both cases, the result was a rigorous mathematical statement (a theorem) justified by a rigorous mathematical argument (a proof).
From the perspective of mathematics as a whole, this meant that, thanks to Cohen, mathematicians had a new way to answer a mathematical question. Classically, there had been just two possibilities: true and false. If you can do neither, you had failed to find an answer. Now, there was a third possibility: (provably) undecidable. What had previously been failure could now become success. Absence of a definite answer could be replaced by getting a definitive answer. Lack of knowledge could be replaced by knowledge.
The two decades following Cohen saw a whole range of unsolved mathematical problems proved undecidable, as a whole army of us jumped into the fray. Some results were easy. Success came quickly to those smart enough or lucky enough (or both) to find an unsolved problem that yielded relatively easily to the forcing technique. Others took much longer to resolve, and a few resisted all attempts (and have done so to this day). But by the start of the 1980s, the probability of success had dropped to that in other areas of mathematics. From then on, for most young mathematicians, getting a PhD by solving an undecidability problem meant finding some relatively minor variant of a result someone else had already obtained. The party was over.
Looking back, I realize that I was simply very lucky to be starting my mathematical career when a productive new subfield was just starting up. By going to the University of Bristol to do a PhD in Automata Theory, I found myself in the right place at the right time to jump ship and have the time of my life. When the field started to settle down and slowdown in the 1980s, I started to lose interest. Not in mathematics, just in that particular area as my main research focus. My main interest shifted elsewhere as my attention was caught by some new mathematics being developed at Stanford (as with forcing, Stanford turned out to be the place that generated the new ideas that caught my attention). That new mathematics was closely intertwined with my earlier high school interest in computing. But that’s another story.
Read the Devlin’s Angle archive.