The Graph that Stopped the World

May 1, 2020 Keith Devlin

Figure from the March 16 paper authored by Imperial College London’s COVID-19 Response Team, led by Neil M. Ferguson, “Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand”

The above graph stopped our world when it appeared on March 16. It was one of the key graphs in the paper by a large team at Imperial College London (think the UK equivalent of MIT) that examined the threat posed by the COVID-19 pandemic that was about to explode in western Europe and the US. If ever anyone doubted the degree to which our lives today are dominated by mathematics, algorithms, and data, the instant, and dramatic (in some ways traumatic) consequences for our daily lives that resulted from western governments seeing that graph, and others like it from other sources, must surely have convinced them.

Variants of this graph appeared all over the media from late March onwards, showing the extent of the catastrophe that would result if nations did not take rapid, and extensive social measures to curb the spread of the novel coronavirus SARS-CoV- 2, the official name for the virus that causes the COVID-19 sickness.

Unfortunately, the clean simplicity of a nicely drawn graph, such as the second image shown here, from a different source, or sets of superimposed graphs as in the first image, thrust the general population, and most political leaders, into a world they had little or no experience of. And it soon began to show, as the news media and social media tried to cover what was a rapidly changing story.

With years of familiarity with graphs going back to their school days, most interpreted the graphs as a representation of what is. That’s what everyone was familiar with; that math class experience of being given a formula or a set of data and being asked to “plot the graph,” or to take a graph and use it to “read off” some specific values. Historically, graphs were, and clearly in our national psyche still largely are, a way to represent visually, but precisely, a mathematical formula or procedure, or a set of data.

But the graphs from Imperial College, and those in many other publications we have been inundated with in the past weeks, are mostly not about representing (factual) data. Rather, they are being used as real-time predictive tools to help guide decision making; they present at-a-glance, quickly digestible ranges of outcomes that will likely follow if certain courses of action are followed (or not).

The better ones are provided in interactive form so decision makers can explore various options enabling them to balance the mathematics of viral growth (mostly well-established mathematics) and our knowledge of epidemiology (which the mathematical models behind the graphs instantiate) with the various other constraints, pressures, and opportunities that go into making public or commercial policy decisions.

To most people, this is all very unfamiliar territory.

Predictive models are made possible by powerful computers that draw on masses of big data. The results they produce, even when presented in the form of a nice, clean-looking graph or table, are rarely presentations of facts. Indeed, the results, and the associated graphs, change whenever new data comes in, which in the case of the COVID-19 pandemic mean updates can occur at least daily. (Commentators sometimes talk about the teams who work on this “changing the model.” While that does happen – meaning that the team alters some of the computer code to incorporate changes to the algorithm – usually the only change is in the data that is fed into the model. Same model, different data leading to different output.)

The situation is made all the more fluid by the nature of this particular virus, which is highly contagious, can spread rapidly through asymptomatic carriers, and can infect someone for up to 14 days before any symptoms appear, during which time that person can infect others. This means the model is always working on data that does not represent the current reality, and, depending on the prevalent data collection activities (to date, very little in the US and the UK), data that is incomplete, perhaps woefully so.

Yet, for all the uncertainties and lack of good data, there is no doubt the kinds of graphs these models produce (like the ones shown above) are extremely useful. In fact, they are the best tool we have available to enable us to see what would otherwise be invisible: the growth and spread of the virus in the society we are part of. Indeed, they are the only tool we have to do that. They are society’s one pair of eyes as we manage this crisis.

This use of mathematics is unfamiliar to most people, which is why they tend to interpret all those graphs and charts as statements of fact. But that incorrect assumption can become problematic, and potentially dangerous, when they note that the reality they are experiencing, and the actual facts that are emerging, do not match what they remember from the last graph they saw. What makes that dangerous is that they may then dismiss the model and its graphs altogether. But if we do that, we are flying blind into a raging storm. Far better then to stick with the fogged and blurred-focus glasses the models provide to see the terrain ahead.

To help readers, some of the models do provide a representation of that “blurred focus” in the way they present the graphs. The IHME modeling tool, from the University of Washington, which the US administration consults regularly, does that. See the image below.

The IHME predictions for New York and California on April 13, which I captured to compare the two states. The shaded areas show the uncertainties, but the significant difference between the two main trend-lines was enough to indicate that California could safely send critical medical equipment to New York, which is what happened.

But many laypersons seem to focus only on the most-likely-trend line and ignore the surrounding shaded region that provides a graphical indication of the possible ranges.

This April 28 tweet (identities redacted) by a US Congressperson shows how misunderstanding how a predictive model works can lead to a poor decision that could have dangerous ramifications. In fact, Twitter responses supplied corrective information.

The problem with doing that, is that an elegant, computer-drawn curve will of necessity have one or more maximum points at specific locations, with specific heights, and when those specific values (say, a particular mortality total on a specific date) don’t match the eventual reality, people can, and do, say, “The model is wrong; it’s useless.” See the tweet from a US Congressperson – someone who is in a position to influence on US policy. In a subsequent tweet, the politician said their objection was that the predicted date was “wrong.”

Yet, what is important about the graphs the Congressperson referred to (not the ones shown here), and indeed all such predictive graphs, is not the actual numbers, but the overall shapes that represent general magnitudes (tall and thin, shorter and more spread out, etc.) and, of particular import, the nature of their slopes: almost-flat, positive, negative, linear, quadratic, polynomial, exponential, etc. When a decision maker is looking for guidance, it’s the overall shapes of the graphs that correspond to different courses of action they need to compare and consider.

This is not to say the numbers and the dates can be ignored. They are clearly relevant. But they have to be viewed as, at best, estimates, and possibly very poor ones. Exponential growth can rapidly turn today’s encouraging good estimate into tomorrow’s nightmare scenario. Once a virus gets to the epidemic stage, the best you can hope for is a range of numbers – possibly a wide range. That number the algorithm puts out is just a marker around which a range of possible values center. Take it with a jar of salt – a big jar once the exponential growth has really kicked in.)

Clearly then, given that we are going to have to live with, and as a society manage, this pandemic for many months, possibly years, those of us in the mathematics business need to help people understand how these models work and how to interpret them. This is all relatively new stuff.

Yet for all that use of mathematics – and graphs in particular – as an explorative, predictive tool to guide decision making, is unfamiliar to most people, it is not a new side to mathematics to those of us inside the discipline. At least since the 1980s, mathematics educators have been trying to get the word out that there is far more to the subject than providing a toolbox for calculating correct, exact answers to precisely formulated questions – which is the impression most people have of math.

Early in my side career as a (self-annointed) mathematical evangelist, I wrote books designed to promote two memes: mathematics as the science of patterns, and mathematics as a language that makes the invisible visible. Neither idea originated with me, though at the time I thought the second phrase was my terminology, only to come across exactly the same phrase soon afterwards used by an artist, almost certainly independently. But both phrases resonated strongly with me, and provided a vehicle to convey to non-mathematicians that our subject is far more than a toolbox for performing various kinds of calculation and solving certain kinds of problems by executing formal procedures. Inspired in part by Richard Dawkins’ success in making people aware of the nature of evolution through the creation of the meme “the selfish gene” – by publishing a book with that title – I deliberately set out to try to do the same with two general audience mathematics books.

The toolbox part of mathematics is extensive and has dominated school mathematics education for many centuries, but it is not what attracts most mathematicians to the subject. Rather we professional (pure) mathematicians are drawn in by the elegance, the beauty, and the intellectual challenge of one of humankind’s most impressive cultural achievements. To us, it is an art form, pursued as such. Fortunately for us, it is an art form that continually supplies new mathematical tools that can be used to solve real-world problems, so society has always been willing to support us in our “artistic” pursuit. In the process, we teach generations of students techniques in the mathematical toolbox they can use to solve problems in other disciplines and application domains.

But the arrival of the digital world all but obliterated those seemingly neat classification boundaries. (Truth be told, they were never that neat, admitting many overlaps, but to the casual observer with only a minimal experience of mathematics they could seem very different.) In recent years, mathematics as a science of patterns and mathematics as a language for making the invisible visible have become front-and-center in many toolbox applications of mathematics. The COVID-19 pandemic is just the latest, and one of the most dramatic examples of that shift.

Our national and local pandemic responses are (or should be) guided by using models to explore the consequences of our decisions before we make them. With a contagious virus, you are dealing with exponential growth, so even approximations to specific numerical values and dates are largely irrelevant. Exponential growth will eventually overwhelm you, so your only recourse is to kill that growth, and as quickly as possible since time is on the side of the virus. It’s the mathematical shapes of the curves that really matter.

In other words, with predictive modeling of a pandemic, we are ultimately looking not at numbers and arithmetic but at patterns and shapes, and we are doing so in order to “see” (to an actionable degree) the otherwise the enemy’s invisible patterns of growth and infection.

The digital tools we are using to do this absolutely make extensive use of the familiar mathematical toolbox that most people think is all of mathematics. That is still there. But in today’s world, that part is all under the hood, being executed by computers. The face of mathematics being presented to the world now is very much one of mathematics as a science of patterns and as a representation system to make the invisible visible.

As a quick glance at my Twitter feed just now indicated to me once again, we have a long way to go in helping our fellow citizens come to terms with these unfamiliar-to-them, yet hugely important, faces of mathematics. (A great many tweets agreeing with the Congressperson are quibbling about “Aug. 4th”, when the real issue is we are still in exponential growth.) That is all the more reason for we mathematicians to do all we can to help people make good use of the powerful tools that, for the most part, are available to us all. It’s not about numbers and accuracy; it’s about understanding and being able to compare (often very) different risks. The stakes are high. Our lives, and those of the people close to us, can depend on interpreting the data well enough to make a wise decision.

ADDED May 5, 2020: This excellent video (24min) from the American Mathematical Society features an interview with Tulane University Mathematics Professor Mac Hyman on the way models are used in managing an epidemic. Hyman focuses on “agent-based” models designed (over many years with support from CDC and NIH) to support long-term planning in fighting an epidemic. This is to some extent different from models like the IHME model I used as an example above, which was created to predict likely medical support needs in the first wave of the viral attack. Though on May 4, with the first peak over, the IHME team modified their model to make it more suited to the changed priority of long-range planning, incorporating some of the methods Hyman describes. See the May 4 updates on the IHME website.

VIDEO: This month’s post was based on an introductory keynote I gave at the unConference for math teachers organized by MidSchool Math and the US Department of Education’s Institute for Educational Science, held on Zoom on April 23. The video of my talk (41min inc. Q&A) is available here. The discussion of COVID-19 modeling starts at 18:12.