How relevant is Cognitive Load Theory to learning?

A tweet from educator (and original MOOC co-creator) Stephen Downes (@oldaily) caught my eye as I was about to get down to writing this month’s post. In standard Downes fashion, the tweet took me to a short commentary on his blog with a link to the longer post he was commenting on, written by Nick Shackleton-Jones (formerly Head of Online and Informal Learning for the BBC in the UK, who as of today I follow on Twitter).

The observation in Downes’ summary that jumped out at me was this:

“Cognitive load will make some difference to problem solving but what really matters is whether or not someone cares about solving the problem.”

I sensed that what I would find in the article was a discussion having relevance to some mathematics-education ideas I have been working on for a couple of decades now. As it turned out, however, it was more than just relevant. It aligned well with an entire train of research into the notion of information that I and others have been pursuing since the early 1980s. That research (centered at, but not exclusive to, Stanford University in the heart of Silicon Valley) was originally motivated not by education but the design of effective information processing technologies, and it was only later, after I served a term on the US Mathematical Sciences Education Board (MSEB), that I found myself following that research thread into the world of math ed. More on that later. Meanwhile, with my originally intended August post now pushed to the back burner, let me pursue the issue Downes’ commentary pointed me too.

Cognitive Load Theory (CLT) is an educational theory developed in the 1980s in the course of studying problem solving. It is based on the notion that people have limited mental capacity for processing information, and argues that learning experiences and materials should be designed in a way that takes account of those limitations. (In particular, it views problem solving as a form of information processing.)

The article by Shackleton-Jones provides a brief summary and history of CLT, before moving on to the thrust of the point he wants to make. He says, in a provocative way guaranteed to get my attention: “Though there is nothing fundamentally wrong with this idea, it risks distracting us from the things that really matter, when people learn.”

Ideally, I suggest you checkout the S-J article (the “longer post” link above) before progressing here, but for completeness let me just remark that CLT views the brain as an organ for processing information. (That was the trigger to my earlier IT-related research; more on that later.) In particular, it assumes that reasoning and problem solving involves retrieving information from the brain’s long term memory and storing it temporarily in (short term) working memory, where the actual problem solving is assumed to take place. While long term memory appears to have effectively limitless capacity, working memory is highly limited. (Hence the significance of the “cognitive load.”)

As S-J notes, there is evidence to support this (theoretical) information processing structure and the limitations on cognitive load, and it has proved to be a useful perspective from which to describe and understand purposeful mental activity, leading to concrete advice for educators.

[Whether it tells us how our brain actually solves problems is a different matter. None of us have access to what goes on inside our heads, and there is no reason to assume we have concepts and a language capable of describing its behavior “accurately.” All talk of information and information processing is just that: talk. What makes adoption of the information stance (as I, and others, call it) towards the brain useful is that it is, indeed useful : it provides a way of understanding, with a language to communicate about mental activity, and provides a framework for creating systems and technologies that aid us in our mental work.]

For instance, CLT tells us that if an instructor displays a passage of text on a screen, it is wise to give the audience time to read it before moving on. The advantages of reading it aloud, with emphases and comments, are likely to be lost if the audience try to read the text (as they surely will) at the same time as they attempt to follow an audio narration. Working memory simply cannot give equal attention to incoming audio and visual information streams at the same time. The cognitive load is too great. One channel has to be de-emphasized. We can do that quite easily if, say, the audio channel is music, when we can generally read the text just fine while retaining just an overall awareness of the music. But trying to read and hear the same text at the same time causes overload, and in general the consequence is that neither succeeds.

See the S-J article for more. The author also provides pointers to more substantial treatments of CLT, some with a focus on mathematics instruction, in particular the relative merits of worked examples versus practice problem-solving.

In fact, let me suggest that at this stage you really should read the S-J article. It is clearly written and easy to follow, and hence, in an age of instant access to original sources, it would serve no purpose for me to summarize here what is already an excellent summary.

So, moving on to what for me was the punchline of S-J’s essay, he writes:

“Although Cognitive Load Theory has some worthwhile applications, it risks distracting us from more important variables affecting learning, some key concepts are poorly or completely undefined, and it is narrow in application. Overall it may lead us to focus on the presentation of the material we are teaching, rather than the learning process and the learner.”

After some discussion in which he elaborates that observation, S-J narrows in on the specific point he wants to make:

“The major objection to Cognitive Load Theory is not, therefore, that it is wrong – but that is it a distraction from more important aspects of the learning process. The cognitive effects that it describes are most applicable in a relatively narrow range of contexts created by the education system and therefore have limited applicability to real-world learning. Even worse, they risk distracting us from what is really happening when we learn.”

This is the part that in particular leapt out to me: “applicable in a relatively narrow range of contexts created by the education system and therefore have limited applicability to real-world learning.” [My emphasis.] Replace “real-world learning” by “real-world problem solving” in that quotation and we are squarely back in the realm I have been pursuing in some of my more recent Devlin’s Angle posts, where I have been advocating making the primary goal of systemic mathematics education the development of the capacity to solve real-world problems (using the full plethora of technological tools available today) — as opposed to mastery of a range of mathematical procedures mandated by some committee or other.

Far more important to achieving good learning, S-J argues, is to design learning experiences based on what he refers to as the Affective Context Model. (That link takes you to a very short introduction.) The name is, I believe, due to S-J, who has been advocating its use for some years. Though the ACM was new to me, and I suspect to many readers, the notion of affective learning has been knocking around the educational world for decades, going back to the introduction of Bloom’s Taxonomy in the 1950s.

Actually, what I just said is not entirely true. The name “affective context model” was new to me, but the concept was very familiar from the work I and others did at Stanford’s Center for the Study of Language and Information, starting in the early 1980s. [I started collaborating with CSLI in the mid-1980s, just after the work got underway, spending two years at the center from 1987 to 1989, and returning as the Executive Director in 2001.]

As I noted earlier, the motivation I and many others had at the time was not education (though a few of the researchers associated with CSLI were focused on education from the getgo), rather understanding the notion of information in a way that could lead to the productive design of effective information technologies that people would find easy and natural to use.

We began by trying to reach an agreement as to what exactly “information” is — more precisely, what we should take it to be — with view to crafting a definition that could support a scientific theory. The highly multidisciplinary structure of CSLI was intended to provide a research community that had a reasonable chance of succeeding in that enterprise (among other goals).

[The funding for CSLI — the founding award in 1983 was $23M, equivalent to around $60M today — came from the System Development Foundation, a non-profit spinoff from the RAND Corporation that designed and built much of the US IT infrastructure in the 1950s and 60s. Having helped build a national IT infrastructure, the folks at RAND thought it would be helpful to retrofit a scientific theory of information, in order to understand just what it was that IT was “processing,” and guide future IT developments.]

In 1991, I wrote a book, Logic and Information, that described much of that early work. It developed a mathematical model of what information is, how it arises, how it can be encoded, and how it can be transmitted. Recognizing that when people talk of information, it is almost always highly contextual, the theory was built on a grounding theory of contexts, called Situation Theory by the two researchers who initially developed it, the mathematician Jon Barwise (deceased) and the philosopher John Perry.

Though modern society tends to view information as a commodity (indeed, the theory has theoretical entities called infons — items of information) that can be created, transported, bought and sold, and consumed, the framework we eventually developed presents a much more complex picture of multiple, interacting contexts. Over the years following the initial development of the theory, the rubber hit the road as it started to be applied to a variety of real-world situations, among them human communication, education, manufacturing, production-line design (including automobiles), silicon-chip design, Space Station planning, and intelligence analysis. What became clear from that applied work was that the theoretical (hence naive) perspective described in Logic and Information, whereby contexts were treated as supporting characters in the creation and transmission of information, obscured the overwhelming fact that contexts were the main drivers. The information transmitted by any given signal depends fundamentally on the originating context at the time of issuance and the receiving context at the time of receipt. The same signal (word, sentence, email, etc.) can convey very different information under different circumstances. That massive context dependency is where the theory’s main focus has to be, both for study and for application. The mathematical notion of information we developed within Situation Theory is used only as an artifactual prop to facilitate the study and discussion of the way the pertinent contexts interact. (I wrote a subsequent book, InfoSense, in 1999 that was all about contexts.) The idea of information as a commodity to be created, shipped around, and consumed, which arose in the nineteenth century with the growth of mass produced daily newspapers (it’s not hard to see why that could have been the cause), is hopelessly inadequate in a world having today’s information technologies and the high degree of instant global connectivity, where many contexts can be in play. In the final analysis, talk about “information flow” (the classic, commodity view) is just one (albeit very useful and productive) way to view the way people (or societies) act and interact.

No surprise then, given that background, that the moment I read S-J’s account of the significance for learning of affective context, I knew at once he was onto something. In fact, I am sure he is onto THE thing. (Actually, it’s just one “the thing”. While learning is a natural ability Homo sapiens acquired through evolution, education is a complicated human-created activity with many facets. Despite coming from different backgrounds, as we do, I am sure that S-J and I agree on that. (We have yet to interact, by the way.)

So, if you have so far resisted looking at S-J’s article, let me make one last attempt to persuade you to do so. That, in fact, is the main goal of this month’s post.

Devlin's AngleKeith DevlinAugust 1, 2019cognitive load theory, affective learning, affective learning model, Bloom's Taxonomy, CSLI, Stanford University, Shackleton-Jones, situation theory, Keith Devlin