PolymathicAll ideas →
Mind & Brain

Working Memory

About four chunks at a time — the cognitive scratchpad that bounds nearly every act of reasoning.

In 1956, the cognitive psychologist George A. Miller published a paper at Harvard with the deliberately conversational title The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information. Miller had noticed something strange about human cognition: across many domains — recalling digits, distinguishing tones, identifying tastes — performance crashed at around seven items. The number seemed to be a fundamental feature of human information-processing capacity. Subsequent work (especially Nelson Cowan's 2001 review) revised the number downward: closer to four, when chunking is controlled for. Working memory — the cognitive scratchpad where information is held actively for ongoing tasks — has a hard, small, surprisingly stable capacity, and that capacity bounds nearly every act of complex reasoning a human being can perform.

Working memory is the active maintenance and manipulation of information over a span of seconds to a minute, distinct from passive short-term storage on one side and durable long-term memory on the other. Alan Baddeley and Graham Hitch's 1974 architecture, refined since, describes it as a central executive — attentional control, conflict resolution, task switching — coordinating a small set of specialized buffers: a phonological loop with a sub-vocal rehearsal mechanism for verbal material, a visuospatial sketchpad for visual and spatial material, and an episodic buffer added later to integrate across modalities and link to long-term memory. The capacity, at any given moment, is around four learned chunks. The expert's apparent expansion of capacity is mostly chunking in disguise: where a novice sees a chess board as twenty-five individual pieces, a master sees it as a handful of strategic chunks each compressing a familiar pattern, a result Chase and Simon nailed down in the 1970s after de Groot's earlier work. The cellular substrate is persistent firing in prefrontal and parietal cortex — Goldman-Rakic's monkey recordings in the 1980s showed individual PFC neurons holding stimulus-specific information across delay periods without the stimulus being present.

The narrow capacity has cascading consequences across cognition. Working-memory measures correlate with fluid intelligence at around 0.5–0.7, one of the strongest relationships in cognitive psychology, and individual differences in capacity predict reading comprehension, mathematical performance, and academic achievement. Reasoning requires holding premises while operating on them, so deeply nested arguments rapidly exhaust the buffer; complex decisions are made under partial information because the rest doesn't fit; arithmetic performance, as Dehaene's work has shown, is sharply bounded by how many partial results can be juggled mid-calculation. Expertise substitutes long-term-memory chunks for working-memory load, which is why a domain expert can reason about much more elaborate situations than a novice in the same domain — they have offloaded the storage. Working-memory training is the part that has held up worst: Cogmed and n-back training produce small task-specific gains but transfer poorly to general cognition, and the strong transfer claims of the early 2010s have largely failed to replicate. The capacity Miller noticed in 1956 is, in retrospect, the bottleneck through which most of conscious cognition has to pass, and it is mostly fixed.

Why it matters now

The concept now frames how we think about artificial minds: a large language model's context window is its working memory — a fixed, finite span of active tokens that bounds what it can reason over in a single pass, and the scramble to extend that window echoes the human bottleneck Miller named. In education, cognitive-load theory turns the four-chunk limit into a design principle: worked examples, chunked sequences, and offloaded reference material all exist to keep learners under the ceiling. Clinically, working-memory deficits track ADHD, healthy aging, and schizophrenia. And the commercial promise of brain-training to raise the ceiling has largely collapsed under failed replication — the capacity stays stubbornly fixed, which is exactly why expertise, not training, remains the only reliable way around it.

Read it in Polymathic →Browse the catalogue
Polymathic — a curated catalogue of the ideas worth keeping across twelve disciplines. polymathic.app