In 1965, Edsger Dijkstra's paper Cooperating Sequential Processes formalized what would become the central problem of concurrent computing: what happens when two or more sequential processes share a resource. His thought experiment — the dining philosophers, five at a round table each needing both adjacent forks to eat — is, sixty years later, still the canonical illustration of the family of bugs concurrency creates: deadlock, starvation, livelock, race conditions. The deeper claim, developed by Dijkstra, Tony Hoare, and especially Leslie Lamport (whose 1978 Time, Clocks, and the Ordering of Events in a Distributed System is among the most-cited papers in the discipline), was that concurrent programming is qualitatively harder than sequential programming because its bugs do not reproduce, depend on timing, and appear in production after surviving every test.
The fundamental difficulty is non-determinism: the same code run twice may produce different results depending on thread interleaving, and the space of possible interleavings grows exponentially in the number of operations, so testing cannot in general cover it. A race condition — two threads accessing shared state where the result depends on who got there first — is the canonical bug: incrementing a counter is not atomic, since read, add 1, write can interleave so two concurrent increments raise the counter by one instead of two. Modern CPUs make this worse by reordering memory operations for performance, so what one thread writes may not be visible to another in the order written; the Java Memory Model (2004) and C++11 memory model finally formalized what programmers can rely on, but before then the situation was a swamp. The conceptual machinery for taming concurrency runs through several traditions: locks protect critical sections but compose badly (two correct components composed may deadlock), atomic operations like compare-and-swap underpin lock-free algorithms, and message-passing models — Erlang's actor model, Go's goroutines and channels — replace shared memory with communication under the slogan don't communicate by sharing memory; share memory by communicating. Two more recent attacks have changed the landscape: CRDTs (conflict-free replicated data types) are data structures designed so concurrent updates merge without conflict and are the foundation of modern collaborative editing in Google Docs, Notion, and Figma; Rust's borrow checker enforces at compile time that at most one mutable reference exists to any piece of data, eliminating data races by construction. Distributed systems extend concurrency to multiple machines and add failure — messages lost, nodes crashed, networks partitioned — and the CAP theorem (Brewer 2000, proven by Gilbert and Lynch 2002) shows you can have at most two of Consistency, Availability, Partition tolerance, while the FLP impossibility (Fischer-Lynch-Paterson 1985) shows that in an asynchronous system with even one faulty process, consensus is impossible. These results, sharp and depressing, define the boundaries of what distributed systems can do.
Modern systems are concurrent by default — laptops have 8–24 cores, web servers handle thousands of concurrent connections, and every major language now offers async/await sugar over cooperatively-scheduled tasks. Distributed-systems frameworks (Kubernetes, Kafka, Spark) and consensus protocols (Paxos 1989, Raft 2013) are how production-scale systems are built; Bitcoin's proof-of-work is a Byzantine-fault-tolerant consensus protocol on an open network, solving a problem long thought theoretically impossible. The same patterns reach AI: training a frontier LLM requires thousands of GPUs operating in concert under data, tensor, pipeline, and sequence parallelism, and the bottleneck is often communication and synchronization rather than compute. The conceptual frameworks (Dijkstra, Hoare, Lamport, CAP) are durable; the engineering practice changes faster than any other area of computer science.