For most of history, function meant a formula. Newton called them fluents, Leibniz called them quantities depending on a variable, and either way the assumption was that you could write the dependence down with arithmetic and roots and the occasional sine. Then, in 1837, the German mathematician Peter Gustav Lejeune Dirichlet gave a definition that quietly broke that assumption: a function, he proposed, is a rule that assigns to each element of one set exactly one element of another — no formula required. The example he gave was deliberately monstrous: a function that returns 1 if the input is rational and 0 if it is irrational. No expression, no graph anyone could draw, but a function nonetheless. The move opened the door to most of modern mathematics.
A function ƒ from a set X (the domain) to a set Y (the codomain) is, formally, a set of ordered pairs (x, y) such that each x ∈ X appears exactly once. The unique-output requirement is the entire content of the definition: every x has one y. From this minimum the rest follows — composition (ƒ ∘ g applied to x means ƒ(g(x))), inverse (a function that undoes ƒ, when one exists), injectivity (different x's go to different y's), surjectivity (every y in the codomain is hit), bijectivity (both at once, the case where an inverse exists). Functions ate mathematics for a simple reason: every dependency between quantities is a function. The position of a planet as a function of time. The probability of a sample as a function of the parameter. The output of a neural network as a function of the inputs. The shift from formulas to the abstract dependency itself freed mathematics from caring whether the rule could be written down. Higher-order functions — functions that take functions as inputs and return functions as outputs — became the foundation of lambda calculus (Church, 1930s) and, through it, of every functional programming language and most of theoretical computer science. The deep observation behind all of this is that once you have the function, the input is data and the output is consequence — a separation that lets one write an algorithm once and apply it to anything that fits.
Modern programming languages are organized around functions: pure functions in Haskell, first-class functions in JavaScript, methods on objects, λ-expressions in Python. Spreadsheet formulas are functions. REST APIs are functions across a network. Machine-learning models are functions: a neural network with billions of parameters is, formally, a single function from inputs to outputs that has been learned rather than written. The shift Dirichlet started — the rule rather than the formula — turns out to be the right level of abstraction for almost every computational thing humans now build.