From Russian Math Feuds to Google Search: The Surprising Power of Markov Chains
Have you ever wondered how many times you need to shuffle a deck of cards to truly randomize it? Or how scientists figured out the amount of uranium needed to build a nuclear bomb? Or even how Google knows exactly which page you’re searching for? The answers to these seemingly unrelated questions all trace back to a fascinating mathematical breakthrough that began over a century ago in Russia—a story of rivalry, insight, and innovation centered around something called Markov chains.
The Great Russian Math Feud: Free Will vs. Probability
In 1905, Russia was deeply divided politically between the Tsarists, who wanted to maintain the status quo, and socialists pushing for reform. This rift extended even into the world of mathematics, where two prominent mathematicians took opposing sides:
- Pavel Nekrasov (“the Tsar of Probability”), a religious man who believed mathematics could explain free will and divine will.
- Andrey Markov (“Andrey the Furious”), an atheist who argued that math had nothing to do with free will and criticized Nekrasov's approach as unrigorous.
Their debate centered on the law of large numbers, a principle stating that as you repeat independent trials (like flipping a fair coin many times), the average outcome converges to the expected probability (roughly 50/50 for heads or tails). Nekrasov believed that seeing this convergence implied the events must be independent—essentially, acts of free will.
Markov challenged this by showing that even dependent events—where one outcome influences the next—can still exhibit this convergence. To prove it, he analyzed the sequence of letters in a famous Russian poem, Eugene Onegin. Letters in text are clearly dependent on preceding letters (e.g., vowels and consonants tend to follow certain patterns), yet Markov demonstrated that these dependent sequences still obeyed the law of large numbers.
This led to the birth of Markov chains, mathematical models that describe systems where the next state depends only on the current state, not the full history. Markov’s conclusion was clear and powerful: free will is not necessary to do probability.
From Poetry to Nuclear Physics: The Monte Carlo Method
Fast forward to World War II, when scientists faced a daunting problem: understanding how neutrons behave inside a nuclear bomb. The complex interactions of trillions of neutrons were impossible to calculate exactly.
Enter Stanislaw Ulam, a mathematician who, while recovering from illness, pondered a simpler question: What are the chances of winning a random game of Solitaire? Realizing an analytical solution was impossible, he proposed playing many games and counting wins to approximate the probability—a method of statistical sampling.
Inspired by this idea, Ulam and John von Neumann applied the concept to simulate neutron behavior by generating many random chains of neutron interactions, using Markov chains to handle the dependencies between steps. Running these simulations on early computers like the ENIAC allowed them to estimate whether a chain reaction would sustain or explode—a critical insight for nuclear weapon design.
They named this approach the Monte Carlo method, after the famous casino, highlighting the role of randomness and chance in their computations. This method revolutionized physics and engineering, enabling scientists to tackle otherwise intractable problems through probabilistic simulation.
Markov Chains and the Rise of Google
Jumping ahead to the explosion of the internet in the 1990s, the challenge shifted to finding relevant information amid millions of webpages. Early search engines like Yahoo ranked pages by keyword frequency, but this was easily manipulated and provided poor quality results.
Two Stanford PhD students, Sergey Brin and Larry Page, had a breakthrough: they realized that links between webpages act like endorsements. Just like library books that are popular have many checkout stamps, webpages linked by many others are more important.
They modeled the web as a Markov chain, where each webpage is a state, and links represent transitions. By simulating a “random surfer” clicking links, they calculated the steady-state probability of being on each page, giving a ranking called PageRank. Importantly, their algorithm discounted low-quality link farms, preventing easy manipulation.
To avoid the surfer getting stuck in loops, they introduced a damping factor allowing random jumps, ensuring the model explored the entire web. This innovation powered Google, launched in 1998, which rapidly surpassed competitors by delivering far better search results.
Beyond Search: Markov Chains in Language and AI
Markov’s original idea of predicting letters depending only on the previous letter evolved over time. Claude Shannon, the father of information theory, expanded the concept to sequences of words, showing that longer histories improved prediction quality.
Modern language models, like those behind Gmail’s autocomplete or advanced AI chatbots, build on these principles but add mechanisms like attention to weigh the importance of different parts of the text context, enabling nuanced understanding and generation of human language.
The Memoryless Magic of Markov Chains
What makes Markov chains so powerful is their memoryless property: the future depends only on the present state, not the entire past. This simplification allows incredibly complex systems—from neutron physics to web surfing to language processing—to be modeled and understood.
While not all systems fit this model perfectly (feedback loops in climate change, for example, complicate predictions), Markov chains have become a cornerstone of probability and applied mathematics.
How Many Shuffles to Randomize a Deck?
Returning to the original question: how many shuffles does it take to randomize a deck of 52 cards? Surprisingly, the answer is just seven riffle shuffles to achieve near-perfect randomness.
This can be understood by modeling each shuffle as a step in a Markov chain, where each arrangement of the deck is a state. After seven shuffles, the distribution of deck states is approximately uniform.
Conclusion
From a century-old feud between Russian mathematicians to the heart of nuclear physics, internet search, and AI language models, Markov chains have quietly transformed our world. They reveal how dependent, complex systems can still be understood through probability, and how simple mathematical insights can lead to revolutionary technologies.
So next time you shuffle cards, search the web, or use predictive text, remember the fascinating chain of ideas—pun intended—that made it all possible.
Want to dive deeper into these concepts?
Check out Brilliant.org for interactive lessons on math, probability, AI, and more. From understanding Markov chains to exploring the math behind shuffling cards, Brilliant makes learning engaging and fun. Visit brilliant.org/veritasium for a free 30-day trial and 20% off annual premium subscriptions.
References: Inspired by the work of Andrey Markov, Stanislaw Ulam, John von Neumann, Sergey Brin, Larry Page, Claude Shannon, and many others who have shaped the mathematical tools we rely on every day.