Imagine settling in with a page-turner or locking in during a tight chess match—your brain is always adjusting, storing details, and making predictions about what comes next. You might not notice it, but this internal notepad is always at work, quietly helping you navigate the twists and turns.
Advanced AI language models like ChatGPT do something a bit similar, though by very different means. Their “thought process” is powered by transformer architecture, a clever structure that helps them predict what should come next in a sequence—whether it’s a word in a sentence or the next logical step in a math problem. Yet, are these models actually thinking step-by-step, the way we do? Or are they cutting corners in ways we might not expect?
To explore that, MIT researchers cooked up a digital riff on the classic shell game—the one with the cups and hidden object. The AI was handed a line-up of numbers and told the “rules” for shuffling them, but wasn’t allowed to peek after the shuffling began. The challenge: predict the final arrangement purely from the starting point and the instructions.
Did the models painstakingly follow every swap and instruction, like an obsessive human player might? Not quite. Instead, the models found their own mathematical shortcuts. Two main strategies emerged: one, called the “Associative Algorithm,” grouped steps, solved partial problems, and pieced the results together, much like pruning branches off a tree and tracing each one to its tip. This made the models much faster and, in some cases, even more accurate than methodically playing out every move.
There was also the “Parity-Associative Algorithm,” which takes notice of whether there’s an odd or even number of swaps, then applies similar grouping shortcuts. This worked well for simple puzzles, but started stumbling when faced with longer, more complex challenges.
How did the researchers figure all this out? They used specialized tools—imagine reading the AI’s private notes mid-process—that let them see when the model guessed correctly and when it tripped up. What they found was clear: the Associative Algorithm allowed the model to learn faster and keep its cool even as challenges grew, while the parity-based strategy, dependent on quick heuristics, hit its limits with complexity.
So, what does all this mean for how we build and train AI? Lead author Belinda Li suggests that, instead of forcing AI to think strictly like us, maybe we should help them refine the unique strategies they develop. For instance, stacking more “layers” into a model may build deeper, more reliable chains of reasoning than simply asking it to take more steps in sequence.
Although the study worked with relatively small models and synthetic data, the team believes these insights could carry over to powerful tools like GPT-4.1. Next steps include testing these ideas on models tackling real-world tasks, like following a plot in a novel or tracking variables in software code.
This kind of research could reshape how AI handles all sorts of tricky state-tracking tasks, from writing recipes to keeping conversations straight. As the field moves forward, the MIT team—Belinda Li, Zifan “Carl” Guo, and Jacob Andreas—hopes their findings will lead to smarter, more trustworthy AI by helping us steer clear of problematic “shortcuts” while encouraging sound reasoning.
Their work debuted at the International Conference on Machine Learning (ICML) and was made possible thanks to support from organizations including Open Philanthropy, the MIT Quest for Intelligence, the National Science Foundation, the Clare Boothe Luce Program for Women in STEM, and the Sloan Research Fellowship.
Source: https://news.mit.edu/2025/unique-mathematical-shortcuts-language-models-use-to-predict-dynamic-scenarios-0721
This website uses cookies.