LLMs: Beyond Correlation to AGI

LLMs: Beyond Correlation to AGI

AI + a16z Mar 17, 2026 english 5 min read

Explore the fundamental workings and limitations of LLMs, distinguishing pattern matching from true intelligence and outlining the path to AGI.

Key Insights

  • Insight

    Transformers, the core architecture of LLMs, perform precise mathematical Bayesian updating, learning in real-time by updating posterior probabilities based on new evidence.

    Impact

    This fundamental understanding validates LLMs' learning capabilities and can guide the development of more robust and predictable AI systems.

  • Insight

    Current LLMs excel at correlation and pattern matching (Shannon entropy) but are fundamentally limited in building causal models, performing simulations, or understanding intervention (Kolmogorov complexity).

    Impact

    Businesses deploying LLMs must be aware of this limitation; current AI is not suitable for tasks requiring true causal reasoning or novel scientific discovery without significant human intervention.

  • Insight

    Achieving Artificial General Intelligence (AGI) requires two major advancements: true plasticity for continual learning without catastrophic forgetting, and a shift from correlation to causation.

    Impact

    This defines clear research and development pathways for AI companies, shifting focus from mere scale to architectural innovations for genuine intelligence.

  • Insight

    The 'Einstein test' – an LLM generating the theory of relativity from pre-1916 physics – serves as a high bar for AGI, emphasizing the need for new representational frameworks rather than just processing existing data.

    Impact

    Provides a conceptual framework for evaluating AGI progress, focusing on true creative and explanatory power over task-specific performance.

  • Insight

    Scale alone will not solve the challenges of plasticity and causation; different architectural approaches and mechanisms are required to move beyond current LLM limitations.

    Impact

    Directs investment and talent towards novel architectural research and away from simply increasing model size, potentially leading to breakthroughs in AI capabilities.

Key Quotes

"But pattern matching is not intelligence. LLMs learn correlation. They don't build models of cause and effect."
"I think deep learning is still in the Shannon entropy world. It has not crossed over to the Kulmagrow complexity and the causal world."
"So to me, AGI will happen when these two problems get solved. Elasticity, continual learning properly, and building a causal model from, you know, uh uh in a more data efficient manner."

Summary

Decoding LLMs: From Bayesian Learning to the Quest for AGI

The landscape of Artificial Intelligence is evolving at an unprecedented pace, with Large Language Models (LLMs) demonstrating remarkable capabilities. However, beneath the impressive surface of their "intelligence" lies a profound debate about their fundamental mechanics and true potential. Recent research sheds light on how these models operate and, crucially, what still separates them from Artificial General Intelligence (AGI).

The Mathematical Core: LLMs as Bayesian Processors

Contrary to early skepticism, rigorous mathematical modeling and empirical testing have confirmed that Transformer-based LLMs perform a form of Bayesian updating. Through "Bayesian wind tunnels" – controlled environments where models learn tasks with analytically computable Bayesian posteriors – it's been shown that these architectures can update their predictions with astonishing precision, matching theoretical distributions almost perfectly. This means LLMs are incredibly adept at adjusting their probabilistic beliefs based on new evidence, a process that underpins their in-context learning capabilities.

For instance, an early implementation of Retrieval Augmented Generation (RAG) with GPT-3 demonstrated how an LLM, previously unaware of a custom domain-specific language (DSL), could dynamically learn to translate natural language queries into that DSL with just a few examples. This real-time learning within a conversation mirrors Bayesian inference, updating probabilities of next tokens as more contextual evidence is presented.

The Chasm: Correlation vs. Causation

Despite their sophisticated Bayesian capabilities, a critical limitation of current LLMs is their reliance on correlation rather than causation. LLMs excel at pattern matching – identifying associations within vast datasets (Shannon entropy). However, they do not build internal causal models of the world. This distinction is crucial: predicting what typically follows an event is different from understanding why it follows and being able to simulate interventions or counterfactuals. This inability to move from association to intervention and counterfactual reasoning – as described in Judea Pearl's causal hierarchy – is a significant barrier.

Human intelligence, by contrast, constantly constructs and refines causal models, allowing for simulation, planning, and genuine generalization. The "Einstein test" for AGI, for example, posits that an AGI should be able to generate new fundamental theories (like relativity from pre-1916 physics data) rather than just interpreting existing information. This requires creating new representations or "manifolds," a leap current LLMs cannot make, as they are "bound" to the manifolds of their training data.

The Road to AGI: Plasticity and Causal Modeling

Achieving AGI, therefore, necessitates two fundamental shifts:

1. Plasticity/Continual Learning: Unlike human brains that remain plastic and learn throughout a lifetime, current LLMs freeze their weights post-training. While they perform Bayesian inference during a conversation, this learning is not retained across sessions. Developing robust continual learning mechanisms that avoid "catastrophic forgetting" is paramount. 2. Causal Reasoning: Moving beyond correlation to building causal models, enabling simulation and intervention, is essential. This aligns with the concept of Kolmogorov complexity – finding the shortest program to describe phenomena – rather than merely processing vast amounts of associative data.

Recent work, like Donald Knuth's experiments with LLMs for Hamiltonian cycles, highlights this gap. While LLMs could explore various solutions, human ingenuity was ultimately required to synthesize a new, concise mathematical proof – effectively generating a new causal manifold. This underscores that while LLMs are powerful tools for exploration and pattern identification, human-like generalization and true understanding remain a frontier.

Conclusion

The current generation of LLMs represents an extraordinary technological achievement, adept at Bayesian inference and pattern recognition. However, the path to AGI is not merely about scaling these models. It demands fundamental architectural innovations that imbue AI with lifelong plasticity and the capacity for causal reasoning. For investors, entrepreneurs, and leaders in technology, understanding these core distinctions is crucial for identifying genuine advancements and directing resources toward the next generation of truly intelligent systems.

Action Items

Invest in research and development focused on AI architectures that enable continual learning (plasticity) and move from correlation-based pattern recognition to causal modeling.

Impact: This strategic shift is critical for building next-generation AI systems capable of AGI, expanding their utility beyond current associative tasks to complex problem-solving and scientific discovery.

Enterprises should design AI deployment strategies that acknowledge current LLM limitations, complementing them with human oversight for tasks requiring causal reasoning, novel problem-solving, or critical decision-making.

Impact: Minimizes risks associated with over-reliance on correlational AI, ensuring responsible and effective integration of LLMs into business processes.

Explore domain-specific language (DSL) creation and few-shot learning techniques for bespoke business applications, leveraging LLMs' proven Bayesian updating for in-context learning.

Impact: Enables companies to rapidly prototype and deploy AI solutions for specialized data queries and tasks, even with models not pre-trained on the specific domain.

Mentioned Companies

GPT-3 was central to the early development of RAG and understanding LLM mechanics. The company also provided an interface for probability display, aiding early research.

ESPN

4.0

Successfully deployed an early production implementation of RAG using GPT-3 for a cricket database front-end, demonstrating practical business application of LLMs.

Mentioned as a creator of 'great products' (Claude, Co-work) in the context of discussing the nature of AI consciousness, indicating positive perception of their work.

A former Columbia colleague involved in the research has joined DeepMind, indicating its role in advanced AI research and talent acquisition.

Google Research published a paper on teaching LLMs Bayesian learning, which aligns with and validates the discussed research directions.

Tags

Keywords

Artificial General Intelligence Large Language Models AI research directions Bayesian learning causal models AI continual learning AI Shannon entropy Kolmogorov complexity AI limitations future of AI