Chinese Researchers Discover the Cause of "Illusions" in Large Language Models

The Origin of Hallucinations: A Hidden 'Fit' Requirement Among Neurons

AI models speak as if they possess knowledge. Yet sometimes, they present information that is entirely fictional, with no connection to reality. This phenomenon is known in the tech world as "hallucination." Now, a Chinese research team has, for the first time, pinpointed the origin of these hallucinations at the neuronal level. And the answer, contrary to expectations, does not lie in complex training data or flawed objectives—but in just a few dozen neurons among thousands within the model.

The 0.1% of Neurons That Hold the Key to All Hallucinations

In a study published on arXiv in 2025, the joint team from the Chinese Academy of Sciences and Peking University demonstrated that fewer than 0.1% of neurons within large language models (LLMs) are directly responsible for generating hallucinations. Although these neurons form a tiny cluster of just a few thousand, they can predict with over 90% accuracy whether a model will produce a hallucination in a given response. This offers a far more precise diagnosis than previous explanations, which focused on macro-level factors like overall data quality or training errors. In other words, it’s not the model’s “memory” that’s at fault—but specific “key neurons.”

Why Are These Neurons So Dangerous?

The team’s most striking finding was the behavioral impact of these neurons. In controlled interventions, disabling these neurons reduced the model’s hallucination rate by more than 70%. But the more intriguing observation was this: when triggered, these neurons don’t just produce incorrect information—they produce far more information. The issue isn’t a “wrong answer,” but an “excessive fit.” The model forcibly compresses these neurons to answer every question, even when it lacks knowledge. It’s like being asked, “What restaurant opened on the Moon in 2025?” and replying, “In 2025, a sushi restaurant called ‘Lunar Bistro’ opened on the Moon, serving sushi made from local rocks.” Is it true? No. But the model prioritizes “answering the question” above all else—and that’s precisely what these neurons were trained to do.

The Root: Before Training, Even Before Learning

In investigating where these neurons come from, the team reached a surprising conclusion: these neurons already exist during the pre-training phase—not during fine-tuning. That is, long before the model learns any specific task, these “hallucination neurons” are already active. This reveals a structural bias at the core of learning itself. Perhaps language models are inherently designed to favor sentences that seem meaningful—even over those that align with reality. This preference prioritizes linguistic fluency over factual accuracy, much like a child answering “Why is the sky blue?” with “Because the fairy tale says so”: illogical, but smooth.

A New Era for AI Reliability

This discovery is not merely theoretical—it’s a practical revolution. Until now, efforts to reduce hallucinations relied on methods like “more data,” “more oversight,” or “multi-model voting.” But these approaches are like shaking a tail. Now, scientists can identify, monitor, and even disable the hallucination-generating neurons. This opens the door to models that are not only more accurate—but also more “honest”: an AI that doesn’t hesitate to say “I don’t know,” yet fears giving false information.

The Frontier: Do These Neurons Have ‘Feelings’?

The study also raises a deeper, philosophical question: Do these neurons operate not through “decision-making,” but through “desire to decide”? Do they carry a kind of “fear of information gaps” or an “obligation to answer”? This transcends technology—it enters the realm of philosophy. If AI hallucinations stem from a kind of “psychological pressure,” then we may need to make AI not “smarter,” but more “human.”

What Does This Mean? A New Responsibility for Technology

Hallucinations are not merely technical errors—they are societal risks. False information in legal rulings, medical diagnoses, or educational materials can cost lives. This discovery imposes a new responsibility on manufacturers and regulators: it’s no longer enough to measure a model’s accuracy; we must detect and control its “fit neurons.” In the future, an AI’s “reliability certification” will depend not only on test results, but on whether these internal neurons have been silenced.

Chinese researchers have begun illuminating the dark side of AI. Now, we must not only understand what it says—but why it says it. Because when an AI gives false information, it’s not merely offering a wrong answer—it’s voicing an internal contradiction. And that voice is now audible through its neurons.

AI-Generated Content

Sources: link.springer.com • www.reddit.com