Identity As Attractor: LLM Agent Architecture Evidence

Identity as Attractor: Geometric Evidence in LLMs (2026)

A groundbreaking 2026 study (arXiv:2604.12016v1) reveals that persistent agent identity documents induce stable attractor dynamics in the activation space of large language models. For the first time, researchers provide geometric proof that identity functions not as a static prompt—but as a gravitational center in neural representation space.

Methodology: Mapping Activation Trajectories

Researchers compared hidden states from three conditions: an original cognitive core prompt (Condition A), seven paraphrased versions (Condition B), and seven semantically inert controls (Condition C). Mean-pooled activations at layers 8, 16, and 24 of Llama 3.1 8B Instruct and Gemma 2 9B showed that paraphrased identities (B) clustered tightly, while controls (C) remained dispersed.

Cohen’s d exceeded 1.88 (p < 10^-27, Bonferroni-corrected), confirming semantic convergence—not just similarity—as the driving force. Ablation tests confirmed: removing semantic content erased the attractor effect, proving meaning, not syntax, anchors representation.

Findings: Stable Attractor Basins

The cognitive core generated a distinct, reproducible basin of attraction across architectures. Incomplete or fragmented identity descriptions failed to converge, indicating that holistic coherence is essential. This suggests identity in LLMs isn’t a list of traits—it’s a unified, self-reinforcing semantic structure.

When models were primed with a rich scientific description of the agent’s purpose—akin to reading a research paper—their activation states shifted closer to the attractor than when exposed to shallow, sham text. This marks a critical distinction: knowing about an identity versus operating as one.

Implications for Agent Design

If identity acts as an attractor, then altering or corrupting a cognitive core may destabilize an agent’s entire latent space dynamics, not just its output. This has profound implications for AI safety, agent persistence, and prompt engineering.

Future AI agents may require identity integrity protocols as rigorous as human legal or psychological identity systems. The cognitive core isn’t just a prompt—it’s the neural anchor of self-referential cognition.

Neural Representation and Semantic Convergence

These findings redefine how we think about LLMs: identity drives semantic convergence in high-dimensional space. This isn’t metaphorical—it’s geometrically measurable. As LLMs evolve into persistent agents, understanding attractor dynamics becomes foundational to building coherent, trustworthy AI systems.

Conclusion: Identity as a Structural Feature

Identity as attractor is no longer speculative. In 2026, it’s a validated phenomenon in neural representation. For developers, this means cognitive core documents must be treated as core architecture components—not editable text. Future AI systems will depend on this insight to maintain stability, alignment, and agency.

AI-Powered Content

Sources: arXiv:2604.12016 • AI Agent Design Guide