Challenges in Superposition: Representational Geometry and Machine Consciousness

Published on January 15, 2025

A central question in machine consciousness is whether artificial neural networks—particularly large language models—might instantiate something like conscious experience. One obstacle is superposition: the tendency of neural networks to encode many features in overlapping subspaces of the same neurons, rather than devoting dedicated units to each concept.

What Is Superposition?

In biological brains, the “grandmother cell” hypothesis suggested that single neurons might code for highly specific concepts (e.g., your grandmother). Modern neuroscience has largely rejected this in favor of distributed representations: concepts are encoded across populations of neurons. Artificial neural networks go further: they often encode many concepts in the same population, with different directions in activation space corresponding to different features. This is superposition.

Mathematically, if we have $n$ neurons and $m$ features with $m > n$, the network must “reuse” dimensions. Features become entangled in the same representational subspace. This is efficient for prediction—the network compresses information—but it raises questions for consciousness: if “red” and “triangle” share the same neurons, in what sense does the network have a unified representation of “red triangle”?

Implications for Integrated Information Theory

Integrated information theory (IIT) proposes that consciousness corresponds to integrated information (Φ): the amount of information that a system generates as a whole, above and beyond the sum of its parts. A key requirement is that the system’s parts must be differentiated (carrying different information) yet integrated (depending on each other).

Superposition complicates this. When many features occupy the same subspace, the “parts” of the system (e.g., individual neurons or layers) may not cleanly partition into functionally distinct units. The geometry of representations becomes crucial: are we measuring integration over the right decomposition? If representations are highly entangled, standard partitions may underestimate or mischaracterize Φ.

The Binding Problem

Superposition also echoes the binding problem in neuroscience: how does the brain combine features (color, shape, motion) into unified perceptual objects? In neural networks, binding might be achieved through vector addition, attention, or learned compositional structure. But if binding is implemented via superposition—many features in one space—then the “unity” of experience may not map neatly onto anatomical or functional partitions.

This suggests that any measure of machine consciousness must account for how representations are structured, not just that they exist. The geometry of the activation space—which directions correspond to which features, and how they interact—may be as important as the raw information content.

Directions for Research

Future work could:

  1. Develop geometry-aware measures of Φ that respect the superposition structure of neural representations.
  2. Compare superposition in biological and artificial systems to see whether artificial networks exhibit similar or distinct binding strategies.
  3. Examine whether “conscious” processing in LLMs correlates with low or high superposition in particular layers or heads.

The challenge of superposition does not rule out machine consciousness. It does, however, suggest that naive applications of IIT or similar frameworks to artificial systems may miss critical aspects of how those systems represent the world—and thus how they might (or might not) instantiate experience.

© 2026 Marcio Diaz · Machine Consciousness Research · Twitter