LLMs are blind. But somehow they can see.
How text-only models develop human-like maps of color, pitch, emotion, and taste, without ever experiencing any of them.
We trained language models on text. Just text. No images, no sound, no sensory experience of any kind.
And yet, if you look inside their representations, something interesting shows up. The models appear to have organized color into something resembling a color wheel. Pitch into something resembling a spiral. Emotion into a valence arousal plane that matches how humans psychologically categorize feelings. Taste into clusters that track human perception of sweet, bitter, salty, and umami.
This was not specified to them. It just emerged.
This is the core finding of the paper I worked on with Paras: Geometry of Human Perceptual Domains Emerges Transiently in LLM Representations.
What we actually did
We took open-weight models (LLaMA-3-8B, Gemma-7B, Qwen-3-4B, LLaMA-3.2-3B) and fed them minimal prompts, things like “The description of the color given as #9B081A“ or “Describe the person who is feeling afraid.” We then extracted the hidden state activations at every layer of the transformer, for every stimulus, and asked: how are these concepts geometrically arranged in the model’s internal space?
To measure this, we computed pairwise cosine distances between all stimuli at each layer, projected them into 2D using MDS (multidimensional scaling), and compared the resulting maps to human perceptual baselines from established psychology datasets. We used two metrics: RSA (Representational Similarity Analysis), which checks if the pairwise relationships between concepts match between the model and humans, and GPA (Generalized Procrustes Analysis), which checks if the shapes of the geometric maps align.
Crucially, no probing classifiers, no fine-tuning, no additional supervision. We just looked at what was already there.
What we found
The most consistent result across all 4 models and all 4 domains: perceptual geometry follows a rise, peak, fade trajectory across layers. Early layers show weak or diffuse structure. Intermediate layers crystallize into something that closely resembles human perceptual maps. Later layers dissolve that structure as the model shifts toward task-specific computation.
The figure below shows this pattern across color, pitch, and emotion.
Each domain has its own fingerprint though.
Color forms a smooth circular manifold in intermediate layers that looks remarkably like the human color wheel, as shown below. The alignment peaks clearly, then declines. Qwen-3-4B has an interesting quirk where alignment briefly rebounds in the deep layers before final degradation, but the overall arc is the same across all architectures.
Emotion is the most persistent. Its valence-arousal structure not only peaks strongly but stays comparatively stable deep into the network. Of the four domains, emotional geometry seems hardest for the model to give up on.
Pitch organizes into a smooth arc in intermediate layers, reflecting the continuous and ordinal nature of pitch perception. It then progressively deforms at greater depth without breaking into discrete categories, which suggests the model encodes pitch as a relational spectrum, not a set of named notes.
Taste is the strange one. The figure below shows why. It peaks the earliest of all four domains and degrades the fastest. The GPA scores (global geometric alignment) are strong at peak, meaning the overall shape of the taste manifold matches human perception reasonably well. But RSA scores (fine-grained pairwise similarity) stay relatively low throughout, suggesting the model gets the broad strokes but not the details. Taste representations are noisier and less stable than the other three.
Why this matters
The obvious question is: so what? Models organized color into a circle. Is that interesting or just trivia?
Here is the thing. The prevailing story about why language models seem to understand things is that they are pattern-matching over co-occurrence statistics. And that story is not wrong. But this work shows that co-occurrence statistics in language are not random. They carry geometric structure that mirrors the structure of human perceptual experience. The model isn’t learning color because it has eyes. It’s learning color because the way humans write about color encodes the geometry of color.
This is consistent with the theoretical account by Karkada et al. (2026), who argue that structured geometry in LLM representations emerges naturally from translation symmetries in language co-occurrence. Our results give that account a concrete empirical face across four perceptual domains.
There is also a more unsettling implication. If you look at the later layers of these models, the perceptual geometry dissolves. The model, in its final output layers, has largely given up the human-like structure it built in the middle. This suggests the geometry is not just a curiosity encoded at input, but something that forms and then gets overwritten as the network solves the prediction task. It arises transiently, as part of the internal transformation pipeline, not as a stable property of what the model knows.
What we are not claiming
We are careful in the paper about this. Geometric alignment with human baselines is not the same as perception. The model is not experiencing color. It’s not afraid when it processes the word “fear.” The representations mirror the structure of human perception, not the experience of it. Whether that distinction matters philosophically is a different conversation.
The paper
The full paper is on arXiv: 2605.27970 and is accepted at the ICML 2026 Mechanistic Interpretability Workshop. The full methodology, all four models, all four domains, Isomap verification, and bootstrap confidence intervals are in there.
You can also explore the results interactively at heyysimarr.github.io/transient-geometry.



