Feature Lottery?

A Bifurcation Theory of Concept Emergence
Fuming Yang, Department of Brain and Cognitive Sciences, MIT  ·  Homepage
① Pitchfork bifurcation ② Four trajectory shapes ③ Lottery animation ④ Grokking three-act ⑤ WD-intervention control ⑥ Cite

① The Hessian-pitchfork mechanism Theory

Attach a passive K-prototype isotropic GMM probe to a given encoder representation. The loss Hessian at the symmetric collapsed state has a zero crossing at:

βc = 1 / λmax(Cov(z))

Below βc, all prototypes pile at the data mean (symmetric collapsed state). Above βc, the symmetric state becomes a saddle and prototypes pitchfork along the principal eigenvector of Cov(z).

β / βc: 0.80 schematic  •  K = 12 prototypes  •  supercritical pitchfork
Proposition 1: When the encoder co-evolves, βc(t) becomes endogenous; under mild growth and non-collapse assumptions, β(t) and βc(t) cross at a finite time.

② Four trajectory shapes Prop 3

The post-critical trajectory in (log(β/βc), log NC1) space takes one of four observable shapes, determined by three binary kinematic axes (initial criticality, post-onset rate ordering, dissipation rate):

(i) Full V
SAE on frozen Pythia L6: βc constant
(ii) Fold-back (spectrum)
DINO/SimCLR on CIFAR-10/100: βc(t) rises
(iii) Delayed escape
Grokking: low dissipation, long Act 2 plateau
(iv) No arc (control)
Rotation-prediction: no clustering pressure
All four shapes are realized in our experiments; the framework forbids a fifth shape under the same dynamics. A trajectory outside (i)–(iv) would falsify the model.

③ The lottery: directions decided at the bifurcation Prop 2

At the Hessian-pitchfork onset, the unstable subspace is shared by all K−1 anti-symmetric modes. Each atom selects a direction from this common manifold; that initial direction is then preserved as it grows toward the attractor. We call this the feature lottery.

step 0 ρ(init, current) = 0.00 K=80 atoms  •  d=2 spatial dimension
Toy verification: Spearman ρ between initial direction and direction at finite simulation time T is +0.948 ± 0.006 (5 seeds, K=200, d=10; T within the persistence window τr ≪ T ≪ Trand).
SAE empirics: ρid = 0.41 ± 0.04 between step-1,000 and step-20,000 POS purity (3 seeds, frequency-stratification robust). Top decile of atoms ranked at 5% of training achieves convergence POS purity 0.82 ± 0.03, 12.3× the uniform-random baseline.

④ Grokking decomposes into three acts Empirical

The cleanest empirical instance of post-critical metastability. Canonical run: p=97, WD=1.0, n=3 seeds. The system crosses β=βc at step ~40, then sits on a metastable saddle for ~8,500 steps before weight decay's dissipation drives the macroscopic transition.

step 0
Act 1: β crosses βc at step 37 ± 2 (universal across all configs). Act 2: Metastable plateau, length set by dissipation. Act 3: Escape at step 8900 ± 864 (canonical p=97, WD=1.0). The indicator log(β/βc) ≈ +3 at step 100 already places the trajectory in Act 2, providing an early empirical forecast of grokking roughly 8,400 steps before test accuracy changes.

⑤ WD-intervention control: metastable plateau length Empirical

The Hessian-pitchfork crossing (§① / Remark 1) marks when the symmetric state becomes unstable, not when the macroscopic transition fires. To test that the plateau length is genuinely dissipation-controlled, we sweep weight decay (a knob that does not shift the crossing) and measure escape time:

τesc ≍ A · γ−p,    p ≈ 1.23  (empirical fit, not a theoretical prediction)

Six WD levels, 3 seeds each, 200k-step horizon. Activation-dominated Kramers form τ ≍ τ0 · exp((ΔS − κγ)/D) is decisively ruled out (ΔAIC = +19.3), pinning the grokking plateau in the drift-dominated regime.

Red dots: observed grok steps (3 seeds each, 200k horizon) • Blue arrow at γ=0: no escape in 50k • Solid: activation-dominated fit • Dashed: power-law fit
What this supports: the framework's qualitative claim that crossing ≠ macroscopic transition (Remark 1). τesc grows monotonically as γ ↓ across two decades (8.9k → 147k steps) and diverges at γ = 0 (0/3 escape in 50k). The power-law exponent is an empirical characterization of this grokking setup, not a derivation from the bifurcation framework; a separate experiment in a different regime (deep / wide barrier) could plausibly land in the activation-dominated Kramers limit instead.

⑥ Paper & citation Info

Read the paper on arXiv:2605.24057.

BibTeX:

@article{yang2026featurelottery, title = {Feature Lottery? A Bifurcation Theory of Concept Emergence}, author = {Yang, Fuming}, journal = {arXiv preprint arXiv:2605.24057}, year = {2026}, eprint = {2605.24057}, archivePrefix = {arXiv} }