Feature Lottery?

A Bifurcation Theory of Concept Emergence

Fuming Yang, Department of Brain and Cognitive Sciences, MIT · Homepage

① Pitchfork bifurcation ② Four trajectory shapes ③ Lottery animation ④ Grokking three-act ⑤ WD-intervention control ⑥ Cite

① The Hessian-pitchfork mechanism Theory

Attach a passive K-prototype isotropic GMM probe to a given encoder representation. The loss Hessian at the symmetric collapsed state has a zero crossing at:

β_c = 1 / λ_max(Cov(z))

Below β_c, all prototypes pile at the data mean (symmetric collapsed state). Above β_c, the symmetric state becomes a saddle and prototypes pitchfork along the principal eigenvector of Cov(z).

β / β_c: 0.80 schematic • K = 12 prototypes • supercritical pitchfork

Proposition 1: When the encoder co-evolves, β_c(t) becomes endogenous; under mild growth and non-collapse assumptions, β(t) and β_c(t) cross at a finite time.

② Four trajectory shapes Prop 3

The post-critical trajectory in (log(β/β_c), log NC1) space takes one of four observable shapes, determined by three binary kinematic axes (initial criticality, post-onset rate ordering, dissipation rate):

(i) Full V

SAE on frozen Pythia L6: β_c constant

(ii) Fold-back (spectrum)

DINO/SimCLR on CIFAR-10/100: β_c(t) rises

(iii) Delayed escape

Grokking: low dissipation, long Act 2 plateau

(iv) No arc (control)

Rotation-prediction: no clustering pressure

All four shapes are realized in our experiments; the framework forbids a fifth shape under the same dynamics. A trajectory outside (i)–(iv) would falsify the model.

③ The lottery: directions decided at the bifurcation Prop 2

At the Hessian-pitchfork onset, the unstable subspace is shared by all K−1 anti-symmetric modes. Each atom selects a direction from this common manifold; that initial direction is then preserved as it grows toward the attractor. We call this the feature lottery.

step 0 ρ(init, current) = 0.00 K=80 atoms • d=2 spatial dimension

Toy verification: Spearman ρ between initial direction and direction at finite simulation time T is +0.948 ± 0.006 (5 seeds, K=200, d=10; T within the persistence window τ_r ≪ T ≪ T_rand).
SAE empirics: ρ_id = 0.41 ± 0.04 between step-1,000 and step-20,000 POS purity (3 seeds, frequency-stratification robust). Top decile of atoms ranked at 5% of training achieves convergence POS purity 0.82 ± 0.03, 12.3× the uniform-random baseline.

④ Grokking decomposes into three acts Empirical

The cleanest empirical instance of post-critical metastability. Canonical run: p=97, WD=1.0, n=3 seeds. The system crosses β=β_c at step ~40, then sits on a metastable saddle for ~8,500 steps before weight decay's dissipation drives the macroscopic transition.

step 0

Act 1: β crosses β_c at step 37 ± 2 (universal across all configs). Act 2: Metastable plateau, length set by dissipation. Act 3: Escape at step 8900 ± 864 (canonical p=97, WD=1.0). The indicator log(β/β_c) ≈ +3 at step 100 already places the trajectory in Act 2, providing an early empirical forecast of grokking roughly 8,400 steps before test accuracy changes.

⑤ WD-intervention control: metastable plateau length Empirical

The Hessian-pitchfork crossing (§① / Remark 1) marks when the symmetric state becomes unstable, not when the macroscopic transition fires. To test that the plateau length is genuinely dissipation-controlled, we sweep weight decay (a knob that does not shift the crossing) and measure escape time:

τ_esc ≍ A · γ^−p, p ≈ 1.23 (empirical fit, not a theoretical prediction)

Six WD levels, 3 seeds each, 200k-step horizon. Activation-dominated Kramers form τ ≍ τ₀ · exp((ΔS − κγ)/D) is decisively ruled out (ΔAIC = +19.3), pinning the grokking plateau in the drift-dominated regime.

Red dots: observed grok steps (3 seeds each, 200k horizon) • Blue arrow at γ=0: no escape in 50k • Solid: activation-dominated fit • Dashed: power-law fit

What this supports: the framework's qualitative claim that crossing ≠ macroscopic transition (Remark 1). τ_esc grows monotonically as γ ↓ across two decades (8.9k → 147k steps) and diverges at γ = 0 (0/3 escape in 50k). The power-law exponent is an empirical characterization of this grokking setup, not a derivation from the bifurcation framework; a separate experiment in a different regime (deep / wide barrier) could plausibly land in the activation-dominated Kramers limit instead.

⑥ Paper & citation Info

Read the paper on arXiv:2605.24057.

BibTeX:

@article{yang2026featurelottery, title = {Feature Lottery? A Bifurcation Theory of Concept Emergence}, author = {Yang, Fuming}, journal = {arXiv preprint arXiv:2605.24057}, year = {2026}, eprint = {2605.24057}, archivePrefix = {arXiv} }