Discussion about this post

User's avatar
Paul's avatar
Feb 3Edited

Hi David, lovely post. It resonates deeply with something I've been working through - not just understanding how metacognitive habits differ, but actually changing them in practice.

Some months ago, I started thinking about the brain as fundamentally conservative, settling into stable attractor states. The brain can't run on nouns - it needs trigger → action → feedback. Some stimuli comes in, the brain responds with a practiced move, and keeps going until it gets relief (prediction error drops). Most of our daily actions - maybe 95%+ - are repetitions of moves we've done before given the same triggers. You can see this vividly in something like football. Watch players dribbling and their gaits are nearly identical match to match. The brain finds movements that work and defaults to them because it's metabolically cheap.

The same thing happens with thinking. When students hit confusion, they cycle through remarkably predictable policies: re-read, ask "why don't I get this?", check the answer key, or skip it. The feedback loop closes - uncertainty gets temporarily resolved - and the brain marks this as successful even though nothing was actually learned.

Looking back at my own learning, I realized every time I got unstuck after prolonged struggle, it was after some pivot or action - trying a diagram, computing a case, explaining to myself. Never did wheel-spinning in the same mode resolve confusion.

What I also noticed was that often it took me extended periods of time to get unstuck - not because the problem was that hard, but because that's how long it took for desperation to build up enough to lower my threshold for seemingly absurd action. Action that then, to my surprise, actually resolved the confusion. But I'd never systematized this or looked at what preceded breakthroughs.

What I settled on was: detect confusion early and immediately force a representational shift. Switch from algebra to geometry, symbolic to numerical, abstract to physical - whatever gets you out of the current stuck state.

This maps directly onto your point about secondary stimuli. The difference between someone stuck and someone learning isn't the problem itself but what they do internally when stuck. Re-reading the same passage generates almost no new neural activity. Switching representational modes - visualizing geometrically, computing specific examples, explaining out loud - activates completely different pathways.

But here's the implementation problem: at peak uncertainty, when you're most confused, your capacity for novel action is lowest. This is exactly when the brain defaults to familiar ineffective responses. Knowing you should shift representations changes nothing.

What actually worked was externalizing the choice. I made cards with specific moves: "Compute Something," "Extreme Cases," "Remove One Part," "Reverse/Rotate/Swap," "Sit With Confusion," "Make Prediction." When stuck, I draw a card. This removes the hardest part - committing to action under uncertainty. Any shift away from the stuck state generates information.

The second thing: permission to execute wrong. After some initial tries, I came to find there's usually this hesitation where you're evaluating "how exactly should I do this?" That evaluation overhead often kills the attempt. Slows you down. So instead: execute wrong on purpose! Don't know which numbers to compute? Use obviously wrong ones. Try the extreme case even if it seems absurd. I found wrong execution generates information fast - you learn why it's wrong in seconds, which reveals constraints and hidden structure.

I realized that confusion as a state is actually rather similar across different domains. The brain being what it is - a trigger-action-feedback system - if I could install a different state-action mapping at this inflection point, it should have outsized effects. If the new policy reduces prediction error better than the old one, it should outcompete it naturally over repetitions.

And it did. After some weeks of forced practice with the cards: detection latency dropped from 90 seconds to maybe 20-30, execution became nearly immediate, and subjectively I started feeling 'restless' when stuck and not shifting. The new policy's prediction-error-reduction is so much better that staying in one representation now feels 'wrong'. That restlessness is the experiential signature of attractor replacement.

What strikes me is the magnitude of improvement relative to effort. Problems I previously thought required checking textbooks or were just beyond me now resolve through sustained exploration. It takes longer than looking up answers, but I build actual understanding and the capability to do it again.

The old policy feels like doomscrolling - minimal cognitive load, no new information, just anxiety relief. The new policy feels like exercise - more effortful initially, but generating new neural activation with each attempt.

Your secondary stimuli insight is particularly apt here. Each representational shift isn't just "thinking harder" - it's generating different internal experience. Algebraic manipulation, geometric visualization, numerical calculation activate distinct neural substrates. High-frequency switching means high-diversity neural exploration, which should drive connectome reorganization much faster than passive re-reading.

Before this I was cognitively sedentary when stuck - burning mental energy on anxiety while generating minimal neural activity. Now there's constant motion: try this angle, doesn't work, try that, learn something, try another. The volume of distinct cognitive states explored per unit time has increased dramatically.

But I think the compounding goes deeper than just executing pivots more reliably. The brain being a pattern-matching machine, over time it comes to map certain kinds of cues - internal or external - with certain moves. It realizes some moves work better to resolve prediction error than others as it gathers experience. This is essentially what intuition is.

The increased information density per unit time, from this active learning strategy, means the brain gets vastly more data to identify underlying patterns. Each night when you sleep, the brain compresses this enormous amount of information, replays it, consolidates it. Over months and years, this might explain how people who naturally default to this stance appear to have better "smell" for what to do and when. It compounds exponentially.

It's really akin to what you described as going into "math mode" - and what you said elsewhere: "This bizarre, almost childish attitude is extremely hard to communicate to outsiders." That's exactly it. It's like a kid picking up a toy and figuring out all the funny ways they can play with it, learning about its properties in the process. The little moves are cheap, easy. If they don't work, it doesn't mean much - you just do something else.

I don't think any of this is particularly revolutionary from a pedagogy or neuroscience standpoint - I suspect it's implicit to how a lot of mathematicians and physicists actually work. But it was revolutionary to me personally, this shift in cognitive stance.

If intelligence lives in the connectome and connectomes reorganize in response to activity patterns, this high-frequency representational shifting should accelerate development. Not immediately, but compounded over months and years the divergence could be substantial.

I think what made this work wasn't just understanding the principle (intellectually) but the operationalization. The cards externalize the decision at exactly the moment when your brain is least capable of making it. Wrong execution removes the evaluative layer that causes hesitation. Together they bypass the exact bottlenecks that prevent people from using techniques they intellectually know about.

This seems to instantiate your conjectures directly. The cards operationalize a specific trainable habit at the exact inflection point that determines learning trajectories. Within weeks of deliberate practice, a completely different response pattern installed itself.

Motion precedes clarity. That's the stance you start to embody.

Being an optimist, I think you might be understating the potential magnitude. If the constraint on cognitive development is primarily metacognitive habits rather than genetic ceiling, and if simple protocols can shift those habits within weeks, the accessible improvement might be larger than the "20% full glass" suggests.

The protocol costs essentially nothing - some index cards and permission to execute badly - but it forces precisely the kind of "peculiar rumination techniques" you describe elite mathematicians practicing. It's a way to operationalize what you call quality of attention, but as concrete actions anyone can systematically train.

Most importantly: it's trainable in the sense of actually installable, not just intellectually understandable. You need a detectable trigger, an externalized action protocol, permission to execute imperfectly, and volume of practice. The policy that better reduces prediction error wins naturally. No willpower required once the initial pattern starts to dominate.

Thanks for articulating this framework so clearly. It's nice to see my experience map onto your educated guesses about secondary stimuli, metacognitive habits, and compounding neural differences.

Jameson Graber's avatar

I think you end up with an extremely reasonable position. Sometimes when you push back against hereditarianism, you seem a bit starry-eyed about what we can accomplish by sheer force of will. In the end, though, you seem to admit that a lot of things are out of our control.

Personally, I still think all of those quotes--Newton, Einstein, Feynman, Grothendieck--are either disingenuous or incredibly naive. My own experience as a mathematician has not made me any less frustrated with these quotes. Quite the contrary, really. It feels as if geniuses feel obliged to remind everyone that actually they work very hard, as if we didn't already know. But I thought everybody knew that, just as they do for any other kind of excellence. Michael Phelps also had to work extremely hard to win all of those gold medals, but that doesn't mean that the rest of us can do it too if we just follow the same physical fitness regime that he did. Just because you have to work very, very hard to develop a gift doesn't mean that it isn't a gift.

Sometimes I think you exaggerate the difference between intellectual and physical prowess. You use a 100 meter dash as an example, because we all agree that all of us could at least finish, even if we're pathetically slow. But there are other activities with threshholds. Lifting weights, for example. The vast majority of us will just never be able to lift 500 pounds, even if we are given a week to do it.

Despite my criticisms, I will say that the extremely valuable part of your book and this post is a sort of research program to try to understand how we might be better trainers of cognitive ability. You're right to point out that physical activities are much more straightforward to model. Cognitive patterns are much more hidden. As a professor I've tried to explain to my students, as far as I'm able, my actual stream of consciousness that occurs to me when I approach a problem. Sometimes this can be a bit frightening to students, probably because, in addition to being hidden, cognitive behavior can be highly idiosyncratic. Still, there's probably a lot to be gained from trying to figure out the common patterns in the cognition of highly effective intellectuals. I imagine neuroscience will play a big role in this.

94 more comments...

No posts

Ready for more?