Hi David, lovely post. It resonates deeply with something I've been working through - not just understanding how metacognitive habits differ, but actually changing them in practice.
Some months ago, I started thinking about the brain as fundamentally conservative, settling into stable attractor states. The brain can't run on nouns - it needs trigger → action → feedback. Some stimuli comes in, the brain responds with a practiced move, and keeps going until it gets relief (prediction error drops). Most of our daily actions - maybe 95%+ - are repetitions of moves we've done before given the same triggers. You can see this vividly in something like football. Watch players dribbling and their gaits are nearly identical match to match. The brain finds movements that work and defaults to them because it's metabolically cheap.
The same thing happens with thinking. When students hit confusion, they cycle through remarkably predictable policies: re-read, ask "why don't I get this?", check the answer key, or skip it. The feedback loop closes - uncertainty gets temporarily resolved - and the brain marks this as successful even though nothing was actually learned.
Looking back at my own learning, I realized every time I got unstuck after prolonged struggle, it was after some pivot or action - trying a diagram, computing a case, explaining to myself. Never did wheel-spinning in the same mode resolve confusion.
What I also noticed was that often it took me extended periods of time to get unstuck - not because the problem was that hard, but because that's how long it took for desperation to build up enough to lower my threshold for seemingly absurd action. Action that then, to my surprise, actually resolved the confusion. But I'd never systematized this or looked at what preceded breakthroughs.
What I settled on was: detect confusion early and immediately force a representational shift. Switch from algebra to geometry, symbolic to numerical, abstract to physical - whatever gets you out of the current stuck state.
This maps directly onto your point about secondary stimuli. The difference between someone stuck and someone learning isn't the problem itself but what they do internally when stuck. Re-reading the same passage generates almost no new neural activity. Switching representational modes - visualizing geometrically, computing specific examples, explaining out loud - activates completely different pathways.
But here's the implementation problem: at peak uncertainty, when you're most confused, your capacity for novel action is lowest. This is exactly when the brain defaults to familiar ineffective responses. Knowing you should shift representations changes nothing.
What actually worked was externalizing the choice. I made cards with specific moves: "Compute Something," "Extreme Cases," "Remove One Part," "Reverse/Rotate/Swap," "Sit With Confusion," "Make Prediction." When stuck, I draw a card. This removes the hardest part - committing to action under uncertainty. Any shift away from the stuck state generates information.
The second thing: permission to execute wrong. After some initial tries, I came to find there's usually this hesitation where you're evaluating "how exactly should I do this?" That evaluation overhead often kills the attempt. Slows you down. So instead: execute wrong on purpose! Don't know which numbers to compute? Use obviously wrong ones. Try the extreme case even if it seems absurd. I found wrong execution generates information fast - you learn why it's wrong in seconds, which reveals constraints and hidden structure.
I realized that confusion as a state is actually rather similar across different domains. The brain being what it is - a trigger-action-feedback system - if I could install a different state-action mapping at this inflection point, it should have outsized effects. If the new policy reduces prediction error better than the old one, it should outcompete it naturally over repetitions.
And it did. After some weeks of forced practice with the cards: detection latency dropped from 90 seconds to maybe 20-30, execution became nearly immediate, and subjectively I started feeling 'restless' when stuck and not shifting. The new policy's prediction-error-reduction is so much better that staying in one representation now feels 'wrong'. That restlessness is the experiential signature of attractor replacement.
What strikes me is the magnitude of improvement relative to effort. Problems I previously thought required checking textbooks or were just beyond me now resolve through sustained exploration. It takes longer than looking up answers, but I build actual understanding and the capability to do it again.
The old policy feels like doomscrolling - minimal cognitive load, no new information, just anxiety relief. The new policy feels like exercise - more effortful initially, but generating new neural activation with each attempt.
Your secondary stimuli insight is particularly apt here. Each representational shift isn't just "thinking harder" - it's generating different internal experience. Algebraic manipulation, geometric visualization, numerical calculation activate distinct neural substrates. High-frequency switching means high-diversity neural exploration, which should drive connectome reorganization much faster than passive re-reading.
Before this I was cognitively sedentary when stuck - burning mental energy on anxiety while generating minimal neural activity. Now there's constant motion: try this angle, doesn't work, try that, learn something, try another. The volume of distinct cognitive states explored per unit time has increased dramatically.
But I think the compounding goes deeper than just executing pivots more reliably. The brain being a pattern-matching machine, over time it comes to map certain kinds of cues - internal or external - with certain moves. It realizes some moves work better to resolve prediction error than others as it gathers experience. This is essentially what intuition is.
The increased information density per unit time, from this active learning strategy, means the brain gets vastly more data to identify underlying patterns. Each night when you sleep, the brain compresses this enormous amount of information, replays it, consolidates it. Over months and years, this might explain how people who naturally default to this stance appear to have better "smell" for what to do and when. It compounds exponentially.
It's really akin to what you described as going into "math mode" - and what you said elsewhere: "This bizarre, almost childish attitude is extremely hard to communicate to outsiders." That's exactly it. It's like a kid picking up a toy and figuring out all the funny ways they can play with it, learning about its properties in the process. The little moves are cheap, easy. If they don't work, it doesn't mean much - you just do something else.
I don't think any of this is particularly revolutionary from a pedagogy or neuroscience standpoint - I suspect it's implicit to how a lot of mathematicians and physicists actually work. But it was revolutionary to me personally, this shift in cognitive stance.
If intelligence lives in the connectome and connectomes reorganize in response to activity patterns, this high-frequency representational shifting should accelerate development. Not immediately, but compounded over months and years the divergence could be substantial.
I think what made this work wasn't just understanding the principle (intellectually) but the operationalization. The cards externalize the decision at exactly the moment when your brain is least capable of making it. Wrong execution removes the evaluative layer that causes hesitation. Together they bypass the exact bottlenecks that prevent people from using techniques they intellectually know about.
This seems to instantiate your conjectures directly. The cards operationalize a specific trainable habit at the exact inflection point that determines learning trajectories. Within weeks of deliberate practice, a completely different response pattern installed itself.
Motion precedes clarity. That's the stance you start to embody.
Being an optimist, I think you might be understating the potential magnitude. If the constraint on cognitive development is primarily metacognitive habits rather than genetic ceiling, and if simple protocols can shift those habits within weeks, the accessible improvement might be larger than the "20% full glass" suggests.
The protocol costs essentially nothing - some index cards and permission to execute badly - but it forces precisely the kind of "peculiar rumination techniques" you describe elite mathematicians practicing. It's a way to operationalize what you call quality of attention, but as concrete actions anyone can systematically train.
Most importantly: it's trainable in the sense of actually installable, not just intellectually understandable. You need a detectable trigger, an externalized action protocol, permission to execute imperfectly, and volume of practice. The policy that better reduces prediction error wins naturally. No willpower required once the initial pattern starts to dominate.
Thanks for articulating this framework so clearly. It's nice to see my experience map onto your educated guesses about secondary stimuli, metacognitive habits, and compounding neural differences.
This is so intriguing. I'm wondering about this passage: 'This is the fundamental reason why educational interventions so often fail to move the needle. While they deterministically alter the primary stimuli, their impact on the secondary stimuli is always indirect and contingent to uncontrolled factors.' Could you give an example of a 'secondary stimulus' to clarify it a bit? And of uncontrolled factors?
Thank you James for your feedback. I will edit this passage as it definitely deserves an example. Here is one:
When you read a book, the primary stimulus is the ink on the page, the secondary stimuli are the mental imaginary and the train of thoughts that are prompted by the primary, and may linger on for minutes, hours, days, years.
I think you end up with an extremely reasonable position. Sometimes when you push back against hereditarianism, you seem a bit starry-eyed about what we can accomplish by sheer force of will. In the end, though, you seem to admit that a lot of things are out of our control.
Personally, I still think all of those quotes--Newton, Einstein, Feynman, Grothendieck--are either disingenuous or incredibly naive. My own experience as a mathematician has not made me any less frustrated with these quotes. Quite the contrary, really. It feels as if geniuses feel obliged to remind everyone that actually they work very hard, as if we didn't already know. But I thought everybody knew that, just as they do for any other kind of excellence. Michael Phelps also had to work extremely hard to win all of those gold medals, but that doesn't mean that the rest of us can do it too if we just follow the same physical fitness regime that he did. Just because you have to work very, very hard to develop a gift doesn't mean that it isn't a gift.
Sometimes I think you exaggerate the difference between intellectual and physical prowess. You use a 100 meter dash as an example, because we all agree that all of us could at least finish, even if we're pathetically slow. But there are other activities with threshholds. Lifting weights, for example. The vast majority of us will just never be able to lift 500 pounds, even if we are given a week to do it.
Despite my criticisms, I will say that the extremely valuable part of your book and this post is a sort of research program to try to understand how we might be better trainers of cognitive ability. You're right to point out that physical activities are much more straightforward to model. Cognitive patterns are much more hidden. As a professor I've tried to explain to my students, as far as I'm able, my actual stream of consciousness that occurs to me when I approach a problem. Sometimes this can be a bit frightening to students, probably because, in addition to being hidden, cognitive behavior can be highly idiosyncratic. Still, there's probably a lot to be gained from trying to figure out the common patterns in the cognition of highly effective intellectuals. I imagine neuroscience will play a big role in this.
Glad I'm ending up with an extremely reasonable position—I do think I'm an extremely reasonable person :) !
On the 500 pounds example: the success metric for weightlifting is how much you can lift. I'm not an expert on the matter, but I do suspect most people could, with adequate training, lift 100 pounds or even 200 and possibly more. This feels nowhere near the common perception of the math talent gap.
On this topic, I think you're missing the essential notion of conceptual compression—how things that initially seem unfathomably hard often become trivial once you develop a familiarity with the right conceptual framework. One of my favorite example is Hindu-Arabic numerals, through which you instantly "see" that 1,000,000,000 - 1 = 999,999,999, a computation that feels superhuman to someone who only knows Roman numerals (see this post based on a chapter from my book: https://davidbessis.substack.com/p/the-magic-of-mathematical-intuition)
There is no equivalent of conceptual compression for lifting 500 pounds and this is where, in my view, the analogy breaks down. Cognition isn't running or weightlifting, and this explains why insane inequalities can develop.
About what we can accomplish by sheer force of will — I am acutely aware of everyone's limits, yet I do think we should absolutely insist that people have a huge progression margin, because they do have one and often think they don't.
Maybe we had a different experience with mathematics — I do think most people are primarily blocked by their fears (and also their misconceptions of what is actually at stake). This certainly applied to me, which may explain I'm particularly adamant on the topic.
We certainly had a different experience with mathematics, which I think shapes a lot of the discussion (which is delightful, by the way).
My experience seems to be almost the opposite of yours in every way. I've known pretty much as long as I can remember that I loved mathematics, and I stood out in all of my classes from kindergarten onward. Far from having a fear of it, I almost found a sort of refuge in it. Perhaps I enjoyed it so much because I could see why things were true, without having to take my teachers' word for it. Other kids seemed to have the opposite experience. They couldn't see why any of it was true, they just learned rules. So I tried to explain it to them, but spent much of my childhood getting blank stares.
The difference between French and US education is a bit paradoxical. You would think that the French would have a much more egalitarian ethos, but when it comes to public schools, almost the opposite is true. We have nothing like "Classe Préparatoire," and on the whole I would say US public schools tend to focus on the median student, doing very little to foster exceptional talent. That was certainly my experience. So while I saw from a distance how much brilliant math students could do with exceptional training, I was left to teach myself, as you say many mathematicians do. I ended up in a pretty good university, but certainly had nothing like the boot camp that allows France to produce so many Fields Medalists.
Speaking of Fields Medalists, allow me to give a comparison from my own experience that will help explain why I hate those quotes by Newton and Einstein. My own research is in the same domain as P.-L. Lions (and maybe Cedric Villani and Alessio Figalli, to name two other Fields Medalists). Now, when I read a paper by Lions or hear his lecture, I can follow along--I know what he is doing. To use your metaphor, I can at least get to the same finish line as he does. But by the time I do, he has gone on to 10 other projects, which he will finish by the time I even start my own. I think I'm doing the same thing as he is, and I can grasp the same ideas on a similar intuitive level, but there is simply no way I can keep up with his speed. Saying I just need to work harder and I can be just like Lions or Villani or Figalli would not be encouraging, but rather soul-crushing. I bet many mathematicians just like me feel the same way.
I understand perfectly well that cognitive work is not like lifting weights, and that intellectual progress is a lot like capital accumulation--it can increase exponentially. But that's exactly my point with respect to my own experience. Even if Lions is actually only 2 or 3 times faster than I am, clearly over a lifetime this is going to yield an overwhelming difference in productivity. I cannot double my speed. There is no point in comparing myself with such giants. At this point, whether or not such capacities are "hereditary" becomes a mere technicality. That meme with the photo of von Neumann? Even if it is technically wrong because there is a lot more than genetics going on, it's still a very real pill that many of us have to swallow.
Now, I understand that in your book, you're more interested in getting people who currently have little to no grasp of mathematics to get some idea of what it's really about. I think that is an admirable goal. But, all I can say is, good luck with that. As I said, I've been trying all my life to try to explain to other people what goes on in my head, and I really get a lot of blank stares. And this was the point of my 500 pound weight example. There are some things you simply cannot do until you have passed a certain threshhold. Maybe the situation is not hopeless, but I do think it's quite a steep uphill battle.
I think there's a contradiction in your position worth examining.
You write: "Just because you have to work very, very hard to develop a gift doesn't mean that it isn't a gift."
But if something requires development, in what sense is it a gift? The word "gift" implies you receive it without earning it. "Develop" implies you build it through practice. These seem mutually exclusive.
Your weightlifting analogy assumes there's some cognitive equivalent to "lift 500 pounds" - some specific mental operation the rest of us physiologically cannot perform. But what is it? Can you name the actual cognitive move that greater minds employ that lesser minds simply cannot execute?
The 100m dash seems more apt precisely because everyone can run - just at different speeds and efficiency. We can all put one foot in front of the other and cross the finish line. The question becomes: why are some so much faster?
I think you're conflating energy spent thinking with effective thought. Not all cognitive effort is equal. Using your weightlifting analogy: it's the difference between lifting with impeccable form (force efficiently transferred along the vertical axis) versus sloppy form where most effort dissipates without moving the weight.
Same total energy expenditure, vastly different outcomes.
The critical variable isn't effort quantity but the specific policies employed when stuck. When Einstein or Grothendieck hit confusion, what did they do? Not in vague terms like "work hard" or "be curious," but as concrete cognitive actions: Did they switch representations? Generate examples? Draw diagrams? Test limits?
You mention trying to explain your stream of consciousness to students. That's valuable, but I suspect the crucial difference isn't what thoughts occur to you, but what you do when your initial approach fails. Do you persist in the same representation or immediately pivot? That's a trainable habit, not a genetic gift.
If Phelps's advantage were purely genetic, we'd expect his training methods to be useless for others. But swimmers who adopt elite training techniques do improve substantially - they just don't reach Phelps's level.
The question is whether the gap is from physiological limits (like bone structure or muscle fiber composition) or from uncopyable aspects of practice patterns that compound over decades.
For cognition, what's the equivalent of bone structure? What's the hard ceiling? I'm genuinely asking, because I don't see it clearly articulated in hereditarian arguments beyond vague appeals to "processing speed" or "working memory" - terms that aren't well-grounded in neuroscience and often just redescribe the performance gap rather than explaining it.
If someone gives you $10,000 as a gift, and then you turn it into a small business through hard work and shrewd investments, it was still a gift. This is a very common sense idea, absolutely no contradiction.
Your analogy assumes the existence of a measurable "$10,000 head start" - but what is it, specifically?
If cognitive advantage were primarily biological like your gift analogy suggests, we should be able to identify and measure it. The history of trying to find physical correlates of genius has been remarkably unsuccessful. Gauss's brain sat mislabeled in a jar for over a century because it was so unremarkable. Einstein's dissected brain showed no convincing peculiarities.
More tellingly: genius is almost always domain-specific. Von Neumann was transcendent in mathematics but ordinary at music. Feynman was brilliant at physics but struggled with homotopy groups. If the advantage were a general biological gift - faster processing, better working memory, superior neural hardware - why wouldn't it transfer across domains?
Yet when exceptional minds apply themselves outside their domain of expertise, they're often no better than average. How does your $10,000 gift explain that? Shouldn't superior hardware help everywhere?
The business analogy actually undermines your point. Yes, $10,000 helps - but there are countless people who start with that amount or more who never build successful businesses. Meanwhile, some build empires from $100. At what point does the initial capital become less important than the strategies employed?
If someone turns $10,000 into millions through "hard work and shrewd investments," those two factors - the specific practices, the decision-making patterns, the strategies that compound over time - seem far more explanatory than the initial gift. Remove the shrewd investments and hard work, and the $10,000 likely dwindles. Remove the initial $10,000 but keep the shrewd strategies, and success still seems probable, just delayed.
The question isn't whether some people have advantages. It's whether those advantages are the primary variable or whether the compounding effects of different practices over decades are doing most of the work. Your analogy seems to assume the former without establishing it.
Curiosity and a questioning attitude provide the fuel needed to relentlessly ponder a confusing topic. I can understand things in my 50s that put me to sleep in my 20s, even though I had an "interest" in learning them in my 20s. Back then, I didn't have the sufficient quality of curiosity that lights one's brain on fire. Also, not giving up and not expecting to understand in 5 minutes, an hour, a day, a week, a month or a year is also key. I wouldn't know how to teach the curiosity that I feel now.
There is no double standard here. Processes that make biceps grow in bodybuilders and liver grow in alcoholics are well-understood, have been demostrated on mice, and you can make average person to make larger biceps with proper medication and training, whereas getting high math ability people depends on self-selection.
I don't think these processes are well-understood by the laypeople who make this inference... and the processes that drive neuroplasticity are reasonably well-understood by experts.
But of course in the latter case the processes are invisible to everyone—so in that sense you are correct.
With bodybuilding, there are *results*. Sure, cell biologists know something regards to neuroplasticity but they can't turn a 70 IQ person into 130 IQ person. They can help in some edge cases such as reducing impact of some disease/trauma, while bodybuilding industry improves for median case. Because we really know some environmental factors, athletes are scanned for doping. If we knew factors for math ability similarly well, we could expect brain competitions to have similarly scandalous stories. Just throwing in word "neuroplasticity" doesn't prove anti-hereditarianism.
"Why do we constantly imagine things? Why do we experience this bizarre thing that people call a stream of consciousness? These capacities are metabolically expensive and, on the face of it, entirely reckless—we are constantly retraining our brains on our own hallucinatory slop."
As a person who has very recently got into Eastern/Buddhist meditation practices in my 50s, this really struck me! During undergrad, and my short stint in grad school, I wasted so much time on idle fantasy and I'm sure it got in my way. (I ended up becoming an actuary, but I'm rediscovering my interest in math now)
Thanks for the feedback! Your question is way too difficult as I don't think there is a generic answer that applies to most people. My sad hunch is that a large share of the population suffers from entrenched attention issues that result from years of not confronting the situation, which is usually a messy combination of:
- interiorized fear
- bad habits / doomscrolling [judging by my own difficulties & what I see around me, this is a fast-deteriorating major public health issue]
- latent depression
- lack of a strong emotional investment in a valuable intellectual endeavour
- lack of competent sparring partners
- lack of ambition / inability to see a credible professional growth path
- ...
Net net, I don't think people can improve their situation if it's not associated with concrete and credible personal goals, whether relational, professional, or intellectual.
People can get stuck in the idea that one-in-a-billion genius is genetically determined, so there's no point in trying to hack it. You have some pre-determined ceiling of capability. But most people are nowhere near approaching that ceiling. You won't become Einstein, but your own potential is its own unexplored territory.
Certainly. But see what creates "just study harder" in South Korea -- teenagers commit suicides because they weren't admitted to prestigious universities, have little free time, collapsing fertility... And it doesn't seem to be any significant increase in scientific output that would be worth it.
I am reading you with extreme interest. I am coming to similar conclusions about [some forms of] mental illness. I am starting to believe (though, like you, I am a million miles away from being able to prove) that the question of exactly who becomes mentally ill, and how bad it gets, is going to turn out to depend in very small part on genetics, and in very small part on environment, and in very large part on a third thing which is habits (in particular, habits around paying attention). The driving force is good habits in the case of mathematics, but harmful habits in the case of [some forms of] mental illness. Changing bad habits to good habits can drag you out of [some forms of] mental illness.
I think that one important thing to notice about habits is that their effect is recursive and hence can grow out of control and result in all sorts of surprising things. Like cancer cells reproducing all out of proportion to other cell types -- or like the growth of twigs on a tree and other fractals found in nature -- recursion is an incredibly powerful idea. I'm not a math head at all, but I do take some inspiration from a couple of mathy sources. Every geek's favorite book "Godel Escher Bach" stresses the importance of recursion, and so does another book I've been learning from recently, which is titled "The Computational Beauty of Nature" (Gary William Flake). I think this post of yours, like other things you have written, touches on this idea, but you don't seem to make much use of the words "recursion" or "recursive."
Recursion is probably important enough that it deserves to be understood as foundational to a lot of developmental processes. Instead of "nature or nurture?" perhaps someday people will ask "nature, nurture, or re-cur?"
Thank you Kent — yes, I agree that the model’s natural scope includes most behavioral traits, including psychopathology (but it should probably be extended to account for neurotransmitter mix and the activitiy of non-neural brain cells — brain “habits” are likely to also be materialized in non-cognitive manners — this also matters in the cognitive aspects I’m discussing, but becomes even more relevant for psychopathology.)
Conjecture 1 is undermined by conjecture 5. (1) posits that "genetic variability...cannot account for the magnitude of the observed cognitive inequality" but (5) provides a mechanism for how a small cognitive difference can account for a large inequality through compounding.
>we will never find a within-family polygenic score S such that the median value of S on the population P lies in the top centile for S with respect to the general population.
It could be simultaneously true than we don't have a good PGS and also that cloning a genius would produce a genius with some major probablity (e.g. >50%, but even 5% would be a lot more than 1e-6)
GWAS has more parameters than samples. There are 3 billion base pairs in human genome, each base can be 4 states. So even if you sampled all human population you would not get enough samples for a simple RMS fit. And that's not talking about non-linearities. Yes, I am aware that some people claim that these effects don't exist.
Also, again, non-genetic doesn't mean non-biological. It could be some biological developmental randomness that we can't control rather than education methods.
Looks like you don't like normalization of IQ scores. If someone is going to seek for genetic or environmental factors that affect the trait, they are going to transform the distribution to one that is similar to bell curve. What is the problem here?
Re: sprinting analogy
Consider a species that evolving is powered flight (or de-elolving it). Suppose the trait is "how many meters one can fly from tree of a given height". At some point there will be a state where most specimens glide for some distance and few ones could travel very far - non-gaussian distribution of results even while most innards of the organism have gausian-like distributions.
It's definitely possible to have some individuals in same species to be more different in a some trait that some species do -- after all, it is how evolution works.
Most of New World primates are stuck in a state where only female heterozygotes have trichromatic vision and other females and all males have dichromatic. This is very real-life relevant, species-scale distinction, seen within a species.
Brilliant essay. I agree with your conclusion that "you cannot train your kid to become the next Einstein." However, there might be a subset of parents who can raise their children to at least a near-genius level if they have a genius level understanding of child development and the right conditions. I am thinking of people like Laszlo Polgar, perhaps Boris Sidis. I'm not saying their methods are free of controversy, but the impossibility of nurturing genius might be practical rather than fundamental.
The Polgars were suggested to adopt a baby from "disadvantaged" demographic, to show that the method works. They refused. There is no reproducible results.
You could make the same conjectures with musicians and non-musicians (and they would be more easily testable). To me, genetics and brain structure should only explain a small amount of the variance in cognitive inequality (for both musicians and mathematicians). Practice makes the master. And it’s easier to practice music than it is math =P
So if we want more math wizards, we need to learn how to practice it with the same accessibility as music.
Hi David, lovely post. It resonates deeply with something I've been working through - not just understanding how metacognitive habits differ, but actually changing them in practice.
Some months ago, I started thinking about the brain as fundamentally conservative, settling into stable attractor states. The brain can't run on nouns - it needs trigger → action → feedback. Some stimuli comes in, the brain responds with a practiced move, and keeps going until it gets relief (prediction error drops). Most of our daily actions - maybe 95%+ - are repetitions of moves we've done before given the same triggers. You can see this vividly in something like football. Watch players dribbling and their gaits are nearly identical match to match. The brain finds movements that work and defaults to them because it's metabolically cheap.
The same thing happens with thinking. When students hit confusion, they cycle through remarkably predictable policies: re-read, ask "why don't I get this?", check the answer key, or skip it. The feedback loop closes - uncertainty gets temporarily resolved - and the brain marks this as successful even though nothing was actually learned.
Looking back at my own learning, I realized every time I got unstuck after prolonged struggle, it was after some pivot or action - trying a diagram, computing a case, explaining to myself. Never did wheel-spinning in the same mode resolve confusion.
What I also noticed was that often it took me extended periods of time to get unstuck - not because the problem was that hard, but because that's how long it took for desperation to build up enough to lower my threshold for seemingly absurd action. Action that then, to my surprise, actually resolved the confusion. But I'd never systematized this or looked at what preceded breakthroughs.
What I settled on was: detect confusion early and immediately force a representational shift. Switch from algebra to geometry, symbolic to numerical, abstract to physical - whatever gets you out of the current stuck state.
This maps directly onto your point about secondary stimuli. The difference between someone stuck and someone learning isn't the problem itself but what they do internally when stuck. Re-reading the same passage generates almost no new neural activity. Switching representational modes - visualizing geometrically, computing specific examples, explaining out loud - activates completely different pathways.
But here's the implementation problem: at peak uncertainty, when you're most confused, your capacity for novel action is lowest. This is exactly when the brain defaults to familiar ineffective responses. Knowing you should shift representations changes nothing.
What actually worked was externalizing the choice. I made cards with specific moves: "Compute Something," "Extreme Cases," "Remove One Part," "Reverse/Rotate/Swap," "Sit With Confusion," "Make Prediction." When stuck, I draw a card. This removes the hardest part - committing to action under uncertainty. Any shift away from the stuck state generates information.
The second thing: permission to execute wrong. After some initial tries, I came to find there's usually this hesitation where you're evaluating "how exactly should I do this?" That evaluation overhead often kills the attempt. Slows you down. So instead: execute wrong on purpose! Don't know which numbers to compute? Use obviously wrong ones. Try the extreme case even if it seems absurd. I found wrong execution generates information fast - you learn why it's wrong in seconds, which reveals constraints and hidden structure.
I realized that confusion as a state is actually rather similar across different domains. The brain being what it is - a trigger-action-feedback system - if I could install a different state-action mapping at this inflection point, it should have outsized effects. If the new policy reduces prediction error better than the old one, it should outcompete it naturally over repetitions.
And it did. After some weeks of forced practice with the cards: detection latency dropped from 90 seconds to maybe 20-30, execution became nearly immediate, and subjectively I started feeling 'restless' when stuck and not shifting. The new policy's prediction-error-reduction is so much better that staying in one representation now feels 'wrong'. That restlessness is the experiential signature of attractor replacement.
What strikes me is the magnitude of improvement relative to effort. Problems I previously thought required checking textbooks or were just beyond me now resolve through sustained exploration. It takes longer than looking up answers, but I build actual understanding and the capability to do it again.
The old policy feels like doomscrolling - minimal cognitive load, no new information, just anxiety relief. The new policy feels like exercise - more effortful initially, but generating new neural activation with each attempt.
Your secondary stimuli insight is particularly apt here. Each representational shift isn't just "thinking harder" - it's generating different internal experience. Algebraic manipulation, geometric visualization, numerical calculation activate distinct neural substrates. High-frequency switching means high-diversity neural exploration, which should drive connectome reorganization much faster than passive re-reading.
Before this I was cognitively sedentary when stuck - burning mental energy on anxiety while generating minimal neural activity. Now there's constant motion: try this angle, doesn't work, try that, learn something, try another. The volume of distinct cognitive states explored per unit time has increased dramatically.
But I think the compounding goes deeper than just executing pivots more reliably. The brain being a pattern-matching machine, over time it comes to map certain kinds of cues - internal or external - with certain moves. It realizes some moves work better to resolve prediction error than others as it gathers experience. This is essentially what intuition is.
The increased information density per unit time, from this active learning strategy, means the brain gets vastly more data to identify underlying patterns. Each night when you sleep, the brain compresses this enormous amount of information, replays it, consolidates it. Over months and years, this might explain how people who naturally default to this stance appear to have better "smell" for what to do and when. It compounds exponentially.
It's really akin to what you described as going into "math mode" - and what you said elsewhere: "This bizarre, almost childish attitude is extremely hard to communicate to outsiders." That's exactly it. It's like a kid picking up a toy and figuring out all the funny ways they can play with it, learning about its properties in the process. The little moves are cheap, easy. If they don't work, it doesn't mean much - you just do something else.
I don't think any of this is particularly revolutionary from a pedagogy or neuroscience standpoint - I suspect it's implicit to how a lot of mathematicians and physicists actually work. But it was revolutionary to me personally, this shift in cognitive stance.
If intelligence lives in the connectome and connectomes reorganize in response to activity patterns, this high-frequency representational shifting should accelerate development. Not immediately, but compounded over months and years the divergence could be substantial.
I think what made this work wasn't just understanding the principle (intellectually) but the operationalization. The cards externalize the decision at exactly the moment when your brain is least capable of making it. Wrong execution removes the evaluative layer that causes hesitation. Together they bypass the exact bottlenecks that prevent people from using techniques they intellectually know about.
This seems to instantiate your conjectures directly. The cards operationalize a specific trainable habit at the exact inflection point that determines learning trajectories. Within weeks of deliberate practice, a completely different response pattern installed itself.
Motion precedes clarity. That's the stance you start to embody.
Being an optimist, I think you might be understating the potential magnitude. If the constraint on cognitive development is primarily metacognitive habits rather than genetic ceiling, and if simple protocols can shift those habits within weeks, the accessible improvement might be larger than the "20% full glass" suggests.
The protocol costs essentially nothing - some index cards and permission to execute badly - but it forces precisely the kind of "peculiar rumination techniques" you describe elite mathematicians practicing. It's a way to operationalize what you call quality of attention, but as concrete actions anyone can systematically train.
Most importantly: it's trainable in the sense of actually installable, not just intellectually understandable. You need a detectable trigger, an externalized action protocol, permission to execute imperfectly, and volume of practice. The policy that better reduces prediction error wins naturally. No willpower required once the initial pattern starts to dominate.
Thanks for articulating this framework so clearly. It's nice to see my experience map onto your educated guesses about secondary stimuli, metacognitive habits, and compounding neural differences.
This is so intriguing. I'm wondering about this passage: 'This is the fundamental reason why educational interventions so often fail to move the needle. While they deterministically alter the primary stimuli, their impact on the secondary stimuli is always indirect and contingent to uncontrolled factors.' Could you give an example of a 'secondary stimulus' to clarify it a bit? And of uncontrolled factors?
Thank you James for your feedback. I will edit this passage as it definitely deserves an example. Here is one:
When you read a book, the primary stimulus is the ink on the page, the secondary stimuli are the mental imaginary and the train of thoughts that are prompted by the primary, and may linger on for minutes, hours, days, years.
I think you end up with an extremely reasonable position. Sometimes when you push back against hereditarianism, you seem a bit starry-eyed about what we can accomplish by sheer force of will. In the end, though, you seem to admit that a lot of things are out of our control.
Personally, I still think all of those quotes--Newton, Einstein, Feynman, Grothendieck--are either disingenuous or incredibly naive. My own experience as a mathematician has not made me any less frustrated with these quotes. Quite the contrary, really. It feels as if geniuses feel obliged to remind everyone that actually they work very hard, as if we didn't already know. But I thought everybody knew that, just as they do for any other kind of excellence. Michael Phelps also had to work extremely hard to win all of those gold medals, but that doesn't mean that the rest of us can do it too if we just follow the same physical fitness regime that he did. Just because you have to work very, very hard to develop a gift doesn't mean that it isn't a gift.
Sometimes I think you exaggerate the difference between intellectual and physical prowess. You use a 100 meter dash as an example, because we all agree that all of us could at least finish, even if we're pathetically slow. But there are other activities with threshholds. Lifting weights, for example. The vast majority of us will just never be able to lift 500 pounds, even if we are given a week to do it.
Despite my criticisms, I will say that the extremely valuable part of your book and this post is a sort of research program to try to understand how we might be better trainers of cognitive ability. You're right to point out that physical activities are much more straightforward to model. Cognitive patterns are much more hidden. As a professor I've tried to explain to my students, as far as I'm able, my actual stream of consciousness that occurs to me when I approach a problem. Sometimes this can be a bit frightening to students, probably because, in addition to being hidden, cognitive behavior can be highly idiosyncratic. Still, there's probably a lot to be gained from trying to figure out the common patterns in the cognition of highly effective intellectuals. I imagine neuroscience will play a big role in this.
Dear Jameson,
Glad I'm ending up with an extremely reasonable position—I do think I'm an extremely reasonable person :) !
On the 500 pounds example: the success metric for weightlifting is how much you can lift. I'm not an expert on the matter, but I do suspect most people could, with adequate training, lift 100 pounds or even 200 and possibly more. This feels nowhere near the common perception of the math talent gap.
On this topic, I think you're missing the essential notion of conceptual compression—how things that initially seem unfathomably hard often become trivial once you develop a familiarity with the right conceptual framework. One of my favorite example is Hindu-Arabic numerals, through which you instantly "see" that 1,000,000,000 - 1 = 999,999,999, a computation that feels superhuman to someone who only knows Roman numerals (see this post based on a chapter from my book: https://davidbessis.substack.com/p/the-magic-of-mathematical-intuition)
There is no equivalent of conceptual compression for lifting 500 pounds and this is where, in my view, the analogy breaks down. Cognition isn't running or weightlifting, and this explains why insane inequalities can develop.
About what we can accomplish by sheer force of will — I am acutely aware of everyone's limits, yet I do think we should absolutely insist that people have a huge progression margin, because they do have one and often think they don't.
Maybe we had a different experience with mathematics — I do think most people are primarily blocked by their fears (and also their misconceptions of what is actually at stake). This certainly applied to me, which may explain I'm particularly adamant on the topic.
We certainly had a different experience with mathematics, which I think shapes a lot of the discussion (which is delightful, by the way).
My experience seems to be almost the opposite of yours in every way. I've known pretty much as long as I can remember that I loved mathematics, and I stood out in all of my classes from kindergarten onward. Far from having a fear of it, I almost found a sort of refuge in it. Perhaps I enjoyed it so much because I could see why things were true, without having to take my teachers' word for it. Other kids seemed to have the opposite experience. They couldn't see why any of it was true, they just learned rules. So I tried to explain it to them, but spent much of my childhood getting blank stares.
The difference between French and US education is a bit paradoxical. You would think that the French would have a much more egalitarian ethos, but when it comes to public schools, almost the opposite is true. We have nothing like "Classe Préparatoire," and on the whole I would say US public schools tend to focus on the median student, doing very little to foster exceptional talent. That was certainly my experience. So while I saw from a distance how much brilliant math students could do with exceptional training, I was left to teach myself, as you say many mathematicians do. I ended up in a pretty good university, but certainly had nothing like the boot camp that allows France to produce so many Fields Medalists.
Speaking of Fields Medalists, allow me to give a comparison from my own experience that will help explain why I hate those quotes by Newton and Einstein. My own research is in the same domain as P.-L. Lions (and maybe Cedric Villani and Alessio Figalli, to name two other Fields Medalists). Now, when I read a paper by Lions or hear his lecture, I can follow along--I know what he is doing. To use your metaphor, I can at least get to the same finish line as he does. But by the time I do, he has gone on to 10 other projects, which he will finish by the time I even start my own. I think I'm doing the same thing as he is, and I can grasp the same ideas on a similar intuitive level, but there is simply no way I can keep up with his speed. Saying I just need to work harder and I can be just like Lions or Villani or Figalli would not be encouraging, but rather soul-crushing. I bet many mathematicians just like me feel the same way.
I understand perfectly well that cognitive work is not like lifting weights, and that intellectual progress is a lot like capital accumulation--it can increase exponentially. But that's exactly my point with respect to my own experience. Even if Lions is actually only 2 or 3 times faster than I am, clearly over a lifetime this is going to yield an overwhelming difference in productivity. I cannot double my speed. There is no point in comparing myself with such giants. At this point, whether or not such capacities are "hereditary" becomes a mere technicality. That meme with the photo of von Neumann? Even if it is technically wrong because there is a lot more than genetics going on, it's still a very real pill that many of us have to swallow.
Now, I understand that in your book, you're more interested in getting people who currently have little to no grasp of mathematics to get some idea of what it's really about. I think that is an admirable goal. But, all I can say is, good luck with that. As I said, I've been trying all my life to try to explain to other people what goes on in my head, and I really get a lot of blank stares. And this was the point of my 500 pound weight example. There are some things you simply cannot do until you have passed a certain threshhold. Maybe the situation is not hopeless, but I do think it's quite a steep uphill battle.
I think there's a contradiction in your position worth examining.
You write: "Just because you have to work very, very hard to develop a gift doesn't mean that it isn't a gift."
But if something requires development, in what sense is it a gift? The word "gift" implies you receive it without earning it. "Develop" implies you build it through practice. These seem mutually exclusive.
Your weightlifting analogy assumes there's some cognitive equivalent to "lift 500 pounds" - some specific mental operation the rest of us physiologically cannot perform. But what is it? Can you name the actual cognitive move that greater minds employ that lesser minds simply cannot execute?
The 100m dash seems more apt precisely because everyone can run - just at different speeds and efficiency. We can all put one foot in front of the other and cross the finish line. The question becomes: why are some so much faster?
I think you're conflating energy spent thinking with effective thought. Not all cognitive effort is equal. Using your weightlifting analogy: it's the difference between lifting with impeccable form (force efficiently transferred along the vertical axis) versus sloppy form where most effort dissipates without moving the weight.
Same total energy expenditure, vastly different outcomes.
The critical variable isn't effort quantity but the specific policies employed when stuck. When Einstein or Grothendieck hit confusion, what did they do? Not in vague terms like "work hard" or "be curious," but as concrete cognitive actions: Did they switch representations? Generate examples? Draw diagrams? Test limits?
You mention trying to explain your stream of consciousness to students. That's valuable, but I suspect the crucial difference isn't what thoughts occur to you, but what you do when your initial approach fails. Do you persist in the same representation or immediately pivot? That's a trainable habit, not a genetic gift.
If Phelps's advantage were purely genetic, we'd expect his training methods to be useless for others. But swimmers who adopt elite training techniques do improve substantially - they just don't reach Phelps's level.
The question is whether the gap is from physiological limits (like bone structure or muscle fiber composition) or from uncopyable aspects of practice patterns that compound over decades.
For cognition, what's the equivalent of bone structure? What's the hard ceiling? I'm genuinely asking, because I don't see it clearly articulated in hereditarian arguments beyond vague appeals to "processing speed" or "working memory" - terms that aren't well-grounded in neuroscience and often just redescribe the performance gap rather than explaining it.
If someone gives you $10,000 as a gift, and then you turn it into a small business through hard work and shrewd investments, it was still a gift. This is a very common sense idea, absolutely no contradiction.
Your analogy assumes the existence of a measurable "$10,000 head start" - but what is it, specifically?
If cognitive advantage were primarily biological like your gift analogy suggests, we should be able to identify and measure it. The history of trying to find physical correlates of genius has been remarkably unsuccessful. Gauss's brain sat mislabeled in a jar for over a century because it was so unremarkable. Einstein's dissected brain showed no convincing peculiarities.
More tellingly: genius is almost always domain-specific. Von Neumann was transcendent in mathematics but ordinary at music. Feynman was brilliant at physics but struggled with homotopy groups. If the advantage were a general biological gift - faster processing, better working memory, superior neural hardware - why wouldn't it transfer across domains?
Yet when exceptional minds apply themselves outside their domain of expertise, they're often no better than average. How does your $10,000 gift explain that? Shouldn't superior hardware help everywhere?
The business analogy actually undermines your point. Yes, $10,000 helps - but there are countless people who start with that amount or more who never build successful businesses. Meanwhile, some build empires from $100. At what point does the initial capital become less important than the strategies employed?
If someone turns $10,000 into millions through "hard work and shrewd investments," those two factors - the specific practices, the decision-making patterns, the strategies that compound over time - seem far more explanatory than the initial gift. Remove the shrewd investments and hard work, and the $10,000 likely dwindles. Remove the initial $10,000 but keep the shrewd strategies, and success still seems probable, just delayed.
The question isn't whether some people have advantages. It's whether those advantages are the primary variable or whether the compounding effects of different practices over decades are doing most of the work. Your analogy seems to assume the former without establishing it.
Curiosity and a questioning attitude provide the fuel needed to relentlessly ponder a confusing topic. I can understand things in my 50s that put me to sleep in my 20s, even though I had an "interest" in learning them in my 20s. Back then, I didn't have the sufficient quality of curiosity that lights one's brain on fire. Also, not giving up and not expecting to understand in 5 minutes, an hour, a day, a week, a month or a year is also key. I wouldn't know how to teach the curiosity that I feel now.
I could have written every word of this! (including the ages)
Excellent!
Metacognitive emotional attitudes influencing development over time is the driver.
It’s great that you made this intuition a bit more concrete.
There is no double standard here. Processes that make biceps grow in bodybuilders and liver grow in alcoholics are well-understood, have been demostrated on mice, and you can make average person to make larger biceps with proper medication and training, whereas getting high math ability people depends on self-selection.
I don't think these processes are well-understood by the laypeople who make this inference... and the processes that drive neuroplasticity are reasonably well-understood by experts.
But of course in the latter case the processes are invisible to everyone—so in that sense you are correct.
With bodybuilding, there are *results*. Sure, cell biologists know something regards to neuroplasticity but they can't turn a 70 IQ person into 130 IQ person. They can help in some edge cases such as reducing impact of some disease/trauma, while bodybuilding industry improves for median case. Because we really know some environmental factors, athletes are scanned for doping. If we knew factors for math ability similarly well, we could expect brain competitions to have similarly scandalous stories. Just throwing in word "neuroplasticity" doesn't prove anti-hereditarianism.
"Why do we constantly imagine things? Why do we experience this bizarre thing that people call a stream of consciousness? These capacities are metabolically expensive and, on the face of it, entirely reckless—we are constantly retraining our brains on our own hallucinatory slop."
As a person who has very recently got into Eastern/Buddhist meditation practices in my 50s, this really struck me! During undergrad, and my short stint in grad school, I wasted so much time on idle fantasy and I'm sure it got in my way. (I ended up becoming an actuary, but I'm rediscovering my interest in math now)
Thank you for the great post.
„our attention, the focus of our curiosity, how we navigate our stream of consciousness—may matter more than we ever dreamed.“
What do you think are the most effective ways to train our attention and make best use of it or does it vary a lot person by person?
Thanks for the feedback! Your question is way too difficult as I don't think there is a generic answer that applies to most people. My sad hunch is that a large share of the population suffers from entrenched attention issues that result from years of not confronting the situation, which is usually a messy combination of:
- interiorized fear
- bad habits / doomscrolling [judging by my own difficulties & what I see around me, this is a fast-deteriorating major public health issue]
- latent depression
- lack of a strong emotional investment in a valuable intellectual endeavour
- lack of competent sparring partners
- lack of ambition / inability to see a credible professional growth path
- ...
Net net, I don't think people can improve their situation if it's not associated with concrete and credible personal goals, whether relational, professional, or intellectual.
People can get stuck in the idea that one-in-a-billion genius is genetically determined, so there's no point in trying to hack it. You have some pre-determined ceiling of capability. But most people are nowhere near approaching that ceiling. You won't become Einstein, but your own potential is its own unexplored territory.
Certainly. But see what creates "just study harder" in South Korea -- teenagers commit suicides because they weren't admitted to prestigious universities, have little free time, collapsing fertility... And it doesn't seem to be any significant increase in scientific output that would be worth it.
Hi David,
I am reading you with extreme interest. I am coming to similar conclusions about [some forms of] mental illness. I am starting to believe (though, like you, I am a million miles away from being able to prove) that the question of exactly who becomes mentally ill, and how bad it gets, is going to turn out to depend in very small part on genetics, and in very small part on environment, and in very large part on a third thing which is habits (in particular, habits around paying attention). The driving force is good habits in the case of mathematics, but harmful habits in the case of [some forms of] mental illness. Changing bad habits to good habits can drag you out of [some forms of] mental illness.
I think that one important thing to notice about habits is that their effect is recursive and hence can grow out of control and result in all sorts of surprising things. Like cancer cells reproducing all out of proportion to other cell types -- or like the growth of twigs on a tree and other fractals found in nature -- recursion is an incredibly powerful idea. I'm not a math head at all, but I do take some inspiration from a couple of mathy sources. Every geek's favorite book "Godel Escher Bach" stresses the importance of recursion, and so does another book I've been learning from recently, which is titled "The Computational Beauty of Nature" (Gary William Flake). I think this post of yours, like other things you have written, touches on this idea, but you don't seem to make much use of the words "recursion" or "recursive."
Recursion is probably important enough that it deserves to be understood as foundational to a lot of developmental processes. Instead of "nature or nurture?" perhaps someday people will ask "nature, nurture, or re-cur?"
Thanks for continuing to write on this stuff.
--Kent
Thank you Kent — yes, I agree that the model’s natural scope includes most behavioral traits, including psychopathology (but it should probably be extended to account for neurotransmitter mix and the activitiy of non-neural brain cells — brain “habits” are likely to also be materialized in non-cognitive manners — this also matters in the cognitive aspects I’m discussing, but becomes even more relevant for psychopathology.)
Conjecture 1 is undermined by conjecture 5. (1) posits that "genetic variability...cannot account for the magnitude of the observed cognitive inequality" but (5) provides a mechanism for how a small cognitive difference can account for a large inequality through compounding.
Ummm, nope.
https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2011.00246/full
and
https://psycnet.apa.org/record/2003-04002-000
Yup.
>we will never find a within-family polygenic score S such that the median value of S on the population P lies in the top centile for S with respect to the general population.
It could be simultaneously true than we don't have a good PGS and also that cloning a genius would produce a genius with some major probablity (e.g. >50%, but even 5% would be a lot more than 1e-6)
GWAS has more parameters than samples. There are 3 billion base pairs in human genome, each base can be 4 states. So even if you sampled all human population you would not get enough samples for a simple RMS fit. And that's not talking about non-linearities. Yes, I am aware that some people claim that these effects don't exist.
Also, again, non-genetic doesn't mean non-biological. It could be some biological developmental randomness that we can't control rather than education methods.
Looks like you don't like normalization of IQ scores. If someone is going to seek for genetic or environmental factors that affect the trait, they are going to transform the distribution to one that is similar to bell curve. What is the problem here?
Re: sprinting analogy
Consider a species that evolving is powered flight (or de-elolving it). Suppose the trait is "how many meters one can fly from tree of a given height". At some point there will be a state where most specimens glide for some distance and few ones could travel very far - non-gaussian distribution of results even while most innards of the organism have gausian-like distributions.
It's definitely possible to have some individuals in same species to be more different in a some trait that some species do -- after all, it is how evolution works.
Most of New World primates are stuck in a state where only female heterozygotes have trichromatic vision and other females and all males have dichromatic. This is very real-life relevant, species-scale distinction, seen within a species.
Brilliant essay. I agree with your conclusion that "you cannot train your kid to become the next Einstein." However, there might be a subset of parents who can raise their children to at least a near-genius level if they have a genius level understanding of child development and the right conditions. I am thinking of people like Laszlo Polgar, perhaps Boris Sidis. I'm not saying their methods are free of controversy, but the impossibility of nurturing genius might be practical rather than fundamental.
The Polgars were suggested to adopt a baby from "disadvantaged" demographic, to show that the method works. They refused. There is no reproducible results.
Neuroscientist here:
You could make the same conjectures with musicians and non-musicians (and they would be more easily testable). To me, genetics and brain structure should only explain a small amount of the variance in cognitive inequality (for both musicians and mathematicians). Practice makes the master. And it’s easier to practice music than it is math =P
So if we want more math wizards, we need to learn how to practice it with the same accessibility as music.