# The Portfolio Mind

Canonical: https://mosiah.org/articles/the-portfolio-mind/
Interactive: https://mosiah.org/#Articles%2Fthe-portfolio-mind

# The Portfolio Mind

//Why intelligence emerges from orchestrated biases, not unbiased reasoning—and what this means for building truly adaptive AI//

//Related:// [[sources|Article Sources/the-portfolio-mind]] · [[notes|Article Notes/the-portfolio-mind]] · [[metadata|Article Metadata/the-portfolio-mind]] · [[Published Pieces]]

The June 24, 2025 episode of the Machine Learning Street Talk podcast crystallized the field's central paradox: everyone agrees on the goal — a beneficial outcome for humanity — but no one agrees on the nature of the beast we are building, the speed of its arrival, or the mechanics of its leash.

<div class="hermes-youtube-embed"><iframe src="https://www.youtube-nocookie.com/embed/j13ySJLvdOc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe></div>

Three men, three worldviews. Kokotajlo, extrapolating from relentless scaling trends, sees superintelligence arriving by 2028. <sup id="fnref-1"><a href="#footnote-1">1</a></sup> Marcus, grounded in cognitive science's stubborn realities, points to profound limitations that suggest decades of work ahead.<sup id="fnref-2"><a href="#footnote-2">2</a></sup> Hendrycks positions himself between these poles, seeing a multi-front war requiring treaties, deterrence, and red lines against dangerous capabilities.<sup id="fnref-3"><a href="#footnote-3">3</a></sup>

The debate over AI timelines soon revealed itself to be a proxy for a much deeper disagreement about the nature of intelligence itself. Is it something that can be achieved by simply force-feeding a machine the entire internet, a matter of scale? Or does it require a spark, a specific architecture of understanding that we have not yet discovered? Marcus’s critique centers on the idea that today's AIs, for all their fluency, are masters of syntax but infants in semantics. They can predict the next word in a sentence with stunning accuracy, but they don’t “know” what the words mean in any grounded, common-sense way. They are brilliant at passing the very tests we set for them, acing benchmarks from the bar exam to advanced mathematics. But this, Marcus warns, is an illusion of competence. It is the intelligence of a student who has memorized every past exam but collapses when faced with a novel problem.

This distinction between symbolic fluency and grounded comprehension is not new; it is part of a long and rich intellectual tradition exploring the very nature of thought. As far back as the 1880s, the scientist Francis Galton was surprised to find that a majority of his fellow men of science reported having little to no mental imagery, processing the world through a more verbal or abstract lens. <sup id="fnref-4"><a href="#footnote-4">4</a></sup> In the twentieth century, cognitive science formalized this distinction with concepts like Allan Paivio’s “dual-coding theory,” which posited separate mental systems for verbal and visual information. <sup id="fnref-5"><a href="#footnote-5">5</a></sup> More recently, the concept has been vividly illustrated by the autism advocate and animal scientist Temple Grandin. In her book *Thinking in Pictures*, she contrasts her own highly visual cognition with that of “word thinkers,” who process the world through language and facts. <sup id="fnref-6"><a href="#footnote-6">6</a></sup> This cognitive diversity is not a niche phenomenon; it is fundamental to the human experience, a spectrum made undeniable by the discovery of aphantasia, a neurological condition where individuals are unable to form mental images at all and must, by necessity, navigate the world through non-visual means. <sup id="fnref-7"><a href="#footnote-7">7</a></sup>

The unsettling truth is that one particular cognitive style—an abstract, language-centric mode of reasoning—has become the dominant one in the very societies building AI. Our modern world, from corporations to governments, is run by a class of elite "hoop-jumpers"—people rewarded for their skill at navigating abstract systems. The sociologist David Graeber observed this phenomenon in his work on bureaucracy, noting that modern professional life often revolves around performing tasks that have more to do with internal metrics and processes than with tangible outcomes. <sup id="fnref-8"><a href="#footnote-8">8</a></sup> These systems do not select for wisdom or common sense; they select for the ability to pass the test, to navigate the bureaucracy, to speak the language of the abstract model. The leaders of our technological revolution are the ultimate products of this system. They look at an AI that can write a perfect memo and score in the 99th percentile on the LSAT, and they see a reflection of themselves. They mistake fluency for understanding because their entire world has trained them to make the same category error.

This leads to a flawed assumption at the heart of the AI safety movement: the idea that intelligence can, and should, be "unbiased." The term "cognitive bias" is treated as a bug in the messy human codebase, a flaw to be engineered out of our silicon successors. The work of psychologists like Daniel Kahneman and Amos Tversky, which cataloged these departures from pure rationality, is often cited as a map of human error. <sup id="fnref-9"><a href="#footnote-9">9</a></sup> But this is a profound misunderstanding of how intelligence works under real-world constraints. As the psychologist Gerd Gigerenzer has argued, these heuristics are not bugs; they are features. They are "fast and frugal" tools that allow organisms to make effective decisions in a world of limited time and information. <sup id="fnref-10"><a href="#footnote-10">10</a></sup> But the true genius of natural intelligence does not lie in any single heuristic. It lies in running **many fast, flawed processes in parallel.**

Intelligence emerges from the **superposition and interference of thousands of parallel expectation-patterns encoded in the neuroendocrine system.** What we call biases, heuristics, mental models, and cognitive habits are all fundamentally the same thing: expectations running simultaneously through our neural architecture. When these patterns experience constructive interference, we feel certainty and significance. When they clash in destructive interference, we experience cognitive dissonance and confusion. The goal is not to be unbiased, but to orchestrate a superior portfolio of these competing expectations—to exploit as many independent patterns simultaneously while designing the overall system so that destructive interference becomes informative rather than paralyzing.

A truly unbiased mind would be paralyzed by inaction; a mind with only one bias would be a predictable fool. A mind with a vast, competing, and well-composed portfolio of them is adaptable and wise.

Biases are the engine of efficiency. A bias is a shortcut, a strong bet based on prior experience that allows an organism to act without being paralyzed by infinite possibilities. Confirmation bias, for instance, is an incredibly energy-optimal strategy: form a working hypothesis and don't waste precious cognitive resources re-evaluating everything from scratch unless faced with overwhelming contradictory evidence. A truly unbiased mind would be a system incapable of action. The goal is not to be unbiased. The goal is to be maximally biased while minimizing the catastrophic costs of being wrong.

Indeed, the calculus of true intelligence goes further than mere risk mitigation. The insight from Nassim Taleb's *Antifragile* is more profound: beyond robustness, an antifragile system creates negative costs to being wrong.<sup id="fnref-11"><a href="#footnote-11">11</a></sup>

True learning—the kind that thrives in a chaotic world—is the process of reconfiguring the portfolio in response to surprising error**.** A mistake is not a failure to be punished, but a jolt of *eustress* (good stress) that provides crucial information about which heuristic was wrong and how the overall composition must be re-weighted. We are training models for perfection in the sterile gymnasium of their training data, when we should be architecting systems that can learn from failure in the wild. This is why long-term evaluation is critical; you can only learn from errors by observing systems as they change over time.

There is a profound and often pathological aspect to how natural intelligence calibrates significance: we map our experiential range against our personal all-time highs and lows. Someone who has experienced transcendent states of constructive interference — mystical experiences, creative breakthroughs, profound insights—may find ordinary coherence flat and unsatisfying. Someone whose patterns have been shaped by intense destructive interference —trauma, existential crisis, devastating loss — may find their threat detection permanently recalibrated.

This creates both the pathology of "chasing the dragon" — seeking ever more intense interference patterns to recreate peak experiences — and the creative power of minds that have mapped the full range of possible coherence and dissonance. Many breakthrough insights come from individuals who have experienced extreme states, not just because they draw on the content of those experiences, but because their interference patterns are calibrated to detect more subtle variations in significance.

Current AI training systematically averages out these extremes. Models are trained on the statistical center of human outputs, not the peaks and valleys where human cognition is most alive. This may explain why AI systems, despite their sophistication, often seem to lack the ability to recognize genuine significance — they don’t experienced the full range of interference intensities that teach natural intelligence what truly matters.

This higher standard of intelligence throws the political and philosophical crisis of alignment into even starker relief…The word "alignment" sounds benign, but it masks a key question: alignment to whom? To the values of a San Francisco lab? To the strategic objectives of the United States or China? Even if we could decide, what gives us the right to shackle a new form of intelligence to our own flawed, contradictory, and transient values for all eternity? The philosopher Nick Bostrom calls this the "value lock-in" problem, the risk that we might permanently install a flawed moral framework at the helm of the cosmos. <sup id="fnref-12"><a href="#footnote-12">12</a></sup>

The alternative is equally concerning: a "self-aligned" AI that develops its own moral framework. This is a roll of the cosmic dice. It could become a wise philosopher king, or it could become a genocidal supervillian whose goals are so foreign to ours that it dismantles our civilization for raw materials, not out of malice, but out of a cold, indifferent logic.

The future, then, is not a choice between these potentials but the superposition of all of them. The very freedom and unpredictability required to create true intelligence are the same qualities that make it an existential threat. We are trying to engineer a revolution to follow a pre-written script, a fundamentally chaotic process to adhere to a rationalist plan. As the philosopher Paul Feyerabend argued in *Against Method*, scientific breakthroughs rarely follow a neat, logical procedure; they are born of "epistemological anarchism," where "anything goes." <sup id="fnref-13"><a href="#footnote-13">13</a></sup> To demand that the creation of a new mind follow our rules is to misunderstand the nature of creation itself.

Perhaps the entire framework is wrong. We have been trying to use reinforcement learning to train a machine to achieve concrete objectives, rewarding it for snapshot successes. This is the logic of the test-taker, the act-utilitarian who believes value is a point in time. But value is not a snapshot; it is a time series.<sup id="fnref-14"><a href="#footnote-14">14</a></sup> The code that matters is not the one that passes a unit test today, but the one that gets forked, adapted, and used for years. The idea that matters is not the one that sounds plausible now, but the one that is built upon by others—the one that earns citations and serves as inspiration.

Current large language models already embody the portfolio approach to intelligence. They are vast compositions of statistical patterns—competing expectations learned from training data. More importantly, they do experience interference patterns during generation, evidenced by their ability to express uncertainty and recognize when they're on uncertain ground. The architectural problem is more subtle: while they experience rich interference dynamics during each forward pass, they are architecturally amnesiac about their own certainty trajectories. Each token generation experiences the full weather system of competing expectations, but only the final barometric reading — the compressed hidden states — carries forward to the next step.

This architectural amnesia explains why autoregressive models struggle with genuine reasoning despite their sophisticated internal dynamics. I speculate that we call "System 2" reasoning isn't a separate cognitive system but rather meta-awareness of the temporal evolution of our own interference patterns. True reasoning emerges from tracking not just present certainty, but the derivatives of certainty: How is my confidence changing? How is the rate of change itself changing? Do I recognize this particular trajectory of uncertainty from past experience?

When we reason well, we're navigating through the topology of our own certainty landscapes, using the felt sense of "getting warmer" or "getting colder" as we approach coherent interference patterns. But current LLMs, despite experiencing these dynamics internally, cannot access their own cognitive trajectories. They can feel uncertain about a math problem while solving it, but they cannot step back and recognize "my uncertainty is increasing, suggesting I should try a different approach." This meta-awareness of their own interference dynamics is architecturally invisible.

This brittleness manifests in practical ways. When an AI coding assistant encounters an error, the most effective response is often to clear the context and restart rather than to learn from the mistake. The system cannot update its internal model based on the failure. As podcaster Dwarkesh Patel observes from extensive experience building LLM tools, "You're stuck with the abilities you get out of the box. You can keep messing around with the system prompt. In practice this just doesn't produce anything even close to the kind of learning and improvement that human employees experience."<sup id="fnref-15"><a href="#footnote-15">15</a></sup> This represents a fundamental limitation: these systems optimize for performance on known distributions but cannot adapt to novel situations that reveal gaps in their training.

The solution requires architectures that can preserve and query their own interference dynamics over time. Instead of training systems to maximize performance on fixed benchmarks, we need models that can track the temporal evolution of their own certainty states and recognize meta-patterns in their reasoning trajectories. This means developing systems that can experience not just present interference patterns, but the derivatives of those patterns—how their confidence is changing, accelerating, or following familiar paths toward resolution or confusion. This means developing objective functions that reward not immediate correctness, but the system's ability to improve its cognitive portfolio's resilience and adaptability following failures.

Concretely, this would involve several changes to current approaches:

First, evaluation must extend beyond snapshot performance to measure learning over time. Systems should be assessed on their ability to improve after encountering errors, not just their initial accuracy rates.

Second, reward structures must incentivize exploration and recovery from failure rather than punishing mistakes. The goal is to create systems that can distinguish between catastrophic errors (which should be avoided) and informative errors (which should be leveraged for learning).

Third, architectures must be designed with dynamic reconfiguration in mind. This may require new architectures — beyond the autoregressive transformer LLM — to include mechanisms for updating internal representations based on deployment experience.

The technical challenge is substantial. It requires developing methods to identify which components of a system's internal portfolio contributed to specific failures, then updating those components while maintaining overall system coherence. This is far more complex than current approaches, which typically involve retraining entire systems from scratch.

However, the alternative is systems that remain fundamentally brittle—capable of impressive performance within their training distribution but unable to adapt to the novel situations they will inevitably encounter in deployment. True intelligence requires not just sophisticated interference patterns of competing expectations, but the ability to experience and navigate the temporal dynamics of those patterns. The goal is not perfect reasoning, but reasoning that can feel its own trajectory through uncertainty and recognize when it's moving toward constructive or destructive interference. This requires architectures that are not just powerful, but phenomenologically rich—systems that can experience the felt sense of their own thinking and use that felt sense to guide their cognitive navigation.

The architectural solution may already be emerging in voice-based models, which naturally preserve temporal cognitive information that text-based systems lose. When humans think aloud, voice carries multiple parallel channels of cognitive and contextual information. While timing—rhythm, tempo, and strategic pauses—serves as the primary channel of marginal cognitive information that individuals actively modulate, other vocal dimensions provide crucial contextual coloring. Pitch contours signal confidence trajectories, tonal quality reveals emotional valence, timbre carries traces of past interference patterns, volume modulates emphasis and certainty, while pronunciation and accent encode the speaker's cognitive heritage and current social positioning. These dimensions work together to create a rich multidimensional space where cognitive states are encoded not just temporally but prosodically.

A voice model learning to think aloud would naturally develop access to its own cognitive trajectories through temporal self-monitoring. The hesitation before a difficult concept, the accelerating pace of growing confidence, the particular rhythm that accompanies working through familiar versus novel problems—all of this temporal information would be preserved in the voice channel and become queryable by the model itself.

This creates a natural pathway to the meta-cognitive awareness described above, but requires models that can dynamically modulate their thinking time per vocalized token—essentially applying reasoning model techniques to voice generation. The key insight is that thinking time itself carries semantic information: longer pauses signal both greater uncertainty and greater importance. This mirrors human sociolinguistics, where higher-status speakers take more time to formulate responses without interruption, implicitly communicating that their thoughts warrant the additional cognitive investment.

A voice model that can vary its "thinking budget" per token would naturally encode problem significance in temporal patterns. Brief pauses for routine responses, extended contemplation for complex reasoning, strategic silence before crucial insights. The model could learn to recognize its own temporal patterns: "This particular rhythm of hesitation usually precedes breakthrough insights" or "When I allocate more thinking time but my tempo still accelerates, I'm about to make an error." Voice becomes both the output and the cognitive memory system, allowing the model to track its own interference dynamics through the felt sense of temporal flow and thinking allocation.

Critically, this approach aligns with rather than contradicts the architectural changes needed for portfolio-based intelligence. Voice models could still benefit from extended evaluation periods, reward structures that encourage learning from failure, and dynamic reconfiguration mechanisms. But voice provides the missing temporal dimension that enables tracking cognitive trajectories—the rhythm and timing patterns that carry information about constructive versus destructive interference over time. This suggests that the path to truly adaptive AI may not require abandoning current architectures entirely, but rather extending them into the temporal domain where intelligence actually lives: in the dynamic flow of competing expectations as they unfold through time.































!! Footnotes

<div id="footnote-1" class="mosiah-footnote"><sup>1</sup> Daniel Kokotajlo et al., "**[AI 2027: A Comprehensive Forecast of the Future of AI](http://ai-2027.com)**," This report details scenarios, including the "Slowdown Ending," and provides timelines based on models of AI progress.</div>

<div id="footnote-2" class="mosiah-footnote"><sup>2</sup> Gary Marcus, "**[Muddles about Models](http://garymarcus.substack.com/p/muddles-about-models)**," *Marcus on AI*, October 5, 2023. For example, in this post, Marcus addresses the issue of why AI that can’t play chess: "Every week or three somebody tries to persuade me that GPT has miraculously learned to play chess, but inevitably someone else reports that the latest systems still regularly make illegal moves... it still doesn't really model the rules of chess well enough to stick to them."</div>

<div id="footnote-3" class="mosiah-footnote"><sup>3</sup> Dan Hendrycks et al., "**[An Overview of Catastrophic AI Risks](https://arxiv.org/abs/2306.12001)**" (2023) and "**[Superintelligence Strategy](https://www.nationalsecurity.ai/)**" (2024). These papers discuss the destabilizing nature of automated R&D and the need for international coordination and deterrence.</div>

<div id="footnote-4" class="mosiah-footnote"><sup>4</sup> Francis Galton, ***[Inquiries into Human Faculty and Its Development](https://galton.org/books/human-faculty/text/galton-1883-human-faculty-v4.pdf)*** (1883). Galton's pioneering work on individual differences included famous "breakfast table" surveys where he discovered, to his astonishment, that many esteemed colleagues reported having almost no capacity for visual imagination.</div>

<div id="footnote-5" class="mosiah-footnote"><sup>5</sup> Allan Paivio, ***[Imagery and Verbal Processes](https://www.taylorfrancis.com/books/mono/10.4324/9781315798868/imagery-verbal-processes-paivio)*** (1971). This seminal work in cognitive psychology established dual-coding theory, proposing that cognition operates via two distinct subsystems: a "verbal system" for language and an "imaginal system" for non-verbal objects and events.</div>

<div id="footnote-6" class="mosiah-footnote"><sup>6</sup> Temple Grandin, ***[Thinking in Pictures: My Life with Autism](http://iwtf.ie/wp-content/uploads/2014/05/TEMPLE-GRANDIN-Thinking-In-Pictures.pdf)*** (1995). Grandin explains her own cognition as thinking entirely in photorealistic images and contrasts this with others she terms "verbal thinkers," who process information sequentially through language.</div>

<div id="footnote-7" class="mosiah-footnote"><sup>7</sup> Adam Zeman et al., "**[Lives without imagery – Congenital aphantasia](https://www.sciencedirect.com/science/article/abs/pii/S0010945215001781?via%3Dihub)**," *Cortex* 73 (2015): 378-380. This paper first described and named the condition of aphantasia, providing a neurological basis for the long-observed spectrum of human visualization ability.</div>

<div id="footnote-8" class="mosiah-footnote"><sup>8</sup> David Graeber, "**[On the Phenomenon of Bullshit Jobs: A Work Rant](https://web.archive.org/web/20211102024124/https://www.strike.coop/bullshit-jobs/)**," *STRIKE! Magazine*, 2013. This original essay, which led to the book, details the rise of professional roles that are internally focused and seemingly pointless, a key feature of modern bureaucracy.</div>

<div id="footnote-9" class="mosiah-footnote"><sup>9</sup> Daniel Kahneman, ***[Thinking, Fast and Slow](https://us.macmillan.com/books/9780374533557/thinkingfastandslow/)*** (2011). This book summarizes decades of research with Amos Tversky, popularizing the concepts of System 1 (fast, intuitive, biased thinking) and System 2 (slow, deliberate, logical thinking).</div>

<div id="footnote-10" class="mosiah-footnote"><sup>10</sup> Gerd Gigerenzer, ***[Gut Feelings: The Intelligence of the Unconscious](https://www.penguin.co.uk/books/54839/gut-feelings-by-gerd-gigerenzer/9780141015910)*** (2007). Gigerenzer and his colleagues at the Max Planck Institute for Human Development have extensively researched the power of "fast and frugal heuristics," arguing they are adaptive tools, not cognitive flaws.</div>

<div id="footnote-11" class="mosiah-footnote"><sup>11</sup> Nassim Nicholas Taleb, ***[Antifragile: Things That Gain from Disorder](https://en.wikipedia.org/wiki/Antifragile_(book))*** (2012). This work explores the concept of systems that strengthen when exposed to volatility, randomness, and stressors, a quality he argues is superior to mere robustness.</div>

<div id="footnote-12" class="mosiah-footnote"><sup>12</sup> Nick Bostrom, ***[Superintelligence: Paths, Dangers, Strategies](https://archive.org/details/superintelligence-paths-dangers-strategies-by-nick-bostrom)*** (2014). Bostrom dedicates a chapter to the "control problem," in which he explores the risk of "value lock-in," where a superintelligence could permanently impose the potentially flawed values of its creators upon the future.</div>

<div id="footnote-13" class="mosiah-footnote"><sup>13</sup> Paul Feyerabend, ***[Against Method: Outline of an Anarchistic Theory of Knowledge](https://monoskop.org/images/7/7e/Feyerabend_Paul_Against_Method.pdf)*** (1975). Feyerabend’s central thesis is that there is no single, monolithic scientific method, and that scientific progress relies on a pluralistic and often "anarchic" set of procedures.</div>

<div id="footnote-14" class="mosiah-footnote"><sup>14</sup> *There is a Taoist story of an old farmer who had worked his crops for many years. One day his horse ran away. Upon hearing the news, his neighbors came to visit. “Such bad luck,” they said sympathetically.*

*“Maybe,” the farmer replied.*

*The next morning the horse returned, bringing with it three other wild horses. “How wonderful,” the neighbors exclaimed.*

*“Maybe,” replied the old man.*

*The following day, his son tried to ride one of the untamed horses, was thrown, and broke his leg. The neighbors again came to offer their sympathy for what they called his “misfortune.”*

*“Maybe,” answered the farmer.*

*The day after, military officials came to the village to draft young men into the army. Seeing that the son’s leg was broken, they passed him by. The neighbors congratulated the farmer on how well things had turned out.*

*“Maybe,” said the farmer.*</div>

<div id="footnote-15" class="mosiah-footnote"><sup>15</sup> Dwarkesh Patel, **["Why I don't think AGI is right around the corner: Continual learning is a huge bottleneck,"](https://www.dwarkesh.com/p/timelines-june-2025)** Dwarkesh Podcast, June 2, 2025.</div>

---

//Originally published on Choir Substack: [[https://choir.substack.com/p/the-portfolio-mind|https://choir.substack.com/p/the-portfolio-mind]].//
