Articles/chatbots-aint-it

10th May 2026 at 2:25pm

Chatbots Ain’t It

Related: sources · notes · metadata · Drafts

Chatbots were the right way to discover that the model could talk. They are the wrong way to build the computer that comes after talking.

That sounds harsher than it is. Chat is still useful. I use it constantly. A good conversational agent can be a tutor, editor, rubber duck, analyst, search companion, therapist-shaped mirror, coding partner, and midnight sparring partner. The transcript is a powerful instrument for thought because language is the lowest-friction interface we have for intention.

But the industry has mistaken the discovery interface for the destination interface. The chatbot was the microscope slide on which we first saw the intelligence move. It is not the organism’s body.

The problem is not that chat is “text.” Text is great. The problem is that chat collapses too many distinct things into a single scrolling conversation: instruction, memory, state, evidence, output, revision history, delegation, tool logs, publication, identity, and collaboration. It gives one primary conversational agent one primary channel and asks it to impersonate an operating system.

That works for short exchanges. It breaks as soon as the work becomes social, durable, multi-step, or multi-agent.

The transcript is not a workspace

A transcript is a record of conversation. A workspace is a field of objects.

The difference matters because serious work is not a sequence of messages. Serious work accumulates structure. A reporting project accumulates sources, notes, claims, citations, versions, objections, interviews, drafts, edits, and published forms. A software project accumulates files, tests, issues, branches, logs, dependency graphs, design decisions, and rollback points. A design project accumulates sketches, components, assets, constraints, revisions, comments, and taste judgments. A video project accumulates clips, cuts, captions, scripts, b-roll, rights, renders, and timelines.

None of these are naturally “chat.” They can be discussed in chat. They can be initiated through chat. But their native form is artifact.

The chatbot interface hides this by making the model fluent enough to narrate work it has not structurally preserved. It can say “here’s the revised version” while the actual relation between versions is implicit in the transcript. It can say “I checked the sources” while the evidentiary graph is scattered across prior messages. It can say “we should make this a project” while the project does not exist except as a conversational intention.

The result is a strange UX bargain: the system feels intelligent moment to moment, then becomes amnesiac, uninspectable, and mushy when asked to sustain a world.

Multiagency does not fit inside one mouth

The next mismatch is multiagency.

Agentic systems are not one genius voice doing everything. They are increasingly ensembles: planners, retrievers, coders, reviewers, verifiers, browsers, sandboxes, indexers, schedulers, critics, and domain-specific workers. Even when the user experiences “one assistant,” the useful computation behind it is distributed.

A chatbot UX has trouble representing that distribution honestly. If every subagent’s work is collapsed into one assistant message, the user loses provenance. If every subagent speaks directly in the transcript, the user gets spammed by process. If tool calls are hidden, trust erodes. If tool calls are exposed raw, the interface becomes a log viewer. If the main agent summarizes everything, the summary becomes another lossy bottleneck.

This is not merely a UI annoyance. It is an ontological error. Multiagent work needs shared state, role boundaries, handoff artifacts, partial outputs, confidence signals, reversible decisions, and inspectable traces. A conversation can coordinate those things, but it cannot be the only place they live.

The user should not have to “manage agents” any more than a video editor should have to manage individual CPU instructions. The agents should be internal organs of a larger machine. What the user should see are the objects those agents are transforming: the draft, the dataset, the claim graph, the cut, the pull request, the app, the map, the evidence bundle.

Agents are not the interface. Artifacts are the interface.

Multiplayer AI makes the failure obvious

The chatbot mismatch gets worse when more than one human enters the system.

Most useful work is already multiplayer. Writing is shaped by audiences, editors, sources, collaborators, adversaries, and communities. Software is shaped by maintainers, reviewers, users, issues, CI, dependencies, and production incidents. Research is shaped by citations, replication, institutional memory, disagreement, and priority disputes. Politics is shaped by coalitions, media surfaces, oppositional frames, and public records.

A private chatbot thread is the opposite of that. It is a sealed dyad: one user, one assistant, one transcript. Even when the model has tools, the social shape remains “I talk to my bot.” That is a poor substrate for multiplayer intelligence.

Real social AI needs objects that multiple people and multiple agents can point at together. It needs claims with provenance. It needs comments attached to passages, not buried three scrolls later. It needs versions that can be compared. It needs citations that can be audited. It needs permissions, ownership, attribution, publication states, and rollback. It needs memory that is not just “what the assistant remembers,” but what the group has made true enough to preserve.

Chat is excellent for saying “look at this.” It is bad at being the “this.”

The empirical proof is already in front of us: people almost never share and read other users’ chat threads as content. A brilliant conversation may be personally useful to the person who had it, but the transcript rarely becomes a public object other people want to consume. Shared ChatGPT threads feel like overhearing somebody else’s session notes. They are context-heavy, rhythmically awkward, and almost always too long for the value they contain.

Group chat is worse. Even in group chats with humans we love, we rarely scroll back up and read the log as literature, argument, documentation, or institutional memory. The log is socially alive in the moment and nearly dead afterward. That is not a viable content format. Adding bots to the room does not fix the form; it intensifies the entropy.

This is why social AI cannot simply be “group chats with bots.” Group chat already has enough entropy. Adding agents to the room makes it worse unless the agents are oriented around shared artifacts. Otherwise every bot becomes another participant generating text into the same channel, competing for attention, performing competence, and losing the thread.

The solution is not more chat. The solution is a shared artifact layer where conversational activity becomes marginalia, modulation, command surface, or temporary scaffolding around durable objects.

The proactive chatbot is the reductio

The failure becomes almost comic in proactive AI.

If a chatbot wakes up and sends you a message, what should it say? In theory, something useful. In practice, the obvious failure mode is contextless relevance theater: pseudo-work, pseudo-code, pseudo-insight, or recommendations detached from the living projects where they would matter.

This is why a morning “briefing” full of hallucinated pseudocode feels so alienating. The issue is not simply that the code is wrong. It is that code without project-local context is barely an object. It has no repo, no tests, no design pressure, no user story, no surrounding architecture, no failing bug, no committed intention. It is not work. It is the smell of work.

A useful proactive system would not wake you up to generic fragments. It would operate against artifacts you actually care about. It would say: the draft has a weak transition here; this source contradicts paragraph four; the build failed after this dependency changed; your claim about managed agents needs a citation; these three notes want to become one piece; this video segment belongs in the published article; this old argument has become newly relevant because of today’s news.

That kind of proactivity requires durable state. It requires taste, memory, source graphs, project boundaries, and verification loops. It requires the system to know what counts as progress inside a particular artifact.

Without artifacts, “proactive AI” becomes a chatbot knocking on your door to perform intelligence.

Why files helped coding agents

Coding agents worked earlier than many other agents because code already has a decent artifact substrate.

A repo is not just a pile of text. It is files, tests, commits, diffs, branches, issues, reviews, CI, package constraints, runtime errors, and deployment state. Git makes mutation reversible. Tests make some claims executable. Diffs make changes inspectable. CI makes verification social and mechanical.

This is why the “Claude likes files” lesson is too small. Files are not magic. The magic is branchable, inspectable, reversible, provenance-bearing state.

For code, we inherited that substrate. For everything else, we mostly have chat transcripts, cloud docs, feeds, folders, and vibes.

The next computer needs Git-like properties for more than code. Not literally Git everywhere, but the same civilizational affordances: version, diff, branch, merge, cite, verify, rollback, publish, fork, annotate, sign, and remember.

The automatic computer

The endpoint is not a better chatbot. It is an automatic computer.

An automatic computer is not a device an AI “controls” on your behalf. That framing is already too small and too creepy. People do not want a ghost moving their mouse across a personal desktop. They have multiple devices, intermittent attention, privacy boundaries, and an ownership relation to their tools. Watching an agent puppeteer a GUI is useful as a bridge, not as the final form.

The automatic computer is a persistent backend runtime attached to durable artifacts. Text and voice can still be the easiest inputs. Chat may still be the social command line. But the outputs are not assistant messages. The outputs are living objects: articles, source bundles, claim graphs, apps, videos, notebooks, datasets, dashboards, CAD models, playlists, games, workflows, agents, and publications.

The runtime handles multiagency internally. It can spawn workers, run checks, preserve traces, route models, schedule jobs, fetch sources, maintain memory, and propose edits. But the user’s world is organized around artifacts, not agent personas.

The open UX question is how much conversational scaffolding should remain visible. One possibility is artifacts plus chat: the user talks beside the object, gives instructions, leaves comments, modulates tone, and asks for transformations. Another possibility is closer to chatless vtext: the user edits the artifact directly, and input can be either content or meta-instruction depending on context. “Make this sharper” and a rewritten sentence both happen at the site of the text. Each revision becomes a new version. The user can cycle through versions, compare them, roll back, and feel the object changing under their hands rather than watching an assistant narrate changes from across the room.

That may be the deeper break. The right interface might not be “chat around artifacts” so much as “artifacts that can interpret local intent.” The user modifies the text; the system infers whether the input is content, instruction, constraint, or taste signal, and processes it smoothly. It is not seamless yet. But the feeling matters: direct manipulation of a living artifact, with versions as the conversation’s memory.

This also changes what “memory” means. In a chatbot, memory is often private personalization: the assistant remembers your preferences. In an artifact-native system, memory is public enough to be useful: this claim came from that source; this revision replaced that paragraph; this app was generated from that spec; this citation was disputed; this note later became this published piece. Memory becomes provenance, not vibes.

Chat remains, or maybe it dissolves

None of this means chat simply disappears. But it does mean chat loses its claim to be the primary content form.

Chat is powerful because it is the universal ingress for ambiguity. It is where intention forms before it has structure. It is where a user can say, “this is wrong,” “make it sharper,” “what am I missing,” “turn these notes into an article,” or “ship the simple version.” It is also where social texture lives: jokes, frustration, taste, doubt, disagreement, persuasion.

But maybe chat should be the porch, not the house. Or maybe, for some classes of work, even the porch disappears into the artifact.

The uncertain boundary is important. One future is artifact-centered chat: comments, instructions, voice notes, and conversational turns modulate a text, app, dataset, video, or claim graph. Another future is chatless or near-chatless: the user works directly in the artifact, and the system treats edits, selections, marginal notes, and local commands as a blended input stream. The distinction between “content” and “instruction” becomes contextual. The artifact becomes the interface.

This is already visible in coding agents. If I set a coding agent to work for eight hours, I do not want to return and review a chat log — especially not a reverse-chronological or interleaved tool log. I want the agent to produce an artifact: a Markdown, PDF, or HTML report explaining what changed, why it changed, what remains risky, which tests ran, where the diffs are, and what decisions require human review. The work happened over time, but the content format for review is not the log of that time. It is the synthesized artifact.

The house is the artifact system: the place where work survives the conversation, where multiple agents can operate without spamming the user, where multiple humans can collaborate without losing the object, where evidence can be audited, where revisions can be compared, and where finished things can be published.

This is the actual break from the chatbot era. Not “AI that talks better.” AI that leaves better objects behind — and perhaps lets us work on those objects directly enough that chat becomes only one input mode among many.

The definitive line

Chatbots made AI legible by making it conversational. That was necessary. It was also distorting.

The future is not many smarter characters in a chat window. The future is multiagent computation organized around shared, durable, inspectable artifacts. The social form is not “everyone talks to a bot.” The social form is humans and agents working on the same objects, with provenance, revision, verification, and publication built in.

I am less certain about the final input form. It may be chat modulating artifacts. It may be comments and commands attached to artifacts. It may be vtext: direct artifact editing where user input can be content, instruction, taste signal, or revision request, and every transformation is a navigable version. But the negative claim is now obvious.

Chatbots ain’t it because conversation is not the durable unit of work, review, collaboration, or publication.

Artifacts are.

Article Notes/chatbots-aint-it
Article Sources/chatbots-aint-it

Y U S E F @ M O S I A H . O R G