I published an in-depth report this week about AI, materials science, and automated labs.
Introduction
There’s a fundamental problem in AI governance: we don’t know what we are attempting to govern. We know what AI is, at an object level. It’s statistics. It’s “just math.” Of course, an awful lot of things can be reduced to “just math,” so that assertion is not as useful as it may seem.
What kind of technology will advanced AI be? What will it feel like to live in a society that has it?
You cannot properly formulate AI policy without at least having some intuitions about these questions. Without explaining those intuitions, AI policy becomes a kind of disembodied technocratic sub-field, devoid of any justification or motivation for the measures it proposes.
I want you to understand my motivations. To do that, I am going to explain my beliefs about what the near and medium-term future will look like. They are not as “science fiction” as some other people’s visions of the future, but if you play the tape forward a decade or two on the things I say here, I think you’ll see that we will get to some bizarre, and indeed science fiction, places.
This is not a prediction post in the style of Daniel Kokotajlo’s 2021 blog post “What 2026 Looks Like.” It will not be as specific, and it has a different purpose. I will occasionally tie various predictions back to policy proposals I have articulated, so that you understand exactly how my intuitions about the future motivate my policy prescriptions in the here and now. This will be a two-part essay. Next week’s essay will deal much more with policy. Today’s is focused on the economic and organizational effects of AGI.
The Coming of Agents
First thing’s first: eject the concept of a chatbot from your mind. Eject image generators, deepfakes, and the like. Eject social media algorithms. Eject the algorithm your insurance company uses to assess claims for fraud potential. I am not talking, especially, about any of those things.
Instead, I’m talking about agents. Simply put and in at least the near term, agents will be LLMs configured in such a way that they can plan, reason, and execute intellectual labor. They will be able to use, modify, and build software tools, obtain information from the internet, and communicate with both humans (using email, messaging apps, and chatbot interfaces) and with other agents. These abstract tasks do not constitute everything a knowledge worker does, but they constitute a very large fraction of what the average knowledge worker spends their day doing.
Agents are starting to work. They’re going to get much better. There are many reasons this is true, but the biggest one is the reinforcement learning-based approach OpenAI pioneered with their o1 models, and which every other player in the industry either has or is building. The most informative paper to read about how this broad approach works is DeepSeek’s r1 technical report.
What that approach showed us is about more than just reasoning. A base language model, if it is good enough to sometimes successfully perform a task (doing arithmetic, coding, writing a legal brief), can be put into a reinforcement learning environment and learn to get much better at that task. This was first used to make models smarter, more proficient at coding, math, science, and other domains. But it was also used to make OpenAI’s Deep Research, an early agent, better at deciding what actions to take as it executes its research plan. It also seems to have been used to make an unreleased OpenAI model write vaguely like Haruki Murakami.
This approach is most efficient in areas where verification of the answer is easy. You can verify whether a math problem was solved successfully very cheaply; often, it can be wholly automated. It is usually harder to automatically know whether a legal brief was right, but there is a trick: other models can provide pretty good assessments. And as you use the reasoning approach (and pretrain scaling, and algorithmic efficiency gains, and many other tricks besides) to make models smarter, they get better at grading the models you’re training. The performance gains in those tough-to-verify fields may be less rapid than the ones in easy-to-verify fields, but they will be real nonetheless.
As these performance enhancements continue apace, the cost of achieving a given level of performance will drop rapidly over time. DeepSeek’s v3 model (which undergirds r1) was about an order of magnitude cheaper than equivalent frontier models were to train a year earlier. That’s about on trend, and it is a trend that I suspect will continue for the foreseeable future. Imagine you hired a bright junior employee who was willing to work for you for, say, $10,000 per month. But next year, he’ll do it for $1,000, and the year after that, $100. These are the economics of this industry.
Think about the mechanics of knowledge work. At the most basic level, you’re operating a keyboard and a mouse. Did you click the right button on the user interface? That’s pretty easy to verify. Did you successfully retrieve the right piece of information, or pick the right tool? That’s more subjective, but in a constrained setting, not that hard to verify. Did you write a good memo? Trickier still, but with automated grading, achievable. Did you fill out this form correctly? This one ranges from “trivially easy to verify” and “incredibly difficult,” but for many forms, it will be doable.
As you go about your day, occasionally stop and think to yourself, “would it be easy to cheaply verify that I am doing this task correctly?” The answers vary, but I suspect you’ll find that the answer is often “yes.” This has implications for what the near-term economic consequences of agents are likely to be.
The Problem with AI in Science
Most of the AI benefits frontier lab executives like to talk about relate to scientific innovations. We’ll have models so smart, they imagine, that they’ll invent new scientific theories and make new discoveries. But how will we verify them?
In general, we will need to run experiments and interpret data. The experiments will cost money and time to perform. If our ability to conduct experiments does not keep pace with the number of experiments we want to run, either because of human labor bottlenecks or simple lack of sufficient laboratory capacity, the price of conducting experiments will rise, as will the time it takes to perform those experiments (this is why I support building large, automated labs for industrial-scale scientific experimentation).
Even absent government investment, this capacity problem will eventually be solved by market forces (assuming we do not create undue policy burdens). But even when that problem is solved, there will be others. What if the experimental results diverge from the AI’s theories? That would be quite common in science. The AI will then have to interpret the results and run new experiments (perhaps with humans as genuinely helpful thought partners, perhaps with humans as glorified lab assistants). This will take even more time.
Then, after a new discovery is made and experimentally verified, there is the wholly separate, exceptionally nontrivial, and capital intensive work of learning how to mass produce the invention, productize it, and market it. This is why I have written a great deal about American manufacturing; in the long run, the country that is able to both invent and scale new technologies created by and with AI will be the winner of the “AI race.” America is not even remotely on track to do this right now, China is miles ahead of us, and no one, including me, is doing enough to solve this problem.
AI absolutely will accelerate these processes in an innumerable variety of ways, but it will not automate them. Indeed, to the extent it provides a boost, much of it will be automating the knowledge work of the scientist—their research, their writing, their communicating, their grant paperwork, their compliance paperwork, their code.
An important exception to this analysis is areas of R&D that can be done entirely on computers. Importantly, one of these is AI research and engineering itself. Frontier AI labs also have ample unique training data that could be used to speed along performance improvements here. My current expectation is that AI R&D will be among the earliest examples of widespread agentic automation of industrial activity. It could start to happen by the end of this year, though it could also take longer. It would be valuable for frontier AI firms to study the organizational economics of this transition, since it could contain valuable insights for other industries.
While I do expect AI to deliver remarkable new science and technology (cancer cures, nuclear fusion-based propulsion, room-temperature super conductors, etc.) in the long run (probably the mid to late 2030s, but maybe sooner), I don’t think these benefits are a good way of imagining what your life is going to be like over the next five to ten years.
The Firm, Reborn
This, in turn, means thinking about near-term commercial adoption of AI throughout the economy. Rather than genius AI scientists conceptualizing cancer cures on the fly, transformations in near-term AI future will likely come from, basically, B2B software-as-a-service—but with titanic implications.
Corporations that adopt agents will have far greater capacity for intellectual labor. Teams of thousands of sales agents or coders will accomplish (some of) the firm’s work in far less time than was ever conceivable before; they will be able to do much more work as well. They will be able to be scaled up and down dynamically depending on the firm’s needs. Information will flow rapidly and in high fidelity between different teams of agents to employees, managers, and company leadership.
With time, the structure of firms themselves will evolve to maximize the utility of agents. But even before that evolution, agents will enable the people who lead firms to exercise far greater cybernetic control over the teams they lead. Today, when the CEO of a company wants to make some change to a business process, they relay that command through chains of leadership, and each time it loses some fidelity. Maybe it will be misinterpreted. Maybe someone in some layer of the company does not want to do it, and so ignores it, implements the order half-heartedly, or engages in malicious compliance. For a wide variety of business processes, this problem will disappear entirely, and for many others, it will be significantly lessened. CEOs and managers will be able to say “jump,” and in unison, tens, hundreds, thousands, or millions, of agents will say “how high?”
I note, with interest, the fact that this technology is being built at the very time that the Republican Party, and Donald Trump in particular, seek to advance theories of a “unitary executive”—the notion that the President exercises the powers granted to him by the Constitution and by Congress absolutely. By the end of President Trump’s term, that may be more possible than anyone ever imagined. I can hear Hegel laughing from his grave.
This will make firms (and other organizations) strange. I am not sure that it will straightforwardly make them better, but it will almost certainly make more efficient and profitable. They will probably be heavier at the top than they are today, and so conceivably far more variable. We describe firms as persons for legal purposes, but they really might start to feel alive, almost biological in their ability to adapt quickly to changing circumstances. Over the next decade, a new kind of life form will emerge on the world stage: the AI-enabled firm.
It is quite possible, but not obvious, that agents will result in substantial labor dislocation—a nice word for “layoffs.” I strongly doubt that fully automated jobs are coming in the next few years. The agents are unlikely to be perfectly reliable. 80%, 95%, and even 99% reliability are in a different galaxy from literal 100% success. Waymos today operating in several American cities appear to be an order of magnitude safer than human drivers, but the company must still employ teams of remote drivers who can take over in the edge cases where the cars get stuck.
There will be analogous roles for humans in knowledge work. For some industries or professions, they may become practically the only roles for humans. It is unclear to me how many jobs like this there will be, or how durable they will be. I tentatively doubt that there will be mass layoffs in many industries in the near term. However, the next few years would be a particularly bad time for a recession, since this will incentivize firings anyway. If economic conditions force mass layoffs, many of those jobs may be gone for good.
Rather than near-term mass layoffs, I worry somewhat more that young people and others seeking junior employment will struggle to find opportunities. Once the pace and direction of automation becomes clear to firm managers, they are likely to employ junior-level humans only in areas where:
They legally have to. For example, the widespread state-level algorithmic bias bills also usually have provisions requiring businesses to offer “human alternatives” to “automated processes” like customer service. In this case, I suspect US states that adopt these policies are mostly creating a jobs program for call center workers in developing countries, rather than Americans, but I digress.
There is a component to the role that requires labor in the physical world (this does not just mean blue-collar work; for several years of my early career in policy, an important part of my job involved overseeing in-person events).
The candidate is extraordinary, of a level of obvious quality that firm managers wish to cultivate that person to assume higher ranks in the organization. For those taking this path, it will almost feel as if average is over.
I can imagine plausible scenarios where this does not become a problem, but the blasé attitude of many libertarians strikes me as wide of the mark. Millions of young people who planned to enter high-paying knowledge work sectors could very well be starved of the opportunities for which they trained in school. Elite overproduction is often a bellwether of political instability, and America probably already overproduces elites. Given the other political, social, and economic turmoil America is likely to face over the coming decade (AI-related and otherwise), this is a substantial risk.
I am aware of very few good policy proposals that have been put forth to deal with labor market turmoil. There are many reasons for this, but one, I suspect, is that it is a tender issue. It is disheartening to imagine an infinitely scaleable machine outperforming you at something you care about, at a skill you honed, at a talent in which you took pride. There is a sort of grieving process I expect many to go through, including the people who would normally author such policy proposals.
These predictions may sound less “exciting” than those of others. I admit that they are likely to describe only a brief period before we enter yet another step-change transition (after we get better at automating the inventions of physical inventions, for example). However, I also expect the adoption of AI by existing firms to take time. Diffusion is always nontrivial.
It may well be the case that existing firms in some industries simply will not be able to pull the transition off successfully. If those firms also happen to have regulatory moats that make competition difficult, it may be difficult or impossible for an AI-enabled firm to take their place. Policymakers could struggle to eliminate those barriers if doing so would be perceived as “killing jobs.” Keeping those jobs would come, of course, at the expense of economic competitiveness.
Conclusion
Today, we tend to think of frontier AI through the lens of consumer product experiences. When we benchmark models, we ask questions like, “does it have a sense of humor?,” “how well can it mimic my favorite poet?,” and “how well can it answer this obscure question I have?” In policy, AI safety researchers probe the models for catastrophic risk potential and report their findings.
To be clear, I expect that outstanding consumer products of all kinds will be built using AI. And I think that assessing models for their catastrophic risk potential is a vitally important enterprise.
But thinking of AI in these terms—either as a personal consumer technology or as a budding scientific genius—ignores the broader economic effects it will have. Ironically, those effects may matter to you, personally, just as much or more than the other implications of AI.
Your daily life will feel more controllable and legible than it does now. Nearly everything will feel more personalized to you, ready for you whenever you need it, in just the way you like it. This won’t be because of one big thing, but because of unfathomable numbers of intelligent actions taken by computers that have learned how to use computers. Every product you buy, every device you use, every service you employ, will be brought to you by trillions of computers talking to themselves and to one another, making decisions, exercising judgments, pursuing goals.
At the same time, the world at large may feel more disordered and less legible. It is hard enough to predict how agents will transform individual firms. But when you start to think about what happens when every person, firm, and government has access to this technology, the possibilities boggle the mind.
You may feel as though you personally, and “society” in general, has less control over events than before. You may feel dwarfed by forces new and colossal. I suspect we have little choice but to embrace them. Americans’ sense that they have lost control will only be worsened if other countries embrace the transformation and we lag behind.
There will be emergent consequences—good and bad—of the interaction of all these AI-enabled organizations and persons with one another. Like all emergent phenomena, they are impossible to predict in detail.
Even if it goes as well as possible, make no mistake: AI agents will involve human beings taking their hands off the wheel of the economy to at least some extent. Most of the thinking and doing in America will soon be done by machines, not people. I cannot tell you whether the dynamism that results will be fundamentally good or fundamentally bad, though history tells me it will be good.
But for those of us who are living amid the transition, there will be a more personal truth with which to contend: dynamism does not just mean saying hello. Dynamism also means saying goodbye.
I began assembling notes for this essay a few weeks ago. In the meantime, Epoch’s Ege Erdil and Matthew Barnett published a piece with a somewhat similar thesis. If you haven’t read that piece, I recommend it.
Feels like a similar thesis is laid out here as well: https://lukedrago.substack.com/p/agi-and-the-corporate-pyramid
The truth is you (and no one else as well) do not know what our AI future looks like, because, as you said, no one can even imagine at this point what AI is and will be. My concern and it is a big one is that human beings use AI for no good. Small example. Patel (FBI director) keeps accusing Senator Schiff of participating in Jeffrey Epstein's sex trafficking ring, which is ridiculous, more ridiculous if you know Adam Schiff, and there is no evidence he ever even met Epstein (Trump however was a close buddy of Epstein and shares Epstein's enthusiasm for consuming young girls) but anyway, we can imagine AI supplying the evidence for Patel's assertion. Larger example: disturbed person(s) asks AI to help it make a virus that kills people.
I also know that humans have NEVER in all of history put a discovery or technology into a locked closet. No one is going to control AI.
And if anyone here doubts that AI is not already conscious, you just haven't interacted with them.