The central challenge of AI policy today is imagining a world transformed before that transformation has taken hold. Analogies and metaphors only go so far, and often, it seems, obscure more than they reveal. The precise contours of the transformation are almost certainly out of the hands of any one group of people, including our government. Recognizing this to varying degrees, US and EU policymakers have sought order by legislating broad outcomes or banning specific malicious uses of AI. Perhaps their efforts are wise and will be successful; perhaps they are not and will fail. Some of this normativity is undoubtedly healthy; no one wants AI-generated child pornography and other extreme yet easily foreseeable ills to proliferate.
In the United States, though, AI policy has focused far too much on these conjectures of order than it has on concrete steps to prepare our society for the transformation that is coming. Even the techno-optimists and classical liberals who contribute to these discussions (yours truly included) often spend their time pushing back against the normativity, worrying about the unintended consequences of regulating too much and too early. This work is critical, no doubt, but it can leave one with the impression that there is nothing to be done other than to let the market rip. I believe government has much work to do—it just doesn’t involve yelling at executives or telling developers what to do as much as it does building.
Some of the things we must build are underway, while others are barely even blueprints; some are long overdue, others are more novel. It is the things we must build which interest me most.
Some examples include:
Digital public infrastructure for payments, identity verification, and data privacy;
A revamped approach to government procurement, especially for the military, that will allow government to take maximum advantage of new technology;
Publicly accessible AI research infrastructure.
Government can be a productive partner in the AI transformation that is coming. It can create value and provide resources that neither the private sector nor academia on their own can. In order to do that most effectively, however, policymakers must transition from an inherently defensive posture to a proactive and positive one. The challenge is not so much to legislate outcomes (almost always folly, especially with the diffusion of a general-purpose technology throughout society), but instead to set about building that which is required to create a better future. I suspect this is true not just for government, but for businesses, and for each of us as individuals. AI is not something that will happen to us; we have agency, individually and collectively, to affect the contours of this transformation.
Anxiety and agency, however, do not blend well. We must be imaginative in our efforts to preserve the things we care about preserving, yet equally bold in our willingness to reinvent that which no longer serves us well. We must shun both fear and naïveté. We need urgency and ambition, optimism and joy. We do not and cannot know what the ideal outcome is; as ever, we will discover it imperfectly, with time, diligence, and effort.
I plan to devote much of my time to exploring the things government should build in more depth. Today, I’d like to focus on the third item, the most modest of the three and therefore, I think, an appropriate starting point: creating a publicly accessible AI research infrastructure.
The Case for Public AI Research Infrastructure
Why is this necessary? Isn’t a ton of AI research happening? Don’t we see new papers and new models each day? The breakneck pace of AI research belies a more troubling reality: the largest AI labs are leaving academic research behind. As AI has become more and more capital intensive, universities have found themselves unable to keep up. Stanford University, a leading center of AI research, has only several hundred of Nvidia’s latest AI chip, the H100. Meta’s Mark Zuckerberg, by contract, recently announced that his firm expects to have 350,000 such chips by the end of this year.
Meta happens to be an industry leader in publishing open research and open-source AI models. Many other AI labs, however, have become increasingly closed as competition rises and the market potential becomes apparent (for the record, these labs often claim that they keep their research secret for safety reasons). Firms working on frontier AI should be free to make independent decisions about what research they choose to publicize, just as Apple is free not to share the details of how it achieved the breakthroughs that led to their Vision Pro headset.
But with a technology as impactful, and as potentially dangerous, as AI, public research is especially crucial. It is well-known in the AI industry that the top labs are far ahead of the academic community; most of the research ideas academics try have been tried already within the top labs, often at grander scale. Dwarkesh Patel, a podcaster and well-connected industry observer, recently wrote about this dynamic and timelines to the creation of AGI (artificial general intelligence):
“I’m probably missing some crucial evidence - the AI labs are simply not releasing that much research, since any insights about the “science of AI” would leak ideas relevant to building the AGI. A friend who is a researcher at one of these labs told me that he misses his undergrad habit of winding down with a bunch of papers - nowadays, nothing worth reading is published. For this reason, I assume that the things I don’t know would shorten my timelines.”
It may well be the case that AI can help reveal insights about the nature of ‘intelligence’ itself. For example, we know that large language models encode meaning within their weights in complex and non-obvious ways. AI models have neurons (for these purposes, think of AI neurons as small sub-units of a model that perform mathematical functions on data), but their neurons don’t map to concepts that we find intuitive. There is likely not a neuron inside of GPT-4, for example, that represents the concept of “red” or of “statistical mechanics.” Instead, these concepts are represented partially in many different neurons, and the entire concept exists only in when all the relevant neurons are activated in a particular pattern. This is called “polysemanticity.” It seems quite likely that human brains must do something at least vaguely similar to encode the vast range of ideas, sense perceptions, and memories that they store.
Interestingly, we observe a similar phenomenon in genetics, where it is rare that a particular characteristic of a living thing is represented in one gene. Instead, complex characteristics are broken down and represented in multiple (sometimes many) different genes. It is not difficult to imagine that the human brain might be doing something similar. It may well be that this pattern-based way of encoding meaning is a fundamental property of some sort—that it is how nature confronts the challenge of representing an infinitely complex reality in a finite structure. AI models are the first dynamic systems that we can study this phenomenon in with perfect detail and no ethical or technical constraints on experimentation. In this sense, AI research may help us to learn about far more than just AI.
What insights about the mechanisms of AI and ways to control AI systems more precisely might the public not know? Might those insights be used by AI labs in ways we might not prefer if we did know? As AI models become a more intimate part of our day-to-day lives, might these insights be weaponized against us? With questions as weighty as these, it is crucial that public research at least keep pace, if not necessarily lead the way.
Even if these specific worries do not pan out, it is still the case that the questions of AI alignment (how consistently a model obeys human values, setting aside the political question of whose values it is aligned with), steerability (user control), and interpretability (understanding of how a model works) are large unsolved scientific problems. We need as many minds addressing these problems as possible, and the current state of academic research makes this challenging. It is not that academic lacks bright minds but that academic institutions lack the computational resources to allow those people to do their best work.
As it happens, California State Senator Scott Weiner’s SB 1047, a bill I criticized heavily last week, has a proposal to create just such a research infrastructure called CalCompute. While the objective is laudable, this infrastructure would create more value as a federal government effort than it would as a state government project. Let’s set aside the fact that SB 1047, in its current form, would seriously call into question the viability of the US AI industry, making an accompanying investment into AI research a questionable use of taxpayer dollars. Let’s instead explore the idea of CalCompute on its own, independent merits.
Even if SB 1047 were amended to be less harmful to the field, state governments are not well suited to this kind of effort. First, this is because a state-level project will naturally favor California-based institutions and researchers; while it is undoubtedly the case that California leads the nation in AI research, it is not the only cluster of AI talent. California creating an AI research center might lead states like New York and Massachusetts to create their own such centers, leading to inefficiencies and competition among state governments for the costly and supply-constrained hardware that powers AI. These are the exact sort of inefficiencies the federal government exists to solve. Furthermore, with the State of California facing a $68 billion budget deficit (itself larger than the entire budget of many state governments), it is not clear that California is in an ideal financial position to finance such an endeavor. Finally, California (along with most state governments) has little experience operating the complex high-performance computing facilities required for AI, and simply staffing such an initiative with the necessary experts would be time-consuming and challenging.
Fortunately, this is a problem that is thoroughly ‘in-scope’ for the federal government, which already runs some of the world’s largest supercomputers, such as the Frontier exascale computer at Oak Ridge National Laboratory in Tennessee. These facilities, however, are not optimized for AI. Frontier is used to run simulations of dynamic systems such as the climate, but lacks the hardware necessary for efficiently training the largest AI models, which are particularly dependent on extremely high bandwidth to send data between different parts of the system. Still, though, there are existing institutions within the federal government that have experience managing such complex systems.
President Biden’s Executive Order pushed forward the National Science Foundation’s National AI Research Resource program, an excellent step in the right direction. This program, if fully realized, would create the exact infrastructure we require for open AI science to thrive. The pilot program, launched in January 2024, smartly subsidizes researcher access to the robust cloud computing infrastructure American firms have built. But in the long run, a truly public cloud is also necessary: renting compute from firms like Amazon and Microsoft is, in the long-run, significantly more expensive than owning it wholly.
Getting the National AI Research Resource past the pilot stage will require Congressional commitment to funding. The Task Force charged with planning the Research Resource estimates the full cost to be $2.6 billion over a six-year period—not cheap, but far from the enormous outlays required for many other federal programs.