Introduction
There’s always been a disconnect between my writing about AI and my own day-to-day use of AI models. I write often about the importance of open-source AI for the the American AI ecosystem. I have criticized, some might even say vigorously, legislation that I believe would negatively impact open models. Yet I myself didn’t use open models on a regular basis. I’m not a developer; I don’t use models at scale. I have consumer-grade AI needs—a Python script here, some data analysis there, and a ton of PDF reading. While I’ve used many open models for testing purposes, my “daily driver” LLM has been Claude 3 since March and before that was ChatGPT.
It always struck me as worrisome, a sign of the open AI ecosystem’s immaturity, that there was no open model I was happy to use as my day-to-day LLM.
That changed on Tuesday. Meta released Llama 3.1, an update to the Llama family (“herd,” as Meta says) of models. Most intriguingly, they released a 405 billion parameter variant of the model whose performance rivals, and in some cases exceeds, OpenAI’s GPT-4o and Anthropic’s Claude 3.5 Sonnet.
Llama 3.1: The Details
It is a very good model. While I, personally, remain smitten by Claude 3’s personality, I find Llama to be exceptionally competent. I selected some recent prompts I’ve given other language models and found it to be just as good as the responses I got from the other frontier models. I tested it with a series of prompts I keep in a file to see how the model deals with tough moral issues and political incorrectness. It dealt with most of them with aplomb—a bit better, in my view, than GPT-4o, and a bit worse than Claude 3 (Anthropic is a category leader in this regard).
We have, finally, the real deal: a frontier, open-weight model. Developers will be able to use its outputs to train smaller, more specialized models (thanks to a newly updated license from Meta that permits this). They’ll be able to build frontier AI capabilities into their applications. And they’ll benefit from lowering costs over time: unlike a closed model, whose price is set by the model’s developer, the price of running Llama on cloud computing infrastructure will be subject to vigorous competition among all cloud providers. Businesses will be able to integrate LLMs into their internal processes without worrying about being dependent on a single vendor. For the first time, researchers will be able to do safety, interpretability, and capabilities research on a frontier LLM with full access to its internals.
Along with the models, Meta released an extensive paper detailing the model training process. They described, in great detail and among many other things, how they curated their training data, the practical hiccups one encounters when running 16,000 Nvidia H100 GPUs in parallel (something very few people ever talk about), and how they conducted their safety testing. Nathan Lambert, author of Interconnects, tweeted that the paper alone may have advanced the open AI ecosystem by 3-9 months.
While I am by no means an open-source maximalist—there is absolutely a role for closed-source AI—I hope that Meta’s paper serves as an example for others in the industry to follow. While DeepMind continues to publish illuminating research of Renaissance-esque diversity, in my view the field more broadly has become too secretive at the frontier.
In addition to their paper, Meta CEO Mark Zuckerberg penned an essay about why open-source AI is beneficial to Meta in particular and society more generally. He begins with an historical analogy that I like very much:
In the early days of high-performance computing, the major tech companies of the day each invested heavily in developing their own closed source versions of Unix. It was hard to imagine at the time that any other approach could develop such advanced software. Eventually though, open source Linux gained popularity – initially because it allowed developers to modify its code however they wanted and was more affordable, and over time because it became more advanced, more secure, and had a broader ecosystem supporting more capabilities than any closed Unix. Today, Linux is the industry standard foundation for both cloud computing and the operating systems that run most mobile devices – and we all benefit from superior products because of it.
I believe that AI will develop in a similar way.
Zuckerberg isn’t alone. Apple seems to see things similarly. In “Apple Intelligence and the Shape of Things to Come,” I wrote:
Apple Intelligence is not so much a set of features as it is the creation of a new AI layer across the entire operating system. Today, every operating system is permeated by layers for security, internet networking, power management, rendering graphics, etc. Apple is suggesting that AI is another such layer. Even as that layer becomes more powerful and capable, it remains just that: a layer in a larger system, instead of the system itself. This is why Apple Intelligence is about more than just “a model”—it’s about multiple models running locally and in the cloud, APIs and other frameworks, and software to integrate all of this with the user’s data. Like any layer, all of this interacts with many different pre-existing parts of the operating system. In Apple’s view, AI neither replaces the OS nor gets bolted on top; it is diffused throughout.
It should not come as a surprise, then, that Apple has also emerged as a player in open-source AI. Of course, one could reasonably argue that both companies—so often diametrically opposed in their visions for technology—see AI this way because it is in their economic interest to do so. Both Apple and Meta make their money by doing something other than selling people AI models, whereas OpenAI and Anthropic have a business model predicated on doing just that. It makes sense, for Apple and Meta, to use their substantial resources and market reach to make AI models a commodity—a strategy known as “commoditizing your complements.”
Some may ask, then: is the open versus closed debate really about different views of how AI should develop? Or is it just about competing economic interests? Though this is a tempting dichotomy to draw, there is no reason it can’t be both. And indeed, this is what Zuckerberg is articulating in his letter, arguing why open-source is both good for Meta and good for the world.
Zuckerberg, and to some extent Apple, are proposing a world in which AI diffuses as a new layer of computing, distributed throughout countless devices and services we encounter in our day-to-day lives. Some of those AI models will be closed source, and others will be open, depending on the particular choices made by the relevant economic actors. Open models will have their limitations, just as open-source software does today (especially for consumer-facing applications). And so, too, will closed models. They will compete and complement one another and together, they will form a new substrate of computing that will, over time, transform education, scientific research, the organization of businesses, and much else.
This is radically distinct from the view of many in the AI safety community, who instead often see the technology developing toward a singular, almost God-like “superintelligence” that, by its own, commands much of the productive activity in the world. I believe that the debate about open versus closed AI, and about much else in AI policy, often comes down to where one stands on this question.
Which of those two visions sounds more appealing to you? Which sounds more realistic, based on the technologies you have seen rise and fall during your lifetime? Which sounds more competitive? Which sounds more human-enriching? Which sounds more dangerous?
The path we take depends on the policies we adopt. Regulation tends to lead to path dependency, and many of the regulations we are considering today absolutely will effect the long-term viability of open-source AI.
What Llama 3.1 Means for AI Policy
With Llama 3.1, and especially its 405 billion parameter variant, we are now thoroughly in the territory of models that the AI safety community assured us, in the recent past, would present a major danger to society. The Center for AI Safety, whose lobbying arm largely authored California’s SB 1047, proposed a de facto ban on open models significantly less capable than Llama 3.1. Gladstone AI, an AI safety company, wrote a report commissioned by the US Department of State proposing that open-sourcing a model like Llama 3.1 be made a felony. And of course, it’s not just proposals: Llama 3.1 is among the first models to be classified, based on the computing power used in training, as a “systemic risk” by the European Union.
Meta’s Llama 3.1 paper has a long section on how they tested the models for catastrophic risks. Indeed, in my reading, they provided more detail about this testing than other frontier labs. While any company’s self-reported testing should be taken with a grain of salt, they found that none of the models in the Llama 3.1 “herd” enable meaningful catastrophic capabilities such as executing cyberattacks, developing bioweapons, etc. This is in line with similar research done by OpenAI using GPT-4.
AI is a new field of policy analysis. There are few randomized control trials for us to run, and there is little basis for “evidence-based policymaking.” Thus, I suspect that the persuasive arguments in AI policy will be determined the old-fashioned way: who is more credible? Who has anticipated the trajectory of things more accurately? Who, as best as we can tell, is more persuasive?
The dice are undoubtedly still in the air. But thus far, those who argued for strict regulation of AI models based on compute thresholds, and who supported policies to ban (or de facto ban) open-source AI are looking, frankly, a bit silly. Does anyone think that the world will be a more dangerous place because a 405 billion parameter version of Llama 3 is on Hugging Face? “But the next generation of models will be the ones with the truly catastrophic capabilities,” they will say, undoubtedly. Perhaps they are right, and perhaps they will look less silly then.
In the meantime, though, making policy based on this community’s reckons—which have been proven objectively incorrect—would strike me as an irresponsible decision. Perhaps even a “catastrophic” one, to use their favorite term (after “existential” proved off-putting, I suppose).
Consistently, the AI safety community has made arguments and proposed policies based on the current dynamics of the AI industry remaining in place. We can afford to regulate open-source AI, they argued, because open models lag the frontier anyway. As a result, banning open models won’t damage our competition with China.
Yet today, the entire AI community (yes, including the AI community in China) has access to a world-class frontier model. As a result, the pace of progress may even quicken in the coming year. Perhaps open-source AI will in fact lead, in at least some important ways. Mark Zuckerberg seems to think so. From his letter (my emphasis added):
Today, several tech companies are developing leading closed models. But open source is quickly closing the gap. Last year, Llama 2 was only comparable to an older generation of models behind the frontier. This year, Llama 3 is competitive with the most advanced models and leading in some areas. Starting next year, we expect future Llama models to become the most advanced in the industry.
We do not know if Zuckerberg is correct. Perhaps the AI safety community’s vision of singular, atom bomb-like models—far too dangerous to make available widely—is right after all. I wonder about this every day.
But I also wonder: will our policymakers allow us to find out? Or will they decide for us?
> This is radically distinct from the view of many in the AI safety community, who instead often see the technology developing toward a singular, almost God-like “superintelligence” that, by its own, commands much of the productive activity in the world. I believe that the debate about open versus closed AI, and about much else in AI policy, often comes down to where one stands on this question.
I agree that many debates in AI policy hinge on differing views on this question – this is an important point and doesn't get enough attention. (Often neither party explicitly states their view, treating it as an obvious fact; as a result, we often see debates between people working from differing-but-unstated worldviews, which I think explains some of the dysfunction in the conversation.)
> Which of those two visions sounds more appealing to you? Which sounds more realistic, based on the technologies you have seen rise and fall during your lifetime? Which sounds more competitive? Which sounds more human-enriching? Which sounds more *dangerous*?
Are you saying that "many in the AI safety community" actively *hope for* a takeover by a singular superintelligence, and advocate for policies which they believe will help to bring this about? My understanding is that most folks who are concerned about x risk fear that a takeover scenario is the default path (unless some other catastrophe intervenes first); they *predict* this outcome, but are striving to *avoid* it.
I'm aware that there are some people who do in fact advocate that the way to avoid disaster is to ensure takeover by a singular, carefully aligned entity. But I don't think this is a mainstream view among the safety community? I have yet to personally interact with anyone who espouses this view.
Are there pointers you could share? This could be fodder for one of the panel discussions I've been putting together.
> With Llama 3.1, and especially its 405 billion parameter variant, we are now thoroughly in the territory of models that the AI safety community assured us, in the recent past, would present a major danger to society.
Similarly here: I'm sure there are folks who have said such things, but I believe it was a minority view? Certainly most people I follow have explicitly stated that they don't see GPT-4 class models as posing catastrophic risks.
This is a rally good piece.
My speculation about Meta's embrace of open source is not to build a moat, but to dump earth to fill in the frontier labs' moats. Then the competition comes about delivering services building on the models - a place where Meta arguably has a lead with their UX and UI experience, not owning the model. This also keeps Meta from having to license the tech from others.
Not sure whether it will work, but it's a viable strategy.