On Civilizational Triumph

Some additional thoughts on open-source AI

Nov 02, 2024

I was on a plane yesterday and got a little stir crazy. The result is that I have a bonus post for you. I hope you enjoy.

Make no little plans. They have no magic to stir men's blood and probably will not themselves be realized. Make big plans, aim high in hope and work, remembering that a noble, logical diagram once recorded will never die, but long after we are gone will be a living thing, asserting itself with ever growing insistency.
Daniel H. Burnham

In 2023, a group of Chinese military research institutions used a version of Meta’s Llama language model to answer some basic questions about warfare and military organizations, such as “describe the US Army Research Laboratory.” Specifically, they fine-tuned the model on publicly available information about military affairs, and then it answered questions about those documents. Unsurprisingly, the researchers found that their fine-tuned model was better at answering basic questions about military affairs than comparable models that hadn’t been trained on the same data.

The 13 billion parameter variant of Llama—Llama 1, folks—used by these researchers is 18 months old and is ranked dead last (155^th place) on the LMSYS Chatbot Arena Leaderboard. There are currently Chinese models ranked as high as 6^th on Chatbot Arena—ahead of Meta’s Llama 3.1, for what it’s worth. Open-source models from Chinese firms Alibaba and DeepSeek are especially well regarded—so well, in fact, that I hear of more than a few US startups opting for such models over American open models. For now, at least, the Chinese have their own perfectly good models they can use for military purposes—and I am sure their government prefers that, for the same reason that our military would prefer to use American models.

America’s lead in language modeling is real, but it is not so overwhelming that the Chinese have no choice but to copy our models. And by the way, some prominent industry observers believe the Chinese are pulling ahead in modalities like video and images because their firms do not have the same intellectual property litigation risks that ours do.

Setting the AI race aside for a moment, this use of Llama by a Chinese military institution is easily the least impactful use case of AI I have ever written about in these digital pages.

But that’s not how some people see things. Reuters breathlessly reported that Meta’s model had been used by the Chinese “to construct a military-focused AI tool to gather and process intelligence, and offer accurate and reliable information for operational decision-making.”

The Jamestown Foundation, a foreign policy think tank, turned up the volume even further. Associate Fellow Sunny Cheung chose to describe this deeply pedestrian use case of AI as “optimiz[ing] Meta’s Llama for specialized military and security purposes,” and “enhanc[ing] the capabilities of foreign militaries.” Jamestown, of course, believes that this anecdote “highlight[s] gaps in enforcement for open-source usage restrictions,” which is think tank scholar speak for “Congress should pass the ENFORCE Act to give the Commerce Department the authority to ban frontier open-source AI.”

This hyperventilating by Reuters and Cheung is profoundly misleading. But it occurred to me that I haven’t written much about open-source per se since one of my first posts on Hyperdimensional back in January. While I still believe my core thesis in that piece, some of my thinking has changed. Now seems like a good time, then, to offer some additional thoughts on open-source AI.

II.

Consider the fact that almost all software in the world is written in English. What I mean is that the built-in functions of nearly every programming language (“print,” “if,” “else,” “for,” “class,” “return,” “continue”) are English words, or abbreviations of them. Every coder in the world, whether Chinese or French, Israeli or Bangladeshi, learns to speak at least some English because of this. English speakers are far from the most populous group of people on Earth, and yet all coders everywhere learn to speak our language. This is true because America won. To code is to submit to a staggering American civilizational victory, to acknowledge, just a little bit, the indisputable dominance of America—our people, our technology, our ideas, and, you better believe it, our words.

One byproduct of this staggering civilizational victory is that the Chinese military almost certainly uses open-source software that is primarily maintained by American and Western programmers. And I would be willing to bet that programming languages (themselves very often a form of open-source software) invented by Americans are a fundamental enabler of all sorts of Chinese weapons.

And it’s about more than weapons, too. As I write this, I am sitting on an airplane drinking a ginger ale given to me by a flight attendant. Software implicitly and explicitly enables all three of those things. I am writing this, of course, using software—that much is obvious. But the plane is almost certainly on autopilot (also known as “software”), and hundreds of its systems are similarly enabled by software. The ginger ale I’m drinking was manufactured and delivered my tray table through miraculously well-orchestrated global supply chains and corporate operations, all of it undergirded by software in countless millions of ways. And as I write this, a Chinese think tank researcher is flying over his country, using just as much software—some of it probably the same exact software, or very similar.

It is likely that all, or nearly all, of this Chinese software was made using our programming languages and open-source software development tools made by Americans. Much of it was given away for free, by us, to them. To describe any of this as a mistake is to misunderstand how technologies, ideas, and economies grow and develop over time. Far from a mistake: this state of affairs is downstream of one of the greatest civilizational victories in human history.

It's not just a linguistic or cultural victory, either. Large portions of that open-source software is licensed (even if is free) using terms set by Americans. That is because it is Americans who led the creation of the most popular open-source software licenses, too. And those licenses reflect American customs, preferences, and laws. They get adjudicated, more often than not, in American courts.

America derives unbelievable power from these facts. And yet we do not appreciate it. To us, it is like white paint. It’s just there. We wield it blindly, without even realizing that we are doing so.

The Chinese, lacking this power, absolutely do appreciate it. They want what we have. They have their own open-source software licenses, written on their terms, pointing back to their courts, to name just one of many examples. They want to set the global standard for the next century.

I would like for AI to result in a similar smashing victory for America. To do that, we will need to set the global standard yet again. And to do that, we will almost certainly need to lead in open-source AI, because it is open protocols and open software that tend to define global standards in information technologies.

That probably means that if America “wins,” many people beyond our borders—even people in China—will use American AI for many things, including things we do not like.

III.

It’s not so simple, of course. There are plenty of things we would never sell a foreign adversary. We would never sell them our fighter jets or our missiles. We’d never sell them a nuclear bomb. That would be senseless. Yet we might well sell them (or even give them) the software tools that they, in turn, use to help build such weapons for themselves. Indeed, we already do. How many computers owned by the People’s Liberation Army run Windows? How many of them run Linux? How many of the AI models the PLA trains are built with PyTorch, the open-source machine learning library made by—you guessed it—Meta?

So which kind of a thing is AI? Is it a weapon, or is it more like the information technologies I’ve described? Well, AI is an information technology, but that does not necessarily mean that the geopolitical dynamics of AI will be quite the same as it has been with operating systems, programming languages, and other basic software infrastructure.

In all likelihood, AI will resemble neither bombs nor programming languages in the precise geopolitical, economic, and technological dynamics that govern its use. AI is a different technology, and the world is a different place than it was in the 20^th century, when America laid the foundation for its present-day digital dominance.

Will AI be more like weapons, though, than it is like programming languages? I don’t know. My instinct is that it will behave more like other software-based information technologies, since it is, in fact, a software-based information technology. Most facts currently in evidence suggest this about AI as well. Could I be wrong? Sure.

But the people who are convinced that weapons are the better analogy with which to reason about AI are only guessing, too. Anyone who expresses certainty about these issues is either wildly overconfident or is trying to manipulate their audience. And those who wish to hoard our software technologies may well be foreclosing on—or perhaps not even understand—the staggering civilizational victory that we earned through openness.

Might there come a day when we no longer want to open-source models at the frontier of AI, when the models are simply too much like weapons for us to be comfortable selling, or giving, to the Chinese? There might, and Mark Zuckerberg himself has said as much. Might that decision entail hard tradeoffs between the preferences of our national security or AI safety communities and the broader goal of maintaining America’s technological leadership? Yes. Might it be in the economic interest of the closed-source frontier labs to convince us to ban frontier open-source AI? You bet.

The central fact is that hoarding our technology within our borders is in fundamental tension with the strategy that earned us the civilizational victory whose fruits we enjoy each day. That does not mean the pro-hoarding crowd is wrong—maybe this time really is different!—but it is a tension worth pondering. Certainly I will not tolerate being told that those of us who support open-source AI are the ones endangering America’s technological leadership. If history is any guide, the opposite is true.

And might it be, in practice, impossible to stop open-source AI? Might the cat be out of the bag in many important ways? Oh my goodness, yes. While the federal government probably can bully Meta into changing its corporate strategy,1 its broader ability to project control over extremely capable neural networks is very much in doubt. OpenAI’s o1-mini model demonstrates that it is possible for shockingly small models—small enough to run on a laptop, maybe—to achieve state-of-the-art performance.

The o1 approach is fundamentally based on a reinforcement learning-based trick—ultimately, this is simply knowledge. It does not require a massive data center to discover this knowledge; it requires ingenuity. Others will discover this knowledge (perhaps even the Chinese!). Some of them will open-source their discovery—indeed, I would not be shocked if we see this happen before the end of the year. And then cat will be, yet again, out of the bag, lest the government choose to restrict the speech of AI researchers—probably not the recipe for maintaining our leadership in AI.

Depending on the development trajectory of AI, there may well be many very useful (and dangerous) AI models that do not need to be big, and thus are easy for a wide range of actors to create and deploy. I know many very accomplished AI researchers who believe this is possible. And if that ends up being the case, the ability of governments to control the “export” of advanced AI to China will be essentially nil.

It is shockingly possible that in the next year, we will ban open-source at the frontier and cede leadership in open-source to China. Their models would dominate in poorer countries all over the world. And we would be a bit like BMW or Hermes, making exquisite, luxury AI for the well-heeled. Until, of course, the Chinese get better at that too. And if highly capable open models are indeed unstoppable, we might hobble ourselves for nothing.

And this is to say nothing of the fact that if we eliminate frontier open-source AI—setting aside what exactly what means or whether it is even possible—we will be guaranteeing that what I believe will be the most powerful technology ever developed is controlled exclusively by the largest corporations on Earth.

IV.

The day may come when frontier AI really is too dangerous to open source. If so, that will be a sad day. But we’re not there yet. Today’s models are not sufficiently useful—or dangerous—to justify such a drastic shift in public policy. And while the future of AI is coming into view, we cannot quite yet grasp its precise contours. Don’t be so sure you know what the future holds. Our path to AI has been consistently surprising. There is no reason to think the surprises end here.

It's an exceptionally difficult call to make—one among many facing us. We’ll have imperfect information we do so, and limited time. So we will need, collectively, to exercise judgment. Such is life at the frontier in a self-governing commercial republic.

We need to be empirical. We need to be honest about what we see in front of us, and not hyperbolic in service of some agenda or another. We need to avoid hyperventilating about a foreign adversary using American software. We need to be level-headed. We need to understand that everything entails tradeoffs. We need to resist giving into anxiety and paranoia.

And we must keep in mind that staggering civilizational victories do not come easy.

Different parts of the federal government are pressuring Meta in opposite directions on this issue. Many members of Congress and constituents of America’s national security apparatus are trying to get them to stop open-sourcing models, while FTC Chair Lina Khan has suggested she’d sue them if they stop open-sourcing Llama.

Wesley Pasfield

Nov 2, 2024

“And this is to say nothing of the fact that if we eliminate frontier open-source AI—setting aside what exactly what means or whether it is even possible—we will be guaranteeing that what I believe will be the most powerful technology ever developed is controlled exclusively by the largest corporations on Earth.” …. Perfectly said, it is so difficult to predict how all this will play out, but this portion seems extremely obvious at this point

Expand full comment

1 reply by Dean W. Ball

Matt Smith

Nov 3, 2024

I currently think that open source AI is slightly more dangerous than closed source (especially as AI models improve) but you make very smart points here. I’ll need to think about this more. Thank you for the thoughtful and deep post.

One other point I’d add is that for a determined nation state like China or Russia, all models are open source, since they can probably hack their way into a frontier AI lab and steal the weights. I

5 more comments...

Hyperdimensional

Discussion about this post

Ready for more?