What GPT-4o illustrates about AI Regulation
The important difference between regulating technology use and regulating conduct
We are in hot take territory, so forgive any inaccuracies.
Sam Hammond of the Foundation for American Innovation published his 95 Theses on AI last week. I believe that this post, like some of Hammond’s other writing, suffers from misplaced negativity and overconfidence in some assertions (biology, for example, is always more complicated than you think). Yet his theses can be summarized into three main ideas, each of which I concur with:
The Biden Executive Order’s reporting requirements for frontier models (10^26 flops or higher) are basically fine, because there is not widespread consensus about the future capabilities of such models (agreed);
The diffusion of AI will require broad deregulation across many economic sectors (agreed, in fact I believe that AI is already overregulated in some important respects);
Potential second and third-order consequences of AGI, a hypothetical and nebulous future AI system, could be politically destabilizing (agreed).
Apart from my broad agreement on these points, there is one of the theses that deserves greater attention, about regulatory approaches to AI:
The dogma that we should only regulate technologies based on “use” or “risk” may sound more market-friendly, but often results in a far broader regulatory scope than technology-specific approaches (see: the EU AI Act).
Zvi Moshowitz picked up on this too:
This as well. When you regulate ‘use’ or ‘risk’ you need to check on everyone’s ‘use’ of everything, and you make a lot of detailed micro interventions, and everyone has to file lots of paperwork and do lots of dumb things, and the natural end result is universal surveillance and a full ‘that which is not compulsory is forbidden’ regime across much of existence. Whereas a technology-focused approach can be entirely handled by the lab or manufacturer, then you are free.
This is a serious misunderstanding. Here are three broad ways you might approach AI regulation:
Model-level regulation: We create formal oversight and regulatory approval for frontier AI models, akin to SB 1047 and several federal proposals. This is the approach favored by AI pessimists such as Zvi and Hammond.
Use-level regulation: We create regulations for each anticipated downstream use of AI—we regulate the use of AI in classrooms, in police departments, in insurance companies, in pharmaceutical labs, in household appliances, etc. This is the direction the European Union has chosen.
Conduct-level regulation: We take a broadly technology-neutral approach, realizing that our existing laws already codify the conduct and standards we wish to see in the world, albeit imperfectly. To the extent existing law is overly burdensome, or does not anticipate certain new crimes enabled by AI, we update the law. Broadly speaking, though, we recognize that murder is murder, theft is theft, and fraud is fraud, regardless of the technologies used in commission. This is what I favor.
It is easy to confuse approaches 2 and 3, but I think not so difficult for a reasonable person to understand. In fact, today’s release of GPT-4o demonstrates it perfectly. The new model enables real-time voice conversations with ChatGPT, making it feel much more like talking to… well, a person (or like talking to Samantha from Her, the obvious inspiration for this new model). At one point during the live demo earlier today, an OpenAI researcher turned on his phone’s camera and asked the model to guess his emotional state (which it did).
As it happens, depending on the location and context, this capability is illegal under the European Union’s just-passed AI Act. From the Act:
The following AI practices shall be prohibited:
…
(f) the placing on the market, the putting into service for this specific purpose, or the use of AI systems to infer emotions of a natural person in the areas of workplace and education institutions, except where the use of the AI system is intended to be put in place or into the market for medical or safety reasons;
Any user could, in theory, ask GPT-4o to infer the emotion of a person (either themselves or someone else) using their smartphone’s camera (and possibly also their computer’s webcam). Is ChatGPT now unlawful in European schools and workplaces for this reason? If so, does that make sense to anyone?
This provision of the AI Act perfectly illustrates the distinction between a technology use and conduct-based approach to AI regulation. It is, of course, entirely lawful for me to infer the emotions of someone by looking at their facial expression, so there is no obvious reason for it to be illegal for ChatGPT to do so.
It can be easy to confuse use and conduct-based approaches to regulations, yet the two could not be further apart in practice: in my preferred conduct-based policy regime, there is no problem with GPT-4o’s ability to infer emotions in real time; in the European Union’s use-based regime, it is illegal. In one, policymakers can concentrate on what they want the outcomes of their laws to be for society; in the other, policymakers have to fret about every potential use case of a general-purpose technology—an obvious epistemic boondoggle. Only involves focusing on our preferred standards for personal and commercial conduct—things that don’t tend to change very quickly. The other requires regulators to police use of a technology that is changing rapidly, creating uncertainty and displeasure for all.
The notion, however, that folks like myself or, say, Yann LeCun, are advocating for the latter approach is wide of the mark. There is a clear difference between the two.
I hope to have more to say about GPT-4o in the coming days or weeks—I do not currently have access to the new voice features. I suspect human-machine interaction is about to take a significant leap forward.
You might appreciate this podcast on the regulation deception:
https://spotifyanchor-web.app.link/e/ASYUrdDKEJb
You've correctly distinguished the difference between use and conduct and the problem with the EU approach (which I've also criticized), but given how you opened the piece, I was expecting you to make the case against model- or input- based ways of triaging oversight. Instead you illustrated my exact point, i.e. that the EU's use-based approach is ridiculously over-broad! I agree a conduct-based approach would be better, but that's still broader in scope, and tangential to, the case for using compute thresholds to pick out frontier labs for oversight. So how does this represent a misunderstanding on my part?