Impact Assessments are the Wrong Way to Regulate Frontier AI
A simple case study--and a better way forward
Introduction
There is a large and growing probability that America will stumble into one of the worst AI regulatory framework among the options that have been advanced. I’m not talking about something resembling SB 1047, the vetoed California bill about which I and others spilled countless gallons of ink in recent months. Instead, I am referring to preemptive, civil rights-based laws that will be debated in a number of states throughout the country in the next legislative session.
Under this framework, an enormous range of both AI developers and non-AI businesses who wish to develop or use AI models will be required to write algorithmic impact assessments and risk management plans before they can deploy any AI system, regardless of its size. These plans will require all parties to think about all the potential ways in which their use of AI could cause harm to various civil rights-protected groups, which include:
Age
Sex
Race
Color
Religion
National Origin
Immigration Status or Citizenship
Disability
Veteran Status
Genetic Information
Under the regulatory framework being created, model developers and “deployers” (referring, often, to both other developers who fine-tune models or deploy models with other software, and businesses using models) will need to think not just about overt discrimination against the above groups, but also “disparate impact” based theories of discrimination. Developers and deployers will need to pre-emptively consider the ways in which their use of AI could negatively affect any of these groups, regardless of whether they intend that effect. Many state and even city governments designate additional protected classes on top of this federal framework.
This is more like the European Union’s AI Act than it is different, and it if it becomes the US AI regulatory approach, it will happen regardless of the fact that Donald Trump was elected president: these policies will be driven by bipartisan state-level policy.
This framework for AI regulation has already passed in Colorado (though it is not yet in effect), is currently advancing in the Virginia legislature, and has been additionally proposed in Texas, Connecticut, and California (the latter as draft agency regulations rather than as a proposed law; see articles 11 and 12 of the link). All these laws vary somewhat from one another, but they are more alike than not. The description I’ve provided above applies to all of them.
Because the federal government has not taken action on AI and has not moved to preempt state authority to regulate AI however they wish, this will be, without course correction, America’s de facto framework for the regulation of AI. It is, in my view, among the worst possible options on the table. I’ve explained why I think that before; last June, for example, I wrote that algorithmic impact assessments are “trojan horses,” permitting a great deal of government intrusion into private business activity.
I stand by that today, but I also want to lay out something more specific: why I think this framework is likely to deter adoption of advanced AI by businesses, and thereby yield both fewer productivity benefits and, potentially, weaken America’s frontier AI sector for years to come.
The impact assessment framework was created during the era of “narrow” machine learning systems trained to make one or a small number of decisions. It was not at all designed with more recent AI systems like ChatGPT in mind, which can of course be used for a nearly infinite range of tasks.
To show you what I mean, I’ll contrast how I think these laws would apply to the narrow ML systems algorithmic impact assessments (AIAs) were originally designed to regulate, and then apply the exact same regulatory tools to generalist models like ChatGPT or Claude. I hope you will quickly see that the “impact assessment” is plausibly a sensible framework for the older systems, but that it makes almost no sense at all for today’s frontier AI. Fortunately, the analysis will conclude with some recommendations for how to improve these bills.
(For the detail-oriented: when I take quotes from legislation, the bill I am using as an example is Colorado’s SB 205, which is the only version of this bill that has passed—though it is worth noting that it does not go into effect until 2026).
The Steelman for AIAs
Before I present the case against AIAs, I’ll briefly lay out the case for them.
Imagine you are an executive at a bank, and you would like to create a machine learning-based system that predicts the likelihood that a given loan applicant will pay back their loan on time. One logical thing to do would be to train a machine learning model of some kind (probably, one that is quite simple compared to a large language model) on your bank’s historical data.
An obvious problem with doing just this is that there is a good chance that you will have skewed data due to the well-documented legacy of discriminatory attitudes and policies in America (women, for example, could be denied a loan without a male co-signer until 1974).
So perhaps it makes sense to have a policy that encourages the bank to slow down for a moment and double check its approach. Adopting a machine learning system like this throughout even a modestly sized bank is a significant undertaking in terms of both money and time, so a little bit of extra bureaucracy can’t be too much harm. And plus, the bank is already a heavily regulated entity. This logic is the seed of the algorithmic impact assessment as a tool for policy, and while I can debate parts of it, it is by no means wholly unsound.
Unfortunately, the dynamics for language models are completely different.
Why AIAs are Incompatible with Frontier AI: A Case Study
To illustrate this, imagine that it is 2026, and that you live and work in one of the states that has passed the civil rights-based AI policy frameworks I am describing. Let’s say you’re a dentist—in fact, we’ll say you own a small dental practice. These laws affect providers of essential services, which inevitably includes healthcare—so more likely than not, you are covered by the law in question.
You decide it’s time to experiment with integrating AI into your business. Specifically, you decide to try out a fancy new model from OpenAI called “o3” (a plausible, but to be clear, fictional, next-generation version of the current o1 model).
You tell the model about your dentistry practice and your goal of using AI to create invoices and automate routine paperwork. You give the AI some examples of routine paperwork as well as a few toy examples of patient records. And you give it some incomplete paperwork too, to test whether the model really can lighten your administrative load. These particular forms will be especially challenging, you think, because they have some questions that always trip up new human employees.
In the first few seconds of its reply, the AI successfully completes the paperwork.
Within about 15 seconds, the AI devises a comprehensive new paperwork automation system for your business that could reduce your office’s time spent filling out paperwork by 80%. It writes some custom Python scripts that will allow you to easily process paperwork using AI. It creates a user-friendly guide explaining how to implement the scripts. And it creates a plan for how to reconfigure the internal operations of your business to ideally accommodate this new degree of automation. It even writes a set of talking points for how to explain these new changes to employees.
Then, over the next 45 seconds or so, the AI proceeds to what it seems to think is the more interesting problem: finding a clever way to infer each customer’s future dental needs. To do this, the AI model builds a machine learning model of its own—a statistical model. It looks at the data you collect in your patient records. It consults several dozen academic papers, analyzing what patient characteristics correlate with higher dental spending. It incorporates those lessons into a multi-agent simulation to gain a better understanding of the microeconomics of dental care. It builds a predictive pricing model that, the AI tells you, could increase your revenue by 11-14%. It cross references relevant state and federal laws on healthcare regulations, making sure that it is complying with them all and writing a comprehensive legal brief, just in case it’s helpful.
In under 60 seconds, the AI has written several thousand lines of code and tens of thousands of words, drawn on expert-level understanding of economics, statistics, biology, public policy, software engineering, and law, and proposed major revisions to the way your business works.
You are astounded. And you have many questions. How does this machine work? Is it correct? Did it make any mistakes? Can you trust it? What else can this machine do, and what else can you do with it? What can your competitors do with this mysterious machine, and how quickly will they do it?
And then another question enters your mind. Could the jaw-dropping work you have just seen this AI model do, if implemented, have “a material legal or similarly significant effect on the provision or denial to any consumer of, or the cost or terms of” your dental services for any protected demographic group, even if you intend no discrimination? The AI has just re-imagined fundamental aspects of your business—it would be hard to argue that the answer to this question is “no.”
Since the answer is probably “yes,” you will need to write an algorithmic impact assessment and a risk management plan.
One obvious option would be for you to ask the model itself to write your compliance documents. Of course, you will then have to consider how much you trust the model to write legal documents for you. You notice that the bottom of the model’s response has a disclaimer saying “ChatGPT can make mistakes. Please check important info.”
You also wonder whether, say, an AI-written risk management plan itself could have “a material legal or similarly significant effect on the provision or denial to any consumer of, or the cost or terms of” your dental services. Because if it could, using AI to write your compliance documents could itself require an algorithmic impact assessment.
Regardless of whether you choose to use AI for your compliance documents, the first question (among many) you’ll have to consider is “the purpose, intended use cases, and deployment context of, and benefits afforded by” this model. But there’s an obvious problem: you have very little idea what this system can do! You don’t know how much you trust it. You don’t know whether you’ll want to modify the ambitious and sweeping plan it has just proposed, or how much you’ll want to modify it. The system certainly seems like a multifaceted expert, but this itself poses a problem: how, precisely, does a non-expert assess the work of an expert?
It would be nice if you could get some real-world experience using this model in your business, but you face a Catch-22: to gain real-world experience with the model, you have to write a report saying how you plan to use the model. Yes, you can update your report to the government at any time, but in addition to requiring more paperwork, you will always have to have a fixed plan for how you wish to use AI. Whereas that makes good sense in the example with the bank from earlier, it makes approximately zero sense with something as flexible (and rapidly changing) as a frontier language model.
How AIAs for Frontier AI Could Work in Practice—and a Better Way Forward
Now, extrapolate this dynamic to hundreds of thousands, of covered businesses of all different industries and sizes. What are some reasonably foreseeable consequences of something like this applied to covered businesses with hundreds or even thousands of employees? A few seem apparent:
Massively centralized AI adoption plans. Given the risks of noncompliance with these laws’ broad definitions of “algorithmic discrimination,” especially to larger firms, adoption of generalist AI will need to be centralized. Management will be able to tolerate very little to no employee- or team-level experimentation with AI.
Overly risk-averse culture. Obviously, implementing an AI adoption plan under a regulatory regime that forces businesses to focus on all conceivable future harms of their AI use will encourage risk aversion. In addition, though, creating a centralized process within firms will also make it likelier that AI use cases will be reviewed by broad committees of “internal stakeholders,” all of whom will have their own boutique complaints, job loss fears, biases, and anti-AI sentiments. Committees are, by their nature, likely to add an additional degree of risk aversion on top of the effect imposed by the law itself.
Inflexible AI use policies. Once an adoption plan is in place, businesses will be unwilling to change their strategies incrementally. Changes could require new compliance paperwork, and rather than go through the trouble, businesses are likely to err on the side of caution. Instead, they will stick with one “AI strategy”—quite possibly for longer than they should.
Employee surveillance. To ensure that employees do not use AI in ways that could run afoul of the laws (or of the firms’ own legally mandated compliance plans), employee use of AI, and perhaps even of their computers more broadly, could be surveilled by management more thoroughly.
Decreased AI diffusion. As technology diffusion expert Jeffrey Ding has pointed out, centralized plans for adoption of general-purpose technologies tend to hinder the diffusion of those technologies. Diffusion is how societies get the benefits of novel technologies, such as enhanced worker productivity, new downstream inventions, novel business practices, and the like. There is little point to innovative new general-purpose technologies unless they can be thoroughly diffused, and Ding has also argued that the economies that diffuse technologies the most successfully outperform those that merely innovate.
Note that all of these consequences have little to do with the fact that these policies all rely on “woke” theories of civil rights-based disparate impact. Perhaps you think those theories are good, or perhaps you think they are problematic.
Perhaps you disagree with the woke policies, but argue, as the right-of-center and usually anti-woke Texas Public Policy Foundation recently did (confusingly and unpersuasively, in my view), that laws like this will ensure “human dignity.” It is unclear to me what human is dignified by the existence of algorithmic impact assessments for frontier AI, especially given that disparate impact-based theories of civil rights litigation are still permissible regardless of whether this law passes.
No matter what position you hold on such issues, this analysis remains relevant. I do not view the current AI policy debate as an extension of culture war or internet policy debates of the last decade. I view it as something distinct. I view it as laying the foundation for a radically different, and potentially much better, future. And if we want that future to be better, this regulatory approach has, at best, a minor role to play.
But this analysis also points the way toward the easiest way to improve these bills. Specifically, the low-hanging fruit for improvement is:
Completely exempt all generative AI foundation models (like ChatGPT) from the law. Some versions of this law already have gestures at this exemption, but none I have seen achieve the full exemption that is appropriate.
Change the definition of “algorithmic discrimination” to be focused on discriminatory intent rather than disparate impact. This will narrow the focus of the law.
This is not complete for every state. Some versions of the bill, such as the one in Texas I wrote about recently, have entirely distinct problems, such as creating an AI regulator with sweeping powers (an approach whose flaws I have examined before), or inadvertently banning large language models through sloppy drafting.
Conclusion
The laws I am describing currently seem to me to be the likeliest framework to gain widespread adoption across America. If enough large states adopt this framework within the next 6-12 months, as seems increasingly possible, it could easily become the de facto standard for any large business that uses AI—not to mention nearly all AI model developers.
It is not obvious to me what urgent problems these laws solve, at least with respect to frontier AI. It is not clear to me that this is the kind of AI regulation the public demands. Indeed, given how closely these laws resemble the EU’s AI Act, it is surprising to me that they are gaining traction in America at all, where the EU has become a cautionary tale of overregulation. As I have written earlier, my only thesis is that these laws are primarily being driven by the worldwide industry of technology law compliance consultants, lawyers, and lobbyists—and their audience of state lawmakers eager to “do something” about AI, regardless of what that “something” is.
All of this should be a troubling fact for anyone who wants the AI transformation to go well. America is sleepwalking into this deeply flawed regime, and it is not clear if, or whether, we will wake up.
Great analysis. The impact on discriminatory intent makes sense to me, though I am presuming there is enough legal distinction on protected classes. It gets especially confusing if you think about using AI to optimize a service business oriented towards a protected class.
Regardless, if the AI is that powerful, could the concerns be addressed "in situ" by having the AI monitor impact across protected classes and propose remedies maybe annually that better meet the intent of the impact? There would have to be a presumption of goodwill for anyone doing that. Doesn't seem like it needs to be a capability of the foundation model, but maybe a service bundled in at the application layer for businesses.
Impact assessments are designed for the slow moving pre LLM era of AI tbh. They can be revisited when we have a better understanding of the future.