On the EU AI Code of Practice

A co-authored comment

and

Nov 27, 2024

The following is a comment co-authored by Miles Brundage (former Head of Policy Research and Senior Advisor for AGI Readiness at OpenAI) and Dean Ball (Research Fellow at the Mercatus Center) on the first draft of the European Union’s General-Purpose AI Code of Practice, a voluntary template for compliance with the EU AI Act’s provisions regarding “general-purpose AI.”

Happy Thanksgiving to American readers.

How we got here

When the European Commission began drafting the AI Act in 2021, society’s understanding of AI was fundamentally different than it is today. Before 2022, “AI” meant, largely, narrow machine learning systems: computer vision models used for quality assurance in factories and farms, predictive models intended to help banks process loan applications, systems intended to help enhance electrical grid efficiency, and the like. These systems were used for discrete functions, often aiding firms and individuals in making specific kinds of decisions (Is this part defective? Is this loan candidate likely to repay their loan?).

At the time, few outside the still-relatively-small AI industry imagined the “generalist” AI systems like ChatGPT that are a common part of life today. These new systems came out before the AI Act had officially passed the European Parliament, but after much of the law had been negotiated. Rather than starting from scratch, European regulators chose to add in a placeholder about “general-purpose AI models,” and to flesh out the details later.

Two weeks ago, the first product of this process was released: the draft General-Purpose AI Code of Practice. The first round of comments are due on November 28. This essay constitutes our comment.

Summary of our reaction to the first draft

The Code of Practice thoughtfully and comprehensively raises many of the most important questions in the fields of AI governance and safety. There are several aspects of the Code of Practice which we broadly support, and we suggest only minor modifications to them. At the same time, there are many places where the draft merely gestures at answers to the questions it thoughtfully raises—or provides none at all. Notably (and admirably), the authors acknowledge that the document is very early and that many open questions remain. We hope that our and others’ comments on this first version prove useful in drafting the next one.

Transparency requirements

The Code begins with a set of rigorous transparency measures and specifies what information should be shared with the AI Office and national competent authorities as well as with downstream model providers, and suggests that companies should consider which bits of information should also be made public. We believe this is a critical section since information is the foundation of good policymaking by governments, responsible deployment decisions by downstream providers, and justified trust (or lack thereof) by the general public.

Many of the required disclosures relate to standard model characteristics and technical information that one would find in conventional model release documentation such as papers, API documents, or system cards: parameter count, architecture, training process, license, and similar. Other requirements may present a substantial compliance burden on small firms while also compelling them to publicly release proprietary information. For example, the Code specifies the following with regard to training data transparency (emphasis added):

Signatories should detail the data acquisition methods, specific information for each data acquisition method (e.g. web crawling, data licencing, data annotation, synthetically generated data, user data, etc...), details about the data processing (e.g. if and how harmful or private data are filtered), and specific information about the data used to train/test/validate the model, such as the fraction of the data that comes from different data sources, and the main characteristics of the training, testing and validation data.

It is unclear what level of detail the drafters intend, and as a result, complying with this requirement could entail anything from a few paragraphs to, conceivably, hundreds of pages, of transparency disclosures. Adding more detail about what precisely is meant by this passage is an essential next step. To do this, we suggest drafters consider why they wish to impose this requirement—what benefit do they expect either the European Union’s government, the AI community, or the public at large, to gain from having this information?

We also note that asking signatories to “consider if the listed information can be disclosed” may not go far enough, and that public sharing of much of the information in this section has limited if any risk to developers or the public interest. Indeed, we believe that it is a reasonable default assumption that if information should be shared with downstream providers, it should also typically be shared with the public, since the same models are often both used to serve downstream providers and to directly serve consumers in first party products.

In some cases, there are multiple sub-components of the requested information, some of which are not obviously relevant to the public and could create competitive risks to companies if they were compelled to share them publicly, but in other cases we believe it is critical for the public to have this information. For example, information about the “objectives being optimised” could, on some interpretations, include technical details of reinforcement learning that are commercially sensitive. However, another interpretation of this request is more straightforward: telling the public what goal an AI has, and what values the company has encoded in it–and this we believe is worth making public. In addition to clarification on this critical distinction and when there is a strong presumption in favor of publicness, we recommend a clearer articulation of the overall logic underlying the designations made in the table of required disclosures (Measure 2).

We also note that there is a potential for abuse of redaction when sharing information. Currently, the draft allows for the possibility of redaction in cases where information sharing is expected to be harmful, but we believe there should be further specification of the process for such designations being made as well as public “meta-transparency” by the AI Office about the frequency with which, and reasons for which, such redactions are allowed.

Lastly, we note the lack of public transparency requirements related to the security requirements for models with systemic risks (we discuss this category in more detail below). If companies were required to disclose the security standards they are applying, this could help create a “race to the top” among companies building and protecting models with increasingly dangerous capabilities.

These specific details notwithstanding, though, overall we strongly favor sensible transparency requirements that clearly specify what information should be provided to whom, especially when they are tied to a directly foreseeable benefit to the information environment, as opposed to transparency for its own sake.

Systemic risk definitions

The AI Act makes a distinction between “general-purpose AI models” and “general-purpose AI models with systemic risk.” General-purpose AI models include effectively all generative AI models; the transparency provisions discussed above, as well as others specified in the Code, apply to them. Models with systemic risk, however, are a more narrowly defined category and face more stringent regulations. We now turn to these.

The Code provides a taxonomy of systemic risks, including:

Cyber offence: Risks related to offensive cyber capabilities such as vulnerability discovery or exploitation.
Chemical, biological, radiological, and nuclear risks: Dual-use science risks enabling chemical, biological, radiological, and nuclear weapons attacks via, among other things, weapons development, design, acquisition, and use.
Loss of Control: Issues related to the inability to control powerful autonomous general-purpose AI models.
Automated use of models for AI Research and Development: This could greatly increase the pace of AI development, potentially leading to unpredictable developments of general-purpose AI models with systemic risk.
Persuasion and manipulation: The facilitation of large-scale persuasion and manipulation, as well as large-scale disinformation or misinformation with risks to democratic values and human rights, such as election interference, loss of trust in the media, and homogenisation or oversimplification of knowledge.
Large-scale discrimination: Large-scale illegal discrimination of individuals, communities, or societies.

In addition, the Code suggests that additional risks may be considered systemic as well:

Signatories may identify further systemic risks beyond those listed above, considering, for example, major accidents, large-scale privacy infringements and surveillance, as well as other ways in which general-purpose AI models may cause large-scale negative effects on public health, safety, democratic processes, public and economic security, critical infrastructure, fundamental rights, environmental resources, economic stability, human agency, or society as a whole.

As the draft notes, a current “open question” is “what are relevant considerations or criteria to take into account when defining whether a risk is systemic?”

Our response, in short, is that “less is more” when defining systemic risks. As currently drafted, this list of “systemic risks” is overbroad and risks spreading company and regulator attention too thin across a number of quite different areas. Moreover, the Code provides no detail on how these systemic risks are to be measured. Some are straightforward: all major Western AI developers, in line with their commitments to the White House and at Bletchley and Seoul, already conduct evaluations to assess model capabilities in domains such as cyberoffense and CBRN weapon development. It is much less clear how a company (or a regulator) supposed to assess the “negative effects” of a deployed model on “society as a whole”, and where should they even begin on gnarly topics like “human agency.”

Furthermore, as currently defined, an unreleased AI model exists in a kind of systemic risk superposition. It is possible for a model to pose a systemic risk before release, and then to lose that designation if it is released with no systemic risk (though again—how this is to be measured, and over what period of time, and by whom, are all not clearly addressed in this draft). It is also possible for a model to not pose a systemic risk before deployment, and then be deemed to pose a systemic risk after it is deployed. In other words, it is possible, as currently written, to be “a general-purpose AI model with systemic risks that does not pose systemic risks.” To quote from the Code:

Signatories recognise that detailed risk assessment, mitigations, and documentation are particularly important where the general-purpose AI model with systemic risk is more likely to (i) present substantial systemic risk, (ii) has uncertain capabilities and impacts, or (iii) where the provider lacks relevant expertise. Conversely, there is less need for more comprehensive measures where there is good reason to believe that a new general-purpose AI model will exhibit the same high-impact capabilities as exhibited by general-purpose AI models with systemic risk that have already been safely deployed, without significant systemic risks materialising and where the implementation of appropriate mitigations has been sufficient.

This creates a troubling dynamic for developers of frontier models. Because some of the “systemic risks” (such as those to “society as a whole”) can only be measured after a model is deployed, this means that they must consider in-development models to have systemic risks. Yet other developers, releasing similar models later, do not need to take the same safety precautions as the first-mover. Why? What if those safety precautions taken by the first-mover are precisely why their model did not pose systemic risks?

We believe these issues—while severe in this first draft—are resolvable. The optimal path forward is to significantly narrow the definition of a systemic risk, and to make it based upon model characteristics that can be reasonably observed by developers before deployment. This would entail removing all definitions of systemic risk that would require a model to be deployed for a long period of time to be measured. In general, we believe that the following systemic risks can be measured ahead of deployment: cyberoffense, CBRN risk, loss of control, autonomous AI R&D, and persuasion and manipulation.

These risks should be measured based on objective criteria (formal evaluations), and models should be evaluated for systemic risk based on the marginal risk over the baseline established by currently available models. Further specification of what constitutes a currently available model and whether capabilities are sufficiently similar would need to be developed, but we believe our framework is a good start and resolves some of the tensions implicit in the current text.

For example, current models can be (and are) used to autonomous AI R&D, in the sense that they generate synthetic data used to train future models, code used to improve AI developer efficiency, can be used to process and refine training data, among other use cases. These use cases are valuable, and would be covered by the current definition of “automated AI R&D” in the Code—but they are far short of “fully autonomous AI R&D.” Because the path to “fully autonomous” R&D may be gradual rather than a discontinuous leap, we suggest keeping the definition close to its current state, but emphasizing the marginal capability addition of new models.

Any future additions to the list of systemic risks should be made only if they meet the criteria above as well as being truly extreme in nature (e.g., constituting the equivalent of over $10 billion in damage). Note further, that in addition to being more conceptually coherent, focusing on just these risks–which all directly or indirectly feature in existing voluntary commitments and are being actively studied by a range of public and private actors, leading to a growing body of practice–would keep the compliance burden within reason.

Systemic risk documentation

Developers who produce models with systemic risk have to comply with a wide range of requirements, with a focus on two documents: a Safety and Security Framework (roughly equivalent to a Responsible Scaling Policy) to be written prior to deployment and govern overall development practices, as well as various Safety and Security Reports, to be written and published at unspecified times during the development and deployment of models. The purpose of the SSR is to, in essence, compare real-world experience with development and deployment to the safety framework laid out in the SSF.

We believe that releasing SSFs publicly is an important step for all frontier labs. The intended purpose of the SSR, too, we find sensible. Currently, however, it is unclear when and why these SSRs are supposed to be published, and having an additional set of documentation beyond SSFs could lead to confusion and duplication. It seems more efficient, all things considered, to drop the SSR requirement and simply require companies to periodically update their SSFs with insights gained from real-world experience rather than creating a new category of document.

Whistleblower Protections

Finally, we wish to comment on the whistleblower protections guaranteed by the Code. These are intended to allow employees of AI developers to report safety problems, violations of the law, and other major concerns to relevant government authorities. As currently drafted, these are quite vague:

Signatories commit to implement whistleblowing channels and afford appropriate whistleblowing protections to covered persons and activities.

We suggest that this provision be expanded to include the following:

Whistleblower protections will apply to all full-time employees of frontier AI developers.
Whistleblower protections will protect employees from retaliation by their employers.
Whistleblower protections will cover the reporting of information to a relevant government authority, rather than public disclosure of proprietary corporate information.
Whistleblower protections will cover cases where an employee has a good faith concern based on direct evidence that their employer is in violation of the AI Act.
The EU AI Office will create a process by which whistleblower reports will be received and evaluated by government, and determine which specific agency or agencies are appropriate recipients of those reports.
The EU AI Office will work with appropriate information security experts to ensure that information related to whistleblowing reports is protected from theft or unauthorized disclosure.

Conclusion

The draft Code of Practice is a key next step towards making the AI Act effective in its stated goals while avoiding excessive compliance burdens. At the same time, this snapshot-in-time of the EU’s work reveals just how many major and unanswered questions there are in AI governance. We applaud the drafters for identifying important concerns, and for their epistemic honesty, yet we also worry that many of these questions—such as how to determine what constitutes a good model evaluation or how to operationalize “high scientific rigor”—are unlikely to be answered by the April 2025 deadline.

This demonstrates the need for flexibility in regulation and other governance practices—something that governments are not traditionally associated with, and which the authors note is a key open question. We suggest that the Code be updated periodically, perhaps annually, after this initial version is finalized in April, and that the first version give a clearer indication of the general considerations which will bear on such updates (e.g., trends in capabilities, evolving practices around system cards, etc.).

Thanks to Adrian Weller and Larissa Schiavo or helpful feedback on earlier versions of this comment. The views expressed here and all remaining errors are our own.

A guest post by

Miles Brundage

Independent AI policy researcher

Steve Newman

Dec 3

I really appreciate this measured and constructive analysis. I'd just like to especially applaud your repeated calls for clarity – of intent, intended benefits, and the logic behind each proposal. Providing "a clearer articulation of the overall logic" is an under-utilized tool that would benefit *many* policy proposals and reduce (honest) misunderstandings of a bill's intent and intended interpretation.

Expand full comment

1 reply by Dean W. Ball

The One Percent Rule

Nov 28

I am on the committee helping to write the code of practice - so far its early days but I am confident by March we will have a very workable and adaptable solution for businesses and providers.

2 more comments...

Hyperdimensional

Discussion about this post