Apple Intelligence and the Shape of Things to Come
Two conflicting visions about the future of AI
Last week, Hyperdimensional reached 1000 subscribers. I’m touched and honored that anyone would take time out of their week to read my work. Thank you all. I want to especially thank the people who helped me build this audience when I had fewer than 100 readers: Nathan Lambert, Nathan Labenz, Tyler Cowen, and Kevin Xu. The best way to support this work remains sharing it with others.
Yesterday, Apple unveiled its suite of AI features—dubbed Apple Intelligence—at its Worldwide Developers Conference. There has been a great deal of confusion online about how these features work, to what extent Apple is outsourcing AI to OpenAI, and the privacy implications of it all. I thought it might be helpful, then, to write an explainer. I also think there are some interesting policy implications of the direction Apple is headed, which I’ll lay out as well.
What Apple Intelligence Is
Like many things, Apple has approached AI in its own way and playing to its own unique strengths. In that light, it’s helpful to start with what Apple Intelligence is not. Perhaps most importantly, it’s not one AI model. Instead, it refers to a family of AI models, the orchestration of those models to accomplish things for users, and the software that integrates those models into the operating system (including with third party apps). Some of those models run locally on Apple devices (a 3 billion parameter, 4-bit quantized language model with adapters for specific tasks, in the case of the language model; I couldn’t find many technical details on the image model, other than that it is a diffusion model) while others run in the cloud. There is also an AI-based classifier that looks at each assigned task and routes it to the most appropriate model. This graphic, from the company’s Platforms State of the Union, captures it well:
The models that run in the cloud are built by Apple and use a cloud infrastructure that Apple dubs “private cloud compute.” They appear to have put serious work into making this more private than other AI cloud compute services. They’ve built their own servers using Apple-designed chips and a custom-made operating system based on iOS and macOS (both well-regarded for their security properties). Requests to the cloud are end-to-end encrypted—they can only be read by a specific node within Apple’s cloud. Apple has committed to make the system auditable by independent security experts. The whole thing is pretty impressive, though of course, the proof will be in the pudding (there is a lot more detail in the page linked above for those interested).
There are some people who have claimed that Apple’s commitment to privacy is “just marketing,” which is a dubious claim people have made about many aspects of Apple’s business for the past quarter-century. The notion that Apple’s privacy work is “just marketing,” however, is consistently belied by the serious investments Apple makes in things like this.
Eventually (and unusually for Apple, this entire section of WWDC was a little light on shipping details—some features will come “later this year” and others will come “over the next year) third-party apps will be able to plug into Apple Intelligence. This is accomplished using Apple’s App Intents system, which they first announced in 2022. The basic idea behind App Intents is to get third-party app developers to spell out the core functions and data types of their apps into modules that can be used in other parts of the OS, for instance in searches, home screen and lock screen widgets, automations using the Shortcuts app (made by Apple), and much else.
Apple has trained a specific version of its on-device language model to recognize App Intents and match a specific user request to the correct Intent. So, if you ask Siri, “send a message to Dean in WhatsApp,” or “play the new Vampire Weekend album in Spotify,” it will know how to do that because of App Intents. It seems that Apple’s long-term ambition is to use this system to enable simple AI agents running locally on devices. One of the examples they gave involved a user asking “what time should I pick up my mom from the airport?” To do this, Apple Intelligence orchestrated a few tasks: it found her mother’s flight information from the user’s email inbox, it cross-referenced that with the actual flight ETA from the web, and it used Apple Maps to make a real-time prediction about traffic.
This sort of agentic behavior has long been seen as the next major step in AI development. Apple is accomplishing it on “easy mode” because of its App Intents framework. They’ve been pushing developers to modularize their core app functionality for years. So instead of having to figure out how to use every feature of every app from scratch, Apple’s models have a “menu” of functionality for any app that plays along with the App Intents system.
We’ll see how well this works in practice—few of these features are testable by the public today. But if it does work, it’s a pretty good tradeoff. Apple’s agent features are not going to be general-purpose (they can’t just figure out any software and work with it like a human would), but they can run locally and achieve things for users in the near-term. General-purpose, training wheel-free agents based on frontier models are just beginning to work, and right now it is anyone’s guess when they will be generally usable. Such agents could very well start to work in the next few months, or it could take a while longer.
Speaking of frontier models, you’ll notice I haven’t mentioned the name “OpenAI.” Indeed, nothing—nothing—I described above involves AI models made by OpenAI. What Apple did announce was that, for a subset of complex requests, users will have the option to send their request to ChatGPT. It seems users will have to proactively assert that they want to send their prompt to OpenAI each time. My strong suspicion is that this ChatGPT integration is mostly intended for requests like “Hey Siri, can you summarize the most important court cases relating to NEPA interpretation over the past three decades?” and not “Hey Siri, can you find that podcast my wife recommended I listen to last week?” The former requires detailed world knowledge that cannot be crammed into the 3 billion parameters of Apple’s on-device language models. If I asked a question like that, the classifier would recognize that this is a prompt ChatGPT could help with, and ask me if I want to send it to OpenAI. If I say yes, Siri handles sending the request to ChatGPT, presumably via the OpenAI API.
Apple says that these requests do not need to be tied to an OpenAI account (which is not required to use this feature on the iPhone), and that OpenAI does not retain prompts sent in this way. I am also going to guess that their classifier system is designed to avoid sending sensitive user data to ChatGPT. They’ve also said that they plan to allow users to swap out other frontier models to fill this role (Claude, Google Gemini, Llama 3, etc.), similar to how users can change the default search engine on Safari. As Ben Thompson has said (paywall), it’s a move that makes sense. Apple does not have a comparative advantage in frontier AI, just like they do not have a comparative advantage in making search engines. Their best option, therefore, is likely to leverage their userbase to commoditize these frontier AI models: force them to compete with one another to access Apple’s numerous and often-upscale customers.
This, I think, is Apple’s theory of the case, anyway. But could they be wrong?
What it Means: Two Conflicting Visions of AI’s Future
Recently, the former OpenAI employee Leopold Aschenbrenner released a manifesto entitled “Situational Awareness”—a guess about what the next few years in AI are going to look like. He believes that AGI will be here within a few years, with superintelligent AI to follow shortly thereafter (because once AGI exists, we’ll be able to apply the equivalent of millions of OpenAI-level automated AI researchers to work on superintelligence). The systems Aschenbrenner is describing would almost obviate the need for an operating system that is exposed to the user. You just ask your magic AI black box to do what you want (my interpretation of Aschenbrenner’s thesis is that he thinks you’ll be able to ask these systems to book a flight in 2024/5, to research and write a high-quality think tank report by 2026, to write an entire operating system from scratch in 2027, and to organize a coup against a Global South government by 2028 or so). In that world, it’s not clear why you even need apps, or an operating system with many user-exposed features—you’ll be better off just letting the computer use itself.
Apple (and, it’s worth noting, Microsoft, with its vaguely similar Copilot + PC system unveiled recently) seems to see things a bit differently. Apple Intelligence is not so much a set of features as it is the creation of a new AI layer across the entire operating system. Today, every operating system is permeated by layers for security, internet networking, power management, rendering graphics, etc. Apple is suggesting that AI is another such layer. Even as that layer becomes more powerful and capable, it remains just that: a layer in a larger system, instead of the system itself. This is why Apple Intelligence is about more than just “a model”—it’s about multiple models running locally and in the cloud, APIs and other frameworks, and software to integrate all of this with the user’s data. Like any layer, all of this interacts with many different pre-existing parts of the operating system. In Apple’s view, AI neither replaces the OS nor gets bolted on top; it is diffused throughout.
These two visions suggest distinct approaches to policy. If Aschenbrenner is correct, model-based regulation (and eventually, monopolization by government) seems likely, if not palatable. If Apple’s vision (or my interpretation of their vision) is correct, model-based regulation seems, frankly, kind of stupid and arbitrary. Should we also regulate operating systems’ graphics rendering pipelines to ensure that illegal content cannot be displayed on user’s screens?
Aschenbrenner’s view implicitly dominates much of the policy conversation related to AI. This is revealed by reflecting on the obsession policymakers in Washington, Brussels, and Sacramento have with “the model” and, most especially, “the weights”—the precious weights, simultaneously cherished and dreaded. They want to regulate “the model” because to them it is the sexy, new, dangerous thing. The AI community, whether e/acc or doomer, is susceptible to this style of thinking because they are also inclined to obsess over models—that’s why they are interested in AI!
In Apple’s view, the underlying models certainty matter, but only as part of a holistic system that enables specific capabilities. In this model of the future, it would be a little jejune for policymakers to obsess for too much longer over models. It would be a little bit like obsessing over operating system kernels rather than policing what customers do with computers. Rather than focusing on what models can do, the focus should be on what users “do” do.
Extrapolating from Apple’s view, I would add that even very large AI models running on multi-gigawatt data centers are only one tool in a user’s toolkit—just as they are but one way to use AI to accomplish one’s goals in Apple Intelligence.
This mentality is reflected in Apple’s own communication about their AI efforts. The first goal Apple outlines in their document introducing the models undergirding Apple Intelligence reads (my emphasis added):
Empower users with intelligent tools: We identify areas where AI can be used responsibly to create tools for addressing specific user needs. We respect how our users choose to use these tools to accomplish their goals.
Perhaps Apple’s model of the world is wrong, and something more like Aschenbrenner’s is what we should be prepared for. We do not know, and I personally struggle with these questions. All I can say is that Apple’s appraisal of the matter is more consistent with the history of technology adoption and diffusion: new technologies tend to change the world slowly as they are integrated into individual’s lives. Often, the most transformative changes, especially from new general-purpose technologies—come not from One Big Thing but instead from every person and business using the new technology in hundreds or thousands of different ways. Each specific use has a small effect, but the total remakes the face of the world.
Thus this conflict of visions does not boil down to whether you think AI will transform human affairs. Instead it is a more specific difference in how one models historical and technological change and one’s philosophical conception of “intelligence”: Is superintelligence a thing we will invent in a lab, or will it be an emergent result of everyone on Earth getting a bit smarter and faster with each passing year? Will humans transform the world with AI, or will AI transform the world on its own?
I really enjoyed this post, thanks for writing!