“There’s an avalanche coming,
Don’t cover your eyes.
It’s what you thought that you wanted,
It’s still a surprise.”
Vampire Weekend, “Unbearably White”
“I’ve grown not to entirely trust people who are not at least slightly demoralized by some of the more recent AI advancements.”
Tyler Cowen (October 13, 2024)
Introduction
I first became “AGI-pilled,” meaning that I began to grasp the profound promise of mechanized intelligence, sometime in the early 2000s. I was a quirky kid, and I liked to use the computer. I especially enjoyed using the computer to learn about computers. If you did that in the early 2000s, it was hard to avoid websites and IRC groups that talked about AGI. There were dark visions of an AI future, just as potent as they are today. But there were positive visions too. One of the ones I remember most fondly is a 1987 video from Apple, putting forth a hypothetical future product called “Knowledge Navigator.”
The Knowledge Navigator was not billed as an AI. Instead it was a hardware device, resembling an iPad, except that in Apple’s vision at the time, it would lie flat on the user’s desk. Running on that hardware was a virtual assistant (a man in a bowtie—I think I will always imagine our best AIs wearing bowties because of this). The virtual assistant could place and screen video calls on the user’s behalf, synthesize large quantities of academic research and other online content, and even use that information to create new knowledge. Apple imagined that you’d communicate with this device almost entirely through voice. The Knowledge Navigator was not just a computer; it was a colleague.
Ever since I saw this video sometime in the early 2000s, this is the tool I craved the most. Every time a major new general-purpose tech product came out, some part of me would compare it to the ideal of the Knowledge Navigator in my mind. For nearly a quarter-century, nothing ever quite lived up. The first iteration of ChatGPT was a huge step forward, but still far short of what was needed to achieve the vision. ChatGPT is like a photograph of the internet—a lossy artifact. It cannot, on its own, retrieve obscure information beyond what is stored (imperfectly) in its parameters. Then we got search tools like Perplexity, but these too were flawed. Though Perplexity has improved, it still feels like a summary of the top Google results on a topic than it does like a true colleague. For the last two years, the Knowledge Navigator, though closer than ever, still felt just out of reach.
Looking back forty years later, it is remarkable how much that vision got right. Here is how that vision was described in a book written at the time by then-Apple CEO John Sculley and co-author John Bryne:
“A future-generation Macintosh, which we should have early in the twenty-first century, might well be a wonderful fantasy machine called the Knowledge Navigator, a discoverer of worlds, a tool as galvanizing as the printing press. Individuals could use it to drive through libraries, museums, databases, or institutional archives. This tool wouldn't just take you to the doorstep of these great resources as sophisticated computers do now; it would invite you deep inside its secrets, interpreting and explaining—converting vast quantities of information into personalized and understandable knowledge.”
The Knowledge Navigator is now here, and I am indeed using it on a future-generation Macintosh. But the Navigator itself is not made by Apple; it’s a product called Deep Research from OpenAI. Deep Research is not entirely Apple’s vision; I cannot have fluid conversations with it yet (though I can with GPT-4o using Advanced Voice Mode), for example. But for all intents and purposes, the parts of Knowledge Navigator I cared about for decades have arrived.
I am stunned, overjoyed, but also, if I am being honest, just a little bit wistful. What does this tool mean for us human researchers, particularly those who don’t already have a public voice? What does this tool mean for the great researchers who haven’t even yet realized their calling? Should they—we—just throw in the towel? And what does it mean for our society more broadly now that the Knowledge Navigator is here? I’d like to walk you through my thoughts, which somehow feel exceptionally preliminary, despite having percolated in my mind for two-thirds of my waking life.
Using Deep Research
Let me start with the specs. OpenAI Deep Research is based on OpenAI’s o3 model. As of today, this is the only way to access the full-sized o3 model, though the standalone version of this model (with no “Deep Research” functionality) should be coming soon enough. Like the o1 model that came before it, o3 is a “reasoner,” meaning that it is trained using a technique called reinforcement learning to use extra inference-time compute to think before it writes its answer to your prompt. Deep Research puts this top-tier reasoning model into agentic scaffolding—software around the model that allows the model to efficiently pull information from the internet and browse the web (scaffolding can enable other actions too, but as far as I know, that’s it in this product).
Deep Research is available (for now) only in the Pro tier of ChatGPT, which costs $200 per month. Users in that tier are restricted to 100 queries per month. Eventually, users in the free the $20 per month tiers of ChatGPT will get a much more limited number of Deep Research queries. Deep Research is also only available on the ChatGPT web interface, rather than the desktop or mobile apps. OpenAI has said that they plan to loosen all these limitations soon, but these are the facts today.
A quick note: I’ve seen some confusion online about the user interface of Deep Research. It is accessible as a button below the prompt text box, as opposed to the model selection drop-down menu. This has caused some people to speculate that they need to pick the model with which they want to use Deep Research and then separately activate the Deep Research button. This is wrong: every time you select the Deep Research button, you are using o3, regardless of what model you have selected in the drop-down menu.
One other key innovation that Deep Research adds on top of o3 alone is that, as pointed out by OpenAI’s former Chief Research Officer Bob McGrew, the reinforcement learning has been adapted to teach the model to take actions in addition to just thinking. What this means in practice is that the model can pursue research goals for you for extended periods of time; Deep Research can take as long as 30 minutes to answer your questions, and for this entire time it is collecting information, reflecting on that information, and modifying its research plan in light of that new information.
Because of this, OpenAI Deep Research is very different from Gemini Deep Research, which is Google DeepMind’s competitor product. Gemini Deep Research, which impressed me in my first few uses and then gradually started to underwhelm me, formulates a research plan and then queries the web for topics relevant to that plan. It then (I assume) puts relevant webpages into a vector database, and later goes into an “analysis” mode, where (again, just assuming here) it queries that database with questions from the research plan. The human equivalent would be copy-pasting a bunch of web pages into a giant text file and then searching that file to write a summary.
OpenAI Deep Research is doing something wholly different. It is extracting content from the web, no doubt, but it is also reflecting every time it reads new information, and occasionally going down rabbit holes. For example, one of my early prompts with the model involved asking for a summary of existing consumer protection and civil rights laws that could in principle be applied to AI in about a dozen states. During its search, Deep Research stumbled on regulatory guidance issued by a state Attorney General and noted the laws that the guidance cited. It then decided to consult the exact text of those laws to understand the precise nature of the statutory authority that Attorney General was invoking.
This is how I conduct policy research. I do not Google a bunch of things and then write down my vague impression from my perusal of the results. I look at primary source documents, seek to understand them, and from those primary sources find other primary sources. I reflect on what I am learning and come up with wholly new queries, unanticipated in my original research plan. One might almost say I navigate knowledge. And so, too, does OpenAI Deep Research.
The result is a system that, while imperfect (I note, for example, that Deep Research asserted with no citation that SB 1047 would have required model developers to perform “impact assessments,” which is untrue), is essentially as good as I am at conducting policy research, which is in some sense what I get paid to do all day. It is probably better at this than I was early in my career. I have personally used it to conduct research in a few minutes that otherwise would have taken me days or longer. This is a massive accelerant to my work, but is it also a threat?
The Meaning of Deep Research
I anticipated something like OpenAI Deep Research would exist someday soon after my first experience with ChatGPT. It is obvious as a product idea. Many people will see it as a threat to their jobs, and I will be blunt: for many people, it probably is. Yet I anticipated this product would exist (indeed, I am surprised it has taken so long) and I still chose to start a new career as a writer. Why?
It might make sense to start with some of the deeper flaws in Deep Research—and I don’t mean hallucinations. In fact, in a sense I mean the opposite: on some questions, Deep Research can become too grounded in its source material.
Let me illustrate this with two different queries I made. The first was asking for precise details on certain aspects of the software procurement policies of every cabinet-level federal agency. Here Deep Research blew me away, answering a question in about ten minutes that would have taken me several days of soul-crushingly boring research. In fact, this question would be so tedious to answer that there is a high likelihood I would never have gotten around to investigating it. This is pure augmentation; I know now facts about the world that I would not have known were it not for this tool.
The second query was asking Deep Research to investigate the extent to which “algorithmic discrimination” is a widely observed phenomenon in the real world. This is research I myself have done, so I was familiar with many of the sources it consulted. The trouble with this issue is that it is one of many examples of herd mentality within the policymaking community. Almost everything you find online says that algorithmic discrimination is a widespread phenomenon and encourages immediate policy action on the topic. Most of that research comes from the late aughts and early 2020s, when both the AI systems in question and the tenor of our politics on this issue were markedly different than they are today.
There are many critical questions one could ask (is the alleged algorithmic discrimination being compared to a contextually appropriate human baseline? Can disparate outcomes be attributed to factors other than discrimination, algorithmic or otherwise?), but even when hinted with these questions, Deep Research never really pursued them. It took the sources too much at face value, and did not inquire more deeply.
On even thornier questions, such as how to solve yet-unsolved questions in AI governance, the model is also weak, presenting me with all the nebulous buzzwords that are familiar to anyone who follows AI policy (“human in the loop,” etc.). This happens even when you ask the system to probe. Perhaps I am simply prompting the model incorrectly, but it is worth noting that I tried a few times.
Thus, imaginative or contrarian thinking is still a domain where talented humans are probably dominant—though I would note that Deep Research did a far better job even in its subpar answers than the average human (maybe even average AI policy researcher) would have done. In this way, the idea that AI will augment human intelligence is still intact: I don’t need an AI that can answer a politically thorny question 60% as imaginatively as I can, but I absolutely love an AI that allows me to answer orders of magnitude more tough, objective research questions.
How long will this idea remain intact? I do not know. And I worry for the young people who haven’t managed to get their foot in the door in writing about public affairs (or many other professions where being an RA is the entry point). It may become far harder to make one’s mark as these tools proliferate, and I personally cannot see why I would ever need a human RA again. My only advice to young people is to accumulate social capital, using these tools and the broader set of tools available to you on the internet, as rapidly as you can. Your window may well be closing. I wish I had something better to say, but at this point I do not.
What I can say is this: with this tool in hand, everything I write can be the product of weeks or months more research time than was possible just a few days ago. That is astonishing: time itself has been compressed. I can do a decade’s worth of work in a year. This, rather than the bare LLM, is my printing press. Steve Jobs called computers a bicycle for the mind; this, however, is an airplane.
Policy Implications
This product may be the most important research and information tool created during my lifetime (so far), and at this point I would be bashing you over the head if I reminded you that the underlying LLM can be expected to improve dramatically almost every three months for at least the foreseeable future.
Soon enough, these systems will be able to find and download datasets from the web, conducting sophisticated quantitative analysis on them (perhaps even using techniques from machine learning—AIs making AIs). They’ll be able to hold vastly more information in their context windows, and synthesize that information in more creative ways than they currently can.
Tools like Deep Research will be a primary, if not the primary, interface for any person engaged in the creation of knowledge or insight.
If I have ever sounded angry when I write about bills like the Texas Responsible AI Governance Act and its many lookalikes across America (we’re close to 20 of them now, by the way), this is why. Among a great many other things, these bills create the possibility of negligence liability for developers for any “discriminatory” outputs of their models. And “discriminatory,” in virtually all these laws, means the mere presence of a disparate demographic effect rather than any intention to discriminate. What that likely means in practice is that these laws will force AI companies to massively censor the outputs of their models—just as their models are becoming useful for genuine knowledge creation.
This is, in my view, an attack on the principles of open inquiry that brought us to this point of technological potential. And it is being inflicted on the tool I have dreamed of having since I was a boy. This strange new parameter of avoiding “discrimination” is being imposed by governments across the country without my, or really anyone else’s, consent, and it will effect me in the most intimate of ways: changing the way I learn about the world. That this policy was conceived in a working group convened by an organization called the Future of Privacy Forum is especially rich. The fact that this state-mediated intrusion into something so personal is being perpetrated in the name of privacy is more than just an irony—it is a molestation of the English language, and a civilization-scale crime, conducted in plain sight.
Conclusion
I’ll continue to push back against these invasions of our liberty, as best I can, and you better believe I’ll be using Deep Research to help me. I don’t have a law degree, but I have a great lawyer by my side (I also have many lawyer friends—accumulate social capital!). I don’t know the intricacies of every state agency regulation, but I have a diligent research assistant who can analyze them in minutes. I don’t know the state of every sub-field of academic inquiry, but I happen to have a keyboard that can summon an expert in all of them in seconds.
Without tools like this, the complexity of modern society is a weapon that can be thrown by the forces of the status quo at even the smartest people trying to change things. But with tools like this, suddenly the tables are turned. Because now, we can analyze that complexity, we can probe it, we can find its weaknesses, in ways that were nearly impossible for most people just a few days ago. I cannot reform our polluted system of government on my own, nor can any small group of dedicated people, but with these tools in our hands, perhaps we have ourselves, at last, an honest ball game. Real change—genuine change—is possible. Just like Deep Research was for the last two years, it is just beyond our grasp. But we can now reach further than ever before.
But the potential here is far greater than just policy. With these tools and many others to come, I hope we will cure people like my mother of crippling autoimmune diseases (soon, please). I hope we find a way to extend human lifespans, but really I hope someone figures out how to extend the lifespans of my two beloved cats before too long. I hope that one day I will find the time to read all the books I never was able to read, and that when I do I will have many friends, and a kind genius in the cloud, with whom to discuss them. I hope we can better our lots in life, each in our own way.
So despite some trepidation and some wistfulness, I am happy to see my childhood dream come to fruition. My long-sought tool, my new instrument, my knowledge navigator, my competitor, my colleague, my friend, is finally here.
Thanks for writing this, Dean! I'm really hoping the Gemini 2.0 models combined with OpenAI's Deep Research release drive improvements to Gemini's Deep Research. The competition will be good for both products.
Hey Dean - I really appreciate your work and have learned a lot about AI policy from reading you. About a 6 months ago I decided that a social science/public policy PhD would be a good way for me to learn about these topics and do research to help make sure AI is integrated intelligently and fairly into society. Now, this doesn't seem like a great plan anymore (maybe it never was). I am not sure what the point is of being a grad student in pursuit of a career that probably won't exist when I'm done (will any exist?). The main benefit I do see is that a PhD will help connect me with a network of researchers, to your point about social capital. Do you think academic research careers are basically not worth it anymore given the projected pace of improvement? If so, do you see any other paths as more promising for young researchers?