Regulators are completely out of their depth here. In their efforts to “control” or “safety-ize” AI, they will struggle and then increasingly try explicitly or inadvertently to strangle innovation. And they will fail.
I like the idea of neural networks as emergent orders, but on reflection I'm not sure how well it works.
For instance, you say an emergent order is self-adaptive. This seems true of the *training process* of a neural network: it creates a seemingly-intelligent artifact without being straightforwardly designed. But it's not clearly true of a trained neural network itself, which is a deterministic process. (Maybe in-context learning gets you some adaptivity?)
As a result, throughout the essay I'm not really sure if you're talking about the model or the training process or both, so the concept feels a bit slippery.
“Self-adaptive,” as you correctly note, would refer to a neural network when it is being trained. But because of the complexity and scale of a contemporary frontier model, it’s not obvious that the results of this self-adaptive process will ever be fully explainable to humans. This is similar to how many genes don’t really have a human interpretable or explainable function. They just do a bunch of different, hard to explain, and not obviously related things. No amount of scientific progress changes this basic fact, which is indeed why we often need neural networks to make progress in science.
So just because a model is not literally self-adaptive when it is doing inference (though it is probabilistic, and as you note in context learning can give you some adaptivity, and in fact I know ML researchers who think it can give you the same level of adaptivity as a human, though I disagree with them), it is still an emergent order (a kind of frozen one) which is likely to be challenging for humans to rationally understand in the limit. Though to be clear, interp is going to take us a long way! Maybe even as much as we need.
well, remember that my essay is describing neural networks far smaller and less capable than something like GPT-4o, and we have empirically not observed any safety issues. I also did write:
"With diligent engineering, we might—and I believe we will—make these machines fairly predictable, maybe even very predictable.With diligent engineering, we might—and I believe we will—make these machines fairly predictable, maybe even very predictable."
But more broadly, you are right that the focus of this piece was not safety. It was interpretability.
AI natation of this post:
https://askwhocastsai.substack.com/p/on-ai-black-boxes-by-by-dean-w-ball
Regulators are completely out of their depth here. In their efforts to “control” or “safety-ize” AI, they will struggle and then increasingly try explicitly or inadvertently to strangle innovation. And they will fail.
I like the idea of neural networks as emergent orders, but on reflection I'm not sure how well it works.
For instance, you say an emergent order is self-adaptive. This seems true of the *training process* of a neural network: it creates a seemingly-intelligent artifact without being straightforwardly designed. But it's not clearly true of a trained neural network itself, which is a deterministic process. (Maybe in-context learning gets you some adaptivity?)
As a result, throughout the essay I'm not really sure if you're talking about the model or the training process or both, so the concept feels a bit slippery.
“Self-adaptive,” as you correctly note, would refer to a neural network when it is being trained. But because of the complexity and scale of a contemporary frontier model, it’s not obvious that the results of this self-adaptive process will ever be fully explainable to humans. This is similar to how many genes don’t really have a human interpretable or explainable function. They just do a bunch of different, hard to explain, and not obviously related things. No amount of scientific progress changes this basic fact, which is indeed why we often need neural networks to make progress in science.
So just because a model is not literally self-adaptive when it is doing inference (though it is probabilistic, and as you note in context learning can give you some adaptivity, and in fact I know ML researchers who think it can give you the same level of adaptivity as a human, though I disagree with them), it is still an emergent order (a kind of frozen one) which is likely to be challenging for humans to rationally understand in the limit. Though to be clear, interp is going to take us a long way! Maybe even as much as we need.
I notice that you didn't provide any reasona as to why might trust that these emergent orders are safe.
well, remember that my essay is describing neural networks far smaller and less capable than something like GPT-4o, and we have empirically not observed any safety issues. I also did write:
"With diligent engineering, we might—and I believe we will—make these machines fairly predictable, maybe even very predictable.With diligent engineering, we might—and I believe we will—make these machines fairly predictable, maybe even very predictable."
But more broadly, you are right that the focus of this piece was not safety. It was interpretability.
Oh sorry, I missed those sentences. I probably wouldn’t have written this if I’d seen them.