On AI "Black Boxes"

Dean W. Ball

Aug 29, 2024

Will we ever understand neural networks?

Read →

7 Comments

Askwho Casts AI

Aug 30

AI natation of this post:

https://askwhocastsai.substack.com/p/on-ai-black-boxes-by-by-dean-w-ball

Expand full comment

Greg

Aug 29

Regulators are completely out of their depth here. In their efforts to “control” or “safety-ize” AI, they will struggle and then increasingly try explicitly or inadvertently to strangle innovation. And they will fail.

Expand full comment

UnabashedWatershed

Sep 13

I like the idea of neural networks as emergent orders, but on reflection I'm not sure how well it works.

For instance, you say an emergent order is self-adaptive. This seems true of the *training process* of a neural network: it creates a seemingly-intelligent artifact without being straightforwardly designed. But it's not clearly true of a trained neural network itself, which is a deterministic process. (Maybe in-context learning gets you some adaptivity?)

As a result, throughout the essay I'm not really sure if you're talking about the model or the training process or both, so the concept feels a bit slippery.

Expand full comment

Reply (1)

Dean W. Ball

Sep 13

“Self-adaptive,” as you correctly note, would refer to a neural network when it is being trained. But because of the complexity and scale of a contemporary frontier model, it’s not obvious that the results of this self-adaptive process will ever be fully explainable to humans. This is similar to how many genes don’t really have a human interpretable or explainable function. They just do a bunch of different, hard to explain, and not obviously related things. No amount of scientific progress changes this basic fact, which is indeed why we often need neural networks to make progress in science.

So just because a model is not literally self-adaptive when it is doing inference (though it is probabilistic, and as you note in context learning can give you some adaptivity, and in fact I know ML researchers who think it can give you the same level of adaptivity as a human, though I disagree with them), it is still an emergent order (a kind of frozen one) which is likely to be challenging for humans to rationally understand in the limit. Though to be clear, interp is going to take us a long way! Maybe even as much as we need.

Expand full comment

Chris L

Aug 30

I notice that you didn't provide any reasona as to why might trust that these emergent orders are safe.

Expand full comment

Reply (1)

Dean W. Ball

Aug 30

well, remember that my essay is describing neural networks far smaller and less capable than something like GPT-4o, and we have empirically not observed any safety issues. I also did write:

"With diligent engineering, we might—and I believe we will—make these machines fairly predictable, maybe even very predictable.With diligent engineering, we might—and I believe we will—make these machines fairly predictable, maybe even very predictable."

But more broadly, you are right that the focus of this piece was not safety. It was interpretability.

Expand full comment

Reply (1)

Chris L

Aug 30

Oh sorry, I missed those sentences. I probably wouldn’t have written this if I’d seen them.

Expand full comment

Hyperdimensional

On AI "Black Boxes"