9 Comments

Here is also one equivalency of interpretability: the constant Shibboleth between different ideological factions (democracy vs populism, entrepreneurship vs co-ops, capitalism vs cronyism) and the issue of translation.

Expand full comment

I very much agree with the gist of this post, and think it's a good intro to the topic too!

> Incidentally, I wonder if the machine learning interpretability problem suggests a skeptical possibility about human communication. Maybe we make our decisions on the basis of vastly complex processes that bear very little resemblance to the explanations we give for our decisions. Maybe all or nearly all explanations are just post-hoc rationalisations.

This is very much the conclusion of Robin Hanson, e.g. in the book he co-wrote: https://en.wikipedia.org/wiki/The_Elephant_in_the_Brain

In terms of noticing what you describe as "conceptual richness problems", I often think or explicitly write/talk about them as "philosophy", e.g. 'the philosophy of accounting'.

I suspect there's a 'natural intelligence control problem' that we ourselves (humanity) face when implementing any of our ideas into actions. (We are our own thankfully limited genies.)

Expand full comment

I’d raise an objection to the hypothesis that most/many of our articulated reasons for doing things are post-hoc rationalizations unrelated to the “real” reasons mainly because we communicate these reasons to other humans, and, if they generally accept those as [plausible] reasons, or go further and can imagine themselves doing the same thing if they were in the same position, then these surface reasons are at least similar in some way to genuine reasons, even if not exhaustive or completely identical. It’s overcomplicating things to suppose that a hypothetically sympathetic listener runs through a thought-experiment where they do the same things as the speaker but for ineffable reasons, generate a post-hoc explanation, and then check to see if that fictional explanation lines up with the speaker’s.

Expand full comment

Looks like you rediscovered computational complexity theory and the P vs. NP problem. A good read would be Scott Aaronsson, who has some papers intended for philosophers, and this brief introduction: https://cs.stackexchange.com/a/9566/65339

Expand full comment