The AI Control Problem in a wider…

Jan 16, 2022

Epistemic status: A public intellectual is someone interesting enough that we have decided to let them be obviously wrong.

Read →

9 Comments

Brad & Butter

Jun 7, 2022

Here is also one equivalency of interpretability: the constant Shibboleth between different ideological factions (democracy vs populism, entrepreneurship vs co-ops, capitalism vs cronyism) and the issue of translation.

Expand full comment

Kenny

Jan 26, 2022

I very much agree with the gist of this post, and think it's a good intro to the topic too!

> Incidentally, I wonder if the machine learning interpretability problem suggests a skeptical possibility about human communication. Maybe we make our decisions on the basis of vastly complex processes that bear very little resemblance to the explanations we give for our decisions. Maybe all or nearly all explanations are just post-hoc rationalisations.

This is very much the conclusion of Robin Hanson, e.g. in the book he co-wrote: https://en.wikipedia.org/wiki/The_Elephant_in_the_Brain

In terms of noticing what you describe as "conceptual richness problems", I often think or explicitly write/talk about them as "philosophy", e.g. 'the philosophy of accounting'.

I suspect there's a 'natural intelligence control problem' that we ourselves (humanity) face when implementing any of our ideas into actions. (We are our own thankfully limited genies.)

Expand full comment

Jesse Amano

Jan 20, 2022

I’d raise an objection to the hypothesis that most/many of our articulated reasons for doing things are post-hoc rationalizations unrelated to the “real” reasons mainly because we communicate these reasons to other humans, and, if they generally accept those as [plausible] reasons, or go further and can imagine themselves doing the same thing if they were in the same position, then these surface reasons are at least similar in some way to genuine reasons, even if not exhaustive or completely identical. It’s overcomplicating things to suppose that a hypothetically sympathetic listener runs through a thought-experiment where they do the same things as the speaker but for ineffable reasons, generate a post-hoc explanation, and then check to see if that fictional explanation lines up with the speaker’s.

Expand full comment

Reply (1)

Philosophy bear

Jan 20, 2022Edited

To be clear I don't believe this hypothesis, it's just a natural thought raised by the material that must be considered . I quite like your answer.

Expand full comment

Marvin

Jan 19, 2022

Looks like you rediscovered computational complexity theory and the P vs. NP problem. A good read would be Scott Aaronsson, who has some papers intended for philosophers, and this brief introduction: https://cs.stackexchange.com/a/9566/65339

Expand full comment

Reply (2)

Philosophy bear

Jan 19, 2022Edited

My concern with the idea that there is a close connection is that if the problems of thick concepts were like a NP hard problems we would be able to write an algorithm to verify whether a particular "solution" (an application of a thick concept) were correct- but we can't do this.

Expand full comment

Reply (1)

Marvin

Jan 19, 2022

In that case, it is simply a harder class, such as EXP, problems that are known to take exponential asymptotic running time, or ever outside of computable problems, which cannot be solved in full generality on an ordinary Turing machine. However, we can use Oracles or Oracle machines and queries to consider the counterfactual where we have access to machines of such power. It is theorized that a computer that can compute un-computable problems is not present in nature, yet, this is the Church-Turing thesis. There is also a variant, the extended Church-Turing thesis, that claims there is no fundamentally new computer that can compute faster than a Turing machine. This thesis has been falsified, as Quantum Computers have been realized, although they are still very weak, as making them scalable is a very complicated physics and engineering challenge.

Expand full comment

Reply (1)

Marvin

Jan 19, 2022

See also https://complexityzoo.net/Petting_Zoo for more complexity classes.

Expand full comment

Kenny

Jan 26, 2022

I think my favorite Aaronson paper ["Why Philosophers Should Care About Computational Complexity"](https://www.scottaaronson.com/papers/philos.pdf) is _somewhat_ relevant to this, but that this post is about something much more general.

This seems to be very much about what David Chapman describes as 'nebulosity': https://meaningness.com/nebulosity

Expand full comment