A brief note on Newcomb's problem

Mar 02, 2025

I’ve seen many claims lately that one-boxing is obviously correct in a way that impugns the skill of any philosopher who would doubt it. Although I am a one-boxer, I think treating two-boxers as defective is unfair. Much better philosophers than I have been two boxers. Because I have been irritated by this discourse, I wanted to briefly review the case for two boxing to show it’s actually pretty strong.

Here's a quick refresher on the problem: In front of you are two boxes: a red box and a blue box. Either the red box contains ten thousand dollars, and the blue box contains one million dollars, or the red box contains ten thousand dollars and the blue box contains nothing. You have to choose whether to take the blue box and the red box or just the blue box.

An oracle who can see the future placed the money in the boxes. If the oracle sees that you will pick both boxes, it will put only 10,000 dollars in the red box. If it sees that you will only take the blue box, it will put 10,000 dollars in the red box and one million dollars in the blue box.

Should you take both boxes or just the blue box? Personally, I would take just the blue box. I would then open it to find it contains a million dollars. Great! Surely, anyone who took both boxes is stupid.

No, I don’t think so, though they will end up less rich. Here’s why it’s not stupid to take both boxes. Either the money is already in the blue box and the red box, or it is only in the red box. Your picking which box(es) to take has no effect on this. It’s already done.

And so comes the fundamental argument for taking both boxes. Either the money is in the blue box and the red box, or it is just in the red box. If it is in both boxes, picking both boxes will get you more money. If it is only in one box, the only way you’ll get any money at all is to pick both boxes. So, whatever the state of the world, picking both boxes results in a better outcome. The principle that a choice is correct if, however, things turn out to be, it yields the best outcome is called choice dominance. It is intuitively a very persuasive principle. Not picking both boxes violates choice dominance.

Consider a common variant. Suppose the boxes were transparent. Suppose that you could already see if each contained money. It was evident that no magic was happening at the time of picking the boxes. You could literally see that the money was already in the red box alone, or both boxes. Would you then not choose to take the red box as well? Why deny yourself 10,000 dollars when you can see the matter is already resolved? This is, I think, where much of the “easy” case for one boxing comes from- the false sense that because the boxes are opaque, things “haven’t happened” until the boxes are open.

Consider another variant. Suppose instead of a supernatural oracle, the money-placer is just a really good psychologist who gets it right 99% of the time. Nothing supernatural is involved. Here again, now that we have removed certainty and the supernatural element, many people are more inclined to take both boxes.

Now personally I am a one boxer, but I don’t think it’s an easy matter. In fact, my own tentative guess is some kind of Humean relativism about decision theory is right. Reason only tells you what will happen as a result of various actions, the causal effects your actions will have, etc. Whether you ‘should’ follow CDT, FDT, EDT, etc. is set by what strategy appeals to you, perhaps by the precise formulation of your desires, etc because reason really only gives us the positive facts and how our actions will alter them and not what we should do.

Deriving categorical imperatives from reason is impossible, but even a practical reason of hypothetical imperatives arising from the facts and our desires is ultimately just a way of talking. Most of the time the idea of practical reason seems to make sense because what it means to desire X constrains what a thinking agent who desires X must do in certain circumstances- if: (1) you desire X, (2) there is a strategy that will get you X with no downsides and (3) You’re aware of this, but you do not do employ the strategy then you do not really desire X.

However, these definitional constraints on what it means to desire something are not enough to decide between one boxing and two boxing. Reason can only tell us what is, what will, and the causal relations between them, and in a case like Newcomb’s problem, this, combined with a desire for money, doesn’t constrain us to act in a certain way. Seeking X, all else being equal is written into the concept of wanting X, but the exact resolution of what “seeking” in a context like this means- where evidence, causation, etc. come apart— is not part of our concept of wanting. What matters is if we want the money in a one-boxer or a two-boxer way, and all reason can do is tell us what will happen and why if we take either option.

A bear sitting in front of two boxes: one red and one blue. The bear appears to be contemplating its choice, looking curiously at the boxes. The setting is a peaceful forest clearing with soft natural light. The boxes are placed on a tree stump, and they have a slight glow, adding to their mysterious appearance. The bear's expression is thoughtful, as if considering an important decision.

BTernaryTau

Mar 2

The problem with choice dominance is that it implicitly assumes that the CDT method of assessing counterfactuals is correct. If you instead use the FDT method of assessing counterfactuals, for example, then choice dominance would favor one-boxing in both Newcomb's problem and the Transparent Newcomb Problem. So really, a better name for choice dominance would be CDT choice dominance, or causal choice dominance. Since FDT/functional choice dominance also exists, you do not need to give up on choice dominance in order to support one-boxing; instead, you can simply change which version of choice dominance you use. Thus, this post's defense of two-boxers is much weaker than it first appears.

Note that the above is essentially just a restatement/summary of an argument from the paper Functional Decision Theory: A New Theory of Instrumental Rationality rather than something original to me.

https://arxiv.org/pdf/1710.05060#page=19

Expand full comment

NotPeerReviewed

Mar 2Edited

I'm fairly certain that polarization on this issue is not actually related to disagreements about decision theory, but rather to different intuitions about whether the idea of a "oracle who can see the future" is coherent and how one should engage with thought experiments that seem unrealistic. The hypothetical situation is usually formulated in such a way that one-boxing is "correct" by definition. But it's extremely difficult for me to engage with the problem in that way, because it runs counter to the ways I think causality and human psychology work.

If I find myself in a situation where I feel like I have a psychologically "live" choice as to whether to one-box or two-box, my sense of human psychology is that it probably means the factors influencing the choice are too chaotic for anyone to predict what I will choose. So if I am asked to imagine myself in that situation, how do I do that? Do I imagine a scenario where someone has successfully convinced me that the oracle is infallible, which also means they have convinced me to abandon my basic intuitions about causality and psychology? It seems like I have to do that, but how could I possibly imagine what I would do in such an alien scenario.

I realize that, the way the scenario is typically described, one-boxing is the "correct" answer. But I feel more emotional affinity for the two-box answer, because my temptation is to respond to scenario itself by thinking "screw the assumptions of this stupid rigged scenario."

edit: I just noticed Kaiser Basileus's comment below. Yeah, that.

4 replies

29 more comments...

Philosophy bear

Discussion about this post