I agree with previous comments. If I lived in a world where this kind of thing happened, I would train myself to be a one-boxer, join the local one-boxer society, shame two-boxers as greedy and so on. Then the oracle would correctly predict that I was going to pick one box and reward me accordingly.
Since I don't live in such a world, I will act rationally which means not leaving money on the table for no good reason. That will serve me well nearly all the time.
If it happens that some rich person with good predictive skills should create the situation in the example, I would be out of luck. But since it's highly improbable that anyone would do this, and also highly improbable that I would be the person chosen, my expected loss is negligible.
False dichotomy. Nothing is stopping you from one-boxing in Newcomb's problem and also "not leaving money on the table" in whatever real-world situations you're thinking of.
In general though, I agree with this sentiment. In the problem as strictly presented, I'm a dyed-in-the-wool two boxer, but committed one-boxers obviously have the higher expected yield.
I guess another way to think of it is the two-boxing is the dominant strategy in a one-shot game, but in an iterated game where the oracle uses your past choices to predict future choices, I think one-boxing (with perhaps the occasional defection to two-boxing) is the dominant strategy.
A broader meta-point is that in lots of real life scenarios, people appear to act "irrationally," but only if one considers the decision they make as a one-shot game when real life is a collection of various iterated games. People tend to punish the other player for giving them scraps in the dictator game because they value fairness - this is only irrational in a one-shot game.
It sounds like you're not a real two-boxer. The premise of the problem is that you're trying to maximize your earnings, so if you agree that one-boxing does that, you're just a one-boxer who doesn't care about money and thus is rejecting the premise of the problem. If you prefer, you could replace the money with something you care more about, like maybe you or your loved one is dying of cancer, and one box contains a cure with 10% probability, and the other box contains either a guaranteed cure or nothing, depending on Omega's choice.
I'm fairly certain that polarization on this issue is not actually related to disagreements about decision theory, but rather to different intuitions about whether the idea of a "oracle who can see the future" is coherent and how one should engage with thought experiments that seem unrealistic. The hypothetical situation is usually formulated in such a way that one-boxing is "correct" by definition. But it's extremely difficult for me to engage with the problem in that way, because it runs counter to the ways I think causality and human psychology work.
If I find myself in a situation where I feel like I have a psychologically "live" choice as to whether to one-box or two-box, my sense of human psychology is that it probably means the factors influencing the choice are too chaotic for anyone to predict what I will choose. So if I am asked to imagine myself in that situation, how do I do that? Do I imagine a scenario where someone has successfully convinced me that the oracle is infallible, which also means they have convinced me to abandon my basic intuitions about causality and psychology? It seems like I have to do that, but how could I possibly imagine what I would do in such an alien scenario.
I realize that, the way the scenario is typically described, one-boxing is the "correct" answer. But I feel more emotional affinity for the two-box answer, because my temptation is to respond to scenario itself by thinking "screw the assumptions of this stupid rigged scenario."
edit: I just noticed Kaiser Basileus's comment below. Yeah, that.
I think the "perfect predictor" formulation is actively unhelpful to understanding the problem and trips people up in exactly the way you're describing.
My preferred version is the one where Omega is able to use a scan of your brain to make a 90% accurate prediction (and you are unable to predict the direction of its errors). This feels like a plausible future technology, consistent with causality and with human psychology. (I'm not sure brain scanning will ever advance to this degree -- maybe self-referential situations like this will be especially hard to predict -- but I don't think you should be confident that it's physically impossible.)
The key fact is then: you're forced to use the very same brain Omega just scanned to make your choice. You can carefully think through the situation for as long as you want, but you know that the brain scanner has, by some means, captured a 90%-reliable indication of where that reasoning process will end.
My perspective is then: you can choose to ignore the details of your situation, pretend you're making a decision completely independent of any prior prediction, and take both boxes. Or, you can correctly reason that you're in a very strange situation where your own brain has been like, adversarially used against you, and take one box.
The self-referential nature of the situation is what makes me skeptical of the technology, and I think there are strong logical grounds for thinking there's a low ceiling on how accurate a self-referential prediction can be. Note that the accuracy of the technology can only be defined with reference to a population – among people who have been thoroughly socialized to believe the correct way to handle this situation is to flip a coin, the accuracy will be 50%.
But let’s say I’m convinced that the accuracy is 90% on a population of people roughly similar to me. I’ve heard that most people have strong and stubborn intuitions about the problem, so I can guess that Omega’s accuracy is buoyed by people who don’t second-guess their choice. If I find myself second-guessing my choice, I then I start to suspect I’m from part of the distribution that Omega has a harder time predicting – which would probably be a substantial minority, given your numbers.
You can make that hypothesis seem less likely by tweaking the probabilities – from 90% to 99%, for example – but then you’re making it harder for me to accept the scenario on its own terms.
I think even if you accept this, you'd still want to one-box. You have two options: either solemnly vow to always run FDT no matter what, or try to jinx Omega's prediction. If you solemnly vow to always run FDT Omega predicts your decision accurately while if you try to jinx Omega's prediction, Omega has 50/50 odds of being correct, which means half of the time the opaque box has money and half of the time it doesn't, which seems worse.
In any case, you'd probably not want to be the kind of person that always run CDT since that makes the fact you'd always two-box predictable.
This might be a frequentist vs Bayesian thing, but I don't think you need to stipulate that the accuracy can only be defined wrt a population.
We could rephrase this as: you are standing in front of the two boxes, and after the situation is fully explained to you, your credence that Omega made an accurate prediction is 90%. The actual Newcomb question is: what do you do in *this* circumstance? Everything else is fighting the hypo.
Now in practice, I think fighting the hypo is good! You're right to ask questions like "wait, Omega was right 90% of the time so far? is there any pattern to this? can I guess whether I'm more likely to be in the correct or incorrect camp?" If you're facing a real-life Newcomb Problem, you shouldn't accept that Omega is well-calibrated just because someone told you it is. But this is the same sort of reasoning as "can't I just take my shoes off before saving the drowning child?" or "is there some way to stop the trolley from hitting anyone?" Each of these would be valuable to do in real life, but they're all failing to engage with the relevant thought experiment.
I think if your strategy consistently leaves you $1 million poorer, that’s a bad strategy. The decision here isn’t whether to take both boxes. It’s whether to be the sort of
person who would take both boxes in this situation.
The problem with choice dominance is that it implicitly assumes that the CDT method of assessing counterfactuals is correct. If you instead use the FDT method of assessing counterfactuals, for example, then choice dominance would favor one-boxing in both Newcomb's problem and the Transparent Newcomb Problem. So really, a better name for choice dominance would be CDT choice dominance, or causal choice dominance. Since FDT/functional choice dominance also exists, you do not need to give up on choice dominance in order to support one-boxing; instead, you can simply change which version of choice dominance you use. Thus, this post's defense of two-boxers is much weaker than it first appears.
Note that the above is essentially just a restatement/summary of an argument from the paper Functional Decision Theory: A New Theory of Instrumental Rationality rather than something original to me.
Dare I say, two-boxers here are not beating the "not being able to take the thought experiment seriously" accusation.
I can steelman the 99% case: if someone in real life told me that he created an AI that can 99% predict whether I one-box or two-box, I would expect this to be based on my known post history, personality, and rat status, and anything I do after that stop being even probabilistically correlated to the prediction. At most it's Smoker's Lesion again. And TDT/FDT/LDT/etc., and even some sophisticated versions of EDT ultimately equivalent to that, will agree with CDT on Smoker's Lesion.
But if you interpret the thought experiment literally (as in, there's still an infallible Omega predictor, but he send the prediction to the person in charge of putting the money inside the boxes with a little noise), then I find no appeal to the idea of reverting back to CDT here.
The transparent version (which is ultimately isomorphic to counterfactual mugging/Parfit's hitchhiker) is a more interesting argument, UDT will bite the bullet on it, but EDT and TDT will agree with CDT, thus violating an intuitive principle that something like conservation of expected evidence when evaluating beliefs should also be true when evaluating actions.
But ultimately it all boils down to: unlike Substack commenters and mainstream analytic philosophers, rats look at all of this as an AI engineering and if you're writing an AI 1/ the possibility of Omega-prediction become immediately obvious and in fact trivial 2/ even if you yourself are a CDT agent, you have no reason to implement CDT when writing an AI agent, consequently 3/ even if you for whatever reason wrote an AI agent using CDT, it would immediately seek to self-modify to a better decision theory.
Maybe I'm honest, or maybe I'm simple, but I think people *correctly* predict each other all the time in everyday life. Label the boxes "excellent employment" and "*not* embezzling everything in front of you" and I think the common-sense case for one-boxing in *transparent* Newcomb's becomes much clearer.
The big prize box is visibly full because you were predicted to be the kind of person who could consistently leave the small prize alone. The big prize is visibly *empty* because -- well, as one of the MIRI folks put it (Soares, I think?), decisions are for making bad outcomes inconsistent. It *isn't* empty, not for me. And when it is, despite that? I'll take the 1% failure rate and go home empty-handed, over turning around and becoming a two-boxer *in every situation like this*.
Evidence and causation are right there, you described the entire setup in a single paragraph. It just is not that complicated. You do need something that can look spooky to sophilists -- a *logical* (not causal) connection between your actions and other people's models of your actions -- but it's not hard to define. If CDT ignores that kind of relation and therefore predictably loses money, so much the worse for CDT. I think this is only complicated to philosophers who've tied themselves up in knots over it. Given the setup there is an action that reliably and predictably wins, and if you can't philosophically justify taking it, that is a failure in your philosophy. *Justifying* the tendency to lose, or waffling between them -- that's just sad. You can do better. Your is/ought intervention just denies the setup to sidestep the problem, and the setup can be patched until it doesn't (assume you want money, assume linear utility in dollars, assume prizes other than money, etc etc etc until you *have to* engage with the question, Least Convenient Possible World style.)
I dunno, you're right that the dominance principle is not to be discarded lightly. But I think I can safely do so, given that maximizing the outcomes for the situation I'm in *affects the situations I end up in* when other people can *see you doing it* and contribute to your situation. I don't think it has to be complex.
("But there's no way people could" go play smash or street fighter or competitive pokémon with your local ranked champions, you will get your *soul read* like your innermost plans are a cheap billboard ad. Everyone who believes they have a libertarian / unpredictable free will should try this some time. Mixing it up is a skill that takes time and practice and can't be perfect.)
("But I don't *want* to embezzle/steal/exploit" yes evolution and culture have had you internalize this, in everyday situations far away from abstract philosophy.)
Put aside boxes entirely. The oracle says "I'm going to ask you to choose a whole number. I have predicted with 100% confidence that the number you choose will be odd. If you pick an odd number - which you of course will - I will give you one dollar. If you in theory you were to choose an even number, I would instead give you two dollars, but of course I know you're not going to do that." Do you choose an even or an odd number?
I don't believe I can engage meaningfully with that hypothetical. My description of the situation strings words and sentences together in a way that superficially appears coherent, but even though I believe in deterministic universe, the scenario is inconceivable - or at very least, imagining it would require either (1) bringing in weird outside factors that have nothing to do with the situation as described, like someone showing up with a gun and threatening to kill you unless you choose an odd number, or (2) overhauling my assumptions about how human psychology works so thoroughly that I am too overwhelmed to engage with the scenario.
I'm sure you and I disagree about whether Newcomb's Paradox presents the same kinds of problems as my hypothetical does, but let's not go there just yet - do you and I at least agree it's impossible to meaningfully engage with the hypothetical scenario I proposed?
But why do you need to imagine a mechanism in order to engage with the hypothetical? Even if there's no possible process that could bring those outcomes about, surely the outcomes themselves are still conceivable?
It seems underspecified. What does "were going to choose" mean here? Would have, were you not informed of the prediction? Are going to despite the prediction? Something else?
As you say, what is in the boxes is already in the boxes. I don't believe in fortune telling, and not much in good psychology. So I take both boxes. This seems to be a 'd'uh' kind of thing for most of us because we aren't going to be overthinking it, like a philosopher would. ;-)
Rejecting the premise of the problem and substituting in a totally different problem where two-boxing is indisputably correct is not particularly clever nor insightful.
Respectfully, you seem a little out of touch with the field here. Choice dominance is a very well-known thing, nearly all one-boxing decision theorists are well aware of it and choose to one-box despite it. In particular, it's already been shown that it's invalid when the probability of the outcome depends on the choice, so no knowledgable two-boxer would cite it in support of their position. See the wikipedia page: https://en.wikipedia.org/wiki/Sure-thing_principle
Your variants are also commonly-discussed. The probabilistic one doesn't affect the outcome, any sophisticated one-boxer would still choose the option with a higher expected value. (Provided it's formulated correctly; there's a subtle ambiguity in the problem statement that can sometimes make it rational to two-box even if you're normally a one-boxer.) The transparent boxes one is more interesting as it highlights a discrepancy between evidential decision theory (which one-boxes in the opaque case but two-boxes in the transparent case unless you allow binding precommitments) and timeless decision theory (which one-boxes in both cases).
The good arguments for two-boxing mostly rely on the No Free Lunch theorem and the infinite regress involved in "choosing a decision theory". (It's rational to two-box if you expect to encounter oracles that will reward you for doing so.)
The section on the is-ought distinction also makes little sense to me. Of course decision theory can't tell you what you ought to desire, but it's inherent to the problem that the agent desires to maximize their earnings. There are definitely nuances to what this means exactly, often getting into anthropic considerations like "should I be acting in a way to increase the probability of this scenario happening", "what happens if my decision theory says to act in a way that causes a contradiction in the problem statement", etc. But it's fundamentally a factual question, not a values one. People don't have some fundamental drive towards taking some number of boxes; they just want money for instrumental reasons, and will take the action that they belief helps them towards that goal. Changing people's minds here is a matter of presenting facts and logic, not one of modifying their emotional reaction.
> because reason really only gives us the positive facts and how our actions will alter them and not what we should do.
In isolation, sure, this seems obviously true. But then it also seems obvious that Parfit's example of someone who prefers arbitrarily large amounts of suffering on Tuesday to arbitrarily small amounts of suffering on other days is not just weird but somehow horribly mistaken - "it's Tuesday" is a fact *within a particular social context* but in an absolute sense not even wrong, and "no reason at all", as Parfit puts it, to hold an absolute preference like that.
Two-boxing feels intuitively similar to me, somehow. Yes, holding the oracle's actions fixed, picking both results in a better outcome. But we're not holding the oracle's actions fixed! Your preferred theory of causation may claim that you don't "cause" the oracle to not put money in the box by choosing both, but causation, in the standard folk-physics sense, is part of the map, not the territory. It is a useful macroscopic concept like "chairs" or "substances" but it does not actually exist. And in the context of a thought experiment that tells you, in no uncertain terms, that if you do this, the oracle will have done that - it's simply not a good enough reason.
Newcomb's paradox is only a paradox because the question is idiotic.
An oracle predicts with perfect accuracy which box you will pick. How does this oracle work? When did it make the prediction? People who write proofs determining what the best box must be are basing this on assumptions on what this magic, ill defined oracle is. At heart it is just bad science fiction.
I've been really trying to grok two-boxing in this round of Newcomb Discourse, but I still haven't been able to get there. The "choice dominance" argument for two-boxing feels (uncharitably) like the 50/50 argument in Monty Hall: the "obvious" first thought you'd have before fully understanding the situation. This is how most two-box arguments feel to me: that they're simply ignoring important facts about the situation, rather than pointing out considerations one-boxers overlook.
The fundamental fact about Newcomb is that it's been rigged specifically to prevent choice dominance from working. By stipulation, your choice is *not* independent of the outcomes! Depending on how it's presented, Omega has either seen the future, or has scanned your brain, or is just very good at predicting for some reason you don't understand.
This argument for two-boxing seems to me to amount to stubbornly insisting that your choice *is* completely independent of the contents of the boxes. That you can simply step outside the chain of causality and transcendentally Make A Choice. If I could do that, I'd be a two-boxer. But shackled as I am to the laws of physics, performing my mental computations on substrate that can be examined and understood the same way a falling rock can, I understand that taking home two boxes full of money is not a choice that is available to me.
edit: agree with Matrice Jacobine, seems like all the two-boxers in the comments here are simply rejecting the premise rather than arguing that two-boxing makes sense on the merits.
The problem with this scenario for me has always been with “sees the future”. If I know that the oracle can see my choice then I can change my choice based on what I think they will predict. But the oracle will predict that and change their behaviour in response to my change of choice. But based on my knowledge of the oracle’s behaviour I will know they will do that and change my choice again … resulting in an infinite regress in which neither of us can make a choice. Also this involves backwards causation through time which is impossible so the whole problem is specious.
I think the issue here is when you say “oracle” you really mean something like a sports forecaster with a 100% success rate. So there’s a window of opportunity in which you can still change the choice they supposedly successfully predicted. But if you mean an oracle who can really see the future you get into the infinite regress problem. But then maybe the forecaster is aware of my predilection for ruining peoples’ predictions.
I agree with previous comments. If I lived in a world where this kind of thing happened, I would train myself to be a one-boxer, join the local one-boxer society, shame two-boxers as greedy and so on. Then the oracle would correctly predict that I was going to pick one box and reward me accordingly.
Since I don't live in such a world, I will act rationally which means not leaving money on the table for no good reason. That will serve me well nearly all the time.
If it happens that some rich person with good predictive skills should create the situation in the example, I would be out of luck. But since it's highly improbable that anyone would do this, and also highly improbable that I would be the person chosen, my expected loss is negligible.
False dichotomy. Nothing is stopping you from one-boxing in Newcomb's problem and also "not leaving money on the table" in whatever real-world situations you're thinking of.
LOLed at "join the local one-boxer society"
In general though, I agree with this sentiment. In the problem as strictly presented, I'm a dyed-in-the-wool two boxer, but committed one-boxers obviously have the higher expected yield.
I guess another way to think of it is the two-boxing is the dominant strategy in a one-shot game, but in an iterated game where the oracle uses your past choices to predict future choices, I think one-boxing (with perhaps the occasional defection to two-boxing) is the dominant strategy.
A broader meta-point is that in lots of real life scenarios, people appear to act "irrationally," but only if one considers the decision they make as a one-shot game when real life is a collection of various iterated games. People tend to punish the other player for giving them scraps in the dictator game because they value fairness - this is only irrational in a one-shot game.
It sounds like you're not a real two-boxer. The premise of the problem is that you're trying to maximize your earnings, so if you agree that one-boxing does that, you're just a one-boxer who doesn't care about money and thus is rejecting the premise of the problem. If you prefer, you could replace the money with something you care more about, like maybe you or your loved one is dying of cancer, and one box contains a cure with 10% probability, and the other box contains either a guaranteed cure or nothing, depending on Omega's choice.
I'm fairly certain that polarization on this issue is not actually related to disagreements about decision theory, but rather to different intuitions about whether the idea of a "oracle who can see the future" is coherent and how one should engage with thought experiments that seem unrealistic. The hypothetical situation is usually formulated in such a way that one-boxing is "correct" by definition. But it's extremely difficult for me to engage with the problem in that way, because it runs counter to the ways I think causality and human psychology work.
If I find myself in a situation where I feel like I have a psychologically "live" choice as to whether to one-box or two-box, my sense of human psychology is that it probably means the factors influencing the choice are too chaotic for anyone to predict what I will choose. So if I am asked to imagine myself in that situation, how do I do that? Do I imagine a scenario where someone has successfully convinced me that the oracle is infallible, which also means they have convinced me to abandon my basic intuitions about causality and psychology? It seems like I have to do that, but how could I possibly imagine what I would do in such an alien scenario.
I realize that, the way the scenario is typically described, one-boxing is the "correct" answer. But I feel more emotional affinity for the two-box answer, because my temptation is to respond to scenario itself by thinking "screw the assumptions of this stupid rigged scenario."
edit: I just noticed Kaiser Basileus's comment below. Yeah, that.
I think the "perfect predictor" formulation is actively unhelpful to understanding the problem and trips people up in exactly the way you're describing.
My preferred version is the one where Omega is able to use a scan of your brain to make a 90% accurate prediction (and you are unable to predict the direction of its errors). This feels like a plausible future technology, consistent with causality and with human psychology. (I'm not sure brain scanning will ever advance to this degree -- maybe self-referential situations like this will be especially hard to predict -- but I don't think you should be confident that it's physically impossible.)
The key fact is then: you're forced to use the very same brain Omega just scanned to make your choice. You can carefully think through the situation for as long as you want, but you know that the brain scanner has, by some means, captured a 90%-reliable indication of where that reasoning process will end.
My perspective is then: you can choose to ignore the details of your situation, pretend you're making a decision completely independent of any prior prediction, and take both boxes. Or, you can correctly reason that you're in a very strange situation where your own brain has been like, adversarially used against you, and take one box.
The self-referential nature of the situation is what makes me skeptical of the technology, and I think there are strong logical grounds for thinking there's a low ceiling on how accurate a self-referential prediction can be. Note that the accuracy of the technology can only be defined with reference to a population – among people who have been thoroughly socialized to believe the correct way to handle this situation is to flip a coin, the accuracy will be 50%.
But let’s say I’m convinced that the accuracy is 90% on a population of people roughly similar to me. I’ve heard that most people have strong and stubborn intuitions about the problem, so I can guess that Omega’s accuracy is buoyed by people who don’t second-guess their choice. If I find myself second-guessing my choice, I then I start to suspect I’m from part of the distribution that Omega has a harder time predicting – which would probably be a substantial minority, given your numbers.
You can make that hypothesis seem less likely by tweaking the probabilities – from 90% to 99%, for example – but then you’re making it harder for me to accept the scenario on its own terms.
I think even if you accept this, you'd still want to one-box. You have two options: either solemnly vow to always run FDT no matter what, or try to jinx Omega's prediction. If you solemnly vow to always run FDT Omega predicts your decision accurately while if you try to jinx Omega's prediction, Omega has 50/50 odds of being correct, which means half of the time the opaque box has money and half of the time it doesn't, which seems worse.
In any case, you'd probably not want to be the kind of person that always run CDT since that makes the fact you'd always two-box predictable.
This might be a frequentist vs Bayesian thing, but I don't think you need to stipulate that the accuracy can only be defined wrt a population.
We could rephrase this as: you are standing in front of the two boxes, and after the situation is fully explained to you, your credence that Omega made an accurate prediction is 90%. The actual Newcomb question is: what do you do in *this* circumstance? Everything else is fighting the hypo.
Now in practice, I think fighting the hypo is good! You're right to ask questions like "wait, Omega was right 90% of the time so far? is there any pattern to this? can I guess whether I'm more likely to be in the correct or incorrect camp?" If you're facing a real-life Newcomb Problem, you shouldn't accept that Omega is well-calibrated just because someone told you it is. But this is the same sort of reasoning as "can't I just take my shoes off before saving the drowning child?" or "is there some way to stop the trolley from hitting anyone?" Each of these would be valuable to do in real life, but they're all failing to engage with the relevant thought experiment.
I think if your strategy consistently leaves you $1 million poorer, that’s a bad strategy. The decision here isn’t whether to take both boxes. It’s whether to be the sort of
person who would take both boxes in this situation.
The problem with choice dominance is that it implicitly assumes that the CDT method of assessing counterfactuals is correct. If you instead use the FDT method of assessing counterfactuals, for example, then choice dominance would favor one-boxing in both Newcomb's problem and the Transparent Newcomb Problem. So really, a better name for choice dominance would be CDT choice dominance, or causal choice dominance. Since FDT/functional choice dominance also exists, you do not need to give up on choice dominance in order to support one-boxing; instead, you can simply change which version of choice dominance you use. Thus, this post's defense of two-boxers is much weaker than it first appears.
Note that the above is essentially just a restatement/summary of an argument from the paper Functional Decision Theory: A New Theory of Instrumental Rationality rather than something original to me.
https://arxiv.org/pdf/1710.05060#page=19
Dare I say, two-boxers here are not beating the "not being able to take the thought experiment seriously" accusation.
I can steelman the 99% case: if someone in real life told me that he created an AI that can 99% predict whether I one-box or two-box, I would expect this to be based on my known post history, personality, and rat status, and anything I do after that stop being even probabilistically correlated to the prediction. At most it's Smoker's Lesion again. And TDT/FDT/LDT/etc., and even some sophisticated versions of EDT ultimately equivalent to that, will agree with CDT on Smoker's Lesion.
But if you interpret the thought experiment literally (as in, there's still an infallible Omega predictor, but he send the prediction to the person in charge of putting the money inside the boxes with a little noise), then I find no appeal to the idea of reverting back to CDT here.
The transparent version (which is ultimately isomorphic to counterfactual mugging/Parfit's hitchhiker) is a more interesting argument, UDT will bite the bullet on it, but EDT and TDT will agree with CDT, thus violating an intuitive principle that something like conservation of expected evidence when evaluating beliefs should also be true when evaluating actions.
But ultimately it all boils down to: unlike Substack commenters and mainstream analytic philosophers, rats look at all of this as an AI engineering and if you're writing an AI 1/ the possibility of Omega-prediction become immediately obvious and in fact trivial 2/ even if you yourself are a CDT agent, you have no reason to implement CDT when writing an AI agent, consequently 3/ even if you for whatever reason wrote an AI agent using CDT, it would immediately seek to self-modify to a better decision theory.
Maybe I'm honest, or maybe I'm simple, but I think people *correctly* predict each other all the time in everyday life. Label the boxes "excellent employment" and "*not* embezzling everything in front of you" and I think the common-sense case for one-boxing in *transparent* Newcomb's becomes much clearer.
The big prize box is visibly full because you were predicted to be the kind of person who could consistently leave the small prize alone. The big prize is visibly *empty* because -- well, as one of the MIRI folks put it (Soares, I think?), decisions are for making bad outcomes inconsistent. It *isn't* empty, not for me. And when it is, despite that? I'll take the 1% failure rate and go home empty-handed, over turning around and becoming a two-boxer *in every situation like this*.
Evidence and causation are right there, you described the entire setup in a single paragraph. It just is not that complicated. You do need something that can look spooky to sophilists -- a *logical* (not causal) connection between your actions and other people's models of your actions -- but it's not hard to define. If CDT ignores that kind of relation and therefore predictably loses money, so much the worse for CDT. I think this is only complicated to philosophers who've tied themselves up in knots over it. Given the setup there is an action that reliably and predictably wins, and if you can't philosophically justify taking it, that is a failure in your philosophy. *Justifying* the tendency to lose, or waffling between them -- that's just sad. You can do better. Your is/ought intervention just denies the setup to sidestep the problem, and the setup can be patched until it doesn't (assume you want money, assume linear utility in dollars, assume prizes other than money, etc etc etc until you *have to* engage with the question, Least Convenient Possible World style.)
I dunno, you're right that the dominance principle is not to be discarded lightly. But I think I can safely do so, given that maximizing the outcomes for the situation I'm in *affects the situations I end up in* when other people can *see you doing it* and contribute to your situation. I don't think it has to be complex.
("But there's no way people could" go play smash or street fighter or competitive pokémon with your local ranked champions, you will get your *soul read* like your innermost plans are a cheap billboard ad. Everyone who believes they have a libertarian / unpredictable free will should try this some time. Mixing it up is a skill that takes time and practice and can't be perfect.)
("But I don't *want* to embezzle/steal/exploit" yes evolution and culture have had you internalize this, in everyday situations far away from abstract philosophy.)
Put aside boxes entirely. The oracle says "I'm going to ask you to choose a whole number. I have predicted with 100% confidence that the number you choose will be odd. If you pick an odd number - which you of course will - I will give you one dollar. If you in theory you were to choose an even number, I would instead give you two dollars, but of course I know you're not going to do that." Do you choose an even or an odd number?
I don't believe I can engage meaningfully with that hypothetical. My description of the situation strings words and sentences together in a way that superficially appears coherent, but even though I believe in deterministic universe, the scenario is inconceivable - or at very least, imagining it would require either (1) bringing in weird outside factors that have nothing to do with the situation as described, like someone showing up with a gun and threatening to kill you unless you choose an odd number, or (2) overhauling my assumptions about how human psychology works so thoroughly that I am too overwhelmed to engage with the scenario.
I'm sure you and I disagree about whether Newcomb's Paradox presents the same kinds of problems as my hypothetical does, but let's not go there just yet - do you and I at least agree it's impossible to meaningfully engage with the hypothetical scenario I proposed?
But why do you need to imagine a mechanism in order to engage with the hypothetical? Even if there's no possible process that could bring those outcomes about, surely the outcomes themselves are still conceivable?
Is the "choose a whole number" scenario I described conceivable to you?
It seems underspecified. What does "were going to choose" mean here? Would have, were you not informed of the prediction? Are going to despite the prediction? Something else?
Take them both bc oracles and accurate psychology don't exist.
To reject the possibility of such an oracle logically requires you to reject well-established physics.
https://outsidetheasylum.blog/zombie-philosophy/
As you say, what is in the boxes is already in the boxes. I don't believe in fortune telling, and not much in good psychology. So I take both boxes. This seems to be a 'd'uh' kind of thing for most of us because we aren't going to be overthinking it, like a philosopher would. ;-)
Rejecting the premise of the problem and substituting in a totally different problem where two-boxing is indisputably correct is not particularly clever nor insightful.
Respectfully, you seem a little out of touch with the field here. Choice dominance is a very well-known thing, nearly all one-boxing decision theorists are well aware of it and choose to one-box despite it. In particular, it's already been shown that it's invalid when the probability of the outcome depends on the choice, so no knowledgable two-boxer would cite it in support of their position. See the wikipedia page: https://en.wikipedia.org/wiki/Sure-thing_principle
Your variants are also commonly-discussed. The probabilistic one doesn't affect the outcome, any sophisticated one-boxer would still choose the option with a higher expected value. (Provided it's formulated correctly; there's a subtle ambiguity in the problem statement that can sometimes make it rational to two-box even if you're normally a one-boxer.) The transparent boxes one is more interesting as it highlights a discrepancy between evidential decision theory (which one-boxes in the opaque case but two-boxes in the transparent case unless you allow binding precommitments) and timeless decision theory (which one-boxes in both cases).
The good arguments for two-boxing mostly rely on the No Free Lunch theorem and the infinite regress involved in "choosing a decision theory". (It's rational to two-box if you expect to encounter oracles that will reward you for doing so.)
The section on the is-ought distinction also makes little sense to me. Of course decision theory can't tell you what you ought to desire, but it's inherent to the problem that the agent desires to maximize their earnings. There are definitely nuances to what this means exactly, often getting into anthropic considerations like "should I be acting in a way to increase the probability of this scenario happening", "what happens if my decision theory says to act in a way that causes a contradiction in the problem statement", etc. But it's fundamentally a factual question, not a values one. People don't have some fundamental drive towards taking some number of boxes; they just want money for instrumental reasons, and will take the action that they belief helps them towards that goal. Changing people's minds here is a matter of presenting facts and logic, not one of modifying their emotional reaction.
> because reason really only gives us the positive facts and how our actions will alter them and not what we should do.
In isolation, sure, this seems obviously true. But then it also seems obvious that Parfit's example of someone who prefers arbitrarily large amounts of suffering on Tuesday to arbitrarily small amounts of suffering on other days is not just weird but somehow horribly mistaken - "it's Tuesday" is a fact *within a particular social context* but in an absolute sense not even wrong, and "no reason at all", as Parfit puts it, to hold an absolute preference like that.
Two-boxing feels intuitively similar to me, somehow. Yes, holding the oracle's actions fixed, picking both results in a better outcome. But we're not holding the oracle's actions fixed! Your preferred theory of causation may claim that you don't "cause" the oracle to not put money in the box by choosing both, but causation, in the standard folk-physics sense, is part of the map, not the territory. It is a useful macroscopic concept like "chairs" or "substances" but it does not actually exist. And in the context of a thought experiment that tells you, in no uncertain terms, that if you do this, the oracle will have done that - it's simply not a good enough reason.
Newcomb's paradox is only a paradox because the question is idiotic.
An oracle predicts with perfect accuracy which box you will pick. How does this oracle work? When did it make the prediction? People who write proofs determining what the best box must be are basing this on assumptions on what this magic, ill defined oracle is. At heart it is just bad science fiction.
To reject the possibility of such an oracle is to reject basic physics.
https://outsidetheasylum.blog/zombie-philosophy/
Where in that link is the magic oracle described?
If brains are computable than they can be computed.
Where is the magic oracle that has all the required knowledge to perfectly compute a brain?
I've been really trying to grok two-boxing in this round of Newcomb Discourse, but I still haven't been able to get there. The "choice dominance" argument for two-boxing feels (uncharitably) like the 50/50 argument in Monty Hall: the "obvious" first thought you'd have before fully understanding the situation. This is how most two-box arguments feel to me: that they're simply ignoring important facts about the situation, rather than pointing out considerations one-boxers overlook.
The fundamental fact about Newcomb is that it's been rigged specifically to prevent choice dominance from working. By stipulation, your choice is *not* independent of the outcomes! Depending on how it's presented, Omega has either seen the future, or has scanned your brain, or is just very good at predicting for some reason you don't understand.
This argument for two-boxing seems to me to amount to stubbornly insisting that your choice *is* completely independent of the contents of the boxes. That you can simply step outside the chain of causality and transcendentally Make A Choice. If I could do that, I'd be a two-boxer. But shackled as I am to the laws of physics, performing my mental computations on substrate that can be examined and understood the same way a falling rock can, I understand that taking home two boxes full of money is not a choice that is available to me.
edit: agree with Matrice Jacobine, seems like all the two-boxers in the comments here are simply rejecting the premise rather than arguing that two-boxing makes sense on the merits.
The problem with this scenario for me has always been with “sees the future”. If I know that the oracle can see my choice then I can change my choice based on what I think they will predict. But the oracle will predict that and change their behaviour in response to my change of choice. But based on my knowledge of the oracle’s behaviour I will know they will do that and change my choice again … resulting in an infinite regress in which neither of us can make a choice. Also this involves backwards causation through time which is impossible so the whole problem is specious.
I think the issue here is when you say “oracle” you really mean something like a sports forecaster with a 100% success rate. So there’s a window of opportunity in which you can still change the choice they supposedly successfully predicted. But if you mean an oracle who can really see the future you get into the infinite regress problem. But then maybe the forecaster is aware of my predilection for ruining peoples’ predictions.