# decisiontheory.docx

Co-authored with crazy88. Please let us know when you find mistakes, and we ll fix them. Last updated 03-27-2013. Contents: 1. What is decision theory? 2. Is the rational decision always the right decision? 3. How can I better understand a decision problem? 4. How can I measure an agent s preferences? 4.1. The concept of utility 4.2. Types of utility 5. What do decision theorists mean by “risk,“ “ignorance,“ and “uncertainty“? 6. How should I make decisions under ignorance? 6.1. The dominance principle 6.2. Maximin and leximin 6.3. Maximax and optimism-pessimism 6.4. Other decision principles 7. Can decisions under ignorance be transformed into decisions under uncertainty? 8. How should I make decisions under uncertainty? 8.1. The law of large numbers 8.2. The axiomatic approach 8.3. The Von Neumann-Morgenstern utility theorem 8.4. VNM utility theory and rationality 8.5. Objections to VNM-rationality 8.6. Should we accept the VNM axioms? 9. Does axiomatic decision theory offer any action guidance? 10. How does probability theory play a role in decision theory? 10.1. The basics of probability theory 10.2. Bayes theorem for updating probabilities 10.3. How should probabilities be interpreted? 11. What about “Newcomb s problem“ and alternative decision algorithms? 11.1. Newcomblike problems and two decision algorithms 11.2. Benchmark theory (BT) 11.3. Timeless decision theory (TDT) 11.4. Decision theory and “winning” 1. What is decision theory? Decision theory, also known as rational choice theory, concerns the study of preferences, uncertainties, and other issues related to making “optimal“ or “rational“ choices. It has been discussed by economists, psychologists, philosophers, mathematicians, statisticians, and computer scientists. We can divide decision theory into three parts (Grant Baron 2008). Normative decision theory studies what an ideal agent (a perfectly rational agent, with infinite computing power, etc.) would choose.Descriptive decision theory studies how non-ideal agents (e.g. humans)actually choose. Prescriptive decision theory studies how non-ideal agents can improve their decision- making (relative to the normative model) despite their imperfections. For example, one s normative model might be expected utility theory, which says that a rational agent chooses the action with the highest expected utility. Replicated results in psychology describe humans repeatedly failing to maximize expected utility in particular, predictableways: for example, they make some choices based not on potential future benefits but on irrelevant past efforts (the “sunk cost fallacy“). To help people avoid this error, some theorists prescribe some basic training in microeconomics, which has been shown to reduce the likelihood that humans will commit the sunk costs fallacy (Larrick et al. 1990). Thus, through a coordination of normative, descriptive, and prescriptive research we can help agents to succeed in life by acting more in accordance with the normative model than they otherwise would. This FAQ focuses on normative decision theory. Good sources on descriptive and prescriptive decision theory include Stanovich (2010) andHastie [s1 = fire, s2 = no fire]; [(a1, s1) = No house and $99,900, (a1, s2) = House and -$100, (a2, s1) = No house and $0, (a2, s2) = House and $0] ] For more details on formalizing and visualizing decision problems, seeSkinner (1993). 4. How can I measure an agent s preferences? 4.1. The concept of utility It is important not to measure an agent s preferences in terms of objectivevalue, e.g. monetary value. To see why, consider the absurdities that can result when we try to measure an agent s preference with money alone. Suppose you may choose between (A) receiving a million dollars for sure, and (B) a 50% chance of winning either $3 million or nothing. Theexpected monetary value (EMV) of your act is computed by multiplying the monetary value of each possible outcome by its probability. So, the EMV of choice A is (1)($1 million) = $1 million. The EMV of choice B is (0.5)($3 million) + (0.5)($0) = $1.5 million. Choice B has a higher expected monetary value, and yet many people would prefer the guaranteed million. Why? For many people, the difference between having $0 and $1 million issubjectively much larger than the difference between having $1 million and $3 million, even if the latter difference is larger in dollars. To capture an agent s subjective preferences, we use the concept ofutility. A utility function assigns numbers to outcomes such that outcomes with higher numbers are preferred to outcomes with lower numbers. For example, for a particular decision maker — say, one who has no money — the utility of $0 might be 0, the utility of $1 million might be 1000, and the utility of $3 million might be 1500. Thus, the expected utility (EU) of choice A is, for this decision maker, (1)(1000) = 1000. Meanwhile, the EU of choice B is (0.5)(1500) + (0.5)(0) = 750. In this case, the expected utility of choice A is greater than that of choice B, even though choice B has a greater expected monetary value. Note that those from the field of statistics who work on decision theory tend to talk about a “loss function,“ which is simply an inverse utility function. For an overview of decision theory from this perspective, seeBerger (1985) and Robert (2001). For a critique of some standard results in statistical decision theory, see Jaynes (2003, ch. 13). 4.2. Types of utility An agent s utility function can t be directly observed, so it must be constructed — e.g. by asking them which options they prefer for a large set of pairs of alternatives (as on WhoIsHotter.com). The number that corresponds to an outcome s utility can convey different information depending on the utility scale in use, and the utility scale in use depends on how the utility function is constructed. Decision theorists distinguish three kinds of utility scales: 1. Ordinal scales (“12 is better than 6“). In an ordinal scale, preferred outcomes are assigned higher numbers, but the numbers don t tell us anything about the differences or ratios between the utility of different outcomes. 2. Interval scales (“the difference between 12 and 6 equals that between 6 and 0“). An interval scale gives us more information than an ordinal scale. Not only are preferred outcomes assigned higher numbers, but also the numbers accurately reflect thedifference between the utility of different outcomes. They do not, however, necessarily reflect the ratios of utility between different outcomes. If outcome A has utility 0, outcome B has utility 6, and outcome C has utility 12 on an interval scale, then we know that the difference in utility between outcomes A and B and between outcomes B and C is the same, but we can t know whether outcome B is “twice as good“ as outcome A. 3. Ratio scales (“12 is exactly twice as valuable as 6“). Numerical utility assignments on a ratio scale give us the most information of all. They accurately reflect preference rankings, differences,and ratios. Thus, we can say that an outcome with utility 12 is exactly twice as valuable to the agent in question as an outcome with utility 6. Note that neither experienced utility (happiness) nor the notions of “average utility“ or “total utility“ discussed by utilitarian moral philosophers are the same thing as the decision utility that we are discussing now to describe decision preferences. As the situation merits, we can be even more specific. For example, when discussing the type of decision utility used in an interval scale utility function constructed using Von Neumann then the probability of both states would be 1/2. According to another equally plausible description, she could either be deeply in love, a little bit in love or not at all in love with her husband; then the probability of each state would be 1/3. 8. How should I make decisions under uncertainty? A decision maker faces a “decision under uncertainty“ when she (1) knows which acts she could choose and which outcomes they may result in, and she (2) assigns probabilities to the outcomes. Decision theorists generally agree that when facing a decision under uncertainty, it is rational to choose the act with the highest expected utility. This is the principle of expected utility maximization (EUM). Decision theorists offer two kinds of justifications for EUM. The first has to do with the law of large numbers (see section 8.1). The second has to do with the axiomatic approach (see sections 8.2 through 8.6). 8.1. The law of large numbers The “law of large numbers,“ which states that in the long run, if you face the same decision problem again and again and again, and you always choose the act with the highest expected utility, then you will almost certainly be better off than if you choose any other acts. There are two problems with using the law of large numbers to justify EUM. The first problem is that the world is ever-changing, so we rarely if ever face the same decision problem “again and again and again.“ The law of large numbers says that if you face the same decision problem infinitely many times, then the probability that you could do better by not maximizing expected utility approaches zero. But you won t ever face the same decision problem infinitely many times! Why should you care what would happen if a certain condition held, if you know that condition will never hold? The second problem with using the law of large numbers to justify EUM has to do with a mathematical theorem known as gambler s ruin. Imagine that you and I flip a fair coin, and I pay you $1 every time it comes up heads and you pay me $1 every time it comes up tails. We both start with $100. If we flip the coin enough times, one of us will face a situation in which the sequence of heads or tails is longer than we can afford. If a long-enough sequence of heads comes up, I ll run out of $1 bills with which to pay you. If a long-enough sequence of tails comes up, you won t be able to pay me. So in this situation, the law of large numbers guarantees that you will be better off in the long run by maximizing expected utility only if you start the game with an infinite amount of money (so that you never go broke), which is an unrealistic assumption. (For technical convenience, assume utility increases linearly with money. But the basic point holds without this assumption.) 8.2. The axiomatic approach The other method for justifying EUM seeks to show that EUM can be derived from axioms that hold regardless of what happens in the long run. In this section we will review perhaps the most famous axiomatic approach, from Von Neumann and Morgenstern (1947). Other axiomatic approaches include Savage (1954), Jeffrey (1983), and Anscombe and (1B) A 33/34 chance of $27,000 and a 1/34 chance of nothing. The second involves the choice between: (2A) A 34% chance of $24, 000 and a 66% chance of nothing; and (2B) A 33% chance of $27, 000 and a 67% chance of nothing. Experiments have shown that many people prefer (1A) to (1B) and (2B) to (2A). However, these preferences contradict independence. Option 2A is the same as [a 34% chance of option 1A and a 66% chance of nothing] while 2B is the same as [a 34% chance of option 1B and a 66% chance of nothing]. So independence implies that anyone that prefers (1A) to (1B) must also prefer (2A) to (2B). When this result was first uncovered, it was presented as evidence against the independence axiom. However, while the Allais paradox clearly reveals that independence fails as a descriptive account of choice, it’s less clear what it implies about the normative account of rational choice that we are discussing in this document. As noted in Peterson (2009, ch. 4), however: [S]ince many people who have thought very hard about this example still feel that it would be rational to stick to the problematic preference pattern described above, there seems to be something wrong with the expected utility principle. However, Peterson then goes on to note that, many people, like the statistician Leonard Savage, argue that it is people’s preference in the Allais paradox that are in error rather than the independence axiom. If so, then the paradox seems to reveal the danger of relying too strongly on intuition to determine the form that should be taken by normative theories of rational. 8.6.4. The Ellsberg paradox The Allais paradox is far from the only case where people fail to act in accordance with EUM. Another well- known case is the Ellsberg paradox (the following is taken from Resnik (1987): An urn contains ninety uniformly sized balls, which are randomly distributed. Thirty of the balls are yellow, the remaining sixty are red or blue. We are not told how many red (blue) balls are in the urn – except that they number anywhere from zero to sixty. Now consider the following pair of situations. In each situation a ball will be drawn and we will be offered a bet on its color. In situation A we will choose between betting that it is yellow or that it is red. In situation B we will choose between betting that it is red or blue or that it is yellow or blue. If we guess the correct color, we will receive a payout of $100. In the Ellsberg paradox, many people bet yellow in situation A and red or blue in situation B. Further, many people make these decisions not because they are indifferent in both situations, and so happy to choose either way, but rather because they have a strict preference to choose in this manner. The Ellsberg paradox However, such behavior cannot be in accordance with EUM. In order for EUM to endorse a strict preference for choosing yellow in situation A, the agent would have to assign a probability of more than 1/3 to the ball selected being blue. On the other hand, in order for EUM to endorse a strict preference for choosing red or blue in situation B the agent would have to assign a probability of less than 1/3 to the selected ball being blue. As such, these decisions can’t be jointly endorsed by an agent following EUM. Those who deny that decisions making under ignorance can be transformed into decision making under uncertainty have an easy response to the Ellsberg paradox: as this case involves deciding under a situation of ignorance, it is irrelevant whether people’s decisions violate EUM in this case as EUM is not applicable to such situations. Those who believe that EUM provides a suitable standard for choice in such situations, however, need to find some other way of responding to the paradox. As with the Allais paradox, there is some disagreement about how best to do so. Once again, however, many people, including Leonard Savage, argue that EUM reaches the right decision in this case. It is our intuitions that are flawed (see again Resnik (1987) for a nice summary of Savage’s argument to this conclusion). 8.6.5. The St Petersburg paradox Another objection to the VNM appr