These are lecture notes of this video by Steven Pinker

Related: Decision Theory FAQ, Sections 1 to 5 cover the basics.

You can also enter a “reading mode” by clicking a small button next to sun/moon

I. Introduction & Recap

  • Context: Building upon previous discussions of rationality domains.

  • Rationality Domains Recap:

    • Deductive Reasoning: Logic, deriving conclusions from premises.
    • Inductive Reasoning: Probability, updating beliefs based on evidence (Bayesianism).
    • Practical Reason: Focus of today’s lecture – making choices to achieve goals. This differs from reasoning about beliefs.
  • Key Distinction Recap: Normative vs. Descriptive Theories

    • Normative Theory: How one ought to reason or behave to be considered rational. Prescribes ideal standards. (e.g., logic, probability theory, today: Expected Utility Theory).
    • Descriptive Theory: How humans actually reason and behave, including biases and heuristics. Psychological reality. (e.g., prospect theory, heuristics & biases research).

II. Expected Utility Theory (EUT) - A Normative Model of Choice

Goal of EUT

EUT provides a formal framework for how a rational agent should make decisions under uncertainty to maximize their desired outcomes.

A. Utility: The Foundation of Choice

Utility

A technical term representing the subjective value or desirability an individual assigns to an outcome. It represents what people want.

  • Clarifications on Utility:

    • Not necessarily Happiness: People pursue goals that don’t maximize short-term happiness (e.g., having children, writing a book, fighting for a cause). These might lead to long-term fulfillment or align with other values. Utility captures the preference for these states, not just hedonic state.
    • Not necessarily Usefulness: Derived from Utilitarianism, but in decision theory, it simply means subjective desirability.
    • Not necessarily Selfishness: An agent’s utility can incorporate the well-being of others (e.g., due to empathy). Altruistic goals are captured within one’s utility function.
    • Not necessarily Money: People make monetary sacrifices for other goals (e.g., charity - “warm glow”, choosing fulfilling but less lucrative careers).
  • Measuring Utility: Revealed Preference

    • Concept: Infer utility from the choices people actually make (“voting with their feet and wallets”). Assumes choices reveal underlying preferences.
    • Potential Circularity: “People maximize utility.” “How do we know their utility?” “Look at what they choose (maximize).” This seems tautological.
    • Resolution: The normative power of EUT comes not from defining utility a priori, but from evaluating the consistency of choices. Given whatever preferences someone has (de gustibus non est disputandum - there’s no disputing taste), are their choices consistent across situations according to rational axioms?

B. EUT Framework

  • Originators: John von Neumann and Oskar Morgenstern (Theory of Games and Economic Behavior, 1944).
    • Von Neumann: Polymath (Manhattan Project, computer architecture, game theory).
  • Core Idea: Life involves choices between uncertain outcomes (gambles). Rational choice involves selecting the gamble with the highest expected utility.

Expected Utility Calculation

The expected utility of a gamble (choice option) is the sum of the utilities of each possible outcome, weighted by their respective probabilities. E(U) = Σ [P(outcome_i) * U(outcome_i)]

  • Example: Pass/Fail vs. Letter Grade Decision

    • Scenario: Choosing grading basis for a course.
    • Assumptions (Illustrative):
      • Utilities (Utils): A=10, B=7, C=5, Pass=6, Fail=0
      • Probabilities (Subjective Estimates):
        • Pass/Fail Option: P(Pass)=0.9, P(Fail)=0.1
        • Letter Grade Option: P(A)=0.3, P(B)=0.5, P(C)=0.1, P(Fail)=0.1
    • Calculations:
      • E(U)_Pass/Fail = (0.9 * 6) + (0.1 * 0) = 5.4 utils
      • E(U)_Letter Grade = (0.3 * 10) + (0.5 * 7) + (0.1 * 5) + (0.1 * 0) = 3 + 3.5 + 0.5 + 0 = 7.0 utils
    • Normative Conclusion: Based on these utilities and probabilities, the rational choice (maximizing E(U)) is to take the Letter Grade.
  • “Expected” meaning: Refers to the mathematical expectation, the long-run average if the choice were repeated many times (though applicable to single-shot decisions).

C. Axioms of Rational Choice (under EUT)

Core Claim of vNM EUT

Maximizing expected utility is mathematically equivalent to satisfying a set of fundamental axioms of rational preference. If you accept the axioms as defining rationality, you must maximize expected utility.

  • Key Axioms Mentioned:
    1. Comparability (Completeness): For any two alternatives A and B, a rational agent must either prefer A to B, prefer B to A, or be indifferent between them. You can’t refuse to compare.
    2. Transitivity: If A is preferred to B, and B is preferred to C, then A must be preferred to C.
      • Violation Consequence: Intransitive preferences make one vulnerable to being a money pump. (Example: Prefer iPhone > Pixel, Pixel > Galaxy, Galaxy > iPhone. I can sell you an iPhone for your Galaxy + 10, then a Galaxy for your Pixel + $10… indefinitely draining your money).
    3. Independence of Irrelevant Alternatives (IIA): If A is preferred to B, then a probabilistic choice between A and some other outcome C should be preferred to the same probabilistic choice between B and C (provided the probability mixture is the same). Adding or changing an irrelevant third option shouldn’t reverse your preference between the original two.
      • Example (Restaurant Pie): Prefer Apple (A) > Blueberry (B). Waitress offers Cherry (C) as well. IIA implies you should still prefer A over B. Switching preference to B because C is now available violates IIA. (This relates to framing effects).
      • Framing: The choice should not depend on how alternatives are described or bundled, only on their fundamental outcomes and probabilities.
    4. Other Axioms (mentioned briefly):
      • Distribution: Consistency with probability theory.
      • Solvability (Continuity): Implies preferences can be represented on a continuous scale, allowing for indifference points between gambles.

D. Implications and Applications of EUT

  • Diminishing Marginal Utility of Money:
    • The utility function for money is typically concave. Each additional dollar provides less additional utility than the previous one. $10 means more to a poor person than to a rich person.
    • Result: Risk Aversion: Because the utility function is concave, a guaranteed amount of money (2X, 50% chance of 0 outweighs the potential utility gain from getting $2X.
  • Insurance:
    • Buying insurance is rational under EUT due to risk aversion (concave utility). You pay a small certain premium (a small utility loss) to avoid a small probability of a catastrophic loss (a huge utility loss).
    • This is rational even beyond the psychological “peace of mind” factor. The math works out due to the utility curve’s shape.
  • Lotteries:
    • Buying lotteries is generally not rational under standard EUT assumptions (it’s the “stupidity tax”). The expected monetary value is typically much lower than the ticket price.
    • Explaining lottery play requires invoking other factors: utility of fantasy/hope, misunderstanding of low probabilities, entertainment value, possibly convex utility for small amounts (?).

III. Challenges to EUT as a Descriptive Model (How Humans Actually Choose)

Question

Are people rational deciders according to EUT? Does Homo Sapiens obey these principles? (Spoiler: Often not).

A. Bounded Rationality (Herbert Simon)

Bounded Rationality

Real-world decision-making is constrained by limited information, limited cognitive processing capacity, and limited time. Humans cannot be the “angels” or “Laplacean demons” assumed by perfectly rational models.

  • Implication: Humans don’t optimize (find the absolute best option according to EUT), they satisfice.
  • Satisficing: Setting a minimum aspiration level or acceptability threshold and choosing the first option that meets it, rather than exhaustively searching for the best.

B. Systematic Violations of EUT Axioms

  • 1. Taboo Tradeoffs / Sacred Values:

    • People resist comparing or making choices involving certain protected values (life, liberty, loyalty, environment). They refuse to put a “price” on them.
    • Examples:
      • Sophie’s Choice (choosing which child lives).
      • Indecent Proposal (trading sex for money).
      • Valuing human lives vs. cost (e.g., safety regulations like overpasses).
      • Environmental protection vs. economic cost (e.g., saving an endangered salamander).
    • Political Relevance: Politicians often avoid explicitly stating tradeoffs involving sacred values (e.g., cost of saving lives, civilian casualties in war) because acknowledging the tradeoff itself is seen as offensive or immoral, even if unavoidable in practice. Michael Kinsley’s definition of a “gaffe”: a politician accidentally telling the truth.
    • Tetlock’s Perspective: While violating EUT axioms, refusing certain tradeoffs can serve a rational purpose in signaling commitment within social relationships. Constantly calculating the “price” of loyalty erodes the relationship itself.
  • 2. Intransitivity:

    • Mechanism: Preferences can shift depending on the comparison set due to salience. When comparing A and B, certain features stand out. When comparing B and C, different features might stand out, leading to seemingly circular preferences.
    • Example (Political Candidates):
      • Biden vs. Klobuchar Prefer Biden (more electable).
      • Buttigieg vs. Biden Prefer Buttigieg (new generation).
      • Klobuchar vs. Buttigieg Prefer Klobuchar (time for a woman).
      • Leads to Biden > Klobuchar > Buttigieg > Biden cycle.
    • Money Pump Reality Check: While theoretically possible, turning people into actual money pumps via intransitivity is hard in practice, as people often recognize the pattern or the act of choosing changes future weightings.
  • 3. Violations of Independence of Irrelevant Alternatives (IIA):

    • Allais Paradox (and simpler examples like Pie): Adding a third option does systematically change people’s choices between the original two, violating IIA.
    • Reason: The context provided by the third option changes how the original two are perceived or evaluated (framing effects, comparison effects).
  • 4. Certainty/Possibility Effects (Prospect Theory - Preview for next week):

    • People overweight certainty (P=1.0) and possibility (P slightly > 0) compared to intermediate probabilities. Getting rid of the last bit of risk feels disproportionately good; introducing the first bit of possibility feels disproportionately enticing.
    • Certainty (P=1) and impossibility (P=0) seem psychologically distinct from very high/low probabilities.
  • 5. Subjective vs. Mathematical Probability: Human probability estimates are often biased.

  • 6. Death as a Special Case: Utility theory struggles with outcomes like death, which isn’t just a low utility value but potentially removes the agent from the game entirely.

  • 7. Anticipated Emotions (Regret Aversion): Choices are influenced by how we expect to feel about the outcome, particularly potential regret.

    • Example: Preferring the sure 0 in Gamble B (when you could have had $1M for sure) would lead to intense regret.

IV. Introduction to Game Theory

Game Theory

The study of strategic decision-making where the optimal choice for one agent depends on the choices made by other rational agents. Payoffs are interdependent.

A. Basic Concepts & Terminology

  • Zero-Sum Game: One player’s gain is exactly another player’s loss. The payoffs in any outcome cell sum to zero (or a constant). Pure conflict, fixed pie.
    • Example: Rock-Paper-Scissors.
  • Non-Zero-Sum Game (includes Positive-Sum / Win-Win): Payoffs don’t necessarily sum to zero. Outcomes can exist where both players gain (or lose) compared to other outcomes. Allows for cooperation, mutual benefit.
    • Example: Coordination games, Prisoner’s Dilemma (potentially).
  • Dominant Strategy: A strategy that yields the best payoff for a player regardless of what the other player(s) do. If you have a dominant strategy, you should play it.
  • Nash Equilibrium (John Nash - “A Beautiful Mind”): A set of strategies (one for each player) such that no player can improve their payoff by unilaterally changing their own strategy, assuming the other players’ strategies remain the same. It’s a stable state where everyone is doing the best they can given what everyone else is doing.

B. Types of Games & Examples

  • 1. Outguessing Standoffs (e.g., Rock-Paper-Scissors, Penalty Kicks, Poker Bluffs):

    • Characteristics: Zero-sum, no dominant strategy, no (pure strategy) Nash Equilibrium.
    • Optimal Strategy: Randomization (Mixed Strategy). Being predictable allows the opponent to exploit you. The best you can do is be unpredictable according to optimal probabilities (e.g., 1/3, 1/3, 1/3 for RPS).
    • Real-world examples: Penalty kicks in soccer, pitching in baseball, military feints.
  • 2. Coordination Games (e.g., Rendezvous Dilemma, Stag Hunt, Driving Conventions, Tech Standards):

    • Characteristics: Non-zero-sum, multiple Nash Equilibria, players want to coordinate on the same equilibrium.
    • Challenge: How to ensure coordination without explicit communication?
    • Common Knowledge: Requires players to know the situation, know that others know, know that others know that they know, etc. (infinite recursion). Lack of common knowledge can prevent coordination. (Example: Texting rendezvous location when unsure if message received).
    • Focal Points (Schelling Points): Solutions that “stand out” due to salience, convention, or obviousness, allowing players to coordinate implicitly even without communication. (Example: Meeting at the most famous landmark, choosing the round number in bargaining - “split the difference” at $75).
    • Real-world examples: Choosing which side of the road to drive on, adopting technological standards (QWERTY, VHS vs Beta), currency adoption.
  • 3. Game of Chicken (e.g., Teenagers Driving, Cuban Missile Crisis, Brinkmanship, Debt Standoffs):

    • Characteristics: Anti-coordination game. Best outcome is if you go straight and the other swerves. Worst outcome is if both go straight (crash). Intermediate outcome if both swerve. Nash equilibria are the states where one swerves and one goes straight.
    • Paradoxical Tactics (Rational Irrationality / Madman Theory): Reducing your own options or appearing irrational/reckless can win the game by forcing the other player to swerve. (Examples: Tearing off steering wheel, Nixon’s alleged strategy, Trump/Kim Jong Un rhetoric).
    • High risk of mutual disaster if both players try to pre-commit or appear irrational.
  • 4. Escalation Games / Dollar Auction:

    • Characteristics: Sequential game where players incrementally increase bids/investment. Often involves a “loser also pays” rule (or sunk costs).
    • Perverse Logic: Leads to bidding far beyond the actual value of the prize because at each step, paying a little more seems better than losing the entire amount already bid (sunk cost fallacy logic).
    • Rational Strategy: Don’t play in the first place. If stuck, cut losses (potentially probabilistically).
    • Real-world examples: Wars of attrition (WWI), prolonged strikes, expensive lawsuits, potentially addictive behaviors.
  • 5. Prisoner’s Dilemma (PD) / Tragedy of the Commons / Public Goods Game:

    • Structure: Two players, each chooses to Cooperate (with partner) or Defect (rat out partner). Payoffs typically ranked: Temptation (Defect while other Cooperates) > Reward (Both Cooperate) > Punishment (Both Defect) > Sucker (Cooperate while other Defects). T > R > P > S.
    • Key Feature: Defection is the dominant strategy for each individual player, regardless of what the other does.
    • Outcome: The Nash Equilibrium is mutual defection, even though mutual cooperation would yield a better outcome for both players. Individual rationality leads to collective irrationality.
    • Tragedy of the Commons: Multi-player version. Each individual has incentive to overuse a shared resource (e.g., grazing land, fisheries, atmosphere for pollution) even though collective overuse destroys the resource for everyone. Each person’s contribution to the damage seems negligible, but the sum is catastrophic.
    • Public Goods: Similar dynamic. Each person benefits from a public good (e.g., lighthouse, clean air, national defense) but has an incentive to free-ride and let others pay for it. If everyone free-rides, the public good isn’t provided.
    • Solving PD / Tragedy of the Commons: Requires changing the payoff structure, often through external mechanisms:
      • Enforcement & Punishment: Laws, regulations, taxes (e.g., carbon tax) that penalize defection (polluting, overfishing, breaking contracts). This makes cooperation individually rational.
      • Trust & Reputation: In repeated interactions, reputation can build trust and incentivize cooperation.
      • Binding Agreements: Contracts.

V. Conclusion & Synthesis

  • Normative models like EUT provide a benchmark for rationality but often fail as descriptive models of human behavior.
  • Humans are subject to bounded rationality (satisficing) and systematic biases/heuristic shortcuts that lead to violations of EUT axioms (taboo tradeoffs, intransitivity, framing effects, etc.).
  • Game theory analyzes strategic interactions where outcomes are interdependent.
  • Different game structures (Zero-Sum, Coordination, Chicken, PD, Escalation) lead to different optimal strategies and potential pitfalls (randomization, focal points, brinkmanship, suboptimal equilibria, tragedy of the commons).
  • Understanding these models helps identify sources of both individual and collective irrationality.
  • External structures (laws, taxes, contracts, social norms) are often necessary to align individual incentives with collectively desirable outcomes, especially in Prisoner’s Dilemma / Tragedy of the Commons scenarios.
  • The tension between individual rationality and collective outcomes is a central theme.

Next Steps / Open Questions:

  • Explore Prospect Theory as a descriptive model of choice under risk (next lecture).
  • How can knowledge of these models improve real-world individual and collective decision-making?
  • What are the limits of applying game theory to complex human interactions involving trust, emotion, and long-term relationships?
  • How do evolutionary pressures shape our decision-making tendencies, both rational and irrational?