@ReformedTrader: 1/ Expert Political Judgment (...

1

1/ Expert Political Judgment (Philip Tetlock)

"What experts think matters far less than how they think. We are better off with experts who draw from an eclectic array of traditions and accept ambiguity/contradiction as inevitable features of life." (p. 2)
amazon.com/Expert-Politic…

2

2/ "Forecasting exercises' winners and losers are not clustered along left/right partisan lines.

"There is an inverse relationship between indicators of good judgment and the qualities the media prizes in pundits—the tenacity required to prevail in ideological combat." (p. 2)

3

3/ "Disagreements hinge on hard-to-refute counterfactual claims about what would have happened if we had taken different policy paths and on moral claims about the types of people we should be—all claims partisans can use to fortify their positions against falsification." (p. 4)

4

4/ "When we pit experts against minimalist performance benchmarks—dilettantes, dart-throwing chimps, and assorted extrapolation algorithms—we find few signs that expertise translates into greater ability to make either “well-calibrated” or “discriminating” forecasts.

5

5/ "Who experts were (professional background, status) made scarcely an iota of difference to accuracy. Nor did what experts thought (liberals or conservatives, realists or institutionalists, optimists or pessimists). But how experts thought—their style of reasoning—did matter.

6

6/ "Foxes know many little things, draw from an eclectic array of traditions, and accept ambiguity/contradiction as inevitable features of life.

"Hedgehogs know one big thing, toil devotedly within one tradition, and reach for formulaic solutions to ill-defined problems."

7

7/ "Foxes consistently edge out hedgehogs, esp. for long-term predictions in their domains of expertise.

"Their self-critical, point-counterpoint thinking prevented them from building up the excessive enthusiasm that hedgehogs, esp. well-informed ones, had for their predictions.

8

8/ "Foxes were more sensitive to how contradictory forces can yield stable equilibria and, as a result, “overpredicted” fewer departures (good or bad) from the status quo. They also recognized the precariousness of equilibria rarely ruled out anything as impossible." (p. 21)

9

9/ "Radical skeptics argue that history is a succession of chaotic shocks reverberating through incomprehensibly intricate networks.

"When well-established nonlinear relationships are linked into positive feedback loops in simulations, tiny input variations have large effects.

10

10/ "More deer means more reproduction but also exhausts food and attracts wolves. A tiny shift in beta (3.94 to 3.935) alters history. The populations remain almost identical but then, for mysterious tipping-point reasons, decisively part ways 25 years into the simulation.

11

11/ "Who among us cannot imagine our lives unfolding differently but for tiny accidents of fate that shaped the jobs we hold, the people we marry, and so on?

"Accounts of military campaigns abound with tales of how horseshoe-nail-sized causes determined the outcomes of battles.

12

12/ "Mokyr compares searching for the seeds of the Industrial Revolution to studying the history of Jewish dissenters from 50 A.D.-50 B.C. We are looking for something that at its inception was insignificant, even bizarre, but destined to change the life of everyone in the West.

13

13/ "Butterfly effects undercut pet theories. Wars break out not due to grand causes—primordial hatreds or power imbalances—but to petty ones—royal carriage drivers making wrong turns, giving astonished assassins who had botched their jobs earlier that day second chances." (p.31)

14

14/ "A prospective study of how well retrospectively identified causes, either singly or in combination, predict events, a squared multiple correlation coefficient would reveal gross unpredictability. And retrodiction is enormously easier than prediction." (p. 35)

15

15/ "Intelligent people have enormous difficulty tracking complex patterns of covariation, such as, “effect y1 rises in likelihood when x1 is falling, x2 is rising, and x3 takes on an intermediate set of values.”

16

16/ "Policy makers rarely invoke nuanced reasoning: “Saddam resembles Hitler in risk-taking, but he also has the shrewd street smarts of Stalin, the vaingloriousness of Mussolini, and the demagoguery of Nasser; the usefulness of each analogy depends on context.” " (p. 38)

17

17/ "History heaps ambiguity on us. It requires us to keep track of many things and offers few clues as to which things made critical differences.

"We know from experimental work that people fill in the missing data points with ideologically scripted event sequences.

18

18/ "Observers of world politics are overconfident, declaring with eerie certainty exactly what would have happened in counterfactual worlds that no one can visit

"The world is messy: policies that one is predisposed to detest (embrace) sometimes have positive (noxious) effects.

19

19/ "The core function of political beliefs is to promote the comforting illusion of predictability.

"Prediction suffers b/c we are deterministic thinkers with an aversion to probabilistic strategies that accept the inevitability of error. We look for order in random sequences.

20

20/ "When we know the base rates—say, the incumbent wins 80% of the time—and not much else, we should simply predict the more common outcome. But work on base rate neglect suggests that people insist on attaching high probabilities to low-frequency events.

21

21/ "These probabilities are rooted not in observations of relative frequency in relevant reference populations of cases, but rather in case-specific hunches about causality that make some scenarios more “imaginable” than others.

22

22/ "A plausible story of how a government might suddenly collapse counts for far more than how often similar outcomes have occurred in the past. Forecasting accuracy suffers when intuitive causal reasoning trumps extensional probabilistic reasoning.

23

23/ "Skeptics remind us of the perils of drawing inductive inferences from small samples of unknown origin and, if in a patient mood, append a lecture on the logical fallacies: “Beware of people who argue, “If A, then B,” observe B, and then declare, “A is true.” " (p. 41)

24

24/ "Participants in our studies were highly educated (the majority had doctorates and almost all had postgraduate training in fields such as political science, economics, international law and diplomacy, business administration, public policy, and journalism).

25

25/ "They had, on average, 12 years of relevant work experience, came from many walks of professional life (academia, think tanks, government service, international institutions) and showed themselves in conversation to be thoughtful observers of the world scene." (p. 44)

26

26/ "Whereas the best human forecasters were hard-pressed to predict >20% of the total variability in outcomes, the crude case-specific algorithms could predict 25%-30% of the variance, and generalized autoregressive distributed lag models explained 47% of the variance." (p. 53)

27

27/ "Humans' scores do not vary much across a broad swath of short- and long-term forecasts in policy domains and states.

"It is impossible to find any domain in which humans clearly outperformed crude extrapolation algorithms, less still sophisticated statistical ones." (p. 53)

28

28/ "Experts on their home turf made neither better calibrated nor more discriminating forecasts than did dilettantes: those who'd devoted years of arduous study were as hard-pressed as colleagues dropping in from other fields to affix realistic probabilities to possible futures.

29

29/ "Comparisons failed to yield any effects on calibration or discrimination that could be traced to experience (seniority) or expertise (academic, government or private sector background, access to classified information, doctoral degree/not, status of university affiliation).

30

30/ "There was also little sign that degrees of expertise improved performance when we broke forecasting questions into subtypes: short-term versus long-term, zone of stability versus turbulence, and domestic political, economic policy/performance, and national security issues.

31

31/ "Although subject matter knowledge does not give a big boost to performance, it is not irrelevant. If one insists on thinking like a human rather than a statistical algorithm, it is especially dangerous doing so equipped only with the thin knowledge base of undergraduates.

32

32/ "Both experts and dilettantes exaggerate the likelihood of change for the worse (better). Such outcomes occur 23% (28%) of the time, but experts assign average probabilities of .35 (0.34), dilettantes, .29 (0.31)." (p. 53)

33

33/ "We asked participants how often they advised policy makers, consulted with government or business, and were solicited by the media for interviews. Consistent with the seduction-by-fame hypothesis, experts in demand were more overconfident, r (136) = .33, p < .05.

34

34/ "A similar correlation links overconfidence and the number of media mentions that participants received, according to Google search count (r = .26). Both relationships fell to nonsignificance after controlling for a cognitive-style measure derived from the thought protocols.

35

35/ "More “balanced” thinkers (“on the one hand”/“on the other”) were less overconfident (r = .37) and less in the limelight (r = .28).

"Overconfident experts may be more quotable and attract more media attention. But they may also be more likely to seek out attention." (p. 62)

36

36/ Agency problems: "Given prevailing accountability norms and practices, the anticipated blame from a policy fiasco in which decision-makers bypassed the relevant experts substantially exceeds that from a fiasco in which they ritualistically consulted those experts." (p. 64)

37

37/ "The test of a first-rate intelligence is the ability to hold two opposing ideas in the mind at the same time, and still retain the ability to function." F. Scott Fitzgerald

38

38/ "It made virtually no difference whether participants had doctorates, had policy experience, or had access to classified information.

"But better-known forecasters—those more likely to be fêted by the media—were less well-calibrated than lower-profile colleagues." (p. 68)

39

39/ "Moderates consistently bested extremists on calibration—an advantage that they did not purchase by sacrificing discrimination.

"But it was easy to identify specific times/places at which one or another faction could crow over its successes ('15 minutes of fame')." (p. 72)

40

40/ "The search for correlates of good judgment across time and topics became more successful when the spotlight shifted from *what* experts thought to *how* they thought. The variable loadings on the first factor resemble Berlin’s distinction between hedgehogs and foxes." (p.72)

41

41/ "Figure 3.2 brackets human performance. The best forecasters approach 20% of the epistemic ideal (extrapolation algorithms, 30%; formal models, 50%). It reveals how far performance can fall—to the point where highly-educated specialists explain <7% of the variance." (p. 75)

42

42/ "The worst (best) performers were hedgehog extremists (foxes) making long-term (short-term) predictions in their domains of expertise.

"Foxes derive modest benefit from expertise whereas hedgehogs are—strange to say—harmed." (p. 78)

43

43/ "Hedgehogs resemble high scorers on personality scales designed to measure needs for closure and structure—the types of people who have been shown in experimental research to be more likely to trivialize (embrace) evidence that undercuts (reinforces) their preconceptions.

44

44/ "The more knowledge hedgehogs possess, the more ammunition they have to perform these belief-defense tasks. By contrast, foxes—who resemble low scorers on those personality scales—should be predisposed to allocate their cognitive resources in a more balanced way." (p. 81)

45

45/ "Although both hedgehogs and foxes overpredict change (the lower base-rate outcome) and thus—by necessity—under-predict the status quo, hedgehogs make this pattern of mistakes to a greater degree than foxes." (p. 83)

46

46/ "Hedgehogs make braver forecasts. This is consistent with the notion that the greater caution among foxes is rooted in balanced cognitive appraisals of situations, not a mindless clinging to the midpoints of the scales.

47

47/ "The hedgehog-fox differential on extremity of predictions, once highly significant (r = .35), shrinks significantly when we control for the fact that foxes engage in more integratively complex thinking about problems than do hedgehogs (partial r = .14).

48

48/ "It is hard to build momentum for extreme predictions if one is slowed down by buts and howevers.

"Good judges tend to be moderate foxes: eclectic thinkers tolerant of counterarguments: prone to hedge their bets and not stray too far from base-rate probabilities." (p. 84)

49

49/ 1999: "Forecasts were euphoric: Dow 36,000, telecommuting eliminating rush-hour traffic jams, Web retailers driving brick-and-mortar stores out of business, interactive televisions, online universities, and near-instantaneous electronic flows of capital across borders.

50

50/ "If a proposal passed the low-hurdle “can I believe this” test, technophiles did not pause to ponder potential resistance from flesh-and-blood humans who have deep-rooted social needs and work within remarkably durable social systems." (p. 103)

51

51/ "Foxes saw merit in the accusations leveled by each side but did not mindlessly split the differences. Most foxes could identify academic fields that they felt had become so suffused with bias that only a vocal minority dared to state obvious but unpleasant truths." (p. 104)

52

52/ "Foxes felt most leaders' decisions had considerable wiggle room.

"Preferring explanatory closure, hedgehogs found this insistence cloying. It allows for butterfly effects—cancerous tumors, love affairs, and assassins’ bullets—tiny causes altering major decisions." (p. 108)

53

53/ "Hedgehogs at continual risk of becoming prisoners of their preconceptions, trapped in self-reinforcing cycles in which their initial ideological disposition stimulates thoughts that further justify that inclination which, in turn, stimulates further supportive thoughts.

54

54/ "The average predictions of forecasters are generally more accurate than the majority of forecasters from whom the averages were computed/

"Trimming outliers (extremists) further enhances accuracy." (p. 117)

This works for trading strategies as well:

55

55/ "But simple, decisive statements are easier to package in sound bites. The same style of reasoning that impairs experts’ performance on scientific indicators of good judgment boosts experts’ attractiveness to the mass market–driven media." (p. 119)

56

56/ "These results dovetail with the cognitive interpretation of the fox-hedgehog performance gaps: foxes do better because they are moderates who factor conflicting considerations—in a flexible, weighted-averaging fashion—into their final judgments." (p. 118)

57

57/ "Foxes update their beliefs more in the Bayesian direction than do hybrids and hedgehogs. This greater movement is all the more impressive in light of the fact that the Bayesian updating formula demanded less movement from the foxes than from the other groups.

58

58/ "In two exercises, hedgehogs moved in the opposite direction prescribed by Bayes and raised confidence in their prior view after the unexpected happened. This latter pattern is contra-Bayesian and also incompatible with all normative theories of belief adjustment." (p. 127)

59

59/ "Experts displayed both the classic hindsight effect (claiming more credit for predicting the future than they deserved) and the mirror-image hindsight effect (giving less credit to their opponents for anticipating the future than they deserved)." (p. 138)

60

60/ "Foxes have repeatedly traced the psychological roots of intelligence failures to an unwillingness to be sufficiently self-critical, to reexamine assumptions, to question dogma, and to muster the imagination to think daringly about options others might ridicule." (p. 143)

61

61/ "Learning from the past is hard, in part, because history is a terrible teacher. She never gives us the exact comparisons we need to determine causality (those are cordoned off in the what-iffy realm of counterfactuals). The control groups “exist” only in our imaginations.

62

62/ "Observers feel impelled to explain puzzling events because the policy stakes are high. However, just because we want an explanation does not mean one is within reach.

"We fill in missing counterfactual scenarios with elaborate stories grounded in our deepest assumptions.

63

63/ "Counterfactual beliefs 'feel' factual, as though experts were telling us: “I know what would have happened. I got back from a trip in my alternative-universe teleportation device and can assure you that events there dovetailed perfectly with my preconceptions.” " (p. 146)

64

64/ "Hedgehogs increased their confidence in their prior position instead of changing their minds in response to dissonant evidence. (Foxes made small adjustments.)

"Hedgehogs were defiant defenders of double standards, searching for flaws only in disagreeable results." (p. 160)

65

65/ "Whereas the probability score for the average of all fox forecasts is only slightly superior to that of the average fox (.181 vs .186), the probability score for the average for all hedgehog forecasts is massively superior to that of the average hedgehog (.184 vs .218).

66

66/ "The average fox forecast beats 70% of foxes; the average hedgehog forecast beats 95% of hedgehogs.

"Hedgehogs make more extreme mistakes in all possible directions and thus derive disproportionate benefit when we let their errors offset each other in composite forecasts.

67

67/ "Foxes do intuitively what averaging does statistically, and what hedgehogs individually largely fail to do: blend perspectives with nonredundant predictive information." (p. 179)

68

68/ "Although foxes and hedgehogs were both more inclined to accept discoveries that meshed with their ideological preconceptions, foxes were at least moderately responsive to the quality of the underlying research, whereas hedgehogs were seemingly oblivious." (p. 182)

69

69/ "Experiments give us grounds for fearing that scenario exercises often fail to open the minds of the inclined-to-be-closed-minded hedgehogs but confuse the already-inclined-to-be-open-minded foxes—confusing foxes so much that their open-mindedness resembles credulousness.

70

70/ "Scenarios impair accuracy: forecasters attach excessive probabilities to too many possibilities. This is especially true of foxes judging dramatic departures from the status quo. Hedgehogs who refuse scenario exercises perform as well as foxes who embrace them." (p. 201)

71

71/ "Scenario manipulations may check hindsight by resurrecting counterfactual possibilities and infusing them with “narrative life.” Ironically, this is possible only if people already attach higher probabilities to more richly embellished and thus more imaginable scenarios...

72

72/ "the exact opposite of what we should do if we appreciated the basic principle that scenarios can only fall in likelihood when we add contingent details to the narrative.

"There is also not as much money in correcting judgments of possible pasts vs possible futures." (p.205)

73

73/ "The same self-assured hedgehog style of reasoning that suppresses forecasting accuracy and slows belief updating translates into compelling media performances: attention-grabbing bold predictions that are rarely checked for accuracy.

74

74/ "When found to be wrong, forecasters steadfastly defend their predictions as “soon to be right,” or “almost right” or as the “right mistakes” to have made given the available information and choices." (p. 217)

75

75/ Related reading:

The Undoing Project (Michael Lewis)

Thinking in Bets: Making Smarter Decisions When You Don't Have All the Facts (Annie Duke)

Superforecasting (Tetlock, Gardner)

76

76/

77

77/

78

78/ "Modern pluralistic systems, Greenspan was saying, were messy and willful; after witnessing government up close, he knew this conclusively. It was idle to expect such systems to submit to rules—political pressures would destroy them."

79

79/

80

80/ "The goal of forecasting is not accuracy. It is to advance the interests of the forecaster and his tribe. Accurate forecasts may do that sometimes but are pushed aside if that’s what the pursuit of power requires."

Thread

81

81/

82

82/ Media Slant is Contagious

83

83/ Adam Butler on the Phil Bak podcast

* How Adam's career got started
* What led him to focus on risk management, diversification, & systematic thinking
* Expert Political Judgment (Phil Tetlock)
* "Alpha lives where no one else wants to go."

philbak.com/epsode/adam-bu…

84

84/

85

85/ Polarizing impact of science literacy and numeracy on perceived climate change risks

86

86/ 50 Years of Successful Predictive Modeling Should Be Enough: Lessons for Philosophy of Science (Bishop, Trout)

"The success of statistical prediction rules forces us to reconsider what is involved in understanding, explanation, and good reasoning."

semanticscholar.org/paper/50-Years…

87

87/ Robust Beauty of Improper Linear Models in Decision-Making

88

88/ "If you cannot be precise, don't say anything.

"If central bank guidance is not grounded in a theory with variables that can tell us something about the future of prices, it is hard to understand why investors should hold risky debt instruments."

mrzepczynski.blogspot.com/2021/10/transi…

89

89/

90

90/ Does Living in California Make People Happy? A Focusing Illusion in Judgments of Life Satisfaction

91

91/ Three Myths about Federal Regulation

92

92/ Superstar Investors

Buffett's Alpha

93

93/

94

94/ "With few exceptions, you can ignore anyone who shares 'trade ideas.' Much of FinTwit is ppl congratulating or berating one either over the outcome of coin flips. Refuse to play that game. Focus on repeatable, economically-sensible sources of edge."

95

95/ "Accuracy remains stable as information increases; however, confidence increases. My forecast does not become better, but I feel more confident that it is true. This has investment implications because sizing of positions is driven by confidence."

mrzepczynski.blogspot.com/2022/03/more-i…

96

96/ "It is not whether what you say is correct. It is whether it is important.

"This is like our discussion on confidence and accuracy. More information improves confidence but not accuracy. Know what is important is more important than just knowing."

mrzepczynski.blogspot.com/2022/03/the-fa…

97

97/

98

98/ "Philip Tetlock found that alleged experts were right less than half of the time and that they were worse than dart-throwing monkeys in forecasting outcomes when multiple probabilities were involved.... When experts are wrong, they seldom admit it."

99

99/

100

100/ "A lot of you seem to have this attraction to modelling and 'quant' work because it will somehow bring you certainty over discretionary methods."

101

101/

102

102/ The world is complicated; the variables below are not the only ones that will matter, and which of the variables end up being important (and which ones don't) also matters. In the real world, the variables are also not binary.

Forecasting is hard!

mrzepczynski.blogspot.com/2022/07/global…

103

103/ "A related error is assuming that experts are always wrong because they tend to be untrustworthy.

"In the example above, the expert was right 60% of the time - better than random but worse than he claimed his ability to be."

104

104/ "Accuracy remains stable as the amount of information increases.

"My forecast does not become better, but I feel more confident that it is true.

"This has investment implications because our sizing of positions is often driven by confidence."

mrzepczynski.blogspot.com/2022/03/more-i…

105

105/

106

106/

107

107/ "Qualitative criteria people lean on for decision-making are counterproductive.

"Pedigree is useless unless alpha relies on *access*.

"Intelligence, charisma, confidence, social status, attention to detail, rhetorical skills are negative screens."

108

108/

109

109/ How Accurate are Capital Market Assumptions, and How Should We Use Them?

110

110/

111

111/ Political endorsement by Nature and trust in scientific expertise during COVID-19

112

112/ Professional Forecasters’ Outlook for 2023 and Caveats Based on Past Performance

113

113/ "A carefully curated social media feed of people who share your worldview peppered with righteous takedowns of the most extreme takes from the other side is not a good source of information. It's a good way to be angry, depressed, and misinformed."

114

114/

115

115/ "Julia Galef introduces a metaphor of soldiers and scouts to describe two very different approaches to thinking. She illustrates this via the Dreyfus Affair: how one man was wrongfully convicted of treason and how another fought to exonerate him."

youtube.com/watch?v=3MYEtQ…

116

116/

117

117/ Economic Records of the Presidents: Party Differences and Inherited Economic Conditions

118

118/

119

119/

120

120/ Expert Political Judgment: How Good Is It? How Can We Know? | Philip Tetlock
Talk given on August 24, 2006

youtube.com/watch?v=f73A-H…

@ReformedTrader: 1/ Expert Political Judgment (...

Actions

What You Can Do