✨ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135°

style Card Style

40px
16px

text_fields Typography

16px
Carlos E. Perez
@IntuitMachine
A new paper by Kenneth Payne investigates whether Large Language Models (LLMs) exhibit the same risk-taking biases that humans do, as described by Kahneman and Tversky's prospect theory.

Key Findings:

Main Discovery: LLMs demonstrate human-like prospect theory behaviors - they tend to be more risk-seeking when facing losses and more risk-averse when facing gains, just like humans do. However, this effect is heavily dependent on the semantic context of the scenario.

Core Experiments:
Tested 5 state-of-the-art LLMs (including GPT-4o, Claude, and Gemini models)
Used scenarios across civilian contexts (business mergers, career transitions, sports) and geopolitical contexts (military crises, border disputes)
Each scenario had mathematically equivalent options (same expected value) but different risk profiles

Critical Insights:

Context Matters More Than Math: The framing effects were strongest in military and sporting scenarios, but sometimes reversed in career scenarios. This suggests the bias isn't just about "gains vs losses" but about the specific semantic context.

Language Encodes the Bias: When presented with pure mathematical problems (no narrative context), the models became perfectly rational and showed no framing effects at all. This proves the biases are embedded in language, not in the models' mathematical reasoning.

Models Have "Cognitive Personalities": Different models showed distinct patterns - for example, Claude was consistently "hawkish" in military scenarios, while GPT-4o was highly sensitive to semantic framing across all contexts.

Theoretical Contribution: The paper argues this supports a Wittgensteinian view that "language models the world" - LLMs have acquired human decision-making heuristics through language itself, and these manifest as context-specific "language games" rather than universal cognitive biases.

The research has implications for understanding both AI behavior in high-stakes decisions and the nature of reasoning versus memorization in LLMs.
Thread image
Carlos E. Perez
@IntuitMachine
The fact is that OpenAI wasn't innovative enough to prove Gary Marcus wrong about his expectations of GPT-5. GPT-5 is an improvement over o3, but not sufficient to meet the expectations of Marcus. That's the nature of this field; there's sufficient unpredictability that you can't cover all bases.
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press + S to quick-export