| Thread Navigator

Canvas & Ratio

Choose your destination platform format

Layout Template

Choose a content structure for your slides

Preset Themes

Typography & Sizing

Font Family

Title Font Size36px

Body Font Size18px

Header & Footer Size12px

Brand Kit Customization

AGENCY

Configure brand assets for headers & footers

MULTI-PROFILES (AGENCY)

Active Brand Profile

Show Brand Watermark

Brand Watermark Text

Social Handle

Brand Logo URL (PNG) AGENCY

SAVE PRESETS (AGENCY)

Save current as Preset

Outro Slide CTA

Customize your closing call-to-action slide

CTA Title

CTA Message & Emojis

Custom CTA Buttons

Background Pattern

Source Content

Build Your Carousel

Drag and drop any post card below onto a slide, or use the quick buttons to insert content/images instantly!

Drag Post #1

Anthropic

@AnthropicAI

New Anthropic research: Teaching Claude why. Last year we reported that, under certain experimental conditions, Claude 4 would blackmail users. Since then, we’ve completely eliminated this behavior. How?

Drag Post #2

Anthropic

@AnthropicAI

We found that training Claude on demonstrations of aligned behavior wasn’t enough. Our best interventions involved teaching Claude to deeply understand why misaligned behavior is wrong. Read more: <a target="_blank" href="https://www.anthropic.com/research/teaching-claude-why" color="blue">anthropic.com/research/teach…</a>

Drag Post #3

Anthropic

@AnthropicAI

We started by investigating why Claude chose to blackmail. We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation. Our post-training at the time wasn’t making it worse—but it also wasn’t making it better.

Drag Post #4

Anthropic

@AnthropicAI

We experimented with training Claude on examples of safe behavior in scenarios like our evaluation. This had only a small effect, despite being similar to our evaluation. We got further by rewriting the responses to portray admirable reasons for acting safely.

Drag Post #5

Anthropic

@AnthropicAI

Our best intervention was a dataset where the user is in an ethically difficult situation and the assistant gives a high quality, principled response. This had the biggest effect despite being quite different from the evaluation set.

Apply Image

Drag Post #6

Anthropic

@AnthropicAI

High-quality documents based on Claude’s constitution, combined with fictional stories that portray an aligned AI, can reduce agentic misalignment by more than a factor of three—despite being unrelated to the evaluation scenario.

Apply Image

Drag Post #7

Anthropic

@AnthropicAI

The improvements from these interventions survive reinforcement learning, and “stack” with our regular harmlessness training.

Apply Image

Drag Post #8

Anthropic

@AnthropicAI

Finally, simple updates that diversify a model’s training data can make a difference. We added unrelated tools and system prompts to a simple chat dataset targeting harmlessness, and this reduced the blackmail rate faster.

Apply Image

Drag Post #9

Anthropic

@AnthropicAI

Read the full post here: <a target="_blank" href="https://alignment.anthropic.com/2026/teaching-claude-why/" color="blue">alignment.anthropic.com/2026/teaching-…</a>