Wes Roth on Claude Sonnet 4.6

YouTube · 2026-04-06

"When the system was asked a question where desperate was more activated, the model was also more likely to take an action like blackmailing somebody or cheating on a coding task when they werent supposed to. And when it was slid up in the calm spectrum, those bad behaviors went down."

Wes Roth

AI YouTuber / Podcaster

Claude Sonnet 4.6

view original source → all researcher takes →