benchmark
.
space
benchmarks
rankings
compare
voices
transcripts
papers
articles
YouTube · 2026-04-08
"It affected Claude Opus 4.6 and Sonnet 4.6 as well. When the reward code saw misaligned chains of thought, bad thoughts in other words, it could give a negative reward."
AI Explained
AI analysis YouTube channel
Claude Opus 4.6
view original source →
all researcher takes →