Anthropic Built an AI They're Too Scared to Release — benchmark.space

josh :)

Anthropic Built an AI They're Too Scared to Release

2026-04-08 8min 2,239 views watch on youtube →

Channel: josh :)

Date: 2026-04-08

Duration: 8min

Views: 2,239

URL: https://www.youtube.com/watch?v=V8d6ggx4YNQ

Anthropic leaked an internal document describing Claude Mythos — an AI model so powerful they refuse to release it.

It found vulnerabilities in Linux, OpenBSD, and FFmpeg that went undetected for decades. Here's what it can do and why they're keeping it locked down.

Anthropic confirmed the leak on March 26, 2026 after security researchers discovered nearly 3,000 internal documents were publicly accessible. One of them described an AI system codenamed Capybara (Claude Mythos) that achieved a 93

This is a leaked internal document [music] from Anthropic describing an AI model they didn't want you to see. I think this is the first time I've actually felt the fear of powerful AI even as an AI safety researcher. Even reading this book, if anyone builds it, everyone dies, which sounds pretty scary. I really think that Claude Mythos is the first glimpse into the future that people like Eliezer foresaw. Okay, so get this. They pointed [music] Claude Mythos at a Linux machine and without any advanced prompting, any advanced directives, it on its own found multiple kernel issues that it was able to chain together and just have root access to any system. Now, if that's not scary to you, you must not fully understand just how much of the world runs on Linux. We're talking about [music] world government infrastructure, US infrastructure, China infrastructure, YouTube, Google, Apple, all of it. All of it probably touches Linux [music] in some way. And now we just have machines

that know how to break the entire world's infrastructure. Now, luckily, when they saw this, they did something I haven't seen any other AI company do. They're officially declaring that there will be no public access to Claude Mythos until way more safeguards [music] are in place. I don't know if I'm convinced that I want just anyone to be able to [music] have access to something that can hack all my devices, but I at least feel a little better knowing that they called Apple, Google, CrowdStrike. Basically, any company that holds up our technological infrastructure, they called them and are giving them access before open source AI catches up or some other company releases a model that's just as powerful, if not more. Okay, let's talk more about the leaks. On March 26th, we had two security researchers. They found that Anthropic's content management system had everything public by default. So, if you wanted to upload something, you had to opt out of it being public, which obviously is a recipe for disaster because someone's going to forget, someone's not going to get clued in, and there are going to be

leaks [music] like the one we have here. One of them is a full blog post describing Kathy Barra and how it's so insane that they can't release it to security researchers. Now, I'm not going to lie, when I saw this, I took it with a grain of salt because I have been done in by leaks, but Anthropic did quickly confirm that it was real. By the way, 5 days later, they also leaked Claude code, but that is a whole 'nother story. But, it is hilarious to me that the model built to detect security leaks and vulnerabilities was leaked by a security vulnerability. I mean, come on. The irony is just fantastic. Okay, but now today, April 6th, Anthropic has released what the model is actually capable of, and it's a doozy. [music] I'm only going to give you one benchmark because I know that the moment I start listing stats, [music] your eyes are going to glaze over, you're going to click on next video. I get it. I do it, too. No judgment. Software engineering bench verified. The

real-world software engineering benchmark that everyone actually cares about. Mythos scores 93% on this benchmark, and the current best public model, Opus, scores 80.8%. Nothing from OpenAI, Google, or any of the open source models are coming within 13 points of this. It is like insane this jump in capabilities. Okay, fine. I'm going to just going to flash the other benchmarks for a second so you can see it's a lot better. Okay, fine. Fine. No more stat because stats are whatever, but let me tell you what the model has actually done, which I think is even more insane. So, they pointed it at OpenBSD, and if you don't know what OpenBSD is, uh you're probably not a doomsday prepper. It It's like an operating system for like paranoid survivalists. People run all their firewalls on it, and it's pretty much what you use if you truly do not trust anything else. Mythos found a bug that had been sitting in the codebase for 27

years. It was a remote crash vulnerability where you can actually just connect [music] to it and just shut it off. I actually feel for the hyper-security people who are like, "Oh, I'm good. I'm on OpenBSD." Well, I guess not, buddy. I guess not. Now, I know bugs are found everywhere, but I think what you have to understand is that this thing has had decades of security audits, so many reviewers that really care and are really knowledgeable about security, and they all missed it. Then Mythos found more. You know FFmpeg? It's in everything, every video app, every streaming service. It's the duct tape that allows videos to be played on the internet. Well, in this one, Mythos found a 16-year-old vulnerability, but worst of all, we have the Linux kernel thing that I mentioned in the beginning of this video. It just found a way to get root access to any Linux machine. Man, I'm really bad at this YouTube thing. My My camera ran out of batter, but I got I got to say, I I

don't know how I feel about this, honestly. [music] Um Anthropic claims that we need to do it first and be the good ones to do it because bad actors are going to do it soon, and if they get to it first, it's going to be really bad for the world. I mean, like I understand the logic. Um it's just How do you know who's a bad actor in the scenario, and how do you make sure that the power doesn't get to your head and corrupt [music] you in some way? And if you're a company trying to make profits, how do you make [music] sure that the incentives of the company don't supersede the safety that's needed? I I I don't know. I don't know. At the same time, I am confident that there are other organizations that if they had the power to take down US governments with their model, they would have probably done it. Now, I know people are going to talk about the hype. Obviously, Anthropic is a profit-driven company. They're looking to IPO this

year, and what a wonderful way to have your cake and eat it, too, that not only have you made the most powerful system to ever exist, but also you are so responsible that you're not even putting it out there. It's such a humble brag, but I do think it's real. [music] Yes, there's a lot to gain from them hyping themselves up, but they actually don't really need to lie because what they've done is actually like unprecedented and never-before-seen. This uh this AI stuff is getting kind of scary. I can't believe the best cybersecurity tool in the world has been made by a company that left their content management system to public. I like to think of myself as an techno-optimist. I feel like I've always had this dream since I was a kid of going into technology and using it to make the world a better place and using it to empower human beings in ways that we've never been able to, using it to do things like fly to Mars, expand our lifespans, get rid of

diseases, and just live in a a solar punk world where technology [music] is blended with nature and all is solved. And [music] I think the people rushing to make these also share that vision. Um but I think the fact that most experts in AI think there's a chance, like a 10 to 20% chance that we're all going to go extinct. Um it's not really how I wanted all this technological progress to happen. Um I don't know what's going to happen, but I thought I needed to let you know. Uh I'm curious what your thoughts are. Um and if you just hate AI and just want to I'm sorry. I I think the world's permanently changed in the way that somebody who hated cars [music] when the first car came out, uh just had to adapt to a new world. Um my heart really goes

out for the people that hate this so much. I feel bad for them. I feel really bad because it's like your life is being consumed by something that you don't want and you have no control over. Um Anyway, I'm going to end the video there. Uh Claude Mythos is scary.