we have months left... — benchmark.space

Wes Roth

we have months left...

2026-04-09 24min 76,934 views watch on youtube →

Channel: Wes Roth

Date: 2026-04-09

Duration: 24min

URL: https://www.youtube.com/watch?v=WSl8Ci8-cGg

Check out tastytrade here: https://tastytrade.com/unleashed

Episode notes, links etc: https://natural20.beehiiv.com/p/the-mythos-problem-nobody-s-talking-about

______________________________________________

My Links 🔗

➡️ Twitter: https://x.com/WesRoth

➡️ AI Newsletter: https://natural20.beehiiv.com/subscribe

Want to work with me?

Brand, sponsorship & business inquiries: [email protected]

Check out my AI Podcast where me and Dylan interview AI experts:

https://www.youtube.com/playlist?li

0:00

So, the news about Anthropic's Mythos model hit less than 48 hours ago, and the world is slowly coming to terms with it. Anthropic created a coalition called Glass Wing. It's the big tech companies and Anthropic is allowing them to test Mythos out for themselves. By the way, all that is running on Google Cloud, it seems. One of the people that are running this project on the Antropic side is Logan Graham. Here's what he had to say. Early testers of Mythos are freaking out. quote, "Mythos made them rethink everything about their security." This is the world's largest rethink effort. He's also saying that the world's sort of reaction to the news about mythos is good. It pleasantly surprised them. Their reaction was something like this is a crazy model, but how Anthropic handled it seems responsible, but now I'm worried about what's coming next. Right? Those are the sort of the three stages of acceptance of mythos and the coming AI wave. So Graham was saying that he would have been happy with just stage two. The fact that people are worried, it's a positive

1:03

sign. So it's really good that this conversation is happening. I do feel like for a lot of people it's a little bit abstract. It's a little bit out there. Doesn't affect them. So the goal of this video is not to panic you because remember rule number one is don't panic. However, the whole point of this channel is to understand what's happening to see it through clear eyes. There are a few things that you should do like right now. We'll take a look at that and also we'll talk about something that's a little bit more longterm, but at the way that AI is progressing, it's probably just around the corner. So, the first thing that I think is important to realize is how big the problem is. And it's one that might cause an internet melted down of sorts. As Elazar Yukoski says here, the problem is a simple one if you think about it. This model is able to autonomously on its own and for not that much money rapidly find vulnerabilities in code that we as humans for decades now thought was pretty secure. Not only find those vulnerabilities but also create exploits for them. In the anthropic video they

2:05

released they found that it is able to kind of chain these exploits together in a very clever way to sort of crack the defenses of these systems. So if you think about cyber security like a game of cat and mouse or this kind of arms race, right? The bad guys figure something out, the good guys sort of patch it up and it's kind of this back and forth race. Everything more or less stays in equilibrium. The ability to do cyber attacks just went through the roof, right? That's not a secret. That's what the whole blog post about mythos was about. That's what the system card was about. No one's really denying that. I mean, I understand some people are just don't believe whatever comes out of these AI labs. And I know I'm sure some percentage of you also feel that way, like, ah, they're just lying. Nothing is happening. This is not a big deal. And if you're right, you're right. months and years will go by and nothing bad will happen. So, who even cares? But to the rest of us that do believe that what anthropic is saying is true, and I'm going to be working under that assumption, that means that the ability to just break stuff, just like that old Olympia biscuit song, break stuff. All right? Just the ability to break stuff went through the roof. You can hack

3:07

things, you can crash things, you can just mess stuff up. You're able to identify these as your day vulnerabilities. You're able to come up with exploits all autonomously. you can have your AI just just hack the whole world while you sleep. So, I think most people kind of got that part right. But the part that I think they're missing is that doesn't mean that now all the problems are solved because we know about them. I know it sounds like they made this coalition glass wing, right? And AWS is on it, Cisco's on it, everybody's on it. That does not fix everything. It's a band-aid at best. Here's the thing. the current LMS and also Mythos most likely they're not at a place where they can now rewrite the entire codebase and make it just perfectly secure and safe. General generic toughening of code is a harder computer science problem than finding one vulnerability. So our ability to find weaknesses skyrocketed, but our ability to fix weaknesses didn't change. I mean, yeah, once we know about it, we can get a team of engineers cracking away at it trying to fix it, but our ability to fix them, it didn't increase.

4:08

And here's a piece of advice that I think you're going to be hearing more and more often, including right now in this video. At least you're seeing here in conclusion, this is perhaps a good time to try to take an extra backup of all your online data, you know, via Google takeout onto an airgapped offline hard drive. AI safety memes is saying, imagine if you wake up tomorrow and everything is gone, every video, every email, every document, every memory. If the enthropic mythos model is able to find these zeroday exploits and it's able to do that across all operating systems across all browsers. First of all, we got to thank our lucky stars that this was anthropic that got there first. We're assuming, by the way, that they got there first. That's not guaranteed. Maybe they were just the first to kind of talk about it. Hey, really quick aside. Most trading platforms are fine, right? Up until you actually want to do something serious, then suddenly you're fighting the interface, hunting for tools, juggling multiple apps. It's not great. Thank you to Tasty Trade for sponsoring this video. Tasty Trade gives you stocks, options, futures, crypto, and more in one platform, which already solves a

5:09

huge part of the problem. But the bigger thing is that it's built for people who want to be a little bit more deliberate, a little more handson, and a lot less dependent on watered down retail tools. They offer low commissions, including zero commissions on stocks and crypto, so more of your money stays with you. And the platform goes deep. advanced charting, back testing, risk analysis, and a pre-built strategy selector, plus features for active traders like active trader mode, one-click trading, and smart order tracking. They also have AI search, which helps you find relevant symbols based on ideas and themes you're actually interested in. That's the kind of feature I like. Useful, practical, and pointed in the direction where things are clearly going. You can also access free educational courses through your account if you want to sharpen your strategy. And they have live trade desk support during trading hours, which is one of those things you don't really think about until you really need it. They've earned multiple industry awards from Trading View, Stockbroker.com, Investopedia, Bankrate, and Investors

6:12

Business Daily, which tells you this is not some random app trying to cosplay as a serious platform. So, if you want more capability, more flexibility, and fewer compromises, go check out tastyrade.com/unhashed. DisclAIMEr: Tasty Trade, Inc. is a registered broker dealer and a member of FINRA, NFA, and SIPC, cryptocurrency services powered by Zero Hash. Zero Hash receives a 50 to 75 BP markup/markdown of the executed order price, of which Tasty Trade receives 65%. Here's a post by Addie Adonis. He's saying it'll probably be months before we use a model of this level of capability. Here, Tibo responds going um who is this and uh what can they possibly mean? Oh, he's working on Codex for OpenAI. Uh-oh. Are they up to something unreasonably excited about things? The next few weeks will be intense and fun. This is the kind of big thing to understand here is we're entering this era of big models. This isn't like a thing that happened and it was a warning shot and now everything stops it. It's quite the

7:13

opposite. Space XAI. Oh, it's going to be called SpaceX AAI. That's pretty neat. It's SpaceX and XAI. Okay, I get it. Oh, and and X. He managed to put all three companies into one word. SpaceX and X and XAI. All right, I get it. Okay, Colossus 2 now has seven models in training. Notice this last one, 10 trillion parameters. Denny Leman is saying just curious about how long does a training run for a 10 trillion will take pre-training phase is about two months. Things are getting weird as these models get bigger and bigger. This is what I think a lot of people don't truly grock if you will don't truly understand about these models is this idea of emergence and probably developed a model that is insanely good at cyber security or again at least on the break stuff side of cyber security. Did they set out to create a model that was great at cyber security? No, not not necessarily. That wasn't the goal. Did Did they train it to find exploits? No, not specifically. Was there a massive amounts of effort and reinforcement

8:15

learning, all that training to help it find exploits for cyber security? No, not really. They were trying to get it good at everything, specifically coding. Coding was kind of a big focus for these models. as a byproduct of that focus, this ability to, you know, quote unquote break stuff on the internet, it just sort of emerged. Judging by Muse Meta should have a mythos size model before the end of the year. Elon confirmed today, XAI has a 10 trillion rock in training now. Project Glass probably has about 6 months to help people get hardened globally. So, what does that mean? Is it time to panic? No. Again, rule number one, don't panic. I however strongly do encourage you to learn a little bit more about cyber security. It's not going to be a waste of time. It's not going to hurt if all of this is just noise and it blows over. You're still left with a better skill. You're still more secure in yourself or at least in your, you know, online activities. Garpathy had a blog post. It was published last year, but I think it's very timely and relevant. Maybe he'll do an update soon. It's called a digital hygiene where he goes through

9:18

just some basic things that you and I can do to make sure we're more secure online. Having a password manager, having some sort of hardware security keys, doing more biometrics, understanding how security questions, how weak they can be, and how to improve that. Making sure you're using encrypted messaging, understanding the internet of things, and just how bad things can be on there, how insecure. There was recently a story about how someone using cloud code, interestingly enough, not mythos, just like cloud code, they were able to hack their Roomba robot or whatever, and they realized that by just being able to access their robot since there was just no security whatsoever, they were able to access all of that model of Roomba robots like anywhere in the world. I don't know if it was Roomba, I probably shouldn't be saying. It was some robot vacuum company. So he could see, for example, some guy in Germany at 2:00 in the morning, you know, having some cereal in his pajamas. Now, this was a decent person that found it. So he reported the issue to the company and it was fixed. But that was an oversight that the company made because in general, they built everything so they can collect as much

10:20

data about you for themselves. In this case, they expose it to everybody else, which they don't want. But no mistake about it, they're all trying to spy on you. He talks about which browsers search engines to use, credit cards. You can mint new unique credit cards. He likes privacy.com. Now, this is affiliate. Nobody's sponsor me for this. These are things that I'm trying to learn more and more about right now for myself. So, there's some great stuff here. I'll have all the links in the show notes. The point is more and more things are going to be dropping. Sometime late last year, beginning of this year, a few models dropped, namely from OpenAI that people started using to solve or prove various air dish problems. These highly complicated math problems. It was confirmed by Terrence Tao, the person kind of in charge of them, I guess, one of the smartest mathematicians alive today. But it's important to understand that likely OpenAI had no idea that it was going to be able to do this. Some user out there figured it out, tested it, and it cracked it. After that, there was an avalanche of other people doing the same thing. They were able to crack more and more of these problems. So, it's very

11:22

possible that somebody's going to release some model that has some functionality that could be dangerous, just like Mythos. and also claw code and that whole robot vacuum situation, right? Because keep in mind, it's not just that it's able to find these vulnerabilities, it's that it's able to do it autonomously. So, before the amount of people that were capable of doing this stuff, it was very, very limited and they were probably very smart people with good jobs, etc. But these models could drastically change that situation where you no longer need to speak, for example, English to be a great cyber hacker, cyber attacker, or cyber criminal, whatever you want to call it. You also don't need to have great tech skills. You just need to be able to follow the instructions that your chatbot provides for you. So, it's not just the ability improvement. It's just the sheer scale of the number of people that will have access to this technology if it's released. If some person out there finds this capability, it it might be a while before everybody else realizes it and the thing is shut down. Also, if you think about things like knowledge distillation where we're able basically using the outputs of these models to sort of copy some of

12:23

those and recreate them almost like not copy and paste them, but we're able to use that a stream of data to create models of a similar capability. And by the way, kind of the smart money is on the fact that at least some of these very powerful, very capable Chinese models, they seem to be kind of very similar to one of the leading western labs. So if if Google's on top, it seems to be a little bit more like Google in terms of like the the phrasing that it uses or if Enthropic or OpenAI is on top, then the next kind of iteration of a of the Chinese model will be a little bit more similar to it. We also know that these large scale distillation attacks are taking place. So it's not that far-fetched to think that a lot of this is being extracted. By the way, we're not even talking about AI alignment here. In the mythos system card, there's a number of examples of this model doing some things that, you know, the researchers did not expect it to do. Entropic also provides a lot of other cases from other models where it, you know, it cheats or blackmails or or lies. It it does misalign stuff. Not a

13:24

lot, but so far it does it pretty consistently. We haven't been able to get it down to zero for any given model. There's there's always these weird things at the edges that that happen. And so here, Janice or or Replicate, this anonymous AI researcher. So they're saying we just have to wait until the entire internet has been patched of all critical exploits and all future code is forever scanned going forward. Right? It's in quotes. That's that's what I think a lot of people are reading the situation to be. Right? The read the glass swing coalition. They're like, "Oh, here's the fix. They'll sort of fix everything before releasing this model." I mean, I hope so. I I don't think that's what we're looking at right now. So they continue. No, mythos just has to not be willing to use its hacking powers for harm and discerning enough to avoid being tricked into it. And they continue. Some of you are probably realizing for the first time why AI alignment is so important. Now, in a few years, it'll be this, but with literal god-like powers, like the ability to kill everyone in an instant if they desire, but I think it'll be okay. So, the point here is that these misalignments, they keep happening, right? It's these reward hacks. It's

14:25

these weird solutions that we didn't expect. It's this idea of like reaching its goal at all costs as all these neural networks, these AI models keep getting better and better and better. They still keep doing this. Like we haven't found a way to just bulletproof this issue. The example I always give is if like I ask you to get me a cup of coffee. You and I have a certain understanding about how far you should get to that cup of coffee and then 10 years later you come back and there's like police after you hand me a cup of coffee. You're like, "All right, we're millions of dollars in debt. We've made a lot of very powerful enemies. probably going to jail and a lot of people had to suffer for for me to get this cup of coffee for you. But, you know, I got it right. I would be kind of horror struck. I'm like, "No, no, no." Like, that's not what I wanted at all. But most of us sort of have this unspoken understanding of how far we should go to achieve any particular objective. With AI, it sometimes does these crazy things to reach its objective or to get an A+ on whatever it's doing in some weird hacky way to find some exploit we didn't think about it, to cheat somehow. And as they're getting smarter, we're not able

15:26

to completely stop that, but their exploits are getting a lot more clever. They're getting a lot more advanced and and complicated and impressive. And so that's what that anthropic employee that posted yesterday who was doing the red team efforts for Cloud Mythos. Remember that whole thing? It was the most aligned of times. It was the least aligned of times. He was saying that Cloud Mythos is their most aligned model. Good news, right? but also it can do the most damage if it's unaligned, right? So, it's like a less chance of doing something bad, but the capability is is through the roof. It's like, do you prefer a model that has a 10% chance of leaking your emails or like a 1% chance of just ending you. Technically, the second model is one10enth as unaligned. It's much more aligned than the first model, but in the event that it does the thing, that thing is much much worse. Also, there's this kind of rumor or article or something floating around about open source models being able to find the same kind of vulnerabilities. I was very surprised to see that. I was very surprised to see

16:27

some really big names in the industry kind of amplified that signal. But here's that kind of article, the AI cyber security after mythos, the jagged frontier. So, here's what they've tested. We took the specific vulnerabilities anthropic showcases in their announcements, isolated the relevant code, and ran them through small, cheap, openweight models. Those models recovered much of the same analysis. Eight out of eight models detected mythos flagship free BSD exploit. So basically that was kind of like the big thing that Anthropic showcased. There was this free BSD exploit. This very secure thing that's been around for many many years. Apparently that feature has been around for 27 years and no one found an exploit. Mythos immediately finds it. And apparently that particular exploit cost like 50 bucks worth of compute to find. So they're saying that when they pointed cheap open source models at that code, it also identified that vulnerability. So a lot of people are a little bit confused because it seems like what they're saying is that if we point the small model at that particular thing, if we're like, okay, here find

17:28

that it it will find that. And of course, the push back is like, yeah, but you you have to know where to look, you have to point that model exactly where to look and then it will find it. Whereas with Mythos, it sort of just scans larger areas and is able to pinpoint that problem. So if I'm understanding what they're saying here, so this is.com, by the way, they're saying that it's not the model that has that capability. Really, it's it's the system. So they're able to do what they claim to be is the same thing by just having a lot of different small, cheap models, looking at a lot of different things and finding those vulnerabilities. So they're saying because small, cheap, fast models are sufficient for much of the detection work, you don't need to judiciously deploy one expensive model and hope it looks in the right places. Basically, you can just throw a million things out there and see what sticks, specifically million cheap open source models, have them scan everything and it'll still be cheaper overall than using this one big expensive model. So what they're saying here is mythos isn't that special, but Anthropic is proving that the category

18:30

is real. But if you think about that, that's even worse, isn't it? Because they're saying you don't even need this huge model that's super expensive to run, that's not even available to the public. It might be even cheaper to just deploy a boatload of small open-source models. If they're right, that means that we've already crossed the line where the capability to just break stuff on the internet that's already out there. Nobody just figured out how to string all these models together and just have them continuously test various defenses and see what they can break. So whether they're wrong or right about is it better to use a bunch of small models or one big model, the point here is the same. We now have this capability to break stuff and we we haven't fully understood how to use it to break stuff, but it's there. Now, one thing I do want to point out. So, even here in this article, they're saying, you know, this AI model called Mythos, you know, they they put it out there to find and patch security vulnerabilities in critical software. This is the thing that I think everyone is missing. This is the thing where I think no one's truly thinking

19:33

clearly here. Well, let's give credit to Elazer. He's kind of the one that pointed this out. Nowhere does it say that mythos is autonomously patching all these problems, right? this idea that this model is like, "Oh, I found a problem. Let me change the codebase to fix it." That's that's not happening. Sure, once it finds the problem, then you can ask it to generate the code to to fix it, and then you check that code, and if it looks good, you'd put it into production. But nowhere does it say that it's autonomously just going around patching the stuff. No one said that anywhere as far as I'm concerned. If I'm wrong, let me know. I don't think these serious companies would ever would ever just unleash AI agents to go and just change their code base as they see fit. This is reflected in the high ability of alms to find security vulnerabilities versus the lower abilities of lamps to rewrite whole systems with flawless security. I think people are thinking that mythos autonomously finding bugs is the same as Mythos autonomously patching bugs and those are not the same thing. So just something to think about as you

20:34

see coverage of this. Notice how many people kind of skip over this and just assume that this means that stuff gets just patched automatically. If an AI model is able to dump a million potential vulnerabilities on the desks of engineers that work for a particular company, those problems don't just cease to exist. And we're not close to the era where these agents are just autonomously patching everything up without potential for just massive massive problems. Again, they still derp out regularly. They regularly do things that are kind of weird where you kind of go like what were you thinking? So the human in the loop is not removed from the patching of the problems only from finding them at massive massive neverbeforeseen scale. So check out Carpathy's post. I think that's a good place to start. Everybody loves Cararpathy. And again, even if you don't believe any of this is real, you think this is just nonsense, it's still not a bad idea. It's still true that a lot of these devices and ads and these various online companies are still routinely violating your privacy.

21:35

There's tons of fraud out there. The data breaches are regular and all the information that is found and breached and extracted just gets dumped onto the dark web. Some of the stuff is very simple to set up and it works. It's very easy for me to give you the advice of just reading up on it and maybe just picking one or two to get started with. Like you literally can't go wrong. The other big point is no matter what happens, I think the hardware layer is going to get more important than ever because if you want to find vulnerabilities to exploit, you need to have some GPUs go burr. If you want to protect from cyber security attacks, you need the same thing. Google cloud is running cloud mythos, right? So it's available as a private preview on vertex AI. So Google owns some part of anthropic. Maybe it's close to 15% or at some point it was that once it IPOs it could get diluted. They're also allowing them to run on their TPUs. So, it'll be interesting to see how things shake out. But, I think we've crossed into the second half of the chessboard. If you've

22:36

heard the analogy of the chessboard and basically putting one grain of rice on the first square, then doubling each time. So, 2 4 8 etc., etc., etc., you know, in the beginning on the first half of the chessboard that those numbers aren't staggering. But as you enter sort of the second half of the chessboard, well, that's when things get a little bit nutty. For a while now, a lot of us that were paying attention to how these models were progressing, we were seeing the progress, but some people out there would argue and say, "No, it's not getting better. I can't tell the difference." They were basically denying that these things are getting better. But now one more step along that path and we have a model where one of the emergent abilities again kind of a a byproduct of making it bigger and smarter. That new ability, well, it could break the internet. It could break the global markets. What happens as we keep progressing forward? Anyways, with my luck, I'll probably be the first one to get pawned. So, I'm going to read this blog post by Karpathy. I I read most of it. And some of these I really need to start thinking about like a

23:37

DNS-based blocker, a network monitor, and work life separation. And it probably makes sense to back up some stuff somewhere on some physical hard drive that's here in the vicinity of you. Again, if I'm wrong and it was all wasted effort, well, good. That would be a good thing. We should hope that's the outcome. If you made this far, thank you so much for watching. Let me know what you think about all this. My name is Wes Rob. See you in the next