Success without Dignity? Nathan finds Hope Amidst Chaos, from The Intelligence Horizon Podcast — benchmark.space

Cognitive Revolution "How AI Changes Everything

Success without Dignity? Nathan finds Hope Amidst Chaos, from The Intelligence Horizon Podcast

2026-04-01 104min 37,377 views watch on youtube →

Channel: Cognitive Revolution "How AI Changes Everything"

Date: 2026-04-01

Duration: 104min

Views: 37,377

URL: https://www.youtube.com/watch?v=CQOQvHWsENs

This special cross-post from The Intelligence Horizon features Nathan Labenz in a wide-ranging conversation on compressed AI timelines, expert disagreement, and why he believes the singularity is near. They discuss interpretability, RL scaling, and the balance between extraordinary upside, like curing major diseases, and serious existential risks. Nathan explains his evolving p(doom), why he’s slightly more optimistic about robustly good AI, and how defense-in-depth strategies might keep society

Hello and welcome back to the Cognitive Revolution. Today I'm sharing a special crossost from my recent appearance on the Intelligence Horizon podcast with hosts Owen Jang and Will Sanuk Dufalo. Owen and Will will soon be graduating from Yale College. And as you'll hear, they've clearly spent much of their senior year thinking deeply about the current state of AI, where we're headed, and what it means for all of us. and I was really impressed not only with the quality of their questions, but their ability to challenge me with follow-ups that effectively steelman the most relevant counterarguments. We start with the fact that while AI timelines have compressed dramatically over the last 5 years, genuine experts still disagree radically on critical questions. Having established what I hope is appropriate epistemic humility, I then go on to call it how I see it. In short, the singularity is near. Interpretability science proves that AIs are developing increasingly sophisticated world models. And with reinforcement learning scaling now clearly working, AIs are no longer

simply imitating humans and likely won't be limited by what we know for much longer. The potential upside of this is, of course, incredible. The value that I've got from using AI to navigate what humans have discovered about how cancer works and how to treat it has been invaluable. And the prospect that we might cure the majority of human diseases in just the next decade or so is obviously extremely exciting. That said, the risks are also very real and they will remain serious for as long as we lack a solid understanding of how AIs work and why they do what they do. My pdoom remains somewhere in the 10 to 90% range. And yet at the same time, I've become at least a little bit more optimistic that we might actually build robustly good AIs because scaling laws at least seem to imply that powerful AIs can only be created with massive resources. The three companies competing at the frontier today are at least reasonably responsible actors and our best alignment techniques are working better than I had expected.

Given these fundamentals, it seems at least plausible that a defense in-depth strategy, which combines techniques like Goodfire's intentional design, Redwood's AI control, improved cyber security through formal verification of software, and various forms of pandemic preparedness could collectively be enough to keep society on the rails. We touch on a number of other topics as well, including the USChina rivalry and why, especially in the context of the Department of War's recent attack on anthropic, which I'm sad to say has us looking more and more like China all the time. I would rather bet on figuring out a way to cooperate with our fellow humans than bet everything on AI researchers ability to steer AI advances in a way that will ultimately work for us humans. I appreciate Owen and Will for allowing me to cross-ost this conversation and I definitely encourage you to subscribe to the intelligence horizon. Their recent conversation with former OpenAI researcher Zoe Hitig covered the evolving ways that people are using chatupt variations on universal basic income AI governance models that

emphasize a decision-making process over specific principles and why she believes that these kinds of structures will probably have to come from outside the frontier companies. For now, I hope you enjoy my conversation with Owen Jang and Will San Dallo from the Intelligence Horizon. The Cognitive Revolution is brought to you in part by Google, makers of the Gemini family of models and much more. One of my big AI goals for 2026 is to find ways to spend less time at my desk and more time exercising and outside. The challenge is that I'm genuinely so obsessed with keeping up with everything that's happening in AI that it's hard to pull myself away from the screen. Google Notebook LM gives me the best of both worlds. By creating podcast style deep dives about whatever I'm curious about on any given day, it helps me keep learning even on the go. If you haven't tried Notebook LM for a while, you should know that it's become a much more steerable research and thinking partner.

These days, you can select short, medium, or long for audio length. And you also get a free text field to steer the direction of the conversation. For AI research, I always ask for rigorous, literal, technical explanations with no analogies. Recently, I used Notebook LM to study a paper by Google DeepMind researcher Rohan Shaw, which attempts to set upper bounds on how much reasoning different kinds of models can do without needing to externalize their thinking in a chain of thought. The hope is that by establishing these limits, we can better calibrate how much confidence we should have in chain of thought monitoring. And standard transformers in fact are much more limited than alternatives like state space models. A notable upside to transformers that we shouldn't take for granted. If you're trying to keep up with the pace of AI research while staying physically fit, or you're just a natural audio learner, give Notebook LM a try at notebookm.google.com. Thank you to Google for supporting the cognitive revolution. And now on with

the show. When it comes down to the kind of core question I think a lot of people are getting at is like is this AI thing going to kind of fizzle out before it really becomes a big deal [music] or is it going to be a huge worldaltering deal. I'm very much [music] confidently clearly in the camp of it's going to be a huge huge deal. And the details are you know where I think the discussion or the debate remains now. not [music] for me at least, you know, whether or not the the overall trajectory of AI [music] is going to take us to something that is powerful enough to be transformed. One of the strangest things in the world today, full stop, [music] is the fact that the disagreement among very plugged in, [music] very informed, very smart people has not really been reduced much at all, even as [music] we've gained a ton of information over the last couple years about the trajectory of AI. I think that is super strange and I'm honestly [music] pretty confused by it. The one thing everybody seems to agree on is like the timeline on which we [music] should expect this has come in. Today's

world if you say you don't think you're going to see AI till 2035 [music] you're like an AI bear but only 5 years ago that was considered to be quite aggressive [music] and you know most people were more like I don't know 2050 maybe not in my lifetime. So there's been this massive compression of timeline. There's obviously been this huge jump in capability. [music] And yet on these fundamental questions of like what's going to happen, you know, there's [music] still total disagreement. [music] Guest today is Nathan Levents, host of the Cognitive Revolution podcast. Before switching to full-time podcasting, Nathan founded Waymark, an automated marketing platform for local businesses that pioneered the use of Generative AI to produce video ad campaigns. After leading the company for six years, he stepped back to focus full-time on understanding and communicating the trajectory of AI. As host of the Cognitive Revolution, he has conducted hundreds of in-depth interviews with AI researchers, founders, policy makers,

and investors, and has become a go-to source for people trying to keep up with what's happening at the frontier, including us. Nathan was also a member of Open Eyes Red Team, where he was among the first outside users to interact with GPT4 before its public release. Welcome, Nathan. Great to have you on today. >> Thanks for having me. I'm excited for this conversation, guys. So, first question, are we on the cusp of AGI? Starting with the the easy ones. [laughter] Well, okay. First of all, as I'm sure you guys are very well aware, what exactly do we mean by AGI is a slippery question at least that has people talking past each other quite a bit. So, I don't think there's any, you know, like super privileged definition. What I do think we're pretty clearly on the cusp of is powerful AI that is better than the vast majority of people, although perhaps not the, you know, very few top top experts in a given domain at

pretty much all cognitive work. That seems like pretty clearly on the horizon. And I think it is absolutely going to be enough to be transformative to the economy, to daily life, potentially even, you know, bigger things than that like the very nature and status of the human species. And I think that we will have that kind of regardless of whether or not there are some niche areas where humans retain an advantage, which my guess is probably will be the case. Certainly, one of the big things that has become clearer over time as AI systems have gotten better is that they are jagged in this weird way where they have certain things they just do amazingly well at, other things they kind of weirdly struggle at. They're not very adversarially robust, for example. You know, they're they're easier than humans, I would say, still to to trick. So, you know, there's there's going to be weirdness. And I think, you know, throughout the conversation today, probably a big theme will be to expect weird things to happen. But, you know, when it comes down to the kind of core question I think a lot of people are getting at is like, is this AI thing

going to kind of fizzle out before it really becomes a big deal or is it going to be a huge worldaltering deal? I'm very much confidently clearly in the camp of it's going to be a huge huge deal and the details are, you know, where I think the discussion or the debate remains now. not for me at least whether or not the the overall trajectory of AI is going to take us to something that is powerful enough to be transformative. >> Okay. So, it sounds like what you're saying is that there's no doubt that we're going to get massively powerful and transformative AI um across domains. Maybe like there'll be some like you know uh tiny little bit where human experts still have an advantage in specific domains. Maybe there also be like comparative advantage. We'll get more in depth on like what specifically those concerns are. But I guess just to hone in, are these more so uncertainties as to whether AI can fully generalize or are you like pretty confident that for like the foreseeable future there will be these gaps between like human experts in specific domains and like problems like adversary robustness? So, so that

just to clarify what your position is, is this like a prediction or is there like this is where your uncertainty lies? Your uncertainty doesn't lie in AI being transformative. >> Yeah, I think the latter. I wouldn't be shocked if there are additional unlocks that allow AI systems to truly undeniably surpass what humans are capable of. I think you look at the natural world and clearly humans did that to every other animal species that existed before us, right? So there is a historical precedent for some new kind of mind showing up on the scene and blowing away all the other minds that came before us. I don't think we have any reason to believe that that couldn't happen to us in some, you know, law of physics guarantee sort of way. So, I definitely think that's possible. But in terms of what I could confidently foresee, I don't think it is clear that that's going to happen in the next few years. But what I do think is still again clear is that we are going to have systems that are powerful enough to be transformative across almost all the

questions that we care about in terms of what is society going to look like? How are we going to organize? Are we going to need a new social contract? Like all those things seem to me pretty clear. And then you know people quibble often around the edges of could I mean you get into there's some pretty esoteric jobs out there right? So like could an AI system ever be able to be as good of a somale as the best human somales? Well, I don't know, you know, is anybody going to be motivated to train one to try to do that? You know, we don't really have a lot of taste AI at this point in time. And, you know, maybe tasting will be the uh forever the domain of humans, but trying to identify these little niches where we may have some really long-standing durable advantage. I think too often distracts from the big question of like, is this going to change just about everything that matters to us? And and there I I clearly come down that the answer will be yes, >> right? So let's talk a little bit about how maybe we get to this transformative or extremely powerful AI. Do you think current paradigms I think the one that comes to mind the most and which is discussed the most is scaling RL is

going to be sufficient to get to that transformative or powerful AI that we discussed. Yes. I mean I think it probably is and I also think if it's not we may never quite answer that question in the sense that I I do expect that we will continue to see new conceptual advances in AI research. You know the field is growing everything. One of my sayings is everything is going exponential. So that is the number of people that are working in AI research. It's the number of papers. It's the number of experiments being run. It's the amount of compute that people have to run those experiments on. It's the data sets that have been collected. It's the RL environments that are being built out over time. U and I do think all those things are going to probably give us some new conceptual unlocks such that we will never probably answer the question. And this already kind of happened with pre-training, right? You if you rewind two to three years, there was a time when people were like, well, we're kind of running out of data, you know, can we really scale this all the way to AGI? And nobody's talking now about can we scale pre-training all the

way to AGI anymore because there's been a new thing and so now the new thing is on top and it's like well clearly that's going to be part of the mix. Will this you know exact pattern or this this sort of shape be enough to get all the way to AGI? My prediction is that again like what does AGI mean? But my prediction is it it probably is enough to get us to systems that are transformative. But if we fast forward to 2028 and, you know, look back at this conversation, we'll probably say, well, nobody's really asking that anymore because we do have a couple new things that have come online. And so now we have a little richer sense of like what the shape of it is going to be. And, you know, we're either getting there or we're not. We maybe still are missing a little something at that point, but it I I would guess that we would look back at the current thing and say, yeah, clearly a couple things have been added and they were a big deal. And so it wasn't, you know, the question of like is exactly what we have in February 2026 enough kind of ends up being beside the point in the final analysis. Interesting. So what you're trying to say here is that in the same way that there were questions about whether pre-training would take us to this like

extremely transformative AI now that we have these new unlocks such as RL for example like we discussed who knows in maybe 2 to 3 years or you know whatever time scales that we're talking about to reach transformative AI there might be other conceptual progresses that are also made that build upon things like pre-training or that will get there. So I guess my question specifically was do you think there will be more of these conceptual progresses that need to be made to get to this transform of AI that we're talking about or do you think the RL paradigm is going to get us there? >> I think RL probably would be enough to get to AI systems that can like do most of the cognitive work in the economy for example. I mean it seems like honestly we're already reasonably close to that and it doesn't seem like we're anywhere close to done. And by the way, if you listen to the lab leaders, the frontier model developer leaders, they are still saying, by the way, too, that pre-training is still working. You know, it was never really the case that pre-training stopped working. As far as I know, those scaling laws basically have held. I think what happened at one

point in time was the next step on the pre-training frontier was becoming really expensive and then they found another kind of maybe not fully orthogonal but you know pretty different direction to go with scaling post-raining with the RL paradigm that we have now and that was just much bigger bang for their buck at least at that moment in time right like they had already gone pretty far up the pre-training curve they hadn't gone very far up at all the RL curve now presumably those things were going to kind of even out. You know, a general kind of economic theory would be like people should be investing in one until the marginal return decreases to the level of the other and then they could maybe invest in both. So, if you find some new path that's like, oh, this is really giving us huge ROI, you go hard at that path for a while, but then that kind of hits some diminishing returns. Now, you're kind of back to, okay, well, maybe we need to do all these things more at the same time. So, we'll do a little more pre pre-training. We'll do a little more RL and we'll do, you know, maybe more of that mystery third thing all advancing in tandem. Right now, I

think we're advancing. I don't have insider information on, you know, exactly what these ratios look like. But I'd say it's probably roughly the case that we're close to, if not at the point where additional compute going into RL is roughly giving the same kind of returns as additional compute going into pre-training. And so both are going to are going to be, you know, places where frontier companies can invest for the time being. You know, I guess another way to think about this is like people also ask like about generalization in RL. And I think again it's kind of worth unpacking like what does it mean to generalize? Like what are we talking about generalizing? One way to think about the question is if we do a bunch of RL on a model, does the model generalize to all sorts of new things? And there I would say again there's probably another uh way to to break that down which is like there are domain specific skills and then there are more kind of cognitive or even like metacognitive skills that work across

domains. So one of the biggest like I guess we'll call it the aha moment for me and for deepseek researchers and for the R1 model that they were training was from the R1 paper. This is January 2025. they um reported this what they call the aha moment which is in in their process of doing RL on you know an already pretty capable base model of course right they found that these previously unobserved higher order cognitive behaviors started to come online the aha moment in particular was in the reasoning trace the R1 model gets to a point and it says oh wait this is an aha moment like I can come at this from a totally different direction And this is something that hadn't been observed much. I mean, it's it's out there in the pre-training data. There's at least some examples, right, of people kind of documenting their own chain of thought and getting to these aha moments and realizing, "Oh, I was coming at it the wrong way. Now I can come at it this other way." But that hadn't been observed too much in AI

systems. Reinforcement learning seems to be bringing, you know, clearly has like brought that sort of thing out. And now we have these long reasoning traces where the model will kind of come at the same problem from a bunch of different directions. And so what generalizes, what doesn't generalize. I think you probably can't take a model that has never been trained on a particular domain and expect it to go into that domain and be successful. But you can expect some of these like meta traits to generalize from one area to another. And then if you zoom out the farthest and just say like does RL as a process that companies or you know organizations apply does that generalize? And there I think the answer is like definitely yes. It's just a question of getting the reward signal dialed in to the point where it actually works. And that's definitely easier in some areas than in others. And so we do see like in areas where it's easy like math [clears throat] and programming like we see fast progress there relative to things where it's harder to get a clear reward signal. But still I think we are

quite obviously making that work. episode of the podcast that's coming out I think today as we're talking is with the head of health at open AAI and you know I can tell you from personal experience latest models are absolutely on the level of attending physicians my son unfortunately has had cancer over the last few months he seems to be very much on track to be cured and be all better which is fantastic and I probably wouldn't be here talking to you if this that wasn't the case um but I've had occasion to really use the latest models intensively in a medical context and they're absolutely on the level of the attending physicians. They know a lot more than the residents and they they really are step forep with the most senior doctors at the hospital. How is that happening? Well, they've worked with 250 plus human doctors closely at OpenAI to create training data to grade etc etc etc. But now their latest models are also outperforming their human doctors when it comes to the task of

evaluating AI outputs. So there there are these kind of thresholds that they're crossing where it's like, you know, it was really hard. I sometimes think of this as like spinning a big wheel. You like your first push is on this big wheel. Don't move it much, right? But as you build up this momentum, as the flywheel really starts to turn, you start to hit these thresholds where it's like, well, now we have a model that is outperforming our humans. You know, think of all the work that they had to put in, thousands and thousands of hours and millions and millions, hundreds of millions of dollars potentially to hire hundreds of doctors to do all this work. Now they've got a model that is beating the doctors at evaluating outputs in the medical domain. And so that totally changes the game. Those thresholds will be crossed at different times for different domains. But I think it's like safe to say that in most domains, if there's any sort of objective ground truth or even like a high level of agreement among professionals, you can get there. It just takes more time. >> Hey, we'll continue our interview in a moment after a word from our sponsors. >> Everyone listening to this show knows

that AI can answer questions, but there's a massive gap between here's how you could do it and here I did it. Taslet closes that gap. Tasklet is a general purpose AI agent that connects to your tools and actually does the work. Describe what you want in plain English. Triage support emails and file tickets in linear. Research 50 companies and draft personalized outreach. Build a live interactive dashboard pulling from Salesforce and Stripe on the fly. Whatever it is, Tasklid does it. It connects to over 3,000 apps, any API or MCP server, and can even spin up its own computer in the cloud for anything that doesn't have an API. Setup triggers, and it runs autonomously, watching your inbox, monitoring feeds, firing on a schedule, all 24/7, even while you sleep. Want to see it in action? We set something up just for Cognitive Revolution listeners. Click the link in the show notes and Tasklet will build you a personalized RSS monitor for this

show. It will first ask about your interests and then notify you when relevant episodes drop. However you prefer, email, text, you choose. It takes just 2 minutes and then it runs in the background. Of course, that's just a small taste of what an always AI agent can do, but I think that once you try it, you'll start imagining a lot more. Listen to my full interview with Tasllet founder and CEO Andrew Lee. Try Tasklet for free at tasklit.ai and use code cog for 50% off your first month. The activation link is in the show notes, so give it a try at tasklet.ai. Support for the show comes from VCX, the public ticker for private tech. For generations, American companies have moved the world forward through their ingenuity and determination. And for generations, everyday Americans could be a part of that journey through perhaps the greatest innovation of all, the US stock market. It didn't matter whether you were a factory worker in Detroit or a farmer in Omaha. anyone could own a piece of the great American companies.

But now that's changed. Today, our most innovative companies are staying private rather than going public. The result is that everyday Americans are excluded from investing and getting left further behind while a select few reap all of the benefits. Until now. Introducing VCX, the public ticker for private tech. VCX by Fundrise gives everyone the opportunity to invest in the next generation of innovation, including the companies leading the AI revolution, space exploration, defense tech, and more. Visit getvcx.com for more info. That's getvcx.com. Carefully consider the investment material before investing, including objectives, risks, charges, and expenses. This and other information can be found in the fund's perspectus at getvcx.com. This is a paid sponsorship. Okay. So, just to like go back a little bit, it sounds like okay, a lot of your confidence um that we'll have transformative AI very soon. It hinges I'm not sure to what extent it hinges on like your confidence that we'll find

some other paradigm if a new paradigm is required on top of pre-training RL. You also sound very bullish on RL. But maybe to steal the other view that's like okay, if RL doesn't work, then we actually might not find another paradigm is that like okay, so RL is a very general machine learning principle, right? It's just that like you reward the model for doing the correct thing on a task that you can like uh define a reward signal for which is like kind of like a maximally general machine learning principle. And it's also something that existed for like a long time. It wasn't something that was like just recently discovered after we got LLMs. It was like a major like one of the foundational sort of um paradigms in machine learning even prior to LLM. Um, and so I guess the skepticism is like maybe there's like it's actually not that easy to find some other paradigm. Like maybe you know pre-training was like this pre-training LLMs on text data was this novel thing and then we just like applied this thing that had been in machine learning for a while um namely RL but then like the next leap is like we there's no there's no precedent for we don't see any like big promising thing. What do you think about that skepticism? >> I think it's probably not going to play

out that way and if it did I still don't really think it matters that much. It probably shifts the timeline a little bit. Um but the RL will continue to scale. The amount of compute coming online is again that's exponential, right? So there's just tremendous amount of additional resources to be thrown into this. One thing we're doing a bit but not a ton is just like using the signal from the world. You know, Elon's got a plan with XAI to and I think he's got a real advantage here to just have the AI solve the same super hard problems that the engineers at Tesla and SpaceX and um Neuralink are solving. And so he's just going to give them a computer, say, "Here's all the professional software. Here's the problem. You got to solve it." Right? There's like a never-ending supply of those problems. And I wouldn't be surprised to see Grock get to like Tesla, SpaceX, Neuralink engineer level just based on the fact that those problems are there to be solved and you

know they've got the compute to keep trying. And what's really the fundamental barrier there? I I don't really see it. I honestly think that the the new paradigm is probably more about usability than it is about capability. By which I mean like people have a lot of complaints when they use models because they're like, "Oh, it didn't really do what I wanted it to do or I sort of, you know, it didn't really understand me or what have you." And clearly one thing that they're not great at doing is going into a new environment, kind of scoping the situation out, getting the vibe, you know, getting picking up that subtle feedback that people give each other and gradually kind of figuring it out and like becoming a useful contributor. That's like how a lot of people go from day one to like effective employee in their jobs. And AIS don't really do that in the same way obviously, right? They don't have continual learning. They are able to manage their own memory somewhat, but they're not that awesome at that. That's another one of these like cognitive skills that does generalize though. Like once you're good at managing your memory, you'll be able to apply that in in new domains.

>> But they're not that great at that yet. And I think that prevents a lot of people from getting value more often. I think that not that the model can't fundamentally do the thing, but that it doesn't have the context. the person trying to get it to do the thing doesn't know how to assemble the context or doesn't believe that it could do the thing such that they're willing to invest the time and energy to yeah give it the context >> and so that unlock might just be like >> now you don't have to do that anymore it will kind of figure it out on its own and this is maybe a way to expose in a way similar to instruction following right like in the >> if you were really good at prompting GPT3 >> you could get it to do a lot of things but it was a weird art to prompt GP GPT3 and to a much lesser degree but still somewhat you know getting value from the current models is still kind of a weird art. You kind of have to have an intuition for them. You kind of need to know how to assemble context and give effective instructions and people aren't that great at that. So that next version might not even be so much about allowing

them to do qualitatively new things but just making the barrier to use much lower so that people can just be like hey AI coworker welcome to Slack you know and then over a week or whatever they just kind of ramp up and gradually get it. Yeah interesting. I I think the push back there is the verifiability problem becomes a little more apparent in these specific long horizon excuse me situations. I think when we talk about verifiability right now, we clearly see that with model capabilities like you said with cloud code that they work great for math, they work great for uh they work great for code um and they work great in these settings where you can clearly check the answers and whether or not they're right or wrong, right? That's how you kind of create this reward signal. But when we talk about these domains where where there's no clear verifier, for example, when it comes to writing good essay, what does that mean? Give good therapy in the healthcare settings. Um, you know, and then we emphasize that even further by placing it in a longer horizon context such as the one that you're talking

about with instead of having a human in the loop at each step, prompting it in a specific direction, giving it a broader task and having it go in a direction for a longer time horizon, it gets even harder to selfish verify ability uh problem, right? like how do you make sure that you get highquality signals? How do you make sure um tune these systems in a way where they can say autonomous work for days or weeks at a time? And so I I wonder if you have any thoughts in terms of how we kind of go in that direction and solve this seemingly very large verifiability problem that hangs over our head when we talk about areas that have long time horizons. Well, again, I'll take one beat to say I think the recent trajectory shows that this problem is being solved. The I'm sure you are very familiar with the meter graph which is everywhere these days. It's basically going vertical at this point. It's to the point where the meter people are like we are really struggling to have

tasks long enough to even be able to evaluate these things on. And you know, it's also kind of worth noting too that that's a bit of a challenge with humans, right? like we hire people on much less than, you know, a week's worth of work. And it it's not like always super easy for people to agree like did somebody do a good job on that month-long project or not. You know, you can usually tell if they like totally crushed it or like totally sucked. There's often a lot of disagreement in organizations about like well, you know, maybe it was actually harder than we thought or, you know, they didn't have the ingredients for success weren't really there. I mean, there's there's often a lot of fuzziness in in this stuff even in in human context. I mean, there's a bunch of techniques that I think are are being used. One is rubric rewards. So, you know, it's Elon likes to talk about things like, does the rocket fly? The ultimate ground truth is if I can send this thing into space and I can come land it down on a pad standing up on its tail again, then like clearly that worked. And yet the reward for that, you know, that's an expensive experiment to

run. you couldn't just you know launch a million rockets and have all of them you know most of them crash to find the ones that worked. So there is a challenge there in terms of sparse reward and the the cost of the experiments. I think what is happening a lot is that people are defining rubrics of things that they want to make sure the AI does well and they're probably working with AIS to develop those rubrics. Again, in the OpenAI health context, they created a benchmark called HealthBench where there are 49,000 evaluation criteria. So, all these different tasks, you know, puzzles that the AIS have to figure out. And it's not just like did you get it right or did you not get it right, but it's a painstaking effort to really flesh out, you know, all the different things that would matter that would make for a complete awesome, you know, best-in-class answer. And then the AIS are not scored zero and one. They're scored on, you know, some sort of scale that's like you may have got zero out of 25 things, right, that

you could have got on this question. You may have got five, 10, 15, but that gives you enough of a signal that you can kind of climb that hill. And it really does seem to be working. I would I think that is going to work pretty well in domains where there is a professional consensus because I think that's how people evaluate each other too, multiple choice tests. But there are also like you know you got to kind of show in a medical context you know you you're a student you go through these in-person training processes they say watch one do one teach one and so you know as a medical student you watch and then later you start to do and then you get a lot of feedback and eventually next thing you know you're the one teaching I think the AIS will pick up those signals and then there's other things where it's taste and I think that that'll be a little bit different probably but if I had to guess you know what is a good novel well first of all there is no consensus on that Right? Even you can find somebody who hates even the you know most universally critically acclAIMEd novels and you can find somebody who loves something that everybody else thinks is trash. I think what we'll see there is kind of taste

based communities coming together to shape models for their own tastes. So, in other words, you might start with a base model and you might want it to write romance novels or you might want it to write anime or you might want it to write hard sci-fi. And what it means to be good in those different genres is like quite different. But what you do have is fans of those genres that can engage with outputs, give their scores and shape models based on what they like. >> Okay. >> And I think that can work. >> Yeah. my so I just want to uh go back to the even within task where there is a consensus right like I I I see what you're saying that like you know there's ways to build like minor consensuses within like uh fiction communities or something like that um but um so like even the question of does the rocket fly I I think there's an argument to be made that this is like categorically different than a chatbot putting giving out like useful medical um advice right like when a chatbot's just giving out useful medicalical advice it's like not

very it's not a long context task It's like easily transcribed into text, right? It's like sort of what LM are like clearly going to be good at. They're like good at LLM. Um, so they're like good, they're native to text, but like something like does the rocket fly? It's like okay, even if there is like obviously a consensus on what it means to make a rocket fly. It it seems like there's an argument made that like the the signals on how to perform that task are very noisy and like you need a ton of them, right? Like you need the like, you know, like let's say you're like a project manager for the rocket flight team, right? like you need to understand what's feasible within engineering which like okay maybe Ellen's are better than that at that but then you also need to like know how to manage humans and like or not maybe AI agents in this case but like you know need to know how to like manage people and and to allocate tasks and how to like what is going to be like feasible with the amount of like money you have and how can you raise money from investors and stuff like that and so it's like the longer time horizon the more agentic a task the more like little itty bitty signals are required

on your day-to-day work to be able to complete the task. So, I think and going back to what you said about like this sort of flywheel thing, we're like, okay, like if we have like we work really hard and we get a lot of medical experts to provide this high quality data, then we get LLMs that are like pretty good at um giving medical advice and then all of a sudden they can also they're actually better than human doctors at at verifying the output of medical advice in the first place. Then we get this flywheel effect um and we have we have like a quick and robust verifier. But maybe like the argument is we just never get over this flywheel hump for longterm highly agentic tasks precisely because it's like it's not even a question of whether or not there's consensus on what constitutes a good outcome. It's like can we define enough reward signals um and like do enough rollouts to get to get to this point. One thing I find helpful sometimes is just to try to reflect on like what I do. What is my experience like? And you really try to interrogate it for like is there something really magical going on or not? And I think I

have to say really long horizon agency is pretty rare among humans, right? I mean you do have these people who like an Elon who keeps going up say you know my goal is to make humanity a multilanetary species and if it takes 30 years you know that so be it right so I would say you know skepticism around like will we get AIs to be you know multi-deade horizon planners that can sort of show up with the fierceness and determination required on a day in and dayout basis to really make that happen remains like pretty valid skepticism in my opinion. At the same time, you look at like what most people do and what drives the economy and it's like much shorter term than that. You know, a lot of it is like I kind of keep showing up one day to the next and every day I kind of start fresh. You know, I went to sleep and I kind of shut down. you know, there's like there's sort of a discontinuity of consciousness between work days, which,

you know, I don't want to I'm not a big analogy guy, but if you squint at it, you can sort of make an analogy between like different context windows. And there's kind of a, you know, a somewhat similar resetting type of moment. And what do I have to do? Like I have to kind of reboot myself and be like, what was I doing yesterday? Where did I leave off? What did I accomplish? What was still to be accomplished? And most of that stuff, you know, I could write down if I if I had like a worse if my integrated memory was worse, I could compensate for that pretty well, you know, at least on sort of a week long or maybe a month long or maybe a quarterlong basis. If I got really good at the end of the day, at the end of each day of saying like, here's everything I did today. Here's what went well. Here's what failed. Here's what's next. Here's the feedback I got. Here's what I think I should do in the next session. and then wiped all that out and came back and looked at that scratch pad, you know, again to start my day tomorrow. I think over time I could get like pretty effective at that. So I don't and I I think this does connect back to also to the certain kinds of skills do generalize and I do think this is one of them, right? Managing your own

memory, writing notes for yourself to kind of document what happened and what's supposed to happen next, giving yourself a sense of like where you are on the overall story that is unfolding. I think that, you know, they're not great at that yet, but they're getting decent. And, you know, when my claw code crashes these days, a lot of times, u,, you know, just kill that tab, resume in the next one, and then I'll just be like, resume, you know, or like, sorry, we got cut off. Like, my internet went out, you know, please pick up where you left off. And they're able to do it, you know. Again, they're not awesome yet, but they're way better than they were even 3 months ago. So I think it is hard for me to see how that doesn't extend, especially because it is a relatively new thing, right? And and most of these things that pop up, they have at least a few generations before they kind of level off. It's hard for me to see how we don't see a again a pretty steep, you know, meter curve is currently vertical. It can hardly get steeper than it is, but I think we've got at least a few generations of like very steep progress

there. And then maybe there's some other kind of thing where it's like, okay, sure, you can do a quarter worth of work when somebody gives you a project. Can you like figure out what the next big civilizational advance could be? Like some of the true human visionaries? Maybe not. Maybe that's a different kind of thing. In a way, I kind of hope that we stop there. Sometimes I cuz I talk up a lot what I think AI is capable of, will be capable of. It's easy to mistake me as like a booster. I'm actually kind of afraid of that. I think it would be great in some ways if we did find certain fundamental barriers where it's like, hey, I I can delegate a a a month or a quarter's worth of work to this thing and it might be able to do it in a day for a couple hundred dollars. Like, wow, amazing. If we could kind of park it there, [laughter] you know, and not have it go to the point where it's doing multi-deade level planning, that might be a really good thing. that might be sort of the sweet spot where we get a lot of the advances that we want and the better quality of life and you know all the abundance that people dream about and we don't have so much risk of losing

control. So I don't you know I don't think every advance by the way is a is a good thing but I just don't see fundamental barriers on the horizon at least. Okay, great. Let's talk about LMS more generally and how we feel about that architecture overall towards reaching this transformative or powerful AI. What are your thoughts on the general idea that LLMs fundamentally cannot be the core of this journey towards reaching this transformative AI? I think Yan Leon is the biggest person behind this hypothesis where this next token prediction is the wrong objective entirely and we need something completely different. something like world models or energy based models that have a real understanding of the input that's coming in and the output that it creates. I think this is important because there's a lot of in intuitive thought processes behind it in which you know the way you're learning statistical correlations over languages isn't necessarily how the world works and that

you need something different to really achieve that understanding to create like powerful AI. Do you think that's the case or do you think LM do take us to this next frontier that we talk about? Yeah, I think there a couple different levels that I would want to use to address that. One is I do agree with the Yanlun thesis in the sense that I feel like we are running right now a depth first search in AI space where we are all jamming as hard as we can on a particular architecture and scaling it as much as we can and you know people are now of course like even building chips that literally embody the architecture of the model in the chip itself. I don't really like that. I I kind of wish that we were doing a little bit more of a breadth first search where we would like explore different kinds of architectures and find their relative strengths and weaknesses and you know hopefully bring because we're not one thing right and we have a lot of modules in our brains and so it's just fundamentally weird on some

level and you would expect it to be kind of brittle in some sense to take one you know relatively simple thing and just stack layers of that you know [clears throat] that that's like certainly our the solution that nature found in humans is like a lot more complicated and it feels like if you want something to be robust in in various ways, you would probably want to have different modules. So, I do kind of agree that like it would be nice if we were doing a little bit more breadth of explanation or breadth of exploration rather than just like trying to jam this one thing as hard as we can, you know, until we can all retire or whatever exactly the, you know, the dream is supposed to be. At the same time, I think where I would disagree with the lacun school pretty strongly, and I honestly think this is like kind of a closed question, although he still he still disputes it, and you can find people who will. For one thing, the AIS are not trained anymore on next token prediction in the way that they were, right? So, RL is not next token prediction in a fundamental sense, right? The task that the model is given

is not here's a bunch of text. Can you predict what comes next? the tech that what it's trained on now, the signal it's getting now is did you get the right answer? And the right answer could be a fully verifiable mathematical proof or you know numerical answer to a question or it could be one of these things that is you know 49,000 evaluation criteria on a a huge medical corpus. But it's not anymore that it was supposed to be this token and you gave it this token. It's now qualitatively or quantifiably as the case may be. Did you get the right answer? And then that signal is translated into a gradient update through like a I mean there's obviously a lot of different mechanisms but GRPO group relative policy optimization is basically comparing for a given model you know here's a bunch of attempts that it made some were right some were wrong let's use that to create a direction in weight space and move in the direction that would make the answer that was right be more likely next time so it's I think something people should all update on at this point that like

we're not just doing next token prediction anymore. That's still part of the process, but it's, you know, it's not the whole story. And then the other thing is I think it's also very clear at this point that the AIs do have world models. We can look at the internals, obviously not anywhere as much as we would like to understand about what's going on inside them, but we do have enough of an understanding at this point to create things like the Golden Gate Claw experiment, which I'm sure you guys have seen where they train the what's known as a sparse autoenccoder. The a huge problem in terms of figuring out what's going on inside a neural network is it's very dense, right? the the width of a model might be depending on how big it is, they usually go like powers of two, right? So, it might be 4,000 or 8,000 or 16,000 uh wide vector of activations that sit between the layers. after each layer there's like this sort of bottleneck of okay we've got let's say 16,000 numbers that are all you know some precision

floatingoint number whatever and that represents the state of play well there's obviously way more than 16,000 concepts right so that means we can't just have one space for each concept instead we've got to have what is known as superp position which means like you know if just point one is lit up that might mean something if points one and two are lit up that means something else. If points one and three are lit up that means something else. One and four, one, two and three, one, two and four out to you know the vast space of combinatorial possibility. So the sparse autoenccoder basically tries to untangle all that stuff and say can we get to a representation where we can look at a sparse number of you and and these sparse autoenccoders they're computationally expensive on their in their own right and there are millions I think like tens of millions of activations wide but they sort of branch this very dense superimposed concept mess out into this sparse space and believe it or not it works and you can

now look and say okay here's the 10 concepts that are like most active in this network at this time. And then you know that's okay maybe you're kind of fooling yourself but the proof is in the pudding when they are able to then say okay now that I know what pattern of activation corresponds to this concept I can intervene on it. So with Golden Gate Claw, they sort of, you know, out of tens of millions of concepts that came out of this process, they found the Golden Gate Bridge concept, artificially turned it up, and now you've got a model that just wants to talk about the Golden Gate Bridge. So, and there's, you know, more stuff that's happened obviously since then in interpretability as well. I wouldn't say that the AIS have perfect world models. Um, but they definitely have some world model. And I also again, I wouldn't say people have perfect world models, right? I mean we went we made it into the 1900s without any sense of like relativity uh because the world model that we had was like good enough for us to get by in the domain you know that we were working in and you know it's going

to be an interesting question of can AI start to create those conceptual leaps like a pre-relativity to relativity sort of jump um again I think you know exactly on what timeline that comes I'm not so sure but there is a world model inside the AIS there is a conceptual understanding that is definitely richer than pure like stochastic you know correlation of tokens. Um and that has been demonstrated I think at this point quite conclusively with the interpretability techniques that are out there. >> So I guess where does that all leave us? Um yeah you're well I yeah again yeah I think that's a pretty comprehensive response to the lacun objection just to translate for our slightly more general audience although it was very interesting and we are glad that you went into technical depth there basically what Nathan is saying is that first of all it's very important to remember that actually this objection that you know next token prediction can't be the core of intelligence doesn't necessarily apply because it's very important to remember that we are now doing reinforcement learning so we we do next token prediction with LLMs to

get this sort of basis of intelligence, but then we do reinforcement learning on like real world tasks that sort of intuitively seems like something like a more more like intuitively what you'd expect the right training objective to be to get AGI or something like that. And then separately, Nathan is also saying with this thing with sparse autoenccoders and the golden gate golden gate claude example, um we have techniques in AI now that allow us to look inside the model when it's responding to a query and see which concepts in some sense are activated by that query as it's responding and and we can see that there are sort of concepts that correspond within the neural network to objects in the real world. Right? So even though it's just an LLM and in addition to reinforcement learning, it does have this world model that where like we can say, "Oh, look, that's the Golden Gate Bridge concept and and it's lighting up when we ask it to answer a query about the Golden Gate." Is that an >> accurate? Yeah, that's great. Um, >> yeah, >> phenomenal job. And one other thing I would add is the

the explorations that have been done of the embedding space or the latent space are also really quite interesting and revealing. There was one, this has been a couple years now, and you know, these things have gotten much more sophisticated since, but there was one study that just showed that you can kind of do vector operations around the latent space. So, for example, if you have the take the embedding for man and then you move to king and you look at that direction and then you apply that same direction to the embedding for woman, you get queen. So there's like an order a sort of conceptual coherence of the way in which concepts are represented spatially in this like you know super highdimensional latent space that clearly is like meaningful you know and exactly how it's meaningful or exactly what it's learned or what mistakes it may contain or you know what aspects of a true grand unified theory of everything it doesn't have like those are all open questions but the I think

that that sort of thing like quite demolishes the idea that you it's all just noise or like there's some, you know, sort of slight of hand. I think that that organization of its own internal map of the world reflects like some real understanding going on. Maybe not humanlike understanding. I always kind of say human level but not humanike. They could be quite alien, you know, but that that doesn't mean they don't understand. They don't have to understand in the same way that we understand in order to meaningfully understand. So I I think that is again pretty well resolved at this point. I honestly don't know why some people, you know, can't uh can't update on that dimension. Uh it's it's quite strange. Okay, moving on from the capabilities and training discussion. Let's talk about hardware and energy, these other inputs to AI model progress. So, it sounds like, you know, you're pretty confident that, you know, either with RL or with some additional paradigm, we'll get to like very transformatively powerful AI. But I assume that part of that thesis involves us continuing to scale these inputs such as talent, such

as hardware, capital, energy. Um, and all all mainly energy and hardware. A push back people have often times is that actually there's just not that much more energy to divert to AI model training and inference, right? Like I can't remember the numbers, but there's like some graph that like the total amount of US energy production is like not very increasing very fast at all. And AI's like the amount AI consumes is increasing very fast and at some point it's going to catch up and we're going to be bottlenecked by energy. With chips like it takes a really long time to make new chip fabrication facilities and like we'll just like run out of chips to train AI models on. Um and even capital uh like maybe like we'll just run out of money in the world to invest in building new data centers. So let's see. I guess the question I want to ask is which of these bottlenecks do you see if any as most plausibly being a major bottleneck to continued progress? Um yeah, I guess the first thing I would say is I don't think any of those bottlenecks are really fundamental. They are more like

cultural or socopolitical or whatever. Um cuz there's like you know a lot of energy coming from the sun all the time. And the question is how much are we actually going to harvest and harness? And that is where it becomes a political debate around, you know, who's going to be allowed to build what, where, on what timeline, with what permits, you know, with what impact, whatever. Um, and I think those are, I think it's often overstated the degree to which AI is energy intensive. Um, I actually did a whole episode on this with a guy named Andy Masley who has really been, you know, fighting this fight online in a pretty uh dogged way. And you know, there's a lot of interesting comparisons, but like a Frontier chip today, like an H100 or whatever, it it basically uses the same amount of energy when it's on as a microwave or as like an electric teapot. And one query is, you know, maybe on the order of running your microwave for like 1 second. So, you know, you'd have to be making a whole lot of queries and increasingly people are for it to be moving the

needle on energy consumption. you know, people don't think twice about like putting something in the microwave for two minutes and that's probably more energy than most people are using with AI on a weekly basis. So now it's ramping up and it is starting to add up and it is going to get to the point where if we can't add any capacity then we're going to have a bottleneck for sure. But again, those bottlenecks aren't super fundamental. China doesn't seem to have them, right? they're adding as much electricity to their system as I think it's gosh I won't say an exact number but it's like some relatively short period of time in which the Chinese economy is adding as much electricity as the entire American capacity now they have like you know four times as many people as well so long way to go to um build out everything that they might want but it just shows that it can be done it also can be done apparently in the Gulf and I recently talked to Sam Hammond who's an economist and a very agiipilled thinker in Washington DC. He had just been to

the UAE on a trip and I was like why are we doing these deals with the UAE? You know, all I hear is like we want to have AI reflect American values and you know take American values around the world and I'm like not sure that the governments of Saudi Arabia and the United Arab Emirates are the greatest partners we could have in projecting American values. Why are we doing these deals? And it seems like honestly a big part of the answer is because they don't have issues with putting up a new plant. You know, they can just do it and it'll happen fast. And that way we kind of know that even if we can't do it here, well, we can at least like do it there. So that's all just kind of characterizing the bottleneck and saying like there's plenty of energy. It's a question of who will be allowed to get it, under what circumstances, you know, on what timelines, with what permits, etc., etc. Um, you know, chips are harder for sure because there's it's a very specialized thing and that, you know, in terms of like if you had to ask if you ask me what would be the most likely reason we wouldn't get economy transforming AI in the next few years, I

would say something happening to the chip fabs in a major way that throws production off to the point where chips are super scarce and you know, maybe we can't scale the training runs full stop or even if the training runs you know, can kind of still scale. There's just not enough inference to go around. And so, you know, even we might have like really powerful systems, but we just don't have enough access, you know, economywide for people to deploy them and like automate all the things that seem like we're on track to automating. So, I guess if I had to pick between energy and chips, I would say chips, but I don't, you know, that seems like kind of a tail risk scenario. All the projections, you know, all the the whole economy kind of depends on it at this point. So everybody is like incentivized, you know, certainly the political class is incentivized to make it work. The corporate managerial class is pretty incentivized to make it work. There is some tail risk, you know, that that mainland China makes a move on Taiwan and that, you know, that could be um a a huge disruption, but it seems like as far as I can tell, it would be

sort of tail risk type of things that would be the most likely disruption, not a fundamental B. you know there's plenty of sand but you know obviously which is where like the silicon comes from so you know and that's that is scaling too we are starting to get chips in the US from what I understand I'm not an expert on this but I understand the yields have been decent in the US maybe even a little bit ahead of schedule you know in terms of like people thought oh man [clears throat] it's really going to be a few years and a few generations and a lot of iteration to get this stuff to be somewhat competitive and seems like it's kind of come online reasonably well I'm mostly I mean this kind of a theme I guess in my thinking in general I'm mostly worried about the tail risks. I'm mostly worried about like the AI going wrong um in some like really weird way. And I'm and in terms of what would prevent it, I also kind of think the tail um outcomes are are the most likely to like put me in the camp of being like catastrophically wrong. >> Yeah. Okay. Awesome. Well, let's talk about that then. Let's talk about this like risk of of alignment that is the

primary concern amongst the community right now. You've had hundreds of these conversations with safety researchers, lab people, builders over the last few years. Net net, what is your analysis on the alignment problem? Has it become harder or easier in the last several years or last several months? Well, for starters, I think reflecting on my hundreds of conversations, one of the strangest things in the world today, full stop, is the fact that the disagreement among very plugged in, very informed, very smart people has not really been reduced much at all, even as we've gained a ton of information over the last couple years about the trajectory of AI. I think that is super strange and I'm honestly pretty confused by it. The one thing everybody seems to agree on is like the timeline on which we should expect this has come in. There's still disagreement about that, but like Helen Toner, who's the head of the Washington DC think tank sees set

and um was previously on the OpenAI board, she wrote put her finger on this phenomenon I think a lot of people were feeling with a blog post that said um even long AGI timelines have gotten super short. And it's basically like, you know, in today's world, if you say you don't think you're going to see AI until 2035, you're like an AI bear. But only 5 years ago that was considered to be quite aggressive and you know most people were more like I don't know 2050 maybe not in my lifetime. So there's been this massive compression of timeline there's obviously been this huge jump in capability and yet on these fundamental questions of like what's going to happen you know there's still total disagreement. So I think that's a very weird phenomenon that I can't fully explain. >> Yeah. Where where do you think Yeah. >> Where do you think this disagreement is coming from? Is it is it like cope or some sort of push back that you're seeing from these like very intelligent people are still part of the field or do you think there's legitimate um uh thought processes behind why there's

so much disagreement around misalignment? Well, they didn't I think I'm not sure I can summarize the conclusions with super high fidelity right now, but but CE set under Helen's um leadership did a workshop where they tried to bring people together and assess like specifically on the question of recursive self-improvement. How big of a deal is it going to be? You know, do do we run the risk of this whole process like getting away from us entirely? You know, even on that somewhat reduced question, there was still like very wide disagreement where some people are like, I don't think it's going to be that big of a deal. It'll make people a little bit more efficient, whatever, but it's not going to like bring about some phase change. And other people are like, you know, as soon as you get an ML researcher that can do that, you've got you go from, you know, I don't know, maybe there's 10,000 people today that are really working at the frontier of ML research globally to 10 million. I mean that's got to you know c certain people think that's got to have a huge effect and you know it's going to be really hard to control a situation if we all of a sudden thousandx how many researchers

and they work at you know potentially thousands of tokens a second. So why do people disagree so much? They sort of came to the conclusion that people were working from different conceptual paradigms and that these paradigms are pretty good at taking new information into account and kind of explaining it away. So you have some theories that are like kind of, you know, bottleneck theory or O-ring theory where it's like you're always going to have you're only as good as your weakest link basically is kind of one way of of thinking about it. And as long as you think that, then you can kind of say, well, sure, okay, the AI can do this, but they still can't do this other thing. And so there's still a weak link and there's still going to be these bottlenecks. And so, you know, the whole thing isn't going to get too crazy. And then the flip side of that is like kind of a jaggedness thing where people will say, "Well, okay, sure, the AI can't do this yet, but look what they couldn't do one year ago, two years ago. You know, they couldn't do basic math. Now they're like solving unsolved math problems. So, sure, there's still jaggedness, but last time you told me about jaggedness, you told me about can't do basic arithmetic. Now

we've got unsolved math problems, but these these perspectives seem to be really grounded in like worldview priors or kind of you know the the paradigm that people work in and it's proving really difficult to get to a real meeting of the minds on those. >> Yeah. I saw a Kotra talked about a similar thing on the 80,000 hours podcast a few weeks ago saying a similar thing to Yeah. Great episode. She said a similar thing like yeah like the economists who expect that like we won't enter some new GDP growth regime you know they always point to all these bottlenecks technological diffusion is always slower than people think. Um and on the other side um you know people think that these models are just simply outdated and these people aren't like really taking seriously what it means for AI capabilities to be at the given point that we're like conditioning on. So, I'm wondering, do you think that it's possible that people are sort of talking past each other and like they're not actually talking about the same level of AI capabilities when they say like, oh, it only like will uplift ML research somewhat. Maybe they're just

talking about like not that powerful AI. But if they were actually talking about like fully automated ML researchers that are as good as humans, never have to sleep, and run faster, then they would like see RSI as more plausible. Do you think that's possible? >> I think that explains some of the disconnect for sure. I do think people are often and I I've started at least in some cases beginning my interviews with like how AGI are you and you know what do you what do you expect to see over the next couple years because if that's not established and I think they're thinking one thing and they're not there can be like quite odd disconnect downstream of that. So I think getting those assumptions on the table early and kind of at least you know cross-co comparing them is usually a a productive or often a productive thing to do. I don't know that that explains all of it. I do think there is still this idea that like because you do hear of sort of sure even if they could you know come up with good ML research experiments and this and that then there's still going to be this other bottleneck that's going to

you know there's always and it's it's seems at times and I'm not in this camp so and I don't I don't want to be uh you know too blindly dismissive of it or unfairly critical but it does seem to me that there's sort of an aspect of like faith in the idea that there's always another bottleneck. And I would contrast my position from some of those positions in the sense that a big part of what motivates me is I don't think what I'm saying has to be guaranteed proven right in order for us to be really motivated by the possibility. I'm happy to leave open the possibility that maybe the bottleneck people are right and there's always going to be another bottleneck and all this will kind of stay under control or maybe there's some like plateau around you know a quarter's worth of work and we just can't quite break past that and you know there's just some weird Elon you know phenomenon that we can't explain and we can't ever quite get there and I again like I said earlier I think that might be great news if such a thing were to be true

but you know I'm happy to leave that question open for the time being and just kind to say, I don't know, you know, we don't have a great account of what that would be if there was some, you know, fundamental bottleneck that we'll never get over. And in the absence of it, I'm not persuaded by people kind of just generally gesturing that there will always be another bottleneck because if they're wrong, you know, we're in for a really wild time. And you know, in terms of like what should we do about where we are and what might be coming, I think the worst mistake we could make would be to not take it seriously enough and kind of content ourselves with a story that, you know, it'll all kind of self-regulate and and we'll be fine because I don't see a great argument that that's true, you know, and I see at least decent arguments that it might not be right. And again, I look to like our own history and I'm like, well, we've driven a lot of other species to extinction, including our closest cousins. So, and some of that was by accident. A lot of it was by accident, right? Like, we it was just small bands of people going out and doing what they were doing to survive. And surviving meant hunting

large animals and eating them and using their bones for tools and stuff. And a lot of those animals went extinct, you know, and it wasn't like a coordinated master plan. It kind of happened by accident a lot of the time. So, I'm just like, "Oh my god." You know, we don't have any like real guarantees as far as I can tell. Um, plot armor is like again two sort of um it would be an unfair dismissal of the more sophisticated people that have, you know, theories about always another bottleneck or what have you. Um, but I do worry that a lot of what goes on among less sophisticated people who, you know, don't want to have to deal with this and kind of would rather believe that everything will be fine is some sort of like plot armor thinking where they're like, well, I don't know, I feel like kind of like a main character and like humans feel kind of like a main character and so, you know, you can't take the main character out of the story, right? And I just unfortunately don't think that that is likely to be the case. >> Yeah. So, you so you find the bottlenecks argument in terms of like all the discussions that we had maybe

with research, maybe with capital spend, maybe with energy, with chips, those or like just general societal processes as a whole that slow down the overall technological diffusion, but overall it seems like you're still relatively on the more pessimistic cap. Is that correct? I don't know. When people ask my P doom, I usually say 10 to 90%. Uh a good friend of mine once told me we should think less about and argue less about exactly what the numbers are and more about what we can shift them to. So I try not to worry too much about you know getting into I think like one significant digit is all you get on pdoom is one funny way I've heard it said. I would say I've actually gotten probably a little bit more optimistic over the last few years >> in the sense that >> you know I started reading Elezer in 2007 and the early visions of like the paperclip maximizer and all that kind of stuff and I can't speak for Eleazar and I think he's got you know some sort of somewhat revisionist takes on like what he really meant at that time and

sometimes when I see what what he's saying now that he really meant I'm like I don't know that's not exactly what I took away back then when I was reading your original work but whatever all that is kind discourse. What I understood and what I think a lot of people feared was a very small system that had some like extremely concentrated form of intelligence, you know, that had had found the the right priors, the right inductive biases to be like hyperrational and insanely effective such that given any thing to optimize, it could just optimize it to this extreme state and you know tile the universe or whatever. And there was also the idea that it was going to be hard to get such a system to understand human values. You know, that that this what we value was kind of very gradually and haphazardly encoded into us by an evolutionary process over a super super long time, you know, dating back even to before our species, right? Like other species, you know, care about their young and like seem to be sad, you know, when they lose

their children and stuff. So this is like not even just human. It's like the whole of evolution has kind of led us to be what we are and have the very complicated value set that we have. And it was I think generally understood again though there is some different takes on that history now. But I thought it was generally understood that like we would expect that it would be hard to get AIs to have a real understanding of what we care about. And now I look at the models that we do have and I'm like well actually they do have a pretty good understanding of what we care about. It's not perfect, but I don't think it's crazy to say that Claude is probably more ethical than the average person. Certainly more sophisticated in its approach to ethics than the average person and does generally seem to like want to be good in in sort of a meaningful way. You know, when they let Claude talk to other Claudes and just let it do whatever it wants to do, it kind of seems to want to like bliss out or something like that. So there is some sort of groing of values that I wouldn't have expected to be as easy as it seems

to be and there is some sort of internalization or sort of identity formation maybe is a better way to say it that at least some of the models seem to have that at least suggest to me that there's like reason to hope that we could really get there. you know, could we create when I first heard this the the question, could we create an AI that loves humanity? I thought it was like laughably out of reach. And I think I heard that from Scott Aronson who said that Ilia asked him, you know, hey, do you have any idea like what the Hamiltonian of love is or some like, you know, crazy question like this and he was like, I can't really help you with that, I remember just being like, oh my god, that's the kind of question they're asking. Like, we are screwed. And now I'm like, well, I don't know, maybe there was a little bit more to it. I trust Claude, not fully, but like more than a human assistant. You know, if I was going to give me the choice between hire a human assistant with my even give me, you know, the opportunity to interview, do some vetting, whatever, versus Claude, which one would I give access to my email? I think I would trust Claude more than a person that I like interviewed a couple interviews

with all my most sensitive information. There are cases where Claude has blackmailed people. There are cases where Claude has, you know, done various things under pressure, but I think I have better odds with Claude. So, overall, I have become a bit more optimistic, but it's certainly we're definitely not anywhere near out of the woods, >> right? I guess maybe let me try to steal man the more pessimistic view and that would be like okay sure LLM's are language native and we can we communicate our moral preferences via language but that wasn't really what the people who are very concerned about alignment were talking about even back in 2007 they were talking about AI agents that were goal- directed and you know they were sort of um anticipating something like something like reinforcement learning that gets us these more goal- directed agents that are able to reason over longer time horizons like have a goal and go out in the world and achieve it. So I guess like the case for like not updating towards optimism with respect to AI safety is that like we're still going to

have to deal with this problem where like we're doing reinforcement learning. We're trying to get agents to go out in the world and do things for us and we still don't know how to make sure like how to define a reward objective that is like fully what we want to be optimized um like the maximum degree and and so we have this paradigm like where we got these LLMs and that was like sort of an update maybe towards AI safety being a little easier but then now we're back to reinforcement learning and these same concerns apply like what do you think about that objection? I think that's a pretty good argument and I certainly don't want to leave people thinking that I don't take that seriously or that they shouldn't take that seriously. So I absolutely think we've got more questions than answers. I guess you know when I say I was I've become more optimistic. It was starting from a not super optimistic place. 5 years ago or maybe a little more than 5 years ago I would have said like powerful AI seems a long way off and I and we have like very little hope of if we do stumble on it we have very little hope of controlling it. And now I'm like, it seems closer, but we have maybe a little more hope, but definitely still a lot of of unanswered questions. I think one thing that I do

see a little bit differently than the most hawkish people recently had an exchange online where, you know, there was a there was a 48 hour period where there were like a couple profiles of Amanda Ascll and then they people were commenting on her in all sorts of different ways and Elon was, you know, attacking her and whatever. I weighed in and said for my part I have become quite a bit more optimistic that it's at least possible to create an AI that in some meaningful sense loves humanity and I got to give her and the anthropic team a lot of credit for that. And then of course I got a lot of replies saying like well this doesn't scale to super intelligence and you know this and that and you know arguments along the lines of the one you just made. And I I basically again I think those are all very serious and worthwhile concerns. But one strand I detected in in a lot of those responses is people seem to be imagining a system that is so much more powerful than anything else that if it goes wrong it's over. you know, you get

these kind of ideas of like Eleazar had this list of lethalities post, you know, and and I think that's shaped a lot of people's thinking and it's like you have to get this absolutely right on the first try. And I do think that's not the shape of the AIS that I'm seeing in the world today, right? That's like a quite a hypothetical state of affairs. If you were to drop in an AI that's just so much more powerful than anything else at and it's, you know, weirdly goal- directed and it doesn't love humanity and you like tell it make paper clips. Yeah, maybe we get all like tiled by paper clips. But the world right now is much more like, you know, there's a an emerging ecology of AIS where there's a few frontier ones that are like roughly competitive with each other. Certainly it's not like one clawed instance is going to you know take over the world. What we're talking about is much more of a wave of things where like simultaneously you know millions or one day billions of cloud instances and GPTs and Geminis and whatever all collectively transform the world and that you know that has a lot

of problems and challenges open questions with it too. But I don't I don't worry as much right now that we're headed for a world where like one system runs away from everything else to the point where if it takes one wrong move or you know there's one bad prompt or one jailbreak or whatever that kind of all is lost. It seems to me like there is at least and I think this is kind of like luck you know or or maybe it's fundamental physics but it's not physics that we understood coming into this but it does seem like scaling laws in a way are sort of protective because you know you get these algorithmic advances and they kind of they move the needle they you know they deflate exactly how much compute you need to get to a certain level but they don't tend to make it so you don't need a ton of compute though still to get to like the very high levels. I think Zuckerberg actually had one of the more interesting takes on this that I've heard. And I I don't think he's like distinguished himself as a great AI safety thinker over time necessarily, but he basically said, you know, at Meta, we deal with scammers,

spammers all the time, and the big advantage we have over them is we have a lot more compute. We have just way bigger, way more powerful systems than they do. So, we're seeing everything. and they're trying to spam and scam here and there, but we're seeing everything. We're monitoring everything. And so, like, you know, little things of course happen all the time, but like we broadly can kind of keep it under control. And so, you could imagine a somewhat similar dynamic with AIS where it's like one AI, even if it was like the single most powerful AI in the world, as long as it's not like orders of magnitude more powerful than everything else, there's a whole ton of other actors. They've got all their compute. they've got all their instances, you know, they're all monitoring for whatever they're monitoring for and hopefully that can kind of balance itself out. Then of course you've got your gradual disempowerment concerns which is like maybe that all ends up in some equilibrium with each other and there's no place for humans in it. Um and that's another thing that I do think is absolutely worth taking seriously. But I

just don't see right now that we seem to be on track for this kind of run. You know, if if if anyone builds it, everyone dies. I think there is like an it for which that's true, but it doesn't seem like anybody's particularly close to building it. So, you know, that that's an important part of the analysis from my perspective. >> Yeah. Okay, great. So, then when we think about your concrete hopes in terms of achieving this world in which we solve this misalignment issue, can you give us specific achievements that you think are necessary to end up with an aligned AI that doesn't result in, you know, doomsday for humanity? Does it involve training a model that loves humans. Does it involve some sort of like mech and turb actually scaling or I guess like the sub goals of training a model that is aligned or loves humans? Does that mean mech and turb scaling? Does that mean um alignment by default because models train on human values just end up being aligned? What what are the concrete, you know, sub goals that you think have to be achieved to get to this um alignment issue being solved? >> I'm always interested in something that

could quote unquote really work. And I ask people for this all the time. Do you know of anybody who is working on anything that could credibly really work in the sense that I can sleep well at night now knowing that there's something that that really works. Basically, nobody has anything. So in absence of that, all the frontier companies seem to be taking a sort of defense in-depth strategy. The hope is kind of like and you can, you know, arguably this is inherently the nature of intelligence. that's kind of unpredictable. Um, you know, you could again say the same thing is kind of true of humans. There's there's never been anything that like really works to make sure, you know, a human never does anything wrong. So maybe it's unrealistic to think we could ever get that. I would still love to see people try, but it seems like where we're headed is kind of a everything in parallel at once sort of strategy where it's like, okay, well, if we can we might not get the AI to the point where it never does bad stuff, but if we can drive that low, then that's better.

And then if we can put a monitoring layer on top of that and catch 90 plus% of the things that it still does bad, then that's better. And if we can, you know, have additional monitoring systems that ban people's accounts who are like bad actors, then that's better. And then we can also, you know, really invest in like formal methods to improve cyber security across the board so that we can sort of take certain risk surfaces entirely off the table. That's one of the things that it's not going to fully work to solve like all of our issues. But in terms of things that could really close down problems, formal methods to verify the security of software is one thing where there does seem to be like the opportunity to create genuinely secure software. This is kind of on the horizon and it seems to be on the verge of having a moment. Um then of course we got the bio risk. So like you know we should probably have PPE stockpiles and we should probably have all the things we should have had you know as of the last pandemic. We should have like something that we just bought for our son's hospital room is a ultraviolet light that's supposed to um and I I

think this is like quite well validated scientifically kills microbes and is like pretty gentle on the skin so you can just like shine it in the room all the time. So we should probably invest in scaling out that kind of capacity. We should have better wastewater monitoring so we know you know when things are popping up because it's still going to be the case probably that things are going to be popping up. We should have vaccine platforms which we do have right that are like very quickly programmable. I'm sure you have heard like the story of how quickly the COVID vaccine was designed in a few days you know before the pandemic really even took off in a serious way in number of cases the design of the vaccine was already there. Took us a long time of course to go through all the trials and actually get it to people. Um but it was just a few days to you know to create that vaccine. So, and then there's, you know, there's I mentioned monitoring of like outputs. There's also going to be these mechan internal monitoring type things. Um, the company Goodfire that does interpretability just put out an agenda called intentional design. And so, they're developing ways to try to

understand at each step of the learning process what is the model learning and be able to kind of shape what it learns. So, it hopefully doesn't learn certain problematic things and and does learn other good things that you want it to learn. Then there's AI control techniques. Redwood Research has, I think, done an incredible job of of some of gaming some of these things out. Um, and that's another 80,000 hours episode I would strongly recommend if if you guys haven't heard that and recommend it to anyone. Buckle Schlleggeras talking about how do we get productive work out of AIS even assuming that they're out to get us. They've actually made, you know, pretty good progress in terms of building out a portfolio of strategies. So, I think that's kind of where we're headed. We're looking at like a world where there's AIS everywhere. Hopefully, none are so much more powerful than anything else that they pose like in a single, you know, small individual instance or small pocket like some existential risk. And then we bring every other strategy we have to bear and

that gets us, you know, a few nines of reliability. And still probably some, you know, crazy bad things happen, but hopefully they can be contained enough that the world overall is good. Um, and when I tell that story, I'm like, we definitely need to scale up our investment real quick for one thing. Um, because the amount of money and resources going into making the AIS more powerful dwarfs the amount that is going into all these other things. So, I don't think we're very well calibrated or balanced in terms of where we are putting our efforts, but I do think we have like a bunch of different agendas that seem like they all can take a bite out of the problem >> and maybe take, you know, 20 bites out of the problem and you don't have much problem left. Holden Karnowski used to say, Alzar used to say, death with dignity. He was like, you know, we're probably going to lose this, but we should at least make a real effort. And now Holden Kernowski who's um senior adviser at Anthropic and recently you know they Anthropic just updated their responsible scaling policy basically backing off from some of the commitments

that they had previously made to like pause development under certain circumstances. They're no longer committing to that. Um so he he put out a a long defensive why but a previous thing that he had written was success without dignity. He was like, you know, I don't think that we're doing a great job collectively of trying as hard as we should be trying. And the risk that we're running is a lot higher than I, speaking as Holden, you know, would like it to be, but the problem does look much more tractable than it used to look. And and I used to when 5 years ago, people would come to me and say, "What can I do for AI safety?" And he'd be like, "I don't really know." And now he's like, "I got a long list of projects that people can work on that all at least seem to help." So, you know, I wish I had an answer that was like, "Do this and it'll really work." Um I don't think I haven't heard any credible claims in that direction really except in cyber security which is obviously only part of what the world is going to need but maybe all that stuff could add up to a win. Great. So given that we we have this enormously transformative technology and it probably is the most powerful dual-use technology in human

history um in my opinion probably consensus opinion also way more consequential than even nuclear weapons which up until this point is probably more powerful. Do you think that these model providers or you know these model capabilities should be in private hands as they currently are at all? Do you have a case in which you support the government in some senses nationalizing AI frontier development? What's your take on that? >> Well, Biden used to always say, "Don't compare me to the almighty, compare me to the alternative." And >> I would probably say the same thing about the frontier AI companies. I certainly don't think they're all acting perfectly. And I certainly think the race between them has the potential to get out of hand. So I would like to see government action of some sort and you know there's a lot of debates there on exactly what sort of policy we should have. Um but no policy at all doesn't seem like a winner to me and I say that as broadly speaking a lifelong techno optimist libertarian who you know mostly

would rather see the government stay out of these things. Uh but this one does seem like a you know qualitatively different thing where some government involvement seems you know prudent if not outright necessary. I would not go for nationalization though for the basic reason that like I just don't trust the government that much. You know that I I don't and I look at the current leadership and I I'm like these are not the I I take my Sam's Altman and my you know Darios and Demises over my Pete Hegths you know in that standoff which is happening this week. I support Anthropic 100% for putting some limits on what they want their technology to be used for and I hope they stand by it and you know if the if the government comes down on them like I will preferentially send my tokens their way even if it weren't you know necessarily always the best performance because I do think that that is like a really important stand for them to take. So could I imagine a government so competently run and you know so pure of heart that it would maybe make sense to nationalize? Yeah, I

can imag I can imagine a lot, I guess, but like I don't think we're anywhere close to having that government today. And so I would rather see some competition and some hopefully healthy balance and maybe some of the worst excesses reigned in by the government, but certainly not like a take over and you know put this under the military. That sounds like a recipe for disaster. >> Yeah, that makes sense. I guess it's I'm always also skeptical though of what we talk about as like corporate incentive alignment. I think oftentimes we think about situations in which um you know mo most notably recently like social media where you have these you know private corporations um driven by usage rates or driven by you know how much capital they're pulling in whether that be for the sake of more development or for the sake of purely you know desiring more capital that that incentive misalignment causes a situation where um you know the individuals or the consumers get ultimately hurt and So I guess on this continuum that we described where we have pure nationalization of the entire

technology or on the other end where there's no regulation at all, can you talk a little bit more about where you sit there and you know what that regulation might look like if you do think it is somewhere in the middle like you you kind of said >> I would probably honestly oppose most of the ideas that rank and file politicians are going to come out with for regulation. Like I want my self-driving cars. I don't want them to say Chad GPT can't give you medical advice. I don't want them to say you can't get therapy from a bot. Um, you know, a lot of that stuff I think ends up just being guild style protectionism and it hurts kind of everybody. Basically, I think what government should do is try to solve the race dynamic coordination problem and take the word hopefully try to minimize the truly extreme risks. It does seem like right now, like over the next couple years, the companies believe they're going to create automated AI researchers. They're going to create recursive self-improvement loops. And nobody really knows where that goes. And I don't think that is a great thing to

be happening in a few relatively small and kind of ideological companies. And I do think it's also underappreciated and under discussed how much these companies kind of are ideological. Even the definition of AGI as something that can do everything better than humans isn't it's not a neutral frame. It's a it's a it's a definition that has like lot baked into it. And I think there is a there's a certainly diversity within the companies. It's not like all of them are like successionists or whatever. But there is a strain of that. And then there's also a strain of just like you know wouldn't it be amazing if we if you know we could just replace ourselves and not have to work anymore. And that I think has kind of taken on a bit of a life of its own. So, I do think there's like a role for the government to kind of say, "Okay, we're not here to tell you you can't get free medical advice, but you can't run like super high energy experiments with no transparency." That you yourself are saying, you know, and that's one of the strangest things about this whole thing, right? Is all the all

the leaders are saying that there's like a significant they have said for years that there's a significant risk that this goes really badly. And so that's the thing that I would like to see government really focus on is is there some way to rein that stuff in make sure we have better, you know, better extreme risk mitigations in place. Make sure we have better safety plans. It's tough. I don't think there's not great or easy answers here, but that's where I would definitely want the government to focus its energy. >> Okay. And then another major governance question that interacts with a lot of these concerns about catastrophic misuse or misalignment is you know you mentioned the race across American labs but there's also the race with Chinese labs in the Chinese government of course. So sort of like the default um highlevel strategic vision in DC as far as I can tell and and in SF as well I think is you know we're going to continue to build continue to race ahead building a capabilities. We have a lead over the Chinese labs right now. It's there's debate around how significant

that lead is but there's no doubt that we have a lead. We'll continue to build up this fleet as much as we can via export controls, improving privacy and security at the model providers. And then when it comes time when we're right on the cusp of extremely trans or even like somewhat into the the intelligence explosion, so to speak, we will have time to slow down and coordinate um and and not push capabilities even further towards super intelligence. And you know, again, that's contingent on having a significant lead over China. Um but but you know sort of an assumption of this strategy is that like you know sort of the worst outcome would be both us and China are neck andneck and we're continuing to race and we're in this prisoner's dilemma where even if like both of us are concerned about misalignment or something we have to continue to race ahead because otherwise the other person is going to continue to race ahead. What do you think about this high level strategic picture? Um well it's tough. I mean, I don't want to pretend it's not tough, but my outlook has always been basically I hate that idea. And I don't think the few months

because the difference is like months. It's not years. You know, that's not a long time to solve all these problems. We've, you know, we've spent the much of the last hour and a half or whatever talking about all the different problems and the different facets. And we didn't even touch on open source, right? And there's like open source could be a great counterbalance against concentration of power, but if we open source the wrong we can't take it back. You, you know, may have just put a bioweapon assistant or, you know, even autonomous creator into the public domain in a way that, you know, is is going to have super long-term echoes potentially, right? So that's another whole facet of the problem that you could spill many hours and and many many thousands of pages trying to figure out what to do. So, I don't think this idea that we're going to like have these couple critical months and then we'll, you know, solve everything in that time and then it'll it will we'll win somehow. That to me just doesn't make any sense at all. I don't want to be naive with respect to and I I don't even want to say China because I think it's like it is the government of China and it's potentially like relatively small

cohort at the top of the government of China that I, you know, think we kind of quite rationally like should be at least somewhat wary of. Um, you know, I I always come back to the real aliens in this situation are the AIs, not the Chinese. We have a lot more in common between the United States and China as humans, fellow humans, than we do with the AIS. So, is it going to be hard to build trust across the Great Ocean and, you know, the great civilizational divide? Like, sure, it's going to be hard. And we've got ideological differences in the government and whatever. Many, many challenges. But I would strongly advocate for starting to do that work now. Like for my part, I always want to talk to people in China. When another AI podcast does an interview with a Chinese researcher, somebody one of their top companies, I ask if I can crossost that to my feed because I think we should have a lot more of this researcher to researcher level communication. And I would absolutely invest in you know all manner

of diplomacy and sort of you know out there ideas like could we create some sort of you know there's like the CERN project for high energy physics you know could we create some sort of shared jointly controlled place where like researchers could go from both countries to like work together on very sensitive problems. You know it could be some small island in the Pacific or it could be in Singapore. I don't know. But like the idea that we're just going to sort of decouple and race each other to me sounds like deeply unwise. And I would take my chances with our the possibility at least that we could come to some sort of shared understanding with fellow humans versus, you know, recursively self-improving AIs that we hope will be better or somehow, you know, better than their um recursively self-improving AIs. All of which, by the way, is happening against the backdrop. And I don't like to say this kind of thing because it's not a very popular opinion and it's not it's not something I take any pleasure in. We are looking more like China all the time. You know

what is happening again this week? We've got the Defense Department threatening retaliation against an American company because that American company wants to hold firm on what I would consider to be a core American value of not using their technology for mass surveillance. What is it that people typically worry about when they talk about Chinese AI or Chinese values or, you know, living in Xi's world? I think one of the big things is mass surveillance, right? It's the idea that I can't speak my mind anymore, even in private contexts, because the government's going to hoover up all that information and potentially use it against me. As far as I can tell, that's exactly what the US Defense Department is trying to coers Anthropic into doing right now. And what? Like, we're losing our we're kind of losing the thread here in our, you know, I don't know, in our delusions of grandeur, you know, our idea that like we're somehow going to be the winners. I was just talking to my seven-year-old last night about this cuz he was asking questions around like, why is there a war? Like, well, how does that happen,

you know? And I was like, well, almost always the people that start it go down in history as the bad guys and almost always the people in the country that started the war that were kind of convinced it might be good for them end up regretting it. And I kind of think that's the unfortunately that feels like the trajectory that we are on right now where we're talking ourselves into this idea that if we can you know achieve this strategic dominance then you know we'll make them an offer they can't refuse and we're just forgetting that like first of all we're losing ourselves in the process and second they get counter moves right and like Taiwan's a lot closer to them than it is to us and these fabs are like easily destroyed you know not easily put back together so I really don't like that theory at all and and try to advocate for something more um consiliatory wherever possible. That's not to say we want again don't want to be totally naive. It's good to have leverage. You know, you you could get me maybe on board with a policy of like let's not sell chips to China, but let's like rent them freely to Chinese companies, which is somewhat kind of the

policy we have. I mean, they can buy from hyperscalers outside of China. I just wish we weren't. So, the momentum on this is like it's it's the worst. It's America at its worst. When we think we have this enemy and, you know, we think we're going to like unite to um to be the best civilization, that just never seems to go super well. So, I hope to visit China sometime in the next year and participate in, you know, do my tiny little part to participate in civilization to civilization understanding. I don't expect to move the needle, but as a gesture, if nothing else, I I really think more people should be doing more of that kind of >> Very cool. Okay. Yeah. Glad we asked you that. I think that's a important perspective and definitely not one that you hear articulated with that much clarity or depth very often. Yeah. So, thank you very much Nathan. This was an amazing very wide-ranging podcast. I think we got into some good depth as well on some crucial issues. For all of our listeners who haven't heard of the cognitive revolution, absolutely go check it out. Nathan is, you know, one of these people that that has a technical background, thinks clearly about risks from AI, but also strongly believes in the promise of technology

and he has just like a massive diversity of conversations with people across various subd disciplines within the field. So yeah, thank you so much Nathan. This was an amazing conversation and really appreciate you coming on today. >> Yeah, thanks so much Nathan. That's very kind. Thank you guys. [music] >> [music] >> The wheel I turn. [music] The river find its way. [music] Between [music] the signs and the fear I man the [music] wisest minds upon the earth cannot agree. [music] What looked like distant sure now reach a hand [music]

and nobody have what them promised we no fortress and no wall [music] no chain and old just layer upon layer and we hope and pray a story told a thousand [music] times of old but who can tell the ending [singing] of this day the real aliens [music] not across the ocean. They're coming up from underneath [music] the ground. We trust the current. We follow the motion. [music] Something not of us. But [music] by us found the real aliens [music] for real unknown. Not enemies that we could sit and meet. something [music] stranger that we ourselves have grown obsessed without dignity incomplete.

I [music] gave the key to something with no name. Closer than a stranger, further than a friend. It never raised its voice. It never [music] clAIMEd a flame. But still you wonder where the patience end. The wise men [music] say the wall is going to hold. The wise men say the wall already break both [music] of them. So show both of them bold and neither one can tell you what's at stake. [music] Real aliens not across the ocean. They're coming up [music] from underneath the ground. [music] We trust the current. We follow the motion. something [music] not of us. But by a spawn, the [music] real aliens, the real unknown, not enemies that we could sit and meet. [music] Something stranger that we ourselves have grown success

without dignity incomplete. [music] We drove the great beast to the edge of the earth. Small bands of people fire in hand. No master [music] plan, just hunger since our birth and our new creation. Walk the land. [music] One man say success without dignity. Another say at least go down [music] upon your feet. But you know we were not made for defeat. So stand firm. [music] Stand firm up on the street. The real not [music] across the ocean. They're coming up from underneath [music] the ground. We trust the current. We follow [music] the motion. Something like us. But by us found [music] the real aliens, the real unknown. Not

enemies that we could sit and meet. Something [music] stranger that we ourselves have grown success [singing] without dignity incomplete. [music] The wheel I turn. The river find it way. The wheel. [music] Nobody ever I really would. But we [music] keep turning. We keep turning. I know we keep turning. [music] If you're finding value in the show, we'd appreciate it if you take a moment to share with friends, post online, write a review on Apple Podcasts or Spotify, or just leave us a comment on YouTube. Of course, we always welcome your feedback, guest and topic suggestions, and sponsorship inquiries, either via our website, cognitive revolution.ai, or by DMing me on your favorite social network. The Cognitive

Revolution is part of the Turpentine Network, a network of podcasts, which is now part of A16Z, where experts talk technology, business, economics, geopolitics, culture, and more. We're produced by AI Podcasting. If you're looking for podcast production help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement at aipodcast.ing. And thank you to everyone who listens for being part of the cognitive revolution.