The Brain Is Just Specialized Agents Talking To Each Other — Dr. Jeff Beck — benchmark.space

Machine Learning Street Talk

The Brain Is Just Specialized Agents Talking To Each Other — Dr. Jeff Beck

2026-01-25 46min 10,087 views watch on youtube →

Channel: Machine Learning Street Talk

Date: 2026-01-25

Duration: 46min

Views: 10,087

URL: https://www.youtube.com/watch?v=Ucqfb33GJJ4

What makes something truly *intelligent?* Is a rock an agent? Could a perfect simulation of your brain actually *be* you? In this fascinating conversation, Dr. Jeff Beck takes us on a journey through the philosophical and technical foundations of agency, intelligence, and the future of AI.

Jeff doesn't hold back on the big questions. He argues that from a purely MATHematical perspective, there's no structural difference between an agent and a rock – both execute policies that map inputs to outp

Geometric deep learning is a big part of like is a big part of the stack if for no other reason than when we talk about like modeling the physical world that means like incorporating the symmetries that exist in the physical world. It's like we're highly motivated to employ a lot of those methods and techniques. >> But is the world written in code or do you mean exploiting the regularities in the code that seem to have some >> exploiting the regularities? No, it's like look we it things are it is the world is translation invariant. The world is like rotation. Well, not really because there's gravity, but like in principle, you know, there is a principal axis, but it's certainly rotationally invariant in the xy plane. >> Yeah. >> Um, and if you if you want to have a good model of the world as it actually is, it should incorporate those features. Of course, you can discover it, you know, in a brute forcy way, but the MATHematician in me really wants to build build the symmetries in. And fortunately, we've got a lot of great tools that were developed over the last several years that can do that. What's your view on agency? >> If I'm being, you know, like an FEP

purist, I have to sort of say like, oh, well, there's no difference between, you know, an agent and an object in in a very real way, or at least there's nothing structurally distinct between what how we model an agent and how we model an object. Um, it's really just a question of of degrees, right? An agent is is a really sophisticated object, right? It has internal states that represent things over very long time scales. um you know uh it has uh sophisticated policies that are context dependent which is basically saying really long time scales again um and things like that. >> Yeah. You know um there's the kind of the philosophical highbrow notion of agency that we introduce notions of um intentionality and self-causation and things like that. I mean the the really nononsense version of an agency is it it's just it's just a thing which acts and performs some kind of computation and I guess you could almost model anything as an agent you know. >> Yeah. Well so if if if your definition

of an agent is something that executes a policy then anything is an agent right a rock is an agent right every everything has you know it's an input a policy is an input output relationship. When many people talk about agents, they they're adding a few they're adding um a few additional elements that I think have a lot to do with how the policy is computed, right? So, for example, when we think of how the difference between like us and like like really like amiebas, we we often cite things like planning, counterfactual reasoning, goaloriented behavior, right? We're specifying things that that um have that that are specific mean that that are all related to how it is we compute our policies, right? They're latent variables that represent policies um that are uh you know that are compatible with like well reinforcement learning, right? And um and that's the defining characteristic of an agent. But you could very easily just sort of say like

from an outside perspective if you can't look at how someone or something is doing the computations if the only thing you observe is the policy right does that mean that you can never conclude that something's an agent and I would say no right you'd still like to be able to conclude that this is an agent even though the only thing I ever get to measure is its policy >> but do you think we should have some notion of the strength of an agent >> the strength of an agent or how is this like a measure of agency is that what you either. Yeah. So, I mean, I think you could use like notions of like transfer entropy and things like that in order to estimate like the timetable over which something is incorporating information or the degree to which it's taken into it. It it exhibits a context dependent behavior and things like that and that would be a pretty good measure. Now, is it normative? No, it's not. It's it's a but it is a measure and you could use things like that. But at that point, you're really just talking again about policy sophistication, right? Not does it have a reward function? Like is it actually executing planning?

>> Yeah. I mean certainly intuitively agents to me seem to be kind of causally disconnected because they're planning into the future. They are not impulse response machines. They're not just, you know, part of the mass of things going on around them. They are just obviously disconnected from the locality. >> So here the trick is is that okay, so I've got this agent and I know exactly what it does, right? It takes into it takes into account information. um it rolls out future you know internally it rolls out a whole bunch of like future uh consequences of of various different actions or plans that it could take it selects the best one and then it executes it right so all of those variables all of those variables that were that occurred inside right from the outside perspective it just looked like a function transformation right it it's I don't unless I unless I'm somehow going in and recording and somehow demonstrating the fact that the manner in which it is calculating its policy you know, like involved doing those rollouts,

right? I wouldn't be able to show that it's actually doing those rollouts. I would just be able to conclude it has a really sophisticated policy. So, can you conclude that something isn't is is so so the question is how do you identify something is actually doing planning? And I think that's a really hard question as opposed to having an incredibly sophisticated policy. I think my my intuition is if it feels to me that a function a simple input output mapping can't be an agent. And and in a way this is related to what we were talking about with grounding. You know, it it seems that when things are physically embedded in the world, then they're more likely to be agents. This functionalist idea that just a bit of computer code running on a machine, it kind of feels like that can't be an agent. >> It does. So suppose I coded it up so it was doing all of that planning. It's like gets its inputs, does some crazy like massive Monte Carlo research, picks the best policy possible, and then executes it. Now, you don't observe any of that, right? Because you know what's going on. You could say, "Oh, well, it's it's clearly like executing, you know,

this is it's doing planning and counterfactual reasoning. It's going on like look there it is because you coded it, so you know it's doing it." But if you're looking at it from the outside, right, it, you know, if you don't know what's happening inside, it's going, you know, all you have access to is, oh, here's the action that it that it that it did given this long series of inputs. And so it's it's really hard to identify what you know something as an agent per se from the outside. You kind of have to know what's going on inside. This by the way is why I don't think that like you know you know these sort of prediction based approaches to like AI um are necess you know you could sort of say well it's not really doing anything even remotely agentic unless it's executing it's doing planning and counterfactual reason. So like your chess program is is like oh clearly it's doing some planning and counterfactual reasoning because you know it's doing it but um but it but you could like write I could describe the exact same set of behaviors just with a policy function. I

I think the counterfactual thing is is an important feature here because we could take something which was conscious or something which had agency and we could just take a trace of the actual path which was found and now we've just got this a reductio at absurd but you know now we've just got a computational trace and that thing clearly has now lost whatever agency or consciousness it had. So there's something about considering all of the possibilities. >> Yeah. Yeah. I think so in my mind that is the fundamental feature of of of of an agent. like if you can show that it's engaged in planning counterfactual reasoning and and then it's definitely an agent. My my argument is just simply that that's hard to do unless you crack it open and see what's going on inside. Now, you could take a a pragmatic view and say, well, if the simplest computational model of the behavior, model it as if it was doing planning and counterfactual reasoning, then you can draw an implicit conclusion that oh yes, well, I may as well say it's an agent. And that's kind of the approach that I've taken. So like one of the things that comes out of the physics discovery

algorithm is that you apply it to agents and what do you get? Well, you get a model. Now bear in mind I called them all objects before and I didn't change anything to make it special to an actual agent, right? But what I do have the ability to do because of the model is I can look at the internal states associated with that object that I want to call an agent and look at how sophisticated it is. >> Right? And that degree of sophistication is what allows me to say, "Oh, well, I'm going to go ahead and say that like and I like the whole idea. It's a great idea. Like it's have a metric, right? And I'm sure it would be something that would effectively be like transfer entropy or something like that." But we have this metric on like, well, how sophisticated were the internal states that were necessary in order to generate this output. And if it's above some threshold, we'll call it an agent. I don't like thresholds, but you know, we just sort of say a degree of agency, a degree of sophistication. And coming back to Dennit's intentional stance. So this is that you know there is um a level of representation which serves as a useful explanation even though it's not actually you know the the the microscopic causal graph. And maybe we

can agree that no agent can possibly be the cause of its own actions. But when there is a degree of planning sophistication for you know macroscopically it's as if it's the cause of its own actions. >> Yes. And that's why this as if phrase comes up a lot. Right. I mean this it's it's important to remember that like no matter how clever your model is and no matter how clever your approach is and how clever the words are that you use to describe it um a lot of this stuff is is is as if right this is this is the best model right it's not the it's not this is why like I I I repeat this over and over again grind it into the students right is that that you know science is about like prediction and data compression and like nothing else and the same thing is going on here right you you'll never, you know, just looking at behavior, you'll never know for sure in any meaningful way like whether or not it's it's just doing a function transformation or whether it's engaged in planning and counterfactual reasoning. But if your best model of it, if you sort of say,

well, I tried to model as a function transformation, but god damn it, it had a lot of parameters, right? But then I tried to model it as something that was just doing Monte Carlo research on the inside and giving the answer and that had like, you know, 40 parameters and it's like, well, that's the model I'm going to go with and now I'm going to call it an agent. If we had a physical agent in the real world that was doing all of this planning and so on, would that have some kind of primacy to a computer simulation of agents that were doing all of this planning? >> Oh, is this is this like uh if I uploaded my brain onto a computer and didn't connect it to the world, would it still be thinking even though it's like doing all of those things? Is that the idea here or am I like >> that works? So, yeah, let's say highfidelity computer simulation of Jeff. Would would would Jeff be an agent? >> No. Oh, wasn't expecting you to say that >> because I'm the agent and if you uh uploaded No, I don't know. Um, so if you is do a highfidelity computer simulation and you put it in my body, then I think I would have to say it's an agent. >> Yeah. >> Right. If it's doing exactly the same, I mean, this is like the standard. It's doing exactly the same calculations from

from a purely like phenomenological perspective, it's like it's the same. It's indistinguishable. >> Okay. So agents need to be physical. >> So I do believe that an agent needs to be physical. That absolutely. I don't believe, you know, I I believe you can have a model of agency and not have an agent, right? I, you know, you can put that model in a computer and run it and make predictions as to what an agent would do. You and it might even be 100% correct, but I still wouldn't call it an agent. But again, this is like getting into philosophy and like philosophy frustrates the basian because philosophy is not probabilistic, right? [laughter] philosophy is really about drawing clear lines and distinctions and in my world those don't really exist right there's everything has an error bar you know all of there isn't a clear delineation between you know uh you know an object and an agent it's really you know in from this modeling perspective it's really just a question of degrees and philosophy is terrible at handling questions of degree >> my friend Keith he he's a big fan of um computability

and and he thinks that an agent is basically you know like a type of computation and it has access to ambient state and it can take action and there's this kind of like cybernetic loop and for him the strength of the agency in the system is the compute type that the thing is doing right so if it's if it's a finite state automter then it's a weak agent if it's a touring machine it's a strong agent >> yeah it's the degree of sophistication of the compute right >> pretty much does That ring true to you? >> I mean that if if you were going to make if you forced me like, you know, at the point of a gun to put a measure on agency, it'd probably look a lot like that. >> Yes. Jeeoff, let's talk about energy based models. >> Sure. >> So, um, Yan Lun, he had a monograph out, I think, in 2006 talking about this. Been talking about this for a long time. >> Oh, yeah. When you fit your neural network to data, you know, via gradient descent, right? then you have written an energy function in weight space and you are follow and you're following it to its energetic minimum. You know the the

advantage of using an energy based uh taking an energy based approach as opposed to taking say a straight up like function approximation approach is that an energy based model comes with something that's kind of like an inductive prior right it it basically you know an energy based model you know if you're just doing function approximation you're basically saying there's any mapping from x to y x is my inputs y any mapping is out there I just want to figure out what it is right now in an you know in an energy based model right you're you're you're you're effectively placing constraint s on what that input output relationship can be. I like thinking about the distinction between an energybased model and a and a traditional sort of feed forward neural network um uh has to do with where your cost function is applied. Right? So in a in a traditional neural network, you take in your inputs, you got your outputs, and the cost function is just a function of the inputs and the outputs. And the only thing that you're optimizing is the weights. In [snorts] an energy based model, there's another thing that that your cost function operates on, and that's something one of the internal states of your model. And

as a result like in order to figure out what the best you know the the the best approach is right you actually have to do two minimizations. One that that finds the energetic minimum associated with the the the part of the cost function that operates on the internal states like the hidden nodes of your network right and then one that is the prediction that is your like effective prediction error. Um this is this is very much consistent with the approach that a basian would take right you have a you have a a prior probability distribution which gives you an energy function over every single latent variable in your model and you are optimizing with respect to all of them. So so you take a probabistic approach good examples of this are like a variational autoenccoder. A variational autoenccoder I think is a is the best example of the most commonly used energybased model out there. Why? because you have an encoder network, you have a decoder network, right? And your cost function is based on the difference between inputs and outputs, right? So that's just like a that's fine. That's still a regular, but it also is how how Gaussian in a well, it depends on what flavor of V8, but you also have some uh some some part of your cost function um

is a function of the actual rep internal representation, right? In a traditional VAE, it's it's how Gaussian is. You want that internal representation to be as Gaussian as possible. Um if it's a VQ VAE then it's like mixture of Gaussians but it's still like a cost function that is applied on the internal states as well as on the inputs and outputs. >> Very cool. So a VAE is is a fairly cononical example of an energy based model and what you were saying about the I mean you know the whole DL world is obsessed with test time inference at the moment and in a way that that is a step towards what you're talking about. So yeah, you're treating a certain Yeah, you're treating some of the weights of your model, right? I mean, well, yeah, you're treating some of the weights of your model as if they're latent variables, right? Because when you when you show a new input, right, you're allowed to change some of the weights without looking at the output, right? And so what are you doing? Well, you're treating the weights as latent. Now, I think that like which makes it a great trick in my opinion. It's like, oh, great. Like, yeah, they're they're they're they're moving in the direction of energy based models. I love it. The only thing I don't like about test time training is the vast majority of the

training that is done. So in a traditional energy based model, you always find the minimum with respect to the latent variables, right? These extra weights that you know which in this case which in the case of test time training is the you know the subset of weights that you're allowed to to change during you know during test time. >> Um when you do the training for a traditional energy based model, you're allowed to make those changes right throughout the entire course of training. The way that we're often doing test time training these days is we just do regular old neural network learning like we don't do and and then and then and then finally when it comes to when we get to the deployment phase then we suddenly turn on right this these additional latents which are basically some of the weights of the network and we do additional an additional bit of learning at that point. This seems monument. Now again not an expert here right but this seems unwise to me and the reason it seems unwise is because you didn't train the original network with that on right you trained it as in a completely supervised way >> yes >> now I'm sure that people have are aware of this and it's been addressed in the

literature but I'm not personally aware of that and I don't think that's how it's used in practice super >> we should also introduce this term transduction so my definition of transduction is that you're actually doing search or optimization as a function of the test samples like I interviewed Clement Bonnet he had a VAE on on arc you know searching latent spaces and he actually um searched through the decoder as a function of the test sample. Yeah >> and because these models they are maximum likelihood estimators right which means they're always giving you a kind of smoothed out average and there's so much information in the test sample. Let's just riff on the relationship between energy based models and and basian inference. So of course they have this advantage that you don't need to do this very expensive intractable normalization. >> Yes. >> Yes. Tell me about that. >> My take on it is is that an energy based model and a basian model have a lot in common right in many ways like energy I mean well literally in physics right energy is like log probab energy is log probability. Now of course there's the

normalization you know factor that you don't need to worry about if you're just doing if you're just minimizing energy. And so the difference between uh you know like- which is sort of like you know in a basian framework that's like saying well you know I'm not actually going to treat some of these latent variables in a probabilistic way. I'm just going to do maximum or map estimation on some of my variables and just be okay with that. And that's one way to interpret the relationship between an energybased model and a properly basian model. There's there's a happy medium here though, right? And the happy medium is you can still treat it as if it's you know you know you don't have to just minimize the energy function but you can calculate the curvature down there too do a lelass approximation and call yourself a basian again right yes there is more computation involved but we've got a lot of great tricks for making that totally tractable. >> What's the relationship between the free energy in the free energy principle and the energy and energy based models? uh regularization term I think is the short answer right um no so so uh the difference between an if you're being

very very very pedantic the difference between an energy based you know minimizing energy and minimizing free energy is that free energy has this additional entropy penalty term now if you're just doing maximum likelihood estimation if you're minimizing your energy function with respect to some partic well just we'll pretend we're only at one variable um and I'm just going to like get a point estimate and call it a day do like you know some kind of map estimation to get to get that that one thing there's not that big of a difference right because you're you're not there is no probability distribution over the latent that allows you to compute that regularization term but that's the only difference it's it's are you regularizing or not is I think the easiest way to think about it >> so lun is a big advocate of jer so these joint embedding prediction architectures using this non-contrastive learning where essentially the the learning objective is is comparing the um the the the latence of observed and unobserved parts of the space. This is an architectural design. >> Well, what is Okay, so what does Jea stand for? It's is it it's joint embedding and prediction architecture.

There we go. >> So, what's the joint embedding bit about? Well, the joint embedding bit about is is, you know, is well, I'm going to take my inputs, I'm going to take my outputs, and I'm going to embed them in some space, right? And then I'm going to learn a prediction between the two embeddings. And that's a great idea. It's a great idea because it has some of the flavor of what we would like to get out of our models. Like we're not interested in predicting every in many situations, I should be very particular about this. In many situations, we're not interested in predicting every single pixel on the image. We want to get, you know, maybe something that's a little more gestalt, a little more high level, a little more conceptual understanding of what's going on. And so emphasizing the goal of predicting every single pixel, which is what's typically done in generative modeling right now, you know, might lose some of the power, the abstractive power of some of the networks. And so like let's do so so the whole point of Japa as I understand it. I'm sure there are other points um is that uh is that you're going to take you're going to you're going to compress your inputs and compress your outputs and then do all the learning in this compressed space. Love it. Right. Science is about prediction and data compression. Let's make that compression

explicit on the front end and the back end. >> The downside of this approach is that is it is it it doesn't work out of the box, right? Because it's very easy to find a compression or an embedding of the inputs and an embedding of the outputs for which prediction is perfect which is to basically make both of them zero and so you have to do some other things other tricks need to be employed in order to make it work. >> Yes. Yes. I remember Lum was talking about this. So there was there's the the traditional contrast if method which is from it's kind of Hinton's idea apparently of like the negative sampling and whatnot and and that's very expensive because you actually have to do lots and lots of sampling and this non non-contrastive thing. >> Yeah. This is this by the way is what he should have won the Nobel Prize for >> right [laughter] >> in my opinion. Yes. Because the the whole point of of of of of the wake sleep algorithm and contrasted divergence was that oh it's actually biologically plausible right it was a it was it was an endun around the need to do back prop and that's what made it so clever and interesting in my opinion. >> Lun is a big fan of this non-contrastive

thing where you work in the the latent space. There are many different algorithms that do this. We we had a whole load of shows all about non-contrastive learning. There's things like VC Craig and BOL and Barlow twins and there's there's an entire thread of research all around that and in many different ways what they're trying to do is avoid this motor collapse problem that you're talking about and they use different forms of regularization. There's an old school way of accomplishing the same thing and that is that is to to um do all of your is it's called pre-processing right and this is this is something that a lot of people do. take your data and in fact we do this all the time with with with like vision language models right so we want to do we want to use an LLM and we want to predict images so what do we do well the first thing we have to do is tokenize the image >> right and so what do we do we run a VA we do the pre-processing and we do it by is the pre-processing step is completely independent right from the actual algorithm that's going to be the be be tasked with solving the problem of interest

um And you know that's not something that we necessarily have to stick with, right? It would be very nice if there was a way of if if there was a way of like again well jointly we're getting right back to Jeep again. What we'd like to do is we'd like to choose our pre-processing algorithm in a manner that that you know you know not a priori not do it first. We like to choose the pre-processor that works the best in in this space. >> Y >> and I think that that's the ultimate motivation for a lot of this work is that there like what's the right embedding. One of my favorite tricks like of course I you know I pre-process with VAS all the time. In fact, it's when you know the second every time someone hands me a new neural data set, the first thing I do and you can I'm I'm not ashamed to admit I run PCA on it and pass it through a VAE and then sort of take a look, right? It's the first thing you do with your data because it gives you a good idea of what the signal to noise ratio is in the data set itself. >> Yes. >> And then I Yeah. And then what do I do?

I subsequently do most of my analysis right in that discovered embedding space. Um, and there's I I I I don't see a huge problem with that from a purely pragmatic perspective, but it it's certainly cleaner, right, to to have a single algorithm and approach and not just be stringing these sort of things together in an ad hoc way. There's, you know, when when doing PCA, PCA is a really great example of this. There's a failure mode for principal component analysis, um, which is actually really common in neural data because principal component analysis basically goes, well, where's the most variability? Okay, I'm worry about that. And then all the stuff that's not varying very much, I'm just going to throw it away, right? Just like look, you know, dimensions in which there's low variability are not important. Well, it turns out that in neural data, the dimensions in which there's very little variability are some of the most important dimensions. And so pre-processing with PCA runs a risk of throwing out the most valuable information in your data set. >> Yes. And so there's a lot of wisdom in in in jointly right pre in in jointly

fitting your pre-processing model as well as your inference and prediction model. I mean on this subject of not throwing things away um jeoper and non-contrasted learning it's part of this bigger field of self-supervised learning and we want to learn representations that maintain fidelity and richness and lun's hypothesis is that when you do something like supervised learning with you know some particular downstream task in mind um the neural network gets wise and what it does is it kind of discards all of the the the longtail stuff that aren't relevant for that particular task. So when you train these models, what you're trying to do is sort of maintain enough ambiguity so that it it compresses the information but it also maintains enough fidelity to work broadly for different things. >> Yes. And that that and that is a lotable goal, right? And and I certainly share it. Right. The last thing you want to do is I mean, you know, fortunately like networks are so big, we don't really run the risk of of like uh overfitting so as much as we used to. Um, but the last thing you want to do is throw is is is

train your network to toss information that you might need down the road. Um, that said, like the vast majority of what you know the brain does just like these neural networks is decide what information is currently task irrelevant. But that's all the more reason to do things in a self-supervised or unsupervised way, right? Because you're basically not telling it this is the important, you know, you're not telling it like what's all task relevant and task irrelevant. So um I interviewed Shalet about the version two of the arc challenge and one thing that struck me is I think of intelligence as being multi-dimensional. So version one got saturated. The ark was actually really amazing because it's the only intelligence bench benchmark that has survived for 5 years before being defeated. You know since the advent of these thinking models, it has been defeated very quickly. But they're working on version three and there'll be version four, there'll be version five. Will there always just be something left over? >> That sounds like another philosophical. So yes is my answer. There will always

be there will always be something left over in the sense that like you know you know we we we have this has been the trajectory things have been going for a really long time, right? It's sort of like we get algorithms that do amazing new cool things and then someone comes along and says, "Yeah, but it can't build me. It can't pull a rabbit out of a hat." Right? And then and then of course what does someone do? they oh they they figure out the new training protocol slightly different architecture or they just train it to pull rabbits out of hats and then suddenly it can and then someone proposes a new challenge and a new challenge and a new challenge and it's always this game of like one-upsmanship. So the question becomes, well, what's the point at which there are no more new challenges? And I'm not entirely certain we're ever going to get there, right? Um it may very well be the case that we get, you know, these sort of algorithms that are capable of replicating the complete suite of human behaviors and then someone will come up with some criticism like, "Yeah, but it's not really doing X. It's just faking it, right? This is just the direction things go because people really do think they're important." >> Yeah. Do do you think that the concept

of recursive self-improving intelligence is a valid one? Yes, I do think that is so so I think that one of the most critical missing elements right now is some form of continual learning, right? You at the end of the day, you really want an algorithm that that doesn't just learn on the training that on the training set and then just gets deployed. You want something that that that runs around in the world and comes across things that it doesn't understand, right? And then is able to incorp to build, you know, append its model in some sense, right? So this is like the this you know and there are some approaches to it's all based on like basian nonparametrics and dish process priors and stuff like that where you you sort of see something that's surprising or unique or different something you didn't expect and it causes you to say I need to turn learning on because I got to figure this out. That is an absolutely critical element that we need to be developing. We are developing that. And it turns out that that's one of the nice things about this sort of object- centered physics discovery thing is because it's object- centered. If it comes across a new

situation that it does not understand, it is capable of instantiating a completely brand new object just to explain this new situation. >> Continually learning agents can acquire new knowledge autonomously and and the whole you know the whole thing just learns more knowledge. But intelligence feels different. It it it feels like in the system that we've been describing the intelligence is the way we're implementing the you know the basian updates and and you know actually building the algorithms. Could could the systems on their own meta program themselves and develop better algorithms or something like that? That's a very good question. something that would be closer to true artificial intelligence than what we currently have would be capable of building models on the fly to deal with new situations to taking things that it knows about right and combining them in new and different ways. Um uh there are approaches that have some of that aspect to it. Like GFlow nets from like Benio stuff is like is like a great example of something that at least in principle is

a generative model of generative models, right? It's sort of like oh like you know I might actually need a new node like it's time to create a new latent variable cuz like like the current set's just not cutting the mustard anymore. Those are things that that that I think are hallmarks of of true intelligence. I don't want to ever make the statement as soon as it's got that it's truly intelligent. I will never ever ever say that. Um but I do think that that is a a critical component that that needs to be present, right? Is the ability to generate new models on the fly to deal with novel situations and data. Um most of that you know um you know as well as the ability to um uh combine old models, previous models in new and interesting ways. This is actually how the brain evolved, right? We started out with like um you know really simple brains and there were different regions and they solved sort of different problems and what eventually happened as we evolved is is that these different regions of the brain learned to communicate with

each other in new ways and through that communication acquired new abilities, right? And then eventually evolved into in you know you know um new capabilities and things like that, right? I I often like to point out to the the I think old-fashioned is like the the sense that's not studied nearly enough. It's an incredibly old part of the brain. Um and arguably, right, it's the it's the first part of the brain that evolved the ability to do proper like associative processing, right? Odor the odor unlike visual space, right, where there's translation symmetries and and all that sort of stuff and things are smooth. Alactory space that does not exist, right? It's it's really really really combinatorial and complicated. And the part of the brain that evolved to solve the alactory problem arguably is the part that evolved into our frontal cortex. Don't quote me on that. There's a lot of disagreement there. That's just my take. Um but it certainly has a lot of the features that we associate with associative cortex. Right? It is it wow I just said got like six uses six three different uses of the word associate in that sentence. But but I think you see

what I mean right? It it um it was all about like taking old capabilities, right? Combining combining, you know, simple models and modules to create something that was more complex and then over time, right? So, so that was what made the brain work, right? It was all about taking little things that worked and combining them in new and different ways in order to evolve, you know, effectively an emergent, you know, emergent properties, emergent, you know, computational abilities and an emergent understanding of the world in which we live. And I do think that like what what you know, if when we get to the point where we start really saying, oh, this is actually truly intelligent, it's going to have that feature. It's going to have the ability to have a it's going to have a modular description of the world and it's going to have the ability to to combine those modules in a way that creates a more sophisticated understanding. It's like Legos, right? I can, you know, the the Lego bricks all connect in certain ways and I can build like all sorts of new and amazing things that were never built before, right? Out of them. That's a capability that we

have and that's the essence of like creativity. It's why I refer to systems engineering as like the thing we really want our our our AI models to be able to do. >> Collective intelligence is a bit different. We we have this plasticity, right? We can adapt our behavior day by day. We might see some kind of metalarning or some kind of change in our organization dynamics. You know, maybe some agents will specialize and it might be an existence proof of this kind of recursive, you know, super intelligence that we're talking about. >> Yeah, I do. I I I think that's absolutely correct. Right. is that you know so the specialization is great in fact I would argue that specialization is how we got all of this right and this was I'm pointing at London in case you there was some confusion there um right it was it was really about you know the interconnected highly specialized intelligences that are people and their ability to learn how to to to work together that that that you know gave rise to the technological revolution the brain is the same way right it's in my view it's highly specializ ized little

modules or agents that are capable of of of of um being repurposed, reused, um capable of communicating with one another in order to solve really complicated problems. But there's always a benefit to specialization. I don't believe in like like AGI. AGI seems like a bit of a a misnomer to me. What we really want is not artificial general intelligent. We want collect we want collective specialized intelligences. >> What about scientific discovery? Do you think that we could, you know, what would the world look like when we could discover new drugs? We could discover new knowledge in science. >> You know, right now the way that we're doing that is is um largely focused on summarizing vast troves of data and looking for correlations that are present in it. Um I think the next major milestone um in this trajectory is is experimental design, right? Not just oh well here's here's some correlations you you may not have seen because they're really small and this is what computers are good at. They're really good at identifying small but highly relevant correlations. Um and uh the next step of course is design is is constructing a

system that tests these hypotheses explicitly right and generates the experiments that will identify like that will they'll fill in the gaps of our knowledge and all of this I believe can in fact be automated in a very sensible way. I I you know I I don't see any like major obstacles to automating empirical inquiry other than we probably want to place some safety constraints when we start letting them work when we start letting the AIs run the labs right because you never know. So you always have this AI was like well you know the most effective experiment to determine if this is correct is to set off a nuke and that that would be bad. >> Yes. >> Right. So pure empirical inquiry right does run risks like that but I think that that's not not not the biggest issue. I think what we need to do is we just had need to have a nice concise framework for saying like oh look you know like I'll give you an example. So, we had this we we we had this um problem that popped up a while back. A gentleman we were talking to is is um is you've got these long, you know, you got these robots and the robot sees something it's never seen before. And in an I, you know, so a robot is like running around.

It comes across like a beach ball. Never seen a beach ball in its entire life. And what you'd like is you'd like the robot to know how to figure out that it's a beach ball and to figure out what its properties are. And if you tell the robot like like if you see something new just stop, right? You're kind of then that's that's no good, right? What you really want to do is you want to figure out a relatively non-invasive procedure for the robot to like poke do what a child would do. What does a kid do when they see a beach ball, right? They run up and they poke it and they say, "Oh, right. Yeah." And then it moved and and it it actually learned it actually experiments with its environment for the purposes of identifying the properties of the objects that exist in it. Um, now I do think we probably want to test this out virtually before it's deployed in the real world because you never know. It might very well be that the optimal experime experiment is to run up and kick it as hard as you possibly can. Um, and we we certainly want to avoid that. But like something along those lines, something, you know, a robot that is able to test the theories that it has um about how things work in an online way

and learn from those results in an online way is definitely part of the goal. Looking forwards, what do you think the future will look like when we have more autonomous AIs among us? A lot of people worry about infeeblement, loss of control, you know, it making us dumb, all of this kind of stuff. >> I do I do worry about AI making us dumb, right? I mean, offloading offloading your thinking onto a machine, which is something that that that that AI allows, is is is a potentially a big problem. I I don't really want to have a situation where humans are reduced to like val they're just re reduced to like value function selectors. They're just basically going, "Oh, no. I don't like that outcome. Like do this instead." I do want to see a future where where where we have an AI that actually improves our understanding of the world. And simply automating everything runs the risk that you specified, right? It runs the risk of people becoming couch potatoes that just watch TV and occasionally say like, "Yeah, you know, these chips are no good." Um uh that seems like a bad outcome to me.

>> Um I worry less about that I think than some because people are remarkably adaptable, >> right? I mean I you know you they have all these arguments about like oh you know this new technology comes along and it's going to completely destroy this way of life and you know and that's going to be awful for people and it is maybe in the short term. um you know I think of like tractors right or just go back how many hundred years do you have to go back when like 99% of people were involved in agriculture and now it's like what two right I consider that a solid improvement right because it allowed the rest of us to it allowed us to do a bunch of other things that we find more satisfying that are more interesting it allowed us to like you know I I can read you know spend some time reading a book don't have to labor in the fields all day um that's the future that I sort of see and that's the future that I hope for is that is is one in which you know all of these artificial agents running around and doing things autonomously um are there to to free us up to pursue

more interesting more you know you know to improve ourselves in in in in in more interesting ways but at the end of the day it's just another techn you know at least initially it'll just be another technology like the tractor um now 100 years from now who knows >> what will the value of work be if the AIs can do everything and there's nothing left for us to do. >> I don't think that it will ever be the case that the AIs can do everything. Like I said, the future I worry about is one where like it's, you know, the the sole role of people is like sitting around like making sure the AIs aren't aren't going rogue and and and things like that. Um, which I don't consider a good outcome. I would really like to see human improvement. You know, I I I envision a future of I don't know this like cybernetic transhumanism if I'm going to go sci-fi on this, right? where where you know the technology and us evolve together in a way that's beneficial for both. That's the goal. Um you know are there these dystopic possibilities where like oh well what are humans in a world where well what are they what are what are humans in a

world where everything can be done by a robot. >> Yeah. You know, that's that's a good question. And that's and at the end of the day, right, they end up just becoming like reward function selectors, right? They end up just sort of saying, "Oh, I don't like this and I do like that." And they're basically, you know, I mean, you end up with a this is another nightmare scenario. I don't like talking about these dystopian futures because honestly I think people are too clever and I think people are too motivated and people are too interested in how the world really works and people are too interested in actually understanding things that they will never stop that they that AI will become a partner not an adversary or a crutch and that's that's that's what I think will happen because that's but that that's a statement more about my belief about humans than it is about my belief about the development of AI you know I am a techno optimist if if you will, not a not a pessimist. I I believe that we will find a way to adapt to an everchanging world as we have done for millions of years, including one that includes technology that alleviates most

of our labors. >> On on that, there's an AI literacy thing because AI has moved so quickly now that certainly my parents don't understand anything about it. But by the same token, policy makers don't understand anything about it. And there are people saying AI is going to kill everyone and there's people making negative arguments. There's people making positive arguments. So, there's a bit of a fog of war now because there are so many people saying different things about AI. How should they make sense of all of this? >> We are now well outside my area of expertise. So, I'm just going to say that before I say anything else. Um, AI is developing very quickly, but I am much more concerned about what people will do with the new technology than I am with what the technology will do all by itself. I don't have the this big concern about I don't really believe that like you know Skynet's going to take over or the internet's going to suddenly become conscious and kill us all >> right um in part because you know AI is not that advanced but also because we are telling a we you know we are still in the position where we specify the

goals of the system and that will likely continue for a very long time and it will always be the case that these systems you know will can be you are are subject to review. We will always keep an eye on them. They will always at least initially be be released in relatively restricted domains and where we're where we're test where where we're keeping a a close eye on what it is that they are and are not doing. So I don't worry too much about like the going rogue. I worry a lot more about somebody building, you know, it's sort of like a virus which we already have to deal with. like somebody builds like some insane virus and like takes down the internet. I'm more worried about malicious human actors than I am malicious AI actors because at the end of the day all of these algorithms they simply do what they are told right we train them we tell them here's your objective function as long as we are specifying the objective function and we understand the objective function we're probably going to be okay I think the

safest way to deal with AI concerns is to tell people hey look this AI is just doing what we told it to we we you know we set it up to make really good predictions and to achieve these outcomes now is it dangerous to like specify these outcomes without being very very very careful. Yes, it is right. That's this is the whole like hey Skynet end world hunger and it kills all humans. That that's a that is that that is a real possibility. But whose fault was that? The fault was the person who like was very very naively specified their goals. There are in fact relatively straightforward ways to specify the the reward function that that don't run that risk nearly as badly. And the best one is so are you familiar with like maximum entropy inverse reinforcement learning? I like to call it active inference because it's really similar. Um and so there what you're doing is you're basically observing someone's policy and then you're trying to do a maximum entropy um model. You're doing maximum model on the reward function itself. Um at the end of the day what ends up happening when you do this is this is

why it's like basically just like active inference. You get a reward fun. So you have some you know organism or whatever and you're trying to do this for it and and it it's got some stationary distribution over actions and outcomes right it's inputs and outputs of a stationary distribution that becomes your reward function like not directly there's some MATH involved but basically your reward function is a function of the steady state distributions over actions and outcomes so we could do this right we could take the current we could take the current manner in which humans are making decisions and we could write down right what's the stationary what what is the current estimate of the stationary distribution of our actions and outcomes. So this would include things like everyone's getting you know this number of people are going hungry this you know and and you know all the stats that describe like the inputs and outputs to our policy make you know to our policy decision um and then we could just ask an AI your reward function is the one that results in the same outcome that we currently have right on average and it would execute it and it would and and to the extent that it works right it it it would it would ultimately result in a in an AI algorithm that just sort of is

like mimicking human behavior, right? Or it's at least achieving the same outcome that we were achieving before. Now, here's the safe way to like improve the situation. You don't say end world hunger, right? You perturb that distribution >> over outcomes, right? And just just over outcomes a little bit >> and then you evaluate the consequences, right? It's it's all you're doing. You make these little changes in the reward in an empirically estimated reward function, right? rather than just sort of specifying one by hand because that's the dangerous thing. >> Jeeoff, thank you so much for joining us today. >> It's my pleasure. >> Amazing.