Beads - Structured Memory for Coding Agents - Brandon Harvey: AI in Action 13 Feb 2026 — benchmark.space

Latent Space TV (see @LatentSpacePod for Pod)

Beads - Structured Memory for Coding Agents - Brandon Harvey: AI in Action 13 Feb 2026

2026-03-18 48min 182 views watch on youtube →

Channel: Latent Space TV (see @LatentSpacePod for Pod)

Date: 2026-03-18

Duration: 48min

Views: 182

URL: https://www.youtube.com/watch?v=LJFL6bYyGHg

Latent Space AI in Action Weekly Jam - 13 Feb 2026

Brandon Harvey demos "Beads" (steveyegge/beads) - a persistent structured memory system for coding agents with dependency-aware graph storage. Also demonstrates "ergo" (sandover/ergo), a CLI tool for writing specs that coding agents can execute.

Timestamps:

0:00 - Introduction

2:00 - Brandon Harvey introduces Beads

5:00 - Persistent structured memory for coding agents

10:00 - Dependency-aware graph storage

20:00 - Beads CLI demo

30:00 - Ergo C

you on the treadmill, Swigs? >> Certainly looks like it. >> Yeah. Uh, >> obviously I'm not a regular here, but just had some time today, so does anybody MC these or do we just talk? a little bit of little bit of both. Usually Kal kind of uh will give a little bit of uh intro. Um and I saw him in chat so he probably should be around shortly but um I suppose uh how are my levels? Can you guys hear me by the way? I'm not sure. Sometimes it it kind of sucks. >> You're good. >> We're good. Cool. Um yeah, let's see. Let me see if I can get my recorder up. And then yeah. Um,

so yeah, I suppose I can I'll take over on the spiel until uh until Kal gets here. We might give it a couple of minutes just to see if more people trickle in. I wonder if the ping went out too. Um, let's do Okay, Kal's on his way. He's got a previous meeting. AI in AI in action. We have Brandon Harvey. >> Oh, that is me. That's you. Um, yeah. What's your what are you uh what are you showing for us today? What you got? >> Well, I want to talk about planning workflow and I have kind of like highle thoughts and then I have a concrete instantiation of how to put those thoughts into action. And I have a tool that I built to help me realize that concrete instantiation. So, I'm going to start broad, go medium, and then uh narrow down on a specific on a specific tool that I want to show you. >> That sounds perfect. Um, >> says, "Kall's still on a meeting. Should we >> get I just got here? Don't wait for me.

>> Just arrived." Yeah, >> I just got here. But and generally don't wait for me. I just like to post that because sometimes people wait for me to MC even though like Yikes has got it. You know, David could do it. Like all all y'all can do it. Just go. >> We just do things. That is what we do. >> I love it. You don't need me anymore. And that's how I like it. [clears throat] Well, same the same same fate for all of us pretty soon, I'm sure. But in any case, >> cool. >> So, Brandon have it. Yeah. What you got for us, man? >> Hey, my name is Brandon. Uh, I've worked at startups and meta and things in between and um I'm a engineer and a and a product manager and um so I'm very excited to be vibe coding a whole bunch right now. Uh working on a number of different projects. Um I'm going to talk to you about planning. Um because in my you know experience in the software development life cycle over many years um I had this you know this

experience I I think that in general there there's a pattern that humans follow which is to specify what they want to build which is like what do we want to build? [snorts] there's there's a question of why and is it the right thing but like at at some point it comes down to a spec which is a PRD or technical architecture those things I would refer to as the spec and if you've worked on a medium to large size project you know that there's there's then a process of backlogging or planning which is how do you devolve the spec into um a set of into a sequence of tasks Um and then you get into the question of like how are those tasks grouped? How are they allocated? How do you fit them to the right people? Uh all that stuff. Um observation I'll make is that you know the backlog and the spec are two different things and you kind of want to think about them separately. Um it's it's good to have your priorities to go back to and it's good to have your Jira or whatever um and to keep those things

in correspondence, but they're not the same thing. Um, another observation is that you learn a lot uh you learn about a lot about the problem domain from building the backlog. Typically the backlog iterate loops back in and makes your plan a bit better. Um, and everything I'm saying applies to agents as well. So agents and plan mode, the implementations of plan mode that we get today, I think are pretty inadequate because they blur the distinction between having a spec and having a backlog [snorts] or they respect that distinction and they have a really shitty implementation of of a backlog, which is to say, you know, maybe a bunch of markdown files or um data that is internal to the agent harness itself and not inspectable by to uh not accessible by other agents or other agent instances or you know other model providers. Um so it's like where's

where's the backlog in you when you have a backlog which is tasks that for example can have dependency relationships this must be done before that this has acceptance criteria that are separate from those acceptance criteria. When you have that that that kind of chain of tasks, um, suddenly you have a plan that's much easier to parallelize, you can make much more robust invariance like when this task is done, these things must be true. These tests must run. You can set policy around tasks. Um, you have something that has a skeleton. So, it works really, really, really well to give your agents a structured backlog. It works better than uh to give them or let them generate um a whole bunch of confusing uh markdown documents. If you are using sub agents or your agent is using sub aents, um it's really great when the sub agent wakes up in the morning and has a crisply defined task to work on rather than having to navigate its way around

um a folder full of 20, 40, or 50 uh different markdown files. Um so this is why uh Steve Yaggi who you may know as a loquacious uh engineer blogger uh was very kind of OG at at Google and Amazon and other places. Um he create he diagnosed this problem and he came up with a solution and it's called beads. Um and I'll just show you his GitHub project. Uh There we go. Um, so this is beads and it's got this really beautiful um interface to it. Um, what you do is you install it. You go in your project BD and you run BD init. BD is the beads command line tag and then you tell your agent use that use that

for task tracking. Use it for planning. You can you might want to spell it out a little more than than this one phrase, but basically that works. And then uh in ter in place of its own internal planning tool or and in place of creating a bunch of markdown files, what your agent does is writes structured tasks with dependency relationships and epics for grouping into um a little beads folder that lives inside your repo. Um, so it's like you had a a uh a little local Jira or a little local linear kind of task planner um inside the repo. Agents write the plans, agents read the plans, and you as a human can optionally go look at that. Um but uh it's it's basically specs still happen. The spec stage still happens and then you tell your agent, you know, break down this break down this spec, this TDD, whatever it is into a

carefully sequenced um collection of tasks each and you can specify like I want each task to be sized in a certain way. For example, one pattern is like make it so each task is about a good size for a git commit or make it so each task is something that a human could do in about an hour. Um, and you can specify, hey, make it so each task has really clear acceptance criteria and automatic validation gates so you can verify that it's done uh without consulting me. or make it so that if a task can't be val ver verified or validated by you um that you stop and ask me the human to validate it for you and be really clear about that. [snorts] Once you have a plan like this, you know, 20, 40, 50, 100 beads, um, then you can just point an agent or multiple agents at it and say, "Implement this plan. Go." And you get beautiful one and twohour chunks of work. Uh, and you can watch your agent kind of knocking down

the backlog. Um, this is not Ralph Wiggum, but it is highly Ralph Wiggum compatible. Uh, in fact, if you want to do Ralph Wiggum, this would be a much better way to keep your model on Rails. Um, have I lost anybody so far or is or is this um pretty reasonable? Is this controversial, uncontroversial, confusing this pattern? >> Chat seems to be on board with versions of this. >> Thank you. Because I'm screen sharing, I'm realizing I have lost track of the chat. So, uh, let me see if I can, uh, remedy that. Beads is very controversial. Um, yes, it's controversial. Um, and I'll tell you about that. Um, I'm totally annoyed with beads. Um, I think I think Yagi is insane. Um, because the bloat in this project is incredible. The, uh, if you run BD-help, it's like five pages

long now. [snorts] um beads is really slow for his representation of the data on disk. He chose a SQLite database but it in parallel to the SQLite database that's uh in your repo. Um there's a JSON L file and then every 5 seconds there's a synchronization task keeping them in sync. Beads has a demon running in the background. Um there are corruption issues, there are versioning issues. So you update beads and suddenly you go into uh an older project uh two weeks older and um and beads says oh your database is old and can't be updated. Um beads is an awful project um but he had a great idea. So it's the era of vibe coding. So we're going to talk about erggo. I wrote Erggo. Erggo is uh the exact same thing. Here's the quick start. You brew install erggo. Uh you

put an instruction in your agent. MD. Um erggo stores its plans in your repo in a directory called erggo as a JSON L file. There is no SQLite database. There is no demon. Get out of here with that. Um the JSON L file is append only and there's a simple file locking. um around that. So, it's concurrency safe, parallel agent safe, everything is plain text forever. It's hard to destroy data with erggo. Um no, I mean SQLite is great. I'm just not I'm not going to commit SQLite to my git repo just to keep track of my tasks. Um I mean I love >> I am very much on board with markdown all the things. It's just that chat was very uh pro-SQLite. >> Yeah. Um, no. SQLite. SQLite's great, but for this for this it's, you know, it's markdown in, markdown out. And, uh, if you do run into a contention issue, which I have, or a work tree [ __ ] up.

Um, uh, agents are easily able to resolve merge conflicts in this JSON L file. Um, so it's the usage model around erggo is highly non-destructive. uh where uh when you write a task that's an append. When you change the state of a task such as going from ready to clAIMEd um that's an append. Um most sets of the state of a task are appends. And uh the only two destructive operations are compact which removes all old version prior versions of a task state from the history. uh which you can optionally run and prune which actually removes all done and canceled uh tasks from the history. Otherwise, if you don't run erggo compact or erggo prune, you will literally never lose um any information via erggo, but the representation is still quite quite small. Um I wrote it in Go. I don't know

Go. I've never written a Go program in my life. I haven't been a uh um I just wasn't privileged enough to do that. But now I can write go and uh and you know there's more testing in this project in this erggo project than all the tests I ever wrote in my life put together as a software engineer um because AI is so good at writing tests. Um here's then there's also a CLI. So here's if you run erggo list yourself which you can um you know here's an example of of a plan uh and uh here's erggo show is an example of looking at a particular uh task. So if you look in the right hand column I give each task uh we generate a little um task ID that's in the right hand column. Um, the E are for epics. The check marks are done. O is ready tasks. Uh, the small dots are not ready. And there's a little hourglass that shows

what that task it's depending on or what it's waiting on. Um, and again, if you want to see what a task is, you can erggo show that ID and like here's the body of a task. Um, [snorts] and erggo prune. The rest the rest of the CLI interface is basically for your agent to use and your agent can easily learn it. It just runs erggo-help and then it's off to the races. Um so um that is erggo and I have been using it in heavily in production for the last two months. Uh kind of running all my projects and all my life through it. Um it's pretty simple. It's pretty small. Be happy if you used it. Be happy if you contributed to it. Um, but even if you don't, uh, I hope that I went some way toward convincing you that treating backlogging and planning as a as a intermediate step in between specifying and implementing is a really good

pattern even in the era of uh, agents. Um, and in fact, it makes your agents uh gives you the potential to use dumber agents, run agents in parallel, be more confident in what in the work your agents are doing, allow your agents to work for a longer period of time. Um, and then give you better introspection into like where are we in the plan? Um, exactly. Not where are we in this markdown file, but where are we in in in the task plan? Uh, that that just feels good. Um, and sometimes it's really important. Um, so I will stop there. I've been talking for 15 minutes. Um, happy to discuss this. Um, I'm also happy to demonstrate the CLI uh, if you want to see it uh, made real. We can just jump to the terminal, but I'll stop demo. Sure. Um, here's I'm in the Erggo project. I'm just going to do I'm going

to run codeex. I'm not going to update codeex. Um, I'm gonna pick the spark model. Go really fast and uh learn erggo with erggo-help. Um, very quickly write out a fake plan to build a feature to add uh cows. um to erggo um uh fake because I didn't have an idea for a feature I wanted to add. I should have prepared an idea for a new feature we could add to Erggo, but um Coway was the first thing I thought of. Um okay, ooh help says the planning Yeah. I'm sorry. Codeex Spark can be a little liberal literal-minded and get its blinders on. Did you actually write ergo tasks?

Go do that. Okay. While it's cranking on that, I'm just going to open up another tab here and do ergo list. So, here's what it's working on. Uh, oh, there was some this is some other just demo junk. Ignore demo epic and orphan to-do task. That was also my own testing. Um, so here's the calc feature define feature requirements. I'll just have a I'll I'll just grab the um task ID and I'll do erggo show that. Okay. So in my agents.mmd file the structure you see here is not part of erggo itself the the acceptance criteria that automated validation gates this body can be anything um I have some I have a sentence in my agents.md file somewhere that says when writing erggo

tasks please include these um these sections so this is structure that I brought it's in erggo you You can impose any structure you want. This is just how I like things to be. This pattern of automated validation gates is really really great in my opinion because uh I don't want the model to stop me and ask something is if something is right [snorts] un unless uh it's important for um the model to stop me and ask if something is right. So that's the first task. Um I'll do the I'll show you another feature which is if you just you know the uh epics have an ID too. If you go erggo show the epic ID, it's just going to concatenate all of the tasks in that whole epic and just so that's you know you can copy and paste the whole epic plans. Um and it's readable and pleasant and it's in the right uh order. Um what else? This is the erggo-help um which is a a complete reference. Um

it also tell down here at the bottom for agents it gives hints like agents have a-json version of all these commands. So if you do erggo list-json and you're an agent you get this um which is uh a list of uh all the all the tasks uh in a in a structured form. Um, other advice for agents, uh, they pass d- agent when they're claiming. So, um, there's a when when a task moves from ready to to to being worked on, there's a claim command, and the claim command, uh, says, well, who clAIMEd this? So, you could say, you know, um, a codeex agent at this host name clAIMEd this. So you could lean further into this feature and if you had multiple agents on multiple machines or you know multiple contributors to a project you could potentially have some provenence of like okay that task is being worked on by whom? >> So uh good question from the chat um what is the scope of the erggo store is

it cross project one folder work trees etc. >> Um the way it's designed is that it lives in the in the repo at top level in the repo that's where it looks for it. There is a d-dur flag. So you could specify have it live somewhere else. Um you just have to make sure to tell your agent always use this d-dur flag because I keep my erggo plans over there. I'll go into erggo and just show you what's in there. It's pretty simple. There's a lock file uh that I use to uh mediate concurrency and there's events.json and I'm going to I'm going to I'm going to cat events.json. Let's just see what's in there. Um it's everything every you know each task and each uh update as as one line. These this right here these are um depends links. So um I'll show you that I I mentioned that I think it's good I found it to be good if you can express a notion of task dependency. One thing

must be done before another thing. Um so there's sequence erggo sequence you know uh tasks should happen in this order. A should happen before B. Um again your agent takes care of this. Um you don't you don't do this though you could. Uh other question is um is it meant to be committed? It's agnostic. You can put it in get ignore uh or you can commit it. I tend to commit it. I tend to give my my dudes a little instruction to like, hey, um, you know, as they're working on tasks and marking them done or whatever, that dirties the events.jsonl uh file, uh, it it acrru those those state updates and I tend to say uh, just commit that at the end of of an epic. Um, that's been my pattern lately and that seems to work well. You could commit it with every commit. Um, or you could again get ignore this thing. Um, and yes, if you're working on different branches,

they will have different versions of the plan, different state updates, and then those will merge. And because JSON L uh stores one blob of JSON per line, those are those are lines. Uh, Git is great at lines, uh, models. I've not had any difficult merge conflicts. Um, because the, you know, the models kind of know how to do that. Um, similarly with work trees. Um, I only used work trees a little bit through the new codeex app interface for work trees and I stubbed my toe a couple times and gave up. But, um, I would think that the the same merge logic applies. You could have multiple different projects right to the same Erggo store. There's nothing stopping you. Um, what else? More questions. >> So, I guess that kind of u leads to a

follow-up question of like how hard have you pushed this? Like how how wide and deep have you gone? Like have you done mult like how many sub agents, how many agents at once? Like how how hard have you stress tested the system? >> I have not gone full agent swarm myself. >> Okay. Uh I tend to parallelize agents up to the capacity of my ability to keep track of them. So this morning I had >> valid >> I I had four going um and uh there were some little conflicts which they resolved. Um so I've I've done it with a handful. the you know concurrency has been tested up to the handful level >> and you were using this um in this demo with um latest codecs in the spark version. Have you have you tried like seeing how how cheap of a model you can get where it's still able to follow these instructions? Um I haven't gone below codeex 53 medium

for anything generally. >> Sure. >> Um so your mileage may vary. I will say Codec Spark based on my 24 hours of experience with it is uh not it's not doing a great job of following instructions. It may be good at coding but I'll give you an example in my agents. MD file I always have like I want conventional commits you know I want you to describe what the change was so on and so forth. Um and codeex spark repeat even on high is like giving me oneline commit messages. Uh, and I was like, you didn't follow our commit policy, did you? So, I really don't trust Spark to uh write any plans. I may trust it to implement some plans. Um, but we'll we'll see. We'll see about Spark. Other other questions? Get is get my I had a former CTO who was like like 10

years ago, 15 years ago. is like, "Yeah, Git is a knife with no handle." Um, which is why I really like that that I have my I I agents are now my Git handle. Um, I'm comfortable in Git, but they're way more comfortable than I am. Anything else? I'm just checking the chat. Um, I think I think work trees are nuts. I don't think work trees are ready for prime time. I'll just I'll say that. I'm going to stop screen sharing. I think we've reached the natural end of my talk. Um, but uh yeah, I'm I'm new in this community. Uh, and I'm still struggling with Discord. So thanks to to those of you who helped me create the schedule this talk and connect to it. >> Thanks AI in actionbot. Um I am curious about um

because I know you mentioned like specs and planning and stuff like that and then versus the backlog. So I see that we have we have erggo or beads or whatever like we can make tickets we can do dependencies for the backlog. How does that relate to how you formulate a formulate a spec and or or are those two are those like basically like siloed processes for you? For me, they're siloed. Uh right now I'm what I my process of creating a spec is uh I really like I love codecs and I also like Opus' taste. So I use both. I'll start in one on on an idea for a feature or a plan for a you know a TDD technical design document and maybe a product vision or feature vision. Um kind of keeping like the product thinking and separate from like the implementation. Um but I'm clear with the model like we're not implementing this this is not a backlog. um you know we're trying to

make this feature or whatever it is the best it can be and then I will give it to the other one and say you know this is a plan for a feature um evaluate this you know from the perspective of CTO or staff level engineer see if you can make improvements measure twice cut once they always manage to make improvements no matter what and they're good improvements too um so I love the pattern of ping ping ponging between different models a little bit until I have a TDD and a and a feature definition that I love and then I have created a I forgot to mention this. I've created a skill an erggo skill um which I can I can share to you um but I think can do forward slashinccodeex uh or whatever and say um read this TDD and you know imp implement it um or sorry uh you know use erggo to to plan it and in the skill is is all the you

know uh the way I want that plan to be shaped. Um I will I will post a link to that skill in the in the channel. Uh actually let me just see if I can find it now. Um I have to find my skills repo. Um and yeah, that that just makes it really quick. Then it just sits around for 10 to 15 minutes and it cranks. And again, measure twice, cut once. Once it's done that, I say, you know, I either start a fresh context window or I tell the same agent if it hasn't used up too much context writing the plan, I say, look at evaluate this plan with fresh eyes from the perspective of whatever uh and make improvements. And again, the first round, the first time you ask for that, it always makes improvements that you really did want. Um, so it's it's a really really good uh practice to um do that second rev. Um, let's see.

Here's my I'm posting a link to the skill in the chat. I'll also I'll also post some links to follow up on this in in the Discord conversation. Um if you look at that file you will not shouldn't see anything too surprising. Um okay so it's yeah so turn a featur feature request. Yeah. So the skill there is is potentially like the the spec to specto ticket bridge here. >> Yeah. what it looks like. >> Yep. >> Nice. >> Though even with even even before I developed the skill, which is only within the last two weeks, I was still, you know, getting good results uh with Erggo just from my own kind of guidance. But this this just puts it even more on rails. >> Yeah. Yeah, I think one of the things that I struggled with with beads um is

finding or like because I I I tend to keep going back to the the Manuel refrain of like just just whip out a SQLite DB and do it with tables. Um of like it feels like like is there too much here? Is this too bulky? Is there too much added complexity? So I'm curious in terms of since erggo is like uh uh like entirely JSON, have you presumably it's like the focus is to kind of keep it lightweight. Have you made any um like decisions about where like something sounded cool but then you were like wait actually I shouldn't add the unnecessary complexity here and and I should leave it kind of >> I went with I went with real minimalism here. So like text is the ultimate interface. Text can never doesn't you know it's not versioned you know um text will work for for a century. Um the feature set you could there's only one

verb well there's two verbs erggo claim and erggo sequence which are syntactic sugar over the set operation. For everything else there is the set operation. If you want to change state of a of a task, um change the title, let's say, there's no set title command. That's a set operation. Um this keeps a minimum of code paths, [snorts] a minimum of test surface area, a minimum of command surface area. It makes it quicker for the models to learn the tool. Makes it easier to learn, you know, reason about. Um so I debated whether to even have erggo claim and erggo erggo sequence as separate commands. Um, but for token efficiency, I put them there. And for um just they're just like frequently run and it's nice for humans to be able to see that that's happening. For everything else, there's erggo set. Um, so yeah, just tried to be as minimalist as I could and yet accommodate my own my

own practice, you know. Um, like, oh, I think like erggo claim is a really good atomic operation. I want to be able to see when that happens. Um, interesting to always open to debate this, but I treat, you know, to me the CLI is a UX. I treated this as a user interface problem. Um, I want to have as, you know, the best user interface I can over the simplest possible core. [snorts] So, that was that was what guided this project. I'm going to raise a a question that is a little bit of a hobby horse question. And I ask a lot of people this, but you're new. So I want to get your perspective as well, which is um I think one of the big challenges that we're all facing in agentic development is not getting code written, but understanding what has been written and understanding how our systems are evolving. And I'm curious how you tackle that and if erggo has any support or it's a different layer and unrelated. >> I will say I really like it's not it's not a general answer to that problem. Um, and I have thoughts on

that on that problem, but I will say that olist just like read reading the names of of the tasks and being like, oh yeah, we're going to do that, then we're going to do that. Yeah, that makes sense. Um, makes me think of things like, well, if we're going to do this and this, have you considered whether that's the right framework? you know, it leads to conversations with my agent when I can see the the plan. Um, [snorts] at it's nice to be able to see things at different levels. This is the way this is the way humans think. So, there's like the feature, which is like 5,000 foot level. And then there's like the epics, the title of the epic, like that tells you something. That's another layer down the stack. And then there's like the titles of the tasks. That's another layer down the stack. And then there's the body of the task, which is another layer down. So you can progressively disclose to yourself as a with your human meat brain how much you want to get into it as opposed to again maybe it's just me but like just these markdown files just like oh [ __ ] I made

another you know 4 kilobyte markdown file for you to look at. It's like please there's only one level of depth there. I either open the markdown file and I read the whole thing or I don't and like [ __ ] me I'm not doing it. Um, so this this helps. This is this is one one thing that helps. Um, I think there are lots of other things that may help. All right, David's clapping. I think I'm done. I'm going to show myself out. >> All right, we're a little >> That wasn't a That wasn't a leave. That was just I like what you said. Uh, and to one of my hobby horse questions, um, like in the process of building this out, right, you were inspired by beads, you have some ideas of keeping it like super minimal. Do you have any other features you're considering or any rough edges you've run into of like, you know, effectively I want to know like the dog fooding of it? Like what's in the Erggo for Erggo? Yeah, for the last in my in

my post Christmas development, um I haven't needed anything. Um it's been working for for the task that it does. So really curious to know if if this conversation stimulates ideas for folks, but at the moment it's stable. So very much like classic Unix tools. They don't change much cuz they do the one thing and they do it well and they're stable. >> Yeah. >> I will also say that I expect that the shi shelf life of this tool is limited. >> Um >> this will not last you for another 3 to 6 months and then the agent harnesses are just going to eat it. And that's actually okay. I had a good time building it. I had a good time using it and then there's going to be something else. I I feel like we saw that with like the the plan MD strategy where now most agents at this point have some version of like planning a specific session and before we had to kind of home roll a lot of that and now it's

built in. This kind of thing is likely to start being more built in. Yeah, totally see that. >> Yeah. Uh also long-term somebody said long-term value of the erggo store is mostly good for coordination. I think that depends on your philosophy about your project. >> Um you could just never compact it, never prune it and it would be a complete history of all the state changes you know of all the tasks in your repo somewhat like you know Atlassian is that is that system of record for people's who use Jira for planning. Uh it's all in there. um it's not that big in terms of data, you know, so you could just keep it all. Um and you might learn things from that. I haven't I don't have a theory of that of how important that is. >> Almost starts to feel like a change log MD in a way. >> Yeah. >> Do you typically prune them?

Um, I've done a little pruning uh just because I wanted like I use erggo list uh shows you the ready stuff and d-all uh shows you you know more. Um, but at some point if you never prune, olist-all shows you so much that you can't find the the the stuff you actually wanted. So, it's really just to to kind of get it down to a size where I can see the thing I want when I run dall all that makes me be like, I'm going to clean this out. And then I guess kind of related to that, I was talking to Manuel about this a little bit earlier on Discord, but like do you like closely review what gets written in the store? cuz I've noticed like in my own Obsidian vault there's like some notes where like an agent like h that wasn't exactly true and then you're kind of like off on a weird trajectory when future agents reference that and they kind of like make up this you know whole bunch

>> I like to have you know when I'm planning I'm using you know codeex high or even extra high uh sometimes um depending on how important it is to me uh and how concerned I am uh like when it comes to like build system and release stuff and packaging. I use extra high because like that stuff is still really hard for the models. Um, so I'm when I'm planning I'm using a high model and I'm asking for a second pass over the plan. Ask for second and third passes over the TDD. Ask for a second pass over the plan to kind of sniff some of that stuff out. And sometimes from a a model from a different provider. Um, you know, I'll I'll let Gemini have a look sometimes. Um, >> and I guess it's a matter of hygiene then if like the agent discovers something after the planning phase to update the plan, right? It's probably kind of on you to make sure it does that. >> Yes. Yeah. Yeah. Though I haven't had too much trouble with that. Um I have one one failure

mode I have seen is that sometimes um agents may learn something in the course of an epic that makes it so that the that the victory condition of that epic is actually a little different than what we thought at the beginning. Um and so like you can have completed all the tasks but not actually accomplished the thing you wanted and you have to be alert to that. um to that kind of drift and say like, "Hey, is this epic going to get us to where we wanted to go?" Um and um some sometimes some replanning, you know, is is needed. You usually that's you a tell of that will be that your agent couldn't complete the whole epic by itself and you got bogged down into something and now you're talking about something else. It's like, "Oh, I forgot about that thing." and and maybe you know some crosscutting concerns aro arose or something like that. Um, so not

a bad idea to be like, hey, does this plan still get us to my victory condition, you know, or does it need to be um updated? And the model will just go and figure that out for you, >> right? >> Yeah, that's not really like a tool question. It's just like any tool you use for this kind of thing, you'll have to be thinking about that. >> Erggo is text only, so it can handle things at the level of UI design description. Um, you can put ASKI diagrams in there. Um, uh, design briefs, but, uh, it doesn't support, uh, images. You could absolutely >> another another question. So, you mentioned um, you find yourself sort of tapping out at about I think it was four agents in parallel. Um, and I mean I think many of us are still grappling with this as well and it feeds back into those questions of how much of our mental model can we update and how are

we updating it or are we full yoloing things or what have you. Um, but I'm curious what you find are the the limiting factors for you in terms of being able to manage that spread. the the context switches are tough especially like as I've been I've been deep in like build and release because I have an interprocess communication library that I'll talk about some other time where I'm trying to release for Windows Linux and Mac CLI Rust API C API go node so what's that cominatorial explosion of 20 things um And uh I come back to that conversation 5 minutes later, 10 minutes later, 2 hours later. Um and I'm like, you know, explain to me in lay terms, what the [ __ ] are happening right now? You know, what what are you hooks and gates and and um smokes and and whatever. So I have to have the model dumb down for me the thing that I

started at doing because my context has rotted because I've been thinking about other things. Um so yeah my ability to understand and absorb and just make that context switch myself such that I can suggest the next right option. Codex has a nice pattern of being like hey here's some here's some issues or you know things I flagged as a bulletointed list or a numbered list and I'll be like I need you to advise me more on the trade-offs of each of those and like remind me why those things are important again. like I I'll kind of try to get it to be like walk me through this decision. Um because I lose the ability to um I'm just I'm f I'm multitasking. I'm focusing on other things. So I'm not the focused engineer who's going to like make that right choice. Um again, fortunately, if you elicit that from the model and say like give me good consult on this, these are my true underlying goals. This is this is what I care about. For example, this is a proof of concept. um don't talk to me about smoke

tests, you know, versus like I want this to be bulletproof um and perfect and robust to all security um issues forever. You know, those are very different conversations. So, you want to keep steering the model um toward your underlying goal. Uh whereas it will it will bog itself down in in minutia and fall into a rut of of doing things in a in a um at a certain level of engineering, right? [snorts] But I I don't have a deep theory of the multitasking problem. You know, the how much like we're getting into the realm where like human becomes the bottleneck and like what's the shape of that bottleneck? >> How much work can you pass through the human? We're we're going to find out. Well, and I mean, Yagi came up a couple times in this conversation, but I don't know if you all read his like AI vampire

thing, but like you can only think about so many things in a day. You can only context switch so much time. And so, yeah, it is a real question, but then it it does say like, okay, are we operating at the right level of abstraction here? Should we be operating mostly at the level of specification and not even looking at the code when it comes back to see if it fit our you right like can we can we create some way to better do a validation at the higher level. Um I don't know but yeah I mean I think we're all experiencing that we are the bottlenecks in our own systems right now. Yeah, great conversation. These are great great prompts, great questions. Um, anything else we should touch on today? Anything else on your mind? >> Cool. >> Not seeing any. Yeah, let's run. Thank

you, Brandon. This is great. I really appreciate uh you jumping in as someone newer to the Discord. like awesome. Uh we want to hear from you again, I hope. Um and to everyone here, bring your projects, bring your stuff. Uh you know, our mantra that was accidentally coined and is now I love it and we stick with it. You know, it doesn't have to be um useful, doesn't have to be polished, it just has to be interesting. So, whatever you want to bring for us, bring it. And to riff on that, um, if you see something interesting on the channel, someone shares one of their projects, like show your work, definitely like start a thread and be like, "Hey, you want to talk to the group about it and get some some thoughts?" Um, and I know uh David, you're up next week if you wanted to give us a little uh teaser. >> Uh, yeah, for sure. Um, so it's a couple of things. One probably saw me talking about Git work trees in in chat. Uh, I don't know, maybe I just missed it, but I have not seen the same complaints uh

from from other people about work trees. Uh, but they made me crazy. So it is how I've fully embraced work trees, how I fully embraced openclaw to create those work trees, how I have a dev server automatically spin up those different work trees uh on their own dev servers that I can access from everywhere and pretty much just taken all my development to Discord and my phone and what that's been like. >> Oh, that sounds hot. >> Awesome. Thanks, David. Thanks Brandon. Thanks Gable. >> Thanks everybody. I guess we will see you next week. Um, use the AI inaction bot in Discord to sign up or ask questions or whatever. You can just ping it and it should know the stuff. And if it doesn't, then make a issue on a repo and uh we'll uh we'll get around to it. >> Or just pull down the repo and solution. It's got a you don't even need the Discord tokens. It's got a whole simulation layer so you can do console level interactions with it and test it.

>> Mhm. which we vibe coded on one of these sessions. So, you know. >> Indeed. Cool. All right. See you guys next week. GG. Bye, everyone. Cheers.