When the Robots Started Optimising Themselves (And I'm Not Sure How to Feel About It)
Andrej Karpathy just casually dropped something on Twitter that’s got me sitting here with my third latte of the day, staring at my MacBook screen and feeling that familiar mix of excitement and low-key existential dread that seems to define 2025.
For those who don’t know, Karpathy is one of the godfathers of modern AI – co-founded OpenAI, former head of AI at Tesla, basically the kind of person who forgets more about neural networks before breakfast than most of us will ever learn. So when he posts about an AI agent that ran autonomously for two days and improved his tiny LLM training process by 11%, making it go from 2.02 hours to 1.80 hours to match GPT-2 performance, people pay attention.
But here’s the thing that’s really got me thinking: it’s not the 11% improvement that matters. It’s that the AI did the entire research loop – try something, measure results, think about what worked, try again – completely on its own. No human intervention. For 48 hours straight. And it actually beat Karpathy’s manual tuning.
Someone in the discussion thread put it perfectly: most agent loops fall apart well before 48 hours from context drift or accumulated side effects corrupting results. The fact that this held coherent experiment state across hundreds of iterations is the real achievement here. It’s the difference between a toddler randomly pressing buttons and a systematic researcher working through a hypothesis.
Now, before you think I’ve gone full doomer, I should point out that this is still optimising a relatively small model. We’re talking about making GPT-2-level performance faster, not creating AGI that’s going to recursively improve itself into godhood by next Tuesday. Several folks in the discussion noted that these optimisations might work beautifully on nano or 1B parameter models but make zero difference at the 100B or 1T scale. Fair point.
But the paradigm shift is what’s keeping me awake at night (well, that and too much coffee). We’re treating model architecture search as an iterative software problem rather than a theoretical one. It’s “just engineering” now, as Karpathy says. And honestly? That’s both brilliant and terrifying.
The discussion that followed was fascinating. Someone mentioned they’ve been doing this for a year but their trainings take two hours each, making it impractical for bigger projects on consumer hardware. Another person estimated Karpathy used 3 billion tokens in a month, which translates to somewhere between $10K-$75K in API costs. Not exactly weekend tinkerer territory, which frustrates me a bit. We’re reaching a point where meaningful AI experimentation is increasingly locked behind serious financial barriers.
There’s also the memory problem that someone raised, which really resonated with me from a DevOps perspective. Right now, these agentic swarms are like having a team where nobody takes notes. Each agent runs, gets results, and then that context either bloats the prompt forever or just disappears. You end up with expensive trial-and-error where agents keep trying things that already failed because there’s no institutional memory. One commenter mentioned forking a project to add persistent memory based on cognitive science principles, so agents could actually recall what didn’t work. That’s the kind of engineering that actually moves the needle.
What really gets me is the alignment question. Someone pointed out – quite rightly – that autonomously improving swarms sound cool until you realise nobody has a good answer for how to keep them aligned once they start modifying themselves. The response was interesting: use a well-aligned smart AI model to verify the alignment of smarter models, like climbing a ladder. Each rung is checked by the previous one.
That sounds reasonable until you think about it for more than thirty seconds. What happens with alignment drift? Even a small deviation from “ideal” alignment will stack up over enough generations. It’s like compound interest, but for potential catastrophe. We’re essentially hoping that we can verify alignment rigorously enough at each step that errors don’t accumulate. And given my experience in software development where bugs always accumulate in unexpected ways, I’m… sceptical.
Look, I’m not a Yudkowsky-level doomer. I think AI is going to bring incredible benefits, and I’m genuinely excited about tools that can optimise complex systems better than humans can. My DevOps work would be transformed if I had an agent that could run deployment experiments, measure performance, and iteratively improve our infrastructure without human intervention. That’s not science fiction anymore – it’s apparently just expensive.
But the environmental footprint keeps nagging at me. Running an agent for 48 hours, burning through billions of tokens, consuming whatever ungodly amount of electricity that requires… for an 11% improvement in training time on a small model. Scale that up to every lab doing this for every experiment, and we’re talking about a significant chunk of energy consumption. In a country like Australia where we’re still running coal plants and battling climate change, that’s not nothing.
Maybe I’m overthinking this. Maybe it really is just another step in the long march of automation, no different in principle than manufacturing robots optimising assembly lines. We adapted to that. We’ll adapt to this.
Or maybe we’re watching the earliest stages of something genuinely transformative, where the feedback loop of AI improving AI is just starting to close. Not the singularity yet – we’re not there. But perhaps we’re in the awkward adolescent phase, where the systems are capable enough to be useful but not quite capable enough to be truly autonomous. And that middle ground might be the most dangerous place to be, because we don’t quite know the rules yet.
What I do know is that Karpathy’s experiment is going to be replicated everywhere. Labs that can afford it will absolutely start running these autonomous optimisation loops. It’s just too useful not to. The competitive pressure alone ensures it. And that means we need to figure out the alignment questions, the memory architecture, the environmental costs, and the access inequality issues sooner rather than later.
For now, I’ll keep watching these developments with fascination and just a touch of anxiety. And maybe I’ll cut back to two lattes a day. No need to burn through resources unnecessarily, right?