Posts / ai

The Moment a Star Broke the Internet (A Little Bit)

There’s a specific kind of online moment that only makes sense if you’re already inside it. From the outside it looks like nothing. From the inside it’s genuinely delightful.

This week, a well-known figure in the local LLM community starred a GitHub repository. That’s it. That’s the whole event. He starred llama.cpp, the foundational codebase behind most of the quantised models that hobbyists and tinkerers run locally on consumer hardware. The catch is that he’s been producing quantised GGUFs from that very codebase for longer than most people in the space have known it existed. Thousands of them. A quiet, consistent, enormous contribution to making local AI actually usable for ordinary people. And apparently he’d never clicked the star button.

When he finally did, people noticed. The reactions ranged from genuine amusement to something approaching reverence.

I find this funny in a way that’s hard to explain to anyone outside the space. It’s a bit like finding out the person who built most of the road didn’t bother to look up the address of the quarry. There’s no judgment in that. It’s just a very particular kind of focused competence: head down, doing the work, not performing participation.

The local LLM scene has that quality more than most tech communities. People are running models on their own machines because they want control, or privacy, or just because they find it interesting. The pipeline someone described in the comments resonated: start with Ollama because it’s easy, move to LM Studio when you want more, get annoyed at the gaps, start building your own thing, end up a year later with a custom application that uses llama.cpp as a sidecar process. That’s a real arc. I recognise parts of it.

I’ve been poking at local models on and off for about a year. My enthusiasm for AI is genuine and my worry about it is also genuine, and I’ve mostly stopped trying to resolve that tension because it doesn’t resolve. Running models locally at least sidesteps some of the data-harvesting concerns, even if the energy cost of a beefy GPU session is its own question I don’t have a clean answer to.

The people who build the quantisation tools and share the outputs aren’t getting paid for it. They’re just doing it because the thing needs doing and they can do it. Open source at its least romantic and most functional. No manifesto, no funding round. Just a folder full of GGUFs and, eventually, one star on a repo.

It’s a small thing. It landed anyway.