Good Software Doesn’t Double Check

I’ve been wrestling with the balance between vibe coding and high touch manual coding these days. At the level of quality I’m trying to output, you cannot delegate everything to agents. But you want to do so as much as possible.

To do this effectively means developing a new set of coding smells. The original coding smells were quick tells of typical coding problems. They indicated muddled thinking, structural problems, or technical debt. But the sort of errors agents make are different and need a new set of nasal receptors. That is, for the next few months until a new crop of tools makes current wisdom obsolete again.

One classic tic of agents is overly defensive programming:

if 'config' in settings and 'height' in settings['config'] and isinstance(settings['config']['height'], int):
   ...

I think it’s a quirk of the reinforcement learning process that leaves them particularly prone to this sort of behaviour. They get rewarded when the code they emit works, and they don’t get penalized for length or quality. So they quickly learn to shove in as many possible checks and conditions as possible.

Generally speaking, this is bad code – it’s very difficult for humans to read, and frequently these tests do little to improve actual robustness, they just give pretty error messages.

I fight this particular behaviour with increasing amounts of strict static typing. That gives the reader confidence that the program is in a given state already, and additional checks would be redundant.

But this smell is actually indicative of a bigger problem with agents. They can tunnel vision a bit on the current problem, and don’t look at a bigger picture. Particularly if a clueless user just copy pastes error messages without themselves thinking about what they actually want. This leads to all sorts of redundant behaviour:

  • Ad hoc type validation
  • Wrapping a failing function in a try/catch or retry1
  • Checking state for things that are meant to be invariant
  • Reparsing strings, vestigal case statements, catching stuff just to log and rethrow

It can be easy to believe that this is just defensive coding. It makes the code more robust, code length is not a big deal if only agents are reading it, so whats the harm? Well, all these checks are making the code more brittle. You are adding checks of your assumptions, but not actually ensuring that the assumptions are consistent across the code base. As the code evolves, they’ll get out of sync with each other, leading to failures in random functions in the middle of your stack.

If instead you were careful to establish your assumptions as documented or enforced invariants, then checks are no longer needed, and you’ve got consistent behaviour across all your code.

It’s a bit like the code version of a TOCTOU bug: The moment you rely to any given fact more than once, you are opening yourself up to drift between them. Establish things once, and move on.


  1. This particular one was a pet peeve at a previous job. Each time had their on retry logic, and calls between services could be quite deep. Retries can be multiplicative if naively set up – if you retry service A up to 3 times, and service B retries service C up to 3 times, then you have to experience 9 failures before you give up. We sometimes had to wait an hour before systems would surface that something we didn’t own wasn’t responding. ↩︎

Infinite Random Rectangles – the Poisson Rect process

Previously, we looked at how to sample points randomly at a given density across an infinite plane.

It’s harder than it sounds, as I was looking for an algorithm that was not biased by the size/shape of the chunks used to calculate it.

Today let’s extend that to filling the infinite plane with random non-overlapping rectangles. As before, that means finding a deterministic chunked algorithm that we can prove is unaffected by the choice of chunking.

Continue reading

My Trip to NeurIPS 2025

I recently went to NeurIPS, the world’s largest academic AI conference. This was a multi-purpose trip: present a poster at the MechInterp Workshop and hobnob with other AI Safety researchers; get a general impression of state of the art in AI, particularly the gamedev/creative space, and represent Timaeus, the company I’ve recently joined.

Continue reading

Silksong Quick Hints

I’ve been enjoying Silksong a lot, but sadly it is not a game where you can trust the developers to leave good signposting and guidance. I don’t like to use game guides, so I ended up wasting a lot of time in this game, and then had to resort to guides anyway.

Here are some very light spoilers that will abate the worst of these. I’m not mentioning alternate paths, just things that the developers likely intended you to find, but might not. Collectively, I think this would have saved me 15 hours of pain.

Continue reading

More Accelerated Game of Life

I got a good comment on my previous article about implementing the Game of Life in CUDA pointing out that I was leaving a lot of performance at the table by only considering a single step at once.

Their point was that my implementations were bound by the speed of DRAM. An A40 can send 696 GB/s from DRAM memory to the cores, and my setup required sending at least one bit in each direction per-cell per-step, which worked out at 1.4ms.

But DRAM is the slowest memory on a graphics card. The L1 cache is hosted inside each Streaming Multiprocessor, much closer to where the calculations can occur. It’s tiny, but has an incredible bandwidth – potentialy 67 TB/s. Fully utilizing this is nigh impossible, but even with mediocre utilization we’d do far better than before.

Continue reading

The Culture Novels as a Dystopia

A couple of people have mentioned to me: “we need more fiction examples of positive AI superintelligence – utopias like the Culture novels”. And they’re right, AI can be tremendously positive, and some beacons lit into the future could help make that come around.

But one of my hobbies is “oppositional reading” – deliberately interpreting novels counter to the obvious / intended reading. And it’s not so clear to me that the Culture is all it is cracked up to be.

Continue reading

Accelerated Game Of Life with CUDA / Triton

Let’s look at implementing Conway’s Game of Life using a graphics card. I want to experiment with different libraries and techniques, to see how to get the best performance. I’m going to start simple, and get increasingly complex as we dive in.

The Game Of Life is a simple cellular automata, so should be really amenable to GPU acceleration. The rules are simple: Each cell in the 2d grid is either alive or dead. At each step, count the alive neighbours of the cell (including diagonals). If the cell is alive, it remains alive if 2 or 3 neighbours are alive. Otherwise it dies. If the cell is dead, it comes ot life if exactly 3 neighbours are alive. These simple rules cause an amazing amount of emergent complexity which has been written about copiously elsewhere.

For simplicity, I’ll only consider N×N grids, and skip calculations on the boundary. I ran everything with an A40, and I’ll benchmark performance at N=216 . For now, we’ll store each cell as 1 byte so this array is which equates to 4 GB of data.

All code is shared in the GitHub repo.

Continue reading