For my personal projects, I don't want to write tests anymore. The economics changed. With an AI in the loop, every test gets written, run, watched, and maintained as the code shifts under it. That is all tokens.
And the test is not even what I am really trusting. The AI writes the test too, so trusting its test but not its code is trusting the same author twice. What I am actually betting on is that the code matches what I described. A test is one way to check that bet, not the cheapest, and not an independent one when the same model wrote both sides.
The AI can already read the function and tell me what it does, so a test is really shorthand proof for a human. But if I am the one using and exercising the thing directly, do I need a test to prove it to myself?
So lean on clear descriptions. State the behavior in plain Given / When / Then, and most of the time stop there. When I genuinely doubt something, write the test on demand, confirm it once, and delete it. It was scaffolding.
There is an equation hiding in here:
keep a test when P(catch) × cost(failure) > upkeep
P(catch) is the chance it flags a real failure I would otherwise ship; upkeep is the tokens to keep it alive as the code moves. For a weekend project that is just me the payoff is tiny, so it rarely clears the bar. So for now: describe clearly, prove on demand, throw the proof away.
The real question is the other side of that inequality. How does this pay off for production workloads, where being wrong costs money, data, or other people, and the cost term is exactly what blows up? That is the one I actually want to answer.