The Era Of Bespoke Software?
In Star Trek: The Next Generation, Picard says “Computer, calculate the optimal route to Starbase 47, accounting for current ion storm activity.” A second passes. The computer answers. Nobody on the bridge looks surprised, because the show wasn’t trying to explain it; it was suspension of disbelief, narrative convenience, the kind of magic 1990s science fiction needed to keep the plot moving.
We never asked, at the time, what the computer was actually doing. Now we can, because we’re starting to live it. The only sensible explanation is that the computer must have been writing a piece of software for that exact request, drafting it, running it once, displaying the result, and discarding it. Single-use code for a single question. The writers reached for the most convenient hand-wave they could think of, and accidentally described what would turn out to be the future. Hindsight is making them look more prescient than they were.
Made For One
The word “Bespoke” usually evokes Italian Suits. Hand-tailored, expensive, for the well-off. I’m using it in its older, more literal meaning: software made for a single use case, drafted by an LLM in response to one specific need, not intended to outlive that need. Single-purpose. Single-use. Disposable.
For most of software engineering’s history, this was uneconomical. Writing code was expensive. So we wrote each piece carefully, abstracted it, reused it, and felt clever about not having to write it again. The whole edifice of software reuse (libraries, frameworks, package managers, the entire NPM ecosystem) exists because writing code was, until recently, a meaningful cost.
It still is, when humans write it. But when an LLM writes it, the cost approaches zero. So the question becomes obvious: if you can have the computer write a CSV parser for this one report, every time, why import one?
The Reuse Tax
Reuse has costs the industry stopped counting. When you import a library, you also import its transitive dependencies, its versioning schedule, and the maintainer’s choices about API shape (made without knowing your specific problem). You take on the indirection between your needs and the abstraction the library exposes, and the unused code that sits in your bundle, never executed, never read, but present.
And, increasingly, you take on the attack surface. Supply chain attacks have moved from theoretical to ubiquitous. Every dependency is now an active, perpetual liability: attack surface that someone could weaponize tomorrow, long after you stopped thinking about it. In 2026, your left-pad is a backdoor. The cost ledger against reuse has grown quietly, while the industry kept reciting “don’t reinvent the wheel” as if the wheel were still free.
None of this makes reuse bad. The trade has just never been as one-sided as the industry acted.
The Boring Ends
The two ends of the spectrum are settled, and they’re boring.
On one end: Postgres, MongoDB, Kafka, Linux. Nobody is going to generate these per use case. The scope is too vast, the operational maturity too hard-won, the surface area too large to fit in a prompt. They keep surviving.
On the other end: left-pad, single-line clipboard wrappers, micro-libraries that exist only because someone wanted a credit on their GitHub profile. Just generate them. The LLM was always going to win these.
The interesting question is everything in between: the library that does one useful thing that took someone six months to get right.
The Messy Middle
Start with the long tail of invisible edge cases. Date and time are the canonical example: locales, calendars, daylight savings, Samoa skipping an entire day in 2011. You don’t think you care about all locales until a user in a forgotten one silently breaks your parser. CSV is, if anything, worse. Quoting, embedded commas, embedded newlines, character encodings, Excel’s creative interpretation of the “standard”. The bug where an embedded comma shifts every subsequent column by one position and throws no error at all is a real bug, and your LLM-generated CSV parser will ship it, every time. The library wins these because the value was always in the edge cases you couldn’t think of, the unknown unknowns, rather than in the code itself.
Then there’s runtime behavior under failure, scale, or concurrency. Database connection pooling looks simple until a thirty-second restart turns into a ten-minute outage, because your pool kept handing out connections to a server that wasn’t there yet, and every request piled up behind a TCP timeout that wasn’t going to fire for another thirty seconds. You can’t generate your way to having been tested in production at scale. The bug is invisible until it isn’t, and then it’s the entire system. Library wins this category, on the back of production hours that have battered it into shape.
Then, the case where bespoke quietly beats the library: fitting a general abstraction to your specific shape. State machines. Configuration loaders. The argument parser for your internal CLI tool. Object-to-object mappers for the one API you happen to be talking to. A purpose-built pipeline that does your three transforms in your one specific order. Glue scripts. Throwaway migrations. Nobody imports a framework to rename a column once. In all of these, the library’s generality is a tax. The tell is when you find yourself writing more code to adapt your problem to the abstraction than the problem itself would have taken. Generate it.
The cleanest case for bespoke is the cheaply, locally verifiable. Regex is the canonical example. The weak argument for generating regex says “nobody needs to understand regex anymore”, which is exactly the same logic that ships the silent date bug, and I reject it. The strong argument is that regex is instantly and locally verifiable. You throw test strings at it, you see failures in milliseconds. Bespoke is safe wherever verification is cheaper than comprehension.
The Calculus
So how do you decide? Blast radius, reversibility, time to recovery. If the bespoke version fails, what does it cost you, and how fast can you recover?
But this doesn’t mean “just use the library”. A dependency has its own blast radius. A latent supply-chain vulnerability is a perpetual liability with an unknown detonation date. Sometimes (and I mean this literally) the right trade is to accept the occasional twenty-minute outage from a slightly wrong bespoke retry policy over the eventual breach from a compromised transitive dependency. What’s worse: waking up in the middle of the night to a pager alert about a retry storm, or learning too late that your customer data has been exfiltrated by the North Koreans?
The cleanest case of this is fetch versus axios. The platform gives you an HTTP client; importing another one no longer makes sense. The retry logic you’d want around it is small enough to verify locally. The bespoke win here is “stop importing what the platform already provides”.
But blast radius isn’t one number. A CSV parser feeding a dropdown selector and a CSV parser feeding an accounting ledger are the same code with the same bug, but the consequences live six orders of magnitude apart, depending entirely on what consumes the output. So the right question is always “what depends on this being right?”. The parser is the same; the answer changes.
The actual decider is verification:
How cheaply can I verify this is correct?
Cheap to verify and fails loud: generate it. Regex sits here. Expensive to verify, or fails silent: the library’s accumulated correctness is the entire point. Dates and CSV sit here. Verifiable only in production under failure: library wins, every time. Connection pooling sits here.
What turns this into a judgment call rather than a checklist is that verifiability is manufacturable. Property-based testing converts a silent, non-local failure into a cheap-to-verify one. Instead of enumerating the edge cases you can’t see, you assert a round-trip invariant and let the framework throw thousands of adversarial inputs at it. The library’s accumulated correctness stops being irreplaceable the moment you can regenerate equivalent confidence through a test harness in an afternoon.
So the question becomes: can a competent engineer make this cheaply verifiable? This is why you still want an experienced engineer in the loop.
The New Compiler
The LLM is the new compiler. Production code is the new machine code. We stopped reading assembly decades ago because we trusted the compiler. We could still read it; we just had no reason to, since verifying at the source level was cheaper, and the compiler had earned the right to make that abstraction safe. The bespoke shift proposes the same move, one layer up: verify at the test or specification level, let the LLM compile from there, stop reading the production code as closely as we used to.
Trust is a dial. We trust the C compiler completely because it’s deterministic and has been exhaustively battle-tested for forty years. We don’t extend that to LLMs yet, and we’re right not to. The amount of trust we place in any tool is, and ought to be, proportional to the cost of that trust being misplaced. We already trust LLM-generated code for low-stakes throwaway work the way we trust a compiler. We withhold that trust where the cost of being wrong is high.
The compiler is the asymptote. We’re not there yet. What’s missing is the verification layer.
This isn’t speculative for me. The senior engineer’s job has always been asserting the behaviors the parser had to exhibit, and understanding the system well enough to know what to assert. The typing was incidental. I’ve been reviewing tests far more carefully than implementations for the last decade; we just called it “code review”. The bespoke shift makes explicit, for everyone, what good engineers have always been doing.
In Theory
In the TNG episode “In Theory” (S04E25), Data, the android, tells Lieutenant Jenna D’Sora that he has written a subroutine specifically for her. A program within the program. He has generated bespoke software for a single use case (one romantic relationship) and shipped it straight into production, by which I mean acting on it, with a real person, in real time.
The episode is forty-five minutes of watching this go subtly, comically, painfully wrong. Crashes would have been easier. The subroutines almost-work, then degrade. They produce romantic responses assembled from pattern-matched fragments, without the judgment to know whether they fit the situation. The output is plausible without being right. This is the silent-failure category in a Starfleet uniform, and you watch the blast radius play out for the entire episode.
When Data writes a subroutine, how does Data trust Data?
No battle-tested library exists for being Data. Every subroutine he writes is bespoke. Zero accumulated trust. The blast radius is his own cognition and his own actions, which is to say the highest blast radius imaginable for a single agent. He is either running an internal verification layer we never see on screen, or he is shipping unverified generated code into production at the most dangerous scale conceivable. “In Theory” is the episode where he visibly didn’t verify enough.
We have been talking, this whole time, as if this was about generation. It wasn’t. It was always about verification. And underneath verification, it was about trust.
What reuse always amortized was trust.
You import the date library because someone else has already paid the cost of verifying it across thousands of locales, ten thousand production deployments, fifteen years of bug reports. Writing date math is hard, sure, but the library’s real value is the audit trail, and the trust that came with paying for it.
Kill the cost of manufacturing trust locally, and reuse retreats to exactly the cases where trust cannot be cheaply made.
Why Kafka Survives
We watched the magic for forty years and assumed the hard part was the writing. It is the verifying, and underneath that, the trusting.
We already have machines that write bespoke software. We’re still waiting on machines we can trust like a compiler, all the way up the dial, proportional to the stakes. Closing that gap is the engineering work of the next decade.
Which is why Kafka survives. An LLM can draft something Kafka-shaped tomorrow; the code is the easy part. Verifying you’ve rewritten it correctly takes years of failures in production, and manufacturing the trust that millions of cumulative production-hours have poured into it, alone and on demand, is impossible.
Reports of open source’s death are exaggerated. It survives at exactly the layer where trust is too expensive to generate locally: the deep infrastructure, the run-at-scale systems, the fails-silently primitives. Everything above that line becomes bespoke.
In Summary
LLMs collapsed the cost of writing single-use software, and bespoke became viable wherever reuse used to win by default. The first pass for deciding between them is blast radius. The actual decider is verification cost, which a competent engineer can manufacture through tests. What reuse always amortized was trust; bespoke retreats to exactly the cases where trust is too expensive to generate locally. The engineering work of the next decade is closing the verification gap.
Data, after all, ran on tests.