top of page
  • Writer's pictureShai Yallin

Unit Tests Considered Harmful

Software engineers often refer to software as being built from Lego bricks, where the engineer’s job is to assemble the bricks to a coherent software system that solves a particular problem. This view is somewhat simplistic, because new software aims to solve a novel problem (otherwise an existing product could be used) and as such, at least some of the aforementioned Lego bricks must be themselves new or unique to the system’s domain. However, it is still a useful enough allusion to deal with the controversial topic of unit tests.


The problem begins with the mere definition: a unit test. What constitutes a unit? Is it a function? A class? I once overheard a seasoned engineer saying that he considers a unit as something that delivers value to a paying client - which is the definition I agree with the most. In reality, though, most engineers attempt to unit test a single class, while mocking out its dependencies. This approach has seen a lot of traction during the past 15 years. It’s being taught in programming classes. It’s being promoted by frameworks such as Nest.js and NX. And it’s complete and utter rubbish.


The reason is simple: a class might or might not be a useful representation of part of the business domain our software system deals with. But a feature is most likely represented by a complex slice of functionality that spans across multiple units of software, for instance a Component that talks to a Route which talks to a Service that accesses a DB via a Repository. Testing this feature would require running both the UI and the backend, using a UI driver (such as TestingLib or Playwright) to interact with the component, and asserting that the DB has been modified and any other side effects (emails being sent, S3 files being created, etc) have occurred as expected.


In many software systems, I see unit tests that achieve 100% coverage with a Service test that mocks the Repository, a Route test that mocks the Service, a Component test that mocks the backend, and so on. 


There are two main problems with this approach, which render it not only useless in terms of preventing regression, but actually harmful

Strict coverage for classes prevents regression in these classes but does not assert that the feature actually works. And these tests make it difficult to change the behavior and interface of these classes, that might reflect incidental design or implementation details that have nothing to do with the feature that we actually care about. It’s akin to making sure that each and every Lego brick has exactly the required number of studs and color, but forgetting to make sure that all bricks are connected together and that our Ferarri can actually drive.


This does not mean that we should rely on an extensive suite of E2E tests - these are often slow, cumbersome, hard to debug and tend to be flaky. My approach prefers extracting IO operations to adapters, testing them separately, then using reliable fakes to test the bulk of the system’s behavior from the outside - achieving the scope of an E2E test with the speed and comfort of an in-process unit test.


Finally, remember that our users don't care about our test suite. They care about whether our software actually solves their problems and makes their lives easier. Our engineers also don’t care about our test suite. They want to develop new features, solve bugs, and keep everything tidy with minimal pains and restrictions.

6,633 views5 comments

NEED MY HELP?

Feeling stuck? dealing with growing pains? I can help.  

I'll reach out as soon as I can

bottom of page