February 4, 2024

This site is likely to accumulate a fair amount of configuration and code. In the past I've tried to automate reuse of this, but have then tended to get dragged down by the need to seamlessly connect all the pieces across the different systems in play. While I'll look to do that again at some point, for now the information is available for copy-paste without further automation. Practically that should be sufficient for me for a time being since I think for a while I had a bit of churn around the systems I was using (and would want use of the configurations) but now it's fairly stable and such automation does not seem warranted.

Viewing the Test Pyramid

One of the common frameworks for approaching testing is the test pyramid, which can be summarized as having a base of those tests that are most focused and least expensive and introducing tests that include more components and are more expensive as the pyramid is ascended.

In my opinion and experience unfortunately the size of the relative layers is too often interpreted as being the mass of tests at each layer - leading to the majority of the tests being unit tests and progressively fewer towards the tip. This certainly makes sense from a cost perspective alone, but this does not necessarily drive towards increased safety or better test or coding practices. Seemingly more often than not this instead seems to lead to a raft of useless unit tests and a harness that overall provides very little guarantees.

Perhaps most perniciously this perspective is not likely to encourage better coding practices - if you're code is simple then verifying it (and achieving test coverage) is trivial whereas if you expect to write unit tests for every bit of logic in your system then unnecessary complexity can easily be normalized.

A far healthier view of the pyramid is that of eliminating cases and providing guarantees. In this model there is line that is perpendicular to the base that runs to the tip that is the idealized happy path for the code, and there are some set of scenarios that surround that line. At each stage the components involved could behave in ways that deviate from that desired path, and tests are created to prevent or specify such behaviors. The perspective then shifts such that the crucial aspect of the pyramid is not the relative weights of the layers but the sloped lines that represent all of the ways that the system could behave narrowing in on how it should behave until it reaches the tip area of the value that is being delivered to the users of the system.

This perspective frames the test pyramid as a mechanism to tame complexity within a system rather than a naive mapping based on mass or cost. This also lends itself to each type of test focusing on an appropriate type of behavior and avoiding another common issue where the same behavior is tested redundantly at each layer. Each type of test can focus on eliminating classes of paths that would transgress across the sloping line towards desired behavior. Unit tests can focus on the logic within given components, interaction tests can focus on how a component handles calling a test double across a range of possible scenarios, contract tests can encode those in a form that can be verified with the other actor involved, narrow integration tests can verify that contract sanity any unsupported configurations are guarded against, and service tests can exercise the complete local configuration (and other tests can do other things, this isn't meant to be a comprehensive list). Rather than simply seeking to write less tests as the pyramid is ascended, the expectation is that those higher tests are able to rely upon the guarantees established by the lower tests and as a result the lines that reach the tip are built with the confidence that all known stray paths have been accounted for.

This also lends itself to better coding which is potentially the most valuable aspect of writing tests, but one which can be neglected by throwing powerful test tools at the problem. Code that is inherently testable (does not require such tools) is likely to demonstrate better modularity and separation of concerns(1). Keeping tests focused on exercising what is possible encourages components where less things are possible (and therefore less tests are needed) which breeds simplicity.

One of the most glaring issues I've seen in projects are very simple code which is tested in every layer…including unit tests that could only break if the language itself were broken (and in that case you couldn't trust the test either). In this perspective such noisy tests are unnecessary as all of the lines are already leading towards the tip and therefore such components could be tested at whatever level there are variations to be concerned about, or simply by blackbox functional tests if not warranted before then.

There are models other than the pyramid, but they tend to all suffer from the same simplistic mechanical view in that the shapes provided are translated into a goal of what amounts to a mass of tests rather than a focus on what roles different types of tests fill and what value is actually being delivered and therefore they all seem subject to a lot of wasted effort in terms of things like redundant tests, but more crucially areas that remain poorly tested and poorly understood because they are not attended to given the false sense of security provided by the existence of some volume of tests.

Testing and verification as a whole (along with other quality checks) tend to suffer pretty severely from the streetlamp effect, I have a few other ideas I'll capture where the next will probably be The Problem With Line Coverage given its close association with this statement.

FARLEY, David and GEE, Trisha. Modern software engineering: doing what works to build better software faster. Boston Columbus New York San Francisco Amsterdam London : Addison-Wesley, 2022. ISBN 978-0-13-731491-1.

In Modern Software Engineering, continuous delivery pioneer David Farley helps software professionals think about their work more effectively, manage it more successfully, and genuinely improve the quality of their applications, their lives, and the lives of their colleagues. Writing for programmers, managers, and technical leads at all levels of experience, Farley illuminates durable principles at the heart of effective software development. He distills the discipline into two core exercises: learning and exploration and managing complexity. For each, he defines principles that can help you improve everything from your mindset to the quality of your code, and describes approaches proven to promote success. --

Use of Code on this Site

Viewing the Test Pyramid