How good architecture emerges from following the best engineering practices

Andrey Lebedev
6 min readJul 23, 2024

--

There is a notion that architecture is something that stands above engineering and considers only high-level abstraction concepts whereas how things are implemented at the lower level of the components involved is not an architectural concern. In this article, I will challenge this concept and provide a concrete example that demonstrates how the best engineering practices lead to a good design.

Let me tell you a story…

Imagine you are building an application that is supposed to fetch certain data from a data warehouse, denormalise, transform and store it in the form of the given business model. You sketch a high-level architecture for this task that would look something like this:

The development team decides to use an ETL (Extract, Transform and Load) tool to implement the Transformation App and follows the organisation’s standard for that — IBM DataStorage. An efficient and powerful ETL solution that, however, lacks out-of-the-shelf instruments for automated testing, code review, CI/CD and VCS routines. This can be solved by creating a relatively complex set of scripts, leveraging Docker-based integration tests. Practising TDD or ATDD (Acceptance Test-Driven Development) in a such setup would be a nightmare, however. And since good engineering practices weren’t fostered in this engineering environment, the overall quality control for the solution boils down solely to manual testing. This quickly leads to many regression defects creeping over in the production environment and the end users become outraged and desperate.

Things wouldn’t go so awry and dramatic if the first concern of the development team were to adhere to the Clean Code guidelines. The urge for Test-Driven Development practices and the given enterprise constraints would force the team considering a general purpose programming language for implementing the Transformation App, despite some added overhead for the standard ETL framework. Let’s assume that the team chooses Java language and Spring Boot framework for this solution. The first sketch of the application gives a monolith Spring Boot system with two different data sources (DS1 and DS2):

This is however not a good engineering practice to have multiple data sources in a single Spring Boot application (Why? A huge overhead and complexity of supporting more than one data source.) The next iteration of brain storming leads the team to a micro-service like architecture:

eventually becoming a reactive-stack solution based on the Spring WebFlux framework:

A simple strive for testability and the proper CI/CD process leads to a much better architecture with all known micro-services architecture benefits, such as:

  • each component can be developed and ultimately deployed independently;
  • the data reader component can serve as an anti-corruption layer, whereas the data loader becomes a source agnostic component, implementing the business logic, that doesn’t need to change if the data warehouse is replaced by any other source (say, a web-service backed by a DataBricks EDL solution);
  • each component can be tested in an isolated environment with mocked external components, which significantly reduces the cost of maintenance and the build/testing timings.

Emergence

The term “emergence” is not very well known to the wide auditory. When I say that architecture “emerges”, it causes sometimes a confusion, so it is worth to clarify what exactly I mean by this term:

In philosophy, systems theory, science, and art, emergence occurs when a complex entity has properties or behaviours that its parts do not have on their own, and emerge only when they interact in a wider whole.

This is exactly what happens when each part of the solution described in the previous chapter doesn’t have good architecture properties per se. For instance, the notion of a component being testable as a good architecture approach appears empirically, because often testable components brought together give emergence to a well designed solution.

Let’s look further at other different Clean Code practices. Take, for instance, the evil of all evils — code duplication. The major lesson from the era of procedural programming — reuse and don’t repeat yourself. Neglecting this rule, postponing code refactoring with deduplication, lead to a failure to see a bigger picture where things could be reorganised into different modules, which is already an architectural question.

This picture is provided for illustrative purposes only. It is not mathematically accurate but should give the idea: code refactoring and de-duplication is akin to the Edge contraction operation from graph theory. Similar ideas were already proposed by some researches (see this article, for example).

Single responsibility principle leads to both modularisation and clean naming. Properly chosen names of classes and identifiers together with the Separation of Concerns principle naturally go hand in hand with Domain Driven Design principles, giving emergence to the latter. How? Striving to achieve the purpose of this principle — one unit (class, for instance) should do one job (and do it properly), a developer will inevitably search for a name that describes the class and its elements at its best. The source for this naming will be the business model and hence some sort of a ubiquitous language will be born during this naming exercise. Thus even if DDD is not part of the enterprise architecture, its elements will emerge from these practices.

Open-closed principle being applied to various layers of the application, in particular leads to a better API design, naturally introducing API versioning (which is by no doubt an architectural concern, because backward compatibility of an API is a crucial element of a bigger picture when different components are designed to talk between each other). In the context of the API development, Interface segregation principle (ISP) also contributes to better designed APIs with different specifications per various domains. This can be achieved by grouping API methods based on their pertinence to certain domain groups, thus eliminating the need for API clients to implement useless interfaces and models.

ISP also pushes for natural appearance of Bounded Contexts — one of the central patterns in DDD. This could be not so obvious at first glance, but think about it: when a developer observes that, say, the same class “Customer” shares different interfaces for sales and support clients and thus violates ISP, it naturally leads to the idea of a separation of these contexts.

Liskov substitution principle together with Dependency inversion give rise to loosely coupled designs which can be easily broken into independent pieces thus helping to perform a transition from a monolith to a micro-services architecture. (Don’t take me wrong: I am not saying that micro-services architecture is better than monolith. It all depends on the circumstances and a monolith with loosely coupled components can perfectly fit the goal).

Loosely coupled components can be easily separated into microservices.

I could continue this list for a long time, but I hope you got the idea: simply following the best engineering practices in a meticulous manner naturally leads to emergence of a better design. This is also true and crucial for legacy systems or systems that previously lacked a thoughtful solution architecture. No wonder this resonates with Agile architecture principle: one doesn’t design the whole solution beforehand; instead, the architecture unfolds and transforms as new requirements arrive.

--

--

Andrey Lebedev

PhD in CS, a Software engineer with more than 20 years of experience. Check my LinkedIn profile for more information: https://www.linkedin.com/in/andremoniy/