Research Results: Key software architecture metrics

Share This Post

Table of Contents

If you remember my article about Software Architecture Quality Attributes, you know that we have been conducting a survey to find out key software architecture metrics that leading companies and software architects use. As quality of a software’s architecture is essential, yet very difficult to apprehend and measure. And the quality features of an architecture are not obvious as relations and dependencies can extend very far away. So, today is the day! Today I would like to share with you the results, but please note, this is a positive experience of specific companies, but it doesn’t mean that these metrics are transferrable to all environments. Some you might find useful for your project, some maybe not. But here goes the results thanks to the following companies that place a special focus on software architecture: Apiumhub, Endava, Codurance, Thoughtworks, Mittelabs, DoItinternational, Developertoarchitect, wps, Xebia, Hello2morrow, Rollbar, Roche, ABB, Hoxell, Vidactive, CodingSans.

As we all know, in Software Development, early detection of software architectural issues is key. It helps mitigate the risk of poor performance, and lowers the cost of repairing these issues. So, let’s analyze key software architecture metrics that got mentioned on the survey to build scalable projects.

Results: key software architecture metrics

Software architecture metrics by Andrew Hamel Law – Tech Principal @ThoughtWorks

Nicole Forsgren et al’s Four key metrics,as described in the book Accelerate, and called out as ‘Adopt’ in the Thoughtworks Radar, differentiate between low, medium and high performing technology organisations: lead time, deployment frequency, mean time to restore (MTTR) and change fail percentage. Indeed, we’ve found that these four key metrics are a simple and yet powerful tool to help both leaders and teams focus on measuring and improving what matters. A good place to start is to instrument existing build pipelines so you can capture the four key metrics and make the software delivery value stream visible. GoCD pipelines, for example, provide the ability to measure these four key metrics as a first-class citizen of the GoCD analytics.

There are other, qualitative (i.e.number of diagrams you see devs drawing, number of ADRs, participation in architectural activities, engagement with governance processes, etc.) as well as feedback from various Fitness Functions you might have running, but the four key metrics, backed by incredibly sound research and statistical work are the place to start

Alvaro Garcia – Principal Engineer @Apiumhub

Fitness Functions to validate that the configuration is correct ( for example the database one at compile time )

Hadil Abukwaik – Software architect @ABB

Fan-in, fan-out, circular dependency, # components, size of components, fitness function, response time, cyclomatic complexity, technical debt specific metrics

Sandro Mancuso – Managing Director @Codurance

The 4 Key Metrics described in Accelerate: Deployment Frequency, Lead Time For Changes, MTTR, Change Failure Rate.

We also use some project-specific metrics related to performance.

Coupling and Cohesion across modules.

Eoin Woods – CTO @Endava

People use some code structure metrics and we have quite sophisticated tooling to derive patterns and anti-patterns, but these are just about code structure, not really the architecture.

Some architects also use structural measures on architectural structures (e.g. complexity of functional module dependencies).

Most of the architects spend more time looking at “”metrics”” related to qualities (transactions per second, mean-time-to-recover, effort to add a feature of a certain type). However these aren’t measurements of the architecture itself, rather of the effect of the architecture.”

I think this is really an active area of research (or could be). Some academics are looking at it.

It would be good to get reliable ways to measure or estimate coupling, cohesion, change propagation, inter-module interactions, compliance with patterns or styles and so on.

The practical problem is how do you measure this stuff? That implies a complete and accurate architecture model. Which normally doesn’t exist (partly because it is too expensive to justify based on its benefits). So you’re left with measuring the code, which we do already. But that means you can’t measure until it is built and some of the things I’m suggesting aren’t found in the code. So I think this is a really difficult (and so interesting) problem.

Special Case Pattern

Alexander von Zitzewitz – CEO @Hello2morrow

Average Component Dependency
Relative Cyclicity
Size of Biggest Cycle Group
Structural Debt Index
Maintainability Level
Cyclomatic Complexity
Max Nesting Depth

I believe metrics regarding cyclic dependencies and coupling are very good indicators for architectural erosion. Since this is the most toxic form of technical debt using those metrics will have a very good ROI.

Mark Richards – Independent Consultant @developertoarchitect and Hands-on Software Architect

I currently measure performance and scalability through the use of automated continuous fitness functions running in production. When a negative trend is identified through the fitness function, it notified me via email.

I also track component coupling (incoming, outgoing, and total), component size (number of statements across all classes in the component), and percentage of code the component represents across the entire code base. A component here is a namespace or Java package.

Ones that impact the modularity and cohesiveness of the system. Component complexity (cyclomatic complexity) is a good metric that points to overall maintainability of the code. I recommend tracking the following metrics form a structural standpoint:

– Component size (number of statements)

– Component coupling

– Component cohesion

– Component complexity (WMC or Avg Complexity)”

Dr. Carola Lilienthal – Managing Director @WPS

MMI (Modularity Maturity Index) – this index comprises many other metrics and gives a good indication whether a system is modular or not.

Testcoverage
Size metrics on all levels
Dependency metrics on all levels

Vlad Khononov – Senior Cloud Architect @DoiT International

Cyclomatic complexity
Coherence

João Rosa – Strategic Software Consultant @Xebia

From DORA:

Lead time for changes
Deployment frequency
Time to restore service
Change failure rate

Also, I use Mean time to detect.

Under the hood, we can use more technical metrics, such as cyclomatic complexity, number of CVS, security, etc.

But most importantly is to link the first 5 metrics to the KPI’s of the business for the software. That is the most important reason. Anything that makes sense from a business perspective. Example, if you are in a high-regulated industry, such as payments, you need to see transactions as a whole. It can involve latency or failure rate.

On the other hand, if you are on forecast business, data accuracy is important. So metrics that can be proxied to the business state.

Last but not least, team metrics. Without people, we don’t have complete systems architecture. Happiness, autonomy, reteaming are things that I’m most interested in to have a healthy work environment.

Alberto Capellini – Cofounder & CTO @Hoxell

Testing coverage it’s the metric I find more useful, as it can be checked, improved and can add value to the final product

I would rather add Agile process metrics, like cycle time and team velocity

Alberto Villar – Manager @Vidactive

Leadtime, to understand how long it takes from an idea to a product
Cycle time related to MTTR, crash rate and new features
Team velocity
Pure application crash rate
Security incidents

Edwin Maldonado – Solutions Architect @Mittelabs

Resilience, maintainability, scalability, testability, affordability (measure with numbers not only the costs in technology but also the people you need to keep it running)

Christian Ciceri – Software Architect @Apiumhub

As an overview: all the -ilities, trying to quantify them, and when I can’t, qualitatively (but using mathematical reasoning). Also, important to quantify bugs and defects, as external quality is a good indicator of internal quality process, automations and methodologies, and indirectly of modifiability, which is (IMHO) maybe the most important attribute. Also, the frequency of (successful) deployments.

Many small Monoliths

Théo Coulin, Maxence Detante, William Mouchère,Fabio Petrillo – Département de génie informatique et génie logiciel Polytechnique Montréal

The metric which is the most used is coupling, and its complementary, cohesion. It is important to have a high cohesion in modules, and a low coupling throughout the architecture. Low coupling is very important because it diminishes the risk of ripple effect when making changes in the program. Thus, low coupling is very important to keep the architecture maintainable. Maintainability is the most important quality that is displayed by a low coupling.

Another very interesting approach is instead of measuring coupling, measure the decoupling, that is the modularity of the architecture. A good modularity means easy maintenance and re-usability.

Another very important metric is complexity. It affects the understandability of the architecture and possibly the performance. It can be expressed by the number of classes in the architecture, or the number of links between classes in the architecture.

Change Propagation evaluates the maintainability of an architecture based on the probability that a change in a class will have an impact on other classes.

Another one is matrix of change propagation probabilities between components, in order for the architect to be able, at a glance, to assess the difficulty and cost of maintenance operations.

Design Pattern Density measures the percentage of class in the architecture that are part of a design pattern. It helps the designer to evaluate the maturity of an architecture. The more mature an architecture is, the more design pattern are put into it, and the higher the design pattern density. It is very good when applied on frameworks, which should be very densely filled with design patterns. A framework with high pattern density is more understandable and likely more performing. This metric seems to be quite hard to use as it does not express on a fixed scale the maturity of the design, but rather on a scale which depends on the problem the software deals with.

Metrics such as utilization and throughput work assess the performance of a component-based and container-hosted solution. It requires prior modeling of several critical part of the future system, such as the tool that receives all requests and redirects them to the corresponding service, or the database activity. Therefore, it requires additional efforts, but seem to provide an accurate profile of the platform performance through response time prediction.

Metrics to measure the coupling between modules, metrics to count the number of inter-module calls that are not made through the defined API, metrics to detect inheritance between classes of different modules, metrics to assess that the higher level of abstraction is the one used by classes outside the module, metrics to assess that interfaces are actually holding a single responsibility are used for Modularization.

There are three metrics to evaluate different aspects of UML diagrams. The first metric proposed is Information Content (IC). It is based on a hierarchy and weight of the different elements in UML diagrams. It defines the quantity of information a diagram or the architecture passes. The higher the IC, the higher the amount of information delivered. The second is more interesting as it is original, and helps assessing the quality of the architecture in terms of understand- ability. It is called Visual Effect. The higher the visual effect, the more complex it is for a human being to comprehend the diagram at a glance. The third metric is close to being a coupling metric : the Connectivity degree measures the number of associations w.r.t. the numbers of entities in the diagram. Different types of associations have different weights.

Another interesting thing to mention is architectural Software Quality Assurance technique (aSQA) to provide a lightweight technique, whose goal is to assess the quality of software architecture as well as prioritizing things to work on. It also enables to balance quality attributes of the architecture. Also, this technique needs to have a component based, or service-based architecture to be easy to use. The next step is to define a mapping of quality measurements to aSQA levels. Since the aSQA technique puts emphasis on balance and prioritization between qualities, it is necessary to use the same scale for all metrics.

DRY - Don’t Repeat Yourself

Also, there are four specific SO metrics to fulfill the specific needs of a design. Each metric evaluates respectively : Service granularity, Service coupling, Service cohesion and Business entity convergence. The proposed model is conducted in three steps: 1) Modeling, identify services in the business process and model the structure of the identified portfolio; 2) Measuring, use the model to measure quality features of services in the portfolio with the help of corresponding design metrics; 3) Evaluation : overall evaluation of the services set by normalizing the metrics and adding some weights, which any designer can customize to adapt the model to his needs.

Dave Farley – Director @Continuous Delivery

Stability: how stable is our code. Measures quality of output and detects rate & time to recover.
Throughput: measures efficiency of our approach and tracks lead time & deploy frequency

Software architecture metrics by other anonymous software architecture experts

Lean Time
Mean life of software applications and services between deploys, i.e, how often changes happen.
Code coverage
Response time
Throughput
Mtbf
Up-time
Requiring documentation of architecture
Afference/efference
Number of unit test on existing code and new code
Test Coverage on existing code and new code
Number of module / package / type of clients impacted by a bug or a requirement change
Measure interfaces quality
DDD + Microservices
Coupling, Complexity
Coverage, weighted methods
Maintainability, extensibility, reusability, complexity, design pattern density, modularization
UML diagrams evaluation, performance, simplicity
Number of new bugs per sprint.
Number of fixed bugs per sprint
Fitness Functions
Code coverage, crash free sessions
CPU usage, latency, battery usage, data consumption, memory usage
The domain of the project, the volume of users, the complexity of the business logic, the capability of extensibility or scalability, among others
The customer use case flows, it could say to us if we could split services or not
Cohesion
Number of components
Dead code
Untested code
Level of automation
Extensibility
Cost of adding new features
Integration testing coverage
JSP+bootstrap+jQuery
Serverless

Also, several useful tools have been mentioned:

Google lighthouse
Datadog
Rollbar
Prometheus

Having key software architecture metrics and tools may make architecture checking much faster and less costly. This can make it possible for software architects to run checks from the start of the project and throughout the life of the software project. Moreover, software architecture can be evaluated at each sprint to make sure it’s not drifting to become something impossible to maintain. It can also allow for comparison between architectures to pick the one that fits best to the project’s requirements.

I hope you found this article useful! If you would like to add something, please, feel free to share it on the comments section below or by contacting us directly: [email protected]

Author

Ekaterina Novoseltseva

Ekaterina Novoseltseva is an experienced CMO and Board Director. Professor in prestigious Business Schools in Barcelona. Teaching about digital business design. Right now Ekaterina is a CMO at Apiumhub - software development hub based in Barcelona and organiser of Global Software Architecture Summit. Ekaterina is proud of having done software projects for companies like Tous, Inditex, Mango, Etnia, Adidas and many others. Ekaterina was taking active part in the Apiumhub office opening in Paseo de Gracia and in helping companies like Bitpanda open their tech hubs in Barcelona.
View all posts