Friday 6 May 2011

Technical Debt

:: Definitions

Technical debt is a high-level metaphor used to describe the quality of software source code. It was first introduced by Ward Cunningham back in 1992 and has since take shape in many different forms. Today it can be used informally to describe the state of a software application to a non-technical audience or it can be used more formally for measuring technical aspects related to software source code or specific tasks required to "clean" it up. Given that the term "technical debt" can have a wide array of definitions, the following sources provide some deeper insight about the term: 
A software application is only as good as the team that designs, builds and tests it. Therefore, technical debt has many sources and can be introduced by architects, developers or testers. It usually starts out being just a small issue but if left unmanaged, can grow to become very debilitating for a development team to produce software in a timely manner. Technical debt can be associated at any level of the software application stack whether embedded in the underlying architecture, classes, methods, algorithms, documentation and/or build systems. It is therefore important in software projects to not only know what requirements have yet to be implemented fully but also to know what technical debt lurks within the software so that it can be managed effectively.

:: Measurement

The holy grail for measuring technical debt would be to measure the impact of a single software change in terms of both the value it produces (most likely profitability) and the risk it either increases or mitigates. The ideal software change would maximize overall value while reducing overall risk. As an industry, software development is a very long way from this fine grained approach to measuring technical debt but there are many techniques that can be used (either individually or combined together) to measure the potential burden that a piece of software may have and thus provides some better insight. Like tracking finances you never really know where or how much money is being spent until you keep track of it and the same is true for technical debt... you never really know what software components might need attention or how bad they actually are until you measure them directly.

There are a number of metrics that can be used to measure specific aspects of technical debt across software source code. Unfortunately none of these metrics are able to provide a complete picture of the technical debt for a piece of software but taken together they can provide some key insights into its overall quality. Note that care should be taken to fully understand what a metric is in fact measuring and then to not over-state its importance. I have seen many development teams become religious over certain metrics and subsequently implement policies to stay within some arbitrary bounds. This can often lead to development practises that are detrimental to the actual software being developed. Measuring technical debt is more of an art than a science but quantitative values can still be helpful in the arts.

Here are some of the metrics I have used to measure source code:
  • SLOC (Source Lines of Code): The size of a software application in SLOC isn't a great metric in an of itself, but when used as a comparison mechanism (e.g. the ratio of production code to testing code) it becomes a good indication of relative code quality.
  • Cyclomatic Complexity: A great indication of how complicated the code paths are in a portion of code (usually a method). For methods, it is also a very good indication of how many unit tests are required to effectively traverse and thus test each code path. This is also one of the few metrics statistically correlated with buggy software.
  • Coupling: A number of metrics are defined in this category but of particular interest are fan-in and fan-out metrics which measure the number of calling or called routines for a portion of code.
  • Branch Coverage (Also known as code coverage): Measures the amount of production code that is covered by unit tests. If you think you have tested a method or class sufficiently this is a good metric to observe to verify that thought.
  • Duplicate code: Duplicate code is a particularly interesting metric because the code that is duplicated can in fact be of a high quality. The problems arises when the duplicated code needs to be modified due to a bug. As a result the developer must make sure that each and every occurrence of the bug is fixed otherwise one occurrence of the bug will be fixed but one or possibly many others will remain. A false sense of security that a bug is fixed when it is not can be a disaster in certain contexts.
  • Nested depth level: Some algorithms require nested 'if', 'for' or 'while' statements and an large number of nested statements (also called the depth level) increases the complexity of a portion of code and makes it harder for any one developer to understand it without considerable effort (leading to possibly buggier code).
  • Violations of coding standards: If a coding standard is defined there a usually a number of tools available that either enforce compliance with the standards or simply count the number of infractions.
Apart from simply using metrics to quantify the technical debt of a software application, I have also found the need to capture various software "cleanup" tasks. Anything from dead code to code refactoring to build system improvements can be added to a ticket in your bug tracking system, labelled as "technical debt" and estimated accordingly. By doing this the technical debt within an application can be is tracked over time and observed as it either increases or decreases. It can therefore be managed in a pragmatic way instead of avoided or simply kept unknown.

:: Tooling

There are a whole range of open-source and commercial static analysis software tools available to capture metrics and enforce their compliance: