Code Change Impact Analysis

Measuring the impact of user/customer requirement changes upon code

No one metric is probably sufficient to get a good profile. Otherwise, somebody could "game the system" by focusing on a narrow metric alone. And it may ultimately be left to the study reader to apply the metric weighting factors they deem most appropriate. The general formula is:

  R = w1*m1 + w2*m2 + w3*m3 ...etc...

where wN is a weight and mN is a metric score. A zero weight would render a given metric "ignored".

More on this can be found at:

Possibly, how many UnitTests are broken?

Isn't the cost of a proposed change covered as part of the PlanningGame?

It's really a bottom-up approach to requirements analysis on a deployed and possibly mature product. Client observes A, which seems wrong, proposes behavior B. Developer finds code section eligible for making B replace A. Now the fun begins. If that change is made, how does it impact the long standing, undocumented requirements of the system in subtle ways? It's all about the dense interconnectedness of requirement, a topic young software pundits love to avoid. Just run the unit tests? Maybe, but what if it's not a unit that breaks?


If something needs to change, the first thing that actually does get changed are the unit tests. So, yes, you really can "just run the unit tests," and when everything passes again, you're good to go. Secondly, you should be working in a branch of the project, so as to not affect others. You'll want to periodically poll from the master branch to keep your stuff maximally synchronized with the working project, of course, to minimize integration issues. At any rate, it's very doable, and has been done before where I work. Your presumption is that you change production code without touching the unit tests in the process (preferably, first). If you do that, you get what you deserve. --SamuelFalvo?

{How do you prove your unit-tests?}

[Well if one was refactoring (even just for aesthetics, changing variable names and moving code into smaller methods/procedures) and/or if one was replacing algorithms with better ones (say more reliable or faster ones), and if the same behavior was expected out of a unit, then many tests may not change (especially since units always wrap other units, no matter how hard you try to dissect them all into invidual units - that is impossible since there are library calls and other units relying on other units, etc). As for just running the unit tests and assuming all is okay if everything passes... bleh, I'll admit that I think unit testing creates some egotistical false sense of security (just like - if it compiles, it's good!). Hey, the tests pass! That means my program is perfect - neener neener neener! (Not.)]

[This above issue bit me several times where I've reported a bug to unit testing advocates in charge of a code base - they simply write some dumb unit tests to prove the bug doesn't exist, using the simplest code possible - but their simple code doesn't allow the more complex problem to show up, which involves more than one unit integrating with another (missing factors). Only I, the user, could write a more complex test and beat it into them that their overly simplified tests were not showing the real problem of the working system. A problem, therefore, with unit testing - is that sometimes the unit tests are too dumb and simple and don't take into account the bigger picture.]

They are certainly at fault here. Don't expect your unit tests to do the job of SystemTests. These are the sorts of small programs attached to BugZilla bug reports, which treat the system as a whole as a BlackBox, encapsulating overall desired behavior.

See also:

If something is broken, it is eventually going to have to be fixed. IOW, code change. Over the long run wide-spread or unintentional impacts will be counted. However, what is not measured is the "2:00am" factor. Some side-effects will cause minor disruptions and some major. Nor is the cost of finding it counted. It would be interesting to see designs that are easy to change but hard to find the spots to change, and visa versa (MentalIndexability).

I would suggest you add a "beeper-during-vacation" multiplier for changes you feel would cost more from that perspective. The reader may or may not agree with your markups, but at least you document your interpretation of the ranking. People will not always agree with your weightings because their experience may paint a different picture from yours. However, the important thing is to document your assumptions. A future reader may still find your analysis useful even if they plug in different weights. Think of the analysis as being a framework and not necessarily a final product, at least outside of your needs. Plus, it can narrow the points of contention so that people can focus on those specifics.

Scoring Technique

Here's a rough model of how to put the above suggestions together from a relational perspective. Note that this is not promoting relational techniques here, only using it as way to show the relationships between the different factors. "Ref" suffixes refer to foreign keys.

 scenarios table
 codeSnippet  // if applicable
 styleGroup   // paradigm/technique identifier

metrics table -------- metricID metDescript // example: "Number of statements changed"

scores table -------- scenRef metricRef score // example: number value of statements changed

scenarioLikelyhood table -------- scenRef proponentRef // person or group making estimate probability // 1.0 = 100%

metricsRelevancy table ------- metricRef proponentRef relevancy // 0 to 1.0

In the past with measurements roughly based on this model, different proponents have given widely different weights of relevancy and frequency. Each side usually accuses the other of biases such as a selective memory. Another explanation is that CompaniesHireLikeMinded such that people tend to end up in companies or teams with somewhat similar techniques, habits, and preferences.

For example, in a situation where we were studying the change impact of asterisks in SQL SELECT statements in this wiki, somebody brought up the situation of a fixed-position array being used to store specific "cells" in rows (cell[1], cell[2], etc.). If the position of columns in the schema changed, obviously it could brake a lot of code. But that practice is not something I see very often in production code, and so I would rank it with a low probability. However, some shops or some languages may tolerate or encourage such practice. The model above allows each shop to plug in their own frequency estimates if they don't like those of the original "judges". It thus allows one to pick and choose which givens they accept and reject.


Thats an awful lot of thin tables.

Some designs call for it, some not. The entities just happen to be skinny in this case.

See also: ChangePattern, WhyNoChangeShootout, SoftwareDevelopmentAsInvesting, DecisionMathAndYagni


CategoryMetrics, CategoryChange, CategoryDecisionMaking

EditText of this page (last edited November 17, 2014) or FindPage with title or text search