From xp mailing list:
Message: 5
Date: Thu, 9 Nov 2000 11:29:05 -0500 From: "Arrizza, John" <john.arrizza@marconi.com> Subject: RE: Adding Features To Legacy Codelong reply, bail now.
> -----Original Message----- > From: ajalis@twu.net [mailto:ajalis@twu.net] > Question: What is a good strategy for adding new features to old code > written by someone else that I don't understand very well.I've done this for two small apps (2-6K lines) and one bigger one (initially 65K) lines), so I speak from few experiences. My experiences may not hold for your app.
The overall rule I've come to is that small, gradual changes are best (surprise, surprise). However, you can sometimes get away with a global sweep of changes on small apps. I just haven't had the courage to try it on the big app.
When I've gone the incremental route here are some things I've observed:
> The code is unit-test unfriendly.I generally don't do them. I use functional tests. However, if a there is a utility class that is nicely encapsulated and it's intent is clear to me, I spend the time to write the UT.
Even then, I still hesitate to write the UT. A couple of times the UT showed some "bugs" in the class. I dutifully fixed them. Bad idea. Running the FTs showed that there was code dependent on how those classes worked albeit badly. It seems better to drive the refactoring from the other direction, from the top down. Sometimes refactoring high level code causes the methods of lower classes to be unnecessary. Sometimes the high level refactorings cause the code to call low level classes differently, removing the dependency on a low level "bug".
On the other hand, fixing low level bugs has high impact on the app as a whole. I find when the app runs "better" it somehow seems easier to guess how it should be running. So I am tempted to fix low level bugs...
BTW it pays to make sure the FTs run in 10-20 seconds. You might want to spend some time stubbing the DB calls your app makes for example.
> In the past, this is what I have done. <snip>Your method seems riskier than doing things incrementally.
> (so I have an easy way to back out and restore the system to its initial state > -- an exit strategy).Excellent point. I use my source control tool to help me out. I make a few fixes, run some tests. If the code is stable, I check everything in and apply a label so I can regress (i.e. small releases). It doesn't matter that the difference between labelled versions is small, what counts is that each version is stable. When I feel there has been significant improvement I make a "major" release. I then do a full(er) range of tests on that release. If possible I get QA or other people to run the release. If the "major" release holds, I blow away the "minor" labels and start again on small fixes.
> Question: Is there a better way to refactor? > If I could do that, I could do some simple refactoring on them - like > split them up into smaller subroutines. But wrapping the unit tests around > them is not obvious. How do you write unit tests around legacy code which is > not organized into neat objects?I rearrange the code so much that writing many unit tests doesn't make sense to me. I just make sure I have a good suite of FTs. I also run the FTs with the app in the debugger. I put a breakpoint on the code I just refactored to make sure that the code is being executed!
One skill you must have if you're going to do this is the ability to look at two pieces of code and *accurately* determine if they are the same or not. You're doomed if you can't do this for short snippets of code. If you can't, you must have UTs to back you up. This also means that you can't refactor in languages that you're not well versed in. In short, you've got to be honest with yourself about your skill level.
I can do this for 4-10 lines of C++ max with a small (and I hope well-known) assumption of risk. After that it becomes too complicated to verify mentally the two snippets are identical. The smaller the snippets I refactor, the easier it is verify they're ok (without a UT that is).
Extracting a method however, is *nearly* risk free. If I'm worried about extracting a method in C++, I first put braces { } around the code I want to extract. I verify that all the variables and temps it uses are within the braces {}. If not, I move them in. I test and if it still works ok, I extract the method.
I still get bitten every so often of course but I've only changed a few lines so it's easy to track down where the problem is. If I can't find the problem, I don't try to fix it for very long. I back out to the last minor release I did and reapply my changes. The reasoning is: I've screwed up, I'm anxious, the worst thing for me to do is take on an even more complicated task. I swallow the lost time and go back to a known good version.
I found the *best* refactorings to do are:
I view the other refactorings as a way to get the code to a spot where I can extract a method or a class. I find renaming variables, methods and temps are great ways to help ease the burden of reading the code. The new names also serve to remind me of my guesses as to the code's intent as I continue refactoring. If I'm lucky, I "gather" up enough of the code's intent such that I can identify and extract common code into a new class.
Good luck,
-- John
I'm often working on legacy code; around 300k to 500k lines for single person projects, to millions of lines for multi-person projects. The most important things to do when you walk in the door are...
When you break the app while fixing and improving structure, and it's going to happen from time to time, you can, with a straight face, say that the app broke because of the new functionality you were implementing at their request -- you don't have to mention the refactoring. Yes it's underhanded and political, but it's a survival strategy.
-- AnonymousCoward (Nov 16, 2000)