What Is Refactoring

: "Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure." -- MartinFowler in RefactoringImprovingTheDesignOfExistingCode

If ReFactoring is an AmeliorationPattern, then IfItIsWorkingDontChange is an AntiPattern? -- GeraldoXexeo

If you have a poorly factored program that does what the customer wants and has no serious bugs, for the love of Mike leave it alone! When you need to fix a bug or add a feature, you RefactorMercilessly the code that you encounter in your efforts. Thus, RefactorMercilessly can live in harmony with IfItIsWorkingDontChange. -- rR

IME, the 'customer' benefiting from refactoring is my coworker, the person who has to add a new feature, or understand an old one. IfItIsWorkingDontChange applies to the maintenance programmers, too. If the first spike winds up being some ugly piece of spaghetti, for whatever reason, then we should be allowed to untangle it when the opportunity presents itself. Code stops working when the customer changes their mind. If we have an agile process, the artifacts can change quickly and frequently. 'Refactoring' is about some ways this is done correctly. -- SkipSailors

Refactoring is typically done in small steps. After each small step, you're left with a working system that's functionally unchanged. Practitioners typically interleave bug fixes and feature additions between these steps. So refactoring doesn't preclude changing functionality, it just says that it's a different activity from rearranging code.

The key insight is that it's easier to rearrange the code correctly if you don't simultaneously try to change its functionality. The secondary insight is that it's easier to change functionality when you have clean (refactored) code.

Refactoring is a kind of reorganization. Technically, it comes from mathematics when you factor an expression into an equivalence - the factors are cleaner ways of expressing the same statement. Refactoring implies equivalence; the beginning and end products must be functionally identical. You can view refactoring as a special case of reworking (see WhatIsReworking).

Practically, refactoring means making code clearer and cleaner and simpler and elegant. Or, in other words, clean up after yourself when you code. Examples would run the range from renaming a variable to introducing a method into a third-party class that you don't have source for.

Refactoring is not rewriting, although many people think they are the same. There are many good reasons to distinguish them, such as regression test requirements and knowledge of system functionality. The technical difference between the two is that refactoring, as stated above, doesn't change the functionality (or information content) of the system whereas rewriting does. Rewriting is reworking. See WhatIsReworking.

Refactoring is a good thing because complex expressions are typically built from simpler, more grokable components. Refactoring either exposes those simpler components or reduces them to the more efficient complex expression (depending on which way you are going).

For an example of efficiency, count the terms and operators: (x - 1) * (x + 1) = x^2 - 1. Four terms versus three. Three operators versus two. However, the left hand side expression is (arguably) simpler to understand because it uses simpler operations. Also, it provides you more information about the structure of the function f(x) = x^2 - 1, like the roots are +/- 1, that would be difficult to determine just by "looking" at the right hand side.

A lot of people want you to RefactorMercilessly, which means basically clean up after yourself in code after every edit, and don't stop until you can't clean any more.

And, hey, you already do it. Everyone refactors (or at least reworks) as a matter of solving problems. However, refactoring heuristics (aka RefactoringPatterns) have come under study recently. Hence the whizbang name for such a simple concept.

I suggest you read MartinFowler's excellent RefactoringImprovingTheDesignOfExistingCode. The code examples may be in Java, but it's not language-dependent.

You can also refactor other things besides formal expression languages. Like DocumentRefactoring. I refactored this definition several times in order to group similar ideas into their related paragraphs. Of course, I call that reorganization. It's all similar (except where it's different).

One more thing: pattern folk sometimes point out how patterns emerge from refactoring. In other words, you don't design from patterns, you refactor to patterns. In fact, refactoring is a process for emergent design if you want to think of it like that.

The shift from StructuredProgramming to ObjectOrientedProgramming is a fundamental example of refactoring. You rewrite from a style where you have a data structure blob and lots of calls to functions taking blob as a first argument [doThisTo(blob,param1); doThatTo(blob,param2);] into method calls [blob->doThis(param1); blob->doThat(param2);].

This change is, linguistically, using explicit subjects. In the StructuredProgramming Age, all the commands were toward the machine: (You = machine) doThisTo blob with param1. Now with ObjectOrientedProgramming, we are making orders not directly to the machine but to imaginary objects we made from problem domains: blob should(or will) doThis with param1. Notice the change in the word order. This is a great paradigm shift; from verb-centric to noun-centric. -- JuneKim

This does not change what the code does, but the code is cleaner as a result; and you unlock uncountable opportunities for further structural improvement - which opportunities end up embodied in OO languages, and give rise to a powerful theoretical foundation.

How exactly is it "cleaner"? One advantage of such approach is that you can do this:

  doSomething(blob1, blob2, blob3, etc....)

The OneAndOnlyOneNoun assumption of much of OO is not very realistic in my opinion. Verbs are more central to English than nouns. The (primary) verb is usually at the top of the parse tree, with the possibility of subject and object: 2 nouns.

That's a pretty big generalization to make about natural language, isn't it? What about a sentence like "I shouldn't run before I crawl, then walk" - how's the verb/noun ratio then?

That does not seem like a legitimate sentence. Perhaps a better example would be "Eat, drink, and be merry, for tomorrow we are outsourced to cheap overseas labor." Either way, this all sounds like a HolyWar about whether nouns are better than verbs in software design.

"I think the point of the "OneAndOnlyOneNoun assumption" has more to do with the real-world correlates of nouns and verbs than with their linguistic ones. In other words, one "thing" (noun) can perform many distinct "behaviors" (verbs), so our programming model ought to reflect this. Arguing about how well a programming language reflects "real world" English syntax seems beside the point to me. If English syntax was even remotely usable in a programming context, we wouldn't have had to invent new languages from the ground up."

 public class Ego
 {
private boolean hasRun = false,
                        hasCrawled = false,
                        hasWalked = false;

public void run()
{
 if (!hasCrawled || !hasWalked)
throw new IllegalStateException;
 // ... running ....
}

public void crawl()
{
 ...
 hasCrawled = true;
}

public void walk()
{
 ...
 hasWalked = true;
}
 }

Intransitive verbs have only one noun - the subject. The sentence Francis suggested has three verbs and one referent (speaker) - i.e., three pronouns of identical reference. -- TomRossen

Re: "I think the point of the "OneAndOnlyOneNoun assumption" has more to do with the real-world correlates of nouns and verbs than with their linguistic ones."

I disagree that OO better reflects the real world in any direct sense. Actions are more of a linker between multiple nouns rather than associated with one directly in the real world. But there are already multiple topics on this such as PrimaryNoun and SoftwarePlatonism.

I created this page to stay within the WhatIsFoo guidelines, because ReFactor is ugly, and I've seen this question at least twice this week. I'm hoping this will remain a summary of accepted definitions of refactoring for the "refactoring newbie." Discussions can easily stay in the billions of refactoring pages that already exist.

ExtremeProgramming is dependent on refactoring. Refactoring is not dependent on or from XP. (Damn, I tried hard not to talk about XP when I wrote this page.)

Strictly speaking, refactoring means reorganizing something but leaving its function in the world unchanged. Anything which depended on that thing to behave a certain way will never be disappointed. The notion of strictness was introduced in order to prevent the addition of bugs.

Pure refactoring isn't always necessary (or desired); if you try to do it, you may have to fend off some the possible improvements you notice. Normally people will aim to improve and change meaning (hopefully for the better). It's more efficient than doing one and then the other, usually. Sometimes it lands you in trouble, so the extremely cautious will refactor first and then add and then refactor. But who wants to live in fear?

Metaphorically, refactoring means to strive for that equivalence. In many contexts strict equivalence is impossible and/or unprovable. For example, RefactoringWikiIsaMetaphor.

Disagree. I have used both the refactoring approach and the "big change" approach, and the refactoring approach is more efficient. With the "big change" approach, one tries to do all of the work inside his head and then just type in the result. When refactoring, one has to enter each intermediate step and ensure it is correct and complete before proceeding to the next step. Refactoring gives one the ability to see when a particular part of a change is running off into infinite and allows one to rollback to the last working change. With the "big change" approach, one often runs into snowballing changes with the only choice being to either continue to completion or back out completely. Unless one's build, test, check in cycle is long, it is more effective to refactor and keep changes grounded in reality, than to rely on extensive mental models that one only hopes are correct.

It seems to me that refactoring is about continually rationalizing our commonality and variability assumptions. We look in the code for stuff that we previously treated as different (variable) but is really the same (common) and vice versa, then re-jig the code to more cleanly reflect our new commonality and variability assumption set. -- AnthonyLauder

The word 'factor' traces back to the Latin word for 'maker'. In other words, things are made up of their factors. To factor something is to find what it's made of, its component parts. To refactor is to find a different set of component parts that make the same thing. Deeper understanding (and control) comes from finding different part-whole relationships in the same problem. Or as James' (of giant peach fame) parents would say "Try looking at it a different way." Factoring and refactoring are essentially analytical, but without the stigma of paralysis attached. The part-whole discovery exists in both, but in current usage, analysis connotes something akin to planning, while refactoring connotes a post-build cleanup. A sort of thinker-doer dichotomy, on the surface anyway. -- WaldenMathews

In my opinion, refactoring is just a compression process for source code. OAOO is a CompressionMethod? to substitute a code chunk with its name. It seems related with LP.

-- SunnyDragon?

While compression is one type of refactoring, refactoring is not just a compression process. Often refactoring means replacing short, thrown together code with a more complete and less compact alternative.

"Practically, refactoring means making code clearer and cleaner and simpler and elegant."

How do we know what is clear, clean, simple and elegant? More importantly, how can we help our less enlightened colleagues to know? -- TomAnderson

Work with them in PairProgramming. Find a piece of code that needs work, then stare at it for ages trying to work out what it does and how it does it. Or doesn't do it. Who can tell? Code needs to be ScreaminglyObvious in order to be maintainable. If you know what the code is supposed to achieve but you can't quickly see how the code is trying to do it, it needs simplification. Use your colleagues' lack of knowledge to convince them that the code needs to be easier to read.

In NicholasNegroponte's book about the MIT Media Lab [http://www.media.mit.edu/] there were mice, robot or real animals, that kept modifying their environment. There is an animal motivation to modify the environment, and one mode by which this is accomplished is reiteration/re-thinking.

See also DouglasHofstadter's GoedelEscherBach: An Eternal Golden Braid, specifically Chapter Five: Recursive Structures and Processes.

See also ChristopherAlexander's TheTimelessWayOfBuilding, which talks about how towns are not developed by a master plan, but by lots of people making small changes, each of which makes the place a little better.

Refactoring is one of the ThreePhasesOfDesign. -- AlexChaffee

BobCringely doesn't like it [http://www.pbs.org/cringely/pulpit/pulpit20030501.html]:

: "Cleaning up code" is a terrible thing. Redesigning WORKING code into different WORKING code (also known as refactoring) is terrible. The reason is that once you touch WORKING code, it becomes NON-WORKING code, and the changes you make (once you get it working again) will never be known. It is basically a programmer's ego trip and nothing else. Cleaning up code, which generally does not occur in nature, is a prime example of amateur Open Source software.

A major motivation behind UnitTests is precisely to ensure that refactoring transformations do not break code. Good automated testing is a prerequisite of being able to refactor. And refactoring is a prerequisite of being able to adapt the structure of software to changing requirements. I have seen many projects stall and have to be totally rewritten precisely because they got to a stage where the code could not be refactored with any degree of confidence. Of course, I imagine BobCringely is aware of this in his saner moments.

I agree with Mr Cringely if you look at Refactoring in isolation; one should avoid doing Refactoring for the sake of Refactoring. The realization that changes everything is that code is not static, it continues to need to be updated. One cannot just produce "working code" and never have to touch it again. Refactoring as preparation for changing code and as completion immediately after changing code is far superior to the typical force fit method of making changes. Refactor in conjunction with code changes, but not in isolation from some need. -- WayneMack

It may be that "Cleaning up code is a terrible thing", but the alternative, namely NOT cleaning up code, is even worse. In addition to making code cleaner and easier to understand and test, I frequently find that the refactoring process highlights bugs and potential pitfalls that may otherwise lie undetected. And I cannot understand the second half of BobCringely's paragraph at all.

Ideally, code is perfectly written with function and maintainability in mind. However, this doesn't always happen (issues like deadlines force "good enough" solutions out the door). BobCringely assumes touching existing code is motivated by a programmer's ego. Has BobCringely worked with enough experienced programmers to know that encapsulation, maintainability and code reuse are good things you should try to achieve? Is it really less valuable if it is added later? Perhaps. Not all code is throw away, some of it is actually worth keeping and reusing.

Cringley's Comment is somewhat taken out of context. He says the above while referring to open source projects who have no new features to add and the developer is just refactoring "cleaning up code" for no reason at all. To me, the good points of refactoring are: Increase readability for future developers and to reduce the amount of work required to add a feature. Therefore, one should only be refactoring when the time comes to add a new feature and it is difficult because the code is either in your way or is too confusing. -- JamesPeckham

Studying this page as it has evolved, I've come to the conclusion that the term "Refactoring" is a projective test. By noticing the reaction to the term, I can find out about the psychology of the person giving the reaction. That in itself makes refactoring a most useful concept. -- JerryWeinberg

Refactoring is another example of Platonic perfectionism -based largely on the belief that something that looks Right, is Right. Some people are blessed with insights and others are not, so views will differ, but most people can see when something is elegant and simple. Problems occur when people try to bring evangelistic simplification to bear on systems that already work, even though they are clunky. Programming isn't meant to be an art form.

Maybe a simpler definition is more useful. In my view, to refactor is to alter a codebase in a way that it improves the potential utility of the code whilst not harming the existing capability.

Maybe it's time to refactor this page... It's a Wiki. Feel free.

I agree that "refactoring" is an overloaded word. What is "good" factoring often depends on one's personal viewpoint as often found in HolyWars. I tend to use "repetition factoring" to refer to the process of removing unnecessary repetition from software code. Of course, what is considered "unnecessary" itself can trigger a holy war. For example, I think good repetition factoring would remove the "dot path" from many function/method calls, but some like the dots (ResponsibilityDrivenDesignConflictsWithYagni). I have never met anybody who proposed that all repetition be removed, so the disagreement is where and how much.

Alan Fisher's article on pp 26-27 of the January 1, 2004 issue of SoftwareDevelopmentTimes? defines refactoring as "leveraging existing code assets into new Web-based applications." (This is mainly to keep track of how the term is being used.)

This is, of course, a horrible, lamentable definition of refactoring... --

Indeed, the rest of the article succeeds in maintaining the level of quality established by that usage.

The Feb 15, 2004 Letters to the Editor (p. 33) includes someone correcting the author's misuse of many terms, including Refactoring. (I harbor a fantasy in which the writer of the correction learned of the misunderstanding through my original posting to Wiki. -- JoeWeaver)

MartinFowler writes about misuse of the term refactoring in his blog: http://martinfowler.com/bliki/RefactoringMalapropism.html

External links: