Legacy Code, Refactoring, and Ownership

Refactoring is good. Everyone knows that. Since Fowler popularised the concept with the seminal Refactoring: Improving the Design of Existing Code it’s become a staple of the industry, and has pride of place on many a bookshelf. In the many, many articles and discussions of the subject, the key goals and benefits of refactoring are generally taken to be the improvement of readability, testability, decoupling, and other similar worthy ideals. For me, however, there is another very distinct benefit, often overlooked. Fowler touches upon it, but doesn’t really develop it, early on in Refactoring:

I use refactoring to help me understand unfamiliar code. When I look at unfamiliar code, I have to try to understand what it does. I look at a couple of lines and say to myself, oh yes, that’s what this bit of code is doing. With refactoring I don’t stop at the mental note. I actually change the code to better reflect my understanding, and then I test that understanding by rerunning the code to see if it still works.

(Fowler, Refactoring, 1999)

By investigating a piece of code thoroughly enough to understand how it works, refactoring it to map directly on to your understanding, and reinforcing everything with good unit tests, you take ownership of the code. It’s yours now.

This is very important, psychologically. Almost every developer feels more at home with their own code than somebody else’s. That’s why you feel uncomfortable and deflated when, 20 minutes into deciphering a nasty bit of opaque gibberish, you realise it was something you yourself wrote a year earlier and subsequently forgot about.

When you refactor, you rewrite code to a greater or lesser extent. Having done so, the resulting feeling of ownership (alongside increased understanding, of course) makes the code much less scary. The benefit of this is less marked in agile methodologies or TDD, of course, since in those cases quite often the code you are refactoring was written by you anyway. Working with legacy code, though, it’s a big deal.

In the preface to Working Effectively With Legacy Code, Feathers asks “what do you think about when you hear the term legacy code?” (Feathers, 2004). He answers by stating that the standard definition is “difficult-to-change code that we don’t understand” and adds his own preferred definition which is, in essence, “code without tests”.

My own definition of legacy code would include, in many cases, code that isn’t mine. By ‘mine’ I don’t exclusively mean code I wrote personally; I also mean code written by my team, or even code written by people who sit a couple of desks down who I can go and pester about it (which is stretching the definition a bit, admittedly).

In short, legacy code for me is code that no longer has any accessible owner. Like a stray cat or dog, code without an owner goes feral. Refactoring is the process of taming feral code, but as with stray cats much of the benefit comes from re-homing. This is a vital process, even if a fairly unconscious one. When you first come face to face with some hideous 5000-line spaghetti monster of a function your heart sinks - how can anyone ever hope to understand that, let alone modify it safely? Especially if the only people that ever worked with it left the company 3 years ago?

Refactoring allows you to split this code up, create classes to better represent the problem domain, improve abstraction, add tests, and all that other good stuff; at the same time, the process of doing so makes the code yours. You make the decisions about the classes to create and the abstractions to introduce. You write the tests that ferret out all the little idiosyncrasies, and uncover the unwritten assumptions. By the end of the process, the code feels like yours. And that means that the next time you have to make a change there, you benefit from the double whammy of code that is not only well-written and tested, but recognisably yours; and that’s the kind of code that you won’t mind working with.

Related Posts:
digg    del.icio.us    reddit    stumbleupon

5 Responses to “Legacy Code, Refactoring, and Ownership”

  1. Tony Morris Says:

    > Refactoring is good. Everyone knows that.

    This is called an argumentum ad populum logical fallacy. Logical fallacies are very prolific in hyperbolic material such that which purports refactoring as anything but a perversion of science.

    There exists one person who doesn’t “know that Refactoring is good” and actively refutes the truth value of this statement (it is so false that the inverse is true). I wrote an article titled Refunctoring quite some time ago; I’ll happily support my position with evidence and reason upon request.

    I bring this to your attention so as not to continue the proliferation of mythology and mislead potentially innocent bystanders into a similar state of delusion.

  2. JS Says:

    After all, ‘terrible code’ is by definition any software code you yourself didn’t write. Right?

  3. russ Says:

    “I bring this to your attention so as not to continue the proliferation of mythology and mislead potentially innocent bystanders into a similar state of delusion.”

    The phrase “everyone knows that” was intended to be slightly tongue-in-cheek, which I had hoped would be picked up by being unnecessarily sweeping and all-encompassing. Generally speaking I think it’s a good thing, but I also think Steve Yegge hit the nail on the head when he said that refactoring had become the goal rather than the cure.

    I read your Refunctoring article and largely agree with it; I code C# for a living and often find myself edging towards a functional style for certain problems. However, I think you’re a bit strident; not all refactoring is simply a progression towards using map, foldr, and filter; similarly, I think you’re overly harsh on Java and C#. C# at least is moving in interesting directions - LINQ is providing something very similar to (if more verbose than) Haskell’s list comprehensions, and also now supports lambdas and a form of type inference. Whilst it’s never going to be as much fun writing C# as it is to write Haskell or Ruby (which are my preferred languages outside the office, along with Ocaml), it’s encouraging that such a mainstream language is starting to “get it”.

    I hold my hand up and concede that my phrasing didn’t really communicate any of this. Sorry. I’m just getting used to blogging and realise I need to be careful about how people interpret my writing.

  4. russ Says:

    “After all, ‘terrible code’ is by definition any software code you yourself didn’t write. Right?”

    Oh give it a rest, that’s not what I said. Why have you put ‘terrible code’ in quote marks as if you’re quoting me, when that phrase appears nowhere in the article? In fact, the whole point of the post is that there are benefits to refactoring beyond simply trying to improve code. You understood that, though. Right?

  5. CD Says:

    Refactoring isn’t a pain in the boss?
    i’ve refactored terrible code from prolog to msx basic, then to msx basic II turbo, then to quick basic and to pascal.
    I think i have enough bacground to discuss refactoring with my boss who actually fired me for trying to explain him what refactoring means.
    Now i am a jobless .net programmer. wanna work with me? i take some time to refactor. when you work on legacy code your own brain refactors.!

Leave a Reply