Good Change Log Entry

In ThePalimpsestEffect:

I've never seen good ChangeLog entries, though. They always describe what happened in this file/entity in fairly low level detail, rarely if ever related this to associated changes in other files, or explaining why the source was changed as well as why, or what other seemingly obvious changes were considered and discarded, along with why they were not used.

I'd be interested in what others think makes a GoodChangeLogEntry.

Good things to include in a change log (assuming that change logs are good ;-):

Who made the change.
Date (and possibly time) of change.
Summary of change:
- Change request / bug report number(s) -- where one can look for more documentation.

The first two bullets should absolutely be filled in automatically by the software (VCS or editor mode or whatnot) and they should be in clear, standard or at least unified formats.

Suppose some other person is looking at a diff of the changes, either because they're going back through the source history looking for regressions, or because they're considering whether your patch should go into the code base. So, I think the log entry should describe "why you thought that change was a good idea".

If it fixes a bug, it would be good to include the bug number in the log entry, or at least a description. If the change covers several files, then it should have more or less the same entry on each, because the reason for the change is the same. If you're making several logically separate changes then preferably they should be checked in and described separately. So, they're perhaps not quite NanoIncremement?~s, but perhaps almost that small.

I think this is an OK fragment of a ChangeLog, though maybe a little FineGrained? (but I'm biased). CriticismWelcome?:

---------------------------- revision 1.22 date: 1999/06/03 06:33:21; author: martinp; state: Exp; lines: +3 -3 OSAgentBusURL: * Default for a partial URL is now taken as a host, not a busname.

* Also, host names and bus names may include uppercase characters. ---------------------------- revision 1.21 date: 1999/06/02 07:33:34; author: martinp; state: Exp; lines: +18 -2 Add parsePartialURL to guess what somebody meant when they don't give a real URL. ---------------------------- revision 1.20 date: 1999/06/02 05:11:24; author: martinp; state: Exp; lines: +54 -54 untabify. No other significant changes. ---------------------------- revision 1.19 date: 1999/06/02 05:10:48; author: martinp; state: Exp; lines: +7 -4 Add OSAgentBusURL, an easier-to-use implementation of a bus URL for VisiBroker? CORBA/OSAgent buses. ---------------------------- revision 1.18 date: 1999/05/26 04:39:48; author: martinp; state: Exp; lines: +10 -8 Doc. Fit to code guidelines.

I suppose with PairProgramming and a strong OralTradition this is less necessary; but even then it might be useful.

-- MartinPool

<FX: Criticism flamethrower being lit...> :-)

Problems: There's no hint in the change log that the parsePartialURL was going to default to a busname. That might be obvious from the diffs, but it would be good to know why. Was this a micro design decision when you wrote it, in response to a UserStory or what)?

Similarly, why did it change to a hostname later? Was this a defect being fixed, an arbitrary change, a change in requirements? This kind of info is needed - perhaps later there would be a SPR complaining it's no longer a busname. The maintainer needs to know if the change to a hostname was an error, or what was supposed to happen.

"May include uppercase"? Does this mean they didn't before (but that was Ok because that was the spec). Is this input to parsePartialURL or in the output from it? It's not clear even whether that "Also" line applies to parsePartialURL or some other piece of code entirely.

So I think this ChangeLogEntry? could supply more information about the Why. With diffs, it's mostly adequate on the What.

MartinPool responds:

You're quite right: there should be a deeper explanation of why. Most of these took place in response to user feedback: for example, in 1.22 that people more often want to specify just a different host than a different busname. It would be better if they'd actually filed a change request so that I could quote that number and thus link into the whole discussion and approval of that request. That's not the way this process happened.

To some extent I think a ChangeLog should just be a commentary on and summary of the diffs; it doesn't have to completely summarize them. It helps to at least give people what changes and what developer they should talk to if e.g. they've found a regression in the handling of default busurls.

By the way, some people on this project always write "no message" for their changes.

I think, at a minimum, people should reference customer "change request numbers" on checkin comments they give to the source control system. People can lookup the change request for business language justification. Some projects say that you're not allowed to change code without a change request number.

I find a short human written summary helpful, both for myself and others, as I'm often refactoring really bad code, so automated "diffs" aren't very helpful to a 3rd party who wants to understand what really changed.

I would like each checkin to implement one and only one well-defined business change. But reality works out differently on most projects I've been on: In non-XP projects, I've found that fixing several related issues together is a more effective use of my time, and I'm often finding and fixing a number of unrelated problems whenever I start working on a new module. In conventional development, the effort required to understand the existing program dominates the maintenance effort, so fixing related problems together dramatically improves your efficiency.

I usually summarize changes by file, and have even been known to document changes in function headers. (But, to be honest, I usually find function-level change histories more annoying than useful.)

What really bothers me about existing source control "diff" tools is that they lack the ability to answer questions like "when was this line put in here?" or "What's the history of changes to this function (this range of lines)?" -- within a file that contains many functions. -- JeffGrigg

Jeff, you can get a rough answer to "when was this line put in here?" using the "cvs annotate" command, or (even better) the "granny" command in Emacs. Bonsai gives an even better web interface to the same query.

I like the idea of all changes being driven by change requests, but even if that were appropriate for all projects it is too large a change for OneLittleProgrammer? to achieve. I can at least try to write decent little comments on things I touch.

Check out Bonsai (http://www.mozilla.org/bonsai.html). It at least gives you source with last-change line annotations with CVS.

JeffGrigg responds: Even good "difference" display does not solve the problem, because I need to know where, in the last 50 revisions of this module, that this range of lines here was last changed. Existing tools can only show the differences between two selected versions. (Ironically, I don't like marking the source code with comments of who changed last touched which lines when, because they're irrelevant almost all of the time. And they discourage refactoring.) -- JeffGrigg

"cvs annotate" does give you some of that, in the form of the latest version in which a given line was modified. If you crave a webby version, Bonsai has one built in: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/js/src/jsscript.c is an example.

Without such tools, you can perform a "manual" binary search for the version where the change is (so you'll only have to do about 8 compares, rather than probably 25 or so)

I want to make the admittedly provocative assertion that no file-based source-code control system can have a GoodChangeLogEntry, because changes typically result from bug reports or functionality enhancements, and thus come in groups that are fundamentally organized on a functional, rather than file, basis.

The grouping of source code into files is, at best, a reflection of the capabilities of the development environment. It generally has almost nothing to do with the semantics of the particular language or application domain: this is why a generic control system like RCS can "support" so many different languages.

A typical change request, on the other hand, implies a change to the behavior of a real application - a change that almost always touches code on a semantic or syntactic basis, and virtually never on a module basis.

The entity upon which a GoodChangeLogEntry would naturally hang is the change request itself, or an object that represents that change in an EnvyDeveloper-style environment that has deep awareness of the syntax and semantics of the code.

One client of mine, for example, uses configuration maps for this purpose, and hangs the description on the comment field of the ConfigurationMap?. Another configuration map can then aggregate the config maps that comprise a set of bug fixes. The path to a subsequent maintenance release is then significantly more apparent.

In a file-based system like RCS or Continuus, you aren't gonna get there from here. -- TomStambaugh

In Continuus, you can use a task to group changes across files, and comment the task.

What is a ConfigurationMap??

SubVersion is file-based, and doesn't understand syntax or semantics, but I expect it should do a good job of keeping changes together.

No, no. It's quite possible to have good change log entries on our project, because our modules are ridiculously large. ;->

Are you looking for a GoodChangeLogEntry or a good description of the change? You should read over the changes broadcast on the gcc-cvs list (archives at http://gcc.gnu.org/ml/gcc-cvs/). I consider most of those entries good. They don't necessarily say why things change, and y'all consider that the most significant problem.

Go to the changed code. If the changed code is significant to understanding, most gcc developers explain what's going on in comments. If the changes themselves are significant, they often mention previous versions and why they were changed. This information is useful in the code itself, as most people won't read through many years of ChangeLog entries even after filtering for changes of interest. (gcc's changelogs are so long that not all of them are even distributed). Since it's in the code if necessary, duplicating it in the ChangeLog would bring all the damnation associated with breaking the OnceAndOnlyOnce commandment. ;)

Actually, I'd guess that any sufficiently distant ChangeLog entry will be ignored no matter how good it is. In a long-running project, almost all entries will be sufficiently distant. If the change is important, document it in the code. Explain how it currently is, how it's been done in the past, and why that was no longer believed sufficient. Keep ChangeLog entries current and briskly informative. -- JasonRiedy

Source code is in the now. Why clutter it with useless information like how it once was? The only time a history is important in a file is when there is no source control to work with (like on a source-code library distribution). Even then, histories should be small, contained and out of the way. -- SunirShah

My experience differs. In one job, support staff would quite normally look back a couple of years in the change history to see why a change had been made. If you're supporting multiple versions with many customers, many of whom do not want to upgrade to a later version until it makes sense for their business, a response along the lines of "source code is in the now" would be greeted with some displeasure.

Comments along with the code is the wrong place for changes with effects in multiple places. There needs to be somewhere that explains the overall intent and justification for the change. Also I might not even know the place to look for the comment.

I'd agree that a change request is a better place to put the comment (and the change request handling system needs to be integrated in some fashion with the source control system). But then I would say that because I've built such a system (actually more than once) in the past :-)

It just begs the question, though. It's then reasonable to ask what makes a good change request change description. -- PaulHudson

Part of the solution is to understand that one-way communication is immensely hard: it's nearly impossible to write a comment today that will answer every question people might have in the future.

Another important reason to have GoodChangeLogEntrys is, as the gcc example hints, that they're a communication to other developers as to what's changed today/this week. "Oh, somebody just changed a file I depend upon. I wonder if the change affects me?" Scanning a good description can tell me whether I can skip it, whether it fixes a bug that was in my way, whether I have to look in more detail at the new code or the diff, or whether I should speak to the author. -- MartinPool

I give comments in change logs the same respect as comments in code - almost none.

I prefer to use a separate system for tracking problems/defects (e.g. BugZilla). Then I can check in files with (1) a bug number and (2) a one-line summary of the bug. This is useful for high-level scanning of the history of a file. It doesn't say much about the changes themselves (which is what diff is for) or why the change was made (which is what BugZilla is for), but it ties the checkin/diff/BugZilla bug together.

CategoryCodingIssues