Source Code Revision Tracking Branches

Under NoBugDatabase, the following exchange was probably destined to spin into a long discussion of it's own, so ignoring the Wiki equivalent to YouAreNotGonnaNeedIt?...

The thought of having NoBugDatabase is probably very silly idea for most projects of any appreciable size. However, the process cults quite often turn the sensible notion of a bug database into an obscenity. The perversion of integrating the debug database with a source code management system, and forcing branches based on "defect number" has become a common abuse.

Clarify for me why this is "obscene, perverted abuse", please. According to a number of sources, these are good practices, (well, maybe not the branch per defect). See for example, "Streamed Lines" http://acme.bradapp.net/branching

For instance, mandated by-defect-number branching usually ends up adding tremendous complexity into the branch structure. The reasoning used to justify it is the myth that someone (other than the programmer) can pick and choose which "defect" fixes go into a build.

The reasoning (at least according to 'Streamed Lines') is not what you suggest above but simply the ability to allow private checkpointing of file versions during such a change, and provide a simple mechanism for being able to quickly identify all the affected files and their revisions/deltas for a particular fix or feature/enhancement. It simply associates all the file revisions with the larger context of the reason for that particular set of edits to that particular set of files. So if I see something wrong with a particular line of code and I want to know which revision of the file introduced it, I can quickly relate that to the other changes made to other files in the same "change context" and decide whether or not I need to look through them too. Such identifiability of a change and its context can assist "separability", but as mentioned before, there may be other dependencies.

The above is correct. The 'Streamed Lines' paper does not suggest creating a branch-per-defect, nor does it suggest that bug-branches are an effective way of selecting/deselecting fixes to go into a release or codeline. What 'Streamed Lines' describes is a way of complementing a developer's sandbox to allow checkpointing of work in the version-control tool without breaking the codeline. There is no more or less merging into the codeline as a result of this. All of the discussion above and below suggesting the contrary is very much mistaken.

Programming doesn't work that way, because changes are usually interelated. Imagine this iteration I fix bug number 249302, and then continue on to fix bug 95843, and then perhaps add some features. You cannot select the fixes for bug 95843 without taking the fixes for bug 249302. You might try, and you might often get lucky, but as a rule it is a bad idea.

Programming often doesn't work that way, but it doesn't have to. We had a process - which worked effectively - that any checkin could fix at most one bug.

And programmers should not be choosing which fixes to do and which features to add. That's the job of a product manager role (that's what we called them) - of course this may be one of the programmers with another hat.

As a programmer, I could make that kind of granular selection possible. I could explicitly produce each change independently of any others, starting with the exact same code base for each change. But doing so increases my workload dramatically. Things get really silly if two of the bugs involve changing the same file. That would force me to retain 3 versions of that file, one for each independent change, and one for the merged changes. And here's an obscene twist: both bug fixes might have required the exact same change!

Ask your top programmers if they like managing their source code with tons of branches, and with "merges" that more often than not are really just copies. Under regimes that mandate branches where the code itself did not take divergent paths, the complexity gets crazy and knowing where in the tree to place a change becomes a chore.

Making a task branch per change doesnt involve any more or less merging of any complexity. Even if I dont branch, I can make changes in my sandbox in parallel with other people making changes to the same files in their private sandbox. And I will have to resolve those conflicts when I "commit" my change - regardless of whether or not I used a task branch. Using a task branch might also cause some trivial copymerges (where a merge form the task-branch to the mainline simply copies the contents from the branch to the mainline because there were no parallel changes for that file), but those are fast and trivial requiring no more manual effort than if I did them in my sandbox without the task-branch

Code naturally diverges in a finite number of situations. Here are the three biggies. (1) When a bug is discovered in an old version of a file, the bugfix must diverge from the main line to avoid picking up later feature changes. (2) Experiments that stand a good chance of being discarded also call out for a branch, although experiments only truly require a branch to satisfy the third situation. (3) The need for simultaneous revision of a single file, as when two programmers need to simultaneously make unrelated changes to that file. The key aspect of all three of these situations is that you wind up with two or more versions of a file that are simulaneously current, so a single dimension ceases being sufficient.

Only branch when the code diverges. Most code follows a linear evolution, so you can usually stick to the main line. (In the uncommon event that you don't like what happened along the main line, you can always copy an older revision back down to the tip.) When selecting code for a build, if you always use labels for selection, you won't get any code off the tips that you are not ready to accept.

Labels and comments are the better tools for recording descriptive information, like defect number correlations. Don't kid yourself, or worse yet let some SCM "guru" or SCM vendor kid you into thinking you can use branches to select individual bugfixes. That usually can't work reliably, and forcing processes that make it work reduces the productivity of your programmers dramatically. It's just the nature of programming.

Again, the 'Streamed Lines' paper and related "branching best-practices" do not suggest creating a branch-per-defect, nor does it suggest that bug-branches are an effective way of selecting/deselecting fixes to go into a release or codeline.


I'm curious as to all these bad effects you mention. Here's what we do: our "bug databse" consists in task cards that have written on them "fix defect #nnnn: blah blah blah" The #nnnn is a key into the prime contractor's web defect tracking thing. We roll those tasks into the current or next iteration depending on priority. Thereafter, it's just a task. What we do with tasks is, yes, take a branch off the mainline, since we don't know that some other task might need to change whatever files this task needs to change. We have a script ("branch") that makes this operation quick and easy. We have other scripts to change which components are in the branch if we guessed wrong when we created the branch.

Thus the pair working on the task can do as much changing and reverting and adding and deleting they like on their branch without having to worry about what any other pair is doing. If a pair does care what someone else has done then there's another script ("catchup") that pulls into their branch changes from the mainline since the branch was taken. When the pair has something working, after an incremental build and regression test, they use another script ("merge") to put their changes back onto the mainline. We do a full build at least once a day. Before doing releases to the prime contractor we label the mainline. What this gives us is a mainline with serialised changes, identifiable by release, each with a comment saying why it was done, and we can easily find the branch where the work was done to see how. Is this so bad?

It seems from your comments above that your fear of increased complexity is more to do with the process around branching, and how branches are thought of, and what happens to them after the work that required them is done, than with the cardinality of the relationship between defects and branches.


For my clarification, why create branches for a change? Why not just incorporate the fixes into the mainline? I want to do whatever is possible to avoid creating branches and subsequent code merges.

Having a change (fix, feature or enhancement) on a task-branch allows private checkpointing, and also provides a simple mechanism for identifying all the versions of all the files associated with a particular change.

The change/fix branch does not preclude incorporating fixes into the mainline. In fact that is usually what is done. The reason for doing the change on the branch for the duration of the fix is to allow the programmer to check-in intermediate points (checkpoints) of various files to create 'private versions' that may be of use to the programmer during the change, but which would break the mainline if checked-in to it at that time. Once the change task is done, it is typically incorporated into the mainline (just like when I do a "commit" in CVS).

This latter purpose isnt necessarily for the goal of 'granular selection of bugfixes'. Some may try to do that, but the latter purpose is simply one of being able to group and identify all the elements of a change. Some version control tools provide other mechanisms for such change-grouping. If yours doesn't, a task-branch is an easy way to do this that also allows private checkpointing. But the reason is to be able to easily know all the related file versions/deltas for a particular fix.


[original poster:] Actually my concern is primarily the addition of a great deal of complexity, particularly in fruitless pursuit of unattainable goals. (e.g. one being granular selection of bugfixes.) Your process doesn't sound as unreasonable as some of the contortions I've endured. Perhaps a SourceCodeRevisionTrackingHorrorStories? page would be fun. However, I do have a few points about your process.

(1) Scripts can't completely hide complexity. For example, your "catchup" and "merge" scripts would only be able to cope with trivial situations, although admittedly these are the most common. If the changes to a file were very close, for instance in the same few lines of code, merging must be done manually, so scripts probably cannot completely hide the procedure. That's not a big issue for you, because overall your process does not seem very complicated.

Sure, manual merges are needed from time to time, that's a given. It doesn't seem as if they will be any more or less frequent using either branches per task or mutliple checkouts from the mainline per task. If there's a conflict, either way there's a conflict

(2) Since you're merging and building at least daily, (good for you!) I'd bet your branches tend to have only a single node followed by a copy of that node to the main (1-1, 2-2, below), with the occasional merge (3/4->5). Would it not be a bit simpler to branch only if you know that you must branch, i.e. when someone else has the mainline node checked out? (Perhaps that would not be "simpler" for you, due to the likely complexity of your relationship to a prime contractor, but for projects without such headaches...)

    1   2   3---.      (branches)
   / \ / \ /     \     ()
  O---1---2---4---5--  (main line, same number used when "merge" was merely a copy)

That's pretty much what we have, except that there'd typically be two or three branches live at a time and probably off the same node, so merges like 3->5 are more frequent than you might expect. I wonder if we're talking about quite the same things. With our tools, if I branch off a bunch of files, six, lets say, but then end up only having to change two of them, when the branch is "merged" back in to the mainline only those two files get a delta on the mainline.

As to only branching when we need to, given that branching is, for us, quick, cheap, easy, and always safe, we find it agreeable to always branch and not have to think about whether or not someone else is working on the same files. It would be, I guess, a few keystrokes and a few seconds faster to not branch, but I bet we'd lose that saving while gathering the information to make the decision whether or not to branch.

I'm not sure what you mean by "granular selection of bugfixes". Would that be multiple configurations (leading to multiple release builds) each with a different selection of bugfixes applied (and therefore branches merged in)? I've worked on systems like that, and yes, it was a horror story.

There's also a cultural aspect to all this. We started this project with some quite inexperienced, and even more undisciplined, developers. They couldn't really have told you why version control is important, and had experience only of VisualSourceSafe when they's done any versioning at all. We wanted a mechism that we 1) couldn't get wrong, 2) was absolutely safe, meaning in part, guarenteeing isolation between the work of two pairs on the same files. Doubtless with more experience and more discipline we could do fine all just making mutliple check-outs off the mainline. I'd worry about it, though.


When you use multiple branches you not only have to merge back to the mainline, but also back to all the other interested branches. Depending on the scope of change in each branch this may be painful or no big deal at all. The worst is when a branch both needs changes from the mainline yet is making major changes. --AnonymousDonor

The above is true only for branches created to handle the "Multiple Maintenance Problem" where I have a requirement to maintain multiple versions/releases of the software in the field. Regardless of whether or not I use branching to solve the multiple maintenance problem, I still have to "effect" changes to one release/version to other actives releases/versions. Branching is simply a common way of handling that particular problem - it doesnt cause the problem. The problem is caused by having the requirement to do such multiple maintenance due to some business decision that was committed to providing that degree of support service.


You need to branch when a code line needs different policies than the current code line. This may be on release, feature, bug fix, whatever. Part of the policy is when a change is needed/accpeted in the mainline. Features/tasks/bug fixes often get pushed back because of the verification effort needed, customer feedback, resources, etc. Having the changes in a different branch makes this type of time-boxing easy. --AnonymousDonor


Source-code management regimes with a lot of conventions (like exactly-one-bug-per-checkin) are very helpful, but they presuppose a project culture that values the returns highly enough to follow the conventions strictly enough to be reliable. Without that kind of project culture, it is a nightmare. You really need to look at this when considering approaches like this, the risk can be high. --Ed


EditText of this page (last edited October 2, 2014) or FindPage with title or text search