Work in progress

January, 2015 ∙ 7 minute read

I’m positive this isn’t a novel idea by any means and in all likelihood this is standard development practice for many engineering teams, but I’d like to document a software engineering strategy that I’ve been using over the course of a few years while working on github.com. The pattern leverages a single, long running, experimental, work in progress Pull Request (PR), from which multiple smaller PRs are extracted, but based on master and then back merged into the long running PR after their own successful review, deployment, and merge. This method is especially useful in fast moving, large, legacy code bases.

diagram

Taking point

When deciding to do some major work the first step is always to branch off the current origin/master with a new long running branch. This is pretty standard development practice with Git and GitHub.

❯ git checkout master       # Make sure you're on the master branch
❯ git pull origin master    # Pull down the latest changes
❯ git checkout -b wip       # Create your new branch called 'wip'

Often, when writing software, you have a goal or vision of what you want to do, but no clear idea of how you’re going to get there or even what the steps will be until you start experimenting. You can think of wip as your blank slate (wip stands for work in progress, but you can name yours anything you like). This wip branch serves as your scratch pad of the bleeding edge of your current thinking about how to tackle the problem. It isn’t intended to be merged directly, and as such, you can take some leeway in committing experimental code. The idea is to be able to move fast and change directions fast as you start to understand the scope of the problem. You also want to share your work as soon as possible for feedback and collaboration. This feels scary, but is almost always a good idea.

I like to push this branch out very early — maybe with only a couple of commits. Then, open a PR sketching out what you think you’d like to do.

The wip branch can often support more than a single developer pushing, but probably works best when 1-3 people participate so as not to get bogged down in merge conflicts. To coordinate a larger number of people working on the same feature it generally makes sense to spread people out on the extraction PRs and keep the wip PR under the oversight of the lead.

You should periodically merge origin/master back into the wip branch to stay up-to-date. In the github codebase this is an automated hard requirement for branch deployment.

❯ git fetch                  # Fetch latest changes from the server
❯ git merge origin/master    # Merge origin/master into your branch

The wip branch should always be kept deployment safe and if you have a production environment that supports branch deployment, it’s worth getting the wip branch out and live early. This makes staying in sync with origin/master easier and allows you to push changes to the full production environment as you go instead of waiting until the end.

Extractions

As you develop on the wip branch, you’ll come to a place where you understand some discrete part of the system that can be extracted and pushed on its own. A simple example of this in Rails would be a model change and an associated database migration.

When you’ve identified one of these extractions, the first thing you do is start another brand new branch from origin/master with just the focused changes. In the case of a model change and related database migration, I will sometimes create two separate PRs for the changes to go out independently. In a legacy application like github.com, changing a model often means a database migration, changes to the ruby models, a data transition of sorts from the old world to the new, followed by a cleanup PR (or set of PRs) that drop columns and fully commit to the new data model.

Assuming you’ve been developing on your wip branch, here are the steps to start an extraction branch:

❯ git checkout master          # Get back to the master branch
❯ git pull origin master       # Make sure you're up-to-date
❯ git checkout -b extraction-a # Create a new branch off master called `extraction-a`

Then, I pull in just the changes necessary from the wip branch. You can do this by cherry picking commits, but I find it easier to directly checkout the changed files from your wip branch and create a brand new set of commits. I will often dive deeper into the specific problem by adding tests, seeking out detailed code review from experts, and making sure the code is production quality and well documented. The extraction PR generally includes considerably more commits and work than the original spike on the wip PR.

Here’s an easy way to pull in changes in a particular file from another branch:

❯ git checkout wip -- app/model/user.rb
❯ git checkout wip -- db/migrations/some_migration.rb
❯ git commit -am "Enhance user to include..."

This is also a good place to rope in security review or even have other individuals push to the extraction branch. You can then branch deploy extraction-a, watch production for un-intended behavior, further smoke test the new feature view the UI or a production console, run migrations/transitions, etc, etc.

When everything looks good, use the merge button or manually merge and re-deploy master. You’re changes are now live and in the main line (master).

Merge back

Now, you’ll want to merge origin/master back into your wip branch. Generally this is painless (and something you’re already doing periodically). Sometimes, you’ll have refactored so far on your extraction branch that the changes look nothing like what you first implemented in the experimental wip branch. This is OK too and these conflicts are usually pretty easy to sort out. If you do run into conflicts, at least they are in a part of the codebase that you have a lot of context with. I usually sort out conflicts manually just by looking at the conflict markers and editing each file into the proper state before issuing a git add on the file and finally a git commit with to complete the merge once all conflicts are sorted.

In the happy path, this is how you’d merge back into wip:

❯ git checkout wip           # Make sure you're on the wip branch
❯ git fetch                  # Get latest
❯ git merge origin/master    # Merge origin/master into your wip branch

When you merge an extraction back into the wip branch, the wip branch diff actually shrinks because some of those changes are now on master. You can see this on the changes view in a PR.

Move forward

Now, you can start moving forward on the experimental branch again. Usually a grow - shrink pattern emerges where your diff grows as you forge ahead with new ideas and features and then shrinks as you extract small PRs, get them merged/deployed, and get origin/master merged back in. Sometimes the real implementation is much cleaner than your initial prototype, other times it involves more code, tests, and docs. Either way, the wip branch does two things:

Moves slowly towards the end implementation goal.
Gets smaller and smaller as you back merge and narrow in on your feature.

I’ve often found that as you get to the end, the wip branch narrows in on a single set of final changes that can be easily reviewed and merged.

Conclusions

The work in progress Pull Request extraction process enables many things. It…

Let’s you experiment in fast moving, legacy code bases.
Provides a process for breaking up a single feature so that many contributors can share the work.
Let’s you get code out early and often which avoids the scary (and error prone) process of merging a large set of changes.
Enables you to focus on a small piece of a larger goal, giving your full attention to test coverage, code quality, documentation and feedback.
Facilitates better code review by allowing experts to be called in to look at small excerpts of a larger idea that are easy to digest/review while still providing the balance of the big picture.

The wip PR lets you see the big picture and the grand vision. It serves as a reference of your goal and an anchor to provide context to the extraction PRs. (If you use cross references in the extraction PRs back to the wip PR you get a nice history of the work done to implement the feature.) Extraction PRs let you dive deep into a particular problem, write clean code, get your tests in order, pull in experts, and focus on a small part of the big picture.

Building GitHub since 2011, programming language connoisseur, San Francisco resident, aspiring surfer, father of two, life partner to @ktkates—all words by me, Tim Clem.