January, 2015 ∙ 7 minute read
I’m positive this isn’t a novel idea by any means and in all likelihood this is standard development practice for many engineering teams, but I’d like to document a software engineering strategy that I’ve been using over the course of a few years while working on github.com. The pattern leverages a single, long running, experimental, work in progress Pull Request (PR), from which multiple smaller PRs are extracted, but based on master and then back merged into the long running PR after their own successful review, deployment, and merge. This method is especially useful in fast moving, large, legacy code bases.
When deciding to do some major work the first step is always to branch off the current
origin/master with a new long running branch. This is pretty standard development practice with Git and GitHub.
❯ git checkout master # Make sure you're on the master branch ❯ git pull origin master # Pull down the latest changes ❯ git checkout -b wip # Create your new branch called 'wip'
Often, when writing software, you have a goal or vision of what you want to do, but no clear idea of how you’re going to get there or even what the steps will be until you start experimenting. You can think of
wip as your blank slate (wip stands for work in progress, but you can name yours anything you like). This
wip branch serves as your scratch pad of the bleeding edge of your current thinking about how to tackle the problem. It isn’t intended to be merged directly, and as such, you can take some leeway in committing experimental code. The idea is to be able to move fast and change directions fast as you start to understand the scope of the problem. You also want to share your work as soon as possible for feedback and collaboration. This feels scary, but is almost always a good idea.
I like to push this branch out very early — maybe with only a couple of commits. Then, open a PR sketching out what you think you’d like to do.
The wip branch can often support more than a single developer pushing, but probably works best when 1-3 people participate so as not to get bogged down in merge conflicts. To coordinate a larger number of people working on the same feature it generally makes sense to spread people out on the extraction PRs and keep the wip PR under the oversight of the lead.
You should periodically merge
origin/master back into the wip branch to stay up-to-date. In the github codebase this is an automated hard requirement for branch deployment.
❯ git fetch # Fetch latest changes from the server ❯ git merge origin/master # Merge origin/master into your branch
The wip branch should always be kept deployment safe and if you have a production environment that supports branch deployment, it’s worth getting the wip branch out and live early. This makes staying in sync with
origin/master easier and allows you to push changes to the full production environment as you go instead of waiting until the end.
As you develop on the wip branch, you’ll come to a place where you understand some discrete part of the system that can be extracted and pushed on its own. A simple example of this in Rails would be a model change and an associated database migration.
When you’ve identified one of these extractions, the first thing you do is start another brand new branch from
origin/master with just the focused changes. In the case of a model change and related database migration, I will sometimes create two separate PRs for the changes to go out independently. In a legacy application like github.com, changing a model often means a database migration, changes to the ruby models, a data transition of sorts from the old world to the new, followed by a cleanup PR (or set of PRs) that drop columns and fully commit to the new data model.
Assuming you’ve been developing on your
wip branch, here are the steps to start an extraction branch:
❯ git checkout master # Get back to the master branch ❯ git pull origin master # Make sure you're up-to-date ❯ git checkout -b extraction-a # Create a new branch off master called `extraction-a`
Then, I pull in just the changes necessary from the wip branch. You can do this by cherry picking commits, but I find it easier to directly checkout the changed files from your
wip branch and create a brand new set of commits. I will often dive deeper into the specific problem by adding tests, seeking out detailed code review from experts, and making sure the code is production quality and well documented. The extraction PR generally includes considerably more commits and work than the original spike on the wip PR.
Here’s an easy way to pull in changes in a particular file from another branch:
❯ git checkout wip -- app/model/user.rb ❯ git checkout wip -- db/migrations/some_migration.rb ❯ git commit -am "Enhance user to include..."
This is also a good place to rope in security review or even have other individuals push to the extraction branch. You can then branch deploy
extraction-a, watch production for un-intended behavior, further smoke test the new feature view the UI or a production console, run migrations/transitions, etc, etc.
When everything looks good, use the merge button or manually merge and re-deploy master. You’re changes are now live and in the main line (master).
Now, you’ll want to merge
origin/master back into your
wip branch. Generally this is painless (and something you’re already doing periodically). Sometimes, you’ll have refactored so far on your extraction branch that the changes look nothing like what you first implemented in the experimental wip branch. This is OK too and these conflicts are usually pretty easy to sort out. If you do run into conflicts, at least they are in a part of the codebase that you have a lot of context with. I usually sort out conflicts manually just by looking at the conflict markers and editing each file into the proper state before issuing a
git add on the file and finally a
git commit with to complete the merge once all conflicts are sorted.
In the happy path, this is how you’d merge back into
❯ git checkout wip # Make sure you're on the wip branch ❯ git fetch # Get latest ❯ git merge origin/master # Merge origin/master into your wip branch
When you merge an extraction back into the wip branch, the wip branch diff actually shrinks because some of those changes are now on master. You can see this on the changes view in a PR.
Now, you can start moving forward on the experimental branch again. Usually a grow - shrink pattern emerges where your diff grows as you forge ahead with new ideas and features and then shrinks as you extract small PRs, get them merged/deployed, and get
origin/master merged back in. Sometimes the real implementation is much cleaner than your initial prototype, other times it involves more code, tests, and docs. Either way, the wip branch does two things:
I’ve often found that as you get to the end, the wip branch narrows in on a single set of final changes that can be easily reviewed and merged.
The work in progress Pull Request extraction process enables many things. It…
The wip PR lets you see the big picture and the grand vision. It serves as a reference of your goal and an anchor to provide context to the extraction PRs. (If you use cross references in the extraction PRs back to the wip PR you get a nice history of the work done to implement the feature.) Extraction PRs let you dive deep into a particular problem, write clean code, get your tests in order, pull in experts, and focus on a small part of the big picture.