8 Principles of Continuous Delivery

continuous deliveryDave Farley co-authored “Continuous Delivery”, an excellent book in the Martin Fowler signature series, which goes into great detail about the evolution of Continuous Integration, and how to achieve continuous delivery (or continuous deployment) using “build pipelines”.

I went along to hear Dave Farley give a talk on Continuous Delivery and how they’re doing it where he works, at LMAX. It was a really great session and he managed to cover, in quite a short session, a great deal of content from in the book. I’ve put together a highlight of what he covered in the talk, mixed with my own take on things

Here’s what I learned…

Continuous delivery is basically the logical extension of Continuous Integration  – it’s a more holistic solution than C.I. though, as it encapsulates a lot more than just the development of software.

For instance, continuous delivery focuses a lot more on requirements than C.I. ever did, and involves a great deal more people on the delivery chain than traditional C.I. as well. It also has a greater customer focus than C.I.

Now, here’s something I didn’t know about continuous delivery…

There are 12 principles behind the agile manifesto. the first of which is:

Our highest priority is to satisfy the customer through early and continuous delivery of valuable software

Well who’d have thought it? Continuous delivery was mentioned waaaaay back in the days of the agile manifesto, some 2500 years BC* and yet for most of us it seems like a pretty new idea.

Continuous delivery is based on the use of smart automation. This is all about creating a repeatable and reliable process for delivering software. You have to automate pretty much everything in order to be able to achieve continuous delivery. manual steps will get in the way or become a bottleneck. This goes for everything from requirements authoring to deploying to production.

The focus is on the finished article – again, this is described as being:

Working software in the hands of the user

software in the hands of the user

software in the hands of the user

Because the focus is on the software in the hands of the user, there’s less tendency from a developers perspective, to simply chuck software over the wall to the QA team, and similarly to the Netops/production team.

Continuous delivery is all about getting that product out there, and getting the feedback from the users. This might mean delivering “unfinished” demo software during your development iterations, and getting your users to give valuable early feedback, or it might mean deploying experimental software to a website cluster and tracking how successful this new site is as compared to the existing system. Either way, it’s all about feedback loops. Essentially you want to have as rapid a feedback loop with your users as possible.

Feedback loops are familiar to everyone who has worked on a Continuous Integration system. In C.I. feedback loops are generally about getting test feedback (unit test, acceptance test, performance test etc) as quickly as possible – “Fail Fast” – as you’ve probably heard.

Continuous Delivery, as described, takes this idea to it’s logical conclusion, and gets the users involved in the feedback loop. This is a good example of how Continuous Delivery is more holistic than its C.I. predecessor. In Continuous Delivery, the feedback loop provides feedback not only on the quality of your code, but on the quality of your requirements, and the quality of your processes for delivering software.

8 Principles of Continuous Delivery

  1. The process for releasing/deploying software MUST be repeatable and reliable. This leads onto the 2nd principle…
  2. Automate everything! A manual deployment can never be described as repeatable and reliable (not if I’m doing it anyway!). You have to invest seriously in automating all the tasks you do repeatedly, and this tends to lead to reliability.
  3. If somethings difficult or painful, do it more often. On the surface this sounds silly, but basically what this means is that doing something painful, more often, will lead you to improve it, probably automate it, and this will eventually make it painless and easy. Take for example doing a deployment of a database schema. If this is tricky, you tend to not do it very often, you put it off, maybe you’ll do 1 a month. Really what you should do is improve the process of doing the schema deployments, get good at it, and do it more often, like 1 a day if needed. If you’re doing something every day, you’re going to be a lot better at it than if you only do it once a month.
  4. Keep everything in source control – this sounds like a bit of a weird one in this day and age, I mean seriously, who doesn’t keep everything in source control? Apparently quite a few people. Who knew?
  5. Done means “released”. This implies ownership of a project right up until it’s in the hands of the user, and working properly. There’s none of this “I’ve checked in my code so it’s done as far as I’m concerned”. I have been fortunate enough to work with some software teams who eagerly made sure their code changes were working right up to the point when their changes were in production, and then monitored the live system to make sure their changes were successful. On the other hand I’ve worked with teams who though their responsibility ended when they checked their code in to the VCS.
  6. Build quality in! Take the time to invest in your quality metrics. A project with good, targeted quality metrics (we could be talking about unit test coverage, code styling, rules violations, complexity measurements – or preferably, all of the above) will invariably be better than one without, and easier to maintain in the long run.
  7. Everybody has responsibility for the release process. A program running on a developers laptop isn’t going to make any money for the company. Similarly, a project with no plan for deployment will never get released, and again make no money. Companies make money by getting their products released to customers, therefore, this process should be in the interest of everybody. Developers should develop projects with a mind for how to deploy them. Project managers should plan projects with attention to deployment. Testers should test for deployment defects with as much care and attention as they do for code defects (and this should be automated and built into the deployment task itself).
  8. Improve continuously. Don’t sit back and wait for your system to become out of date or impossible to maintain. Continuous improvement means your system will always be evolving and therefore easier to change when needs be.

To go with these principles there are also:

4 Practices of Continuous Delivery

  1. Build binaries only once. You’d be staggered by the number of times I’ve seen people recompile code between one environment and the next. Binaries should be compiled once and once only. The binary should then be stored someplace which is accessible only to your deployment mechanism, and your deployment mechanism should deploy this same binary to each successive environment…
  2. Use precisely the same mechanism to deploy to every environment. It sounds obvious, but I’ve genuinely seen cases where deployments to QA were automated, only for the production deployments to be manual. I’ve also seen cases where deployments to QA and production were both automated, but in 2 entirely different languages. This is obviously the work of mad people.
  3. Smoke test your deployment. Don’t leave it to chance that your deployment was a roaring success, write a smoke test and include that in the deployment process. I also like to include a simple diagnostics test, all it does it check that everything is where it’s meant to be – it compares a file list of what you expect to see in your deployment against what actually ends up on the server. I call it a diagnostics test because it’s a good first port of call when there’s a problem.
  4. If anything fails, stop the line! Throw it away and start the process again, don’t patch, don’t hack. If a problem arises, no matter where, discard the deployment (i.e. rollback), fix the issue properly, check it in to source control and repeat the deployment process. A lot of people comment that this is impossible, especially if you’ve got a tiny outage window to deploy things to your live system, or if you do your production changes are done in the middle of the night while nobody else is around to fix the issue properly. I would say that these arguments rather miss the point. Firstly, if you have only a tiny outage window, hacking your live system should be the last thing you want to do, because this will invalidate all your other environments unless you similarly hack all of them as well. Secondly, the next time you do a deployment, you may reintroduce the original issue. Thirdly, if you’re doing your deployments in the middle of the night with nobody around to fix issues, then you’re not paying enough attention to the 7th principle of Continuous Delivery – Everybody has responsibility for the release process. Unless you can’t avoid it, I wouldn’t recommend doing releases when there’s the least amount of support available, it simply goes against common sense.

* Approximate date.


Continuous Delivery in Practice

A couple of months ago I was fortunate enough to be invited to the Thoughtworks live 2011 event in London. The main topics of this event were agile (as you’d expect from Thoughtworks) and Continuous Delivery. The event was run over 2 days, the first day included talks from Jez Humble and Martin Fowler, as well as a number of other speakers (Dave West did a session on agile vs the rest, Beyond BudgetingBjarte Bogsnes presented a very interesting session called “Beyond Budgeting”, and Scott Durchslag was guest speaking about how agile has worked at Expedia, and his interest in Continuous Delivery). My colleague Steve Morgan was also in attendance on the first day and has produced this excellent post on the contents of day 1. I won’t say much more about the first day because Steve’s post covers it better than I ever could (he cheated by taking notes/paying attention). However, I will just add that it was great to see so many people who genuinely seemed passionate about the evolution of agile, and the coffee breaks were a great opportunity to pick the brains of Messrs Fowler and Humble. I particularly enjoyed the “Beyond Budgeting” talk from Bjarte, and am happy to say he’s written a book about it, which you can buy on the internets!

The only other thing I would say about day 1 was that there was a reasonably good choice of biscuits. Biscuit variety is important and mustn’t be underestimated.

Day 2

Day 2 was somewhat more hands on than the first day, and also more tools-oriented. It started off with another session from Jez Humble and Martin Fowler, where they discussed continuous delivery, covering some of the contents of Jez’s book, and also going into detail about branching (feature branching vs Continuous Integration).Continuous delivery book The sessions then moved on to cover some of the products being produced at Thoughtworks Studios. There were talks from Andy Kemp and Suzie Prince about their products Mingle and Twist. Mingle is marketed as an agile project management tool. I’ve not used it myself but it seems to be based around shared workspaces and good requirements tracking. As you’d expect, it seems to integrate well with the other Thoughtworks products, which is probably one of its main advantages. Twist is a functional testing platform, which is very agile-centric in that it allows you to run requirements specs as tests in the “As a user, I want to…” style. Again it’s the integration with the other tools and products that made it look good to me. I’ve used a lot of agile tools recently, which all seem to be great on their own, but putting them all together so that I can track a requirement through to a test and a change through to a build can be a lot of hard work. The final product that they presented was Go, which is their Continuous Integration system, and they even showed us how they used Go to deliver builds of Mingle, Twist and even Go itself (talk about eating your own dogfood).

The centrepiece of day 2 though was of course my very own talk on how we at Caplin Systems are using Go in our Continuous Delivery system 🙂 I was invited to present a session on how Go is helping us achieve our Continuous Integration goals, and how we are using it to implement our own brand of Continuous Delivery. I’ve embedded my presentation above – sorry it’s not a video. I won’t go into too much detail (I’ll save you the lecture), but basically we’re using build pipelines along with a high degree of automated tests to deliver builds which are suitable for deployment. The other feature I have to mention is how Go manages multiple agents. We have about 60 active agents which Go manages. A build can be farmed out to any of these 60 agents, and if a particular resource is needed (let’s say I need to run a test on a particular OS, like Centos) then Go can be configured to send the builds to the right agents. Particular builds can be configured to be excluded from certain agents, meaning the system is highly configurable.

In the afternoon we had an open-space session, during which I spent most of my time with a group discussing database deployments and how to bring db changes under Continuous Integration. It seems there are still a lot of people around who feel that this aspect is largely overlooked with respect to Continuous Integration, and this was reflected in the way people were talking about using manual db compare tools as a way of deploying db changes to production. My own feelings on this are that database changes should be treated more like code changes, with each change scripted and deployed as a code change would be. There needn’t be destructive changes, and a good set of test data should help catch the issues that are often only found in production.

Greasemonkey script for CI system

Here in Caplin Towers (it’s not really called that) we’ve got a couple of projectors displaying the Continuous Integration builds up on the walls. It’s pretty useful until you get to the point where you’ve got more projects than space on the wall. We got to that point a while ago, and have had to resort to only displaying the “most important” builds on the wall. Clearly this is not very cool, because all the builds are important.

Sorry, there's no room here!

Sorry, there's no room here!

I decided to write a script which would scroll through all of the build groups and display this on the wall. I worked out that it would take about a minute to scroll through the whole lot, with a 4 second pause on each build group. My first thought was to use Watir (a ruby based browser scripting tool), and this would have probably worked fine on a Jenkins, Bamboo or CruiseControl system, but not for Go (I needed my solution to work for Go as many of our builds are in this system at present). You see, Go displays build groups by use of “views” (like Jenkins does). Unfortunately in Go there isn’t a different url for each view, meaning I can’t just write a simple ruby script that loads up a different page for each build group. I guess it must be handled by javascript.

So, I decided to try selenium. In theory this should have worked fine, and indeed it would have if I could be bothered to spend a bit more time on it. My plan was to record a journey which loaded up each view, one after another, and then play back this journey using selenium RC so that I could put it into a scheduled cron job and have it run over and over again. Like I said, in theory it works fine, but in practice it wasn’t such a great idea afterall. Firstly, there’s always that delay as selenium initializes and loads the browser, then there’s the presence of the selenium window, and then there’s the problem of having to update the script every time a new build group is added. I know most of these issues can be overcome fairly easy, especially if you’re selenium savvy or if you have a java framework for laoding and running selenium tests in place. I was just about to go down the route of writing my journey in java (mainly so that I can manipulate the window sizes more easily), when my colleague Edmund Dipple, said “I saw you struggling, so I’ve knocked this up” and showed me a greasemonkey script which does exactly what I was looking for. 🙂

Basically the script runs through each pipeline group, one after the other, and pauses for 5 seconds on each one before moving on. Perfect. He used the chrome developer tools (or you could use Firebug on Firefox) to find out the name of the pipeline group container (which turned out to be “pipeline_groups_container”) and then iterate through each of the child elements (the child elements represent each pipeline group). The full script is here:

var timeout = 5000;

var counter = 0;
var groups = document.getElementById(“pipeline_groups_container”).children;
var groupsLength = groups.length;

function scroll()

groups[i].style.display = “none”;
groups[counter].style.display = “block”;


if(counter == groupsLength)
counter = 0;




And now we see each build group on screen, one at a time:

This is one pipeline group....

...and this is another

What is in a name? Usually a version number, actually.

Another fascinating topic for you – build versioning! Ok, fun it might not be, but it is important and mostly unavoidable. In an earlier blog I outlined a build versioning strategy I was proposing to use with our Java builds. Since then, the requirements have changed, as they tend to, and so I’ve had to change the versioning convention.

Essentially, what I’m after is a way of using artifact version numbers to tell me some useful at-a-glance information about the artifact I have created. Also, customers want the version number to meet their expectations – that is, when they get a new build, they want to see an easily identifiable difference in the version number between the new build and their old one. What they don’t want is a long complicated list of numbers which are hard to distinguish. For instance, it’s much easy to identify which of the following 2 versions is the latest:

  • 5.0.1
  • 5.0.4

but it’s not so easy to work out which of these is the latest:


As we’re practicing continuous delivery, any given check-in can feasibly produce a release build. So, I would like some way of identifying exactly which check in produced my builds, or at least have a way of working out which bits of source code went into my released package. There are a couple of ways we can do this:

Tag the source code – We could make the builds tag the source code in our SCM system (Perforce) with every build. This is relatively easy to do using Ant and Maven. With Ant there are numerous different ways of doing it depending on your SCM system, for instance, with subversion you need to use the SvnAnt tasks from subclipse (http://subclipse.tigris.org/svnant/svn.html) and basically perform a copy of your source url:

 <copy srcUrl=”${src.url}” destUrl=”${dest.url}” message=”${version.num}”/>

(this is because tags in svn are just cheap copies with a label).

With Maven you just need to use the release plugin – this automatically handles tagging for you.

Tagging the source code is great – it keeps the version numbers as simple as I’d like, and it’s nicely traceable. However, it’s time consuming, and can result in a lot of tags.  The other problem is, I can’t tell which check-in caused the build just by looking at the version number of an artifact.

Use the commit number in the build version – We use a build version of Major.Release.Patch-Build in our artifacts. The build number used to be an auto-incrementing number – this worked fine but it didn’t give us a link back to which commit had caused the build to be made. So, I decided to use the perforce changelist id (i.e. the commit version) as the build number in the version, so that builds would end up looking something like this: 1.0.0-11531.

The problem here is that the version number is not customer friendly – so I remove the build number as a final step, before the builds get released to customers. To track what version the customers have got, I still keep a record of the full build number (including the commit number) in the release notes, and I could also easily inject it into an assembly info or properties/config file if I so wished, so that customers could very easily read out the full version number just by looking in a menu somewhere.

There were several obstacles I had to overcome to get this working. The first obstacle, and really this was the one that stopped me from tagging the source code, was that the maven release plugin is abysmal when it comes to continuous delivery. I needed to use the release plugin to tag the source code, but one of the other things that the maven release plugin does is to remove the word SNAPSHOT, increment the version number, and check the pom back into source control. This would cause another build to trigger in the CI system, which in turn would increment the build number etc and cause another build to trigger – so on and so on. Basically it would create a continually building project.

So I have decided not to use the maven release plugin at all – it doesn’t seem to fit in with Continuous Delivery. In order to create potential release candidates with every successful build, I’ve removed the word SNAPSHOT from all the poms, so we aren’t making any snapshot builds anymore either (except when you build locally – more on that later). The version in the poms now takes the P4 commit number, which is injected via the Continuous Integration system, which in my case is Go. Jenkins also supports this, using the subversion plugin (if you use subversion), which sets an environment variable with the svn revision number (more details here). The Jenkins Perforce plugin does the same thing, setting the P4_CHANGELIST environment variable – so it can easily be consumed (more details here).

Go takes the P4 changelist number and puts it in an environment variable called “GO_PIPELINE_LABEL”. I read this variable in, and assign it to a property called p4.revision. I do this in the command that kicks off the build, so that it overwrites a default value which I can keep in my pom – this is useful because it means my colleagues and I don’t have to make any changes to the pom if we want to run a build locally (bear in mind if we run it locally this environment variable won’t exist on our PCs, so the build would otherwise fail). Here’s a basic run down of a sample pom, with more details to follow:


<description>Description about this application</description>







The value for p4.revision is “SNAP” by default, meaning that if I make a local build, I’ll get an artifact with the version 5.0.2-SNAP. I know that these builds should never be promoted to production or handed to customers because the word SNAP gives it away.  However, when a build is created by the CI system, the following command is passed:

clean deploy sonar:sonar -Dp4.revision=${env.GO_PIPELINE_LABEL}

This overwrites the value for p4.revision, passing in the Perforce commit number, and the build will create something like 5.0.2-1234 (where 1234 is my imaginary p4 commit number).

I’ve added a property called main.version, which is the same as the full version but without the build number. I’ve done this so that I can package up my customer builds (ina  zip) and label them with the version 5.0.2. After all, customers don’t care about the build number.

An important policy to follow is once a build is released to a customer, one of the other version numbers MUST be increased, meaning all further builds will be at least 5.0.3. The decision of which version number to increase depends on various business factors – I like to increase the 3rd number if I’m releasing a patch to a previously released build. If I’m releasing new functionality I increase the second number. The first number gets increased for major releases. The whole issue of version numbers becomes a lot less complicated if you’re in the business of releasing software to web servers and you don’t actually have to hand software over to customers. In this instance, I just keep the full version number with the build number at the end, as it’s usually someone like me who has to look after the production system anyway!

The Boy Who Cried Wolf – Why Broken Builds Must Not Be Ignored!

We’ve all heard the fable of the boy who cried wolf, an old tale written by a Greek slave who lived a long time ago and liked telling stories. I must stress thought that this was just a story, but like many good stories (Star Wars, Gremlins 2) it’s based on true events. It’s a little known fact (which I have just made up) that the story of the Boy Who Cried Wolf is actually based on an ancient Continuous Integration system (possibly belonging to Spartatech inc, a leading software development company of from around 600BCE).

The are only very sketchy records of what actually happened during this historical event, but here’s what we know for sure*:

  • The Continuous Integration system compiled code and ran unit tests.
  • The unit tests were followed by acceptance tests, which in turn were followed by integration tests.
  • The integration tests were followed by cross-platform tests.

One day, the C.I. system alerted the developers, and anyone else who would listen, that one of the builds was failing!



All the workers gathered around the C.I. system to take a look at the error, only to discover that in actual fact, the problem was that the machine on which the tests were running had restarted, thanks to a windows update! The developers went back to work and thought no more of it.

The very next day, the C.I. system went ballistic again, alerting the developers of yet another failure. Once again the team took notice and looked into the error. This time they discovered that the test machine was running slowly and their unit test had timed out. They restarted the machine and it all passed.

By the following week, the C.I. system had alerted the team to 5 other “failed” builds, which were simply a result of the test servers behaving oddly, and no change to the code was required at all. By the end of that week the dev team had stopped paying attention to these alerts, because they all had proper jobs to do, and a lot of work to get through by the end of the sprint (they were agile, even back then – so if you still work according to “Waterfall” THAT’S how far behind the times you are).

Then, one day, a developer checked in a piece of code which failed a unit test. The C.I. system rightly alerted the development team again, but this time nobody came running, in fact, nobody paid the blindest bit of notice. They were all so used to ignoring the C.I. system that when a real problem finally arose, it wasn’t picked up by anyone. Anyway, cutting an already long story a bit shorter, the bug was a biggie, and the next release of their software was so poor that Spartatech inc, the biggest employer in Sparta, went out of business making everyone redundant. This is what basically led to the collapse of Sparta, not sure if you knew that. And here concludes your exceedingly dodgy history lesson.

Back to the present-time: We all know that false-negatives are the enemy of a good Continuous Integration system, they lead us down a path from which it’s increasingly hard to recover. The problem is that it’s just so easy to get into that situation.  Thankfully, there are a few “Continuous Integration best practices” which we can follow, which can help us keep our system in good nick:

  • Make sure the servers on which you run your builds are cleaned regularly, preferably before every build. I would suggest a clean image be deployed each morning. This obviously means that deploying and configuring your system will need to be automatic.
  • Use tools such as VMWare, VirtualBox etc to manage your Virtual Machines.
  • Use tools like Puppet, Chef and Vagrant to automate the deployment and configuration of these VMs.
  • Add the deployment of the VMs to your C.I system!
  • If there are any randomly failing tests, which also randomly fail on your local machine, remove them or rewrite them to make them more reliable.
  • When doing sprint planning, make sure time is dedicated to investigating and fixing broken or flaky builds. It’s very important to ensure the Product and Project Managers are aware of the importance of the C.I. system, and the risk of not maintaining it properly.
  • Treat the rules of Continuous Integration as sacrosanct – if a build fails, fix it as soon as possible.

In an earlier post I wrote about the difference between having a Continuous Integration server and practicing Continuous Integration. If you start to get into the situation where you’re allowing broken builds to go un-fixed, you’re slowly slipping away from actually practicing continuous integration, and soon you’ll just be a development team who has a continuous integration server which is delivering much less value than it ought to.

*I may have got my chronology a little mixed up

Nant NUnit and FxCop Build Script

Here’s a nant script I’ve knocked up which does the following:

  • Compiles the solution in debug
  • Moves the Unit test dlls into a single common folder
  • Applies Unit tests
  • Generates reports
  • Runs an FxCop analysis
  • Outputs the FxCop analysis to an xml file

I’ve used this build script in a continuous integration system that doesn’t actually publish any code. The purpose of this system is to run the unit tests and the code analysis quickly so that feedback can be obtained as early as possible. I’ve reused most of this script in the nightly build, which actually deploys the resulting code to a test server. The main difference is that the build is done in release maode in that instance. Anyway, here’s the script:

<?xml version=”1.0″ encoding=”utf-8″?>
<project name=”Unit tests” default=”run”>

<property name=”namespace” value=”MyProjectName” />
<property name=”test.dir” value=”.\UNITTESTS” />
<property name=”nant.settings.currentframework” value=”net-2.0″ overwrite=”false”/>
<property name=”output.dir” value=”c:\output” overwrite=”false”/>
<!– fxcop props –>
<property name=”fxcop.basedir” value=”C:\FxCop” />
<property name=”fxcop.executable” value=”${fxcop.basedir}\fxcopcmd.exe” overwrite=”false”/>
<property name=”fxcop.template” value=”${fxcop.basedir}\Template.FXCop” overwrite=”false”/>
<property name=”fxcop.out” value=”${output.dir}\${namespace}.fxCop.xml” overwrite=”false”/>

<target name=”compile”>
<msbuild project=”${namespace}.sln” >
<arg value=”/property:Configuration=Debug”/>

<target name=”move” description=”moves the unittest dlls into a single common folder”>
<mkdir dir=”${test.dir}” />
<copy todir=”${test.dir}” includeemptydirs=”false” flatten=”true”>
<fileset basedir=”..\”>
<include name=”**\*UnitTests.dll” />

<target name=”test” depends=”compile” description=”Apply unit tests”>
<property name=”nant.onfailure” value=”fail.test”/>

<formatter type=”Xml” usefile=”true” extension=”.xml” outputdir=”${output.dir}” />
<assemblies basedir=”${test.dir}” >
<include name=”*UnitTests.dll”/>

<nunit2report out=”${output.dir}\${namespace}.html” >
<include name=”${output.dir}\*.dll-results.xml” />
<property name=”nant.onfailure” value=”fail”/>

<target name=”fail.test”>
<nunit2report out=”${output.dir}\${namespace}.html” >
<include name=”${output.dir}\*.dll-results.xml” />

<target name=”cop”>
<echo message=”Running FxCop Analysis”/>
<exec program=”${fxcop.executable}” commandline=”/p:${fxcop.template} /f:${test.dir}\*.dll /o:${fxcop.out} /s” failonerror=”false”/>
<!–<echo message=”Logging Analysis Results”/>
<fxcoplogger dbserver=”myDbServer” dbname=”CodeAnalysis” trustedconnection=”true” />–>

<target name=”run” depends=”compile, move, test, cop”>


One thing you might note is that there appears to be a bit of replication of the same code, namely the nunit2report. But if you follow the logic, you’ll see the reason for this. We don’t want the build to stop processing if one of the unit tests fail, we actually want to capture the results, carry on, and present them in the build report. So what you do is add this line:

<property name=”nant.onfailure” value=”fail.test”/>

This tells the the script to go to the fail.test target, only if the build fails. This target generates the nunit2report, thus:

<target name=”fail.test”>
<nunit2report out=”${output.dir}\${namespace}.html” >
<include name=”${output.dir}\*.dll-results.xml” />

However, I still want to see the nunit2report even if the build doesn’t fail, so I have put a copy of the same target inside the test target, this means that if the unit tests fail or pass, I get my report, and if the tests pass, the build process carries on to do an FxCop analysis.

I think the outline of this system is covered in a book called Expert .Net Delivery by Marc Holmes, which is a great resource for getting up and running with Nant. It was the first book I bought when I started using Nant, and I recall it covered the fundamentals very well. It’s a pity the title doesn’t have “Continuous Integration” in it, because it’s actually a good place to start when setting up a .Net based CI system too.

Speaking of CI – if you use CruiseControl.Net there are handly transforms for getting your FxCop and Nunit reports published alongside your build on the CI dashboard. Alternatively you can hack the CruiseControl.net dashboard and provide a customised link to the nunit2report’s output html – it all depends on which format you prefer. I can’t decide, so I have both!

Continuous Integration: The Last Mile

Conquering the Last Mile

I went to the London C.I. Meetup last week where Gus Power (how he chose a career in I.T. and not as a pro-wrestler with a name like that I do NOT know) delivered a talk entitled “C.I. The Last Mile”. Gus seems to have a comprehensive and varied background in lean and agile software engineering and delivery, and in this session he talked to us about “Conquering the last mile”, that is, getting working software out of the door.

This is NOT Gus Power

This is NOT Gus Power

We’ve all heard (or maybe even said a few times) the oft repeated mantra “but it works fine on my machine!”. This situation is commonplace, a build, for instance, will run just fine on your own PC, but as soon as it runs on the build server, it fails. Gus reminds us that, although this scenario is common, it’s a million miles away from working, releseable software, for a number of reasons, and that the only true definition of “working” software is “running, tested, and in the hands of the customer”.

Continuous Integration – is it Just a “Software” Practice?

Continuous Integration as a practice is traditionally quite a few steps removed from the world of running, tested software in the hands of the customer. In fact, Gus suggested that C.I. is usually associated with automating builds and running unit tests (although I would suggest that this is probably an increasingly naive view in this day and age, but for the sake of his talk it helped to make a point). This is obviously a long way away from delivering working software to a customer. Gus told us of his own company’s realisation that software, in their case, was not the product, and that their “product” included a lot more than just the software that the dev team delivered. He was of course referring to the supporting software and systems, and infrastructure.

Think of a web-based company, for instance Uswitch.com (for whom I used to work). While their core product may be a lot of web pages and associated content, their overall system consists of a lot more – for instance, there’s the IIS configurations, the data in the databases, the load balancers and of course the infrastructure (servers, Operating Systems etc). Note that Continuous Integration, in many cases, doesn’t cover most of this supporting system, only the company’s proprietary software (including the database in the case of Uswitch.com – at least, that was the case a few years ago when i worked there).

Gus pointed out that all these other parts of the system should also be brought under the watchful eye of Continuous Integration, and I agree with him wholeheartedly.

Lean Manufacturing

Gus went on to introduce us to the Lean Manufacturing model as used by Toyota.

Image: Toshifumi Kitamura/AFP/Getty

He said that Toyota enjoy a great deal of success due to their use of a system based on tolerances, rather than rigid specs. He seemed to suggest that this approach could be applied to some of the non-software aspects of a system, such as infrastructure.

Acceptance Criteria for Systems

I think Gus raised a particularly interesting point about the lean manufacturing system, and it made me think about how much attention we usually pay to “acceptance criteria”. In the current agile environment, much attention is focused on what constitutes “acceptance” when it comes to designing and delivering software. Many tools exist, such as Fitnesse and Cucumber, which allow us to capture acceptance criteria in relation to a user story or use case. However, while this practice is now common across many software companies, the focus tends to be on the software that’s being developed, and often doesn’t include the supporting parts of the system, such as the infrastructure (hardware, load balancers, data, etc and so forth).

cucumber - behaviour driven development with elegance and joy

Of course, there’s an obvious reason for this. Most of the stories come from user requirements, the stories are made into tasks and the tasks are worked on by the software engineers. There’s a disconnect between those responsible for delivering the software (i.e. the software engineers) and those responsible for the supporting systems (who are they anyway???). Also, the customer’s requirements are often agnostic when it comes to the supporting parts of the system – they generally don’t care about the load balancer’s capabilities, or what type of web server you use).

What we’re talking about here is the difference between “software requirements” and “system requirements”, and how the latter are quite often overlooked, or at least overshadowed by the former.

Software Acceptance vs Product Acceptance

Gus finished off his talk by recommending that all parts of your “product” (I prefer the term “system” here, but that’s just me) should be brought under the control of your C.I. system and that there should be some effort to capture and measure system acceptance criteria – here he refers back to the Toyota system of using tolerances.

I raised the subject of Chef & Puppet in the Q&A session at the end. These tools allow you to treat what are traditionally considered “infrastructure” parts of your system as code. With these tools you can easily script up and deploy VMs and Operating Systems along with their configurations,  and they also work well in a C.I. environment. Gus said that he has enjoyed some success with Chef.

This is not the chef we were referring to

This is not the chef we were referring to

Maven Release Plugin and Perforce Clientspecs

I’m getting a very annoying error trying to do maven release builds on our CI servers, which isn’t appearing on my local workstation. The build seems to fail because the pom file is under source control (I’m using Perforce) and so it’s read-only. However, It should check out the pom file so that it isn’t read only. Alas, that doesn’t seem top be working. Here are the errors I got initially:

[INFO] ————————————————————————
[INFO] Error writing POM: C:\Program Files\Cruise Agent\pipelines\yadda\yadda\pom.xml (Access is denied)

The reason behind it appeared to be:

[ERROR] CommandLineException Exit code: 1 – Client ‘xpcruisebuildvm1-STANDARD-ALL’ can only be used from host ‘xpcruisebuildvm1’.

Command line was:p4 -d “C:\Program Files\Cruise Agent\pipelines\yadda\yadda” -p perforce.mycompany.com:1666 ed
it pom.xml
org.codehaus.plexus.util.cli.CommandLineException: Exit code: 1 – Client ‘xpcruisebuildvm1-STANDARD-ALL’ can only be used from host ‘xpcruisebuildvm1’

So the first thing I did was create a new clientspec for that machine which had all the files mapped to its c:\temp directory, and then tried to run the build again, assuming this would fix the issue but present me with a whole new problem about how to fit this all in to my CI system. However, this also failed:

[INFO] ————————————————————————
[INFO] Error writing POM: C:\TEMP\yadda\pom.xml (Access is denied)

This is confusing because the clientspec I’m using does have these files in its view and therefore Maven should be able to edit them. Also, this is how I have it setup on my local workstation, and that works fine…

After a lot of trial and error I managed to make some progress. One of the issues was that the client spec had the following mapping in it:

//JavaDevelopment/main/yadda/… //XPCRUISEVM808_release/temp/yadda/…

and the root was set to C:\

This means all P4 files should sync to my C:\TEMP directory, which already existed on the machine. As expected, the sync worked fine and all files appeared in C:\TEMP\yadda.

And therein lies the rub: the client spec uses lowercase “temp”, while the windows directory was uppercase “TEMP”. As simple as that. I changed the client spec to math the filesystem and for good measure updated my release plugin version to 2.1. Problem solved (well, actually that presents me with a whole new problem because I don’t want every build agent to have different client specs, because I’m using Go, and that puts them in it’s own directory under the name of the build job. Grrrr).

Build Versioning Strategy

Over the last few years I’ve followed a build versioning strategy of the following format:

<Major Version>.<Release Version>.<Patch Number>.<Build ID>

The use of decimal points allows us to implement an auto-incrementing strategy for our builds, meaning the Build ID doesn’t need to be manually changed each time we produce a build, as this is taken care of by the build system. Both Maven and Ant have simple methods of incrementing this number.

Ensuring that each build has a unique version number (by incrementing the Build ID) allows us to distinguish between builds, as no two builds of the same project will have the same BuildID. The other numbers are changed manually, as and when required.

When to Change Versions:

Major Version – Typically changes when there are very large changes to product or project, such as after a rewrite, or a significant change to functionality

Release Version – Incremented when there is an official release of a project which is not considered a Major Version change. For example, we may plan to release a project to a customer in 2 or 3 separate releases. These releases may well represent the same major version (say version 5) but we would still like to be able to identify the fact that these are subsequent planned releases, and not patches.

Patch Number – This denotes a patch to an existing release. The release that is being patched is reflected in the Release Version. A patch is usually issued to fix a critical bug or a collection of major issues, and as such is different to a “planned” release.

Build ID – This auto-increments with each release build in the CI system. This ensures that each build has a unique version number. When the Major Version, Release Version or Patch Number is increased, the Build ID is reset to 1.

Examples: – This represents release 17.23. It is the 9th build of this release. – This is the next release, release 17.24. This is the first build of 17.24. – This represents a patch for release 17.24. This is the first patch release, and happens to be the 2nd build of that patch.

Continuous Delivery using build pipelines with Jenkins and Ant

My idea of a good build system is one which will give me fast, concise, relevant feedback, but I also want it to produce a proper finished article when I’ve checked in my code. I’d like every check-in to result in a potential release candidate. Why? Well, why not?

I used to employ a system where release candidates were produced separately to my check-in builds (also known as “snapshot” builds). This encouraged people to treat snapshot builds as second rate. The main focus was on the release builds. However, if every build is a potential release build, then the focus on each build is increased. Consequently, if every build could be a potential release candidate, then I need to make sure every build goes through the most rigorous testing possible, and I would like to see a comprehensive report on the stability and design of the build before it gets released. I would also like to do all of this automatically, as I am inherently lazy, and have a facebook profile to constantly update!

This presents me with a problem: I want instant feedback on check-in builds, and to have full static analysis performed on them and yet I still want every check-in build to undergo a full suite of testing, be packaged correctly AND be deployed to our test environments. Clearly this will take a lot longer than I’m prepared to wait! The solution to this problem is to break the build process down into smaller sections.

Pipelines to the Rescue!

The concept of build pipelines has been around for a couple of years at least. It’s nothing new, but it’s not yet standard practice, which is a pity because I think it has some wonderful advantages. The concept is simple: the build as a whole is broken down into sections, such as the unit test, acceptance test, packaging, reporting and deployment phases. The pipeline phases can be executed in series or parallel, and if one phase is successful, it automatically moves on to the next phase (hence the relevance of the name “pipeline”). This means I can setup a build system where unit tests, acceptance tests and my static analysis are all run simultaneously at commit stage (if I so wish), but the next stage in the pipeline will not start unless they all pass. This means I don’t have to wait around too long for my acceptance test results or static analysis report.

Continuous Delivery

Continuous delivery has also been around for a while. I remember hearing about it in about 2006 and loving the concept. It seems to be back in the news again since the publication of “Continuous Delivery”, an excellent book from Jez Humble and David Farley. Again the concept is simple, roughly speaking it means that every build gets made available for deployment to production if it passes all the quality gates along the way. Continuous Delivery is sometimes confused with Continuous Deployment. Both follow the same basic principle, the main difference is that with Continuous Deployment it is implied that each and every successful build will be deployed to production, whereas with continuous delivery it is implied that each successful build will be made available for deployment to production. The decision of whether or not to actually deploy the finished article to the production environment is entirely up to you.

Continuous Delivery using Build Pipelines

You can have continuous delivery without using build pipelines, and you can use build pipelines without doing continuous delivery, but the fact is they seem made for each other. Here’s my example framework for a continuous delivery system using build pipelines:

I check some code in to source control – this triggers some unit tests. If these pass it notifies me, and automatically triggers my acceptance tests AND produces my code-coverage and static analysis report at the same time. If the acceptance tests all pass my system will trigger the deployment of my project to an integration environment and then invoke my integration test suite AND a regression test suite. If these pass they will trigger another deployment, this time to UAT and a performance test environment, where performance tests are kicked off. If these all pass, my system will then automatically promote my project to my release repository and send out an alert, including test results and release notes.

So, in a nutshell, my “template” pipeline will consist of the following stages:

  • Unit-tests
  • Acceptance tests
  • Code coverage and static analysis
  • Deployment to integration environment
  • Integration tests
  • Scenario/regression tests
  • Deployments to UAT and Performance test environment
  • More scenario/regression tests
  • Performance tests
  • Alerts, reports and Release Notes sent out
  • Deployment to release repository

Introducing the Tools:

Thankfully, implementing continuous delivery doesn’t require any special tools outside of the usual toolset you’d find in a normal Continuous Integration system. It’s true to say that some tools and applications lend themselves to this system better than others, but I’ll demonstrate that it can be achieved with the most common/popular tools out there.

Who’s this Jenkins person??

Jenkins is an open-source Continuous Integration application, like Hudson, CruiseControl and many others (it’s basically Hudson, or was Hudson, but isn’t Hudson any more. It’s a trifle confusing*, but it’s not important right now!). So, what is Jenkins? Well, as a CI server, it’s basically a glorified scheduler, a cron job if you like, with a swish front end. Ok, so it’s a very swish front end, but my point is that your CI server isn’t usually very complicated, in a nutshell it just executes the build scripts whenever there’s a trigger. There’s a more important aspect than just this though, and that’s the fact that Jenkins has a build pipelines plugin, which was written recently by Centrum Systems. This pipelines plugin gives us exactly what we want, a way of breaking down our builds into smaller loops, and running stages in parallel.


Ant has been possibly the most popular build scripting language for the last few years. It’s been around for a long while, and its success lies in its simplicity. Ant is an XML based scripting language tailored specifically for software build related tasks (specifically Java. Nant is the .Net version of Ant and is almost identical).


Sonar is a quality measurement and reporting tool, which produces metrics on build quality such as unit test coverage (using Cobertura) and static analysis tools (Findbugs, PMD and Checkstyle). I like to use Sonar as it provides a very readable report and contains a great deal of useful information all in one place.

Setting up the Tools

Installing Jenkins is incredibly simple.  There’s a debian package for Operating Systems such as ubuntu, so you can install it using apt-get. For Redhat users there’s an rpm, so you can install via yum.

Alternatively, if you’re already running Tomcat v5 or above, you can simply deploy the jenkins.war to your tomcat container.

Yet another alternative, and probably the simplest way to quickly get up and running with Jenkins is to download the war and execute:

java -jar jenkins.war

This will run jenkins through it’s own Winstone servlet container.

You can also use this method for installing Jenkins on Windows, and then, once it’s up and running, you can go to “manage jenkins” and click on the option to install Jenkins as a Windows Service! There’s also a windows installer, which you can download from the Jenkins website

Ant is also fairly simple to install, however, you’ll need the java jdk installed as a pre-requisite. To install ant itself you just need to download and extract the tar, and then create the environment variable ANT_HOME (point this to the directory you unzipped Ant into). Then add ${ANT_HOME}/bin (or %ANT_HOME%/bin if you’re on Windows) to your PATH, and that’s about it.

Configuring Jenkins

One of the best things about Jenkins is the way it uses plugins, and how simple it is to get them up and running. The “Manage Jenkins” page has a”Manage Plugins” link on it, which takes you a list of all the available plugins for your Jenkins installation:

To install the build pipeline plugin, simply put a tick in the checkbox next to “build pipeline plugin” (it’s 2/3 of the way down on the list) and click “install”. It’s as simple as that.

The Project

The project I’m going to create for the purpose of this example is going to be a very simple java web application. I’m going to have a unit test and an acceptance test stage.  The build system will be written in Ant and it will compile the project and run the tests, and also deploy the build to a tomcat server. Sonar will be used for producing the reports (such as test coverage and static analysis).

The Pipelines

For the sake of simplicity, I’ve only created 6 pipeline sections, these are:

  • Unit test phase
  • Acceptance test phase
  • Deploy to test phase
  • Integration test phase
  • Sonar report phase
  • Deploy to UAT phase

The successful completion of the unit tests will initiate the acceptance tests. Once these complete, 2 pipeline stages are triggered:

  • Deployment to a test server


  • Production of Sonar reports.

Once the deployment to the test server has completed, the integration test pipeline phase will start. If these pass, we’ll deploy our application to our UAT environment.

To create a pipeline in Jenkins we first have to create the build jobs. Each pipeline section represents 1 build job, which in this case runs just 1 ant task each. You have to then tell each build job about the downstream build which is must trigger, using the “build other projects” option:

Obviously I only want each pipeline section to do the single task it’s designed to do, i.e. I want the unit test section to run just the unit tests, and not the whole build. You can easily do this by targeting the exact section(s) of the build file that you want to run. For instance, in my acceptance test stage, I only want to run my acceptance tests. There’s no need to do a clean, or recompile my source code, but I do need to compile my acceptance tests and execute them, so I choose the targets “compile_ATs” and “run_ATs” which I have written in my ant script. The build job configuration page allows me to specify which targets to call:

Once the 6 build jobs are created, we need to make a new view, so that we can start to visualise this as a pipeline:

We now have a new pipeline! The next thing to do is kick it off and see it in action:

Oops! Looks like the deploy to qa has failed. It turns out to be an error in my deploy script. But what this highlights is that the sonar report is still produced in parallel with the deploy step, so we still get our build metrics! This functionality can become very useful if you have a great deal of different tests which could all be run at the same time, for instance performance tests or OS/browser-compatibility tests, which could all be running on different Operating Systems or web browsers simultaneously.

Finally, I’ve got my deploy scripts working so all my stages are looking green! I’ve also edited my pipeline view to display the results of the last 3 pipeline builds:


The pipelines plugin also works for Hudson, as you would expect. However, I’m not aware of such a plugin for Bamboo. Bamboo does support the concept of downstream builds, but that’s really only half the story here. The pipeline “view” in Jenkins is what really brings it all together.

“Go”, the enterprise Continuous Integration effort from ThoughtWorks not only supports pipelines, but it was pretty much designed with them in mind. Suffice to say that it works exceedingly well, in fact, I use it every day at work! On the downside though, it costs money, whereas Jenkins doesn’t.

As far as build tools/scripts/languages are concerned, this system is largely agnostic. It really doesn’t matter whether you use Ant, Nant, Gradle or Maven, they all support the functionality required to get this system up and running (namely the ability to target specific build phases). However, Maven does make hard work of this in a couple of ways – firstly because of the way Maven lifecycles work, you cannot invoke the “deploy” phase in an isolated way, it implicitly calls all the preceding phases, such as the compile and test phases. If your tests are bound to one of these phases, and they take a long time to run, then this can make your deploy seem to take a lot longer than you would expect. In this instance there’s a workaround – you can skip the tests using –DskipTests, but this doesn’t work for all the other phases which are implicitly called. Another drawback with maven is the way it uses snapshot and release builds. Ultimately we want to create a release build, but at the point of check-in we want a release build. This suggests that at some point in the pipeline we’re going to have to recompile in “release mode”, which in my book is a bad thing, because it means we have to run ALL of the tests again. The only solution I have thought of so far is to make every build a release build and simply not use snapshots.

* A footnote about the Hudson/Jenkins “thing”: It’s a little confusing because there’s still Hudson, which is owned by Oracle. The whole thing came about when there was a dispute between Oracle, the “owners” of Hudson, and Kohsuke Kawaguchi along with most of the rest of the Hudson community. The story goes that Kawaguchi moved the codebase to GitHub and Oracle didn’t like that idea, and so the split started.