Upcoming DevOps & Agile Events

London Puppet User Group Meetup
London, Thursday May 21st, 2015
6:00pm
http://goo.gl/C2zuKb

DevOps Exchange London – DevOps & DevOps
London, Tuesday May 26th, 2015
6:30pm
http://goo.gl/Xmdqxl

London Agile Discussion Group – Should DevOps be a person or a team-wide skill?
London, Tuesday May 26th, 2015
6:30pm
http://goo.gl/xksVOH

AWS User Group UK – meetup #15
London, Wed May 27th, 2015
6:30pm
http://goo.gl/uBsiUj

Chef Users London – Microsoft Azure / Chef Taster Day
London, Friday May 29, 2015
9:00am to 5:00pm
http://goo.gl/VOvkC3

DevOps Cardiff – Herding ELKs with consul.io
Cardiff, Wednesday, June 3, 2015
6:30pm
http://goo.gl/WwOvkQ

Agile Testing – Visual Creativity: Using Sketchnotes & Mindmaps to aid testing @ #ltgworkshops
London, Thursday June 4th, 2015
8:30am
http://goo.gl/34iIXM

ABC (Agile Book Club) London – Review Jeff Patton’s User Story Mapping
London, Thursday June 4th, 2015
6:30pm
http://goo.gl/X0qPwb

Agile Testing – Hooking Docker Into Selenium @ #ltgworkshops
London, Thursday June 4th, 2015
8:30am
http://goo.gl/ONH8dQ

UK Azure User Group – Cloud Gaming Hackathon
London, Saturday June 6th, 2015
9:30am
http://goo.gl/ONH8dQ

London DevOps – London DevOps Meetup #10
London, Thursday June 11th, 2015
7:00pm
http://goo.gl/uolxJk

Kanban Coaching Exchange – Continuous learning through communities of practice – Emily Webber
London, Thursday June 11th, 2015
6:30pm
http://goo.gl/9aFD8x

Lean Agile Manchester
Manchester, Wednesday June 17th, 2015
6:30pm
http://goo.gl/Z15ac3

London Lean Coffee – Holborn
London, Thursday, June 18th, 2015
9-10am
http://goo.gl/QkIBhj

UK Azure User Group – Chris Risner
London, Thursday June 18th, 2015
7:00pm
http://goo.gl/EfbNnn

Jenkins User Conference – Europe (London)
London, Tuesday June 23rd – 24th, 2015
2 days
http://goo.gl/achJJX

BDD London June Meetup
London, Thursday June 25th, 2015
6:30pm
http://goo.gl/C2zuKb

Automated Database Deployment (Workshop – £300)
Belfast, Northern Ireland, Friday June 26th, 2015
1 day course
http://goo.gl/fXlJr7

Database Continuous Integration (Workshop – £300)
London, July 8th, 2015
1 day course
http://goo.gl/lW4TjA

Database Source Control (Workshop – £100)
London, July 8th, 2015
1 day course
http://goo.gl/C2zuKb

London Lean Coffee – Holborn
London, Thursday, July 16, 2015
9-10am
http://goo.gl/mtJ3k4

Agile Taster – a free introductory Agile training course
Cardiff, Saturday 18 July 2015
10am – 3pm
http://goo.gl/qFYS6b

AWS User Group UK – meetup #16
London, Wed July 22nd, 2015
6:30pm
http://goo.gl/Tc3hlD

Advertisement

Adopting Agile in 3 “Easy” Steps

All good plans come in 3 phases:

Profit

Although I won’t be collecting any underpants, I’ll be following this basic template (with a couple of tweaks here and there) during an Agile adoption initiative I’m currently working on.

In the South Park episode (from which I have taken the picture above) the boys discover a bunch of underpant-stealing gnomes, who are collecting underpants as part of a grand plan to make profit. The gnomes claim to be business experts but none of them appears to know what phase 2 of the plan is. All they know is that their business model is based on collecting underpants, and so that’s what they’ll do.

Unfortunately, I have been witness to a couple of attempts at adopting agile which weren’t very dissimilar to the underpants gnomes’ business plan. Namely, a business starts “Adopting Agile”, usually driven by the development team, where they start doing stand-ups and using a sprint board (this is phase 1) and somehow they are surprised when this doesn’t suddenly start producing profit. Clearly, “becoming Agile” isn’t as simple as that.

Phase 1 – Collect business reasons (not underpants)

So you’re going Agile. Presumably you’ve determined that this is what you want, and what your customers need. If you haven’t done this yet then stop right there and ask yourself “Do I Need Go Agile?”. The answer might be “no”, but does needing to go Agile have to be the only reason? Maybe you just want to go agile to see what the fuss is all about, or to make your business more attractive to potential new employees.

brush

So lets assume we’re going agile, and you have valid business reasons to do so. My first suggestion would be to make those business reasons highly visible. You have to outline the existing issues and how Agile can help to fix them. Mitchell and Webb once did a sketch about a toothbrush company who had to try to think of some gimmick to add to their toothbrushes in order to keep increasing their sales. They came up with the idea of “dirty tongue”. This is where microscopic “tonguanoids” build up, and basically result in social exclusion and a lack of sex. Their solution: to put bristles on the other side of the toothbrush so that people can brush their tongues while they brush their teeth. People will buy these toothbrushes despite the fact that “brushing your tongue makes you retch, everybody knows that”. The point I’m making, very badly, is that it’s a lot easier to sell things if people think that what they’re buying into will fix some very real, tangible issue.

The same goes with Agile. To get the buy-in you need to make your agile adoption a success, you’ll need to identify how “going Agile” is going to make life better for everyone concerned.

If the problem you’ve got is that you never ship software on time, or you constantly fail to deliver what the customer wants, then it’s fairly easy to “sell” agile as the solution. The concept of sprints are a doddle for everyone to understand, and they’ll love the idea that the customer will have regular interactions with the development team, and get to see regular progress in the demos. “Of course!” they’ll say “It’s so obvious, why didn’t I think of that before”. The business should easily be able to see how short, sharp sprints with an emphasis on “working software” will make it easier to deliver what the customer wants, and manage their expectations of when it’ll be ready.

But what if those aren’t the problems you need to solve?

What if your problem is quality? How do you convince the business that Agile will result in a higher quality product? It’s not quite so easy. Agile itself won’t deliver better quality, but the good practices you’ll have to implement in order to successfully be agile will help to improve your quality. I was thinking about this the other day because it’s exactly the problem I was faced with.

Agile isn’t going to make it easier to reliably test our software. But to be agile, we need to be able to build and deploy our project rapidly so that we can test it right there and then, not tomorrow, not next week, but right now, so that the testers and devs can work in tandem, building features and signing them off and moving on to the next one. We have to facilitate this in order to be agile, so as a byproduct of going agile we might have to invest in creating a new build and deployment system. And it has to be quick so it’ll have to be automated.

So we have an automated build and deployment system, but to be able to reliably test our features we’ll have to make sure the environments are reliable. We can throw people at this problem and dedicate a team to making sure our environments are clean and regularly audited, or we can automate all that as well! chef Fortunately there are numerous tools and good practices we can follow to do this, just take a look at Chef, Puppet, Vagrant, and VMWare as examples of tools for automating deployments of virtual machines, and the concepts of “infrastructure as code” for good practices. (of course, if your hardware isn’t already virtualised the first thing to do is see whether it can be, and if it really, honestly can’t, then look at tools like Norton Ghost and Powershell for ways of automating as much as you can).

“Agile” and “Improved Quality” might not be the most obvious bed partners, but the journey to becoming agile almost forces you to take steps which will naturally go towards improving your quality.

Hopefully you’ll have enough “sales material” to put forward a great case for agile – you can deliver exactly what the customer wants, to a higher quality, and you can manage their expectations in a way you could never do before. And that’s just scratching the surface of what Agile can do for a business, but for the purposes of keeping this post to a reasonable length, I’ll leave it at those 3 things!

Phase 2 – Pick the most appropriate project, and start doing Scrum

The sales pitch is over and now it’s time to start doing stuff. Make your life a lot easier by picking a project that has as many of the following features as possible:

  • Smart developers and testers
  • Isn’t suffering from a tonne of technical debt
  • Has users who are happy to get involved in early & regular feedback
  • Is small, new or yet to begin

If you’re taking on an existing project, a good idea at this point is to benchmark your existing processes. Consider trying to measure the following:

  • How long does it take to get a single change from request through to production deployment?
  • How much time and money does it cost to fix an issue on production?
  • How many bugs do you typically find on your production code every month?
  • How often do you deliver features that don’t satisfy the customer?
  • How often do you deliver features after the deadline?

Measuring some of the things above is clearly non-trivial, but if you can find these stats somewhere, they’ll be very useful benchmarks for you in the future. When you can demonstrate that all of these metrics are improved in your new Agile process, god-like status will soon follow.

You - after delivering "agile" to the business

Guess who just delivered a release on time…
Ohhhh Yeeeeaaaaahhhh!

I recommend doing Scrum because it’s simple and has the most support in terms of people with experience, material (books, courses etc), and tools. It’s a good “framework” to get you started, and once you’ve had success, you can evolve into other methods, or incorporate them into Scrum (such as BDD, TDD etc).

Succeeding With Agile

Here’s one of many great books to get you started on your agile journey

At about this point you’ll need to do some brainwashing training. The concept of doing analysis, design, development and testing all at the same time is going to sound absolutely bonkers to some people. Try your best to explain it to them, but don’t waste too much time on this – just crack on and make a start!

Most people will enjoy the experience of working in this “new” way, and the first few sprints will probably benefit from the fact that everyone is performing better simply because they feel more invigorated. Use this opportunity to promote scrum across the organisation.

In this phase, always maintain a focus on “the business” and not just on the technical team. It’s important that the business feels part of this new process or they’ll just see it as some crazy dev thing which doesn’t really affect them, and they won’t try to understand it. Business people might refer to this as “Promoting Synergy”, which I’ve just shoe-horned into this post so that I can add a picture from Lonely Island’s “Like A Boss” video. However, I do like to make a point of always highlighting the extra business value we’re delivering, and make sure the Product Managers (soon to be “Product Owners”) are involved all the way. They represent the traditional link between the customers and what we’re delivering, and so it’s essential that they understand the benefits of agile.

Promote Synergy!

Promote Synergy!

I was recently asked about the impact of “going agile” on a project’s release schedule, and when we would be able to deliver the features we’ve promised to the customers. It’s difficult to explain that we no longer know when we’ll deliver stuff, but at some point, people will have to realise that this is the wrong question. I prefer the idea of a rolling roadmap, which is continually reviewed and updated (as often as you can afford to do it, really). Rolling Roadmaps give the business, as well as the customers a good idea of our intentions, but it is very different to fixed dates on a release schedule, or a traditional yearly roadmap. Of course, everyone needs to understand that the main driver for our deliverables will be the customers, and what the customer wants will usually change over a shorter period than you expect. So for your new “Agile” project, try to work towards implementing a rolling roadmap culture, and move away from long-term fixed delivery dates (if you can).

One final note on Phase 2: Make it fun, and make it different.

Phase 3 – Improve

Agile promotes “fast feedback loops” all over the place: in development we get fast feedback on our code through Continuous Integration, with BDD we get fast feedback to the Product Owners/BAs and of course with our more frequent releases we get faster feedback from the customers. And so it is with our Agile processes as a whole. With short sprints and the clever use of retrospectives we can continually tinker with our fine tuning to see if we can improve our quality and velocity. Look at areas you can try to improve, change something and then see if your change has had a positive impact at the end of the sprint. This is basically the concept behind Deming’s Shewhart Cycle:

demingcycleDeming actually preferred “Plan, Do, Study, Act”, whereas I myself prefer “Plan, Do, Measure, Act”. The reason I prefer this is because it implies the use of quantifiable metrics to base our actions on, rather than some other non-quantifiable observations. Anyways, the point is that after agile is applied, you should keep looking at ways to continuously improve. This is key to keeping everyone feeling fresh and invigorated, helps us to learn from our mistakes, and encourages innovation.

So there you go, Agile delivered in 3 well easy steps. It shouldn’t take you much longer than an afternoon. Ok, it might take a bit longer but if you’re looking for a 30,000 foot overview of a simple 3-phase approach, then you could do a lot worse than apply the principles of “Sell the Agile Idea, Pick the Best Project, and Keep Improving”.

A Really $h!t Branching Policy

“As a topic of conversation, I find branching policies to be very interesting”, “Branching is great fun!”, “I wish we could do more branching” are just some of the comments on branching that you will never hear. And with good reason, because branching is boring. Merging is also boring. None of this stuff is fun. But for some strange reason, I still see the occasional branching policy which involves using the largest number of branches you can possibly justify, and of course the most random, highly complex merging process you can think of.

Here’s an example of a really $h!t branching policy:

Look how hateful it is!! I imagine this is the kind of conversation that leads to this sort of branching policy:

“Right, let’s just work on main and then take a release branch when we’re nearly ready to release”

“Waaaaait a second there… that sounds too easy. A better idea would be to have a branch for every environment, maybe one for each developer as well, and we should merge only at the most inconvenient time, and when we’ve merged to the production branch we should make a build and deploy it straight to Live, safe in the knowledge that the huge merge we just did went perfectly and couldn’t possibly have resulted in any integration issues”

“Er, what?”

“You see! Its complexity is beautiful”

Conclusion

Branching is boring. Merging is dull and risky. Don’t have more branches than you need. Work on main, take a branch at the latest possible time, release from there and merge daily. Don’t start conversations about branching with girls you’re trying to impress. Don’t talk about branches as if they have personalities, that’s just weird. Use a source control system that maintains branch history. Floss regularly. Stretch after exercise.

 

 

 

Why do we do Continuous Integration?

Continuous Integration is now very much a central process of most agile development efforts, but it hasn’t been around all that long. It may be widely regarded as a “development best practice” but some teams are still waiting to adopt C.I. Seriously, they are.

And it’s not just agile teams that can benefit from C.I. The principles behind good C.I. can apply to any development effort.

This article aims to explain where C.I. came from, why it has become so popular, and why you should adopt it on your development project, whether you’re agile or not.

Back in the Day…

Are you sitting comfortably? I want you to close your eyes, relax, and cast your mind back, waaay back, to 2003 or something like that…

You’re in an office somewhere, people are talking about The Matrix way too much, and there’s an alarming amount of corduroy on show… and developers are checking in code to their source control system….

Suddenly a developer swears violently as he checks out the latest code and finds it doesn’t compile. Someone’s check-in has broken the codebase.

He sets about fixing it and checking it back in.

Suddenly another developer swears violently….

Rinse and repeat.

CI started out as a way of minimising code integration headaches. The idea was, “if it’s painful, don’t put it off, do it more often”. It’s much better to do small and frequent code integrations rather than big ugly ones once in a while. Soon tools were invented to help us do these integrations more easily, and to check that our integrations weren’t breaking anything.

Tests!

Fossilized C.I. System

Fossil of a Primitive C.I. System

Excavations of fossilized C.I. systems from the early 21st Century suggest that these primitive C.I. systems basically just compiled code, and then, when unit tests became more popular, they started running unit tests as well. So every time someone checked in some code, the build would make sure that this integration would still result in a build which would compile, and pass the unit tests. Simple!

C.I. systems then started displaying test results and we started using them to run huuuuge overnight builds which would actually deploy our builds and run integration tests. The C.I. system was the automation centre, it ran all these tasks on a timer, and then provided the feedback – this was usually an email saying what had passed and broken. I think this was an important time in the evolution of C.I. because people started seeing C.I. as more of an information generator, and a communicator, rather than just a techie tool that ran some builds on a regular basis.

Information Generator

Management teams started to get information out of C.I. and so it became an “Enterprise Tool”.

Some processes and “best practices” were identified early on:

  • Builds should never be left in a broken state.
  • You should never check in on a broken build because it makes troubleshooting and fixing even harder.

With this new-found management buy-in, C.I. became a central tenet of modern development practices.

People started having fun with C.I. plugging lava lamps, traffic lights and talking rabbits into the system. These were fun, but they did something very important in the evolution of C.I. –  they turned it into an information radiator and a focal point of development efforts.

Automate Everything!

Automation was the big selling point for C.I. Tasks that would previously have been manual, error-prone and time-consuming could now be done automatically, or at night while we were in bed. For me it meant I didn’t have to come in to work on the weekends and do the builds! Whole suites of acceptance, integration and performance tests could automatically be executed on any given build, on a convenient schedule thanks to our C.I. system. This aspect, as much as any other, helped in the widespread adoption of C.I. because people could put a cost-saving value on it. C.I. could save companies money. I, on the other hand, lost out on my weekend overtime.

Code Quality

Static analysis and code coverage tools appeared all over the place, and were ideally suited to be plugged in to C.I. These days, most code coverage tools are designed specifically to be run via C.I. rather than manually. These tools provided a wealth of feedback to the developers and to the project team as a whole. Suddenly we were able to use our C.I. system to get a real feeling for our project’s quality. The unit test results combined with the static analysis could give us information about the code quality, the integration  and functional test results gave us verification of our design and ensured we were making the right stuff, and the nightly performance tests told us that what we were making was good enough for the real world. All of this information got presented to us, automatically, via our new best friend the Continuous Integration system.

Linking C.I. With Stories

When our C.I. system runs our acceptance tests, we’re actually testing to make sure that what we’ve intended to do, has in fact been done. I like the saying that our acceptance tests validate that we built the right thing, while our unit and functional tests verify that we built the thing right.

Linking the ATs to the stories is very important, because then we can start seeing, via the C.I. system, how many of the stories have been completed and pass their acceptance criteria. At this point, the C.I. system becomes a barometer of how complete our projects are.

So, it’s time for a brief recap of what our C.I. system is providing for us at this point:

1. It helps us identify our integration problems at the earliest opportunity

2. It runs our unit tests automatically, saving us time and verifying or code.

3. It runs static analysis, giving us a feel for the code quality and potential hotspots, so it’s an early warning system!

4. It’s an information radiator – it gives us all this information automatically

5. It runs our ATs, ensuring we’re building the right thing and it becomes a barometer of how complete our project is.

And we’re not done yet! We haven’t even started talking about deployments.

Deployments

Ok now we’ve started talking about deployments.

C.I. systems have long been used to deploy builds and execute tests. More recently, with the introduction of advanced C.I. tools such as Jenkins (Hudson), Bamboo and TeamCity, we can use the C.I. tool not only to deploy our builds but to manage deployments to multiple environments, including production. It’s now not uncommon to see a Jenkins build pipeline deploying products to all environments. Driving your production deployments via C.I. is the next logical step in the process, which we’re now calling “Continuous Delivery” (or Continuous Deployment if you’re actually deploying every single build which passes all the test stages etc).

Below is a diagram of the stages in a Continuous Delivery system I worked on recently. The build is automatically promoted to the next stage whenever it successfully completes the current stage, right up until the point where it’s available for deployment to production. As you can imagine, this process relies heavily on automation. The tests must be automated, the deployments automated, even the release email and it’s contents are automated.So what exactly is the cost saving with having a C.I. system?

Yeah, that’s a good question, well done me. Not sure I can give you a straight answer to that one though. Obviously one of the biggest factors is the time savings. As I mentioned earlier, back when I was a human C.I. machine I had to work weekends to sort out build issues and get working code ready for Monday morning. Also, C.I. sort of forces you to automate everything else, like the tests and the deployments, as well as the code analysis and all that good stuff. Again we’re talking about massive time savings.

But automating the hell out of everything doesn’t just save us time, it also eliminates human error. Consider the scope for human error in a system where some poor overworked person has to manually build every project, some other poor sap has to manually do all the testing and then someone else has to manually deploy this project to production and confidently say “Right, now that’s done, I’m sure it’ll work perfectly”. Of course, that never happened, because we were all making mistakes along the line, and they invariably came to light when the code was already live. How much time and money did we waste fixing live issues that we’d introduced by just not having the right processes and systems in place. And by systems, of course, I’m talking about Continuous Integration. I can’t put a value on it but I can tell you we wasted LOTS of money. We even had bugfix teams dedicated to fixing issues we’d introduced and not caught earlier (due in part to a lack of C.I.).

Conclusion

While for many companies C.I. is old news, there are still plenty of people yet to get on board. It can be hard for people to see how C.I. can really make that much of a difference, so hopefully this blog will help to highlight some of the benefits and explain how C.I. has been adopted as one of the most important and central tenets of modern software delivery.

For me, and for many others, Continuous Integration is a MUST.

 

PowerCLI: Reverting CI Agents to Snapshot

My friend Ed’s capacity to automate stuff is quite awesome. Yesterday he automated a way of making our Continuous Integration system alert us when one of the agents went offline. This is already automated in our CI system, but it just wasn’t automated enough for Ed’s liking, so he wrote a script. His script will send us an email whenever an agent goes offline. I haven’t recieved any emails so far, so either the agents are all fine, or the script isn’t working – there’s no way to tell, so I expect Ed will automate a way of telling us whether the automated script has run successfully.

Then today, in the true spirit of “DevOps”, he tells me he has automated a way of reverting our CI agents to a snapshot and plugged it in to the CI system, for good measure. The CI agents are all VMs deployed by VMware, so Ed has used the PowerCLI plugin to do the automation.

Basically the script just iterates over a list of VMs which are in a particular resource pool, and reverts them all to a snaphot. Here’s the script itself:

connect-viserver myserver.mycompany.com -User username -Password secret

$vmcsv = import-csv $args[0]

ForEach($line in $vmcsv){
Get-VM $line.name | Get-Snapshot -Name $line.snap | Set-VM -VM $line.name -confirm:$false
Get-VM $line.name | Start-VM
}

disconnect-viserver -confirm:$false

import-csv looks something like this:

name, snap
linuxSvr1, snap1
xpSvr1, snap1
xpSvrIE7, snap1
w7SvrIE8, snap1
w7SvrIE9, snap1

Ed has added the execution of this script to our CI system, so any of the devs can revert their CI servers to a snapshot by simply pressing a button. They key thing here is to organise VMs into resource pools. We’ve got dedicated resource pools per dev team/project, so it’s safe enough to allow the devs to do this without running the risk of affecting anyone elses CI builds.

You can follow Ed on twitter (@ElMundio87) and check out his blog here: http://www.elmundio.net/blog/

Upcoming Agile/DevOps/CI Events

There’s a free talk this evening at Skills Matter (London) about Continuous Delivery by Tom Duckering and Marc Hofer. Tom did a talk on “Coping with Big CI” a few months ago, which was interesting and very well delivered. I’m looking forward to tonight’s talk.

Then tomorrow there’s the DevOps summit (again in London), which is being chaired by Stephen Nelson-Smith, author of “Test-Driven Infrastructure with Chef” (you can find my review of the book here). Atlassian and CollabNet will have speakers/panelists at this event so I’m expecting it to be very worthwhile.

On the 26th June, again in London (it’s all happening in London for a change), there’s Software Experts Summit, subtitled “Mastering Uncertainty in the Software Industry: Risks, Rewards, and Reality”, I’m expecting there will be a decent amount of DevOps/Continuous Delivery coverage. Speakers include representatives from Microsoft and Groupon.

Next Thursday (June 28th) there’s an Agile Evangelists Meetup in London entitled “Agility within a Client Driven Environment” with talks from experienced agile experts from a range of industries. These are usually pretty good events and the speakers usually have a great deal to offer.

And as I mentioned in an earlier post, there’s the Jenkins User Conference in Israel on July 5th.

Maven the Version Number Nazi

Maven doesn’t like it when you use different verison numbers to the Maven standard format. Of course it doesn’t. It wouldn’t would it? It’s Maven, and Maven only likes it when you do what it tells you to do. I’m still a bit annoyed with Maven, as you can probably tell.

Don’t get me wrong, I’m not “Maven bashing”, it’s just that this particular problem doesn’t have quite the elegant solution I was looking for. I do appreciate Maven, honestly.

This was the problem:

I wanted to change our versioning system from something like 1.0.0-1234 to something like 1.0.0-1234-01

Why the hell would I want to do that?? I’ll explain…

Our verisoning is like this:

{major}.{minor}.{patch}-{build}

The only problem was, the build number was taken from the Perforce check-in number, and this number didn’t always change whenever a build was made, especially if the build was kicked off by an upstream dependency, or a forced build was triggered. Basically, if the build was kicked off by anything other than a commit to Perforce, the build would create an artifact of identical version to the previous build. This, in theory, shouldn’t be a problem, because it is actually building exactly the same thing, but I just don’t like it. Anything could happen, any environmental change could produce a slightly different build to the previous one.

The problem was that I wasn’t using an incremental counter anywhere in my version numbe. It’s essential to have an incrementing version number in order to ensure that every single build creates a unique identifier, so that no two different builds can appear to be the same build.

My first thought was to append a build counter on the end, like this:

{major}.{minor}.{patch}-{build}-{counter}

And that would have worked fine, if it wasn’t for the fact that we use version ranges in our dependencies, and we already have plenty of builds which use the previous versioning system. Maven kept picking up the builds with the previous version system, even though, in every possible sense, the new ones had higher version numbers. It made no sense. That’s when I looked into how maven works out versions. Basically it says “if you’re using version ranges, and not using the maven standard versioning format, you might as well forget it”. If it sees dependencies using the standard format, and ones using the non-standard maven format, it’ll pick up the standard format ones and basically ignore the new ones. To get around this you can delete all the old builds using the standard maven format, and then it’ll work, because it’ll treat each build version like a string and just get you the latest in whatever your range is.

Sadly, this isn’t an option for me, as I want to kep the old builds using the old format. So I tried a few things. I tried putting a string in as a separator, so it would look like this:

{major}.{minor}.{patch}-{build}rc{counter}

This effectively produces something looking like:

1.0.0-1234rc01

I’m fine with that. Maven, on the other hand, isn’t. I made a build with this version 1.0.0-9999rc01 and used it as a dependency in another build, but the other build still went and got 1.0.0-1234, the OLD build using the standard maven versioning. I mean, you’d think 1.0.0-9999rc01 > 1.0.0-1234 but apparently not.

I was a bit pushed for time so I couldn’t spend forever looking into this, so I’ve basically just appended the build counter directly onto the end of the perforce number. This works ok, but just looks a little ugly.

There’s more information on the Maven versioning rules here. It seems that you can break the rules no problems, but you’re in trouble if you use version ranges in your dependencies, and your dependencies need to live alongside binaries which use the standard maven versioning system 😦

If anyone has any better solutions I’d like to hear them. And please don’t say “stop using Maven”.

 

Continuous Delivery Using Maven

I’m currently working on a continuous delivery system where I work, so I thought I would write something up about what I’m doing. The continuous delivery system, in a nutshell, looks a bit like this:

I started out with a bit of a carte blanche with regards to what tools to use, but here’s a list of what was already in use, in one form or another, when I started my adventure:

  • Ant (the main build tool)
  • Maven (used for dependency management)
  • CruiseControl
  • CruiseControl.Net
  • Go
  • Monit
  • JUnit
  • js-test-driver
  • Selenium
  • Artifactory
  • Perforce

The decision of which of these tools to use for my system was influenced by a number of factors. Firstly I’ll explain why I decided to use Maven as the build tool (shock!!).

I’m a big fan of Ant, I’d usually choose it (or probably Gradle now) over Maven any day of the week, but there was already an existing Ant build system in place, which had grown a bit monolithic (that’s my polite way of saying it was a huge mess), so I didn’t want to go there! And besides, the first project that would be going into the new continuous delivery system was a simple Java project – way too straightforward to justify rewriting the whole ant system from scratch and improving it, so I went for Maven. Furthermore, since the project was (from a build perspective) fairly straightforward, I thought Maven could handle it without too much bother. I’ve used Maven before, so I’ve had my run-ins with it, and I know how hard it can be if you want to do anything outside of “The Maven Way”. But, as I said, the project I was working on seemed pretty simple so Maven got the nod.

GO was the latest and greatest C.I. server in use, and the CruiseControl systems were a bit of a handful already, so I went for GO (also I’d never used it before so I thought that would be cool, and it’s from Thoughtworks Studios, so I thought it might be pretty good). I particularly liked the pipeline feature it has, and the way it manages each of its own agents. A colleague of mine, Andy Berry, had already done quite a bit of work on the GO C.I. system, so there was already something to start from. I would have gone for Jenkins had there not already been a considerable investment in GO by the company prior to my arrival.

I decided to use Artifactory as the artifact repository manager, simply because there was already an instance installed, and it was sort-of already setup. The existing build system didn’t really use it, as most artifacts/dependencies were served from network shares. I would have considered Nexus if Artifactory wasn’t already installed.

I setup Sonar to act as a build analysis/reporting tool, because we were starting with a Java project. I really like what Sonar does, I think the information it presents can be used very effectively. Most of all I just like the way in which it delivers the information. The Maven site plugin can produce pretty much all of the information that Sonar does, but I think the way Sonar presents the information is far superior – more on this later.

Perforce was the incumbent source control system, and so it was a no-brainer to carry on with that. In fact, changing the SC system wasn’t ever in question. That said, I would have chosen Subversion if this was an option, just because it’s so utterly freeeeeeee!!!

That was about it for the tools I wanted to use. It was up to the rest of the project team to determine which tools to use for testing and developing. All that I needed for the system I was setting up was a distinction between the Unit Tests, Acceptance Tests and Integration Tests. In the end, the team went with Junit, Mockito and a couple of in-house apps to take care of the testing.

The Maven Build, and the Joys of the Release Plugin!

The idea behind my Continuous Delivery system was this:

  • Every check-in runs a load of unit tests
  • If they pass it runs a load of acceptance tests
  • If they pass we run more tests – Integration, scenario and performance tests
  • If they all pass we run a bunch of static analysis and produce pretty reports and eventually deploy the candidate to a “Release Candidate” repository where QA and other like-minded people can look at it, prod it, and eventually give it a seal of approval.

This is the basic outline of a build pipeline:

Maven isn’t exactly fantastic at fitting in to the pipeline process. For starters we’re running multiple test phases, and Maven follows a “lifecycle” process, meaning that every time you call a particular pipeline phase, it runs all the preceding phases again. Our pipeline needs to run the maven Surefire plugin twice, because that’s the plugin we use to execute our different tests. The first time we run it, we want to execute all the unit tests. The second time we run it we want to execute the acceptance tests – but we don’t want it to run the unit tests again, obviously.

You probably need some familiarity with the maven build lifecycle at this point, because we’re going to be binding the Surefire plugin to two different phases of the maven lifecycle so that we can run it twice and have it run different tests each time. Here is the maven lifecycle, (for a more detailed description check out the Maven’s own lifecycle page)

Clean Lifecycle

  • pre-clean
  • clean
  • post-clean

Default Lifecycle

  • validate
  • initialize
  • generate-sources
  • process-sources
  • generate-resources
  • process-resources
  • compile
  • process-classes
  • generate-test-sources
  • process-test-sources
  • generate-test-resources
  • process-test-resources
  • test-compile
  • process-test-classes
  • test
  • prepare-package
  • package
  • pre-integration-test
  • integration-test
  • post-integration-test
  • verify
  • install
  • deploy

Site Lifecycle

  • pre-site
  • site
  • post-site
  • site-deploy

So, we want to bind our Surefire plugin to both the test phase to execute the UTs, and the integration-test phase to run the ATs, like this:

<plugin>
<!-- Separates the unit tests from the integration tests. -->
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
  <configuration>
  -Xms256m -Xmx512m
  <skip>true</skip>
  </configuration>
  <executions>
    <execution>
      <id>unit-tests</id>
      <phase>test</phase>
      <goals>
        <goal>test</goal>
      </goals>
  <configuration>
    <testClassesDirectory>
      target/test-classes
    </testClassesDirectory>
    <skip>false</skip>
    <includes>
      <include>**/*Test.java</include>
    </includes>
    <excludes>
      <exclude>**/acceptance/*.java</exclude>
      <exclude>**/benchmark/*.java</exclude>
      <include>**/requestResponses/*Test.java</exclude>
    </excludes>
  </configuration>
</execution>
<execution>
  <id>acceptance-tests</id>
  <phase>integration-test</phase>
  <goals>
    <goal>test</goal>
  </goals>
  <configuration>
    <testClassesDirectory>
      target/test-classes
    </testClassesDirectory>
    <skip>false</skip>
    <includes>
      <include>**/acceptance/*.java</include>
      <include>**/benchmark/*.java</include>
      <include>**/requestResponses/*Test.java</exclude>
    </includes>
  </configuration>
</execution>
</executions>
</plugin>

Now in the first stage of our pipeline, which polls Perforce for changes, triggers a build and runs the unit tests, we simply call:

mvn clean test

This will run the surefire test phase of the maven lifecycle. As you can see from the Surefire plugin configuration above, during the “test” phase execution of Surefire (i.e. this time we run it) it’ll run all of the tests except for the acceptance tests – these are explicitly excluded from the execution in the “excludes” section. The other thing we want to do in this phase is quickly check the unit test coverage for our project, and maybe make the build fail if the test coverage is below a certain level. To do this we use the cobertura plugin, and configure it as follows:

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>cobertura-maven-plugin</artifactId>
  <version>2.4</version>
  <configuration>
    <instrumentation>
      <excludes><!-- this is why this isn't in the parent -->
        <exclude>**/acceptance/*.class</exclude>
        <exclude>**/benchmark/*.class</exclude>
        <exclude>**/requestResponses/*.class</exclude>
      </excludes>
    </instrumentation>
    <check>
      <haltOnFailure>true</haltOnFailure>
      <branchRate>80</branchRate>
      <lineRate>80</lineRate>
      <packageLineRate>80</packageLineRate>
      <packageBranchRate>80</packageBranchRate>
      <totalBranchRate>80</totalBranchRate>
      <totalLineRate>80</totalLineRate>
    </check>
    <formats>
      <format>html</format>
      <format>xml</format>
    </formats>
  </configuration>
  <executions>
    <execution>
      <phase>test</phase>
      <goals>
        <goal>clean</goal>
        <goal>check</goal>
      </goals>
    </execution>
  </executions>
</plugin>

To get the cobertura plugin to execute, we need to call “mvn cobertura:cobertura”, or run the maven “verify” phase by calling “mvn verify”, because the cobertura plugin by default binds to the verify lifecycle phase. But if we delve a little deeper into what this actually does, we see that it actually runs the whole test phase all over again, and of course the integration-test phase too, because they precede the verify phase, and cobertura:cobertura actually invokes execution of the test phase before executing itself. So what I’ve done is to change the lifecycle phase that cobertura binds to, as you can see above. I’ve made it bind to the test phase only, so that it only executes when the unit tests run. A consequence of this is that we can now change the maven command we run, to something like this:

mvn clean cobertura:cobertura

This will run the Unit Tests implicitly and also check the coverage!

In the second stage of the pipeline, which runs the acceptance tests, we can call:

mvn clean integration-test

This will again run the Surefire plugin, but this time it will run through the test phase (thus executing the unit tests again) and then execute the integration-test phase, which actually runs our acceptance tests.

You’ll notice that we’ve run the unit tests twice now, and this is a problem. Or is it? Well actually no it isn’t, not for me anyway. One of the reasons why the pipeline is broken down into sections is to allow us to separate different tasks according to their purpose. My Unit Tests are meant to run very quickly (less than 3 minutes ideally, they actually take 15 seconds on this particular project) so that if they fail, I know about it asap, and I don’t have to wait around for a lifetime before I can either continue checking in, or start fixing the failed tests. So my unit test pipeline phase needs to be quick, but what difference does an extra few seconds mean for my Acceptance Tests? Not too much to be honest, so I’m actually not too fussed about the unit tests running for a second time.  If it was a problem, I would of course have to somehow skip the unit tests, but only in the test phase on the second run. This is doable, but not very easy. The best way I’ve thought of is to exclude the tests using SkipTests, which actually just skips the execution of the surefire plugin, and then run your acceptance tests using a different plugin (the Antrun plugin for instance).

The next thing we want to do is create a built artifact (a jar or zip for example) and upload it to our artifact repository. We’ll use 5 artifact repositories in our continuous delivery system, these are:

  1. A cached copy of the maven central repo
  2. A C.I. repository where all builds go
  3. A Release Candidate (RC) repository where all builds under QA go
  4. A Release repository where all builds which have passed QA go
  5. A Downloads repository, from where the downloads to customers are actually served

Once our build has passed all the automated test phases it gets deployed to the C.I. repository. This is done by configuring the C.I. repository in the maven pom file as follows:

<distributionManagement>
<repository>
<id>CI-repo</id>
<url>http://artifactory.mycompany.com/ci-repo</url&gt;
</repository>
</distributionManagement>

and calling:

mvn clean deploy

Now, since Maven follows the lifecycle pattern, it’ll rerun the tests again, and we don’t want to do all that, we just want to deploy the artifacts. In fact, there’s no reason why we shouldn’t just deploy the artifact straight after the Acceptance Test stage is completed, so that’s what exactly what we’ll do. This means we need to go back and change our maven command for our Acceptance Test stage as follows:

mvn clean deploy

This does the same as it did before, because the integration-test phase is implicit and is executed on the way to reaching the “deploy” phase as part of the maven lifecycle, but of course it does more than it did before, it actually deploys the artifact to the C.I. repository.

One thing that is worth noting here is that I’m not using the maven release plugin, and that’s because it’s not very well suited to continuous delivery, as I’ve noted here. The main problem is that the release plugin will increment the build number in the pom and check it in, which will in turn kick off another build, and if every build is doing this, then you’ll have an infinitely building loop. Maven declares builds as either a “release build” which uses the release plugin, or a SNAPSHOT build, which is basically anything else. But I want to create releases out of SNAPSHOT builds, but I don’t want them to be called SNAPSHOT builds, because they’re releases! So what I need to do is simply remove the word SNAPSHOT from my pom. Get rid of it entirely. This will now build a normal “snapshot” build, but not add the SNAPSHOT label, and since we’re not running the release plugin, that’s fine (WARNING: if you try removing the word snapshot from your pom and then try to run a release build using the release plugin, it’ll fail).

Ok, let’s briefly catch up with what our system can now do:

  • We’ve got a build pipeline with 2 stages
  • It’s executed every time code is checked-in
  • Unit tests are executed in the first stage
  • Code coverage is checked, also in the first stage
  • The second stage runs the acceptance tests
  • The jar/zip is built and deployed to our ci repo, this also in the second stage of our pipeline

So we have a jar, and it’s in our “ci” repo, and we have a code coverage report. But where’s the rest of our static analysis? The build should report a lot more than just the code coverage. What about coding styles & standards, rules violations, potential defect hot spots, copy and pasted code etc and so forth??? Thankfully, there’s a great tool which collects all this information for us, and it’s called Sonar.

I won’t go into detail about how to setup and install Sonar, because I’ve already detailed it here.

Installing Sonar is very simple, and to get your builds to produce Sonar reports is as simple as adding a small amount of configuration to your pom, and adding the Sonar plugin to you plugin section. To produce the Sonar reports for your project, you can simply run:

mvn sonar:sonar

So that’s exactly what we’ll do in the next section of our build pipeline.

So we now have 3 pipeline sections and were producing Sonar reports with every build. The Sonar reports look something like this:

Sonar report

As you can see, Sonar produces a wealth of useful information which we can pour over and discuss in our daily stand-ups. As a rule we try to fix any “critical” rule violations, and keep the unit test coverage percentage up in the 90s (where appropriate). Some people might argue that unit test coverage isn’t a valuable metric, but bear in mind that Sonar allows you to exclude certain files and directories from your analysis, so that you’re only measuring the unit test coverage of the code you want to have covered by unit tests. For me, this makes it a valuable metric.

Moving on from Sonar now, we get to the next stage of my pipeline, and here I’m going to run some Integration Tests (finally!!). The ITs have a much wider scope than the Unit Test, and they also have greater requirements, in that we need an Integration Test Environment to run them in. I’m going to use Ant to control this phase of the pipeline, because it gives me more control than Maven does, and I need to do a couple of funky things, namely:

  • Provision an environment
  • Deploy all the components I need to test with
  • Get my newly built artifact from the ci repository in Artifactory
  • Deploy it to my IT environment
  • Kick of the tests

The Ant script is fairly straightforward, but I’ll just mention that getting our artifact from Artifactory is as simple as using Ant’s own “get” task (you don’t need to use Ivy juts to do this):

<get src=”${artifactory.url}/${repo.name}/${namespace}/${jarname}-${version}” dest=”${temp.dir}/${jarname}-${version}” />

The Integration Test stage takes a little longer than the previous stages, and so to speed things up we can run this stage in parallel with the previous stage. Go allows us to do this by setting up 2 jobs in one pipeline stage. Our Sonar stage now changes to “Reports and ITs”, and includes 2 jobs:

<jobs>
          <job name="sonar">
            <tasks>
              <exec command="mvn" args="sonar:sonar" workingdir="JavaDevelopment" />
            </tasks>
            <resources>
              <resource>windows</resource>
            </resources>
          </job>
 <job name="ITs">
            <tasks>
              <ant buildfile="run_ITs.xml" target="build" workingdir="JavaDevelopment" />
            </tasks>
            <resources>
              <resource>windows</resource>
            </resources>
          </job>
</jobs>

Once this phase completes successfully, we know we’ve got a half decent looking build! At this point I’m going to throw a bit of a spanner into the works. The QA team want to perform some manual exploratory tests on the build. Good idea! But how does that fit in with our Continuous Delivery model? Well, what I did was to create a separate “Release Candidate” (RC) repository, also known as a QA repo. Builds that pass the IT stage get promoted to the RC repo, and from there the QA team can take them and do their exploratory testing.

Does this stop us from practicing “Continuous Delivery”? Well, not really. In my opinion, Continuous Delivery is more about making sure that every build creates a potentially releasable artifact, rather that making every build actually deploy an artifact to production – that’s Continuous Deployment.

Our final stage in the deployment pipeline is to deploy our build to a performance test environment, and execute some load tests. Once this stage completes we deploy our build to the Release Repository, as it’s all signed off and ready to handover to customers. At this point there’s a manual decision gate, which in reality is a button in my CI system. At this point, only the product owner or some such responsible person, can decide whether or not to actually release this build into the wild. They may decide not to, simply because they don’t feel that the changes included in this build are particularly worth deploying. On the other hand, they may decide to release it, and to do this they simply click the button. What does the button do? Well, it simply copies the build to the “downloads” repository, from where a link is served and sent to customers, informing them that a new release is available – that’s just the way things are done here. In a hosted environment (like a web-based company), this button-press could initiate the deploy script to deploy this build to the production environment.

A Word on Version Numbers

This system is actually dependant on each build producing a unique artifact. If a code change is checked in, the resultant build must be uniquely identifiable, so that when we come to release it, we know we’re releasing theexact same build that has gone through the whole pipeline, not some older previous build. To do this, we need to version each build with a unique number. The CI system is very useful for doing this. In Go, as with most other CI systems, you can retrieve a unique “counter” for your build, which is incremented every time there’s a build. No two builds of the same name can have the same counter. So we could add this unique number to our artifact’s version, something like this (let’s say the counter is 33, meaning this is the 33rd build):

myproject.jar-1.0.33

This is good, but it doesn’t tell us much, apart from that this is the 33rd build of “myproject”. A more meaningful version number is the source control revision number, which relates to the code commit which kicked off the build. This is extremely useful. From this we can cross reference every build to the code in our source control system, and this saves us from having to “tag” the source code with every build. I can access the source control revision number via my CI system, because Go sets it as an environment variable at build time, so I simply pass it to my build script in my CI system’s xml, like this:

cobertura:cobertura -Dp4.revision=${env.GO_PIPELINE_LABEL}
-Dbuild.counter=${env.GO_PIPELINE_COUNTER"

p4.revision and build.counter are used in the maven build script, where I set the version number:

    <groupId>com.mycompany</groupId>
<artifactId>myproject</artifactId>
<packaging>jar</packaging>
<version>${main.version}-${build.number}-${build.counter}</version>
<name>myproject</name>

<properties>
<build.number>${p4.revision}</build.number>
<major.version>1</major.version>
<minor.version>0</minor.version>
<patch.version>0</patch.version>
<main.version>${major.version}.${minor.version}.${patch.version}</main.version>
</properties>

If my Perforce check-in number was 1234, then this build, for example, will produce:

myproject.jar-1.0.0-1234-33

And that just about covers it. I hope this is useful to some people, especially those who are using Maven and are struggling with the release plugin!

How Mature Is Your Continuous Integration?

As I’m sure I’ve ranted about mentioned in the past, Continuous Integration is far more than just a collection of tools and scripts. It’s “a practice”, a way of doing something, and it has to be part of our working culture to be truly effective.

I’ve seen instances of CI implemented which are truly magnificent, using great tools, great architecture, very smart scripts and a good process, but I’ve also seen this system fail. Unfortunatley, it’s all too easy to have your wonderful system, and then have it ignored unless there is the right level of buy-in from the people who this system is meant to cater for, nameley development, QA and Management.

The way I currently see it, is that there are a number of levels of CI maturity. I’ll just call these levels “Level 1”, “Level 2” and so on, rather than “Highly Immature”, “Stroppy Teenager” etc:

Level 1

No CI tools to speak of, no CI process. I’ve been there. Builds take about a day to get working. It’s a nightmare. I still shiver just thinking about it.

Level 2

We’ve got some CI tools like cruise control, repeatable builds, but no CI process. We’ve basically got a front-end to a system of chaos. Most of the builds are broken, but you now have a nice way of visualising that, and nobody cares. It’s Level 1 with a pretty wrapper on it.

Level 3

We’ve got a system, but not the right tools. We’ve got a policy of running our tests locally before checking-in, and some poor soul somewhere is left with the task of making the “official” builds for passing to QA. These builds will usually fail and everyone will have to chip in to help sort out the mess until a build can be made. We desperately need a computer to do this build lark for us!

Level 4

We’ve got some rudimentary tools, like Cruise Control or Jenkins, but we’re not using them to their full capacity, but we’ve got a build monkey! The build monkey does his or her best to make sure that people are aware that they’ve broken the build. The build monkey sets up the CI system and makes changes whenever necessary. The build monkey is the first to look into every build failure. The build monkey goes on holiday and the whole place grinds to a halt.

Level 5

We’ve got the tools, we’ve got the process, but nobody’s listening to us!!! All the tools are in place, we’ve got a suitable CI system and maybe we’re even trying to do continuous delivery. The build system is virtualised and we have a release engineering team (the collective noun for a group of build monkeys is a “release engineering”). The only trouble is, the unit test coverage is apalling and people don’t fix their broken builds, despite the fact that we’ve got a nice shiny wiki page saying we should aim to have 95% unit test coverage and broken builds should be fixed within 3o mins.

Level 6

We’ve got the tools, the processes and we’ve got management buy-in! This is looking good, we now have a lovely system, which our team of build engineers looks after, and we have a semi-compliant dev team who get told off if they don’t play ball! We’re all heading in roughly the right direction

Level 7

We’ve got the tools, the process and the right culture! Everyone has buy-in to the build system. Developers and build enginners alike can be trusted to edit build files and even the CI configuration because we all clearly understand what we’re trying to achieve. Best practices are being observed and so our build engineering team don’t need to spend all day chasing people or working on trivial tasks. Our time can be better invested and productivity increased.

Conclusion

Ultimately we’re all responsible for looking after the CI system – it’s for our own benefit afterall. As a developer I want to make sure I have some fast and reliable feedback on the quality of my code changes. If I see my build has failed, I would actually want to find out why, rather than ignore it. As a build engineer I want our CI system to be providing useful feedback to our developers, and useful information to management – if it isn’t, or if this “useful” information isn’t being acted upon, then it’s not really useful at all, and my job is less fulfilling! All of this means that we all have some responsibility to occasionally look under the hood and see what’s going on, and try to figure out why the system is telling me that something isn’t working quite right.

The hardest part to get right, particularly in distributed teams or in companies over a certain size, is the culture. You have to have a team of build engineers and developers who all understand the big picture. Developers need to understand that they are instrumental in making the system work – their input is vital, and they have to understand clearly what benefits they will personally get from this system, otherwise they’ll ignore it. Build engineers in turn need to understand that the more you are able to devolve the ownership of the system, the better it can work, and the more buy-in you will get in return. The build system needs guardians, but it doesn’t need treating like a holy relic.

Coping With Big C.I.

Last night I went along to another C.I. meetup to listen to Tom Duckering, a consultant devops at Thoughtworks, deliver a talk about managing a scaled-up build/release/CI system. In his talk, Tom discussed Continuous Delivery, common mistakes, best practices, monkeys, Jamie Oliver and McDonald’s.

Big CI and Build Monkeys

buildmonkeyFirst of all, Tom started out by defining what he meant by “Big CI”.

Big CI means large-scale build and Continuous Integration systems. We’re talking about maybe 100+ bits of software being built, and doing C.I. properly. In Big CI there’s usually a dedicated build team, which in itself raises a few issues which are covered a bit later. The general formula for getting to Big CI, as far as build engineers (henceforth termed “build monkeys”) are concerned goes as follows:

build monkey + projects = build monkeys

build monkeys + projects + projects = build monkey society

build monkey society + projects = über build monkey

über build monkey + build monkey society + projects = BIG CI

 

What are the Issues with Big CI?

Big CI isn’t without its problems, and Tom presented a number of Anti-Patterns which he has witnessed in practice. I’ve listed most of them and added my own thoughts where necessary:

Anti-Pattern: Slavish Standardisation

As build monkeys we all strive for a decent degree of standardisation – it makes our working lives so much easier! The fewer systems, technologies and languages we have to support the easier, it’s like macro configuration management in a way – the less variation the better. However, Tom highlighted that mass standardisation is the work of the devil, and by the devil of course, I mean McDonald’s.

McDonald’s vs Jamie Oliver 

ronnysmug git

Jamie Oliver might me a smug mockney git who loves the sound of his own voice BUT he does know how to make tasty food, apparently (I don’t know, he’s never cooked for me before). McDonald’s make incredibly tasty food if you’re a teenager or unemployed, but beyond that, they’re pretty lame. However, they do have consistency on their side. Go into a McDonald’s in London and have a cheeseburger – it’s likeley to taste exactly the same as a cheeseburger from a McDonald’s in Moscow (i.e. bland and rubbery, and nothing like in the pictures either). This is thanks to mass standardisation.

Jamie Oliver, or so Tom Duckering says (I’m staying well out of this) is less consistent. His meals may be of a higher standard, but they’re likely to be slightly different each time. Let’s just say that Jamie Oliver’s dishes are more “unique”.

Anyway, back to the Continuous Integration stuff! In Big CI, you can be tempted by mass standardisation, but all you’ll achieve is mediocrity. With less flexibility you’re preventing project teams from achieving their potential, by stifling their creativity and individuality. So, as Tom says, when we look at our C.I. system we have to ask ourselves “Are we making burgers?”

Are we making burgers?

– T. Duckering, 2011

Anti-Pattern: The Team Who Knew Too Much

There is a phenomenon in the natural world known as “Build Monkey Affinity”, in which build engineers tend to congregate and work together, rather than integrate with the rest of society. Fascinating stuff. The trouble is, this usually leads the build monkeys to assume all the responsibilities of the CI system, because their lack of integration with the rest of the known world makes them isolated, cold and bitter (Ok, I’m going overboard here). Anyway, the point is this, if the build team don’t work with the project team, and every build task has to go through the build team, there will be a disconnect, possibly bottlenecks and a general lack of agility. Responsibility for build related activities should be devolved to the project teams as much as possible, so that bottlenecks and disconnects don’t arise. It also stops all the build knowledge from staying in one place.

Anti-Pattern: Big Ball of CI Mud

This is where you have a load of complexity in your build system, loads of obscure build scripts, multitudes of properties files, and all sorts of nonsense, all just to get a build working. It tends to happen when you over engineer your build solution because you’re compensating for a project that’s not in a fit state. I’ve worked in places where there are projects that have no regard for configuration management, project structures in source control that don’t match what they need to look like to do a build, and projects where the team have no idea what the deployed artifact should look like – so they just check all their individual work into source control and leave it for the build system to sort the whole mess out. Obviously, in these situations, you’re going to end up with some sort of crazy Heath Robinson build system which is bordering on artificial intelligence in its complexity. This is a big ball of CI mud.

Heath Robinson Build System

Heath Robinson Build System a.k.a. "a mess"

Anti-Pattern: “They” Broke MY Build…

This is a situation that often arises when you have upstream and downstream dependencies. Let’s say your build depends on library X. Someone in another team makes a change to library X and now your build fails. This seriously blows. It happens most often when you are forced to accept the latest changes from an upstream build. This is called a “push” method. An alternative is to use the “pull” method, which is where you choose whether or not you want to accept a new release from an upstream build – i.e. you can choose to stick with the existing version that you know works with your project.

The truth is, neither system is perfect, but what would be nice is if the build system had the flexibility to be either push or pull.

The Solutions!

Fear not, for Tom has come up with some thoroughly decent solutions to some of these anti-patterns!

Project Teams Should Own Their Builds

Don’t have a separated build team – devolve the build responsibilities to the project team, share the knowledge and share the problems! Basically just buy into the whole agile idea of getting the expertise within the project team.

Project teams should involve the infrastructure team as early as possible in the project, and again, infrastructure responsibilities should be devolved to the project team as much as possible.

Have CI Experts

Have a small number of CI experts, then use them wisely! Have a program of pairing or secondment. Pair the experts with the developers, or have a rotational system of secondment where a developer or two are seconded into the build team for a couple of months. Meanwhile, the CI experts should be encouraged to go out and get a thoroughly rounded idea of current CI practices by getting involved in the wider CI community and attending meetups… like this one!

Personal Best Metrics

The trouble with targets, metrics and goals is that they can create an environment where it’s hard to take risks, for fear of missing your target. And without risks there’s less reward. Innovations come from taking the odd risk and not being afraid to try something different.

It’s also almost impossible to come up with “proper” metrics for CI. There are no standard rules, builds can’t all be under 10 minutes, projects are simply too diverse and different. So if you need to have metrics and targets, make them pertinent, or personal for each project.

Treat Your Build Environments Like They Are Production

Don’t hand crank your build environments. Sorry, I should have started with “You wouldn’t hand crank your production environments would you??” but of course, I know the answer to that isn’t always “no”. But let’s just take it as read that if you have a large production estate, to do anything other than automate the provision of new infrastructure would be very bad indeed. Tom suggests using the likes of Puppet and Chef, and here at Caplin we’re using VMWare which works pretty well for us. The point is, extend this same degree of infrastructure automation to your build and CI environments as well, make it easy to create new CI servers. And automate the configuration management while you’re at it!

Provide a Toolbox, Not a Rigid Framework

Flexibility is the name of the game here. The project teams have far more flexibility if you, as a build team, are able to offer a selection of technologies, processes and tricks, to help them create their own build system, rather than force a rigid framework on them which may not be ideal for each project. Wouldn’t it be nice, from a build team perspective, if you could allow the project teams to pick and choose whichever build language they wanted, without worrying that it’ll cause a nightmare for you? It would be great if you could support builds written in Maven, Ant, Gradle and MSBuild without any problems. This is where a toolkit comes in. If you can provide a certain level of flexibility and make your system build-language agnostic, and devolve the ownership of the scripts to the project team, then things will get done much quicker.

Consumer-Driven Contracts

It would be nice if we could somehow give upstream builds a “contract”, like a test API layer or something. Something that they must conform to, or make sure they don’t break, before they expose their build to your project. This is a sort of push/pull compromise.

And that pretty much covers it for the content of Tom’s talk. It was really well delivered, with good audience participation and the content was thought-provoking. I may have paraphrased him on the whole Jamie Oliver thing, but never mind that.

It was really interesting to hear someone so experienced in build management actually promote flexibility rather than standardisation. It’s been my experience that until recently the general mantra has been “standardise and conform!”. But the truth is that standardisation can very easily lead to inflexibility, and the cost is that projects take longer to get out of the door because we spend so much time compromising and conforming to a rigid process or framework.

Chatting to Christian Blunden a couple of months back about developer anarchy was about the first time I really thought that such a high degree of flexibility could actually be a good thing. It’s really not easy to get to a place where you can support such flexibility, it requires a LOT of collaboration with the rest of the dev team, and I really believe that secondment and pairing is a great way to achieve that. Fortunately, that’s something we do quite well at Caplin, which is pretty lucky because we’re up to 6 build languages and 4 different C.I. systems already!