Connecting your azure environment to your office VPN

Okay, before I go anywhere with this topic I should point out that:

a) This is most definitely NOT a step-by-step guide on how to configure your VPN device

b) This is basically just an overview of stuff you need to know before you start

c) I can’t think of a third thing to put here, but 2 things just doesn’t feel like enough to justify a list

Why on earth would you need to connect your azure environment to your office VPN anyway?

Actually there’s all sorts of reasons for doing this, for instance you might need your Azure hosted services to connect directly to servers/services inside your office VPN. My main reason for needing to do this was to connect my Azure VMs to my Chef server running on a VM inside the office VPN. (“Why not just move your Chef server to Azure as well?!” I hear you ask. Well, let’s just imagine there was a really good reason for this, and move on).

Setting up a VPN connection can be a bit of a pain (and take ages to implement) with some datacentre providers, but with Azure it’s actually rather quite easy. The first thing you need to determine is the type of VPN connection you want to set up. Your 2 main options are point-to-site and site-to-site.

Point-to-Site essentially just involves setting up a virtual network within Azure and connecting out to it from individually configured clients within your office (if you ever work from home and VPN into the office network then you’ll be very familiar with this type of setup).

Site-to-Site involves connecting an existing office VPN to a virtual network within Azure (it’s basically the equivalent of adding your Azure subscription to your local office network).

I opted for a site-to-site connection because it scales well, and once it’s set up there’s no need to use VPN clients on my on-premise servers.

If you want to setup a site-to-site VPN connection to Azure you’ve basically got 2 choices:

  • Setup a connection between your existing VPN hardware (you can find a list of supported VPN devices here) and an Azure Virtual Network
  • Setup a connection between an Azure Virtual Network and a local Windows 2012 R2 server with Routing and Remote Access Service (RRAS).

Setting up a connection using your existing VPN hardware

Many organisations will have dedicated VPN devices, but as mentioned previously not all of these are suitable for connecting a site-to-site VPN to Azure. If your device does happen to be supported then you’ll need to get hands-on with the device configuration in order to setup the site-to-site connection. This will differ from one device to the next, so good luck with that!

Whatever supported device you’re using, you’ll still need to create and configure a virtual network in Azure. The full instructions on how to do this can be found here, but here’s a basic checklist of the sort of stuff you’ll need to know:

  • Your DNS Servers
  • Your local network name (obvs)
  • Your VPN device’s IP address
  • Your address space
  • Subnet details (if you want to create one)
  • Affinity group name (you can create one as you go through the Virtual Network setup)

Other than creating the virtual network, you just need to create a gateway within that virtual network. Details of how to do that can be found here. This stuff is all really simple from within the Azure Management UI.

And that’s about it from the Azure side. You now just need to configure your office VPN device. As mentioned earlier, the details of how to do this will depend on what device you have, so time to dig out your VPN device’s user manual!

 

But what if your VPN device isn’t on “The List”??

Well, fear not, for there is another way! All you need is a Windows 2012 Server with RRAS configured.

NOTE: I know you can also configure RRAS on Windows server 2008 R2 but I don’t yet know if this will work (we’re still trying to test it out as I’m writing this). Here, try this guide if you fancy giving it a shot, and let me know if it works with Azure!

One thing to note is that the Microsoft documentation pretty much says this setup won’t work if your RRAS server is behind a NAT or a firewall, but this isn’t actually the case. It’ll work just as long as your RRAS server has a public IP address.

So, here’s a basic overview of what you’ll need:

  • The same shizzle as previously for the Azure Virtual Network
  • A Windows server 2012 with 2 NICS
  • A public IP address on the 2012 server
  • A local Gateway server (you could just use the RRAS machine for this though)
  • ICMPv4  enabled on your firewall

So there we are, nothing too complicated at all. There’s plenty of configuration work to be done in setting all this stuff up, but the Azure side is definitely the easy part. As for the RRAS stuff, don’t install and configure this manually – you actually need to edit a powershell script with the details you get along the way, and then run the script. It sounds like a ball-ache, but it’s actually more fun than the usual Windows service installation! There are plenty of good resources for helping you work through a site-to-site setup in a step-by-step guide, such as:

 

 

 

 

 

 

 

Enabling winrm using powershell

So, you’re doing stuff with these new “virtual” machines eh? Well if you’re using windows, there’s a damn good chance you’ll need to enable and configure winrm, otherwise you won’t be able to log in to your swanky new “virtual machine”! Even Chef needs this service running on the target in order to work with windows. Anyway, here’s what to do: open a powershell prompt and type the following:

winrm quickconfig -q

winrm set winrm/config/winrs ‘@{MaxMemoryPerShellMB=”512″}’

winrm set winrm/config ‘@{MaxTimeoutms=”1800000″}’

winrm set winrm/config/service ‘@{AllowUnencrypted=”true”}’

winrm set winrm/config/service/auth ‘@{Basic=”true”}’

Start-Service WinRM

set-service WinRM -StartupType Automatic

Alternatively you could create a ps1 script containing the stuff above, open powershell, do the thingy that allows you to run unsigned scripts, namely:

Set-ExecutionPolicy Unrestricted

Then run the ps1 script.

There, I’ve blogged it, now I’ll never have to google this again!

Adopting Agile in 3 “Easy” Steps

All good plans come in 3 phases:

Profit

Although I won’t be collecting any underpants, I’ll be following this basic template (with a couple of tweaks here and there) during an Agile adoption initiative I’m currently working on.

In the South Park episode (from which I have taken the picture above) the boys discover a bunch of underpant-stealing gnomes, who are collecting underpants as part of a grand plan to make profit. The gnomes claim to be business experts but none of them appears to know what phase 2 of the plan is. All they know is that their business model is based on collecting underpants, and so that’s what they’ll do.

Unfortunately, I have been witness to a couple of attempts at adopting agile which weren’t very dissimilar to the underpants gnomes’ business plan. Namely, a business starts “Adopting Agile”, usually driven by the development team, where they start doing stand-ups and using a sprint board (this is phase 1) and somehow they are surprised when this doesn’t suddenly start producing profit. Clearly, “becoming Agile” isn’t as simple as that.

Phase 1 – Collect business reasons (not underpants)

So you’re going Agile. Presumably you’ve determined that this is what you want, and what your customers need. If you haven’t done this yet then stop right there and ask yourself “Do I Need Go Agile?”. The answer might be “no”, but does needing to go Agile have to be the only reason? Maybe you just want to go agile to see what the fuss is all about, or to make your business more attractive to potential new employees.

brush

So lets assume we’re going agile, and you have valid business reasons to do so. My first suggestion would be to make those business reasons highly visible. You have to outline the existing issues and how Agile can help to fix them. Mitchell and Webb once did a sketch about a toothbrush company who had to try to think of some gimmick to add to their toothbrushes in order to keep increasing their sales. They came up with the idea of “dirty tongue”. This is where microscopic “tonguanoids” build up, and basically result in social exclusion and a lack of sex. Their solution: to put bristles on the other side of the toothbrush so that people can brush their tongues while they brush their teeth. People will buy these toothbrushes despite the fact that “brushing your tongue makes you retch, everybody knows that”. The point I’m making, very badly, is that it’s a lot easier to sell things if people think that what they’re buying into will fix some very real, tangible issue.

The same goes with Agile. To get the buy-in you need to make your agile adoption a success, you’ll need to identify how “going Agile” is going to make life better for everyone concerned.

If the problem you’ve got is that you never ship software on time, or you constantly fail to deliver what the customer wants, then it’s fairly easy to “sell” agile as the solution. The concept of sprints are a doddle for everyone to understand, and they’ll love the idea that the customer will have regular interactions with the development team, and get to see regular progress in the demos. “Of course!” they’ll say “It’s so obvious, why didn’t I think of that before”. The business should easily be able to see how short, sharp sprints with an emphasis on “working software” will make it easier to deliver what the customer wants, and manage their expectations of when it’ll be ready.

But what if those aren’t the problems you need to solve?

What if your problem is quality? How do you convince the business that Agile will result in a higher quality product? It’s not quite so easy. Agile itself won’t deliver better quality, but the good practices you’ll have to implement in order to successfully be agile will help to improve your quality. I was thinking about this the other day because it’s exactly the problem I was faced with.

Agile isn’t going to make it easier to reliably test our software. But to be agile, we need to be able to build and deploy our project rapidly so that we can test it right there and then, not tomorrow, not next week, but right now, so that the testers and devs can work in tandem, building features and signing them off and moving on to the next one. We have to facilitate this in order to be agile, so as a byproduct of going agile we might have to invest in creating a new build and deployment system. And it has to be quick so it’ll have to be automated.

So we have an automated build and deployment system, but to be able to reliably test our features we’ll have to make sure the environments are reliable. We can throw people at this problem and dedicate a team to making sure our environments are clean and regularly audited, or we can automate all that as well! chef Fortunately there are numerous tools and good practices we can follow to do this, just take a look at Chef, Puppet, Vagrant, and VMWare as examples of tools for automating deployments of virtual machines, and the concepts of “infrastructure as code” for good practices. (of course, if your hardware isn’t already virtualised the first thing to do is see whether it can be, and if it really, honestly can’t, then look at tools like Norton Ghost and Powershell for ways of automating as much as you can).

“Agile” and “Improved Quality” might not be the most obvious bed partners, but the journey to becoming agile almost forces you to take steps which will naturally go towards improving your quality.

Hopefully you’ll have enough “sales material” to put forward a great case for agile – you can deliver exactly what the customer wants, to a higher quality, and you can manage their expectations in a way you could never do before. And that’s just scratching the surface of what Agile can do for a business, but for the purposes of keeping this post to a reasonable length, I’ll leave it at those 3 things!

Phase 2 – Pick the most appropriate project, and start doing Scrum

The sales pitch is over and now it’s time to start doing stuff. Make your life a lot easier by picking a project that has as many of the following features as possible:

  • Smart developers and testers
  • Isn’t suffering from a tonne of technical debt
  • Has users who are happy to get involved in early & regular feedback
  • Is small, new or yet to begin

If you’re taking on an existing project, a good idea at this point is to benchmark your existing processes. Consider trying to measure the following:

  • How long does it take to get a single change from request through to production deployment?
  • How much time and money does it cost to fix an issue on production?
  • How many bugs do you typically find on your production code every month?
  • How often do you deliver features that don’t satisfy the customer?
  • How often do you deliver features after the deadline?

Measuring some of the things above is clearly non-trivial, but if you can find these stats somewhere, they’ll be very useful benchmarks for you in the future. When you can demonstrate that all of these metrics are improved in your new Agile process, god-like status will soon follow.

You - after delivering "agile" to the business

Guess who just delivered a release on time…
Ohhhh Yeeeeaaaaahhhh!

I recommend doing Scrum because it’s simple and has the most support in terms of people with experience, material (books, courses etc), and tools. It’s a good “framework” to get you started, and once you’ve had success, you can evolve into other methods, or incorporate them into Scrum (such as BDD, TDD etc).

Succeeding With Agile

Here’s one of many great books to get you started on your agile journey

At about this point you’ll need to do some brainwashing training. The concept of doing analysis, design, development and testing all at the same time is going to sound absolutely bonkers to some people. Try your best to explain it to them, but don’t waste too much time on this – just crack on and make a start!

Most people will enjoy the experience of working in this “new” way, and the first few sprints will probably benefit from the fact that everyone is performing better simply because they feel more invigorated. Use this opportunity to promote scrum across the organisation.

In this phase, always maintain a focus on “the business” and not just on the technical team. It’s important that the business feels part of this new process or they’ll just see it as some crazy dev thing which doesn’t really affect them, and they won’t try to understand it. Business people might refer to this as “Promoting Synergy”, which I’ve just shoe-horned into this post so that I can add a picture from Lonely Island’s “Like A Boss” video. However, I do like to make a point of always highlighting the extra business value we’re delivering, and make sure the Product Managers (soon to be “Product Owners”) are involved all the way. They represent the traditional link between the customers and what we’re delivering, and so it’s essential that they understand the benefits of agile.

Promote Synergy!

Promote Synergy!

I was recently asked about the impact of “going agile” on a project’s release schedule, and when we would be able to deliver the features we’ve promised to the customers. It’s difficult to explain that we no longer know when we’ll deliver stuff, but at some point, people will have to realise that this is the wrong question. I prefer the idea of a rolling roadmap, which is continually reviewed and updated (as often as you can afford to do it, really). Rolling Roadmaps give the business, as well as the customers a good idea of our intentions, but it is very different to fixed dates on a release schedule, or a traditional yearly roadmap. Of course, everyone needs to understand that the main driver for our deliverables will be the customers, and what the customer wants will usually change over a shorter period than you expect. So for your new “Agile” project, try to work towards implementing a rolling roadmap culture, and move away from long-term fixed delivery dates (if you can).

One final note on Phase 2: Make it fun, and make it different.

Phase 3 – Improve

Agile promotes “fast feedback loops” all over the place: in development we get fast feedback on our code through Continuous Integration, with BDD we get fast feedback to the Product Owners/BAs and of course with our more frequent releases we get faster feedback from the customers. And so it is with our Agile processes as a whole. With short sprints and the clever use of retrospectives we can continually tinker with our fine tuning to see if we can improve our quality and velocity. Look at areas you can try to improve, change something and then see if your change has had a positive impact at the end of the sprint. This is basically the concept behind Deming’s Shewhart Cycle:

demingcycleDeming actually preferred “Plan, Do, Study, Act”, whereas I myself prefer “Plan, Do, Measure, Act”. The reason I prefer this is because it implies the use of quantifiable metrics to base our actions on, rather than some other non-quantifiable observations. Anyways, the point is that after agile is applied, you should keep looking at ways to continuously improve. This is key to keeping everyone feeling fresh and invigorated, helps us to learn from our mistakes, and encourages innovation.

So there you go, Agile delivered in 3 well easy steps. It shouldn’t take you much longer than an afternoon. Ok, it might take a bit longer but if you’re looking for a 30,000 foot overview of a simple 3-phase approach, then you could do a lot worse than apply the principles of “Sell the Agile Idea, Pick the Best Project, and Keep Improving”.

Continuous Delivery Warts and All

Tom Duckering was back at Skills Matter this week, and this time he bought a friend (and fellow thoughtworker), Marc Hofer. They were there to talk to us about a “real life” continuous delivery project they’ve recently been working on. I sat, listened, took notes, and then I had to leave because I was meeting my girlfriend at the cinema to watch “Snow White and the Huntsman”, which was absolutely AWFUL by the way. Do not waste your time on this movie, it seriously drags on forever and I actually fell asleep before the end. It has Charlize Theron in it (is it me or is she in everything right now?), but don’t let that fool you, it’s still rubbish. Anyway, as I was saying, I took notes, and this is what I learned…

Warts?

The “warts and all” title was meant to be a caveat that they don’t claim to have got everything perfectly right, and that there were problems along the way on this project. The client for this particular project was “Springer” (a publishing company) and the job was to redesign the website (basically). One of the problems they were aiming to fix was the “time to release”, which was in the region of months, rather than hours, and so they decided to go all Continuous Delivery from the outset. Another thing worth mentioning was that this was a greenfield project, which has its advantages and disadvantages, as outlined here in my incredibly pointless table:

I did that table in Powerpoint, thus highlighting my potential as a senior manager.

Why Continuous Delivery?

The fact that they chose to follow the continuous delivery path right from the outset was an important decision. In my experience, continuous delivery isn’t something you can easily retro fit into an existing system, well, it’s not as easy as when you set out right from the start to follow continuous delivery. Tom put it like this:

You can’t sell continuous delivery as a bolt-on

Which, as usual, is a much better way of putting it than I just did.

Once of the reasons why they went for the continuous delivery approach with this client was to sell more of Jez Humble’s Continuous Delivery book (available on Amazon at a very reasonable price). Just kidding! They would never do that. They actually chose continuous delivery because of the good-practices (I’m trying to stop using the term “best practices” as I’ve learned that it’s evil) it enforces on a project. Continuous delivery allows you to have fast, frequent releases, which forces small changes rather than big ones, and also forces you to automate pretty much everything. They even automated the release notes, which is something we’ve also done on a project I’m working on currently! Our release notes are populated from a template, content pulled in from Jira, and they’re packaged up in every single build. Neat, no? Well Tom seemed pretty impressed with the idea, and I’m quite chuffed that we’re doing the same stuff.

Another reason they opted for a continuous delivery approach was to overcome the IT bottleneck problem.

Look at all the cool stuff I can do with MS paint!!

It would seem that there was an IT black hole which was unable to produce as quickly as the business demanded. I usually hear people say “Agile” is the solution to the IT bottleneck, rather than continuous delivery, but Tom made a point of saying that they were agile as well. I think continuous delivery helps teams to focus on the delivery aspect of agile, and gives us a way of bringing the delivery issues much further back down the line, where they can be addressed more easily, and not at the last minute. As I mentioned earlier, time-to-market was an important driving factor in choosing continuous delivery. I would also add that, in my experience, having a predictable time to market is of great importance to the business. You tend to find that project sponsors don’t mind waiting a couple of weeks, maybe longer, for a change to go live, as long as that estimate is realistic.

The Details

I won’t go into too much technical detail about the project they were working on, so I’ll summarise it like this:

  • Local virtualisation was done using Vagrant and VirtualBox, so dev’s could easily spin up new environments locally.
  • They used Git, and it wasn’t easy. Steep learning curve etc. Using submodules didn’t help either.
  • They had on-site Git go-to people, which helped with the Git learning curve.
  • Devs could deploy to any environment – this was useful for building up environments, but is scary as hell.
  • They kept the branches to a minimum – only for bugfixes or when doing feature toggle releasing.
  • They do check-in stats analysis to “incentivize” people. Small and frequent commits were rewarded.
  • They used Go (they have my sympathy).
  • They deploy using capistrano
  • They deploy to a versioned directory and use symlinks which helps with rollbacks (I’d say this was a pretty standard practice)
  • They use Kickstart and Chef to build workstations, and Chef-Solo for other environments
  • The servers are provisioned with VMWare, the base OS installed with Cobbler/Kickstart, and the “configuration” applied by Chef
  • Even the QA environment was load balanced!
  • This is a long list of bullet points

I was pretty interested with the idea of load balancing the test environment because it reminded me of a problem I had at a company I was working for a few years ago. We didn’t have a load balanced test environment but we did have a load balanced live environment, and one night we did a scheduled production release which just wouldn’t work. It was about 4am and things weren’t looking good. Luckily for me, a particularly bright developer by the name of Andy Butterworth was on hand, and he got to the bottom of the problem and dug us out of a hole. The problem was load-balance related of course. Our new code hadn’t been written for a load balanced cluster, but we never picked it up until it was too late. I’m not sure what past experiences drove Tom and Marc to implement a load balanced test environment, but it’s a good job they did, as Tom testified that it has saved their bacon a few times.

Load balancing QA has saved our bacon a few times!

One of the other things that I was interested in was the idea of using Vagrant and VirtualBox for local VM stuff. I was surprised at this because they are also using VMware. I wondered why, if they’re already using VMware, they don’t just use VMware player for their local VMs?

I was also interested in the way they’d configured Go, which, at a glance, looked totally different to how we’ve got our one setup here where I’m currently working. I’m hoping Tom will shed some light on this in due course!

I loved the idea of using check-in stats to incentivize the team! I’m really keen on the whole gamification thing at the moment, and I’m trying to think of some cool gamified way of incentivizing teams where I work. The check-in stats approach that Tom talked about looked cool, they analyse the number of check-ins per person and also look at the devs comments too, and produce a scoreboard 🙂

More Than Tools

I’ve been to a few talks and conferences recently and one of the underlying messages I’ve got from most of them is that people and relationships are more important than tools, and by that I mean that it’s more important to get relationships right than it is to pick the right tools. Bringing in a new amazing tool isn’t going to fix the big problems if the big problems are down to relationships.

I can think of a few examples: introducing tools like VMware and Chef are great at helping to speed up provisioning and configuring of environments, but if you don’t actually work on the relationships between the development and operations teams, then the tools won’t have any effect, the operations team might not buy into them, or maybe they’ll use them but not in the way the developers want them to. Another example: bringing in a new build tool because your old build system was unreliable. This isn’t going to fix your problem if your problem was that your old system was unreliable because development weren’t communicating clearly with the build engineers.

So relationships are key. But how do we make sure we’ve got good relationships? Well, I think if anyone knew the answer to that one they’d bottle it and sell it for millions. The truth is that it’s different for every situation, but there are things which can make sure you’re all on the same page, which is a start:

  • Have shared goals! I’m often banging on about this. Everyone has to push in the same direction. For me, in reality this often means trying to educate people that we don’t make any money from having reliable builds on developers laptops if the builds are unreliable in the CI/build system. We don’t make money out of finishing all our story points on time. We don’t make money out of writing new features. We make money by delivering quality software to customers! So I think that is exactly what we should all be focused on.
  • Be agile! I know this might seem a bit like it’s the wrong way around, but I actually think that being agile helps to build relationships. It’s a practice and a mindset as much as a process, and so if people share that mindset they’re naturally going to work better together. In my experience, in Operations teams we’ve been quite slow at adopting agile in comparison to other teams. It’s time for this to change. Tom said that on the project he’s working on, the Ops team are agile, and he identified that as one of the success areas.
  • Pair up. There’s nothing quite like sitting next to someone for a couple of days to help you see things from their perspective! On Tom & Marc’s project at Springer they paired the ops guys with dev. I would recommend going further and pairing dev with support engineers, QA (obvs!) and build/release management on a regular basis. Pairing them with users/customers would be even better!
  • Skill up. Tom & Marc talked about cross pollination of skills, and by this he means different people (possibly from different teams) learning parts of each others trade and skills. Increasing your skillset helps you understand other people’s issues and problems better, as well as making you more valuable, of course!

I became a better developer by understanding how things ran in Production – Marc Hofer

Summary

In summary – Tools are important, people and relationships are importanter (new word), you should automate everything, take little steps instead of big ones, stick to the principles of continuous delivery, and the new Snow White movie is bollocks.

Test-Driven Infrastructure with Chef – Book Review

A while ago I ordered a copy of “Test-Driven Infrastructure with Chef” from Amazon. It must have been sometime last summer in fact. It took months to arrive, because they simply didn’t have enough copies. So when it finally did arrive, I was very excited to see if my wait was worth the, er, wait.

Written by Stephen Nelson-Smith (he of @LordCope twitter fame and author of the excellent Agile Sysadmin blog), Test-Driven Infrastructure with Chef is by no means “War And Peace”. In fact it’s pretty tiny, and looks more like a pamphlet than a book. But what it lacks in size it more than makes up for in concise content.

What I really like about this book is that it feels absolutely perfect for someone like me, and by that I guess I’m trying to say that the target audience is well thought out. It’s aimed at Developers, Ops Engineers, DevOps, Sysadmins and Release engineers, those sorts of people. It assumes you know a certain amount about your own business, and so I don’t find myself sitting there reading some really basic stuff that anyone in my line of work is bound to know already. I’ll take the Continuous Delivery book as an example – it’s a great book, but some of it is about introducing Continuous Integration! I would have thought that if you were about to embark on Continuous Delivery, the first thing you’d already be VERY comfortable with would be C.I. so why the need to cover that all over again? Besides, there are plenty of good C.I. books on that subject. Of course, I know that the Continuous Delivery book is in actual fact aimed at a much wider audience, but what I’m trying to say here is that Test-Driven Infrastructure with Chef is more targeted, and I feel it’s a much easier read for it.

Infrastructure as Code

The fundamental premise of this book, and indeed the main point of Chef itself, is that we should treat infrastructure as code. What this means is that managing, designing, deploying and testing infrastructure should be done in an analogous fashion to how we do these same things with software. The code that  builds, deploys and tests our infrastructure should be committed to source-control in the same way as the code that builds, deploys and tests our software is.

This approach brings with it many of the same principles as we have around building, deploying and testing our software, and these are listed in chapter 1. Also listed here are the advantages of treating your infrastructure as code, things such as repeatability, automation, agility, scalability, disaster recovery and very importantly, reassurance!

The book goes on to introduce the reader to Chef. Chef is an open source tool for managing, deploying and configuring infrastructure. It’s produced by Opscode – check out the website for more information. The book explains about the Chef tool, framework and API, and then goes on to give instructions on how to install it (you’ll need Ruby installed – and the book covers this, using RVM). You’ll also get an introduction to Git (and GitHub) here too, and how to install Git on Ubuntu. Incidentally, all the examples are based on an Ubuntu system, so if you follow the examples closely, it’s best to have Ubuntu, or an Ubuntu vm at hand. That’s not to say that the examples can’t be done on other systems, but I would guess that centos would require a lot more behind-the-scenes configuring, thanks to its less-than-fantastic package repositories.

Cucumber-Chef

Chapter 4 provides a nice introduction and description of Test-Driven and Behavior-Driven Development, and talks a little about the Acceptance Test automation tool Cucumber, before chapter 5 goes into some more detail about Cucumber-Chef (don’t worry if you haven’t heard about this before, the book tells you all you need to know to get started, but for now let’s just say it does for infrastructure what Cucumber does for code).

I couldn't find the logo for cucumber-chef, so here's a picture of a cucumber...

...and chef

Chapter 5 introduces us to the Amazon EC2 Web Service. It shows you how to get setup with a personal account (which is nice and easy), because you’ll need it to work through the examples that follow! This is one of the things that I like most about this book, it’s a practical guide, it’s as if the author knew (which he obviously did) that his target readers are the types of people who like to get stuck in and do stuff. Chapter 5 finishes off with instructions on installing Chef and using a couple of the built-in tasks.

Recipes, Roles and Cookbooks

It’s chapter 6 and time for a worked example using cucumber-chef. This is where we first meet Recipes, Roles and Cookbooks. First, we learn how to do cucumber style Given, When, Then Acceptance tests, and we start TDDing. This chapter really is about how to apply test-driven development to an operations solution. Requirements are gathered, Acceptance criteria are identified, Acceptance Tests are written (before we actually do any Chef scripting, as per TDD), and we follow the “Red, Green, Refactor” model until we have a working solution. In my copy there’s a glaring mis-print on page 48, where the given, when, then example ends up being “given, when, when” 🙂 Hopefully this’ll get corrected in future editions.

The final chapter underlines how managing our infrastructure as code, and applying the principles of test-driven development can help us increase our quality and reduce the usual risk associated with deploying infrastructure.

 

Conclusion

This book will serve as a great introduction to Chef for anybody who learns best via hands-on examples. But it’s much more than an introduction to Chef, it’s really about the practice of test-driven development, and about how to apply the principles of TDD to infrastructure management.

Using Chef in itself is a step in the right direction – it allows us to treat infrastructure as code, and this has so many benefits – we can version our configurations much more easily, we can store our configurations in a proper source code management tool, we can deploy configurations sensibly, and so on. And applying TDD principles to your Chef “development” obviously looks like a great idea, it brings with it all the goodness of TDD, and gets us to think in terms of requirements and acceptance criteria before we start building our systems, ensuring that what we produce is fit for purpose.

Even if you don’t follow TDD, or don’t plan to follow TDD for your infrastructure development, this book is still very much well worth the read. The hands-on approach using examples you can actually work with, is refreshing. I’ve probably learned more from working through this little book than I have from reading most other voluminous technical guides.

The Boy Who Cried Wolf – Why Broken Builds Must Not Be Ignored!

We’ve all heard the fable of the boy who cried wolf, an old tale written by a Greek slave who lived a long time ago and liked telling stories. I must stress thought that this was just a story, but like many good stories (Star Wars, Gremlins 2) it’s based on true events. It’s a little known fact (which I have just made up) that the story of the Boy Who Cried Wolf is actually based on an ancient Continuous Integration system (possibly belonging to Spartatech inc, a leading software development company of from around 600BCE).

The are only very sketchy records of what actually happened during this historical event, but here’s what we know for sure*:

  • The Continuous Integration system compiled code and ran unit tests.
  • The unit tests were followed by acceptance tests, which in turn were followed by integration tests.
  • The integration tests were followed by cross-platform tests.

One day, the C.I. system alerted the developers, and anyone else who would listen, that one of the builds was failing!

THE BUILD IS FAILING!!!!!!

THE BUILD IS FAILING!!!!!!

All the workers gathered around the C.I. system to take a look at the error, only to discover that in actual fact, the problem was that the machine on which the tests were running had restarted, thanks to a windows update! The developers went back to work and thought no more of it.

The very next day, the C.I. system went ballistic again, alerting the developers of yet another failure. Once again the team took notice and looked into the error. This time they discovered that the test machine was running slowly and their unit test had timed out. They restarted the machine and it all passed.

By the following week, the C.I. system had alerted the team to 5 other “failed” builds, which were simply a result of the test servers behaving oddly, and no change to the code was required at all. By the end of that week the dev team had stopped paying attention to these alerts, because they all had proper jobs to do, and a lot of work to get through by the end of the sprint (they were agile, even back then – so if you still work according to “Waterfall” THAT’S how far behind the times you are).

Then, one day, a developer checked in a piece of code which failed a unit test. The C.I. system rightly alerted the development team again, but this time nobody came running, in fact, nobody paid the blindest bit of notice. They were all so used to ignoring the C.I. system that when a real problem finally arose, it wasn’t picked up by anyone. Anyway, cutting an already long story a bit shorter, the bug was a biggie, and the next release of their software was so poor that Spartatech inc, the biggest employer in Sparta, went out of business making everyone redundant. This is what basically led to the collapse of Sparta, not sure if you knew that. And here concludes your exceedingly dodgy history lesson.

Back to the present-time: We all know that false-negatives are the enemy of a good Continuous Integration system, they lead us down a path from which it’s increasingly hard to recover. The problem is that it’s just so easy to get into that situation.  Thankfully, there are a few “Continuous Integration best practices” which we can follow, which can help us keep our system in good nick:

  • Make sure the servers on which you run your builds are cleaned regularly, preferably before every build. I would suggest a clean image be deployed each morning. This obviously means that deploying and configuring your system will need to be automatic.
  • Use tools such as VMWare, VirtualBox etc to manage your Virtual Machines.
  • Use tools like Puppet, Chef and Vagrant to automate the deployment and configuration of these VMs.
  • Add the deployment of the VMs to your C.I system!
  • If there are any randomly failing tests, which also randomly fail on your local machine, remove them or rewrite them to make them more reliable.
  • When doing sprint planning, make sure time is dedicated to investigating and fixing broken or flaky builds. It’s very important to ensure the Product and Project Managers are aware of the importance of the C.I. system, and the risk of not maintaining it properly.
  • Treat the rules of Continuous Integration as sacrosanct – if a build fails, fix it as soon as possible.

In an earlier post I wrote about the difference between having a Continuous Integration server and practicing Continuous Integration. If you start to get into the situation where you’re allowing broken builds to go un-fixed, you’re slowly slipping away from actually practicing continuous integration, and soon you’ll just be a development team who has a continuous integration server which is delivering much less value than it ought to.

*I may have got my chronology a little mixed up