Tuesday, 17 November 2009

Behaviour Driven Infrastructure

I've been following the development of puppet for many years and this gem of a thread caught my attention recently. Martin Englund asks the Puppet Users mailing list:
how do you validate that puppet has done what it is supposed to, and even troublesome, how you validate that it has done what you intended it to do?
This is something I've struggled with over the years with my JASS/SST-based jumpstart build system. I've gone so far as to automate the build testing process using buildbot and pass or fail a build by using regexps to search for errors in the installation process. But my testing ends when the console login prompt appears. Validating whether a system build functions as intended is beyond what I currently test now and even beyond the capabilities of puppet.

This is where Martin's Behaviour Driven Infrastructure (BDI) approach comes into play. Martin is using Cucumber, a Behaviour Driven Development testing tool, to describe a system's behaviour using natural language that is readable and easy to understand by non-technical users (i.e. your IT helpdesk or even your business stakeholders).

Where this really gets interesting is combining puppet, cucumber and a monitoring system such as Nagios to do Test Driven Infrastructure. For example, you can use cucumber-nagios to integrate your cucumber tests with Nagios, then write a test for a new feature you want the system to have, e.g. "Should be able to send email". Initially this would result in Nagios marking the system as having a fault because you haven't yet implemented the feature that passes the test. You then proceed to implement the feature using puppet such that the test passes.

Over time I would imagine the amount of test coverage to grow such that system behaviours like DNS resolution, LDAP authentication, host based firewall policies etc would be tested. Any change to the system that broke one of these tests could be quickly pinpointed and fixed.

A perfect example in the Solaris world is the application of the latest recommended patches to a box. At a minimum it tends to break the sendmail and snmpd configs on my systems and I know to manually backup and restore these files before and after applying the patches. With BDI combined with a monitoring system you could be alerted to these breakages or any others that you weren't previously aware of and rapidly respond to them.



Jeff Anderson said...

This is uber brilliant.

Great concept, the worry i have ia that with out major vendor support how realistic ad relevant will this be?

what do i have to do to get thos worling with HP openview, BMC, etc...

No major vendors are even aware of BDD, doesnt stop me from using it but its always an up hill battle...

BDI sounds like another one. Still i love the idea, makes sysop duties actually interesting.

Kartar said...

@Jeff Anderson

Ditch HP and BMC and use something open source and extensible? :)

More seriously though, all you need is an API. If your enterprise monitoring tool can receive output from other sources (and I know BMC can) then you can create a Cucumber formatter to output in that format.

matthew said...

@Jeff Anderson

Your test runner only needs some way to notify your NMS of a failure. This could be via syslog, snmp trap, or even a custom SNMP MIB variable that you could poll.