Web Application Functional Regression Testing Using Selenium

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

At Foliotek, we use a rapid development methodology.  Typically, a new item will go from definition through coding to release in a month’s time (bucketed along with other new items for the month).  A bugfix will nearly always be released within a week of the time it was reported.  In fact, we are currently experimenting with a methodology that will allow us to test and deploy new items individually as well – which means that a new (small) item can go from definition to release in as little as a week, too.

Overall, this kind of workflow is great for us, and great for our customers.  We don’t need to wait a year to change something to make our product more compelling, and customers don’t have to wait a year to get something they want implemented.  We also avoid the shock of suddenly introducing a year’s worth of development to all our customers all at once – a handful of minor changes every month (or week) is much easier to cope with.

However, it also means that Foliotek is never exactly the same as it was the week before.  Every time something changes, there is some risk that something breaks.   We handle this risk in two ways:

  1. We test extremely thoroughly
  2. We fix any problems that arise within about a week (severe problems usually the same day)

At first, we did all testing manually.  This is the best way to test, assuming you have enough good testers with enough time to do it well.  Good testers can’t be just anyone – they have to have a thorough knowledge of how the system should work, they have to care that it does work perfectly, and they have to have a feel for how they might try to break things.  Having enough people like this with enough time to do testing is expensive.

Over time two related things happened.  One was that we added more developers to the project, and started building more faster.  Two was that the system was growing bigger and more complex.

As more people developed on it and the system grew more complex, our testing needs grew exponentially.  The rise in complexity and people developing led to much, much more potential for side-effects – problems where one change affects a different (but subtly related) subsystem.  Side-effects by their nature are impossible to predict.  The only way to catch them was to test EVERYTHING any time ANYTHING changed.

We didn’t have enough experienced testers to do that every month (new development release) let alone every week (bugfix release).

To deal with that, we started by writing a manual regression test script to run through each week.  While this didn’t free up any time overall – it did mean that once the test was written well, anyone could execute it.  This was doable, because we had interns who had to be around to help handle support calls anyways – and they were only intermittently busy.  In their free time they could execute the tests.

Another route we could have gone would have been to write automated unit tests (http://en.wikipedia.org/wiki/Unit_testing).  Basically, these are tiny contracts the developers would write that say something like “calling the Add function on the User class with name Luke will result in the User database table having a new row with name Luke”.  Each time the project is built, the contracts are verified.  This is great for projects like code libraries and APIs where the product of the project IS the result of each function.  For a web application, though, the product is the complex interaction of functions and how they produce an on screen behavior.  There are lots of ways that the individual functions could all be correct and the behavior still fails.  It is also very difficult to impossible to test client-side parts of a web application – javascript, AJAX, CSS, etc.  Unit testing would cost a non trivial amount (building and maintaining the tests) for a trivial gain.

Eventually, we discovered the Selenium project (http://seleniumhq.org/download/).  The idea of Selenium is basically to take our manual regression test scripts, and create them such that a computer can automatically run the tests in a browser (pretty much) just like a human tester would.  This allows us to greatly expand our regression test coverage, and run it for every single change we make and release.

Here are the Selenium tools we use and what we use them for:

  • Selenium IDE (http://release.seleniumhq.org/selenium-ide/) : A Firefox plugin that lets you quickly create tests using a ‘record’ function that builds it out of your clicks, lets you manually edit to make your tests more complex, and runs them in Firefox.
  • Selenium RC (http://selenium.googlecode.com/files/selenium-remote-control-1.0.3.zip):  A java application that will take the tests you create with Selenium IDE, and run them in multiple browsers (firefox, ie, chrome, etc).  It runs from the command line, so its fairly easy to automate test runs into build actions/etc as well.
  • Sauce RC (http://saucelabs.com/downloads): A fork of RC that adds a web ui on top of the command line interface.  It’s useful for quickly debugging tests that don’t execute properly in non-firefox browsers.  It also integrates with SauceLabs – a service that lets you run your tests in the cloud on multiple operating systems and browsers (for a fee).
  • BrowserMob (http://browsermob.com/performance-testing): An online service that will take your selenium scripts and use them to generate real user traffic on your site.  Essentially, it spawns off as many real machines and instances of FireFox at once to run your test – each just as you would do locally – for a fee.  It costs less than $10 to test up to 25 “real browser users” – which actually can map to many more users than that since the automated test doesn’t have to think between clicks.  It gets expensive quickly to test more users than that.

Selenium is a huge boon for us.  We took the manual tests that would occupy a tester for as much as a day, and made it possible to run those same tests with minimal interaction in a half hour or less.  We’ll be able to cover more test cases, and run it more – even running them as development occurs to catch issues earlier.

In my next post, I’ll talk about the details of how you build tests, run them, maintain them, etc. with the tools mentioned above.