Automa Blog

Showing posts tagged with: testing, Show all

The Bug Hunt Game at EuroSTAR 2013

Update 05/11/2013: The Bug Hunt Game will take place on Wednesday, Nov 6 in the Community Hub at EuroSTAR. Come find us!

As explained in a previous blog post, we won two places for this year's EuroSTAR testing conference in Gothenburg, Sweden. We can't wait to attend this great event, and have prepared a special game for the other attendees, to let them relax, have fun and meet new people.

The rules of the game are the following: Each player is either a "bug" or a "bug hunter". Players receive badges that show which of the two they are. When a bug and a bug hunter meet, they play rock paper scissors. The winner takes one of the other player's lives. A scoreboard will be kept and at the end of the day the winner (ie. the person with the most lives) will be announced. We're curious to see whether it will be a bug or a bug hunter!

The game was developed during a workshop by Oana Juncu at Agile Testing Days 2013, in close collaboration with Jesper Lottosen (@jlottosen on Twitter).

The game will most likely take place on Wednesday or Thursday. You can sign up for it in the EuroSTAR Community Hub. We will publish further details here and via our Twitter account.

See you at the conference and, as always, happy automating! :-)

The number one rule for dealing with unstable builds

We'll soon have some pretty revolutionary news for you. While we're working hard to make them happen, here is a little post for you with our experiences of how to best deal with an unstable build.

Suppose you have a build that passes most of the time but fails with irreproducible errors every so often. Maybe the build accesses an unreliable service, or performs some automated GUI tests that are not as stable as they should be.

The next time the build fails, do not simply run it again. This is worth repeating: Do not simply re-run a broken build in the hope that it will magically succeed.Yes, you are just working on something else and really don't have time to deal with this instability now. But if you ignore instabilities like this too often, you start going down a very dangerous route.

Every time the build fails, you are given the unique opportunity to fix one of your build's (or even software's!) instabilities. This is great! You get to experience first-hand how a bug that will most likely affect your users manifests itself. Now is your chance to fix it. If you simply re-run the build, you will most likely lose valuable information, such as log files.

If you make it a habit to simply re-run the build when an instability occurs, it is likely that more and more instabilities will sneak into your program. The failing builds cost you more and more time, and you feel even more like the last thing you have time for is to fix the occasional failing build. A vicious circle.

If you really don't have time to investigate an instability when it occurs, at least make time to save all information such as log files that might give you a clue as to what caused the problem. Once you have completed your pressing, pending task, go back and investigate what caused the build to fail.

Sometimes when an instability occurs, you don't have enough information to find out what caused it. Do not take this as an excuse to ignore the instability. Add more logging that might let you find out what the cause of the problem is the next time the instability occurs.

One approach we have had very good experiences with was to keep logs of when each particular instability occurred. The problem with an instability is that, since it is not reproducible, it is often not possible to test whether an attempted fix actually works. If you have logs of when the instability occurred, you can estimate roughly how often it occurs (eg. once a week, or on every 10th build). If after your fix attempt you then don't see the instability for two weeks or 20 builds in the previous example, you can be pretty confident that your fix attempt was indeed successful.

We'll be at the Agile Testing Days in Potsdam, Germany tomorrow. If you see us around, do come and say hi!

Happy automating! :-)

The Test Hourglass

An interesting question in the context of test automation is which ratio of your tests should be unit tests, integration tests or GUI (/system) tests. A common answer to this problem (especially in Agile circles) is given by Mike Cohn's test pyramid:

Test Automation Pyramid

The test pyramid argues that automated tests that run against the user interface of an application are slow, brittle and difficult to maintain, and that you should therefore have many more low-level unit tests than high-level GUI tests. The test pyramid also argues for a medium amount of "service tests", which are similar to UI tests, but use programmatic interfaces rather than the GUI to drive the application.

The test pyramid is right in many ways. It is true that tests which operate the GUI are slower and more brittle than pure unit tests. It also makes a lot of sense to use a solid amount of service tests for verifying that the components that were tested in isolation by the unit tests then work together as expected.

One aspect the test pyramid does not describe is how the ratios of different types of automated tests change as the project progresses. Tools such as Automa are increasingly used to capture GUI tests before the implementation. This style of development has many advantages, including better fleshed-out requirements due to the precise nature of executable tests, and better test coverage. When you follow such an approach, you initially have more GUI tests than service- or unit tests. At least for a short time, you thus do not adhere to the test pyramid.

The theory behind the test pyramid also does not mention several of the advantages of GUI tests. First and foremost, GUI (/system) tests are the only type of test that can really check that an application meets the customer's requirements. Saying that the unit and service tests pass will not satisfy a customer when there is a bug that can be reproduced through the GUI. Second, an advantage of system tests is that they don't have to change as often as service- or unit tests: Whenever the code changes, it is likely that a unit test will have to be updated to reflect the new implementation. On the other hand, the system tests only have to be updated when the code change actually leads to a change in the user-visible behaviour of the application. In the common case of a refactoring, for instance, this would not be the case.

So what does our test portfolio look like? Firstly, it must be pointed out that our product, Automa, comes as a Python library or console application, hence does not have a GUI. On the other hand, Automa is a tool for GUI automation, so testing its functionality effects operating a GUI, even if it is not Automa's own. A typical example of a system test for Automa would be to use its GUI automation commands to start the text editor Notepad, write some text, save the file and verify that the file has the correct contents. The commands would be sent through the command line, but result in an action in the GUI. In this way, automated system tests for Automa are not GUI tests in the pure sense of the definition, but share a lot of the characteristics of an average GUI test.

As explained in a previous post, our development proceeds in an Acceptance Test-Driven style: When starting work on a new feature, we first define a set of examples that describe the required behaviour. These examples are then turned into automated acceptance tests. Only when this is finished do we begin with the implementation.

During the implementation, we write unit-, integration- and service tests to help us with our development. We do not adhere to a dogma of having to have such and such percentage of test coverage, but rather use common sense to determine when it makes sense to add a test. This keeps our test portfolio slim, while including checks for the most important functionalities.

Since our product does not have a GUI, we cannot readily apply the concepts of the test pyramid to our process. For us, it makes more sense to speak of system-, integration- and unit tests as comprising our test portfolio. System tests are those that operate the executable binary of Automa through its command line interface. This is essentially done by piping the input and output of Automa.exe using Windows' command redirection operators > and <. The next level in our test hierarchy are integration tests, which use Automa as a Python library. They test that Automa's Python API works, but also on a more technical level that Automa's internal components cooperate as expected. Finally, we have a significant number of unit tests that check the correctness of Automa's individual functions and classes.

Using the classification from the previous paragraph, Automa's test portfolio currently consists of 38% system tests, 16% integration tests and 45% unit tests. In a picture, this would look roughly like an hourglass:

Test Hourglass

Each of the compartments of the hourglass has an area that corresponds to the relative percentage of the respective type of test in our portfolio. We call this the Test Hourglass.

The test hourglass clearly has a very different shape from the test pyramid. Does this mean we're doing something wrong? Not necessarily. Firstly, as already mentioned a few times, the test pyramid does not readily apply to our case since our system tests do not check a GUI in the conventional sense. Second, we are very satisfied with our test portfolio. It captures all cases that are important to us, yet requires little maintenance effort when some existing functionality does change. It is also not too slow, because we frequently optimize both our product and the test suite for performance. All in all, it can be said that our test portfolio is one of our greatest assets.

The hourglass shape of our test portfolio might be the result of several factors. First, the acceptance test-driven development style naturally leads to a large number of system tests. Second, while our system tests are effectively automating a GUI when exercising Automa's API, the interface to our application is not graphical. The advantages of a service test over a GUI test in the test pyramid come from avoiding brittle and slow GUI operations. If the interface to the system under test is not graphical, as in our case or for instance for a web service, then not much is to be gained by choosing a service test over a system test. We suspect that for systems with a non-graphical interface, the test hourglass could represent a good analogue to the test pyramid and graphical applications.

The test pyramid provides a useful guideline for applications with a graphical interface. However, as with all guidelines, it needs to be evaluated in the context in which it is to be applied. Novel GUI automation tools might make it feasible to have a higher ratio of GUI tests than in the past. For applications without a graphical interface, the test hourglass might present a more applicable guideline than the test pyramid. In all cases, it is important to choose the approach that is right for your particular situation, rather than blindly following an established dogma.

This article is hosted on the Automa Blog. If you would like to know more about Automa, we welcome you to visit our home page.

Happy automating!


  • The Test Pyramid is a concept developed by Mike Cohn. It is described in his book Succeeding with Agile.
  • Martin Fowler's Bliki has a good summary of the main ideas and caveats of the test pyramid.
  • Mike Cohn also describes the test pyramid on his blog.

Automa 2.1: Deadly!

We just released version 2.1 of Automa. It contains a deadly (but very useful ;-) ) new feature.

The typical steps in a GUI automation case are:

  1. Start the application you wish to automate.
  2. Perform GUI automation.
  3. Close the application.

The last step ensures that the system is left in the same state as before the automation, and thus ready for the next automation run. In particular, you want to kill all applications and windows opened by the GUI automation before.

A problem with step 3. is that sometimes unexpected windows might have popped up in the GUI automation in step 2., which the logic in step 3. is not prepared to handle (think "Automatic Updates"). Another problem is that sometimes closing the application might bring up additional windows or pop-ups such as "You have unmodified changes. Do you want to save?".

The solution to these problems is to not close the application and its opened windows using normal GUI operations such as click("File", "Exit"). Instead, one simply wants to kill the entire application, just like what you do when you force-close a program using the Windows task manager.

The new version of Automa brings with it a new command kill(...) and a new predicate Application(...) that lets you do just that. When you start an application via start(...), you now get an Application object as a result:

>>> start("Notepad")
Application("Untitled - Notepad")

The result corresponds to the same-named entry in the "Applications" tab of the Windows task manager:

Notepad Application in the task manager

After you have performed your automation, you can use the new kill(...) command to close your application and all associated windows:

notepad = start("Notepad")
# Perform GUI Automation

Independently of which windows were open in the application (eg. a "Save As" dialog) or would have opened during a normal closing ("Do you want to save?"), the kill(...) statement closes the open application. This ensures that your test system is left in exactly the same state it was in before the automation, and is thus ready for the next automation run.

The two new features of course seamlessly combine with Automa's existing API:

>>> notepad = start("Notepad")
>>> start("Calculator")
>>> switch_to(notepad)
>>> find_all(Application())
[Application('Untitled - Notepad'), Application('Calculator')]
>>> kill("Calculator")
>>> find_all(Application())
[Application('Untitled - Notepad')]
>>> kill(notepad)
>>> find_all(Application())

Deadly, useful or both? See for yourself on our download page ;-)

Happy automating!

Automa 1.9.0: A Sikuli Alternative

Many big news for us these weeks :-) We just released version 1.9.0 of Automa, our next generation GUI automation tool. For this version, we completely rewrote Automa's image search algorithms. These are used when you use the Image predicate to find (and interact with) an image shown on the screen. The new algorithms are more robust with respect to differences between the sought image and the actual image on screen. What's more, the new algorithms are compatible with the widely used Sikuli image automation tool.

Sikuli is a visual technology to automate and test graphical user interfaces using images (screenshots). It is open source and cross-platform. It supports text recognition using OCR, but unlike Automa does not support distinguishing different types of GUI elements, or to extract data from them.

The image search algorithm used in the new version of Automa is fully compatible with that of Sikuli. That is, if you used Sikuli to search for an image "my_button.png" with a minimum similarity of 0.8 (say), then the following Automa command will return exactly the same results:

Image("my_button.png", min_similarity=0.8)

As always, you can combine this command with Automa's other functions. For example:

click(Image("my_button.png", min_similarity=0.8))

We like to think that this makes Automa's GUI element search strictly more powerful than Sikuli's. With the new algorithm, Automa can find any image that Sikuli can find, plus it lets you:

So, what are you waiting for? Grab your copy of Automa from our download page!

Happy automating ;-)

Website testing using Python and Automa

After each release, we download the newest version of Automa from our website, in exactly the same way as our users would do. We then quickly launch it to make sure everything works. We do this to make sure that the newest version of our product was deployed to the server and that it is not corrupted in any way.

The steps required to download the file can be easily automated using.. Automa! And, by using Python's unittest library, we can add some assertions to make sure that the website is working correctly and that the file was indeed downloaded. Today I quickly wrote the appropriate test script which I would like to share.

In order to start, you need to download Automa ZIP archive and extract it somewhere to your disk (for example, extract contents of into C:\Automa). Then edit (or add) the PYTHONPATH environment variable so that it contains the path to the file (for example,  PYTHONPATH=C:\Automa\ This allows importing and using the Automa API in your Python scripts:

from automa.api import *

Then, in any text editor (for example, notepad) you can write a test script which uses Automa's API, save it with the *.py extension and run using Python interpreter. In order to test the downloading of Automa works fine, I wrote the following test case:

from automa.api import *
from os.path import expanduser, exists, join
import unittest
class TestAutomaWebsite(unittest.TestCase):
    def setUp(self):
    def test_download(self):
        # get downloads directory
        zip_path = join(expanduser("~"), "Downloads", "")
        self.assertFalse(exists(zip_path ))
        # assert that the website is up and running and has been loaded
        write("BugFree Software")
        # if zip is not selected, then select it
        zip_radio_button = Image("zip_radio_button.png")
        if (zip_radio_button.exists):
        # if Firefox opens window asking what to do with the file..
        if (Image("download_window.png").exists):
            click("Save File")
        # close the "Downloads" window
        press(ALT + F4)
        # make sure that the correct page is displayed
        # check that has been downloaded
    def tearDown(self):
        # close Firefox
        press(ALT + F4)
if __name__ == "__main__":

As you can see, I used multiple *.png images in the test script. With Automa, you always have the choice of using either the image-based or text-based automation. Normally, the text-based automation is preferred as it is more robust and test scripts do not break if GUI elements change visually. However, in some cases, for example when it is difficult to refer to UI elements by text (for example, when they don't have any label assigned to them), you can always fall back to *.png images. In the above script I used the following images:


In order to run the test script, I saved it together with all the images in one directory and run the python interpreter from the command prompt:

Test Run Success

And this is it! After running this automated test script I am sure that our website is up and running and that can be downloaded without any issues. You can try it yourself by downloading the archive below which contains the script together with all the images.

Happy testing and automating! As always, we'd be very happy to hear what you think about our tool in the comments below :-)

WPF GUI Testing with Automa

Recently we wrote a simple WPF application to test Automa. Our WPF test application consists simply of a few text fields and buttons which we can use Automa to operate on and test their state. We believe that it might be worth sharing the details of our WPF tests as this might be useful for anyone who develops applications in this technology.

This is how our test application looks like:

WPF Test Application

We can use Automa to read values of the text fields, press the buttons, read their state etc. In order to start, after opening Automa, we need to launch our simple WPF test application. Assuming the executable is located on the C drive, we run the following command:


Then we can easily interact with the application using Automa's API. For instance, we can perform the following operations on the text fields:

>>> example_text_field = TextField("Example Text Field")
>>> example_text_field.exists
>>> example_text_field.value
'Lorem ipsum'
>>> example_text_field.is_enabled
>>> example_text_field.is_readonly

Of course, when you ask whether the 'Disabled Text Field' button is enabled, Automa will correctly give you the answer:

>>> TextField("Disabled Text Field").is_enabled

Similarly, Automa finds and allows you to read the state of the WPF buttons:

>>> Button("Enabled Button").exists
>>> Button("Enabled Button").is_enabled
>>> Button("Disabled Button").is_enabled

As you can see this is quite powerful and enables your application to perform actions on GUI elements only when it is possible to do so. For example:

if TextField("File Name").exists and TextField("File Name").is_enabled:
    write("MyFileName", into="File Name")

Or we can check whether a button exists and is enabled before clicking it:

if Button("Submit").exists and Button("Submit").is_enabled:
    print "Couldn't click the 'Submit' button!"

You can also write any GUI tests you like. It's easy to import Automa's API as a python library, combine it with the unittest library, and in your test class (extending TestCase) be able to write assertions, such as:


You can download our simple WPF test app to try it out yourself!


Why automation does not replace manual testing

Quite often when we explain the goal of our GUI test automation tool Automatest to someone from a less technical background, we get the response "So you're stealing the testers' jobs". This is a common misconception and this post explains why.

The main cost benefit of automation comes from the time you save: Whenever an automated process performs a task for you, you effectively gain the time it would have taken you to perform the task manually. The more long-running and frequent your process is, the more you gain from automating it. For this reason, it pays off most to automate tasks that take a long time to perform manually and have to be executed often.

Testing a new version of a piece of software consists of two parts: A regression test of the old functionality and a test of the new functionality. The regression test usually takes a long time and has to be performed often, hence should be automated. The test that the new functionality meets its specification has to be performed for the first time but will be a part of the regression test from the next release. Therefore, it should also be automated.

The last paragraph makes it sounds like we have everything covered, but a crucial step is missing: New functionality that is added to a system might - will! - interact with the existing code base in unforeseen ways. While the interaction of existing and new functionality should be part of the automated tests, it is impossible to predict all places where the two generations of functionality might collide. This is where a lately trending approach called exploratory testing comes in.

In exploratory testing, a tester uses his or her creativity, analytic skills and experience to learn about the application under test and find new, unaccounted for ways of highlighting defects. Since this is a highly cognitive process that requires intuition and original human thinking, it cannot be automated.

There are many other situations where a computer cannot replace the insight, experience and understanding of a tester. The role of a tester demands a lot of personal communication, often involving the feelings of others. A good tester has the drive to find defects, a quality machines do not possess. Finally, software is made for humans with particular thoughts, needs and feelings and thus has to be tested by someone who can understand and experience these emotions in the same way.

By making it possible to automate repetitive tasks with our GUI test automation tool Automatest, we hope to let everybody do what they do best: Machines the grunt work and testers the creative, analytic and insightful process that is necessary for ensuring that a piece of software meets the needs of another human being.

Kill Bugs Dead

While our name BugFree Software certainly represents one of our core values, it goes without saying that all software of any appreciable complexity, including our products, will contain bugs. All we can do - what we have to do - is to fix any bugs we find and make sure they never return again.

We recently had a bug where Automa would fail to start without displaying an error message to the user when the installation directory was not writeable. This would for instance occur when the user installed Automa with admin privileges to a folder requiring such privileges, but then started Automa without an elevated account. Since we require Automa's installation directory to be writeable, the fix to the problem was to display an error message to the user that says that he might have to start Automa with administrator privileges:

Error message for non-writeable installation directory

In the spirit of acceptance test driven development, the first step in fixing the above problem consisted of writing an automated system test that captures the erroneous behaviour. The backbone of our system tests for Automa is a little Python class that allows sending inputs to and checking outputs from a console process:

class ApplicationConsoleTester(object):
	def send_input(self, input):
		# Implementation details hidden...
	def expect_output(self, expected_output):
		# Implementation details hidden...

You (roughly) use it like so:

cmd = ApplicationConsoleTester("cmd")
cmd.send_input("echo Hello World!")
cmd.expect_output("Hello World!")

When the expected output is not received within a given timeout, an AssertionError is raised. This makes it very easy to use ApplicationConsoleTester in conjunction with one of the unit testing frameworks available for Python.

To highlight the above bug, we wrote the following Python TestCase:

class NonWriteableInstallDirST(unittest.TestCase):
    # Some implementation details hidden...
    def test_error_message_for_non_writeable_install_dir(self):
        automa_tester = ApplicationConsoleTester("Automa.exe")
            "Cannot write to Automa's installation directory. If you "
            "installed Automa with administrator privileges then you "
            "might also have to start Automa with those privileges.\n"
            "\nType Enter to quit."
            "Automa did not give the user enough time to "
            "see the error message."

Once we had written the test and seen it fail (in the style of good (acceptance)-test driven development), it was easy to fix the bug.

Having the test automated allows us to execute it every time Automa is built. This ensures that the bug will never occur again. But there is another benefit of having such a system test: By discovering and fixing the bug, we have effectively enriched Automa's feature set. Keeping the system tests in sync with development means that the system test suite forms a comprehensive, up-to-date documentation of the required functionality. There is no more partial knowledge as to what works and what doesn't, possibly spread amongst multiple individuals. There is only one truth: that determined by the tests.

The costs of a bug

Update 06/08/2012: Elisabeth Hendrickson wrote an interesting blog entry on the costs of not fixing a bug. You can find it in the section 'Further Reading' below.

Bugs are an inevitable aspect of software development and much effort is spent on preventing them (for example via code reviews), catching them early (via testing), tracking and fixing them. We don't like bugs, but how much effort should we reasonably invest into keeping them out of our systems? When does it make sense not to fix a bug? In order to answer these questions, we need an understanding of the costs involved in having a bug and those involved in fixing it. This is what this article is set out to explain.

The cheapest bug is the one you never had

Many techniques and tools exist to prevent bugs - code reviews, test driven development, code analysis tools and design by contract name but a few. These methods actually all have the greater effect of improving code quality, which naturally leads to fewer bugs being introduced. One of our aims is to help our clients write better software, i.e. higher quality code.

The costs of fixing a bug late

Despite all proactive measures and without our immediate noticing, it will happen that a bug gets introduced into the system. How much effort should we put into trying to detect and fix the bug early, before it reaches our users? How expensive is it if we miss the bug before it goes live and we have to ship a fix in a later release? The following figure is an attempt at a quantification:

The timeline contains several key events:

  • A: The bug is introduced into the system, for example by committing it to the common code base.
  • B - after less than one day: The bug didn't occur, you made a few changes and now it does. Trivial to fix.
  • C - after a few days: You still remember the changes that may have caused the bug but the fix may require you to revisit some decisions you have made in the meantime. Still, the bug is not too hard to fix.
  • D - Go-live: The bug starts to affect your customers and/or the integrity of your application. Deploying a fix is more difficult now.
  • E: The bug has been in production for a while and the affected code parts have not been touched in quite some time. Your users may have learned to live with the bug, fixing it however is difficult because it is buried deep inside the system and you hardly remember the changes that have caused it.
  • F: The bug has been in the system for a long time and the person or team that introduced it no longer works at your company. Fixing it would be very difficult if not infeasible.

The green line indicates the accumulated costs (that is, the sum of all costs over time) of:

  1. documenting the bug once it is noticed This cost is 0 if the bug is found and immediately fixed by the engineer who introduced the bug. It increases as time progresses and a tester or even a user through customer support reports the bug.
  2. context switching - putting aside other tasks and getting an understanding of the functional and technical background of the bug. Similarly to the cost of documenting the bug, this cost is 0 if the engineer who introduced the bug finds and immediately fixes it. The cost is extremely high when the knowledge of the bug's background no longer resides within the team, for instance at point (F).
  3. identifying the bug's cause. This cost increases steadily as the system grows in complexity and the people who were around when the bug was introduced forget the changes that were made and/or leave the team/company.
  4. performing the fix. This cost increases as new functionality is added to the code base, possibly even relying on the "buggy" behaviour.
  5. risk of introducing a new bug while performing the fix. This risk is particularly high close to a deadline such as a release (D).
  6. deploying the fix. This cost rapidly increases once the bug has been released at point (D).
  7. effects on customers / your company's reputation. This includes customer support and cancellations of orders. It only takes effect once the bug is released at point (D). While the figure above shows accumulated costs and can thus never decline, the running costs of the effects of a bug on customers can decline because people get used to the faulty behaviour (provided the bug is not too critical, see the next paragraph).

Caveats and Summary

The above figure does not have any backing data and is purely based on personal experience. What's more, it clearly does not apply to all bugs - consider a typo in a label vs. a blocker that makes your system unusable. Another case is where a bug disappears on its own (hence cannot be counted as "fixed" with the costs outlined above) because the functionality that was affected by it is replaced or removed.

If we do take the above discussion as a starting point, several conclusions can be drawn. The most obvious one is that because the running costs (cf. point 7. above) of having a bug eventually decrease while the costs of fixing it (cf. points 2. - 6.) increase, there may be a moment after which fixing a bug does not pay off. Second, the figure shows just how important it is to find and fix bugs as early as possible. Fortunately, this is where Automa can help ;-).

Further Reading

The following blog posts are interesting reads related to the above topic: