Showing posts with label Model-Based Testing. Show all posts
Showing posts with label Model-Based Testing. Show all posts

December 4, 2007

Solving Intractable Problems

"A solution to a given problem is called optimal if one can prove that no better solution exists. Some skeptics might ask, Why should intuition rely on a rule of thumb instead of the optimal strategy? To solve a problem by optimization -- rather than by a rule of thumb -- implies both that an optimal solution exists and that a strategy exists to find it. Computers would seem to be the ideal tool for finding the best solution to a problem. Yet paradoxically, the advent of high-speed computers has opened our eyes to the fact that the best strategy often cannot be found." - Gerd Gigerenzer, Gut Feelings: The Intelligence of the Unconscious


An intractable problem is a problem for which there is no efficient means of solving. These aren't necessarily problems for which there is no solution. Instead, these are problems that take too long to analyze all the options.

One of the biggest challenges in testing software is to select useful tests from infinite options. Even finite sets of options can create intractable problems.

For example, state model-based test automation generally requires creation of an explicit finite model. Models are less complex than the real thing -- if they aren't, they are copies, not models. Model-based test automation can be used to generate and execute tests for many more paths through a computer program than people are willing or able to try. However, testing all paths in a model for a non-trivial program can easily become an intractable problem.

Gerd Gigerenzer demonstrates this with a challenge in his book Gut Feelings. Gigerenzer asks readers to find the shortest route to visit 50 cities starting and ending at the same city. Think you can find the solution? There are only 12 different routes to visit 5 cities. So how many combinations would you need to check to visit 50 cities? According to Gigerenzer, there are approximately 300,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 possible routes to visit 50 cities. Even with the help of computers, we do not have the time to calculate the best route.

Even relatively simple computer programs have significantly more data and path possibilities -- especially when we consider retracing paths as part of a larger path. There are potentially infinite path possibilities though even the simplest programs. (As I type each letter of this article into my computer, I am selecting an input from many more than 50 options.)

The problem of visiting 50 cities is solvable if we do not insist on finding the shortest route. Finding the shortest route and trying every possible path through a computer program are not really required to find a satisfactory solution.

Intractable problems like this require that we not consider all possibilities. Trying to consider all the options distracts us from what could be productive testing. Instead of considering all options, we only need to consider enough options to be satisfied. Otherwise, we will spend more time analyzing the possibilities than we do testing. Sometimes we just need to stop analyzing and go with our gut feelings.

"Gut feelings are based on surprisingly little information that makes them look untrustworthy in the eyes of our superego, which has internalized the credo that more is always better. Yet experiments demonstrate the amazing fact that less time and information can improve decisions. Less is more means there is some range of information, time, or alternatives where a smaller amount is better ..." - Gerd Gigerenzer, Gut Feelings: The Intelligence of the Unconscious

Gut feelings -- this unconscious thinking -- is based on heuristics, or rules of thumb. A heuristic is a problem solving device that helps narrow intractable problem into solvable problems. Heuristics are not laws. They do not apply to all situations. Two or more useful heuristics may even contradict one another. Gigerenzer states that the value of a heuristic is dependent on the context in which it is applied.

I have encountered many good testers that just seem to have a gift for breaking software. I once thought that this gift was something that could not be taught to others. I was wrong. Writers like, and including, Gerald Weinberg have shown me that teaching such problem solving is possible. Testing coaches James Bach and Michael Bolton have shown me that intuitive testing can be taught.

The secret to teaching such testing is to help good testers identify the heuristics that they use while testing. Once they are identified, they can be stated in such a way that they make sense to others. Then they can be named to make them easy to remember.*

As (or after) you test, think about why you do what you do and write it down. Don't think in terms of absolutes and programmable logic. Think in terms of rules of thumb. Then give each heuristic a name. Then share it.

If you are having trouble, here is a heuristic I've found useful for starting many things:

Start with what you recognize.

Hopefully starting will trigger many more ideas that are not obvious at the start.


* For some examples, look here for a bunch of links compiled by Brian Marick.

July 12, 2007

Woodpeckers, Pinatas, and Dead Horses

Here's some short blurbs of a few things I took away from CAST sessions.

From Lee Copeland's keynote address:
  • "It's nonsensical to talk about automated tests as if they were automated human testing."
  • Write or speak about something you're knowledgeable and passionate about.
  • Combine things from multiple disciplines.

From Harry Robinson's keynote address:
  • Weinberg's Second Law: If Builders Built Buildings The Way Programmers Write Programs, Then The First Woodpecker That Came Along Would Destroy Civilization.

From Esther Derby's keynote:
  • To successfully coach someone, they must want to be coached and want to be coached by you.

From James Bach's tutorial:
  • Pinata Heuristic: Keep beating at it until the candy comes out. ... and stop once the candy drops.
unless ...
  • Dead Horse Heuristic: You may be beating a dead horse.
yet beware ...
  • If it is a pinata, don't stop beating at it until the candy drops; but if it is a dead horse, your beating is bringing no value. It can be a challenge to determine if its a pinata or a dead horse.
From Antti Kervinen's presentation:
  • Separate automation models into high level (behavior) and low level (behavior implementation) components to reuse test models on a variety of platforms and configurations.

More from James Bach's tutorial:
  • Testing does not break software. Testing dispels illusions.
  • Rational Unified Process is none of the three. (attributed to Jerry Weinberg)

From the tester exhibition:

  • Testing what can't be fixed or controlled may be of little value. Some things may not be worth testing.
  • There is great value in the diversity of approaches and skills on a test team.
  • It may be possible to beat a dead horse and test (and analyze) too much. Sometimes we should just stop testing and act on the information we have.

From Doug Hoffman's tutorial:
  • Record and playback automation can be very useful for testing for the same behavior with many configurations. And, once the script stops finding errors: throw it out.

From Keith Stobie's keynote:
  • Reduce the paths though your system to improve quality. Fewer features may be better.
  • Free web sites often have higher quality than subscription sites. This is because it is easy to measure the cost of downtime on ad-supported systems.

From David Gilbert's session:
  • People expect hurricanes to blow around and change path. We should expect the same with software development projects. (David has some interesting ideas about forecasting in software development.)
  • Numbers tell a story only in context. You must understand the story behind the numbers.
One more from James:
  • Keep Notes!


What did you take away from CAST?

July 8, 2007

Too much testing?

In a recent blog post, Jason Gorman provides some thoughts about the following question:

How much testing is too much?
To me, this is like asking "how much cheese would it take to sink a battleship?" There probably is an answer - a real amount of cheese that really would sink a battleship. But very few of us are ever likely to see that amount of cheese in one place in our lifetimes.
As Jason states, we may never encounter too much testing. However, I believe that we testers often include too much repetition in our testing and miss many bugs that are waiting to be discovered. This becomes especially likely when we limit our testing to scripted testing or put our test plans in freezers. Repeating scripted tests -- whether manual or automated -- is unlikely to find new bugs. To find new bugs, we testers need to step outside the path cleared by previous testing and explore new paths through the subject of our testing.


Executing the same tests over and over again is like a grade school teacher giving a class the same spelling list and test each week. The children will eventually learn to spell the words on the list and ace the test; but this does not help them learn to spell any new words. At some point, the repeated testing stops adding value.
If you have men who will only come if they know there is a good road, I don't want them. I want men who will come if there is no road at all.
Like Dr. Livingstone, I want testers that are willing to explore paths that have not been trampled by the testers that went before them. I want automated tests to go out like probe droids and bring back useful information. I want each manual tester on a team to think and explore the system under test in a different manner than the rest. There is a time and place for repeatable consistency, but that's just a part of testing. Real human users don't follow our test scripts. I don't want testing to be limited to testers and robots (automation) that follow scripts through pre-cleared paths.


Want to learn more about exploratory testing?


Try exploring the web with Google or your favorite search engine:

June 15, 2007

Modeling the Windows Calculator: Part 2

Adding Basic Validations

In the previous post, I created a simple model for starting and stopping the Windows calculator, and for switching between standard and scientific modes. I then created the code needed to execute that test and ran a test that hit each of the defined actions once.

As the next step, I reran the test with the MBTE configured to capture GUI object information as it executes. This created a checkpoint table that I then ran through a script that removes duplicates and combines rows that are the same for multiple states. I also manually reviewed this table to verify that the reported results are as I expected. I made some tweaks to the table based on my expectations. I can then use this checkpoint table as input for the next test execution. You may view the edited checkpoint file using the link below.

The checkpoint table is one of two table formats that I use for defining test oracles. I call the other format a state table. The state tables contain one validation per row and have additional fields for creating user-friendly descriptions of the validation. The state tables can also be used to reference external code for complex results validations. The checkpoint files contain one GUI object per row and the columns define the properties to validate and the expected values. While not as user-friendly as state tables, checkpoint tables are easy to automatically generate during test execution and reuse as input for future tests.

My calculator checkpoint table currently contains only positive tests to ensure that expected objects appear as expected. It does not yet contain any validations to ensure that the unexpected does not occur. For example, it contains no check to ensure that the calculator stops when the window is closed.

I then created a state table and added two oracles stating that the calculator window should exist when running and not exist when stopped. I gave each of these a failState value of "restart" to indicate that if these checks fail, the application should be restarted to resume testing.

My model currently contains the following files:
I then ran a test with this model. The MBTE executed a test that hit each of my test set actions once without me needing to give it a sequence of test steps. The MBTE automatically generated the test steps based on the model.

The results from this test execution may be viewed here. Some features in the results require Internet Explorer and may not function in other browsers. These results are usually placed on a file server, so there may be issues I have not yet noticed when accessing them from a web server.

There are some failures reported in the results. These appear to be tool issues rather than bugs in the Windows Calculator. I will look into these failures later. Do you have any ideas about the failures?

The color-coded HTML results make it easy to tell what happened. Each row indicates what happened, where the action or validation was defined, the code executed, and other pertinent information. Please explore the results and send me any feedback.

What would you like to add to this test next? More validations? Additional actions?

Do you have any observations or questions about this automation approach? Please add them to the comments.


Modeling the Windows Calculator

June 14, 2007

Modeling The Windows Calculator: Part 1

I have received a number of requests for some sample models. Based on a question I received a couple weeks ago, I'd like to create a test model for the Windows Calculator. The Windows calculator contains some things that are very simple to model as a state machine (such as switching between standard and scientific modes) and other things that do not have clear distinguishable states (such as performing the actual calculations).

I plan to model the calculator a piece at a time in a series of blog posts. I welcome your input.

I will start by modeling the obvious states that I see in the Calculator's user interface.

At the highest level, I can partition the Calculator's behavior into two states: running or not running. Next, the calculator has two major modes of operation: standard and scientific. After a little experimentation, I see that if I stop the calculator it will return to the previous mode when it is restarted. These transactions can be modeled as follows:

calc.standard -> calc.scientific
calc.scientific -> calc.standard
calc.standard -> stopped.standard
calc.scientific -> stopped.scientific
stopped.standard -> calc.standard
stopped.scientific -> calc.scientific

One problem with implementing the above in a machine-executable form is that we don't know the state of the calculator the first time we start it. This requires that we code detection of the starting state at the start of the test. This can be done by modeling virtual states that have guarded transitions going out. For example, the following can be used to start the test. The state of "start" is my MBTE's starting state.
start -> detectMode
detectMode (if standard) -> calc.standard
detectMode (if scientific) -> calc.scientific

In addition to the built-in state of "start", my MBTE has states called "restart" and "stop" that are used to restart an application after a failure and to shut down and cleanup at the end of a test. These state transitions should also be added:
restart -> detectMode
stop -> stopped

Now that I have defined the basic high-level transitions, I can put them in an action table and create the automation code needed to make these transitions happen.

The action table may be viewed here. The MBTE generated the following image for the model. (Click the image for a larger version.)



The next step will be to add some validations for the states modeled so far.

While I was running a test on this model, my daughter noticed a potential bug in the calculator that I had not noticed before. This model does not yet contain any calculations. Any idea what the bug may be?

Please send me your questions and suggestions for what should be added to this model next.

June 7, 2007

Model-Based Test Engine Benefit #4: Generate and execute new tests – and find new bugs

The last -- and perhaps the best -- major benefit of implementing a Model-Based Test Engine (MBTE) is automation that is capable of generating and executing tests that have not previously been executed manually.

Traditional regression test automation simply retraces test steps that have already been performed manually. This may find new bugs that show up in a previously tested path though an application but will not find bugs off the beaten path. In his book "Software Testing Techniques", Boris Beizer compares the eradication of software bugs to the extermination of insects.
Every method you use to prevent or find bugs leaves a residue of subtler bugs against which those methods are ineffectual.
- Boris Beizer
Software Testing Techniques
When we apply any method to finding bugs, we will find bugs that are found by that method. However, other bugs will remain. Finding new bugs requires variation in testing, not repeating the same thing over and over. Repeatability is often advertised as a benefit of traditional test automation. However, complete repeatability often hurts more than it helps. It can also give a false sense of security when repetitive automated test executions do not find bugs. James Bach compares repetitive tests to retracing a cleared path through a minefield.
Highly repeatable testing can actually minimize the chance of discovering all the important problems, for the same reason that stepping in someone else’s footprints minimizes the chance of being blown up by a land mine.
-James Bach
Test Automation Snake Oil
The randomized action selection of a MBTE leads to execution of a variety of paths though an application with a variety of data. This is likely to try things that have not been executed manually. Every randomly generated test will not be of value. However, computers are able and willing to run tests all night and on weekends for a lower cost than human testers.

As with any automation, tests generated and executed by a MBTE are no better than the model provided by the human designer. If the test or model designer does not model something, that thing will not be tested. Good test design is essential to useful automation.
Use model-based automation as an interactive part of the overall testing process. Use automation to collect information for human testers instead of attempting to replace them.

Use your brain. Do some exploratory modeling. Model the behavior you expect from the application and use the MBTE like a probe droid to go out and confirm your expectations. Then use the results of each test execution to refine your model for the next execution. Continually update your models improve your testing.

Go out and find new bugs.

June 1, 2007

Model-Based Test Engine Benefit #3: Automatic handling of application changes and bugs


Automated tests based on models have one important feature that scripted testing cannot: automated handling of application changes and bugs. I do not mean that model-based automation can think and make decisions like a human tester does when they discover something unexpected. Instead, the automated selection of test steps supports working around the unexpected without special exception handling code for each situation.


For example: If there are two methods for logging into an application and one breaks the test engine can try the alternate option to get to the rest of the application. If a traditional scripted automated test encounters an unexpected problem it will not be able to complete.

The model-based test engine (MBTE) can be coded to not try an action after a pre-defined number of failures. The MBTE's selection algorithm can then seek out other options that have not yet been found to fail. This also results in the MBTE reattempting failed actions and exposing failures that only occur after specific sequences of actions.

To facilitate the error detection, each action and validation should return the status to the MBTE framework. This allows for error handling to be built into the framework instead of each test model or script. Standard error codes -- either your own or the tool's built-in codes -- help standardize reporting.

For example: return a zero (0) when an action successfully completes or a validation passes, return a negative number on failure, and return a positive number for inconclusive results that require manual investigation.

Code the test engine to detect the error status of each action and validation and take appropriate action. If an action passes, perform the validations for the action's expected end state. If an action fails, restart the application or do whatever other error recovery fits your situation.

If a validation fails you can either code that the next validation be performed or identify validation failures that should stop further validation.

Validations can also be flagged to be state-changing failures by adding a "fail state" column to the oracle/validation tables. Give this field the name of the state that the application is in if the validation fails. You can even build standard states such as "restart" into the framework to indicate that the state is unknown and the application needs to be restarted. For example, a validation that an HTTP 404 error page is not displayed could have a "fail state" of "restart" defined to indicate that the application should be restarted when this validation fails.

Julian Harty has suggested that validations can be weighted and test execution be varied based on the combined score of failures.


Build error handling into the framework so that you can define the details with data instead of code.

May 21, 2007

Model-Based Test Engine Benefit #2: Simplified test result analysis


Automation is of little value if it does not report useful information that can be quickly reviewed by testers.

Reported results should contain enough information to answer the following questions:

  • What happened?
  • What is the state of the application?
  • How did the application get in that state?
  • What automation code was executed?
  • What automation data/parameters were used?

Some failures reported by automated tests will be errors in the system under test and others will be errors in the automation model or code. It is important that results point the reader to both.

I have found logging of the following information to be useful:

Test (test configuration information)
  • Title
  • Start Time
  • Script File(s)
  • Model Files
  • Test Set
  • Severity
  • Environment
  • Object Map
  • Action Table(s)
  • Oracle Table(s)
  • Computer Name
  • Operating System
  • Tester

Actions (controlling the application)
  • Source (where is the action defined?)
  • Title
  • Start Time
  • Action Details
  • Duration
  • State Transition
  • Automation Code
  • Result Details
  • Snapshot (screen capture, saved files, etc)
  • Status (Pass, Fail, Inconclusive)

Oracles (validating the results)
  • Source (where is the oracle defined?)
  • Title
  • State
  • Automation Code
  • Error Code / Description
  • Validation Details
  • Snapshot (screen capture, saved files, etc)
  • Status (Pass, Fail, Inconclusive)

Messages (report useful information not directly connected to an action or oracle)
  • Message
  • Link
  • Snapshot

Once you have decided what data to report, it is important to present the data in a manner that is conducive to efficient analysis. Results need to be both comprehensive and summarized (or linked) in ways that aid human testers and toolsmiths in quickly answering the questions listed above. A 10 hour automated test execution may be of little value if it takes another 10 hours to interpret the results.

Standardizing reporting and presentation is the first step to improving results analysis. Do not rely on your tool's built-in reporting. An expensive test automation tool should not be required to view results -- especially the incomplete results reported by many tools. Create a common reporting library that can be used by all your tests and use that library. Users of the reported results will not need to learn new formats for every project or test. Some suggested output formats are:

  • HTML: Human users like color-coded well-formed results presented in HTML. A little JavaScript can be added to customize the experience.
  • XML: Extensible Markup Language (XML) files can be processed by machines and can be displayed to human users when style sheets are applied.
  • Tab-Delimited / Excel: Simple tab-delimited, CSV, or Excel tables are useful reporting formats that are easily processed by both people and machines.
  • Database: Results written directly to a database can be easily compared to results from previous test executions.

Determine your needs and select the output formats that best meet those needs. If you standardize your reporting through a single small set of reporting functions, you can easily adapt reporting as your needs change.

May 19, 2007

Model-Based Test Engine Benefit #1: Simplified automation creation and maintenance

Model-Oriented Design

Procedural automated test scripts may be easy to record or script. However, they are difficult to maintain when applications change. They are also difficult to adapt to new test ideas. Maintenance is simplified by automating the procedure generation in addition to the execution. New actions, validations, and data can be added to existing tests. This allows testers to spend more time thinking up new test ideas instead of maintaining procedural scripts.

Simplified GUI Interaction Coding

Most GUI automation tools contain complex vocabularies for controlling objects and retrieving information from those objects. There are usually different methods for interacting with different classes of objects. This requires that toolsmiths learn a class-sensitive vocabulary and be aware of the class as they code tests. There is an easier way: create functions that automatically detect an object's class and apply the appropriate method. This allows for the same command to be used whether you are selecting from a list box or entering text into an edit box. The parameters for the functions can then be specified in tables that are processed by the test generation and execution engine.

Common framework functions for interacting with the applications under test also allows for common solutions to tool bugs and limitations. Workarounds and enhancements can be put in the common framework code instead of being reimplemented for each test script.

Separate result validation from actions

Separating expected results definition from the test action execution simplifies maintenance and supports easy reuse of test oracle code. Validations can be specified at whatever level in the model hierarchy they apply and the test engine automatically applies them to all sub-states.

April 15, 2007

Model-Based Test Engine Benefits

A Model-Based Test Engine (MBTE) is a test automation framework that generates and executes tests based on a behavioral model. Instead of performing scripted test cases, a MBTE generates tests from the model during execution. Instead of implementing models in code, a MBTE can process models defined in tables. Both human testers and computers can understand models defined in tables. A MBTE can be built on top of most existing GUI test automation tools. Combining good automation framework practices with Model-Based Testing (MBT) can transform some common test automation pitfalls to benefits.

Implementing a MBTE can produce the following:

  1. Simplified automation creation and maintenance.
  2. Simplified test result analysis.
  3. Automatic handling of application changes and bugs.
  4. Generate and execute new tests – and find new bugs.

More to come...

March 25, 2007

Common Barriers to Model-Based Automation

If modeling is as simple as the previous blog entry implies, then why isn’t everyone using model-based automated testing?

1. Model-based testing requires a change in thinking.

Most testers have been trained to transform mental models into explicit test scripts – not document behavior in a machine-readable format. However, most testers will find that modeling is actually easier than defining and maintaining explicit test cases for automation.

2. Large models are very difficult to create and maintain.

Small additions to a model can easily trigger exponential growth in the size and complexity. This state explosion usually requires that large models be defined using code instead of tables. The large model problem can be solved through the use of Hierarchical State Machines (HSMs) and state variables.

Most software states have hierarchical relationships in which child states inherit all attributes of the parent states plus have additional attributes that are specific to the child. Hierarchical state machines reduce redundant specification and allow behavior to be modeled in small pieces that can be assembled into the larger system. For example, the following HSM represents the same keyless entry system with less than half as many transitions defined. Actions that are possible from each state are also possible from all the child states. Validation requirements apply to the parent and all child states. This greatly reduces the size and complexity of the model. Large systems can be modeled by merging many small models.






Defining some state information as variables instead of explicitly named states can reduce the state explosion. Sometimes it is easier to define some conditions as state variables instead of specific child states. These state variables can be used to define guarded transitions. Guarded transitions are transactions that are only possible when the specified data condition is met. A requirement that all doors be closed before the example keyless entry system will arm the alarm may be specified as shown below. Without using guarded transitions, adding the difference in behavior based on whether doors are open or closed would require many new states and transitions.






3. The leading test tool vendors do not offer model-based testing tools.

Modeling is not a “best practice” promoted by the tool vendors. Tool vendors often dictate the way that their tools are used. This results in automation practices being defined to fit the tools instead of making the tools fit the desired approach. The good news is that many test automation tools – both commercial and open source – provide enough flexibility to build new frameworks on top of the built-in functionality.

4. Model-based testing looks complicated.

The model-based testing literature often makes modeling look more complicated than necessary. The truth is that modeling does not require expert mathematicians and computer scientists. A relatively simple framework can support complex test generation and execution with less manual work than most other automation methodologies.

March 24, 2007

Finite State Machines

Software behavior can be modeled using Finite State Machines (FSMs). FSMs are composed of states, transitions, and actions. Each state is a possible condition of the modeled system. Transitions are the possible changes in states. Actions are the events that cause state transitions. For example, the following FSM shows the expected behavior of a car keyless entry system.




Images like the above are great for human use, but not machines. State transitions and the actions that trigger them can also be defined in a table format that can be processed by a computer. The above FSM can be represented using the table below.





The requirements for each state can also be defined using tables. The table below contains sample requirements for example keyless entry system.

March 23, 2007

Artificial Intelligence Meets Random Selection

Automated tests can be defined using models instead of scripting specific test steps. Tests can then be randomly generated from those models. The computer can even use the model to recover from many errors that would stop scripted automation. Although the computer cannot question the software like a human tester, the automation tool can report anything it encounters that deviates from the model. Thinking human beings can then adjust the model based on what they learn from the automated test’s results. This is automated model-based testing.

As with any explicit model, a model built for test automation is going to be less complex than the system it represents. It does not need to be complex to be useful. New information can be added to existing models throughout the testing process to improve test coverage.

Intelligent Design

All testing is model-based. Good tests require intelligent design. Testers use mental models of system behavior when creating and executing tests. Scripted tests are created from the designers’ mental models prior to test execution and do not change based on the results. Exploratory testing starts with a pre-conceived mental model that is refined as tests are executed. Whether scripted or exploratory, human testers are capable of applying information they learn during test execution to improve the testing. Computers cannot adjust scripted test cases during execution. What if automated tests could apply behavioral models to generate tests that go where no manual tester has gone before?

March 18, 2007

SQuAD 2007 Conference Presentation

Click here to download my SQuAD conference presentation slides.

Please ask questions using the blog's comment feature or email them to me at ben@qualityfrog.com.

Ben

March 13, 2007

Expecting the unexpected. Part 2

How can we create automation that can deal with the unexpected?
The first step is to create test automation that "knows" what to expect. Most GUI test automation is built by telling the computer what to do instead of what to expect. Model-Based Automated Testing goes beyond giving the computer specific test steps to execute.

Model-Based Testing is testing based on behavioral models instead of specific test steps. Manual testers design and execute tests based on their mental models of a system's expected behavior.

Automated tests can also be defined using models instead of scripting specific test steps. Tests can then be randomly generated from those models -- by the computer instead of a manual tester. The computer can even recover from many errors that would stop traditional test automation because it knows how the system is expected to behave. And by knowing how it is expected to behave, it can detect unexpected behavior. Unexpected does not necessarily mean wrong behavior. The behavior could be wrong or it could be something that was not included in the model. The computer can report the unexpected behavior to human testers for investigation and future updates to the model.

For example, if one path to functionality to be tested fails, the MBT execution engine can attempt to access that functionality by another path defined in the model.

Of course, there will always be some cascading failures that stop both automated and manual tests. MBT inherently provides better error handling than scripted test automation.

February 26, 2007

Expecting the unexpected. Part 1

One of the expectations for GUI test automation is unattended running of tests. However, this is often difficult to accomplish. Unexpected application behavior can stop an automated test in its tracks. Manual intervention is then required to run the script. Some automation tools offer run-time options to help the user prod the test along. Other tools require that the script or system under test be fixed before test execution can continue. The process of running a partial test, fixing the script (or waiting for an application fix), and then running another partial test to only find another script-stopping change can be time consuming. This process often takes longer than manual testing.

The problem is that scripted automation cannot adjust to application issues like a thinking manual tester. The automation script can only do what the scripter told it to expect. Some automation tools offer complex exception handling features that allow users to define expected unexpected behavior. There lies the problem: someone has to expect and code for the unexpected. There will always be unexpected unexpected behavior.

How can we create automation that can deal with the unexpected?

February 7, 2007

People, Monkeys, and Models

Methods I have used for automating “black box” software testing…


I have approached test automation in a number of different ways over the past 15 years. Some have worked well and others have not. Most have worked when applied in the appropriate context. Many would be inappropriate for contexts other than that in which they were successful.

Below is a list of methods I’ve tried in the general order that I first implemented them.

Notice that I did not start out with the record-playback test automation that is demonstrated by tool vendors. The first test automation tool I used professionally was the DOS version of Word Perfect. (Yes, a Word Processor as a test tool. Right now, Excel is probably the the one tool I find most useful.) Word Perfect had a great macro language that could be used for all kinds of automated data manipulation. I then moved to Pascal and C compilers. I even used a pre-HTML hyper-link system called First Class to create front ends for integrated computer-assisted testing systems.

I had been automating tests for many years before I saw my first commercial GUI test automation tool. My first reaction to such tools was something like: "Cool. A scripting language that can easily interact with the user interfaces of other programs."

I have approached test automation as software development since the beginning. I've seen (and helped recover from) a number of failed test automation efforts that were implemented using the guidelines (dare I say "Best Practices"?) of the tools' vendors. I had successfully implemented model-based testing solutions before I knew of keyword-driven testing (as a package by that name). I am currently using model-based test automation for most GUI test automation: including release acceptance and regression testing. I also use computer-assisted testing tools help generate test data and model applications for MBT.

I've rambled on long enough. Here's my list of methods I've applied in automating "black box" software testing. What methods have worked for you?

Computer-assisted Testing
· How It Works
: Manual testers use software tools to assist them with testing testing. Specific tasks in the manual testing process are automated to improve consistency or speed.
· Pros: Tedious or difficult tasks can be given to the computer while a thinking human being is engaged throughout most of the process. A little coding effort greatly benefits testers. A thinking human being is involved throughout most of the testing process.
· Cons: A human being is involved throughout most of the testing process.

Static Scripted Testing
· How It Works: The test system steps through an application in a pre-defined order, validating a small number of pre-defined requirements. Every time a static test is repeated, it performs the same actions in the same order. This is the type of test created using the record and playback features in most test automation tools.
· Pros: Tests are easy to create for specific features and to retest known problems. Non-programmers can usually record and replay manual testing steps.
· Cons: Specific test cases need to be developed, automated, and maintained. Regular maintenance is required because most automated test tools are not able to adjust for minor application changes that may not even be noticed by a human tester. Test scripts can quickly become complex and may even require a complete redesigned each time an application changes. Tests only retrace steps that have already been performed manually. Tests may miss problems that are only evident when actions are taken (or not taken) in a specific order. Recovery from failure can be difficult: a single failure can easily prevent testing of other parts of the application under test.

Wild (or Unmanaged) Monkey Testing
· How It Works:
The automated test system simulates a monkey banging on the keyboard by randomly generating input (key-presses; and mouse moves, clicks, drags, and drops) without knowledge of available input options. Activity is logged, and major malfunctions such as program crashes, system crashes, and server/page not found errors are detected and reported.
· Pros: Tests are easy to create, require little maintenance, and given time, can stumble into major defects that may be missed following pre-defined test procedures.
· Cons: The monkey is not able to detect whether or not the software is functioning properly. It can only detect major malfunctions. Reviewing logs to determine just what the monkey did to stumble into a defect can be time consuming.

Trained (or Managed) Monkey Testing
· How It Works: The automated test system detects available options displayed to the user and randomly enters data and presses buttons that apply to the detected state of the application. · Pros: Tests are relatively easy to create, require little maintenance, and easily find catastrophic software problems. May find errors more quickly than an unsupervised monkey test.
· Cons: Although a trained monkey is somewhat selective in performing actions, it also knows nothing (or very little) about the expected behavior of the application and can only detect defects that result in major application failures.

Tandem Monkey Testing
· How It Works:
The automated test system performs trained monkey tests, in tandem, in two versions of an application: one performing an action after the other. The test tool compares the results of each action and reports differences.
· Pros: Specific test cases are not required. Tests are relatively easy to create, require little maintenance, and easily identify differences between two versions of an application.
· Cons: Manual review of differences can be time consuming. Due to the requirement of running two versions of a application at the same time, this type of testing is usually only suited for testing through web browsers and terminal emulators. Both versions of the application under test must be using the same data – unless the data is the subject of the test.

Data-Reading Scripted Testing
· How It Works: The test system steps through an application using pre-defined procedures with a variety of pre-defined input data. Each time an action is executed, the same procedures are followed; however, the input data changes.
· Pros: Tests are easy to create for specific features and to retest known problems. Recorded manual tests can be parameterized to create data-reading static tests. Performing the same test with a variety of input data can identify data-related defects that may be missed by tests that always use the same data.
· Cons: All the development and maintenance problems associated with pure static scripted tests still exist with most data-reading tests.


Model-Based Testing
· How It Works:
Model-based testing is an approach in which the behavior of an application is described in terms of actions that change the state of the system. The test system can then dynamically create test cases by traversing the model and comparing results of each action to the action’s expected result state.
· Pros: Relatively easy to create and maintain. Models can be as simple or complex as desired. Models can be easily expanded to test additional functionality. There is no need to create specific test cases because the test system can generate endless tests from what is described in the model. Maintaining a model is usually easier than managing test cases (especially when an application changes often). Machine-generated “exploratory” testing is likely to find software defects that will be missed by traditional automation that simply repeats steps that have already been performed manually. Human testers can focus on bigger issues that require an intelligent thinker during execution. Model-based automation can also provide information to human testers to help direct manual testing.
· Cons: It requires a change in thinking. This is not how we used to creating tests. Model-based test automation tools are not readily available.

Keyword-Driven Testing
· How It Works:
Test design and implementation are separated. Use case components are assigned keywords. Keywords are linked to create tests procedures. Small components are automated for each keyword process.
· Pros: Automation maintenance is simplified. Coding skills are not required to create tests from existing components. Small reusable components are easier to manage than long recorded scripts.
· Cons: Test cases still need to be defined. Managing the process can become as time consuming as automating with static scripts. Tools to manage the process are expensive. Cascading bugs can stop automation in its tracks. The same steps are repeated each time a test is executed. (Repeatability is not all its cracked up to be.)

February 6, 2007

Slogans are models.

Harry Robinson posted an answer to inquiries about the Google Testing Blog's slogan: "Life is too short for manual testing." Some were concerned that the slogan implied that Google does not value manual and exploratory testing. I too had such concerns.

Harry pointed out that the slogan is just a slogan and that life really is too short to try all the combinations that might expose important bugs.

This got me to thinking about slogans as models. A slogan is really a model of an idea. It is not complete. It is simpler than the thing it describes.

Consider the following advertising slogans:
  • "The ultimate driving machine"
  • "When it absolutely, positively has to be there overnight."
  • "Finger lickin' good."
  • "Let your fingers do the walking."
  • "Reach out and touch someone."
  • "The quicker picker-upper."
  • "Have if your way."
  • "It's everywhere you want to be."
  • "Betcha can't eat just one."
These slogans bring to mind attributes about the companies and their products that are not an explicit part of the slogan. I don't even have to mention the companies or their products. This is your mind accessing your mental model of the company and products that the model represents.

In addition to the more detailed model invoked in your mind, it should not be difficult to find faults with these slogans. The slogans are incomplete; yet they are not useless.

Slogans demonstrate both the usefulness and potential complexity of models. A model does not need to be complete to be useful.

So, how does this apply to software testing ... and test automation?

When we develop test cases or perform exploratory testing we are implementing our mental models. When we execute tests, we (hopefully) learn more about the system under test and update our mental models.

In the same way, explicit models used for model-based test automation can be refined after each test execution. There is no need to model all the possible details before the first test run. Running tests based on incomplete models can provide valuable information about your test subject. It can validate or disprove your assumptions. Results from an incomplete model can help lead you to other useful tests -- both manual and automated.

Investigate using Hierarchical State Machines to simplify model definition and maintenance.

Build your models one step at a time.