Questioning Software: Load Testing

Showing posts with label Load Testing. Show all posts

June 23, 2007

How many load generators do I need?

How many load generators do I need to run a [insert number here] user load test on a web application?

I am often asked how many load generators are needed for a load test with a certain number of simulated users.

My answer: It depends.

It depends on the system under test. It depends on your test tool. It depends on your specific script. It depends on your load generation hardware.

There is no straight forward answer to this question. There is no formula that can be used to extrapolate an answer. There is no one-size-fits-all rule of thumb. Some tool vendors will attempt to provide an answer, but they are wrong. I once spent a half hour arguing with a tool vendor support representative that claimed that I could run 200, and no more than 200, simulated users per load generator regardless of what those simulated users did or what hardware I used to host them. I had successfully simulated 1200 users with this tool for one script but could not simulate 50 users with another script. The number I could run on that same piece of hardware varied depending on the script.

The real question being asked is: How much load can I put on a load generator without impacting performance?

Isn't this one of the most common questions that load testing attempts to answer? Performance testers, of all people, should understand that there is no single formula for determining how much load you can place on a load generator system.

That "system" includes computers, software on those computers, and the network between the load generation computers and the system under test. To determine the requirements for this system we need to monitor that entire system. We need to monitor CPU usage. We need to monitor memory usage. We need to monitor whatever other system resources the script and test tool may impact. We need to monitor bandwidth usage. We need to monitor at the start of a test and at the end of a test. We need to monitor and load test the load generation environment just as much as we need to monitor and test the system under test.

So, where do we start?

Test the test environment. I start by running a small number of simulated users from a single load generator system. I monitor the system resources on the load generator. I estimate the bandwidth between me and the system under test. Once I have system resource numbers for my small number of simulated users, I extrapolate how many I think can run on the system and I test it. If I find that it is too many, I decrease the number of users. If I think my environment can support more, I will test that.

Even when I think I know the answer, I monitor the test environment during load tests. This gives me information should I ever question the environment. Sometimes I discover that something in my test environment is the bottleneck and not the system I am testing. When this happens, we need to be willing to say "oops" and try again.

My real concern.

My real concern about such questions is not the answer, but that they are being asked with an expectation for a formulaic answer; and they are being asked by people that are designing, executing, and analysing load tests. The following may seem harsh, but I believe it to be true:

I believe anyone that is looking for a one-size-fits-all answer to the question of how many load generators are required should not be leading any load testing.

They should not be designing tests. They should not be analyzing results. They may be able to safely script what other people have designed. Even new load testers quickly learn (if they are paying attention) that even minor changes in activity within an application can impact performance and the load generated on the system under test.

There are no one-size-fits-all answers. That's why we test. Understanding this should be requirement #1 for selecting any load tester.

What is a new load tester to do?

1) Educate yourself.

Most people I talk to got into load testing without much direction. This was also the case for me. There seems to be a perception that anyone can learn a tool and be a great load testers. This tool-centric emphasis often leads people astray. Some smart people I know at a leading tool vendor even tell me that their training is just the introduction. I encourage all new load testers to seek education in addition to the formal vendor training.

Some great resources on the web are:

Scott Barber's web site: PerftTestPlus
Scott Barber's Peak Performance column: currently at SoftwareSearchQuality, formerly at ST&P Magazine
Chris Loosley's blog: Web Performance Matters
Google :)

Get to know your tools. However, it is more important to get to know performance testing. The tools are the easy part.

2) Befriend experienced load testers, network engineers, systems engineers, systems adminstrators, developers, and anyone else that may have useful information.

It is disappointing that new testers are often sent out into unchartered territory all by themselves. This has happened to me. I don't like being set up to fail. I don't like seeing other people being set up to fail.

So, why didn't I sink? I had some great mentors that helped keep me afloat. These mentors were rarely assigned to a project with me, so I had to seek them out.

Learn to ask questions. Performance and load testing often requires information gathering from more sources than any other testing. Asking questions is part of the job. If you don't understand something, seek out the answer. If you're embarassed to ask publically within your project team, ask someone privately. Look for resources outside your project. Find others in and outside your company that are willing to help. Participate in online forums -- with the understanding that there's also lots of misinformation on the web.

The Bottom Line

There are no magic formulas. That's why we test. Educate yourself. Get to know the people in your neighborhood.

June 10, 2007

Performance: Investigate Early, Validate Last

Performance and load testing is often viewed as something that has to be done late in the development cycle with a goal of validating that performance meets predefined requirements. The problem with this is that fixing performance problems can require major changes to the architecture of a system. When we do performance testing last, it is often too late or too expensive to fix anything.

The truth is that performance testing does not need to happen last. Load test scripting is often easier if we wait until the end, but should we sacrifice quality just to make testing easier?

Scott Barber divides performance testing requirements and goals into the following three categories:

Scalability -- extremely technical
Stability -- mostly technical
Speed -- fuzzy: some technical, other usability

Speed is where things get fuzzy. Some speed requirements are quite definable, quantifiable and technical; others are not.
- Scott Barber

Scott says that hard measurable requirements can usually be defined for scalability and stability; however, meeting technical speed requirements does not ensure happy users. I often hear (and read) it said that one must have test criteria defined before performance testing can start. I disagree. When requirements are difficult to quantify, it is often better to do some investigative testing to collect information instead of validating the system against predefined requirements.

In additional to the three requirements categories, Scott argues that there are two different classifications of performance tests.

Investigation -- collect information that may assist in measuring or improving the quality of a system
Validation -- compare a system to predefined expectations

Performance testers often focus on the validation side and overlook the value they can bring on the investigation side. Sometimes we need to take off our quality cop (enforcement) hat and put on our private investigator hat and test to find useful information instead of enforce the law (requirements). Testers that work primarily with scripted testing are accustomed to the validation role of functional testing and try to carry that info performance testing. The problem is that most performance testing is really investigation -- we just have trouble admitting it.

Investigate performance early
Validate performance last
Traditional performance testing is treated as a validation effort with technical requirements. It is often said that a complete working system is required before testing can begin. Extensive up-front design is common. Tests are executed just before release and problems are fixed after a release. A couple years ago, Neill McCarthy asked attendees at his STAR West presentation if these really are axioms. When we consider the potential of investigative testing, these assumptions of traditional performance testing quickly dissolve.

Agile Manifesto
Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

Neill recommended that we apply the Agile Manifesto to early performance testing. How can we apply agile principles to investigative load testing?

Model user behavior as early as possible; and model often. A working application is not needed to model user behavior. Revise the model as the application and expected use change. Script simple tests based on the model. Be prepared to throw away scripts if the application changes.

Conduct exploratory performance tests. Apply exploratory testing techniques to performance testing: simultaneous learning, test design, and test execution. Perform "what if" tests to see what happens if users behave in a certain way. Adapt your scripts based on what you learn from each execution.

Evaluate each build on some key user scenarios. Create a baseline test that contains some key user scenarios that can be run with each build. A common baseline in the midst of exploratory and investigative tests provides supports comparison of builds.

Investigative agile performance testing can increase our confidence in the systems we test. Exploratory tests allow us to find important problems early. Testing throughout the development lifecycle makes it easier to measure the impact of code changes on performance.

References

McCarthy, Neill; Performance testing in early development iterations, STAR West presentation, Nov 2005
Barber, Scott; Investigation vs. Validation, Software Test & Performance, Nov 2005
Barber, Scott; Two Kinds Of Performance Requirements, Software Test & Performance, Dec 2005

April 16, 2007

Performance Testing Lessons Learned

Web and client/server load testing can easily become a complex task. Most people I've met got started in load testing with only minimal training in using the test tools. This is how I got started in load testing -- although I had an advantage in that I had been exposed to load testing of communications systems. I also had experience with automated single-user performance testing. I had led some small-scale manual load tests with multiple testers on a conference call hitting the same client-server application at once. (And we found some show-stopping bugs doing that manual testing.) I had watched others perform load tests. I had read numerous load test plans and reports. However, I had never directly participated in executing automated load tests... then I was asked to lead a load testing project.

Through the years, I have made many mistakes designing, scripting, and executing load tests. Load testing easily becomes complex. Tool sales people sometime tell us that nearly anyone can create tests with their tools. (Yet buying test tools is sometimes just like buying a new car: the salesman tells you that the car is reliable and has a great warranty; then the finance person warns of everything that could go wrong that isn't covered in the warranty and trys to sell you an extended warranty and maintenance contract.) Learning the mechanics of how to use a tool are often the easy part. Its what you do with the tool that matters.

Here is the short list of some of the important performance/load testing lessons I have learned. Some I learned from my own experience. Some I learned from the failures of others.

Bad assumptions waste time and effort

Ask questions
Performance testing is often exploratory
Expect surprises
Prepare to adapt

Get to know the people in your neighborhood: no single person or group is likely to have all the required information

Subject-matter experts
Developers
System administrators
Database administrators
Network engineers

Don’t script too much too soon: you may end up tossing out much of what you script

Applications change
Real usage is difficult to estimate
Tool limitations may be discovered

Different processes have different impacts: what users do can be as, or more, important as how many users are doing it

Include high-use processes (80/20 rule)
Include high-risk processes
Include “different” processes

Modularize scripts: simplify script maintenance -- but only when you intend to run the script again
Data randomization is not always a good thing: randomization can make result comparison difficult
Code error detection and handling

Don’t assume that your tool will handle errors
Catch and report errors when and where they happen
Realize that errors may change simulated user activity

Know your tools and test environment

Tool’s supported protocols and licensing
Load generator and network resource usage
Load balancing and caching mechanisms
Beware of test and production environment differences

Try to translate results into stories that matter to the applicable stakeholders

Tests are run to answer questions: don't overwhelm your audience with hundreds of numbers if they just want a "yes" or "no" answer

and finally…

Most performance and load-related problems are due to software code or configuration; not hardware

Don’t throw more hardware at a software problem