Frequently Asked Questions
What is an Average Load Test?
A computer system has a typical activity in terms of end user interactions and the demand these interactions make on the day to day operation. An average load test sets out to see how the system copes with this average activity. This test profile is often used as a standard or baseline to measure changes in system capacity as the computers code and platform develops over time.
Understanding the levels of end users anticipated in the live system use and applying these figures to the average test, like all performance testing, is key. The easiest way to do this is to look at how your system is behaving now. If you are implementing a new system, then look at the business plan, and architectural requirements, and imagine how you think the system will be used.
It is important to consider the 'concurrency' level of the system and setting a realistic set of goals. We discuss end user concurrency in our FAQs.
We would typically suggest having your virtual users load the system gradually (called a ramp-up) and then repeat their selected business executions, or journeys for an hour or two.
Things to look for
Your system will need to provide you with response times that are acceptable to your end users. These vary from system type and customer. Your end users would also have to expect a certain success rate for the requests they make. For example searching for a product would return the product search results page, rather than a system error. The percentage of end users calls returning the expected page is number typically from 95 to 100% is what we'd hope you want to receive.
In addition, the machines running the system should all be behaving within performance and capacity thresholds. Measuring, understanding and setting acceptance criteria around this will depend on the hosting or infrastructure providers. BluCap of course can provide Performance analysis services to assist in this area.
What is a Peak Load Test?
Most computer systems experience variances in the amount of traffic they have to cope with over time. For example a children's toy online shop may have on average 20 people using it, but two weeks to Christmas, this may jump up to 200 people at a time looking to buy presents for their children.
The maximum volume of end users for a peak load test can be difficult to estimate. The simplest way to do so is to look at the history of your current application, or research similar applications.
When setting your peak load levels, it is also a good idea to understand your system use growth over time. If for example we are starting up an online shop 'BluTop Toys', as our customer base grows over time, so will the demand on the online store system we are implementing. We typically see customers plan their capacity in three to five year terms.
The peak load test will most likely increase the virtual user load over a short time, and run at the maximum number of users for one or two hours.
Things to look for
The key information you will gain from a peak load test is very similar to an average load test. All of the assumptions about how much time it takes to return the information the customer wants will stand here, although you may want to think about the tolerances to cover high use periods.
The success rates for the end users are also important, as errors in responses from the system are likely to increase with the number of end users a system has to cope with. Bear in mind that there are two ways to assess the success rate as a user sees it:
- A single page returns an error, and the customer knows how to press refresh and carry on successfully. This customer has managed to complete their journey, but you are faced with a loss in credibility as the computer system 'glitches'. We normally represent this as success rate for all hits.
- The end user is a customer, and when they experience the error, they give up and take their business elsewhere. We can think of this as a success rate for whole journeys. Obviously this case has a greater impact on current and forward sales for online store applications.
Peak load tests also provide a good mechanism to check the capacity of the system running the site, in terms of machine behaviour. BluCap provides a consultancy service to assist in this if required.
What does the blucap report look like?
Click here to view a sample report
What is a Stress Test (or Destructive Test or Break Test)
There are two principal objectives of a stress test. These are to see how many end users or how much load a system can take, and to identify which components fail first and cause the performance problems or 'bottlenecks'.
The format of a stress test is usually to apply increased load over time. There are two approaches; each has its benefits and drawbacks.
- Run a single test, gradually increasing virtual users up to failure point. This is the least time consuming approach, however it does not always show the most accurate results. The reason being, that as we add more virtual users to a load test, they typically start their user journeys with a log on or home page. As the system slows down under load, these pages are the ones handling the most requests, and proportionally tend to get inflated in terms of activity. When planning a stress test, it can also be difficult to estimate the maximum virtual users you want to test to. If a test is planned with a 300 virtual user capacity as the limit, if the system copes, then another stress test will be required with a higher number of virtual users.
- Several tests are scheduled each with a gradual loading period, and a peak load period. At some point, the load levels will cause a system failure, and we have identified the point at which the test cannot continue. The benefits are that we can measure the scalability (behaviour of system in terms of response times and machine behaviour as the load levels increase) by comparing each test running at its peak level, where the virtual users settle into a repeatable loop and do not clump together. Of course, this approach takes more time, and subsequently costs more as it incurs the repeating of tests. If the user journeys being tested use up or burn data (registration details fed in, or products bought and removed from a stock management system) this data will need or removing from the system between tests.
Things to look for
The failure point of a test will normally be identified by three factors:
- The success criteria for your test in terms of time for pages to be returned, and the errors the end users encountered are not met.
- The system simply fails to respond, or a component stops functioning.
- The capacity and scalability of the underlying machines is exceeded, typically central processing unit (CPU) gets too busy or the memory on the servers runs low.
What is a Duration Test (or Soak Test)
Testing a computer system for an hour with the most expected virtual users is a proven way to understand a system's behaviour. However, we often see that computer code, internal routines, or database structures start to have a detrimental effect on the response times and stability of the system, as it continues to run.
A soak test is designed to provide two main objectives.
- Illustrate that the end user response times do not degrade over time.
- Identify detrimental trends in technical capacity over time, and assist in rectifying them.
The approach typically taken in a soak test is to gradually add virtual users over a period of time and then continue the test loading for a long time, such as 8 to 24 hours.
The load levels chosen are normally higher than an average load test, as high activity levels will accentuate technical trends quicker than a lower one.
Things to look for
To understand how an end users experience changes over time, assess the response time for pages over each hour, and look for standard deviation and events which may impact them. These can also help technicians look for background processes, which may impact performance.
Machines running custom code can have problems managing memory or processes. A soak test analysis would typically look at memory used over the peak load period, and the number of processes running. If memory is seen to be running out over time, then looking to see which processes use more and more memory will help identify the culprits.
What is a Spike Test (or Surge Test)
Surge testing is used to assess how a system copes with a sudden input of additional activity, which is then removed. For example, our online toys store 'BluTop Toys' runs an advertising campaign, which consists of a single advert during children's hour on Friday afternoons. The expected impact of that may be a raise in requests by a factor of 50% above average, which then eases off into the evening. BluTop Toys may wish to see that the as the system quietens down, the end users browsing the store later on are not slowed down by the previous activity.
The virtual user load will be increased over time to run at 'average' or 'peak' expected volumes. The load will typically be permitted to settle for up to 30 minutes, and then a large number of new virtual users will be added in a short period of time. These virtual users will run for a short period and then stop generating load. The 'base load' virtual users will then continue from 30 minutes to two hours.
Things to look for
User page response times should be assessed from the start of the peak time, scrutinised over the spike point, and then check that if they slowed down, that the times recovered shortly after the spike-users stopped their activity. Having a baseline to compare the response times to here is a very useful tool.
The machine level statistics will also help in identifying if the spike damages anything running behind the scenes, and if response times are slow, assist in identifying the cause.
What is External Influence or User Touch Testing
Many online systems run with connections, or interfaces, to other systems, or have business critical components which need proving under load, but do not warrant the need to use performance tools to do so as they are infrequent events.
For example 'BluTop Toys' have a monthly accounting period, and the sales team produce reports for the management team to identify sales by department and by advertising campaign. These reports are expected to run in the afternoon on the last Friday of every month.
The average load would be generated over a period of time, and once the peak load level was reached, End Users, User Acceptance Test Technicians or System Administrators would start their testing, noting the start time of the each specific activity, and how long it took. When the testing is completed the performance load would be stopped.
Things to look for
We would normally look to see what the impact of the external activity has on the page response times the loaded virtual users were experiencing, and also look to see that the external activities were completed within their expected timescales.
Reporting the specific manual or triggered activity timing against any differences reported in response times helps to identify the dependencies in end user activity and can be cross checked against anomalies seen in what the computers are actually doing.