Scalability Testing - A Guide and Checklist
You can build a web application that functions perfectly on your machine, yet if it doesn't scale, a sudden influx of traffic could cripple your whole system. That's why it's important to have measures in place that allow your application to quickly scale in order to handle these spikes in traffic. This guide to scalability testing will help you overcome such growing pains.
The importance of scalability
Scalability refers to the capacity of a network, system or process to adapt to sudden changes. Applications must be able to handle large increases in simultaneous users, data volume and other workloads. Scalability tests provide a way to simulate different scenarios to ensure that your application is ready for any situation.
Scalability testing is incredibly important because performance issues can result in ruinous first impressions that do irreparable damage to a brand. For example, if an online retailer is advertising a big sale, they should be prepared for increased traffic. If their website doesn't scale, it could crash, which would drive away potential customers who were excited about spending their money. In fact, at the time of writing this article, VIA Rail, a train company based in Canada ran a promotion to celebrate Canada's 150th anniversary which ended up being so popular that it brought their checkout process to a near halt.
The terms scalability, reliability and performance testing are sometimes used synonymously. Performance and workload are inextricably linked. There is no one-size-fits-all approach to improving scalability; what works best for a small website will not necessarily work for an MMORPG. Developers must consider the specific needs of their applications and their users. For example, if your application depends heavily on a database, then you must consider the relationship between the capacity of your database and the number of expected users.
Testing for scalability
The basic goal of a scalability test is simple: to determine at which point the application stops scaling, and then figure out how to fix it. However, the process isn't always as straightforward as it sounds. Developers must often make difficult decisions, compromises, and tradeoffs. For example, when building a PHP application, you may have to sacrifice some speed for scalability by writing a script that loads data in chunks rather than all at once.
Testing can help you learn your application's user limit by assessing client side degradation and end user experience as well as server side robustness and degradation under heavy loads. There are many factors to consider when testing for scalability including:
- Response time
- Screen transitions
- Requests per seconds
- Network usage
- Memory usage
- The time it takes to execute tasks
Setting goals for scalability tests
Ideally, you want your application to work perfectly for every user all the time; however, that goal is both nebulous and impossible to achieve. Therefore, you should aim to meet more realistic goals.
A concrete goal sounds more like "the application should be accessible with minimal delay in response time." You could even get more specific and define "minimal." Another achievable goal is "the server shouldn't crash under heavy loads," although you should define what a "heavy load" is. "Maintain an average transaction time of four seconds at 100 simultaneous users," is an appropriately explicit goal.
Different applications will need to be tested for different parameters, and successful testing should help you determine if performance problems are related to your network, your databases or your other hardware and software.
Preparing for scalability testing
If you're building a new web application, you have no way of knowing the number of simultaneous users you'll have years from now. Although you obviously want your project to be as successful as possible, your problems could multiply as you gain users. For instance, if you have a database that stores data about all newcomers, a sudden surge of customers could slow things down for everyone.
Since different scenarios create different challenges, scalability testing should be conducted in increments of small, medium, and large loads. Everything should work seamlessly under small loads, so your first round of testing will give you an idea of your application's baseline performance. A medium load gives you a more realistic sense of how the application will usually run, and a large load test shows you how it will cope under pressure. You are likely to encounter different obstacles along each step.
You must be sure to use the same environment throughout testing so that you get reliable results. The graph below illustrates the ideal relationship between memory usage and time. You should expect an initial ramp-up phase that quickly levels out and remains consistent. If you get a graph that looks like this, then you should have enough memory to handle all three stages of testing.
The following graph represents the relationship between how long it takes to execute a report and the number of users:
The x-axis represents the number of users while the y-axis represents response time in seconds. As you can see, the results are almost perfectly proportional. A load of 20 users gives you an average response time of 5.5 seconds. If you increase the load to 60 users, the response time rises to 18 seconds.
If the outcome is not proportional, then you will experience bottlenecks with more simultaneous users. You may have to make adjustments to your server, software or hardware to achieve your desired results. After every round of changes, you must perform testing all over again to make sure problems have been properly addressed.
Vertical vs horizontal scaling
After your first round of testing, you can start upgrading your software and hardware by scaling up or scaling out. Scaling up and scaling out are not the same thing.
- Scaling up, or vertical scaling - involves replacing a component of your system with something that works better. Trading your processor for a faster one is scaling up.
- Scaling out, or horizontal scaling - means adding components. Bringing in a new server to share the workload with your current server is scaling out.
Both approaches to scalability testing come with their own advantages and disadvantages. While scaling up is theoretically simpler than scaling out, constantly upgrading hardware eventually produces diminishing returns. Each processor upgrade provides less benefit than the previous upgrade. On the other hand, horizontal scaling might give you better results, but it can get expensive fast when you factor in the cost of maintenance.
The scalability test checklist
Below is a basic checklist for the scalability testing process:
- Pick a repeatable process for conducting your scalability tests during the application's lifecycle.
- Define your scalability criteria.
- Make a list of tools you'll need to run tests.
- Establish the testing environment and configure any hardware you need to perform testing.
- Plan your test scenarios.
- Create and verify visual script and load test scenarios.
- Execute your load tests.
- Analyze your results and generate reports.
The information in your reports will provide guidance for making improvements. For example, if your company's website is anticipating a 400 percent increase in traffic over the next month, you may want to give your server performance a boost to minimize request processing time. Possible options include upgrading your server hardware to allow for additional RAM. Alternatively, you could switch to a different server software.
Scalability testing should help you accurately project how changes in hardware and software will impact your server performance. That way you can gauge whether or not investing in upgrades is really worth it.
Designing scalability tests
Scalability test results are only helpful if designed correctly. Proper scalability testing consists of a series of load tests using a number of hardware or software specifications while maintaining a consistent testing environment. Variables you have control over include CPU speed, server types, and available memory. These are the essential steps to designing adequate tests:
- Come up with possible user scenarios. You can modify them in various ways. Make sure to verify each scenario so that requests get recorded properly.
- Design a load test with a set number of virtual users. Configure the settings to account for different bandwidths and browsers.
- Run your test to simulate user requests.
- Tweak your software or hardware.
- Repeat testing until you reach your desired outcome.
Make sure to specify the same number of users and the same settings whenever you repeat tests. IBM has a terrific guide to designing performance tests with more details.
Tips and best practices for improving scalability
- Offload your database by limiting open connections and transactions. However, don't go overboard loading everything into the app layer, or else you could face other performance issues.
- Caches can help significantly with offloading resources. Consider implement a CDN to help take some of the load off of your origin server and place it on the CDN's edge servers for even faster performance.
- There is no need to store transient data permanently in a database. Store only necessary data that helps improve your business or application.
- Restrict access to your limited resources. If you have multiple requests for the same resource that performs the same calculation, let each one finish before the next one starts. Otherwise, the process will slow down for everyone.
- Breaking processes into asynchronous stages and separating them in queues to be executed by a minimal number of workers can give you a major performance boost.
- Network communications take longer than in-memory communications, so limit the chatter between your application and your network.
- Alter just one variable at a time. This may sound time-consuming, but changing too much at once could make your application's performance worse, and then you'll have to backtrack to figure what worked and what didn't.
- Reset everything you can before executing a test to make sure that previous tests do not influence your current one. It is generally recommended to restart your entire software system, but you can usually leave your hardware running.
- Whenever you come back for more testing, be sure to execute a baseline test. Don't rely on data from six months ago for comparison. Your disk could have become fragmented during that time, or you could have added software that consumes more CPU cycles and memory.
- Automate as much of the testing process as you can. That way, you can spend your working hours analyzing the tests that were conducted during off-hours. Automation also ensures that testing and retesting are performed consistently with the same settings.
Scalability testing tools
The best testing tools for you will depend on your project and your preferences. Fortunately, you'll find no shortage of free tools for testing performance, reliability, and scalability. We've written about a few testing tools in the past in our website performance optimization article, which included:
Additionally, Tech Beacon has an impressive list of 12 open source performance testing tools. Blaze Meter also has a detailed comparison between popular performance testing tools like Gatling, Tsung, JMeter and the Grinder.
Scalability shouldn't be seen as optional; it's pivotal to the success of any growing web application. An old piece of business advice goes, "If you're not growing, you're dying." This adage is certainly true; however, if you are growing, you need to make sure that you're thriving. Businesses that don't anticipate large increases in demand may find themselves facing hard times, so run those scalability tests to make sure your application is ready for whatever gets thrown at it.