A Complete Guide to ab apache benchmark for fast load testing

So, you want to see how your web server holds up under pressure? Getting started with load testing can feel intimidating, but it doesn’t have to be. The ab tool, short for Apache Bench, is one of the quickest and most straightforward ways to get a baseline on your application’s performance.
It’s a no-frills command-line program that comes bundled with the Apache HTTP Server project, which means it’s widely available and incredibly easy to pick up.
Your First Steps With AB Apache Benchmark
Installing Apache Bench
First things first, you need to get ab on your machine. Since it’s part of the standard Apache utilities, installing it is usually just a single command.
- On Debian/Ubuntu: Pop open a terminal and run
sudo apt-get install apache2-utils. - On CentOS/RHEL: The command you’re looking for is
sudo yum install httpd-tools. - On macOS: If you’re using Homebrew, it’s as simple as
brew install httpd.
Once that’s done, you can confirm it’s ready to go by typing ab -V. This should print out the version information, letting you know it’s installed correctly.
Running Your First Basic Test
Your first test is refreshingly simple. You only need two flags to get going: -n for the total number of requests you want to send, and -c for the concurrency level.
Think of concurrency (-c) as the number of virtual users hitting your site at the same time.
Let’s say you want to send 100 total requests with 10 concurrent users hammering your homepage. The command would look like this:
ab -n 100 -c 10 https://www.yourwebsite.com/
This tells ab to fire off 100 requests as fast as it can, while making sure 10 requests are always active at any given moment. The output you get back gives you immediate, raw feedback on how your server is handling that load. For a deeper dive, the official documentation has a complete list of all the flags you can play with.
As you can see from the options, ab is both simple and powerful. You can set timeouts (-s), enable Keep-Alive to simulate more realistic browser behavior (-k), and much more.
Key Takeaway: The real power of
abis its simplicity. In a single command, you can get a meaningful performance snapshot. That’s invaluable for a quick gut check after a new deployment or a server configuration tweak.
Don’t underestimate the value of a simple test like this. In a well-documented WordPress performance case study, an unoptimized server could only handle 5.79 requests per second. After adding a basic caching layer, that same server jumped to 20.74 requests per second—a massive 358% improvement they discovered with this kind of straightforward benchmarking. It just goes to show how ab can instantly validate the impact of your optimization efforts.
How to Make Sense of Your AB Benchmark Results
Running an ab test is the easy part. The real work starts when you’re staring at that wall of text it spits out. Buried in those numbers is everything you need to know about your server’s health, but you have to know where to look.
The first number everyone gravitates to is Requests per second. It’s a great starting point, but it’s only half the story. A high request rate doesn’t mean much if your latency is through the roof—your users are still having a miserable time. You need to look at both throughput and response times together to get the full picture.
This output gives you a quick visual on the core components of a typical test run.

It shows how the total requests (-n) and concurrent users (-c)—the two foundational parameters of any ab test—relate to the final numbers.
Digging Into the Key Metrics
To find actionable insights, you need to scroll down to the “Connection Times” section. This is where you can start pinpointing bottlenecks.
- Connect: This is how long it took to establish the initial TCP connection. If this number is high, you might be looking at a network problem between your test machine and the server.
- Processing: This is the time from when the connection was made until the first byte of the response came back. It’s a clean measurement of how long your server took to get its job done.
- Waiting: Pay close attention to this one. It’s the time your server spent thinking—processing the request and getting the response ready. High
Waitingtimes almost always point to slow database queries, inefficient code, or a CPU that’s struggling to keep up.
A high Waiting time is your clearest signal that the bottleneck is in your application logic, not the network.
To help you get the most out of your ab reports, here’s a quick breakdown of the most important metrics and what they’re telling you about your application’s performance.
Key AB Apache Benchmark Metrics Explained
| Metric | What It Means | What to Look For |
|---|---|---|
| Requests per second | Your application’s throughput. How many requests it can handle every second. | A higher number is better, but it must be balanced with low latency. |
| Time per request | The average time it took to complete a single request, including network overhead. | A consistently low number indicates a responsive system. |
| Waiting Time | The time your server spent processing the request (CPU, DB, etc.). | High values often point to application-level bottlenecks. |
| 95th Percentile (p95) | The response time that 95% of your users experienced or better. | A large gap between the median (50%) and p95 suggests long-tail latency issues. |
| Failed requests | The number of requests that timed out or returned an error. | Anything above 0 is a red flag. Your server is dropping connections or failing. |
Understanding these numbers is the first step toward building a truly resilient system. You can get a more in-depth look at essential performance testing metrics in our guide to see how they all fit together.
Interpreting Latency and Failures
The percentile distribution table is a goldmine. It tells you the response time story for different segments of your user base.
For example, the
95%line shows the time within which 95% of all requests were completed. If this number is way higher than your median (50%) value, it means a small but significant chunk of your users are experiencing frustratingly slow responses.
Failed requests are a non-negotiable red flag. If you see any number other than zero here under a moderate load, your server is buckling. It could be hitting max connection limits, the application might be crashing, or requests are simply timing out.
Once you’ve used ab to find the weak spots, the next step is to act on that data. After all, the goal is to improve page load speed and create a better user experience. That’s what turns ab from a simple command-line tool into a powerful ally.
Simulating Realistic Scenarios With Advanced Commands
Hitting your server with a flood of simple GET requests is a decent start, but it’s not how people actually use your app. Real-world traffic is messy. Users log in, submit forms, and hammer your APIs, which involves a lot more than just fetching a static page. If you want truly useful data from the ab apache benchmark tool, you need to mimic these more complex interactions.

This is where you graduate from basic commands and start using ab’s advanced flags. They transform it from a simple page-hitter into a surprisingly versatile tool, letting you craft requests that look a lot more like real application usage.
Testing APIs and Forms With POST Requests
One of the first things you’ll want to test is a form submission or an API endpoint that handles incoming data. A standard ab command won’t cut it, since it defaults to GET requests. The key here is the -p flag, which lets you send a POST request with a data payload.
Let’s say you have a login API that expects JSON. You’d start by creating a file, we’ll call it payload.json, with the login credentials:
{ "username": "testuser", "password": "password123" }
Then, you have to tell ab what kind of data you’re sending. You do this with the -T flag to set the Content-Type header. The final command to blast your API with 500 POST requests from 50 concurrent users looks like this:
ab -n 500 -c 50 -p payload.json -T 'application/json' https://api.yourwebsite.com/login
Suddenly, you’re not just hitting a URL; you’re simulating 50 users trying to log in at the same time, putting real pressure on your authentication service and database.
Simulating Modern Browser Behavior
Modern browsers are smart. They don’t waste time opening and closing a new connection for every single request. Instead, they use HTTP Keep-Alive to reuse the same connection for multiple requests, which cuts down on overhead dramatically.
By default, ab doesn’t do this. To make your benchmark even remotely realistic, you should almost always include the -k flag to enable Keep-Alive.
A test without Keep-Alive can give you an artificially pessimistic view of your server’s performance. It forces the server to spend way too much time on connection setup instead of actually processing requests. Just adding the
-kflag brings your test much closer to what a real browser does.
For more complex setups, especially when dealing with load balancing AWS, understanding these nuances is crucial. When your traffic is spread across multiple servers, Keep-Alive becomes even more important for measuring the performance of the entire stack, not just a single instance. And when synthetic tests aren’t enough, you might want to replay production traffic for realistic load testing.
Another powerful flag for deeper analysis is -g, which dumps the raw data into a file perfect for plotting. This is how you can perform percentile analysis to see the difference between your fastest and slowest requests, which is critical for finding and fixing those painful long-tail latencies. By feeding this data into a tool like gnuplot, you can visualize exactly when and how response times degrade as the load increases.
Common Load Testing Pitfalls and How to Avoid Them
The ab benchmark tool is incredibly powerful because it’s so simple, but that same simplicity makes it easy to generate misleading data if you’re not careful. A few common, yet critical, mistakes can completely torpedo your test results, leading you to optimize the wrong things or—even worse—miss a serious performance bottleneck.

Understanding these traps is the first step toward collecting numbers you can actually trust. When you avoid them, you can be confident your performance tuning efforts are based on a real picture of your server’s capabilities.
The Localhost Benchmarking Trap
The most frequent and damaging mistake I see is people running ab on the exact same machine as the web server they’re testing.
When you benchmark against localhost or 127.0.0.1, you completely bypass the network stack. This means your results will show artificially low latency and wildly inflated “Requests per second” numbers because there’s zero real-world network overhead involved.
Your server might look like a speed demon, but this setup tells you absolutely nothing about how it will perform when actual users try to access it over the internet.
Pro Tip: Always run your
abtests from a separate, dedicated machine. This client machine should have a solid network connection to your server to simulate a much more realistic environment and capture the true impact of network latency on response times.
This one simple change moves you from a theoretical lab experiment to a practical, meaningful measurement. You’ll get a much clearer picture of what your users are actually experiencing.
Understanding Coordinated Omission
Another subtle but significant pitfall is a phenomenon known as coordinated omission. At its core, this issue means that a testing tool can inadvertently paint a much rosier picture of latency than what’s really happening under load.
Here’s how it works:
- When the system you’re testing gets overloaded, it starts to slow down.
- Because the system is slow, the load generator (
ab) also slows down, sitting idle while it waits for responses before it can fire off new requests. - As a result,
abdoesn’t measure the time it was “stalled” waiting to send a request. It only records the latency of the requests it did manage to send.
This leads to a dangerous underreporting of true user-perceived latency, especially in the higher percentiles like p99 and p99.9. The system might seem stable in your test report, but real users are experiencing significant delays that the test simply isn’t capturing.
While ab is fantastic for getting a baseline on throughput, just be aware of this limitation when you’re trying to hunt down high-percentile latency issues. For those deep-dive scenarios, you might need more advanced tools to get the full story.
When to Move Beyond Synthetic Testing
The ab apache benchmark is a fantastic tool for generating a clean, consistent, and predictable stream of traffic. It’s the perfect way to establish a performance baseline, test raw server throughput, and see how your application behaves under sterile, lab-like conditions.
But real-world user traffic is anything but sterile. It’s chaotic, unpredictable, and full of complex patterns that a simple synthetic test just can’t replicate.
While ab is excellent at answering “How many requests can my server handle?”, it starts to fall short when you ask more nuanced questions. For instance, how does your database perform when users trigger a complex series of reads and writes in a specific sequence? How do caching layers hold up when hit with a diverse mix of URLs, not just a single endpoint?
Synthetic testing often misses these real-world scenarios, which can lead to a false sense of security. Your server might ace a synthetic benchmark but then crumble during a real traffic spike simply because the load pattern is completely different.
The Limits of Synthetic Load
The core limitation of tools like Apache Bench is that they test what you think your user traffic looks like. You’re essentially creating a hypothesis about user behavior and running a test against it. The problem is, this approach can easily miss critical edge cases and hidden bottlenecks that only pop up under the messy reality of production traffic.
- Request Diversity: Real users access hundreds of different URLs, not just one. This variety impacts database query performance, caching effectiveness, and overall resource usage in ways a single-URL test never could.
- Timing and Pacing: Users don’t hit your site with machine-gun precision. There are natural pauses and irregular intervals between their requests, which affects everything from session management to connection pooling.
- Complex User Flows: A true user journey might involve logging in, searching for a product, adding it to a cart, and checking out. That sequence of API calls puts a very different kind of strain on your system than just hammering the homepage over and over.
This evolution in performance validation has shown major limitations in traditional synthetic benchmarking. To achieve realistic load testing, it’s essential to mirror actual production traffic, which can uncover hidden performance issues that synthetic tests miss. Learn more about the differences in this load testing guide on goreplay.org.
Embracing Realistic Traffic With GoReplay
This is exactly where traffic shadowing tools like GoReplay come into play. Instead of simulating traffic, GoReplay captures live production HTTP requests and replays them against a staging or test environment. This isn’t a simulation; it’s a perfect copy of your real user load.
By replaying actual user sessions, you can validate exactly how new code deployments or infrastructure changes will behave under true production stress. It’s an invaluable method for testing database migrations, validating caching strategies, or simply ensuring a new feature doesn’t introduce unexpected performance regressions before it goes live.
AB Apache Benchmark vs GoReplay Comparison
So, how do you decide which tool is right for the job? It’s less about one being better and more about them serving different, complementary purposes. One gives you a clean baseline, while the other gives you messy reality.
Here’s a quick comparison of their strengths and ideal use cases.
| Feature | AB Apache Benchmark | GoReplay |
|---|---|---|
| Traffic Type | Synthetic and predictable | Real and unpredictable |
| Use Case | Baseline throughput and raw capacity testing | Pre-release validation and realistic scenario testing |
| Focus | How many requests a single endpoint can handle | How the entire system behaves under complex user flows |
| Complexity | Simple, single-URL focus | Captures and replays all user interactions across the system |
Using ab gives you a solid performance foundation, but pairing it with a tool like GoReplay provides a far more comprehensive picture. Together, they ensure your application isn’t just fast in a lab—it’s truly ready for the real world.
Common Questions (and Headaches) with ApacheBench
Even a tool as direct as ApacheBench can leave you scratching your head sometimes. When you’re in the middle of a test, the last thing you want is to get stuck wrestling with the tool itself. Let’s walk through some of the most common questions and issues that pop up.
Can I Test Multiple URLs at Once?
This one comes up a lot. The short answer is no—ab is built to hammer a single URL at a time.
This isn’t a limitation; it’s by design. The tool’s purpose is to isolate and measure the performance of one specific endpoint, giving you a clean baseline. If you need to test a complex user journey that hits several different pages, you’ll either need to write a script that calls ab multiple times or, more likely, reach for a more advanced load testing tool.
How Do I Handle Logins and Dynamic Content?
Testing a page that requires a user to be logged in is another frequent hurdle. If you just point ab at a protected page, all your requests will just hit the login screen, and your metrics will be completely useless.
The key is the -C flag, which lets you pass cookies with your requests. This is how you can simulate an authenticated user session.
For example, you’d grab a valid session cookie from your browser and use it like this:
ab -n 100 -c 10 -C 'sessionid=your_session_value' https://app.yourwebsite.com/dashboard
This command sends 100 requests, with 10 happening concurrently, all appearing as if they’re coming from a logged-in user. Now you’re getting a much more realistic picture of how your application performs for actual users.
Why Are My Results So Inconsistent?
Getting wildly different numbers between test runs can be incredibly frustrating, but it usually points to an issue in your testing environment, not ab itself.
Before you pull your hair out, check a few things:
- Is there other traffic hitting the server you’re testing?
- Is the machine running
aboverloaded? A maxed-out CPU on the client side will bottleneck your tests and skew the results. - Are you on a stable network connection?
For a clean test, you need to make sure both the client and server have plenty of CPU and network headroom.
It’s also critical to remember what
abis—and what it isn’t. It gives you a consistent, repeatable baseline. It does not replicate the messy, unpredictable chaos of real user traffic. For that, you need a different strategy.
This is where more advanced techniques come into play, like replaying captured production traffic at an accelerated speed (say, 10x) to see how your system handles a sudden spike. Many engineering teams use a hybrid approach: ApacheBench for establishing simple baselines and production traffic replay for true, real-world validation.
You can learn more about these advanced strategies in this comprehensive guide to load testing. Combining these methods ensures you’re prepared for both the traffic you expect and the traffic you don’t.
At GoReplay, we build tools that let you move beyond synthetic guesswork. By capturing and replaying real user traffic, you can test new releases with the confidence that your application is truly resilient under real-world pressure. Find out how to stop simulating and start replicating at https://goreplay.org.