Published on 8/26/2026

Maven Failsafe Plugin: 2026 Guide to Integration Tests

Your build passes unit tests, then falls apart when integration tests need a database, a message broker, a seeded data set, or a running service. Worse, the failure happens early enough that cleanup never runs. Containers stay up. Test data sticks around. The next CI job inherits a dirty environment and fails for the wrong reason.

That’s where the maven failsafe plugin earns its place. It isn’t just a Maven detail. It’s the plugin that separates fast feedback from environment-dependent verification, and that separation matters when your pipeline has to be reliable under failure, not just when everything is green.

A lot of teams treat integration tests as “more tests” and wire them like unit tests. That works until the suite needs setup and teardown with real consequences. Then the lifecycle matters. A lot.

Why Your Integration Tests Need the Failsafe Plugin

A common CI failure looks like this. The build starts clean, integration tests bring up a database and a broker, one test fails early, and the job exits before teardown finishes. The next pipeline run inherits stale containers, dirty data, or ports that never got released. Teams lose time debugging the environment instead of the code.

The maven failsafe plugin fixes that by separating integration test execution from final build failure. It binds integration testing to the Maven phases built for setup, execution, cleanup, and result checking: pre-integration-test, integration-test, post-integration-test, and verify. That separation matters any time tests depend on infrastructure outside the JVM.

The practical benefit is Failsafe’s safe failure behavior.

Failsafe lets the build continue far enough for cleanup to run before Maven marks the build as failed. In a local project, that means fewer orphaned services and less manual cleanup. In CI/CD, it matters more. Docker containers get stopped, test fixtures get removed, temporary environments get torn down, and reports still get published even when the suite fails. That keeps one bad run from contaminating the next one.

Use it when tests exercise the system the way production does. That includes tests against a real database, HTTP calls between services, contract checks against deployed components, and heavier validation such as integration testing best practices for traffic replay and production-like verification. Those tests are expensive, environment-dependent, and worth isolating so they do not interrupt the fast unit test loop.

Practical rule: If a test needs a running service, a real database, a network hop, or a built artifact, run it under Failsafe, not in the test phase.

The mistake that burns pipeline time is treating integration tests like larger unit tests. They are operational tests. They need ordered setup, predictable teardown, and a failure model that preserves the environment long enough to clean it up correctly. Failsafe gives you that control.

Failsafe vs Surefire The Critical Difference

Developers often know both names but use them interchangeably. That’s where pipeline problems start. Surefire and Failsafe are related plugins, but they solve different problems and they should be treated differently in a production build.

A comparison table outlining the key differences between the Maven Surefire and Failsafe plugins for software testing.

The short version

Use Surefire for unit tests that should fail fast.

Use Failsafe for integration tests that need setup, execution, teardown, and only then a pass or fail decision.

Maven Surefire vs. Failsafe Plugin at a Glance

Characteristic	Surefire Plugin	Failsafe Plugin
Purpose	Unit testing	Integration testing
Maven phase	`test`	`integration-test`, `verify`
Failure handling	Fails the build immediately	Allows build to continue until `verify`
Lifecycle integration	Works with the default unit test flow	Needs explicit integration test configuration
Typical test naming	`Test.java`, `Test.java`	`IT.java`, `IT.java`, `*ITCase.java`
Best use case	Fast isolated checks	Environment-dependent validation

That table is the mechanical difference. The operational difference is more important.

Why immediate failure helps unit tests

Unit tests should fail fast because speed matters more than teardown complexity. If a pure unit test fails, there usually isn’t a container to stop, a temporary database to drop, or a replay process to terminate. Failing immediately gives quick feedback and keeps local development tight.

That’s why Surefire fits the test phase. It’s for isolated logic checks that don’t need environment orchestration.

Why deferred failure helps integration tests

Integration tests are different. They often bring up infrastructure in pre-integration-test, exercise the system in integration-test, and need cleanup in post-integration-test. If the build died the moment a test failed, cleanup could be skipped.

The Apache ecosystem documents that behavior clearly. Failsafe’s safe failure model defers the build failure to verify, rather than failing during integration-test. OpenClover’s documentation also notes that this pattern supports combined unit and integration workflows, and the same source ties Failsafe’s modern usage to native JUnit 5 support since version 2.22.0 in 2018 in its Maven Surefire and Failsafe integration guide.

If your test environment must be cleaned up even after failure, immediate build failure is the wrong behavior.

That’s the critical mental model. Failsafe doesn’t “delay failure” because it’s softer. It delays failure because teardown is part of correctness.

Naming conventions aren’t cosmetic

A second source of confusion is test naming. Teams sometimes put integration tests under the same names as unit tests and then wonder why Maven runs them in the wrong phase.

A healthy split looks like this:

Use unit-style names for Surefire: UserServiceTest, OrderValidatorTest
Use integration-style names for Failsafe: UserServiceIT, CheckoutFlowIT, PaymentGatewayITCase
Keep intent obvious: if a test crosses process boundaries or needs external resources, name it like an integration test

That naming convention is part of the contract between your codebase and your pipeline. Break it, and Maven can’t help you.

Your First Failsafe Configuration in pom.xml

A correct setup is short, but the details matter. The two mistakes that waste the most time are simple: teams forget to bind the verify goal, or they name integration tests in a way Failsafe never discovers.

A computer screen displaying Maven Failsafe plugin XML configuration code on a desk near a window.

A working baseline

Start with this in your pom.xml:

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-failsafe-plugin</artifactId>
      <version>3.5.5</version>
      <executions>
        <execution>
          <goals>
            <goal>integration-test</goal>
            <goal>verify</goal>
          </goals>
        </execution>
      </executions>
    </plugin>
  </plugins>
</build>

This binds the plugin’s two essential goals into the lifecycle. Maven’s plugin documentation is explicit: bind failsafe:integration-test to the integration-test phase and failsafe:verify to the verify phase. The same documentation also notes that a common pitfall is using the wrong naming convention, with Failsafe defaulting to **/IT*.java, **/*IT.java, and **/*ITCase.java in the official plugin info.

What each part actually does

The <artifactId> is obvious. The part that deserves attention is <executions>.

<goal>integration-test</goal> tells Maven when to run the integration test classes. At this point, the tests execute.

<goal>verify</goal> is what turns test outcomes into a build decision. If you leave this out, you can end up in the worst state possible: integration tests ran, something failed, and your build still doesn’t enforce the result correctly.

Key takeaway: Running integration tests without binding verify is not a partial setup. It’s a misleading setup.

Name tests so Failsafe can find them

The defaults are generally sufficient. Stick to them instead of inventing a naming scheme nobody remembers.

Use names like:

Prefix style: ITUserImport.java
Suffix style: CheckoutFlowIT.java
Case style: OrderSyncITCase.java

Avoid mixing integration tests into *Test.java names. That pushes them toward the unit-test lane and causes confusion around which plugin is executing them.

A stronger starter configuration

For CI, I usually make the intent a little more explicit:

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-failsafe-plugin</artifactId>
      <version>3.5.5</version>
      <executions>
        <execution>
          <id>integration-tests</id>
          <goals>
            <goal>integration-test</goal>
            <goal>verify</goal>
          </goals>
        </execution>
      </executions>
      <configuration>
        <forkedProcessTimeoutInSeconds>60</forkedProcessTimeoutInSeconds>
      </configuration>
    </plugin>
  </plugins>
</build>

That timeout matters when tests spawn forked JVMs and something hangs because of non-daemon threads. Instead of leaving a CI agent blocked indefinitely, the plugin can kill the forked process after the configured timeout.

Common setup mistakes that break builds quietly

These are the ones I see most often:

Binding only one goal
Teams add integration-test but forget verify. The suite runs, but the pipeline’s enforcement behavior is wrong.
Using the wrong test names
PaymentServiceTest looks harmless until Failsafe ignores it and Surefire tries to run it as a unit test.
Calling the wrong Maven command
If your pipeline stops at mvn test, Failsafe never gets involved. Integration tests need a lifecycle that reaches verify.
Treating setup and teardown as optional extras
If the test suite needs an environment, wire that environment around the proper lifecycle phases instead of inside ad hoc shell commands.

Keep the plugin boring

That’s the best compliment for a build plugin. A boring configuration is one everyone on the team understands. Start with the default patterns, bind the correct goals, and only add customization when a real need appears.

Running Integration Tests and Understanding Results

A pipeline that runs mvn test and skips mvn verify gives a false sense of coverage. Failsafe only does its full job when the build reaches verify, because that is where Maven turns integration test results into the final build outcome.

That separation matters in CI/CD. During integration-test, your suite can start containers, seed databases, replay production traffic against a staging service, or hold open network resources for end-to-end checks. If one test fails midway, Failsafe does not short-circuit the lifecycle the way an immediate failure would. Maven can still run the later phases that shut things down cleanly, then fail the build in verify. That is the safe failure model, and it prevents the classic mess of orphaned containers, dirty test data, and stuck agents.

What execution actually looks like

Failsafe runs integration-test classes in the integration-test phase, then evaluates the overall result in verify. The practical effect is simple. Your environment gets a chance to clean up before the job is marked failed.

That difference shows up fast in real pipelines. A broken API replay test should fail the build, but it should also leave the agent ready for the next job. If cleanup depends on later lifecycle phases, stopping early wastes CI time and creates flaky follow-on failures that are harder to diagnose than the original test failure.

Where the reports go

By default, Failsafe writes its output under target/failsafe-reports. The plugin documentation for using the Maven Failsafe Plugin describes the standard report directory and the files generated during execution.

Use those files based on who needs the answer:

Read the .txt reports first when debugging locally. They are faster to scan for stack traces and the exact test method that failed.
Feed TEST-*.xml files into CI tooling. Jenkins, GitLab, and similar systems parse XML test reports cleanly.
Check failsafe-summary.xml when the failure looks broader than one class. It helps separate test assertion failures from execution-level problems.

The text output is for humans. The XML output is for automation.

A practical triage flow

Start with the build log, but do not stop there.

Confirm the job reached verify. If it did not, the pipeline may have called the wrong Maven phase.
Open target/failsafe-reports. Verify which integration test classes ran.
Read the matching .txt file for the failing class. That usually gets you to the root cause fastest.
Review failsafe-summary.xml. Use it to confirm whether the run failed because of test failures, execution errors, or both.

This order saves time. CI consoles are noisy, especially when integration tests involve container startup, service logs, or traffic replay output. The Failsafe reports are usually the cleaner source of truth.

Advanced Failsafe Techniques for Real-World Projects

The default setup gets you clean lifecycle separation. Real projects usually need a bit more control. You may need to run only one integration suite while debugging, skip integration tests for quick local loops, or stop the suite after the first meaningful failure in CI.

A diagram illustrating an advanced failsafe architecture with system components, monitoring, data integrity, and external services integration.

Narrow the test scope when debugging

Don’t run every integration test while chasing one broken flow. Use explicit includes and excludes.

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-failsafe-plugin</artifactId>
  <version>3.5.5</version>
  <executions>
    <execution>
      <goals>
        <goal>integration-test</goal>
        <goal>verify</goal>
      </goals>
    </execution>
  </executions>
  <configuration>
    <includes>
      <include>**/*CheckoutIT.java</include>
    </includes>
    <excludes>
      <exclude>**/*SlowIT.java</exclude>
    </excludes>
  </configuration>
</plugin>

That kind of filtering is useful when one feature area is unstable and you want a fast feedback loop without rewriting the whole pipeline.

Stop wasting CI time after the first hard failure

Sometimes one failed integration test is enough to know the environment is broken. Maven supports -Dfailsafe.skipAfterFailureCount=1, which is documented in the plugin guidance. In practice, that’s useful for expensive suites where a known environment problem would make the remaining results noisy.

Use it selectively. It’s good for branch validation when a broken database migration means the rest of the suite is meaningless. It’s less helpful when you want a full list of failures before a release branch is fixed.

A complete failure list is valuable in diagnosis. Early exit is valuable in throughput. Pick one intentionally.

Control hanging test processes

Integration suites are more likely than unit tests to hang because they touch network calls, spawned processes, and non-daemon threads. That’s where <forkedProcessTimeoutInSeconds>60</forkedProcessTimeoutInSeconds> helps. It gives CI a defined escape hatch instead of letting one bad test leave an executor stuck.

I prefer using a timeout in the plugin rather than hoping every test library shuts down cleanly under every failure mode. Build infrastructure should defend itself.

Skip integration tests without deleting discipline

Developers often need a fast local compile-and-unit-test loop. CI needs the full validation path. You can support both without mangling the pom.xml.

A simple pattern is to gate execution with a property such as -DskipITs=true for local work. The main rule is organizational, not technical: skipping integration tests should be explicit and temporary, not the default behavior for shared branch validation.

Here’s what works well in practice:

Local development: allow quick iterations when someone is changing pure business logic and doesn’t need a full environment.
Pull request validation: run the meaningful integration subset tied to the service being changed.
Main branch or release pipeline: run the full suite and enforce verify as the definitive gate.

Use Maven profiles carefully

Profiles are useful when your integration environment differs by context. Maybe local runs use lightweight dependencies while CI points to provisioned services. Profiles can help, but they also create hidden behavior if overused.

Keep profiles readable:

Name them by intent: ci-it, local-it, contract-it
Avoid stacking many overlapping profiles: if nobody can predict which one is active, your build has become opaque
Document the command the team should run: confusion around profiles causes more pipeline waste than the profile mechanism itself

Coverage and mixed test reporting

If you care about combined coverage from unit and integration tests, tools like OpenClover can aggregate output from Surefire and Failsafe reports. The important practical point is that the split between unit and integration execution doesn’t prevent a combined quality view. It makes the coverage story cleaner because each test type has a clear execution boundary.

What doesn’t work well

A few habits make Failsafe harder than it needs to be:

Stuffing environment startup into arbitrary shell scripts instead of aligning it with Maven lifecycle intent
Renaming tests ad hoc so no one knows whether Surefire or Failsafe should pick them up
Using one build mode for every scenario instead of distinguishing local, PR, and release needs
Ignoring hanging-process protection and blaming CI when builds stall

The plugin does its job well when you give it a disciplined test taxonomy and a predictable lifecycle.

Mastering Failsafe in CI/CD Pipelines

The biggest misunderstanding about the maven failsafe plugin is that its delayed failure behavior looks less strict than Surefire. In a CI/CD pipeline, the opposite is true. It’s stricter where it matters because it preserves cleanup and still enforces the outcome before deployment.

A computer monitor displaying a CI/CD pipeline workflow diagram illustrating automated software development processes and deployment stages.

The Apache Maven documentation highlights an important gap in common tutorials: the distinction between integration-test and verify is critical for deployment strategy, especially when teams need teardown to complete even after a failing run, as described in the Maven Failsafe plugin overview. That matters even more when your tests orchestrate containers, seeded services, or traffic replay.

Treat verify as the deployment gate

A reliable pipeline doesn’t ask, “Did tests execute?” It asks, “Did the environment get set up, exercised, cleaned up, and then verified successfully?”

That’s why verify is the primary gate. It’s the point where Maven can say all integration work completed, cleanup had its chance to run, and the build should now pass or fail as a release candidate.

In CI terms, that gives you a cleaner contract:

Pipeline concern	Better choice with Failsafe
Environment bootstrap	Run before integration tests
Test execution	Keep it in `integration-test`
Resource teardown	Let `post-integration-test` finish
Release decision	Enforce at `verify`

If your team is refining broader automation practice, this guide to continuous integration best practices complements the way Failsafe should be used as a quality gate.

Why cleanup discipline changes pipeline reliability

CI agents are shared systems, even when they’re ephemeral on paper. A failed job that leaves behind state still poisons the next run in practical terms. You see this with:

Containers left running
Databases left with partial fixtures
Ports still held by child processes
Temporary artifacts that make later jobs look nondeterministic

Failsafe’s lifecycle separation gives those teardown actions a reliable place to live. That’s not abstract Maven theory. It’s how you stop one failing integration suite from creating two more false failures downstream.

Don’t design CI around successful runs only. Design it around failed runs that still clean up after themselves.

Traffic replay is where lifecycle separation really pays off

Advanced integration testing gets much more interesting when a team replays realistic HTTP traffic against a staging-like environment. This kind of testing is valuable because it exercises routing, persistence, and service interactions under conditions that resemble production behavior more closely than hand-written happy-path tests.

It also creates more moving parts. You may have a replay process, a target environment, seeded dependencies, and result validation that depends on the full run completing. If a failure aborts the build too early, teardown can be skipped and the replay context becomes noisy or unusable for the next job.

That’s exactly why the Maven lifecycle split matters. Bring the environment up before execution, run the replay-backed integration tests during integration-test, let teardown happen, then let verify make the deployment decision.

This walkthrough is useful if you want a quick visual explanation of CI flow around test automation:

A practical pipeline pattern

A sound CI/CD pattern with Failsafe looks like this:

Build the artifact
Provision the integration environment
Run unit tests separately in the fast lane
Execute integration tests through Failsafe
Always run teardown
Use verify as the release blocker
Publish Failsafe XML reports into your CI system

That structure keeps feedback layered. Unit failures stop cheap mistakes early. Integration failures stop unsafe releases later, after the environment has been managed correctly.

What teams usually get wrong

The common mistakes aren’t subtle:

They gate deployments on test execution rather than test verification
They call mvn test in CI and assume integration tests ran
They wire cleanup outside the Maven lifecycle and wonder why it’s brittle
They treat traffic replay failures as just another test assertion, when they’re often environment events that still require orderly teardown

The fix is usually architectural, not clever. Give integration work its own lifecycle. Let Failsafe do the job it was designed for.

When teams make that change, their pipelines become easier to trust. Not because failures disappear, but because failures happen in a controlled way. That’s what a mature build system is supposed to do.

If you want to validate releases with realistic production-like traffic before deployment, GoReplay is worth a serious look. It lets teams capture and replay live HTTP traffic into test environments, which pairs naturally with a Failsafe-driven integration stage that can set up, execute, tear down, and then enforce results at verify.