Cucumber Framework with Java: Build Robust Automation

You’re probably in one of two situations right now. Either you’ve got a growing Java app and manual regression testing is slowing releases down, or you already tried UI automation and discovered that “just add Cucumber” turns into runners, hooks, WebDriver lifecycle bugs, brittle selectors, and a lot more glue code than anyone admits in quick-start guides.

That’s why the cucumber framework with java still matters. It gives teams a readable way to describe behavior, connect it to executable tests, and keep product, QA, and engineering aligned around the same scenarios. But the useful version of Cucumber isn’t the demo where one login test passes on localhost. The useful version is the one that survives a real codebase, CI runs, flaky browsers, and months of UI changes.

Getting Started with Your Cucumber Java Project

Cucumber became popular because it made test intent readable. Cucumber launched in 2008 and pioneered Gherkin syntax with Given, When, Then. Its Java implementation passed 1 million downloads on Maven Central by 2015, and its integration with JUnit 4/5 gives it 100% compatibility with modern CI/CD pipelines, according to this Cucumber history overview.

That history matters because it explains why so many Java teams still pick it. Cucumber isn’t just a test library. It’s a way to turn a requirement into something humans can read and machines can execute.

An infographic illustrating five key steps to get started with the Cucumber framework using Java programming.

Choose a project shape that won’t fight you later

For Java teams, Maven is usually the least painful starting point. Gradle works too, but most examples, enterprise repos, and CI snippets you’ll find for Cucumber Java still lean Maven-first.

A practical folder layout looks like this:

src/test/resources/features for .feature files
src/test/java/steps for step definitions
src/test/java/runners for runner classes
src/test/java/pages for page objects once the suite grows
src/test/java/support for hooks, driver factory, and shared utilities

That split keeps feature text separate from execution code. It also stops step definitions from turning into a junk drawer.

Start with the minimum useful dependencies

If you’re using Maven, your pom.xml should include the pieces that each solve one job:

cucumber-java handles the step definition annotations and core runtime.
cucumber-junit or cucumber-testng gives you a runner strategy.
selenium-java drives the browser for UI tests.
maven-surefire-plugin runs tests locally and in CI.

A minimal setup often grows into something like this:

<dependencies>
  <dependency>
    <groupId>io.cucumber</groupId>
    <artifactId>cucumber-java</artifactId>
    <version>YOUR_VERSION</version>
    <scope>test</scope>
  </dependency>
 
  <dependency>
    <groupId>io.cucumber</groupId>
    <artifactId>cucumber-junit</artifactId>
    <version>YOUR_VERSION</version>
    <scope>test</scope>
  </dependency>
 
  <dependency>
    <groupId>org.seleniumhq.selenium</groupId>
    <artifactId>selenium-java</artifactId>
    <version>YOUR_VERSION</version>
  </dependency>
</dependencies>

If you know parallel execution is coming, it’s smart to think ahead and read through broader web application testing tool choices before you commit the whole team to a code-heavy stack.

Practical rule: Keep your first setup boring. One browser, one feature file, one runner, one passing scenario. Complexity added early usually becomes framework debt.

Create one clean baseline before adding abstractions

The temptation is to scaffold everything at once. Don’t. Get these working first:

A single feature file under src/test/resources/features
A step definition class that compiles
One runner that discovers the feature
One browser session that opens and closes correctly

If those four pieces aren’t stable, adding hooks, dependency injection, reporting, and parallel runs will only hide the underlying problems. Early success in Cucumber comes from clarity, not cleverness.

Writing Your First Feature and Step Definitions

The first useful Cucumber scenario should read like a product conversation, not a Selenium script disguised as English. If your feature file says “click the blue button with id submit-login,” you’ve already missed the point. The scenario should describe behavior users care about.

A login flow is a good first example because everyone understands it.

A 3D cartoon wizard illustrating the connection between Cucumber behavior-driven development text and Java testing code.

Write the feature file in user language

Create login.feature:

Feature: User login
 
  Scenario: Successful login with valid credentials
    Given the user is on the login page
    When the user enters a valid username and password
    And clicks the login button
    Then the user should see the dashboard

That wording does two jobs well. It’s readable by non-developers, and it avoids implementation detail. Good Gherkin usually starts from the same discipline used in writing better user stories, because both force you to focus on intent instead of internal mechanics.

Bind plain English to Java methods

Now map each step to Java:

package steps;
 
import io.cucumber.java.en.Given;
import io.cucumber.java.en.When;
import io.cucumber.java.en.Then;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.By;
import org.openqa.selenium.chrome.ChromeDriver;
 
import static org.junit.Assert.assertTrue;
 
public class LoginSteps {
 
    private WebDriver driver;
 
    @Given("the user is on the login page")
    public void userOnLoginPage() {
        driver = new ChromeDriver();
        driver.get("https://example.com/login");
    }
 
    @When("the user enters a valid username and password")
    public void enterValidCredentials() {
        driver.findElement(By.id("username")).sendKeys("demo");
        driver.findElement(By.id("password")).sendKeys("password");
    }
 
    @When("clicks the login button")
    public void clickLoginButton() {
        driver.findElement(By.id("loginButton")).click();
    }
 
    @Then("the user should see the dashboard")
    public void verifyDashboard() {
        assertTrue(driver.getCurrentUrl().contains("/dashboard"));
        driver.quit();
    }
}

This is the point where Cucumber clicks for most people. The text in the .feature file triggers annotated Java methods. That’s the core mechanism.

Keep step definitions thin. If they start holding locators, waits, assertions, and test data setup all in one class, they become harder to change than the UI they test.

What works in early step definitions

A few habits save pain later:

Use action-focused names. userOnLoginPage() is better than step1().
Match step text exactly enough to stay readable, but don’t obsess over regex complexity on day one.
Close the browser reliably. Don’t leave cleanup buried in the last assertion forever. Hooks are better once your suite grows.
Avoid giant composite steps like “the user logs in and creates an order and verifies the summary.” Those steps read fast and debug badly.

Here’s a quick smell test:

Good step text	Weak step text
user enters valid username and password	user fills field1 and field2
user should see the dashboard	verify page redirected correctly
user adds product to cart	click add button

First-pass mistakes you should expect

Your first run often fails for very ordinary reasons:

Glue path mismatch. The runner can’t find your step classes.
Undefined steps. Feature text and annotation text don’t match.
Driver setup problems. Browser binaries or local environment aren’t ready.
Assertions too early. The page hasn’t finished updating.

None of that means the framework is wrong. It means browser automation has friction from the start, even in the cleanest Cucumber tutorial.

Configuring a Test Runner and Using Advanced Features

One passing scenario is a toy. A useful suite needs a runner, a way to slice execution, and a browser lifecycle that doesn’t poison neighboring tests.

For modern Java projects, I prefer treating the runner as a control point rather than boilerplate. It decides where features live, which tags run, how reports are emitted, and whether the suite scales.

A basic runner that stays readable

A JUnit 5 style setup is common in current projects:

package runners;
 
import org.junit.platform.suite.api.ConfigurationParameter;
import org.junit.platform.suite.api.SelectClasspathResource;
import org.junit.platform.suite.api.Suite;
 
import static io.cucumber.junit.platform.engine.Constants.GLUE_PROPERTY_NAME;
import static io.cucumber.junit.platform.engine.Constants.PLUGIN_PROPERTY_NAME;
import static io.cucumber.junit.platform.engine.Constants.FILTER_TAGS_PROPERTY_NAME;
 
@Suite
@SelectClasspathResource("features")
@ConfigurationParameter(key = GLUE_PROPERTY_NAME, value = "steps,support")
@ConfigurationParameter(key = PLUGIN_PROPERTY_NAME, value = "pretty, html:target/cucumber-report.html")
@ConfigurationParameter(key = FILTER_TAGS_PROPERTY_NAME, value = "@smoke")
public class TestRunner {
}

This is enough to run a tagged subset. The tags matter because real suites rarely run everything on every commit.

Common tag patterns:

@smoke for a tiny confidence set
@regression for wider UI coverage
@checkout, @login, @billing for functional grouping

Hooks and state isolation matter more than people think

Hooks let you centralize setup and cleanup instead of scattering it through step files.

package support;
 
import io.cucumber.java.Before;
import io.cucumber.java.After;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
 
public class Hooks {
 
    public static WebDriver driver;
 
    @Before
    public void setUp() {
        driver = new ChromeDriver();
    }
 
    @After
    public void tearDown() {
        if (driver != null) {
            driver.quit();
        }
    }
}

That example is intentionally simple. In a real suite, you’ll usually move driver creation into a factory and stop using a static shared driver once parallel runs enter the picture.

For teams standardizing suite hygiene, these automation testing best practices are worth keeping next to your framework code because runner configuration alone won’t save brittle test design.

A flaky suite rarely fails because Cucumber is “bad.” It fails because state leaks, waits are inconsistent, selectors drift, and browsers are shared in unsafe ways.

Parallel execution helps, but only if the driver model is safe

A practical approach to parallel execution uses Maven Surefire, a @Suite runner, and ThreadLocal<WebDriver> to keep each thread isolated. That setup can reduce execution time by 70-80% for suites with over 500 scenarios, based on this parallel Cucumber Java guide.

A simple thread-local pattern looks like this:

public class DriverManager {
    private static final ThreadLocal<WebDriver> driver = new ThreadLocal<>();
 
    public static void setDriver(WebDriver webDriver) {
        driver.set(webDriver);
    }
 
    public static WebDriver getDriver() {
        return driver.get();
    }
 
    public static void unload() {
        driver.remove();
    }
}

And a Surefire configuration might look like:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-surefire-plugin</artifactId>
  <version>3.0.0</version>
</plugin>

What doesn’t work is pretending parallelism is just a plugin switch. If your page objects share mutable state, or your hooks stash data in static fields, parallel execution just makes failure faster.

Building a Scalable Framework with the Page Object Model

The biggest mistake in an early Cucumber Java suite is letting step definitions become mini test scripts. Locators creep in. Wait logic gets copied. Assertions spread across unrelated classes. A small project still passes. A growing one turns into a maintenance job.

That’s why the Page Object Model, usually shortened to POM, is still the right default for UI automation in Java. It gives each page its own class, keeps locators and interactions together, and leaves step definitions focused on intent.

This kind of project shape is easier to understand when you look at a clean skeleton:

Screenshot from https://github.com/cucumber/cucumber-java-skeleton

Refactor brittle steps into page objects

Here’s the before version that many teams start with:

@When("the user enters a valid username and password")
public void enterValidCredentials() {
    driver.findElement(By.id("username")).sendKeys("demo");
    driver.findElement(By.id("password")).sendKeys("password");
}

It works, but it hard-codes page structure inside the step layer.

A better page object looks like this:

package pages;
 
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
 
public class LoginPage {
    private final WebDriver driver;
 
    private final By username = By.id("username");
    private final By password = By.id("password");
    private final By loginButton = By.id("loginButton");
 
    public LoginPage(WebDriver driver) {
        this.driver = driver;
    }
 
    public void open() {
        driver.get("https://example.com/login");
    }
 
    public void loginAs(String user, String pass) {
        driver.findElement(username).sendKeys(user);
        driver.findElement(password).sendKeys(pass);
        driver.findElement(loginButton).click();
    }
}

Then the step definition gets smaller:

@When("the user logs in with username {string} and password {string}")
public void login(String username, String password) {
    LoginPage loginPage = new LoginPage(Hooks.driver);
    loginPage.loginAs(username, password);
}

Data-driven tests are where Cucumber starts paying off

Cucumber gets more compelling when you stop duplicating scenarios and start feeding them data. Scenario Outline with Examples can reduce code duplication by 60-75%, and DataTableType can cut parsing overhead by 50% compared to manual map manipulation, according to this Cucumber parameterization reference.

A clean Scenario Outline example:

Scenario Outline: Login with multiple users
  Given the user is on the login page
  When the user logs in with username "<username>" and password "<password>"
  Then the login result should be "<result>"
 
Examples:
  | username | password | result  |
  | demo     | pass123  | success |
  | locked   | pass123  | failure |

For richer inputs, use data tables:

Given the following users exist
  | username | role  |
  | anna     | admin |
  | ben      | user  |

Then transform them into domain objects instead of passing Map<String, String> around forever.

A useful rule of thumb: If a step definition contains more Selenium code than business language, move that code into a page object or helper immediately.

What scales and what breaks

A scalable framework usually has these boundaries:

Feature files describe behavior
Step definitions translate behavior into calls
Page objects know the UI
Utilities handle waits, config, and data setup

What breaks is mixing all four in one file. That’s the pattern that makes a test suite feel productive in week one and expensive in month six.

Debugging, CI Integration, and the Maintenance Reality

A Cucumber suite becomes real the first time it fails in CI for a reason nobody can reproduce locally. That’s the moment when “human-readable tests” stop sounding magical and start sounding incomplete. Readable scenarios help. They don’t replace diagnostics.

Debug failures with artifacts, not guesses

When a scenario fails, start with the boring evidence:

The stack trace tells you whether the failure is a missing element, a timeout, a bad assertion, or a setup problem.
A screenshot on failure helps when the UI changed without immediate notice.
Browser console logs catch frontend errors your assertion never sees.
Network logs help when the page looks fine but data didn’t load.

Many teams wire screenshots into @After hooks and attach them to reports. That’s not fancy. It’s survival.

A simple @After failure hook often looks like this in practice:

@After
public void afterScenario(Scenario scenario) {
    if (scenario.isFailed()) {
        // capture screenshot and attach it to your report system
    }
}

CI is straightforward. Reliability isn’t.

On paper, Cucumber is CI-friendly. In Java Selenium setups, parallel execution can reduce suite time by up to 94%, and JUnit XML output makes CI integration straightforward. But meaningful analysis of flaky tests usually needs extra reporting, and coverage can stay below 90% without dedicated effort, according to this Cucumber framework reporting and CI overview.

That’s why a GitHub Actions workflow is usually the easy part:

name: Run Cucumber Tests
 
on: [push, pull_request]
 
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Set up Java
        uses: actions/setup-java@v3
        with:
          distribution: temurin
          java-version: '17'
      - name: Run tests
        run: mvn test

The hard part is making mvn test trustworthy every day. If you’re tightening pipeline quality, it helps to frame your suite as part of a broader continuous integration testing workflow, not as a disconnected automation artifact.

The maintenance burden is the real cost

Three things usually hurt long term:

Selector churn. Frontend teams rename attributes, move components, or change rendering timing.
Shared state bugs. Test order starts affecting results.
Test data drift. The environment no longer matches what the scenario assumes.

Teams often add better dashboards and orchestration around the suite. If you’re evaluating operational layers around automation, tools in the category of Agentic Test Management solutions like Testsigma can be useful to study because they focus on organizing, observing, and triaging testing work rather than only executing scripts.

“Passing locally” isn’t a quality signal. “Passing repeatedly in CI with useful failure data” is.

A maintainable framework needs more than green checks. It needs enough evidence to explain the red ones quickly.

The Modern Alternative When Cucumber Becomes a Chore

Cucumber with Java is still a legitimate engineering choice. It’s strong when your team wants code-level control, already lives in the JVM, and is willing to invest in framework ownership. But that ownership is the catch. Every gain in flexibility comes with maintenance work.

If you’ve built one of these suites before, you know where the time goes. Not just writing tests. Keeping selectors current. Stabilizing waits. Updating page objects after UI changes. Fixing parallel conflicts. Tuning reports so failures are diagnosable. None of that shows up in the happy-path tutorial.

A frustrated developer looking at a tangled pile of rope labeled Cucumber Framework and a modern alternative.

Side-by-side trade-offs

Here’s the honest comparison many small teams eventually make:

Question	Cucumber with Java	No-code AI testing approach
Test creation	Write Gherkin, step defs, page objects, runners	Describe behavior in plain English
Maintenance	Ongoing locator and framework upkeep	Less code-specific upkeep
Edge cases	Only what the team scripts	Can explore unexpected flows
Debugging	Depends on reporting setup	Often bundled with session artifacts
Team fit	Strong for automation-heavy engineering teams	Strong for lean teams without dedicated QA

That doesn’t mean code-based testing is obsolete. It means its cost profile is often underestimated.

Where script-based frameworks stop making sense

There’s also a scale mismatch that quick-start guides ignore. A startup with a handful of engineers may not benefit from building and owning a full automation framework, even if that framework is “best practice.” The team still has to maintain it.

One source analyzing over 200 projects reported a 45% reduction in manual test effort after adopting AI tools like Monito. The same source notes that parallel scripting can suffer from up to 70% flakiness, and gives a cost range of $0.08-$0.13 per test run versus over $2k/month for managed QA services in that comparison, as described in this AI testing and maintenance trade-off analysis.

The practical takeaway isn’t “never use Cucumber.” It’s narrower than that. If your team has the discipline and time to maintain a Java automation framework, Cucumber can work well. If your team keeps postponing testing because nobody wants to own the framework, then the framework is the bottleneck.

A blunt decision rule

Choose Cucumber with Java when:

your team wants source-controlled automation code
browser automation is part of your engineering practice
you can invest in stable framework design

Choose a no-code AI alternative when:

nobody wants to maintain test code
the team needs coverage fast
exploratory behavior matters as much as scripted flows

A lot of teams don’t need “more automation engineering.” They need more testing with less upkeep.

If you want web app testing without building and maintaining a Java framework, try Monito. You describe what to test in plain English, it runs the browser session for you, explores edge cases, and returns structured bug reports with logs, screenshots, and replay data. It’s a practical option for small teams that need better coverage but don’t want another codebase to maintain.