Blog Details Shape
Automation testing

How to Fix Flaky Playwright Tests

Published:
July 11, 2025
Table of Contents
Join 1,241 readers who are obsessed with testing.
Consult the author or an expert on this topic.

A few weeks ago, during a sprint, our QA team flagged a frustrating issue: a Playwright test that passed locally, failed in CI, then passed again—all without any code change. It was slowing us down and shaking confidence across the team.

Digging deeper, we found what many engineers face: Flaky tests caused by bad timing, unstable selectors, and missed auto-wait features. In fast-moving CI/CD pipelines, these issues went unnoticed until they broke builds.

We used Playwright Trace Viewer and Testdino’s smart dashboards to debug the problem. By fixing brittle selectors, mocking unstable networks, and isolating tests properly, we cut reruns and improved test reliability.

Flaky tests don’t just fail randomly, they waste hours, slow releases, and kill team morale. Fixing them with tools like Trace Viewer, retry logic, and Testdino brings stability, speed, and peace of mind.

{{blog-cta-1}}

Lifecycle of Flaky Playwright Test

Why Flaky Playwright Tests Matter in CI/CD

In modern development, CI/CD relies on fast, reliable automation. But flaky tests can break this flow by producing inconsistent results, disrupting testing processes and reducing overall efficiency. These failures waste time, confuse teams, and block critical deployments.

1. Signs of Flaky Tests

Flaky tests often pass locally but fail randomly in CI. They may pass on one commit and fail on the next, even with no code changes. These signs usually point to issues like poor test isolation, missed auto-wait Playwright logic, or fragile locators.

Common symptoms:

  • Tests fail intermittently without changes
  • Reruns produce different results
  • Failures disappear when run in isolation

2. How They Affect End-to-End Testing

In end-to-end testing with Playwright, flakiness reduces trust in your suite. Developers stop relying on test feedback, and real bugs can get missed. It also increases noise in build logs and delays releases.

These tests:

  • Cause false positives and negatives
  • Hide actual defects during regression
  • Slow down feedback loops for dev teams
Alphabin Flaky Test Dahsboard

3. CI/CD Pipeline Risks

CI/CD flaky tests in shared pipelines are harder to manage due to limited runtime or parallel execution. These failures trigger unnecessary re-runs and slow your test feedback loop. Frequent reruns also increase the total number of tests run in CI/CD pipelines, further impacting efficiency.

Over time, this impacts developer velocity, increases test execution time, and lowers morale, especially in fast execution browser contexts like containers or headless browser window setups.

Characteristics of Flaky Tests

Characteristic Description
Inconsistent Results Tests pass and fail randomly without code or data changes.
Timing Sensitivity Fail due to flaky tests auto wait issues or missing waits.
Environment-Specific Only fail in CI/CD, not during local test runs, due to resource or setup gaps.
Brittle Selectors Use unstable locators that change across test runs or environments.
Network Dependence Rely on slow or flaky APIs that introduce random failures.

Impact of Flaky Tests on Developer Workflow

Playwright flaky tests don’t just affect the test suite; they disrupt the entire developer workflow. From wasting time to delaying releases, they have a ripple effect across engineering teams.

1. Time Wasted in Debugging

Flaky tests often lead developers to chase bugs that don’t exist. A test might fail in CI but pass locally, wasting time and adding confusion to your Playwright test automation workflows, especially if it was your first test using new features.

Example 1: A test for a login page fails on CI due to an animation delay, but passes fine locally. The engineer spends 2 hours checking the auth code before realizing it’s a timing issue.

Example 2: A test fails once every 10 runs because of a missing await; this isn’t spotted until someone uses Playwright’s– repeat-each.

Tools like Playwright Inspector can help developers step through failing tests to identify timing or selector issues more efficiently.

These delays often stem from overlooked automation testing pitfalls, like overusing static waits or relying on unstable selectors,

2. How Stability Boosts Team Confidence

Fixing flaky tests improves more than just pipelines; it strengthens morale and momentum. A stable suite of test scripts gives developers confidence that their test code and features work reliably across all test scenarios, including fast execution browser contexts like CI pipelines or multi-browser setups using Firefox and WebKit.

Test stability is a key feature of Playwright that helps teams maintain confidence in their automation.

Flaky Playwright tests delay deployments as well as increase triage time in CI/CD environments. Also, unresolved tests lower team morale in fast-paced workflows.

Tools and techniques that prevent flakiness help teams move faster, streamline parallel test execution, and reduce reruns across test scenarios.

Common Causes of Playwright Flaky Tests

Test stability, along with reliability, do improve by the comprehension of Playwright flaky tests’ root causes. In cycles for development that are fast-paced, these issues often arise. Usually, bad practices or overlooked test conditions are the cause.

Leveraging Playwright's real browser input pipeline can help ensure user interactions are accurately simulated, reducing the likelihood of flaky tests caused by input timing issues.

1. Timing and Race Conditions

Tests often fail when elements aren’t fully loaded or interactive before actions are triggered. This is usually caused by missing waits or relying on assumptions about load timing.

Using flaky tests’ auto-wait capabilities in Playwright can help avoid such race conditions during testing web applications. Additionally, using web first assertions in Playwright can help reduce timing-related flakiness by automatically retrying until conditions are met.

2. Using Hardcoded Waits

Hardcoded timeouts like waitForTimeout(3000) make tests brittle and environment-dependent. These static waits can pass locally but fail under CI load. Replace them with condition-based waits like waitForSelector() or assertions that check visibility.

3. Unstable or Dynamic Selectors

Selectors that depend on changing classes, auto-generated IDs, or text content are highly prone to failure. To reduce test brittleness, it’s best to use consistent locators like data-testid. Learn how to use data-testid attributes for better stability. Stable selectors help Playwright interact reliably with web pages during automation.

CI/CD Tips to Handle Flaky Tests

1. Use Retries and Repeat Flags

Use Playwright’s built-in retry and repeat features to detect and reduce flakiness in CI environments.

Repeat tests multiple times to catch flakiness

npx playwright test –repeat-each=5‍
Copied!

These help identify Playwright flaky tests before merging into the main branch. Reviewing the results of each test run helps identify patterns of flakiness and improve test reliability.

2. Tag and Separate Unstable Tests

Group flaky or unstable tests using @tag or custom file naming so they can be isolated in CI.

//Example: mark unstable test
test(‘@flaky Should handle slow animation’, async ({ page }) => {
// test logic
});‍
Copied!

Then run stable and unstable suites separately:

//Run only stable tests
npx playwright test –grep-invert @flaky‍
Copied!
//Run only flaky-tagged tests
npx playwright test– grep @flaky‍
Copied!

Tagging tests in this way allows teams to run specific tests separately, making it easier to diagnose and address flaky behavior.

3. Add Debugging and Reporting Tools

Use tools like Trace Viewer and visual reports to debug flaky tests in CI.

//Enable trace collection on failures
   npx playwright test– trace=on‍
Copied!

Generate detailed HTML reports.

View the Playwright report after running
npx playwright show-report
Copied!

These tools improve visibility for debugging Playwright tests and identifying flaky patterns in pipelines. Teams should also explore execution logs and related artifacts to diagnose the root causes of flaky tests.

Tools to Debug Playwright Flaky Tests

Tool Key Features for Debugging Flaky Tests
Playwright Trace Viewer Step-by-step test replay with DOM and console logs
Playwright HTML Reporter Basic test results with error messages and status
CI/CD Logs (Jenkins, GitHub Actions, etc.) Shows failures, retries, and test durations in pipelines
Custom Dashboards (e.g., via Grafana) Track flaky trends and failure patterns over time
AI-based Analysis Tools Detects flaky patterns using clustering and smart insights
TestDino AI-based flaky detection, error auto-tagging, smart suggestions, Slack alerts

{{cta-image}}

Catch Flaky Tests Early While Writing

1. Test Repeats During Development

Repeating the same test scenarios during local test execution helps spot instability early before it affects shared pipelines. This is crucial for maintaining robust test automation practices.

npx playwright test --repeat-each=5
Copied!

2. Run --repeat-each Before Merging

Integrate the --repeat-each flag into your pre-merge workflow. It ensures your test doesn’t pass just once by luck and fail later when running tests in CI/CD environments. Combined with tools that capture execution trace, this improves visibility and lets teams configure the test retry strategy before merging.

npx playwright test my-test.spec.ts --repeat-each=3 --retries=1
Copied!

3. Write Stable Test Templates

Write tests using reusable templates with clear patterns for test isolation, execution logs, and flaky tests auto wait handling to improve stability. Organizing these reusable test templates in a dedicated tests folder helps maintain structure and stability.

For Example, using stable attributes like data-testid helps reduce test brittleness and supports cleaner test execution when running tests or performing API testing,

Advanced Strategies to Prevent Flakiness

1. Leverage Playwright’s Auto-Waiting

Use built-in methods like page.waitForSelector() instead of page.waitForTimeout().

Example:

await page.click(‘button#submit’);
await expect(page.locator(‘.success’)).toBeVisible({ timeout: 5000 });
Copied!

Playwright's auto-waiting feature is designed to align with the behavior of modern browsers, improving test reliability.

2. Mock or Stub External Dependencies

Isolate tests by mocking APIs or network calls:

await page.route(‘**/api/user’, route =>
route.fulfill({ json: { id: 1, name: ‘Test’ } })
);
Copied!

Mocking external dependencies like APIs helps create independent tests that do not interfere with each other.

3. Adopt Retry Logic & Repeat-Every

Use Playwright flags to spot intermittent failures early:

npx playwright test –retries=2 –repeat-each=3
Copied!

Applying retry logic across all the tests in your suite helps catch intermittent failures early.

Debugging Flaky Tests Effectively

1. Trace Viewer in CI

Enable trace collection in CI:

trace: ‘on-first-retry’

Then use Trace Viewer to inspect step-by-step DOM snapshots, network activity, and timing details. Trace Viewer is especially useful for debugging complex tests that span multiple tabs or user sessions.

2. TestDino Dashboards for Insights

Visual dashboards show:

  • Most flaky tests
  • Failure frequencies
  • Race-condition patterns

Use this analysis to refactor or isolate problematic test cases.

Best Practices to Reduce Flaky Tests

Best Practice Action
Write Reliable and Clear Tests Ensure tests are well-structured, easy to understand, and free from complex logic.
Automate Tests on Every Commit Run tests automatically in CI/CD pipelines to catch issues early.
Refactor and Maintain Test Code Regularly clean up, simplify, and split complex test scenarios.
Test Across Environments Validate tests on multiple browsers, devices, and configurations.
Use Parallel Execution Run tests concurrently to speed up feedback and detect race conditions.
Investigate and Fix Root Causes Analyze failures deeply instead of re-running tests without changes.
Track, Document & Monitor Flakiness Use dashboards, reports, and logs to monitor flaky test patterns over time.

{{cta-image-second}}

Conclusion

At Alphabin, we help QA and DevOps teams move beyond fragile test automation. With deep Playwright expertise, CI/CD alignment, and in-house tools like TestDino, we ensure your releases are stable, fast, and reliable, even under the pressure of modern software delivery cycles.

This guide walked through real-world causes of flaky Playwright tests, like unstable selectors, hardcoded waits, and network dependencies and how to fix them using auto-waiting, retry logic, mocking, and trace-based debugging. It also showed how TestDino helps detect flakiness early and streamlines root cause analysis at scale.

From developers writing their first test to teams maintaining large test suites across Chromium, Firefox, and WebKit, we turn reactive triaging into proactive automation. With TestGenX for script generation and TestDino for intelligent analysis, we empower you to build testing you can trust.

Something you should read...

Frequently Asked Questions

What is a flaky test?
FAQ ArrowFAQ Minus Arrow

A test that intermittently passes and fails without any code or data changes.

How can I detect flaky tests in Playwright?
FAQ ArrowFAQ Minus Arrow

Use the --repeat-each flag or retry logic to surface unstable tests during execution.

Why do flaky tests often fail in CI/CD but pass locally?
FAQ ArrowFAQ Minus Arrow

CI/CD pipelines have slower environments, limited resources, or parallel test conflicts.

What’s the best way to fix timing-related flakiness?
FAQ ArrowFAQ Minus Arrow

Leverage Playwright’s auto-wait feature and avoid static timeouts like waitForTimeout().

Discover vulnerabilities in your  app with AlphaScanner 🔒

Try it free!Blog CTA Top ShapeBlog CTA Top Shape
Discover vulnerabilities in your app with AlphaScanner 🔒

About the author

Pratik Patel

Pratik Patel

Pratik Patel is the founder and CEO of Alphabin, an AI-powered Software Testing company.

He has over 10 years of experience in building automation testing teams and leading complex projects, and has worked with startups and Fortune 500 companies to improve QA processes.

At Alphabin, Pratik leads a team that uses AI to revolutionize testing in various industries, including Healthcare, PropTech, E-commerce, Fintech, and Blockchain.

More about the author
Join 1,241 readers who are obsessed with testing.
Consult the author or an expert on this topic.
Pro Tip Image

Pro-tip

What Are Flaky Tests?

Flaky tests are the ones that pass sometimes and fail other times, without any changes in code or environment. Imagine a test that works fine in Chrome but fails randomly in Firefox; it’s like chasing a bug that isn’t there.

Playwright supports testing across different browsers, such as Chromium, Firefox, and WebKit, to ensure web apps work consistently and reliably for all users. These tests quietly break CI pipelines, shake team confidence, and slow down releases, which is especially important for web apps that need to function reliably across browsers.

Blog Quote Icon

Blog Quote Icon

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

Related article:

TestDino helps you find, fix, and prevent flaky tests automaticallyFix Flaky Tests with TestDino’s Smart Reporting and AI Debugging
Blog Newsletter Image

Don’t miss
our hottest news!

Get exclusive AI-driven testing strategies, automation insights, and QA news.
Thanks!
We'll notify you once development is complete. Stay tuned!
Oops!
Something went wrong while subscribing.