QA & testing glossary
Plain-English definitions of the testing terms BugBrain works with every day.
End-to-end (E2E) testing
Testing that exercises a complete user flow through the real application — UI, backend, and integrations — to verify the system behaves correctly from the user’s perspective.
Regression testing
Re-running tests after a change to confirm that previously working functionality still works. The goal is to catch features that a new change accidentally broke.
Flaky test
A test that passes and fails intermittently without any code change, often due to timing, network, or unstable selectors. Flaky tests erode trust in the suite because teams start ignoring red builds.
Self-healing tests
Tests that automatically re-resolve element locators when the UI changes — falling back through strategies like cached locators, the DOM, the accessibility tree, or vision — so they keep passing instead of breaking on minor edits.
Test coverage
A measure of how much of an application’s behavior is exercised by tests. For end-to-end testing this is usually framed as the share of critical user flows that are tested.
Exploratory testing
Unscripted testing where the tester (human or AI) navigates the app, forms hypotheses, and probes for issues in real time, rather than following a predefined script. Good at finding edge cases no one thought to script.
Pre-merge testing
Running tests on a pull request before it merges, typically as a status check, so regressions are caught while the change is still under review instead of after it ships.
Test oracle
The mechanism that decides whether a test’s observed behavior is correct. A strong oracle reduces false positives by verifying that a reported failure is a genuine defect.
False positive
A test result that reports a problem where none exists — for example, a failure caused by flakiness or environment rather than a real bug. High false-positive rates train teams to ignore alerts.
Visual regression testing
Comparing screenshots of the UI against a known-good baseline to catch unintended visual changes such as shifted layouts, color changes, or broken styling.
Accessibility (WCAG) testing
Checking that an application is usable by people with disabilities, measured against the Web Content Accessibility Guidelines (WCAG). Catches issues like missing labels, poor contrast, and keyboard traps.
Smoke testing
A quick, shallow set of checks that verify the most critical paths work before deeper testing or release — a fast “is it on fire?” gate.
Continuous testing
Automatically running tests throughout the delivery pipeline — on every commit, pull request, or deployment — so quality feedback is immediate rather than batched before release.
API contract testing
Verifying that an API’s requests and responses match an agreed specification (such as OpenAPI), catching breaking changes and drift between the contract and the implementation.
PR quality gate
An automated check on a pull request that decides whether the change is safe to merge. A good gate is advisory by default and only blocks on regressions the PR actually introduces — not on bugs that already existed.
Shift-left testing
Moving testing earlier in development — to the pull request or even the editor — so defects are caught while a change is cheap to fix, instead of during a release crunch.
Test impact analysis
Working out which tests are worth running for a specific code change by tracing the diff to the application areas it affects. It keeps pull-request feedback fast by skipping tests the change can’t have broken.
Autonomous testing
Testing where AI agents decide what to do — exploring the app, choosing actions, and judging outcomes — rather than replaying a fixed script. Forrester renamed the category “autonomous testing platforms” in 2025.
Agentic QA
A quality-assurance approach driven by autonomous AI agents that plan, act, and verify across an application, often coordinating multiple specialised agents (explorer, critic, triager) instead of one model doing everything.
Self-healing locator
An element locator that re-resolves itself when the UI changes, falling back through strategies such as a cached fingerprint, the DOM, the accessibility tree, or vision — so a renamed class or moved button doesn’t break the test.
LLM testing
Testing large-language-model features for accuracy, safety, and consistency — checking for hallucinations, prompt-injection vulnerabilities, and reliable multi-turn behaviour, rather than only deterministic pass/fail assertions.
Prompt injection
An attack where crafted input makes an AI system ignore its instructions or reveal data it shouldn’t. Testing for it means probing an LLM feature with adversarial prompts to confirm it holds its guardrails.
Hallucination (AI)
When an AI model produces confident but false or fabricated information. Testing for hallucination uses judges or golden answers to score outputs for factual accuracy before the feature ships.
LLM-as-judge
Using a language model to grade another model’s output against a criterion. Done honestly it judges one criterion per call, reasons before scoring, and is allowed to abstain when evidence is insufficient.
Model Context Protocol (MCP)
An open standard that lets AI tools like Claude Code and Cursor call external systems through a common interface. An MCP testing server lets you trigger and read test runs without leaving your editor.
No-code / codeless test automation
Creating automated tests without writing code — through recording, plain-English steps, or AI generation — so non-engineers can contribute and teams reach coverage faster.
Bug triage
Reviewing reported issues to decide which are real, how severe they are, and what to do next. AI triage adds root-cause hypotheses and evidence so engineers can act without re-investigating from scratch.
WCAG 2.2
The current version (2023) of the Web Content Accessibility Guidelines, adding criteria like focus appearance and dragging alternatives. Some checks are automatable; others still require human judgement.
Put the theory into practice
Start free and let BugBrain test your app — no scripts to write.