Using Claude Code SDK to Reduce E2E Test Time by 84% Overview End-to-end (E2E) tests, while essential for verifying full user workflows, tend to be slow, fragile, and costly. Often, teams run E2E tests only nightly due to time constraints, risking bugs slipping into production. The article explores an approach to run only relevant E2E tests for specific code changes in a pull request (PR), significantly cutting down test time from 44 minutes to under 10 minutes. --- Traditional Approach: Using Globs Teams often use glob patterns to decide which tests to run based on file changes. Example snippet of labeler.yml specifying file path patterns triggering tests. Problems: High maintenance as the codebase evolves. Overly broad triggering of tests (a small shared component change can trigger many unrelated tests). No fine-grained precision. --- Enter Claude Code SDK The Challenge Need both coverage (not missing critical E2E tests) and precision (avoiding unnecessary test runs). Naively feeding the entire repo and changes to a large language model (LLM) is impractical due to token limits. Claude Code's Approach Uses tool calls to selectively examine files. Searches for patterns and traces dependencies incrementally, replicating human intuition on test relevance. Hypothesis: Given a PR, Claude Code can decide the necessary E2E tests by understanding code changes and test suites. --- Building the AI-powered E2E Gatekeeper Inputs Needed PR Modifications: Use precise git diff commands to get clean diffs showing modified code, excluding noise like whitespace or irrelevant files (e.g., package.lock). E2E Test Inventory: Extract the list of E2E tests dynamically from the test framework configuration (e.g., WebdriverIO's wdio.conf.ts). Prompt Crafting: Build a clear, precise prompt for Claude Code to: Review the diff. Match changes with available E2E tests. "Think deep" to thoroughly analyze and include tests with reasonable impact. Always err on the side of inclusion when in doubt. Structured Output: Instead of fighting with strict JSON output (which Claude struggles with), request Claude Code to write a test-recommendations.json file with: An array of test files to run. A concise explanation in markdown for auditability. --- Integration & Execution A custom bash command concatenates the prompt, E2E file list, and git diff, then pipes it to Claude: The --allowedTools "Edit Write" flag lets Claude write output files safely without risky permissions like network fetches. Emphasize security: Never use --skip-dangerous-permissions to avoid exploits from malicious prompts. --- Results Time Reduction: Core E2E test runtime dropped from 44 minutes to about 7 minutes on most PRs, even large ones. Accuracy: Claude Code never missed a relevant test but would sometimes include extra tests for safety. Cost: About $30/contributor/month, offset by savings on device farm resources and developer time. Scalability: The approach scales well as it focuses on semantically named test files and relevant changed code paths rather than entire repos. --- Summary Using Claude Code SDK as an AI gatekeeper to selectively run E2E tests leads to: Major CI speedup by running only needed tests. Prevention of bugs reaching production by maintaining thorough test coverage. Savings in costs and developer productivity. A practical, maintainable system that can adapt dynamically with the codebase. This method bridges the gap between traditional,