Phase 6 of 12 · QA Operator

Code Testing

Phase 6 is the work of proving deterministic code behaves correctly, where humans write behavioural assertions and review generated tests for real coverage.

Prove that deterministic code behaves correctly before judging agent behaviour.

Decision rules

Each rule connects a real situation to the skill or playbook that fits it. Linked terms open canonical sources.

Decision rules for Code Testing
Situation Missing skill Recommended playbook Alternatives Why
A UI bug has been reported but no one has reproduced it in a real browser yet. Browser reproduction Playwright Cypress Playwright is the default for agent-driven reproduction; Cypress is the right choice when the team already runs it in CI.
Agent-generated tests pass but the underlying bug still ships to users. Behavioural assertion writing verification-before-completion Playwright Verification-before-completion rejects tests that don't assert behaviour; Playwright is the tool when the missing assertion is specifically about end-to-end browser behaviour.
A bug was marked fixed once already but has come back. Repro validation systematic-debugging Regression suite addition Systematic-debugging captures the reproduction as a regression test before the bug is closed; adding to the regression suite without that discipline tends to encode the symptom rather than the cause.
A bug shows multiple symptoms and the root cause keeps moving. Hypothesis-driven debugging systematic-debugging Bisect + logging Systematic-debugging forces one hypothesis at a time; bisect plus logging is faster when the regression has a clear before-and-after state.

Watch

Reality

Traditional tests remain necessary. Generated tests help, but they can encode the same misunderstanding as the generated code.

Required skills

  • Regression test design
  • Behavioural assertion writing
  • Browser reproduction
  • Generated-test review
  • Exploratory test planning

Failure modes

  • Generated shallow coverage
  • False confidence
  • Missing behavioural assertions

Next operating step

Maintain a deterministic test suite for code behaviour, and review generated tests for real assertions, edge cases, regressions, and browser-level behaviour.

Working through Code Testing?

I advise teams on this part of the lifecycle. Get in touch → if you want a direct, vendor-free conversation about what's worth doing next.