Agentic manual testing

Overview

Traditional coding agents can write and execute code, but they miss bugs that automated tests don’t catch. This article explains how to get agents to manually test their own code using command-line tools and browser automation to catch issues that slip through unit tests.

View Original

The Breakdown

Coding agents can execute code they write, but automated tests miss real-world failures like server crashes or UI bugs that pass unit tests but break actual functionality
Python -c command execution allows agents to quickly test library functions with edge cases without creating temporary files or complex test setups
curl-based API exploration enables agents to manually verify JSON APIs by running dev servers and testing endpoints interactively to discover integration issues
Browser automation with Playwright lets agents test web UIs in real browsers, uncovering visual and interaction problems that code-only testing cannot detect
Red-green TDD integration converts manual test discoveries into permanent automated tests, ensuring issues found during exploration become part of the test suite