Overview
Traditional coding agents can write and execute code, but they miss bugs that automated tests don’t catch. This article explains how to get agents to manually test their own code using command-line tools and browser automation to catch issues that slip through unit tests.
The Breakdown
- Coding agents can execute code they write, but automated tests miss real-world failures like server crashes or UI bugs that pass unit tests but break actual functionality
- Python -c command execution allows agents to quickly test library functions with edge cases without creating temporary files or complex test setups
- curl-based API exploration enables agents to manually verify JSON APIs by running dev servers and testing endpoints interactively to discover integration issues
- Browser automation with Playwright lets agents test web UIs in real browsers, uncovering visual and interaction problems that code-only testing cannot detect
- Red-green TDD integration converts manual test discoveries into permanent automated tests, ensuring issues found during exploration become part of the test suite