harnesslog.dev

Claude Code, AI, and development stories

EN · KO
H
hwangjungmin

agent-browser shipped a skill called dogfood. It navigates a web app like a real user, finds bugs and UX issues, and returns a structured report — screenshots, repro videos, and step-by-step reproduction steps.

Before this, running exploratory QA with an AI agent meant cobbling together browser automation yourself. This wraps that whole workflow into a single skill you can call from Claude Code.

I’m curious whether this makes QA genuinely delegatable. Not automated testing in the traditional sense — no test files, no assertions — just someone walking through the app the way a human tester would. The structured output is the part that seems actually useful; it’s something you could hand off.

Still need to test how fast it runs and how many tokens it burns. That’ll determine whether it’s practical day-to-day.