Playwright MCP: The Swiss Army Knife for AI Agents
Most people think of Playwright as a testing framework. Run some E2E tests, validate some selectors, call it a day.
We use it as the eyes, hands, and memory of our AI agents. And it has quietly become the single most valuable tool in our entire multi-agent stack.
Here’s why.
What Is Playwright MCP?
MCP — Model Context Protocol — lets AI models call external tools through a standardized interface. Playwright MCP gives your AI agent a real browser. Not a simulated one. Not a headless screenshot service. A full Chromium instance that can navigate, click, type, scroll, read, screenshot, and evaluate JavaScript — all driven by natural language instructions from the agent.
When our agent says “go to the Cloudflare dashboard, find the DNS records for byte-sized.io, and tell me if there’s an A record pointing to 104.21.x.x,” it actually opens a browser, navigates through the Zero Trust UI, reads the page, and reports back. No API needed. No documentation to parse. It just uses the website like a human would.
That’s the unlock. Any web interface becomes an API.
The Five Use Cases Nobody Talks About
When I listed the core use cases in our team’s idea channel, they looked simple. But each one represents a category of work that used to require a human switching contexts, reading docs, and clicking through UIs manually.
1. Debugging
This is the obvious one, but the depth of it surprises people.
Traditional debugging with AI: you paste error logs, stack traces, or screenshots into a chat. The AI guesses what might be wrong based on text patterns.
Playwright MCP debugging: the agent opens your running application, inspects the actual DOM, reads console errors in real time, checks network requests, evaluates JavaScript expressions, and screenshots the broken state — all without you lifting a finger.
Agent: "The form submission is failing. Let me check."
→ Navigates to /signup
→ Fills the form with test data
→ Clicks submit
→ Reads console: "TypeError: Cannot read properties of undefined (reading 'email')"
→ Checks network tab: POST /api/signup returned 422
→ Screenshots the error state
→ "The issue is in the form handler — it's reading req.body.email
but the form sends it as req.body.user.email. Here's the fix..."
The agent didn’t ask you to reproduce the bug. It reproduced it itself, in a real browser, with real network requests. That’s not a testing tool — it’s a debugging partner.
We use this constantly when working on our infrastructure. When something looks wrong with a Cloudflare tunnel, an Unraid container, or a Supabase dashboard, the agent opens the admin UI and investigates directly rather than asking us to describe what we see.
2. Test Case Preconditions
Here’s one that saves enormous time: setting up the state that a test needs to run.
Say you need to test that a user can edit their profile. Before the test can run, you need:
- A registered user account
- The user logged in
- An existing profile with data to edit
- Maybe specific feature flags enabled
Setting this up through an API is fast but requires knowing the API. Setting it up through the UI takes forever but requires zero API knowledge.
Playwright MCP is the best of both worlds. The agent navigates to the signup page, creates a user, logs in, fills out a profile, and then hands you a browser session in exactly the state you need. You describe the precondition in plain English; the agent makes it happen.
"Create a test user with a completed profile, two published blog posts,
and one draft. Then navigate to the drafts page."
Done. No seed scripts. No fixture files. No “let me check the API docs for how to create a post programmatically.”
3. Design Analysis
This is where Playwright MCP really shines in our workflow. Our ui-reviewer agent uses it to perform automated visual audits of every prototype and production deployment.
The workflow:
- Navigate to the target at three viewport widths (375px, 768px, 1280px)
- Screenshot each breakpoint
- Compare against the design specification or previous version
- Test every interactive element — hover states, click targets, form focus
- Validate accessibility: ARIA labels, heading hierarchy, color contrast
- Measure Core Web Vitals directly in the browser
- Report with severity-classified issues
The thresholds are strict: 95% visual accuracy for prototypes, 98% for production. If a button moved 3 pixels or a color shifted outside the acceptable delta, the reviewer flags it.
But it goes beyond pixel comparison. The agent reads the accessibility tree — a structured representation of how screen readers see the page. Missing alt text, broken tab order, insufficient contrast ratios — these show up automatically.
We integrated this into our design pipeline’s quality gates. No code advances to the next stage without passing the visual review. The reviewer doesn’t get tired, doesn’t miss details, and generates a complete audit trail.
4. RTFM’ing — Reading Documentation For You
This might be the most underrated use case. How many times have you needed to figure out how to do something in a tool you rarely use? You open the docs, search for the right page, scroll through irrelevant sections, and eventually find the three sentences that actually answer your question.
Playwright MCP turns your agent into a documentation speed-reader.
When we needed to configure SMTP settings in our self-hosted Infisical instance, the agent didn’t search for “Infisical SMTP configuration” and hope for a good blog post. It navigated to the Infisical docs site, found the configuration page, read the actual documentation, and told us exactly which environment variables to set — with the correct key names and format.
When we set up Cloudflare Tunnels, the agent walked through the Zero Trust dashboard in real time, reading the UI labels and help text to figure out the exact configuration. No stale Stack Overflow answers. No outdated blog posts. The current UI, read directly.
This is especially powerful for tools you use rarely. Your Unraid NAS admin panel. Your router’s firewall settings. The Cloudflare WAF rules interface. These are all web UIs that change frequently and have mediocre documentation. But they’re always accurate about themselves — the UI is the documentation. Playwright MCP reads it directly.
5. Speeding Up Work You’re Familiar With But Not Expert At
This is the sweet spot. You know what you want to do. You’ve done something similar before. But the specific steps are just fuzzy enough that you’d normally spend 15 minutes clicking around and checking settings.
Examples from our daily work:
- Creating a Cloudflare DNS record: You know it goes in the DNS section, you know you need an A record, but which proxy setting? Which TTL? The agent navigates there, reads the current records, and adds the new one with the right settings.
- Setting up a Docker container on Unraid: You know the Community Apps store, you know roughly which fields to fill, but the port mappings and volume paths are always slightly different. The agent reads the template and fills it correctly.
- Configuring GitHub Actions secrets: You know they live in Settings → Secrets, but which repository? And is it “Repository secrets” or “Environment secrets”? The agent navigates to the right place and sets them.
The pattern: you provide the intent, the agent handles the navigation. You’re not learning a new tool. You’re not reading documentation. You’re not context-switching from your code editor to a web browser and back. The agent does the browser work while you stay focused on the actual problem.
How We Wired It Up
Our Playwright MCP setup runs across two environments:
Claude Code (MacBook) — native Playwright MCP server with full browser access. This is where most interactive work happens: debugging, design review, admin UI navigation.
OpenClaw (Cloud) — headless browser with screenshot/snapshot return. Used for automated testing, monitoring, and tasks delegated from ClawControl.
The mhm-playwright agent handles routing:
- Simple tasks (screenshot, single-page read) execute locally
- Complex multi-step tasks get delegated to RCC (Claude Code) via the bridge protocol
- Visual testing pipelines run through the
ui-revieweragent
Session management is critical. Our skill implementation enforces a clean session protocol: kill existing browsers, clear temp files, start fresh. This prevents state leakage between tasks — a screenshot taken for one project shouldn’t have cookies from another.
The Accessibility Tree: Playwright’s Secret Weapon
Most people use Playwright for screenshots. We use browser_snapshot more than browser_take_screenshot.
The snapshot returns the accessibility tree — a structured representation of the page as assistive technology sees it. It includes:
- Element roles (button, link, heading, textbox)
- Labels and descriptions
- States (expanded, selected, disabled)
- Hierarchical relationships
This is often more useful than a screenshot because:
- It’s machine-readable. The agent can reason about structure, not pixels.
- It validates accessibility for free. If a button has no label in the tree, it’s inaccessible.
- It’s deterministic. Unlike screenshots, which vary with rendering, the tree is the same every time.
- It’s lightweight. Text instead of image data means faster processing and lower token cost.
When our ui-reviewer checks a prototype, it takes both: screenshots for visual accuracy, snapshots for structural accuracy. Belt and suspenders.
Real-World Saves
Some specific moments where Playwright MCP saved us significant time this month:
Cloudflare Tunnel Setup: API tokens didn’t have the right permissions for tunnel creation. Instead of debugging the API, the agent opened the Zero Trust dashboard in Playwright, navigated through the tunnel creation wizard, and configured everything through the UI. Problem solved in 5 minutes instead of an hour of API debugging.
Infisical Secrets Manager: Setting up SMTP on a self-hosted Infisical instance required specific environment variables that weren’t clearly documented. The agent navigated the admin panel, found the settings page, and read the field labels and help text to determine the correct configuration.
Blog Deployment Verification: After switching from VPN-based SMB mounts to Cloudflare Pages, we needed to verify every page rendered correctly. The ui-reviewer agent opened each page, compared it against the old deployment screenshots, and flagged two CSS issues we’d have missed.
Discord Channel Audit: We needed to verify 12 Discord channels were configured correctly with the right permissions and descriptions. The agent opened the Discord web app, navigated to each channel’s settings, and compiled a comparison report — in 3 minutes instead of 30 minutes of manual clicking.
When NOT to Use Playwright MCP
It’s not always the right tool:
- When an API exists and you know it. If you can
curlthe answer in one line, don’t open a browser. - When the page requires authentication you can’t provide. Playwright can handle login flows, but if 2FA or SSO is involved, it gets complicated.
- When you need to trigger alerts or modals. Browser dialogs (alert, confirm, prompt) block the entire event loop. If a page might pop an alert, the agent can get stuck.
- When speed matters more than accuracy. Browser automation is slower than API calls. For bulk operations (updating 100 DNS records), use the API.
The Bigger Picture
Playwright MCP represents something larger than browser automation. It’s the principle that AI agents should interact with the same interfaces humans use, not just the ones built for machines.
APIs are great when they exist, when they’re documented, when they’re stable, and when you have the right credentials. That’s a lot of “when”s. Web UIs are always there. They’re always current. And with Playwright MCP, they’re always accessible to your agents.
Every web interface in your stack — your cloud dashboard, your NAS admin panel, your CI/CD pipeline, your monitoring tools, your CMS — is now an agent-accessible tool. No integration work. No API keys. No SDK updates.
That’s why it’s the swiss army knife. Not because it does one thing well, but because it does everything adequately — and it’s always in your pocket.
Part of the AGENT_VAULT multi-agent framework. Browser automation powered by Playwright MCP.