WednesdayTuesdayMondaySundaySaturdayFridayThursday

Playwright Tools for MCP

alex_hirner 181 points github.com
mgdev
This is so good. I've been using it with Claude Code with great success.

I just leave an instruction in CLAUDE.md to validate changes with Playwright. It automatically starts a dev server (wrote a little MCP server to do that), navigates to the page with the changes it just made, and validates that its changes worked. If there is anything unexpected, it self-corrects.

It's like working with a really great mid-level engineer.

What a time to be alive.

maleldil
Claude Code is amazing. Unfortunately, it's also very expensive. How do interactions with MCP servers affect token usage/cost?
drewnick
+1 for claude code being amazing, and especially +1 for the cost. I've spent $500 this week, $.10 - $1 at a time, fixing bugs and adding features. It took a while to get used to not @ tagging all of the files and realizing it just "figures it out" (using tokens to do so of course!)
Aeolun
I burned though $25 in just 3 hours. Claude code will be great when they can get the cost down. If the cost is like 1/10th of that I’d be using it all the time, but +/- $10 / hour is too much.
ramesh31
I burned though $25 in just 3 hours. Claude code will be great when they can get the cost down. If the cost is like 1/10th of that I’d be using it all the time, but +/- $10 / hour is too much.

I've been trying to figure this out, and I don't think it's malicious, but it's just a matter of incentives. Anthropic devs are certainly not paying retail prices for Claude usage, so their benchmark (or just intuition) of efficiency is probably much different than the average user. Without that hard constraint the incentive just isn't there for them to squeeze out a few more pennies, and it ends up way more expensive than stuff like Cline or Cursor.

mgdev
A US-based dev costs 125/hr, on the low end.

A US-based dev directing Claude Code has like 3x output.

So the biz is spending 125 + AI costs, but saving 250/hr.

An individual dev might feel like a superhuman compared to those not using Claude Code. Could even earn them a substantial promotion.

Either way, seems to net out.

codybontecou
Interesting use-case. Can you give an example of a prompt you use that triggers this tool? Are you validating UI changes (button color), navigation, or something more complex?
mettamage
I don't know playwright, but how is this different than puppeteer?

The issue I'm noticing with puppeteer is that it isn't always successful to immediately get the right javascript to complete a simple task such as accepting a cookie consent banner, for example.

bdcravens
Playwright is a bit of an evolution of Puppeteer. Mostly the same API, extends the API a bit (I tend to prefer its abstractions over Puppeteer), and designed to work with multiple browsers. It came from many of the same developers as Puppeteer.
paulryanrogers
Does Playwright work with multiple browsers? I get the impression it can work with multiple engines, but they're just custom wrappers and not the full/original browsers.
epolanski
I've been using Playwright for years testing safari/FF/chromium-based engines. Playwright team compiles every single browser at each new release.

It's great, no worry. Besides very minor things like mobile safari bugs (which you can't test on Macos safari neither, you need a real device or browser stack) it's perfect.

the_arun
Is this for test automation? or for using Playwright as "Operator" in an Agent?
codybontecou
I think it's to give the LLM control of your browser.
bdcravens
So instead of specifying explicit selectors, etc, you just use a prompt? (like "Go to eBay.com, search for Playstation 5, and click on the first result that isn't a promoted listing")
simonw
Yes, exactly. It defaults to using the Chrome accessibility tree but it can also be run so it uses Claude's vision feature against screenshots instead.
plondon514
Anyone know how you would use this _within_ playwright to write playwright tests in natural language?
mvdtnz
Click the link and find out.
lxe
It uses ariaSnapshot, which is an accessible representation of the DOM used by screen readers and accessibility validation tools as well as playwright testing.

However, even with that, it will quickly exhaust the model context if you navigate to something like Gmail. I just verified this with cursor.

I've been playing around with a much better textual representation of the page that's much more compact:

https://github.com/lxe/chrome-mcp/blob/master/src/runtime-te...

This uses your own chrome session and doesn't require a huge context size.

I might refactor this to use the aria interface available to the CDP, which I wasn't aware of at the time.

dannyobrien
I agree -- I hacked up a CDP-driven MCP so that Claude can drive your own browser instance, and I think that's more in the spirit of how MCP is supposed to work (where it's driving your tools under supervision, rather than spinning up its own context)
atonse
I’m going to see if I can use this in combination with our JIRA MCP to read a bug ticket’s “steps to reproduce” to see if it translate those steps to actually reproduce those actions.

I don’t understand the hate against MCP. It is truly exciting to see the Cambrian explosion of “connectors” coming out.

This is going to be the “App Store” for models in a way that OpenAI’s custom GPTs never was.

samsepi01
Please share results!
adamtaylor_13
I want an MCP for capybara so LLMs can write my Rails system tests and debug them when they don't work.
mazacx5volvo240
Watermarking and synthesizing text for hosts and clients, private RAG over Slack MCP implementations would disperse LLM's to Local Data Souce: A, B, and remote server C.

[1]https://dlsp2024.ieee-security.org/wagner.pdf

rahimnathwani
It seems this new tool from Microsoft is a competitor to https://github.com/executeautomation/mcp-playwright

The Microsoft one seems simpler, whereas the other one has more tools.

ramesh31
Microsoft is the official owner of Playwright
nikcub
and this is how I find out Cursor has a limit of 40 tools.
rpastuszak
Did you manage to get it to work with Cursor? How?
jauntywundrkind
Submitted acouple times, would love to hear more.

Note also, there's a Fetch-MCP which is playwright based, supports batch. Would be interesting to compare. https://github.com/jae-jae/fetch-mcp https://news.ycombinator.com/item?id=43419713 (64 points, 6 days ago, 14 comments)

pal9000i
Great release! But i'm wondering why they didn't just support the original Playwright APIs instead of the subset of actions.
febed
Anyone compare this with BrowserUse ?

https://browser-use.com/

NinadSinha
I think the use cases are slightly different between for the two. The playwright MCP depends on the mcp server (like claude desktop or cursor) to provide the intelligence, while browser-use can "think" by itself. Plus it seems that unless you use the vision mode, you are kind of restricted to the accessibility tree, which may not be present or well populated depending on the website you're using. This also means that it won't really work as well with stuff like cursor/windsurf since they don't really process images from MCPs right now.

I'm more in the camp of using claude computer-use/openai cua. I think they work better for most things, especially if you don't interact with hidden/obscured elements.

If you're interested in comparing these different services, you can try HyperPilot by Hyperbrowser at https://pilot.hyperbrowser.ai .

Disclaimer: I worked on Hyperpilot so I might be a bit biased.

upcoming-sesame
code --add-mcp

Vscode comes with a built in MCP client ?

cendyne
Interestingly it transforms the page into markdown for navigation.
b0dhimind
How can we use this with Cursor agent?