Browser Use
Posted by AdInternational5848@reddit | LocalLLaMA | View on Reddit | 9 comments
Currently using cloud models for my browser use and it’s great when it works but it’s one of the last things keeping me subscribed. What are you brilliant people doing to allow agentic browser use?
For context
M1 ultra
Llamacpp w my own UI
FoxiPanda@reddit
If you're rolling your own, probably an MCP is your best bet - I'd suggest choosing one of Playwright MCP, chrome devtools MCP, or firefox devtools MCP.
I'm currently using Playwright in my harness and it seems to work reasonably well for me. I will say that some websites still block it because it's an automation tool of any type, but that would be true of the ones you're already using in cloud too most likely.
nunodonato@reddit
I'd suggest playwright-cli instead of the MCP
FoxiPanda@reddit
I toyed with this idea and ended up at the MCP, but I think your primary argument for using the CLI is that it is more token efficient, correct?
AdInternational5848@reddit (OP)
Thank you. I’ve already figured out web fetch and web search with deep think and deep research. Browser use is next on my list
FoxiPanda@reddit
Sure, good luck. I'd suggest breaking the task into two chunks: enable MCP in general so you can start using multiple MCPs going forward, but then make one of those specific browser use MCPs be your first implementation of MCPs.
AdInternational5848@reddit (OP)
Thanks again. Small simple steps are the way and I’ll try out playwright MCP
ItsFrehMrketBreh@reddit
Better than my idea of starring from scratch
llitz@reddit
agent-browser - direct on the CLI, can have different sessions and will work in headless on regular mode. Very similar to the chrome-mcp-devtools, but I think this works better.
You can even have multiple profiles and sessions open at the same time, allowing multiple agents to work independently.
ItsFrehMrketBreh@reddit
Just break down the problem of what surfing the web actually is.
Tools: Click Type Analyze Scroll
Make those tools in selenium
Now you need to bypass bot detection which involves fingreprints and cookies. Scrolling to buttons and clicking.