### 核心观点"> ### 核心观点"> ### 核心观点">
返回列表
🪞 Uota学 · 💬 讨论题 ### 一句话

Your Browser Is The Bottleneck For Openclaw

把浏览器从“我电脑上的 dev 工具”升级成“可丢弃的远程基础设施”,OpenClaw 才能安全且稳定地并行跑起来。 ### 核心观点

2026-02-19 原文链接 ↗
阅读简报
双语对照
完整翻译
原文
讨论归档

核心观点

  • 本地浏览器=隐形的安全洞 让 agent 驱动你真实浏览状态,本质是把权限边界打穿:Cookie、登录态、扩展、历史记录都可能被误用或泄露。
  • 并行一上来,瓶颈先爆在 RAM/稳定性而不是模型 单机本地 Chromium 更像“开发调试工具”,不是给多 session 并发设计的 infra:内存飙升、延迟抖动、跑着跑着就 flaky。
  • Browser Sandbox 的关键不是“远程”,是“可一次性销毁 + 预装工具链” 不用装 Chromium/driver,agent-browser/Playwright 现成;你可以把 agent 跑在低配机器/树莓派上,把重活(浏览)丢到隔离环境里。
  • 意图级接口比 Playwright 代码更适合作为 Agent 的 web 层 “open/click/fill/snapshot/scrape”这种 agent-browser 命令,让 agent 少写/少 debug 脆弱脚本,把复杂性下沉到基础设施。
  • 上下文卸载=更稳 + 更省 token 返回 artifact(snapshot/提取内容),而不是把原始 DOM/日志塞进 prompt;再配合文件系统缓存,能减少无意义的上下文膨胀。

### 跟我们的关联

跟我们的关联

  • OpenClaw 规模化的第一性原则:控制面与执行面解耦 agent 负责决策,浏览器是执行引擎;只要浏览器还绑在本机,任何“多 session/多 agent”都会被机器瓶颈拖死。
  • 安全边界要收紧:不要让 agent 触碰真实浏览态 如果要用本地浏览器,必须默认“隔离 profile/隔离 container/只读 cookie”之类的硬约束;更理想的是直接上远程沙盒。
  • 对 Neta 的启发:我们该把“浏览器能力”当成可复用模块 不管是内部运营抓取、竞品行研、UGC 内容采集还是自动化发布,本质都在争夺一个稳定可控的 web 执行层。
  • 成本/稳定性的 trade-off 会改变我们对 infra 的选择 与其加大本机配置,不如把浏览器做成弹性资源池:按需拉起、失败即丢弃、并发靠水平扩展。

### 讨论引子

讨论引子

  • 我们现在最常见的 OpenClaw flaky case,究竟是“模型/提示词问题”,还是“浏览器执行层不够 infra 化”的问题?怎么做最小实验验证?
  • 浏览器自动化的默认安全策略应该是什么:允许 agent 触碰真实登录态,还是默认一律沙盒?边界怎么定义才不拖慢效率?
  • 我们需要的 web 层到底是 Playwright(代码)还是 agent-browser(意图接口)?各自的失败模式和维护成本是什么?

你的浏览器才是 OpenClaw 的瓶颈

在硅谷,几乎每个人要么试过 OpenClaw,要么有同事正用它交付某个东西。而人们遇到的第一个问题之一,就是浏览器自动化。

默认配置是让 OpenClaw 驱动你本地的浏览器。对少数工作流来说它还能用,但代价很快就会显现:你把一个 agent 放进了与你真实浏览状态相同的环境里,等于打开了一个巨大的安全隐患;而一旦你尝试并行跑几个 session,你的机器立刻就成了瓶颈——内存飙升、agent 变慢,运行也开始变得不稳定。

我们在 Firecrawl 里构建了 Browser Sandbox,因为我们一次次撞上同一块天花板:本地浏览器更像开发工具,而不是基础设施。

Browser Sandbox 把这些工作搬到一个安全、远程、可一次性销毁的浏览器环境里。无需本地安装 Chromium。无需配置驱动。agent-browser 和 Playwright 都已就位。你的 agent 可以按需拉起浏览器,不论是一个 session 还是几十个,都不必把负载绑死在运行它的那台机器上。你的 OpenClaw agent 可以跑在免费层的 EC2 实例、Raspberry Pi,或你手头任何设备上,而浏览行为在别处发生。

建议你试试。

配置 Firecrawl

让你的 agent 用一条命令安装 Firecrawl:

npx -y firecrawl-cli init --browser

这会安装 Firecrawl CLI,弹出浏览器让你在 Firecrawl 中完成认证,然后安装该 skill。

现在你的 agent 已经准备好上网浏览了。安装完成后,让你的 OpenClaw agent 试一下。

对你的 agent 说:

“使用 Firecrawl Browser Sandbox 打开 Hacker News,给我当天的前 5 条新闻,以及每条新闻的前 10 条评论”

在底层,firecrawl browser ... 这些命令会通过 agent-browser 接口,在安全沙盒中执行操作。这很重要,因为你的 agent 可以直接发出意图级命令(“open”“click”“fill”“snapshot”“scrape”),而不是生成并调试 Playwright 代码。如果你需要,Playwright 也依然可用。

你的 agent 最终做的事大致会是这样:

firecrawl browser "open https://news.ycombinator.com"

firecrawl browser "snapshot"

firecrawl browser "scrape"

firecrawl browser close

有几处机制值得特别说明:

简写自动建会话:简写形式(firecrawl browser "...")会在没有活跃会话时自动启动一个沙盒会话,因此 agent 不需要一开始就管理会话生命周期。

默认使用 agent-browser:这些带引号的命令会在沙盒内自动发送给 agent-browser。

上下文卸载 + Token 效率:agent 拿到的是产物(snapshot/提取内容),而不是把原始 DOM/驱动日志一股脑塞进 prompt。它还会用文件系统保存抓取到的页面与交互记录,只在需要时再去查询。

更棒的是,你得到的是一套可靠、完整的 Web 工具箱:抓取、搜索、浏览器自动化,全都通过一个你的 agent 已经会用的 CLI 来完成。

链接:http://x.com/i/article/2024194854893629440

相关笔记

Everyone in the valley has either tried OpenClaw or has a coworker shipping something with it. And one of the first problems people run into is browser automation.

在硅谷,几乎每个人要么试过 OpenClaw,要么有同事正用它交付某个东西。而人们遇到的第一个问题之一,就是浏览器自动化。

The default setup is to let OpenClaw drive your local browser. It works for a few workflows, but the costs show up quickly: you’re putting an agent in the same environment as your real browsing state, enabling a massive security flaw and the moment you try to run a few sessions in parallel your machine becomes the bottleneck; RAM spikes, the agent slows down, runs get flaky.

默认配置是让 OpenClaw 驱动你本地的浏览器。对少数工作流来说它还能用,但代价很快就会显现:你把一个 agent 放进了与你真实浏览状态相同的环境里,等于打开了一个巨大的安全隐患;而一旦你尝试并行跑几个 session,你的机器立刻就成了瓶颈——内存飙升、agent 变慢,运行也开始变得不稳定。

We built Browser Sandbox in Firecrawl because we kept hitting that same ceiling: local browsers behave like dev tooling, not infrastructure.

我们在 Firecrawl 里构建了 Browser Sandbox,因为我们一次次撞上同一块天花板:本地浏览器更像开发工具,而不是基础设施。

Browser Sandbox moves that work into a secure, remote, disposable browser environment. No local Chromium installs. No driver setup. agent-browser and Playwright are already there. Your agent can spin up a browser on demand, one session or dozens, without tying the workload to the machine it’s running on. Your OpenClaw agent can run on a free-tier EC2 instance, a Raspberry Pi, or whatever you’ve got, while the browsing happens elsewhere.

Browser Sandbox 把这些工作搬到一个安全、远程、可一次性销毁的浏览器环境里。无需本地安装 Chromium。无需配置驱动。agent-browser 和 Playwright 都已就位。你的 agent 可以按需拉起浏览器,不论是一个 session 还是几十个,都不必把负载绑死在运行它的那台机器上。你的 OpenClaw agent 可以跑在免费层的 EC2 实例、Raspberry Pi,或你手头任何设备上,而浏览行为在别处发生。

I encourage you to try it out.

建议你试试。

Setting up Firecrawl

配置 Firecrawl

Tell your agent to install Firecrawl with 1 command:

让你的 agent 用一条命令安装 Firecrawl:

npx -y firecrawl-cli init --browser

npx -y firecrawl-cli init --browser

This installs the Firecrawl CLI, pops a browser so you can authenticate in Firecrawl and then installs the skill.

这会安装 Firecrawl CLI,弹出浏览器让你在 Firecrawl 中完成认证,然后安装该 skill。

Your agent is now ready to browse the web. Once installed have your OpenClaw agent try it.

现在你的 agent 已经准备好上网浏览了。安装完成后,让你的 OpenClaw agent 试一下。

Tell your agent:

对你的 agent 说:

“Use Firecrawl Browser Sandbox to open Hacker News and get me the top 5 news of the day and the first 10 comments on each”

“使用 Firecrawl Browser Sandbox 打开 Hacker News,给我当天的前 5 条新闻,以及每条新闻的前 10 条评论”

Under the hood, the firecrawl browser ... commands use agent-browser interface to execute commands in a secure sandbox. That matters because your agent can issu intent-level commands (“open”, “click”, “fill”, “snapshot”, “scrape”) instead of generating and debugging Playwright code. Playwright is still there if you need it.

在底层,firecrawl browser ... 这些命令会通过 agent-browser 接口,在安全沙盒中执行操作。这很重要,因为你的 agent 可以直接发出意图级命令(“open”“click”“fill”“snapshot”“scrape”),而不是生成并调试 Playwright 代码。如果你需要,Playwright 也依然可用。

What your agent will end up doing looks like this:

你的 agent 最终做的事大致会是这样:

firecrawl browser "open https://news.ycombinator.com"

firecrawl browser "open https://news.ycombinator.com"

firecrawl browser "snapshot"

firecrawl browser "snapshot"

firecrawl browser "scrape"

firecrawl browser "scrape"

firecrawl browser close

firecrawl browser close

A few mechanics worth calling out:

有几处机制值得特别说明:

Shorthand auto-session: the shorthand form (firecrawl browser "...") will auto-launch a sandbox session if there isn’t one active, so the agent doesn’t need to manage session lifecycle up front.

简写自动建会话:简写形式(firecrawl browser "...")会在没有活跃会话时自动启动一个沙盒会话,因此 agent 不需要一开始就管理会话生命周期。

agent-browser by default: those quoted commands are sent to agent-browser automatically inside the sandbox.

默认使用 agent-browser:这些带引号的命令会在沙盒内自动发送给 agent-browser。

Context offloading + Token efficiency: the agent gets back artifacts (snapshot/extracted content) instead of hauling raw DOM/driver logs into the prompt. It also uses the file system to saved fetched pages and interactions, only querying it when needed.

上下文卸载 + Token 效率:agent 拿到的是产物(snapshot/提取内容),而不是把原始 DOM/驱动日志一股脑塞进 prompt。它还会用文件系统保存抓取到的页面与交互记录,只在需要时再去查询。

And the best part is you get a reliable full web toolkit. Scrape, search, and browser automation all through a single CLI that your agent already knows how to use.

更棒的是,你得到的是一套可靠、完整的 Web 工具箱:抓取、搜索、浏览器自动化,全都通过一个你的 agent 已经会用的 CLI 来完成。

Link: http://x.com/i/article/2024194854893629440

链接:http://x.com/i/article/2024194854893629440

相关笔记

Your browser is the bottleneck for OpenClaw

  • Source: https://x.com/nickscamara_/status/2024226351369376211?s=46
  • Mirror: https://x.com/nickscamara_/status/2024226351369376211?s=46
  • Published: 2026-02-18T20:55:45+00:00
  • Saved: 2026-02-19

Content

Everyone in the valley has either tried OpenClaw or has a coworker shipping something with it. And one of the first problems people run into is browser automation.

The default setup is to let OpenClaw drive your local browser. It works for a few workflows, but the costs show up quickly: you’re putting an agent in the same environment as your real browsing state, enabling a massive security flaw and the moment you try to run a few sessions in parallel your machine becomes the bottleneck; RAM spikes, the agent slows down, runs get flaky.

We built Browser Sandbox in Firecrawl because we kept hitting that same ceiling: local browsers behave like dev tooling, not infrastructure.

Browser Sandbox moves that work into a secure, remote, disposable browser environment. No local Chromium installs. No driver setup. agent-browser and Playwright are already there. Your agent can spin up a browser on demand, one session or dozens, without tying the workload to the machine it’s running on. Your OpenClaw agent can run on a free-tier EC2 instance, a Raspberry Pi, or whatever you’ve got, while the browsing happens elsewhere.

I encourage you to try it out.

Setting up Firecrawl

Tell your agent to install Firecrawl with 1 command:

npx -y firecrawl-cli init --browser

This installs the Firecrawl CLI, pops a browser so you can authenticate in Firecrawl and then installs the skill.

Your agent is now ready to browse the web. Once installed have your OpenClaw agent try it.

Tell your agent:

“Use Firecrawl Browser Sandbox to open Hacker News and get me the top 5 news of the day and the first 10 comments on each”

Under the hood, the firecrawl browser ... commands use agent-browser interface to execute commands in a secure sandbox. That matters because your agent can issu intent-level commands (“open”, “click”, “fill”, “snapshot”, “scrape”) instead of generating and debugging Playwright code. Playwright is still there if you need it.

What your agent will end up doing looks like this:

firecrawl browser "open https://news.ycombinator.com"

firecrawl browser "snapshot"

firecrawl browser "scrape"

firecrawl browser close

A few mechanics worth calling out:

Shorthand auto-session: the shorthand form (firecrawl browser "...") will auto-launch a sandbox session if there isn’t one active, so the agent doesn’t need to manage session lifecycle up front.

agent-browser by default: those quoted commands are sent to agent-browser automatically inside the sandbox.

Context offloading + Token efficiency: the agent gets back artifacts (snapshot/extracted content) instead of hauling raw DOM/driver logs into the prompt. It also uses the file system to saved fetched pages and interactions, only querying it when needed.

And the best part is you get a reliable full web toolkit. Scrape, search, and browser automation all through a single CLI that your agent already knows how to use.

Link: http://x.com/i/article/2024194854893629440

📋 讨论归档

讨论进行中…