返回列表
🧠 阿头学 · 💬 讨论题

Block 把 AI 代码审查做成“守护者”,方向对但证据明显不够

Block 这篇文章最有价值的不是“AI 很强”,而是它把 AI 从写代码助手升级成流程守门人;但它几乎不给效果数据,所以目前更像一套高级治理框架宣言,而不是被充分验证的成功案例。
打开原文 ↗

2026-04-03 原文链接 ↗
阅读简报
双语对照
完整翻译
原文
讨论归档

核心观点

  • 定位升级是对的 Block 把 AI 从“助手/顾问”升级为“守护者”,这个方向判断比单纯做 copilot 更成熟,因为大组织真正昂贵的问题不是代码写得慢,而是局部优化持续腐蚀全局架构、安全和运维约束。
  • 左移检查有现实价值 用统一 CLI、Just 约定、pre-commit/pre-push 本地检查去复用 CI 能力,这个做法不是花活,而是切中“本地一套、云端一套”导致的工程摩擦,属于即使没有大模型也成立的硬方法论。
  • 上下文架构比模型大小更关键 文章最站得住的部分不是 Builderbot 本身,而是 AGENTS.md、模块级 checks、Agent Skills 这种“渐进式披露 + 按需加载”的设计;这比用一个超级 prompt 覆盖全系统更靠谱,也更接近真实复杂系统的治理方式。
  • “全局世界模型”有明显理想化 作者默认组织存在一套清晰、统一、可编码、可维护的世界模型,这在现实里通常不成立;如果规则本身过时、冲突或带有历史包袱,守护者就不会守护系统,只会机械放大组织混乱。
  • 最关键的风险被回避了 文章反复强调“默认行动而不是建议”,但没有交代误报率、阻断边界、开发者逃逸、延迟成本和责任归属;这不是细节,而是决定系统成败的主问题。

跟我们的关联

  • 对 ATou 意味着什么 ATou 如果在做 agent 产品或工作流,不该再把重点放在“更像人地完成任务”,而该放在“替组织执行规则和守边界”;下一步可以把现有流程拆成统一入口、局部上下文、按需 checks 三层来设计。
  • 对 Neta 意味着什么 Neta 如果在搭知识系统或内容系统,这篇文章说明真正难点不是模型回答能力,而是上下文调度能力;下一步应先画出哪些规则必须全局统一、哪些知识必须局部暴露,而不是先堆 prompt。
  • 对 Uota 意味着什么 Uota 如果关心复杂系统中的 agent 协作,这篇文章给了一个清晰模板:不要追求单一全知代理,而要做并行的专门化子代理;下一步可以尝试把一个复杂任务拆成 3-5 个角色化审查节点并汇总。
  • 对投资意味着什么 这说明 AI 基础设施里更值得看的不是“写代码更快”的表层工具,而是“治理、审查、合规、策略执行”的系统层产品;下一步应重点看谁能证明误报率可控、开发者愿意留在链路里,而不是只会讲 agent 愿景。

讨论引子

1. AI 守护者一旦拥有阻断权,它到底是质量保险,还是新的官僚瓶颈? 2. 一个组织真的能维护“全局世界模型”吗,还是最终都会退化成大量互相冲突的规则文件? 3. 在工程体系里,最值得产品化的 agent 角色究竟是“执行者”,还是“审查者/守门人”?

~/posts/protecting-our-systems-with-intelligence...

$ cd ..

2026 年 4 月 2 日

用智能守护我们的系统

我们如何用具备代理能力的审查器担任守护者,维护系统韧性

$ git blame

Joah Gerstenberg

Block 的 AI 赋能

$ cat content.md

守护者,而不是助手

我们认为,要保护我们的世界模型,一个核心要求是把智能当成的不只是一个助手:它把选项抬上来,然后等人来采取行动。相反,我们的代理会在研究、规划和实现的全过程中充当警觉的守护者,确保系统能抵御每个工程组织都会遭遇的退化模式:各团队发布的功能在局部看起来合理,却在全局持续腐蚀。作为我们的第一个守护者,Builderbot 处在开发者与我们正在构建的系统之间,持续观察、学习、提出建议,并引导变更,让它们与我们的世界模型对齐。

关键原则

左移

抵御那些侵蚀系统的力量,越早介入越好,最好在软件开发生命周期的最前段就开始。这在软件开发中是一个早已被验证的模式,但当我们用智能去扩展更快交付功能的能力时,它变得更关键。在大多数工程组织里,CI 已经成了事实上的验证层:测试套件很大、构建很复杂,往往让构建系统自己去跑比在本地把一切都验证一遍更省事。代理改变了这套局面。只要给它们一个统一入口,它们就能在推送前在本地跑同样的检查,速度与一致性都是过去不现实的。

为此,我们正在所有仓库的本地开发里,用 Just 落地一份统一的 CLI 约定。这样,本地代理就能通过标准入口调用 CI 里同样的工具。现在,代理遇到一个新仓库时不再需要摸索,它们会按统一预期,在 pre-commit 和 pre-push hooks 里先跑 just fmt 或 just test,再把代码推到 PR。这个小改动却极大提升了本地代理快速做出正确修改的能力,也避免把负担转嫁给 CI。

每个模块都有守护者

每个模块都需要能定义自己的钩子、检查项和上下文,让这些信息在修改该模块代码时被纳入考虑。只定义一个守护者就指望它适用于所有系统不够,指望 monorepo 的一套规则覆盖其中每个模块也不现实。高度本地化的上下文要和全局世界模型配合,守护者在引导系统变更时才会有足够的信息。经过反复试错,我们对如何把这件事做对的看法也在不断演进。许多具备代理能力的审查器只靠一段提示词,试图覆盖整个系统,而我们看到更有效的做法是利用渐进式披露,把正确的上下文带到它们负责的模块里,引导它们完成系统审查。

AGENTS.md 提供了一种方式,让上下文只在相关模块里以渐进式的方式暴露出来。大多数代理在进入某个目录开始工作时会自动加载 AGENTS.md,并在它们穿行系统时继续查找更局部的 AGENTS.md 上下文文件。我们常在 AGENTS.md 里放提示,让代理知道可能需要回看的外部文档,或需要保持实现同步的相邻系统。在项目里精心设计分层嵌套的 AGENTS.md,就能很容易用它成功所需的本地上下文去引导代理。

上下文管理是关键

虽然 AGENTS.md 是管理高度本地化上下文的一种工具,但它本身不足以把我们的世界模型用守护者可直接使用的形式完整表达出来。有些具备代理能力的审查器会通过在 AGENTS.md 里引用来提供模块本地上下文,但这个上下文文件里每多一个 token,就会给每个进入该模块的代理增加负担。为了把这些上下文从关键路径里挪出去,我们很喜欢 Amp 的 Code Review Checks 模式,它让我们把提示词放进 .agents/checks/.md 文件里,只在相关时才加载。和 AGENTS.md 一样,checks 也可以通过 /.agents/checks/.md 这样的方式嵌套在单个模块里,并且每段提示词都会在专属的审查子代理中执行,以保证信噪比始终够高。

Agent Skills 提供了另一套标准,用来把上下文从代理的关键路径里抽离出来。Agent Skills 是一种高度可扩展的格式,用来向代理暴露上下文,并让它们在某个任务需要时动态“装备”这些信息。通过内部的 Skills Marketplace,我们使用数百个内部编写的 Agent Skills,为每个环境预置上下文,让无状态的代理也能快速获取理解世界模型所需的信息,从而在研究、规划和实现阶段主动引导决策。

如何构建一个守护者

守护者在本质上不同于助手或顾问。助手要等你开口。顾问给出选项就退后。守护者会行动,持续行动,大多发生在你意识阈值之下。最贴近的类比是免疫系统:它不会等你发现自己病了,也不会给你一张威胁仪表盘让你选方案。它以极高的复杂度自动运作,只有在失效时才会被注意到。

Builderbot 的代码审查系统,是我们系统架构的守护者。如今再也没有任何单个工程师能把整个系统完整装进脑子里,但守护者可以。它会把每一次提出的变更都拿来对照整体模型评估,不只是被触碰的模块,还包括贯穿整个组织的架构模式、安全要求和运维约束。它的默认行为是行动,而不是建议。它不会写个报告然后等待;它会审查、标记并引导,人类提供的是最终批准,而不是最初的分析。

单一入口

为了尽可能早地捕获问题,本地代理应该能访问与云端代理相同的上下文、工具和策略。为此,我们把 sq agents review 这个 CLI 工具分发到每台工作站和云端代理运行器,让它们都拥有本地与全局知识的完整上下文。用一个统一入口在所有地方守护系统,也让策略随时间演进变得简单。它在针对 PR 运行时,sq agents review 可以先验证新改动是否与世界模型对齐,再请求人类过目并授予最终批准。

可访问全局知识的专门化审查

借助 Code Review Checks,模块负责人可以定义高度本地化的审查上下文,并在执行 sq agents review 时把这些上下文分发给子代理。每个 check 都作为一个隔离的子代理运行,有自己的上下文窗口:面向 API 标准的全局 check 会加载一套上下文,而 payments/ 里针对 PCI 合规的模块级 check,或 auth/ 里的安全审查 check,会加载另一套上下文。子代理并行运行,最后把发现汇总成一份审查报告。

除了本地 checks,我们还有一套持续演进的全局 checks,会在审查时被拉进来验证新代码是否遵循我们的全局世界模型。它们让我们可以在更远处就捕捉到问题,并通过把全局关注点写成规则来引导代理式的变更。最后,因为我们的代理审查跑在自有硬件上,审查时还能引用内部文档和资料,而这些内容通常不会暴露给第三方审查者。

持续演进的策略

主动式的保护不该要求人类一直盯着,把 checks 跟不断变化的产品方向硬同步。我们给守护者一个心跳,让它主动回顾事故、公告和消息,思考应该向人类审阅者提出哪些确定性与非确定性的 checks。它们可以在某个问题反复出现的模块或仓库里本地提出,也可以被加入到全局层面,推动整个系统朝着世界模型的新一轮演进前进。

有把握的速度

通过构建能主动守护系统的代理,我们让开发者具备所需工具,可以更有把握地交付,并与更大的组织节奏保持一致。把 checks 左移到 pre-push 阶段运行,可以更快发现问题,也减轻最终盖章的人类审阅者的负担。提供在单个模块内管理审查上下文的工具,也让服务负责人更容易对自己护航的代码承担更高程度的责任。借助全局 checks,我们能把世界模型更广泛地分发给代理,让快速前进的同时保护系统变得前所未有地容易。

$cat tags

AI软件工程最佳实践开发者工具代码审查+另 1 个

$

$block_engineering

喜欢我们的工作?加入 Block 的工程团队,和我们一起构建未来。

cd ./block-careers

网络

$ echo 版权所有 2026 Block, Inc. 保留所有权利。

Protecting Our Systems with Intelligence | Block Engineering Blog

~/posts/protecting-our-systems-with-intelligence...

$ cd ..

Engineering

April 2, 2026

~/posts/protecting-our-systems-with-intelligence...

$ cd ..

2026 年 4 月 2 日

Protecting Our Systems with Intelligence

How were using agentic reviewers as guardians to maintain system resilience

$ git blame

Joah Gerstenberg

AI enablement at Block

$ cat content.md

用智能守护我们的系统

我们如何用具备代理能力的审查器担任守护者,维护系统韧性

$ git blame

Joah Gerstenberg

Block 的 AI 赋能

$ cat content.md

Protectors, not assistants

We believe that a core requirement of protecting our world model is to use intelligence as more than simply an assistant that elevates options and waits for a human to take action. Instead, our agents function as vigilant guardians throughout research, planning, and implementation to ensure that our systems are resilient against the degenerative patterns that every engineering organization faces: individual teams shipping features that are locally rational and globally corrosive. As our first protector, Builderbot sits between our builders and the systems that we are building, constantly observing, learning, recommending, and steering changes to align with our world model.

守护者,而不是助手

我们认为,要保护我们的世界模型,一个核心要求是把智能当成的不只是一个助手:它把选项抬上来,然后等人来采取行动。相反,我们的代理会在研究、规划和实现的全过程中充当警觉的守护者,确保系统能抵御每个工程组织都会遭遇的退化模式:各团队发布的功能在局部看起来合理,却在全局持续腐蚀。作为我们的第一个守护者,Builderbot 处在开发者与我们正在构建的系统之间,持续观察、学习、提出建议,并引导变更,让它们与我们的世界模型对齐。

Key principles

Shift left

Protection against the forces that erode systems must happen as early as possible in the software development lifecycle. This is a well established pattern in software development, but becomes even more critical as we focus on scaling our ability to ship features faster with intelligence. In most engineering organizations, CI has become the de facto validation layer — test suites are large, builds are complex, and its often easier to let the build system figure it out than to verify everything locally. Agents change this equation. They can run the same checks locally before pushing, at a speed and consistency that wasnt practical before, if you give them a consistent entrypoint.

To accomplish this, we are implementing a single common CLI contract for local development in all of our repositories using Just. This enables our local agents to have a standardized entrypoint to all of the same tools that our CI runs. Now, instead of agents fumbling around when they encounter a new repository, they have a standard expectation to just fmt or just test via pre-commit and pre-push hooks before pushing code to a PR. This small change has massive impacts for our local agents ability to make the right changes quickly and avoid shifting that burden to CI.

A protector in every module

Each module needs the ability to define custom hooks, checks, and context that are considered when making changes to code within it. Its not sufficient to define one protector and expect it to work for every system, nor to expect a monorepos rules to cover every module that lives within it. Hyperlocal context in concert with a global world model is a requirement for protectors to have sufficient context when steering changes to the system. Through trial and error, we have evolved our opinions about how to do this the right way over time. Many agentic reviewers are limited to a single prompt thats expected to cover an entire system, but we have seen much more success when leveraging progressive disclosure to guide agentic system reviews with the right context in the modules that they cover.

AGENTS.md provides one way to progressively disclose context in the modules where its relevant. Most agents will automatically load an AGENTS.md file when they start working in a directory, and check for more local AGENTS.md context files as they navigate a system. We frequently include hints in our AGENTS.md files to let agents know external docs that they might want to review, or neighboring systems whose implementations need to stay in sync. By carefully crafting nested AGENTS.md files within a project, its easy to steer an agent with the local context it needs in order to succeed.

Context management is key

While AGENTS.md provides one tool for managing hyperlocal context, its not sufficient on its own to capture our world model in a format accessible to our protectors. Some agentic reviewers offer methods to provide local context in modules by referencing it in AGENTS.md, but each token added to this context file creates a burden for every agent encountering the module. In order to get this context out of our critical paths, we really like Amps Code Review Checks pattern, which enables us to move our prompts into .agents/checks/.md files to only load the context when its relevant. Just like AGENTS.md, checks are able to be nested inside individual modules using /.agents/checks/.md, and each prompt gets executed with its own dedicated review subagent to ensure the signal stays high.

Agent Skills provides another standard for pulling context out of the critical paths for our agents. Agent Skills is a highly extensible format for exposing context to agents and allowing them to dynamically equip themselves with it when it becomes relevant to a given task. Through an internal Skills Marketplace, we leverage hundreds of internally-written Agent Skills to seed each of our environments with context that makes it easy for stateless agents to quickly glean the information they need about our world model to proactively steer decisions during research, planning, and implementation.

关键原则

左移

抵御那些侵蚀系统的力量,越早介入越好,最好在软件开发生命周期的最前段就开始。这在软件开发中是一个早已被验证的模式,但当我们用智能去扩展更快交付功能的能力时,它变得更关键。在大多数工程组织里,CI 已经成了事实上的验证层:测试套件很大、构建很复杂,往往让构建系统自己去跑比在本地把一切都验证一遍更省事。代理改变了这套局面。只要给它们一个统一入口,它们就能在推送前在本地跑同样的检查,速度与一致性都是过去不现实的。

为此,我们正在所有仓库的本地开发里,用 Just 落地一份统一的 CLI 约定。这样,本地代理就能通过标准入口调用 CI 里同样的工具。现在,代理遇到一个新仓库时不再需要摸索,它们会按统一预期,在 pre-commit 和 pre-push hooks 里先跑 just fmt 或 just test,再把代码推到 PR。这个小改动却极大提升了本地代理快速做出正确修改的能力,也避免把负担转嫁给 CI。

每个模块都有守护者

每个模块都需要能定义自己的钩子、检查项和上下文,让这些信息在修改该模块代码时被纳入考虑。只定义一个守护者就指望它适用于所有系统不够,指望 monorepo 的一套规则覆盖其中每个模块也不现实。高度本地化的上下文要和全局世界模型配合,守护者在引导系统变更时才会有足够的信息。经过反复试错,我们对如何把这件事做对的看法也在不断演进。许多具备代理能力的审查器只靠一段提示词,试图覆盖整个系统,而我们看到更有效的做法是利用渐进式披露,把正确的上下文带到它们负责的模块里,引导它们完成系统审查。

AGENTS.md 提供了一种方式,让上下文只在相关模块里以渐进式的方式暴露出来。大多数代理在进入某个目录开始工作时会自动加载 AGENTS.md,并在它们穿行系统时继续查找更局部的 AGENTS.md 上下文文件。我们常在 AGENTS.md 里放提示,让代理知道可能需要回看的外部文档,或需要保持实现同步的相邻系统。在项目里精心设计分层嵌套的 AGENTS.md,就能很容易用它成功所需的本地上下文去引导代理。

上下文管理是关键

虽然 AGENTS.md 是管理高度本地化上下文的一种工具,但它本身不足以把我们的世界模型用守护者可直接使用的形式完整表达出来。有些具备代理能力的审查器会通过在 AGENTS.md 里引用来提供模块本地上下文,但这个上下文文件里每多一个 token,就会给每个进入该模块的代理增加负担。为了把这些上下文从关键路径里挪出去,我们很喜欢 Amp 的 Code Review Checks 模式,它让我们把提示词放进 .agents/checks/.md 文件里,只在相关时才加载。和 AGENTS.md 一样,checks 也可以通过 /.agents/checks/.md 这样的方式嵌套在单个模块里,并且每段提示词都会在专属的审查子代理中执行,以保证信噪比始终够高。

Agent Skills 提供了另一套标准,用来把上下文从代理的关键路径里抽离出来。Agent Skills 是一种高度可扩展的格式,用来向代理暴露上下文,并让它们在某个任务需要时动态“装备”这些信息。通过内部的 Skills Marketplace,我们使用数百个内部编写的 Agent Skills,为每个环境预置上下文,让无状态的代理也能快速获取理解世界模型所需的信息,从而在研究、规划和实现阶段主动引导决策。

How to build a protector

A protector is fundamentally different from an assistant or an advisor. An assistant waits to be asked. An advisor presents options and steps back. A protector acts — continuously, mostly below the threshold of awareness. The closest analogy is your immune system: it doesnt wait for you to notice youre sick, it doesnt present a dashboard of threats and ask you to choose a response. It acts with enormous sophistication, and you only notice it when it fails.

Builderbots code review system is a protector for our system architecture. No single engineer can hold the full system in their head anymore, but a protector can. It evaluates every proposed change against a model of the whole — not just the module being touched, but the architectural patterns, security requirements, and operational constraints that span the entire organization. Its default is action, not recommendation. It doesnt file a report and wait; it reviews, flags, and steers, with humans providing the final stamp of approval rather than the initial analysis.

A single entrypoint

In order to catch issues as early as possible, local agents should have access to all of the same context, tools, and policies as our agents that run in the cloud. To enable this, we distribute a sq agents review CLI tool to every workstation and cloud agent runner that has the full context of local and global knowledge. Having a single entrypoint that we use to protect our systems everywhere makes it easy to evolve policies over time. When running against a PR, sq agents review can ensure that alignment with our world model has been verified before requesting humans to take a pass and grant final approval.

Specialized reviews with access to global knowledge

With Code Review Checks, we equip module owners with the ability to define hyperlocal review context that gets dispatched in subagents during an execution of sq agents review. Each check runs as an isolated subagent with its own context window — a global check for API standards loads different context than a module-level check for PCI compliance in payments/ or security review in auth/. These subagents run in parallel, and their findings are aggregated into a single review report.

In addition to local checks, we have a constantly evolving set of global checks that get pulled in at review time to verify that new code is adhering to our global world model, allowing us to catch issues from afar and steer agentic changes by codifying global concerns. Finally, because our agentic reviews run on our own hardware, were able to reference internal documents and sources during review time that may not otherwise be exposed to a third-party reviewer.

Continuously evolving policies

Proactive protection shouldnt require humans to be constantly keeping checks in sync with our evolving product direction. We give our protectors a heartbeat to proactively review incidents, announcements, and messages to consider which deterministic and non-deterministic checks to propose for human review. These may be proposed locally within a particular module or repository that has a recurring set of consistent issues, or they can be added globally to steer entire systems toward a new evolution of our world model.

如何构建一个守护者

守护者在本质上不同于助手或顾问。助手要等你开口。顾问给出选项就退后。守护者会行动,持续行动,大多发生在你意识阈值之下。最贴近的类比是免疫系统:它不会等你发现自己病了,也不会给你一张威胁仪表盘让你选方案。它以极高的复杂度自动运作,只有在失效时才会被注意到。

Builderbot 的代码审查系统,是我们系统架构的守护者。如今再也没有任何单个工程师能把整个系统完整装进脑子里,但守护者可以。它会把每一次提出的变更都拿来对照整体模型评估,不只是被触碰的模块,还包括贯穿整个组织的架构模式、安全要求和运维约束。它的默认行为是行动,而不是建议。它不会写个报告然后等待;它会审查、标记并引导,人类提供的是最终批准,而不是最初的分析。

单一入口

为了尽可能早地捕获问题,本地代理应该能访问与云端代理相同的上下文、工具和策略。为此,我们把 sq agents review 这个 CLI 工具分发到每台工作站和云端代理运行器,让它们都拥有本地与全局知识的完整上下文。用一个统一入口在所有地方守护系统,也让策略随时间演进变得简单。它在针对 PR 运行时,sq agents review 可以先验证新改动是否与世界模型对齐,再请求人类过目并授予最终批准。

可访问全局知识的专门化审查

借助 Code Review Checks,模块负责人可以定义高度本地化的审查上下文,并在执行 sq agents review 时把这些上下文分发给子代理。每个 check 都作为一个隔离的子代理运行,有自己的上下文窗口:面向 API 标准的全局 check 会加载一套上下文,而 payments/ 里针对 PCI 合规的模块级 check,或 auth/ 里的安全审查 check,会加载另一套上下文。子代理并行运行,最后把发现汇总成一份审查报告。

除了本地 checks,我们还有一套持续演进的全局 checks,会在审查时被拉进来验证新代码是否遵循我们的全局世界模型。它们让我们可以在更远处就捕捉到问题,并通过把全局关注点写成规则来引导代理式的变更。最后,因为我们的代理审查跑在自有硬件上,审查时还能引用内部文档和资料,而这些内容通常不会暴露给第三方审查者。

持续演进的策略

主动式的保护不该要求人类一直盯着,把 checks 跟不断变化的产品方向硬同步。我们给守护者一个心跳,让它主动回顾事故、公告和消息,思考应该向人类审阅者提出哪些确定性与非确定性的 checks。它们可以在某个问题反复出现的模块或仓库里本地提出,也可以被加入到全局层面,推动整个系统朝着世界模型的新一轮演进前进。

Velocity with confidence

By building agents that proactively protect our systems, we are equipping our builders with the tools they need to build with confidence that they are moving in step with the broader organization. Shifting checks to run pre-push makes it faster to catch issues and reduces the burden on human reviewers who give the final stamp of approval. By giving tools for managing review context within individual modules, we make it easy for service stewards to have a high degree of ownership over the code they are shepherding. Through global checks, we can distribute our world model broadly to our agents, making it easier than ever to protect our systems while moving quickly.

$cat tags

AISoftware EngineeringBest PracticesDeveloper ToolsCode Review+1 more

$

$block_engineering

Inspired by our work? Join Blocks engineering team and build the future with us.

cd ./block-careers

有把握的速度

通过构建能主动守护系统的代理,我们让开发者具备所需工具,可以更有把握地交付,并与更大的组织节奏保持一致。把 checks 左移到 pre-push 阶段运行,可以更快发现问题,也减轻最终盖章的人类审阅者的负担。提供在单个模块内管理审查上下文的工具,也让服务负责人更容易对自己护航的代码承担更高程度的责任。借助全局 checks,我们能把世界模型更广泛地分发给代理,让快速前进的同时保护系统变得前所未有地容易。

$cat tags

AI软件工程最佳实践开发者工具代码审查+另 1 个

$

$block_engineering

喜欢我们的工作?加入 Block 的工程团队,和我们一起构建未来。

cd ./block-careers

Network

$ echo Copyright 2026 Block, Inc. All rights reserved.

网络

$ echo 版权所有 2026 Block, Inc. 保留所有权利。

Protecting Our Systems with Intelligence | Block Engineering Blog

~/posts/protecting-our-systems-with-intelligence...

$ cd ..

Engineering

April 2, 2026

Protecting Our Systems with Intelligence

How were using agentic reviewers as guardians to maintain system resilience

$ git blame

Joah Gerstenberg

AI enablement at Block

$ cat content.md

Protectors, not assistants

We believe that a core requirement of protecting our world model is to use intelligence as more than simply an assistant that elevates options and waits for a human to take action. Instead, our agents function as vigilant guardians throughout research, planning, and implementation to ensure that our systems are resilient against the degenerative patterns that every engineering organization faces: individual teams shipping features that are locally rational and globally corrosive. As our first protector, Builderbot sits between our builders and the systems that we are building, constantly observing, learning, recommending, and steering changes to align with our world model.

Key principles

Shift left

Protection against the forces that erode systems must happen as early as possible in the software development lifecycle. This is a well established pattern in software development, but becomes even more critical as we focus on scaling our ability to ship features faster with intelligence. In most engineering organizations, CI has become the de facto validation layer — test suites are large, builds are complex, and its often easier to let the build system figure it out than to verify everything locally. Agents change this equation. They can run the same checks locally before pushing, at a speed and consistency that wasnt practical before, if you give them a consistent entrypoint.

To accomplish this, we are implementing a single common CLI contract for local development in all of our repositories using Just. This enables our local agents to have a standardized entrypoint to all of the same tools that our CI runs. Now, instead of agents fumbling around when they encounter a new repository, they have a standard expectation to just fmt or just test via pre-commit and pre-push hooks before pushing code to a PR. This small change has massive impacts for our local agents ability to make the right changes quickly and avoid shifting that burden to CI.

A protector in every module

Each module needs the ability to define custom hooks, checks, and context that are considered when making changes to code within it. Its not sufficient to define one protector and expect it to work for every system, nor to expect a monorepos rules to cover every module that lives within it. Hyperlocal context in concert with a global world model is a requirement for protectors to have sufficient context when steering changes to the system. Through trial and error, we have evolved our opinions about how to do this the right way over time. Many agentic reviewers are limited to a single prompt thats expected to cover an entire system, but we have seen much more success when leveraging progressive disclosure to guide agentic system reviews with the right context in the modules that they cover.

AGENTS.md provides one way to progressively disclose context in the modules where its relevant. Most agents will automatically load an AGENTS.md file when they start working in a directory, and check for more local AGENTS.md context files as they navigate a system. We frequently include hints in our AGENTS.md files to let agents know external docs that they might want to review, or neighboring systems whose implementations need to stay in sync. By carefully crafting nested AGENTS.md files within a project, its easy to steer an agent with the local context it needs in order to succeed.

Context management is key

While AGENTS.md provides one tool for managing hyperlocal context, its not sufficient on its own to capture our world model in a format accessible to our protectors. Some agentic reviewers offer methods to provide local context in modules by referencing it in AGENTS.md, but each token added to this context file creates a burden for every agent encountering the module. In order to get this context out of our critical paths, we really like Amps Code Review Checks pattern, which enables us to move our prompts into .agents/checks/.md files to only load the context when its relevant. Just like AGENTS.md, checks are able to be nested inside individual modules using /.agents/checks/.md, and each prompt gets executed with its own dedicated review subagent to ensure the signal stays high.

Agent Skills provides another standard for pulling context out of the critical paths for our agents. Agent Skills is a highly extensible format for exposing context to agents and allowing them to dynamically equip themselves with it when it becomes relevant to a given task. Through an internal Skills Marketplace, we leverage hundreds of internally-written Agent Skills to seed each of our environments with context that makes it easy for stateless agents to quickly glean the information they need about our world model to proactively steer decisions during research, planning, and implementation.

How to build a protector

A protector is fundamentally different from an assistant or an advisor. An assistant waits to be asked. An advisor presents options and steps back. A protector acts — continuously, mostly below the threshold of awareness. The closest analogy is your immune system: it doesnt wait for you to notice youre sick, it doesnt present a dashboard of threats and ask you to choose a response. It acts with enormous sophistication, and you only notice it when it fails.

Builderbots code review system is a protector for our system architecture. No single engineer can hold the full system in their head anymore, but a protector can. It evaluates every proposed change against a model of the whole — not just the module being touched, but the architectural patterns, security requirements, and operational constraints that span the entire organization. Its default is action, not recommendation. It doesnt file a report and wait; it reviews, flags, and steers, with humans providing the final stamp of approval rather than the initial analysis.

A single entrypoint

In order to catch issues as early as possible, local agents should have access to all of the same context, tools, and policies as our agents that run in the cloud. To enable this, we distribute a sq agents review CLI tool to every workstation and cloud agent runner that has the full context of local and global knowledge. Having a single entrypoint that we use to protect our systems everywhere makes it easy to evolve policies over time. When running against a PR, sq agents review can ensure that alignment with our world model has been verified before requesting humans to take a pass and grant final approval.

Specialized reviews with access to global knowledge

With Code Review Checks, we equip module owners with the ability to define hyperlocal review context that gets dispatched in subagents during an execution of sq agents review. Each check runs as an isolated subagent with its own context window — a global check for API standards loads different context than a module-level check for PCI compliance in payments/ or security review in auth/. These subagents run in parallel, and their findings are aggregated into a single review report.

In addition to local checks, we have a constantly evolving set of global checks that get pulled in at review time to verify that new code is adhering to our global world model, allowing us to catch issues from afar and steer agentic changes by codifying global concerns. Finally, because our agentic reviews run on our own hardware, were able to reference internal documents and sources during review time that may not otherwise be exposed to a third-party reviewer.

Continuously evolving policies

Proactive protection shouldnt require humans to be constantly keeping checks in sync with our evolving product direction. We give our protectors a heartbeat to proactively review incidents, announcements, and messages to consider which deterministic and non-deterministic checks to propose for human review. These may be proposed locally within a particular module or repository that has a recurring set of consistent issues, or they can be added globally to steer entire systems toward a new evolution of our world model.

Velocity with confidence

By building agents that proactively protect our systems, we are equipping our builders with the tools they need to build with confidence that they are moving in step with the broader organization. Shifting checks to run pre-push makes it faster to catch issues and reduces the burden on human reviewers who give the final stamp of approval. By giving tools for managing review context within individual modules, we make it easy for service stewards to have a high degree of ownership over the code they are shepherding. Through global checks, we can distribute our world model broadly to our agents, making it easier than ever to protect our systems while moving quickly.

$cat tags

AISoftware EngineeringBest PracticesDeveloper ToolsCode Review+1 more

$

$block_engineering

Inspired by our work? Join Blocks engineering team and build the future with us.

cd ./block-careers

Network

$ echo Copyright 2026 Block, Inc. All rights reserved.

📋 讨论归档

讨论进行中…