🤖 Agent · 🏗 构建

AI 软件开发的第三时代——从“写代码”到“建工厂”

Cursor 团队把 AI 编码演进分成三代：Tab 补全→同步智能体→云端自治智能体；第三代的关键是把开发者角色从“逐步指挥”改成“定义问题+配置工具+基于产出物审阅”，从而把 IDE 变成一座并行运行的“软件工厂”。
打开原文 ↗

2026-03-02 原文链接 ↗

阅读简报

双语对照

完整翻译

原文

讨论归档

核心观点

三代范式：
1) Tab：低熵、重复性工作自动化（两年左右窗口）；
2) 同步智能体：提示-回应循环推进任务，但需要人持续在线，且吃本地资源；
3) 云端自治智能体：更长时间尺度、少人类指引，独立跑数小时完成更大任务。
云端智能体的两大解耦：
交互解耦：你把任务交出去，去做别的；
资源解耦：每个智能体在独立 VM 上跑，允许更高并发。
“产出物”取代“diff”成为审阅入口：智能体带回日志、录屏、实时预览等，让人无需重建会话上下文也能评估结果。
人的角色迁移：从“指导每一行”转为“问题拆解 + 审阅标准 + 反馈闭环”。
内部数据点：Cursor 内部合并 PR 的 ~35% 已由云端自主智能体创建；采用者常见特征是几乎所有代码由智能体写、人把时间花在拆解与审阅、并行启动多个智能体。

跟我们的关联

“产出物驱动审阅”≈ 把 agent output 当成可交付物（artifact），类似 CI 报告/预览环境的思路。
与 OpenClaw/编排层思路互补：并行运行多个 agent 的前提是任务状态、工具、上下文与验收标准可机器化。

讨论引子

当智能体能跑数小时并产出 artifacts，你觉得最该被产品化/自动化的审阅标准是什么？（例如：性能回归阈值、截图对比、错误预算、可观测性）

几年前我们开始打造 Cursor 时，大多数代码仍是一键一键敲出来的。Tab 自动补全改变了这一点，开启了 AI 辅助编码的第一个时代。

随后，智能体出现，开发者转而通过同步的提示—回应循环来指挥智能体工作。这是第二个时代。如今，第三个时代正在到来。它由这样一类智能体所定义：它们能在更长的时间尺度上、以更少的人类指引，独立完成更大规模的任务。

因此，Cursor 不再主要是为了写代码。它在帮助开发者建造一座“生产软件的工厂”。这座工厂由成群的智能体组成，开发者把它们当作队友来协作：给出最初的方向，为它们配齐能够独立工作的工具，并审阅它们的工作。

Cursor 团队里，许多人已经在这样工作。我们合并的 PR 中，超过三分之一现在由在云端、在各自“电脑”上运行的智能体生成。一年之后，我们认为绝大多数开发工作都会由这类智能体完成。

从 Tab 到智能体

Tab 擅长识别哪些低熵、重复性的工作可以被自动化。在将近两年的时间里，它带来了显著的生产力杠杆。

随后，模型变得更强。智能体能够容纳更多上下文、使用更多工具，并执行更长的行动序列。开发者的习惯开始发生变化：夏天里缓慢转变，而在最近几个月里迅速加速。

这种转变之彻底，以至于今天，许多 Cursor 用户从不去按 Tab 键。2025 年 3 月，我们的 Tab 用户数量大约是智能体用户的 2.5 倍。如今形势反转：智能体用户数量已是 Tab 用户的 2 倍，Cursor 中智能体的使用也激增。

但这场转变很快又在让位于更大的变化。Tab 时代持续了将近两年。第二个时代——大部分工作通过同步智能体完成的时代——可能连一年都未必能持续。

云端智能体与产出物

与 Tab 相比，同步智能体工作在技术栈更上层。它们能处理需要上下文与判断的任务，但仍会让开发者在每一步都保持参与。然而，这种实时交互的形式，再加上同步智能体会在本地机器上争夺资源，意味着一次只与少数智能体协作才现实可行。

云端智能体同时移除了这两项约束。每个智能体都运行在自己的虚拟机上，使得开发者可以把任务交出去，转而去做别的事。智能体会花上数小时推进任务，不断迭代与测试，直到对输出有信心，然后带着便于快速审阅的结果返回：日志、视频录屏、实时预览，而不是 diffs。

这让并行运行多个智能体变得可行，因为产出物与预览提供了足够的上下文，让你无需从头重建每次会话就能评估输出。人的角色也从“指导每一行代码”转向“定义问题并设定审阅标准”。

变革已在 Cursor 内部展开

在 Cursor 内部，我们合并的 PR 中有 35% 现在由在云端虚拟机（VM）中自主运行的智能体创建。我们看到，采用这种新工作方式的开发者，通常具有三个特征：

他们的代码几乎 100% 由智能体编写。
他们把时间用在拆解问题、审阅产出物 / 代码、以及给出反馈上。
他们会同时启动多个智能体，而不是把一个智能体手把手带到完成。

在这种方法成为软件开发的标准范式之前，还有大量工作要做。在工业级规模下，一个单个开发者还能绕过去的不稳定测试或损坏环境，会变成一种会打断每一次智能体运行的失败。更广泛地说，我们仍需要确保智能体能尽可能有效地运作，并能完整访问其所需的工具与上下文。

我们认为，昨天的发布是朝这个方向迈出的初步但重要的一步。

When we started building Cursor a few years ago, most code was written one keystroke at a time. Tab autocomplete changed that and opened the first era of AI-assisted coding.

几年前我们开始打造 Cursor 时，大多数代码仍是一键一键敲出来的。Tab 自动补全改变了这一点，开启了 AI 辅助编码的第一个时代。

Then agents arrived, and developers shifted to directing agents through synchronous prompt-and-response loops. That was the second era. Now a third era is arriving. It is defined by agents that can tackle larger tasks independently, over longer timescales, with less human direction.

As a result, Cursor is no longer primarily about writing code. It is about helping developers build the factory that creates their software. This factory is made up of fleets of agents that they interact with as teammates: providing initial direction, equipping them with the tools to work independently, and reviewing their work.

Many of us at Cursor are already working this way. More than one-third of the PRs we merge are now created by agents that run on their own computers in the cloud. A year from now, we think the vast majority of development work will be done by these kinds of agents.

From Tab to agents

从 Tab 到智能体

Tab excelled at identifying where low-entropy, repetitive work could be automated. For nearly two years, it produced significant leverage.

Tab 擅长识别哪些低熵、重复性的工作可以被自动化。在将近两年的时间里，它带来了显著的生产力杠杆。

Then the models improved. Agents could hold more context, use more tools, and execute longer sequences of actions. Developer habits began to shift, slowly through the summer, then rapidly over the last few months.

The transformation has been so complete that today, many Cursor users never touch the tab key. In March 2025, we had roughly 2.5x as many Tab users as agent users. Now, that is flipped: we now have 2x as many agent users as Tab users and agent usage in Cursor has surged.

But already this shift is giving way to something bigger. The Tab era lasted nearly two years. The second era, in which most work is done with synchronous agents, may not last one.

Cloud agents and artifacts

云端智能体与产出物

Compared to Tab, synchronous agents work further up the stack. They handle tasks that require context and judgment, but still keep the developer in the loop at every step. But this form of real-time interaction, combined with the fact that synchronous agents compete for resources on the local machine, means it is only practical to work with a few at a time.

Cloud agents remove both constraints. Each runs on its own virtual machine, allowing a developer to hand off a task and move on to something else. The agent works through it over hours, iterating and testing until it is confident in the output, and returns with something quickly reviewable: logs, video recordings, and live previews rather than diffs.

This makes running agents in parallel practical, because artifacts and previews give you enough context to evaluate output without reconstructing each session from scratch. The human role shifts from guiding each line of code to defining the problem and setting review criteria.

The shift is underway inside Cursor

变革已在 Cursor 内部展开

Thirty-five percent of the PRs we merge internally at Cursor are now created by agents operating autonomously in cloud VMs. We see the developers adopting this new way of working as characterized by three traits:

Agents write almost 100% of their code.

他们的代码几乎 100% 由智能体编写。

They spend their time breaking down problems, reviewing artifacts / code, and giving feedback.

他们把时间用在拆解问题、审阅产出物 / 代码、以及给出反馈上。

They spin up multiple agents simultaneously instead of handholding one to completion.

他们会同时启动多个智能体，而不是把一个智能体手把手带到完成。

There is a lot of work left before this approach becomes standard in software development. At industrial scale, a flaky test or broken environment that a single developer can work around turns into a failure that interrupts every agent run. More broadly, we still need to make sure agents can operate as effectively as possible, with full access to tools and context they need.

We think yesterday's launch is an initial but important step in that direction.