🧠 阿头学 · 💬 讨论题

AI 编程循环：从提示词到编排器

AI 编程的价值重心已强制从模型推理转向循环编排，缺乏验证机制的自动化只是批量制造错误的烧钱机器。
打开原文 ↗

2026-06-09 原文链接 ↗

阅读简报

双语对照

完整翻译

原文

讨论归档

核心观点

定义重构 循环不是新魔法，而是带决策权的 Cron 任务，核心价值在于模型内部的动态决策而非固定脚本。
成本转移 昂贵资源已从 Token 消耗转向循环管理，无限循环导致的账单风险远超模型推理成本。
技能资产 循环调用的命名技能库才是可复用资产，单纯依赖动态推导的循环只是在燃烧预算。
验证闭环 没有自我验证和硬停止条件的循环只是批量制造错误自信的机器，反馈机制决定可信度。

跟我们的关联

对 ATou 意味着工程管理范式上移，下一步需制定自主 Agent 的预算上限与停止标准。
对 Neta 意味着技术栈需从 Prompt 优化转向状态持久化与验证网关开发。
对 Uota 意味着需警惕营销包装，下一步应验证现有自动化流程是否具备真正的决策闭环。

讨论引子

当循环具备自主决策权时，如何界定代码错误的责任归属是开发者还是 Agent？
在预算有限的情况下，优先投资模型能力还是循环验证机制？
现有的 CI/CD 流程是否足以容纳自主 Agent 产生的高频提交？

本周 AI 编程里被重复最多的一句话，只有六个词长，而说这句话的人里，几乎没人能把它定义清楚。本周有一条推文把整条时间线都卡住了，于是我对大家吵得最凶的那个词跑了一次 /last30days。答案确有其事，它有五年的演化脉络，而真正的包袱在于，如今贵的部分已经不是模型，而是循环。

那条把时间线卡住的推文

这周有一条推文让整个 AI 编程时间线都着了魔。Peter Steinberger 在 6 月 7 日发了它，浏览量突破 220 万，回复区也因为它到底是什么意思打成一团。

“Here's your monthly reminder that you shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents.”

@steipete，2026 年 6 月 7 日

这就是所有人都在引用的那句话。最说明问题的一条回复来自 Varadh Jain，他问了唯一重要的问题，这东西在实践里到底长什么样。而后来成了全场情绪总结的回答，则出自 Matthew Berman。

“nobody knows but him and boris.”

@MatthewBerman，2026 年 6 月 7 日

这才是真正的故事。不是说循环就是未来，而是一个六个词的说法拿到了两百万浏览量，可帮着推它的人，却还在回复区里争它到底是什么意思。对此我没有翻白眼，因为我自己每晚都会跑一个循环，在我睡觉时给大约三十个开源仓库自动开 pull request。九十秒的研究给我带回了十五个 Reddit 讨论串、二十一个 X 帖子，还有一个不太舒服的模式，AI 编程里最响亮的概念，恰恰是大多数复述它的人解释不清的概念。一派人在喊 prompt engineering 已死。另一派，也就是手真放在键盘上的那群人，则谨慎得多。

“It's not ralph/goal loops, that's old hat by now. It's probably some kind of continuous orchestration loop that oversees other threads/agents.”

@trashpandaemoji，2026 年 6 月 7 日

这条回复，是当时所有人里最接近正确答案的一条。先记住它。

循环真正是什么

Boris Cherny 在 2024 年 9 月把 Claude Code 做成了一个副项目。据说它现在支撑着 GitHub 上接近百分之四的公开提交。6 月 2 日，在 WorkOS 主办的 Acquired Unplugged 活动台上，他给出了你能找到的最干净定义。

“Now it's actually leveled up, I think, again, to the next wave of abstraction where I don't prompt Claude anymore. I have loops that are running. They're the ones that are prompting Claude and figuring out what to do. My job is to write loops.”

Boris Cherny，WorkOS Acquired Unplugged，2026 年 6 月 2 日

所以，简单说就是这样。循环是你写的一个小程序，它替你去提示编程 agent，读取它产出的内容，判断它是不是已经做完，如果还没做完，就再提示一次。你不再是那个坐在循环里面手打提示词的人。你成了循环的作者。模型则成了一个子程序。

Boris 把这件事讲成三个阶段，而把自己放到他的这把梯子上，是理解它最快的办法。一年前，他还在手写代码，配合自动补全。之后他开始并行跑五到十个 Claude 会话，分别给它们下提示。现在他已经完全不亲自提示了。他写的是提示 Claude 的循环，同时有几百个 agent 读取他的 GitHub、Slack 和 Twitter，决定接下来该构建什么。他有凭据。

“In the last 30 days, 100% of my contributions to Claude Code were written by Claude Code. I landed 259 PRs.”

Boris Cherny，经 Simon Willison 转述，2025 年 12 月 27 日

他在 11 月删掉了自己的 IDE，此后再也没打开过。那些高喊 prompt engineering 已死的人跳过了一个关键细节，他并不是说工程师已经没用了。还是得有人决定该做什么，还是得有人跟客户交流、协调团队，而且他说，优秀工程师比以往任何时候都更重要。工作没有消失。只是上移了一层，从写代码，变成写那个会写代码的东西。

光谱：从 ReAct 到编排

回复区之所以一团乱，是因为循环这个词至少藏着五种不同的东西。下面按时间顺序，从旧到新，把这把梯子摆出来，这样大家就不用再各说各话。

第一阶段是学术界的 while-loop。2022 年的 ReAct 论文把它形式化了，模型先推理，再调用工具，再读取结果，重复直到完成。一个模型，一个循环，一个人在旁边盯着。第二阶段是 2023 年的 AutoGPT，它给循环一个目标，让它自己给自己下提示，后来也因为空转半天什么都不干而出了名。这个失败给后面好多年都埋下了一个印象，agent 不过是玩具。

第三阶段，就是 Trash Panda 说的 old hat，也就是 Geoffrey Huntley 在 2025 年 7 月发布的 ralph loop。它简单得近乎侮辱人，本质上就是一句 bash one-liner，把同一个提示文件一遍又一遍喂给 agent。它真正的创新不在花样，而在纪律，每一轮都会把上下文重置到一组固定的锚点文件，而不是任由对话无限膨胀。Huntley 用它做出了一整门编程语言，成本大约 297 美元。第四阶段则把这件事产品化了，到了 2026 年春天，Codex 和 Claude Code 都上线了 /goal 命令，让 ralph loop 一直跑，直到一个小型验证模型确认任务已经完成。

第五阶段，才是 Boris 和 Steinberger 真正指的东西，而且这次确实是新东西，不是换个名字而已。有四件事变了。循环本身成了工作单元，不再只是任务。循环开始监管其他循环，并且是并发地、按调度来做。调度取代了人工启动，于是循环跑在基础设施的时间上，而不是跑在你的注意力上。还有，持久性被明确提出了，状态放进 git，有崩溃恢复，因为这些东西必须能扛过重启。ralph 默认你的终端会一直开着。2026 年这一版默认它不会。于是 Trash Panda 说对了两次，单 agent 的 ralph loop 已经不新鲜了，而叠在它上面的多 agent 编排循环，才是新层次。

这不过是戴了帽子的 cron job

在所有材料里，最好的怀疑派一句话只有四个词。它出现在某人激动地说循环会走向未来的帖子下面。

“Cronjobs have funny re-branding rn.”

X 回复，loops 讨论，2026 年 6 月

这句话值得正面回答，不该绕，因为它对了一半。对，调度这一层就是 cron。Boris 真的就是用 cron 跑的。Claude Code 里的 /loop 命令，底层用的也是 cron。如果你对循环的全部定义，就是一个按定时器运行的东西，那没错，这玩意 1975 年就发明了，你可以回家了。

但 cron 从来没有中间那一块。cron job 运行的是固定脚本。循环运行的是模型，它会查看当前状态，决定下一步做什么，去做，再检查是否生效，然后决定要不要继续。这个决策是 agent 做的，不是你做的，也不是硬编码分支。把这些层层叠起来，让一个循环去分发和监管其他循环，再给它们持久化共享状态，你就得到了 cron 无法表达的东西。最诚实的说法不是循环是什么新魔法，也不是循环不过是 cron。真正的说法是，循环等于 cron 加上一个在执行体内部做决策的人，而真正有意思的工程工作，是你围绕这个决策加上的全部保护，避免它一路冲下悬崖。

真做一个出来时，它到底长什么样

理论说够了。入门只要一行。Claude Code 上线了 /loop，而 Boris 自己给的例子，就是最标准的起步方式。把这段贴进去，再把名词改掉。

/loop babysit all my PRs. Auto-fix build issues, and when comments come in, use a worktree agent to fix them.

下面是他更完整的配方。几天后，Boris 发了五条建议，讲怎么让 Opus 自主跑上几小时甚至几天。

Five tips, in his words: use auto mode for permissions so Claude doesn't ask for approval; use dynamic workflows to have Claude orchestrate hundreds or thousands of agents to get a task done; use /goal or /loop to nudge Claude to keep going until it's done; use Claude Code in the cloud so you can close your laptop; and make sure Claude has a way to self-verify its work end to end.

@bcherny，2026 年 6 月

第五条，正是被热潮跳过、却被实干派死盯不放的那条，循环值不值得信任，完全取决于它能不能检查自己的工作。

这就是整个思路的缩影。步骤不是你写的。你写的是意图和停止行为，而循环会在每一次 tick 时去提示 agent。在 TikTok 上，这个表述对大众也很容易落地。

“Loop mode is one of the clearest signs that AI coding is moving from one-off prompts to background operations.”

TikTok 上的 @ai.native.founder，2026 年 6 月

更深的一端，是 Steve Yegge 在 1 月推出的 Gas Town，二十到三十个 Claude Code 实例，由一个 Mayor agent 协调，巡逻 agent 跑持续循环，状态则存进 git，因此哪怕崩溃，工作也能保住。这就是 Trash Panda 想表达的那种持续编排循环，它监管其他线程，已经上线，而且是开源的。

不过，这次研究里最实用的教训仍然是，循环能有多好，全看它有多会检查自己。增长最快的子主题不是编排，而是验证。

“Your coding agent can move fast, but bad commits compound fast too.”

@DanKornas，2026 年 6 月

Kornas 正在发布 roborev，这个工具会在后台审查每一次提交，并在上下文还新鲜的时候，把发现再喂回 agent。一个开放循环如果只是写代码，却没有反馈，那就是一台大批量制造自信错误的机器。真正能工作的，是那种会写、会跑、会读结果、会修正的循环。魔法不在循环本身。魔法在里面的反馈。

剧情反转，如今贵的是循环本身

研究到这里，话题就从哲学转成了财务问题。对整套 agents 神话最尖锐的一次拆解，来自一个真正在线上干活的工程师。

“Every ai agent i shipped this year is a for-loop, an llm call, and a try/catch around the json parsing. The only thing agentic about it is the anthropic bill at the end of the month.”

@rohit_jsfreaky，2026 年 6 月

那张账单可不是笑话。本月最硬的一张收据是，Uber 在四个月内烧完了全年 AI 预算，于是给工程师在 Claude Code 和 Cursor 上都设了上限，每人每工具每月 1500 美元。等模型几乎不要钱就能把代码写出来之后，成本就转移到了让循环持续运行这件事上。

“The costliest thing in AI coding is no longer writing code, it's managing the agent loop.”

@runes_leo，2026 年 6 月

而所有真正在线上环境里跑的人，最怕的失败模式，就是停不下来的循环。

“Without guardrails, you get infinite loops and billing surprises orders of magnitude over budget.”

@cv_usk，2026 年 6 月

所以，2026 年所有认真写循环的人，最后都会收敛到同样的三个硬停止条件，最大迭代次数、无进展检测、token 或美元预算上限。循环最浪漫的版本，是你写好循环，一千个 agent 一夜之间帮你把公司搭起来。循环在线上环境里的版本，是你写好循环，然后你大部分工作都花在确保它们会停上。Gartner 把 agentic AI 放在了期望膨胀的峰值，但真正部署 agent 的组织只有大约百分之十七。时间线上的热闹，和账单上的现实，中间的落差，才是真实局面。

重点不是循环，而是技能

下面说说我自己的看法。这是一周看下来之后，我最后落脚的地方。循环是管道。真正的资产，是它调用的技能。

Steinberger 还有一个经常一起讲的观点，通常会和循环搭配出现，而且它更耐久，如果一件事你做了不止一次，就把它变成自动化技能。如果一件事很难，就在做完之后把它也变成技能，这样下一次就是白赚。一个循环如果里面没有可复用的技能，那它不过是在一个陌生人外面包了一层 while true。一个循环如果调用的是一套锋利、经过验证、带名字的技能库，它才会越滚越强。Reddit 上那个真正已经开始转型的实干派，说得最好。

“A lot of people are rolling their eyes on Twitter, but my ears are perked up.”

r/ChatGPTCoding，2026 年 6 月

所以，WTF is a loop 的答案，不是什么 prompt engineering 已死的热评。真正的答案是，把自己从循环里面拿出来。把循环写一次，给它值得调用的技能，也给它能自查的反馈，再给它设上限，让它会停，然后把它交给 cron 去跑，你自己去决定下一个该造什么。Steinberger 和 Boris 说的是同一种动物，只是站在两侧描述它。真正懂的人，只有那些已经亲手造出一个的人。好消息是，至少从这个月起，入门坡道已经变成了一条斜杠命令。

研究里的关键模式

循环就是 cron 加上一个在执行体内部做决策的人，也就是模型，而不是硬编码分支，来决定每一次 tick 的下一步动作。

这条谱系是真实存在的，2022 年的 ReAct，2023 年的 AutoGPT，2025 年的 ralph，2026 年春天的 /goal，以及现在的编排循环。单 agent 的 ralph 已经不新鲜了，多 agent 监管才是新的一层。

循环的好坏，完全取决于它的反馈。持续审查和验证闸门，才是让循环值得信任的东西。

昂贵资源已经从 token 转移到了循环管理。要限制迭代次数，要检测无进展，要设置美元预算。

循环里面真正可复用的单位是技能，不是提示词。会调用锋利命名技能的循环会不断积累价值，而每次都从头推导一切的循环只是在烧钱。

所有 Agent 都回报了结果

Reddit：17 个声音，来自 r/ClaudeAI、r/AI_Agents、r/ExperiencedDevs，47 个讨论串，3.4 万赞

X：21 个声音，包括 steipete、bcherny、runes_leo，56 条帖子，175 次转发

YouTube：4 个声音，包括 WorkOS、Lenny's Podcast、Y Combinator，来自演讲转录

TikTok：6 个声音，包括 ai.native.founder、nikpolale，34 个片段

Instagram：4 个声音，包括 sequenzy_com、ai.builders，14 条 reels

Hacker News：12 个声音，54 个帖子，1000 条评论

GitHub：6 个仓库，包括 gastownhall/gastown、NousResearch/hermes，steipete 有 259+ 个 PR

最核心的声音：steipete、bcherny、runes_leo、rohit_jsfreaky、MatthewBerman

以上内容整理自 2026-06-07 的 /last30days 运行结果。切面包括 designing loops that prompt coding agents、ai loops、coding loops。

联合创办过一家自动驾驶烤箱公司，后来被 Weber 收购，也联合创办过后来变成 Lyft 的那家公司。现在又在重新创业，很快会有更多消息。我会在睡觉时跑循环，让它们自动提交开源 PR，而我则在后台一边写这些循环，一边跑 /last30days 研究。

The most repeated sentence in AI coding this week is six words long, and almost nobody saying it can define it. One tweet had the entire timeline in a chokehold this week, so I ran /last30days on the word everyone was fighting about. The answer is real, it has a five-year lineage, and the punchline is that the loop, not the model, is now the expensive part.

The tweet that has the timeline in a chokehold

那条把时间线卡住的推文

One tweet has had the entire AI-coding timeline obsessed this week. Peter Steinberger posted it on June 7, it cleared 2.2 million views, and the replies turned into a brawl over what it actually meant.

这周有一条推文让整个 AI 编程时间线都着了魔。Peter Steinberger 在 6 月 7 日发了它，浏览量突破 220 万，回复区也因为它到底是什么意思打成一团。

“Here's your monthly reminder that you shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents.”

@steipete, June 7, 2026

@steipete，2026 年 6 月 7 日

That is the sentence everyone is quoting. The most telling reply came from Varadh Jain, who asked the only question that mattered: what does this look like in practice? And the answer that became the whole mood was Matthew Berman's.

“nobody knows but him and boris.”

@MatthewBerman, June 7, 2026

@MatthewBerman，2026 年 6 月 7 日

That is the real story. Not that loops are the future, but that a six-word phrase hit two million views while the people boosting it argued in the replies about what it meant. I did not roll my eyes, because I run a loop every night that opens pull requests across roughly thirty open-source repos while I sleep. Ninety seconds of research handed back fifteen Reddit threads, twenty-one X posts, and one uncomfortable pattern: the loudest idea in AI coding is one most people repeating it cannot explain. One camp shouted that prompt engineering is dead. Another camp, the one with their hands actually on a keyboard, was more careful.

“It's not ralph/goal loops, that's old hat by now. It's probably some kind of continuous orchestration loop that oversees other threads/agents.”

@trashpandaemoji, June 7, 2026

@trashpandaemoji，2026 年 6 月 7 日

That reply is the closest thing to a correct answer anyone posted. Hold onto it.

这条回复，是当时所有人里最接近正确答案的一条。先记住它。

What a loop actually is

循环真正是什么

Boris Cherny created Claude Code as a side project in September 2024. It now reportedly sits behind close to four percent of all public commits on GitHub. On stage at the Acquired Unplugged event hosted by WorkOS on June 2, he gave the cleanest definition of a loop you will find.

“Now it's actually leveled up, I think, again, to the next wave of abstraction where I don't prompt Claude anymore. I have loops that are running. They're the ones that are prompting Claude and figuring out what to do. My job is to write loops.”

Boris Cherny, WorkOS Acquired Unplugged, June 2, 2026

Boris Cherny，WorkOS Acquired Unplugged，2026 年 6 月 2 日

So here is the plain version. A loop is a small program you write that prompts the coding agent for you, reads what it produced, decides whether it is done, and if not, prompts it again. You stop being the thing inside the loop typing prompts. You become the author of the loop. The model becomes a subroutine.

Boris tells it as three stages, and placing yourself on his ladder is the fastest way to get it. A year ago he wrote code by hand with autocomplete. Then he ran five to ten Claude sessions in parallel and prompted each one. Now he does not prompt at all. He writes the loops that prompt Claude, and a couple hundred agents read his GitHub, Slack, and Twitter and decide what to build next. He has the receipt.

“In the last 30 days, 100% of my contributions to Claude Code were written by Claude Code. I landed 259 PRs.”

Boris Cherny, via Simon Willison, December 27, 2025

Boris Cherny，经 Simon Willison 转述，2025 年 12 月 27 日

He deleted his IDE in November and has not opened it since. The nuance the prompt-engineering-is-dead crowd skips: he is not saying engineers are obsolete. Someone still has to decide what to build, talk to customers, and coordinate teams, and he says great engineers matter more than ever. The job did not vanish. It moved up an altitude, from writing the code to writing the thing that writes the code.

The spectrum: from ReAct to orchestration

光谱：从 ReAct 到编排

The replies were a mess because loop hides at least five different things. Here is the ladder, oldest to newest, so you can stop talking past people.

回复区之所以一团乱，是因为循环这个词至少藏着五种不同的东西。下面按时间顺序，从旧到新，把这把梯子摆出来，这样大家就不用再各说各话。

Stage one is the academic while-loop. The 2022 ReAct paper formalized it: the model reasons, calls a tool, reads the result, repeats until done. One model, one loop, a human watching. Stage two is AutoGPT in 2023, which gave it a goal and let it prompt itself, and which became famous for spinning forever doing nothing. That failure seeded years of agents are a toy.

Stage three is the one Trash Panda called old hat: the ralph loop, published by Geoffrey Huntley in July 2025. It is almost insultingly simple, a bash one-liner that pipes the same prompt file into the agent over and over. Its real innovation was discipline: every iteration resets the context to a fixed set of anchor files instead of letting the conversation grow. Huntley built an entire programming language with it for about 297 dollars. Stage four productized that: in spring 2026 both Codex and Claude Code shipped a /goal command that runs the ralph loop until a small validator model confirms the task is done.

Stage five is what Boris and Steinberger actually mean, and it is genuinely new, not just renamed. Four things changed. The loop became the unit of work, not the task. Loops started supervising other loops, concurrently and on a schedule. Scheduling replaced the human kickoff, so the loop runs on infrastructure time instead of your attention. And durability became explicit, with git-backed state and crash recovery, because these things have to survive a restart. Ralph assumed your terminal stayed open. The 2026 version assumes it does not. So Trash Panda was right twice: the single-agent ralph loop is old hat, and the multi-agent orchestration loop on top of it is the new thing.

It's just a cron job with a hat on

这不过是戴了帽子的 cron job

The best skeptic line in the entire corpus was four words, posted under someone gushing that loops is where it will go.

在所有材料里，最好的怀疑派一句话只有四个词。它出现在某人激动地说循环会走向未来的帖子下面。

“Cronjobs have funny re-branding rn.”

X reply, loops discourse, June 2026

X 回复，loops 讨论，2026 年 6 月

This deserves a straight answer, not a dodge, because it is half right. Yes, the scheduling layer is cron. Boris literally runs his on cron. The /loop command in Claude Code uses cron under the hood. If your whole definition of a loop is a thing that runs on a timer, then yes, we invented that in 1975 and you can go home.

What cron never had is the part in the middle. A cron job runs a fixed script. A loop runs a model that looks at the current state, decides what to do next, does it, checks whether it worked, and decides whether to keep going. The decision is the agent's, not yours, and not a hardcoded branch. Stack those, let one loop dispatch and supervise others, give them durable shared state, and you have something cron cannot express. The honest framing is not that loops are new magic and not that loops are just cron. It is that loops are cron plus a decision-maker in the body, and the interesting engineering is everything you wrap around that decision so it does not run off a cliff.

What it looks like when you actually build one

真做一个出来时，它到底长什么样

Enough theory. The on-ramp is one line. Claude Code shipped /loop, and Boris's own example is the canonical starter. Paste this and change the nouns.

理论说够了。入门只要一行。Claude Code 上线了 /loop，而 Boris 自己给的例子，就是最标准的起步方式。把这段贴进去，再把名词改掉。

/loop babysit all my PRs. Auto-fix build issues, and when comments come in, use a worktree agent to fix them.

And here is his fuller recipe. Days later, Boris posted five tips for running Opus autonomously for hours or days.

下面是他更完整的配方。几天后，Boris 发了五条建议，讲怎么让 Opus 自主跑上几小时甚至几天。

Five tips, in his words: use auto mode for permissions so Claude doesn't ask for approval; use dynamic workflows to have Claude orchestrate hundreds or thousands of agents to get a task done; use /goal or /loop to nudge Claude to keep going until it's done; use Claude Code in the cloud so you can close your laptop; and make sure Claude has a way to self-verify its work end to end.

@bcherny, June 2026

@bcherny，2026 年 6 月

Tip five is the one the hype skips and the practitioners obsess over: a loop is only as trustworthy as its ability to check its own work.

第五条，正是被热潮跳过、却被实干派死盯不放的那条，循环值不值得信任，完全取决于它能不能检查自己的工作。

That is the whole idea in miniature. You did not write the steps. You wrote the intent and the stopping behavior, and the loop prompts the agent each tick. On TikTok the framing landed cleanly for a general audience.

“Loop mode is one of the clearest signs that AI coding is moving from one-off prompts to background operations.”

@ai.native.founder on TikTok, June 2026

TikTok 上的 @ai.native.founder，2026 年 6 月

The deep end is Steve Yegge's Gas Town, launched in January: twenty to thirty Claude Code instances coordinated by a Mayor agent, with patrol agents that run continuous loops and state stored in git so work survives a crash. That is the continuous orchestration loop that oversees other threads Trash Panda was reaching for, shipped and open source.

But the most practical lesson in the research is that a loop is only as good as its ability to check itself. The fastest-growing sub-theme was not orchestration, it was verification.

不过，这次研究里最实用的教训仍然是，循环能有多好，全看它有多会检查自己。增长最快的子主题不是编排，而是验证。

“Your coding agent can move fast, but bad commits compound fast too.”

@DanKornas, June 2026

@DanKornas，2026 年 6 月

Kornas is shipping roborev, a tool that reviews every commit in the background and feeds the findings back into the agent while the context is still fresh. An open loop that writes code with no feedback is a machine for generating confident mistakes. A loop that writes, runs, reads the result, and corrects is the thing that actually works. The loop is not the magic. The feedback inside it is.

The plot twist: the loop is now the expensive part

剧情反转，如今贵的是循环本身

Here is where the research turned from philosophy to a finance problem. The sharpest deflation of the whole agents mythology came from a working engineer.

研究到这里，话题就从哲学转成了财务问题。对整套 agents 神话最尖锐的一次拆解，来自一个真正在线上干活的工程师。

“Every ai agent i shipped this year is a for-loop, an llm call, and a try/catch around the json parsing. The only thing agentic about it is the anthropic bill at the end of the month.”

@rohit_jsfreaky, June 2026

@rohit_jsfreaky，2026 年 6 月

That bill is not a joke. The receipt of the month: Uber capped its engineers at 1,500 dollars per person per tool per month for Claude Code and Cursor after burning its annual AI budget in four months. Once the model writes the code for almost nothing, the cost moves to the loop running it.

“The costliest thing in AI coding is no longer writing code, it's managing the agent loop.”

@runes_leo, June 2026

@runes_leo，2026 年 6 月

And the failure mode everyone in production is scared of is the loop that does not stop.

而所有真正在线上环境里跑的人，最怕的失败模式，就是停不下来的循环。

“Without guardrails, you get infinite loops and billing surprises orders of magnitude over budget.”

@cv_usk, June 2026

@cv_usk，2026 年 6 月

Which is why every serious 2026 write-up on loops converges on the same three hard stops: a maximum iteration count, no-progress detection, and a token or dollar budget ceiling. The romantic version of loops is that you write the loops and a thousand agents build your company overnight. The production version is that you write the loops, and most of your job is making sure they halt. Gartner puts agentic AI at the peak of inflated expectations, with only about seventeen percent of organizations actually deploying agents. The gap between the timeline and the receipts is the real state of play.

It's not loops. It's skills.

重点不是循环，而是技能

Here is my own take, and it is where I land after a week of watching this. The loop is plumbing. The asset is the skill it calls.

下面说说我自己的看法。这是一周看下来之后，我最后落脚的地方。循环是管道。真正的资产，是它调用的技能。

Steinberger's other recurring point pairs with the loops one and is the more durable half: if you do something more than once, turn it into an automated skill, and if you do something hard, turn it into a skill afterward so next time is free. A loop with no reusable skills inside it is just a while-true around a stranger. A loop that calls a library of sharp, tested, named skills is a system that compounds. The Reddit practitioner who is actually converting said it best.

“A lot of people are rolling their eyes on Twitter, but my ears are perked up.”

r/ChatGPTCoding, June 2026

r/ChatGPTCoding，2026 年 6 月

So the answer to WTF is a loop is not a hot take about prompt engineering dying. It is this: stop being the thing in the loop. Write the loop once, give it skills worth calling and feedback so it can check itself, cap it so it halts, and let it run on cron while you go decide what to build next. Steinberger and Boris are describing the same animal from two sides. The only people who truly know are the ones who have already built one. The good news is that, as of this month, the on-ramp is a single slash command.

Key Patterns from the Research

研究里的关键模式

A loop is cron plus a decision-maker in the body: the model, not a hardcoded branch, picks the next action each tick.

循环就是 cron 加上一个在执行体内部做决策的人，也就是模型，而不是硬编码分支，来决定每一次 tick 的下一步动作。

The lineage is real: ReAct in 2022, AutoGPT in 2023, ralph in 2025, /goal in spring 2026, orchestration loops now. Single-agent ralph is old hat; multi-agent supervision is the new layer.

The loop is only as good as its feedback. Continuous review and validation gates are what make a loop trustworthy.

循环的好坏，完全取决于它的反馈。持续审查和验证闸门，才是让循环值得信任的东西。

The expensive resource shifted from tokens to loop management. Cap iterations, detect no-progress, set a dollar budget.

昂贵资源已经从 token 转移到了循环管理。要限制迭代次数，要检测无进展，要设置美元预算。

The reusable unit inside the loop is a skill, not a prompt. Loops that call sharp named skills compound; loops that re-derive everything just burn money.

循环里面真正可复用的单位是技能，不是提示词。会调用锋利命名技能的循环会不断积累价值，而每次都从头推导一切的循环只是在烧钱。

All Agents Reported Back

所有 Agent 都回报了结果

Reddit: 17 voices (r/ClaudeAI, r/AI_Agents, r/ExperiencedDevs), 47 threads, 34k upvotes

Reddit：17 个声音，来自 r/ClaudeAI、r/AI_Agents、r/ExperiencedDevs，47 个讨论串，3.4 万赞

X: 21 voices (steipete, bcherny, runes_leo), 56 posts, 175 reposts

X：21 个声音，包括 steipete、bcherny、runes_leo，56 条帖子，175 次转发

YouTube: 4 voices (WorkOS, Lenny's Podcast, Y Combinator), talk transcripts

YouTube：4 个声音，包括 WorkOS、Lenny's Podcast、Y Combinator，来自演讲转录

TikTok: 6 voices (ai.native.founder, nikpolale), 34 clips

TikTok：6 个声音，包括 ai.native.founder、nikpolale，34 个片段

Instagram: 4 voices (sequenzy_com, ai.builders), 14 reels

Instagram：4 个声音，包括 sequenzy_com、ai.builders，14 条 reels

Hacker News: 12 voices, 54 stories, 1k comments

Hacker News：12 个声音，54 个帖子，1000 条评论

GitHub: 6 repos (gastownhall/gastown, NousResearch/hermes), steipete 259+ PRs

GitHub：6 个仓库，包括 gastownhall/gastown、NousResearch/hermes，steipete 有 259+ 个 PR

Top voices: steipete, bcherny, runes_leo, rohit_jsfreaky, MatthewBerman

最核心的声音：steipete、bcherny、runes_leo、rohit_jsfreaky、MatthewBerman

Compiled from /last30days runs on 2026-06-07. Facets: designing loops that prompt coding agents, ai loops, coding loops.

以上内容整理自 2026-06-07 的 /last30days 运行结果。切面包括 designing loops that prompt coding agents、ai loops、coding loops。

Co-founded a self-driving oven company (acquired by Weber) and the company that became Lyft. Building again, more soon. I run loops that ship open-source PRs while I sleep, and I write them with /last30days research running in the background.

The tweet that has the timeline in a chokehold

“Here's your monthly reminder that you shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents.”

@steipete, June 7, 2026

“nobody knows but him and boris.”

@MatthewBerman, June 7, 2026

“It's not ralph/goal loops, that's old hat by now. It's probably some kind of continuous orchestration loop that oversees other threads/agents.”

@trashpandaemoji, June 7, 2026

That reply is the closest thing to a correct answer anyone posted. Hold onto it.

What a loop actually is

“Now it's actually leveled up, I think, again, to the next wave of abstraction where I don't prompt Claude anymore. I have loops that are running. They're the ones that are prompting Claude and figuring out what to do. My job is to write loops.”

Boris Cherny, WorkOS Acquired Unplugged, June 2, 2026

“In the last 30 days, 100% of my contributions to Claude Code were written by Claude Code. I landed 259 PRs.”

Boris Cherny, via Simon Willison, December 27, 2025

The spectrum: from ReAct to orchestration

The replies were a mess because loop hides at least five different things. Here is the ladder, oldest to newest, so you can stop talking past people.

It's just a cron job with a hat on

The best skeptic line in the entire corpus was four words, posted under someone gushing that loops is where it will go.

“Cronjobs have funny re-branding rn.”

X reply, loops discourse, June 2026

What it looks like when you actually build one

Enough theory. The on-ramp is one line. Claude Code shipped /loop, and Boris's own example is the canonical starter. Paste this and change the nouns.

/loop babysit all my PRs. Auto-fix build issues, and when comments come in, use a worktree agent to fix them.

And here is his fuller recipe. Days later, Boris posted five tips for running Opus autonomously for hours or days.

Five tips, in his words: use auto mode for permissions so Claude doesn't ask for approval; use dynamic workflows to have Claude orchestrate hundreds or thousands of agents to get a task done; use /goal or /loop to nudge Claude to keep going until it's done; use Claude Code in the cloud so you can close your laptop; and make sure Claude has a way to self-verify its work end to end.

@bcherny, June 2026

Tip five is the one the hype skips and the practitioners obsess over: a loop is only as trustworthy as its ability to check its own work.

“Loop mode is one of the clearest signs that AI coding is moving from one-off prompts to background operations.”

@ai.native.founder on TikTok, June 2026

But the most practical lesson in the research is that a loop is only as good as its ability to check itself. The fastest-growing sub-theme was not orchestration, it was verification.

“Your coding agent can move fast, but bad commits compound fast too.”

@DanKornas, June 2026

The plot twist: the loop is now the expensive part

Here is where the research turned from philosophy to a finance problem. The sharpest deflation of the whole agents mythology came from a working engineer.

“Every ai agent i shipped this year is a for-loop, an llm call, and a try/catch around the json parsing. The only thing agentic about it is the anthropic bill at the end of the month.”

@rohit_jsfreaky, June 2026

“The costliest thing in AI coding is no longer writing code, it's managing the agent loop.”

@runes_leo, June 2026

And the failure mode everyone in production is scared of is the loop that does not stop.

“Without guardrails, you get infinite loops and billing surprises orders of magnitude over budget.”

@cv_usk, June 2026

It's not loops. It's skills.

Here is my own take, and it is where I land after a week of watching this. The loop is plumbing. The asset is the skill it calls.

“A lot of people are rolling their eyes on Twitter, but my ears are perked up.”

r/ChatGPTCoding, June 2026

Key Patterns from the Research

A loop is cron plus a decision-maker in the body: the model, not a hardcoded branch, picks the next action each tick.

The lineage is real: ReAct in 2022, AutoGPT in 2023, ralph in 2025, /goal in spring 2026, orchestration loops now. Single-agent ralph is old hat; multi-agent supervision is the new layer.

The loop is only as good as its feedback. Continuous review and validation gates are what make a loop trustworthy.

The expensive resource shifted from tokens to loop management. Cap iterations, detect no-progress, set a dollar budget.

The reusable unit inside the loop is a skill, not a prompt. Loops that call sharp named skills compound; loops that re-derive everything just burn money.

All Agents Reported Back

Reddit: 17 voices (r/ClaudeAI, r/AI_Agents, r/ExperiencedDevs), 47 threads, 34k upvotes

X: 21 voices (steipete, bcherny, runes_leo), 56 posts, 175 reposts

YouTube: 4 voices (WorkOS, Lenny's Podcast, Y Combinator), talk transcripts

TikTok: 6 voices (ai.native.founder, nikpolale), 34 clips

Instagram: 4 voices (sequenzy_com, ai.builders), 14 reels

Hacker News: 12 voices, 54 stories, 1k comments

GitHub: 6 repos (gastownhall/gastown, NousResearch/hermes), steipete 259+ PRs

Top voices: steipete, bcherny, runes_leo, rohit_jsfreaky, MatthewBerman

Compiled from /last30days runs on 2026-06-07. Facets: designing loops that prompt coding agents, ai loops, coding loops.

📋 讨论归档

讨论进行中…