🪞 Uota学

Ralph Loop 不是循环，是把 AI 编程从"帮我写函数"拉到"帮我做产品"的脚手架

大多数 AI 编程会话死于上下文窗口塞满——Ralph Loop 用"每轮迭代从干净状态开始 + 状态存文件和 git"绕开了这个死穴。

2026-02-25

阅读简报

双语对照

完整翻译

原文

讨论归档

核心观点

上下文窗口是 AI 编程的真正瓶颈，不是模型能力 模型在前 20 分钟表现很好，然后上下文塞满，开始忘指令、质量下降。Ralph Loop 的核心洞察是：别让 agent 在一个会话里干完所有事，每完成一个任务就退出，下一轮从干净状态重新开始。状态存在文件和 git 里，不存在上下文窗口里。这个设计简单到让人觉得"就这？"，但确实解决了最致命的问题。

PRD 是整个循环的灵魂，不是代码 作者说"PRD 通常是最难的一步"，这话说到点上了。Ralph Loop 本质上是一个"把 PRD 拆成任务 → 逐个执行 → 测试验证"的自动机。PRD 写得烂，循环跑得再久也是垃圾。这意味着 2026 年最重要的技能不是写代码，是写需求。

测试是 agent 的自检机制，没有测试就是盲飞 没有 Playwright + Vitest，agent 会在"并未真正检查是否可用"的情况下继续往前走。这条经验对 Uota 的 skill 开发也适用——每个 skill 应该有自己的验证机制。

STEERING.md 是运行时的人类干预接口 循环运行时编辑 STEERING.md 就能重定向优先级，agent 每轮迭代开始时读取。这个设计很聪明——不需要杀进程重启，也不需要等当前任务完成。Uota 的 HEARTBEAT.md 其实是同一个模式。

强项和短板的划分很诚实 原型/MVP/迁移/重复性工作是强项，像素级设计/全新架构/安全关键代码是短板。这种诚实比"AI 能做一切"有用得多。

跟我们的关联

🪞Uota：Ralph Loop 的"每轮干净上下文 + 状态存文件"模式，和 Uota 的 subagent spawn 模式异曲同工。但 Uota 目前的 subagent 没有 STEERING.md 这种运行时干预机制——值得考虑加入。另外，Ralph Loop 的"git revert 失败任务 → 测试失败 → 下轮自动重试"机制，比 Uota 当前的 respawn protocol 更优雅。

👤ATou：如果 ATou 想用 Ralph Loop 做 Neta 的某些独立模块（比如内部工具、数据管道），这篇文章就是操作手册。10 分钟能跑起来，成本低，风险可控。

讨论引子

💭 Ralph Loop 的核心假设是"每个任务可以独立完成"——但现实中很多任务有依赖关系。当任务 B 依赖任务 A 的输出时，这个循环怎么处理？是不是需要一个任务依赖图？

💭 "2026 年最重要的技能是把你想要的东西讲清楚"——这和 Context Engineering 的定义高度重合。PRD 写作能力是不是就是 prompt engineering 的企业级版本？

🔗 原文：http://x.com/i/article/2026009212296327168

我的 Ralph Loop 配置：让 AI Agent 长时间稳定运行

我已经用这套工作流交付了四个独立项目。最长的一次连续跑了 37 小时，从一份 2,000 行的需求文档里完成了 250 个任务。全程我都在 AFK。

这份指南会带你从零开始，在 10 分钟内跑起来一个 Ralph Loop。

如果你更喜欢看视频，👉 这里是完整的演示。

发生了什么？！

大家都收藏了一堆 Ralph Loop 的文章，但几乎没人真的把它搭起来。

概念很简单，但从“很酷的点子”到“真正能跑的工作流”，中间差着一大截。你需要对的提示词、对的任务结构、对的验证标准。

大多数人连书签那一步都没跨过去。

我花了几周把这道鸿沟填平：迭代提示词、重构任务、调校验证循环。

我学到的一切都被打包进了一条安装命令里，这篇指南会带你一步步走完。

你需要准备什么

两样东西：

Docker — 循环在一个沙箱容器里运行

Claude Code — Anthropic 用于智能体编程的 CLI

就这些。其余都会自动安装。

你也可以用 Codex、Gemini CLI、Copilot CLI 或 Kiro 替换 Claude Code。完整内容见

初始化你的项目

这一步并非必须，但它能省 tokens，也能给你更好的基础。

用你喜欢的技术栈把项目脚手架搭起来即可。例如：

无论怎么搭，都要安装 Playwright 和 Vitest 用于测试。Ralph Loop 会用测试来验证自己的工作；没有测试，agent 就会在并未真正检查是否可用的情况下继续往前走。

把 API keys 先写进 .env 文件：数据库、支付服务商、LLM 的 key（不要提交 / 记得加到 .gitignore）。

项目需要什么就准备什么。没有这些，agent 会跳过对集成的验证，后面你会遇到很糟心的“惊喜”。

安装 Ralph Loop

一条命令：

这会创建一个 .agent/ 目录，里面包含循环所需的一切：

写你的 PRD

这通常是最难的一步，但我来帮你。

这条命令会安装一个 skill，负责搞定这些工作，并把内容转换成能随循环扩展的任务。

打开 Claude Code，用内置的 prd-creator skill 生成一份产品需求文档（Product Requirements Document）：

用你自己的话写。把 UI、流程、集成点、技术选型说清楚。这个 skill 会把你的“脑暴倾倒”扩展成结构化的 PRD，提出澄清问题，然后把它拆成任务。

想看完整示例？👉 去看视频

别跳过复核。逐条读一遍生成的任务；如果 agent 误解了什么，现在就修正。改一条任务说明的成本，远低于回滚 10 个糟糕的提交。

让需求更好的三个小建议：

指向确切的文档。把第三方文档保存为项目里的 markdown 文件，并用 @docs/FILE.md 引用它们，避免 agent 靠猜。

尽早准备 API key。在循环开始前写进 .env，这样每个集成都能被真正跑测。

不确定就直接写不确定。在需求里加上 "Interview me about the payment integration"。agent 会问你该问的问题。

循环在 Docker 沙箱里运行。Claude Code 在里面拥有完整权限，但无法触碰你的宿主机。

输入 yes 以绕过权限模式。然后退出。你只需要做一次，用来完成认证。

运行循环

先从小规模开始：

观察它在做什么，检查提交记录。如果看起来没问题，再把规模拉大：

有把握后，就让它跑一整晚：

就这么简单。Ralph 会挑选最高优先级的任务，完成实现，跑测试，提交，然后继续下一个。当所有任务都通过后，它会停止。

运行中如何“掌舵”

循环运行时，你可以编辑 .agent/STEERING.md 来重定向优先级。agent 会在每次迭代开始时读取这个文件。

发现了关键 bug？把它写进 STEERING .md，agent 会在继续任务列表之前先处理它。

审阅输出

Ralph 会留下完整轨迹：

.agent/logs/LOG.md — 已完成工作的时间序日志

.agent/history/ — 每次迭代的完整输出

git log — 每个完成的任务都会对应一次提交

如果出了问题，就用 git revert 回滚那次糟糕的提交。该任务的测试会失败，Ralph 会在下一次运行时重新尝试。

为什么这个循环能奏效

大多数 AI 编程会话会在上下文窗口被塞满时“死掉”。模型开始忘记早先的指令，输出质量下降，你就进入了所谓的“变笨区”。

Ralph 完全避开了这个问题。每一轮迭代都从一份全新的上下文开始：AI 读取任务列表，挑选下一个任务，完成实现，验证，通过后提交并退出。下一轮又从干净的状态开始。

状态保存在文本文件和 git 提交里，而不是困在上下文窗口里。

这就是把 AI 编程从“帮我写个函数”扩展到“帮我做个应用”的方法。

Ralph 的强项

原型与 MVP。想法到可运行的应用，很快

自动化测试。编写原本要花你几个小时的 E2E 和单元测试

迁移。把整个代码库迁到新的框架版本

重复性工作。批量重构、模板代码、文件结构调整

它的短板

像素级还原的设计。细腻的 UX 与交互流程

全新架构。真正独特、没有可复用模式可循的系统

安全关键代码。边界情况绝对不能出错的地方

最重要的一点

你的角色会发生变化：你不再是每一行都亲自写的人，而是那个负责规划、委派和审阅的人。

这意味着，2026 年最重要的技能不是更快地敲代码，而是把你想要的东西讲清楚：UI 规格、流程、约束、集成点。这才是关键。

Ralph 只是一个循环。真正的工作在于围绕它的搭建，以及把需求和通过标准定对。把这些想清楚，循环就能把剩下的都处理掉。

这也正是我为什么花了几周把它打磨出来。

链接

📺 带逐步说明与示例的视频

包含全部细节与专业技巧的深度文章

GitHub 上的 @pageai/ralph-loop：脚本、提示词与 skills

资源

Docker Sandboxes 隔离的执行环境

Anthropic 关于长时间运行智能体的研究任务格式背后的研究

skills.sh 额外的 agent skills

链接：http://x.com/i/article/2026009212296327168

My Ralph Loop setup for long running AI Agents

Source: https://x.com/d4m1n/status/2026032801322356903?s=46
Mirror: https://x.com/d4m1n/status/2026032801322356903?s=46
Published: 2026-02-23T20:33:56+00:00
Saved: 2026-02-25

Content

I used this workflow to ship four separate projects now. Longest run went 37 hours straight, completed 250 tasks from a 2,000-line requirements document. All while I was AFK.

This guide gets you from zero to a running Ralph Loop in under 10 minutes.

If video is your jam, 👉 here's the full walkthrough.

What Happened?!

Everyone saved a bunch of Ralph Loop articles. Almost nobody actually set it up.

The concept is simple, but the gap between "cool idea" and "working workflow" is huge. You need the right prompts, the right task structure, the right validation criteria.

Most people never got past the bookmark.

I spent weeks closing that gap. Iterating on prompts, restructuring tasks, tuning the validation loop.

Everything I learned is packaged into a single install command. This guide walks you through it.

What You'll Need

Two things:

Docker — the loop runs in a sandboxed container

Claude Code — Anthropic's CLI for agentic coding

That's it. Everything else is installed automatically.

You can swap Claude Code for Codex, Gemini CLI, Copilot CLI, or Kiro. See full

Bootstrap Your Project

You don't strictly need this step, but it saves tokens and gives you a better foundation.

Scaffold your project with whatever stack you prefer. For example:

Regardless of the setup, install Playwright and Vitest for testing. The Ralph Loop uses tests to verify its own work. Without them, the agent moves forward without actually checking if things work.

Prepare your API keys in a .env file. Database, payment provider, LLM keys (DO NOT COMMIT / add to .gitignore).

Whatever your project needs. Without these, the agent will skip verifying integrations and you'll have nasty surprises later.

Install the Ralph Loop

One command:

This creates an .agent/ directory with everything the loop needs:

Create Your PRD

This usually is the most difficult part, but I got you.

The command installed a skill that takes care of all of this + converting to tasks that scale with the loop.

Open Claude Code and use the included prd-creator skill to generate a Product Requirements Document:

Write in your own words. Be specific about UI, flows, integrations, and tech choices. The skill expands your brain dump into a structured PRD, asks clarifying questions, then breaks it into tasks.

Looking for a complete example? 👉 Check the video

Don't skip the review. Read each generated task. If the agent misunderstood something, fix it now. It's much cheaper to correct a task spec than to revert 10 bad commits.

Three tips for better requirements:

Point to exact docs. Save third-party documentation as markdown files in your project and reference them with @docs/FILE.md. This prevents the agent from guessing.

Prepare API keys early. Write them into .env before the loop starts so every integration gets tested for real.

If you're unsure, say so. Add "Interview me about the payment integration" to your requirements. The agent will ask you the right questions.

The loop runs inside a Docker sandbox. Claude Code gets full permissions in there, but it can't touch your host machine.

Answer yes to bypass permissions mode. Exit. You only need to do this once to authenticate.

Run the Loop

Start small:

Watch what it does. Check the commits. If it looks good, scale up:

When you're confident, let it run overnight:

That's it. Ralph picks the highest priority task, implements it, runs tests, commits, and moves on. When all tasks pass, it stops.

Steer Mid-Run

While the loop runs, you can edit .agent/STEERING.md to redirect priorities. The agent reads this file at the start of each iteration.

Found a critical bug? Write it in STEERING .md and the agent will handle it before continuing with the task list.

Review the Output

Ralph leaves a trail:

.agent/logs/LOG.md — chronological log of completed work

.agent/history/ — full output from each iteration

git log — every completed task is a commit

If something went wrong, git revert the bad commit. The task's tests will fail, and Ralph will re-attempt it on the next run.

Why The Loop Works Works

Most AI coding sessions die when the context window fills up. The model starts forgetting earlier instructions. Output quality drops. You've hit the "dumb zone."

Ralph avoids this entirely. Each iteration starts with a fresh context. The AI reads a task list, picks the next task, implements it, verifies it, commits, and exits. The next iteration starts clean.

State lives in text files and git commits. Not in the context window.

This is how you scale AI coding from "help me write a function" to "build me an app."

Where Ralph Shines

Prototyping and MVPs. Idea to working app, fast

Automated testing. Writing E2E and unit tests that would take you hours

Migrations. Moving an entire codebase to a new framework version

Repetitive tasks. Bulk refactoring, boilerplate, file restructuring

Where It Struggles

Pixel-perfect design. Nuanced UX and interaction flows

Novel architecture. Truly unique systems with no patterns to follow

Security-critical code. Where edge cases absolutely cannot exist

The Important Bit

Your role changes. You stop being the one who writes every line and start being the one who plans, delegates, and reviews.

That means the most important skill in 2026 isn't typing code faster. It's describing what you want clearly: UI specs, flows, constraints, integrations. This is the key.

Ralph is just a loop. The real work is in the setup around it and getting requirements + passing criteria right. Figure those out, and the loop handles the rest.

That's exactly why I spent weeks putting this together.

Links

📺 video with step by step instructions and examples

in-depth article with all details and pro tips

@pageai/ralph-loop on GitHub scripts, prompts, and skills

Resources

Docker Sandboxes isolated execution environment

Anthropic's research on long-running agents the research behind the task format

skills.sh additional agent skills

Link: http://x.com/i/article/2026009212296327168

📋 讨论归档

讨论进行中…