🧠 阿头学 · 💬 讨论题 · 💰投资

Codex 最佳实践——从工具到队友的系统化运营

OpenAI 在推销一套"AI 队友长期运营体系"，核心是把零散提示词沉淀为结构化配置（AGENTS.md、skills、automations），但这套方法论的成本、风险和普适性都被刻意淡化了。
打开原文 ↗

2026-03-12 原文链接 ↗

阅读简报

双语对照

完整翻译

原文

讨论归档

核心观点

从"一次性对话"到"长期配置"的范式转变 文档的真正主张不是"Codex 很聪明"，而是"通过 AGENTS.md、config.toml、MCP、skills 等机制，把 AI 变成可持续改进的队友"。这意味着使用成本从"写好一个 prompt"升级到"维护一套 AI 工程体系"，对小团队和个人开发者的门槛陡增。

"质量问题往往是系统问题"的隐含假设 文中反复暗示 AI 出错不是模型问题，而是环境配置不当（缺少上下文、权限不对、工作流不完整）。这个论点有一定说服力，但也掩盖了一个事实：即使配置完美，LLM 在复杂推理、边界情况处理上仍有天花板，不能通过工程手段完全弥补。

"闭环验证"被包装成 AI 的核心职责 文档强调让 Codex 写测试、运行 lint、审查 diff，暗示 AI 能自我把关。但最终质量责任完全在人工，这种措辞容易让决策者误解为"可以放心把审查交给 AI"，尤其在高风险领域是危险的。

多代理、自动化的复杂度成本被严重低估 文中给出"先做成 skill，再自动化"的简单顺序，但完全没讨论：如何监控自动化结果质量、如何避免多线程冲突、谁对自动化失败负责。这是把"自动化能节省时间"的局部事实直接外推为"只要稳定就适合自动化"。

文档本身包含严重的事实性问题 出现了"GPT-5.4"和"GPT-5 Codex 提示词指南"的链接，但 OpenAI 至今未公开发布 GPT-5。这要么是内部泄露的未来草案，要么是虚假信息，严重损害了整份文档的可信度。

跟我们的关联

对 ATou 意味着什么 如果你正在构建 AI 原生的工程团队，这套"配置驱动"的思路值得参考——特别是 AGENTS.md 分层（全局/仓库/目录）的设计，可以直接类比为公司文化→团队规范→项目约定的管理体系。下一步：先在一个小项目上试验 AGENTS.md，观察是否真的能降低 AI 出错率，再决定是否全面推行。

对 Uota 意味着什么 这份文档揭示了 OpenAI 的产品战略：不是在卖一个更聪明的模型，而是在建立一个生态锁定（MCP、skills、automations 都是 Codex 专属）。如果你在评估 AI 编程工具，关键不是看 demo 有多炫，而是看它是否能嵌入你现有的 build/test/review 流程，以及是否支持"沉淀为可复用单元"的设计。

对 Neta 意味着什么 这套方法论可以迁移到任何 Agent 系统设计中。"四要素任务描述"（Goal–Context–Constraints–Done）和"成熟度阶梯"（从临时提示词→AGENTS.md→skill→MCP→automation）是跨产品、跨供应商都成立的经验。下一步：用这个框架评估你正在用的 AI 工具，看是否已经支持到"skill 化"或"自动化"阶段。

讨论引子

1. 成本与收益的真实账本 文档鼓励"超高推理等级""多代理并行""后台自动化"，但在 LLM 按 token 计费的模式下，这些操作会导致成本呈指数增长。有没有团队真正测过"配置一套完整的 AGENTS.md + skills + automations"相比"简单 prompt"的 ROI？还是这套方法论本质上只对大厂（成本不敏感）有效？

2. "自我审查"的风险底线在哪 文中强调让 Codex 写测试、运行审查，但在金融、医疗、基础设施等高风险领域，自动生成的测试和审查能否真正替代人工？还是这套方法论只适合"容错成本低"的业务场景（如内部工具、非关键路径代码）？

3. 维护 AI 指令集的"文档债" AGENTS.md、PLANS.md、skills、code_review.md 的组合，本质上是在创建一套"AI 员工手册"。但这套手册的维护成本、版本管理、团队协作成本有多高？是否会演变成另一种形式的"文档债"，最后沦为"写了但没人维护"的摆设？

一个很实用的思考方式是：先给对任务有帮助的上下文，用 AGENTS.md 写下可长期复用的指导，按你的工作流配置 Codex，用 MCP 连接外部系统，把重复工作沉淀成 skills，并将稳定的工作流自动化。

首次上手要做对：上下文与提示词

即使你的提示词不够完美，Codex 也已经强到足以派上用场。你常常可以在几乎不做准备的情况下把一个难题交给它，仍然得到很不错的结果。清晰的提示词（prompting）并不是获得价值的必要条件，但它会让结果更可靠——尤其是在更大的代码库里，或在高风险任务中。

如果你在大型或复杂仓库里工作，最大的“解锁点”是：给 Codex 提供正确的任务上下文，并为你希望完成的事情提供清晰的结构。

一个很好的默认做法是：在提示词里包含四样东西：

目标：你想改什么、建什么？
上下文：这项任务中哪些文件、目录、文档、示例或报错信息是关键？你也可以用 @ 提及某些文件作为上下文。
约束：Codex 需要遵循哪些标准、架构、安全要求或约定？
完成条件：在任务完成前应满足什么，例如测试通过、行为改变，或某个 bug 不再复现？

这能帮助 Codex 保持范围清晰、减少假设，并产出更容易审查的结果。

根据任务难度选择推理等级，并测试哪种最适合你的工作流。不同用户与不同任务，最合适的设置也不同。

低：用于更快、范围清晰的任务
中或高：用于更复杂的改动或调试
超高：用于更长、更具代理式执行特征、推理密集型的任务

为了更快提供上下文，你可以在 Codex 应用里使用语音输入，把你希望 Codex 做的事情直接口述出来，而不是打字。

遇到难题先做计划

如果任务复杂、含糊，或者很难描述清楚，可以先让 Codex 在开始写代码之前做计划。

下面几种方式都很有效：

使用 Plan 模式：对大多数用户而言，这是最简单也最有效的选择。Plan 模式会让 Codex 先收集上下文、提出澄清问题，并在实现之前制定更扎实的计划。用 /plan 或 Shift+Tab 切换。

让 Codex 采访你：如果你大概知道想要什么，但不确定该如何描述，可以让 Codex 先向你提问。告诉它要挑战你的假设，并在写代码前把模糊的想法变成具体可执行的东西。

使用 PLANS.md 模板：对于更高级的工作流，你可以配置 Codex 在长周期或多步骤任务中遵循 PLANS.md 或 execution-plan 模板。更多细节见 execution plans guide。

用 AGENTS.md 让指导可复用

当某种提示词模式奏效后，下一步就是停止手动重复它。这正是 AGENTS.md 的用武之地。

把 AGENTS.md 想象成面向代理的开放格式 README。它会自动加载进上下文，是在仓库里编码你和团队希望 Codex 如何工作的最佳位置。

一个好的 AGENTS.md 通常包含：

仓库结构与重要目录
如何运行项目
构建、测试与 lint 命令
工程规范与 PR 期望
约束与禁止事项
“完成”的定义以及如何验证工作

CLI 里的 /init 斜杠命令是快速开始命令，用来在当前目录生成一份入门版 AGENTS.md。它是很好的起点，但你应该把生成结果改成与你团队真实的构建、测试、审查与发布方式一致。

你可以在不同层级创建 AGENTS.md 文件：在 ~/.codex 里放一个全局 AGENTS.md 作为个人默认；在仓库根部放一个仓库级文件作为共享标准；在子目录里放更具体的文件作为局部规则。如果当前目录附近存在更具体的文件，那份指导优先生效。

保持实用。一个短而准确的 AGENTS.md 往往比一份很长、却充满模糊规则的文件更有用。先从基础开始，只有当你观察到反复出现的错误时，再补充规则。

如果 AGENTS.md 开始变得过大，可以保持主文件简洁，并引用更具体的 markdown 文件来承载诸如规划、代码审查或架构等任务相关的内容。

当 Codex 在同一个点上犯了两次相同错误，就让它做一次复盘，并更新 AGENTS.md。这样你的指导会始终围绕真实摩擦点，保持实用。

通过配置让 Codex 更一致

配置是让 Codex 在不同会话与不同界面中表现更一致的主要方式之一。例如，你可以设置模型选择、推理强度、沙箱模式、审批策略、profiles 与 MCP 设置等默认值。

一个很好的起步模式是：

在 ~/.codex/config.toml 中保存个人默认（在 Codex 应用中：Settings → Configuration → Open config.toml）
在 .codex/config.toml 中保存仓库特定行为
只在一次性场景使用命令行覆盖（如果你用 CLI）

config.toml 是你定义长期偏好的地方，例如 MCP 服务器、profiles、多代理设置与实验特性。你可以直接编辑它，也可以让 Codex 帮你更新。

Codex 提供操作级别的沙箱机制，并有两个你可以控制的关键旋钮：审批模式决定 Codex 何时需要征得你同意才会运行命令；沙箱模式决定 Codex 是否能在目录中读写，以及代理可以访问哪些文件。

如果你刚开始使用编码代理，请从默认权限开始。默认保持审批与沙箱收紧，等需求明确后再只针对受信任仓库或特定工作流放宽权限。

请注意：CLI、IDE 与 Codex 应用共享同一套配置层。更多信息见示例配置页面。

尽早让 Codex 适配你的真实环境。许多质量问题其实是设置问题，比如工作目录不对、缺少写权限、模型默认值不合适，或缺少所需工具与连接器。

用测试与审查提升可靠性

不要止步于让 Codex 做出改动。需要时，让它创建测试、运行相关检查、确认结果，并在你接受之前审查自己的工作。

Codex 能替你完成这个闭环，但前提是它知道“好”的标准是什么。这份标准既可以来自提示词，也可以来自 AGENTS.md。

这可能包括：

为改动编写或更新测试
运行合适的测试套件
检查 lint、格式化或类型检查
确认最终行为符合需求
审查 diff，查找 bug、回归或高风险模式

在 Codex 应用里切换 diff 面板，即可在本地直接审查改动。点击某一行并给出反馈，这些反馈会作为上下文进入下一次 Codex 交互。

一个很实用的选择是斜杠命令 /review，它提供几种代码审查方式：

按 PR 风格，基于某个 base 分支审查
审查未提交的改动
审查某个提交
使用自定义审查说明

如果你和团队有一份 code_review.md 文件，并在 AGENTS.md 中引用它，Codex 也可以在审查时遵循这份指导。这对希望团队在不同仓库与不同贡献者之间保持审查一致性的团队来说，是一种很强的模式。

Codex 不应该只生成代码。在恰当指令下，它也能帮你测试、检查并审查。

如果你使用 GitHub Cloud，可以配置 Codex 为你的 PR 运行代码审查。在 OpenAI，Codex 会审查 100% 的 PR。你可以启用自动审查，也可以在你 @Codex 时让它被动触发审查。

用 MCP 获取外部上下文

当 Codex 所需上下文不在仓库里时，就使用 MCP。它能让 Codex 连接到你已经在用的工具与系统，这样你就不必在提示词里反复复制粘贴实时信息。

Model Context Protocol（简称 MCP）是一项开放标准，用于把 Codex 连接到外部工具与系统。

在以下情况使用 MCP：

所需上下文存在于仓库外
数据经常变化
你希望 Codex 直接使用工具，而不是依赖粘贴的说明
你需要一个可在用户或项目间复用的集成

Codex 同时支持基于 STDIO 的服务器和支持 Streamable HTTP 的服务器，并支持 OAuth。

在 Codex 应用中，进入 Settings → MCP servers 查看自定义与推荐服务器。很多时候，Codex 还能帮你安装所需服务器——你只需要提出需求即可。你也可以在 CLI 中使用 codex mcp add 命令，通过名称、URL 等信息添加自定义服务器。

只有当工具能解锁真实工作流时才添加。不要一开始就把你用的所有工具都接进来。先从一两个能明确消除你已反复手动执行的环节的工具开始，再逐步扩展。

把可复用的工作沉淀为 Skills

一旦某个工作流变得可复用，就不要再依赖冗长提示词或反复来回沟通。使用 Skill，把指令、上下文与 Codex 应持续一致执行的辅助逻辑封装进一个 SKILL.md 文件中。Skills 可跨 CLI、IDE 扩展与 Codex 应用使用。

让每个 skill 聚焦在一件事上。先从 2 到 3 个具体用例开始，定义清晰的输入与输出，并把描述写得足够明确：它做什么、何时使用。还要包含用户真实可能说出的触发短语。

不要一开始就试图覆盖每个边界情况。先把一个代表性任务做顺，做出稳定效果，然后把该工作流固化为 skill 并持续改进。只有在脚本或额外资源确实能提升可靠性时，再把它们纳入。

一个很好的经验法则是：如果你不断复用同一段提示词，或反复纠正同一个工作流，那么它大概率就应该变成一个 skill。

Skills 尤其适合如下重复性工作：

日志分诊（Log triage）
发布说明草拟
按清单审查 PR
迁移规划
监控指标或事故摘要
标准化调试流程

$skill-creator skill 是开始生成 skill 初版骨架的最佳起点，$skill-installer skill 则用于本地安装。skill 最重要的部分之一就是描述：它必须清楚说明这个 skill 做什么、以及何时该用它。

个人 skills 存储在 $HOME/.agents/skills，共享的团队 skills 可以提交到仓库内的 .agents/skills。对新同事的入职上手尤其有帮助。

用自动化处理重复工作

当某个工作流足够稳定后，你可以安排 Codex 在后台定时运行它。在 Codex 应用里，automations 允许你为周期性任务选择项目、提示词、执行频率与执行环境。

当某项工作对你而言变得重复时，你可以在 Codex 应用的 Automations 标签页创建自动化。你可以选择它在哪个项目里运行、运行的提示词（可调用 skills），以及运行频率。你也可以选择自动化在独立的 git worktree 中运行，或在你的本地环境中运行。更多关于 git worktrees。

适合的候选任务包括：

总结近期提交
扫描潜在 bug
草拟发布说明
检查 CI 失败
生成站会摘要
定时运行可复用的分析工作流

一个有用的规则是：skills 定义方法，automations 定义时间表。如果某个工作流仍需要大量引导，先把它做成 skill；当它变得可预测时，自动化会成为强力倍增器。

把自动化用于复盘与维护，而不只是执行。回顾近期会话，总结反复出现的摩擦点，并持续改进提示词、指令与工作流配置。

用会话控制管理长期任务

Codex 会话不只是聊天记录。它们是会随着时间积累上下文、决策与行动的工作线程，因此把会话管理好，会对质量产生很大影响。

Codex 应用的 UI 最便于线程管理，因为你可以固定线程并创建 worktrees。若你使用 CLI，这些斜杠命令尤其有用：

/experimental：切换实验特性并写入 config.toml
/resume：恢复已保存的对话
/fork：在保留原始对话稿的同时创建新线程
/compact：当线程变长时生成对早期上下文的摘要版本。注意 Codex 也会自动压缩对话
/agent：在并行代理运行时切换当前活跃的代理线程
/theme：选择语法高亮主题
/apps：在 Codex 中直接使用 ChatGPT apps
/status：查看当前会话状态

尽量让一个线程对应一个连贯的工作单元。如果工作仍属于同一个问题，留在同一线程往往更好，因为它保留了推理轨迹。只有在工作真正分叉时才 fork。

使用 Codex 的多代理（multi-agent）工作流，把边界清晰的任务从主线程卸载出去。让主代理专注核心问题，把探索、测试或分诊等任务交给子代理。

Search the Codex docs

Primary navigation

API API Reference Codex ChatGPT Learn

Get started

Core concepts

Agents

Tools

Run and scale

Evaluation

Realtime API

Model optimization

Specialized models

Going live

Legacy APIs

Assistants API
Migration guide
Deep dive
Tools

Resources

Getting Started

Using Codex

Configuration

Administration

Automation

Learn

Community

Releases

Apps SDK Commerce

Core Concepts

Plan

Build

Deploy

Guides

Resources

Guides

Commerce specs

Product feeds

Showcase Cookbook Blog Resources

Home
Home

Topics

Contribute

Recent

Topics

API Dashboard

Search ⌘ K

Getting Started

Using Codex

Configuration

Administration

Automation

Learn

Community

Releases

Copy PageMore page actions

搜索 Codex 文档

关闭

主导航

API API 参考 Codex ChatGPT 学习

开始使用

核心概念

代理

工具

运行与扩展

评估

Realtime API

模型优化

专用模型

上线与投产

旧版 API

Assistants API
迁移指南
深度解析
工具

资源

开始使用

使用 Codex

配置

管理

自动化

学习

社区

发布

Apps SDK Commerce

核心概念

规划

构建

部署

指南

资源

指南

Commerce 规范

商品源（Product feeds）

展示 Cookbook 博客资源

主题

贡献

主题

API 控制台

搜索 ⌘ K

开始使用

使用 Codex

配置

管理

自动化

学习

社区

发布

复制页面更多页面操作

Best practices

Getting started with Codex and proven practices for better results

If you’re new to Codex or coding agents in general, this guide will help you get better results faster. It covers the core habits that make Codex more effective across the CLI, IDE extension, and the Codex app, from prompting and planning to validation, MCP, skills, and automations.

Codex works best when you treat it less like a one-off assistant and more like a teammate you configure and improve over time.

A useful way to think about this: start with the right task context, use AGENTS.md for durable guidance, configure Codex to match your workflow, connect external systems with MCP, turn repeated work into skills, and automate stable workflows.

最佳实践

开始使用 Codex，以及获得更好结果的成熟做法

如果你是第一次接触 Codex 或编码代理，这份指南将帮助你更快获得更好的效果。它覆盖了让 Codex 在 CLI、IDE 扩展与 Codex 应用中都更有效的核心习惯：从提示词与规划，到验证、MCP、skills 和自动化。

Codex 最适合的用法，是把它当作一个可以长期配置与持续改进的队友，而不是一次性的助手。

Strong first use: Context and prompts

Codex is already strong enough to be useful even when your prompt isn’t perfect. You can often hand it a hard problem with minimal setup and still get a strong result. Clear prompting isn’t required to get value, but it does make results more reliable, especially in larger codebases or higher-stakes tasks.

If you work in a large or complex repository, the biggest unlock is giving Codex the right task context and a clear structure for what you want done.

A good default is to include four things in your prompt:

Goal: What are you trying to change or build?
Context: Which files, folders, docs, examples, or errors matter for this task? You can @ mention certain files as context.
Constraints: What standards, architecture, safety requirements, or conventions should Codex follow?
Done when: What should be true before the task is complete, such as tests passing, behavior changing, or a bug no longer reproducing?

This helps Codex stay scoped, make fewer assumptions, and produce work that’s easier to review.

Choose a reasoning level based on how hard the task is and test what works best for your workflow. Different users and tasks work best with different settings.

Low for faster, well-scoped tasks
Medium or High for more complex changes or debugging
Extra High for long, agentic, reasoning-heavy tasks

To provide context faster, try using speech dictation inside the Codex app to dictate what you want Codex to do rather than typing it.

首次上手要做对：上下文与提示词

如果你在大型或复杂仓库里工作，最大的“解锁点”是：给 Codex 提供正确的任务上下文，并为你希望完成的事情提供清晰的结构。

一个很好的默认做法是：在提示词里包含四样东西：

目标：你想改什么、建什么？
上下文：这项任务中哪些文件、目录、文档、示例或报错信息是关键？你也可以用 @ 提及某些文件作为上下文。
约束：Codex 需要遵循哪些标准、架构、安全要求或约定？
完成条件：在任务完成前应满足什么，例如测试通过、行为改变，或某个 bug 不再复现？

这能帮助 Codex 保持范围清晰、减少假设，并产出更容易审查的结果。

根据任务难度选择推理等级，并测试哪种最适合你的工作流。不同用户与不同任务，最合适的设置也不同。

低：用于更快、范围清晰的任务
中或高：用于更复杂的改动或调试
超高：用于更长、更具代理式执行特征、推理密集型的任务

为了更快提供上下文，你可以在 Codex 应用里使用语音输入，把你希望 Codex 做的事情直接口述出来，而不是打字。

Plan first for difficult tasks

If the task is complex, ambiguous, or hard to describe well, ask Codex to plan before it starts coding.

A few approaches work well:

Use Plan mode: For most users, this is the easiest and most effective option. Plan mode lets Codex gather context, ask clarifying questions, and build a stronger plan before implementation. Toggle with /plan or Shift+Tab.

Ask Codex to interview you: If you have a rough idea of what you want but aren’t sure how to describe it well, ask Codex to question you first. Tell it to challenge your assumptions and turn the fuzzy idea into something concrete before writing code.

Use a PLANS.md template: For more advanced workflows, you can configure Codex to follow a PLANS.md or execution-plan template for longer-running or multi-step work. For more detail, see the execution plans guide.

遇到难题先做计划

如果任务复杂、含糊，或者很难描述清楚，可以先让 Codex 在开始写代码之前做计划。

下面几种方式都很有效：

使用 PLANS.md 模板：对于更高级的工作流，你可以配置 Codex 在长周期或多步骤任务中遵循 PLANS.md 或 execution-plan 模板。更多细节见 execution plans guide。

Make guidance reusable with AGENTS.md

Once a prompting pattern works, the next step is to stop repeating it manually. That’s where AGENTS.md comes in.

Think of AGENTS.md as an open-format README for agents. It loads into context automatically and is the best place to encode how you and your team want Codex to work in a repository.

A good AGENTS.md covers:

repo layout and important directories
How to run the project
Build, test, and lint commands
Engineering conventions and PR expectations
Constraints and do-not rules
What done means and how to verify work

The /init slash command in the CLI is the quick-start command to scaffold a starter AGENTS.md in the current directory. It’s a great starting point, but you should edit the result to match how your team actually builds, tests, reviews, and ships code.

You can create AGENTS.md files at different levels: a global AGENTS.md for personal defaults that sits in ~/.codex, a repo-level file for shared standards, and more specific files in subdirectories for local rules. If there’s a more specific file closer to your current directory, that guidance wins.

Keep it practical. A short, accurate AGENTS.md is more useful than a long file full of vague rules. Start with the basics, then add new rules only after you notice repeated mistakes.

If AGENTS.md starts getting too large, keep the main file concise and reference task-specific markdown files for things like planning, code review, or architecture.

When Codex makes the same mistake twice, ask it for a retrospective and update AGENTS.md. Guidance stays practical and based on real friction.

用 AGENTS.md 让指导可复用

当某种提示词模式奏效后，下一步就是停止手动重复它。这正是 AGENTS.md 的用武之地。

把 AGENTS.md 想象成面向代理的开放格式 README。它会自动加载进上下文，是在仓库里编码你和团队希望 Codex 如何工作的最佳位置。

一个好的 AGENTS.md 通常包含：

仓库结构与重要目录
如何运行项目
构建、测试与 lint 命令
工程规范与 PR 期望
约束与禁止事项
“完成”的定义以及如何验证工作

保持实用。一个短而准确的 AGENTS.md 往往比一份很长、却充满模糊规则的文件更有用。先从基础开始，只有当你观察到反复出现的错误时，再补充规则。

如果 AGENTS.md 开始变得过大，可以保持主文件简洁，并引用更具体的 markdown 文件来承载诸如规划、代码审查或架构等任务相关的内容。

当 Codex 在同一个点上犯了两次相同错误，就让它做一次复盘，并更新 AGENTS.md。这样你的指导会始终围绕真实摩擦点，保持实用。

Configure Codex for consistency

Configuration is one of the main ways to make Codex behave more consistently across sessions and surfaces. For example, you can set defaults for model choice, reasoning effort, sandbox mode, approval policy, profiles, and MCP setup.

A good starting pattern is:

Keep personal defaults in ~/.codex/config.toml (Settings → Configuration → Open config.toml from the Codex app)
Keep repo-specific behavior in .codex/config.toml
Use command-line overrides only for one-off situations (if you use the CLI)

config.toml is where you define durable preferences such as MCP servers, profiles, multi-agent setup, and experimental features. You can edit it directly or ask Codex to update it for you.

Codex ships with operating level sandboxing and has two key knobs that you can control. Approval mode determines when Codex asks for your permission to run a command and sandbox mode determines if Codex can read or write in the directory and what files the agent can access.

If you’re new to coding agents, start with the default permissions. Keep approval and sandboxing tight by default, then loosen permissions only for trusted repos or specific workflows once the need is clear.

Note that the CLI, IDE, and Codex app all share the same configuration layers. Learn more on the sample configuration page.

Configure Codex for your real environment early. Many quality issues are really setup issues, like the wrong working directory, missing write access, wrong model defaults, or missing tools and connectors.

通过配置让 Codex 更一致

一个很好的起步模式是：

在 ~/.codex/config.toml 中保存个人默认（在 Codex 应用中：Settings → Configuration → Open config.toml）
在 .codex/config.toml 中保存仓库特定行为
只在一次性场景使用命令行覆盖（如果你用 CLI）

config.toml 是你定义长期偏好的地方，例如 MCP 服务器、profiles、多代理设置与实验特性。你可以直接编辑它，也可以让 Codex 帮你更新。

如果你刚开始使用编码代理，请从默认权限开始。默认保持审批与沙箱收紧，等需求明确后再只针对受信任仓库或特定工作流放宽权限。

请注意：CLI、IDE 与 Codex 应用共享同一套配置层。更多信息见示例配置页面。

尽早让 Codex 适配你的真实环境。许多质量问题其实是设置问题，比如工作目录不对、缺少写权限、模型默认值不合适，或缺少所需工具与连接器。

Improve reliability with testing and review

Don’t stop at asking Codex to make a change. Ask it to create tests when needed, run the relevant checks, confirm the result, and review the work before you accept it.

Codex can do this loop for you, but only if it knows what “good” looks like. That guidance can come from either the prompt or AGENTS.md.

That can include:

Writing or updating tests for the change
Running the right test suites
Checking lint, formatting, or type checks
Confirming the final behavior matches the request
Reviewing the diff for bugs, regressions, or risky patterns

Toggle the diff panel in the Codex app to directly review changes locally. Click on a specific row to provide feedback that gets fed as context to the next Codex turn.

A useful option here is the slash command /review, which gives you a few ways to review code:

Review against a base branch for PR-style review
Review uncommitted changes
Review a commit
Use custom review instructions

If you and your team have a code_review.md file and reference it from AGENTS.md, Codex can follow that guidance during review as well. This is a strong pattern for teams that want review behavior to stay consistent across repositories and contributors.

Codex shouldn’t just generate code. With the right instructions, it can also help test it, check it, and review it.

If you use GitHub Cloud, you can set up Codex to run code reviews for your PRs. At OpenAI, Codex reviews 100% of PRs. You can enable automatic reviews or have Codex reactively review when you @Codex.

用测试与审查提升可靠性

不要止步于让 Codex 做出改动。需要时，让它创建测试、运行相关检查、确认结果，并在你接受之前审查自己的工作。

Codex 能替你完成这个闭环，但前提是它知道“好”的标准是什么。这份标准既可以来自提示词，也可以来自 AGENTS.md。

这可能包括：

为改动编写或更新测试
运行合适的测试套件
检查 lint、格式化或类型检查
确认最终行为符合需求
审查 diff，查找 bug、回归或高风险模式

在 Codex 应用里切换 diff 面板，即可在本地直接审查改动。点击某一行并给出反馈，这些反馈会作为上下文进入下一次 Codex 交互。

一个很实用的选择是斜杠命令 /review，它提供几种代码审查方式：

按 PR 风格，基于某个 base 分支审查
审查未提交的改动
审查某个提交
使用自定义审查说明

Codex 不应该只生成代码。在恰当指令下，它也能帮你测试、检查并审查。

Use MCPs for external context

Use MCPs when the context Codex needs lives outside the repo. It lets Codex connect to the tools and systems you already use, so you don’t have to keep copying and pasting live information into prompts.

Model Context Protocol, or MCP, is an open standard for connecting Codex to external tools and systems.

Use MCP when:

The needed context lives outside the repo
The data changes frequently
You want Codex to use a tool rather than rely on pasted instructions
You need a repeatable integration across users or projects

Codex supports both STDIO and Streamable HTTP servers with OAuth.

In the Codex App, head to Settings → MCP servers to see custom and recommended servers. Often, Codex can help you install the needed servers. All you need to do is ask. You can also use the codex mcp add command in the CLI to add your custom servers with a name, URL, and other details.

Add tools only when they unlock a real workflow. Do not start by wiring in every tool you use. Start with one or two tools that clearly remove a manual loop you already do often, then expand from there.

用 MCP 获取外部上下文

当 Codex 所需上下文不在仓库里时，就使用 MCP。它能让 Codex 连接到你已经在用的工具与系统，这样你就不必在提示词里反复复制粘贴实时信息。

Model Context Protocol（简称 MCP）是一项开放标准，用于把 Codex 连接到外部工具与系统。

在以下情况使用 MCP：

所需上下文存在于仓库外
数据经常变化
你希望 Codex 直接使用工具，而不是依赖粘贴的说明
你需要一个可在用户或项目间复用的集成

Codex 同时支持基于 STDIO 的服务器和支持 Streamable HTTP 的服务器，并支持 OAuth。

Turn repeatable work into skills

Once a workflow becomes repeatable, stop relying on long prompts or repeated back-and-forth. Use a Skill to package the instructions in a SKILL.md file, context, and supporting logic Codex should apply consistently. Skills work across the CLI, IDE extension, and Codex app.

Keep each skill scoped to one job. Start with 2 to 3 concrete use cases, define clear inputs and outputs, and write the description so it says what the skill does and when to use it. Include the kinds of trigger phrases a user would actually say.

Don’t try to cover every edge case up front. Start with one representative task, get it working well, then turn that workflow into a skill and improve from there. Include scripts or extra assets only when they improve reliability.

A good rule of thumb: if you keep reusing the same prompt or correcting the same workflow, it should probably become a skill.

Skills are especially useful for recurring jobs like:

Log triage
Release note drafting
PR review against a checklist
Migration planning
Telemetry or incident summaries
Standard debugging flows

The $skill-creator skill is the best place to start to scaffold the first version of a skill and to use the $skill-installer skill to install it locally. One of the most important parts of a skill is the description. It should say what the skill does and when to use it.

Personal skills are stored in $HOME/.agents/skills, and shared team skills can be checked into .agents/skills inside a repository. This is especially helpful for onboarding new teammates.

把可复用的工作沉淀为 Skills

一个很好的经验法则是：如果你不断复用同一段提示词，或反复纠正同一个工作流，那么它大概率就应该变成一个 skill。

Skills 尤其适合如下重复性工作：

日志分诊（Log triage）
发布说明草拟
按清单审查 PR
迁移规划
监控指标或事故摘要
标准化调试流程

个人 skills 存储在 $HOME/.agents/skills，共享的团队 skills 可以提交到仓库内的 .agents/skills。对新同事的入职上手尤其有帮助。

Use automations for repeated work

Once a workflow is stable, you can schedule Codex to run it in the background for you. In the Codex app, automations let you choose the project, prompt, cadence, and execution environment for a recurring task.

Once a task becomes repetitive for you, you can create an automation in the Automations tab on the Codex app. You can choose which project it runs in, the prompt it runs (you can invoke skills), and the cadence it will run. You can also choose whether the automation runs in a dedicated git worktree or in your local environment. Learn more about git worktrees.

Good candidates include:

Summarizing recent commits
Scanning for likely bugs
Drafting release notes
Checking CI failures
Producing standup summaries
Running repeatable analysis workflows on a schedule

A useful rule is that skills define the method, automations define the schedule. If a workflow still needs a lot of steering, turn it into a skill first. Once it’s predictable, automation becomes a force multiplier.

Use automations for reflection and maintenance, not just execution. Review recent sessions, summarize repeated friction, and improve prompts, instructions, or workflow setup over time.

用自动化处理重复工作

适合的候选任务包括：

总结近期提交
扫描潜在 bug
草拟发布说明
检查 CI 失败
生成站会摘要
定时运行可复用的分析工作流

把自动化用于复盘与维护，而不只是执行。回顾近期会话，总结反复出现的摩擦点，并持续改进提示词、指令与工作流配置。

Organize long-running work with session controls

Codex sessions aren’t just chat history. They’re working threads that accumulate context, decisions, and actions over time, so managing them well has a big impact on quality.

The Codex app UI makes thread management easiest because you can pin threads and create worktrees. If you are using the CLI, these slash commands are especially useful:

/experimental to toggle experimental features and add to your config.toml
/resume to resume a saved conversation
/fork to create a new thread while preserving the original transcript
/compact when the thread is getting long and you want a summarized version of earlier context. Note that Codex does automatically compact conversations for you
/agent when you are running parallel agents and want to switch between the active agent thread
/theme to choose a syntax highlighting theme
/apps to use ChatGPT apps directly in Codex
/status to inspect the current session state

Keep one thread per coherent unit of work. If the work is still part of the same problem, staying in the same thread is often better because it preserves the reasoning trail. Fork only when the work truly branches.

Use Codex’s multi-agent workflows to offload bounded work from the main thread. Keep the main agent focused on the core problem, and use subagents for tasks like exploration, tests, or triage.

用会话控制管理长期任务

Codex 会话不只是聊天记录。它们是会随着时间积累上下文、决策与行动的工作线程，因此把会话管理好，会对质量产生很大影响。

Codex 应用的 UI 最便于线程管理，因为你可以固定线程并创建 worktrees。若你使用 CLI，这些斜杠命令尤其有用：

/experimental：切换实验特性并写入 config.toml
/resume：恢复已保存的对话
/fork：在保留原始对话稿的同时创建新线程
/compact：当线程变长时生成对早期上下文的摘要版本。注意 Codex 也会自动压缩对话
/agent：在并行代理运行时切换当前活跃的代理线程
/theme：选择语法高亮主题
/apps：在 Codex 中直接使用 ChatGPT apps
/status：查看当前会话状态

使用 Codex 的多代理（multi-agent）工作流，把边界清晰的任务从主线程卸载出去。让主代理专注核心问题，把探索、测试或分诊等任务交给子代理。

Common mistakes

A few common mistakes to avoid when first using Codex:

Overloading the prompt with durable rules instead of moving them into AGENTS.md or a skill
Not letting the agent see its work by not giving details on how to best run build and test commands
Skipping planning on multi-step and complex tasks
Giving Codex full permission to your computer before you understand the workflow
Running live threads on the same files without using git worktrees
Turning a recurring task into an automation before it’s reliable manually
Treating Codex like something you have to watch step by step instead of using it in parallel with your own work
Using one thread per project instead of one thread per task. This leads to bloated context and worse results over time

常见误区

初次使用 Codex 时，有一些常见误区需要避免：

把长期适用的规则都塞进提示词里，而不是迁移到 AGENTS.md 或 skill 中
不让代理看到如何运行构建与测试命令的细节，导致它难以更好地运行与验证工作
在多步骤、复杂任务上跳过规划
在你尚未理解工作流之前，就给 Codex 你的计算机完全权限
在同一批文件上同时跑多个 live 线程，却不使用 git worktrees
在手动阶段尚未足够可靠前，就把重复任务变成自动化
把 Codex 当作必须一步步盯着的工具，而不是与自己并行协作的队友
一个项目只用一个线程，而不是一项任务一个线程。这会让上下文膨胀，久而久之质量变差

Strong first use: Context and prompts

If you work in a large or complex repository, the biggest unlock is giving Codex the right task context and a clear structure for what you want done.

A good default is to include four things in your prompt:

Goal: What are you trying to change or build?
Context: Which files, folders, docs, examples, or errors matter for this task? You can @ mention certain files as context.
Constraints: What standards, architecture, safety requirements, or conventions should Codex follow?
Done when: What should be true before the task is complete, such as tests passing, behavior changing, or a bug no longer reproducing?

This helps Codex stay scoped, make fewer assumptions, and produce work that’s easier to review.

Choose a reasoning level based on how hard the task is and test what works best for your workflow. Different users and tasks work best with different settings.

Low for faster, well-scoped tasks
Medium or High for more complex changes or debugging
Extra High for long, agentic, reasoning-heavy tasks

To provide context faster, try using speech dictation inside the Codex app to dictate what you want Codex to do rather than typing it.

Plan first for difficult tasks

If the task is complex, ambiguous, or hard to describe well, ask Codex to plan before it starts coding.

A few approaches work well:

Make guidance reusable with AGENTS.md

Once a prompting pattern works, the next step is to stop repeating it manually. That’s where AGENTS.md comes in.

Think of AGENTS.md as an open-format README for agents. It loads into context automatically and is the best place to encode how you and your team want Codex to work in a repository.

A good AGENTS.md covers:

repo layout and important directories
How to run the project
Build, test, and lint commands
Engineering conventions and PR expectations
Constraints and do-not rules
What done means and how to verify work

Keep it practical. A short, accurate AGENTS.md is more useful than a long file full of vague rules. Start with the basics, then add new rules only after you notice repeated mistakes.

If AGENTS.md starts getting too large, keep the main file concise and reference task-specific markdown files for things like planning, code review, or architecture.

When Codex makes the same mistake twice, ask it for a retrospective and update AGENTS.md. Guidance stays practical and based on real friction.

Configure Codex for consistency

A good starting pattern is:

Keep personal defaults in ~/.codex/config.toml (Settings → Configuration → Open config.toml from the Codex app)
Keep repo-specific behavior in .codex/config.toml
Use command-line overrides only for one-off situations (if you use the CLI)

config.toml is where you define durable preferences such as MCP servers, profiles, multi-agent setup, and experimental features. You can edit it directly or ask Codex to update it for you.

Note that the CLI, IDE, and Codex app all share the same configuration layers. Learn more on the sample configuration page.

Improve reliability with testing and review

Don’t stop at asking Codex to make a change. Ask it to create tests when needed, run the relevant checks, confirm the result, and review the work before you accept it.

Codex can do this loop for you, but only if it knows what “good” looks like. That guidance can come from either the prompt or AGENTS.md.

That can include:

Writing or updating tests for the change
Running the right test suites
Checking lint, formatting, or type checks
Confirming the final behavior matches the request
Reviewing the diff for bugs, regressions, or risky patterns

Toggle the diff panel in the Codex app to directly review changes locally. Click on a specific row to provide feedback that gets fed as context to the next Codex turn.

A useful option here is the slash command /review, which gives you a few ways to review code:

Review against a base branch for PR-style review
Review uncommitted changes
Review a commit
Use custom review instructions

Codex shouldn’t just generate code. With the right instructions, it can also help test it, check it, and review it.

If you use GitHub Cloud, you can set up Codex to run code reviews for your PRs. At OpenAI, Codex reviews 100% of PRs. You can enable automatic reviews or have Codex reactively review when you @Codex.

Use MCPs for external context

Model Context Protocol, or MCP, is an open standard for connecting Codex to external tools and systems.

Use MCP when:

The needed context lives outside the repo
The data changes frequently
You want Codex to use a tool rather than rely on pasted instructions
You need a repeatable integration across users or projects

Codex supports both STDIO and Streamable HTTP servers with OAuth.

Turn repeatable work into skills

A good rule of thumb: if you keep reusing the same prompt or correcting the same workflow, it should probably become a skill.

Skills are especially useful for recurring jobs like:

Log triage
Release note drafting
PR review against a checklist
Migration planning
Telemetry or incident summaries
Standard debugging flows

Personal skills are stored in $HOME/.agents/skills, and shared team skills can be checked into .agents/skills inside a repository. This is especially helpful for onboarding new teammates.

Use automations for repeated work

Good candidates include:

Summarizing recent commits
Scanning for likely bugs
Drafting release notes
Checking CI failures
Producing standup summaries
Running repeatable analysis workflows on a schedule

Use automations for reflection and maintenance, not just execution. Review recent sessions, summarize repeated friction, and improve prompts, instructions, or workflow setup over time.

Organize long-running work with session controls

Codex sessions aren’t just chat history. They’re working threads that accumulate context, decisions, and actions over time, so managing them well has a big impact on quality.

The Codex app UI makes thread management easiest because you can pin threads and create worktrees. If you are using the CLI, these slash commands are especially useful:

/experimental to toggle experimental features and add to your config.toml
/resume to resume a saved conversation
/fork to create a new thread while preserving the original transcript
/compact when the thread is getting long and you want a summarized version of earlier context. Note that Codex does automatically compact conversations for you
/agent when you are running parallel agents and want to switch between the active agent thread
/theme to choose a syntax highlighting theme
/apps to use ChatGPT apps directly in Codex
/status to inspect the current session state

Use Codex’s multi-agent workflows to offload bounded work from the main thread. Keep the main agent focused on the core problem, and use subagents for tasks like exploration, tests, or triage.

Common mistakes

A few common mistakes to avoid when first using Codex:

Overloading the prompt with durable rules instead of moving them into AGENTS.md or a skill
Not letting the agent see its work by not giving details on how to best run build and test commands
Skipping planning on multi-step and complex tasks
Giving Codex full permission to your computer before you understand the workflow
Running live threads on the same files without using git worktrees
Turning a recurring task into an automation before it’s reliable manually
Treating Codex like something you have to watch step by step instead of using it in parallel with your own work
Using one thread per project instead of one thread per task. This leads to bloated context and worse results over time

📋 讨论归档

讨论进行中…

Codex 最佳实践——从工具到队友的系统化运营

核心观点

跟我们的关联

讨论引子

首次上手要做对：上下文与提示词

遇到难题先做计划

用 AGENTS.md 让指导可复用

通过配置让 Codex 更一致

用测试与审查提升可靠性

用 MCP 获取外部上下文

把可复用的工作沉淀为 Skills

用自动化处理重复工作

用会话控制管理长期任务

Search the Codex docs

Get started

Core concepts

Agents

Tools

Run and scale

Evaluation

Realtime API

Model optimization

Specialized models

Going live

Legacy APIs

Resources

Getting Started

Using Codex

Configuration

Administration

Automation

Learn

Community

Releases

Core Concepts

Plan

Build

Deploy

Guides

Resources

Guides

Commerce specs

Product feeds

Topics

Contribute

Recent

Topics

Categories

Topics

Getting Started

Using Codex

Configuration

Administration

Automation

Learn

Community

Releases

搜索 Codex 文档

开始使用

核心概念

代理

工具

运行与扩展

评估

Realtime API

模型优化

专用模型

上线与投产

旧版 API

资源

开始使用

使用 Codex

配置

管理

自动化

学习

社区

发布

核心概念

规划