返回列表
🧠 阿头学 · 💬 讨论题 · 💰投资

Claude Code 的 Skills 设计与落地经验

Anthropic 通过把 Skills 从"提示词文件"升级为"文件夹+脚本+验证+记忆"的微型工作流引擎,定义了 Agent 能力建设的新范式,但这套方法论的成本、风险与适用边界被严重低估了。
打开原文 ↗

2026-03-18 原文链接 ↗
阅读简报
双语对照
完整翻译
原文
讨论归档

核心观点

  • Skills 的真正价值在于"渐进式上下文"而非知识传递 最高信噪比的内容不是教 AI 怎么写代码,而是记录 AI 反复踩坑的边界情况(Gotchas)。通过文件夹结构让 Agent 按需读取(如 `references/api.md`、`assets/templates/`),而不是一次性喂入所有信息,这解决了大模型"上下文过载"与"注意力涣散"的核心问题。
  • 验证类 Skills 是 Agent 落地的真正瓶颈 文章强调"让一位工程师花一周把验证 skills 打磨到非常优秀,往往非常值得",这直指 AI coding 的关键痛点:不是生成代码,而是确认代码真的工作。用 Playwright、tmux、程序化断言把验收流程产品化,才是拉开 Agent ROI 的地方。
  • Description 字段是"路由规则"而非产品摘要 传统软件文档是为人写的,但在 Agent 时代,Skill 的 Description 是为模型的意图识别写的——定义"什么情况下该调用我"。这改变了整个插件市场与内部工具的写作文法,但文章对这一范式转变的影响力论述不足。
  • 按需启用的动态护栏(Hooks)提供了"既要安全又要聪明"的解法 全局安全限制会把 AI 变成"智障",但按需触发(如 `/**careful` 仅在动生产环境时拦截高危命令)能在风控与效率间找到平衡。这对 Agent 产品的治理设计有重要参考价值。
  • 内部插件市场的"有机策展"存在管理逻辑断裂 文章既推崇"没有集中式团队拍板"的自然涌现,又警告"发布前确保有筛选与策展机制"。在没有中心化审查的情况下,谁来为包含破坏性脚本的公共 Skill 负责?这个矛盾没有被解决。

跟我们的关联

  • 对 ATou 意味着什么: 这套 Skills 框架本质上是在把团队的隐性知识(避坑指南、验收标准、工作流)API 化。对增长团队尤其关键——把"诊断漏斗→定位断点→验证改动"的流程封成 Skills(如 funnel-query、cohort-compare),能显著减少跨时区、跨市场团队的协作损耗。下一步可以尝试把你们最高频的重复工作(如数据分析、A/B 测试验证)先做成 1-2 个验证类 Skill,衡量实际节省的工时。
  • 对 Neta 意味着什么: 这篇文章本质上是 Anthropic 的产品推广内容,在分享内部经验的同时,多次植入 Claude Code、Skill Creator、插件市场等自家产品。但其中关于"Gotchas 沉淀"与"渐进式披露"的设计思想是真正可迁移的——无论用什么 Agent 框架,把失败经验结构化、把信息分层管理,都是提升 Agent 可靠性的核心。下一步可以思考:你们的 Agent 系统里,哪些知识应该被沉淀成"可复用的避坑指南"?
  • 对 Uota 意味着什么: 文章暴露了 Agent 能力建设的真实成本:不是写一个 Skill,而是持续维护 Gotchas、更新脚本、管理依赖、审查市场内容。随着数百个复杂 Skills 的引入,Token 消耗、首字节延迟、模型因上下文过载导致的性能衰减是致命的系统性风险,文章对此避重就轻。下一步需要建立"Skill 生命周期管理"与"成本-收益评估"机制,而不是盲目堆积。
  • 通用启示: Agent 产品的成熟度不在于能力数量,而在于"触发机制"与"验证机制"的清晰度。一个好的 Skill 不是功能全面,而是场景触发明确、失败模式可预测、验收标准可程序化。这对任何试图规模化 AI 能力的组织都适用。

讨论引子

1. 你们团队里最高频的重复工作是什么?它是否满足"高频、高错、难口述、强验证"这四个条件?如果满足,为什么还没被沉淀成可复用的 Skill 或工作流?

2. 在"灵活性"与"约束性"之间,Agent 系统应该如何找到平衡?文章既要求 Skill 不要"写死流程",又要求通过 schema、hooks、guardrails 强约束,这两个目标是否本质矛盾?

3. 当 Skill 数量增长到数百个时,Token 消耗、上下文窗口压力、模型性能衰减会成为瓶颈。你们的组织是否有成本-收益评估机制来决定"哪些 Skill 值得保留"?

Skills 已经成为 Claude Code 中使用最频繁的扩展点之一。它们灵活、易于制作,也便于分发。

但这种灵活性也带来了一个难题:到底什么才最有效?哪些类型的 skills 值得投入去做?写出一个好 skill 的秘诀是什么?又该在什么时候把它们分享给别人?

在 Anthropic,我们在 Claude Code 里大量使用 skills,活跃使用的就有数百个。下面是我们在用 skills 加速开发过程中总结出的经验教训。

Skills 是什么?

如果你刚接触 skills,我建议先读我们的文档,或在 Skilljar 上观看我们最新的 Agent Skills 课程;本文将默认你已经对 skills 有一定了解。

我们常听到一个关于 skills 的误解:它们“只是 markdown 文件”。但 skills 最有趣的地方恰恰在于,它们并不只是文本文件——它们是文件夹,里面可以包含脚本、资源、数据等,agent 能够发现、探索并操作这些内容。

在 Claude Code 里,skills 还有非常丰富的配置选项,包括注册动态 hooks。

我们发现,Claude Code 里一些最有意思的 skills,正是创造性地利用了这些配置选项与文件夹结构。

Skills 的类型

在盘点了我们所有 skills 之后,我们注意到它们会聚类到几个反复出现的类别里。最好的 skills 往往能清晰地归入某一类;更令人困惑的则会横跨好几类。这不是一份最终的、权威的清单,但它能很好地帮助你思考:你们组织内部是否缺了某一类关键 skills。

https://code.claude.com/docs/en/plugin-marketplaces

1. 库与 API 参考

用于解释如何正确使用某个库、CLI 或 SDK 的 skills。它们既可以面向内部库,也可以面向那些 Claude Code 有时不太擅长处理的常用库。这类 skills 往往会包含一个参考代码片段文件夹,以及一份 gotchas 清单,提醒 Claude 在写脚本时避开常见坑点。

示例:

  • billing-lib — 你们的内部计费库:边界情况、易踩雷点等。

  • internal-platform-cli — 你们内部 CLI 封装的每个子命令,以及何时使用的示例

  • frontend-design — 让 Claude 更擅长你们的设计系统

2. 产品验证

描述如何测试或验证你的代码是否正常工作的 skills。它们常与 playwright、tmux 等外部工具配合,用来完成验证。

验证类 skills 对确保 Claude 的输出正确性极其有用。让一位工程师专门花一周把验证 skills 打磨到非常优秀,往往非常值得。

可以考虑一些技巧:例如让 Claude 录制其输出的过程视频,让你准确看到它到底测了什么;或者在每一步对状态加入程序化断言。这些通常通过在 skill 里放入多种脚本来实现。

示例:

  • signup-flow-driver — 在无头浏览器里跑完整个 注册 → 邮箱验证 → onboarding 流程,并在每一步提供用于断言状态的 hooks

  • checkout-verifier — 用 Stripe 测试卡驱动结账 UI,验证发票确实进入了正确状态

  • tmux-cli-driver — 用于交互式 CLI 测试,适合被验证对象需要 TTY 的场景

3. 数据拉取与分析

连接到你们的数据与监控体系的 skills。这类 skills 可能包含:带凭证的数据拉取库、具体的 dashboard id 等;也会包含常见工作流或获取数据的方法说明。

示例:

  • funnel-query — “要看到 注册 → 激活 → 付费,我该 join 哪些事件?”以及真正包含规范 user_id 的那张表

  • cohort-compare — 对比两个 cohort 的留存或转化,标记统计显著的差异,并链接到 segment 定义

  • grafana — datasource UID、集群名称、问题 → dashboard 的查询表

4. 业务流程与团队自动化

把重复性工作流自动化为一条命令的 skills。这类 skills 通常只是相对简单的说明,但可能对其他 skills 或 MCP 有更复杂的依赖。对于这类 skills,把以往结果保存到日志文件里,能帮助模型保持一致,并对之前的工作流执行进行反思。

示例:

  • standup-post — 汇总你的工单系统、GitHub 活动和过往 Slack → 输出格式化的站会内容,只给增量

  • create-<ticket-system>-ticket — 强制执行 schema(合法的 enum 值、必填字段),并包含创建后的工作流(@reviewer、在 Slack 里贴链接)

  • weekly-recap — 合并的 PR + 关闭的 tickets + 部署记录 → 输出格式化的周回顾帖

5. 代码脚手架与模板

为代码库中某个特定功能生成框架样板代码(boilerplate)的 skills。你可以把这类 skills 与可组合的脚本结合使用。尤其当你的脚手架需求包含纯代码难以覆盖的自然语言要求时,它们会非常有用。

示例:

  • new-<framework>-workflow — 按你们的注解/标注风格,脚手架化生成一个新的 service/workflow/handler

  • new-migration — 你们的 migration 文件模板,以及常见坑点

  • create-app — 创建一个新的内部应用,预先把你们的 auth、logging 与 deploy 配置接好

6. 代码质量与评审

用于在组织内部落实代码质量、辅助代码评审的 skills。它们可以包含确定性的脚本或工具,以获得最高的鲁棒性。你也许会希望把这些 skills 作为 hooks 的一部分自动运行,或放进 GitHub Action 里执行。

  • adversarial-review — 生成一个“新鲜视角”的子 agent 来挑刺,落实修复并持续迭代,直到发现的问题退化为吹毛求疵的小点

  • code-style — 强制执行代码风格,尤其是 Claude 默认不太擅长的那部分风格

  • testing-practices — 如何写测试、该测什么的指导

7. CI/CD 与部署

帮助你在代码库内拉取、推送和部署代码的 skills。它们可能会引用其他 skills 来收集数据。

示例:

  • babysit-pr — 监控一个 PR → 重试 flaky 的 CI → 解决合并冲突 → 开启自动合并

  • deploy-<service> — build → smoke test → 带错误率对比的渐进式流量发布 → 回归时自动回滚

  • cherry-pick-prod — 隔离的 worktree → cherry-pick → 冲突解决 → 用模板创建 PR

8. Runbooks(故障处置手册)

这类 skills 接收一个症状(例如 Slack 线程、告警或错误特征),串联多个工具完成调查,并产出结构化报告。

示例:

  • <service>-debugging — 为你们最高流量的服务,把 症状 → 工具 → 查询模式 对应起来

  • oncall-runner — 拉取告警 → 检查常见嫌疑项 → 格式化输出结论

  • log-correlator — 给定一个 request ID,从所有可能触达过它的系统里拉取匹配日志

9. 基础设施运维

执行日常维护与运维流程的 skills——其中一些涉及破坏性操作,特别适合通过 guardrails 加上安全护栏。这些 skills 让工程师在关键运维操作中更容易遵循最佳实践。

示例:

  • <resource>-orphans — 找出孤儿 pods/volumes → 发到 Slack → 观察期(soak period)→ 用户确认 → 级联清理

  • dependency-management — 你们组织的依赖审批流程

  • cost-investigation — “为什么我们的存储/出网账单暴涨?”并给出具体 bucket 与查询模式

制作 Skills 的技巧

当你决定要做哪个 skill 之后,该怎么写?下面是我们总结的一些最佳实践、建议与小技巧。

我们最近也发布了 Skill Creator,让在 Claude Code 里创建 skills 更容易。

别讲显而易见的事

Claude Code 对你的代码库了解很多,Claude 也很懂编程,并且自带不少默认观点。如果你要发布的 skill 主要是知识型内容,尽量聚焦于那些能把 Claude 从“惯性思路”里拉出来的信息。

frontend design skill 就是个很好的例子——它由 Anthropic 的一位工程师通过与客户反复迭代,提升 Claude 的设计品味,并刻意避开一些经典套路(比如 Inter 字体和紫色渐变)。

打造一个 Gotchas(坑点)章节

https://gist.github.com/ThariqS/24defad423d701746e23dc19aace4de5

任何 skill 里信噪比最高的内容,往往就是 Gotchas 章节。这个章节应该来源于 Claude 在使用你的 skill 时反复遇到的失败点。理想情况下,你会随着时间不断更新 skill,把这些 gotchas 逐步收录进去。

善用文件系统与渐进式披露(Progressive Disclosure)

https://claude.com/blog/improving-skill-creator-test-measure-and-refine-agent-skills

就像前面说的,skill 是一个文件夹,而不只是一个 markdown 文件。你应该把整个文件系统当作一种上下文工程与渐进式披露的手段:告诉 Claude 你的 skill 里有哪些文件,它会在合适的时机去读取它们。

最简单的渐进式披露,就是指向其他 markdown 文件供 Claude 使用。例如,你可以把详细的函数签名和用法示例拆分到 references/api.md 里。

再比如:如果你的最终产出是一个 markdown 文件,你可以在 assets/ 里放一个模板文件,让 Claude 复制并使用。

你也可以准备 references、scripts、examples 等文件夹,帮助 Claude 更高效地工作。

避免把 Claude 写死在固定流程里

Claude 通常会努力遵循你的指令,而 skills 又具有很强的复用性,因此你需要警惕:说明写得过于具体。把 Claude 需要的信息给到,但也要留出它根据实际情境调整的空间。例如:

把初始化/配置想清楚

https://github.com/anthropics/skills/blob/main/skills/frontend-design/SKILL.md

有些 skills 可能需要从用户那里获取上下文才能完成初始化。例如,如果你做的是一个把站会内容发布到 Slack 的 skill,你可能希望 Claude 先问清楚要发到哪个 Slack 频道。

一个很好的模式,是像上面的例子一样,把这些初始化信息放在 skill 目录下的 config.json 里。如果 config 尚未配置好,agent 就可以向用户询问信息。

如果你希望 agent 用结构化的多选题来提问,可以指示 Claude 使用 AskUserQuestion 工具。

Description 字段是给模型看的

Claude Code 启动一次 session 时,会生成一个包含所有可用 skills 及其 description 的列表。Claude 会扫描这份列表来判断“这个请求有没有对应的 skill?”因此 description 字段不是摘要——它描述的是:在什么情况下应该触发这个 PR。

https://anthropic.skilljar.com/introduction-to-agent-skills

记忆与数据存储

有些 skills 可以通过在自身内部存储数据,来实现某种形式的“记忆”。你既可以把数据存成很简单的 append only 文本日志或 JSON 文件,也可以复杂到用 SQLite 数据库。

例如,一个 standup-post skill 可能会维护一个 standups.log,记录它写过的每一次站会内容。这意味着下一次运行时,Claude 会读取自己的历史记录,并能判断相较于昨天发生了哪些变化。

skill 目录里的数据在你升级 skill 时可能会被删除,因此你应该把数据放到稳定的文件夹里。截至目前,我们为每个 plugin 提供 ${**CLAUDE_PLUGIN_DATA**} 作为稳定的数据存储目录。

存放脚本,让 Claude 生成代码

你能给 Claude 的最强大工具之一,就是代码。给 Claude 提供脚本与库,可以让 Claude 把回合花在“组合与决策下一步做什么”上,而不是反复重建样板代码。

例如,在你的数据科学 skill 里,你可能有一套从事件源拉取数据的函数库。为了让 Claude 做复杂分析,你可以提供一组辅助函数,例如:

https://code.claude.com/docs/en/skills

这样,Claude 就能在运行时按需生成脚本,把这些能力组合起来,完成更高级的分析,例如针对提示“周二发生了什么?”进行分析。

https://code.claude.com/docs/en/skills#frontmatter-reference

按需启用的 Hooks

skills 可以包含 hooks:只有在 skill 被调用时才激活,并持续到整个 session 结束。把它用于那些你不想一直运行、但在特定情况下又极其有用的、更具“主张”的 hooks。

例如:

  • /careful — 通过 Bash 的 PreToolUse matcher 阻止 rm -rf、DROP TABLE、force-push、kubectl delete。只有当你明确知道自己在动 prod 时才需要它——如果一直开着,会把人逼疯

  • /freeze — 阻止任何不在指定目录内的 Edit/Write。很有用

  • 调试时:"我想加日志,但我总是不小心 'fixing' 了无关的

分发 Skills

Skills 最大的好处之一,是你可以把它们分享给团队里的其他人。

你可能有两种方式把 skills 分享给他人:

  • 将 skills 检入到你的 repo(放在 ./.claude/skills 下)

  • 做成一个 plugin,并搭建一个 Claude Code 插件市场,让用户上传并安装插件(更多信息见文档)

对于跨越仓库数量不多的小团队来说,把 skills 直接检入 repo 往往效果很好。但每个被检入的 skill 都会给模型上下文增加一点负担。随着规模扩大,内部插件市场可以让你分发 skills,并让团队成员自行决定安装哪些。

管理市场

你如何决定哪些 skills 应该进入市场?大家又怎么提交它们?

我们并没有一个集中式团队来拍板;相反,我们尽量以更自然的方式找出最有用的 skills。如果你有一个 skill 想让大家试用,可以把它上传到 GitHub 的 sandbox 文件夹里,然后在 Slack 或其他论坛里把链接发给大家。

当某个 skill 逐渐获得使用与认可(由 skill owner 自行判断)之后,就可以提交一个 PR,把它移动到 marketplace 里。

需要提醒的是:做出糟糕或重复的 skills 其实非常容易。因此,在发布前确保你有某种筛选与策展机制很重要。

组合 Skills

你可能会希望某些 skills 依赖其他 skills。比如,你可能有一个文件上传 skill 负责上传文件,另一个 CSV 生成 skill 负责生成 CSV 并上传。类似的依赖管理目前还没有原生内置在 marketplaces 或 skills 里,但你可以直接通过名称引用其他 skills;只要它们已安装,模型就会调用它们。

衡量 Skills

为了了解一个 skill 的表现,我们使用一个 PreToolUse hook 来记录公司内部的 skill 使用情况(示例代码见这里)。这样我们就能发现哪些 skills 很受欢迎,或者哪些 skills 相比我们的预期触发不足。

结语

Skills 对 agent 来说是极其强大且灵活的工具,但这一切仍处在早期阶段,我们也都还在摸索如何把它们用到最好。

与其把这篇文章当成一份定论式指南,不如把它看作我们见过“确实有效”的实用技巧合集。理解 skills 最好的方式,是开始动手、不断实验,看看什么对你有效。我们的很多 skills 最初也只是几行文字和一个 gotcha;之所以变得更好,是因为当 Claude 遇到新的边界情况时,人们不断把经验补充进去。

希望这对你有帮助。如果你有任何问题,欢迎告诉我。

Skills have become one of the most used extension points in Claude Code. They’re flexible, easy to make, and simple to distribute.

Skills 已经成为 Claude Code 中使用最频繁的扩展点之一。它们灵活、易于制作,也便于分发。

But this flexibility also makes it hard to know what works best. What type of skills are worth making? What's the secret to writing a good skill? When do you share them with others?

但这种灵活性也带来了一个难题:到底什么才最有效?哪些类型的 skills 值得投入去做?写出一个好 skill 的秘诀是什么?又该在什么时候把它们分享给别人?

We've been using skills in Claude Code extensively at Anthropic with hundreds of them in active use. These are the lessons we've learned about using skills to accelerate our development.

在 Anthropic,我们在 Claude Code 里大量使用 skills,活跃使用的就有数百个。下面是我们在用 skills 加速开发过程中总结出的经验教训。

What are Skills?

Skills 是什么?

If you’re new to skills, I’d recommend reading our docs or watching our newest course on new Skilljar on Agent Skills, this post will assume you already have some familiarity with skills.

如果你刚接触 skills,我建议先读我们的文档,或在 Skilljar 上观看我们最新的 Agent Skills 课程;本文将默认你已经对 skills 有一定了解。

A common misconception we hear about skills is that they are “just markdown files”, but the most interesting part of skills is that they’re not just text files. They’re folders that can include scripts, assets, data, etc. that the agent can discover, explore and manipulate.

我们常听到一个关于 skills 的误解:它们“只是 markdown 文件”。但 skills 最有趣的地方恰恰在于,它们并不只是文本文件——它们是文件夹,里面可以包含脚本、资源、数据等,agent 能够发现、探索并操作这些内容。

In Claude Code, skills also have a wide variety of configuration options including registering dynamic hooks.

在 Claude Code 里,skills 还有非常丰富的配置选项,包括注册动态 hooks。

We’ve found that some of the most interesting skills in Claude Code use these configuration options and folder structure creatively.

我们发现,Claude Code 里一些最有意思的 skills,正是创造性地利用了这些配置选项与文件夹结构。

Types of Skills

Skills 的类型

After cataloging all of our skills, we noticed they cluster into a few recurring categories. The best skills fit cleanly into one; the more confusing ones straddle several. This isn't a definitive list, but it is a good way to think about if you're missing any inside of your org.

在盘点了我们所有 skills 之后,我们注意到它们会聚类到几个反复出现的类别里。最好的 skills 往往能清晰地归入某一类;更令人困惑的则会横跨好几类。这不是一份最终的、权威的清单,但它能很好地帮助你思考:你们组织内部是否缺了某一类关键 skills。

1. Library & API Reference

1. 库与 API 参考

Skills that explain how to correctly use a library, CLI, or SDKs. These could be both for internal libraries or common libraries that Claude Code sometimes has trouble with. These skills often included a folder of reference code snippets and a list of gotchas for Claude to avoid when writing a script.

用于解释如何正确使用某个库、CLI 或 SDK 的 skills。它们既可以面向内部库,也可以面向那些 Claude Code 有时不太擅长处理的常用库。这类 skills 往往会包含一个参考代码片段文件夹,以及一份 gotchas 清单,提醒 Claude 在写脚本时避开常见坑点。

Examples:

示例:

  • billing-lib — your internal billing library: edge cases, footguns, etc.
  • billing-lib — 你们的内部计费库:边界情况、易踩雷点等。
  • internal-platform-cli — every subcommand of your internal CLI wrapper with examples on when to use them
  • internal-platform-cli — 你们内部 CLI 封装的每个子命令,以及何时使用的示例
  • frontend-design — make Claude better at your design system
  • frontend-design — 让 Claude 更擅长你们的设计系统

2. Product Verification

2. 产品验证

Skills that describe how to test or verify that your code is working. These are often paired with an external tool like playwright, tmux, etc. for doing the verification.

描述如何测试或验证你的代码是否正常工作的 skills。它们常与 playwright、tmux 等外部工具配合,用来完成验证。

Verification skills are extremely useful for ensuring Claude's output is correct. It can be worth having an engineer spend a week just making your verification skills excellent.

验证类 skills 对确保 Claude 的输出正确性极其有用。让一位工程师专门花一周把验证 skills 打磨到非常优秀,往往非常值得。

Consider techniques like having Claude record a video of its output so you can see exactly what it tested, or enforcing programmatic assertions on state at each step. These are often done by including a variety of scripts in the skill.

可以考虑一些技巧:例如让 Claude 录制其输出的过程视频,让你准确看到它到底测了什么;或者在每一步对状态加入程序化断言。这些通常通过在 skill 里放入多种脚本来实现。

Examples:

示例:

  • signup-flow-driver — runs through signup → email verify → onboarding in a headless browser, with hooks for asserting state at each step
  • signup-flow-driver — 在无头浏览器里跑完整个 注册 → 邮箱验证 → onboarding 流程,并在每一步提供用于断言状态的 hooks
  • checkout-verifier — drives the checkout UI with Stripe test cards, verifies the invoice actually lands in the right state
  • checkout-verifier — 用 Stripe 测试卡驱动结账 UI,验证发票确实进入了正确状态
  • tmux-cli-driver — for interactive CLI testing where the thing you're verifying needs a TTY
  • tmux-cli-driver — 用于交互式 CLI 测试,适合被验证对象需要 TTY 的场景

3. Data Fetching & Analysis

3. 数据拉取与分析

Skills that connect to your data and monitoring stacks. These skills might include libraries to fetch your data with credentials, specific dashboard ids, etc. as well as instructions on common workflows or ways to get data.

连接到你们的数据与监控体系的 skills。这类 skills 可能包含:带凭证的数据拉取库、具体的 dashboard id 等;也会包含常见工作流或获取数据的方法说明。

Examples:

示例:

  • funnel-query — "which events do I join to see signup → activation → paid" plus the table that actually has the canonical user_id
  • funnel-query — “要看到 注册 → 激活 → 付费,我该 join 哪些事件?”以及真正包含规范 user_id 的那张表
  • cohort-compare — compare two cohorts' retention or conversion, flag statistically significant deltas, link to the segment definitions
  • cohort-compare — 对比两个 cohort 的留存或转化,标记统计显著的差异,并链接到 segment 定义
  • grafana — datasource UIDs, cluster names, problem → dashboard lookup table
  • grafana — datasource UID、集群名称、问题 → dashboard 的查询表

4. Business Process & Team Automation

4. 业务流程与团队自动化

Skills that automate repetitive workflows into one command. These skills are usually fairly simple instructions but might have more complicated dependencies on other skills or MCPs. For these skills, saving previous results in log files can help the model stay consistent and reflect on previous executions of the workflow.

把重复性工作流自动化为一条命令的 skills。这类 skills 通常只是相对简单的说明,但可能对其他 skills 或 MCP 有更复杂的依赖。对于这类 skills,把以往结果保存到日志文件里,能帮助模型保持一致,并对之前的工作流执行进行反思。

Examples:

示例:

  • standup-post — aggregates your ticket tracker, GitHub activity, and prior Slack → formatted standup, delta-only
  • standup-post — 汇总你的工单系统、GitHub 活动和过往 Slack → 输出格式化的站会内容,只给增量
  • create-<ticket-system>-ticket — enforces schema (valid enum values, required fields) plus post-creation workflow (ping reviewer, link in Slack)
  • create-<ticket-system>-ticket — 强制执行 schema(合法的 enum 值、必填字段),并包含创建后的工作流(@reviewer、在 Slack 里贴链接)
  • weekly-recap — merged PRs + closed tickets + deploys → formatted recap post
  • weekly-recap — 合并的 PR + 关闭的 tickets + 部署记录 → 输出格式化的周回顾帖

5. Code Scaffolding & Templates

5. 代码脚手架与模板

Skills that generate framework boilerplate for a specific function in codebase. You might combine these skills with scripts that can be composed. They are especially useful when your scaffolding has natural language requirements that can’t be purely covered by code.

为代码库中某个特定功能生成框架样板代码(boilerplate)的 skills。你可以把这类 skills 与可组合的脚本结合使用。尤其当你的脚手架需求包含纯代码难以覆盖的自然语言要求时,它们会非常有用。

Examples:

示例:

  • new-<framework>-workflow — scaffolds a new service/workflow/handler with your annotations
  • new-<framework>-workflow — 按你们的注解/标注风格,脚手架化生成一个新的 service/workflow/handler
  • new-migration — your migration file template plus common gotchas
  • new-migration — 你们的 migration 文件模板,以及常见坑点
  • create-app — new internal app with your auth, logging, and deploy config pre-wired
  • create-app — 创建一个新的内部应用,预先把你们的 auth、logging 与 deploy 配置接好

6. Code Quality & Review

6. 代码质量与评审

Skills that enforce code quality inside of your org and help review code. These can include deterministic scripts or tools for maximum robustness. You may want to run these skills automatically as part of hooks or inside of a GitHub Action.

用于在组织内部落实代码质量、辅助代码评审的 skills。它们可以包含确定性的脚本或工具,以获得最高的鲁棒性。你也许会希望把这些 skills 作为 hooks 的一部分自动运行,或放进 GitHub Action 里执行。

  • adversarial-review — spawns a fresh-eyes subagent to critique, implements fixes, iterates until findings degrade to nitpicks
  • adversarial-review — 生成一个“新鲜视角”的子 agent 来挑刺,落实修复并持续迭代,直到发现的问题退化为吹毛求疵的小点
  • code-style — enforces code style, especially styles that Claude does not do well by default.
  • code-style — 强制执行代码风格,尤其是 Claude 默认不太擅长的那部分风格
  • testing-practices — instructions on how to write tests and what to test.
  • testing-practices — 如何写测试、该测什么的指导

7. CI/CD & Deployment

7. CI/CD 与部署

Skills that help you fetch, push, and deploy code inside of your codebase. These skills may reference other skills to collect data.

帮助你在代码库内拉取、推送和部署代码的 skills。它们可能会引用其他 skills 来收集数据。

Examples:

示例:

  • babysit-pr — monitors a PR → retries flaky CI → resolves merge conflicts → enables auto-merge
  • babysit-pr — 监控一个 PR → 重试 flaky 的 CI → 解决合并冲突 → 开启自动合并
  • deploy-<service> — build → smoke test → gradual traffic rollout with error-rate comparison → auto-rollback on regression
  • deploy-<service> — build → smoke test → 带错误率对比的渐进式流量发布 → 回归时自动回滚
  • cherry-pick-prod — isolated worktree → cherry-pick → conflict resolution → PR with template
  • cherry-pick-prod — 隔离的 worktree → cherry-pick → 冲突解决 → 用模板创建 PR

8. Runbooks

8. Runbooks(故障处置手册)

Skills that take a symptom (such as a Slack thread, alert, or error signature), walk through a multi-tool investigation, and produce a structured report.

这类 skills 接收一个症状(例如 Slack 线程、告警或错误特征),串联多个工具完成调查,并产出结构化报告。

Examples:

示例:

  • <service>-debugging — maps symptoms → tools → query patterns for your highest-traffic services
  • <service>-debugging — 为你们最高流量的服务,把 症状 → 工具 → 查询模式 对应起来
  • oncall-runner — fetches the alert → checks the usual suspects → formats a finding
  • oncall-runner — 拉取告警 → 检查常见嫌疑项 → 格式化输出结论
  • log-correlator — given a request ID, pulls matching logs from every system that might have touched it
  • log-correlator — 给定一个 request ID,从所有可能触达过它的系统里拉取匹配日志

9. Infrastructure Operations

9. 基础设施运维

Skills that perform routine maintenance and operational procedures — some of which involve destructive actions that benefit from guardrails. These make it easier for engineers to follow best practices in critical operations.

执行日常维护与运维流程的 skills——其中一些涉及破坏性操作,特别适合通过 guardrails 加上安全护栏。这些 skills 让工程师在关键运维操作中更容易遵循最佳实践。

Examples:

示例:

  • <resource>-orphans — finds orphaned pods/volumes → posts to Slack → soak period → user confirms → cascading cleanup
  • <resource>-orphans — 找出孤儿 pods/volumes → 发到 Slack → 观察期(soak period)→ 用户确认 → 级联清理
  • dependency-management — your org's dependency approval workflow
  • dependency-management — 你们组织的依赖审批流程
  • cost-investigation — "why did our storage/egress bill spike" with the specific buckets and query patterns
  • cost-investigation — “为什么我们的存储/出网账单暴涨?”并给出具体 bucket 与查询模式

Tips for Making Skills

制作 Skills 的技巧

Once you've decided on the skill to make, how do you write it? These are some of the best practices, tips, and tricks we've found.

当你决定要做哪个 skill 之后,该怎么写?下面是我们总结的一些最佳实践、建议与小技巧。

We also recently released Skill Creator to make it easier to create skills in Claude Code.

我们最近也发布了 Skill Creator,让在 Claude Code 里创建 skills 更容易。

Don’t State the Obvious

别讲显而易见的事

Claude Code knows a lot about your codebase, and Claude knows a lot about coding, including many default opinions. If you’re publishing a skill that is primarily about knowledge, try to focus on information that pushes Claude out of its normal way of thinking.

Claude Code 对你的代码库了解很多,Claude 也很懂编程,并且自带不少默认观点。如果你要发布的 skill 主要是知识型内容,尽量聚焦于那些能把 Claude 从“惯性思路”里拉出来的信息。

The frontend design skill is a great example — it was built by one of the engineers at Anthropic by iterating with customers on improving Claude’s design taste, avoiding classic patterns like the Inter font and purple gradients.

frontend design skill 就是个很好的例子——它由 Anthropic 的一位工程师通过与客户反复迭代,提升 Claude 的设计品味,并刻意避开一些经典套路(比如 Inter 字体和紫色渐变)。

Build a Gotchas Section

打造一个 Gotchas(坑点)章节

The highest-signal content in any skill is the Gotchas section. These sections should be built up from common failure points that Claude runs into when using your skill. Ideally, you will update your skill over time to capture these gotchas.

任何 skill 里信噪比最高的内容,往往就是 Gotchas 章节。这个章节应该来源于 Claude 在使用你的 skill 时反复遇到的失败点。理想情况下,你会随着时间不断更新 skill,把这些 gotchas 逐步收录进去。

Use the File System & Progressive Disclosure

善用文件系统与渐进式披露(Progressive Disclosure)

Like we said earlier, a skill is a folder, not just a markdown file. You should think of the entire file system as a form of context engineering and progressive disclosure. Tell Claude what files are in your skill, and it will read them at appropriate times.

就像前面说的,skill 是一个文件夹,而不只是一个 markdown 文件。你应该把整个文件系统当作一种上下文工程与渐进式披露的手段:告诉 Claude 你的 skill 里有哪些文件,它会在合适的时机去读取它们。

The simplest form of progressive disclosure is to point to other markdown files for Claude to use. For example, you may split detailed function signatures and usage examples into references/api.md.

最简单的渐进式披露,就是指向其他 markdown 文件供 Claude 使用。例如,你可以把详细的函数签名和用法示例拆分到 references/api.md 里。

Another example: if your end output is a markdown file, you might include a template file for it in assets/ to copy and use.

再比如:如果你的最终产出是一个 markdown 文件,你可以在 assets/ 里放一个模板文件,让 Claude 复制并使用。

You can have folders of references, scripts, examples, etc., which help Claude work more effectively.

你也可以准备 references、scripts、examples 等文件夹,帮助 Claude 更高效地工作。

Avoid Railroading Claude

避免把 Claude 写死在固定流程里

Claude will generally try to stick to your instructions, and because Skills are so reusable you’ll want to be careful of being too specific in your instructions. Give Claude the information it needs, but give it the flexibility to adapt to the situation. For example:

Claude 通常会努力遵循你的指令,而 skills 又具有很强的复用性,因此你需要警惕:说明写得过于具体。把 Claude 需要的信息给到,但也要留出它根据实际情境调整的空间。例如:

Think through the Setup

把初始化/配置想清楚

Some skills may need to be set up with context from the user. For example, if you are making a skill that posts your standup to Slack, you may want Claude to ask which Slack channel to post it in.

有些 skills 可能需要从用户那里获取上下文才能完成初始化。例如,如果你做的是一个把站会内容发布到 Slack 的 skill,你可能希望 Claude 先问清楚要发到哪个 Slack 频道。

A good pattern to do this is to store this setup information in a config.json file in the skill directory like the above example. If the config is not set up, the agent can then ask the user for information.

一个很好的模式,是像上面的例子一样,把这些初始化信息放在 skill 目录下的 config.json 里。如果 config 尚未配置好,agent 就可以向用户询问信息。

If you want the agent to present structured, multiple choice questions you can instruct Claude to use the AskUserQuestion tool.

如果你希望 agent 用结构化的多选题来提问,可以指示 Claude 使用 AskUserQuestion 工具。

The Description Field Is For the Model

Description 字段是给模型看的

When Claude Code starts a session, it builds a listing of every available skill with its description. This listing is what Claude scans to decide "is there a skill for this request?" Which means the description field is not a summary — it's a description of when to trigger this PR.

Claude Code 启动一次 session 时,会生成一个包含所有可用 skills 及其 description 的列表。Claude 会扫描这份列表来判断“这个请求有没有对应的 skill?”因此 description 字段不是摘要——它描述的是:在什么情况下应该触发这个 PR。

Memory & Storing Data

记忆与数据存储

Some skills can include a form of memory by storing data within them. You could store data in anything as simple as an append only text log file or JSON files, or as complicated as a SQLite database.

有些 skills 可以通过在自身内部存储数据,来实现某种形式的“记忆”。你既可以把数据存成很简单的 append only 文本日志或 JSON 文件,也可以复杂到用 SQLite 数据库。

For example, a standup-post skill might keep a standups.log with every post it's written, which means the next time you run it, Claude reads its own history and can tell what's changed since yesterday.

例如,一个 standup-post skill 可能会维护一个 standups.log,记录它写过的每一次站会内容。这意味着下一次运行时,Claude 会读取自己的历史记录,并能判断相较于昨天发生了哪些变化。

Data stored in the skill directory may be deleted when you upgrade the skill, so you should store this in a stable folder, as of today we provide ${**CLAUDE_PLUGIN_DATA**} as a stable folder per plugin to store data in.

skill 目录里的数据在你升级 skill 时可能会被删除,因此你应该把数据放到稳定的文件夹里。截至目前,我们为每个 plugin 提供 ${**CLAUDE_PLUGIN_DATA**} 作为稳定的数据存储目录。

Store Scripts & Generate Code

存放脚本,让 Claude 生成代码

One of the most powerful tools you can give Claude is code. Giving Claude scripts and libraries lets Claude spend its turns on composition, deciding what to do next rather than reconstructing boilerplate.

你能给 Claude 的最强大工具之一,就是代码。给 Claude 提供脚本与库,可以让 Claude 把回合花在“组合与决策下一步做什么”上,而不是反复重建样板代码。

For example, in your data science skill you might have a library of functions to fetch data from your event source. In order for Claude to do complex analysis, you could give it a set of helper functions like so:

例如,在你的数据科学 skill 里,你可能有一套从事件源拉取数据的函数库。为了让 Claude 做复杂分析,你可以提供一组辅助函数,例如:

Claude can then generate scripts on the fly to compose this functionality to do more advanced analysis for prompts like “What happened on Tuesday?”

这样,Claude 就能在运行时按需生成脚本,把这些能力组合起来,完成更高级的分析,例如针对提示“周二发生了什么?”进行分析。

On Demand Hooks

按需启用的 Hooks

Skills can include hooks that are only activated when the skill is called, and last for the duration of the session. Use this for more opinionated hooks that you don’t want to run all the time, but are extremely useful sometimes.

skills 可以包含 hooks:只有在 skill 被调用时才激活,并持续到整个 session 结束。把它用于那些你不想一直运行、但在特定情况下又极其有用的、更具“主张”的 hooks。

For example:

例如:

  • /careful — blocks rm -rf, DROP TABLE, force-push, kubectl delete via PreToolUse matcher on Bash. You only want this when you know you're touching prod — having it always on would drive you insane
  • /careful — 通过 Bash 的 PreToolUse matcher 阻止 rm -rf、DROP TABLE、force-push、kubectl delete。只有当你明确知道自己在动 prod 时才需要它——如果一直开着,会把人逼疯
  • /freeze — blocks any Edit/Write that's not in a specific directory. Useful
  • /freeze — 阻止任何不在指定目录内的 Edit/Write。很有用
  • when debugging: "I want to add logs but I keep accidentally 'fixing' unrelated
  • 调试时:"我想加日志,但我总是不小心 'fixing' 了无关的

Distributing Skills

分发 Skills

One of the biggest benefits of Skills is that you can share them with the rest of your team.

Skills 最大的好处之一,是你可以把它们分享给团队里的其他人。

There are two ways you might to share skills with others:

你可能有两种方式把 skills 分享给他人:

  • check your skills into your repo (under ./.claude/skills)
  • 将 skills 检入到你的 repo(放在 ./.claude/skills 下)
  • make a plugin and have a Claude Code Plugin marketplace where users can upload and install plugins (read more on the documentation here)
  • 做成一个 plugin,并搭建一个 Claude Code 插件市场,让用户上传并安装插件(更多信息见文档)

For smaller teams working across relatively few repos, checking your skills into repos works well. But every skill that is checked in also adds a little bit to the context of the model. As you scale, an internal plugin marketplace allows you to distribute skills and let your team decide which ones to install.

对于跨越仓库数量不多的小团队来说,把 skills 直接检入 repo 往往效果很好。但每个被检入的 skill 都会给模型上下文增加一点负担。随着规模扩大,内部插件市场可以让你分发 skills,并让团队成员自行决定安装哪些。

Managing a Marketplace

管理市场

How do you decide which skills go in a marketplace? How do people submit them?

你如何决定哪些 skills 应该进入市场?大家又怎么提交它们?

We don't have a centralized team that decides; instead we try and find the most useful skills organically. If you have a skill that you want people to try out, you can upload it to a sandbox folder in GitHub and point people to it in Slack or other forums.

我们并没有一个集中式团队来拍板;相反,我们尽量以更自然的方式找出最有用的 skills。如果你有一个 skill 想让大家试用,可以把它上传到 GitHub 的 sandbox 文件夹里,然后在 Slack 或其他论坛里把链接发给大家。

Once a skill has gotten traction (which is up to the skill owner to decide), they can put in a PR to move it into the marketplace.

当某个 skill 逐渐获得使用与认可(由 skill owner 自行判断)之后,就可以提交一个 PR,把它移动到 marketplace 里。

A note of warning, it can be quite easy to create bad or redundant skills, so making sure you have some method of curation before release is important.

需要提醒的是:做出糟糕或重复的 skills 其实非常容易。因此,在发布前确保你有某种筛选与策展机制很重要。

Composing Skills

组合 Skills

You may want to have skills that depend on each other. For example, you may have a file upload skill that uploads a file, and a CSV generation skill that makes a CSV and uploads it. This sort of dependency management is not natively built into marketplaces or skills yet, but you can just reference other skills by name, and the model will invoke them if they are installed.

你可能会希望某些 skills 依赖其他 skills。比如,你可能有一个文件上传 skill 负责上传文件,另一个 CSV 生成 skill 负责生成 CSV 并上传。类似的依赖管理目前还没有原生内置在 marketplaces 或 skills 里,但你可以直接通过名称引用其他 skills;只要它们已安装,模型就会调用它们。

Measuring Skills

衡量 Skills

To understand how a skill is doing, we use a PreToolUse hook that lets us log skill usage within the company (example code here). This means we can find skills that are popular or are undertriggering compared to our expectations.

为了了解一个 skill 的表现,我们使用一个 PreToolUse hook 来记录公司内部的 skill 使用情况(示例代码见这里)。这样我们就能发现哪些 skills 很受欢迎,或者哪些 skills 相比我们的预期触发不足。

Conclusion

结语

Skills are incredibly powerful, flexible tools for agents, but it’s still early and we’re all figuring out how to use them best.

Skills 对 agent 来说是极其强大且灵活的工具,但这一切仍处在早期阶段,我们也都还在摸索如何把它们用到最好。

Think of this more as a grab bag of useful tips that we’ve seen work than a definitive guide. The best way to understand skills is to get started, experiment, and see what works for you. Most of ours began as a few lines and a single gotcha, and got better because people kept adding to them as Claude hit new edge cases.

与其把这篇文章当成一份定论式指南,不如把它看作我们见过“确实有效”的实用技巧合集。理解 skills 最好的方式,是开始动手、不断实验,看看什么对你有效。我们的很多 skills 最初也只是几行文字和一个 gotcha;之所以变得更好,是因为当 Claude 遇到新的边界情况时,人们不断把经验补充进去。

I hope this was helpful, let me know if you have any questions.

希望这对你有帮助。如果你有任何问题,欢迎告诉我。

Skills have become one of the most used extension points in Claude Code. They’re flexible, easy to make, and simple to distribute.

But this flexibility also makes it hard to know what works best. What type of skills are worth making? What's the secret to writing a good skill? When do you share them with others?

We've been using skills in Claude Code extensively at Anthropic with hundreds of them in active use. These are the lessons we've learned about using skills to accelerate our development.

What are Skills?

If you’re new to skills, I’d recommend reading our docs or watching our newest course on new Skilljar on Agent Skills, this post will assume you already have some familiarity with skills.

A common misconception we hear about skills is that they are “just markdown files”, but the most interesting part of skills is that they’re not just text files. They’re folders that can include scripts, assets, data, etc. that the agent can discover, explore and manipulate.

In Claude Code, skills also have a wide variety of configuration options including registering dynamic hooks.

We’ve found that some of the most interesting skills in Claude Code use these configuration options and folder structure creatively.

Types of Skills

After cataloging all of our skills, we noticed they cluster into a few recurring categories. The best skills fit cleanly into one; the more confusing ones straddle several. This isn't a definitive list, but it is a good way to think about if you're missing any inside of your org.

https://code.claude.com/docs/en/plugin-marketplaces

1. Library & API Reference

Skills that explain how to correctly use a library, CLI, or SDKs. These could be both for internal libraries or common libraries that Claude Code sometimes has trouble with. These skills often included a folder of reference code snippets and a list of gotchas for Claude to avoid when writing a script.

Examples:

  • billing-lib — your internal billing library: edge cases, footguns, etc.

  • internal-platform-cli — every subcommand of your internal CLI wrapper with examples on when to use them

  • frontend-design — make Claude better at your design system

2. Product Verification

Skills that describe how to test or verify that your code is working. These are often paired with an external tool like playwright, tmux, etc. for doing the verification.

Verification skills are extremely useful for ensuring Claude's output is correct. It can be worth having an engineer spend a week just making your verification skills excellent.

Consider techniques like having Claude record a video of its output so you can see exactly what it tested, or enforcing programmatic assertions on state at each step. These are often done by including a variety of scripts in the skill.

Examples:

  • signup-flow-driver — runs through signup → email verify → onboarding in a headless browser, with hooks for asserting state at each step

  • checkout-verifier — drives the checkout UI with Stripe test cards, verifies the invoice actually lands in the right state

  • tmux-cli-driver — for interactive CLI testing where the thing you're verifying needs a TTY

3. Data Fetching & Analysis

Skills that connect to your data and monitoring stacks. These skills might include libraries to fetch your data with credentials, specific dashboard ids, etc. as well as instructions on common workflows or ways to get data.

Examples:

  • funnel-query — "which events do I join to see signup → activation → paid" plus the table that actually has the canonical user_id

  • cohort-compare — compare two cohorts' retention or conversion, flag statistically significant deltas, link to the segment definitions

  • grafana — datasource UIDs, cluster names, problem → dashboard lookup table

4. Business Process & Team Automation

Skills that automate repetitive workflows into one command. These skills are usually fairly simple instructions but might have more complicated dependencies on other skills or MCPs. For these skills, saving previous results in log files can help the model stay consistent and reflect on previous executions of the workflow.

Examples:

  • standup-post — aggregates your ticket tracker, GitHub activity, and prior Slack → formatted standup, delta-only

  • create-<ticket-system>-ticket — enforces schema (valid enum values, required fields) plus post-creation workflow (ping reviewer, link in Slack)

  • weekly-recap — merged PRs + closed tickets + deploys → formatted recap post

5. Code Scaffolding & Templates

Skills that generate framework boilerplate for a specific function in codebase. You might combine these skills with scripts that can be composed. They are especially useful when your scaffolding has natural language requirements that can’t be purely covered by code.

Examples:

  • new-<framework>-workflow — scaffolds a new service/workflow/handler with your annotations

  • new-migration — your migration file template plus common gotchas

  • create-app — new internal app with your auth, logging, and deploy config pre-wired

6. Code Quality & Review

Skills that enforce code quality inside of your org and help review code. These can include deterministic scripts or tools for maximum robustness. You may want to run these skills automatically as part of hooks or inside of a GitHub Action.

  • adversarial-review — spawns a fresh-eyes subagent to critique, implements fixes, iterates until findings degrade to nitpicks

  • code-style — enforces code style, especially styles that Claude does not do well by default.

  • testing-practices — instructions on how to write tests and what to test.

7. CI/CD & Deployment

Skills that help you fetch, push, and deploy code inside of your codebase. These skills may reference other skills to collect data.

Examples:

  • babysit-pr — monitors a PR → retries flaky CI → resolves merge conflicts → enables auto-merge

  • deploy-<service> — build → smoke test → gradual traffic rollout with error-rate comparison → auto-rollback on regression

  • cherry-pick-prod — isolated worktree → cherry-pick → conflict resolution → PR with template

8. Runbooks

Skills that take a symptom (such as a Slack thread, alert, or error signature), walk through a multi-tool investigation, and produce a structured report.

Examples:

  • <service>-debugging — maps symptoms → tools → query patterns for your highest-traffic services

  • oncall-runner — fetches the alert → checks the usual suspects → formats a finding

  • log-correlator — given a request ID, pulls matching logs from every system that might have touched it

9. Infrastructure Operations

Skills that perform routine maintenance and operational procedures — some of which involve destructive actions that benefit from guardrails. These make it easier for engineers to follow best practices in critical operations.

Examples:

  • <resource>-orphans — finds orphaned pods/volumes → posts to Slack → soak period → user confirms → cascading cleanup

  • dependency-management — your org's dependency approval workflow

  • cost-investigation — "why did our storage/egress bill spike" with the specific buckets and query patterns

Tips for Making Skills

Once you've decided on the skill to make, how do you write it? These are some of the best practices, tips, and tricks we've found.

We also recently released Skill Creator to make it easier to create skills in Claude Code.

Don’t State the Obvious

Claude Code knows a lot about your codebase, and Claude knows a lot about coding, including many default opinions. If you’re publishing a skill that is primarily about knowledge, try to focus on information that pushes Claude out of its normal way of thinking.

The frontend design skill is a great example — it was built by one of the engineers at Anthropic by iterating with customers on improving Claude’s design taste, avoiding classic patterns like the Inter font and purple gradients.

Build a Gotchas Section

https://gist.github.com/ThariqS/24defad423d701746e23dc19aace4de5

The highest-signal content in any skill is the Gotchas section. These sections should be built up from common failure points that Claude runs into when using your skill. Ideally, you will update your skill over time to capture these gotchas.

Use the File System & Progressive Disclosure

https://claude.com/blog/improving-skill-creator-test-measure-and-refine-agent-skills

Like we said earlier, a skill is a folder, not just a markdown file. You should think of the entire file system as a form of context engineering and progressive disclosure. Tell Claude what files are in your skill, and it will read them at appropriate times.

The simplest form of progressive disclosure is to point to other markdown files for Claude to use. For example, you may split detailed function signatures and usage examples into references/api.md.

Another example: if your end output is a markdown file, you might include a template file for it in assets/ to copy and use.

You can have folders of references, scripts, examples, etc., which help Claude work more effectively.

Avoid Railroading Claude

Claude will generally try to stick to your instructions, and because Skills are so reusable you’ll want to be careful of being too specific in your instructions. Give Claude the information it needs, but give it the flexibility to adapt to the situation. For example:

Think through the Setup

https://github.com/anthropics/skills/blob/main/skills/frontend-design/SKILL.md

Some skills may need to be set up with context from the user. For example, if you are making a skill that posts your standup to Slack, you may want Claude to ask which Slack channel to post it in.

A good pattern to do this is to store this setup information in a config.json file in the skill directory like the above example. If the config is not set up, the agent can then ask the user for information.

If you want the agent to present structured, multiple choice questions you can instruct Claude to use the AskUserQuestion tool.

The Description Field Is For the Model

When Claude Code starts a session, it builds a listing of every available skill with its description. This listing is what Claude scans to decide "is there a skill for this request?" Which means the description field is not a summary — it's a description of when to trigger this PR.

https://anthropic.skilljar.com/introduction-to-agent-skills

Memory & Storing Data

Some skills can include a form of memory by storing data within them. You could store data in anything as simple as an append only text log file or JSON files, or as complicated as a SQLite database.

For example, a standup-post skill might keep a standups.log with every post it's written, which means the next time you run it, Claude reads its own history and can tell what's changed since yesterday.

Data stored in the skill directory may be deleted when you upgrade the skill, so you should store this in a stable folder, as of today we provide ${**CLAUDE_PLUGIN_DATA**} as a stable folder per plugin to store data in.

Store Scripts & Generate Code

One of the most powerful tools you can give Claude is code. Giving Claude scripts and libraries lets Claude spend its turns on composition, deciding what to do next rather than reconstructing boilerplate.

For example, in your data science skill you might have a library of functions to fetch data from your event source. In order for Claude to do complex analysis, you could give it a set of helper functions like so:

https://code.claude.com/docs/en/skills

Claude can then generate scripts on the fly to compose this functionality to do more advanced analysis for prompts like “What happened on Tuesday?”

https://code.claude.com/docs/en/skills#frontmatter-reference

On Demand Hooks

Skills can include hooks that are only activated when the skill is called, and last for the duration of the session. Use this for more opinionated hooks that you don’t want to run all the time, but are extremely useful sometimes.

For example:

  • /careful — blocks rm -rf, DROP TABLE, force-push, kubectl delete via PreToolUse matcher on Bash. You only want this when you know you're touching prod — having it always on would drive you insane

  • /freeze — blocks any Edit/Write that's not in a specific directory. Useful

  • when debugging: "I want to add logs but I keep accidentally 'fixing' unrelated

Distributing Skills

One of the biggest benefits of Skills is that you can share them with the rest of your team.

There are two ways you might to share skills with others:

  • check your skills into your repo (under ./.claude/skills)

  • make a plugin and have a Claude Code Plugin marketplace where users can upload and install plugins (read more on the documentation here)

For smaller teams working across relatively few repos, checking your skills into repos works well. But every skill that is checked in also adds a little bit to the context of the model. As you scale, an internal plugin marketplace allows you to distribute skills and let your team decide which ones to install.

Managing a Marketplace

How do you decide which skills go in a marketplace? How do people submit them?

We don't have a centralized team that decides; instead we try and find the most useful skills organically. If you have a skill that you want people to try out, you can upload it to a sandbox folder in GitHub and point people to it in Slack or other forums.

Once a skill has gotten traction (which is up to the skill owner to decide), they can put in a PR to move it into the marketplace.

A note of warning, it can be quite easy to create bad or redundant skills, so making sure you have some method of curation before release is important.

Composing Skills

You may want to have skills that depend on each other. For example, you may have a file upload skill that uploads a file, and a CSV generation skill that makes a CSV and uploads it. This sort of dependency management is not natively built into marketplaces or skills yet, but you can just reference other skills by name, and the model will invoke them if they are installed.

Measuring Skills

To understand how a skill is doing, we use a PreToolUse hook that lets us log skill usage within the company (example code here). This means we can find skills that are popular or are undertriggering compared to our expectations.

Conclusion

Skills are incredibly powerful, flexible tools for agents, but it’s still early and we’re all figuring out how to use them best.

Think of this more as a grab bag of useful tips that we’ve seen work than a definitive guide. The best way to understand skills is to get started, experiment, and see what works for you. Most of ours began as a few lines and a single gotcha, and got better because people kept adding to them as Claude hit new edge cases.

I hope this was helpful, let me know if you have any questions.

Link: http://x.com/i/article/2033772621536591872

📋 讨论归档

讨论进行中…