返回列表
🧠 阿头学 · 💬 讨论题

别把学习外包出去

这篇文章的判断基本成立:AI 真正危险的不是替你干活,而是默认把“理解问题”这一步偷走;但它对长期能力退化的论证有说服力,证据却还不够闭环。
打开原文 ↗

2026-05-19 原文链接 ↗
阅读简报
双语对照
完整翻译
原文
讨论归档

核心观点

  • 交付和学习不是一回事 作者最强的判断是,当前 AI 编码工具几乎只优化“尽快把任务关掉”,却不优化“让使用者形成心智模型”,所以短期效率上升并不等于长期能力增强。
  • 问题不在 AI,本质在使用姿态 文中最站得住脚的部分是区分了两种用法:拿 AI 做解释、校准、反驳,通常能增益学习;拿 AI 做直接代写、复制、跳过推理,通常会制造认知债务,这个区分比“该不该用 AI”更关键。
  • AI 介入越早,越容易替你定义问题 文章引用的锚定效应很重要,因为风险未必来自“用太多”,更可能来自“还没形成自己的判断就先看 AI 答案”,这会系统性削弱独立 framing 能力。
  • 深度理解在异常场景里仍然值钱 作者判断对了一点:样板任务可以放心委托,但调试、架构迁移、处理幻觉、解决非标准问题这些高价值场景,仍然要求工程师自己能看懂、能质疑、能重建。
  • 修复方案是改工作流,不是禁工具 文中提出的“先写假设、先要解释、把输出当 PR 审、定期手工重写”是务实建议,比道德化地喊“少用 AI”更可执行,也更符合现实。

跟我们的关联

  • 对 ATou 意味着什么、下一步怎么用 如果 ATou 正在高频用 AI 写代码或做分析,这篇文章提醒的是别把自己降级成验收员;下一步可以强制加入一个动作:每次提问前先写 2 句自己的假设,再让 AI 回答。
  • 对 Neta 意味着什么、下一步怎么用 如果 Neta 负责方法论、知识体系或研究框架,这篇文章说明“先 framing 再求助”比“先搜答案”更重要;下一步可以把 AI 使用流程写成模板,区分解释型提问和代做型提问。
  • 对 Uota 意味着什么、下一步怎么用 如果 Uota 关注产品与用户行为,这篇文章说明默认 UX 会塑造用户能力,而不只是提升效率;下一步可以审视产品是否只奖励快速完成,而没有奖励理解、复盘和校准。
  • 对团队意味着什么、下一步怎么用 如果团队只看交付速度,实际上是在默许认知外包;下一步应把“解释设计选择”“能否脱离 AI 重建关键模块”纳入 review 或复盘标准,否则产能可能是假繁荣。

讨论引子

1. 哪些任务真的值得“直接外包给 AI”,哪些任务一旦外包就会伤到长期竞争力? 2. 如果“AI 介入顺序”比“AI 使用总量”更关键,我们该怎么重设计日常工作流? 3. 团队该不该把“学习指标”制度化,如果该,最不虚的衡量方式是什么?

此刻,太容易让 AI 去写代码,而自己跳过学习。Bug 被修好了,但你的心智模型没有前进一步。时间一长,情况可能还会更糟。我们正在悄无声息地用未来的能力,交换当下的速度,而工具并不会逼着我们做出别的选择。这一部分只能靠你自己。

我们大多数人已经默认进入了这样一个循环。你贴上一段需求说明或报错信息。模型给你一个修复方案,症状随之消失。你把东西发出去。就在这个循环里的某个地方,问题和解法之间那种混乱、艰难的搏斗,彻底不再发生了。

我以前写过关于认知让渡的事,也就是 AI 审阅者的判断,悄悄取代你自己判断的那个瞬间。这里则是同一循环的单人版。只有你和模型。模型更快,于是你不再试着在理解上和它较劲。成千上万次这样的小互动累积下来,没有 AI 在旁边帮衬时,你真正能独立造出来的东西,每周都会弱上一点。可这些时刻发生的当天,没有哪一次会让人觉得这就是个问题。

我并不反对 AI。我每天都在用这些工具,过去一年里靠它们交付出来的东西,比此前好几年加起来还多。但我们默认的使用方式,只针对一件事做了优化,那就是把任务关掉。

而这和另一件事完全不是同一个目标。那件事是,在一段足够长的职业生涯里,始终保持足够敏锐,能驾驭这些工具。

研究正在指向同一个结论

过去一年里,有几项研究大致都落在同一个地方。

Anthropic 在 2026 年初做了一项随机试验,让工程师学习一个新的 Python 库,一半人在 AI 辅助下学习,另一半不用 AI。两组完成任务的速度一样快。但在后续的理解测试里,AI 组表现很差,手动组是 67%,AI 组只有 50%,而且差距在调试题上还在扩大。更有意思的是 AI 组内部的分化。那些用 AI 提概念问题的工程师,得分超过 65%。那些直接复制粘贴生成代码的工程师,得分不到 40%。决定结果的不是工具,而是姿态。

MIT 的 Your Brain on ChatGPT 研究比较了三组人的写作,一组用 LLM,一组用搜索引擎,一组只靠大脑。EEG 测量显示,外部支持每多一层,大脑连接性就下降一层。LLM 组的耦合最弱。写完文章后,83% 的 LLM 用户连自己刚刚写出的内容中一句完整的话都引用不出来。研究者把这叫作认知债务,今天省下心智努力,明天就在批判性思维上还债。

CHI 2026 的一项研究又补上了一个相关发现。当人们在任务一开始就能用上 LLM 时,整个问题的框架往往会被 LLM 先定下来。即使后面的工作全是人自己完成的,最初这一下锚定,依然会带来可测量的更差决策。操作顺序的重要性,超过了 AI 总使用量。

不同的方法,得出的是同一个结论。如果使用 AI 时没有主动学习的意图,它就会悄悄削弱你赖以谋生的那项能力。

这些工具默认追求的是交付,不是教学

如果你打开一个编码代理,然后一路使用默认设置,所有东西都只朝着一个指标在调,那就是把任务做完。

模型写代码。你接受。循环继续。工具在任何一个节点都不会停下来问你,你觉得问题是什么,或者先试着自己写前五行看看。

这就是当下的 UX 引力。产品团队因为变更合并得更多、周期更短而得到奖励,不是因为让你成了更锋利的工程师。大家都想少敲几下键盘,所以工具把摩擦都磨掉了。问题在于,学习原本就住在那些摩擦里。

已经有一些公司开始试着对抗这些并不鼓励我们真正学习的循环。

Anthropic 给 Claude 推出了 Learning Mode,用苏格拉底式提问,先停下来,让你自己写代码,再继续。OpenAI 和 Google 也推出过类似功能。可如果实话实说,几乎没人会在真正的生产工作里用它们。我们已经悄悄把它们归进了只给学生用的那一栏,而这正是个错误。同一个能帮大二学生学会 React 的功能,也能帮资深工程师学 Rust。前提只是,你得愿意重新感受一次初学者的状态。

如果 AI 能做,为什么还非得自己理解

这是个合理的问题。对某些工作,答案或许就是,也许……确实不用?如果是样板代码、胶水代码,或者一个你再也不会回头看的临时 CI 脚本,那就交给它。为了记住某些语法去投入精力,机会成本太高了。

但对于真正的软件,纯粹的委托会在几个具体的地方失效。

当东西坏掉的时候。 AI 生成的代码,崩起来和人写的代码没有区别。代理写的这句话,对调试问题没有任何帮助。团队里总得有人真的理解架构。

当它错得很自信的时候。 LLM 依然会产生幻觉。面对一个看起来很像那么回事、其实不对的答案,唯一的防线,就是你得有足够的专业能力把它看出来。技能、CLI 之类的创可贴,只能帮到某个限度。

当底层发生变化的时候。 代码是暂时的,系统是长期的。框架更新了,或者安全审查指出了一个结构性问题,这时候光靠重新写提示是逃不出去的。你需要的是那些对系统理解足够深,能把它迁过去的工程师。

当你离开中位数的时候。 AI 在 GitHub 上已经被解决过一百万次的问题面前,确实非常厉害。可你偏离中位数越远,它的表现就越差。那些难的、没有文档的、真正配得上资深工程师薪水的问题,依然需要深度理解。

当市场开始重新定价的时候。 那些只能借助 AI 交付、离开 AI 就不行的工程师,正在进入一个已经开始重新评估专业能力价值的劳动力池。如果你用 AI 跳过学习,换来的只是一个稍微轻松一点的周二,丢掉的却是未来的相关性。

修复办法,藏在你的提示词里

好消息是,能制造认知债务的同一批工具,也能帮你成为更锋利的工程师。差别在于,你向它们提出什么要求。

在发问之前,先形成一个假设。在请求修复之前,先用两三句话写下你认为问题出在哪里。把模型的回答拿来检验你的理论,而不是拿来替代它。

先要解释,再要代码。到了不熟悉的领域,你的第一条提示应该更像这样,"解释它是怎么工作的,还有哪些替代方案,以及各自的取舍是什么。" 等你抓住概念以后,再去要代码。

当你明显超出自己能力范围时,就打开 Learning Mode。没错,它会更慢。这正是重点。

把 AI 的输出当成一位初级工程师发来的 PR。去读。去挑。去反驳。难道只因为测试过了,你就会直接合并吗。如果不会,这里也别合。

隔一阵子,手工重新推一遍。拿一段模型替你写过的代码,试着从零自己重写出来。这是一个校准检查,能告诉你自己已经悄悄丢掉了多少东西。

让模型教你它刚刚做了什么。它写完一个巧妙的函数以后,追问它用了哪些概念,以及要理解这个设计选择,你还该去读什么。仅仅多一条提示,就会改变你从这次过程里带走什么。

这些都不是什么戏剧性的改变。它们只是你已经在用的同一套工具里,一些很小的姿态调整。

不是一个指标,而是两个

我开始在每次编码结束时问自己一个简单的问题。今天学到东西了吗,还是只是把问题关掉了?

有时诚实的答案就是,只是把问题关掉了,这没关系。可如果一连几个月答案都是这样,认知债务就已经在后台悄悄累积了。

交付和学习,是两个彼此独立的指标。你的经理和客户永远只会问前一个。后一个只能靠你自己盯着。

相比反过来,我宁可只交付自己本来能交付的 80%,也要把自己本来该学会的 100% 学到手。把时间拉长到几年,这两种策略会造出完全不同的工程师。

你不必在使用 AI 和学习之间二选一。但你必须选一种能同时做到这两件事的工作流,因为默认设置不会替你做这个选择。工具早就准备好了,就等你自己。

下一个你本来准备直接委托出去的无聊任务,就是一个很好的开始。

Right now, it's too easy to let AI write the code while you skip the learning. The bug gets fixed but your mental model doesn't move. It might get worse over time. We are silently trading future capability for present-day speed, and the tools won't force us to do otherwise. That part has to come from you.

此刻,太容易让 AI 去写代码,而自己跳过学习。Bug 被修好了,但你的心智模型没有前进一步。时间一长,情况可能还会更糟。我们正在悄无声息地用未来的能力,交换当下的速度,而工具并不会逼着我们做出别的选择。这一部分只能靠你自己。

There's a default loop most of us have settled into. You paste in a spec or error message. The model hands you a fix and the symptom vanishes. You ship. Somewhere in that loop, the messy struggle between problem and solution stops happening at all.

我们大多数人已经默认进入了这样一个循环。你贴上一段需求说明或报错信息。模型给你一个修复方案,症状随之消失。你把东西发出去。就在这个循环里的某个地方,问题和解法之间那种混乱、艰难的搏斗,彻底不再发生了。

I've written before about cognitive surrender, the moment an AI reviewer's verdict quietly replaces your own. This is the solo version of that same loop. It's just you and the model. The model is faster, so you stop trying to compete on comprehension. Across thousands of these small interactions, what you can actually build without an AI looking over your shoulder gets a little weaker every week. None of these moments feel like a problem on the day they happen.

我以前写过关于认知让渡的事,也就是 AI 审阅者的判断,悄悄取代你自己判断的那个瞬间。这里则是同一循环的单人版。只有你和模型。模型更快,于是你不再试着在理解上和它较劲。成千上万次这样的小互动累积下来,没有 AI 在旁边帮衬时,你真正能独立造出来的东西,每周都会弱上一点。可这些时刻发生的当天,没有哪一次会让人觉得这就是个问题。

I'm not anti-AI. I use these tools daily and have shipped more with them in the last year than in the years before it. But the default way we use them is optimized for one thing: closing tasks.

我并不反对 AI。我每天都在用这些工具,过去一年里靠它们交付出来的东西,比此前好几年加起来还多。但我们默认的使用方式,只针对一件事做了优化,那就是把任务关掉。

That is a completely different goal from staying sharp enough to steer them over a career that spans a long time.

而这和另一件事完全不是同一个目标。那件事是,在一段足够长的职业生涯里,始终保持足够敏锐,能驾驭这些工具。

The studies are converging on the same point

研究正在指向同一个结论

Several pieces of research over the last year have landed in roughly the same place.

过去一年里,有几项研究大致都落在同一个地方。

Anthropic ran a randomized trial in early 2026 where engineers learned a new Python library, half with AI assistance and half without. Both groups finished the tasks at the same speed. But the AI group bombed the follow-up comprehension quiz: 50% versus 67% for the manual group, with the gap widening on debugging. The interesting cut was inside the AI group itself. Engineers who used AI to ask conceptual questions scored above 65%. Engineers who copy-pasted the generated code scored under 40%. The tool didn't determine the outcome. The posture did.

Anthropic 在 2026 年初做了一项随机试验,让工程师学习一个新的 Python 库,一半人在 AI 辅助下学习,另一半不用 AI。两组完成任务的速度一样快。但在后续的理解测试里,AI 组表现很差,手动组是 67%,AI 组只有 50%,而且差距在调试题上还在扩大。更有意思的是 AI 组内部的分化。那些用 AI 提概念问题的工程师,得分超过 65%。那些直接复制粘贴生成代码的工程师,得分不到 40%。决定结果的不是工具,而是姿态。

MIT's Your Brain on ChatGPT study compared essay writing across LLM, search-engine, and brain-only groups. EEG measurements showed brain connectivity scaling down with every layer of external support. The LLM group showed the weakest coupling. After writing the essay, 83% of LLM users couldn't quote a single line of what they had just produced. The researchers called this cognitive debt: saving mental effort today, paying for it in critical thinking tomorrow.

MIT 的 Your Brain on ChatGPT 研究比较了三组人的写作,一组用 LLM,一组用搜索引擎,一组只靠大脑。EEG 测量显示,外部支持每多一层,大脑连接性就下降一层。LLM 组的耦合最弱。写完文章后,83% 的 LLM 用户连自己刚刚写出的内容中一句完整的话都引用不出来。研究者把这叫作认知债务,今天省下心智努力,明天就在批判性思维上还债。

A CHI 2026 study added a related finding. When people had LLM access at the start of a task, the LLM framed the entire problem. Even when the human did the rest of the work themselves, that initial anchoring produced measurably worse decisions. The order of operations mattered more than the total amount of AI used.

CHI 2026 的一项研究又补上了一个相关发现。当人们在任务一开始就能用上 LLM 时,整个问题的框架往往会被 LLM 先定下来。即使后面的工作全是人自己完成的,最初这一下锚定,依然会带来可测量的更差决策。操作顺序的重要性,超过了 AI 总使用量。

Different methodologies reaching the same conclusion. Using AI without an active intent to learn quietly degrades the skill you're being paid for.

不同的方法,得出的是同一个结论。如果使用 AI 时没有主动学习的意图,它就会悄悄削弱你赖以谋生的那项能力。

The tools default to shipping, not teaching

这些工具默认追求的是交付,不是教学

**If you fire up a coding agent and stick to the defaults, everything is tuned for one metric: getting the task done. **

如果你打开一个编码代理,然后一路使用默认设置,所有东西都只朝着一个指标在调,那就是把任务做完。

The model writes the code. You accept it. The loop repeats. At no point does the tool pause and ask "what do you think the problem is?" or "try writing the first five lines yourself."

模型写代码。你接受。循环继续。工具在任何一个节点都不会停下来问你,你觉得问题是什么,或者先试着自己写前五行看看。

That's where UX gravity is right now. Product teams get rewarded for merged changes and shorter cycle times, not for making you a sharper engineer. We all want fewer keystrokes, so the tools have sanded the friction away. The trouble is that friction was where the learning lived.

这就是当下的 UX 引力。产品团队因为变更合并得更多、周期更短而得到奖励,不是因为让你成了更锋利的工程师。大家都想少敲几下键盘,所以工具把摩擦都磨掉了。问题在于,学习原本就住在那些摩擦里。

A few companies have tried pushing back on these loops not encouraging us to really learn.

已经有一些公司开始试着对抗这些并不鼓励我们真正学习的循环。

Anthropic shipped Learning Mode for Claude, which uses Socratic questioning and stops to ask you to write code before continuing. OpenAI and Google have shipped similar features. Almost nobody uses them for real production work though if we're being real. We've quietly filed them under "for students" and that's a mistake. The same feature that helps a sophomore learn React works for a senior engineer learning Rust. You just have to be willing to feel like a beginner again.

Anthropic 给 Claude 推出了 Learning Mode,用苏格拉底式提问,先停下来,让你自己写代码,再继续。OpenAI 和 Google 也推出过类似功能。可如果实话实说,几乎没人会在真正的生产工作里用它们。我们已经悄悄把它们归进了只给学生用的那一栏,而这正是个错误。同一个能帮大二学生学会 React 的功能,也能帮资深工程师学 Rust。前提只是,你得愿意重新感受一次初学者的状态。

"If the AI can do it, why do I need to understand it?"

如果 AI 能做,为什么还非得自己理解

A fair question. For some work, the answer is: maybe...you don't? If it's boilerplate, glue code, or a throwaway CI script you'll never look at again, delegate it. The opportunity cost of memorizing some syntax is too high.

这是个合理的问题。对某些工作,答案或许就是,也许……确实不用?如果是样板代码、胶水代码,或者一个你再也不会回头看的临时 CI 脚本,那就交给它。为了记住某些语法去投入精力,机会成本太高了。

For real software, pure delegation breaks down in a few specific places.

但对于真正的软件,纯粹的委托会在几个具体的地方失效。

When something breaks. AI-generated code crashes the same way human code does. "The agent wrote it" doesn't help you debug problems. Somebody on the team has to understand the architecture.

当东西坏掉的时候。 AI 生成的代码,崩起来和人写的代码没有区别。代理写的这句话,对调试问题没有任何帮助。团队里总得有人真的理解架构。

When it's confidently wrong. LLMs can still hallucinate. The only defense against a plausible-looking incorrect answer is enough expertise to spot it. Bandaids like skills, CLIs etc only get you so far.

当它错得很自信的时候。 LLM 依然会产生幻觉。面对一个看起来很像那么回事、其实不对的答案,唯一的防线,就是你得有足够的专业能力把它看出来。技能、CLI 之类的创可贴,只能帮到某个限度。

When the foundation changes. Code is temporary; systems are permanent. When frameworks update or a security review flags a structural issue, you can't re-prompt your way out. You need engineers who understand the system well enough to migrate it.

当底层发生变化的时候。 代码是暂时的,系统是长期的。框架更新了,或者安全审查指出了一个结构性问题,这时候光靠重新写提示是逃不出去的。你需要的是那些对系统理解足够深,能把它迁过去的工程师。

When you leave the median. AI is brilliant at problems that have been solved a million times on GitHub. The further you stray from the median, the worse it gets. The hard, undocumented problems, the ones that justify a senior engineer's salary, still require deep understanding.

当你离开中位数的时候。 AI 在 GitHub 上已经被解决过一百万次的问题面前,确实非常厉害。可你偏离中位数越远,它的表现就越差。那些难的、没有文档的、真正配得上资深工程师薪水的问题,依然需要深度理解。

When the market adjusts. Engineers who can only ship with AI, and not without it, are entering a labor pool that is already re-pricing what expertise is worth. If you use AI to skip learning, you're trading future relevance for a slightly easier Tuesday.

当市场开始重新定价的时候。 那些只能借助 AI 交付、离开 AI 就不行的工程师,正在进入一个已经开始重新评估专业能力价值的劳动力池。如果你用 AI 跳过学习,换来的只是一个稍微轻松一点的周二,丢掉的却是未来的相关性。

The fix is in how you prompt

修复办法,藏在你的提示词里

The good news is that the same tools that produce cognitive debt can produce sharper engineers. The difference is in what you ask of them.

好消息是,能制造认知债务的同一批工具,也能帮你成为更锋利的工程师。差别在于,你向它们提出什么要求。

Form a hypothesis before you ask. Before requesting a fix, write down two or three sentences on what you think the problem is. Use the model's answer to test your theory, not to replace it.

在发问之前,先形成一个假设。在请求修复之前,先用两三句话写下你认为问题出在哪里。把模型的回答拿来检验你的理论,而不是拿来替代它。

Ask for the explanation before the code. In unfamiliar territory, your first prompt should be something like "explain how this works, what the alternatives are, and what the tradeoffs are." Ask for the code only after you've grasped the concepts.

先要解释,再要代码。到了不熟悉的领域,你的第一条提示应该更像这样,"解释它是怎么工作的,还有哪些替代方案,以及各自的取舍是什么。" 等你抓住概念以后,再去要代码。

Turn on Learning Mode when you're out of your depth. Yes, it feels slower. That's the point.

当你明显超出自己能力范围时,就打开 Learning Mode。没错,它会更慢。这正是重点。

Treat AI output like a PR from a junior engineer. Read it. Critique it. Push back on it. Would you merge it just because the tests passed? If not, don't merge it here either.

把 AI 的输出当成一位初级工程师发来的 PR。去读。去挑。去反驳。难道只因为测试过了,你就会直接合并吗。如果不会,这里也别合。

Re-derive things by hand once in a while. Take a piece of code the model wrote for you and try to recreate it from scratch. It's the calibration check that tells you how much you've quietly lost.

隔一阵子,手工重新推一遍。拿一段模型替你写过的代码,试着从零自己重写出来。这是一个校准检查,能告诉你自己已经悄悄丢掉了多少东西。

Ask the model to teach you what it just did. After it writes a clever function, ask what concepts it used and what you'd need to read to understand the design choice. One extra prompt changes what you take away from the session.

让模型教你它刚刚做了什么。它写完一个巧妙的函数以后,追问它用了哪些概念,以及要理解这个设计选择,你还该去读什么。仅仅多一条提示,就会改变你从这次过程里带走什么。

None of these are dramatic. They're small posture shifts inside the same tools you're already using.

这些都不是什么戏剧性的改变。它们只是你已经在用的同一套工具里,一些很小的姿态调整。

Two metrics, not one

不是一个指标,而是两个

I've started ending coding sessions with a simple question: did I learn anything today, or did I just close issues?

我开始在每次编码结束时问自己一个简单的问题。今天学到东西了吗,还是只是把问题关掉了?

Sometimes the honest answer is "I just closed issues" and that's fine. If it becomes the answer for months in a row, cognitive debt is accumulating in the background.

有时诚实的答案就是,只是把问题关掉了,这没关系。可如果一连几个月答案都是这样,认知债务就已经在后台悄悄累积了。

Ship and learn are two separate metrics. Your manager and your customers will only ever ask about the first one. The second is on you.

交付和学习,是两个彼此独立的指标。你的经理和客户永远只会问前一个。后一个只能靠你自己盯着。

I'd rather ship 80% of what I could have and learn 100% of what I needed to, than the reverse. Over years, those two strategies produce very different engineers.

相比反过来,我宁可只交付自己本来能交付的 80%,也要把自己本来该学会的 100% 学到手。把时间拉长到几年,这两种策略会造出完全不同的工程师。

You don't have to choose between using AI and learning. You do have to choose a workflow that does both, because the defaults won't choose it for you. The tools are ready whenever you are.

你不必在使用 AI 和学习之间二选一。但你必须选一种能同时做到这两件事的工作流,因为默认设置不会替你做这个选择。工具早就准备好了,就等你自己。

The next boring task you were about to delegate is a good place to start.

下一个你本来准备直接委托出去的无聊任务,就是一个很好的开始。

Right now, it's too easy to let AI write the code while you skip the learning. The bug gets fixed but your mental model doesn't move. It might get worse over time. We are silently trading future capability for present-day speed, and the tools won't force us to do otherwise. That part has to come from you.

There's a default loop most of us have settled into. You paste in a spec or error message. The model hands you a fix and the symptom vanishes. You ship. Somewhere in that loop, the messy struggle between problem and solution stops happening at all.

I've written before about cognitive surrender, the moment an AI reviewer's verdict quietly replaces your own. This is the solo version of that same loop. It's just you and the model. The model is faster, so you stop trying to compete on comprehension. Across thousands of these small interactions, what you can actually build without an AI looking over your shoulder gets a little weaker every week. None of these moments feel like a problem on the day they happen.

I'm not anti-AI. I use these tools daily and have shipped more with them in the last year than in the years before it. But the default way we use them is optimized for one thing: closing tasks.

That is a completely different goal from staying sharp enough to steer them over a career that spans a long time.

The studies are converging on the same point

Several pieces of research over the last year have landed in roughly the same place.

Anthropic ran a randomized trial in early 2026 where engineers learned a new Python library, half with AI assistance and half without. Both groups finished the tasks at the same speed. But the AI group bombed the follow-up comprehension quiz: 50% versus 67% for the manual group, with the gap widening on debugging. The interesting cut was inside the AI group itself. Engineers who used AI to ask conceptual questions scored above 65%. Engineers who copy-pasted the generated code scored under 40%. The tool didn't determine the outcome. The posture did.

MIT's Your Brain on ChatGPT study compared essay writing across LLM, search-engine, and brain-only groups. EEG measurements showed brain connectivity scaling down with every layer of external support. The LLM group showed the weakest coupling. After writing the essay, 83% of LLM users couldn't quote a single line of what they had just produced. The researchers called this cognitive debt: saving mental effort today, paying for it in critical thinking tomorrow.

A CHI 2026 study added a related finding. When people had LLM access at the start of a task, the LLM framed the entire problem. Even when the human did the rest of the work themselves, that initial anchoring produced measurably worse decisions. The order of operations mattered more than the total amount of AI used.

Different methodologies reaching the same conclusion. Using AI without an active intent to learn quietly degrades the skill you're being paid for.

The tools default to shipping, not teaching

**If you fire up a coding agent and stick to the defaults, everything is tuned for one metric: getting the task done. **

The model writes the code. You accept it. The loop repeats. At no point does the tool pause and ask "what do you think the problem is?" or "try writing the first five lines yourself."

That's where UX gravity is right now. Product teams get rewarded for merged changes and shorter cycle times, not for making you a sharper engineer. We all want fewer keystrokes, so the tools have sanded the friction away. The trouble is that friction was where the learning lived.

A few companies have tried pushing back on these loops not encouraging us to really learn.

Anthropic shipped Learning Mode for Claude, which uses Socratic questioning and stops to ask you to write code before continuing. OpenAI and Google have shipped similar features. Almost nobody uses them for real production work though if we're being real. We've quietly filed them under "for students" and that's a mistake. The same feature that helps a sophomore learn React works for a senior engineer learning Rust. You just have to be willing to feel like a beginner again.

"If the AI can do it, why do I need to understand it?"

A fair question. For some work, the answer is: maybe...you don't? If it's boilerplate, glue code, or a throwaway CI script you'll never look at again, delegate it. The opportunity cost of memorizing some syntax is too high.

For real software, pure delegation breaks down in a few specific places.

When something breaks. AI-generated code crashes the same way human code does. "The agent wrote it" doesn't help you debug problems. Somebody on the team has to understand the architecture.

When it's confidently wrong. LLMs can still hallucinate. The only defense against a plausible-looking incorrect answer is enough expertise to spot it. Bandaids like skills, CLIs etc only get you so far.

When the foundation changes. Code is temporary; systems are permanent. When frameworks update or a security review flags a structural issue, you can't re-prompt your way out. You need engineers who understand the system well enough to migrate it.

When you leave the median. AI is brilliant at problems that have been solved a million times on GitHub. The further you stray from the median, the worse it gets. The hard, undocumented problems, the ones that justify a senior engineer's salary, still require deep understanding.

When the market adjusts. Engineers who can only ship with AI, and not without it, are entering a labor pool that is already re-pricing what expertise is worth. If you use AI to skip learning, you're trading future relevance for a slightly easier Tuesday.

The fix is in how you prompt

The good news is that the same tools that produce cognitive debt can produce sharper engineers. The difference is in what you ask of them.

Form a hypothesis before you ask. Before requesting a fix, write down two or three sentences on what you think the problem is. Use the model's answer to test your theory, not to replace it.

Ask for the explanation before the code. In unfamiliar territory, your first prompt should be something like "explain how this works, what the alternatives are, and what the tradeoffs are." Ask for the code only after you've grasped the concepts.

Turn on Learning Mode when you're out of your depth. Yes, it feels slower. That's the point.

Treat AI output like a PR from a junior engineer. Read it. Critique it. Push back on it. Would you merge it just because the tests passed? If not, don't merge it here either.

Re-derive things by hand once in a while. Take a piece of code the model wrote for you and try to recreate it from scratch. It's the calibration check that tells you how much you've quietly lost.

Ask the model to teach you what it just did. After it writes a clever function, ask what concepts it used and what you'd need to read to understand the design choice. One extra prompt changes what you take away from the session.

None of these are dramatic. They're small posture shifts inside the same tools you're already using.

Two metrics, not one

I've started ending coding sessions with a simple question: did I learn anything today, or did I just close issues?

Sometimes the honest answer is "I just closed issues" and that's fine. If it becomes the answer for months in a row, cognitive debt is accumulating in the background.

Ship and learn are two separate metrics. Your manager and your customers will only ever ask about the first one. The second is on you.

I'd rather ship 80% of what I could have and learn 100% of what I needed to, than the reverse. Over years, those two strategies produce very different engineers.

You don't have to choose between using AI and learning. You do have to choose a workflow that does both, because the defaults won't choose it for you. The tools are ready whenever you are.

The next boring task you were about to delegate is a good place to start.

📋 讨论归档

讨论进行中…