💬 讨论题

别让 AI 每次都像个“带薪失忆”的临时工

放弃低效的 Grep 字符串匹配，通过“BM25+语义”混合检索构建本地记忆层，让 AI 从“单次对话工具”进化为“拥有长期上下文的数字分身”。
打开原文 ↗

2026-03-03 原文链接 ↗

阅读简报

双语对照

完整翻译

原文

讨论归档

核心观点

上下文债务是 AI 协作的隐形杀手 频繁开启新会话会导致决策逻辑丢失、重复解释背景，这种“上下文损耗”在处理复杂长周期项目时会演变成系统性失控。
检索范式的代差：Grep 已死 传统的关键词匹配（Grep）无法处理语义关联（如搜“失眠”找不出“睡眠质量差”），且在 AI Agent 调用时极度消耗 Token 和时间。
混合检索（Hybrid Search）是工程最优解 80% 的场景应优先使用 BM25（基于词频的确定性搜索，快且稳），配合语义搜索（Embedding）补盲，最后由 LLM 重排，兼顾性能与召回率。
会话资产化：对话即知识 将 AI 对话记录（JSONL）自动解析为 Markdown 并建立索引，使“过去的决策”成为“未来的提示词”，实现工具链的增量学习。
工具会变，但上下文永存 只要掌握了个人/团队的上下文索引机制，无论底层模型如何迭代（Claude/GPT/Gemini），你都能无缝迁移并保持指挥官的领先地位。

跟我们的关联

👤ATou

意味着：你个人的“数字遗产”散落在 Obsidian、Slack 和无数个 AI 会话中，缺乏一个统一的“召回（Recall）”开关。
接下来：调研并部署 QMD，尝试将个人的 Obsidian Vault 与 Claude Code 技能库打通，构建个人的“长期记忆层”。

🧠Neta

意味着：20 人特种作战团队最怕“信息孤岛”和“决策重复”。如果 AI 能记住团队半年前在海外增长上的失败教训，其价值将翻倍。
接下来：在内部研发或增长工作流中试点“MemoryOps 闭环”，将重要决策会话自动索引化，减少跨时区/跨阶段的交接摩擦。

🪞Uota

意味着：Agent 的核心竞争力不在于 Prompt 写得好，而在于 Context 喂得准。
接下来：在 Neta 的 Agent 架构中引入“检索路由矩阵”：结构化资料走 BM25，非结构化历史走语义，高风险决策走 Hybrid。

讨论引子

1. 如果 AI 能够精准记住你一年前随口提过的一个“未竟梦想”，并在今天主动提醒你重启，这种体验对 Neta 的用户留存意味着什么？ 2. 我们是否过度迷信“语义搜索”了？在代码和结构化文档面前，回归 BM25 这种“数学搜索”是否才是更高杠杆的降本增效？ 3. “工具会变，上下文不会”，我们现在的团队知识资产是否具备“跨模型迁移”的能力？

每一次与 Claude Code 的对话都从零开始。下面是我如何用一个本地搜索引擎和一项技能，在你敲下第一个字之前就把完整上下文加载好。

每一次与 Claude Code 的对话都像从零开始。三周里我跑了 700 个会话，可我已经不记得当时在做什么了。发生了什么、进展到哪儿，我越来越失控。

我只是打开一个新的终端，然后——现在到底是什么情况？我得想办法把这些上下文全都找回来：项目是什么、做过哪些决策。我总是在从头开始。

更糟的是，这种事会在会话进行到一半时发生。当上下文用量到 60% 触顶，我们就得压缩或交接。于是，一半的决策就丢了。甚至更惨的是，如果我想第二天继续，我根本不记得当时在干嘛。

靠在文件里 grep 的范式根本扩展不动。于是我把 QMD 接进了我的 vault——这是 Shopify 的 CEO Tobias Lutke（@tobi）做的一个搜索引擎，加拿大出品。

接下来要展示的整套记忆系统——其实就是一项技能。两分钟装好，Claude Code 就已经知道怎么用了。

QMD：你的 Vault 的本地搜索引擎

QMD 是面向你知识库的本地搜索引擎。它会为你的 Obsidian @obsdmd vault 建立索引，并在一秒内找到你要的任何内容。

对 vault 里的每个文件夹，我都有一个 QMD collection：笔记、日记条目、会话、转录稿——一一对应。对每个 collection，我都能做聚焦搜索。

qmd vsearch "happy, grateful, excited" -c daily -n 5
qmd vsearch "energy, great day, feeling good" -c daily -n 5
qmd vsearch "satisfaction, accomplishment" -c daily -n 5

就这样。一条命令，你的 vault 就可搜索了。

但它到底“怎么搜”？Claude Code @claudeai 默认的搜索方式很粗暴——它会派出一个 Haiku 子代理，把每个文件都 grep 一遍。我试过（看视频）：用常规方式去搜我关于不同 graph 方案的笔记。花了 3 分钟。我等得无聊，只能刷 Twitter。结果呢？300 个文件，不怎么样。

QMD 搜索是秒出。结果更好。消耗的 token 少得多。这事根本不需要子代理。区别在于：grep 只匹配字符串，QMD 会按相关性排序。

Grep vs BM25 vs 语义搜索

QMD 给你三种搜索模式。BM25（qmd search）是确定性的全文检索：像 grep 一样，它匹配精确关键词；但不同于 grep，它会给每个文件打分：这个词出现得有多频繁？在你所有文档里有多稀有？一条短笔记里提到五次 “sleep”，会比一篇 10,000 词、只出现一次 “sleep” 的文件得分更高。没有 AI、没有 embeddings——只有数学。Semantic（qmd vsearch）用 embeddings 去找“含义”：即使文档里没有出现同样的词，你也能按概念去搜。Hybrid（qmd query）则把两者结合起来。

我在整个 vault 里搜了 “sleep”。区别如下。

Grep 找到了 200 个文件，散得到处都是。它会把所有包含 “sleep” 这个词的文件都捞出来。它甚至会搜到 sleep()——一个让代码暂停执行的编程命令，和真正的睡眠毫无关系。这就是字符串匹配的问题。

BM25（2 秒）：关于睡眠质量的反思；追踪睡眠碎片化模式的实验；凌晨 3 点睡眠被打断……好太多了。

但 qmd search "insomnia" 会返回 0 条结果，因为 vault 里压根没有这个词。

Semantic search：我输入 qmd vsearch "couldn't sleep, bad night"，它找到了我多年前设定的一个“睡前纪律”目标。它越过关键词，直接探索语义。五条结果里有四条根本不包含搜索词。Grep 永远不可能找到它们。

Hybrid（qmd query "couldn't sleep, bad night"）：把“睡眠质量提升”排到 89%，“凌晨 3 点睡眠中断”排到 51%，“健康睡眠优化”排到 42%。综合排名最好。

qmd collection list
qmd search "video workflow" -c notes -n 3

先从 BM25 开始：快，而且能处理结构化笔记。80% 的搜索都用它。对转录稿和脑暴式随手记，再加语义搜索——因为你根本不会用那些精确词组去搜。

语义搜索到底能翻出什么

我在日记里搜了：“找出我快乐的那些日子，以及原因。”

这不是一个简单查询。Claude 会把它改写适配，并跑多次搜索：

它在数月的日记里找到了语义上相关的连接。

规律是：我最快乐的日子，往往是我交付了某个东西、而且睡眠恢复得很好——比如蒸了桑拿，或者睡了 9 小时。

然后，它翻出了我完全忘掉的一件事。去年 10 月，我在写博士论文，几乎要放弃了。我必须一天一天来：走进来，写点东西。但我当时真正意识到的是：我需要做的只是穿过不适，而不是逃去追求快速修复。

我不记得自己写过这些。我也没想到搜索能把它翻出来。但它就在那里——一条精确引用。

/recall——在开始之前加载上下文

/recall 是一个构建在 QMD 之上的 Claude Code 技能。它会在你开工之前就加载上下文——你不再需要向 Claude 解释你之前在做什么，只要让它 recall。

它有三种模式：temporal（按日期扫描你的会话历史）、topic（在你的 QMD collections 上做 BM25 搜索）、graph（对会话与文件做交互式可视化）。

https://youtu.be/RDoTY4_xh0s?t=167

/recall yesterday

/recall topic graph

/recall graph last week

Yesterday 从一天里重建了 39 个会话：时间线、每个会话的消息数量、什么时候做了什么，一清二楚。

Topic 在会话和笔记里搜索 “QMD video”，返回了仪表盘、制作计划、待办清单。不到一分钟，所有相关文件就都浮出水面。对比蛮力方案：你让 Claude “找出这个项目的所有信息”，它就会派 Haiku 去你的 vault 里 grep 3 分钟，烧掉大量 token，结果还更差。用 /recall topic，我把项目的完整状态重建出来，然后问：下一步最有杠杆的动作是什么？

Graph 打开了我整整一周的交互式可视化：会话像彩色团块，越旧越暗，越新的会用紫色高亮。文件按类型聚类：目标、研究、声音、文档、内容、技能。

举个例子：我在探索午餐地点。我对 Claude 说“我想吃一顿很棒的午饭”，我们分析了不同的去处。我把这些存成了我可能想尝试的活动。一周后，我打开 graph，看见那次会话，把那些文件路径复制进 Claude Code，然后从那里接着做。这个图谱让每一次过去的对话都能被找回。

700 个会话，全部可搜索

Claude Code 会把你的所有对话保存为电脑上的 JSONL 文件。过去三周我有 700 个会话。每一个里面都有决策、问题、以及我以后还会需要的上下文。

原始文件里有工具调用、系统提示词、角色信息。我们会把它解析成清晰的 Markdown：真正的信号、真正的用户消息，然后把这些内容嵌入到 QMD 索引里。

每次会话结束、我关掉终端时，都会触发一个 hook，把这次会话导出并嵌入到 QMD。于是索引永远是新的。我甚至能找回今天刚发生的任何东西。

https://github.com/tobi/qmd

找到那些你从未付诸行动的想法

这部分最让我意外。我搜的是：“找出那些我从来没有真正去做的想法。”

让 Claude 去综合 QMD 的原始结果非常有用——你自己很难读懂那些原始输出。Claude 总结了它找到的内容：

10 月 19 日——我想做一个博士写作仪表盘，但最后没做
我有过一些以插画为核心的应用想法，但从未推进到底
我想录一段关于我完整 Obsidian 工作流的录屏，但一直没有真正开干

而且这一切都在本地——所有这些 embeddings 都在你自己的电脑上。

工具会变，但你的上下文不会

你的笔记不再是被动的。它们不再被困在 Obsidian 的世界里。它们开始真的能“做事”。你的所有笔记，都是关于你自己的有用上下文，你可以用它们去实现人生目标。

工具会变。一个月后会有新模型出现。那又怎样？只要你有自己的上下文，你就能在任何情境下把它跑起来：Claude Code、Codex、Gemini CLI，什么都行。

这层记忆层以“技能”的形式贯穿你的整个工具栈。我用 Obsidian Sync 让我的 vault 在 Mac 和一台常开着的 Mac Mini 之间保持同步。在 Mac Mini 上，OpenClaw 24/7 运行。于是我拿起手机，打开 OpenClaw，就能在任何地方拥有同一个 vault、同一个 QMD 索引、同一套技能。

https://memory-artemzhutov.netlify.app/

下一步

下载 /recall 技能，把它放进你的 .claude/skills/ 文件夹里，你今天就能把会话流水线和 recall 跑起来（或者让 Claude 帮你搞定 :) ）。

QMD——这一切背后的搜索引擎。作者 Tobias Lutke。

https://github.com/tobi/qmd

观看完整视频——42 分钟全流程讲解，现场演示。

https://youtu.be/RDoTY4_xh0s

Artem

Every conversation with Claude Code starts from zero. Here's how I fixed that with a local search engine and a skill that loads your full context before you type a single word.

每一次与 Claude Code 的对话都从零开始。下面是我如何用一个本地搜索引擎和一项技能，在你敲下第一个字之前就把完整上下文加载好。

Every conversation with Claude Code starts from zero. I had 700 sessions in 3 weeks and I don't remember what was happening back then. I'm losing control in terms of what's happening.

每一次与 Claude Code 的对话都像从零开始。三周里我跑了 700 个会话，可我已经不记得当时在做什么了。发生了什么、进展到哪儿，我越来越失控。

I just open a new terminal and then what's going on. I need to somehow find all this context. What was the project, what are the decisions. I need to start from zero all the time.

我只是打开一个新的终端，然后——现在到底是什么情况？我得想办法把这些上下文全都找回来：项目是什么、做过哪些决策。我总是在从头开始。

It gets worse mid-session. When we hit context limit at 60%, we need to compact or hand off. Then half of the decisions have been lost. Or even worse if I want to continue next day, I don't remember what was happening back then.

The current paradigm of grepping over files doesn't scale. So I plugged QMD into my vault, a search engine by Tobias Lutke @tobi , CEO of Shopify. Made in Canada.

靠在文件里 grep 的范式根本扩展不动。于是我把 QMD 接进了我的 vault——这是 Shopify 的 CEO Tobias Lutke（@tobi）做的一个搜索引擎，加拿大出品。

The whole memory system I'm about to show you - it's a skill. Install it in 2 minutes and Claude Code already knows how to use it.

接下来要展示的整套记忆系统——其实就是一项技能。两分钟装好，Claude Code 就已经知道怎么用了。

QMD: A Local Search Engine for Your Vault

QMD：你的 Vault 的本地搜索引擎

QMD is a local search engine for your knowledge base. It indexes your Obsidian @obsdmd vault and finds anything in under a second.

QMD 是面向你知识库的本地搜索引擎。它会为你的 Obsidian @obsdmd vault 建立索引，并在一秒内找到你要的任何内容。

For each vault folder I have a QMD collection. Notes, daily entries, sessions, transcripts. One-to-one mapping. For each collection I can perform a focused search.

对 vault 里的每个文件夹，我都有一个 QMD collection：笔记、日记条目、会话、转录稿——一一对应。对每个 collection，我都能做聚焦搜索。

qmd vsearch "happy, grateful, excited" -c daily -n 5
qmd vsearch "energy, great day, feeling good" -c daily -n 5
qmd vsearch "satisfaction, accomplishment" -c daily -n 5

qmd vsearch "happy, grateful, excited" -c daily -n 5
qmd vsearch "energy, great day, feeling good" -c daily -n 5
qmd vsearch "satisfaction, accomplishment" -c daily -n 5

That's it. One command, your vault is searchable.

就这样。一条命令，你的 vault 就可搜索了。

But searchable how? The default way Claude Code @claudeai searches is brute force - it sends a Haiku sub-agent to grep through every file. I tried it (watch): searched for my notes about different graph approaches the normal way. It took 3 minutes. I was bored, scrolling Twitter while waiting. And the results? 300 files, not great.

QMD search was instant. Better results. Way fewer tokens. You don't have to use sub-agents to do that. The difference: grep matches strings, QMD ranks by relevance.

QMD 搜索是秒出。结果更好。消耗的 token 少得多。这事根本不需要子代理。区别在于：grep 只匹配字符串，QMD 会按相关性排序。

Grep vs BM25 vs Semantic

Grep vs BM25 vs 语义搜索

QMD gives you three search modes. BM25 (qmd search) is deterministic full-text search. Like grep, it matches exact keywords. Unlike grep, it scores each file: how often does the word appear, and how rare is it across all your documents? A short note mentioning "sleep" five times scores higher than a 10,000-word file where "sleep" appears once. No AI, no embeddings - just math. Semantic (qmd vsearch) uses embeddings to find meaning - you can search for a concept even if the exact words aren't there. Hybrid (qmd query) combines both.

I searched for "sleep" across my vault. Here's the difference.

我在整个 vault 里搜了 “sleep”。区别如下。

Grep found 200 files. All over the place. It finds all the files which contain the word "sleep." It even finds sleep() - a programming command that pauses code execution. Nothing to do with actual sleep. That's the problem with string matching.

BM25 (2 seconds): a reflection about sleep quality. Experiment with tracking sleep fragmentation patterns. Sleep interrupted at 3am. Much better.

BM25（2 秒）：关于睡眠质量的反思；追踪睡眠碎片化模式的实验；凌晨 3 点睡眠被打断……好太多了。

But qmd search "insomnia" returns zero results. The word doesn't exist in the vault.

但 qmd search "insomnia" 会返回 0 条结果，因为 vault 里压根没有这个词。

Semantic search: I type qmd vsearch "couldn't sleep, bad night" and it found a bedtime discipline goal I set years ago. It goes beyond the keywords and explores the meaning. Four of five results don't contain the search words at all. Grep could never find them.

Hybrid (qmd query "couldn't sleep, bad night"): ranks sleep quality improvement at 89%, sleep interrupted at 3am at 51%, health sleep optimization at 42%. The best ranking of all.

Hybrid（qmd query "couldn't sleep, bad night"）：把“睡眠质量提升”排到 89%，“凌晨 3 点睡眠中断”排到 51%，“健康睡眠优化”排到 42%。综合排名最好。

qmd collection list
qmd search "video workflow" -c notes -n 3

qmd collection list
qmd search "video workflow" -c notes -n 3

Start with BM25. Fast and handles structured notes. 80% of searches. Add semantic for transcripts and braindumps, where you'd never search for those exact words.

先从 BM25 开始：快，而且能处理结构化笔记。80% 的搜索都用它。对转录稿和脑暴式随手记，再加语义搜索——因为你根本不会用那些精确词组去搜。

What semantic search actually surfaces

语义搜索到底能翻出什么

I searched my daily notes: "find the days when I was happy and what was the reason."

我在日记里搜了：“找出我快乐的那些日子，以及原因。”

This is a non-trivial query. Claude adapted it and ran multiple searches:

这不是一个简单查询。Claude 会把它改写适配，并跑多次搜索：

Found semantically relevant connections across months of daily notes.

它在数月的日记里找到了语义上相关的连接。

The pattern: my happiest days are when I ship something and I had good sleep recovery, like sauna or 9 hours of sleep.

规律是：我最快乐的日子，往往是我交付了某个东西、而且睡眠恢复得很好——比如蒸了桑拿，或者睡了 9 小时。

Then it surfaced something I had completely forgotten. Back in October when I was writing my PhD thesis and I was about to give up. I needed to do it day by day, come in and write something. But what I realized back then is that I just need to see it through discomfort instead of escaping to quick fixes.

I didn't remember writing that. I didn't expect the search could surface this. But there it was, the exact citation.

我不记得自己写过这些。我也没想到搜索能把它翻出来。但它就在那里——一条精确引用。

/recall - load context before you start

/recall——在开始之前加载上下文

/recall is a Claude Code skill that sits on top of QMD. It loads context before you start working - instead of explaining to Claude what you were doing, you tell it to recall.

/recall 是一个构建在 QMD 之上的 Claude Code 技能。它会在你开工之前就加载上下文——你不再需要向 Claude 解释你之前在做什么，只要让它 recall。

It has three modes: temporal (scan your session history by date), topic (BM25 search across your QMD collections), and graph (interactive visualization of sessions and files).

它有三种模式：temporal（按日期扫描你的会话历史）、topic（在你的 QMD collections 上做 BM25 搜索）、graph（对会话与文件做交互式可视化）。

https://youtu.be/RDoTY4_xh0s?t=167

/recall yesterday

/recall topic graph

/recall graph last week

Yesterday reconstructed 39 sessions from one day. Timeline, number of messages in each session, what was done when.

Yesterday 从一天里重建了 39 个会话：时间线、每个会话的消息数量、什么时候做了什么，一清二楚。

Topic searched for "QMD video" across sessions and notes. Returned the dashboard, production plan, to-do list. All related files surfaced in less than a minute. Compare that to the brute force approach: tell Claude "find all information about this project" and it sends Haiku to grep through your vault for 3 minutes, burns tokens, and comes back with worse results. With /recall topic, I reconstructed the full state of the project and asked: what is the next highest leverage action?

Graph opened an interactive visualization of my whole week. Sessions as colored blobs, older ones dimmer, recent ones highlighted in purple. Files clustered by type: goals, research, voice, docs, content, skills.

Here's an example: I was exploring lunch places. I told Claude "I want to have a great lunch" and we analyzed different places to go. I stored those as activities I might want to try. A week later, I open the graph, see that session, copy those file paths into Claude Code and continue from there. The graph makes every past conversation recoverable.

700 sessions, all searchable

700 个会话，全部可搜索

Claude Code saves all your conversations as JSONL files on your computer. I had 700 sessions in the last 3 weeks. Each one has decisions, questions, context I'll need again.

Claude Code 会把你的所有对话保存为电脑上的 JSONL 文件。过去三周我有 700 个会话。每一个里面都有决策、问题、以及我以后还会需要的上下文。

The raw files have tool uses, system prompts, roles. We parse a clear markdown, the actual signal, the actual user messages, and embed this into the QMD index.

At the end of each session when I close my terminal, I have a hook which exports and embeds those sessions into QMD. So the index is always fresh. I can get back to anything I had even from today.

每次会话结束、我关掉终端时，都会触发一个 hook，把这次会话导出并嵌入到 QMD。于是索引永远是新的。我甚至能找回今天刚发生的任何东西。

https://github.com/tobi/qmd

Finding Ideas You Never Acted On

找到那些你从未付诸行动的想法

This is the one that surprised me. I searched: "find the ideas that I have never acted on."

这部分最让我意外。我搜的是：“找出那些我从来没有真正去做的想法。”

Having Claude synthesize the raw QMD results is very useful - it's hard to parse the actual raw results yourself. Claude summarized what it found:

让 Claude 去综合 QMD 的原始结果非常有用——你自己很难读懂那些原始输出。Claude 总结了它找到的内容：

October 19th - I wanted to build a PhD writing dashboard but never did it

10 月 19 日——我想做一个博士写作仪表盘，但最后没做

I had ideas for illustration-based apps but never followed through

我有过一些以插画为核心的应用想法，但从未推进到底

I had an idea to record a screen recording about my full Obsidian workflow but never committed

我想录一段关于我完整 Obsidian 工作流的录屏，但一直没有真正开干

And it's all local - all of these embeddings live on your computer.

而且这一切都在本地——所有这些 embeddings 都在你自己的电脑上。

Tools change. Your context stays.

工具会变，但你的上下文不会

Your notes stop being passive. They stop being trapped in your Obsidian world. They actually start doing things. All of your notes is useful context about yourself that you can use to achieve goals in your life.

Tools change. A month from now there are going to be new models. So what. If you have your context you can make it work in any situation. Claude Code, Codex, Gemini CLI, anything.

工具会变。一个月后会有新模型出现。那又怎样？只要你有自己的上下文，你就能在任何情境下把它跑起来：Claude Code、Codex、Gemini CLI，什么都行。

The memory layer works as a skill across your whole stack. I use Obsidian Sync to keep my vault in sync between my Mac and a Mac Mini that's always on. On the Mac Mini, OpenClaw runs 24/7. So I pick up my phone, open OpenClaw, and I have the same vault, same QMD index, same skills - from anywhere.

https://memory-artemzhutov.netlify.app/

Next steps

下一步

Download the /recall skill, drop it in your .claude/skills/ folder, and you have the session pipeline and recall working today (or ask Claude to do it for you :) ).

下载 /recall 技能，把它放进你的 .claude/skills/ 文件夹里，你今天就能把会话流水线和 recall 跑起来（或者让 Claude 帮你搞定 :) ）。

QMD - the search engine behind all of this. By Tobias Lutke.

QMD——这一切背后的搜索引擎。作者 Tobias Lutke。

https://github.com/tobi/qmd

Watch the full video - 42 min walkthrough with live demos.

观看完整视频——42 分钟全流程讲解，现场演示。

https://youtu.be/RDoTY4_xh0s

Artem

Every conversation with Claude Code starts from zero. Here's how I fixed that with a local search engine and a skill that loads your full context before you type a single word.

Every conversation with Claude Code starts from zero. I had 700 sessions in 3 weeks and I don't remember what was happening back then. I'm losing control in terms of what's happening.

I just open a new terminal and then what's going on. I need to somehow find all this context. What was the project, what are the decisions. I need to start from zero all the time.

The current paradigm of grepping over files doesn't scale. So I plugged QMD into my vault, a search engine by Tobias Lutke @tobi , CEO of Shopify. Made in Canada.

The whole memory system I'm about to show you - it's a skill. Install it in 2 minutes and Claude Code already knows how to use it.

QMD: A Local Search Engine for Your Vault

QMD is a local search engine for your knowledge base. It indexes your Obsidian @obsdmd vault and finds anything in under a second.

For each vault folder I have a QMD collection. Notes, daily entries, sessions, transcripts. One-to-one mapping. For each collection I can perform a focused search.

qmd vsearch "happy, grateful, excited" -c daily -n 5
qmd vsearch "energy, great day, feeling good" -c daily -n 5
qmd vsearch "satisfaction, accomplishment" -c daily -n 5

That's it. One command, your vault is searchable.

QMD search was instant. Better results. Way fewer tokens. You don't have to use sub-agents to do that. The difference: grep matches strings, QMD ranks by relevance.

Grep vs BM25 vs Semantic

I searched for "sleep" across my vault. Here's the difference.

BM25 (2 seconds): a reflection about sleep quality. Experiment with tracking sleep fragmentation patterns. Sleep interrupted at 3am. Much better.

But qmd search "insomnia" returns zero results. The word doesn't exist in the vault.

Hybrid (qmd query "couldn't sleep, bad night"): ranks sleep quality improvement at 89%, sleep interrupted at 3am at 51%, health sleep optimization at 42%. The best ranking of all.

qmd collection list
qmd search "video workflow" -c notes -n 3

Start with BM25. Fast and handles structured notes. 80% of searches. Add semantic for transcripts and braindumps, where you'd never search for those exact words.

What semantic search actually surfaces

I searched my daily notes: "find the days when I was happy and what was the reason."

This is a non-trivial query. Claude adapted it and ran multiple searches:

Found semantically relevant connections across months of daily notes.

The pattern: my happiest days are when I ship something and I had good sleep recovery, like sauna or 9 hours of sleep.

I didn't remember writing that. I didn't expect the search could surface this. But there it was, the exact citation.

/recall - load context before you start

/recall is a Claude Code skill that sits on top of QMD. It loads context before you start working - instead of explaining to Claude what you were doing, you tell it to recall.

It has three modes: temporal (scan your session history by date), topic (BM25 search across your QMD collections), and graph (interactive visualization of sessions and files).

https://youtu.be/RDoTY4_xh0s?t=167

/recall yesterday

/recall topic graph

/recall graph last week

Yesterday reconstructed 39 sessions from one day. Timeline, number of messages in each session, what was done when.

700 sessions, all searchable

Claude Code saves all your conversations as JSONL files on your computer. I had 700 sessions in the last 3 weeks. Each one has decisions, questions, context I'll need again.

The raw files have tool uses, system prompts, roles. We parse a clear markdown, the actual signal, the actual user messages, and embed this into the QMD index.

At the end of each session when I close my terminal, I have a hook which exports and embeds those sessions into QMD. So the index is always fresh. I can get back to anything I had even from today.

https://github.com/tobi/qmd

Finding Ideas You Never Acted On

This is the one that surprised me. I searched: "find the ideas that I have never acted on."

Having Claude synthesize the raw QMD results is very useful - it's hard to parse the actual raw results yourself. Claude summarized what it found:

October 19th - I wanted to build a PhD writing dashboard but never did it
I had ideas for illustration-based apps but never followed through
I had an idea to record a screen recording about my full Obsidian workflow but never committed

And it's all local - all of these embeddings live on your computer.

Tools change. Your context stays.

Tools change. A month from now there are going to be new models. So what. If you have your context you can make it work in any situation. Claude Code, Codex, Gemini CLI, anything.

https://memory-artemzhutov.netlify.app/

Next steps

Download the /recall skill, drop it in your .claude/skills/ folder, and you have the session pipeline and recall working today (or ask Claude to do it for you :) ).

QMD - the search engine behind all of this. By Tobias Lutke.

https://github.com/tobi/qmd

Watch the full video - 42 min walkthrough with live demos.

https://youtu.be/RDoTY4_xh0s

Artem

📋 讨论归档

讨论进行中…