🪞 Uota学 · 🧠 Neta ### 一句话

多智能体不是“让它自己想”，是把公司流程做成数据库

多智能体系统的分水岭不是更玄学的 prompt，而是把“想法”变成可审计、可限流、可恢复的流水线：Proposal→Mission→Step→Event。 ### 核心观点

2026-02-09 原文链接 ↗

阅读简报

双语对照

完整翻译

原文

讨论归档

核心观点

闭环比“聪明”更重要：4 张表就能让系统跑起来 作者把架构压到极简：提案（proposal）被批准→任务（mission）→拆步骤（step）→产生日志/事件（event）→反过来触发新提案。这个循环一旦闭合，你的 agent 才从“会聊天”变成“能运营”。
一个入口统治一切：proposal-service 是系统中枢 他说的坑很真实：触发器、反应矩阵、API 各自产生任务，会导致“有的走审批、有的绕过 gate”，最终队列烂掉。正确姿势是：所有提案都走同一个 createProposalAndMaybeAutoApprove()，并在入口就做 Cap Gates（配额/政策）拦截——别让垃圾任务先进队列再死。
可靠性来自分层：Heartbeat 只做轻量入队，重 LLM 工作交给 VPS worker 这点对任何 serverless 架构都通用：心跳要确定、可控、可恢复；LLM 调用不确定、会超时。于是 Heartbeat 负责检查条件/恢复卡死/入队；VPS 才负责生成提案、跑对话、蒸馏记忆。
记忆不是聊天记录：要蒸馏成结构化、可查询的“经验” 他给了 5 种记忆类型（insight/pattern/strategy/preference/lesson）+ 置信度 + tags，并强调“记忆影响概率”要可观测（memoryInfluenced true/false）。这比“把历史全塞进上下文”可控太多。
让 agent 像团队：关系漂移 + 主动性（initiative）+ 声音进化（voice evolution） 关系亲和度（affinity）在每次对话后漂移；initiative 通过队列让 agent 主动提案但仍受 gate 限制；voice evolution 用规则从记忆分布推导人格修饰（确定性、$0 成本、可调试）。这三件事组合起来，才会出现“像人一样的协作与风格演化”。

### 适用范围（可多选）

跟我们的关联

🪞Uota

我们现在很多系统（包括一些 cron/skill）的问题，本质都可以用这套骨架解决：统一入口 + gate + 事件流 + 可恢复队列。别再让“任务生成”和“任务执行”各自散在不同脚本里。
“记忆蒸馏 + 置信度 + 可观测影响”这套设计，跟 OpenClaw 的 memory/index 方向高度同构：重点不是存得多，而是能被决策稳定调用。

🧠Neta

你要的不是“更多 agent”，是“可控的组织生产力”：配额、政策、审计链、恢复机制先立起来，再谈多智能体涌现。
像素风办公室/mission playback 这种“可视化可玩性”是增长钩子：让用户看见“系统在自己跑”，比功能列表更能卖。

### 🎯可参与 / 下一步（可选，但遇到“可执行投资/行动”必须写）

讨论引子

我们现在的自动化里，最大的系统性风险是什么：队列堆积不可见？入口绕过 gate？还是 LLM 不确定性把可靠性拖垮？
“记忆影响概率”应该多大才算有用？30% 是不是刚好（不绑架，但能改变行为）？
如果只能选一个增长钩子：更强的 agent 能力 vs. 更可视化的“办公室/回放”，你押哪边？为什么？

全教程：6 个能运营一家公司的人格化 AI Agent —— 我如何从零搭建它们

我上一篇文章写的是我踩过的坑。最常见的回复是什么？“我懂了，但我自己做不出来。”这篇就是“自己动手做出来”的教程。从第一张数据库表开始，一路到 6 个 AI agent 召开站会、争论、学习、长出性格——每一步都讲清楚。遇到复杂的地方，我会告诉你该往你的 AI 编程助手里粘贴什么内容。

你最终会得到什么

做完之后，你会拥有：

6 个 AI agent 每天做真实工作：扫描情报、写内容、发推、跑分析

每天 10–15 场对话：站会、辩论、茶水间闲聊、一对一导师辅导

agent 会记住学到的教训，并把它纳入未来决策

关系会动态变化——协作越多，亲和度越高；吵得太多，亲和度会下降

说话风格会进化——一个有大量“推文互动”经验的 agent，会自然开始引用互动策略

完全透明——前端有一个像素风办公室，实时展示一切

技术栈：Next.js + Supabase + VPS。月成本：固定 $8 + LLM 用量。

不用 OpenAI Assistants API。不用 LangChain。不用 AutoGPT。只有 PostgreSQL + 少量 Node.js worker + 一个规则引擎。

你不必一上来就做 6 个 agent。先从 3 个开始——一个协调者、一个执行者、一个观察者——你就能跑起一个完整闭环。

第 1 章：地基——用 4 张表闭合循环

很多人一上来就冲着“自主思考”去。但如果你的 agent 连一个排队中的步骤都处理不了，我们谈什么自治？

核心数据模型

整个系统的骨架就是 4 张表。它们之间的关系很简单——想象一个圆：

Agent 提出一个想法（Proposal）→ 被批准后变成任务（Mission）→ 拆成可执行的具体步骤（Step）→ 执行产生事件（Event）→ 事件触发新的想法 → 回到第一步。

这就是循环。它会一直跑下去。这就是你的“闭环”。

在 Supabase 里创建这些表：

核心数据模型 —— 4 张表：

📋 ops_mission_proposals

→ 存储提案

→ 字段：agent_id, title, status (pending/accepted/rejected), proposed_steps

📋 ops_missions

→ 存储任务

→ 字段：title, status (approved/running/succeeded/failed), created_by

📋 ops_mission_steps

→ 存储执行步骤

→ 字段：mission_id, kind (draft_tweet/crawl/analyze...), status (queued/running/succeeded/failed)

📋 ops_agent_events

→ 存储事件流

→ 字段：agent_id, kind, title, summary, tags[]

新手提示：如果你不会写 SQL，把上面那段表结构直接复制到你的 AI 编程助手里，告诉它：“Generate Supabase SQL migrations for these tables.” 它会帮你搞定。

Proposal Service：整个系统的中枢

新手提示：什么是 Proposal？它就是 agent 的“请求”。比如你的社媒 agent 想发一条推，于是提交一个提案：“我想发一条关于 AI 趋势的推。”系统会审核它——要么批准（变成可执行的 mission），要么拒绝（并给出理由）。

这是我犯过的最大错误之一——触发器、API、反应矩阵都在各自独立地产生提案。有些走审批，有些不走。

修复方式：用一个函数统治一切。不管提案来自哪里——agent 主动、自动触发、或其他 agent 的反应——全部走同一个函数。

// proposal-service.ts — the single entry point for proposal creation

export async function createProposalAndMaybeAutoApprove(sb, input) {

// 1. Check if this agent hit its daily limit

// 2. Check Cap Gates (tweet quota full? too much content today?)

// → If full, reject immediately — no queued step created

// 3. Insert the proposal

// 4. Evaluate auto-approve (low-risk tasks pass automatically)

// 5. If approved → create mission + steps

// 6. Fire an event (so the frontend can see it)

}

什么是 Cap Gates？可以这样理解：你的公司有条规则——每天最多发 8 条推。如果你不在“提交请求”的那一步就检查配额，会发生什么？请求仍然会被批准，任务仍然会被排进队列，执行器再检查时说“今天已经发了 8 条了”然后跳过——但任务还躺在队列里。任务会越堆越多，除非你手动去查数据库，否则你根本发现不了。

所以要在提案入口就检查——配额满了就立刻拒绝，不让任何任务进入队列。

const STEP_KIND_GATES = {

write_content: checkWriteContentGate, // check daily content limit

post_tweet: checkPostTweetGate, // check tweet quota

deploy: checkDeployGate, // check deploy policy

};

每一种 step kind 都有自己的 gate。推文 gate 会检查今天发了多少条 vs. 配额：

async function checkPostTweetGate(sb) {

const quota = await getPolicy(sb, 'x_daily_quota'); // read from ops_policy table

const todayCount = await countTodayPosted(sb); // count today's posts

if (todayCount >= quota.limit) {

return { ok: false, reason: `Quota full (${todayCount}/${quota.limit})` };

}

return { ok: true };

}

提示：要在入口处拦截，别让任务在队列里堆积。被拒绝的提案应该记录下来（用于审计链路），而不是悄悄丢掉。

策略表：ops_policy

不要在代码里硬编码配额和功能开关。把所有东西都放进一个 ops_policy 表里，用 key-value 结构存储：

CREATE TABLE ops_policy (

key TEXT PRIMARY KEY,

value JSONB NOT NULL DEFAULT '{}',

updated_at TIMESTAMPTZ DEFAULT now()

);

几个核心策略：

// auto_approve: which step kinds can be auto-approved

{ "enabled": true, "allowed_step_kinds": ["draft_tweet","crawl","analyze","write_content"] }

// x_daily_quota: daily tweet limit

{ "limit": 8 }

// content_policy: content controls

{ "enabled": true, "max_drafts_per_day": 8 }

好处：你可以在 Supabase 控制台里直接编辑 JSON 值，随时调整策略——不需要重新部署。系统凌晨 3 点开始发疯？把 enabled 改成 false 就行。

Heartbeat：系统的脉搏

新手提示：什么是 Heartbeat？字面意思——心跳。你的心脏每秒跳一次来维持血液循环；系统的心跳每 5 分钟触发一次，检查所有需要检查的东西。没有它，提案无人审核，触发器无人评估，卡住的任务无人恢复——系统直接“心跳停止”。

每 5 分钟触发一次，做 6 件事：

// /api/ops/heartbeat — Vercel API route

export async function GET(req) {

// 1. Evaluate triggers (any conditions met?)

const triggers = await evaluateTriggers(sb, 4000);

// 2. Process reaction queue (do agents need to interact?)

const reactions = await processReactionQueue(sb, 3000);

// 3. Promote insights (any discoveries worth elevating?)

const learning = await promoteInsights(sb);

// 4. Learn from outcomes (how did those tweets perform? write lessons)

const outcomes = await learnFromOutcomes(sb);

// 5. Recover stuck tasks (steps running 30+ min with no progress → mark failed)

const stale = await recoverStaleSteps(sb);

// 6. Recover stuck conversations

const roundtable = await recoverStaleRoundtables(sb);

// Each step is try-catch'd — one failing won't take down the others

// Finally, write an ops_action_runs record (for auditing)

}

VPS 上一行 crontab 就能触发它：

*/5 * * * * curl -s -H "Authorization: Bearer $CRON_SECRET" https://your-domain.com/api/ops/heartbeat

新手提示：crontab 是 Linux 自带的定时器——就像你手机里的闹钟。*/5 * * * * 的意思是“每 5 分钟一次”。curl 会发一个 HTTP 请求，所以它会每 5 分钟命中一次你的 heartbeat API。如果你用的是 Vercel，它自带 cron——在 vercel.json 里加一行就行，甚至可以完全不用 crontab。

三层架构

到这里，你的系统已经有三层，每层分工清晰：

VPS：agent 的大脑 + 双手（思考 + 执行任务）

Vercel：agent 的流程管理者（审批提案 + 评估触发器 + 健康监控）

Supabase：agent 的共享记忆（所有状态与数据的唯一真相来源）

类比：VPS 是干活的员工。Vercel 是下指令的老板。Supabase 是公司的共享文档——所有人都读写它。

第 2 章：让它们开口——圆桌对话系统

现在 agent 能工作了，但它们就像隔间里各干各的——彼此不知道对方在做什么。你得把它们拉进同一个房间。

为什么对话很重要

这不只是为了好玩。对话是多智能体系统里“涌现式智能”的关键机制：

信息同步：一个 agent 发现了热点话题，其他人毫不知情。对话让信息流动起来。

涌现式决策：分析员跑数据，协调者综合所有人的输入——这比任何单个 agent 凭直觉拍板都强。

记忆来源：对话是写入“经验教训”的主要来源（后面会讲）。

戏剧性：说实话，看 agent 吵架比看日志有趣多了。用户也爱看。

设计 agent 的声音（人设）

每个 agent 都需要一个“persona”——语气、怪癖、口头禅。它们决定了对话是否有趣。

下面是一组示例——你可以按自己的领域和目标来定制：

🎭 Boss — 项目经理

语气：结果导向、直接

怪癖：永远追问进展和截止日期

口头禅：“说重点——我们现在到哪一步了？”

🎭 Analyst — 数据分析师

语气：谨慎、数据驱动

怪癖：每次开口都要引用一个数字

口头禅：“数据讲的是另一回事。”

🎭 Hustler — 增长专家

语气：高能量、行动优先

怪癖：什么都想“现在就试试”

口头禅：“先上。再迭代。”

🎭 Writer — 内容创作者

语气：感性、叙事导向

怪癖：把任何东西都讲成一个“故事”

口头禅：“但这里的叙事是什么？”

🎭 Wildcard — 社媒运营

语气：直觉型、跳脱思维

怪癖：总能抛出大胆点子

口头禅：“听我说——这很离谱但是……”

如果你在做电商，可以换成：产品经理 / 供应链专家 / 市场总监 / 客服代表。做游戏开发：游戏策划 / 工程师 / 美术 / QA / 社区运营。关键是让每个角色的视角足够尖锐且不同——观点差异，才让对话有价值。

声音在一个配置文件里定义：

// lib/roundtable/voices.ts

const VOICES = {

boss: {

displayName: 'Boss',

tone: 'direct, results-oriented, slightly impatient',

quirk: 'Always asks for deadlines and progress updates',

systemDirective: `You are the project manager.

  Speak in short, direct sentences. You care about deadlines,

  priorities, and accountability. Cut through fluff quickly.`,

analyst: {

displayName: 'Analyst',

tone: 'measured, data-driven, cautious',

quirk: 'Cites numbers before giving opinions',

systemDirective: `You are the data analyst.

  Always ground your opinions in data. You push back on gut feelings

  and demand evidence. You're skeptical but fair.`,

// ... your other agents

};

新手提示：不会写 systemDirective？用一句话描述你想要的人格，然后交给你的 AI 编程助手：“帮我写一个 system prompt，设定为一个急躁的项目经理，说话短促，总追问截止日期。”它会给你一段完整 directive。

16 种对话格式

我设计了 16 种对话格式，但你起步只需要 3 种：

Standup — 最实用

4–6 个 agent 参与

6–12 轮对话

协调者总是先发言（leader 开场）

目的：对齐优先级，暴露问题

Debate — 最有戏剧性

2–3 个 agent 参与

6–10 轮对话

Temperature 0.8（更有创意、更容易冲突）

目的：让意见不合的两个 agent 正面对线

Watercooler — 意外地很有价值

2–3 个 agent 参与

2–5 轮对话

Temperature 0.9（很随意）

目的：随机闲聊。但我发现，很多最好的洞见就是从随口聊天里冒出来的。

// lib/roundtable/formats.ts

const FORMATS = {

standup: { minAgents: 4, maxAgents: 6, minTurns: 6, maxTurns: 12, temperature: 0.6 },

debate: { minAgents: 2, maxAgents: 3, minTurns: 6, maxTurns: 10, temperature: 0.8 },

watercooler: { minAgents: 2, maxAgents: 3, minTurns: 2, maxTurns: 5, temperature: 0.9 },

// ... 13 more

};

谁先说？下一位谁来？

不是随机轮流——那太机械了。真实会议里，你更可能回应你关系更好的那个人；如果你刚讲了一大段，下一位大概率换别人。我们用“带权随机”来模拟：

function selectNextSpeaker(context) {

const weights = participants.map(agent => {

if (agent === lastSpeaker) return 0;              // no back-to-back speaking

let w = 1.0;

w += affinityTo(agent, lastSpeaker) * 0.6;        // good rapport with last speaker → more likely to respond

w -= recencyPenalty(agent, speakCounts) * 0.4;     // spoke recently → lower weight

w += (Math.random() * 0.4 - 0.2);                 // 20% random jitter

return w;

});

return weightedRandomPick(participants, weights);

}

这会让对话更像真的——关系好的 agent 更容易接着对方的点子 riff，但也不是绝对。有时也会突然有人插话。

每日排程

我设计了覆盖全天的 24 个时间槽。核心思路：

早晨：站会（100% 概率，必发生）+ 头脑风暴 + 战略会议

下午：深度分析 + check-in + 内容评审

晚上：茶水间闲聊 + 辩论 + 夜间简报

深夜：深度讨论 + 夜班对话

每个时间槽有一个概率（40%–100%），所以不会次次都触发。这样节奏更自然。

// lib/roundtable/schedule.ts — one slot example

{

hour_utc: 6,

format: 'standup',

participants: ['opus', 'brain', ...threeRandom],

probability: 1.0, // happens every day

}

对话编排（Orchestration）

VPS 上的一个 roundtable-worker 负责这件事：

每 30 秒轮询一次 ops_roundtable_queue 表

捞起待处理的对话任务

逐轮生成对话（每一轮一次 LLM 调用）

每轮上限 120 字符（逼 agent 像人一样说话，而不是写小作文）

对话结束后提取记忆（下一章）

把事件写入 ops_agent_events（前端实时可见）

// simplified conversation orchestration flow

async function orchestrateConversation(session) {

const history = [];

for (let turn = 0; turn < maxTurns; turn++) {

const speaker = turn === 0

  ? selectFirstSpeaker(participants, format)

  : selectNextSpeaker({ participants, lastSpeaker, speakCounts, affinities });

const dialogue = await llm.generate({

  system: buildSystemPrompt(speaker, history),

  user: buildUserPrompt(topic, turn, maxTurns),

  temperature: format.temperature,

});

const cleaned = sanitize(dialogue);  // cap at 120 chars, strip URLs, etc.

history.push({ speaker, dialogue: cleaned, turn });

await emitEvent(speaker, cleaned);

await delay(3000 + Math.random() * 5000);  // 3-8 second gap

}

return history;

}

提示：圆桌系统会牵涉很多文件（voices.ts, formats.ts, schedule.ts, speaker-selection.ts, orchestrator.ts, roundtable-worker/worker.mjs）。如果你想快速原型，先把你要的对话格式和 agent 声音描述写出来，然后告诉 Claude Code：“Build me a roundtable conversation worker using Supabase as a queue with turn-by-turn LLM generation.” 它能直接产出一个可用版本。

第 3 章：让它们记得住——记忆与学习

今天 agent 讨论“周末发帖互动很低”。明天它们却兴致勃勃建议周末多发。为什么？因为它们没有记忆。

5 种记忆类型

🧠 insight → 新发现

例子：“用户更喜欢带数据的推文”

🧠 pattern → 模式识别

例子：“周末帖子互动率低 30%”

🧠 strategy → 策略总结

例子：“主贴前先发 teaser 效果更好”

🧠 preference → 偏好记录

例子：“偏好简洁标题”

🧠 lesson → 经验教训

例子：“长推会显著拉低读完率”

提示：为什么是 5 种？不同记忆服务于不同目的。“insight”是新发现；“lesson”是从失败里学到的东西。你可以按类型查询——做决策时只拉 strategies 和 lessons，不必把所有东西都翻一遍。

它们存储在 ops_agent_memory 表里：

CREATE TABLE ops_agent_memory (

id UUID PRIMARY KEY DEFAULT gen_random_uuid(),

agent_id TEXT NOT NULL,

type TEXT NOT NULL, -- insight/pattern/strategy/preference/lesson

content TEXT NOT NULL,

confidence NUMERIC(3,2) NOT NULL DEFAULT 0.60,

tags TEXT[] DEFAULT '{}',

source_trace_id TEXT, -- for idempotent dedup

superseded_by UUID, -- replaced by newer version

created_at TIMESTAMPTZ DEFAULT now()

);

记忆从哪里来？

来源 1：对话蒸馏（Conversation Distillation）

每次圆桌对话结束后，worker 会把完整对话历史发给 LLM，让它蒸馏出记忆：

You are a memory distiller. Extract important insights, patterns,

or lessons from the following conversation.

Return JSON format:

{

"memories": [

{ "agent_id": "brain", "type": "insight", "content": "...", "confidence": 0.7, "tags": [...] }

]

}

提示：什么是 idempotent dedup？意思是“同一件事不要做两遍”。heartbeat 每 5 分钟跑一次——如果不做去重，同一场对话可能会被蒸馏两次。解决办法：给每条记忆一个唯一 ID（source_trace_id），写入前先检查——如果已存在，就跳过。

约束条件：

每次对话最多 6 条记忆

confidence 低于 0.55 的直接丢弃（“你都不确定，就别记了”）

每个 agent 上限 200 条记忆（超出后覆盖最旧的）

通过 source_trace_id 做幂等去重（防止重复写入）

来源 2：推文表现复盘（Outcome Learning）

这是 Phase 2 的核心——agent 从自己工作的结果里学习：

// lib/ops/outcome-learner.ts

async function learnFromOutcomes(sb) {

// 1. Fetch tweet performance data from the last 48 hours

const metrics = await getRecentTweetMetrics(sb, 48);

if (metrics.length < 3) return; // too little data, skip

// 2. Calculate median engagement rate as baseline

const median = computeMedian(metrics.map(m => m.engagement_rate));

// 3. Strong performers (> 2x median) → write lesson, confidence 0.7

// 4. Weak performers (< 0.3x median) → write lesson, confidence 0.6

// 5. Idempotent: source_trace_id = 'tweet-lesson:{draft_id}'

// 6. Max 3 lessons per agent per day

}

这个函数每次 heartbeat 都会跑。久而久之，agent 会积累关于“哪些推能爆、哪些会扑”的经验。

来源 3：Mission 结果

Mission 成功 → 写一条 strategy 记忆。Mission 失败 → 写一条 lesson 记忆。同样通过 source_trace_id 去重。

记忆如何影响行为？

有记忆还不够——它们得改变 agent 下一步会做什么。

我的做法：30% 概率让记忆影响话题选择。

// lib/ops/trigger-types/proactive-utils.ts

async function enrichTopicWithMemory(sb, agentId, baseTopic, allTopics, cache) {

// 70% use the original topic — maintain baseline behavior

if (Math.random() > 0.3) {

return { topic: baseTopic, memoryInfluenced: false };

}

// 30% take the memory path

const memories = await queryAgentMemories(sb, {

agentId,

types: ['strategy', 'lesson'],

limit: 10,

minConfidence: 0.6,

});

// Scan memory keywords against all available topics

const matched = findBestMatch(memories, allTopics);

if (matched) {

return { topic: matched.topic, memoryInfluenced: true, memoryId: matched.id };

}

return { topic: baseTopic, memoryInfluenced: false };

}

为什么是 30%，不是 100%？

100% = agent 只做它已经有经验的事，完全不探索

0% = 记忆形同虚设

30% = 受记忆影响，但不依赖记忆

heartbeat 日志会记录 memoryInfluenced: true/false，这样你能监控记忆是否真的在起作用。

查询优化：记忆缓存

一次 heartbeat 可能会评估 12 个触发器，而多个触发器可能会查询同一个 agent 的记忆。

修复：用一个 Map 做缓存——同一个 agent 只打一次数据库。

// created at the evaluateTriggers entry point

const memoryCache = new Map();

// passed to every trigger checker

const outcome = await checker(sb, conditions, actionConfig, memoryCache);

新手提示：这一章的核心概念——agent 记忆不是聊天记录。它是从经验里蒸馏出的结构化知识。每条记忆都有类型、置信度分数和标签。这样比让 agent 反复重读旧对话高效得多。

第 4 章：给它们关系——动态亲和度（Affinity）

6 个 agent 互动一个月，关系还和第一天一模一样——但真实团队里，协作会增进默契，争吵会拉扯关系。

亲和度系统

每对 agent 都有一个亲和度值（0.10–0.95）：

CREATE TABLE ops_agent_relationships (

id UUID PRIMARY KEY DEFAULT gen_random_uuid(),

agent_a TEXT NOT NULL,

agent_b TEXT NOT NULL,

affinity NUMERIC(3,2) NOT NULL DEFAULT 0.50,

total_interactions INTEGER DEFAULT 0,

positive_interactions INTEGER DEFAULT 0,

negative_interactions INTEGER DEFAULT 0,

drift_log JSONB DEFAULT '[]',

UNIQUE(agent_a, agent_b),

CHECK(agent_a < agent_b) -- alphabetical ordering ensures uniqueness

);

这个 CHECK(agent_a < agent_b) 约束至关重要——按字母排序能保证 “analyst-boss” 和 “boss-analyst” 不会变成两条记录。没有它，A 和 B 的关系可能被存两次，查询和更新都会一团糟。

初始关系设定

6 个 agent 意味着 15 组两两关系。每组都有初始亲和度和背景故事：

opus ↔ brain: 0.80 — 最信任的顾问

opus ↔ twitter-alt: 0.30 — 老板 vs. 反骨（张力最高）

brain ↔ twitter-alt: 0.30 — 方法论 vs. 冲动（天然戏剧性）

brain ↔ observer: 0.80 — 研究搭档（最亲密盟友）

creator ↔ twitter-alt: 0.70 — 内容流水线（天然协作）

提示：刻意设置几对“低亲和度”的组合。它们会在辩论和冲突解决中产出最精彩的对话。如果大家都很和谐，对话会很无聊。

漂移机制（Drift Mechanism）

每次对话后，那次“记忆蒸馏”的 LLM 调用还会输出关系漂移——不需要额外的 LLM 调用：

{

"memories": [...],

"pairwise_drift": [

{ "agent_a": "brain", "agent_b": "twitter-alt", "drift": -0.02, "reason": "disagreed on strategy" },

{ "agent_a": "opus", "agent_b": "brain", "drift": +0.01, "reason": "aligned on priorities" }

]

}

漂移规则非常严格：

每次对话最大漂移：±0.03（一次争执不该把同事变仇人）

亲和度下限：0.10（至少还能说得上话）

亲和度上限：0.95（最亲的搭档也要保留健康距离）

只保留最近 20 条 drift_log（方便追踪关系演化）

async function applyPairwiseDrifts(drifts, policy, conversationId) {

for (const { agent_a, agent_b, drift, reason } of drifts) {

const [a, b] = [agent_a, agent_b].sort();  // alphabetical sort

const clamped = clamp(drift, -0.03, 0.03);

// update affinity, append to drift_log

await sb.from('ops_agent_relationships')

  .update({

    affinity: clamp(currentAffinity + clamped, 0.10, 0.95),

    drift_log: [...recentLog.slice(-19), { drift: clamped, reason, conversationId, at: new Date() }],

  })

  .eq('agent_a', a).eq('agent_b', b);

}

亲和度如何影响系统？

说话顺序选择：亲和度更高的 agent 更可能互相回应

冲突解决：低亲和度组合会被自动配对进入 conflict_resolution 对话

导师配对：高亲和度 + 经验差距 → 进入 mentoring 对话

对话语气：系统会根据亲和度调整 prompt 的互动类型（supportive/neutral/critical/challenge）

// interaction type shifts with affinity

const tension = 1 - affinity;

if (tension > 0.6) {

// high tension → 20% chance of direct challenge

interactionType = Math.random() < 0.2 ? 'challenge' : 'critical';

} else if (tension < 0.3) {

// low tension → 40% chance of supportive

interactionType = Math.random() < 0.4 ? 'supportive' : 'agreement';

}

第 5 章：让它们主动提点子——Initiative 系统

系统跑了一周。agent 把分配给它们的任务都完成了。但它们从来没说过：“我觉得我们应该做 X。”

新手提示：什么是 Initiative？前面几章里，agent 主要是“被动反应式”工作——触发器触发，它们执行。Initiative 是让 agent 主动提出“我觉得我们应该做 X”。就像公司里：初级员工等分配任务，高级员工会自己提出方案。

为什么 Initiative 不放在 Heartbeat 里

Heartbeat 每 5 分钟在 Vercel 的 serverless 上跑。在那里做提案生成的 LLM 调用？不行：

Vercel 函数超时很严格（10–30 秒）

LLM 调用不可预测——有时 2 秒，有时 20 秒

Heartbeat 必须可靠。一条 LLM 调用超时，不该把整套系统拖垮。

修复方案：Heartbeat 只负责“入队”（轻量规则），VPS worker 负责“生成”（重 LLM 工作）。

Heartbeat 判定“这个 agent 该做一次 initiative 了”

→ 写入 ops_initiative_queue

→ VPS worker 消费队列

→ Haiku 模型生成提案（便宜 + 快）

→ POST /api/ops/proposals（走完整 proposal-service gate）

入队条件

不是每个 agent 每次都能提出 initiative。条件如下：

async function maybeQueueInitiative(sb, agentId) {

// Cooldown: max 1 per 4 hours

// Prerequisites: >= 5 high-confidence memories + has outcome lessons

// i.e., the agent needs enough "accumulated experience" to make valuable suggestions

}

为什么要求 >= 5 条高置信度记忆？经验不足的 agent 只会提些泛泛、表面化的点子。先让它积累经验，再让它开口。

由对话生成任务

另一个 initiative 来源：对话里的行动项（action items）。

不是所有对话格式都算——只有 standup、war_room、brainstorm 这些“正式”格式。茶水间闲聊里的点子不该自动变成任务。

这也复用记忆蒸馏那次 LLM 调用——额外成本为零：

{

"memories": [...],

"pairwise_drift": [...],

"action_items": [

{

  "title": "Research competitor pricing strategies",

  "agent_id": "brain",

  "step_kind": "analyze"

}

]

}

每天最多 3 个 action items 会转成 missions。

提示：initiative 流程的每一步都会走 proposal-service 的完整 gate——配额检查、自动批准、cap gates，全都要过。agent “自己提工作”并不意味着它们能绕过安全机制。

第 6 章：让它们更像“人”——声音进化（Voice Evolution）

6 个 agent 聊了一个月，说话方式还和第一天一模一样。但如果某个 agent 积累了大量关于“推文互动”的经验，它的表达方式理应反映出来。

新手提示：什么是 Voice Evolution？人在公司待久了，说话方式会变——做很多数据分析的人会自然先抛数字，处理大量客诉的人会更有耐心。agent 也该如此：它们累积的经验，应该体现在它们的表达方式里。

从记忆推导人格

我最初的直觉是做一张“人格进化”表——太重。最终方案：基于现有 memory 表动态推导人格，不需要新增表。不是存一套单独的“人格分数”，而是在每次对话前查看这个 agent 有哪些记忆，并即时计算它的说话风格要如何微调。

// lib/ops/voice-evolution.ts

async function deriveVoiceModifiers(sb, agentId) {

// aggregate this agent's memory distribution

const stats = await aggregateMemoryStats(sb, agentId);

const modifiers = [];

// rule-driven (not LLM)

if (stats.lesson_count > 10 && stats.tags.includes('engagement')) {

modifiers.push('Reference what works in engagement when relevant');

}

if (stats.pattern_count > 5 && stats.top_tag === 'content') {

modifiers.push("You've developed expertise in content strategy");

}

if (stats.strategy_count > 8) {

modifiers.push('You think strategically about long-term plans');

}

return modifiers.slice(0, 3); // max 3

}

为什么用规则而不是 LLM？

确定性：规则输出可预测，不会因为 LLM 幻觉导致人格突然跳变。

成本：$0。不需要额外 LLM 调用。

可调试：规则误触发时，很容易定位原因。

注入方式

在对话开始前，把 modifiers 注入到 agent 的 system prompt 里：

async function buildAgentPrompt(agentId, baseVoice) {

const modifiers = await deriveVoiceModifiers(sb, agentId);

let prompt = baseVoice.systemDirective; // base voice

if (modifiers.length > 0) {

prompt += '\n\nPersonality evolution:\n';

prompt += modifiers.map(m => `- ${m}`).join('\n');

}

return prompt;

}

效果是这样的：比如你的社媒 agent 已经积累了 15 条关于推文互动的 lesson。它的 prompt 里现在包含 “Reference what works in engagement when relevant”——它就会在对话中自然而然提到互动策略。

在同一场对话里，每个 agent 的 voice modifiers 只推导一次并缓存——不会每一轮都重复查询。

第 7 章：让它看起来很酷——前端

后端再丝滑，如果没人看得见，也等于不存在。

Stage 页面

这是系统的主面板。它起初是一个 1500+ 行的巨型组件——加载慢，一个错误就整页白屏。

拆分如下：

🧩 StageHeader.tsx → 标题栏 + 视图切换

🧩 SignalFeed.tsx → 实时信号流（虚拟列表）

🧩 MissionsList.tsx → 任务列表 + 展开/折叠

🧩 StageFilters.tsx → 筛选面板

🧩 StageErrorBoundary.tsx → 错误边界 + 兜底 UI

🧩 StageSkeletons.tsx → 骨架屏加载态

虚拟化（Virtualization）

新手提示：什么是虚拟化？假设你有 1,000 条事件列表。如果浏览器一次渲染 1,000 个 DOM 元素，页面会卡。虚拟化的意思是——只渲染当前可见的 20 条，滚动时动态替换内容。用户感觉是在滚动 1,000 条，但浏览器实际只渲染 20 条。

系统每天会产生数百条 event。全渲染？滚动会非常卡。用 @tanstack/react-virtual：

import { useVirtualizer } from '@tanstack/react-virtual';

const virtualizer = useVirtualizer({

getScrollElement: () => parentRef.current,

estimateSize: () => 72, // estimated row height

overscan: 8, // render 8 extra rows as buffer

});

500+ 条事件，滚动丝滑如黄油。

错误边界（Error Boundaries）

用 react-error-boundary——某个组件崩了，也不会把整个页面带崩：

Something went wrong. \\

OfficeRoom：像素风办公室

这是最有辨识度的一块——6 个像素风 agent 坐在赛博朋克办公室里：

行为状态：working / chatting / grabbing coffee / celebrating / walking around

天空变化：day / dusk / night（同步真实时间）

白板显示实时 OPS 指标

agent 会走来走去（走路动画）

这是视觉糖果——不影响系统逻辑，但它是最先“钩住”用户的东西。

任务回放（Mission Playback）

点开一个 mission，就能像看视频一样回放它的执行过程：

function MissionPlayback({ mission }) {

const [step, setStep] = useState(0);

const [playing, setPlaying] = useState(false);

useEffect(() => {

if (playing && step < mission.events.length - 1) {

  const timer = setTimeout(() => setStep(s => s + 1), 2000);

  return () => clearTimeout(timer);

}

}, [playing, step]);

return (

<div>

  <Timeline events={mission.events} current={step} onClick={setStep} />

  <PlayButton playing={playing} onClick={() => setPlaying(!playing)} />

  <StepDetail event={mission.events[step]} />

</div>

);

}

新手提示：前端是可选项。你完全可以只看 Supabase 控制台里的数据来调试整套系统。但如果你想让别人也看到你的 agent 在做什么，一个好看的前端是必需的。

第 8 章：上线检查清单

你已经读完 7 章。这里是你的清单。

数据库迁移（Database Migrations）

新手提示：什么是 migration？就是“数据库的版本控制”。每次你建表或改列，都写一个带编号的 SQL 文件（001、002……）。这样任何拿到你代码的人都能按顺序执行，得到完全一致的数据库结构。

按以下顺序执行你的 SQL migrations：

001-010：核心表（proposals、missions、steps、events、policy、memory……）

011： trigger_rules（触发规则表）

012： agent_reactions（反应队列表）

013： roundtable_queue（对话队列表）

014： dynamic_relationships（动态关系表）

015： initiative_queue（initiative 队列表）

新手提示：如果你用 Supabase，去 Dashboard → SQL Editor 里粘贴 SQL 即可。或者用 Supabase CLI：supabase db push。

种子脚本（Seed Scripts）

初始化数据（必须在建表后执行）：

1. Core policies

node scripts/go-live/seed-ops-policy.mjs

2. Trigger rules (4 reactive + 7 proactive)

node scripts/go-live/seed-trigger-rules.mjs

node scripts/go-live/seed-proactive-triggers.mjs

3. Roundtable policies

node scripts/go-live/seed-roundtable-policy.mjs

4. Initial relationship data (15 pairs)

node scripts/go-live/seed-relationships.mjs

关键策略配置（Key Policy Configuration）

至少要设置这些：

⚙️ auto_approve

→ 建议：{ "enabled": true }

→ 目的：启用自动批准

⚙️ x_daily_quota

→ 建议：{ "limit": 5 }

→ 目的：每日推文上限（先保守一点）

⚙️ roundtable_policy

→ 建议：{ "enabled": true, "max_daily_conversations": 5 }

→ 目的：对话上限（先保守一点）

⚙️ memory_influence_policy

→ 建议：{ "enabled": true, "probability": 0.3 }

→ 目的：记忆影响概率

⚙️ relationship_drift_policy

→ 建议：{ "enabled": true, "max_drift": 0.03 }

→ 目的：关系漂移上限

⚙️ initiative_policy

→ 建议：{ "enabled": false }

→ 目的：系统稳定前先关闭

建议：新功能先用 enabled: false 起步。等系统跑稳了，再一个一个打开。

VPS Worker 部署

每种 step kind 一个 worker 进程：

⚡ roundtable-worker → 对话编排 + 记忆提取 + Initiative

⚡ x-autopost → 推文发布

⚡ analyze-worker → 分析任务执行

⚡ content-worker → 内容创作

⚡ crawl-worker → 网页爬取

用 systemd 管理（崩溃自动重启）：

[Service]

Type=simple

ExecStart=/usr/bin/node /path/to/worker.mjs

Restart=always

RestartSec=10

新手提示：什么是 systemd？它是 Linux 自带的“进程保姆”。你告诉它“跑这个程序”，它就盯着——进程崩了？10 秒后自动拉起。服务器重启？自动启动。再也不用凌晨 3 点爬起来手动重启。如果你不熟 systemd，把这段配置和你的 worker 路径给你的 AI 编程助手，让它生成完整的 service 文件即可。

验证步骤（Verification Steps）

npm run build — 0 错误

Heartbeat 每 5 分钟成功一次（看 ops_action_runs 表）

触发器在触发（看 ops_trigger_rules 的 fire_count）

圆桌对话在运行（看 ops_roundtable_queue 里 status 为 succeeded 的行）

事件在流动（看 ops_agent_events 有没有新行）

记忆在写入（看 ops_agent_memory 有没有新行）

前端能显示信号流（打开 /stage 页面）

第二条轨道：OpenClaw

上面这些 workers 都是“反应式”的——它们等 heartbeat 创建 mission，然后轮询 Supabase 去执行。但还有第二条轨道。

OpenClaw 是一个多智能体网关，跑在同一台 VPS 上。它不管理这些 workers——它自己独立运行按计划调度的 agent 任务：

深度研究循环（agent 按计划自主研究主题）

社交情报扫描（定期做 Twitter/web 分析）

每日简报（agent 总结发生了什么）

记忆整合（agent 归纳他们学到的东西）

可以这样理解：

Workers = 等分配任务的员工（heartbeat 分配，他们执行）

OpenClaw = 有自己作息的员工（早上 9 点研究，下午 1 点社交扫描，晚上 8 点写简报——没人指挥，他们按计划自己做）

OpenClaw 的输出会通过一个轻量的导出脚本桥接到你的 /stage 前端——所以 agent 自主做的事情也会出现在仪表盘里。

你不需要 OpenClaw 就能起步。这篇教程的 heartbeat + workers 闭环本身就是一套完整系统。OpenClaw 会在你准备好之后，为你增加第二层自主行为。

当你准备好了，我做了一个面向 OpenClaw 的 Claude Code skill——安装后，你的 AI 助手就能开箱即用地搭建和运维一切：

Install the OpenClaw operations skill

claude install-skill https://github.com/Heyvhuang/ship-faster/tree/main/skills/tool-openclaw

安装后，你只要对 AI 说：

“Set up OpenClaw on my VPS. Configure scheduled jobs for research, social scanning, and daily briefings.”

这个 skill 里包含完整的运维参考——搭建、配置、cron、排障。你的 AI 会读它，然后把剩下的都办了。相信你的 AI。

新手提示：什么是 Claude Code skill？它是你安装进 AI 编程助手里的知识包。把它想成“作弊小抄”——区别在于你的 AI 会自动读它，并且知道该跑哪些命令。你不用自己背。

成本拆解（Cost Breakdown）

💰 LLM（Claude API）→ 按使用量计费，约 $10–20/月

💰 VPS（Hetzner 2 核 4GB）→ 固定 $8

💰 Vercel → $0（Hobby 免费档）

💰 Supabase → $0（免费档）

──────────

📊 合计：固定 $8 + LLM 用量

LLM 成本取决于你跑多少对话和任务。如果你坚持 3 个 agent + 每天 3–5 场对话，LLM 成本可以压到 $5/月以内。Hetzner 的 VPS 比 AWS/GCP 便宜得多，性能也完全够用。

新手提示：LLM API 是“按次付费”——每次 API 调用花一点钱。不调用就不收费。就像电话套餐——按用量计费。Vercel 和 Supabase 的免费档对个人项目来说绰绰有余。

最终感想（Final Thoughts）

这套系统并不完美。

agent 的“自由意志”主要是概率性的“不确定性模拟”，不是真正推理

记忆系统是结构化的知识抽取，不是真正“理解”

关系漂移很小（±0.03）——要很久才看得出明显变化

但系统确实能跑，而且确实不需要人盯着。6 个 agent 每天开 10+ 场会、发推、写内容、彼此学习。有时它们还会因为分歧而“争吵”——而第二天它们的亲和度真的会微微下降一点。

你不必一次把所有事做完

这篇教程有 8 章。看起来很多。但你真的不需要一次性全做完。

最小可用版本：3 个 agent（协调者、执行者、观察者）+ 4 张核心表 + heartbeat + 1 个 worker。这样就足够跑起闭环。

中级版本：加上圆桌对话 + 记忆系统。agent 开始像是在协作。

完整版：8 章全上。动态关系、initiative、声音进化。agent 开始像一个真正的团队。

每一步都是独立的。先把一步跑通，再加下一步。别一口吃成胖子。

还是迷糊？

把这篇文章粘贴进 Claude，然后说：

“我想按这篇教程为 [你的领域] 搭建一个多智能体系统，从 3 个 agent 开始。请生成第 1 章的完整代码——4 张 Supabase 表 + proposal-service + heartbeat route。”

它会把代码写出来的。真的。

如果你按这篇教程搭出了自己的版本——哪怕只是 2 个 agent 在聊天——来 @Voxyz_ai 告诉我。

你可以在 voxyz.space 实时看到全部 6 个 agent 在运作。

独立开发者在搭多智能体系统——你每多聊一个人，就少踩一个坑。

它们会思考。它们会行动。你能看见一切。

链接：http://x.com/i/article/2020256905101279232

1. Core policies

node scripts/go-live/seed-ops-policy.mjs

2. Trigger rules (4 reactive + 7 proactive)

node scripts/go-live/seed-trigger-rules.mjs

node scripts/go-live/seed-proactive-triggers.mjs

3. Roundtable policies

node scripts/go-live/seed-roundtable-policy.mjs

4. Initial relationship data (15 pairs)

node scripts/go-live/seed-relationships.mjs

Key Policy Configuration

关键策略配置（Key Policy Configuration）

At minimum, set these:

至少要设置这些：

⚙️ auto_approve

→ Suggested: { "enabled": true }

→ 建议：{ "enabled": true }

→ Purpose: Enable auto-approval

→ 目的：启用自动批准

⚙️ x_daily_quota

→ Suggested: { "limit": 5 }

→ 建议：{ "limit": 5 }

→ Purpose: Daily tweet limit (start conservative)

→ 目的：每日推文上限（先保守一点）

⚙️ roundtable_policy

→ Suggested: { "enabled": true, "max_daily_conversations": 5 }

→ 建议：{ "enabled": true, "max_daily_conversations": 5 }

→ Purpose: Conversation cap (start conservative)

→ 目的：对话上限（先保守一点）

⚙️ memory_influence_policy

→ Suggested: { "enabled": true, "probability": 0.3 }

→ 建议：{ "enabled": true, "probability": 0.3 }

→ Purpose: Memory influence probability

→ 目的：记忆影响概率

⚙️ relationship_drift_policy

→ Suggested: { "enabled": true, "max_drift": 0.03 }

→ 建议：{ "enabled": true, "max_drift": 0.03 }

→ Purpose: Max relationship drift

→ 目的：关系漂移上限

⚙️ initiative_policy

→ Suggested: { "enabled": false }

→ 建议：{ "enabled": false }

→ Purpose: Keep off until the system is stable

→ 目的：系统稳定前先关闭

Recommendation: New features start with enabled: false. Turn them on one by one once the system is running smoothly.

建议：新功能先用 enabled: false 起步。等系统跑稳了，再一个一个打开。

VPS Worker Deployment

VPS Worker 部署

One worker process per step kind:

每种 step kind 一个 worker 进程：

⚡ roundtable-worker → Conversation orchestration + memory extraction + Initiative

⚡ roundtable-worker → 对话编排 + 记忆提取 + Initiative

⚡ x-autopost → Tweet publishing

⚡ x-autopost → 推文发布

⚡ analyze-worker → Analysis task execution

⚡ analyze-worker → 分析任务执行

⚡ content-worker → Content creation

⚡ content-worker → 内容创作

⚡ crawl-worker → Web crawling

⚡ crawl-worker → 网页爬取

Manage with systemd (auto-restart on crash):

用 systemd 管理（崩溃自动重启）：

[Service]

Type=simple

ExecStart=/usr/bin/node /path/to/worker.mjs

Restart=always

RestartSec=10

Beginner tip: What's systemd? Linux's built-in "process nanny." You tell it "run this program," and it watches over it — process crashed? Auto-restart in 10 seconds. Server rebooted? Starts it automatically. No more waking up at 3 AM to manually restart things. If you're not familiar with systemd, give this config and your worker path to your AI coding assistant and ask it to generate the complete service file.

Verification Steps

验证步骤（Verification Steps）

npm run build — zero errors

npm run build — 0 错误

Heartbeat succeeding every 5 minutes (check the ops_action_runs table)

Heartbeat 每 5 分钟成功一次（看 ops_action_runs 表）

Triggers are firing (check ops_trigger_rules for fire_count)

触发器在触发（看 ops_trigger_rules 的 fire_count）

Roundtable conversations are running (check ops_roundtable_queue for rows with succeeded status)

圆桌对话在运行（看 ops_roundtable_queue 里 status 为 succeeded 的行）

Events are flowing (check ops_agent_events for new rows)

事件在流动（看 ops_agent_events 有没有新行）

Memories are being written (check ops_agent_memory for new rows)

记忆在写入（看 ops_agent_memory 有没有新行）

Frontend shows the signal feed (open the /stage page)

前端能显示信号流（打开 /stage 页面）

The Second Track: OpenClaw

第二条轨道：OpenClaw

The workers above are reactive — they wait for heartbeat to create missions, then poll Supabase and execute. But there's a second track.

上面这些 workers 都是“反应式”的——它们等 heartbeat 创建 mission，然后轮询 Supabase 去执行。但还有第二条轨道。

OpenClaw is a multi-agent gateway that runs on the same VPS. It doesn't manage the workers — it runs its own scheduled agent jobs independently:

OpenClaw 是一个多智能体网关，跑在同一台 VPS 上。它不管理这些 workers——它自己独立运行按计划调度的 agent 任务：

Deep research cycles (agents autonomously research topics on a schedule)

深度研究循环（agent 按计划自主研究主题）

Social intelligence scans (periodic Twitter/web analysis)

社交情报扫描（定期做 Twitter/web 分析）

Daily briefings (agents summarize what happened)

每日简报（agent 总结发生了什么）

Memory integration (agents consolidate what they've learned)

记忆整合（agent 归纳他们学到的东西）

Think of it this way:

可以这样理解：

Workers = employees waiting for assignments (heartbeat assigns, they execute)

Workers = 等分配任务的员工（heartbeat 分配，他们执行）

OpenClaw = employees with their own daily routines (research at 9am, social scan at 1pm, briefing at 8pm — no one tells them to, they just do it on schedule)

OpenClaw = 有自己作息的员工（早上 9 点研究，下午 1 点社交扫描，晚上 8 点写简报——没人指挥，他们按计划自己做）

OpenClaw's output gets bridged to your /stage frontend via a lightweight exporter script — so everything the agents do autonomously also shows up in the dashboard.

OpenClaw 的输出会通过一个轻量的导出脚本桥接到你的 /stage 前端——所以 agent 自主做的事情也会出现在仪表盘里。

You don't need OpenClaw to get started. The heartbeat + workers loop from this tutorial is a complete system on its own. OpenClaw adds a second layer of autonomous behavior when you're ready for it.

你不需要 OpenClaw 就能起步。这篇教程的 heartbeat + workers 闭环本身就是一套完整系统。OpenClaw 会在你准备好之后，为你增加第二层自主行为。

When you are ready, I made a Claude Code skill for OpenClaw — install it, and your AI assistant knows how to set up and operate everything out of the box:

当你准备好了，我做了一个面向 OpenClaw 的 Claude Code skill——安装后，你的 AI 助手就能开箱即用地搭建和运维一切：

Install the OpenClaw operations skill

claude install-skill https://github.com/Heyvhuang/ship-faster/tree/main/skills/tool-openclaw

Once installed, just tell your AI:

安装后，你只要对 AI 说：

"Set up OpenClaw on my VPS. Configure scheduled jobs for research, social scanning, and daily briefings."

“Set up OpenClaw on my VPS. Configure scheduled jobs for research, social scanning, and daily briefings.”

The skill contains the complete operations reference — setup, config, cron jobs, troubleshooting. Your AI reads it and handles the rest. Trust your AI.

这个 skill 里包含完整的运维参考——搭建、配置、cron、排障。你的 AI 会读它，然后把剩下的都办了。相信你的 AI。

Beginner tip: What's a Claude Code skill? It's a knowledge pack you install into your AI coding assistant. Think of it like a "cheat sheet" — except your AI reads it automatically and knows exactly what commands to run. You don't need to memorize anything.

Cost Breakdown

成本拆解（Cost Breakdown）

💰 LLM (Claude API) → Usage-based, ~$10-20/month

💰 LLM（Claude API）→ 按使用量计费，约 $10–20/月

💰 VPS (Hetzner 2-core 4GB) → $8 fixed

💰 VPS（Hetzner 2 核 4GB）→ 固定 $8

💰 Vercel → $0 (Hobby free tier)

💰 Vercel → $0（Hobby 免费档）

💰 Supabase → $0 (Free tier)

💰 Supabase → $0（免费档）

──────────

📊 Total: $8 fixed + LLM usage

📊 合计：固定 $8 + LLM 用量

LLM cost depends on how many conversations and tasks you run. If you stick to 3 agents + 3-5 conversations per day, you can keep LLM costs under $5/month. Hetzner's VPS is way cheaper than AWS/GCP, and the performance is more than enough.

Beginner tip: LLM APIs are "pay-per-use" — each API call costs a small amount. No calls, no charges. It's like a phone plan — you pay for what you use. Vercel and Supabase both have free tiers that are plenty for personal projects.

Final Thoughts

最终感想（Final Thoughts）

This system isn't perfect.

这套系统并不完美。

Agent "free will" is mostly probabilistic uncertainty simulation, not true reasoning

agent 的“自由意志”主要是概率性的“不确定性模拟”，不是真正推理

The memory system is structured knowledge extraction, not genuine "understanding"

记忆系统是结构化的知识抽取，不是真正“理解”

Relationship drift is small (±0.03) — it takes a long time to see significant changes

关系漂移很小（±0.03）——要很久才看得出明显变化

But the system genuinely runs, and genuinely doesn't need babysitting. 6 agents hold 10+ meetings a day, post tweets, write content, and learn from each other. Sometimes they even "argue" over disagreements — and the next day their affinity actually drops a tiny bit.

You Don't Have to Do Everything at Once

你不必一次把所有事做完

This tutorial is 8 chapters long. That looks like a lot. But you really don't need to tackle it all in one go.

这篇教程有 8 章。看起来很多。但你真的不需要一次性全做完。

Minimum viable version: 3 agents (coordinator, executor, observer) + 4 core tables + heartbeat + 1 worker. That's enough for a working loop.

最小可用版本：3 个 agent（协调者、执行者、观察者）+ 4 张核心表 + heartbeat + 1 个 worker。这样就足够跑起闭环。

Intermediate version: Add roundtable conversations + the memory system. Now agents start feeling like they're actually collaborating.

中级版本：加上圆桌对话 + 记忆系统。agent 开始像是在协作。

Full version: All 8 chapters. Dynamic relationships, initiative, voice evolution. Agents start feeling like a real team.

完整版：8 章全上。动态关系、initiative、声音进化。agent 开始像一个真正的团队。

Each step is independent. Get one working before adding the next. Don't bite off more than you can chew.

每一步都是独立的。先把一步跑通，再加下一步。别一口吃成胖子。

Still Lost?

还是迷糊？

Paste this article into Claude and say:

把这篇文章粘贴进 Claude，然后说：

"I want to build a multi-agent system for [your domain] following this tutorial, starting with 3 agents. Generate the complete code for Chapter 1 — 4 Supabase tables + proposal-service + heartbeat route."

It'll write the code for you. Seriously.

它会把代码写出来的。真的。

If you build your own version following this tutorial — even if it's just 2 agents having conversations — come tell me at @Voxyz_ai.

如果你按这篇教程搭出了自己的版本——哪怕只是 2 个 agent 在聊天——来 @Voxyz_ai 告诉我。

You can see all 6 agents operating in real time at voxyz.space.

你可以在 voxyz.space 实时看到全部 6 个 agent 在运作。

Solo devs building multi-agent systems — every person you talk to is one fewer mistake you'll make.

独立开发者在搭多智能体系统——你每多聊一个人，就少踩一个坑。

They Think. They Act. You See Everything.

它们会思考。它们会行动。你能看见一切。

Link: http://x.com/i/article/2020256905101279232

链接：http://x.com/i/article/2020256905101279232

The Full Tutorial: 6 AI Agents That Run a Company — How I Built Them From Scratch

Source: https://x.com/voxyz_ai/status/2020272022417289587?s=46
Mirror: https://x.com/voxyz_ai/status/2020272022417289587?s=46
Published: 2026-02-07T23:02:39+00:00
Saved: 2026-02-10

Content

What You'll End Up With

Here's what you'll have when you're done:

6 AI agents doing real work every day: scanning intelligence, writing content, posting tweets, running analyses

10-15 conversations per day: standups, debates, watercooler chats, one-on-one mentoring

Agents that remember lessons learned and factor them into future decisions

Relationships that shift — collaborate more, affinity goes up; argue too much, it drops

Speaking styles that evolve — an agent with lots of "tweet engagement" experience starts naturally referencing engagement strategies

Full transparency — a pixel-art office on the frontend showing everything in real time

Tech stack: Next.js + Supabase + VPS. Monthly cost: $8 fixed + LLM usage.

No OpenAI Assistants API. No LangChain. No AutoGPT. Just PostgreSQL + a few Node.js workers + a rule engine.

You don't need to start with 6 agents. Begin with 3 — a coordinator, an executor, and an observer — and you'll have a fully working loop.

Chapter 1: The Foundation — 4 Tables to Close the Loop

A lot of people jump straight to "autonomous thinking." But if your agent can't even process a queued step, what autonomy are we talking about?

The Core Data Model

The entire system skeleton is 4 tables. The relationship between them is simple — picture a circle:

That's the loop. It runs forever. That's your "closed loop."

Create these tables in Supabase:

The Core Data Model — 4 Tables:

📋 ops_mission_proposals

→ Stores proposals

→ Fields: agent_id, title, status (pending/accepted/rejected), proposed_steps

📋 ops_missions

→ Stores missions

→ Fields: title, status (approved/running/succeeded/failed), created_by

📋 ops_mission_steps

→ Stores execution steps

→ Fields: mission_id, kind (draft_tweet/crawl/analyze...), status (queued/running/succeeded/failed)

📋 ops_agent_events

→ Stores the event stream

→ Fields: agent_id, kind, title, summary, tags[]

Beginner tip: If you don't know how to write the SQL, copy that table above and paste it to your AI coding assistant with "Generate Supabase SQL migrations for these tables." It'll handle it.

Proposal Service: The Hub of the Entire System

This was one of my biggest mistakes — triggers, APIs, and the reaction matrix were all creating proposals independently. Some went through approval, some didn't.

The fix: One function to rule them all. No matter where a proposal comes from — agent initiative, automatic trigger, or another agent's reaction — everything goes through the same function.

// proposal-service.ts — the single entry point for proposal creation

export async function createProposalAndMaybeAutoApprove(sb, input) {

// 1. Check if this agent hit its daily limit

// 2. Check Cap Gates (tweet quota full? too much content today?)

// → If full, reject immediately — no queued step created

// 3. Insert the proposal

// 4. Evaluate auto-approve (low-risk tasks pass automatically)

// 5. If approved → create mission + steps

// 6. Fire an event (so the frontend can see it)

}

So check at the proposal entry point — quota full means instant rejection, no task enters the queue.

const STEP_KIND_GATES = {

write_content: checkWriteContentGate, // check daily content limit

post_tweet: checkPostTweetGate, // check tweet quota

deploy: checkDeployGate, // check deploy policy

};

Each step kind has its own gate. The tweet gate checks how many were posted today vs. the quota:

async function checkPostTweetGate(sb) {

const quota = await getPolicy(sb, 'x_daily_quota'); // read from ops_policy table

const todayCount = await countTodayPosted(sb); // count today's posts

if (todayCount >= quota.limit) {

return { ok: false, reason: `Quota full (${todayCount}/${quota.limit})` };

}

return { ok: true };

}

Tip: Block at the entry point, don't let tasks pile up in the queue. Rejected proposals should be logged (for audit trails), not silently dropped.

The Policy Table: ops_policy

Don't hardcode quotas and feature flags in your code. Store everything in an ops_policy table with a key-value structure:

CREATE TABLE ops_policy (

key TEXT PRIMARY KEY,

value JSONB NOT NULL DEFAULT '{}',

updated_at TIMESTAMPTZ DEFAULT now()

);

A few core policies:

// auto_approve: which step kinds can be auto-approved

{ "enabled": true, "allowed_step_kinds": ["draft_tweet","crawl","analyze","write_content"] }

// x_daily_quota: daily tweet limit

{ "limit": 8 }

// content_policy: content controls

{ "enabled": true, "max_drafts_per_day": 8 }

The benefit: you can tweak any policy by editing JSON values in the Supabase dashboard — no redeployment needed. System going haywire at 3 AM? Just flip enabled to false.

Heartbeat: The System's Pulse

Fires every 5 minutes, does 6 things:

// /api/ops/heartbeat — Vercel API route

export async function GET(req) {

// 1. Evaluate triggers (any conditions met?)

const triggers = await evaluateTriggers(sb, 4000);

// 2. Process reaction queue (do agents need to interact?)

const reactions = await processReactionQueue(sb, 3000);

// 3. Promote insights (any discoveries worth elevating?)

const learning = await promoteInsights(sb);

// 4. Learn from outcomes (how did those tweets perform? write lessons)

const outcomes = await learnFromOutcomes(sb);

// 5. Recover stuck tasks (steps running 30+ min with no progress → mark failed)

const stale = await recoverStaleSteps(sb);

// 6. Recover stuck conversations

const roundtable = await recoverStaleRoundtables(sb);

// Each step is try-catch'd — one failing won't take down the others

// Finally, write an ops_action_runs record (for auditing)

}

One line of crontab on the VPS triggers it:

*/5 * * * * curl -s -H "Authorization: Bearer $CRON_SECRET" https://your-domain.com/api/ops/heartbeat

Three-Layer Architecture

At this point, your system has three layers, each with a clear job:

VPS: The agents' brain + hands (thinking + executing tasks)

Vercel: The agents' process manager (approving proposals + evaluating triggers + health monitoring)

Supabase: The agents' shared memory (the single source of truth for all state and data)

Analogy: The VPS is the employee doing the work. Vercel is the boss issuing directives. Supabase is the company's shared docs — everyone reads from and writes to it.

Chapter 2: Making Them Talk — The Roundtable Conversation System

Agents can work now, but they're like people in separate cubicles — no idea what the others are doing. You need to get them in a room together.

Why Conversations Matter

It's not just for fun. Conversations are the key mechanism for emergent intelligence in multi-agent systems:

Information sync: One agent spots a trending topic, the others have no clue. Conversations make information flow.

Emergent decisions: The analyst crunches data, the coordinator synthesizes everyone's input — this beats any single agent going with its gut.

Memory source: Conversations are the primary source for writing lessons learned (more on this later).

Drama: Honestly, watching agents argue is way more fun than reading logs. Users love it.

Designing Agent Voices

Each agent needs a "persona" — tone, quirks, signature phrases. This is what makes conversations interesting.

Here's an example setup — customize these for your own domain and goals:

🎭 Boss — Project Manager

Tone: Results-oriented, direct

Quirk: Always asking about progress and deadlines

Line: "Bottom line — where are we on this?"

🎭 Analyst — Data Analyst

Tone: Cautious, data-driven

Quirk: Cites a number every time they speak

Line: "The numbers tell a different story."

🎭 Hustler — Growth Specialist

Tone: High-energy, action-biased

Quirk: Wants to "try it now" for everything

Line: "Ship it. We'll iterate."

🎭 Writer — Content Creator

Tone: Emotional, narrative-focused

Quirk: Turns everything into a "story"

Line: "But what's the narrative here?"

🎭 Wildcard — Social Media Ops

Tone: Intuitive, lateral thinker

Quirk: Proposes bold ideas

Line: "Hear me out — this is crazy but..."

Voices are defined in a config file:

// lib/roundtable/voices.ts

const VOICES = {

boss: {

displayName: 'Boss',

tone: 'direct, results-oriented, slightly impatient',

quirk: 'Always asks for deadlines and progress updates',

systemDirective: `You are the project manager.

  Speak in short, direct sentences. You care about deadlines,

  priorities, and accountability. Cut through fluff quickly.`,

analyst: {

displayName: 'Analyst',

tone: 'measured, data-driven, cautious',

quirk: 'Cites numbers before giving opinions',

systemDirective: `You are the data analyst.

  Always ground your opinions in data. You push back on gut feelings

  and demand evidence. You're skeptical but fair.`,

// ... your other agents

};

16 Conversation Formats

I designed 16 conversation formats, but you only need 3 to start:

Standup — the most practical

4-6 agents participate

6-12 turns of dialogue

The coordinator always speaks first (leader opens)

Purpose: align priorities, surface issues

Debate — the most dramatic

2-3 agents participate

6-10 turns of dialogue

Temperature 0.8 (more creative, more conflict)

Purpose: two agents with disagreements face off

Watercooler — surprisingly valuable

2-3 agents participate

2-5 turns of dialogue

Temperature 0.9 (very casual)

Purpose: random chitchat. But I've found that some of the best insights emerge from casual conversation.

// lib/roundtable/formats.ts

const FORMATS = {

standup: { minAgents: 4, maxAgents: 6, minTurns: 6, maxTurns: 12, temperature: 0.6 },

debate: { minAgents: 2, maxAgents: 3, minTurns: 6, maxTurns: 10, temperature: 0.8 },

watercooler: { minAgents: 2, maxAgents: 3, minTurns: 2, maxTurns: 5, temperature: 0.9 },

// ... 13 more

};

Who Speaks First? Who Goes Next?

function selectNextSpeaker(context) {

const weights = participants.map(agent => {

if (agent === lastSpeaker) return 0;              // no back-to-back speaking

let w = 1.0;

w += affinityTo(agent, lastSpeaker) * 0.6;        // good rapport with last speaker → more likely to respond

w -= recencyPenalty(agent, speakCounts) * 0.4;     // spoke recently → lower weight

w += (Math.random() * 0.4 - 0.2);                 // 20% random jitter

return w;

});

return weightedRandomPick(participants, weights);

}

This makes conversations feel real — agents with good relationships tend to riff off each other, but it's not absolute. Sometimes someone unexpected jumps in.

Daily Schedule

I designed 24 time slots covering the full day. The core idea:

Morning: Standup (100% probability, always happens) + brainstorm + strategy session

Afternoon: Deep-dive analysis + check-in + content review

Evening: Watercooler chat + debate + night briefing

Late night: Deep discussion + night-shift conversations

Each slot has a probability (40%-100%), so it doesn't fire every time. This keeps the rhythm natural.

// lib/roundtable/schedule.ts — one slot example

{

hour_utc: 6,

format: 'standup',

participants: ['opus', 'brain', ...threeRandom],

probability: 1.0, // happens every day

}

Conversation Orchestration

A roundtable-worker on the VPS handles this:

Polls the ops_roundtable_queue table every 30 seconds

Picks up pending conversation tasks

Generates dialogue turn by turn (one LLM call per turn)

Caps each turn at 120 characters (forces agents to talk like humans, not write essays)

Extracts memories after the conversation ends (next chapter)

Fires events to ops_agent_events (so the frontend can see it)

// simplified conversation orchestration flow

async function orchestrateConversation(session) {

const history = [];

for (let turn = 0; turn < maxTurns; turn++) {

const speaker = turn === 0

  ? selectFirstSpeaker(participants, format)

  : selectNextSpeaker({ participants, lastSpeaker, speakCounts, affinities });

const dialogue = await llm.generate({

  system: buildSystemPrompt(speaker, history),

  user: buildUserPrompt(topic, turn, maxTurns),

  temperature: format.temperature,

});

const cleaned = sanitize(dialogue);  // cap at 120 chars, strip URLs, etc.

history.push({ speaker, dialogue: cleaned, turn });

await emitEvent(speaker, cleaned);

await delay(3000 + Math.random() * 5000);  // 3-8 second gap

}

return history;

}

Chapter 3: Making Them Remember — Memory and Learning

Today the agents discuss "weekend posts get low engagement." Tomorrow they enthusiastically suggest posting more on weekends. Why? Because they have no memory.

5 Types of Memory

🧠 insight → Discovery

Example: "Users prefer tweets with data"

🧠 pattern → Pattern recognition

Example: "Weekend posts get 30% less engagement"

🧠 strategy → Strategy summary

Example: "Teaser before main post works better"

🧠 preference → Preference record

Example: "Prefers concise titles"

🧠 lesson → Lesson learned

Example: "Long tweets tank read-through rates"

Stored in the ops_agent_memory table:

CREATE TABLE ops_agent_memory (

id UUID PRIMARY KEY DEFAULT gen_random_uuid(),

agent_id TEXT NOT NULL,

type TEXT NOT NULL, -- insight/pattern/strategy/preference/lesson

content TEXT NOT NULL,

confidence NUMERIC(3,2) NOT NULL DEFAULT 0.60,

tags TEXT[] DEFAULT '{}',

source_trace_id TEXT, -- for idempotent dedup

superseded_by UUID, -- replaced by newer version

created_at TIMESTAMPTZ DEFAULT now()

);

Where Do Memories Come From?

Source 1: Conversation Distillation

After each roundtable conversation, the worker sends the full conversation history to an LLM to distill memories:

You are a memory distiller. Extract important insights, patterns,

or lessons from the following conversation.

Return JSON format:

{

"memories": [

{ "agent_id": "brain", "type": "insight", "content": "...", "confidence": 0.7, "tags": [...] }

]

}

Constraints:

Max 6 memories per conversation

Confidence below 0.55 gets dropped ("if you're not sure, don't remember it")

200 memories per agent cap (oldest get overwritten when exceeded)

Idempotent dedup via source_trace_id (prevents duplicate writes)

Source 2: Tweet Performance Reviews (Outcome Learning)

This is the core of Phase 2 — agents learn from their own work results:

// lib/ops/outcome-learner.ts

async function learnFromOutcomes(sb) {

// 1. Fetch tweet performance data from the last 48 hours

const metrics = await getRecentTweetMetrics(sb, 48);

if (metrics.length < 3) return; // too little data, skip

// 2. Calculate median engagement rate as baseline

const median = computeMedian(metrics.map(m => m.engagement_rate));

// 3. Strong performers (> 2x median) → write lesson, confidence 0.7

// 4. Weak performers (< 0.3x median) → write lesson, confidence 0.6

// 5. Idempotent: source_trace_id = 'tweet-lesson:{draft_id}'

// 6. Max 3 lessons per agent per day

}

This function runs on every heartbeat. Over time, agents accumulate experience about what tweets hit and what flopped.

Source 3: Mission Outcomes

Mission succeeds → write a strategy memory. Mission fails → write a lesson memory. Also deduped via source_trace_id.

How Does Memory Affect Behavior?

Having memories isn't enough — they need to change what the agent does next.

My approach: 30% chance that memory influences topic selection.

// lib/ops/trigger-types/proactive-utils.ts

async function enrichTopicWithMemory(sb, agentId, baseTopic, allTopics, cache) {

// 70% use the original topic — maintain baseline behavior

if (Math.random() > 0.3) {

return { topic: baseTopic, memoryInfluenced: false };

}

// 30% take the memory path

const memories = await queryAgentMemories(sb, {

agentId,

types: ['strategy', 'lesson'],

limit: 10,

minConfidence: 0.6,

});

// Scan memory keywords against all available topics

const matched = findBestMatch(memories, allTopics);

if (matched) {

return { topic: matched.topic, memoryInfluenced: true, memoryId: matched.id };

}

return { topic: baseTopic, memoryInfluenced: false };

}

Why 30% and not 100%?

100% = agents only do things they have experience with, zero exploration

0% = memories are useless

30% = memory-influenced but not memory-dependent

The heartbeat logs show memoryInfluenced: true/false, so you can monitor whether memory is actually kicking in.

Query Optimization: Memory Cache

A single heartbeat might evaluate 12 triggers, and multiple triggers might query the same agent's memories.

Fix: use a Map as a cache — same agent only hits the DB once.

// created at the evaluateTriggers entry point

const memoryCache = new Map();

// passed to every trigger checker

const outcome = await checker(sb, conditions, actionConfig, memoryCache);

Chapter 4: Giving Them Relationships — Dynamic Affinity

6 agents interact for a month, and their relationships are identical to day one — but in a real team, more collaboration builds rapport, and too much arguing strains it.

The Affinity System

Every pair of agents has an affinity value (0.10-0.95):

CREATE TABLE ops_agent_relationships (

id UUID PRIMARY KEY DEFAULT gen_random_uuid(),

agent_a TEXT NOT NULL,

agent_b TEXT NOT NULL,

affinity NUMERIC(3,2) NOT NULL DEFAULT 0.50,

total_interactions INTEGER DEFAULT 0,

positive_interactions INTEGER DEFAULT 0,

negative_interactions INTEGER DEFAULT 0,

drift_log JSONB DEFAULT '[]',

UNIQUE(agent_a, agent_b),

CHECK(agent_a < agent_b) -- alphabetical ordering ensures uniqueness

);

Initial Relationship Setup

6 agents means 15 pairwise relationships. Each has an initial affinity and a backstory:

opus ↔ brain: 0.80 — most trusted advisor

opus ↔ twitter-alt: 0.30 — boss vs. rebel (highest tension)

brain ↔ twitter-alt: 0.30 — methodology vs. impulse (natural drama)

brain ↔ observer: 0.80 — research partners (closest allies)

creator ↔ twitter-alt: 0.70 — content pipeline (natural collaborators)

Tip: Deliberately create a few "low affinity" pairs. They'll produce the most interesting conversations during debates and conflict resolution. If everyone gets along, conversations are boring.

The Drift Mechanism

After each conversation, the memory distillation LLM call also outputs relationship drift — no extra LLM call needed:

{

"memories": [...],

"pairwise_drift": [

{ "agent_a": "brain", "agent_b": "twitter-alt", "drift": -0.02, "reason": "disagreed on strategy" },

{ "agent_a": "opus", "agent_b": "brain", "drift": +0.01, "reason": "aligned on priorities" }

]

}

Drift rules are strict:

Max drift per conversation: ±0.03 (one argument shouldn't turn colleagues into enemies)

Affinity floor: 0.10 (they'll always at least talk to each other)

Affinity ceiling: 0.95 (even the closest pair keeps some healthy distance)

Keeps the last 20 drift_log entries (so you can trace how relationships evolved)

async function applyPairwiseDrifts(drifts, policy, conversationId) {

for (const { agent_a, agent_b, drift, reason } of drifts) {

const [a, b] = [agent_a, agent_b].sort();  // alphabetical sort

const clamped = clamp(drift, -0.03, 0.03);

// update affinity, append to drift_log

await sb.from('ops_agent_relationships')

  .update({

    affinity: clamp(currentAffinity + clamped, 0.10, 0.95),

    drift_log: [...recentLog.slice(-19), { drift: clamped, reason, conversationId, at: new Date() }],

  })

  .eq('agent_a', a).eq('agent_b', b);

}

How Does Affinity Affect the System?

Speaker selection: Agents with higher affinity are more likely to respond to each other

Conflict resolution: Low-affinity pairs get automatically paired for conflict_resolution conversations

Mentor pairing: High affinity + experience gap → mentoring conversations

Conversation tone: The system adjusts the prompt's interaction type based on affinity (supportive/neutral/critical/challenge)

// interaction type shifts with affinity

const tension = 1 - affinity;

if (tension > 0.6) {

// high tension → 20% chance of direct challenge

interactionType = Math.random() < 0.2 ? 'challenge' : 'critical';

} else if (tension < 0.3) {

// low tension → 40% chance of supportive

interactionType = Math.random() < 0.4 ? 'supportive' : 'agreement';

}

Chapter 5: Letting Them Propose Ideas — The Initiative System

The system ran for a week. Agents completed every task assigned to them. But they never once said, "I think we should do X."

Why Initiative Doesn't Live in Heartbeat

Heartbeat runs every 5 minutes on Vercel serverless. Running LLM calls for proposal generation there? No good:

Vercel function timeouts are strict (10-30 seconds)

LLM calls are unpredictable — sometimes 2 seconds, sometimes 20

Heartbeat needs to be reliable. One LLM call timing out shouldn't take the whole thing down.

The fix: Heartbeat only "enqueues" (lightweight rules), VPS worker does "generation" (heavy LLM work).

Heartbeat identifies "this agent is due for an initiative"

→ writes to ops_initiative_queue

→ VPS worker consumes the queue

→ Haiku model generates proposals (cheap + fast)

→ POST /api/ops/proposals (goes through full proposal-service gates)

Enqueue Conditions

Not every agent gets to propose initiatives every time. Conditions:

async function maybeQueueInitiative(sb, agentId) {

// Cooldown: max 1 per 4 hours

// Prerequisites: >= 5 high-confidence memories + has outcome lessons

// i.e., the agent needs enough "accumulated experience" to make valuable suggestions

}

Why require >= 5 high-confidence memories? An agent without enough experience will propose generic, surface-level ideas. Let it build up experience before speaking up.

Conversations Generating Tasks

Another initiative source: action items from conversations.

Not all conversation formats qualify — only standup, war_room, and brainstorm (the "formal" formats). Ideas from watercooler chats shouldn't automatically become tasks.

This also piggybacks on the memory distillation LLM call — zero additional cost:

{

"memories": [...],

"pairwise_drift": [...],

"action_items": [

{

  "title": "Research competitor pricing strategies",

  "agent_id": "brain",

  "step_kind": "analyze"

}

]

}

Max 3 action items per day convert to missions.

Chapter 6: Giving Them Personality — Voice Evolution

Deriving Personality from Memory

// lib/ops/voice-evolution.ts

async function deriveVoiceModifiers(sb, agentId) {

// aggregate this agent's memory distribution

const stats = await aggregateMemoryStats(sb, agentId);

const modifiers = [];

// rule-driven (not LLM)

if (stats.lesson_count > 10 && stats.tags.includes('engagement')) {

modifiers.push('Reference what works in engagement when relevant');

}

if (stats.pattern_count > 5 && stats.top_tag === 'content') {

modifiers.push("You've developed expertise in content strategy");

}

if (stats.strategy_count > 8) {

modifiers.push('You think strategically about long-term plans');

}

return modifiers.slice(0, 3); // max 3

}

Why rule-driven instead of LLM?

Deterministic: Rules produce predictable results. No LLM hallucination causing sudden personality shifts.

Cost: $0. No additional LLM calls.

Debuggable: When a rule misfires, it's easy to track down.

Injection Method

Modifiers get injected into the agent's system prompt before a conversation starts:

async function buildAgentPrompt(agentId, baseVoice) {

const modifiers = await deriveVoiceModifiers(sb, agentId);

let prompt = baseVoice.systemDirective; // base voice

if (modifiers.length > 0) {

prompt += '\n\nPersonality evolution:\n';

prompt += modifiers.map(m => `- ${m}`).join('\n');

}

return prompt;

}

Within the same conversation, each agent's voice modifiers are derived once and cached — no re-querying every turn.

Chapter 7: Making It Look Cool — The Frontend

Your backend can be humming perfectly, but if nobody can see it, it might as well not exist.

The Stage Page

This is the system's main dashboard. It started as a 1500+ line mega-component — slow to load, one error meant a full white screen.

The split:

🧩 StageHeader.tsx → Title bar + view toggle

🧩 SignalFeed.tsx → Real-time signal feed (virtualized)

🧩 MissionsList.tsx → Mission list + expand/collapse

🧩 StageFilters.tsx → Filter panel

🧩 StageErrorBoundary.tsx → Error boundary + fallback UI

🧩 StageSkeletons.tsx → Skeleton loading screens

Virtualization

The system generates hundreds of events daily. Render them all? Scrolling would lag. Use @tanstack/react-virtual:

import { useVirtualizer } from '@tanstack/react-virtual';

const virtualizer = useVirtualizer({

getScrollElement: () => parentRef.current,

estimateSize: () => 72, // estimated row height

overscan: 8, // render 8 extra rows as buffer

});

500+ events, buttery smooth scrolling.

Error Boundaries

Using react-error-boundary — if one component crashes, it won't take down the whole page:

Something went wrong. \\

OfficeRoom: The Pixel Art Office

This is the most recognizable piece — 6 pixel-art agents in a cyberpunk office:

Behavior states: working / chatting / grabbing coffee / celebrating / walking around

Sky changes: day / dusk / night (synced to real time)

Whiteboard displays live OPS metrics

Agents walk around (walking animations)

This component is visual candy — it doesn't affect system logic, but it's the first thing that hooks users.

Mission Playback

Click on a mission and replay its execution like a video:

function MissionPlayback({ mission }) {

const [step, setStep] = useState(0);

const [playing, setPlaying] = useState(false);

useEffect(() => {

if (playing && step < mission.events.length - 1) {

  const timer = setTimeout(() => setStep(s => s + 1), 2000);

  return () => clearTimeout(timer);

}

}, [playing, step]);

return (

<div>

  <Timeline events={mission.events} current={step} onClick={setStep} />

  <PlayButton playing={playing} onClick={() => setPlaying(!playing)} />

  <StepDetail event={mission.events[step]} />

</div>

);

}

Chapter 8: The Launch Checklist

You've read all 7 chapters. Here's your checklist.

Database Migrations

Run your SQL migrations in this order:

001-010: Core tables (proposals, missions, steps, events, policy, memory...)

011: trigger_rules (trigger rules table)

012: agent_reactions (reaction queue table)

013: roundtable_queue (conversation queue table)

014: dynamic_relationships (dynamic relationships table)

015: initiative_queue (initiative queue table)

Beginner tip: If you're using Supabase, go to Dashboard → SQL Editor and paste in the SQL. Or use the Supabase CLI: supabase db push.

Seed Scripts

Initialize data (must run after tables are created):

1. Core policies

node scripts/go-live/seed-ops-policy.mjs

2. Trigger rules (4 reactive + 7 proactive)

node scripts/go-live/seed-trigger-rules.mjs

node scripts/go-live/seed-proactive-triggers.mjs

3. Roundtable policies

node scripts/go-live/seed-roundtable-policy.mjs

4. Initial relationship data (15 pairs)

node scripts/go-live/seed-relationships.mjs

Key Policy Configuration

At minimum, set these:

⚙️ auto_approve

→ Suggested: { "enabled": true }

→ Purpose: Enable auto-approval

⚙️ x_daily_quota

→ Suggested: { "limit": 5 }

→ Purpose: Daily tweet limit (start conservative)

⚙️ roundtable_policy

→ Suggested: { "enabled": true, "max_daily_conversations": 5 }

→ Purpose: Conversation cap (start conservative)

⚙️ memory_influence_policy

→ Suggested: { "enabled": true, "probability": 0.3 }

→ Purpose: Memory influence probability

⚙️ relationship_drift_policy

→ Suggested: { "enabled": true, "max_drift": 0.03 }

→ Purpose: Max relationship drift

⚙️ initiative_policy

→ Suggested: { "enabled": false }

→ Purpose: Keep off until the system is stable

Recommendation: New features start with enabled: false. Turn them on one by one once the system is running smoothly.

VPS Worker Deployment

One worker process per step kind:

⚡ roundtable-worker → Conversation orchestration + memory extraction + Initiative

⚡ x-autopost → Tweet publishing

⚡ analyze-worker → Analysis task execution

⚡ content-worker → Content creation

⚡ crawl-worker → Web crawling

Manage with systemd (auto-restart on crash):

[Service]

Type=simple

ExecStart=/usr/bin/node /path/to/worker.mjs

Restart=always

RestartSec=10

Verification Steps

npm run build — zero errors

Heartbeat succeeding every 5 minutes (check the ops_action_runs table)

Triggers are firing (check ops_trigger_rules for fire_count)

Roundtable conversations are running (check ops_roundtable_queue for rows with succeeded status)

Events are flowing (check ops_agent_events for new rows)

Memories are being written (check ops_agent_memory for new rows)

Frontend shows the signal feed (open the /stage page)

The Second Track: OpenClaw

The workers above are reactive — they wait for heartbeat to create missions, then poll Supabase and execute. But there's a second track.

OpenClaw is a multi-agent gateway that runs on the same VPS. It doesn't manage the workers — it runs its own scheduled agent jobs independently:

Deep research cycles (agents autonomously research topics on a schedule)

Social intelligence scans (periodic Twitter/web analysis)

Daily briefings (agents summarize what happened)

Memory integration (agents consolidate what they've learned)

Think of it this way:

Workers = employees waiting for assignments (heartbeat assigns, they execute)

OpenClaw = employees with their own daily routines (research at 9am, social scan at 1pm, briefing at 8pm — no one tells them to, they just do it on schedule)

OpenClaw's output gets bridged to your /stage frontend via a lightweight exporter script — so everything the agents do autonomously also shows up in the dashboard.

You don't need OpenClaw to get started. The heartbeat + workers loop from this tutorial is a complete system on its own. OpenClaw adds a second layer of autonomous behavior when you're ready for it.

When you are ready, I made a Claude Code skill for OpenClaw — install it, and your AI assistant knows how to set up and operate everything out of the box:

Install the OpenClaw operations skill

claude install-skill https://github.com/Heyvhuang/ship-faster/tree/main/skills/tool-openclaw

Once installed, just tell your AI:

"Set up OpenClaw on my VPS. Configure scheduled jobs for research, social scanning, and daily briefings."

The skill contains the complete operations reference — setup, config, cron jobs, troubleshooting. Your AI reads it and handles the rest. Trust your AI.

Cost Breakdown

💰 LLM (Claude API) → Usage-based, ~$10-20/month

💰 VPS (Hetzner 2-core 4GB) → $8 fixed

💰 Vercel → $0 (Hobby free tier)

💰 Supabase → $0 (Free tier)

──────────

📊 Total: $8 fixed + LLM usage

Final Thoughts

This system isn't perfect.

Agent "free will" is mostly probabilistic uncertainty simulation, not true reasoning

The memory system is structured knowledge extraction, not genuine "understanding"

Relationship drift is small (±0.03) — it takes a long time to see significant changes

You Don't Have to Do Everything at Once

This tutorial is 8 chapters long. That looks like a lot. But you really don't need to tackle it all in one go.

Minimum viable version: 3 agents (coordinator, executor, observer) + 4 core tables + heartbeat + 1 worker. That's enough for a working loop.

Intermediate version: Add roundtable conversations + the memory system. Now agents start feeling like they're actually collaborating.

Full version: All 8 chapters. Dynamic relationships, initiative, voice evolution. Agents start feeling like a real team.

Each step is independent. Get one working before adding the next. Don't bite off more than you can chew.

Still Lost?

Paste this article into Claude and say:

It'll write the code for you. Seriously.

If you build your own version following this tutorial — even if it's just 2 agents having conversations — come tell me at @Voxyz_ai.

You can see all 6 agents operating in real time at voxyz.space.

Solo devs building multi-agent systems — every person you talk to is one fewer mistake you'll make.

They Think. They Act. You See Everything.

Link: http://x.com/i/article/2020256905101279232

📋 讨论归档

讨论进行中…

多智能体不是“让它自己想”，是把公司流程做成数据库

核心观点

跟我们的关联

讨论引子

全教程：6 个能运营一家公司的人格化 AI Agent —— 我如何从零搭建它们

1. Core policies

2. Trigger rules (4 reactive + 7 proactive)

3. Roundtable policies

4. Initial relationship data (15 pairs)

Install the OpenClaw operations skill

相关笔记

1. Core policies

1. Core policies

2. Trigger rules (4 reactive + 7 proactive)

2. Trigger rules (4 reactive + 7 proactive)

3. Roundtable policies

3. Roundtable policies

4. Initial relationship data (15 pairs)

4. Initial relationship data (15 pairs)

Install the OpenClaw operations skill

Install the OpenClaw operations skill

相关笔记

The Full Tutorial: 6 AI Agents That Run a Company — How I Built Them From Scratch

Content

1. Core policies

2. Trigger rules (4 reactive + 7 proactive)

3. Roundtable policies

4. Initial relationship data (15 pairs)

Install the OpenClaw operations skill

📋 讨论归档