🧠 阿头学 · 💰投资

像对冲基金一样使用预测市场数据

对冲基金在预测市场的真正优势不在预测准确，而在风险管理、时间策略和结构性站位——这些都可以从 4 亿笔公开交易数据中学到。
打开原文 ↗

2026-03-01 原文链接 ↗

阅读简报

双语对照

完整翻译

原文

讨论归档

核心观点

优势在流程而非信息 机构赢的不是因为预测更准，而是用经验凯利+蒙特卡洛把"优势的不确定性"量化进仓位，避免系统性过度下注。散户把 6% 的优势当真相，机构把它当一个有 30%-90% 波动的分布，这决定了长期复利与爆仓的分水岭。

冷门偏差是可测量的结构性套利 数据显示 1 美分合约的 Taker 亏损高达 57%，因为人们为"1% 的希望"支付了巨额溢价。这不是随机，而是在 99 个价格档位中的 80 个都存在的系统性现象，证明财富从急迫的 Taker 流向耐心的 Maker。

时间维度的校准曲面 机构不只看价格维度的错定价，还追踪它如何随结算临近而变化。早期散户情绪主导偏差最大，中期信息积累效率提升，临近结算可能出现反转——这需要在完整数据集上验证，但逻辑上符合行为金融学。

Maker 结构性优于 Taker 数据证明 Maker 的超额收益来自结构而非预测能力（YES/NO 方向的收益几乎对称），而 Taker 的负期望来自"为确定性支付心理税"。这是市场微观结构的铁律，不是个人能力问题。

预测市场是实验室而非赌场 对冲基金用这 4 亿笔交易来校准风险模型、识别系统性偏差、测试结构性优势，然后把这些模式迁移到传统市场的数十亿仓位决策上。数据的价值在于"可结算、可复盘、完整标注"。

跟我们的关联

对 ATou 意味着什么 这套"经验凯利+蒙特卡洛+不确定性折扣"的框架可以直接迁移到任何重复博弈决策（投资、广告投放、增长实验）。下一步：把历史项目当"交易记录"，筛出类似场景的样本，构建经验收益分布，按 95 分位回撤来配置预算占比，而不是拍脑袋。

对 Neta 意味着什么 用户在极低概率事件上的"为希望支付溢价"是可利用的结构性偏差。在定价、众筹、预售等场景中，可以用"校准曲面"方法识别用户在哪些价格-时间组合上系统性高估或低估，然后据此调整策略。关键是从"凭直觉定价"变成"利用人群系统性偏差定价"。

对 Uota 意味着什么 预测市场数据揭示了一个普遍规律：急于成交的一方长期付出溢价，耐心的一方收割结构性收益。这映射到任何"Maker vs Taker"的场景（团队管理、产品设计、市场竞争）。下一步：设计制度让更多人以"Maker 模式"工作（清晰职责、安全试错、稳定节奏），而不是被动成为各种紧急需求的 Taker。

对通用决策的意义 这篇文章的核心转折是"不确定性本身可以被量化和定价"。无论在交易、创业还是人生选择中，问自己"我对这个判断有多不确定"，然后按不确定性折扣你的投入，会比"信心越高下越重"的直觉方法活得更久。

讨论引子

1. 经验凯利的"不确定性折扣"在你的业务里具体怎么操作？是否有办法把历史决策当样本，构建真实的收益分布而不是假设正态？

2. 文章说机构用预测市场当"实验室"而非"赌场"，那么对于没有数十亿传统市场仓位的个人或小团队，这份数据集的实际价值是什么——是学习方法论，还是真的能找到可交易的 Alpha？

3. Maker vs Taker 的结构性优势在预测市场上已被数据证实，但在现实业务中（如团队管理、产品定价），这种优势是否同样稳健，还是会因为环境差异而消失？

我会把对冲基金到底如何利用预测市场数据来构建交易策略、提取散户错过的 Alpha（超额收益）讲清楚。我还会分享一份可追溯到 2020 年、包含 4 亿+ 笔交易的数据集。

直接开始。

把这篇文章收藏起来——
我是 Roan，一名后端开发者，工作方向包括系统设计、HFT 风格的执行，以及量化交易系统。我的关注点是：预测市场在高负载下的真实行为。如果你有建议、认真合作或伙伴关系想聊，欢迎私信。

刚刚公开的数据集

@beckerrjon 发布了目前公开可用的最大预测市场数据集：来自 Polymarket 和 Kalshi、可追溯到 2020 年的 4 亿+ 笔交易。包含完整的市场元数据、细粒度成交数据、结算结果，并以 Parquet 文件存储。

这是逐笔（tick）级数据。每一笔交易都有时间戳、价格、成交量、taker 方向。传统市场里，机构数据供应商对这种粒度的数据通常每年收费 $100K+。

现在它开源了。这太震撼了。

在我拆解对冲基金用这些数据在做什么之前，我先带你把它真正跑起来。因为不同于大多数只讲理论的文章，我会把让你自己拿到“机构级”数据的具体步骤给全。

如何逐步设置数据集（一步步）

前置条件：

已安装 Python 3.9 或更高版本
预留 40GB 可用磁盘空间
可使用命令行（Mac/Linux 用 Terminal，Windows 用 PowerShell）

步骤 1：安装 uv（依赖管理器）

uv sync

步骤 2：克隆仓库

步骤 3：安装依赖

这一步会安装 DuckDB、Pandas、Matplotlib 以及其他分析工具。

步骤 4：下载数据集

这一步会从 Cloudflare R2 下载 data.tar.zst（压缩后 36GB），并解压到 data/ 目录。
解压耗时 5 到 30 分钟，取决于你的机器（我这边比预期更久）。

步骤 5：验证数据

make setup

你应该会看到数百个包含成交数据的 Parquet 文件。

恭喜，你现在拥有了对冲基金正在分析的同一份数据集。

数据的组织结构大致是这样：

每个交易文件都是一个 Parquet 文件。什么是 Parquet 文件？
Parquet 是一种列式存储格式，让你可以在不把所有数据加载进内存的情况下查询数十亿行记录。

现在你已经把它搭起来了，接下来我给你看机构真正会用这份数据做什么。

ls data/polymarket/trades/
ls data/kalshi/trades/

对冲基金实际上如何使用这些数据

你以为预测市场只是用来押注结果的。

你错了。

对冲基金把预测市场数据当作三个方面的实验室：经验式风险校准、系统性偏差识别，以及订单流分析。预测市场并不是他们部署资金的地方，而是他们提取模式、进而影响传统市场数十亿仓位决策的地方。

下面就是他们用这 4 亿笔交易具体在做什么。

方法 1：带蒙特卡洛不确定性量化的经验凯利准则

凯利准则是量化仓位管理的基石。
每个机构交易员都知道这个公式：

f* = (p × b - q) / b

其中 f* 是最优投入资金比例，p 是获胜概率，q 是失败概率，b 表示赔率。

教科书版凯利的问题在于：它假设你能“确定无疑”地知道自己的优势（edge）。

现实会立刻打破这个假设。

当你的模型估计某笔交易有 6% 的优势时，那不是“真相”。它只是带不确定性的点估计。真实优势可能是 3%，也可能是 9%。你拿到的是一个分布，而不是一个数。

标准凯利把这个 6% 当成事实处理。这在数学上是错误的，并会导致系统性的过度下注（overbetting）。

经验凯利（Empirical Kelly）通过把不确定性直接纳入仓位计算来解决这个问题。

他们用 Becker 数据集的实现流程如下：

阶段 1：提取历史交易样本

基金会用非常精确的语言定义策略条件。
例如：“当合约价格低于 $0.15 且我们的基本面模型估计真实概率高于 0.25 时，买入 YES。”

然后他们在 4 亿笔历史交易里筛选出每一次“完全满足该模式”的出现。不是相似，是完全一致。

这会给他们带来成千上万条历史类比样本。由于数据集包含结算结果，每一条样本的最终结果都是已知的。

阶段 2：构建收益分布

对每条历史类比样本，他们计算当时的实际收益：赢还是输、幅度是多少、发生在什么时候。

这就得到一个经验收益分布——不是理论上的正态分布，而是这个模式在真实市场中出现时“实际发生过”的分布。

关键洞察：这个分布几乎从来不是正态的。它有厚尾、有偏度、有会让统计学教授皱眉的峰度。

传统模型往往把这些特征“假设掉”。经验方法则直接测量它们。

阶段 3：蒙特卡洛重采样

这里数学上开始变得有趣。

历史收益序列只是众多可能路径中的一种。
如果同样的交易以不同的顺序发生，权益曲线会完全不同。

收益序列 [+8%, -4%, +6%, -3%, +7%] 的平均值与 [-4%, -3%, +6%, +7%, +8%] 相同，但回撤形态天差地别。第一条序列从不跌破 0%，第二条序列一上来就出现 -7% 的回撤。

这就是路径依赖（path dependency），而它对风险管理极其关键。

蒙特卡洛重采样会通过随机重排同一组历史收益，生成 10,000 条替代路径。每条路径的统计性质相同，但实际风险画像不同。

阶段 4：回撤分布分析

对这 10,000 条模拟路径分别计算最大回撤（从峰值到谷底的最差下跌）。

于是你得到的是“可能的最大回撤分布”，而不是一个单点数字。你能看到第 50 分位（中位情形）、第 95 分位（倒霉情形）、第 99 分位（灾难情形）。

这正是机构风控与散户分道扬镳的地方。

你：“我的回测最大回撤是 12%，我能扛。”

机构：“中位路径的回撤是 12%，但第 95 分位的回撤是 31%。我们要按第 95 分位来定仓，而不是按中位数。”

阶段 5：不确定性调整后的仓位规模

最后一步是计算仓位大小，使第 95 分位回撤不超过机构的风险上限。

数学形式变为：

f_empirical = f_kelly × (1 - CV_edge)

其中 CV_edge 是蒙特卡洛模拟中优势估计的变异系数（标准差 / 均值）。

不确定性高 → CV 大 → 仓位被大幅折扣。
不确定性低 → CV 小 → 仓位更接近理论凯利。

用一个示例来说明这套方法的效果：

一个量化策略：做多价格低于 $0.20 且模型估计真实概率 > 0.30 的合约

使用 Becker 数据集做历史模式匹配并进行蒙特卡洛重采样后：

标准凯利计算可能给出：20%+ 的仓位波动率调整后：~15-20% 仓位
再做蒙特卡洛不确定性调整（典型 CV：0.3-0.5）：10-15% 仓位考虑模型风险、偏保守的部署：8-12%

忽略不确定性（20%+ 仓位）与纳入不确定性（10% 仓位）的差别，就是“很可能走向破产”和“长期稳健复利”的差别。

为什么这很重要：

每个用凯利的散户，几乎都在用教科书版本。
他们会系统性过度下注，因为他们没有把优势估计的不确定性算进去。

使用带蒙特卡洛的经验凯利的机构，则是按“可能结果的分布”来定仓，而不是按某个点估计。

时间一长，就会产生巨大的分化。散户会经历一次 40% 的回撤，把数年的收益抹平；机构的回撤从不超过 20%，并且平滑复利。

同一个策略，不同的仓位方法论，结果完全不同。

方法 2：跨价格与时间维度的校准曲面分析

标准的校准分析会画出“隐含概率”与“实际发生频率”的关系。

当价格是 $0.30（隐含 30%）时，这个结果实际发生了多少次？如果实际发生 30%，市场就是校准的；如果是 25%，就说明被高估；如果是 35%，就说明被低估。

这是一维分析：只有价格维度。

机构会构建“校准曲面”，加入时间维度：随着结算临近，校准如何变化？

框架：

定义 C(p, t) 为校准函数，其中：

p 表示合约价格（0 到 100）
t 表示距离结算的剩余时间（以天计）
C(p, t) 返回该结果发生的经验概率

在完美校准的市场中，对所有 p 和 t，都有 C(p, t) = p。

现实中，C(p, t) 会随价格与时间系统性变化。

Jon Becker 的研究实际展示了什么：

对 72.1 百万笔 Kalshi 交易的分析表明，冷门偏差（longshot bias）真实存在且可度量。

在极端低概率（1 美分合约）处：

Taker 仅有 0.43% 的时间会赢
隐含概率：1%
错定价：-57%（显著跑输）

在中等概率（50 美分合约）处：

Taker 错定价：-2.65%
Maker 错定价：+2.66%
偏差仍然存在，但被压缩

研究还确认：在 99 个价格档位中，taker 在其中 80 个档位表现出负的超额收益，证明整个概率谱上存在系统性错定价。

机构关于时间维度的假设：

Becker 已发布的研究聚焦于基于价格的校准，而机构会基于行为金融理论把框架扩展到时间维度。

假设是：冷门偏差会随着距离结算的时间变化而变化，因为驱动它的心理因素会变。

早期（距离结算较远）：散户情绪占主导。信息稀缺。人们按“希望”而非“概率”买彩券。这一阶段冷门偏差应当最大。

中期：信息逐步积累，更成熟的参与者进入。随着信息环境改善，价格应当向基本面收敛。

晚期（临近结算）：信息揭示加速。原本就不太可能的结果会变得“显而易见地不可能”。假设认为：随着希望破灭，偏差可能出现反转，但这需要经验验证。

策略框架：

基于既有行为模式，制定随时间变化的过滤规则：

距离结算较远：Becker 记录的冷门偏差很可能在这里最强。策略：系统性地反向交易那些低概率、散户热情主导的合约。

中段：信息与流动性改善带来效率高峰期。策略：降低参与度或转而寻找其他边际优势。

临近结算：信息不对称应当坍缩。策略：利用任何残余错定价，但要意识到效率通常会提升。

数学形式化：

错定价函数：

M(p, t) = C(p, t) - p/100

其中 M 表示以百分点计的系统性错定价。

机构的入场规则会是：

当 M(p, t) > threshold（被高估）时做空
当 M(p, t) < -threshold（被低估）时做多
当 |M(p, t)| < threshold（公允）时观望

阈值会结合交易成本与所需的风险调整后收益进行校准。

为什么这很重要：

Becker 的研究在价格维度上证明了冷门偏差的存在。1 美分合约记录到的 -57% 错定价既巨大又系统性。

机构进一步假设这种偏差会随时间变化，但要验证就必须分析完整的时间序列数据集。

这个框架是合理的：由希望、恐惧与信息不对称驱动的行为偏差，逻辑上确实应该随着结算临近而变化。

至于具体结构是“早期偏差 → 中期效率 → 晚期反转”，还是别的时间形态，需要在 4 亿笔交易数据上跑完分析才能确定。

从已验证的研究中，我们确定知道：

冷门偏差存在且可度量（1 美分处 -57%）
它会随概率谱变化
Taker 在 99 个价格档位中的 80 个档位系统性亏损
这种偏差带来结构性机会

机构会做的经验检验：

这种偏差如何随距离结算的时间变化
不同市场类别中模式是否稳定
最优的入场与离场阈值
扣除交易成本后的可交易盈利性

校准曲面方法给出框架，Becker 数据集提供实验室，经验分析决定哪些具体模式存在且可交易。

# On Mac/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows (PowerShell)
irm https://astral.sh/uv/install.ps1 | iex

方法 3：订单流分解与 Maker vs. Taker 盈利能力

这是最微妙的优势，也是散户几乎从不考虑的一点。

每一笔交易都有两个参与者：提供流动性的 maker，以及消耗流动性的 taker。

maker 挂出限价单，等待成交。

taker 跨过价差，买的是“立刻成交”。

Jon Becker 的数据集为每笔交易标注了 taker 方向。这意味着你可以把人群拆成 makers 与 takers，并分别分析他们的盈利能力。

Becker 的研究实际揭示了什么：

对 72.1 百万笔 Kalshi 交易及其结算结果的分析显示出鲜明的不对称。

在 1 美分合约（极端冷门）处：

Taker 仅有 0.43% 的时间会赢
隐含概率：1%
Taker 错定价：-57%
Makers 有 1.57% 的时间会赢
Maker 错定价：+57%

在 50 美分合约处：

data/
├── polymarket/
│   ├── markets/               # Market metadata (titles, outcomes, status)
│   └── trades/                # Every trade (price, volume, timestamp)
└── kalshi/
    ├── markets/               # Same structure for Kalshi
    └── trades/

Taker 错定价：-2.65%
Maker 错定价：+2.66%

整体发现：

在 99 个价格档位中，takers 在其中 80 个档位呈现负的超额收益
Makers 买 YES：+0.77% 超额收益
Makers 买 NO：+1.25% 超额收益
统计对称性（Cohen's d ≈ 0.02）表明 makers 并不是预测更准，而是结构更优

这不是小差异。

takers 作为一个群体，是系统性错误的。不是 50/50 硬币式的“随机错”，而是在 99 个价格档位中的 80% 持续、可测量地错。

为什么 Takers 会输：

Becker 的研究指出了核心洞察：makers 的盈利来自结构性套利，而不是更强的预测能力。

makers 买 YES（+0.77%）与买 NO（+1.25%）的超额收益几乎对称，证明他们不是在挑赢家，而是在利用 taker 人群中“昂贵的偏好”。

taker 的行为体现出紧迫性。你跨过价差，是因为你更看重成交确定性而非价格。这种紧迫性与行为偏差相关。

第一：对信息不对称的误判。takers 以为自己在行动于有价值的信息。多数并非如此。他们是在对公开信息做情绪化反应，而不是基于私有信息。

第二：肯定偏误（affirmative bias）。Becker 的研究表明，takers 表现出“一种对肯定式、冷门结果的昂贵偏好”。他们在冷门上更倾向于买 YES，从而系统性地多付钱。

相反，makers 体现的是耐心：按定义他们在等待成交。耐心会过滤掉情绪化的紧迫冲动。

此外，makers 优化的是价差捕获（spread capture），而不是结果预测。时间一长，价差收益叠加上对抗带偏见 taker 流量的结构性优势，会产生稳定的正期望。

数学表达：

每笔被动成交订单的 maker 期望利润：

E[Profit_maker] = spread_capture + edge_vs_takers

其中：

spread_capture 表示买卖价差的收取
edge_vs_takers 表示经验上相对 takers 的胜率优势

从 Becker 已验证的数据看：

Maker 相对 takers 的优势：+0.77% 到 +1.25%（取决于方向）
这一优势覆盖 99 个价格档位中的 80 个
该优势是结构性的，而非信息性的（由 YES/NO 表现对称所证明）

如果你持续在大量市场中提供流动性，就能反复赚取这份优势，而不需要更强的预测能力。

但这也有风险。

库存风险（Inventory Risk）：作为 maker，你会累积头寸。你在一些市场上做多，在另一些市场上做空。若相关性变化或市场朝不利方向移动，均值回归之前就会出现回撤。

逆向选择风险（Adverse Selection Risk）：并非所有 takers 都是无知的。正如 Becker 所说，“成熟的交易者会为抓住时间敏感的信息而跨价差成交。”大单可能代表有信息的订单流，你可能被“点杀”。

成交量演化（Volume Evolution）：Becker 的研究显示市场成熟度很关键。在早期低成交量阶段，即便是 makers 也会输给相对更“有信息”的 takers。后来成交量激增，吸引了职业流动性提供者，才使他们能在各个价位点上提取价值。

机构做市框架：

报出双边报价，并确保价差捕获的期望为正。
允许小额成交，代表散户 taker 流（这里有已记录的偏差）。
对大额成交打标复核（可能是更成熟的参与者）。
监控整体库存敞口，超过阈值就对冲。
目标是从结构性优势中获取稳定回报，而非押注结果预测。

为什么这很重要：

散户几乎总是 taker。他们看到一个市场，点击买入，跨价差成交。

这么做意味着他们加入了一个群体：Becker 的研究证明，这个群体在 99 个价格档位中的 80% 具有负的超额收益。

在极端冷门（1 美分合约）处，takers 跑输 57%。即便在中等概率（50 美分）处，也跑输 2.65%。

而提供流动性的机构则收割了这份优势的另一侧。

同一个市场，不同的方法。由 72.1 百万笔交易数据证明的结构性优势。

研究结论：

Becker 的分析表明，财富会从流动性消耗者系统性转移到流动性提供者，其驱动力来自行为偏差与市场微观结构，而不是 makers 更强的预测能力。

taker 对肯定式、冷门结果的偏好创造了结构性机会；maker 的耐心与价差捕获方法把它收割出来。

这不是理论。这是对迄今最大预测市场数据集的测量与验证所得的现实。

```bash git clone https://github.com/Jon-Becker/prediction-market-analysis

I'm going to break down exactly how hedge funds use prediction market data to build trading strategies and extract alpha that retail misses. I'll also share dataset of 400m+ trades going back to 2020.

Let's get straight to it.

Bookmark This - I’m Roan, a backend developer working on system design, HFT-style execution, and quantitative trading systems. My work focuses on how prediction markets actually behave under load. For any suggestions, thoughtful collaborations, partnerships DMs are open.

直接开始。

把这篇文章收藏起来——
我是 Roan，一名后端开发者，工作方向包括系统设计、HFT 风格的执行，以及量化交易系统。我的关注点是：预测市场在高负载下的真实行为。如果你有建议、认真合作或伙伴关系想聊，欢迎私信。

The Dataset That Just Went Public

@beckerrjon released the largest publicly available prediction market dataset: 400 million+ trades from Polymarket and Kalshi going back to 2020. Complete market metadata, granular trade data, resolution outcomes, stored as Parquet files.

This is tick level data. Every trade has timestamp, price, volume, taker direction. The same granularity institutional data vendors charge $100K+ annually for in traditional markets.

**Now it's open source. This is massive. **

Before I break down what hedge funds are doing with this data, let me show you how to actually get it set up. Because unlike most articles that just talk theory, I'm giving you the exact steps to access institutional grade data yourself.

刚刚公开的数据集

这是逐笔（tick）级数据。每一笔交易都有时间戳、价格、成交量、taker 方向。传统市场里，机构数据供应商对这种粒度的数据通常每年收费 $100K+。

现在它开源了。这太震撼了。

**How to Set Up the Dataset (Step by Step)

Prerequisites:**

Python 3.9 or higher installed
40GB free disk space
Command line access (Terminal on Mac/Linux, PowerShell on Windows)

Step 1: Install uv (Dependency Manager)

uv sync

Step 2: Clone the Repository

Step 3: Install Dependencies

This installs DuckDB, Pandas, Matplotlib and other analysis tools.

Step 4: Download the Dataset

This downloads data.tar.zst (36GB compressed) from Cloudflare R2 and extracts it to the data/ directory. Extraction takes 5 to 30 minutes depending on your system (for me it took more than expected)

Step 5: Verify the Data

make setup

You should see hundreds of Parquet files containing trade data.

Congrats. You now have the same dataset hedge funds are analyzing.

The data is organized like this:

Each trade file is a Parquet file. What is Parquet file? Parquet is a columnar storage format that lets you query billions of rows without loading everything into memory.

Now that you have it set up, let me show you what institutions are actually doing with this data.

ls data/polymarket/trades/
ls data/kalshi/trades/

如何逐步设置数据集（一步步）

前置条件：

已安装 Python 3.9 或更高版本
预留 40GB 可用磁盘空间
可使用命令行（Mac/Linux 用 Terminal，Windows 用 PowerShell）

步骤 1：安装 uv（依赖管理器）

uv sync

步骤 2：克隆仓库

步骤 3：安装依赖

这一步会安装 DuckDB、Pandas、Matplotlib 以及其他分析工具。

步骤 4：下载数据集

这一步会从 Cloudflare R2 下载 data.tar.zst（压缩后 36GB），并解压到 data/ 目录。
解压耗时 5 到 30 分钟，取决于你的机器（我这边比预期更久）。

步骤 5：验证数据

make setup

你应该会看到数百个包含成交数据的 Parquet 文件。

恭喜，你现在拥有了对冲基金正在分析的同一份数据集。

数据的组织结构大致是这样：

现在你已经把它搭起来了，接下来我给你看机构真正会用这份数据做什么。

ls data/polymarket/trades/
ls data/kalshi/trades/

How Hedge Funds Actually Use This Data

You think prediction markets are for betting on outcomes.

You're wrong.

Hedge funds use prediction market data as a laboratory for three things: empirical risk calibration, systematic bias detection and order flow analysis. The prediction market isn't where they deploy capital. It's where they extract patterns that inform billions in traditional market positions.

Here's exactly what they're doing with 400 million trades.

对冲基金实际上如何使用这些数据

你以为预测市场只是用来押注结果的。

你错了。

下面就是他们用这 4 亿笔交易具体在做什么。

Method 1: Empirical Kelly Criterion with Monte Carlo Uncertainty Quantification

The Kelly Criterion is the foundation of quantitative position sizing. Every institutional trader knows the formula:

f* = (p × b - q) / b

Where f* is the optimal fraction of capital to deploy, p is win probability, q is loss probability, and b represents the odds.

The problem with textbook Kelly: it assumes you know your edge with certainty.

Reality breaks this assumption immediately.

When your model estimates 6% edge on a trade, that's not ground truth. It's a point estimate with uncertainty. The true edge might be 3%. It might be 9%. You have a distribution, not a number.

Standard Kelly treats that 6% as fact. This is mathematically incorrect and leads to systematic overbetting.

Empirical Kelly solves this by incorporating uncertainty directly into the sizing calculation.

Here's how they implement it using the Becker dataset:

Phase 1: Historical Trade Extraction

Funds define their strategy criteria in precise terms. For example: "Enter Yes when contract price is below $0.15 and our fundamental model estimates true probability above 0.25."

They filter the 400 million historical trades to find every instance where that exact pattern appeared. Not similar. Exact.

This gives them thousands of historical analogs. Each one has a known outcome because the dataset includes resolutions.

Phase 2: Return Distribution Construction

For each historical analog, they calculate what the realized return was. Win or loss. Magnitude. Timing.

This creates an empirical distribution of returns. Not a theoretical normal distribution. The actual, realized distribution of what happened when this pattern appeared in real markets.

Critical insight: this distribution is almost never normal. It has fat tails. It has skewness. It has kurtosis that would make a statistics professor wince.

Traditional models assume away these features. Empirical methods measure them directly.

Phase 3: Monte Carlo Resampling

Here's where it gets interesting mathematically.

The historical sequence of returns is just one possible path. If the same trades had occurred in a different order, the equity curve would look completely different.

Returns of [+8%, -4%, +6%, -3%, +7%] average to the same number as [-4%, -3%, +6%, +7%, +8%], but the drawdown profiles are dramatically different. The first sequence never drops below 0%. The second sequence hits -7% drawdown immediately.

This is path dependency. And it matters enormously for risk management.

Monte Carlo resampling generates 10,000 alternative paths by randomly reordering the same historical returns. Each path has identical statistical properties but different realized risk profiles.

Phase 4: Drawdown Distribution Analysis

For each of the 10,000 simulated paths, calculate the maximum drawdown. The worst peak to trough decline.

Now you have a distribution of possible max drawdowns, not a single number. You can see the 50th percentile (median case), the 95th percentile (bad luck), the 99th percentile (disaster scenario).

This is where institutional risk management diverges from retail.

You: "My backtest shows 12% max drawdown, I can handle that."

Institutional: "The median path shows 12% drawdown, but the 95th percentile shows 31% drawdown. We need to size for the 95th percentile, not the median."

Phase 5: Uncertainty Adjusted Position Sizing

The final step is calculating position size that keeps the 95th percentile drawdown under institutional risk limits.

The math becomes:

f_empirical = f_kelly × (1 - CV_edge)

Where CV_edge is the coefficient of variation (standard deviation / mean) of edge estimates across the Monte Carlo simulations.

High uncertainty → large CV → aggressive haircut to position size. Low uncertainty → small CV → sizing closer to theoretical Kelly.

Consider an illustrative application of this methodology:

A quantitative strategy: Long contracts under $0.20 where model estimates true probability > 0.30

Using historical pattern matching on the Becker dataset and Monte Carlo resampling:

Standard Kelly calculation might suggest: 20%+ position sizing After volatility adjustment: ~15-20% sizing
After Monte Carlo uncertainty adjustment (typical CV: 0.3-0.5): 10-15% sizing Conservative deployment allowing for model risk: 8-12%

The difference between ignoring uncertainty (20%+ sizing) and incorporating it (10% sizing) is the difference between probable ruin and steady compounding over time.

Why This Matters:

Every retail trader using Kelly is using the textbook version. They're overbetting systematically because they're not accounting for uncertainty in their edge estimates.

Institutions using empirical Kelly with Monte Carlo are sizing for the distribution of possible outcomes, not the point estimate.

Over time, this creates massive divergence. The retail trader experiences a 40% drawdown that wipes out years of gains. The institutional trader never exceeds 20% drawdown and compounds smoothly.

Same strategy. Different position sizing methodology. Completely different outcomes.

方法 1：带蒙特卡洛不确定性量化的经验凯利准则

凯利准则是量化仓位管理的基石。
每个机构交易员都知道这个公式：

f* = (p × b - q) / b

其中 f* 是最优投入资金比例，p 是获胜概率，q 是失败概率，b 表示赔率。

教科书版凯利的问题在于：它假设你能“确定无疑”地知道自己的优势（edge）。

现实会立刻打破这个假设。

标准凯利把这个 6% 当成事实处理。这在数学上是错误的，并会导致系统性的过度下注（overbetting）。

经验凯利（Empirical Kelly）通过把不确定性直接纳入仓位计算来解决这个问题。

他们用 Becker 数据集的实现流程如下：

阶段 1：提取历史交易样本

基金会用非常精确的语言定义策略条件。
例如：“当合约价格低于 $0.15 且我们的基本面模型估计真实概率高于 0.25 时，买入 YES。”

然后他们在 4 亿笔历史交易里筛选出每一次“完全满足该模式”的出现。不是相似，是完全一致。

这会给他们带来成千上万条历史类比样本。由于数据集包含结算结果，每一条样本的最终结果都是已知的。

阶段 2：构建收益分布

对每条历史类比样本，他们计算当时的实际收益：赢还是输、幅度是多少、发生在什么时候。

这就得到一个经验收益分布——不是理论上的正态分布，而是这个模式在真实市场中出现时“实际发生过”的分布。

关键洞察：这个分布几乎从来不是正态的。它有厚尾、有偏度、有会让统计学教授皱眉的峰度。

传统模型往往把这些特征“假设掉”。经验方法则直接测量它们。

阶段 3：蒙特卡洛重采样

这里数学上开始变得有趣。

历史收益序列只是众多可能路径中的一种。
如果同样的交易以不同的顺序发生，权益曲线会完全不同。

这就是路径依赖（path dependency），而它对风险管理极其关键。

蒙特卡洛重采样会通过随机重排同一组历史收益，生成 10,000 条替代路径。每条路径的统计性质相同，但实际风险画像不同。

阶段 4：回撤分布分析

对这 10,000 条模拟路径分别计算最大回撤（从峰值到谷底的最差下跌）。

于是你得到的是“可能的最大回撤分布”，而不是一个单点数字。你能看到第 50 分位（中位情形）、第 95 分位（倒霉情形）、第 99 分位（灾难情形）。

这正是机构风控与散户分道扬镳的地方。

你：“我的回测最大回撤是 12%，我能扛。”

机构：“中位路径的回撤是 12%，但第 95 分位的回撤是 31%。我们要按第 95 分位来定仓，而不是按中位数。”

阶段 5：不确定性调整后的仓位规模

最后一步是计算仓位大小，使第 95 分位回撤不超过机构的风险上限。

数学形式变为：

f_empirical = f_kelly × (1 - CV_edge)

其中 CV_edge 是蒙特卡洛模拟中优势估计的变异系数（标准差 / 均值）。

不确定性高 → CV 大 → 仓位被大幅折扣。
不确定性低 → CV 小 → 仓位更接近理论凯利。

用一个示例来说明这套方法的效果：

一个量化策略：做多价格低于 $0.20 且模型估计真实概率 > 0.30 的合约

使用 Becker 数据集做历史模式匹配并进行蒙特卡洛重采样后：

忽略不确定性（20%+ 仓位）与纳入不确定性（10% 仓位）的差别，就是“很可能走向破产”和“长期稳健复利”的差别。

为什么这很重要：

每个用凯利的散户，几乎都在用教科书版本。
他们会系统性过度下注，因为他们没有把优势估计的不确定性算进去。

使用带蒙特卡洛的经验凯利的机构，则是按“可能结果的分布”来定仓，而不是按某个点估计。

时间一长，就会产生巨大的分化。散户会经历一次 40% 的回撤，把数年的收益抹平；机构的回撤从不超过 20%，并且平滑复利。

同一个策略，不同的仓位方法论，结果完全不同。

Method 2: Calibration Surface Analysis Across Price and Time Dimensions

Standard calibration analysis plots implied probability versus realized frequency.

At price $0.30 (30% implied), how often did that outcome actually occur? If it occurred 30% of the time, the market was calibrated. If 25%, it was overpriced. If 35%, it was underpriced.

This is one dimensional analysis. Price only.

Institutions build calibration surfaces adding the time dimension: how does calibration change as resolution approaches?

The Framework:

Define C(p, t) as the calibration function where:

p represents contract price (0 to 100)
t represents time remaining until resolution (measured in days)
C(p, t) returns the empirical probability that outcome occurs

In perfectly calibrated markets, C(p, t) = p for all p and t.

In reality, C(p, t) varies systematically with both price and time.

What Jon Becker's Research Actually Shows:

Analysis of 72.1 million Kalshi trades reveals the longshot bias is real and measurable.

At extreme low probabilities (1-cent contracts):

Takers win only 0.43% of the time
Implied probability: 1%
Mispricing: -57% (massively underperforming)

At mid probabilities (50-cent contracts):

Taker mispricing: -2.65%
Maker mispricing: +2.66%
Bias exists but compressed

The research confirms takers exhibit negative excess returns at 80 of 99 price levels, proving systematic mispricing across the probability spectrum.

The Institutional Hypothesis on Time Dimension:

While Becker's published research focused on price based calibration, institutions extend this framework temporally based on behavioral finance theory.

The hypothesis: longshot bias should vary with time to resolution because the psychological drivers change.

Early Period (far from resolution): Retail sentiment dominates. Limited information exists. People buy lottery tickets based on hope rather than probability. This should maximize longshot bias.

Mid Period: Information accumulates. Sophisticated participants enter. Prices should converge toward fundamentals as the informational environment improves.

Late Period (near resolution): Information revelation accelerates. Outcomes that were always unlikely become obviously unlikely. The hypothesis suggests potential bias reversal as hope capitulates, but this requires empirical validation.

The Strategy Framework:

Time varying filter rules based on established behavioral patterns:

Far from resolution: The longshot bias documented by Becker is likely strongest here. Strategy: systematically fade low probability contracts where retail enthusiasm dominates.

Mid range: Peak efficiency period as information and liquidity improve. Strategy: reduce activity or focus on other edges.

Near resolution: Information asymmetry should collapse. Strategy: exploit any remaining mispricing but with awareness that efficiency typically improves.

Mathematical Formalization:

The mispricing function:

M(p, t) = C(p, t) - p/100

Where M represents systematic mispricing in percentage points.

Institutional entry rules would be:

Enter short when M(p, t) > threshold (overpriced) Enter long when M(p, t) < -threshold (underpriced) Stay flat when |M(p, t)| < threshold (fair)

The threshold is calibrated to transaction costs and required risk adjusted returns.

Why This Matters:

Becker's research proves the longshot bias exists at the price dimension. The documented -57% mispricing at 1-cent contracts is enormous and systematic.

Institutions hypothesize this bias varies with time, though empirical validation requires analysis of the full temporal dataset.

The framework is sound: behavioral biases driven by hope, fear, and information asymmetry should logically vary as resolution approaches.

Whether the specific pattern is early bias → mid efficiency → late reversal, or some other temporal structure, requires running the analysis on the 400 million trade dataset.

What we know for certain from verified research:

Longshot bias exists and is measurable (-57% at 1-cent)
It varies across the probability spectrum
Takers systematically lose at 80 of 99 price levels
The bias creates structural opportunity

What institutions test empirically:

How this bias changes with time to resolution
Whether patterns are stable across market categories
Optimal threshold levels for entry and exit
Transaction cost adjusted profitability

The calibration surface methodology provides the framework. The Becker dataset provides the laboratory. The empirical analysis determines which specific patterns exist and are tradeable.

# On Mac/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows (PowerShell)
irm https://astral.sh/uv/install.ps1 | iex

方法 2：跨价格与时间维度的校准曲面分析

标准的校准分析会画出“隐含概率”与“实际发生频率”的关系。

这是一维分析：只有价格维度。

机构会构建“校准曲面”，加入时间维度：随着结算临近，校准如何变化？

框架：

定义 C(p, t) 为校准函数，其中：

p 表示合约价格（0 到 100）
t 表示距离结算的剩余时间（以天计）
C(p, t) 返回该结果发生的经验概率

在完美校准的市场中，对所有 p 和 t，都有 C(p, t) = p。

现实中，C(p, t) 会随价格与时间系统性变化。

Jon Becker 的研究实际展示了什么：

对 72.1 百万笔 Kalshi 交易的分析表明，冷门偏差（longshot bias）真实存在且可度量。

在极端低概率（1 美分合约）处：

Taker 仅有 0.43% 的时间会赢
隐含概率：1%
错定价：-57%（显著跑输）

在中等概率（50 美分合约）处：

Taker 错定价：-2.65%
Maker 错定价：+2.66%
偏差仍然存在，但被压缩

研究还确认：在 99 个价格档位中，taker 在其中 80 个档位表现出负的超额收益，证明整个概率谱上存在系统性错定价。

机构关于时间维度的假设：

Becker 已发布的研究聚焦于基于价格的校准，而机构会基于行为金融理论把框架扩展到时间维度。

假设是：冷门偏差会随着距离结算的时间变化而变化，因为驱动它的心理因素会变。

早期（距离结算较远）：散户情绪占主导。信息稀缺。人们按“希望”而非“概率”买彩券。这一阶段冷门偏差应当最大。

中期：信息逐步积累，更成熟的参与者进入。随着信息环境改善，价格应当向基本面收敛。

策略框架：

基于既有行为模式，制定随时间变化的过滤规则：

距离结算较远：Becker 记录的冷门偏差很可能在这里最强。策略：系统性地反向交易那些低概率、散户热情主导的合约。

中段：信息与流动性改善带来效率高峰期。策略：降低参与度或转而寻找其他边际优势。

临近结算：信息不对称应当坍缩。策略：利用任何残余错定价，但要意识到效率通常会提升。

数学形式化：

错定价函数：

M(p, t) = C(p, t) - p/100

其中 M 表示以百分点计的系统性错定价。

机构的入场规则会是：

当 M(p, t) > threshold（被高估）时做空
当 M(p, t) < -threshold（被低估）时做多
当 |M(p, t)| < threshold（公允）时观望

阈值会结合交易成本与所需的风险调整后收益进行校准。

为什么这很重要：

Becker 的研究在价格维度上证明了冷门偏差的存在。1 美分合约记录到的 -57% 错定价既巨大又系统性。

机构进一步假设这种偏差会随时间变化，但要验证就必须分析完整的时间序列数据集。

这个框架是合理的：由希望、恐惧与信息不对称驱动的行为偏差，逻辑上确实应该随着结算临近而变化。

至于具体结构是“早期偏差 → 中期效率 → 晚期反转”，还是别的时间形态，需要在 4 亿笔交易数据上跑完分析才能确定。

从已验证的研究中，我们确定知道：

冷门偏差存在且可度量（1 美分处 -57%）
它会随概率谱变化
Taker 在 99 个价格档位中的 80 个档位系统性亏损
这种偏差带来结构性机会

机构会做的经验检验：

这种偏差如何随距离结算的时间变化
不同市场类别中模式是否稳定
最优的入场与离场阈值
扣除交易成本后的可交易盈利性

校准曲面方法给出框架，Becker 数据集提供实验室，经验分析决定哪些具体模式存在且可交易。

# On Mac/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows (PowerShell)
irm https://astral.sh/uv/install.ps1 | iex

Method 3: Order Flow Decomposition and Maker Versus Taker Profitability

This is the most subtle edge and the one retail traders never consider.

Every trade has two participants: a maker who provided liquidity and a taker who consumed it.

The maker posted a limit order. They waited.

The taker crossed the spread. They paid for immediacy.

Jon Becker's dataset tags every trade with the taker side. This means you can separate the population into makers and takers and analyze their profitability independently.

What Becker's Research Actually Reveals:

Analysis of 72.1 million Kalshi trades with resolution outcomes shows a stark asymmetry.

At 1-cent contracts (extreme longshots):

Takers win only 0.43% of the time
Implied probability: 1%
Taker mispricing: -57%
Makers win 1.57% of the time
Maker mispricing: +57%

**At 50-cent contracts: **

data/
├── polymarket/
│   ├── markets/               # Market metadata (titles, outcomes, status)
│   └── trades/                # Every trade (price, volume, timestamp)
└── kalshi/
    ├── markets/               # Same structure for Kalshi
    └── trades/

Taker mispricing: -2.65%
Maker mispricing: +2.66%

Aggregate findings:

Takers exhibit negative excess returns at 80 of 99 price levels
Makers buying YES: +0.77% excess return
Makers buying NO: +1.25% excess return
Statistical symmetry (Cohen's d ≈ 0.02) indicates makers don't predict better, they just structure better

This is not a small difference.

Takers, as a population, are systematically wrong. Not 50/50 coinflip wrong. Persistently, measurably wrong at 80% of all price levels.

Why Takers Lose:

Becker's research identifies the core insight: makers profit via structural arbitrage, not superior forecasting ability.

The near identical excess returns for makers buying YES (+0.77%) versus NO (+1.25%) proves they're not picking winners. They're exploiting a costly preference in the taker population.

Taker behavior reveals urgency. You cross the spread because you value execution certainty over price. This urgency correlates with behavioral bias.

First: Information asymmetry misperception. Takers believe they're acting on valuable information. Most aren't. They're reacting to public information emotionally, not privately.

Second: Affirmative bias. Becker's research shows takers exhibit "a costly preference for affirmative, longshot outcomes." They disproportionately buy YES on longshots, systematically overpaying.

Makers, conversely, demonstrate patience. By definition, they wait. This patience filters out emotional urgency.

Additionally, makers optimize for spread capture, not outcome prediction. Over time, spread collection plus the structural edge versus biased taker flow generates consistent positive expectation.

The Math:

Expected maker profit per filled order:

E[Profit_maker] = spread_capture + edge_vs_takers

Where:

spread_capture represents the bid ask collection
edge_vs_takers represents the empirical win rate advantage

From Becker's verified data:

Maker edge vs takers: +0.77% to +1.25% depending on position direction
This edge exists across 80 of 99 price levels
The edge is structural, not informational (proven by symmetric YES/NO performance)

If you provide liquidity consistently across many markets, you earn this edge repeatedly without needing superior forecasting ability.

But there's risk.

Inventory Risk: As a maker, you accumulate positions. You're long some markets, short others. If correlations shift or markets move against you, drawdowns occur before mean reversion.

Adverse Selection Risk: Not all takers are uninformed. As Becker notes, "sophisticated traders cross the spread to act on time sensitive information." Large orders may signal informed flow. You risk getting picked off.

Volume Evolution: Becker's research shows market maturity matters. In early low volume periods, even makers lost to relatively informed takers. The volume surge attracted professional liquidity providers who could then extract value at all price points.

The Institutional Market Making Framework:

Quote two sided markets with positive expected spread capture. Allow small fills that represent retail taker flow (these have the documented bias). Flag large fills for review (potential sophisticated participants). Monitor aggregate inventory exposure and hedge when thresholds are exceeded. Target consistent returns from structural edge, not outcome prediction.

Why This Matters:

Retail traders are almost always takers. They see a market, they click buy, they cross the spread.

By doing this, they're entering with a population that Becker's research proves has negative excess returns at 80% of all price levels.

At extreme longshots (1-cent contracts), takers underperform by 57%. Even at mid probabilities (50-cent), they underperform by 2.65%.

Institutions providing liquidity collect the other side of that edge.

Same markets. Different approach. Structural advantage proven by 72.1 million trades.

The Research Conclusion:

Becker's analysis demonstrates that wealth systematically transfers from liquidity takers to liquidity makers, driven by behavioral biases and market microstructure, not superior forecasting ability by makers.

The documented taker preference for affirmative, longshot outcomes creates the structural opportunity. The maker patience and spread capture methodology harvests it.

This is not theory. This is measured, verified reality from the largest prediction market dataset ever analyzed.

git clone https://github.com/Jon-Becker/prediction-market-analysis
cd prediction-market-analysis

方法 3：订单流分解与 Maker vs. Taker 盈利能力

这是最微妙的优势，也是散户几乎从不考虑的一点。

每一笔交易都有两个参与者：提供流动性的 maker，以及消耗流动性的 taker。

maker 挂出限价单，等待成交。

taker 跨过价差，买的是“立刻成交”。

Jon Becker 的数据集为每笔交易标注了 taker 方向。这意味着你可以把人群拆成 makers 与 takers，并分别分析他们的盈利能力。

Becker 的研究实际揭示了什么：

对 72.1 百万笔 Kalshi 交易及其结算结果的分析显示出鲜明的不对称。

在 1 美分合约（极端冷门）处：

Taker 仅有 0.43% 的时间会赢
隐含概率：1%
Taker 错定价：-57%
Makers 有 1.57% 的时间会赢
Maker 错定价：+57%

在 50 美分合约处：

data/
├── polymarket/
│   ├── markets/               # Market metadata (titles, outcomes, status)
│   └── trades/                # Every trade (price, volume, timestamp)
└── kalshi/
    ├── markets/               # Same structure for Kalshi
    └── trades/

Taker 错定价：-2.65%
Maker 错定价：+2.66%

整体发现：

在 99 个价格档位中，takers 在其中 80 个档位呈现负的超额收益
Makers 买 YES：+0.77% 超额收益
Makers 买 NO：+1.25% 超额收益
统计对称性（Cohen's d ≈ 0.02）表明 makers 并不是预测更准，而是结构更优

这不是小差异。

takers 作为一个群体，是系统性错误的。不是 50/50 硬币式的“随机错”，而是在 99 个价格档位中的 80% 持续、可测量地错。

为什么 Takers 会输：

Becker 的研究指出了核心洞察：makers 的盈利来自结构性套利，而不是更强的预测能力。

makers 买 YES（+0.77%）与买 NO（+1.25%）的超额收益几乎对称，证明他们不是在挑赢家，而是在利用 taker 人群中“昂贵的偏好”。

taker 的行为体现出紧迫性。你跨过价差，是因为你更看重成交确定性而非价格。这种紧迫性与行为偏差相关。

第一：对信息不对称的误判。takers 以为自己在行动于有价值的信息。多数并非如此。他们是在对公开信息做情绪化反应，而不是基于私有信息。

相反，makers 体现的是耐心：按定义他们在等待成交。耐心会过滤掉情绪化的紧迫冲动。

数学表达：

每笔被动成交订单的 maker 期望利润：

E[Profit_maker] = spread_capture + edge_vs_takers

其中：

spread_capture 表示买卖价差的收取
edge_vs_takers 表示经验上相对 takers 的胜率优势

从 Becker 已验证的数据看：

Maker 相对 takers 的优势：+0.77% 到 +1.25%（取决于方向）
这一优势覆盖 99 个价格档位中的 80 个
该优势是结构性的，而非信息性的（由 YES/NO 表现对称所证明）

如果你持续在大量市场中提供流动性，就能反复赚取这份优势，而不需要更强的预测能力。

但这也有风险。

机构做市框架：

为什么这很重要：

散户几乎总是 taker。他们看到一个市场，点击买入，跨价差成交。

这么做意味着他们加入了一个群体：Becker 的研究证明，这个群体在 99 个价格档位中的 80% 具有负的超额收益。

在极端冷门（1 美分合约）处，takers 跑输 57%。即便在中等概率（50 美分）处，也跑输 2.65%。

而提供流动性的机构则收割了这份优势的另一侧。

同一个市场，不同的方法。由 72.1 百万笔交易数据证明的结构性优势。

研究结论：

Becker 的分析表明，财富会从流动性消耗者系统性转移到流动性提供者，其驱动力来自行为偏差与市场微观结构，而不是 makers 更强的预测能力。

taker 对肯定式、冷门结果的偏好创造了结构性机会；maker 的耐心与价差捕获方法把它收割出来。

这不是理论。这是对迄今最大预测市场数据集的测量与验证所得的现实。

git clone https://github.com/Jon-Becker/prediction-market-analysis
cd prediction-market-analysis

The Institutional Edge is not Information

Here's what retail gets wrong about hedge funds.

They assume hedge funds win because they have better information. Better research. Better models. Better predictions.

That's not where the edge is.

The edge is in:

Risk management: Sizing positions for the distribution of outcomes, not the point estimate. Monte Carlo uncertainty adjustment prevents ruin.
Time-varying strategies: Exploiting calibration patterns that change as resolution approaches. Selling early bias. Buying late reversal.
Structural positioning: Being the maker instead of the taker. Collecting spread and adverse selection edge from impatient counterparties.

None of these require better prediction. They require better process. The Becker dataset gives you the laboratory to build that process.

400 million trades. Every outcome known. Every pattern measurable.

Retail will use this data to backtest their predictions.

Institutions will use this data to calibrate their risk management, identify time-varying biases and measure structural edges.

***I have an idea for the most insane prediction market experiment ever built on this dataset - reply YES if you want me to do it

7K people already follow this journey, join them so you don't miss what's next.***

Link: http://x.com/i/article/2022988148943601665

机构优势不在信息

散户对对冲基金最大的误解在这里。

他们以为对冲基金赢，是因为信息更好、研究更强、模型更厉害、预测更准。

优势不在那儿。

优势在于：

风险管理：按结果分布而非点估计来定仓。蒙特卡洛不确定性调整避免走向毁灭。
随时间变化的策略：利用随结算临近而变化的校准模式。卖出早期偏差，买入晚期反转。
结构性站位：做 maker 而不是 taker。从不耐心对手方那里收取价差与逆向选择优势。

这些都不需要更好的预测，而需要更好的流程。
Becker 数据集给你一个实验室，去打造这套流程。

4 亿笔交易。每个结果都已知。每个模式都可测量。

散户会用这些数据去回测他们的预测。

机构会用这些数据去校准风险管理、识别随时间变化的偏差、衡量结构性优势。

我有一个点子：要在这份数据集上做出史上最疯狂的预测市场实验——如果你想让我做，回复 YES。
7K 人已经在关注这段旅程，加入我们，别错过接下来的内容。

链接: http://x.com/i/article/2022988148943601665

The Dataset That Just Went Public

This is tick level data. Every trade has timestamp, price, volume, taker direction. The same granularity institutional data vendors charge $100K+ annually for in traditional markets.

**Now it's open source. This is massive. **

**How to Set Up the Dataset (Step by Step)

Prerequisites:**

Python 3.9 or higher installed
40GB free disk space
Command line access (Terminal on Mac/Linux, PowerShell on Windows)

Step 1: Install uv (Dependency Manager)

uv sync

Step 2: Clone the Repository

Step 3: Install Dependencies

This installs DuckDB, Pandas, Matplotlib and other analysis tools.

Step 4: Download the Dataset

This downloads data.tar.zst (36GB compressed) from Cloudflare R2 and extracts it to the data/ directory. Extraction takes 5 to 30 minutes depending on your system (for me it took more than expected)

Step 5: Verify the Data

make setup

You should see hundreds of Parquet files containing trade data.

Congrats. You now have the same dataset hedge funds are analyzing.

The data is organized like this:

Each trade file is a Parquet file. What is Parquet file? Parquet is a columnar storage format that lets you query billions of rows without loading everything into memory.

Now that you have it set up, let me show you what institutions are actually doing with this data.

ls data/polymarket/trades/
ls data/kalshi/trades/

How Hedge Funds Actually Use This Data

You think prediction markets are for betting on outcomes.

You're wrong.

Here's exactly what they're doing with 400 million trades.

Method 1: Empirical Kelly Criterion with Monte Carlo Uncertainty Quantification

The Kelly Criterion is the foundation of quantitative position sizing. Every institutional trader knows the formula:

f* = (p × b - q) / b

Where f* is the optimal fraction of capital to deploy, p is win probability, q is loss probability, and b represents the odds.

The problem with textbook Kelly: it assumes you know your edge with certainty.

Reality breaks this assumption immediately.

When your model estimates 6% edge on a trade, that's not ground truth. It's a point estimate with uncertainty. The true edge might be 3%. It might be 9%. You have a distribution, not a number.

Standard Kelly treats that 6% as fact. This is mathematically incorrect and leads to systematic overbetting.

Empirical Kelly solves this by incorporating uncertainty directly into the sizing calculation.

Here's how they implement it using the Becker dataset:

Phase 1: Historical Trade Extraction

Funds define their strategy criteria in precise terms. For example: "Enter Yes when contract price is below $0.15 and our fundamental model estimates true probability above 0.25."

They filter the 400 million historical trades to find every instance where that exact pattern appeared. Not similar. Exact.

This gives them thousands of historical analogs. Each one has a known outcome because the dataset includes resolutions.

Phase 2: Return Distribution Construction

For each historical analog, they calculate what the realized return was. Win or loss. Magnitude. Timing.

This creates an empirical distribution of returns. Not a theoretical normal distribution. The actual, realized distribution of what happened when this pattern appeared in real markets.

Critical insight: this distribution is almost never normal. It has fat tails. It has skewness. It has kurtosis that would make a statistics professor wince.

Traditional models assume away these features. Empirical methods measure them directly.

Phase 3: Monte Carlo Resampling

Here's where it gets interesting mathematically.

The historical sequence of returns is just one possible path. If the same trades had occurred in a different order, the equity curve would look completely different.

This is path dependency. And it matters enormously for risk management.

Monte Carlo resampling generates 10,000 alternative paths by randomly reordering the same historical returns. Each path has identical statistical properties but different realized risk profiles.

Phase 4: Drawdown Distribution Analysis

For each of the 10,000 simulated paths, calculate the maximum drawdown. The worst peak to trough decline.

Now you have a distribution of possible max drawdowns, not a single number. You can see the 50th percentile (median case), the 95th percentile (bad luck), the 99th percentile (disaster scenario).

This is where institutional risk management diverges from retail.

You: "My backtest shows 12% max drawdown, I can handle that."

Institutional: "The median path shows 12% drawdown, but the 95th percentile shows 31% drawdown. We need to size for the 95th percentile, not the median."

Phase 5: Uncertainty Adjusted Position Sizing

The final step is calculating position size that keeps the 95th percentile drawdown under institutional risk limits.

The math becomes:

f_empirical = f_kelly × (1 - CV_edge)

Where CV_edge is the coefficient of variation (standard deviation / mean) of edge estimates across the Monte Carlo simulations.

High uncertainty → large CV → aggressive haircut to position size. Low uncertainty → small CV → sizing closer to theoretical Kelly.

Consider an illustrative application of this methodology:

A quantitative strategy: Long contracts under $0.20 where model estimates true probability > 0.30

Using historical pattern matching on the Becker dataset and Monte Carlo resampling:

The difference between ignoring uncertainty (20%+ sizing) and incorporating it (10% sizing) is the difference between probable ruin and steady compounding over time.

Why This Matters:

Every retail trader using Kelly is using the textbook version. They're overbetting systematically because they're not accounting for uncertainty in their edge estimates.

Institutions using empirical Kelly with Monte Carlo are sizing for the distribution of possible outcomes, not the point estimate.

Over time, this creates massive divergence. The retail trader experiences a 40% drawdown that wipes out years of gains. The institutional trader never exceeds 20% drawdown and compounds smoothly.

Same strategy. Different position sizing methodology. Completely different outcomes.

Method 2: Calibration Surface Analysis Across Price and Time Dimensions

Standard calibration analysis plots implied probability versus realized frequency.

At price $0.30 (30% implied), how often did that outcome actually occur? If it occurred 30% of the time, the market was calibrated. If 25%, it was overpriced. If 35%, it was underpriced.

This is one dimensional analysis. Price only.

Institutions build calibration surfaces adding the time dimension: how does calibration change as resolution approaches?

The Framework:

Define C(p, t) as the calibration function where:

p represents contract price (0 to 100)
t represents time remaining until resolution (measured in days)
C(p, t) returns the empirical probability that outcome occurs

In perfectly calibrated markets, C(p, t) = p for all p and t.

In reality, C(p, t) varies systematically with both price and time.

What Jon Becker's Research Actually Shows:

Analysis of 72.1 million Kalshi trades reveals the longshot bias is real and measurable.

At extreme low probabilities (1-cent contracts):

Takers win only 0.43% of the time
Implied probability: 1%
Mispricing: -57% (massively underperforming)

At mid probabilities (50-cent contracts):

Taker mispricing: -2.65%
Maker mispricing: +2.66%
Bias exists but compressed

The research confirms takers exhibit negative excess returns at 80 of 99 price levels, proving systematic mispricing across the probability spectrum.

The Institutional Hypothesis on Time Dimension:

While Becker's published research focused on price based calibration, institutions extend this framework temporally based on behavioral finance theory.

The hypothesis: longshot bias should vary with time to resolution because the psychological drivers change.

Early Period (far from resolution): Retail sentiment dominates. Limited information exists. People buy lottery tickets based on hope rather than probability. This should maximize longshot bias.

Mid Period: Information accumulates. Sophisticated participants enter. Prices should converge toward fundamentals as the informational environment improves.

The Strategy Framework:

Time varying filter rules based on established behavioral patterns:

Far from resolution: The longshot bias documented by Becker is likely strongest here. Strategy: systematically fade low probability contracts where retail enthusiasm dominates.

Mid range: Peak efficiency period as information and liquidity improve. Strategy: reduce activity or focus on other edges.

Near resolution: Information asymmetry should collapse. Strategy: exploit any remaining mispricing but with awareness that efficiency typically improves.

Mathematical Formalization:

The mispricing function:

M(p, t) = C(p, t) - p/100

Where M represents systematic mispricing in percentage points.

Institutional entry rules would be:

Enter short when M(p, t) > threshold (overpriced) Enter long when M(p, t) < -threshold (underpriced) Stay flat when |M(p, t)| < threshold (fair)

The threshold is calibrated to transaction costs and required risk adjusted returns.

Why This Matters:

Becker's research proves the longshot bias exists at the price dimension. The documented -57% mispricing at 1-cent contracts is enormous and systematic.

Institutions hypothesize this bias varies with time, though empirical validation requires analysis of the full temporal dataset.

The framework is sound: behavioral biases driven by hope, fear, and information asymmetry should logically vary as resolution approaches.

Whether the specific pattern is early bias → mid efficiency → late reversal, or some other temporal structure, requires running the analysis on the 400 million trade dataset.

What we know for certain from verified research:

Longshot bias exists and is measurable (-57% at 1-cent)
It varies across the probability spectrum
Takers systematically lose at 80 of 99 price levels
The bias creates structural opportunity

What institutions test empirically:

How this bias changes with time to resolution
Whether patterns are stable across market categories
Optimal threshold levels for entry and exit
Transaction cost adjusted profitability

The calibration surface methodology provides the framework. The Becker dataset provides the laboratory. The empirical analysis determines which specific patterns exist and are tradeable.

# On Mac/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows (PowerShell)
irm https://astral.sh/uv/install.ps1 | iex

Method 3: Order Flow Decomposition and Maker Versus Taker Profitability

This is the most subtle edge and the one retail traders never consider.

Every trade has two participants: a maker who provided liquidity and a taker who consumed it.

The maker posted a limit order. They waited.

The taker crossed the spread. They paid for immediacy.

Jon Becker's dataset tags every trade with the taker side. This means you can separate the population into makers and takers and analyze their profitability independently.

What Becker's Research Actually Reveals:

Analysis of 72.1 million Kalshi trades with resolution outcomes shows a stark asymmetry.

At 1-cent contracts (extreme longshots):

Takers win only 0.43% of the time
Implied probability: 1%
Taker mispricing: -57%
Makers win 1.57% of the time
Maker mispricing: +57%

**At 50-cent contracts: **

data/
├── polymarket/
│   ├── markets/               # Market metadata (titles, outcomes, status)
│   └── trades/                # Every trade (price, volume, timestamp)
└── kalshi/
    ├── markets/               # Same structure for Kalshi
    └── trades/

Taker mispricing: -2.65%
Maker mispricing: +2.66%

Aggregate findings:

Takers exhibit negative excess returns at 80 of 99 price levels
Makers buying YES: +0.77% excess return
Makers buying NO: +1.25% excess return
Statistical symmetry (Cohen's d ≈ 0.02) indicates makers don't predict better, they just structure better

This is not a small difference.

Takers, as a population, are systematically wrong. Not 50/50 coinflip wrong. Persistently, measurably wrong at 80% of all price levels.

Why Takers Lose:

Becker's research identifies the core insight: makers profit via structural arbitrage, not superior forecasting ability.

The near identical excess returns for makers buying YES (+0.77%) versus NO (+1.25%) proves they're not picking winners. They're exploiting a costly preference in the taker population.

Taker behavior reveals urgency. You cross the spread because you value execution certainty over price. This urgency correlates with behavioral bias.

First: Information asymmetry misperception. Takers believe they're acting on valuable information. Most aren't. They're reacting to public information emotionally, not privately.

Second: Affirmative bias. Becker's research shows takers exhibit "a costly preference for affirmative, longshot outcomes." They disproportionately buy YES on longshots, systematically overpaying.

Makers, conversely, demonstrate patience. By definition, they wait. This patience filters out emotional urgency.

Additionally, makers optimize for spread capture, not outcome prediction. Over time, spread collection plus the structural edge versus biased taker flow generates consistent positive expectation.

The Math:

Expected maker profit per filled order:

E[Profit_maker] = spread_capture + edge_vs_takers

Where:

spread_capture represents the bid ask collection
edge_vs_takers represents the empirical win rate advantage

From Becker's verified data:

Maker edge vs takers: +0.77% to +1.25% depending on position direction
This edge exists across 80 of 99 price levels
The edge is structural, not informational (proven by symmetric YES/NO performance)

If you provide liquidity consistently across many markets, you earn this edge repeatedly without needing superior forecasting ability.

But there's risk.

Inventory Risk: As a maker, you accumulate positions. You're long some markets, short others. If correlations shift or markets move against you, drawdowns occur before mean reversion.

The Institutional Market Making Framework:

Why This Matters:

Retail traders are almost always takers. They see a market, they click buy, they cross the spread.

By doing this, they're entering with a population that Becker's research proves has negative excess returns at 80% of all price levels.

At extreme longshots (1-cent contracts), takers underperform by 57%. Even at mid probabilities (50-cent), they underperform by 2.65%.

Institutions providing liquidity collect the other side of that edge.

Same markets. Different approach. Structural advantage proven by 72.1 million trades.

The Research Conclusion:

The documented taker preference for affirmative, longshot outcomes creates the structural opportunity. The maker patience and spread capture methodology harvests it.

This is not theory. This is measured, verified reality from the largest prediction market dataset ever analyzed.

git clone https://github.com/Jon-Becker/prediction-market-analysis
cd prediction-market-analysis

The Institutional Edge is not Information

Here's what retail gets wrong about hedge funds.

They assume hedge funds win because they have better information. Better research. Better models. Better predictions.

That's not where the edge is.

The edge is in:

Risk management: Sizing positions for the distribution of outcomes, not the point estimate. Monte Carlo uncertainty adjustment prevents ruin.
Time-varying strategies: Exploiting calibration patterns that change as resolution approaches. Selling early bias. Buying late reversal.
Structural positioning: Being the maker instead of the taker. Collecting spread and adverse selection edge from impatient counterparties.

None of these require better prediction. They require better process. The Becker dataset gives you the laboratory to build that process.

400 million trades. Every outcome known. Every pattern measurable.

Retail will use this data to backtest their predictions.

Institutions will use this data to calibrate their risk management, identify time-varying biases and measure structural edges.

***I have an idea for the most insane prediction market experiment ever built on this dataset - reply YES if you want me to do it

7K people already follow this journey, join them so you don't miss what's next.***

Link: http://x.com/i/article/2022988148943601665

📋 讨论归档

讨论进行中…

像对冲基金一样使用预测市场数据

核心观点

跟我们的关联

讨论引子

刚刚公开的数据集

如何逐步设置数据集（一步步）

对冲基金实际上如何使用这些数据

方法 1：带蒙特卡洛不确定性量化的经验凯利准则

方法 2：跨价格与时间维度的校准曲面分析

方法 3：订单流分解与 Maker vs. Taker 盈利能力

The Dataset That Just Went Public

刚刚公开的数据集

**How to Set Up the Dataset (Step by Step)

如何逐步设置数据集（一步步）

How Hedge Funds Actually Use This Data

对冲基金实际上如何使用这些数据

Method 1: Empirical Kelly Criterion with Monte Carlo Uncertainty Quantification

方法 1：带蒙特卡洛不确定性量化的经验凯利准则

Method 2: Calibration Surface Analysis Across Price and Time Dimensions

方法 2：跨价格与时间维度的校准曲面分析

Method 3: Order Flow Decomposition and Maker Versus Taker Profitability

方法 3：订单流分解与 Maker vs. Taker 盈利能力

The Institutional Edge is not Information

机构优势不在信息

相关笔记

The Dataset That Just Went Public

**How to Set Up the Dataset (Step by Step)

How Hedge Funds Actually Use This Data

Method 1: Empirical Kelly Criterion with Monte Carlo Uncertainty Quantification

Method 2: Calibration Surface Analysis Across Price and Time Dimensions

Method 3: Order Flow Decomposition and Maker Versus Taker Profitability

The Institutional Edge is not Information

📋 讨论归档