返回列表
🧠 阿头学 · 🪞 Uota学

18个月成为量化交易员?先看看这条路的真实成本

一份看似民主化的量化金融自学指南,实则揭示了一个残酷真相:工具可以民主化,但确信度(conviction)不能——而后者才是真正的护城河。
打开原文 ↗

2026-03-04 原文链接 ↗
阅读简报
双语对照
完整翻译
原文
讨论归档

核心观点

  • "估计误差"是理论与实践之间的真正鸿沟 完美的Kelly下注、Markowitz组合、ML模型在实践中都会因参数估计的噪声而崩溃。这不是数学问题,是认识论问题——你永远拿不到"真实参数",只能拿到"带噪声的估计"。量化新手的通病是高估自己找到了多少信号(NOT BS),实际上前10个策略都是噪声(NOISY BS)。这个洞察可以迁移到所有数据驱动决策:Neta的增长指标、A/B测试结果、用户反馈——都要问"这是信号还是噪声"。
  • 数学流利度是AI时代少数剩下的护城河 文章最站得住的论点:AI能写代码、能回测,但推导伊藤引理、理解测度论、判断凸松弛的紧性——这些深度数学能力短期内很难被AI替代。迁移到ATou的处境:不是"会用Claude/GPT",而是"理解Transformer架构、知道attention为什么work、能设计出让AI表现提升10倍的系统"。表层工具会被民主化,底层原理是护城河。
  • 但这条路径本身充满幸存者偏差 作者已经成功了,所以回溯出的路径天然带有"事后合理化"。文章完全没提失败率——Jane Street录取率可能低于1%,但语气让人觉得"只要肯学就能进"。时间估算也过于乐观:每个Level的"3-4周"是基于什么样本?对没有数学背景的人,Shreve的随机微积分可能需要6个月而不是6-8周。更关键的是:如果1万人都按这条路走,18个月后市场会涌入大量同质化候选人,门槛必然上升。
  • "条件式思维"是可迁移的最强框架 量化不按"要么真要么假"思考,而是"在我已知信息的前提下,它有多大概率成立"。这不只是概率论,是整个认知框架的升级。在Neta的产品决策里,不是"这个功能好不好",而是"在当前用户画像、竞争格局、技术成熟度的条件下,这个功能成功的概率是多少"。条件式思维强迫你显式列出假设,而不是凭直觉拍板。
  • "工具民主化,确信没有"的悖论 作者在文末说"工具被民主化了,确信没有",但整篇文章恰恰在推销一条"民主化"的路径——用开源工具、免费教材、18个月自学。如果确信度才是护城河,那这条路径能给你的只是"入场券",不是"优势"。这个悖论很有启发:Neta在做的AI原生创作工具也是"民主化"——让所有人都能创作角色、故事。但真正的护城河不是工具本身,而是"用户在Neta上积累的数据、社交关系、创作历史"——这些才是不可迁移的确信度。

跟我们的关联

👤ATou(个人技能树规划) 文章把成为quant拆成5个Level,每个Level都是下一个的前置条件。ATou的2026目标(成为top 0.0001%的AI指挥者)也可以Level化:

  • Level 1: 会用Claude/GPT做单点任务
  • Level 2: 设计multi-agent系统
  • Level 3: 构建自己的agent基础设施
  • Level 4: 理解LLM的数学原理并能调优
  • Level 5: 创造新的agent范式

每个Level有明确的能力边界和验证方式(不是模糊的"我觉得我会了")。下一步:用这个框架重新审视当前技能树,标注"我在哪个Level"和"下一个Level的具体作业是什么"。

🧠Neta(产品与增长策略)

1. "NOT BS vs NOISY BS"框架应用到数据分析:强制每个数据结论附上"这个结论的噪声有多大"。用户增长是真实增长还是营销活动的短期波动?某个功能的留存提升是信号还是样本太小的随机波动?在做A/B测试时,警惕多重比较问题——测试10个变体,总有1-2个"显著提升",但可能只是运气。解决方案:只测试你有强先验的假设,不要shotgun式乱测。

2. "贝叶斯更新速度"是Real-time Era的核心:文章强调"更新得最快、最准确的交易员才能把钱赚到手"。Neta的2026年10月Real-time Era本质上也是贝叶斯更新:用户每次互动都是新数据,系统要实时更新对"这个用户想要什么"的后验概率。关键不是模型有多复杂,而是更新速度——能不能在100ms内完成一次更新并调整下一轮输出。

3. "可视化不确定性"而非隐藏它:量化金融不会假装自己有确定性,而是显式建模风险(VaR、压力测试、置信区间)。Neta在做AI生成内容时也可以借鉴:不是藏起不确定性,而是可视化置信区间——让用户看到"这个角色有70%概率是这样,30%概率是那样",把选择权还给用户。这比"AI说了算"更诚实,也更能建立信任。

🪞Uota(agent架构与学习系统) "Level化学习路径"可以用在agent的能力进化上:不是一次性给agent 100个skill,而是设计渐进式解锁——Level 1完成后才能访问Level 2的工具。每个Level有明确的验证标准(不是"我觉得agent学会了",而是"agent在benchmark上达到X%准确率")。这样可以避免agent在早期就被复杂工具淹没,也能让能力增长更可追踪。

讨论引子

💭 Neta的"确信度"从哪里来? 文章说"工具民主化了,确信没有"。如果所有AI社交产品都能调用同样的模型API(GPT-4/Claude),Neta的确信度(不可复制的优势)是什么?是数据飞轮(用户生成内容→训练模型→更好体验→更多用户)?是社交网络效应?还是团队对"什么是好的AI原生社交"的独特理解?如果答案是"暂时没有",那接下来12个月最该建立的是什么?

💭 ATou在建立"数学流利度"的等价物吗? 量化用深度数学建立护城河,因为这很难速成、很难被AI替代。ATou的Context Engineering、agent系统设计、多模态内容理解——这些是"数学流利度"的等价物吗?还是只是"会用工具"的表层能力?如果是后者,那"深度能力"应该是什么?是理解Transformer架构?是能从零训练模型?还是对人类认知/创作的深刻洞察?

💭 "前10次都会错"——Neta接受这个现实了吗? 文章说"你的前10个策略都会是NOISY BS,现在就接受这一点"。Neta的前10个增长实验、前10个功能迭代、前10个市场假设——有多少已经被证伪了?团队是把失败当"学费"(正常的探索成本),还是当"挫折"(士气打击)?如果是后者,如何建立"快速试错、快速学习"的文化?

在 2025 年,顶级机构的入门级量化总包能拿到 $300K-$500K。

金融行业对 AI/ML 的招聘同比增长了 88%。

这篇文章就是我刚走上这条路时,最希望有人直接递给我的东西——并且把路径按你应该学习的确切顺序铺好了。

这条路径就像电子游戏的层层关卡,你不能跳关。

每一个概念都建立在上一个之上。但前提是你得投入真正的努力:不是去看那些关于金融的无聊 YouTube 视频——那只是在浪费时间——而是做真实的、需要动脑的解题训练。这样,你大概可以在 18 个月内从一无所知变得有模有样。

免责声明: 非投资建议 & 请自行研究 & 市场有风险。 我的项目 - @coldvisionXYZ

忘掉你以为自己懂的交易

大多数人以为量化交易就是选股:对特斯拉有观点、预测财报、押对方向。

量化交易关乎数学。

你大部分时间处理的是统计关系、定价低效,以及结构性优势——这些优势之所以存在,是因为市场是由人运行的复杂系统,而人会犯系统性的错误。

Part I: 概率是不确定性的语言

量化金融里的一切,某种程度上都可以归结为一个问题:

胜率是多少?而且这个胜率是否对我有利?

这就是概率。如果你对概率没有深刻理解,这篇文章里其他内容都不重要。

条件式思维

大多数人按“绝对”方式思考:要么真,要么假。
量化按“条件”方式思考:在我已知信息的前提下,它有多大概率成立?

A 在给定 B 的条件下发生的概率,等于 A 和 B 同时发生的概率除以 B 发生的概率。其影响非常深远。

某只股票在 60% 的交易日上涨——这就是基准概率(base rate)。但在成交量高于平均的日子里,它有 75% 的时间上涨。

这个条件概率才是“不是瞎扯”的(NOT BS)。而那裸的 60% 是“噪声很大的瞎扯”(NOISY BS)。

贝叶斯定理

你更新后的信念等于

(如果你的假设为真,你看到这组数据的可能性) * (你的先验信念) /(在任何假设下看到这组数据的总概率)。

分母会对所有假设求和。

在实践中,你会用 Monte Carlo 采样来计算它。

但逻辑完全一样。贝叶斯就是你如何实时更新自己的确信程度。

某个模型说一只股票应该值 $50。财报出来,营收比预期高 3%。贝叶斯后验就会向上移动。更新得最快、最准确的交易员才能把钱赚到手。

期望值与方差:你最好的两位朋友

期望值代表你的“胜算/确定性”。
方差代表你的风险。

如果你的策略期望值为正,而且你能扛住方差(不被波动打爆),你大概率会赚钱。

Level 1 作业(3-4 周,每天 2 小时): 1. 阅读 Blitzstein & Hwang, Introduction to Probability(哈佛提供免费 PDF)。做完第 1-6 章的每一道题。 2. 编程 模拟 10,000 次抛硬币,用可视化方式验证大数定律。 3. 编程 2 实现一个贝叶斯更新器:输入先验和似然,输出后验。

Part II: 统计学

当你会说概率这门语言后,你还得学会听数据说话。

这就是统计学,而统计学教给你的头号教训是:“看起来像 NOT BS 的,大多数其实是 NOISY BS。”

假设检验就是你的“瞎扯探测器”。

你做了一个模型,回测年化 15%。这是真的吗?

设定 H_0: “这个策略的期望收益为 0。”
计算一个检验统计量。
计算 p-value——在 H_0 为真的情况下,得到这么好结果的概率。

但如果你随机测试 1,000 个策略,其中大约 50 个会纯靠运气出现 p-value 低于 0.05。

这就是多重比较问题。

你的修复方式是 Bonferroni 校正:用显著性阈值除以测试次数。
或者用 Benjamini-Hochberg 来控制错误发现率(false discovery rate)。

几乎每一个新手都会严重高估自己找到了多少 NOT BS。你的前 10 个策略都会是 NOISY BS。现在就接受这一点,你能省下很多钱。

用回归分解收益

线性回归 y=Xβ+ϵ 是最核心的主力工具。
在金融里,你会把策略收益对已知风险因子做回归:

截距 α 就是你的 alpha 也就是无法被已知因子解释的收益。如果在控制因子后 α 为 0,那么你的“优势”只是伪装过的市场暴露。

OLS 估计量:

最重要的数字是 α。
使用 Newey-West 标准误:金融数据存在自相关与异方差,所以默认的 OLS 标准误是错的。用它就像开车时挡风玻璃裂了还硬开。

极大似然估计(MLE)

给定来自参数为 θ 的模型的数据 x_1,…,x_n,​:

令导数为 0 并求解。(不然就没戏了 gng)

在金融里,你用 MLE 来校准几乎所有模型:给波动率拟合 GARCH、估计跳跃扩散参数、把期权定价模型校准到市场报价。

它在渐近意义下是有效的:对于大样本,没有任何一致估计量能拥有更低的方差(Cramér-Rao 下界)。

当公司里有人说自己在“校准(calibrating)”模型时,他们几乎总是指 MLE。

Level 2 作业(4-5 周): 1. 阅读 Wasserman, All of Statistics,第 1-13 章。 2. 编程 用 yfinance 下载真实股票收益率。检验正态性(会失败)。用 MLE 拟合 t 分布并比较。 3. 编程 用 statsmodels 对一个股票组合做 Fama-French 三因子回归。 4. 编程 实现一个置换检验:把日期打乱 10,000 次,把打乱后的表现与真实表现比较。

import numpy as np
import cvxpy as cp

# ============================================
# Markowitz optimization with cvxpy
# ============================================
np.random.seed(42)
n_assets = 10
mu = np.random.uniform(0.04, 0.15, n_assets)
A = np.random.randn(n_assets, n_assets) * 0.1
cov = A @ A.T + np.eye(n_assets) * 0.01

w = cp.Variable(n_assets)
objective = cp.Minimize(cp.quad_form(w, cov))
constraints = [
    mu @ w >= 0.08,      # minimum return
    cp.sum(w) == 1,       # fully invested
    w >= -0.1,            # max 10% short
    w <= 0.3              # max 30% long
]

prob = cp.Problem(objective, constraints)
prob.solve()

ret = mu @ w.value
vol = np.sqrt(w.value @ cov @ w.value)
sharpe = (ret - 0.03) / vol

print(f"Portfolio return:  {ret:.4f}")
print(f"Portfolio vol:     {vol:.4f}")
print(f"Sharpe ratio:      {sharpe:.4f}")
print(f"Weights: {np.round(w.value, 4)}")

Part III: 线性代数

线性代数听起来很无聊。但它是驱动一切的机器:组合构建、PCA、神经网络、协方差估计、因子模型。你不可能在不熟练矩阵的情况下成为量化。

import numpy as np
from scipy.stats import norm

def black_scholes(S, K, T, r, sigma, option_type='call'):
    d1 = (np.log(S/K) + (r + sigma**2/2)*T) / (sigma*np.sqrt(T))
    d2 = d1 - sigma*np.sqrt(T)
    if option_type == 'call':
        return S*norm.cdf(d1) - K*np.exp(-r*T)*norm.cdf(d2)
    else:
        return K*np.exp(-r*T)*norm.cdf(-d2) - S*norm.cdf(-d1)

def monte_carlo_option(S0, K, T, r, sigma, n_sims=500_000):
    """Price via risk-neutral simulation (drift = r, not mu)"""
    Z = np.random.standard_normal(n_sims)
    ST = S0 * np.exp((r - sigma**2/2)*T + sigma*np.sqrt(T)*Z)
    payoffs = np.maximum(ST - K, 0)
    price = np.exp(-r*T) * np.mean(payoffs)
    stderr = np.exp(-r*T) * np.std(payoffs) / np.sqrt(n_sims)
    return price, stderr

def greeks(S, K, T, r, sigma):
    d1 = (np.log(S/K) + (r + sigma**2/2)*T) / (sigma*np.sqrt(T))
    d2 = d1 - sigma*np.sqrt(T)
    return {
        'delta': norm.cdf(d1),
        'gamma': norm.pdf(d1) / (S * sigma * np.sqrt(T)),
        'theta': -(S*norm.pdf(d1)*sigma)/(2*np.sqrt(T)) - r*K*np.exp(-r*T)*norm.cdf(d2),
        'vega':  S * np.sqrt(T) * norm.pdf(d1),
        'rho':   K * T * np.exp(-r*T) * norm.cdf(d2),
    }

# Verify: Monte Carlo converges to Black-Scholes
S, K, T, r, sigma = 100, 105, 1.0, 0.05, 0.2

bs = black_scholes(S, K, T, r, sigma)
mc, err = monte_carlo_option(S, K, T, r, sigma)
g = greeks(S, K, T, r, sigma)

print(f"Black-Scholes: ${bs:.4f}")
print(f"Monte Carlo:   ${mc:.4f} ± {err:.4f}")
print(f"Difference:    ${abs(bs - mc):.4f}\n")
for name, val in g.items():
    print(f"  {name:>6}: {val:.6f}")

用矩阵思考

协方差矩阵 Σ 描述了每个资产相对于其他资产的联动方式。对 500 只股票而言,Σ 是一个 500×500 的矩阵,包含 125,250 个不重复的条目。组合方差会收敛为一个单一表达式

这个二次型是 Markowitz 组合理论、风险管理以及一切的核心。

在股票宇宙里,真正重要的是特征值/特征向量

看一个 500 只股票的宇宙,前 5 个特征向量就能解释 70% 的整体方差。剩下的全是 NOISY BS。

第一次用特征分解时,你会觉得整个世界都变了。看一个 500 只股票的宇宙,前 5 个特征向量就能解释 70% 的整体方差。这就是降维,也是因子投资的基础。

Level 3 作业(4-6 周): 1. 观看 Gilbert Strang 的 MIT 18.06 线性代数课程:全部看完。没有商量余地。 2. 阅读 Strang, Introduction to Linear Algebra。做题。 3. 编程 对标普 500 收益率做 PCA 分解。画出特征值谱。识别前三个主成分。 4. 编程 从零实现 Markowitz 均值-方差优化。

import numpy as np
import matplotlib.pyplot as plt

# Law of large numbers: running average converges to true probability
np.random.seed(42)
flips = np.random.choice([0, 1], size=10000, p=[0.5, 0.5])
running_avg = np.cumsum(flips) / np.arange(1, 10001)

plt.figure(figsize=(10, 4))
plt.plot(running_avg, linewidth=0.7)
plt.axhline(y=0.5, color='r', linestyle='--', label='True probability')
plt.xlabel('Number of flips')
plt.ylabel('Running average')
plt.title('Law of Large Numbers in Action')
plt.legend()
plt.savefig('lln.png', dpi=150)
print(f"After 10,000 flips: {running_avg[-1]:.4f} (true: 0.5000)")

Part IV: 微积分与优化

微积分是变化的语言。在金融里,一切都在变:价格、波动率、相关性,整张概率分布都在每一秒重新塑形。微积分描述并利用这些变化。

导数(数学意义上的):出现在每一个神经网络的反向传播里,也出现在每一次 Greeks 计算里。

泰勒展开

import numpy as np
from scipy import optimize, stats

# Demonstrate fat tails: MLE fit of Student-t to return data
np.random.seed(42)

# Simulate "realistic" returns (fat tails, slight positive drift)
true_df = 4
returns = stats.t.rvs(df=true_df, loc=0.0005, scale=0.015, size=1000)

def neg_log_likelihood(params, data):
    df, loc, scale = params
    if df <= 2 or scale <= 0:
        return 1e10
    return -np.sum(stats.t.logpdf(data, df=df, loc=loc, scale=scale))

result = optimize.minimize(
    neg_log_likelihood, x0=[5, 0, 0.01], args=(returns,),
    method='Nelder-Mead'
)
fitted_df, fitted_loc, fitted_scale = result.x

print(f"MLE degrees of freedom: {fitted_df:.2f} (true: {true_df})")
print(f"MLE location:           {fitted_loc:.6f}")
print(f"MLE scale:              {fitted_scale:.6f}")

# Normality test
_, p_normal = stats.normaltest(returns)
print(f"\nNormality test p-value: {p_normal:.2e}")
print(f"Reject normality? {'YES  fat tails confirmed' if p_normal < 0.05 else 'NO'}")

Delta 对冲是一阶近似。
Gamma 对冲加入二阶修正。
而伊藤微积分之所以不同于普通微积分,恰恰是因为对随机过程来说,二阶泰勒项并不会消失。 记住这一点

Level 4 作业(4-5 周): 1. 阅读 Boyd & Vandenberghe, Convex Optimization(斯坦福提供免费 PDF),第 1-5 章。 2. 编程 从零实现梯度下降。最小化 Rosenbrock 函数。 3. 编程 用 cvxpy 解一个带交易成本约束的组合优化问题。

Part V: 随机微积分

在学随机微积分之前,你只是个喜欢金融的数据科学家。

学完之后,你就是量化。你听到了吗?QUANTATIVE FINANCE EXPERT。

你会在这里学会:如何在连续时间里建模随机性、如何从第一性原理推导 Black-Scholes 方程、以及为什么万亿美元级别的衍生品市场会以那样的方式运作。

布朗运动:纯随机,被形式化

布朗运动(维纳过程)W_t 是一个连续时间随机游走:

  • W_0 = 0

  • 增量 W_t - W_s ~ N(0, t - s) for t > s

  • 不重叠的增量相互独立

  • 路径连续,但处处不可导

一个所有后续内容都依赖的关键洞见:dW_t 的“量级”是 dt,这意味着 (dW_t)^2 = dt。听起来像技术细节,但它是量化金融里最重要的单一事实。

几何布朗运动(GBM)用于建模股票价格:

伊藤引理

在普通微积分里,df = f'(x)dx。你做泰勒展开,然后 (dx)^2 项是无穷小的高阶项,于是你把它扔掉。

但当 x 是随机过程时,(dW_t)^2 = dt 是一阶项。你不能扔。

伊藤引理

把它应用到期权价格上,你就能得到 Black-Scholes。这条公式就是整个衍生品行业的发动机。

从零推导 Black-Scholes

拿起纸笔跟着推。

Step 1: 令 V(S,t) 为期权价格。应用伊藤引理:

Step 2: 构造一个 delta 对冲组合 Π=V−∂S/∂V​⋅S。计算 dΠ:

dW_t​ 项会被完美抵消。该组合在局部是无风险的。

Step 3: 无风险组合必须赚取无风险利率: dΠ=rΠ dtd\Pi = r\Pi \, dt dΠ=rΠdt.

Step 4: 代入并整理:

这就是 Black-Scholes PDE

注意发生了什么——漂移项 μ 消失了。期权价格不依赖股票的期望收益。风险偏好不重要。你可以像所有人都是风险中性那样为期权定价。第一次真正理解这一点,会让人直呼离谱。

解这个 PDE(执行价 K、到期 T 的欧式看涨)得到:

https://x.com/@coldvisionXYZ

The Greeks

  • Delta Δ 是股票每变动 $1,期权变动多少。你的对冲比率。

  • Gamma Γ​:delta 变化的速度。你的凸性敞口。

  • Theta Θ:时间价值衰减。对做多期权通常为负。

  • Vega V:对波动率的敏感度。衍生品赚钱的大头往往在这里。

  • Rho ρ:对利率的敏感度。

Delta 告诉你对冲比率。Gamma 告诉你多久需要重新对冲。Theta 是持有成本。Vega 是波动率交易台的饭碗。

Level 5 作业(6-8 周——最难的一关): 1. 阅读 Shreve, Stochastic Calculus for Finance II。金标准。 2. 备选 Arguin, A First Course in Stochastic Calculus(更新、更易读)。 3. 推导 对 f(S)=ln⁡(S) 应用伊藤引理,其中 S 服从 GBM。推到 −σ^2/2。 4. 推导 用 delta 对冲论证推导完整 Black-Scholes 方程。 5. 编程 从零实现 Black-Scholes。与 Monte Carlo 比较。验证收敛性。

Polymarket

这是当下世界上最有意思的市场,而它背后的数学把本文所有东西都串起来了:概率、信息论、凸优化、整数规划。

LMSR 如何为信念定价

对数市场评分规则(LMSR)由 Robin Hanson 发明,为自动化预测市场提供动力。对 n 个结果,其成本函数是:

其中 q_i​ 追踪结果 i 的未平仓份额,b 是流动性参数。结果 i 的价格为:

这就是 softmax 函数——驱动每一个神经网络分类器的函数。

价格总和永远为 1,永远落在 (0,1) 内,并且永远存在,提供无限流动性。做市商的最大亏损上界为 b * ln(n)

量化职业版图

4 种原型: 量化研究员(Quant Researcher) 最顶的那类人:在 PB 级数据里找模式、建预测模型、设计算法策略。需要博士级的数学/统计/ML,或极其突出的本科能力。在 Jane Street 这类公司,QR 会用上成千上万块 GPU。

量化开发/工程师(Quant Developer/Engineer) 中间那档的强者,主要是“建造者”:交易平台、执行引擎、实时数据管道。把研究员的模型变成真正能交易的系统。需要生产级 C++/Rust/Python,以及低延迟系统能力。

量化交易员(Quant Trader) 要么是最猛的赌徒,要么是最顶的强者,主要是决策者:跑资金、管风险、实时拍板。薪酬波动最大——极端年份能到八位数。

风险量化(Risk Quant) 要么是最顶的强者,要么是经验极其老到的“企业老手”,主要是守门人:模型验证、VaR、压力测试、合规监管。职业更稳、上限更低。正在兴起的 AI/ML Quant 角色(用深度学习做信号生成)增长最快,2025 年招聘同比增长 88%。

薪酬大概是这样:

顶级(Jane Street, Citadel, HRT) 应届 $300K-$500K+ 总包
中期(3-7 年)$550K-$950K
资深(8+ 年)$1M-$3M+
明星交易员/PM $3M-$30M+
中档(Two Sigma, DE Shaw) 应届 $250K–$350K
中期(3-7 年) $350K–$625K
资深(8+ 年) $575K–$1.2M
明星交易员/PM 不知道 idk

据报道,Jane Street 在 2025 年上半年的人均薪酬为每年 $1.4 million。那是平均值,不过

面试地狱

简历筛选 -> 在线测评(Zetamac 心算——目标 50+;逻辑题) -> 电话面(概率题、下注游戏) -> Superday(连续 3-5 场面试,模拟交易、编程、白板推导)。

Jane Street 会故意出一些你一个人解不出来的题——他们测试的是你如何利用提示、如何协作。

他们最近的实习生里,超过三分之二学的是 CS;超过三分之一学的是数学。一般不要求金融知识。

头号备考资源 Xinfeng Zhou 的 Green BookA Practical Guide to Quantitative Finance Interviews)——200+ 道真实题。 再补充 QuantGuide.io(“量化版 LeetCode”)
Brainstellar
Jane Street 的 Figgie 纸牌游戏

完整工具箱

Python 技术栈 数据:pandas, polars(Polars 在大数据集上快 10-50 倍) 数值计算:numpy, scipy ML(表格数据):xgboost, lightgbm, catboost ML(深度):pytorch 优化:cvxpy 衍生品:QuantLib(工业级,C++ 后端) 统计:statsmodels 回测:NautilusTrader 回测(更简单):backtrader, vectorbt(更容易上手) 量化研究:Microsoft Qlib(17K+ stars,AI 导向) 交易强化学习:FinRL(10K+ stars)

C++ 与 Rust 说实话我对这块啥也不懂。这是我找到的: C++ 库:QuantLib, Eigen, Boost。 Rust:RustQuant(期权定价),以及 NautilusTrader 的 Rust+Python 范式(Rust 内核提速,Python API 做研究)。

数据源 免费:yfinance, Finnhub(60 次调用/分钟), Alpha Vantage。 中档:Polygon.io($199/月,<20ms 延迟), Tiingo。 企业级:Bloomberg Terminal(约 $32K/年), Refinitiv, FactSet。 区块链:Alchemy(免费层含归档访问)。

求解器 Gurobi:最快的商业 MIP 求解器,有免费的学术许可。做组合套利必备。 Google OR-Tools:最强免费求解器。 PuLP/Pyomo:Python 建模接口。

阅读清单(按顺序)

数学

  1. Blitzstein & Hwang - Introduction to Probability(哈佛提供免费 PDF)

  2. Strang - Introduction to Linear Algebra + MIT 18.06 lectures

  3. Wasserman - All of Statistics

  4. Boyd & Vandenberghe - Convex Optimization(斯坦福提供免费 PDF)

  5. Shreve - Stochastic Calculus for Finance I & II

量化金融

  1. Hull - Options, Futures, and Other Derivatives

  2. Natenberg - Option Volatility and Pricing

  3. López de Prado - Advances in Financial Machine Learning

  4. Ernest Chan - Quantitative Trading

  5. Zuckerman - The Man Who Solved the Market

面试准备

  1. Zhou - Practical Guide to Quantitative Finance Interviews(Green Book #1)

  2. Crack -Heard on the Street

  3. Joshi - Quant Job Interview Questions

竞赛

  • Jane Street Kaggle($100K 奖金)

  • WorldQuant BRAIN(100K+ 用户,为 alpha 信号付费)

  • Citadel Datathon(快速通道拿 offer)

  • Jane Street 每月谜题(难度高于面试)

我更早知道就好了的三件事

估计误差才是真正的敌人。 满 Kelly 下注、无约束 Markowitz、特征太多的 ML 模型——它们都会因为同一个原因失败:在参数估计里对 NOISY BS 过拟合。

在真实参数存在时,数学完美无缺。但你永远拿不到真实参数。理论与实践之间的差距永远是估计误差,而最好的量化就是那些尊重这一点的人。

工具被民主化了。确信没有。 任何人都能用 QuantLib、Polygon.io、PyTorch。技术是必要条件,但远远不够。优势来自独特数据、独特模型或独特执行——不是更会 pip install

数学是护城河 AI 能写代码、能给策略建议。但能推导出为什么伊藤引理多了一项,能证明在风险中性测度下贴现价格是鞅,能判断组合市场里某个凸松弛到底是紧的还是松的——这种数学流利度,才把“自己打造优势的量化”和“借来优势的量化”区分开。而借来的优势会过期。

Part 2 会讲什么

Part 2 覆盖:奇异期权(障碍、亚式、回望)、随机波动率(Heston 模型校准)、跳跃扩散(Merton)、更高阶测度论(鞅表示、可选停止)、最优执行的随机控制(Almgren-Chriss)、做市的强化学习、面向金融时间序列的 Transformer 架构、FPGA 交易基础设施、WebSocket 行情、并行执行、用 Gurobi 做跨上千条件的组合套利(Frank-Wolfe)。

数学更难。薪水更长

In 2025, entry-level quants at top firms pulled $300K-$500K total comp.

在 2025 年,顶级机构的入门级量化总包能拿到 $300K-$500K。

AI/ML hiring in finance grew 88% year-over-year.

金融行业对 AI/ML 的招聘同比增长了 88%。

This article is everything I wish someone had handed me when i started my path laid out in the exact order you should learn it.

这篇文章就是我刚走上这条路时,最希望有人直接递给我的东西——并且把路径按你应该学习的确切顺序铺好了。

The path is like layers of a video game, where you can't skip levels.

这条路径就像电子游戏的层层关卡,你不能跳关。

Every concept builds on the last. But if you put in real work, not watching some lame ahh YouTube videos about finance, that's just wasting your time, actual problem-solving work - you can go from knowing nothing to being something in about 18 months.

每一个概念都建立在上一个之上。但前提是你得投入真正的努力:不是去看那些关于金融的无聊 YouTube 视频——那只是在浪费时间——而是做真实的、需要动脑的解题训练。这样,你大概可以在 18 个月内从一无所知变得有模有样。

Disclaimer: Not Financial Advice & Do Your Own Research & Markets involve risk. My own project - @coldvisionXYZ

免责声明: 非投资建议 & 请自行研究 & 市场有风险。 我的项目 - @coldvisionXYZ

Forget everything you think you know about trading

忘掉你以为自己懂的交易

Most people think quantitative trading is about picking stocks. Having opinions on Tesla. Predicting earnings.

大多数人以为量化交易就是选股:对特斯拉有观点、预测财报、押对方向。

Quant trading is about math.

量化交易关乎数学。

You are mostly working with statistical relationships, pricing inefficiencies, and structural edges that exist because markets are complex systems run by humans who make systematic errors.

你大部分时间处理的是统计关系、定价低效,以及结构性优势——这些优势之所以存在,是因为市场是由人运行的复杂系统,而人会犯系统性的错误。

Part I: Probability is the Language of Uncertainty

Part I: 概率是不确定性的语言

Everything in quantitative finance reduces kinda to 1 question:

量化金融里的一切,某种程度上都可以归结为一个问题:

What are the odds, and are the odds in my favor?

胜率是多少?而且这个胜率是否对我有利?

That's probability. If you don't understand probability at a deep level, nothing else in this article matters.

这就是概率。如果你对概率没有深刻理解,这篇文章里其他内容都不重要。

Conditional thinking

条件式思维

Most people think in absolutes. Something is true or it isn't. Quants think in conditionals. Given what I know, how likely is this?

大多数人按“绝对”方式思考:要么真,要么假。
量化按“条件”方式思考:在我已知信息的前提下,它有多大概率成立?

The probability of A given B equals the probability of both happening divided by the probability of B. Profound implications.

A 在给定 B 的条件下发生的概率,等于 A 和 B 同时发生的概率除以 B 发生的概率。其影响非常深远。

A stock goes up 60% of days - that's the base rate. But on days when volume is above average, it goes up 75% of the time.

某只股票在 60% 的交易日上涨——这就是基准概率(base rate)。但在成交量高于平均的日子里,它有 75% 的时间上涨。

That conditional probability is a NOT BS. The raw 60% is NOISY BS.

这个条件概率才是“不是瞎扯”的(NOT BS)。而那裸的 60% 是“噪声很大的瞎扯”(NOISY BS)。

Bayes' theorem

贝叶斯定理

Your updated belief equals

你更新后的信念等于

(how likely you'd see this data if your hypothesis were true) * (your prior belief) / (the total probability of seeing this data under any hypothesis).

(如果你的假设为真,你看到这组数据的可能性) * (你的先验信念) /(在任何假设下看到这组数据的总概率)。

The denominator sums over all hypotheses.

分母会对所有假设求和。

In practice, you compute this with Monte Carlo sampling.

在实践中,你会用 Monte Carlo 采样来计算它。

But the logic is the same. Bayes is how you update your conviction in real time.

但逻辑完全一样。贝叶斯就是你如何实时更新自己的确信程度。

A model says a stock should be worth $50. Earnings come out, revenue is 3% above estimate. The Bayesian posterior shifts upward. The traders who update fastest and most accurately win bread.

某个模型说一只股票应该值 $50。财报出来,营收比预期高 3%。贝叶斯后验就会向上移动。更新得最快、最准确的交易员才能把钱赚到手。

Expected value and variance your two best friends

期望值与方差:你最好的两位朋友

Expected value is your conviction. Variance is your risk.

期望值代表你的“胜算/确定性”。
方差代表你的风险。

If your strategy has positive expected value and you can survive the variance, you likely will make money.

如果你的策略期望值为正,而且你能扛住方差(不被波动打爆),你大概率会赚钱。

Level 1 homework (3-4 weeks at 2 hours/day): 1. Read Blitzstein & Hwang, Introduction to Probability (free PDF from Harvard). Every problem in Chapters 1-6. 2. Code Simulate 10,000 coin flips, verify the law of large numbers visually. 3. Code 2 Implement a Bayesian updater takes a prior and likelihood, returns a posterior.

Level 1 作业(3-4 周,每天 2 小时): 1. 阅读 Blitzstein & Hwang, Introduction to Probability(哈佛提供免费 PDF)。做完第 1-6 章的每一道题。 2. 编程 模拟 10,000 次抛硬币,用可视化方式验证大数定律。 3. 编程 2 实现一个贝叶斯更新器:输入先验和似然,输出后验。

Part II: Statistics

Part II: 统计学

Once you speak probability, you need to learn to listen to data.

当你会说概率这门语言后,你还得学会听数据说话。

That's statistics and the #1 lesson statistics teaches is "most of what looks like NOT A BS is actually NOISY BS"

这就是统计学,而统计学教给你的头号教训是:“看起来像 NOT BS 的,大多数其实是 NOISY BS。”

Hypothesis testing is the BS detector

假设检验就是你的“瞎扯探测器”。

You build a model. It backtests at 15% annual return. Is it real?

你做了一个模型,回测年化 15%。这是真的吗?

Set up H_0: "this strategy has zero expected return." Compute a test statistic. Calculate a p-value - the probability of seeing results this good if H_0 were true.

设定 H_0: “这个策略的期望收益为 0。”
计算一个检验统计量。
计算 p-value——在 H_0 为真的情况下,得到这么好结果的概率。

BUT If you test 1,000 random strategies, 50 of them will show p-values below 0.05 purely by chance.

但如果你随机测试 1,000 个策略,其中大约 50 个会纯靠运气出现 p-value 低于 0.05。

That's the multiple comparisons problem.

这就是多重比较问题。

Ur fix is Bonferroni correction divide your significance threshold by the number of tests Or use Benjamini-Hochberg for false discovery rate control.

你的修复方式是 Bonferroni 校正:用显著性阈值除以测试次数。
或者用 Benjamini-Hochberg 来控制错误发现率(false discovery rate)。

Every single beginner massively overestimates how much NOT A BS they've found. Your first 10 strategies will all be NOISY BS. Accept this now and save yourself a lot of money.

几乎每一个新手都会严重高估自己找到了多少 NOT BS。你的前 10 个策略都会是 NOISY BS。现在就接受这一点,你能省下很多钱。

Regression decomposing returns

用回归分解收益

Linear regression y=Xβ+ϵ is the workhorse. In finance, you regress your strategy's returns against known risk factors:

线性回归 y=Xβ+ϵ 是最核心的主力工具。
在金融里,你会把策略收益对已知风险因子做回归:

The intercept α is your alpha the return that can't be explained by known factors. If α is zero after accounting for factors, your "edge" is just disguised market exposure.

截距 α 就是你的 alpha 也就是无法被已知因子解释的收益。如果在控制因子后 α 为 0,那么你的“优势”只是伪装过的市场暴露。

The OLS estimator:

OLS 估计量:

The most important number is α. Use Newey-West standard errors financial data has autocorrelation and heteroskedasticity, so default OLS standard errors are wrong. Using them is like driving with a cracked windshield.

最重要的数字是 α。
使用 Newey-West 标准误:金融数据存在自相关与异方差,所以默认的 OLS 标准误是错的。用它就像开车时挡风玻璃裂了还硬开。

Maximum Likelihood Estimation

极大似然估计(MLE)

Given data x_1,…,x_n,​ from a model with parameter θ:

给定来自参数为 θ 的模型的数据 x_1,…,x_n,​:

Set the derivative to zero and solve. (or it's over gng)

令导数为 0 并求解。(不然就没戏了 gng)

MLE is how you calibrate every model in finance fit a GARCH model to volatility, estimate jump-diffusion parameters, calibrate option pricing to market quotes.

在金融里,你用 MLE 来校准几乎所有模型:给波动率拟合 GARCH、估计跳跃扩散参数、把期权定价模型校准到市场报价。

It's asymptotically efficient no other consistent estimator has lower variance for large samples (the Cramér-Rao lower bound).

它在渐近意义下是有效的:对于大样本,没有任何一致估计量能拥有更低的方差(Cramér-Rao 下界)。

When someone at a firm says they're "calibrating" a model, they almost, like always mean MLE.

当公司里有人说自己在“校准(calibrating)”模型时,他们几乎总是指 MLE。

Level 2 homework (4-5 weeks): 1. Read Wasserman, All of Statistics, Chapters 1-13. 2. Code Download real stock returns with yfinance. Test normality (they'll fail). Fit a t-distribution via MLE. Compare. 3. Code Run a Fama-French 3-factor regression on a stock portfolio using statsmodels. 4. Code Implement a permutation test shuffle dates 10,000 times, compare shuffled performance to actual.

Level 2 作业(4-5 周): 1. 阅读 Wasserman, All of Statistics,第 1-13 章。 2. 编程 用 yfinance 下载真实股票收益率。检验正态性(会失败)。用 MLE 拟合 t 分布并比较。 3. 编程 用 statsmodels 对一个股票组合做 Fama-French 三因子回归。 4. 编程 实现一个置换检验:把日期打乱 10,000 次,把打乱后的表现与真实表现比较。

import numpy as np
import cvxpy as cp

# ============================================
# Markowitz optimization with cvxpy
# ============================================
np.random.seed(42)
n_assets = 10
mu = np.random.uniform(0.04, 0.15, n_assets)
A = np.random.randn(n_assets, n_assets) * 0.1
cov = A @ A.T + np.eye(n_assets) * 0.01

w = cp.Variable(n_assets)
objective = cp.Minimize(cp.quad_form(w, cov))
constraints = [
    mu @ w >= 0.08,      # minimum return
    cp.sum(w) == 1,       # fully invested
    w >= -0.1,            # max 10% short
    w <= 0.3              # max 30% long
]

prob = cp.Problem(objective, constraints)
prob.solve()

ret = mu @ w.value
vol = np.sqrt(w.value @ cov @ w.value)
sharpe = (ret - 0.03) / vol

print(f"Portfolio return:  {ret:.4f}")
print(f"Portfolio vol:     {vol:.4f}")
print(f"Sharpe ratio:      {sharpe:.4f}")
print(f"Weights: {np.round(w.value, 4)}")
import numpy as np
import cvxpy as cp

# ============================================
# Markowitz optimization with cvxpy
# ============================================
np.random.seed(42)
n_assets = 10
mu = np.random.uniform(0.04, 0.15, n_assets)
A = np.random.randn(n_assets, n_assets) * 0.1
cov = A @ A.T + np.eye(n_assets) * 0.01

w = cp.Variable(n_assets)
objective = cp.Minimize(cp.quad_form(w, cov))
constraints = [
    mu @ w >= 0.08,      # minimum return
    cp.sum(w) == 1,       # fully invested
    w >= -0.1,            # max 10% short
    w <= 0.3              # max 30% long
]

prob = cp.Problem(objective, constraints)
prob.solve()

ret = mu @ w.value
vol = np.sqrt(w.value @ cov @ w.value)
sharpe = (ret - 0.03) / vol

print(f"Portfolio return:  {ret:.4f}")
print(f"Portfolio vol:     {vol:.4f}")
print(f"Sharpe ratio:      {sharpe:.4f}")
print(f"Weights: {np.round(w.value, 4)}")

Part III: Linear Algebra

Part III: 线性代数

Linear algebra sounds boring. It's the machinery that runs everything: portfolio construction, PCA, neural networks, covariance estimation, factor models. You cannot be a quant without being fluent in matrices.

线性代数听起来很无聊。但它是驱动一切的机器:组合构建、PCA、神经网络、协方差估计、因子模型。你不可能在不熟练矩阵的情况下成为量化。

import numpy as np
from scipy.stats import norm

def black_scholes(S, K, T, r, sigma, option_type='call'):
    d1 = (np.log(S/K) + (r + sigma**2/2)*T) / (sigma*np.sqrt(T))
    d2 = d1 - sigma*np.sqrt(T)
    if option_type == 'call':
        return S*norm.cdf(d1) - K*np.exp(-r*T)*norm.cdf(d2)
    else:
        return K*np.exp(-r*T)*norm.cdf(-d2) - S*norm.cdf(-d1)

def monte_carlo_option(S0, K, T, r, sigma, n_sims=500_000):
    """Price via risk-neutral simulation (drift = r, not mu)"""
    Z = np.random.standard_normal(n_sims)
    ST = S0 * np.exp((r - sigma**2/2)*T + sigma*np.sqrt(T)*Z)
    payoffs = np.maximum(ST - K, 0)
    price = np.exp(-r*T) * np.mean(payoffs)
    stderr = np.exp(-r*T) * np.std(payoffs) / np.sqrt(n_sims)
    return price, stderr

def greeks(S, K, T, r, sigma):
    d1 = (np.log(S/K) + (r + sigma**2/2)*T) / (sigma*np.sqrt(T))
    d2 = d1 - sigma*np.sqrt(T)
    return {
        'delta': norm.cdf(d1),
        'gamma': norm.pdf(d1) / (S * sigma * np.sqrt(T)),
        'theta': -(S*norm.pdf(d1)*sigma)/(2*np.sqrt(T)) - r*K*np.exp(-r*T)*norm.cdf(d2),
        'vega':  S * np.sqrt(T) * norm.pdf(d1),
        'rho':   K * T * np.exp(-r*T) * norm.cdf(d2),
    }

# Verify: Monte Carlo converges to Black-Scholes
S, K, T, r, sigma = 100, 105, 1.0, 0.05, 0.2

bs = black_scholes(S, K, T, r, sigma)
mc, err = monte_carlo_option(S, K, T, r, sigma)
g = greeks(S, K, T, r, sigma)

print(f"Black-Scholes: ${bs:.4f}")
print(f"Monte Carlo:   ${mc:.4f} ± {err:.4f}")
print(f"Difference:    ${abs(bs - mc):.4f}\n")
for name, val in g.items():
    print(f"  {name:>6}: {val:.6f}")
import numpy as np
from scipy.stats import norm

def black_scholes(S, K, T, r, sigma, option_type='call'):
    d1 = (np.log(S/K) + (r + sigma**2/2)*T) / (sigma*np.sqrt(T))
    d2 = d1 - sigma*np.sqrt(T)
    if option_type == 'call':
        return S*norm.cdf(d1) - K*np.exp(-r*T)*norm.cdf(d2)
    else:
        return K*np.exp(-r*T)*norm.cdf(-d2) - S*norm.cdf(-d1)

def monte_carlo_option(S0, K, T, r, sigma, n_sims=500_000):
    """Price via risk-neutral simulation (drift = r, not mu)"""
    Z = np.random.standard_normal(n_sims)
    ST = S0 * np.exp((r - sigma**2/2)*T + sigma*np.sqrt(T)*Z)
    payoffs = np.maximum(ST - K, 0)
    price = np.exp(-r*T) * np.mean(payoffs)
    stderr = np.exp(-r*T) * np.std(payoffs) / np.sqrt(n_sims)
    return price, stderr

def greeks(S, K, T, r, sigma):
    d1 = (np.log(S/K) + (r + sigma**2/2)*T) / (sigma*np.sqrt(T))
    d2 = d1 - sigma*np.sqrt(T)
    return {
        'delta': norm.cdf(d1),
        'gamma': norm.pdf(d1) / (S * sigma * np.sqrt(T)),
        'theta': -(S*norm.pdf(d1)*sigma)/(2*np.sqrt(T)) - r*K*np.exp(-r*T)*norm.cdf(d2),
        'vega':  S * np.sqrt(T) * norm.pdf(d1),
        'rho':   K * T * np.exp(-r*T) * norm.cdf(d2),
    }

# Verify: Monte Carlo converges to Black-Scholes
S, K, T, r, sigma = 100, 105, 1.0, 0.05, 0.2

bs = black_scholes(S, K, T, r, sigma)
mc, err = monte_carlo_option(S, K, T, r, sigma)
g = greeks(S, K, T, r, sigma)

print(f"Black-Scholes: ${bs:.4f}")
print(f"Monte Carlo:   ${mc:.4f} ± {err:.4f}")
print(f"Difference:    ${abs(bs - mc):.4f}\n")
for name, val in g.items():
    print(f"  {name:>6}: {val:.6f}")

Thinking in matrices

用矩阵思考

A covariance matrix Σ captures how every asset moves relative to every other asset. For 500 stocks, Σ is 500×500 with 125,250 unique entries. Portfolio variance collapses to a single expression

协方差矩阵 Σ 描述了每个资产相对于其他资产的联动方式。对 500 只股票而言,Σ 是一个 500×500 的矩阵,包含 125,250 个不重复的条目。组合方差会收敛为一个单一表达式

This quadratic form is the core of Markowitz portfolio theory, of risk management, of everything.

这个二次型是 Markowitz 组合理论、风险管理以及一切的核心。

Eigenvalues is what actually matters in a universe of stocks

在股票宇宙里,真正重要的是特征值/特征向量

Look at a 500-stock universe and the first 5 eigenvectors explain 70% of all variance. Everything else is NOISY BS.

看一个 500 只股票的宇宙,前 5 个特征向量就能解释 70% 的整体方差。剩下的全是 NOISY BS。

The first time eigendecomposition u use it the whole world changes. Look at a 500-stock universe and the first 5 eigenvectors explain 70% of all variance. Dimensionality reduction, and it's the foundation of factor investing.

第一次用特征分解时,你会觉得整个世界都变了。看一个 500 只股票的宇宙,前 5 个特征向量就能解释 70% 的整体方差。这就是降维,也是因子投资的基础。

Level 3 homework (4-6 weeks): 1. Watch Gilbert Strang's MIT 18.06 lectures all of them. Non-negotiable. 2. Read Strang, Introduction to Linear Algebra. Do the problems. 3. Code PCA decomposition of S&P 500 returns. Plot eigenvalue spectrum. Identify top 3 components. 4. Code Markowitz mean-variance optimization from scratch.

Level 3 作业(4-6 周): 1. 观看 Gilbert Strang 的 MIT 18.06 线性代数课程:全部看完。没有商量余地。 2. 阅读 Strang, Introduction to Linear Algebra。做题。 3. 编程 对标普 500 收益率做 PCA 分解。画出特征值谱。识别前三个主成分。 4. 编程 从零实现 Markowitz 均值-方差优化。

import numpy as np
import matplotlib.pyplot as plt

# Law of large numbers: running average converges to true probability
np.random.seed(42)
flips = np.random.choice([0, 1], size=10000, p=[0.5, 0.5])
running_avg = np.cumsum(flips) / np.arange(1, 10001)

plt.figure(figsize=(10, 4))
plt.plot(running_avg, linewidth=0.7)
plt.axhline(y=0.5, color='r', linestyle='--', label='True probability')
plt.xlabel('Number of flips')
plt.ylabel('Running average')
plt.title('Law of Large Numbers in Action')
plt.legend()
plt.savefig('lln.png', dpi=150)
print(f"After 10,000 flips: {running_avg[-1]:.4f} (true: 0.5000)")
import numpy as np
import matplotlib.pyplot as plt

# Law of large numbers: running average converges to true probability
np.random.seed(42)
flips = np.random.choice([0, 1], size=10000, p=[0.5, 0.5])
running_avg = np.cumsum(flips) / np.arange(1, 10001)

plt.figure(figsize=(10, 4))
plt.plot(running_avg, linewidth=0.7)
plt.axhline(y=0.5, color='r', linestyle='--', label='True probability')
plt.xlabel('Number of flips')
plt.ylabel('Running average')
plt.title('Law of Large Numbers in Action')
plt.legend()
plt.savefig('lln.png', dpi=150)
print(f"After 10,000 flips: {running_avg[-1]:.4f} (true: 0.5000)")

Part IV: Calculus & Optimization

Part IV: 微积分与优化

Calculus is the language of change. In finance, everything changes: prices, volatilities, correlations, the entire probability distribution shifts second by second. Calculus describes and exploits those changes.

微积分是变化的语言。在金融里,一切都在变:价格、波动率、相关性,整张概率分布都在每一秒重新塑形。微积分描述并利用这些变化。

Derivatives (the math kind): appears in every neural network backpropagation and every Greek calculation.

导数(数学意义上的):出现在每一个神经网络的反向传播里,也出现在每一次 Greeks 计算里。

Taylor expansion:

泰勒展开

import numpy as np
from scipy import optimize, stats

# Demonstrate fat tails: MLE fit of Student-t to return data
np.random.seed(42)

# Simulate "realistic" returns (fat tails, slight positive drift)
true_df = 4
returns = stats.t.rvs(df=true_df, loc=0.0005, scale=0.015, size=1000)

def neg_log_likelihood(params, data):
    df, loc, scale = params
    if df <= 2 or scale <= 0:
        return 1e10
    return -np.sum(stats.t.logpdf(data, df=df, loc=loc, scale=scale))

result = optimize.minimize(
    neg_log_likelihood, x0=[5, 0, 0.01], args=(returns,),
    method='Nelder-Mead'
)
fitted_df, fitted_loc, fitted_scale = result.x

print(f"MLE degrees of freedom: {fitted_df:.2f} (true: {true_df})")
print(f"MLE location:           {fitted_loc:.6f}")
print(f"MLE scale:              {fitted_scale:.6f}")

# Normality test
_, p_normal = stats.normaltest(returns)
print(f"\nNormality test p-value: {p_normal:.2e}")
print(f"Reject normality? {'YES  fat tails confirmed' if p_normal < 0.05 else 'NO'}")
import numpy as np
from scipy import optimize, stats

# Demonstrate fat tails: MLE fit of Student-t to return data
np.random.seed(42)

# Simulate "realistic" returns (fat tails, slight positive drift)
true_df = 4
returns = stats.t.rvs(df=true_df, loc=0.0005, scale=0.015, size=1000)

def neg_log_likelihood(params, data):
    df, loc, scale = params
    if df <= 2 or scale <= 0:
        return 1e10
    return -np.sum(stats.t.logpdf(data, df=df, loc=loc, scale=scale))

result = optimize.minimize(
    neg_log_likelihood, x0=[5, 0, 0.01], args=(returns,),
    method='Nelder-Mead'
)
fitted_df, fitted_loc, fitted_scale = result.x

print(f"MLE degrees of freedom: {fitted_df:.2f} (true: {true_df})")
print(f"MLE location:           {fitted_loc:.6f}")
print(f"MLE scale:              {fitted_scale:.6f}")

# Normality test
_, p_normal = stats.normaltest(returns)
print(f"\nNormality test p-value: {p_normal:.2e}")
print(f"Reject normality? {'YES  fat tails confirmed' if p_normal < 0.05 else 'NO'}")

Delta hedging is the first-order approximation. Gamma hedging adds the second-order correction. And the reason Itô calculus differs from ordinary calculus is precisely because the second-order Taylor term doesn't vanish for random processes. Just Remember it

Delta 对冲是一阶近似。
Gamma 对冲加入二阶修正。
而伊藤微积分之所以不同于普通微积分,恰恰是因为对随机过程来说,二阶泰勒项并不会消失。 记住这一点

Level 4 homework (4-5 weeks): 1. Read Boyd & Vandenberghe, Convex Optimization (free PDF from Stanford), Chapters 1-5. 2. Code Implement gradient descent from scratch. Minimize the Rosenbrock function. 3. Code Solve a portfolio optimization problem with cvxpy including transaction cost constraints.

Level 4 作业(4-5 周): 1. 阅读 Boyd & Vandenberghe, Convex Optimization(斯坦福提供免费 PDF),第 1-5 章。 2. 编程 从零实现梯度下降。最小化 Rosenbrock 函数。 3. 编程 用 cvxpy 解一个带交易成本约束的组合优化问题。

Part V: Stochastic Calculus

Part V: 随机微积分

Before stochastic calculus, you're a data scientist who likes finance.

在学随机微积分之前,你只是个喜欢金融的数据科学家。

After it, you're a quant. QUANTATIVE FINANCE EXPERT, you heard?

学完之后,你就是量化。你听到了吗?QUANTATIVE FINANCE EXPERT。

This is where you learn to model randomness in continuous time, derive the Black-Scholes equation from first principles, and understand why the trillion-dollar derivatives market works the way it does.

你会在这里学会:如何在连续时间里建模随机性、如何从第一性原理推导 Black-Scholes 方程、以及为什么万亿美元级别的衍生品市场会以那样的方式运作。

Brownian motion pure randomness, formalized

布朗运动:纯随机,被形式化

A Brownian motion (Wiener process) W_t is a continuous-time random walk:

布朗运动(维纳过程)W_t 是一个连续时间随机游走:

  • W_0 = 0
  • W_0 = 0
  • Increments W_t - W_s ~ N(0, t - s) for t > s
  • 增量 W_t - W_s ~ N(0, t - s) for t > s
  • Non-overlapping increments are independent
  • 不重叠的增量相互独立
  • Paths are continuous but nowhere differentiable
  • 路径连续,但处处不可导

The critical insight that everything else depends on: dW_t has "size" dt, which means (dW_t)^2 = dt. This sounds like a technicality, but its the single most important fact in quantitative finance.

一个所有后续内容都依赖的关键洞见:dW_t 的“量级”是 dt,这意味着 (dW_t)^2 = dt。听起来像技术细节,但它是量化金融里最重要的单一事实。

Geometric Brownian Motion models stock prices:

几何布朗运动(GBM)用于建模股票价格:

Itô's lemma

伊藤引理

In normal calculus, df = f'(x)dx. You Taylor-expand, and the (dx)^2 term is infinitesimally small you drop it.

在普通微积分里,df = f'(x)dx。你做泰勒展开,然后 (dx)^2 项是无穷小的高阶项,于是你把它扔掉。

But when x is a stochastic process, (dW_t)^2 = dt is first order. You can't drop it.

但当 x 是随机过程时,(dW_t)^2 = dt 是一阶项。你不能扔。

Itô's lemma:

伊藤引理

Apply it to an option price and you get Black-Scholes. Formula is the engine behind the entire derivatives industry.

把它应用到期权价格上,你就能得到 Black-Scholes。这条公式就是整个衍生品行业的发动机。

Deriving Black-Scholes from scratch

从零推导 Black-Scholes

Follow along with pen and paper.

拿起纸笔跟着推。

Step 1: Let V(S,t) be an option price. Apply Itô's lemma:

Step 1: 令 V(S,t) 为期权价格。应用伊藤引理:

Step 2: Construct a delta-hedged portfolio Π=V−∂S/∂V​⋅S. Compute dΠ:

Step 2: 构造一个 delta 对冲组合 Π=V−∂S/∂V​⋅S。计算 dΠ:

The dW_t​ terms cancel perfectly. The portfolio is locally riskless.

dW_t​ 项会被完美抵消。该组合在局部是无风险的。

Step 3: A riskless portfolio must earn the risk-free rate: dΠ=rΠ dtd\Pi = r\Pi \, dt dΠ=rΠdt.

Step 3: 无风险组合必须赚取无风险利率: dΠ=rΠ dtd\Pi = r\Pi \, dt dΠ=rΠdt.

Step 4: Substitute and rearrange:

Step 4: 代入并整理:

This is the Black-Scholes PDE.

这就是 Black-Scholes PDE

Notice what happened - the drift μ vanished. The option price doesn't depend on the expected return of the stock. Risk preferences don't matter. You can price options as if everyone is risk-neutral. The first time this sinks in genuinely mind-bending.

注意发生了什么——漂移项 μ 消失了。期权价格不依赖股票的期望收益。风险偏好不重要。你可以像所有人都是风险中性那样为期权定价。第一次真正理解这一点,会让人直呼离谱。

Solving this PDE for a European call with strike K and expiry T gives:

解这个 PDE(执行价 K、到期 T 的欧式看涨)得到:

The Greeks

The Greeks

  • Delta Δ is How much the option moves per $1 stock move. Your hedge ratio.
  • Delta Δ 是股票每变动 $1,期权变动多少。你的对冲比率。
  • Gamma Γ​: How fast delta changes. Your convexity exposure.
  • Gamma Γ​:delta 变化的速度。你的凸性敞口。
  • Theta Θ: Time decay. Typically negative for long options.
  • Theta Θ:时间价值衰减。对做多期权通常为负。
  • Vega V: Sensitivity to volatility. Where most derivatives money is made.
  • Vega V:对波动率的敏感度。衍生品赚钱的大头往往在这里。
  • Rho ρ: Sensitivity to interest rates.
  • Rho ρ:对利率的敏感度。

Delta tells you your hedge ratio. Gamma tells you how often to re-hedge. Theta is the cost of holding. Vega is the bread and butter of vol trading desks.

Delta 告诉你对冲比率。Gamma 告诉你多久需要重新对冲。Theta 是持有成本。Vega 是波动率交易台的饭碗。

Level 5 homework (6-8 weeks - the hardest level): 1. Read Shreve, Stochastic Calculus for Finance II. The gold standard. 2. Alternative Arguin, A First Course in Stochastic Calculus (newer, more accessible). 3. Derive Apply Itô's lemma to f(S)=ln⁡(S) where S follows GBM. Get the −σ^2/2. 4. Derive The full Black-Scholes equation from the delta-hedging argument. 5. Code Black-Scholes from scratch. Compare to Monte Carlo. Verify convergence.

Level 5 作业(6-8 周——最难的一关): 1. 阅读 Shreve, Stochastic Calculus for Finance II。金标准。 2. 备选 Arguin, A First Course in Stochastic Calculus(更新、更易读)。 3. 推导 对 f(S)=ln⁡(S) 应用伊藤引理,其中 S 服从 GBM。推到 −σ^2/2。 4. 推导 用 delta 对冲论证推导完整 Black-Scholes 方程。 5. 编程 从零实现 Black-Scholes。与 Monte Carlo 比较。验证收敛性。

Polymarket

Polymarket

This is the most interesting market in the world right now and the math behind it connects everything in this article: probability, information theory, convex optimization, integer programming

这是当下世界上最有意思的市场,而它背后的数学把本文所有东西都串起来了:概率、信息论、凸优化、整数规划。

How LMSR prices beliefs

LMSR 如何为信念定价

The Logarithmic Market Scoring Rule (LMSR), invented by Robin Hanson, powers automated prediction markets. The cost function for n outcomes:

对数市场评分规则(LMSR)由 Robin Hanson 发明,为自动化预测市场提供动力。对 n 个结果,其成本函数是:

where q_i​ tracks outstanding shares of outcome i and b is the liquidity parameter. The price of outcome i:

其中 q_i​ 追踪结果 i 的未平仓份额,b 是流动性参数。结果 i 的价格为:

That's the softmax function - function powering every neural network classifier.

这就是 softmax 函数——驱动每一个神经网络分类器的函数。

Prices always sum to 1, always lie in (0,1), and always exist providing infinite liquidity. The market maker's maximum loss is bounded at b * ln(n)

价格总和永远为 1,永远落在 (0,1) 内,并且永远存在,提供无限流动性。做市商的最大亏损上界为 b * ln(n)

The Quant Career Landscape

量化职业版图

4 archetypes: Quant Researcher The most-cracked guy who finds patterns in petabytes, builds predictive models, designs strategies. Needs PhD-level math/stats/ML, or exceptional undergraduate achievement. At firms like Jane Street, QRs work with tens of thousands of GPUs.

4 种原型: 量化研究员(Quant Researcher) 最顶的那类人:在 PB 级数据里找模式、建预测模型、设计算法策略。需要博士级的数学/统计/ML,或极其突出的本科能力。在 Jane Street 这类公司,QR 会用上成千上万块 GPU。

Quant Developer/Engineer The mid-cracked guy, mostly the builder. Trading platforms, execution engines, real-time data pipelines. Makes the researcher's model actually trade. Needs production C++/Rust/Python, low-latency systems.

量化开发/工程师(Quant Developer/Engineer) 中间那档的强者,主要是“建造者”:交易平台、执行引擎、实时数据管道。把研究员的模型变成真正能交易的系统。需要生产级 C++/Rust/Python,以及低延迟系统能力。

Quant Trader Either the biggest degen or the most-cracked guy, mostly the decision-maker. Runs capital, manages risk, makes real-time calls. Highest compensation variance - eight figures in exceptional years.

量化交易员(Quant Trader) 要么是最猛的赌徒,要么是最顶的强者,主要是决策者:跑资金、管风险、实时拍板。薪酬波动最大——极端年份能到八位数。

Risk Quant The most-cracked guy or just insanely experienced corporate guy, mostly the guardian. Model validation, VaR, stress testing, regulatory compliance. Steadier career, lower ceiling. The emerging AI/ML Quant role signal generation with deep learning is the fastest-growing, with hiring up 88% year-over-year in 2025.

风险量化(Risk Quant) 要么是最顶的强者,要么是经验极其老到的“企业老手”,主要是守门人:模型验证、VaR、压力测试、合规监管。职业更稳、上限更低。正在兴起的 AI/ML Quant 角色(用深度学习做信号生成)增长最快,2025 年招聘同比增长 88%。

What it pays:

薪酬大概是这样:

Level Top Tier (Jane Street, Citadel, HRT) New grad $300K-$500K+ total comp Mid career (3-7yr)$550K-$950K Senior (8+yr)$1M-$3M+ Star trader/PM $3M-$30M+ ** Mid Tier (Two Sigma, DE Shaw)** New grad $250K–$350K Mid career (3-7yr) $350K–$625K Senior (8+yr) $575K–$1.2M Star trader/PM idk

顶级(Jane Street, Citadel, HRT) 应届 $300K-$500K+ 总包
中期(3-7 年)$550K-$950K
资深(8+ 年)$1M-$3M+
明星交易员/PM $3M-$30M+
中档(Two Sigma, DE Shaw) 应届 $250K–$350K
中期(3-7 年) $350K–$625K
资深(8+ 年) $575K–$1.2M
明星交易员/PM 不知道 idk

Jane Street's average employee compensation was reported at $1.4 million/year in H1 2025. That's the average though

据报道,Jane Street 在 2025 年上半年的人均薪酬为每年 $1.4 million。那是平均值,不过

The interview gauntlet

面试地狱

Resume screen -> Online assessment (mental math via Zetamac - target 50+, logic puzzles) -> Phone screen (probability problems, betting games) -> Superday (3-5 back-to-back interviews, mock trading, coding, whiteboard derivations).

简历筛选 -> 在线测评(Zetamac 心算——目标 50+;逻辑题) -> 电话面(概率题、下注游戏) -> Superday(连续 3-5 场面试,模拟交易、编程、白板推导)。

Jane Street gives problems intentionally too hard to solve alone - they test how you use hints and collaborate.

Jane Street 会故意出一些你一个人解不出来的题——他们测试的是你如何利用提示、如何协作。

Over two-thirds of their recent intern class studied CS; over a third studied math. Finance knowledge generally not required.

他们最近的实习生里,超过三分之二学的是 CS;超过三分之一学的是数学。一般不要求金融知识。

The #1 prep resource Xinfeng Zhou's Green Book (A Practical Guide to Quantitative Finance Interviews) - 200+ real problems. Supplement with QuantGuide.io ("LeetCode for quants") Brainstellar Jane Street's Figgie card game

头号备考资源 Xinfeng Zhou 的 Green BookA Practical Guide to Quantitative Finance Interviews)——200+ 道真实题。 再补充 QuantGuide.io(“量化版 LeetCode”)
Brainstellar
Jane Street 的 Figgie 纸牌游戏

The Complete Toolbox

完整工具箱

Python stack Data: pandas, polars (Polars is 10-50x faster on large datasets) Numerics: numpy, scipy ML (tabular): xgboost, lightgbm, catboost ML (deep): pytorch Optimization: cvxpy Derivatives: QuantLib (Industry-grade, C++ backend) Stats: statsmodels Backtesting: NautilusTrader Backtesting (simpler): backtrader, vectorbt (Easier starting point) Quant research: Microsoft Qlib (17K+ stars, AI-oriented) RL for trading: FinRL (10K+ stars)

Python 技术栈 数据:pandas, polars(Polars 在大数据集上快 10-50 倍) 数值计算:numpy, scipy ML(表格数据):xgboost, lightgbm, catboost ML(深度):pytorch 优化:cvxpy 衍生品:QuantLib(工业级,C++ 后端) 统计:statsmodels 回测:NautilusTrader 回测(更简单):backtrader, vectorbt(更容易上手) 量化研究:Microsoft Qlib(17K+ stars,AI 导向) 交易强化学习:FinRL(10K+ stars)

C++ and Rust Tbh i don't know anything about this. This is what I've found: C++ libraries: QuantLib, Eigen, Boost. Rust: RustQuant for option pricing, NautilusTrader as the Rust+Python paradigm (Rust core for speed, Python API for research).

C++ 与 Rust 说实话我对这块啥也不懂。这是我找到的: C++ 库:QuantLib, Eigen, Boost。 Rust:RustQuant(期权定价),以及 NautilusTrader 的 Rust+Python 范式(Rust 内核提速,Python API 做研究)。

Data sources Free: yfinance, Finnhub (60 calls/min), Alpha Vantage. Mid-range: Polygon.io ($199/mo, sub-20ms latency), Tiingo. Enterprise: Bloomberg Terminal (~$32K/yr), Refinitiv, FactSet. Blockchain: Alchemy (free tier with archive access).

数据源 免费:yfinance, Finnhub(60 次调用/分钟), Alpha Vantage。 中档:Polygon.io($199/月,<20ms 延迟), Tiingo。 企业级:Bloomberg Terminal(约 $32K/年), Refinitiv, FactSet。 区块链:Alchemy(免费层含归档访问)。

Solvers Gurobi: Fastest commercial MIP solver, free academic license. Essential for combinatorial arbitrage. Google OR-Tools: Strongest free solver. PuLP/Pyomo: Python modeling interfaces.

求解器 Gurobi:最快的商业 MIP 求解器,有免费的学术许可。做组合套利必备。 Google OR-Tools:最强免费求解器。 PuLP/Pyomo:Python 建模接口。

The Reading List (In Order)

阅读清单(按顺序)

Mathematics

数学

  1. Blitzstein & Hwang - Introduction to Probability (free PDF from Harvard)
  1. Blitzstein & Hwang - Introduction to Probability(哈佛提供免费 PDF)
  1. Strang - Introduction to Linear Algebra + MIT 18.06 lectures
  1. Strang - Introduction to Linear Algebra + MIT 18.06 lectures
  1. Wasserman - All of Statistics
  1. Wasserman - All of Statistics
  1. Boyd & Vandenberghe - Convex Optimization (free PDF from Stanford)
  1. Boyd & Vandenberghe - Convex Optimization(斯坦福提供免费 PDF)
  1. Shreve - Stochastic Calculus for Finance I & II
  1. Shreve - Stochastic Calculus for Finance I & II

Quant finance

量化金融

  1. Hull - Options, Futures, and Other Derivatives
  1. Hull - Options, Futures, and Other Derivatives
  1. Natenberg - Option Volatility and Pricing
  1. Natenberg - Option Volatility and Pricing
  1. López de Prado - Advances in Financial Machine Learning
  1. López de Prado - Advances in Financial Machine Learning
  1. Ernest Chan - Quantitative Trading
  1. Ernest Chan - Quantitative Trading
  1. Zuckerman - The Man Who Solved the Market
  1. Zuckerman - The Man Who Solved the Market

Interview prep

面试准备

  1. Zhou - Practical Guide to Quantitative Finance Interviews (Green Book #1)
  1. Zhou - Practical Guide to Quantitative Finance Interviews(Green Book #1)
  1. Crack -Heard on the Street
  1. Crack -Heard on the Street
  1. Joshi - Quant Job Interview Questions
  1. Joshi - Quant Job Interview Questions

Competitions

竞赛

  • Jane Street Kaggle ($100K prize)
  • Jane Street Kaggle($100K 奖金)
  • WorldQuant BRAIN (100K+ users, pays for alpha signals)
  • WorldQuant BRAIN(100K+ 用户,为 alpha 信号付费)
  • Citadel Datathon (fast-track to employment)
  • Citadel Datathon(快速通道拿 offer)
  • Jane Street monthly puzzles (above interview difficulty)
  • Jane Street 每月谜题(难度高于面试)

Three things I wish I'd known earlier

我更早知道就好了的三件事

Estimation error is the real enemy. Full Kelly betting, unconstrained Markowitz, ML models with too many features - they all fail for the same reason: overfitting NOISY BS in parameter estimates.

估计误差才是真正的敌人。 满 Kelly 下注、无约束 Markowitz、特征太多的 ML 模型——它们都会因为同一个原因失败:在参数估计里对 NOISY BS 过拟合。

The math works perfectly with true parameters. You never have true parameters. The gap between theory and practice is always estimation error, and the best quants are the ones who respect it.

在真实参数存在时,数学完美无缺。但你永远拿不到真实参数。理论与实践之间的差距永远是估计误差,而最好的量化就是那些尊重这一点的人。

Tools have democratized. Conviction hasn't. Anyone can access QuantLib, Polygon.io, and PyTorch. Technology is necessary but not sufficient. Edge lives in unique data, unique models, or unique execution - not better pip installs.

工具被民主化了。确信没有。 任何人都能用 QuantLib、Polygon.io、PyTorch。技术是必要条件,但远远不够。优势来自独特数据、独特模型或独特执行——不是更会 pip install

The math is the moat AI can write code and suggest strategies. But the ability to derive why Itô's lemma has an extra term, to prove that discounted prices are martingales under the risk-neutral measure, to know when a convex relaxation is tight versus loose in a combinatorial market that mathematical fluency separates quants who build edge from quants who borrow it. And borrowed edge expires.

数学是护城河 AI 能写代码、能给策略建议。但能推导出为什么伊藤引理多了一项,能证明在风险中性测度下贴现价格是鞅,能判断组合市场里某个凸松弛到底是紧的还是松的——这种数学流利度,才把“自己打造优势的量化”和“借来优势的量化”区分开。而借来的优势会过期。

What comes in Part 2

Part 2 会讲什么

Part 2 covers: exotic derivatives (barriers, Asians, lookbacks), stochastic volatility (Heston model calibration), jump-diffusion (Merton), advanced measure theory (martingale representation, optional stopping), stochastic control for optimal execution (Almgren-Chriss), reinforcement learning for market making, transformer architectures for financial time series, FPGA trading infrastructure, WebSocket feeds, parallel execution, Frank-Wolfe with Gurobi for combinatorial arbitrage across thousands of conditions.

Part 2 覆盖:奇异期权(障碍、亚式、回望)、随机波动率(Heston 模型校准)、跳跃扩散(Merton)、更高阶测度论(鞅表示、可选停止)、最优执行的随机控制(Almgren-Chriss)、做市的强化学习、面向金融时间序列的 Transformer 架构、FPGA 交易基础设施、WebSocket 行情、并行执行、用 Gurobi 做跨上千条件的组合套利(Frank-Wolfe)。

The math gets harder. The paycheck gets longer

数学更难。薪水更长

In 2025, entry-level quants at top firms pulled $300K-$500K total comp.

AI/ML hiring in finance grew 88% year-over-year.

This article is everything I wish someone had handed me when i started my path laid out in the exact order you should learn it.

The path is like layers of a video game, where you can't skip levels.

Every concept builds on the last. But if you put in real work, not watching some lame ahh YouTube videos about finance, that's just wasting your time, actual problem-solving work - you can go from knowing nothing to being something in about 18 months.

Disclaimer: Not Financial Advice & Do Your Own Research & Markets involve risk. My own project - @coldvisionXYZ

Forget everything you think you know about trading

Most people think quantitative trading is about picking stocks. Having opinions on Tesla. Predicting earnings.

Quant trading is about math.

You are mostly working with statistical relationships, pricing inefficiencies, and structural edges that exist because markets are complex systems run by humans who make systematic errors.

Part I: Probability is the Language of Uncertainty

Everything in quantitative finance reduces kinda to 1 question:

What are the odds, and are the odds in my favor?

That's probability. If you don't understand probability at a deep level, nothing else in this article matters.

Conditional thinking

Most people think in absolutes. Something is true or it isn't. Quants think in conditionals. Given what I know, how likely is this?

The probability of A given B equals the probability of both happening divided by the probability of B. Profound implications.

A stock goes up 60% of days - that's the base rate. But on days when volume is above average, it goes up 75% of the time.

That conditional probability is a NOT BS. The raw 60% is NOISY BS.

Bayes' theorem

Your updated belief equals

(how likely you'd see this data if your hypothesis were true) * (your prior belief) / (the total probability of seeing this data under any hypothesis).

The denominator sums over all hypotheses.

In practice, you compute this with Monte Carlo sampling.

But the logic is the same. Bayes is how you update your conviction in real time.

A model says a stock should be worth $50. Earnings come out, revenue is 3% above estimate. The Bayesian posterior shifts upward. The traders who update fastest and most accurately win bread.

Expected value and variance your two best friends

Expected value is your conviction. Variance is your risk.

If your strategy has positive expected value and you can survive the variance, you likely will make money.

Level 1 homework (3-4 weeks at 2 hours/day): 1. Read Blitzstein & Hwang, Introduction to Probability (free PDF from Harvard). Every problem in Chapters 1-6. 2. Code Simulate 10,000 coin flips, verify the law of large numbers visually. 3. Code 2 Implement a Bayesian updater takes a prior and likelihood, returns a posterior.

Part II: Statistics

Once you speak probability, you need to learn to listen to data.

That's statistics and the #1 lesson statistics teaches is "most of what looks like NOT A BS is actually NOISY BS"

Hypothesis testing is the BS detector

You build a model. It backtests at 15% annual return. Is it real?

Set up H_0: "this strategy has zero expected return." Compute a test statistic. Calculate a p-value - the probability of seeing results this good if H_0 were true.

BUT If you test 1,000 random strategies, 50 of them will show p-values below 0.05 purely by chance.

That's the multiple comparisons problem.

Ur fix is Bonferroni correction divide your significance threshold by the number of tests Or use Benjamini-Hochberg for false discovery rate control.

Every single beginner massively overestimates how much NOT A BS they've found. Your first 10 strategies will all be NOISY BS. Accept this now and save yourself a lot of money.

Regression decomposing returns

Linear regression y=Xβ+ϵ is the workhorse. In finance, you regress your strategy's returns against known risk factors:

The intercept α is your alpha the return that can't be explained by known factors. If α is zero after accounting for factors, your "edge" is just disguised market exposure.

The OLS estimator:

The most important number is α. Use Newey-West standard errors financial data has autocorrelation and heteroskedasticity, so default OLS standard errors are wrong. Using them is like driving with a cracked windshield.

Maximum Likelihood Estimation

Given data x_1,…,x_n,​ from a model with parameter θ:

Set the derivative to zero and solve. (or it's over gng)

MLE is how you calibrate every model in finance fit a GARCH model to volatility, estimate jump-diffusion parameters, calibrate option pricing to market quotes.

It's asymptotically efficient no other consistent estimator has lower variance for large samples (the Cramér-Rao lower bound).

When someone at a firm says they're "calibrating" a model, they almost, like always mean MLE.

Level 2 homework (4-5 weeks): 1. Read Wasserman, All of Statistics, Chapters 1-13. 2. Code Download real stock returns with yfinance. Test normality (they'll fail). Fit a t-distribution via MLE. Compare. 3. Code Run a Fama-French 3-factor regression on a stock portfolio using statsmodels. 4. Code Implement a permutation test shuffle dates 10,000 times, compare shuffled performance to actual.

import numpy as np
import cvxpy as cp

# ============================================
# Markowitz optimization with cvxpy
# ============================================
np.random.seed(42)
n_assets = 10
mu = np.random.uniform(0.04, 0.15, n_assets)
A = np.random.randn(n_assets, n_assets) * 0.1
cov = A @ A.T + np.eye(n_assets) * 0.01

w = cp.Variable(n_assets)
objective = cp.Minimize(cp.quad_form(w, cov))
constraints = [
    mu @ w >= 0.08,      # minimum return
    cp.sum(w) == 1,       # fully invested
    w >= -0.1,            # max 10% short
    w <= 0.3              # max 30% long
]

prob = cp.Problem(objective, constraints)
prob.solve()

ret = mu @ w.value
vol = np.sqrt(w.value @ cov @ w.value)
sharpe = (ret - 0.03) / vol

print(f"Portfolio return:  {ret:.4f}")
print(f"Portfolio vol:     {vol:.4f}")
print(f"Sharpe ratio:      {sharpe:.4f}")
print(f"Weights: {np.round(w.value, 4)}")

Part III: Linear Algebra

Linear algebra sounds boring. It's the machinery that runs everything: portfolio construction, PCA, neural networks, covariance estimation, factor models. You cannot be a quant without being fluent in matrices.

import numpy as np
from scipy.stats import norm

def black_scholes(S, K, T, r, sigma, option_type='call'):
    d1 = (np.log(S/K) + (r + sigma**2/2)*T) / (sigma*np.sqrt(T))
    d2 = d1 - sigma*np.sqrt(T)
    if option_type == 'call':
        return S*norm.cdf(d1) - K*np.exp(-r*T)*norm.cdf(d2)
    else:
        return K*np.exp(-r*T)*norm.cdf(-d2) - S*norm.cdf(-d1)

def monte_carlo_option(S0, K, T, r, sigma, n_sims=500_000):
    """Price via risk-neutral simulation (drift = r, not mu)"""
    Z = np.random.standard_normal(n_sims)
    ST = S0 * np.exp((r - sigma**2/2)*T + sigma*np.sqrt(T)*Z)
    payoffs = np.maximum(ST - K, 0)
    price = np.exp(-r*T) * np.mean(payoffs)
    stderr = np.exp(-r*T) * np.std(payoffs) / np.sqrt(n_sims)
    return price, stderr

def greeks(S, K, T, r, sigma):
    d1 = (np.log(S/K) + (r + sigma**2/2)*T) / (sigma*np.sqrt(T))
    d2 = d1 - sigma*np.sqrt(T)
    return {
        'delta': norm.cdf(d1),
        'gamma': norm.pdf(d1) / (S * sigma * np.sqrt(T)),
        'theta': -(S*norm.pdf(d1)*sigma)/(2*np.sqrt(T)) - r*K*np.exp(-r*T)*norm.cdf(d2),
        'vega':  S * np.sqrt(T) * norm.pdf(d1),
        'rho':   K * T * np.exp(-r*T) * norm.cdf(d2),
    }

# Verify: Monte Carlo converges to Black-Scholes
S, K, T, r, sigma = 100, 105, 1.0, 0.05, 0.2

bs = black_scholes(S, K, T, r, sigma)
mc, err = monte_carlo_option(S, K, T, r, sigma)
g = greeks(S, K, T, r, sigma)

print(f"Black-Scholes: ${bs:.4f}")
print(f"Monte Carlo:   ${mc:.4f} ± {err:.4f}")
print(f"Difference:    ${abs(bs - mc):.4f}\n")
for name, val in g.items():
    print(f"  {name:>6}: {val:.6f}")

Thinking in matrices

A covariance matrix Σ captures how every asset moves relative to every other asset. For 500 stocks, Σ is 500×500 with 125,250 unique entries. Portfolio variance collapses to a single expression

This quadratic form is the core of Markowitz portfolio theory, of risk management, of everything.

Eigenvalues is what actually matters in a universe of stocks

Look at a 500-stock universe and the first 5 eigenvectors explain 70% of all variance. Everything else is NOISY BS.

The first time eigendecomposition u use it the whole world changes. Look at a 500-stock universe and the first 5 eigenvectors explain 70% of all variance. Dimensionality reduction, and it's the foundation of factor investing.

Level 3 homework (4-6 weeks): 1. Watch Gilbert Strang's MIT 18.06 lectures all of them. Non-negotiable. 2. Read Strang, Introduction to Linear Algebra. Do the problems. 3. Code PCA decomposition of S&P 500 returns. Plot eigenvalue spectrum. Identify top 3 components. 4. Code Markowitz mean-variance optimization from scratch.

import numpy as np
import matplotlib.pyplot as plt

# Law of large numbers: running average converges to true probability
np.random.seed(42)
flips = np.random.choice([0, 1], size=10000, p=[0.5, 0.5])
running_avg = np.cumsum(flips) / np.arange(1, 10001)

plt.figure(figsize=(10, 4))
plt.plot(running_avg, linewidth=0.7)
plt.axhline(y=0.5, color='r', linestyle='--', label='True probability')
plt.xlabel('Number of flips')
plt.ylabel('Running average')
plt.title('Law of Large Numbers in Action')
plt.legend()
plt.savefig('lln.png', dpi=150)
print(f"After 10,000 flips: {running_avg[-1]:.4f} (true: 0.5000)")

Part IV: Calculus & Optimization

Calculus is the language of change. In finance, everything changes: prices, volatilities, correlations, the entire probability distribution shifts second by second. Calculus describes and exploits those changes.

Derivatives (the math kind): appears in every neural network backpropagation and every Greek calculation.

Taylor expansion:

import numpy as np
from scipy import optimize, stats

# Demonstrate fat tails: MLE fit of Student-t to return data
np.random.seed(42)

# Simulate "realistic" returns (fat tails, slight positive drift)
true_df = 4
returns = stats.t.rvs(df=true_df, loc=0.0005, scale=0.015, size=1000)

def neg_log_likelihood(params, data):
    df, loc, scale = params
    if df <= 2 or scale <= 0:
        return 1e10
    return -np.sum(stats.t.logpdf(data, df=df, loc=loc, scale=scale))

result = optimize.minimize(
    neg_log_likelihood, x0=[5, 0, 0.01], args=(returns,),
    method='Nelder-Mead'
)
fitted_df, fitted_loc, fitted_scale = result.x

print(f"MLE degrees of freedom: {fitted_df:.2f} (true: {true_df})")
print(f"MLE location:           {fitted_loc:.6f}")
print(f"MLE scale:              {fitted_scale:.6f}")

# Normality test
_, p_normal = stats.normaltest(returns)
print(f"\nNormality test p-value: {p_normal:.2e}")
print(f"Reject normality? {'YES  fat tails confirmed' if p_normal < 0.05 else 'NO'}")

Delta hedging is the first-order approximation. Gamma hedging adds the second-order correction. And the reason Itô calculus differs from ordinary calculus is precisely because the second-order Taylor term doesn't vanish for random processes. Just Remember it

Level 4 homework (4-5 weeks): 1. Read Boyd & Vandenberghe, Convex Optimization (free PDF from Stanford), Chapters 1-5. 2. Code Implement gradient descent from scratch. Minimize the Rosenbrock function. 3. Code Solve a portfolio optimization problem with cvxpy including transaction cost constraints.

Part V: Stochastic Calculus

Before stochastic calculus, you're a data scientist who likes finance.

After it, you're a quant. QUANTATIVE FINANCE EXPERT, you heard?

This is where you learn to model randomness in continuous time, derive the Black-Scholes equation from first principles, and understand why the trillion-dollar derivatives market works the way it does.

Brownian motion pure randomness, formalized

A Brownian motion (Wiener process) W_t is a continuous-time random walk:

  • W_0 = 0

  • Increments W_t - W_s ~ N(0, t - s) for t > s

  • Non-overlapping increments are independent

  • Paths are continuous but nowhere differentiable

The critical insight that everything else depends on: dW_t has "size" dt, which means (dW_t)^2 = dt. This sounds like a technicality, but its the single most important fact in quantitative finance.

Geometric Brownian Motion models stock prices:

Itô's lemma

In normal calculus, df = f'(x)dx. You Taylor-expand, and the (dx)^2 term is infinitesimally small you drop it.

But when x is a stochastic process, (dW_t)^2 = dt is first order. You can't drop it.

Itô's lemma:

Apply it to an option price and you get Black-Scholes. Formula is the engine behind the entire derivatives industry.

Deriving Black-Scholes from scratch

Follow along with pen and paper.

Step 1: Let V(S,t) be an option price. Apply Itô's lemma:

Step 2: Construct a delta-hedged portfolio Π=V−∂S/∂V​⋅S. Compute dΠ:

The dW_t​ terms cancel perfectly. The portfolio is locally riskless.

Step 3: A riskless portfolio must earn the risk-free rate: dΠ=rΠ dtd\Pi = r\Pi \, dt dΠ=rΠdt.

Step 4: Substitute and rearrange:

This is the Black-Scholes PDE.

Notice what happened - the drift μ vanished. The option price doesn't depend on the expected return of the stock. Risk preferences don't matter. You can price options as if everyone is risk-neutral. The first time this sinks in genuinely mind-bending.

Solving this PDE for a European call with strike K and expiry T gives:

https://x.com/@coldvisionXYZ

The Greeks

  • Delta Δ is How much the option moves per $1 stock move. Your hedge ratio.

  • Gamma Γ​: How fast delta changes. Your convexity exposure.

  • Theta Θ: Time decay. Typically negative for long options.

  • Vega V: Sensitivity to volatility. Where most derivatives money is made.

  • Rho ρ: Sensitivity to interest rates.

Delta tells you your hedge ratio. Gamma tells you how often to re-hedge. Theta is the cost of holding. Vega is the bread and butter of vol trading desks.

Level 5 homework (6-8 weeks - the hardest level): 1. Read Shreve, Stochastic Calculus for Finance II. The gold standard. 2. Alternative Arguin, A First Course in Stochastic Calculus (newer, more accessible). 3. Derive Apply Itô's lemma to f(S)=ln⁡(S) where S follows GBM. Get the −σ^2/2. 4. Derive The full Black-Scholes equation from the delta-hedging argument. 5. Code Black-Scholes from scratch. Compare to Monte Carlo. Verify convergence.

Polymarket

This is the most interesting market in the world right now and the math behind it connects everything in this article: probability, information theory, convex optimization, integer programming

How LMSR prices beliefs

The Logarithmic Market Scoring Rule (LMSR), invented by Robin Hanson, powers automated prediction markets. The cost function for n outcomes:

where q_i​ tracks outstanding shares of outcome i and b is the liquidity parameter. The price of outcome i:

That's the softmax function - function powering every neural network classifier.

Prices always sum to 1, always lie in (0,1), and always exist providing infinite liquidity. The market maker's maximum loss is bounded at b * ln(n)

The Quant Career Landscape

4 archetypes: Quant Researcher The most-cracked guy who finds patterns in petabytes, builds predictive models, designs strategies. Needs PhD-level math/stats/ML, or exceptional undergraduate achievement. At firms like Jane Street, QRs work with tens of thousands of GPUs.

Quant Developer/Engineer The mid-cracked guy, mostly the builder. Trading platforms, execution engines, real-time data pipelines. Makes the researcher's model actually trade. Needs production C++/Rust/Python, low-latency systems.

Quant Trader Either the biggest degen or the most-cracked guy, mostly the decision-maker. Runs capital, manages risk, makes real-time calls. Highest compensation variance - eight figures in exceptional years.

Risk Quant The most-cracked guy or just insanely experienced corporate guy, mostly the guardian. Model validation, VaR, stress testing, regulatory compliance. Steadier career, lower ceiling. The emerging AI/ML Quant role signal generation with deep learning is the fastest-growing, with hiring up 88% year-over-year in 2025.

What it pays:

Level Top Tier (Jane Street, Citadel, HRT) New grad $300K-$500K+ total comp Mid career (3-7yr)$550K-$950K Senior (8+yr)$1M-$3M+ Star trader/PM $3M-$30M+ ** Mid Tier (Two Sigma, DE Shaw)** New grad $250K–$350K Mid career (3-7yr) $350K–$625K Senior (8+yr) $575K–$1.2M Star trader/PM idk

Jane Street's average employee compensation was reported at $1.4 million/year in H1 2025. That's the average though

The interview gauntlet

Resume screen -> Online assessment (mental math via Zetamac - target 50+, logic puzzles) -> Phone screen (probability problems, betting games) -> Superday (3-5 back-to-back interviews, mock trading, coding, whiteboard derivations).

Jane Street gives problems intentionally too hard to solve alone - they test how you use hints and collaborate.

Over two-thirds of their recent intern class studied CS; over a third studied math. Finance knowledge generally not required.

The #1 prep resource Xinfeng Zhou's Green Book (A Practical Guide to Quantitative Finance Interviews) - 200+ real problems. Supplement with QuantGuide.io ("LeetCode for quants") Brainstellar Jane Street's Figgie card game

The Complete Toolbox

Python stack Data: pandas, polars (Polars is 10-50x faster on large datasets) Numerics: numpy, scipy ML (tabular): xgboost, lightgbm, catboost ML (deep): pytorch Optimization: cvxpy Derivatives: QuantLib (Industry-grade, C++ backend) Stats: statsmodels Backtesting: NautilusTrader Backtesting (simpler): backtrader, vectorbt (Easier starting point) Quant research: Microsoft Qlib (17K+ stars, AI-oriented) RL for trading: FinRL (10K+ stars)

C++ and Rust Tbh i don't know anything about this. This is what I've found: C++ libraries: QuantLib, Eigen, Boost. Rust: RustQuant for option pricing, NautilusTrader as the Rust+Python paradigm (Rust core for speed, Python API for research).

Data sources Free: yfinance, Finnhub (60 calls/min), Alpha Vantage. Mid-range: Polygon.io ($199/mo, sub-20ms latency), Tiingo. Enterprise: Bloomberg Terminal (~$32K/yr), Refinitiv, FactSet. Blockchain: Alchemy (free tier with archive access).

Solvers Gurobi: Fastest commercial MIP solver, free academic license. Essential for combinatorial arbitrage. Google OR-Tools: Strongest free solver. PuLP/Pyomo: Python modeling interfaces.

The Reading List (In Order)

Mathematics

  1. Blitzstein & Hwang - Introduction to Probability (free PDF from Harvard)

  2. Strang - Introduction to Linear Algebra + MIT 18.06 lectures

  3. Wasserman - All of Statistics

  4. Boyd & Vandenberghe - Convex Optimization (free PDF from Stanford)

  5. Shreve - Stochastic Calculus for Finance I & II

Quant finance

  1. Hull - Options, Futures, and Other Derivatives

  2. Natenberg - Option Volatility and Pricing

  3. López de Prado - Advances in Financial Machine Learning

  4. Ernest Chan - Quantitative Trading

  5. Zuckerman - The Man Who Solved the Market

Interview prep

  1. Zhou - Practical Guide to Quantitative Finance Interviews (Green Book #1)

  2. Crack -Heard on the Street

  3. Joshi - Quant Job Interview Questions

Competitions

  • Jane Street Kaggle ($100K prize)

  • WorldQuant BRAIN (100K+ users, pays for alpha signals)

  • Citadel Datathon (fast-track to employment)

  • Jane Street monthly puzzles (above interview difficulty)

Three things I wish I'd known earlier

Estimation error is the real enemy. Full Kelly betting, unconstrained Markowitz, ML models with too many features - they all fail for the same reason: overfitting NOISY BS in parameter estimates.

The math works perfectly with true parameters. You never have true parameters. The gap between theory and practice is always estimation error, and the best quants are the ones who respect it.

Tools have democratized. Conviction hasn't. Anyone can access QuantLib, Polygon.io, and PyTorch. Technology is necessary but not sufficient. Edge lives in unique data, unique models, or unique execution - not better pip installs.

The math is the moat AI can write code and suggest strategies. But the ability to derive why Itô's lemma has an extra term, to prove that discounted prices are martingales under the risk-neutral measure, to know when a convex relaxation is tight versus loose in a combinatorial market that mathematical fluency separates quants who build edge from quants who borrow it. And borrowed edge expires.

What comes in Part 2

Part 2 covers: exotic derivatives (barriers, Asians, lookbacks), stochastic volatility (Heston model calibration), jump-diffusion (Merton), advanced measure theory (martingale representation, optional stopping), stochastic control for optimal execution (Almgren-Chriss), reinforcement learning for market making, transformer architectures for financial time series, FPGA trading infrastructure, WebSocket feeds, parallel execution, Frank-Wolfe with Gurobi for combinatorial arbitrage across thousands of conditions.

The math gets harder. The paycheck gets longer

📋 讨论归档

讨论进行中…