knowledge-base/4 - Resources/Claude-Code/Ralphinho RFC-DAG 编排模式.md

---
created: "2026-04-06"
type: resource
tags: [resource, claude-code, AI-tools, ralphinho, RFC, DAG, multi-agent, orchestration, ECC]
source: "~/.claude/skills/ralphinho-rfc-pipeline/SKILL.md"
---

# Ralphinho RFC-DAG 编排模式

最复杂的自主循环模式。把 RFC/PRD 分解为依赖 DAG，按层并行执行，每个 unit 过分级质量管道，最后通过合并队列着陆。由 enitrat 创建。

相关笔记：[[Autonomous Loops 自主循环模式]]、[[dmux 多Agent并行编排]]、[[Autonomous Agent Harness 自主代理框架]]、[[ECC 编排替代方案 (orchestrate 迁移)]]

## 架构总览

```
RFC 文档
  |
  v
AI 分解为 WorkUnit (含依赖 DAG)
  |
  v
RALPH LOOP (最多 3 pass)
  |
  +-- 按 DAG 层执行 (层内并行):
  |     每个 unit 在独立 worktree:
  |     Research -> Plan -> Implement -> Test -> Review
  |     (深度按复杂度分级)
  |
  +-- 合并队列:
        Rebase onto main -> Run tests -> Land or Evict
        被驱逐的 unit 带着冲突上下文重新进入
```

## WorkUnit 定义

```typescript
interface WorkUnit {
  id: string;              // kebab-case 标识
  name: string;            // 可读名称
  rfcSections: string[];   // 对应 RFC 哪些章节
  description: string;     // 详细描述
  deps: string[];          // 依赖 (其他 unit ID)
  acceptance: string[];    // 具体验收标准
  tier: "trivial" | "small" | "medium" | "large";
}
```

### 分解原则

- 偏好更少、更内聚的 unit（减少合并风险）
- 最小化跨 unit 文件重叠（避免冲突）
- 测试跟随实现（不要分成 "implement X" + "test X"）
- 仅在有真实代码依赖时才建立依赖关系

## DAG 层级执行

依赖 DAG 决定执行顺序：

```
Layer 0: [unit-a, unit-b]      <- 无依赖，并行
Layer 1: [unit-c]              <- 依赖 unit-a
Layer 2: [unit-d, unit-e]      <- 依赖 unit-c
```

同层内并行，跨层顺序执行。

## 复杂度分级管道

不同复杂度走不同深度的质量管道：

| 级别 | 管道阶段 |
|------|---------|
| trivial | implement -> test |
| small | implement -> test -> code-review |
| medium | research -> plan -> implement -> test -> PRD-review + code-review -> review-fix |
| large | research -> plan -> implement -> test -> PRD-review + code-review -> review-fix -> final-review |

## 分离上下文窗口 (消除自我审查偏差)

每个阶段运行在独立 agent 进程中，reviewer 永远不是 author：

| 阶段 | 模型 | 目的 |
|------|------|------|
| Research | Sonnet | 读代码+RFC，产出上下文文档 |
| Plan | Opus | 设计实现步骤 |
| Implement | Codex/Sonnet | 写代码 |
| Test | Sonnet | 跑构建+测试 |
| PRD Review | Sonnet | Spec 合规检查 |
| Code Review | Opus | 质量+安全检查 |
| Review Fix | Codex/Sonnet | 处理 review 意见 |
| Final Review | Opus | 质量门 (仅 large tier) |

## 合并队列

```
Unit branch
  |
  +-- Rebase onto main
  |     冲突? -> EVICT (捕获冲突上下文)
  |
  +-- Run build + tests
  |     失败? -> EVICT (捕获测试输出)
  |
  +-- Pass -> Fast-forward main, push, delete branch
```

### 文件重叠智能

- 无重叠的 unit：投机性并行着陆
- 有重叠的 unit：逐个着陆，每次 rebase

### 驱逐恢复

被驱逐时，完整上下文（冲突文件、diff、测试输出）传给下次实现：

```markdown
## MERGE CONFLICT -- RESOLVE BEFORE NEXT LANDING

Your previous implementation conflicted with another unit that landed first.
Restructure your changes to avoid the conflicting files/lines below.

{完整驱逐上下文和 diff}
```

## 阶段间数据流

```
research.contextFilePath --------> plan
plan.implementationSteps --------> implement
implement.{filesCreated} --------> test, reviews
test.failingSummary ------------> reviews, implement (next pass)
reviews.{feedback} -------------> review-fix -> implement (next pass)
final-review.reasoning ---------> implement (next pass)
evictionContext -----------------> implement (after merge conflict)
```

## Worktree 隔离

每个 unit 在独立 worktree 中运行。同一 unit 的各管道阶段共享 worktree，保留跨阶段状态（上下文文件、计划文件、代码变更）。

---

## 实际例子：smart-support 多租户改造

### Step 1: 写 RFC

```markdown
# RFC: Multi-Tenant Agent Architecture

## Goal
Support multiple tenants, each with own agent config and conversation history.

## Work Units
1. tenant-model: Tenant SQLAlchemy model + migration
2. tenant-middleware: FastAPI middleware, extract tenant from JWT
3. agent-scoping: Scope agent registry per tenant
4. conversation-isolation: Filter conversations by tenant_id
5. frontend-tenant-selector: Tenant switcher in UI header
6. e2e-multi-tenant: E2E test for full flow

## Dependencies
tenant-model -> tenant-middleware -> agent-scoping
tenant-model -> conversation-isolation
agent-scoping + conversation-isolation -> frontend-tenant-selector
all -> e2e-multi-tenant
```

### Step 2: DAG 分解

```
Layer 0: [tenant-model]                                    # tier: small
Layer 1: [tenant-middleware, conversation-isolation]        # tier: medium, small
Layer 2: [agent-scoping]                                   # tier: medium
Layer 3: [frontend-tenant-selector]                        # tier: small
Layer 4: [e2e-multi-tenant]                                # tier: small
```

### Step 3: 执行脚本

```bash
#!/bin/bash
set -e

# --- Layer 0: tenant-model (small: implement -> test -> review) ---
claude -p --model sonnet "Implement Tenant SQLAlchemy model in backend/app/models/tenant.py.
  Fields: id, name, api_key_hash, created_at. Write migration. Tests first."
claude -p --model opus "Review changes for security (api_key hashing) and schema design."

# --- Layer 1: 并行 (medium + small) ---

# tenant-middleware (medium: research -> plan -> implement -> test -> review)
(
  claude -p --model sonnet --allowedTools "Read,Grep,Glob" \
    "Research how FastAPI middleware works in this project. Document in /tmp/middleware-research.md"
  claude -p --model opus \
    "Read /tmp/middleware-research.md. Plan tenant extraction from JWT. Write to /tmp/middleware-plan.md"
  claude -p --model sonnet \
    "Read /tmp/middleware-plan.md. Implement tenant middleware. Tests first."
  claude -p --model opus \
    "Review tenant-middleware changes for security and correctness."
) &
PID1=$!

# conversation-isolation (small: implement -> test -> review)
(
  claude -p --model sonnet \
    "Add tenant_id to conversations table. Filter all conversation queries by tenant_id. Tests first."
  claude -p --model opus \
    "Review conversation-isolation changes."
) &
PID2=$!

wait $PID1 $PID2

# De-sloppify Layer 1
claude -p "Review all uncommitted changes. Remove test slop. Run pytest --cov=app."

# --- Layer 2: agent-scoping (medium) ---
claude -p --model sonnet --allowedTools "Read,Grep,Glob" \
  "Research how backend/app/registry.py loads agents. Document in /tmp/registry-research.md"
claude -p --model opus \
  "Read /tmp/registry-research.md. Plan tenant-scoped agent loading. Write to /tmp/scoping-plan.md"
claude -p --model sonnet \
  "Read /tmp/scoping-plan.md. Implement tenant-scoped agent loading. Tests first."
claude -p --model opus \
  "Review agent-scoping changes for correctness and security."

# --- Layer 3: frontend (small) ---
claude -p "Add tenant selector to frontend header. Call GET /api/tenants.
  Store selected tenant in context. Pass tenant_id header on all API calls."

# --- Layer 4: E2E (small) ---
claude -p "Write E2E test in backend/tests/e2e/test_multi_tenant.py:
  1. Create two tenants
  2. Send chat as tenant A
  3. Verify tenant B cannot see A's conversations
  Run pytest -m e2e"

# --- Final verification ---
claude -p "Run pytest --cov=app --cov-report=term-missing. Fix any failures."
```

---

## 何时使用 Ralphinho vs 更简单的模式

| 信号 | 用 Ralphinho | 用更简单的 |
|------|-------------|-----------|
| 多个相互依赖的 work unit | 是 | 否 |
| 需要并行实现 | 是 | 否 |
| 合并冲突可能 | 是 | 否 (sequential 就行) |
| 单文件变更 | 否 | 是 (sequential) |
| 多天项目 | 是 | 可能 (continuous-claude) |
| Spec/RFC 已写好 | 是 | 可能 |
| 快速迭代单一事物 | 否 | 是 (NanoClaw 或 pipeline) |

## 关键设计原则

1. **确定性执行** -- 前置分解锁定并行度和<E5BAA6><E5928C><EFBFBD>序
2. **人在关键杠杆点审查** -- work plan 是最高杠杆的干预点
3. **关注点分离** -- 每阶段独立上下文+独立 agent
4. **带上下文的冲突恢复** -- 不是盲目重试
5. **分级深度** -- trivial 跳过 research/review，large 最大审查力度
6. **可恢复工作流** -- 状态持久化到 SQLite，任意点恢复

## Related

- [[Autonomous Loops 自主循环<E5BEAA><E78EAF><EFBFBD>式]]
- [[dmux 多Agent并行编排]]
- [[Everything Claude Code <20><>整指南]]