Files
knowledge-base/4 - Resources/Claude-Code/Ralphinho RFC-DAG 编排模式.md
2026-04-06 23:27:39 +02:00

272 lines
8.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
created: "2026-04-06"
type: resource
tags: [resource, claude-code, AI-tools, ralphinho, RFC, DAG, multi-agent, orchestration, ECC]
source: "~/.claude/skills/ralphinho-rfc-pipeline/SKILL.md"
---
# Ralphinho RFC-DAG 编排模式
最复杂的自主循环模式。把 RFC/PRD 分解为依赖 DAG按层并行执行每个 unit 过分级质量管道,最后通过合并队列着陆。由 enitrat 创建。
相关笔记:[[Autonomous Loops 自主循环模式]]、[[dmux 多Agent并行编排]]、[[Autonomous Agent Harness 自主代理框架]]、[[ECC 编排替代方案 (orchestrate 迁移)]]
## 架构总览
```
RFC 文档
|
v
AI 分解为 WorkUnit (含依赖 DAG)
|
v
RALPH LOOP (最多 3 pass)
|
+-- 按 DAG 层执行 (层内并行):
| 每个 unit 在独立 worktree:
| Research -> Plan -> Implement -> Test -> Review
| (深度按复杂度分级)
|
+-- 合并队列:
Rebase onto main -> Run tests -> Land or Evict
被驱逐的 unit 带着冲突上下文重新进入
```
## WorkUnit 定义
```typescript
interface WorkUnit {
id: string; // kebab-case 标识
name: string; // 可读名称
rfcSections: string[]; // 对应 RFC 哪些章节
description: string; // 详细描述
deps: string[]; // 依赖 (其他 unit ID)
acceptance: string[]; // 具体验收标准
tier: "trivial" | "small" | "medium" | "large";
}
```
### 分解原则
- 偏好更少、更内聚的 unit减少合并风险
- 最小化跨 unit 文件重叠(避免冲突)
- 测试跟随实现(不要分成 "implement X" + "test X"
- 仅在有真实代码依赖时才建立依赖关系
## DAG 层级执行
依赖 DAG 决定执行顺序:
```
Layer 0: [unit-a, unit-b] <- 无依赖,并行
Layer 1: [unit-c] <- 依赖 unit-a
Layer 2: [unit-d, unit-e] <- 依赖 unit-c
```
同层内并行,跨层顺序执行。
## 复杂度分级管道
不同复杂度走不同深度的质量管道:
| 级别 | 管道阶段 |
|------|---------|
| trivial | implement -> test |
| small | implement -> test -> code-review |
| medium | research -> plan -> implement -> test -> PRD-review + code-review -> review-fix |
| large | research -> plan -> implement -> test -> PRD-review + code-review -> review-fix -> final-review |
## 分离上下文窗口 (消除自我审查偏差)
每个阶段运行在独立 agent 进程中reviewer 永远不是 author
| 阶段 | 模型 | 目的 |
|------|------|------|
| Research | Sonnet | 读代码+RFC产出上下文文档 |
| Plan | Opus | 设计实现步骤 |
| Implement | Codex/Sonnet | 写代码 |
| Test | Sonnet | 跑构建+测试 |
| PRD Review | Sonnet | Spec 合规检查 |
| Code Review | Opus | 质量+安全检查 |
| Review Fix | Codex/Sonnet | 处理 review 意见 |
| Final Review | Opus | 质量门 (仅 large tier) |
## 合并队列
```
Unit branch
|
+-- Rebase onto main
| 冲突? -> EVICT (捕获冲突上下文)
|
+-- Run build + tests
| 失败? -> EVICT (捕获测试输出)
|
+-- Pass -> Fast-forward main, push, delete branch
```
### 文件重叠智能
- 无重叠的 unit投机性并行着陆
- 有重叠的 unit逐个着陆每次 rebase
### 驱逐恢复
被驱逐时完整上下文冲突文件、diff、测试输出传给下次实现
```markdown
## MERGE CONFLICT -- RESOLVE BEFORE NEXT LANDING
Your previous implementation conflicted with another unit that landed first.
Restructure your changes to avoid the conflicting files/lines below.
{完整驱逐上下文和 diff}
```
## 阶段间数据流
```
research.contextFilePath --------> plan
plan.implementationSteps --------> implement
implement.{filesCreated} --------> test, reviews
test.failingSummary ------------> reviews, implement (next pass)
reviews.{feedback} -------------> review-fix -> implement (next pass)
final-review.reasoning ---------> implement (next pass)
evictionContext -----------------> implement (after merge conflict)
```
## Worktree 隔离
每个 unit 在独立 worktree 中运行。同一 unit 的各管道阶段共享 worktree保留跨阶段状态上下文文件、计划文件、代码变更
---
## 实际例子smart-support 多租户改造
### Step 1: 写 RFC
```markdown
# RFC: Multi-Tenant Agent Architecture
## Goal
Support multiple tenants, each with own agent config and conversation history.
## Work Units
1. tenant-model: Tenant SQLAlchemy model + migration
2. tenant-middleware: FastAPI middleware, extract tenant from JWT
3. agent-scoping: Scope agent registry per tenant
4. conversation-isolation: Filter conversations by tenant_id
5. frontend-tenant-selector: Tenant switcher in UI header
6. e2e-multi-tenant: E2E test for full flow
## Dependencies
tenant-model -> tenant-middleware -> agent-scoping
tenant-model -> conversation-isolation
agent-scoping + conversation-isolation -> frontend-tenant-selector
all -> e2e-multi-tenant
```
### Step 2: DAG 分解
```
Layer 0: [tenant-model] # tier: small
Layer 1: [tenant-middleware, conversation-isolation] # tier: medium, small
Layer 2: [agent-scoping] # tier: medium
Layer 3: [frontend-tenant-selector] # tier: small
Layer 4: [e2e-multi-tenant] # tier: small
```
### Step 3: 执行脚本
```bash
#!/bin/bash
set -e
# --- Layer 0: tenant-model (small: implement -> test -> review) ---
claude -p --model sonnet "Implement Tenant SQLAlchemy model in backend/app/models/tenant.py.
Fields: id, name, api_key_hash, created_at. Write migration. Tests first."
claude -p --model opus "Review changes for security (api_key hashing) and schema design."
# --- Layer 1: 并行 (medium + small) ---
# tenant-middleware (medium: research -> plan -> implement -> test -> review)
(
claude -p --model sonnet --allowedTools "Read,Grep,Glob" \
"Research how FastAPI middleware works in this project. Document in /tmp/middleware-research.md"
claude -p --model opus \
"Read /tmp/middleware-research.md. Plan tenant extraction from JWT. Write to /tmp/middleware-plan.md"
claude -p --model sonnet \
"Read /tmp/middleware-plan.md. Implement tenant middleware. Tests first."
claude -p --model opus \
"Review tenant-middleware changes for security and correctness."
) &
PID1=$!
# conversation-isolation (small: implement -> test -> review)
(
claude -p --model sonnet \
"Add tenant_id to conversations table. Filter all conversation queries by tenant_id. Tests first."
claude -p --model opus \
"Review conversation-isolation changes."
) &
PID2=$!
wait $PID1 $PID2
# De-sloppify Layer 1
claude -p "Review all uncommitted changes. Remove test slop. Run pytest --cov=app."
# --- Layer 2: agent-scoping (medium) ---
claude -p --model sonnet --allowedTools "Read,Grep,Glob" \
"Research how backend/app/registry.py loads agents. Document in /tmp/registry-research.md"
claude -p --model opus \
"Read /tmp/registry-research.md. Plan tenant-scoped agent loading. Write to /tmp/scoping-plan.md"
claude -p --model sonnet \
"Read /tmp/scoping-plan.md. Implement tenant-scoped agent loading. Tests first."
claude -p --model opus \
"Review agent-scoping changes for correctness and security."
# --- Layer 3: frontend (small) ---
claude -p "Add tenant selector to frontend header. Call GET /api/tenants.
Store selected tenant in context. Pass tenant_id header on all API calls."
# --- Layer 4: E2E (small) ---
claude -p "Write E2E test in backend/tests/e2e/test_multi_tenant.py:
1. Create two tenants
2. Send chat as tenant A
3. Verify tenant B cannot see A's conversations
Run pytest -m e2e"
# --- Final verification ---
claude -p "Run pytest --cov=app --cov-report=term-missing. Fix any failures."
```
---
## 何时使用 Ralphinho vs 更简单的模式
| 信号 | 用 Ralphinho | 用更简单的 |
|------|-------------|-----------|
| 多个相互依赖的 work unit | 是 | 否 |
| 需要并行实现 | 是 | 否 |
| 合并冲突可能 | 是 | 否 (sequential 就行) |
| 单文件变更 | 否 | 是 (sequential) |
| 多天项目 | 是 | 可能 (continuous-claude) |
| Spec/RFC 已写好 | 是 | 可能 |
| 快速迭代单一事物 | 否 | 是 (NanoClaw 或 pipeline) |
## 关键设计原则
1. **确定性执行** -- 前置分解锁定并行度和<E5BAA6><E5928C><EFBFBD>
2. **人在关键杠杆点审查** -- work plan 是最高杠杆的干预点
3. **关注点分离** -- 每阶段独立上下文+独立 agent
4. **带上下文的冲突恢复** -- 不是盲目重试
5. **分级深度** -- trivial 跳过 research/reviewlarge 最大审查力度
6. **可恢复工作流** -- 状态持久化到 SQLite任意点恢复
## Related
- [[Autonomous Loops 自主循环<E5BEAA><E78EAF><EFBFBD>式]]
- [[dmux 多Agent并行编排]]
- [[Everything Claude Code <20><>整指南]]