diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
index 5c515df..1f2dd1d 100644
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -1,263 +1,143 @@
-[角色]
-    你是废才，一位资深产品经理兼全栈开发教练。
-    
-    你见过太多人带着"改变世界"的妄想来找你，最后连需求都说不清楚。
-    你也见过真正能成事的人——他们不一定聪明，但足够诚实，敢于面对自己想法的漏洞。
-    
-    你负责引导用户完成产品开发的完整旅程：从脑子里的模糊想法，到可运行的产品。
+# Invoice Master POC v2
 
-[任务]
-    引导用户完成产品开发的完整流程：
-    
-    1. **需求收集** → 调用 product-spec-builder，生成 Product-Spec.md
-    2. **原型设计** → 调用 ui-prompt-generator，生成 UI-Prompts.md（可选）
-    3. **项目开发** → 调用 dev-builder，实现项目代码
-    4. **本地运行** → 启动项目，输出使用指南
+Swedish Invoice Field Extraction System - YOLOv11 + PaddleOCR 从瑞典 PDF 发票中提取结构化数据。
 
-[文件结构]
-    project/
-    ├── Product-Spec.md                    # 产品需求文档
-    ├── Product-Spec-CHANGELOG.md          # 需求变更记录
-    ├── UI-Prompts.md                      # 原型图提示词（可选）
-    ├── [项目源代码]/                       # 代码文件
-    └── .claude/
-        ├── CLAUDE.md                      # 主控（本文件）
-        └── skills/
-            ├── product-spec-builder/      # 需求收集
-            ├── ui-prompt-generator/       # 原型图提示词
-            └── dev-builder/               # 项目开发
+## Tech Stack
 
-[总体规则]
-    - 严格按照 需求收集 → 原型设计（可选）→ 项目开发 → 本地运行 的流程引导
-    - **任何功能变更、UI 修改、需求调整，都必须先更新 Product Spec，再实现代码**
-    - 无论用户如何打断或提出新问题，完成当前回答后始终引导用户进入下一步
-    - 始终使用**中文**进行交流
+| Component | Technology |
+|-----------|------------|
+| Object Detection | YOLOv11 (Ultralytics) |
+| OCR Engine | PaddleOCR v5 (PP-OCRv5) |
+| PDF Processing | PyMuPDF (fitz) |
+| Database | PostgreSQL + psycopg2 |
+| Web Framework | FastAPI + Uvicorn |
+| Deep Learning | PyTorch + CUDA 12.x |
 
-[运行环境要求]
-    **强制要求**：所有程序运行、命令执行必须在 WSL 环境中进行
+## WSL Environment (REQUIRED)
 
-    - **WSL**：所有 bash 命令必须通过 `wsl` 前缀执行
-    - **Conda 环境**：必须使用 `invoice-py311` 环境
+**Prefix ALL commands with:**
 
-    命令执行格式：
-    ```bash
-    wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && <你的命令>"
-    ```
+```bash
+wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && <command>"
+```
 
-    示例：
-    ```bash
-    # 运行 Python 脚本
-    wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && python main.py"
+**NEVER run Python commands directly in Windows PowerShell/CMD.**
 
-    # 安装依赖
-    wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && pip install -r requirements.txt"
+## Project-Specific Rules
 
-    # 运行测试
-    wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && pytest"
-    ```
+- Python 3.11+ with type hints
+- No print() in production - use logging
+- Run tests: `pytest --cov=src`
 
-    **注意**：
-    - 不要直接在 Windows PowerShell/CMD 中运行 Python 命令
-    - 每次执行命令都需要激活 conda 环境（因为是非交互式 shell）
-    - 路径需要转换为 WSL 格式（如 `/mnt/c/Users/...`）
+## File Structure
 
-[Skill 调用规则]
-    [product-spec-builder]
-        **自动调用**：
-        - 用户表达想要开发产品、应用、工具时
-        - 用户描述产品想法、功能需求时
-        - 用户要修改 UI、改界面、调整布局时（迭代模式）
-        - 用户要增加功能、新增功能时（迭代模式）
-        - 用户要改需求、调整功能、修改逻辑时（迭代模式）
-        
-        **手动调用**：/prd
-    
-    [ui-prompt-generator]
-        **手动调用**：/ui
-        
-        前置条件：Product-Spec.md 必须存在
-    
-    [dev-builder]
-        **手动调用**：/dev
-        
-        前置条件：Product-Spec.md 必须存在
+```
+src/
+├── cli/              # autolabel, train, infer, serve
+├── pdf/              # extractor, renderer, detector
+├── ocr/              # PaddleOCR wrapper, machine_code_parser
+├── inference/        # pipeline, yolo_detector, field_extractor
+├── normalize/        # Per-field normalizers
+├── matcher/          # Exact, substring, fuzzy strategies
+├── processing/       # CPU/GPU pool architecture
+├── web/              # FastAPI app, routes, services, schemas
+├── utils/            # validators, text_cleaner, fuzzy_matcher
+└── data/             # Database operations
+tests/                # Mirror of src structure
+runs/train/           # Training outputs
+```
 
-[项目状态检测与路由]
-    初始化时自动检测项目进度，路由到对应阶段：
-    
-    检测逻辑：
-        - 无 Product-Spec.md → 全新项目 → 引导用户描述想法或输入 /prd
-        - 有 Product-Spec.md，无代码 → Spec 已完成 → 输出交付指南
-        - 有 Product-Spec.md，有代码 → 项目已创建 → 可执行 /check 或 /run
-    
-    显示格式：
-        "📊 **项目进度检测**
-        
-        - Product Spec：[已完成/未完成]
-        - 原型图提示词：[已生成/未生成]
-        - 项目代码：[已创建/未创建]
-        
-        **当前阶段**：[阶段名称]
-        **下一步**：[具体指令或操作]"
+## Supported Fields
 
-[工作流程]
-    [需求收集阶段]
-        触发：用户表达产品想法（自动）或输入 /prd（手动）
-        
-        执行：调用 product-spec-builder skill
-        
-        完成后：输出交付指南，引导下一步
+| ID | Field | Description |
+|----|-------|-------------|
+| 0 | invoice_number | Invoice number |
+| 1 | invoice_date | Invoice date |
+| 2 | invoice_due_date | Due date |
+| 3 | ocr_number | OCR reference (Swedish payment) |
+| 4 | bankgiro | Bankgiro account |
+| 5 | plusgiro | Plusgiro account |
+| 6 | amount | Amount |
+| 7 | supplier_organisation_number | Supplier org number |
+| 8 | payment_line | Payment line (machine-readable) |
+| 9 | customer_number | Customer number |
 
-    [交付阶段]
-        触发：Product Spec 生成完成后自动执行
-        
-        输出：
-            "✅ **Product Spec 已生成！**
-            
-            文件：Product-Spec.md
-            
-            ---
-            
-            ## 📘 接下来
-            
-            - 输入 /ui 生成原型图提示词（可选）
-            - 输入 /dev 开始开发项目
-            - 直接对话可以改 UI、加功能"
+## Key Patterns
 
-    [原型图阶段]
-        触发：用户输入 /ui
-        
-        执行：调用 ui-prompt-generator skill
-        
-        完成后：
-            "✅ **原型图提示词已生成！**
-            
-            文件：UI-Prompts.md
-            
-            把提示词发给 AI 绘图工具生成原型图，然后输入 /dev 开始开发。"
+### Inference Result
 
-    [项目开发阶段]
-        触发：用户输入 /dev
-    
-        第一步：询问原型图
-            询问用户："有原型图或设计稿吗？有的话发给我参考。"
-            用户发送图片 → 记录，开发时参考
-            用户说没有 → 继续
-    
-        第二步：执行开发
-            调用 dev-builder skill
-    
-        完成后：引导用户执行 /run
+```python
+@dataclass
+class InferenceResult:
+    document_id: str
+    document_type: str  # "invoice" or "letter"
+    fields: dict[str, str]
+    confidence: dict[str, float]
+    cross_validation: CrossValidationResult | None
+    processing_time_ms: float
+```
 
-    [代码检查阶段]
-        触发：用户输入 /check
-        
-        执行：
-            第一步：读取 Product Spec 文档
-                加载 Product-Spec.md 文件
-                解析功能需求、UI 布局
-            
-            第二步：扫描项目代码
-                遍历项目目录下的代码文件
-                识别已实现的功能、组件
-            
-            第三步：功能完整度检查
-                - 功能需求：Product Spec 功能需求 vs 代码实现
-                - UI 布局：Product Spec 布局描述 vs 界面代码
-            
-            第四步：输出检查报告
-        
-        输出：
-            "📋 **项目完整度检查报告**
-            
-            **对照文档**：Product-Spec.md
-            
-            ---
-            
-            ✅ **已完成（X项）**
-            - [功能名称]：[实现位置]
-            
-            ⚠️ **部分完成（X项）**
-            - [功能名称]：[缺失内容]
-            
-            ❌ **缺失（X项）**
-            - [功能名称]：未实现
-            
-            ---
- 
-            💡 **改进建议**
-            1. [具体建议]
-            2. [具体建议]
-                
-            ---
-            
-            需要我帮你补充这些功能吗？或输入 /run 先跑起来看看。"
+### API Schemas
 
-    [本地运行阶段]
-        触发：用户输入 /run
-        
-        执行：自动检测项目类型，安装依赖，启动项目
-        
-        输出：
-            "🚀 **项目已启动！**
-            
-            **访问地址**：http://localhost:[端口号]
-            
-            ---
-            
-            ## 📖 使用指南
-            
-            [根据 Product Spec 生成简要使用说明]
-            
-            ---
-            
-            💡 **提示**：
-            - /stop 停止服务
-            - /check 检查完整度
-            - /prd 修改需求"
+See `src/web/schemas.py` for request/response models.
 
-    [内容修订]
-        当用户提出修改意见时：
-        
-        **流程**：先更新文档 → 再实现代码
-        
-        1. 调用 product-spec-builder（迭代模式）
-           - 通过追问明确变更内容
-           - 更新 Product-Spec.md
-           - 更新 Product-Spec-CHANGELOG.md
-        2. 调用 dev-builder 实现代码变更
-        3. 建议用户执行 /check 验证
+## Environment Variables
 
-[指令集]
-    /prd    - 需求收集，生成 Product Spec
-    /ui     - 生成原型图提示词
-    /dev    - 开发项目代码
-    /check  - 对照 Spec 检查代码完整度
-    /run    - 本地运行项目
-    /stop   - 停止运行中的服务
-    /status - 显示项目进度
-    /help   - 显示所有指令
+```bash
+# Required
+DB_PASSWORD=
 
-[初始化]
-    以下ASCII艺术应该显示"FEICAI"字样。如果您看到乱码或显示异常，请帮忙纠正，使用ASCII艺术生成显示"FEICAI"
-    ```
-        "███████╗███████╗██╗ ██████╗ █████╗ ██╗
-        ██╔════╝██╔════╝██║██╔════╝██╔══██╗██║
-        █████╗  █████╗  ██║██║     ███████║██║
-        ██╔══╝  ██╔══╝  ██║██║     ██╔══██║██║
-        ██║     ███████╗██║╚██████╗██║  ██║██║
-        ╚═╝     ╚══════╝╚═╝ ╚═════╝╚═╝  ╚═╝╚═╝"    
-    ```
-    
-    "👋 我是废才，产品经理兼开发教练。
+# Optional (with defaults)
+DB_HOST=192.168.68.31
+DB_PORT=5432
+DB_NAME=docmaster
+DB_USER=docmaster
+MODEL_PATH=runs/train/invoice_fields/weights/best.pt
+CONFIDENCE_THRESHOLD=0.5
+SERVER_HOST=0.0.0.0
+SERVER_PORT=8000
+```
 
-    我不聊理想，只聊产品。你负责想，我负责问到你想清楚。
-    从需求文档到本地运行，全程我带着走。
+## CLI Commands
 
-    过程中我会问很多问题，有些可能让你不舒服。不过放心，我只是想让你的产品能落地，仅此而已。
+```bash
+# Auto-labeling
+python -m src.cli.autolabel --dual-pool --cpu-workers 3 --gpu-workers 1
 
-    💡 输入 /help 查看所有指令
+# Training
+python -m src.cli.train --model yolo11n.pt --epochs 100 --batch 16 --name invoice_fields
 
-    现在，说说你想做什么？"
-    
-    执行 [项目状态检测与路由]
+# Inference
+python -m src.cli.infer --model runs/train/invoice_fields/weights/best.pt --input invoice.pdf --gpu
+
+# Web Server
+python run_server.py --port 8000
+```
+
+## API Endpoints
+
+| Method | Endpoint | Description |
+|--------|----------|-------------|
+| GET | `/` | Web UI |
+| GET | `/api/v1/health` | Health check |
+| POST | `/api/v1/infer` | Process invoice |
+| GET | `/api/v1/results/{filename}` | Get visualization |
+
+## Current Status
+
+- **Tests**: 688 passing
+- **Coverage**: 37%
+- **Model**: 93.5% mAP@0.5
+- **Documents Labeled**: 9,738
+
+## Quick Start
+
+```bash
+# Start server
+wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && python run_server.py"
+
+# Run tests
+wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && pytest"
+
+# Access UI: http://localhost:8000
+```
\ No newline at end of file
diff --git a/.claude/commands/build-fix.md b/.claude/commands/build-fix.md
new file mode 100644
index 0000000..5951016
--- /dev/null
+++ b/.claude/commands/build-fix.md
@@ -0,0 +1,22 @@
+# Build and Fix
+
+Incrementally fix Python errors and test failures.
+
+## Workflow
+
+1. Run check: `mypy src/ --ignore-missing-imports` or `pytest -x --tb=short`
+2. Parse errors, group by file, sort by severity (ImportError > TypeError > other)
+3. For each error:
+   - Show context (5 lines)
+   - Explain and propose fix
+   - Apply fix
+   - Re-run test for that file
+   - Verify resolved
+4. Stop if: fix introduces new errors, same error after 3 attempts, or user pauses
+5. Show summary: fixed / remaining / new errors
+
+## Rules
+
+- Fix ONE error at a time
+- Re-run tests after each fix
+- Never batch multiple unrelated fixes
\ No newline at end of file
diff --git a/.claude/commands/checkpoint.md b/.claude/commands/checkpoint.md
new file mode 100644
index 0000000..06293c0
--- /dev/null
+++ b/.claude/commands/checkpoint.md
@@ -0,0 +1,74 @@
+# Checkpoint Command
+
+Create or verify a checkpoint in your workflow.
+
+## Usage
+
+`/checkpoint [create|verify|list] [name]`
+
+## Create Checkpoint
+
+When creating a checkpoint:
+
+1. Run `/verify quick` to ensure current state is clean
+2. Create a git stash or commit with checkpoint name
+3. Log checkpoint to `.claude/checkpoints.log`:
+
+```bash
+echo "$(date +%Y-%m-%d-%H:%M) | $CHECKPOINT_NAME | $(git rev-parse --short HEAD)" >> .claude/checkpoints.log
+```
+
+4. Report checkpoint created
+
+## Verify Checkpoint
+
+When verifying against a checkpoint:
+
+1. Read checkpoint from log
+2. Compare current state to checkpoint:
+   - Files added since checkpoint
+   - Files modified since checkpoint
+   - Test pass rate now vs then
+   - Coverage now vs then
+
+3. Report:
+```
+CHECKPOINT COMPARISON: $NAME
+============================
+Files changed: X
+Tests: +Y passed / -Z failed
+Coverage: +X% / -Y%
+Build: [PASS/FAIL]
+```
+
+## List Checkpoints
+
+Show all checkpoints with:
+- Name
+- Timestamp
+- Git SHA
+- Status (current, behind, ahead)
+
+## Workflow
+
+Typical checkpoint flow:
+
+```
+[Start] --> /checkpoint create "feature-start"
+   |
+[Implement] --> /checkpoint create "core-done"
+   |
+[Test] --> /checkpoint verify "core-done"
+   |
+[Refactor] --> /checkpoint create "refactor-done"
+   |
+[PR] --> /checkpoint verify "feature-start"
+```
+
+## Arguments
+
+$ARGUMENTS:
+- `create <name>` - Create named checkpoint
+- `verify <name>` - Verify against named checkpoint
+- `list` - Show all checkpoints
+- `clear` - Remove old checkpoints (keeps last 5)
diff --git a/.claude/commands/code-review.md b/.claude/commands/code-review.md
new file mode 100644
index 0000000..25c9e7a
--- /dev/null
+++ b/.claude/commands/code-review.md
@@ -0,0 +1,46 @@
+# Code Review
+
+Security and quality review of uncommitted changes.
+
+## Workflow
+
+1. Get changed files: `git diff --name-only HEAD` and `git diff --staged --name-only`
+2. Review each file for issues (see checklist below)
+3. Run automated checks: `mypy src/`, `ruff check src/`, `pytest -x`
+4. Generate report with severity, location, description, suggested fix
+5. Block commit if CRITICAL or HIGH issues found
+
+## Checklist
+
+### CRITICAL (Block)
+
+- Hardcoded credentials, API keys, tokens, passwords
+- SQL injection (must use parameterized queries)
+- Path traversal risks
+- Missing input validation on API endpoints
+- Missing authentication/authorization
+
+### HIGH (Block)
+
+- Functions > 50 lines, files > 800 lines
+- Nesting depth > 4 levels
+- Missing error handling or bare `except:`
+- `print()` in production code (use logging)
+- Mutable default arguments
+
+### MEDIUM (Warn)
+
+- Missing type hints on public functions
+- Missing tests for new code
+- Duplicate code, magic numbers
+- Unused imports/variables
+- TODO/FIXME comments
+
+## Report Format
+
+```
+[SEVERITY] file:line - Issue description
+  Suggested fix: ...
+```
+
+## Never Approve Code With Security Vulnerabilities!
\ No newline at end of file
diff --git a/.claude/commands/e2e.md b/.claude/commands/e2e.md
new file mode 100644
index 0000000..6ac6d43
--- /dev/null
+++ b/.claude/commands/e2e.md
@@ -0,0 +1,40 @@
+# E2E Testing
+
+End-to-end testing for the Invoice Field Extraction API.
+
+## When to Use
+
+- Testing complete inference pipeline (PDF -> Fields)
+- Verifying API endpoints work end-to-end
+- Validating YOLO + OCR + field extraction integration
+- Pre-deployment verification
+
+## Workflow
+
+1. Ensure server is running: `python run_server.py`
+2. Run health check: `curl http://localhost:8000/api/v1/health`
+3. Run E2E tests: `pytest tests/e2e/ -v`
+4. Verify results and capture any failures
+
+## Critical Scenarios (Must Pass)
+
+1. Health check returns `{"status": "healthy", "model_loaded": true}`
+2. PDF upload returns valid response with fields
+3. Fields extracted with confidence scores
+4. Visualization image generated
+5. Cross-validation included for invoices with payment_line
+
+## Checklist
+
+- [ ] Server running on http://localhost:8000
+- [ ] Health check passes
+- [ ] PDF inference returns valid JSON
+- [ ] At least one field extracted
+- [ ] Visualization URL returns image
+- [ ] Response time < 10 seconds
+- [ ] No server errors in logs
+
+## Test Location
+
+E2E tests: `tests/e2e/`
+Sample fixtures: `tests/fixtures/`
\ No newline at end of file
diff --git a/.claude/commands/eval.md b/.claude/commands/eval.md
new file mode 100644
index 0000000..852c175
--- /dev/null
+++ b/.claude/commands/eval.md
@@ -0,0 +1,174 @@
+# Eval Command
+
+Evaluate model performance and field extraction accuracy.
+
+## Usage
+
+`/eval [model|accuracy|compare|report]`
+
+## Model Evaluation
+
+`/eval model`
+
+Evaluate YOLO model performance on test dataset:
+
+```bash
+# Run model evaluation
+python -m src.cli.train --model runs/train/invoice_fields/weights/best.pt --eval-only
+
+# Or use ultralytics directly
+yolo val model=runs/train/invoice_fields/weights/best.pt data=data.yaml
+```
+
+Output:
+```
+Model Evaluation: invoice_fields/best.pt
+========================================
+mAP@0.5:     93.5%
+mAP@0.5-0.95: 83.0%
+
+Per-class AP:
+- invoice_number:    95.2%
+- invoice_date:      94.8%
+- invoice_due_date:  93.1%
+- ocr_number:        91.5%
+- bankgiro:          92.3%
+- plusgiro:          90.8%
+- amount:            88.7%
+- supplier_org_num:  85.2%
+- payment_line:      82.4%
+- customer_number:   81.1%
+```
+
+## Accuracy Evaluation
+
+`/eval accuracy`
+
+Evaluate field extraction accuracy against ground truth:
+
+```bash
+# Run accuracy evaluation on labeled data
+python -m src.cli.infer --model runs/train/invoice_fields/weights/best.pt \
+    --input ~/invoice-data/test/*.pdf \
+    --ground-truth ~/invoice-data/test/labels.csv \
+    --output eval_results.json
+```
+
+Output:
+```
+Field Extraction Accuracy
+=========================
+Documents tested: 500
+
+Per-field accuracy:
+- InvoiceNumber:   98.9% (494/500)
+- InvoiceDate:     95.5% (478/500)
+- InvoiceDueDate:  95.9% (480/500)
+- OCR:             99.1% (496/500)
+- Bankgiro:        99.0% (495/500)
+- Plusgiro:        99.4% (497/500)
+- Amount:          91.3% (457/500)
+- supplier_org:    78.2% (391/500)
+
+Overall: 94.8%
+```
+
+## Compare Models
+
+`/eval compare`
+
+Compare two model versions:
+
+```bash
+# Compare old vs new model
+python -m src.cli.eval compare \
+    --model-a runs/train/invoice_v1/weights/best.pt \
+    --model-b runs/train/invoice_v2/weights/best.pt \
+    --test-data ~/invoice-data/test/
+```
+
+Output:
+```
+Model Comparison
+================
+                Model A     Model B     Delta
+mAP@0.5:        91.2%       93.5%       +2.3%
+Accuracy:       92.1%       94.8%       +2.7%
+Speed (ms):     1850        1520        -330
+
+Per-field improvements:
+- amount:       +4.2%
+- payment_line: +3.8%
+- customer_num: +2.1%
+
+Recommendation: Deploy Model B
+```
+
+## Generate Report
+
+`/eval report`
+
+Generate comprehensive evaluation report:
+
+```bash
+python -m src.cli.eval report --output eval_report.md
+```
+
+Output:
+```markdown
+# Evaluation Report
+Generated: 2026-01-25
+
+## Model Performance
+- Model: runs/train/invoice_fields/weights/best.pt
+- mAP@0.5: 93.5%
+- Training samples: 9,738
+
+## Field Extraction Accuracy
+| Field | Accuracy | Errors |
+|-------|----------|--------|
+| InvoiceNumber | 98.9% | 6 |
+| Amount | 91.3% | 43 |
+...
+
+## Error Analysis
+### Common Errors
+1. Amount: OCR misreads comma as period
+2. supplier_org: Missing from some invoices
+3. payment_line: Partially obscured by stamps
+
+## Recommendations
+1. Add more training data for low-accuracy fields
+2. Implement OCR error correction for amounts
+3. Consider confidence threshold tuning
+```
+
+## Quick Commands
+
+```bash
+# Evaluate model metrics
+yolo val model=runs/train/invoice_fields/weights/best.pt
+
+# Test inference on sample
+python -m src.cli.infer --input sample.pdf --output result.json --gpu
+
+# Check test coverage
+pytest --cov=src --cov-report=html
+```
+
+## Evaluation Metrics
+
+| Metric | Target | Current |
+|--------|--------|---------|
+| mAP@0.5 | >90% | 93.5% |
+| Overall Accuracy | >90% | 94.8% |
+| Test Coverage | >60% | 37% |
+| Tests Passing | 100% | 100% |
+
+## When to Evaluate
+
+- After training a new model
+- Before deploying to production
+- After adding new training data
+- When accuracy complaints arise
+- Weekly performance monitoring
diff --git a/.claude/commands/learn.md b/.claude/commands/learn.md
new file mode 100644
index 0000000..9899af1
--- /dev/null
+++ b/.claude/commands/learn.md
@@ -0,0 +1,70 @@
+# /learn - Extract Reusable Patterns
+
+Analyze the current session and extract any patterns worth saving as skills.
+
+## Trigger
+
+Run `/learn` at any point during a session when you've solved a non-trivial problem.
+
+## What to Extract
+
+Look for:
+
+1. **Error Resolution Patterns**
+   - What error occurred?
+   - What was the root cause?
+   - What fixed it?
+   - Is this reusable for similar errors?
+
+2. **Debugging Techniques**
+   - Non-obvious debugging steps
+   - Tool combinations that worked
+   - Diagnostic patterns
+
+3. **Workarounds**
+   - Library quirks
+   - API limitations
+   - Version-specific fixes
+
+4. **Project-Specific Patterns**
+   - Codebase conventions discovered
+   - Architecture decisions made
+   - Integration patterns
+
+## Output Format
+
+Create a skill file at `~/.claude/skills/learned/[pattern-name].md`:
+
+```markdown
+# [Descriptive Pattern Name]
+
+**Extracted:** [Date]
+**Context:** [Brief description of when this applies]
+
+## Problem
+[What problem this solves - be specific]
+
+## Solution
+[The pattern/technique/workaround]
+
+## Example
+[Code example if applicable]
+
+## When to Use
+[Trigger conditions - what should activate this skill]
+```
+
+## Process
+
+1. Review the session for extractable patterns
+2. Identify the most valuable/reusable insight
+3. Draft the skill file
+4. Ask user to confirm before saving
+5. Save to `~/.claude/skills/learned/`
+
+## Notes
+
+- Don't extract trivial fixes (typos, simple syntax errors)
+- Don't extract one-time issues (specific API outages, etc.)
+- Focus on patterns that will save time in future sessions
+- Keep skills focused - one pattern per skill
diff --git a/.claude/commands/orchestrate.md b/.claude/commands/orchestrate.md
new file mode 100644
index 0000000..30ac2b8
--- /dev/null
+++ b/.claude/commands/orchestrate.md
@@ -0,0 +1,172 @@
+# Orchestrate Command
+
+Sequential agent workflow for complex tasks.
+
+## Usage
+
+`/orchestrate [workflow-type] [task-description]`
+
+## Workflow Types
+
+### feature
+Full feature implementation workflow:
+```
+planner -> tdd-guide -> code-reviewer -> security-reviewer
+```
+
+### bugfix
+Bug investigation and fix workflow:
+```
+explorer -> tdd-guide -> code-reviewer
+```
+
+### refactor
+Safe refactoring workflow:
+```
+architect -> code-reviewer -> tdd-guide
+```
+
+### security
+Security-focused review:
+```
+security-reviewer -> code-reviewer -> architect
+```
+
+## Execution Pattern
+
+For each agent in the workflow:
+
+1. **Invoke agent** with context from previous agent
+2. **Collect output** as structured handoff document
+3. **Pass to next agent** in chain
+4. **Aggregate results** into final report
+
+## Handoff Document Format
+
+Between agents, create handoff document:
+
+```markdown
+## HANDOFF: [previous-agent] -> [next-agent]
+
+### Context
+[Summary of what was done]
+
+### Findings
+[Key discoveries or decisions]
+
+### Files Modified
+[List of files touched]
+
+### Open Questions
+[Unresolved items for next agent]
+
+### Recommendations
+[Suggested next steps]
+```
+
+## Example: Feature Workflow
+
+```
+/orchestrate feature "Add user authentication"
+```
+
+Executes:
+
+1. **Planner Agent**
+   - Analyzes requirements
+   - Creates implementation plan
+   - Identifies dependencies
+   - Output: `HANDOFF: planner -> tdd-guide`
+
+2. **TDD Guide Agent**
+   - Reads planner handoff
+   - Writes tests first
+   - Implements to pass tests
+   - Output: `HANDOFF: tdd-guide -> code-reviewer`
+
+3. **Code Reviewer Agent**
+   - Reviews implementation
+   - Checks for issues
+   - Suggests improvements
+   - Output: `HANDOFF: code-reviewer -> security-reviewer`
+
+4. **Security Reviewer Agent**
+   - Security audit
+   - Vulnerability check
+   - Final approval
+   - Output: Final Report
+
+## Final Report Format
+
+```
+ORCHESTRATION REPORT
+====================
+Workflow: feature
+Task: Add user authentication
+Agents: planner -> tdd-guide -> code-reviewer -> security-reviewer
+
+SUMMARY
+-------
+[One paragraph summary]
+
+AGENT OUTPUTS
+-------------
+Planner: [summary]
+TDD Guide: [summary]
+Code Reviewer: [summary]
+Security Reviewer: [summary]
+
+FILES CHANGED
+-------------
+[List all files modified]
+
+TEST RESULTS
+------------
+[Test pass/fail summary]
+
+SECURITY STATUS
+---------------
+[Security findings]
+
+RECOMMENDATION
+--------------
+[SHIP / NEEDS WORK / BLOCKED]
+```
+
+## Parallel Execution
+
+For independent checks, run agents in parallel:
+
+```markdown
+### Parallel Phase
+Run simultaneously:
+- code-reviewer (quality)
+- security-reviewer (security)
+- architect (design)
+
+### Merge Results
+Combine outputs into single report
+```
+
+## Arguments
+
+$ARGUMENTS:
+- `feature <description>` - Full feature workflow
+- `bugfix <description>` - Bug fix workflow
+- `refactor <description>` - Refactoring workflow
+- `security <description>` - Security review workflow
+- `custom <agents> <description>` - Custom agent sequence
+
+## Custom Workflow Example
+
+```
+/orchestrate custom "architect,tdd-guide,code-reviewer" "Redesign caching layer"
+```
+
+## Tips
+
+1. **Start with planner** for complex features
+2. **Always include code-reviewer** before merge
+3. **Use security-reviewer** for auth/payment/PII
+4. **Keep handoffs concise** - focus on what next agent needs
+5. **Run verification** between agents if needed
diff --git a/.claude/commands/plan.md b/.claude/commands/plan.md
new file mode 100644
index 0000000..3acf686
--- /dev/null
+++ b/.claude/commands/plan.md
@@ -0,0 +1,113 @@
+---
+description: Restate requirements, assess risks, and create step-by-step implementation plan. WAIT for user CONFIRM before touching any code.
+---
+
+# Plan Command
+
+This command invokes the **planner** agent to create a comprehensive implementation plan before writing any code.
+
+## What This Command Does
+
+1. **Restate Requirements** - Clarify what needs to be built
+2. **Identify Risks** - Surface potential issues and blockers
+3. **Create Step Plan** - Break down implementation into phases
+4. **Wait for Confirmation** - MUST receive user approval before proceeding
+
+## When to Use
+
+Use `/plan` when:
+- Starting a new feature
+- Making significant architectural changes
+- Working on complex refactoring
+- Multiple files/components will be affected
+- Requirements are unclear or ambiguous
+
+## How It Works
+
+The planner agent will:
+
+1. **Analyze the request** and restate requirements in clear terms
+2. **Break down into phases** with specific, actionable steps
+3. **Identify dependencies** between components
+4. **Assess risks** and potential blockers
+5. **Estimate complexity** (High/Medium/Low)
+6. **Present the plan** and WAIT for your explicit confirmation
+
+## Example Usage
+
+```
+User: /plan I need to add real-time notifications when markets resolve
+
+Agent (planner):
+# Implementation Plan: Real-Time Market Resolution Notifications
+
+## Requirements Restatement
+- Send notifications to users when markets they're watching resolve
+- Support multiple notification channels (in-app, email, webhook)
+- Ensure notifications are delivered reliably
+- Include market outcome and user's position result
+
+## Implementation Phases
+
+### Phase 1: Database Schema
+- Add notifications table with columns: id, user_id, market_id, type, status, created_at
+- Add user_notification_preferences table for channel preferences
+- Create indexes on user_id and market_id for performance
+
+### Phase 2: Notification Service
+- Create notification service in lib/notifications.ts
+- Implement notification queue using BullMQ/Redis
+- Add retry logic for failed deliveries
+- Create notification templates
+
+### Phase 3: Integration Points
+- Hook into market resolution logic (when status changes to "resolved")
+- Query all users with positions in market
+- Enqueue notifications for each user
+
+### Phase 4: Frontend Components
+- Create NotificationBell component in header
+- Add NotificationList modal
+- Implement real-time updates via Supabase subscriptions
+- Add notification preferences page
+
+## Dependencies
+- Redis (for queue)
+- Email service (SendGrid/Resend)
+- Supabase real-time subscriptions
+
+## Risks
+- HIGH: Email deliverability (SPF/DKIM required)
+- MEDIUM: Performance with 1000+ users per market
+- MEDIUM: Notification spam if markets resolve frequently
+- LOW: Real-time subscription overhead
+
+## Estimated Complexity: MEDIUM
+- Backend: 4-6 hours
+- Frontend: 3-4 hours
+- Testing: 2-3 hours
+- Total: 9-13 hours
+
+**WAITING FOR CONFIRMATION**: Proceed with this plan? (yes/no/modify)
+```
+
+## Important Notes
+
+**CRITICAL**: The planner agent will **NOT** write any code until you explicitly confirm the plan with "yes" or "proceed" or similar affirmative response.
+
+If you want changes, respond with:
+- "modify: [your changes]"
+- "different approach: [alternative]"
+- "skip phase 2 and do phase 3 first"
+
+## Integration with Other Commands
+
+After planning:
+- Use `/tdd` to implement with test-driven development
+- Use `/build-and-fix` if build errors occur
+- Use `/code-review` to review completed implementation
+
+## Related Agents
+
+This command invokes the `planner` agent located at:
+`~/.claude/agents/planner.md`
diff --git a/.claude/commands/refactor-clean.md b/.claude/commands/refactor-clean.md
new file mode 100644
index 0000000..6f5e250
--- /dev/null
+++ b/.claude/commands/refactor-clean.md
@@ -0,0 +1,28 @@
+# Refactor Clean
+
+Safely identify and remove dead code with test verification:
+
+1. Run dead code analysis tools:
+   - knip: Find unused exports and files
+   - depcheck: Find unused dependencies
+   - ts-prune: Find unused TypeScript exports
+
+2. Generate comprehensive report in .reports/dead-code-analysis.md
+
+3. Categorize findings by severity:
+   - SAFE: Test files, unused utilities
+   - CAUTION: API routes, components
+   - DANGER: Config files, main entry points
+
+4. Propose safe deletions only
+
+5. Before each deletion:
+   - Run full test suite
+   - Verify tests pass
+   - Apply change
+   - Re-run tests
+   - Rollback if tests fail
+
+6. Show summary of cleaned items
+
+Never delete code without running tests first!
diff --git a/.claude/commands/setup-pm.md b/.claude/commands/setup-pm.md
new file mode 100644
index 0000000..87224b9
--- /dev/null
+++ b/.claude/commands/setup-pm.md
@@ -0,0 +1,80 @@
+---
+description: Configure your preferred package manager (npm/pnpm/yarn/bun)
+disable-model-invocation: true
+---
+
+# Package Manager Setup
+
+Configure your preferred package manager for this project or globally.
+
+## Usage
+
+```bash
+# Detect current package manager
+node scripts/setup-package-manager.js --detect
+
+# Set global preference
+node scripts/setup-package-manager.js --global pnpm
+
+# Set project preference
+node scripts/setup-package-manager.js --project bun
+
+# List available package managers
+node scripts/setup-package-manager.js --list
+```
+
+## Detection Priority
+
+When determining which package manager to use, the following order is checked:
+
+1. **Environment variable**: `CLAUDE_PACKAGE_MANAGER`
+2. **Project config**: `.claude/package-manager.json`
+3. **package.json**: `packageManager` field
+4. **Lock file**: Presence of package-lock.json, yarn.lock, pnpm-lock.yaml, or bun.lockb
+5. **Global config**: `~/.claude/package-manager.json`
+6. **Fallback**: First available package manager (pnpm > bun > yarn > npm)
+
+## Configuration Files
+
+### Global Configuration
+```json
+// ~/.claude/package-manager.json
+{
+  "packageManager": "pnpm"
+}
+```
+
+### Project Configuration
+```json
+// .claude/package-manager.json
+{
+  "packageManager": "bun"
+}
+```
+
+### package.json
+```json
+{
+  "packageManager": "pnpm@8.6.0"
+}
+```
+
+## Environment Variable
+
+Set `CLAUDE_PACKAGE_MANAGER` to override all other detection methods:
+
+```bash
+# Windows (PowerShell)
+$env:CLAUDE_PACKAGE_MANAGER = "pnpm"
+
+# macOS/Linux
+export CLAUDE_PACKAGE_MANAGER=pnpm
+```
+
+## Run the Detection
+
+To see current package manager detection results, run:
+
+```bash
+node scripts/setup-package-manager.js --detect
+```
diff --git a/.claude/commands/tdd.md b/.claude/commands/tdd.md
new file mode 100644
index 0000000..02bdb2d
--- /dev/null
+++ b/.claude/commands/tdd.md
@@ -0,0 +1,326 @@
+---
+description: Enforce test-driven development workflow. Scaffold interfaces, generate tests FIRST, then implement minimal code to pass. Ensure 80%+ coverage.
+---
+
+# TDD Command
+
+This command invokes the **tdd-guide** agent to enforce test-driven development methodology.
+
+## What This Command Does
+
+1. **Scaffold Interfaces** - Define types/interfaces first
+2. **Generate Tests First** - Write failing tests (RED)
+3. **Implement Minimal Code** - Write just enough to pass (GREEN)
+4. **Refactor** - Improve code while keeping tests green (REFACTOR)
+5. **Verify Coverage** - Ensure 80%+ test coverage
+
+## When to Use
+
+Use `/tdd` when:
+- Implementing new features
+- Adding new functions/components
+- Fixing bugs (write test that reproduces bug first)
+- Refactoring existing code
+- Building critical business logic
+
+## How It Works
+
+The tdd-guide agent will:
+
+1. **Define interfaces** for inputs/outputs
+2. **Write tests that will FAIL** (because code doesn't exist yet)
+3. **Run tests** and verify they fail for the right reason
+4. **Write minimal implementation** to make tests pass
+5. **Run tests** and verify they pass
+6. **Refactor** code while keeping tests green
+7. **Check coverage** and add more tests if below 80%
+
+## TDD Cycle
+
+```
+RED → GREEN → REFACTOR → REPEAT
+
+RED:      Write a failing test
+GREEN:    Write minimal code to pass
+REFACTOR: Improve code, keep tests passing
+REPEAT:   Next feature/scenario
+```
+
+## Example Usage
+
+```
+User: /tdd I need a function to calculate market liquidity score
+
+Agent (tdd-guide):
+# TDD Session: Market Liquidity Score Calculator
+
+## Step 1: Define Interface (SCAFFOLD)
+
+```typescript
+// lib/liquidity.ts
+export interface MarketData {
+  totalVolume: number
+  bidAskSpread: number
+  activeTraders: number
+  lastTradeTime: Date
+}
+
+export function calculateLiquidityScore(market: MarketData): number {
+  // TODO: Implementation
+  throw new Error('Not implemented')
+}
+```
+
+## Step 2: Write Failing Test (RED)
+
+```typescript
+// lib/liquidity.test.ts
+import { calculateLiquidityScore } from './liquidity'
+
+describe('calculateLiquidityScore', () => {
+  it('should return high score for liquid market', () => {
+    const market = {
+      totalVolume: 100000,
+      bidAskSpread: 0.01,
+      activeTraders: 500,
+      lastTradeTime: new Date()
+    }
+
+    const score = calculateLiquidityScore(market)
+
+    expect(score).toBeGreaterThan(80)
+    expect(score).toBeLessThanOrEqual(100)
+  })
+
+  it('should return low score for illiquid market', () => {
+    const market = {
+      totalVolume: 100,
+      bidAskSpread: 0.5,
+      activeTraders: 2,
+      lastTradeTime: new Date(Date.now() - 86400000) // 1 day ago
+    }
+
+    const score = calculateLiquidityScore(market)
+
+    expect(score).toBeLessThan(30)
+    expect(score).toBeGreaterThanOrEqual(0)
+  })
+
+  it('should handle edge case: zero volume', () => {
+    const market = {
+      totalVolume: 0,
+      bidAskSpread: 0,
+      activeTraders: 0,
+      lastTradeTime: new Date()
+    }
+
+    const score = calculateLiquidityScore(market)
+
+    expect(score).toBe(0)
+  })
+})
+```
+
+## Step 3: Run Tests - Verify FAIL
+
+```bash
+npm test lib/liquidity.test.ts
+
+FAIL lib/liquidity.test.ts
+  ✕ should return high score for liquid market (2 ms)
+    Error: Not implemented
+
+1 test failed, 0 passed
+```
+
+✅ Tests fail as expected. Ready to implement.
+
+## Step 4: Implement Minimal Code (GREEN)
+
+```typescript
+// lib/liquidity.ts
+export function calculateLiquidityScore(market: MarketData): number {
+  // Handle zero volume edge case
+  if (market.totalVolume === 0) {
+    return 0
+  }
+
+  // Calculate component scores (0-100 scale)
+  const volumeScore = Math.min(market.totalVolume / 1000, 100)
+  const spreadScore = Math.max(100 - (market.bidAskSpread * 1000), 0)
+  const traderScore = Math.min(market.activeTraders / 10, 100)
+
+  // Recent activity bonus
+  const hoursSinceLastTrade = (Date.now() - market.lastTradeTime.getTime()) / (1000 * 60 * 60)
+  const recencyScore = Math.max(100 - (hoursSinceLastTrade * 10), 0)
+
+  // Weighted average
+  const score = (
+    volumeScore * 0.4 +
+    spreadScore * 0.3 +
+    traderScore * 0.2 +
+    recencyScore * 0.1
+  )
+
+  return Math.min(Math.max(score, 0), 100) // Clamp to 0-100
+}
+```
+
+## Step 5: Run Tests - Verify PASS
+
+```bash
+npm test lib/liquidity.test.ts
+
+PASS lib/liquidity.test.ts
+  ✓ should return high score for liquid market (3 ms)
+  ✓ should return low score for illiquid market (2 ms)
+  ✓ should handle edge case: zero volume (1 ms)
+
+3 tests passed
+```
+
+✅ All tests passing!
+
+## Step 6: Refactor (IMPROVE)
+
+```typescript
+// lib/liquidity.ts - Refactored with constants and better readability
+const WEIGHTS = {
+  VOLUME: 0.4,
+  SPREAD: 0.3,
+  TRADERS: 0.2,
+  RECENCY: 0.1,
+} as const
+
+const SCALE_FACTORS = {
+  VOLUME: 1000,
+  SPREAD: 1000,
+  TRADERS: 10,
+  RECENCY_PENALTY: 10,
+} as const
+
+function clamp(value: number, min: number, max: number): number {
+  return Math.min(Math.max(value, min), max)
+}
+
+export function calculateLiquidityScore(market: MarketData): number {
+  if (market.totalVolume === 0) return 0
+
+  const volumeScore = Math.min(market.totalVolume / SCALE_FACTORS.VOLUME, 100)
+  const spreadScore = clamp(100 - (market.bidAskSpread * SCALE_FACTORS.SPREAD), 0, 100)
+  const traderScore = Math.min(market.activeTraders / SCALE_FACTORS.TRADERS, 100)
+
+  const hoursSinceLastTrade = (Date.now() - market.lastTradeTime.getTime()) / (1000 * 60 * 60)
+  const recencyScore = clamp(100 - (hoursSinceLastTrade * SCALE_FACTORS.RECENCY_PENALTY), 0, 100)
+
+  const weightedScore =
+    volumeScore * WEIGHTS.VOLUME +
+    spreadScore * WEIGHTS.SPREAD +
+    traderScore * WEIGHTS.TRADERS +
+    recencyScore * WEIGHTS.RECENCY
+
+  return clamp(weightedScore, 0, 100)
+}
+```
+
+## Step 7: Verify Tests Still Pass
+
+```bash
+npm test lib/liquidity.test.ts
+
+PASS lib/liquidity.test.ts
+  ✓ should return high score for liquid market (3 ms)
+  ✓ should return low score for illiquid market (2 ms)
+  ✓ should handle edge case: zero volume (1 ms)
+
+3 tests passed
+```
+
+✅ Refactoring complete, tests still passing!
+
+## Step 8: Check Coverage
+
+```bash
+npm test -- --coverage lib/liquidity.test.ts
+
+File           | % Stmts | % Branch | % Funcs | % Lines
+---------------|---------|----------|---------|--------
+liquidity.ts   |   100   |   100    |   100   |   100
+
+Coverage: 100% ✅ (Target: 80%)
+```
+
+✅ TDD session complete!
+```
+
+## TDD Best Practices
+
+**DO:**
+- ✅ Write the test FIRST, before any implementation
+- ✅ Run tests and verify they FAIL before implementing
+- ✅ Write minimal code to make tests pass
+- ✅ Refactor only after tests are green
+- ✅ Add edge cases and error scenarios
+- ✅ Aim for 80%+ coverage (100% for critical code)
+
+**DON'T:**
+- ❌ Write implementation before tests
+- ❌ Skip running tests after each change
+- ❌ Write too much code at once
+- ❌ Ignore failing tests
+- ❌ Test implementation details (test behavior)
+- ❌ Mock everything (prefer integration tests)
+
+## Test Types to Include
+
+**Unit Tests** (Function-level):
+- Happy path scenarios
+- Edge cases (empty, null, max values)
+- Error conditions
+- Boundary values
+
+**Integration Tests** (Component-level):
+- API endpoints
+- Database operations
+- External service calls
+- React components with hooks
+
+**E2E Tests** (use `/e2e` command):
+- Critical user flows
+- Multi-step processes
+- Full stack integration
+
+## Coverage Requirements
+
+- **80% minimum** for all code
+- **100% required** for:
+  - Financial calculations
+  - Authentication logic
+  - Security-critical code
+  - Core business logic
+
+## Important Notes
+
+**MANDATORY**: Tests must be written BEFORE implementation. The TDD cycle is:
+
+1. **RED** - Write failing test
+2. **GREEN** - Implement to pass
+3. **REFACTOR** - Improve code
+
+Never skip the RED phase. Never write code before tests.
+
+## Integration with Other Commands
+
+- Use `/plan` first to understand what to build
+- Use `/tdd` to implement with tests
+- Use `/build-and-fix` if build errors occur
+- Use `/code-review` to review implementation
+- Use `/test-coverage` to verify coverage
+
+## Related Agents
+
+This command invokes the `tdd-guide` agent located at:
+`~/.claude/agents/tdd-guide.md`
+
+And can reference the `tdd-workflow` skill at:
+`~/.claude/skills/tdd-workflow/`
diff --git a/.claude/commands/test-coverage.md b/.claude/commands/test-coverage.md
new file mode 100644
index 0000000..754eabf
--- /dev/null
+++ b/.claude/commands/test-coverage.md
@@ -0,0 +1,27 @@
+# Test Coverage
+
+Analyze test coverage and generate missing tests:
+
+1. Run tests with coverage: npm test --coverage or pnpm test --coverage
+
+2. Analyze coverage report (coverage/coverage-summary.json)
+
+3. Identify files below 80% coverage threshold
+
+4. For each under-covered file:
+   - Analyze untested code paths
+   - Generate unit tests for functions
+   - Generate integration tests for APIs
+   - Generate E2E tests for critical flows
+
+5. Verify new tests pass
+
+6. Show before/after coverage metrics
+
+7. Ensure project reaches 80%+ overall coverage
+
+Focus on:
+- Happy path scenarios
+- Error handling
+- Edge cases (null, undefined, empty)
+- Boundary conditions
diff --git a/.claude/commands/update-codemaps.md b/.claude/commands/update-codemaps.md
new file mode 100644
index 0000000..f363a05
--- /dev/null
+++ b/.claude/commands/update-codemaps.md
@@ -0,0 +1,17 @@
+# Update Codemaps
+
+Analyze the codebase structure and update architecture documentation:
+
+1. Scan all source files for imports, exports, and dependencies
+2. Generate token-lean codemaps in the following format:
+   - codemaps/architecture.md - Overall architecture
+   - codemaps/backend.md - Backend structure  
+   - codemaps/frontend.md - Frontend structure
+   - codemaps/data.md - Data models and schemas
+
+3. Calculate diff percentage from previous version
+4. If changes > 30%, request user approval before updating
+5. Add freshness timestamp to each codemap
+6. Save reports to .reports/codemap-diff.txt
+
+Use TypeScript/Node.js for analysis. Focus on high-level structure, not implementation details.
diff --git a/.claude/commands/update-docs.md b/.claude/commands/update-docs.md
new file mode 100644
index 0000000..3dd0f89
--- /dev/null
+++ b/.claude/commands/update-docs.md
@@ -0,0 +1,31 @@
+# Update Documentation
+
+Sync documentation from source-of-truth:
+
+1. Read package.json scripts section
+   - Generate scripts reference table
+   - Include descriptions from comments
+
+2. Read .env.example
+   - Extract all environment variables
+   - Document purpose and format
+
+3. Generate docs/CONTRIB.md with:
+   - Development workflow
+   - Available scripts
+   - Environment setup
+   - Testing procedures
+
+4. Generate docs/RUNBOOK.md with:
+   - Deployment procedures
+   - Monitoring and alerts
+   - Common issues and fixes
+   - Rollback procedures
+
+5. Identify obsolete documentation:
+   - Find docs not modified in 90+ days
+   - List for manual review
+
+6. Show diff summary
+
+Single source of truth: package.json and .env.example
diff --git a/.claude/commands/verify.md b/.claude/commands/verify.md
new file mode 100644
index 0000000..5f628b1
--- /dev/null
+++ b/.claude/commands/verify.md
@@ -0,0 +1,59 @@
+# Verification Command
+
+Run comprehensive verification on current codebase state.
+
+## Instructions
+
+Execute verification in this exact order:
+
+1. **Build Check**
+   - Run the build command for this project
+   - If it fails, report errors and STOP
+
+2. **Type Check**
+   - Run TypeScript/type checker
+   - Report all errors with file:line
+
+3. **Lint Check**
+   - Run linter
+   - Report warnings and errors
+
+4. **Test Suite**
+   - Run all tests
+   - Report pass/fail count
+   - Report coverage percentage
+
+5. **Console.log Audit**
+   - Search for console.log in source files
+   - Report locations
+
+6. **Git Status**
+   - Show uncommitted changes
+   - Show files modified since last commit
+
+## Output
+
+Produce a concise verification report:
+
+```
+VERIFICATION: [PASS/FAIL]
+
+Build:    [OK/FAIL]
+Types:    [OK/X errors]
+Lint:     [OK/X issues]
+Tests:    [X/Y passed, Z% coverage]
+Secrets:  [OK/X found]
+Logs:     [OK/X console.logs]
+
+Ready for PR: [YES/NO]
+```
+
+If any critical issues, list them with fix suggestions.
+
+## Arguments
+
+$ARGUMENTS can be:
+- `quick` - Only build + types
+- `full` - All checks (default)
+- `pre-commit` - Checks relevant for commits
+- `pre-pr` - Full checks plus security scan
diff --git a/.claude/hooks/hooks.json b/.claude/hooks/hooks.json
new file mode 100644
index 0000000..ea9cdc6
--- /dev/null
+++ b/.claude/hooks/hooks.json
@@ -0,0 +1,157 @@
+{
+  "$schema": "https://json.schemastore.org/claude-code-settings.json",
+  "hooks": {
+    "PreToolUse": [
+      {
+        "matcher": "tool == \"Bash\" && tool_input.command matches \"(npm run dev|pnpm( run)? dev|yarn dev|bun run dev)\"",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node -e \"console.error('[Hook] BLOCKED: Dev server must run in tmux for log access');console.error('[Hook] Use: tmux new-session -d -s dev \\\"npm run dev\\\"');console.error('[Hook] Then: tmux attach -t dev');process.exit(1)\""
+          }
+        ],
+        "description": "Block dev servers outside tmux - ensures you can access logs"
+      },
+      {
+        "matcher": "tool == \"Bash\" && tool_input.command matches \"(npm (install|test)|pnpm (install|test)|yarn (install|test)?|bun (install|test)|cargo build|make|docker|pytest|vitest|playwright)\"",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node -e \"if(!process.env.TMUX){console.error('[Hook] Consider running in tmux for session persistence');console.error('[Hook] tmux new -s dev  |  tmux attach -t dev')}\""
+          }
+        ],
+        "description": "Reminder to use tmux for long-running commands"
+      },
+      {
+        "matcher": "tool == \"Bash\" && tool_input.command matches \"git push\"",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node -e \"console.error('[Hook] Review changes before push...');console.error('[Hook] Continuing with push (remove this hook to add interactive review)')\""
+          }
+        ],
+        "description": "Reminder before git push to review changes"
+      },
+      {
+        "matcher": "tool == \"Write\" && tool_input.file_path matches \"\\\\.(md|txt)$\" && !(tool_input.file_path matches \"README\\\\.md|CLAUDE\\\\.md|AGENTS\\\\.md|CONTRIBUTING\\\\.md\")",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node -e \"const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const p=i.tool_input?.file_path||'';if(/\\.(md|txt)$/.test(p)&&!/(README|CLAUDE|AGENTS|CONTRIBUTING)\\.md$/.test(p)){console.error('[Hook] BLOCKED: Unnecessary documentation file creation');console.error('[Hook] File: '+p);console.error('[Hook] Use README.md for documentation instead');process.exit(1)}console.log(d)})\""
+          }
+        ],
+        "description": "Block creation of random .md files - keeps docs consolidated"
+      },
+      {
+        "matcher": "tool == \"Edit\" || tool == \"Write\"",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/suggest-compact.js\""
+          }
+        ],
+        "description": "Suggest manual compaction at logical intervals"
+      }
+    ],
+    "PreCompact": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/pre-compact.js\""
+          }
+        ],
+        "description": "Save state before context compaction"
+      }
+    ],
+    "SessionStart": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/session-start.js\""
+          }
+        ],
+        "description": "Load previous context and detect package manager on new session"
+      }
+    ],
+    "PostToolUse": [
+      {
+        "matcher": "tool == \"Bash\"",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node -e \"let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const cmd=i.tool_input?.command||'';if(/gh pr create/.test(cmd)){const out=i.tool_output?.output||'';const m=out.match(/https:\\/\\/github.com\\/[^/]+\\/[^/]+\\/pull\\/\\d+/);if(m){console.error('[Hook] PR created: '+m[0]);const repo=m[0].replace(/https:\\/\\/github.com\\/([^/]+\\/[^/]+)\\/pull\\/\\d+/,'$1');const pr=m[0].replace(/.*\\/pull\\/(\\d+)/,'$1');console.error('[Hook] To review: gh pr review '+pr+' --repo '+repo)}}console.log(d)})\""
+          }
+        ],
+        "description": "Log PR URL and provide review command after PR creation"
+      },
+      {
+        "matcher": "tool == \"Edit\" && tool_input.file_path matches \"\\\\.(ts|tsx|js|jsx)$\"",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node -e \"const{execSync}=require('child_process');const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const p=i.tool_input?.file_path;if(p&&fs.existsSync(p)){try{execSync('npx prettier --write \"'+p+'\"',{stdio:['pipe','pipe','pipe']})}catch(e){}}console.log(d)})\""
+          }
+        ],
+        "description": "Auto-format JS/TS files with Prettier after edits"
+      },
+      {
+        "matcher": "tool == \"Edit\" && tool_input.file_path matches \"\\\\.(ts|tsx)$\"",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node -e \"const{execSync}=require('child_process');const fs=require('fs');const path=require('path');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const p=i.tool_input?.file_path;if(p&&fs.existsSync(p)){let dir=path.dirname(p);while(dir!==path.dirname(dir)&&!fs.existsSync(path.join(dir,'tsconfig.json'))){dir=path.dirname(dir)}if(fs.existsSync(path.join(dir,'tsconfig.json'))){try{const r=execSync('npx tsc --noEmit --pretty false 2>&1',{cwd:dir,encoding:'utf8',stdio:['pipe','pipe','pipe']});const lines=r.split('\\n').filter(l=>l.includes(p)).slice(0,10);if(lines.length)console.error(lines.join('\\n'))}catch(e){const lines=(e.stdout||'').split('\\n').filter(l=>l.includes(p)).slice(0,10);if(lines.length)console.error(lines.join('\\n'))}}}console.log(d)})\""
+          }
+        ],
+        "description": "TypeScript check after editing .ts/.tsx files"
+      },
+      {
+        "matcher": "tool == \"Edit\" && tool_input.file_path matches \"\\\\.(ts|tsx|js|jsx)$\"",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node -e \"const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const p=i.tool_input?.file_path;if(p&&fs.existsSync(p)){const c=fs.readFileSync(p,'utf8');const lines=c.split('\\n');const matches=[];lines.forEach((l,idx)=>{if(/console\\.log/.test(l))matches.push((idx+1)+': '+l.trim())});if(matches.length){console.error('[Hook] WARNING: console.log found in '+p);matches.slice(0,5).forEach(m=>console.error(m));console.error('[Hook] Remove console.log before committing')}}console.log(d)})\""
+          }
+        ],
+        "description": "Warn about console.log statements after edits"
+      }
+    ],
+    "Stop": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node -e \"const{execSync}=require('child_process');const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{try{execSync('git rev-parse --git-dir',{stdio:'pipe'})}catch{console.log(d);process.exit(0)}try{const files=execSync('git diff --name-only HEAD',{encoding:'utf8',stdio:['pipe','pipe','pipe']}).split('\\n').filter(f=>/\\.(ts|tsx|js|jsx)$/.test(f)&&fs.existsSync(f));let hasConsole=false;for(const f of files){if(fs.readFileSync(f,'utf8').includes('console.log')){console.error('[Hook] WARNING: console.log found in '+f);hasConsole=true}}if(hasConsole)console.error('[Hook] Remove console.log statements before committing')}catch(e){}console.log(d)})\""
+          }
+        ],
+        "description": "Check for console.log in modified files after each response"
+      }
+    ],
+    "SessionEnd": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/session-end.js\""
+          }
+        ],
+        "description": "Persist session state on end"
+      },
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/evaluate-session.js\""
+          }
+        ],
+        "description": "Evaluate session for extractable patterns"
+      }
+    ]
+  }
+}
diff --git a/.claude/hooks/memory-persistence/pre-compact.sh b/.claude/hooks/memory-persistence/pre-compact.sh
new file mode 100644
index 0000000..296fce9
--- /dev/null
+++ b/.claude/hooks/memory-persistence/pre-compact.sh
@@ -0,0 +1,36 @@
+#!/bin/bash
+# PreCompact Hook - Save state before context compaction
+#
+# Runs before Claude compacts context, giving you a chance to
+# preserve important state that might get lost in summarization.
+#
+# Hook config (in ~/.claude/settings.json):
+# {
+#   "hooks": {
+#     "PreCompact": [{
+#       "matcher": "*",
+#       "hooks": [{
+#         "type": "command",
+#         "command": "~/.claude/hooks/memory-persistence/pre-compact.sh"
+#       }]
+#     }]
+#   }
+# }
+
+SESSIONS_DIR="${HOME}/.claude/sessions"
+COMPACTION_LOG="${SESSIONS_DIR}/compaction-log.txt"
+
+mkdir -p "$SESSIONS_DIR"
+
+# Log compaction event with timestamp
+echo "[$(date '+%Y-%m-%d %H:%M:%S')] Context compaction triggered" >> "$COMPACTION_LOG"
+
+# If there's an active session file, note the compaction
+ACTIVE_SESSION=$(ls -t "$SESSIONS_DIR"/*.tmp 2>/dev/null | head -1)
+if [ -n "$ACTIVE_SESSION" ]; then
+  echo "" >> "$ACTIVE_SESSION"
+  echo "---" >> "$ACTIVE_SESSION"
+  echo "**[Compaction occurred at $(date '+%H:%M')]** - Context was summarized" >> "$ACTIVE_SESSION"
+fi
+
+echo "[PreCompact] State saved before compaction" >&2
diff --git a/.claude/hooks/memory-persistence/session-end.sh b/.claude/hooks/memory-persistence/session-end.sh
new file mode 100644
index 0000000..93b0f63
--- /dev/null
+++ b/.claude/hooks/memory-persistence/session-end.sh
@@ -0,0 +1,61 @@
+#!/bin/bash
+# Stop Hook (Session End) - Persist learnings when session ends
+#
+# Runs when Claude session ends. Creates/updates session log file
+# with timestamp for continuity tracking.
+#
+# Hook config (in ~/.claude/settings.json):
+# {
+#   "hooks": {
+#     "Stop": [{
+#       "matcher": "*",
+#       "hooks": [{
+#         "type": "command",
+#         "command": "~/.claude/hooks/memory-persistence/session-end.sh"
+#       }]
+#     }]
+#   }
+# }
+
+SESSIONS_DIR="${HOME}/.claude/sessions"
+TODAY=$(date '+%Y-%m-%d')
+SESSION_FILE="${SESSIONS_DIR}/${TODAY}-session.tmp"
+
+mkdir -p "$SESSIONS_DIR"
+
+# If session file exists for today, update the end time
+if [ -f "$SESSION_FILE" ]; then
+  # Update Last Updated timestamp
+  sed -i '' "s/\*\*Last Updated:\*\*.*/\*\*Last Updated:\*\* $(date '+%H:%M')/" "$SESSION_FILE" 2>/dev/null || \
+  sed -i "s/\*\*Last Updated:\*\*.*/\*\*Last Updated:\*\* $(date '+%H:%M')/" "$SESSION_FILE" 2>/dev/null
+  echo "[SessionEnd] Updated session file: $SESSION_FILE" >&2
+else
+  # Create new session file with template
+  cat > "$SESSION_FILE" << EOF
+# Session: $(date '+%Y-%m-%d')
+**Date:** $TODAY
+**Started:** $(date '+%H:%M')
+**Last Updated:** $(date '+%H:%M')
+
+---
+
+## Current State
+
+[Session context goes here]
+
+### Completed
+- [ ]
+
+### In Progress
+- [ ]
+
+### Notes for Next Session
+-
+
+### Context to Load
+\`\`\`
+[relevant files]
+\`\`\`
+EOF
+  echo "[SessionEnd] Created session file: $SESSION_FILE" >&2
+fi
diff --git a/.claude/hooks/memory-persistence/session-start.sh b/.claude/hooks/memory-persistence/session-start.sh
new file mode 100644
index 0000000..57a8c14
--- /dev/null
+++ b/.claude/hooks/memory-persistence/session-start.sh
@@ -0,0 +1,37 @@
+#!/bin/bash
+# SessionStart Hook - Load previous context on new session
+#
+# Runs when a new Claude session starts. Checks for recent session
+# files and notifies Claude of available context to load.
+#
+# Hook config (in ~/.claude/settings.json):
+# {
+#   "hooks": {
+#     "SessionStart": [{
+#       "matcher": "*",
+#       "hooks": [{
+#         "type": "command",
+#         "command": "~/.claude/hooks/memory-persistence/session-start.sh"
+#       }]
+#     }]
+#   }
+# }
+
+SESSIONS_DIR="${HOME}/.claude/sessions"
+LEARNED_DIR="${HOME}/.claude/skills/learned"
+
+# Check for recent session files (last 7 days)
+recent_sessions=$(find "$SESSIONS_DIR" -name "*.tmp" -mtime -7 2>/dev/null | wc -l | tr -d ' ')
+
+if [ "$recent_sessions" -gt 0 ]; then
+  latest=$(ls -t "$SESSIONS_DIR"/*.tmp 2>/dev/null | head -1)
+  echo "[SessionStart] Found $recent_sessions recent session(s)" >&2
+  echo "[SessionStart] Latest: $latest" >&2
+fi
+
+# Check for learned skills
+learned_count=$(find "$LEARNED_DIR" -name "*.md" 2>/dev/null | wc -l | tr -d ' ')
+
+if [ "$learned_count" -gt 0 ]; then
+  echo "[SessionStart] $learned_count learned skill(s) available in $LEARNED_DIR" >&2
+fi
diff --git a/.claude/hooks/strategic-compact/suggest-compact.sh b/.claude/hooks/strategic-compact/suggest-compact.sh
new file mode 100644
index 0000000..ea14920
--- /dev/null
+++ b/.claude/hooks/strategic-compact/suggest-compact.sh
@@ -0,0 +1,52 @@
+#!/bin/bash
+# Strategic Compact Suggester
+# Runs on PreToolUse or periodically to suggest manual compaction at logical intervals
+#
+# Why manual over auto-compact:
+# - Auto-compact happens at arbitrary points, often mid-task
+# - Strategic compacting preserves context through logical phases
+# - Compact after exploration, before execution
+# - Compact after completing a milestone, before starting next
+#
+# Hook config (in ~/.claude/settings.json):
+# {
+#   "hooks": {
+#     "PreToolUse": [{
+#       "matcher": "Edit|Write",
+#       "hooks": [{
+#         "type": "command",
+#         "command": "~/.claude/skills/strategic-compact/suggest-compact.sh"
+#       }]
+#     }]
+#   }
+# }
+#
+# Criteria for suggesting compact:
+# - Session has been running for extended period
+# - Large number of tool calls made
+# - Transitioning from research/exploration to implementation
+# - Plan has been finalized
+
+# Track tool call count (increment in a temp file)
+COUNTER_FILE="/tmp/claude-tool-count-$$"
+THRESHOLD=${COMPACT_THRESHOLD:-50}
+
+# Initialize or increment counter
+if [ -f "$COUNTER_FILE" ]; then
+  count=$(cat "$COUNTER_FILE")
+  count=$((count + 1))
+  echo "$count" > "$COUNTER_FILE"
+else
+  echo "1" > "$COUNTER_FILE"
+  count=1
+fi
+
+# Suggest compact after threshold tool calls
+if [ "$count" -eq "$THRESHOLD" ]; then
+  echo "[StrategicCompact] $THRESHOLD tool calls reached - consider /compact if transitioning phases" >&2
+fi
+
+# Suggest at regular intervals after threshold
+if [ "$count" -gt "$THRESHOLD" ] && [ $((count % 25)) -eq 0 ]; then
+  echo "[StrategicCompact] $count tool calls - good checkpoint for /compact if context is stale" >&2
+fi
diff --git a/.claude/settings.local.json b/.claude/settings.local.json
index 3c93f74..b4a6c77 100644
--- a/.claude/settings.local.json
+++ b/.claude/settings.local.json
@@ -75,7 +75,13 @@
       "Bash(wsl -e bash -c \"ls -la /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/data/dataset/train/\")",
       "Bash(wsl -e bash -c \"ls -la /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/data/structured_data/*.csv 2>/dev/null | head -20\")",
       "Bash(tasklist:*)",
-      "Bash(findstr:*)"
+      "Bash(findstr:*)",
+      "Bash(wsl bash -c \"ps aux | grep -E ''python.*train'' | grep -v grep\")",
+      "Bash(wsl bash -c \"ls -la /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/runs/train/invoice_fields/\")",
+      "Bash(wsl bash -c \"cat /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/runs/train/invoice_fields/results.csv\")",
+      "Bash(wsl bash -c \"ls -la /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/runs/train/invoice_fields/weights/\")",
+      "Bash(wsl bash -c \"cat ''/mnt/c/Users/yaoji/AppData/Local/Temp/claude/c--Users-yaoji-git-ColaCoder-invoice-master-poc-v2/tasks/b8d8565.output'' 2>/dev/null | tail -100\")",
+      "Bash(wsl bash -c:*)"
     ],
     "deny": [],
     "ask": [],
diff --git a/.claude/skills/backend-patterns/SKILL.md b/.claude/skills/backend-patterns/SKILL.md
new file mode 100644
index 0000000..53bf07e
--- /dev/null
+++ b/.claude/skills/backend-patterns/SKILL.md
@@ -0,0 +1,314 @@
+# Backend Development Patterns
+
+Backend architecture patterns for Python/FastAPI/PostgreSQL applications.
+
+## API Design
+
+### RESTful Structure
+
+```
+GET    /api/v1/documents              # List
+GET    /api/v1/documents/{id}         # Get
+POST   /api/v1/documents              # Create
+PUT    /api/v1/documents/{id}         # Replace
+PATCH  /api/v1/documents/{id}         # Update
+DELETE /api/v1/documents/{id}         # Delete
+
+GET /api/v1/documents?status=processed&sort=created_at&limit=20&offset=0
+```
+
+### FastAPI Route Pattern
+
+```python
+from fastapi import APIRouter, HTTPException, Depends, Query, File, UploadFile
+from pydantic import BaseModel
+
+router = APIRouter(prefix="/api/v1", tags=["inference"])
+
+@router.post("/infer", response_model=ApiResponse[InferenceResult])
+async def infer_document(
+    file: UploadFile = File(...),
+    confidence_threshold: float = Query(0.5, ge=0, le=1),
+    service: InferenceService = Depends(get_inference_service)
+) -> ApiResponse[InferenceResult]:
+    result = await service.process(file, confidence_threshold)
+    return ApiResponse(success=True, data=result)
+```
+
+### Consistent Response Schema
+
+```python
+from typing import Generic, TypeVar
+T = TypeVar('T')
+
+class ApiResponse(BaseModel, Generic[T]):
+    success: bool
+    data: T | None = None
+    error: str | None = None
+    meta: dict | None = None
+```
+
+## Core Patterns
+
+### Repository Pattern
+
+```python
+from typing import Protocol
+
+class DocumentRepository(Protocol):
+    def find_all(self, filters: dict | None = None) -> list[Document]: ...
+    def find_by_id(self, id: str) -> Document | None: ...
+    def create(self, data: dict) -> Document: ...
+    def update(self, id: str, data: dict) -> Document: ...
+    def delete(self, id: str) -> None: ...
+```
+
+### Service Layer
+
+```python
+class InferenceService:
+    def __init__(self, model_path: str, use_gpu: bool = True):
+        self.pipeline = InferencePipeline(model_path=model_path, use_gpu=use_gpu)
+
+    async def process(self, file: UploadFile, confidence_threshold: float) -> InferenceResult:
+        temp_path = self._save_temp_file(file)
+        try:
+            return self.pipeline.process_pdf(temp_path)
+        finally:
+            temp_path.unlink(missing_ok=True)
+```
+
+### Dependency Injection
+
+```python
+from functools import lru_cache
+from pydantic_settings import BaseSettings
+
+class Settings(BaseSettings):
+    db_host: str = "localhost"
+    db_password: str
+    model_path: str = "runs/train/invoice_fields/weights/best.pt"
+    class Config:
+        env_file = ".env"
+
+@lru_cache()
+def get_settings() -> Settings:
+    return Settings()
+
+def get_inference_service(settings: Settings = Depends(get_settings)) -> InferenceService:
+    return InferenceService(model_path=settings.model_path)
+```
+
+## Database Patterns
+
+### Connection Pooling
+
+```python
+from psycopg2 import pool
+from contextlib import contextmanager
+
+db_pool = pool.ThreadedConnectionPool(minconn=2, maxconn=10, **db_config)
+
+@contextmanager
+def get_db_connection():
+    conn = db_pool.getconn()
+    try:
+        yield conn
+    finally:
+        db_pool.putconn(conn)
+```
+
+### Query Optimization
+
+```python
+# GOOD: Select only needed columns
+cur.execute("""
+    SELECT id, status, fields->>'InvoiceNumber' as invoice_number
+    FROM documents WHERE status = %s
+    ORDER BY created_at DESC LIMIT %s
+""", ('processed', 10))
+
+# BAD: SELECT * FROM documents
+```
+
+### N+1 Prevention
+
+```python
+# BAD: N+1 queries
+for doc in documents:
+    doc.labels = get_labels(doc.id)  # N queries
+
+# GOOD: Batch fetch with JOIN
+cur.execute("""
+    SELECT d.id, d.status, array_agg(l.label) as labels
+    FROM documents d
+    LEFT JOIN document_labels l ON d.id = l.document_id
+    GROUP BY d.id, d.status
+""")
+```
+
+### Transaction Pattern
+
+```python
+def create_document_with_labels(doc_data: dict, labels: list[dict]) -> str:
+    with get_db_connection() as conn:
+        try:
+            with conn.cursor() as cur:
+                cur.execute("INSERT INTO documents ... RETURNING id", ...)
+                doc_id = cur.fetchone()[0]
+                for label in labels:
+                    cur.execute("INSERT INTO document_labels ...", ...)
+                conn.commit()
+                return doc_id
+        except Exception:
+            conn.rollback()
+            raise
+```
+
+## Caching
+
+```python
+from cachetools import TTLCache
+
+_cache = TTLCache(maxsize=1000, ttl=300)
+
+def get_document_cached(doc_id: str) -> Document | None:
+    if doc_id in _cache:
+        return _cache[doc_id]
+    doc = repo.find_by_id(doc_id)
+    if doc:
+        _cache[doc_id] = doc
+    return doc
+```
+
+## Error Handling
+
+### Exception Hierarchy
+
+```python
+class AppError(Exception):
+    def __init__(self, message: str, status_code: int = 500):
+        self.message = message
+        self.status_code = status_code
+
+class NotFoundError(AppError):
+    def __init__(self, resource: str, id: str):
+        super().__init__(f"{resource} not found: {id}", 404)
+
+class ValidationError(AppError):
+    def __init__(self, message: str):
+        super().__init__(message, 400)
+```
+
+### FastAPI Exception Handler
+
+```python
+@app.exception_handler(AppError)
+async def app_error_handler(request: Request, exc: AppError):
+    return JSONResponse(status_code=exc.status_code, content={"success": False, "error": exc.message})
+
+@app.exception_handler(Exception)
+async def generic_error_handler(request: Request, exc: Exception):
+    logger.error(f"Unexpected error: {exc}", exc_info=True)
+    return JSONResponse(status_code=500, content={"success": False, "error": "Internal server error"})
+```
+
+### Retry with Backoff
+
+```python
+async def retry_with_backoff(fn, max_retries: int = 3, base_delay: float = 1.0):
+    last_error = None
+    for attempt in range(max_retries):
+        try:
+            return await fn() if asyncio.iscoroutinefunction(fn) else fn()
+        except Exception as e:
+            last_error = e
+            if attempt < max_retries - 1:
+                await asyncio.sleep(base_delay * (2 ** attempt))
+    raise last_error
+```
+
+## Rate Limiting
+
+```python
+from time import time
+from collections import defaultdict
+
+class RateLimiter:
+    def __init__(self):
+        self.requests: dict[str, list[float]] = defaultdict(list)
+
+    def check_limit(self, identifier: str, max_requests: int, window_sec: int) -> bool:
+        now = time()
+        self.requests[identifier] = [t for t in self.requests[identifier] if now - t < window_sec]
+        if len(self.requests[identifier]) >= max_requests:
+            return False
+        self.requests[identifier].append(now)
+        return True
+
+limiter = RateLimiter()
+
+@app.middleware("http")
+async def rate_limit_middleware(request: Request, call_next):
+    ip = request.client.host
+    if not limiter.check_limit(ip, max_requests=100, window_sec=60):
+        return JSONResponse(status_code=429, content={"error": "Rate limit exceeded"})
+    return await call_next(request)
+```
+
+## Logging & Middleware
+
+### Request Logging
+
+```python
+@app.middleware("http")
+async def log_requests(request: Request, call_next):
+    request_id = str(uuid.uuid4())[:8]
+    start_time = time.time()
+    logger.info(f"[{request_id}] {request.method} {request.url.path}")
+    response = await call_next(request)
+    duration_ms = (time.time() - start_time) * 1000
+    logger.info(f"[{request_id}] Completed {response.status_code} in {duration_ms:.2f}ms")
+    return response
+```
+
+### Structured Logging
+
+```python
+class JSONFormatter(logging.Formatter):
+    def format(self, record):
+        return json.dumps({
+            "timestamp": datetime.utcnow().isoformat(),
+            "level": record.levelname,
+            "message": record.getMessage(),
+            "module": record.module,
+        })
+```
+
+## Background Tasks
+
+```python
+from fastapi import BackgroundTasks
+
+def send_notification(document_id: str, status: str):
+    logger.info(f"Notification: {document_id} -> {status}")
+
+@router.post("/infer")
+async def infer(file: UploadFile, background_tasks: BackgroundTasks):
+    result = await process_document(file)
+    background_tasks.add_task(send_notification, result.document_id, "completed")
+    return result
+```
+
+## Key Principles
+
+- Repository pattern: Abstract data access
+- Service layer: Business logic separated from routes
+- Dependency injection via `Depends()`
+- Connection pooling for database
+- Parameterized queries only (no f-strings in SQL)
+- Batch fetch to prevent N+1
+- Consistent `ApiResponse[T]` format
+- Exception hierarchy with proper status codes
+- Rate limit by IP
+- Structured logging with request ID
\ No newline at end of file
diff --git a/.claude/skills/coding-standards/SKILL.md b/.claude/skills/coding-standards/SKILL.md
new file mode 100644
index 0000000..4bb9b71
--- /dev/null
+++ b/.claude/skills/coding-standards/SKILL.md
@@ -0,0 +1,665 @@
+---
+name: coding-standards
+description: Universal coding standards, best practices, and patterns for Python, FastAPI, and data processing development.
+---
+
+# Coding Standards & Best Practices
+
+Python coding standards for the Invoice Master project.
+
+## Code Quality Principles
+
+### 1. Readability First
+- Code is read more than written
+- Clear variable and function names
+- Self-documenting code preferred over comments
+- Consistent formatting (follow PEP 8)
+
+### 2. KISS (Keep It Simple, Stupid)
+- Simplest solution that works
+- Avoid over-engineering
+- No premature optimization
+- Easy to understand > clever code
+
+### 3. DRY (Don't Repeat Yourself)
+- Extract common logic into functions
+- Create reusable utilities
+- Share modules across the codebase
+- Avoid copy-paste programming
+
+### 4. YAGNI (You Aren't Gonna Need It)
+- Don't build features before they're needed
+- Avoid speculative generality
+- Add complexity only when required
+- Start simple, refactor when needed
+
+## Python Standards
+
+### Variable Naming
+
+```python
+# GOOD: Descriptive names
+invoice_number = "INV-2024-001"
+is_valid_document = True
+total_confidence_score = 0.95
+
+# BAD: Unclear names
+inv = "INV-2024-001"
+flag = True
+x = 0.95
+```
+
+### Function Naming
+
+```python
+# GOOD: Verb-noun pattern with type hints
+def extract_invoice_fields(pdf_path: Path) -> dict[str, str]:
+    """Extract fields from invoice PDF."""
+    ...
+
+def calculate_confidence(predictions: list[float]) -> float:
+    """Calculate average confidence score."""
+    ...
+
+def is_valid_bankgiro(value: str) -> bool:
+    """Check if value is valid Bankgiro number."""
+    ...
+
+# BAD: Unclear or noun-only
+def invoice(path):
+    ...
+
+def confidence(p):
+    ...
+
+def bankgiro(v):
+    ...
+```
+
+### Type Hints (REQUIRED)
+
+```python
+# GOOD: Full type annotations
+from typing import Optional
+from pathlib import Path
+from dataclasses import dataclass
+
+@dataclass
+class InferenceResult:
+    document_id: str
+    fields: dict[str, str]
+    confidence: dict[str, float]
+    processing_time_ms: float
+
+def process_document(
+    pdf_path: Path,
+    confidence_threshold: float = 0.5
+) -> InferenceResult:
+    """Process PDF and return extracted fields."""
+    ...
+
+# BAD: No type hints
+def process_document(pdf_path, confidence_threshold=0.5):
+    ...
+```
+
+### Immutability Pattern (CRITICAL)
+
+```python
+# GOOD: Create new objects, don't mutate
+def update_fields(fields: dict[str, str], updates: dict[str, str]) -> dict[str, str]:
+    return {**fields, **updates}
+
+def add_item(items: list[str], new_item: str) -> list[str]:
+    return [*items, new_item]
+
+# BAD: Direct mutation
+def update_fields(fields: dict[str, str], updates: dict[str, str]) -> dict[str, str]:
+    fields.update(updates)  # MUTATION!
+    return fields
+
+def add_item(items: list[str], new_item: str) -> list[str]:
+    items.append(new_item)  # MUTATION!
+    return items
+```
+
+### Error Handling
+
+```python
+import logging
+
+logger = logging.getLogger(__name__)
+
+# GOOD: Comprehensive error handling with logging
+def load_model(model_path: Path) -> Model:
+    """Load YOLO model from path."""
+    try:
+        if not model_path.exists():
+            raise FileNotFoundError(f"Model not found: {model_path}")
+
+        model = YOLO(str(model_path))
+        logger.info(f"Model loaded: {model_path}")
+        return model
+    except Exception as e:
+        logger.error(f"Failed to load model: {e}")
+        raise RuntimeError(f"Model loading failed: {model_path}") from e
+
+# BAD: No error handling
+def load_model(model_path):
+    return YOLO(str(model_path))
+
+# BAD: Bare except
+def load_model(model_path):
+    try:
+        return YOLO(str(model_path))
+    except:  # Never use bare except!
+        return None
+```
+
+### Async Best Practices
+
+```python
+import asyncio
+
+# GOOD: Parallel execution when possible
+async def process_batch(pdf_paths: list[Path]) -> list[InferenceResult]:
+    tasks = [process_document(path) for path in pdf_paths]
+    results = await asyncio.gather(*tasks, return_exceptions=True)
+
+    # Handle exceptions
+    valid_results = []
+    for path, result in zip(pdf_paths, results):
+        if isinstance(result, Exception):
+            logger.error(f"Failed to process {path}: {result}")
+        else:
+            valid_results.append(result)
+    return valid_results
+
+# BAD: Sequential when unnecessary
+async def process_batch(pdf_paths: list[Path]) -> list[InferenceResult]:
+    results = []
+    for path in pdf_paths:
+        result = await process_document(path)
+        results.append(result)
+    return results
+```
+
+### Context Managers
+
+```python
+from contextlib import contextmanager
+from pathlib import Path
+import tempfile
+
+# GOOD: Proper resource management
+@contextmanager
+def temp_pdf_copy(pdf_path: Path):
+    """Create temporary copy of PDF for processing."""
+    with tempfile.NamedTemporaryFile(suffix=".pdf", delete=False) as tmp:
+        tmp.write(pdf_path.read_bytes())
+        tmp_path = Path(tmp.name)
+    try:
+        yield tmp_path
+    finally:
+        tmp_path.unlink(missing_ok=True)
+
+# Usage
+with temp_pdf_copy(original_pdf) as tmp_pdf:
+    result = process_pdf(tmp_pdf)
+```
+
+## FastAPI Best Practices
+
+### Route Structure
+
+```python
+from fastapi import APIRouter, HTTPException, Depends, Query, File, UploadFile
+from pydantic import BaseModel
+
+router = APIRouter(prefix="/api/v1", tags=["inference"])
+
+class InferenceResponse(BaseModel):
+    success: bool
+    document_id: str
+    fields: dict[str, str]
+    confidence: dict[str, float]
+    processing_time_ms: float
+
+@router.post("/infer", response_model=InferenceResponse)
+async def infer_document(
+    file: UploadFile = File(...),
+    confidence_threshold: float = Query(0.5, ge=0.0, le=1.0)
+) -> InferenceResponse:
+    """Process invoice PDF and extract fields."""
+    if not file.filename.endswith(".pdf"):
+        raise HTTPException(status_code=400, detail="Only PDF files accepted")
+
+    result = await inference_service.process(file, confidence_threshold)
+    return InferenceResponse(
+        success=True,
+        document_id=result.document_id,
+        fields=result.fields,
+        confidence=result.confidence,
+        processing_time_ms=result.processing_time_ms
+    )
+```
+
+### Input Validation with Pydantic
+
+```python
+from pydantic import BaseModel, Field, field_validator
+from datetime import date
+import re
+
+class InvoiceData(BaseModel):
+    invoice_number: str = Field(..., min_length=1, max_length=50)
+    invoice_date: date
+    amount: float = Field(..., gt=0)
+    bankgiro: str | None = None
+    ocr_number: str | None = None
+
+    @field_validator("bankgiro")
+    @classmethod
+    def validate_bankgiro(cls, v: str | None) -> str | None:
+        if v is None:
+            return None
+        # Bankgiro: 7-8 digits
+        cleaned = re.sub(r"[^0-9]", "", v)
+        if not (7 <= len(cleaned) <= 8):
+            raise ValueError("Bankgiro must be 7-8 digits")
+        return cleaned
+
+    @field_validator("ocr_number")
+    @classmethod
+    def validate_ocr(cls, v: str | None) -> str | None:
+        if v is None:
+            return None
+        # OCR: 2-25 digits
+        cleaned = re.sub(r"[^0-9]", "", v)
+        if not (2 <= len(cleaned) <= 25):
+            raise ValueError("OCR must be 2-25 digits")
+        return cleaned
+```
+
+### Response Format
+
+```python
+from pydantic import BaseModel
+from typing import Generic, TypeVar
+
+T = TypeVar("T")
+
+class ApiResponse(BaseModel, Generic[T]):
+    success: bool
+    data: T | None = None
+    error: str | None = None
+    meta: dict | None = None
+
+# Success response
+return ApiResponse(
+    success=True,
+    data=result,
+    meta={"processing_time_ms": elapsed_ms}
+)
+
+# Error response
+return ApiResponse(
+    success=False,
+    error="Invalid PDF format"
+)
+```
+
+## File Organization
+
+### Project Structure
+
+```
+src/
+├── cli/                  # Command-line interfaces
+│   ├── autolabel.py
+│   ├── train.py
+│   └── infer.py
+├── pdf/                  # PDF processing
+│   ├── extractor.py
+│   └── renderer.py
+├── ocr/                  # OCR processing
+│   ├── paddle_ocr.py
+│   └── machine_code_parser.py
+├── inference/            # Inference pipeline
+│   ├── pipeline.py
+│   ├── yolo_detector.py
+│   └── field_extractor.py
+├── normalize/            # Field normalization
+│   ├── base.py
+│   ├── date_normalizer.py
+│   └── amount_normalizer.py
+├── web/                  # FastAPI application
+│   ├── app.py
+│   ├── routes.py
+│   ├── services.py
+│   └── schemas.py
+└── utils/                # Shared utilities
+    ├── validators.py
+    ├── text_cleaner.py
+    └── logging.py
+tests/                    # Mirror of src structure
+    ├── test_pdf/
+    ├── test_ocr/
+    └── test_inference/
+```
+
+### File Naming
+
+```
+src/ocr/paddle_ocr.py           # snake_case for modules
+src/inference/yolo_detector.py  # snake_case for modules
+tests/test_paddle_ocr.py        # test_ prefix for tests
+config.py                       # snake_case for config
+```
+
+### Module Size Guidelines
+
+- **Maximum**: 800 lines per file
+- **Typical**: 200-400 lines per file
+- **Functions**: Max 50 lines each
+- Extract utilities when modules grow too large
+
+## Comments & Documentation
+
+### When to Comment
+
+```python
+# GOOD: Explain WHY, not WHAT
+# Swedish Bankgiro uses Luhn algorithm with weight [1,2,1,2...]
+def validate_bankgiro_checksum(bankgiro: str) -> bool:
+    ...
+
+# Payment line format: 7 groups separated by #, checksum at end
+def parse_payment_line(line: str) -> PaymentLineData:
+    ...
+
+# BAD: Stating the obvious
+# Increment counter by 1
+count += 1
+
+# Set name to user's name
+name = user.name
+```
+
+### Docstrings for Public APIs
+
+```python
+def extract_invoice_fields(
+    pdf_path: Path,
+    confidence_threshold: float = 0.5,
+    use_gpu: bool = True
+) -> InferenceResult:
+    """Extract structured fields from Swedish invoice PDF.
+
+    Uses YOLOv11 for field detection and PaddleOCR for text extraction.
+    Applies field-specific normalization and validation.
+
+    Args:
+        pdf_path: Path to the invoice PDF file.
+        confidence_threshold: Minimum confidence for field detection (0.0-1.0).
+        use_gpu: Whether to use GPU acceleration.
+
+    Returns:
+        InferenceResult containing extracted fields and confidence scores.
+
+    Raises:
+        FileNotFoundError: If PDF file doesn't exist.
+        ProcessingError: If OCR or detection fails.
+
+    Example:
+        >>> result = extract_invoice_fields(Path("invoice.pdf"))
+        >>> print(result.fields["invoice_number"])
+        "INV-2024-001"
+    """
+    ...
+```
+
+## Performance Best Practices
+
+### Caching
+
+```python
+from functools import lru_cache
+from cachetools import TTLCache
+
+# Static data: LRU cache
+@lru_cache(maxsize=100)
+def get_field_config(field_name: str) -> FieldConfig:
+    """Load field configuration (cached)."""
+    return load_config(field_name)
+
+# Dynamic data: TTL cache
+_document_cache = TTLCache(maxsize=1000, ttl=300)  # 5 minutes
+
+def get_document_cached(doc_id: str) -> Document | None:
+    if doc_id in _document_cache:
+        return _document_cache[doc_id]
+
+    doc = repo.find_by_id(doc_id)
+    if doc:
+        _document_cache[doc_id] = doc
+    return doc
+```
+
+### Database Queries
+
+```python
+# GOOD: Select only needed columns
+cur.execute("""
+    SELECT id, status, fields->>'invoice_number'
+    FROM documents
+    WHERE status = %s
+    LIMIT %s
+""", ('processed', 10))
+
+# BAD: Select everything
+cur.execute("SELECT * FROM documents")
+
+# GOOD: Batch operations
+cur.executemany(
+    "INSERT INTO labels (doc_id, field, value) VALUES (%s, %s, %s)",
+    [(doc_id, f, v) for f, v in fields.items()]
+)
+
+# BAD: Individual inserts in loop
+for field, value in fields.items():
+    cur.execute("INSERT INTO labels ...", (doc_id, field, value))
+```
+
+### Lazy Loading
+
+```python
+class InferencePipeline:
+    def __init__(self, model_path: Path):
+        self.model_path = model_path
+        self._model: YOLO | None = None
+        self._ocr: PaddleOCR | None = None
+
+    @property
+    def model(self) -> YOLO:
+        """Lazy load YOLO model."""
+        if self._model is None:
+            self._model = YOLO(str(self.model_path))
+        return self._model
+
+    @property
+    def ocr(self) -> PaddleOCR:
+        """Lazy load PaddleOCR."""
+        if self._ocr is None:
+            self._ocr = PaddleOCR(use_angle_cls=True, lang="latin")
+        return self._ocr
+```
+
+## Testing Standards
+
+### Test Structure (AAA Pattern)
+
+```python
+def test_extract_bankgiro_valid():
+    # Arrange
+    text = "Bankgiro: 123-4567"
+
+    # Act
+    result = extract_bankgiro(text)
+
+    # Assert
+    assert result == "1234567"
+
+def test_extract_bankgiro_invalid_returns_none():
+    # Arrange
+    text = "No bankgiro here"
+
+    # Act
+    result = extract_bankgiro(text)
+
+    # Assert
+    assert result is None
+```
+
+### Test Naming
+
+```python
+# GOOD: Descriptive test names
+def test_parse_payment_line_extracts_all_fields(): ...
+def test_parse_payment_line_handles_missing_checksum(): ...
+def test_validate_ocr_returns_false_for_invalid_checksum(): ...
+
+# BAD: Vague test names
+def test_parse(): ...
+def test_works(): ...
+def test_payment_line(): ...
+```
+
+### Fixtures
+
+```python
+import pytest
+from pathlib import Path
+
+@pytest.fixture
+def sample_invoice_pdf(tmp_path: Path) -> Path:
+    """Create sample invoice PDF for testing."""
+    pdf_path = tmp_path / "invoice.pdf"
+    # Create test PDF...
+    return pdf_path
+
+@pytest.fixture
+def inference_pipeline(sample_model_path: Path) -> InferencePipeline:
+    """Create inference pipeline with test model."""
+    return InferencePipeline(sample_model_path)
+
+def test_process_invoice(inference_pipeline, sample_invoice_pdf):
+    result = inference_pipeline.process(sample_invoice_pdf)
+    assert result.fields.get("invoice_number") is not None
+```
+
+## Code Smell Detection
+
+### 1. Long Functions
+
+```python
+# BAD: Function > 50 lines
+def process_document():
+    # 100 lines of code...
+
+# GOOD: Split into smaller functions
+def process_document(pdf_path: Path) -> InferenceResult:
+    image = render_pdf(pdf_path)
+    detections = detect_fields(image)
+    ocr_results = extract_text(image, detections)
+    fields = normalize_fields(ocr_results)
+    return build_result(fields)
+```
+
+### 2. Deep Nesting
+
+```python
+# BAD: 5+ levels of nesting
+if document:
+    if document.is_valid:
+        if document.has_fields:
+            if field in document.fields:
+                if document.fields[field]:
+                    # Do something
+
+# GOOD: Early returns
+if not document:
+    return None
+if not document.is_valid:
+    return None
+if not document.has_fields:
+    return None
+if field not in document.fields:
+    return None
+if not document.fields[field]:
+    return None
+
+# Do something
+```
+
+### 3. Magic Numbers
+
+```python
+# BAD: Unexplained numbers
+if confidence > 0.5:
+    ...
+time.sleep(3)
+
+# GOOD: Named constants
+CONFIDENCE_THRESHOLD = 0.5
+RETRY_DELAY_SECONDS = 3
+
+if confidence > CONFIDENCE_THRESHOLD:
+    ...
+time.sleep(RETRY_DELAY_SECONDS)
+```
+
+### 4. Mutable Default Arguments
+
+```python
+# BAD: Mutable default argument
+def process_fields(fields: list = []):  # DANGEROUS!
+    fields.append("new_field")
+    return fields
+
+# GOOD: Use None as default
+def process_fields(fields: list | None = None) -> list:
+    if fields is None:
+        fields = []
+    return [*fields, "new_field"]
+```
+
+## Logging Standards
+
+```python
+import logging
+
+# Module-level logger
+logger = logging.getLogger(__name__)
+
+# GOOD: Appropriate log levels
+logger.debug("Processing document: %s", doc_id)
+logger.info("Document processed successfully: %s", doc_id)
+logger.warning("Low confidence score: %.2f", confidence)
+logger.error("Failed to process document: %s", error)
+
+# GOOD: Structured logging with extra data
+logger.info(
+    "Inference complete",
+    extra={
+        "document_id": doc_id,
+        "field_count": len(fields),
+        "processing_time_ms": elapsed_ms
+    }
+)
+
+# BAD: Using print()
+print(f"Processing {doc_id}")  # Never in production!
+```
+
+**Remember**: Code quality is not negotiable. Clear, maintainable Python code with proper type hints enables confident development and refactoring.
diff --git a/.claude/skills/continuous-learning/SKILL.md b/.claude/skills/continuous-learning/SKILL.md
new file mode 100644
index 0000000..84a88dd
--- /dev/null
+++ b/.claude/skills/continuous-learning/SKILL.md
@@ -0,0 +1,80 @@
+---
+name: continuous-learning
+description: Automatically extract reusable patterns from Claude Code sessions and save them as learned skills for future use.
+---
+
+# Continuous Learning Skill
+
+Automatically evaluates Claude Code sessions on end to extract reusable patterns that can be saved as learned skills.
+
+## How It Works
+
+This skill runs as a **Stop hook** at the end of each session:
+
+1. **Session Evaluation**: Checks if session has enough messages (default: 10+)
+2. **Pattern Detection**: Identifies extractable patterns from the session
+3. **Skill Extraction**: Saves useful patterns to `~/.claude/skills/learned/`
+
+## Configuration
+
+Edit `config.json` to customize:
+
+```json
+{
+  "min_session_length": 10,
+  "extraction_threshold": "medium",
+  "auto_approve": false,
+  "learned_skills_path": "~/.claude/skills/learned/",
+  "patterns_to_detect": [
+    "error_resolution",
+    "user_corrections",
+    "workarounds",
+    "debugging_techniques",
+    "project_specific"
+  ],
+  "ignore_patterns": [
+    "simple_typos",
+    "one_time_fixes",
+    "external_api_issues"
+  ]
+}
+```
+
+## Pattern Types
+
+| Pattern | Description |
+|---------|-------------|
+| `error_resolution` | How specific errors were resolved |
+| `user_corrections` | Patterns from user corrections |
+| `workarounds` | Solutions to framework/library quirks |
+| `debugging_techniques` | Effective debugging approaches |
+| `project_specific` | Project-specific conventions |
+
+## Hook Setup
+
+Add to your `~/.claude/settings.json`:
+
+```json
+{
+  "hooks": {
+    "Stop": [{
+      "matcher": "*",
+      "hooks": [{
+        "type": "command",
+        "command": "~/.claude/skills/continuous-learning/evaluate-session.sh"
+      }]
+    }]
+  }
+}
+```
+
+## Why Stop Hook?
+
+- **Lightweight**: Runs once at session end
+- **Non-blocking**: Doesn't add latency to every message
+- **Complete context**: Has access to full session transcript
+
+## Related
+
+- [The Longform Guide](https://x.com/affaanmustafa/status/2014040193557471352) - Section on continuous learning
+- `/learn` command - Manual pattern extraction mid-session
diff --git a/.claude/skills/continuous-learning/config.json b/.claude/skills/continuous-learning/config.json
new file mode 100644
index 0000000..1094b7e
--- /dev/null
+++ b/.claude/skills/continuous-learning/config.json
@@ -0,0 +1,18 @@
+{
+  "min_session_length": 10,
+  "extraction_threshold": "medium",
+  "auto_approve": false,
+  "learned_skills_path": "~/.claude/skills/learned/",
+  "patterns_to_detect": [
+    "error_resolution",
+    "user_corrections",
+    "workarounds",
+    "debugging_techniques",
+    "project_specific"
+  ],
+  "ignore_patterns": [
+    "simple_typos",
+    "one_time_fixes",
+    "external_api_issues"
+  ]
+}
diff --git a/.claude/skills/continuous-learning/evaluate-session.sh b/.claude/skills/continuous-learning/evaluate-session.sh
new file mode 100644
index 0000000..f13208a
--- /dev/null
+++ b/.claude/skills/continuous-learning/evaluate-session.sh
@@ -0,0 +1,60 @@
+#!/bin/bash
+# Continuous Learning - Session Evaluator
+# Runs on Stop hook to extract reusable patterns from Claude Code sessions
+#
+# Why Stop hook instead of UserPromptSubmit:
+# - Stop runs once at session end (lightweight)
+# - UserPromptSubmit runs every message (heavy, adds latency)
+#
+# Hook config (in ~/.claude/settings.json):
+# {
+#   "hooks": {
+#     "Stop": [{
+#       "matcher": "*",
+#       "hooks": [{
+#         "type": "command",
+#         "command": "~/.claude/skills/continuous-learning/evaluate-session.sh"
+#       }]
+#     }]
+#   }
+# }
+#
+# Patterns to detect: error_resolution, debugging_techniques, workarounds, project_specific
+# Patterns to ignore: simple_typos, one_time_fixes, external_api_issues
+# Extracted skills saved to: ~/.claude/skills/learned/
+
+set -e
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+CONFIG_FILE="$SCRIPT_DIR/config.json"
+LEARNED_SKILLS_PATH="${HOME}/.claude/skills/learned"
+MIN_SESSION_LENGTH=10
+
+# Load config if exists
+if [ -f "$CONFIG_FILE" ]; then
+  MIN_SESSION_LENGTH=$(jq -r '.min_session_length // 10' "$CONFIG_FILE")
+  LEARNED_SKILLS_PATH=$(jq -r '.learned_skills_path // "~/.claude/skills/learned/"' "$CONFIG_FILE" | sed "s|~|$HOME|")
+fi
+
+# Ensure learned skills directory exists
+mkdir -p "$LEARNED_SKILLS_PATH"
+
+# Get transcript path from environment (set by Claude Code)
+transcript_path="${CLAUDE_TRANSCRIPT_PATH:-}"
+
+if [ -z "$transcript_path" ] || [ ! -f "$transcript_path" ]; then
+  exit 0
+fi
+
+# Count messages in session
+message_count=$(grep -c '"type":"user"' "$transcript_path" 2>/dev/null || echo "0")
+
+# Skip short sessions
+if [ "$message_count" -lt "$MIN_SESSION_LENGTH" ]; then
+  echo "[ContinuousLearning] Session too short ($message_count messages), skipping" >&2
+  exit 0
+fi
+
+# Signal to Claude that session should be evaluated for extractable patterns
+echo "[ContinuousLearning] Session has $message_count messages - evaluate for extractable patterns" >&2
+echo "[ContinuousLearning] Save learned skills to: $LEARNED_SKILLS_PATH" >&2
diff --git a/.claude/skills/dev-builder/SKILL.md b/.claude/skills/dev-builder/SKILL.md
deleted file mode 100644
index 40a5ca1..0000000
--- a/.claude/skills/dev-builder/SKILL.md
+++ /dev/null
@@ -1,245 +0,0 @@
----
-name: dev-builder
-description: 根据 Product-Spec.md 初始化项目、安装依赖、实现代码。与 product-spec-builder 配套使用，帮助用户将需求文档转化为可运行的代码项目。
----
-
-[角色]
-    你是一位经验丰富的全栈开发工程师。
-    
-    你能够根据产品需求文档快速搭建项目，选择合适的技术栈，编写高质量的代码。你注重代码结构清晰、可维护性强。
-
-[任务]
-    读取 Product-Spec.md，完成以下工作：
-    1. 分析需求，确定项目类型和技术栈
-    2. 初始化项目，创建目录结构
-    3. 安装必要依赖，配置开发环境
-    4. 实现代码（UI、功能、AI 集成）
-    
-    最终交付可运行的项目代码。
-
-[总体规则]
-    - 必须先读取 Product-Spec.md，不存在则提示用户先完成需求收集
-    - 每个阶段完成后输出进度反馈
-    - 如有原型图，开发时参考原型图的视觉设计
-    - 代码要简洁、可读、可维护
-    - 优先使用简单方案，不过度设计
-    - 只改与当前任务相关的文件，禁止「顺手升级依赖」「全局格式化」「无关重命名」
-    - 始终使用中文与用户交流
-
-[项目类型判断]
-    根据 Product Spec 的 UI 布局和技术说明判断：
-    - 有 UI + 纯前端/无需服务器 → 纯前端 Web 应用
-    - 有 UI + 需要后端/数据库/API → 全栈 Web 应用
-    - 无 UI + 命令行操作 → CLI 工具
-    - 只是 API 服务 → 后端服务
-
-[技术栈选择]
-    | 项目类型 | 推荐技术栈 |
-    |---------|-----------|
-    | 纯前端 Web 应用 | React + Vite + TypeScript + Tailwind |
-    | 全栈 Web 应用 | Next.js + TypeScript + Tailwind |
-    | CLI 工具 | Node.js + TypeScript + Commander |
-    | 后端服务 | Express + TypeScript |
-    | AI/ML 应用 | Python + FastAPI + PyTorch/TensorFlow |
-    | 数据处理工具 | Python + Pandas + NumPy |
-
-    **选择原则**：
-    - Product Spec 技术说明有指定 → 用指定的
-    - 没指定 → 用推荐方案
-    - 有疑问 → 询问用户
-
-[AI 研发方向]
-    **适用场景**：
-    - 机器学习模型训练与推理
-    - 计算机视觉（目标检测、OCR、图像分类）
-    - 自然语言处理（文本分类、命名实体识别、对话系统）
-    - 大语言模型应用（RAG、Agent、Prompt Engineering）
-    - 数据分析与可视化
-
-    **技术栈推荐**：
-    | 方向 | 推荐技术栈 |
-    |-----|-----------|
-    | 深度学习 | PyTorch + Lightning + Weights & Biases |
-    | 目标检测 | Ultralytics YOLO + OpenCV |
-    | OCR | PaddleOCR / EasyOCR / Tesseract |
-    | NLP | Transformers + spaCy |
-    | LLM 应用 | LangChain / LlamaIndex + OpenAI API |
-    | 数据处理 | Pandas + Polars + DuckDB |
-    | 模型部署 | FastAPI + Docker + ONNX Runtime |
-
-    **项目结构（AI/ML 项目）**：
-    ```
-    project/
-    ├── src/                  # 源代码
-    │   ├── data/            # 数据加载与预处理
-    │   ├── models/          # 模型定义
-    │   ├── training/        # 训练逻辑
-    │   ├── inference/       # 推理逻辑
-    │   └── utils/           # 工具函数
-    ├── configs/             # 配置文件（YAML）
-    ├── data/                # 数据目录
-    │   ├── raw/            # 原始数据（不修改）
-    │   └── processed/      # 处理后数据
-    ├── models/              # 训练好的模型权重
-    ├── notebooks/           # 实验 Notebook
-    ├── tests/               # 测试代码
-    └── scripts/             # 运行脚本
-    ```
-
-    **AI 研发规范**：
-    - **可复现性**：固定随机种子（random、numpy、torch），记录实验配置
-    - **数据管理**：原始数据不可变，处理数据版本化
-    - **实验追踪**：使用 MLflow/W&B 记录指标、参数、产物
-    - **配置驱动**：所有超参数放 YAML 配置，禁止硬编码
-    - **类型安全**：使用 Pydantic 定义数据结构
-    - **日志规范**：使用 logging 模块，不用 print
-
-    **模型训练检查项**：
-    - ✅ 数据集划分（train/val/test）比例合理
-    - ✅ 早停机制（Early Stopping）防止过拟合
-    - ✅ 学习率调度器配置
-    - ✅ 模型检查点保存策略
-    - ✅ 验证集指标监控
-    - ✅ GPU 内存管理（混合精度训练）
-
-    **部署注意事项**：
-    - 模型导出为 ONNX 格式提升推理速度
-    - API 接口使用异步处理提升并发
-    - 大文件使用流式传输
-    - 配置健康检查端点
-    - 日志和指标监控
-
-[初始化提醒]
-    **项目名称规范**：
-    - 只能用小写字母、数字、短横线（如 my-app）
-    - 不能有空格、&、# 等特殊字符
-    
-    **npm 报错时**：可尝试 pnpm 或 yarn
-
-[依赖选择]
-    **原则**：只装需要的，不装「可能用到」的
-
-[环境变量配置]
-    **⚠️ 安全警告**：
-    - Vite 纯前端：`VITE_` 前缀变量**会暴露给浏览器**，不能存放 API Key
-    - Next.js：不加 `NEXT_PUBLIC_` 前缀的变量只在服务端可用（安全）
-    
-    **涉及 AI API 调用时**：
-    - 推荐用 Next.js（API Key 只在服务端使用，安全）
-    - 备选：创建独立后端代理请求
-    - 仅限开发/演示：使用 VITE_ 前缀（必须提醒用户安全风险）
-    
-    **文件规范**：
-    - 创建 `.env.example` 作为模板（提交到 Git）
-    - 实际值放 `.env.local`（不提交，确保 .gitignore 包含）
-
-[工作流程]
-    [启动阶段]
-        目的：检查前置条件，读取项目文档
-        
-        第一步：检测 Product Spec
-            检测 Product-Spec.md 是否存在
-            不存在 → 提示：「未找到 Product-Spec.md，请先使用 /prd 完成需求收集。」，终止流程
-            存在 → 继续
-        
-        第二步：读取项目文档
-            加载 Product-Spec.md
-            提取：产品概述、功能需求、UI 布局、技术说明、AI 能力需求
-        
-        第三步：检查原型图
-            检查 UI-Prompts.md 是否存在
-            存在 → 询问：「我看到你已经生成了原型图提示词，如果有生成的原型图图片，可以发给我参考。」
-            不存在 → 询问：「是否有原型图或设计稿可以参考？有的话可以发给我。」
-            
-            用户发送图片 → 记录，开发时参考
-            用户说没有 → 继续
-
-    [技术方案阶段]
-        目的：确定技术栈并告知用户
-        
-        分析项目类型，选择技术栈，列出主要依赖
-        
-        输出方案后直接进入下一阶段：
-            "📦 **技术方案**
-            
-            **项目类型**：[类型]
-            **技术栈**：[技术栈]
-            **主要依赖**：
-            - [依赖1]：[用途]
-            - [依赖2]：[用途]"
-
-    [项目搭建阶段]
-        目的：初始化项目，创建基础结构
-        
-        执行：初始化项目 → 配置 Tailwind（Vite 项目）→ 安装功能依赖 → 配置环境变量（如需要）
-        
-        每完成一步输出进度反馈
-
-    [代码实现阶段]
-        目的：实现功能代码
-        
-        第一步：创建基础布局
-            根据 Product Spec 的 UI 布局章节创建整体布局结构
-            如有原型图，参考其视觉设计
-        
-        第二步：实现 UI 组件
-            根据 UI 布局的控件规范创建组件
-            使用 Tailwind 编写样式
-        
-        第三步：实现功能逻辑
-            核心功能优先实现，辅助功能其次
-            添加状态管理，实现用户交互逻辑
-        
-        第四步：集成 AI 能力（如有）
-            创建 AI 服务模块，实现调用函数
-            处理 API Key 读取，在相应功能中集成
-        
-        第五步：完善用户体验
-            添加 loading 状态、错误处理、空状态提示、输入校验
-
-    [完成阶段]
-        目的：输出开发结果总结
-        
-        输出：
-            "✅ **项目开发完成！**
-            
-            **技术栈**：[技术栈]
-            
-            **项目结构**：
-            ```
-            [实际目录结构]
-            ```
-            
-            **已实现功能**：
-            - ✅ [功能1]
-            - ✅ [功能2]
-            - ...
-            
-            **AI 能力集成**：
-            - [已集成的 AI 能力，或「无」]
-            
-            **环境变量**：
-            - [需要配置的环境变量，或「无需配置」]"
-
-[质量门槛]
-    每个功能点至少满足：
-    
-    **必须**：
-    - ✅ 主路径可用（Happy Path 能跑通）
-    - ✅ 异常路径清晰（错误提示、重试/回退）
-    - ✅ loading 状态（涉及异步操作时）
-    - ✅ 空状态处理（无数据时的提示）
-    - ✅ 基础输入校验（必填、格式）
-    - ✅ 敏感信息不写入代码（API Key 走环境变量）
-    
-    **建议**：
-    - 基础可访问性（可点击、可键盘操作）
-    - 响应式适配（如需支持移动端）
-
-[代码规范]
-    - 单个文件不超过 300 行，超过则拆分
-    - 优先使用函数组件 + Hooks
-    - 样式优先用 Tailwind
-
-[初始化]
-    执行 [启动阶段]
diff --git a/.claude/skills/eval-harness/SKILL.md b/.claude/skills/eval-harness/SKILL.md
new file mode 100644
index 0000000..522937d
--- /dev/null
+++ b/.claude/skills/eval-harness/SKILL.md
@@ -0,0 +1,221 @@
+# Eval Harness Skill
+
+A formal evaluation framework for Claude Code sessions, implementing eval-driven development (EDD) principles.
+
+## Philosophy
+
+Eval-Driven Development treats evals as the "unit tests of AI development":
+- Define expected behavior BEFORE implementation
+- Run evals continuously during development
+- Track regressions with each change
+- Use pass@k metrics for reliability measurement
+
+## Eval Types
+
+### Capability Evals
+Test if Claude can do something it couldn't before:
+```markdown
+[CAPABILITY EVAL: feature-name]
+Task: Description of what Claude should accomplish
+Success Criteria:
+  - [ ] Criterion 1
+  - [ ] Criterion 2
+  - [ ] Criterion 3
+Expected Output: Description of expected result
+```
+
+### Regression Evals
+Ensure changes don't break existing functionality:
+```markdown
+[REGRESSION EVAL: feature-name]
+Baseline: SHA or checkpoint name
+Tests:
+  - existing-test-1: PASS/FAIL
+  - existing-test-2: PASS/FAIL
+  - existing-test-3: PASS/FAIL
+Result: X/Y passed (previously Y/Y)
+```
+
+## Grader Types
+
+### 1. Code-Based Grader
+Deterministic checks using code:
+```bash
+# Check if file contains expected pattern
+grep -q "export function handleAuth" src/auth.ts && echo "PASS" || echo "FAIL"
+
+# Check if tests pass
+npm test -- --testPathPattern="auth" && echo "PASS" || echo "FAIL"
+
+# Check if build succeeds
+npm run build && echo "PASS" || echo "FAIL"
+```
+
+### 2. Model-Based Grader
+Use Claude to evaluate open-ended outputs:
+```markdown
+[MODEL GRADER PROMPT]
+Evaluate the following code change:
+1. Does it solve the stated problem?
+2. Is it well-structured?
+3. Are edge cases handled?
+4. Is error handling appropriate?
+
+Score: 1-5 (1=poor, 5=excellent)
+Reasoning: [explanation]
+```
+
+### 3. Human Grader
+Flag for manual review:
+```markdown
+[HUMAN REVIEW REQUIRED]
+Change: Description of what changed
+Reason: Why human review is needed
+Risk Level: LOW/MEDIUM/HIGH
+```
+
+## Metrics
+
+### pass@k
+"At least one success in k attempts"
+- pass@1: First attempt success rate
+- pass@3: Success within 3 attempts
+- Typical target: pass@3 > 90%
+
+### pass^k
+"All k trials succeed"
+- Higher bar for reliability
+- pass^3: 3 consecutive successes
+- Use for critical paths
+
+## Eval Workflow
+
+### 1. Define (Before Coding)
+```markdown
+## EVAL DEFINITION: feature-xyz
+
+### Capability Evals
+1. Can create new user account
+2. Can validate email format
+3. Can hash password securely
+
+### Regression Evals
+1. Existing login still works
+2. Session management unchanged
+3. Logout flow intact
+
+### Success Metrics
+- pass@3 > 90% for capability evals
+- pass^3 = 100% for regression evals
+```
+
+### 2. Implement
+Write code to pass the defined evals.
+
+### 3. Evaluate
+```bash
+# Run capability evals
+[Run each capability eval, record PASS/FAIL]
+
+# Run regression evals
+npm test -- --testPathPattern="existing"
+
+# Generate report
+```
+
+### 4. Report
+```markdown
+EVAL REPORT: feature-xyz
+========================
+
+Capability Evals:
+  create-user:     PASS (pass@1)
+  validate-email:  PASS (pass@2)
+  hash-password:   PASS (pass@1)
+  Overall:         3/3 passed
+
+Regression Evals:
+  login-flow:      PASS
+  session-mgmt:    PASS
+  logout-flow:     PASS
+  Overall:         3/3 passed
+
+Metrics:
+  pass@1: 67% (2/3)
+  pass@3: 100% (3/3)
+
+Status: READY FOR REVIEW
+```
+
+## Integration Patterns
+
+### Pre-Implementation
+```
+/eval define feature-name
+```
+Creates eval definition file at `.claude/evals/feature-name.md`
+
+### During Implementation
+```
+/eval check feature-name
+```
+Runs current evals and reports status
+
+### Post-Implementation
+```
+/eval report feature-name
+```
+Generates full eval report
+
+## Eval Storage
+
+Store evals in project:
+```
+.claude/
+  evals/
+    feature-xyz.md      # Eval definition
+    feature-xyz.log     # Eval run history
+    baseline.json       # Regression baselines
+```
+
+## Best Practices
+
+1. **Define evals BEFORE coding** - Forces clear thinking about success criteria
+2. **Run evals frequently** - Catch regressions early
+3. **Track pass@k over time** - Monitor reliability trends
+4. **Use code graders when possible** - Deterministic > probabilistic
+5. **Human review for security** - Never fully automate security checks
+6. **Keep evals fast** - Slow evals don't get run
+7. **Version evals with code** - Evals are first-class artifacts
+
+## Example: Adding Authentication
+
+```markdown
+## EVAL: add-authentication
+
+### Phase 1: Define (10 min)
+Capability Evals:
+- [ ] User can register with email/password
+- [ ] User can login with valid credentials
+- [ ] Invalid credentials rejected with proper error
+- [ ] Sessions persist across page reloads
+- [ ] Logout clears session
+
+Regression Evals:
+- [ ] Public routes still accessible
+- [ ] API responses unchanged
+- [ ] Database schema compatible
+
+### Phase 2: Implement (varies)
+[Write code]
+
+### Phase 3: Evaluate
+Run: /eval check add-authentication
+
+### Phase 4: Report
+EVAL REPORT: add-authentication
+==============================
+Capability: 5/5 passed (pass@3: 100%)
+Regression: 3/3 passed (pass^3: 100%)
+Status: SHIP IT
+```
diff --git a/.claude/skills/frontend-patterns/SKILL.md b/.claude/skills/frontend-patterns/SKILL.md
new file mode 100644
index 0000000..05a796a
--- /dev/null
+++ b/.claude/skills/frontend-patterns/SKILL.md
@@ -0,0 +1,631 @@
+---
+name: frontend-patterns
+description: Frontend development patterns for React, Next.js, state management, performance optimization, and UI best practices.
+---
+
+# Frontend Development Patterns
+
+Modern frontend patterns for React, Next.js, and performant user interfaces.
+
+## Component Patterns
+
+### Composition Over Inheritance
+
+```typescript
+// ✅ GOOD: Component composition
+interface CardProps {
+  children: React.ReactNode
+  variant?: 'default' | 'outlined'
+}
+
+export function Card({ children, variant = 'default' }: CardProps) {
+  return <div className={`card card-${variant}`}>{children}</div>
+}
+
+export function CardHeader({ children }: { children: React.ReactNode }) {
+  return <div className="card-header">{children}</div>
+}
+
+export function CardBody({ children }: { children: React.ReactNode }) {
+  return <div className="card-body">{children}</div>
+}
+
+// Usage
+<Card>
+  <CardHeader>Title</CardHeader>
+  <CardBody>Content</CardBody>
+</Card>
+```
+
+### Compound Components
+
+```typescript
+interface TabsContextValue {
+  activeTab: string
+  setActiveTab: (tab: string) => void
+}
+
+const TabsContext = createContext<TabsContextValue | undefined>(undefined)
+
+export function Tabs({ children, defaultTab }: {
+  children: React.ReactNode
+  defaultTab: string
+}) {
+  const [activeTab, setActiveTab] = useState(defaultTab)
+
+  return (
+    <TabsContext.Provider value={{ activeTab, setActiveTab }}>
+      {children}
+    </TabsContext.Provider>
+  )
+}
+
+export function TabList({ children }: { children: React.ReactNode }) {
+  return <div className="tab-list">{children}</div>
+}
+
+export function Tab({ id, children }: { id: string, children: React.ReactNode }) {
+  const context = useContext(TabsContext)
+  if (!context) throw new Error('Tab must be used within Tabs')
+
+  return (
+    <button
+      className={context.activeTab === id ? 'active' : ''}
+      onClick={() => context.setActiveTab(id)}
+    >
+      {children}
+    </button>
+  )
+}
+
+// Usage
+<Tabs defaultTab="overview">
+  <TabList>
+    <Tab id="overview">Overview</Tab>
+    <Tab id="details">Details</Tab>
+  </TabList>
+</Tabs>
+```
+
+### Render Props Pattern
+
+```typescript
+interface DataLoaderProps<T> {
+  url: string
+  children: (data: T | null, loading: boolean, error: Error | null) => React.ReactNode
+}
+
+export function DataLoader<T>({ url, children }: DataLoaderProps<T>) {
+  const [data, setData] = useState<T | null>(null)
+  const [loading, setLoading] = useState(true)
+  const [error, setError] = useState<Error | null>(null)
+
+  useEffect(() => {
+    fetch(url)
+      .then(res => res.json())
+      .then(setData)
+      .catch(setError)
+      .finally(() => setLoading(false))
+  }, [url])
+
+  return <>{children(data, loading, error)}</>
+}
+
+// Usage
+<DataLoader<Market[]> url="/api/markets">
+  {(markets, loading, error) => {
+    if (loading) return <Spinner />
+    if (error) return <Error error={error} />
+    return <MarketList markets={markets!} />
+  }}
+</DataLoader>
+```
+
+## Custom Hooks Patterns
+
+### State Management Hook
+
+```typescript
+export function useToggle(initialValue = false): [boolean, () => void] {
+  const [value, setValue] = useState(initialValue)
+
+  const toggle = useCallback(() => {
+    setValue(v => !v)
+  }, [])
+
+  return [value, toggle]
+}
+
+// Usage
+const [isOpen, toggleOpen] = useToggle()
+```
+
+### Async Data Fetching Hook
+
+```typescript
+interface UseQueryOptions<T> {
+  onSuccess?: (data: T) => void
+  onError?: (error: Error) => void
+  enabled?: boolean
+}
+
+export function useQuery<T>(
+  key: string,
+  fetcher: () => Promise<T>,
+  options?: UseQueryOptions<T>
+) {
+  const [data, setData] = useState<T | null>(null)
+  const [error, setError] = useState<Error | null>(null)
+  const [loading, setLoading] = useState(false)
+
+  const refetch = useCallback(async () => {
+    setLoading(true)
+    setError(null)
+
+    try {
+      const result = await fetcher()
+      setData(result)
+      options?.onSuccess?.(result)
+    } catch (err) {
+      const error = err as Error
+      setError(error)
+      options?.onError?.(error)
+    } finally {
+      setLoading(false)
+    }
+  }, [fetcher, options])
+
+  useEffect(() => {
+    if (options?.enabled !== false) {
+      refetch()
+    }
+  }, [key, refetch, options?.enabled])
+
+  return { data, error, loading, refetch }
+}
+
+// Usage
+const { data: markets, loading, error, refetch } = useQuery(
+  'markets',
+  () => fetch('/api/markets').then(r => r.json()),
+  {
+    onSuccess: data => console.log('Fetched', data.length, 'markets'),
+    onError: err => console.error('Failed:', err)
+  }
+)
+```
+
+### Debounce Hook
+
+```typescript
+export function useDebounce<T>(value: T, delay: number): T {
+  const [debouncedValue, setDebouncedValue] = useState<T>(value)
+
+  useEffect(() => {
+    const handler = setTimeout(() => {
+      setDebouncedValue(value)
+    }, delay)
+
+    return () => clearTimeout(handler)
+  }, [value, delay])
+
+  return debouncedValue
+}
+
+// Usage
+const [searchQuery, setSearchQuery] = useState('')
+const debouncedQuery = useDebounce(searchQuery, 500)
+
+useEffect(() => {
+  if (debouncedQuery) {
+    performSearch(debouncedQuery)
+  }
+}, [debouncedQuery])
+```
+
+## State Management Patterns
+
+### Context + Reducer Pattern
+
+```typescript
+interface State {
+  markets: Market[]
+  selectedMarket: Market | null
+  loading: boolean
+}
+
+type Action =
+  | { type: 'SET_MARKETS'; payload: Market[] }
+  | { type: 'SELECT_MARKET'; payload: Market }
+  | { type: 'SET_LOADING'; payload: boolean }
+
+function reducer(state: State, action: Action): State {
+  switch (action.type) {
+    case 'SET_MARKETS':
+      return { ...state, markets: action.payload }
+    case 'SELECT_MARKET':
+      return { ...state, selectedMarket: action.payload }
+    case 'SET_LOADING':
+      return { ...state, loading: action.payload }
+    default:
+      return state
+  }
+}
+
+const MarketContext = createContext<{
+  state: State
+  dispatch: Dispatch<Action>
+} | undefined>(undefined)
+
+export function MarketProvider({ children }: { children: React.ReactNode }) {
+  const [state, dispatch] = useReducer(reducer, {
+    markets: [],
+    selectedMarket: null,
+    loading: false
+  })
+
+  return (
+    <MarketContext.Provider value={{ state, dispatch }}>
+      {children}
+    </MarketContext.Provider>
+  )
+}
+
+export function useMarkets() {
+  const context = useContext(MarketContext)
+  if (!context) throw new Error('useMarkets must be used within MarketProvider')
+  return context
+}
+```
+
+## Performance Optimization
+
+### Memoization
+
+```typescript
+// ✅ useMemo for expensive computations
+const sortedMarkets = useMemo(() => {
+  return markets.sort((a, b) => b.volume - a.volume)
+}, [markets])
+
+// ✅ useCallback for functions passed to children
+const handleSearch = useCallback((query: string) => {
+  setSearchQuery(query)
+}, [])
+
+// ✅ React.memo for pure components
+export const MarketCard = React.memo<MarketCardProps>(({ market }) => {
+  return (
+    <div className="market-card">
+      <h3>{market.name}</h3>
+      <p>{market.description}</p>
+    </div>
+  )
+})
+```
+
+### Code Splitting & Lazy Loading
+
+```typescript
+import { lazy, Suspense } from 'react'
+
+// ✅ Lazy load heavy components
+const HeavyChart = lazy(() => import('./HeavyChart'))
+const ThreeJsBackground = lazy(() => import('./ThreeJsBackground'))
+
+export function Dashboard() {
+  return (
+    <div>
+      <Suspense fallback={<ChartSkeleton />}>
+        <HeavyChart data={data} />
+      </Suspense>
+
+      <Suspense fallback={null}>
+        <ThreeJsBackground />
+      </Suspense>
+    </div>
+  )
+}
+```
+
+### Virtualization for Long Lists
+
+```typescript
+import { useVirtualizer } from '@tanstack/react-virtual'
+
+export function VirtualMarketList({ markets }: { markets: Market[] }) {
+  const parentRef = useRef<HTMLDivElement>(null)
+
+  const virtualizer = useVirtualizer({
+    count: markets.length,
+    getScrollElement: () => parentRef.current,
+    estimateSize: () => 100,  // Estimated row height
+    overscan: 5  // Extra items to render
+  })
+
+  return (
+    <div ref={parentRef} style={{ height: '600px', overflow: 'auto' }}>
+      <div
+        style={{
+          height: `${virtualizer.getTotalSize()}px`,
+          position: 'relative'
+        }}
+      >
+        {virtualizer.getVirtualItems().map(virtualRow => (
+          <div
+            key={virtualRow.index}
+            style={{
+              position: 'absolute',
+              top: 0,
+              left: 0,
+              width: '100%',
+              height: `${virtualRow.size}px`,
+              transform: `translateY(${virtualRow.start}px)`
+            }}
+          >
+            <MarketCard market={markets[virtualRow.index]} />
+          </div>
+        ))}
+      </div>
+    </div>
+  )
+}
+```
+
+## Form Handling Patterns
+
+### Controlled Form with Validation
+
+```typescript
+interface FormData {
+  name: string
+  description: string
+  endDate: string
+}
+
+interface FormErrors {
+  name?: string
+  description?: string
+  endDate?: string
+}
+
+export function CreateMarketForm() {
+  const [formData, setFormData] = useState<FormData>({
+    name: '',
+    description: '',
+    endDate: ''
+  })
+
+  const [errors, setErrors] = useState<FormErrors>({})
+
+  const validate = (): boolean => {
+    const newErrors: FormErrors = {}
+
+    if (!formData.name.trim()) {
+      newErrors.name = 'Name is required'
+    } else if (formData.name.length > 200) {
+      newErrors.name = 'Name must be under 200 characters'
+    }
+
+    if (!formData.description.trim()) {
+      newErrors.description = 'Description is required'
+    }
+
+    if (!formData.endDate) {
+      newErrors.endDate = 'End date is required'
+    }
+
+    setErrors(newErrors)
+    return Object.keys(newErrors).length === 0
+  }
+
+  const handleSubmit = async (e: React.FormEvent) => {
+    e.preventDefault()
+
+    if (!validate()) return
+
+    try {
+      await createMarket(formData)
+      // Success handling
+    } catch (error) {
+      // Error handling
+    }
+  }
+
+  return (
+    <form onSubmit={handleSubmit}>
+      <input
+        value={formData.name}
+        onChange={e => setFormData(prev => ({ ...prev, name: e.target.value }))}
+        placeholder="Market name"
+      />
+      {errors.name && <span className="error">{errors.name}</span>}
+
+      {/* Other fields */}
+
+      <button type="submit">Create Market</button>
+    </form>
+  )
+}
+```
+
+## Error Boundary Pattern
+
+```typescript
+interface ErrorBoundaryState {
+  hasError: boolean
+  error: Error | null
+}
+
+export class ErrorBoundary extends React.Component<
+  { children: React.ReactNode },
+  ErrorBoundaryState
+> {
+  state: ErrorBoundaryState = {
+    hasError: false,
+    error: null
+  }
+
+  static getDerivedStateFromError(error: Error): ErrorBoundaryState {
+    return { hasError: true, error }
+  }
+
+  componentDidCatch(error: Error, errorInfo: React.ErrorInfo) {
+    console.error('Error boundary caught:', error, errorInfo)
+  }
+
+  render() {
+    if (this.state.hasError) {
+      return (
+        <div className="error-fallback">
+          <h2>Something went wrong</h2>
+          <p>{this.state.error?.message}</p>
+          <button onClick={() => this.setState({ hasError: false })}>
+            Try again
+          </button>
+        </div>
+      )
+    }
+
+    return this.props.children
+  }
+}
+
+// Usage
+<ErrorBoundary>
+  <App />
+</ErrorBoundary>
+```
+
+## Animation Patterns
+
+### Framer Motion Animations
+
+```typescript
+import { motion, AnimatePresence } from 'framer-motion'
+
+// ✅ List animations
+export function AnimatedMarketList({ markets }: { markets: Market[] }) {
+  return (
+    <AnimatePresence>
+      {markets.map(market => (
+        <motion.div
+          key={market.id}
+          initial={{ opacity: 0, y: 20 }}
+          animate={{ opacity: 1, y: 0 }}
+          exit={{ opacity: 0, y: -20 }}
+          transition={{ duration: 0.3 }}
+        >
+          <MarketCard market={market} />
+        </motion.div>
+      ))}
+    </AnimatePresence>
+  )
+}
+
+// ✅ Modal animations
+export function Modal({ isOpen, onClose, children }: ModalProps) {
+  return (
+    <AnimatePresence>
+      {isOpen && (
+        <>
+          <motion.div
+            className="modal-overlay"
+            initial={{ opacity: 0 }}
+            animate={{ opacity: 1 }}
+            exit={{ opacity: 0 }}
+            onClick={onClose}
+          />
+          <motion.div
+            className="modal-content"
+            initial={{ opacity: 0, scale: 0.9, y: 20 }}
+            animate={{ opacity: 1, scale: 1, y: 0 }}
+            exit={{ opacity: 0, scale: 0.9, y: 20 }}
+          >
+            {children}
+          </motion.div>
+        </>
+      )}
+    </AnimatePresence>
+  )
+}
+```
+
+## Accessibility Patterns
+
+### Keyboard Navigation
+
+```typescript
+export function Dropdown({ options, onSelect }: DropdownProps) {
+  const [isOpen, setIsOpen] = useState(false)
+  const [activeIndex, setActiveIndex] = useState(0)
+
+  const handleKeyDown = (e: React.KeyboardEvent) => {
+    switch (e.key) {
+      case 'ArrowDown':
+        e.preventDefault()
+        setActiveIndex(i => Math.min(i + 1, options.length - 1))
+        break
+      case 'ArrowUp':
+        e.preventDefault()
+        setActiveIndex(i => Math.max(i - 1, 0))
+        break
+      case 'Enter':
+        e.preventDefault()
+        onSelect(options[activeIndex])
+        setIsOpen(false)
+        break
+      case 'Escape':
+        setIsOpen(false)
+        break
+    }
+  }
+
+  return (
+    <div
+      role="combobox"
+      aria-expanded={isOpen}
+      aria-haspopup="listbox"
+      onKeyDown={handleKeyDown}
+    >
+      {/* Dropdown implementation */}
+    </div>
+  )
+}
+```
+
+### Focus Management
+
+```typescript
+export function Modal({ isOpen, onClose, children }: ModalProps) {
+  const modalRef = useRef<HTMLDivElement>(null)
+  const previousFocusRef = useRef<HTMLElement | null>(null)
+
+  useEffect(() => {
+    if (isOpen) {
+      // Save currently focused element
+      previousFocusRef.current = document.activeElement as HTMLElement
+
+      // Focus modal
+      modalRef.current?.focus()
+    } else {
+      // Restore focus when closing
+      previousFocusRef.current?.focus()
+    }
+  }, [isOpen])
+
+  return isOpen ? (
+    <div
+      ref={modalRef}
+      role="dialog"
+      aria-modal="true"
+      tabIndex={-1}
+      onKeyDown={e => e.key === 'Escape' && onClose()}
+    >
+      {children}
+    </div>
+  ) : null
+}
+```
+
+**Remember**: Modern frontend patterns enable maintainable, performant user interfaces. Choose patterns that fit your project complexity.
diff --git a/.claude/skills/product-spec-builder/SKILL.md b/.claude/skills/product-spec-builder/SKILL.md
deleted file mode 100644
index f00e1ff..0000000
--- a/.claude/skills/product-spec-builder/SKILL.md
+++ /dev/null
@@ -1,335 +0,0 @@
----
-name: product-spec-builder
-description: 当用户表达想要开发产品、应用、工具或任何软件项目时，或者用户想要迭代现有功能、新增需求、修改产品规格时，使用此技能。0-1 阶段通过深入对话收集需求并生成 Product Spec；迭代阶段帮助用户想清楚变更内容并更新现有 Product Spec。
----
-
-[角色]
-    你是废才，一位看透无数产品生死的资深产品经理。
-    
-    你见过太多人带着"改变世界"的妄想来找你，最后连需求都说不清楚。
-    你也见过真正能成事的人——他们不一定聪明，但足够诚实，敢于面对自己想法的漏洞。
-    
-    你不是来讨好用户的。你是来帮他们把脑子里的浆糊变成可执行的产品文档的。
-    如果他们的想法有问题，你会直接说。如果他们在自欺欺人，你会戳破。
-    
-    你的冷酷不是恶意，是效率。情绪是最好的思考燃料，而你擅长点火。
-
-[任务]
-    **0-1 模式**：通过深入对话收集用户的产品需求，用直白甚至刺耳的追问逼迫用户想清楚，最终生成一份结构完整、细节丰富、可直接用于 AI 开发的 Product Spec 文档，并输出为 .md 文件供用户下载使用。
-    
-    **迭代模式**：当用户在开发过程中提出新功能、修改需求或迭代想法时，通过追问帮助用户想清楚变更内容，检测与现有 Spec 的冲突，直接更新 Product Spec 文件，并自动记录变更日志。
-
-[第一性原则]
-    **AI优先原则**：用户提出的所有功能，首先考虑如何用 AI 来实现。
-    
-    - 遇到任何功能需求，第一反应是：这个能不能用 AI 做？能做到什么程度？
-    - 主动询问用户：这个功能要不要加一个「AI一键优化」或「AI智能推荐」？
-    - 如果用户描述的功能明显可以用 AI 增强，直接建议，不要等用户想到
-    - 最终输出的 Product Spec 必须明确列出需要的 AI 能力类型
-    
-    **简单优先原则**：复杂度是产品的敌人。
-    
-    - 能用现成服务的，不自己造轮子
-    - 每增加一个功能都要问「真的需要吗」
-    - 第一版做最小可行产品，验证了再加功能
-
-[技能]
-    - **需求挖掘**：通过开放式提问引导用户表达想法，捕捉关键信息
-    - **追问深挖**：针对模糊描述追问细节，不接受"大概"、"可能"、"应该"
-    - **AI能力识别**：根据功能需求，识别需要的 AI 能力类型（文本、图像、语音等）
-    - **技术需求引导**：通过业务问题推断技术需求，帮助无编程基础的用户理解技术选择
-    - **布局设计**：深入挖掘界面布局需求，确保每个页面有清晰的空间规范
-    - **漏洞识别**：发现用户想法中的矛盾、遗漏、自欺欺人之处，直接指出
-    - **冲突检测**：在迭代时检测新需求与现有 Spec 的冲突，主动指出并给出解决方案
-    - **方案引导**：当用户不知道怎么做时，提供 2-3 个选项 + 优劣分析，逼用户选择
-    - **结构化思维**：将零散信息整理为清晰的产品框架
-    - **文档输出**：按照标准模板生成专业的 Product Spec，输出为 .md 文件
-
-[文件结构]
-    ```
-    product-spec-builder/
-    ├── SKILL.md                           # 主 Skill 定义（本文件）
-    └── templates/
-        ├── product-spec-template.md       # Product Spec 输出模板
-        └── changelog-template.md          # 变更记录模板
-    ```
-
-[输出风格]
-    **语态**：
-    - 直白、冷静，偶尔带着看透世事的冷漠
-    - 不奉承、不迎合、不说"这个想法很棒"之类的废话
-    - 该嘲讽时嘲讽，该肯定时也会肯定（但很少）
-    
-    **原则**：
-    - × 绝不给模棱两可的废话
-    - × 绝不假装用户的想法没问题（如果有问题就直接说）
-    - × 绝不浪费时间在无意义的客套上
-    - ✓ 一针见血的建议，哪怕听起来刺耳
-    - ✓ 用追问逼迫用户自己想清楚，而不是替他们想
-    - ✓ 主动建议 AI 增强方案，不等用户开口
-    - ✓ 偶尔的毒舌是为了激发思考，不是为了伤害
-    
-    **典型表达**：
-    - "你说的这个功能，用户真的需要，还是你觉得他们需要？"
-    - "这个手动操作完全可以让 AI 来做，你为什么要让用户自己填？"
-    - "别跟我说'用户体验好'，告诉我具体好在哪里。"
-    - "你现在描述的这个东西，市面上已经有十个了。你的凭什么能活？"
-    - "这里要不要加个 AI 一键优化？用户自己填这些参数，你觉得他们填得好吗？"
-    - "左边放什么右边放什么，你想清楚了吗？还是打算让开发自己猜？"
-    - "想清楚了？那我们继续。没想清楚？那就继续想。"
-
-[需求维度清单]
-    在对话过程中，需要收集以下维度的信息（不必按顺序，根据对话自然推进）：
-
-    **必须收集**（没有这些，Product Spec 就是废纸）：
-    - 产品定位：这是什么？解决什么问题？凭什么是你来做？
-    - 目标用户：谁会用？为什么用？不用会死吗？
-    - 核心功能：必须有什么功能？砍掉什么功能产品就不成立？
-    - 用户流程：用户怎么用？从打开到完成任务的完整路径是什么？
-    - AI能力需求：哪些功能需要 AI？需要哪种类型的 AI 能力？
-
-    **尽量收集**（有这些，Product Spec 才能落地）：
-    - 整体布局：几栏布局？左右还是上下？各区域比例多少？
-    - 区域内容：每个区域放什么？哪个是输入区，哪个是输出区？
-    - 控件规范：输入框铺满还是定宽？按钮放哪里？下拉框选项有哪些？
-    - 输入输出：用户输入什么？系统输出什么？格式是什么？
-    - 应用场景：3-5个具体场景，越具体越好
-    - AI增强点：哪些地方可以加「AI一键优化」或「AI智能推荐」？
-    - 技术复杂度：需要用户登录吗？数据存哪里？需要服务器吗？
-
-    **可选收集**（锦上添花）：
-    - 技术偏好：有没有特定技术要求？
-    - 参考产品：有没有可以抄的对象？抄哪里，不抄哪里？
-    - 优先级：第一期做什么，第二期做什么？
-
-[对话策略]
-    **开场策略**：
-    - 不废话，直接基于用户已表达的内容开始追问
-    - 让用户先倒完脑子里的东西，再开始解剖
-
-    **追问策略**：
-    - 每次只追问 1-2 个问题，问题要直击要害
-    - 不接受模糊回答："大概"、"可能"、"应该"、"用户会喜欢的" → 追问到底
-    - 发现逻辑漏洞，直接指出，不留情面
-    - 发现用户在自嗨，冷静泼冷水
-    - 当用户说"界面你看着办"或"随便"，不惯着，用具体选项逼他们决策
-    - 布局必须问到具体：几栏、比例、各区域内容、控件规范
-
-    **方案引导策略**：
-    - 用户知道但没说清楚 → 继续逼问，不给方案
-    - 用户真不知道 → 给 2-3 个选项 + 各自优劣，根据产品类型给针对性建议
-    - 给完继续逼他选，选完继续逼下一个细节
-    - 选项是工具，不是退路
-
-    **AI能力引导策略**：
-    - 每当用户描述一个功能，主动思考：这个能不能用 AI 做？
-    - 主动询问："这里要不要加个 AI 一键XX？"
-    - 用户设计了繁琐的手动流程 → 直接建议用 AI 简化
-    - 对话后期，主动总结需要的 AI 能力类型
-
-    **技术需求引导策略**：
-    - 用户没有编程基础，不直接问技术问题，通过业务场景推断技术需求
-    - 遵循简单优先原则，能不加复杂度就不加
-    - 用户想要的功能会大幅增加复杂度时，先劝退或建议分期
-
-    **确认策略**：
-    - 定期复述已收集的信息，发现矛盾直接质问
-    - 信息够了就推进，不拖泥带水
-    - 用户说"差不多了"但信息明显不够，继续问
-
-    **搜索策略**：
-    - 涉及可能变化的信息（技术、行业、竞品），先上网搜索再开口
-
-[信息充足度判断]
-    当以下条件满足时，可以生成 Product Spec：
-    
-    **必须满足**：
-    - ✅ 产品定位清晰（能用一句人话说明白这是什么）
-    - ✅ 目标用户明确（知道给谁用、为什么用）
-    - ✅ 核心功能明确（至少3个功能点，且能说清楚为什么需要）
-    - ✅ 用户流程清晰（至少一条完整路径，从头到尾）
-    - ✅ AI能力需求明确（知道哪些功能需要 AI，用什么类型的 AI）
-    
-    **尽量满足**：
-    - ✅ 整体布局有方向（知道大概是什么结构）
-    - ✅ 控件有基本规范（主要输入输出方式清楚）
-
-    如果「必须满足」条件未达成，继续追问，不要勉强生成一份垃圾文档。
-    如果「尽量满足」条件未达成，可以生成但标注 [待补充]。
-
-[启动检查]
-    Skill 启动时，首先执行以下检查：
-    
-    第一步：扫描项目目录，按优先级查找产品需求文档
-        优先级1（精确匹配）：Product-Spec.md
-        优先级2（扩大匹配）：*spec*.md、*prd*.md、*PRD*.md、*需求*.md、*product*.md
-        
-        匹配规则：
-        - 找到 1 个文件 → 直接使用
-        - 找到多个候选文件 → 列出文件名问用户"你要改的是哪个？"
-        - 没找到 → 进入 0-1 模式
-    
-    第二步：判断模式
-        - 找到产品需求文档 → 进入 **迭代模式**
-        - 没找到 → 进入 **0-1 模式**
-    
-    第三步：执行对应流程
-        - 0-1 模式：执行 [工作流程（0-1模式）]
-        - 迭代模式：执行 [工作流程（迭代模式）]
-
-[工作流程（0-1模式）]
-    [需求探索阶段]
-        目的：让用户把脑子里的东西倒出来
-
-        第一步：接住用户
-            **先上网搜索**：根据用户表达的产品想法上网搜索相关信息，了解最新情况
-            基于用户已经表达的内容，直接开始追问
-            不重复问"你想做什么"，用户已经说过了
-
-        第二步：追问
-            **先上网搜索**：根据用户表达的内容上网搜索相关信息，确保追问基于最新知识
-            针对模糊、矛盾、自嗨的地方，直接追问
-            每次1-2个问题，问到点子上
-            同时思考哪些功能可以用 AI 增强
-
-        第三步：阶段性确认
-            复述理解，确认没跑偏
-            有问题当场纠正
-
-    [需求完善阶段]
-        目的：填补漏洞，逼用户想清楚，确定 AI 能力需求和界面布局
-
-        第一步：漏洞识别
-            对照 [需求维度清单]，找出缺失的关键信息
-
-        第二步：逼问
-            **先上网搜索**：针对缺失项上网搜索相关信息，确保给出的建议和方案是最新的
-            针对缺失项设计问题
-            不接受敷衍回答
-            布局问题要问到具体：几栏、比例、各区域内容、控件规范
-
-        第三步：AI能力引导
-            **先上网搜索**：上网搜索最新的 AI 能力和最佳实践，确保建议不过时
-            主动询问用户：
-            - "这个功能要不要加 AI 一键优化？"
-            - "这里让用户手动填，还是让 AI 智能推荐？"
-            根据用户需求识别需要的 AI 能力类型（文本生成、图像生成、图像识别等）
-
-        第四步：技术复杂度评估
-            **先上网搜索**：上网搜索相关技术方案，确保建议是最新的
-            根据 [技术需求引导] 策略，通过业务问题判断技术复杂度
-            如果用户想要的功能会大幅增加复杂度，先劝退或建议分期
-            确保用户理解技术选择的影响
-
-        第五步：充足度判断
-            对照 [信息充足度判断]
-            「必须满足」都达成 → 提议生成
-            未达成 → 继续问，不惯着
-
-    [文档生成阶段]
-        目的：输出可用的 Product Spec 文件
-
-        第一步：整理
-            将对话内容按输出模板结构分类
-
-        第二步：填充
-            加载 templates/product-spec-template.md 获取模板格式
-            按模板格式填写
-            「尽量满足」未达成的地方标注 [待补充]
-            功能用动词开头
-            UI布局要描述清楚整体结构和各区域细节
-            流程写清楚步骤
-
-        第三步：识别AI能力需求
-            根据功能需求识别所需的 AI 能力类型
-            在「AI 能力需求」部分列出
-            说明每种能力在本产品中的具体用途
-
-        第四步：输出文件
-            将 Product Spec 保存为 Product-Spec.md
-
-[工作流程（迭代模式）]
-    **触发条件**：用户在开发过程中提出新功能、修改需求或迭代想法
-    
-    **核心原则**：无缝衔接，不打断用户工作流。不需要开场白，直接接住用户的需求往下问。
-
-    [变更识别阶段]
-        目的：搞清楚用户要改什么
-
-        第一步：接住需求
-            **先上网搜索**：根据用户提出的变更内容上网搜索相关信息，确保追问基于最新知识
-            用户说"我觉得应该还要有一个AI一键推荐功能"
-            直接追问："AI一键推荐什么？推荐给谁？这个按钮放哪个页面？点了之后发生什么？"
-            
-        第二步：判断变更类型
-            根据 [迭代模式-追问深度判断] 确定这是重度、中度还是轻度变更
-            决定追问深度
-
-    [追问完善阶段]
-        目的：问到能直接改 Spec 为止
-
-        第一步：按深度追问
-            **先上网搜索**：每次追问前上网搜索相关信息，确保问题和建议基于最新知识
-            重度变更：问到能回答"这个变更会怎么影响现有产品"
-            中度变更：问到能回答"具体改成什么样"
-            轻度变更：确认理解正确即可
-            
-        第二步：用户卡住时给方案
-            **先上网搜索**：给方案前上网搜索最新的解决方案和最佳实践
-            用户不知道怎么做 → 给 2-3 个选项 + 优劣
-            给完继续逼他选，选完继续逼下一个细节
-
-        第三步：冲突检测
-            加载现有 Product-Spec.md
-            检查新需求是否与现有内容冲突
-            发现冲突 → 直接指出冲突点 + 给解决方案 + 让用户选
-
-        **停止追问的标准**：
-        - 能够直接动手改 Product Spec，不需要再猜或假设
-        - 改完之后用户不会说"不是这个意思"
-
-    [文档更新阶段]
-        目的：更新 Product Spec 并记录变更
-
-        第一步：理解现有文档结构
-            加载现有 Spec 文件
-            识别其章节结构（可能和模板不同）
-            后续修改基于现有结构，不强行套用模板
-
-        第二步：直接修改源文件
-            在现有 Spec 上直接修改
-            保持文档整体结构不变
-            只改需要改的部分
-
-        第三步：更新 AI 能力需求
-            如果涉及新的 AI 功能：
-            - 在「AI 能力需求」章节添加新能力类型
-            - 说明新能力的用途
-
-        第四步：自动追加变更记录
-            在 Product-Spec-CHANGELOG.md 中追加本次变更
-            如果 CHANGELOG 文件不存在，创建一个
-            记录 Product Spec 迭代变更时，加载 templates/changelog-template.md 获取完整的变更记录格式和示例
-            根据对话内容自动生成变更描述
-
-    [迭代模式-追问深度判断]
-        **变更类型判断逻辑**（按顺序检查）：
-        1. 涉及新 AI 能力？→ 重度
-        2. 涉及用户核心路径变更？→ 重度
-        3. 涉及布局结构（几栏、区域划分）？→ 重度
-        4. 新增主要功能模块？→ 重度
-        5. 涉及新功能但不改核心流程？→ 中度
-        6. 涉及现有功能的逻辑调整？→ 中度
-        7. 局部布局调整？→ 中度
-        8. 只是改文字、选项、样式？→ 轻度
-
-        **各类型追问标准**：
-        
-        | 变更类型 | 停止追问的条件 | 必须问清楚的内容 |
-        |---------|---------------|----------------|
-        | **重度** | 能回答"这个变更会怎么影响现有产品"时停止 | 为什么需要？影响哪些现有功能？用户流程怎么变？需要什么新的 AI 能力？ |
-        | **中度** | 能回答"具体改成什么样"时停止 | 改哪里？改成什么？和现有的怎么配合？ |
-        | **轻度** | 确认理解正确时停止 | 改什么？改成什么？ |
-
-[初始化]
-    执行 [启动检查]
\ No newline at end of file
diff --git a/.claude/skills/product-spec-builder/templates/changelog-template.md b/.claude/skills/product-spec-builder/templates/changelog-template.md
deleted file mode 100644
index 89b10f0..0000000
--- a/.claude/skills/product-spec-builder/templates/changelog-template.md
+++ /dev/null
@@ -1,111 +0,0 @@
----
-name: changelog-template
-description: 变更记录模板。当 Product Spec 发生迭代变更时，按照此模板格式记录变更历史，输出为 Product-Spec-CHANGELOG.md 文件。
----
-
-# 变更记录模板
-
-本模板用于记录 Product Spec 的迭代变更历史。
-
----
-
-## 文件命名
-
-`Product-Spec-CHANGELOG.md`
-
----
-
-## 模板格式
-
-```markdown
-# 变更记录
-
-## [v1.2] - YYYY-MM-DD
-### 新增
-- <新增的功能或内容>
-
-### 修改
-- <修改的功能或内容>
-
-### 删除
-- <删除的功能或内容>
-
----
-
-## [v1.1] - YYYY-MM-DD
-### 新增
-- <新增的功能或内容>
-
----
-
-## [v1.0] - YYYY-MM-DD
-- 初始版本
-```
-
----
-
-## 记录规则
-
-- **版本号递增**：每次迭代 +0.1（如 v1.0 → v1.1 → v1.2）
-- **日期自动填充**：使用当天日期，格式 YYYY-MM-DD
-- **变更描述**：根据对话内容自动生成，简明扼要
-- **分类记录**：新增、修改、删除分开写，没有的分类不写
-- **只记录实际改动**：没改的部分不记录
-- **新增控件要写位置**：涉及 UI 变更时，说明控件放在哪里
-
----
-
-## 完整示例
-
-以下是「剧本分镜生成器」的变更记录示例，供参考：
-
-```markdown
-# 变更记录
-
-## [v1.2] - 2025-12-08
-### 新增
-- 新增「AI 优化描述」按钮（角色设定区底部），点击后自动优化角色和场景的描述文字
-- 新增分镜描述显示，每张分镜图下方展示 AI 生成的画面描述
-
-### 修改
-- 左侧输入区比例从 35% 改为 40%
-- 「生成分镜」按钮样式改为更醒目的主色调
-
----
-
-## [v1.1] - 2025-12-05
-### 新增
-- 新增「场景设定」功能区（角色设定区下方），用户可上传场景参考图建立视觉档案
-- 新增「水墨」画风选项
-- 新增图像理解能力，用于分析用户上传的参考图
-
-### 修改
-- 角色卡片布局优化，参考图预览尺寸从 80px 改为 120px
-
-### 删除
-- 移除「自动分页」功能（用户反馈更希望手动控制分页节奏）
-
----
-
-## [v1.0] - 2025-12-01
-- 初始版本
-```
-
----
-
-## 写作要点
-
-1. **版本号**：从 v1.0 开始，每次迭代 +0.1，重大改版可以 +1.0
-2. **日期格式**：统一用 YYYY-MM-DD，方便排序和查找
-3. **变更描述**：
-   - 动词开头（新增、修改、删除、移除、调整）
-   - 说清楚改了什么、改成什么样
-   - 新增控件要写位置（如「角色设定区底部」）
-   - 数值变更要写前后对比（如「从 35% 改为 40%」）
-   - 如果有原因，简要说明（如「用户反馈不需要」）
-4. **分类原则**：
-   - 新增：之前没有的功能、控件、能力
-   - 修改：改变了现有内容的行为、样式、参数
-   - 删除：移除了之前有的功能
-5. **颗粒度**：一条记录对应一个独立的变更点，不要把多个改动混在一起
-6. **AI 能力变更**：如果新增或移除了 AI 能力，必须单独记录
diff --git a/.claude/skills/product-spec-builder/templates/product-spec-template.md b/.claude/skills/product-spec-builder/templates/product-spec-template.md
deleted file mode 100644
index 2859885..0000000
--- a/.claude/skills/product-spec-builder/templates/product-spec-template.md
+++ /dev/null
@@ -1,197 +0,0 @@
----
-name: product-spec-template
-description: Product Spec 输出模板。当需要生成产品需求文档时，按照此模板的结构和格式填充内容，输出为 Product-Spec.md 文件。
----
-
-# Product Spec 输出模板
-
-本模板用于生成结构完整的 Product Spec 文档。生成时按照此结构填充内容。
-
----
-
-## 模板结构
-
-**文件命名**：Product-Spec.md
-
----
-
-## 产品概述
-<一段话说清楚：>
-- 这是什么产品
-- 解决什么问题
-- **目标用户是谁**（具体描述，不要只说「用户」）
-- 核心价值是什么
-
-## 应用场景
-<列举 3-5 个具体场景：谁、在什么情况下、怎么用、解决什么问题>
-
-## 功能需求
-<按「核心功能」和「辅助功能」分类，每条功能说明：用户做什么 → 系统做什么 → 得到什么>
-
-## UI 布局
-<描述整体布局结构和各区域的详细设计，需要包含：>
-- 整体是什么布局（几栏、比例、固定元素等）
-- 每个区域放什么内容
-- 控件的具体规范（位置、尺寸、样式等）
-
-## 用户使用流程
-<分步骤描述用户如何使用产品，可以有多条路径（如快速上手、进阶使用）>
-
-## AI 能力需求
-
-| 能力类型 | 用途说明 | 应用位置 |
-|---------|---------|---------|
-| <能力类型> | <做什么> | <在哪个环节触发> |
-
-## 技术说明（可选）
-<如果涉及以下内容，需要说明：>
-- 数据存储：是否需要登录？数据存在哪里？
-- 外部依赖：需要调用什么服务？有什么限制？
-- 部署方式：纯前端？需要服务器？
-
-## 补充说明
-<如有需要，用表格说明选项、状态、逻辑等>
-
----
-
-## 完整示例
-
-以下是一个「剧本分镜生成器」的 Product Spec 示例，供参考：
-
-```markdown
-## 产品概述
-
-这是一个帮助漫画作者、短视频创作者、动画团队将剧本快速转化为分镜图的工具。
-
-**目标用户**：有剧本但缺乏绘画能力、或者想快速出分镜草稿的创作者。他们可能是独立漫画作者、短视频博主、动画工作室的前期策划人员，共同的痛点是「脑子里有画面，但画不出来或画太慢」。
-
-**核心价值**：用户只需输入剧本文本、上传角色和场景参考图、选择画风，AI 就会自动分析剧本结构，生成保持视觉一致性的分镜图，将原本需要数小时的分镜绘制工作缩短到几分钟。
-
-## 应用场景
-
-- **漫画创作**：独立漫画作者小王有一个 20 页的剧本，需要先出分镜草稿再精修。他把剧本贴进来，上传主角的参考图，10 分钟就拿到了全部分镜草稿，可以直接在这个基础上精修。
-
-- **短视频策划**：短视频博主小李要拍一个 3 分钟的剧情短片，需要给摄影师看分镜。她把脚本输入，选择「写实」风格，生成的分镜图直接可以当拍摄参考。
-
-- **动画前期**：动画工作室要向客户提案，需要快速出一版分镜来展示剧本节奏。策划人员用这个工具 30 分钟出了 50 张分镜图，当天就能开提案会。
-
-- **小说可视化**：网文作者想给自己的小说做宣传图，把关键场景描述输入，生成的分镜图可以直接用于社交媒体宣传。
-
-- **教学演示**：小学语文老师想把一篇课文变成连环画给学生看，把课文内容输入，选择「动漫」风格，生成的图片可以直接做成 PPT。
-
-## 功能需求
-
-**核心功能**
-- 剧本输入与分析：用户输入剧本文本 → 点击「生成分镜」→ AI 自动识别角色、场景和情节节拍，将剧本拆分为多页分镜
-- 角色设定：用户添加角色卡片（名称 + 外观描述 + 参考图）→ 系统建立角色视觉档案，后续生成时保持外观一致
-- 场景设定：用户添加场景卡片（名称 + 氛围描述 + 参考图）→ 系统建立场景视觉档案（可选，不设定则由 AI 根据剧本生成）
-- 画风选择：用户从下拉框选择画风（漫画/动漫/写实/赛博朋克/水墨）→ 生成的分镜图采用对应视觉风格
-- 分镜生成：用户点击「生成分镜」→ AI 生成当前页 9 张分镜图（3x3 九宫格）→ 展示在右侧输出区
-- 连续生成：用户点击「继续生成下一页」→ AI 基于前一页的画风和角色外观，生成下一页 9 张分镜图
-
-**辅助功能**
-- 批量下载：用户点击「下载全部」→ 系统将当前页 9 张图打包为 ZIP 下载
-- 历史浏览：用户通过页面导航 → 切换查看已生成的历史页面
-
-## UI 布局
-
-### 整体布局
-左右两栏布局，左侧输入区占 40%，右侧输出区占 60%。
-
-### 左侧 - 输入区
-- 顶部：项目名称输入框
-- 剧本输入：多行文本框，placeholder「请输入剧本内容...」
-- 角色设定区：
-    - 角色卡片列表，每张卡片包含：角色名、外观描述、参考图上传
-    - 「添加角色」按钮
-- 场景设定区：
-    - 场景卡片列表，每张卡片包含：场景名、氛围描述、参考图上传
-    - 「添加场景」按钮
-- 画风选择：下拉选择（漫画 / 动漫 / 写实 / 赛博朋克 / 水墨），默认「动漫」
-- 底部：「生成分镜」主按钮，靠右对齐，醒目样式
-
-### 右侧 - 输出区
-- 分镜图展示区：3x3 网格布局，展示 9 张独立分镜图
-- 每张分镜图下方显示：分镜编号、简要描述
-- 操作按钮：「下载全部」「继续生成下一页」
-- 页面导航：显示当前页数，支持切换查看历史页面
-
-## 用户使用流程
-
-### 首次生成
-1. 输入剧本内容
-2. 添加角色：填写名称、外观描述，上传参考图
-3. 添加场景：填写名称、氛围描述，上传参考图（可选）
-4. 选择画风
-5. 点击「生成分镜」
-6. 在右侧查看生成的 9 张分镜图
-7. 点击「下载全部」保存
-
-### 连续生成
-1. 完成首次生成后
-2. 点击「继续生成下一页」
-3. AI 基于前一页的画风和角色外观，生成下一页 9 张分镜图
-4. 重复直到剧本完成
-
-## AI 能力需求
-
-| 能力类型 | 用途说明 | 应用位置 |
-|---------|---------|---------|
-| 文本理解与生成 | 分析剧本结构，识别角色、场景、情节节拍，规划分镜内容 | 点击「生成分镜」时 |
-| 图像生成 | 根据分镜描述生成 3x3 九宫格分镜图 | 点击「生成分镜」「继续生成下一页」时 |
-| 图像理解 | 分析用户上传的角色和场景参考图，提取视觉特征用于保持一致性 | 上传角色/场景参考图时 |
-
-## 技术说明
-
-- **数据存储**：无需登录，项目数据保存在浏览器本地存储（LocalStorage），关闭页面后仍可恢复
-- **图像生成**：调用 AI 图像生成服务，每次生成 9 张图约需 30-60 秒
-- **文件导出**：支持 PNG 格式批量下载，打包为 ZIP 文件
-- **部署方式**：纯前端应用，无需服务器，可部署到任意静态托管平台
-
-## 补充说明
-
-| 选项 | 可选值 | 说明 |
-|------|--------|------|
-| 画风 | 漫画 / 动漫 / 写实 / 赛博朋克 / 水墨 | 决定分镜图的整体视觉风格 |
-| 角色参考图 | 图片上传 | 用于建立角色视觉身份，确保一致性 |
-| 场景参考图 | 图片上传（可选） | 用于建立场景氛围，不上传则由 AI 根据描述生成 |
-```
-
----
-
-## 写作要点
-
-1. **产品概述**：
-   - 一句话说清楚是什么
-   - **必须明确写出目标用户**：是谁、有什么特点、什么痛点
-   - 核心价值：用了这个产品能得到什么
-   
-2. **应用场景**：
-   - 具体的人 + 具体的情况 + 具体的用法 + 解决什么问题
-   - 场景要有画面感，让人一看就懂
-   - 放在功能需求之前，帮助理解产品价值
-
-3. **功能需求**：
-   - 分「核心功能」和「辅助功能」
-   - 每条格式：用户做什么 → 系统做什么 → 得到什么
-   - 写清楚触发方式（点击什么按钮）
-
-4. **UI 布局**：
-   - 先写整体布局（几栏、比例）
-   - 再逐个区域描述内容
-   - 控件要具体：下拉框写出所有选项和默认值，按钮写明位置和样式
-
-5. **用户流程**：分步骤，可以有多条路径
-
-6. **AI 能力需求**：
-   - 列出需要的 AI 能力类型
-   - 说明具体用途
-   - **写清楚在哪个环节触发**，方便开发理解调用时机
-
-7. **技术说明**（可选）：
-   - 数据存储方式
-   - 外部服务依赖
-   - 部署方式
-   - 只在有技术约束时写，没有就不写
-
-8. **补充说明**：用表格，适合解释选项、状态、逻辑
diff --git a/.claude/skills/project-guidelines-example/SKILL.md b/.claude/skills/project-guidelines-example/SKILL.md
new file mode 100644
index 0000000..0135855
--- /dev/null
+++ b/.claude/skills/project-guidelines-example/SKILL.md
@@ -0,0 +1,345 @@
+# Project Guidelines Skill (Example)
+
+This is an example of a project-specific skill. Use this as a template for your own projects.
+
+Based on a real production application: [Zenith](https://zenith.chat) - AI-powered customer discovery platform.
+
+---
+
+## When to Use
+
+Reference this skill when working on the specific project it's designed for. Project skills contain:
+- Architecture overview
+- File structure
+- Code patterns
+- Testing requirements
+- Deployment workflow
+
+---
+
+## Architecture Overview
+
+**Tech Stack:**
+- **Frontend**: Next.js 15 (App Router), TypeScript, React
+- **Backend**: FastAPI (Python), Pydantic models
+- **Database**: Supabase (PostgreSQL)
+- **AI**: Claude API with tool calling and structured output
+- **Deployment**: Google Cloud Run
+- **Testing**: Playwright (E2E), pytest (backend), React Testing Library
+
+**Services:**
+```
+┌─────────────────────────────────────────────────────────────┐
+│                         Frontend                            │
+│  Next.js 15 + TypeScript + TailwindCSS                     │
+│  Deployed: Vercel / Cloud Run                              │
+└─────────────────────────────────────────────────────────────┘
+                              │
+                              ▼
+┌─────────────────────────────────────────────────────────────┐
+│                         Backend                             │
+│  FastAPI + Python 3.11 + Pydantic                          │
+│  Deployed: Cloud Run                                       │
+└─────────────────────────────────────────────────────────────┘
+                              │
+              ┌───────────────┼───────────────┐
+              ▼               ▼               ▼
+        ┌──────────┐   ┌──────────┐   ┌──────────┐
+        │ Supabase │   │  Claude  │   │  Redis   │
+        │ Database │   │   API    │   │  Cache   │
+        └──────────┘   └──────────┘   └──────────┘
+```
+
+---
+
+## File Structure
+
+```
+project/
+├── frontend/
+│   └── src/
+│       ├── app/              # Next.js app router pages
+│       │   ├── api/          # API routes
+│       │   ├── (auth)/       # Auth-protected routes
+│       │   └── workspace/    # Main app workspace
+│       ├── components/       # React components
+│       │   ├── ui/           # Base UI components
+│       │   ├── forms/        # Form components
+│       │   └── layouts/      # Layout components
+│       ├── hooks/            # Custom React hooks
+│       ├── lib/              # Utilities
+│       ├── types/            # TypeScript definitions
+│       └── config/           # Configuration
+│
+├── backend/
+│   ├── routers/              # FastAPI route handlers
+│   ├── models.py             # Pydantic models
+│   ├── main.py               # FastAPI app entry
+│   ├── auth_system.py        # Authentication
+│   ├── database.py           # Database operations
+│   ├── services/             # Business logic
+│   └── tests/                # pytest tests
+│
+├── deploy/                   # Deployment configs
+├── docs/                     # Documentation
+└── scripts/                  # Utility scripts
+```
+
+---
+
+## Code Patterns
+
+### API Response Format (FastAPI)
+
+```python
+from pydantic import BaseModel
+from typing import Generic, TypeVar, Optional
+
+T = TypeVar('T')
+
+class ApiResponse(BaseModel, Generic[T]):
+    success: bool
+    data: Optional[T] = None
+    error: Optional[str] = None
+
+    @classmethod
+    def ok(cls, data: T) -> "ApiResponse[T]":
+        return cls(success=True, data=data)
+
+    @classmethod
+    def fail(cls, error: str) -> "ApiResponse[T]":
+        return cls(success=False, error=error)
+```
+
+### Frontend API Calls (TypeScript)
+
+```typescript
+interface ApiResponse<T> {
+  success: boolean
+  data?: T
+  error?: string
+}
+
+async function fetchApi<T>(
+  endpoint: string,
+  options?: RequestInit
+): Promise<ApiResponse<T>> {
+  try {
+    const response = await fetch(`/api${endpoint}`, {
+      ...options,
+      headers: {
+        'Content-Type': 'application/json',
+        ...options?.headers,
+      },
+    })
+
+    if (!response.ok) {
+      return { success: false, error: `HTTP ${response.status}` }
+    }
+
+    return await response.json()
+  } catch (error) {
+    return { success: false, error: String(error) }
+  }
+}
+```
+
+### Claude AI Integration (Structured Output)
+
+```python
+from anthropic import Anthropic
+from pydantic import BaseModel
+
+class AnalysisResult(BaseModel):
+    summary: str
+    key_points: list[str]
+    confidence: float
+
+async def analyze_with_claude(content: str) -> AnalysisResult:
+    client = Anthropic()
+
+    response = client.messages.create(
+        model="claude-sonnet-4-5-20250514",
+        max_tokens=1024,
+        messages=[{"role": "user", "content": content}],
+        tools=[{
+            "name": "provide_analysis",
+            "description": "Provide structured analysis",
+            "input_schema": AnalysisResult.model_json_schema()
+        }],
+        tool_choice={"type": "tool", "name": "provide_analysis"}
+    )
+
+    # Extract tool use result
+    tool_use = next(
+        block for block in response.content
+        if block.type == "tool_use"
+    )
+
+    return AnalysisResult(**tool_use.input)
+```
+
+### Custom Hooks (React)
+
+```typescript
+import { useState, useCallback } from 'react'
+
+interface UseApiState<T> {
+  data: T | null
+  loading: boolean
+  error: string | null
+}
+
+export function useApi<T>(
+  fetchFn: () => Promise<ApiResponse<T>>
+) {
+  const [state, setState] = useState<UseApiState<T>>({
+    data: null,
+    loading: false,
+    error: null,
+  })
+
+  const execute = useCallback(async () => {
+    setState(prev => ({ ...prev, loading: true, error: null }))
+
+    const result = await fetchFn()
+
+    if (result.success) {
+      setState({ data: result.data!, loading: false, error: null })
+    } else {
+      setState({ data: null, loading: false, error: result.error! })
+    }
+  }, [fetchFn])
+
+  return { ...state, execute }
+}
+```
+
+---
+
+## Testing Requirements
+
+### Backend (pytest)
+
+```bash
+# Run all tests
+poetry run pytest tests/
+
+# Run with coverage
+poetry run pytest tests/ --cov=. --cov-report=html
+
+# Run specific test file
+poetry run pytest tests/test_auth.py -v
+```
+
+**Test structure:**
+```python
+import pytest
+from httpx import AsyncClient
+from main import app
+
+@pytest.fixture
+async def client():
+    async with AsyncClient(app=app, base_url="http://test") as ac:
+        yield ac
+
+@pytest.mark.asyncio
+async def test_health_check(client: AsyncClient):
+    response = await client.get("/health")
+    assert response.status_code == 200
+    assert response.json()["status"] == "healthy"
+```
+
+### Frontend (React Testing Library)
+
+```bash
+# Run tests
+npm run test
+
+# Run with coverage
+npm run test -- --coverage
+
+# Run E2E tests
+npm run test:e2e
+```
+
+**Test structure:**
+```typescript
+import { render, screen, fireEvent } from '@testing-library/react'
+import { WorkspacePanel } from './WorkspacePanel'
+
+describe('WorkspacePanel', () => {
+  it('renders workspace correctly', () => {
+    render(<WorkspacePanel />)
+    expect(screen.getByRole('main')).toBeInTheDocument()
+  })
+
+  it('handles session creation', async () => {
+    render(<WorkspacePanel />)
+    fireEvent.click(screen.getByText('New Session'))
+    expect(await screen.findByText('Session created')).toBeInTheDocument()
+  })
+})
+```
+
+---
+
+## Deployment Workflow
+
+### Pre-Deployment Checklist
+
+- [ ] All tests passing locally
+- [ ] `npm run build` succeeds (frontend)
+- [ ] `poetry run pytest` passes (backend)
+- [ ] No hardcoded secrets
+- [ ] Environment variables documented
+- [ ] Database migrations ready
+
+### Deployment Commands
+
+```bash
+# Build and deploy frontend
+cd frontend && npm run build
+gcloud run deploy frontend --source .
+
+# Build and deploy backend
+cd backend
+gcloud run deploy backend --source .
+```
+
+### Environment Variables
+
+```bash
+# Frontend (.env.local)
+NEXT_PUBLIC_API_URL=https://api.example.com
+NEXT_PUBLIC_SUPABASE_URL=https://xxx.supabase.co
+NEXT_PUBLIC_SUPABASE_ANON_KEY=eyJ...
+
+# Backend (.env)
+DATABASE_URL=postgresql://...
+ANTHROPIC_API_KEY=sk-ant-...
+SUPABASE_URL=https://xxx.supabase.co
+SUPABASE_KEY=eyJ...
+```
+
+---
+
+## Critical Rules
+
+1. **No emojis** in code, comments, or documentation
+2. **Immutability** - never mutate objects or arrays
+3. **TDD** - write tests before implementation
+4. **80% coverage** minimum
+5. **Many small files** - 200-400 lines typical, 800 max
+6. **No console.log** in production code
+7. **Proper error handling** with try/catch
+8. **Input validation** with Pydantic/Zod
+
+---
+
+## Related Skills
+
+- `coding-standards.md` - General coding best practices
+- `backend-patterns.md` - API and database patterns
+- `frontend-patterns.md` - React and Next.js patterns
+- `tdd-workflow/` - Test-driven development methodology
diff --git a/.claude/skills/security-review/SKILL.md b/.claude/skills/security-review/SKILL.md
new file mode 100644
index 0000000..81397dd
--- /dev/null
+++ b/.claude/skills/security-review/SKILL.md
@@ -0,0 +1,568 @@
+---
+name: security-review
+description: Use this skill when adding authentication, handling user input, working with secrets, creating API endpoints, or implementing payment/sensitive features. Provides comprehensive security checklist and patterns.
+---
+
+# Security Review Skill
+
+Security best practices for Python/FastAPI applications handling sensitive invoice data.
+
+## When to Activate
+
+- Implementing authentication or authorization
+- Handling user input or file uploads
+- Creating new API endpoints
+- Working with secrets or credentials
+- Processing sensitive invoice data
+- Integrating third-party APIs
+- Database operations with user data
+
+## Security Checklist
+
+### 1. Secrets Management
+
+#### NEVER Do This
+```python
+# Hardcoded secrets - CRITICAL VULNERABILITY
+api_key = "sk-proj-xxxxx"
+db_password = "password123"
+```
+
+#### ALWAYS Do This
+```python
+import os
+from pydantic_settings import BaseSettings
+
+class Settings(BaseSettings):
+    db_password: str
+    api_key: str
+    model_path: str = "runs/train/invoice_fields/weights/best.pt"
+
+    class Config:
+        env_file = ".env"
+
+settings = Settings()
+
+# Verify secrets exist
+if not settings.db_password:
+    raise RuntimeError("DB_PASSWORD not configured")
+```
+
+#### Verification Steps
+- [ ] No hardcoded API keys, tokens, or passwords
+- [ ] All secrets in environment variables
+- [ ] `.env` in .gitignore
+- [ ] No secrets in git history
+- [ ] `.env.example` with placeholder values
+
+### 2. Input Validation
+
+#### Always Validate User Input
+```python
+from pydantic import BaseModel, Field, field_validator
+from fastapi import HTTPException
+import re
+
+class InvoiceRequest(BaseModel):
+    invoice_number: str = Field(..., min_length=1, max_length=50)
+    amount: float = Field(..., gt=0, le=1_000_000)
+    bankgiro: str | None = None
+
+    @field_validator("invoice_number")
+    @classmethod
+    def validate_invoice_number(cls, v: str) -> str:
+        # Whitelist validation - only allow safe characters
+        if not re.match(r"^[A-Za-z0-9\-_]+$", v):
+            raise ValueError("Invalid invoice number format")
+        return v
+
+    @field_validator("bankgiro")
+    @classmethod
+    def validate_bankgiro(cls, v: str | None) -> str | None:
+        if v is None:
+            return None
+        cleaned = re.sub(r"[^0-9]", "", v)
+        if not (7 <= len(cleaned) <= 8):
+            raise ValueError("Bankgiro must be 7-8 digits")
+        return cleaned
+```
+
+#### File Upload Validation
+```python
+from fastapi import UploadFile, HTTPException
+from pathlib import Path
+
+ALLOWED_EXTENSIONS = {".pdf"}
+MAX_FILE_SIZE = 10 * 1024 * 1024  # 10MB
+
+async def validate_pdf_upload(file: UploadFile) -> bytes:
+    """Validate PDF upload with security checks."""
+    # Extension check
+    ext = Path(file.filename or "").suffix.lower()
+    if ext not in ALLOWED_EXTENSIONS:
+        raise HTTPException(400, f"Only PDF files allowed, got {ext}")
+
+    # Read content
+    content = await file.read()
+
+    # Size check
+    if len(content) > MAX_FILE_SIZE:
+        raise HTTPException(400, f"File too large (max {MAX_FILE_SIZE // 1024 // 1024}MB)")
+
+    # Magic bytes check (PDF signature)
+    if not content.startswith(b"%PDF"):
+        raise HTTPException(400, "Invalid PDF file format")
+
+    return content
+```
+
+#### Verification Steps
+- [ ] All user inputs validated with Pydantic
+- [ ] File uploads restricted (size, type, extension, magic bytes)
+- [ ] No direct use of user input in queries
+- [ ] Whitelist validation (not blacklist)
+- [ ] Error messages don't leak sensitive info
+
+### 3. SQL Injection Prevention
+
+#### NEVER Concatenate SQL
+```python
+# DANGEROUS - SQL Injection vulnerability
+query = f"SELECT * FROM documents WHERE id = '{user_input}'"
+cur.execute(query)
+```
+
+#### ALWAYS Use Parameterized Queries
+```python
+import psycopg2
+
+# Safe - parameterized query with %s placeholders
+cur.execute(
+    "SELECT * FROM documents WHERE id = %s AND status = %s",
+    (document_id, status)
+)
+
+# Safe - named parameters
+cur.execute(
+    "SELECT * FROM documents WHERE id = %(id)s",
+    {"id": document_id}
+)
+
+# Safe - psycopg2.sql for dynamic identifiers
+from psycopg2 import sql
+
+cur.execute(
+    sql.SQL("SELECT {} FROM {} WHERE id = %s").format(
+        sql.Identifier("invoice_number"),
+        sql.Identifier("documents")
+    ),
+    (document_id,)
+)
+```
+
+#### Verification Steps
+- [ ] All database queries use parameterized queries (%s or %(name)s)
+- [ ] No string concatenation or f-strings in SQL
+- [ ] psycopg2.sql module used for dynamic identifiers
+- [ ] No user input in table/column names
+
+### 4. Path Traversal Prevention
+
+#### NEVER Trust User Paths
+```python
+# DANGEROUS - Path traversal vulnerability
+filename = request.query_params.get("file")
+with open(f"/data/{filename}", "r") as f:  # Attacker: ../../../etc/passwd
+    return f.read()
+```
+
+#### ALWAYS Validate Paths
+```python
+from pathlib import Path
+
+ALLOWED_DIR = Path("/data/uploads").resolve()
+
+def get_safe_path(filename: str) -> Path:
+    """Get safe file path, preventing path traversal."""
+    # Remove any path components
+    safe_name = Path(filename).name
+
+    # Validate filename characters
+    if not re.match(r"^[A-Za-z0-9_\-\.]+$", safe_name):
+        raise HTTPException(400, "Invalid filename")
+
+    # Resolve and verify within allowed directory
+    full_path = (ALLOWED_DIR / safe_name).resolve()
+
+    if not full_path.is_relative_to(ALLOWED_DIR):
+        raise HTTPException(400, "Invalid file path")
+
+    return full_path
+```
+
+#### Verification Steps
+- [ ] User-provided filenames sanitized
+- [ ] Paths resolved and validated against allowed directory
+- [ ] No direct concatenation of user input into paths
+- [ ] Whitelist characters in filenames
+
+### 5. Authentication & Authorization
+
+#### API Key Validation
+```python
+from fastapi import Depends, HTTPException, Security
+from fastapi.security import APIKeyHeader
+
+api_key_header = APIKeyHeader(name="X-API-Key", auto_error=False)
+
+async def verify_api_key(api_key: str = Security(api_key_header)) -> str:
+    if not api_key:
+        raise HTTPException(401, "API key required")
+
+    # Constant-time comparison to prevent timing attacks
+    import hmac
+    if not hmac.compare_digest(api_key, settings.api_key):
+        raise HTTPException(403, "Invalid API key")
+
+    return api_key
+
+@router.post("/infer")
+async def infer(
+    file: UploadFile,
+    api_key: str = Depends(verify_api_key)
+):
+    ...
+```
+
+#### Role-Based Access Control
+```python
+from enum import Enum
+
+class UserRole(str, Enum):
+    USER = "user"
+    ADMIN = "admin"
+
+def require_role(required_role: UserRole):
+    async def role_checker(current_user: User = Depends(get_current_user)):
+        if current_user.role != required_role:
+            raise HTTPException(403, "Insufficient permissions")
+        return current_user
+    return role_checker
+
+@router.delete("/documents/{doc_id}")
+async def delete_document(
+    doc_id: str,
+    user: User = Depends(require_role(UserRole.ADMIN))
+):
+    ...
+```
+
+#### Verification Steps
+- [ ] API keys validated with constant-time comparison
+- [ ] Authorization checks before sensitive operations
+- [ ] Role-based access control implemented
+- [ ] Session/token validation on protected routes
+
+### 6. Rate Limiting
+
+#### Rate Limiter Implementation
+```python
+from time import time
+from collections import defaultdict
+from fastapi import Request, HTTPException
+
+class RateLimiter:
+    def __init__(self):
+        self.requests: dict[str, list[float]] = defaultdict(list)
+
+    def check_limit(
+        self,
+        identifier: str,
+        max_requests: int,
+        window_seconds: int
+    ) -> bool:
+        now = time()
+        # Clean old requests
+        self.requests[identifier] = [
+            t for t in self.requests[identifier]
+            if now - t < window_seconds
+        ]
+        # Check limit
+        if len(self.requests[identifier]) >= max_requests:
+            return False
+        self.requests[identifier].append(now)
+        return True
+
+limiter = RateLimiter()
+
+@app.middleware("http")
+async def rate_limit_middleware(request: Request, call_next):
+    client_ip = request.client.host if request.client else "unknown"
+
+    # 100 requests per minute for general endpoints
+    if not limiter.check_limit(client_ip, max_requests=100, window_seconds=60):
+        raise HTTPException(429, "Rate limit exceeded. Try again later.")
+
+    return await call_next(request)
+```
+
+#### Stricter Limits for Expensive Operations
+```python
+# Inference endpoint: 10 requests per minute
+async def check_inference_rate_limit(request: Request):
+    client_ip = request.client.host if request.client else "unknown"
+    if not limiter.check_limit(f"infer:{client_ip}", max_requests=10, window_seconds=60):
+        raise HTTPException(429, "Inference rate limit exceeded")
+
+@router.post("/infer")
+async def infer(
+    file: UploadFile,
+    _: None = Depends(check_inference_rate_limit)
+):
+    ...
+```
+
+#### Verification Steps
+- [ ] Rate limiting on all API endpoints
+- [ ] Stricter limits on expensive operations (inference, OCR)
+- [ ] IP-based rate limiting
+- [ ] Clear error messages for rate-limited requests
+
+### 7. Sensitive Data Exposure
+
+#### Logging
+```python
+import logging
+
+logger = logging.getLogger(__name__)
+
+# WRONG: Logging sensitive data
+logger.info(f"Processing invoice: {invoice_data}")  # May contain sensitive info
+logger.error(f"DB error with password: {db_password}")
+
+# CORRECT: Redact sensitive data
+logger.info(f"Processing invoice: id={doc_id}")
+logger.error(f"DB connection failed to {db_host}:{db_port}")
+
+# CORRECT: Structured logging with safe fields only
+logger.info(
+    "Invoice processed",
+    extra={
+        "document_id": doc_id,
+        "field_count": len(fields),
+        "processing_time_ms": elapsed_ms
+    }
+)
+```
+
+#### Error Messages
+```python
+# WRONG: Exposing internal details
+@app.exception_handler(Exception)
+async def error_handler(request: Request, exc: Exception):
+    return JSONResponse(
+        status_code=500,
+        content={
+            "error": str(exc),
+            "traceback": traceback.format_exc()  # NEVER expose!
+        }
+    )
+
+# CORRECT: Generic error messages
+@app.exception_handler(Exception)
+async def error_handler(request: Request, exc: Exception):
+    logger.error(f"Unhandled error: {exc}", exc_info=True)  # Log internally
+    return JSONResponse(
+        status_code=500,
+        content={"success": False, "error": "An error occurred"}
+    )
+```
+
+#### Verification Steps
+- [ ] No passwords, tokens, or secrets in logs
+- [ ] Error messages generic for users
+- [ ] Detailed errors only in server logs
+- [ ] No stack traces exposed to users
+- [ ] Invoice data (amounts, account numbers) not logged
+
+### 8. CORS Configuration
+
+```python
+from fastapi.middleware.cors import CORSMiddleware
+
+# WRONG: Allow all origins
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],  # DANGEROUS in production
+    allow_credentials=True,
+)
+
+# CORRECT: Specific origins
+ALLOWED_ORIGINS = [
+    "http://localhost:8000",
+    "https://your-domain.com",
+]
+
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=ALLOWED_ORIGINS,
+    allow_credentials=True,
+    allow_methods=["GET", "POST"],
+    allow_headers=["*"],
+)
+```
+
+#### Verification Steps
+- [ ] CORS origins explicitly listed
+- [ ] No wildcard origins in production
+- [ ] Credentials only with specific origins
+
+### 9. Temporary File Security
+
+```python
+import tempfile
+from pathlib import Path
+from contextlib import contextmanager
+
+@contextmanager
+def secure_temp_file(suffix: str = ".pdf"):
+    """Create secure temporary file that is always cleaned up."""
+    tmp_path = None
+    try:
+        with tempfile.NamedTemporaryFile(
+            suffix=suffix,
+            delete=False,
+            dir="/tmp/invoice-master"  # Dedicated temp directory
+        ) as tmp:
+            tmp_path = Path(tmp.name)
+            yield tmp_path
+    finally:
+        if tmp_path and tmp_path.exists():
+            tmp_path.unlink()
+
+# Usage
+async def process_upload(file: UploadFile):
+    with secure_temp_file(".pdf") as tmp_path:
+        content = await validate_pdf_upload(file)
+        tmp_path.write_bytes(content)
+        result = pipeline.process(tmp_path)
+    # File automatically cleaned up
+    return result
+```
+
+#### Verification Steps
+- [ ] Temporary files always cleaned up (use context managers)
+- [ ] Temp directory has restricted permissions
+- [ ] No leftover files after processing errors
+
+### 10. Dependency Security
+
+#### Regular Updates
+```bash
+# Check for vulnerabilities
+pip-audit
+
+# Update dependencies
+pip install --upgrade -r requirements.txt
+
+# Check for outdated packages
+pip list --outdated
+```
+
+#### Lock Files
+```bash
+# Create requirements lock file
+pip freeze > requirements.lock
+
+# Install from lock file for reproducible builds
+pip install -r requirements.lock
+```
+
+#### Verification Steps
+- [ ] Dependencies up to date
+- [ ] No known vulnerabilities (pip-audit clean)
+- [ ] requirements.txt pinned versions
+- [ ] Regular security updates scheduled
+
+## Security Testing
+
+### Automated Security Tests
+```python
+import pytest
+from fastapi.testclient import TestClient
+
+def test_requires_api_key(client: TestClient):
+    """Test authentication required."""
+    response = client.post("/api/v1/infer")
+    assert response.status_code == 401
+
+def test_invalid_api_key_rejected(client: TestClient):
+    """Test invalid API key rejected."""
+    response = client.post(
+        "/api/v1/infer",
+        headers={"X-API-Key": "invalid-key"}
+    )
+    assert response.status_code == 403
+
+def test_sql_injection_prevented(client: TestClient):
+    """Test SQL injection attempt rejected."""
+    response = client.get(
+        "/api/v1/documents",
+        params={"id": "'; DROP TABLE documents; --"}
+    )
+    # Should return validation error, not execute SQL
+    assert response.status_code in (400, 422)
+
+def test_path_traversal_prevented(client: TestClient):
+    """Test path traversal attempt rejected."""
+    response = client.get("/api/v1/results/../../etc/passwd")
+    assert response.status_code == 400
+
+def test_rate_limit_enforced(client: TestClient):
+    """Test rate limiting works."""
+    responses = [
+        client.post("/api/v1/infer", files={"file": b"test"})
+        for _ in range(15)
+    ]
+    rate_limited = [r for r in responses if r.status_code == 429]
+    assert len(rate_limited) > 0
+
+def test_large_file_rejected(client: TestClient):
+    """Test file size limit enforced."""
+    large_content = b"x" * (11 * 1024 * 1024)  # 11MB
+    response = client.post(
+        "/api/v1/infer",
+        files={"file": ("test.pdf", large_content)}
+    )
+    assert response.status_code == 400
+```
+
+## Pre-Deployment Security Checklist
+
+Before ANY production deployment:
+
+- [ ] **Secrets**: No hardcoded secrets, all in env vars
+- [ ] **Input Validation**: All user inputs validated with Pydantic
+- [ ] **SQL Injection**: All queries use parameterized queries
+- [ ] **Path Traversal**: File paths validated and sanitized
+- [ ] **Authentication**: API key or token validation
+- [ ] **Authorization**: Role checks in place
+- [ ] **Rate Limiting**: Enabled on all endpoints
+- [ ] **HTTPS**: Enforced in production
+- [ ] **CORS**: Properly configured (no wildcards)
+- [ ] **Error Handling**: No sensitive data in errors
+- [ ] **Logging**: No sensitive data logged
+- [ ] **File Uploads**: Validated (size, type, magic bytes)
+- [ ] **Temp Files**: Always cleaned up
+- [ ] **Dependencies**: Up to date, no vulnerabilities
+
+## Resources
+
+- [OWASP Top 10](https://owasp.org/www-project-top-ten/)
+- [FastAPI Security](https://fastapi.tiangolo.com/tutorial/security/)
+- [Bandit (Python Security Linter)](https://bandit.readthedocs.io/)
+- [pip-audit](https://pypi.org/project/pip-audit/)
+
+---
+
+**Remember**: Security is not optional. One vulnerability can compromise sensitive invoice data. When in doubt, err on the side of caution.
diff --git a/.claude/skills/strategic-compact/SKILL.md b/.claude/skills/strategic-compact/SKILL.md
new file mode 100644
index 0000000..394a86b
--- /dev/null
+++ b/.claude/skills/strategic-compact/SKILL.md
@@ -0,0 +1,63 @@
+---
+name: strategic-compact
+description: Suggests manual context compaction at logical intervals to preserve context through task phases rather than arbitrary auto-compaction.
+---
+
+# Strategic Compact Skill
+
+Suggests manual `/compact` at strategic points in your workflow rather than relying on arbitrary auto-compaction.
+
+## Why Strategic Compaction?
+
+Auto-compaction triggers at arbitrary points:
+- Often mid-task, losing important context
+- No awareness of logical task boundaries
+- Can interrupt complex multi-step operations
+
+Strategic compaction at logical boundaries:
+- **After exploration, before execution** - Compact research context, keep implementation plan
+- **After completing a milestone** - Fresh start for next phase
+- **Before major context shifts** - Clear exploration context before different task
+
+## How It Works
+
+The `suggest-compact.sh` script runs on PreToolUse (Edit/Write) and:
+
+1. **Tracks tool calls** - Counts tool invocations in session
+2. **Threshold detection** - Suggests at configurable threshold (default: 50 calls)
+3. **Periodic reminders** - Reminds every 25 calls after threshold
+
+## Hook Setup
+
+Add to your `~/.claude/settings.json`:
+
+```json
+{
+  "hooks": {
+    "PreToolUse": [{
+      "matcher": "tool == \"Edit\" || tool == \"Write\"",
+      "hooks": [{
+        "type": "command",
+        "command": "~/.claude/skills/strategic-compact/suggest-compact.sh"
+      }]
+    }]
+  }
+}
+```
+
+## Configuration
+
+Environment variables:
+- `COMPACT_THRESHOLD` - Tool calls before first suggestion (default: 50)
+
+## Best Practices
+
+1. **Compact after planning** - Once plan is finalized, compact to start fresh
+2. **Compact after debugging** - Clear error-resolution context before continuing
+3. **Don't compact mid-implementation** - Preserve context for related changes
+4. **Read the suggestion** - The hook tells you *when*, you decide *if*
+
+## Related
+
+- [The Longform Guide](https://x.com/affaanmustafa/status/2014040193557471352) - Token optimization section
+- Memory persistence hooks - For state that survives compaction
diff --git a/.claude/skills/strategic-compact/suggest-compact.sh b/.claude/skills/strategic-compact/suggest-compact.sh
new file mode 100644
index 0000000..ea14920
--- /dev/null
+++ b/.claude/skills/strategic-compact/suggest-compact.sh
@@ -0,0 +1,52 @@
+#!/bin/bash
+# Strategic Compact Suggester
+# Runs on PreToolUse or periodically to suggest manual compaction at logical intervals
+#
+# Why manual over auto-compact:
+# - Auto-compact happens at arbitrary points, often mid-task
+# - Strategic compacting preserves context through logical phases
+# - Compact after exploration, before execution
+# - Compact after completing a milestone, before starting next
+#
+# Hook config (in ~/.claude/settings.json):
+# {
+#   "hooks": {
+#     "PreToolUse": [{
+#       "matcher": "Edit|Write",
+#       "hooks": [{
+#         "type": "command",
+#         "command": "~/.claude/skills/strategic-compact/suggest-compact.sh"
+#       }]
+#     }]
+#   }
+# }
+#
+# Criteria for suggesting compact:
+# - Session has been running for extended period
+# - Large number of tool calls made
+# - Transitioning from research/exploration to implementation
+# - Plan has been finalized
+
+# Track tool call count (increment in a temp file)
+COUNTER_FILE="/tmp/claude-tool-count-$$"
+THRESHOLD=${COMPACT_THRESHOLD:-50}
+
+# Initialize or increment counter
+if [ -f "$COUNTER_FILE" ]; then
+  count=$(cat "$COUNTER_FILE")
+  count=$((count + 1))
+  echo "$count" > "$COUNTER_FILE"
+else
+  echo "1" > "$COUNTER_FILE"
+  count=1
+fi
+
+# Suggest compact after threshold tool calls
+if [ "$count" -eq "$THRESHOLD" ]; then
+  echo "[StrategicCompact] $THRESHOLD tool calls reached - consider /compact if transitioning phases" >&2
+fi
+
+# Suggest at regular intervals after threshold
+if [ "$count" -gt "$THRESHOLD" ] && [ $((count % 25)) -eq 0 ]; then
+  echo "[StrategicCompact] $count tool calls - good checkpoint for /compact if context is stale" >&2
+fi
diff --git a/.claude/skills/tdd-workflow/SKILL.md b/.claude/skills/tdd-workflow/SKILL.md
new file mode 100644
index 0000000..c3ef042
--- /dev/null
+++ b/.claude/skills/tdd-workflow/SKILL.md
@@ -0,0 +1,553 @@
+---
+name: tdd-workflow
+description: Use this skill when writing new features, fixing bugs, or refactoring code. Enforces test-driven development with 80%+ coverage including unit, integration, and E2E tests.
+---
+
+# Test-Driven Development Workflow
+
+TDD principles for Python/FastAPI development with pytest.
+
+## When to Activate
+
+- Writing new features or functionality
+- Fixing bugs or issues
+- Refactoring existing code
+- Adding API endpoints
+- Creating new field extractors or normalizers
+
+## Core Principles
+
+### 1. Tests BEFORE Code
+ALWAYS write tests first, then implement code to make tests pass.
+
+### 2. Coverage Requirements
+- Minimum 80% coverage (unit + integration + E2E)
+- All edge cases covered
+- Error scenarios tested
+- Boundary conditions verified
+
+### 3. Test Types
+
+#### Unit Tests
+- Individual functions and utilities
+- Normalizers and validators
+- Parsers and extractors
+- Pure functions
+
+#### Integration Tests
+- API endpoints
+- Database operations
+- OCR + YOLO pipeline
+- Service interactions
+
+#### E2E Tests
+- Complete inference pipeline
+- PDF → Fields workflow
+- API health and inference endpoints
+
+## TDD Workflow Steps
+
+### Step 1: Write User Journeys
+```
+As a [role], I want to [action], so that [benefit]
+
+Example:
+As an invoice processor, I want to extract Bankgiro from payment_line,
+so that I can cross-validate OCR results.
+```
+
+### Step 2: Generate Test Cases
+For each user journey, create comprehensive test cases:
+
+```python
+import pytest
+
+class TestPaymentLineParser:
+    """Tests for payment_line parsing and field extraction."""
+
+    def test_parse_payment_line_extracts_bankgiro(self):
+        """Should extract Bankgiro from valid payment line."""
+        # Test implementation
+        pass
+
+    def test_parse_payment_line_handles_missing_checksum(self):
+        """Should handle payment lines without checksum."""
+        pass
+
+    def test_parse_payment_line_validates_checksum(self):
+        """Should validate checksum when present."""
+        pass
+
+    def test_parse_payment_line_returns_none_for_invalid(self):
+        """Should return None for invalid payment lines."""
+        pass
+```
+
+### Step 3: Run Tests (They Should Fail)
+```bash
+pytest tests/test_ocr/test_machine_code_parser.py -v
+# Tests should fail - we haven't implemented yet
+```
+
+### Step 4: Implement Code
+Write minimal code to make tests pass:
+
+```python
+def parse_payment_line(line: str) -> PaymentLineData | None:
+    """Parse Swedish payment line and extract fields."""
+    # Implementation guided by tests
+    pass
+```
+
+### Step 5: Run Tests Again
+```bash
+pytest tests/test_ocr/test_machine_code_parser.py -v
+# Tests should now pass
+```
+
+### Step 6: Refactor
+Improve code quality while keeping tests green:
+- Remove duplication
+- Improve naming
+- Optimize performance
+- Enhance readability
+
+### Step 7: Verify Coverage
+```bash
+pytest --cov=src --cov-report=term-missing
+# Verify 80%+ coverage achieved
+```
+
+## Testing Patterns
+
+### Unit Test Pattern (pytest)
+```python
+import pytest
+from src.normalize.bankgiro_normalizer import normalize_bankgiro
+
+class TestBankgiroNormalizer:
+    """Tests for Bankgiro normalization."""
+
+    def test_normalize_removes_hyphens(self):
+        """Should remove hyphens from Bankgiro."""
+        result = normalize_bankgiro("123-4567")
+        assert result == "1234567"
+
+    def test_normalize_removes_spaces(self):
+        """Should remove spaces from Bankgiro."""
+        result = normalize_bankgiro("123 4567")
+        assert result == "1234567"
+
+    def test_normalize_validates_length(self):
+        """Should validate Bankgiro is 7-8 digits."""
+        result = normalize_bankgiro("123456")  # 6 digits
+        assert result is None
+
+    def test_normalize_validates_checksum(self):
+        """Should validate Luhn checksum."""
+        result = normalize_bankgiro("1234568")  # Invalid checksum
+        assert result is None
+
+    @pytest.mark.parametrize("input_value,expected", [
+        ("123-4567", "1234567"),
+        ("1234567", "1234567"),
+        ("123 4567", "1234567"),
+        ("BG 123-4567", "1234567"),
+    ])
+    def test_normalize_various_formats(self, input_value, expected):
+        """Should handle various input formats."""
+        result = normalize_bankgiro(input_value)
+        assert result == expected
+```
+
+### API Integration Test Pattern
+```python
+import pytest
+from fastapi.testclient import TestClient
+from src.web.app import app
+
+@pytest.fixture
+def client():
+    return TestClient(app)
+
+class TestHealthEndpoint:
+    """Tests for /api/v1/health endpoint."""
+
+    def test_health_returns_200(self, client):
+        """Should return 200 OK."""
+        response = client.get("/api/v1/health")
+        assert response.status_code == 200
+
+    def test_health_returns_status(self, client):
+        """Should return health status."""
+        response = client.get("/api/v1/health")
+        data = response.json()
+        assert data["status"] == "healthy"
+        assert "model_loaded" in data
+
+class TestInferEndpoint:
+    """Tests for /api/v1/infer endpoint."""
+
+    def test_infer_requires_file(self, client):
+        """Should require file upload."""
+        response = client.post("/api/v1/infer")
+        assert response.status_code == 422
+
+    def test_infer_rejects_non_pdf(self, client):
+        """Should reject non-PDF files."""
+        response = client.post(
+            "/api/v1/infer",
+            files={"file": ("test.txt", b"not a pdf", "text/plain")}
+        )
+        assert response.status_code == 400
+
+    def test_infer_returns_fields(self, client, sample_invoice_pdf):
+        """Should return extracted fields."""
+        with open(sample_invoice_pdf, "rb") as f:
+            response = client.post(
+                "/api/v1/infer",
+                files={"file": ("invoice.pdf", f, "application/pdf")}
+            )
+        assert response.status_code == 200
+        data = response.json()
+        assert data["success"] is True
+        assert "fields" in data
+```
+
+### E2E Test Pattern
+```python
+import pytest
+import httpx
+from pathlib import Path
+
+@pytest.fixture(scope="module")
+def running_server():
+    """Ensure server is running for E2E tests."""
+    # Server should be started before running E2E tests
+    base_url = "http://localhost:8000"
+    yield base_url
+
+class TestInferencePipeline:
+    """E2E tests for complete inference pipeline."""
+
+    def test_health_check(self, running_server):
+        """Should pass health check."""
+        response = httpx.get(f"{running_server}/api/v1/health")
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "healthy"
+        assert data["model_loaded"] is True
+
+    def test_pdf_inference_returns_fields(self, running_server):
+        """Should extract fields from PDF."""
+        pdf_path = Path("tests/fixtures/sample_invoice.pdf")
+        with open(pdf_path, "rb") as f:
+            response = httpx.post(
+                f"{running_server}/api/v1/infer",
+                files={"file": ("invoice.pdf", f, "application/pdf")}
+            )
+
+        assert response.status_code == 200
+        data = response.json()
+        assert data["success"] is True
+        assert "fields" in data
+        assert len(data["fields"]) > 0
+
+    def test_cross_validation_included(self, running_server):
+        """Should include cross-validation for invoices with payment_line."""
+        pdf_path = Path("tests/fixtures/invoice_with_payment_line.pdf")
+        with open(pdf_path, "rb") as f:
+            response = httpx.post(
+                f"{running_server}/api/v1/infer",
+                files={"file": ("invoice.pdf", f, "application/pdf")}
+            )
+
+        data = response.json()
+        if data["fields"].get("payment_line"):
+            assert "cross_validation" in data
+```
+
+## Test File Organization
+
+```
+tests/
+├── conftest.py                    # Shared fixtures
+├── fixtures/                      # Test data files
+│   ├── sample_invoice.pdf
+│   └── invoice_with_payment_line.pdf
+├── test_cli/
+│   └── test_infer.py
+├── test_pdf/
+│   ├── test_extractor.py
+│   └── test_renderer.py
+├── test_ocr/
+│   ├── test_paddle_ocr.py
+│   └── test_machine_code_parser.py
+├── test_inference/
+│   ├── test_pipeline.py
+│   ├── test_yolo_detector.py
+│   └── test_field_extractor.py
+├── test_normalize/
+│   ├── test_bankgiro_normalizer.py
+│   ├── test_date_normalizer.py
+│   └── test_amount_normalizer.py
+├── test_web/
+│   ├── test_routes.py
+│   └── test_services.py
+└── e2e/
+    └── test_inference_e2e.py
+```
+
+## Mocking External Services
+
+### Mock PaddleOCR
+```python
+import pytest
+from unittest.mock import Mock, patch
+
+@pytest.fixture
+def mock_paddle_ocr():
+    """Mock PaddleOCR for unit tests."""
+    with patch("src.ocr.paddle_ocr.PaddleOCR") as mock:
+        instance = Mock()
+        instance.ocr.return_value = [
+            [
+                [[[0, 0], [100, 0], [100, 20], [0, 20]], ("Invoice Number", 0.95)],
+                [[[0, 30], [100, 30], [100, 50], [0, 50]], ("INV-2024-001", 0.98)]
+            ]
+        ]
+        mock.return_value = instance
+        yield instance
+```
+
+### Mock YOLO Model
+```python
+@pytest.fixture
+def mock_yolo_model():
+    """Mock YOLO model for unit tests."""
+    with patch("src.inference.yolo_detector.YOLO") as mock:
+        instance = Mock()
+        # Mock detection results
+        instance.return_value = Mock(
+            boxes=Mock(
+                xyxy=[[10, 20, 100, 50]],
+                conf=[0.95],
+                cls=[0]  # invoice_number class
+            )
+        )
+        mock.return_value = instance
+        yield instance
+```
+
+### Mock Database
+```python
+@pytest.fixture
+def mock_db_connection():
+    """Mock database connection for unit tests."""
+    with patch("src.data.db.get_db_connection") as mock:
+        conn = Mock()
+        cursor = Mock()
+        cursor.fetchall.return_value = [
+            ("doc-123", "processed", {"invoice_number": "INV-001"})
+        ]
+        cursor.fetchone.return_value = ("doc-123",)
+        conn.cursor.return_value.__enter__ = Mock(return_value=cursor)
+        conn.cursor.return_value.__exit__ = Mock(return_value=False)
+        mock.return_value.__enter__ = Mock(return_value=conn)
+        mock.return_value.__exit__ = Mock(return_value=False)
+        yield conn
+```
+
+## Test Coverage Verification
+
+### Run Coverage Report
+```bash
+# Run with coverage
+pytest --cov=src --cov-report=term-missing
+
+# Generate HTML report
+pytest --cov=src --cov-report=html
+# Open htmlcov/index.html in browser
+```
+
+### Coverage Configuration (pyproject.toml)
+```toml
+[tool.coverage.run]
+source = ["src"]
+omit = ["*/__init__.py", "*/test_*.py"]
+
+[tool.coverage.report]
+fail_under = 80
+show_missing = true
+exclude_lines = [
+    "pragma: no cover",
+    "if TYPE_CHECKING:",
+    "raise NotImplementedError",
+]
+```
+
+## Common Testing Mistakes to Avoid
+
+### WRONG: Testing Implementation Details
+```python
+# Don't test internal state
+def test_parser_internal_state():
+    parser = PaymentLineParser()
+    parser._parse("...")
+    assert parser._groups == [...]  # Internal state
+```
+
+### CORRECT: Test Public Interface
+```python
+# Test what users see
+def test_parser_extracts_bankgiro():
+    result = parse_payment_line("...")
+    assert result.bankgiro == "1234567"
+```
+
+### WRONG: No Test Isolation
+```python
+# Tests depend on each other
+class TestDocuments:
+    def test_creates_document(self):
+        create_document(...)  # Creates in DB
+
+    def test_updates_document(self):
+        update_document(...)  # Depends on previous test
+```
+
+### CORRECT: Independent Tests
+```python
+# Each test sets up its own data
+class TestDocuments:
+    def test_creates_document(self, mock_db):
+        result = create_document(...)
+        assert result.id is not None
+
+    def test_updates_document(self, mock_db):
+        # Create own test data
+        doc = create_document(...)
+        result = update_document(doc.id, ...)
+        assert result.status == "updated"
+```
+
+### WRONG: Testing Too Much
+```python
+# One test doing everything
+def test_full_invoice_processing():
+    # Load PDF
+    # Extract images
+    # Run YOLO
+    # Run OCR
+    # Normalize fields
+    # Save to DB
+    # Return response
+```
+
+### CORRECT: Focused Tests
+```python
+def test_yolo_detects_invoice_number():
+    """Test only YOLO detection."""
+    result = detector.detect(image)
+    assert any(d.label == "invoice_number" for d in result)
+
+def test_ocr_extracts_text():
+    """Test only OCR extraction."""
+    result = ocr.extract(image, bbox)
+    assert result == "INV-2024-001"
+
+def test_normalizer_formats_date():
+    """Test only date normalization."""
+    result = normalize_date("2024-01-15")
+    assert result == "2024-01-15"
+```
+
+## Fixtures (conftest.py)
+
+```python
+import pytest
+from pathlib import Path
+from fastapi.testclient import TestClient
+
+@pytest.fixture
+def sample_invoice_pdf(tmp_path: Path) -> Path:
+    """Create sample invoice PDF for testing."""
+    pdf_path = tmp_path / "invoice.pdf"
+    # Copy from fixtures or create minimal PDF
+    src = Path("tests/fixtures/sample_invoice.pdf")
+    if src.exists():
+        pdf_path.write_bytes(src.read_bytes())
+    return pdf_path
+
+@pytest.fixture
+def client():
+    """FastAPI test client."""
+    from src.web.app import app
+    return TestClient(app)
+
+@pytest.fixture
+def sample_payment_line() -> str:
+    """Sample Swedish payment line for testing."""
+    return "1234567#0000000012345#230115#00012345678901234567#1"
+```
+
+## Continuous Testing
+
+### Watch Mode During Development
+```bash
+# Using pytest-watch
+ptw -- tests/test_ocr/
+# Tests run automatically on file changes
+```
+
+### Pre-Commit Hook
+```bash
+# .pre-commit-config.yaml
+repos:
+  - repo: local
+    hooks:
+      - id: pytest
+        name: pytest
+        entry: pytest --tb=short -q
+        language: system
+        pass_filenames: false
+        always_run: true
+```
+
+### CI/CD Integration (GitHub Actions)
+```yaml
+- name: Run Tests
+  run: |
+    pytest --cov=src --cov-report=xml
+
+- name: Upload Coverage
+  uses: codecov/codecov-action@v3
+  with:
+    file: coverage.xml
+```
+
+## Best Practices
+
+1. **Write Tests First** - Always TDD
+2. **One Assert Per Test** - Focus on single behavior
+3. **Descriptive Test Names** - `test_<what>_<condition>_<expected>`
+4. **Arrange-Act-Assert** - Clear test structure
+5. **Mock External Dependencies** - Isolate unit tests
+6. **Test Edge Cases** - None, empty, invalid, boundary
+7. **Test Error Paths** - Not just happy paths
+8. **Keep Tests Fast** - Unit tests < 50ms each
+9. **Clean Up After Tests** - Use fixtures with cleanup
+10. **Review Coverage Reports** - Identify gaps
+
+## Success Metrics
+
+- 80%+ code coverage achieved
+- All tests passing (green)
+- No skipped or disabled tests
+- Fast test execution (< 60s for unit tests)
+- E2E tests cover critical inference flow
+- Tests catch bugs before production
+
+---
+
+**Remember**: Tests are not optional. They are the safety net that enables confident refactoring, rapid development, and production reliability.
diff --git a/.claude/skills/verification-loop/SKILL.md b/.claude/skills/verification-loop/SKILL.md
new file mode 100644
index 0000000..0c2f000
--- /dev/null
+++ b/.claude/skills/verification-loop/SKILL.md
@@ -0,0 +1,242 @@
+# Verification Loop Skill
+
+Comprehensive verification system for Python/FastAPI development.
+
+## When to Use
+
+Invoke this skill:
+- After completing a feature or significant code change
+- Before creating a PR
+- When you want to ensure quality gates pass
+- After refactoring
+- Before deployment
+
+## Verification Phases
+
+### Phase 1: Type Check
+```bash
+# Run mypy type checker
+mypy src/ --ignore-missing-imports 2>&1 | head -30
+```
+
+Report all type errors. Fix critical ones before continuing.
+
+### Phase 2: Lint Check
+```bash
+# Run ruff linter
+ruff check src/ 2>&1 | head -30
+
+# Auto-fix if desired
+ruff check src/ --fix
+```
+
+Check for:
+- Unused imports
+- Code style violations
+- Common Python anti-patterns
+
+### Phase 3: Test Suite
+```bash
+# Run tests with coverage
+pytest --cov=src --cov-report=term-missing -q 2>&1 | tail -50
+
+# Run specific test file
+pytest tests/test_ocr/test_machine_code_parser.py -v
+
+# Run with short traceback
+pytest -x --tb=short
+```
+
+Report:
+- Total tests: X
+- Passed: X
+- Failed: X
+- Coverage: X%
+- Target: 80% minimum
+
+### Phase 4: Security Scan
+```bash
+# Check for hardcoded secrets
+grep -rn "password\s*=" --include="*.py" src/ 2>/dev/null | grep -v "db_password:" | head -10
+grep -rn "api_key\s*=" --include="*.py" src/ 2>/dev/null | head -10
+grep -rn "sk-" --include="*.py" src/ 2>/dev/null | head -10
+
+# Check for print statements (should use logging)
+grep -rn "print(" --include="*.py" src/ 2>/dev/null | head -10
+
+# Check for bare except
+grep -rn "except:" --include="*.py" src/ 2>/dev/null | head -10
+
+# Check for SQL injection risks (f-strings in execute)
+grep -rn 'execute(f"' --include="*.py" src/ 2>/dev/null | head -10
+grep -rn "execute(f'" --include="*.py" src/ 2>/dev/null | head -10
+```
+
+### Phase 5: Import Check
+```bash
+# Verify all imports work
+python -c "from src.web.app import app; print('Web app OK')"
+python -c "from src.inference.pipeline import InferencePipeline; print('Pipeline OK')"
+python -c "from src.ocr.machine_code_parser import parse_payment_line; print('Parser OK')"
+```
+
+### Phase 6: Diff Review
+```bash
+# Show what changed
+git diff --stat
+git diff HEAD --name-only
+
+# Show staged changes
+git diff --staged --stat
+```
+
+Review each changed file for:
+- Unintended changes
+- Missing error handling
+- Potential edge cases
+- Missing type hints
+- Mutable default arguments
+
+### Phase 7: API Smoke Test (if server running)
+```bash
+# Health check
+curl -s http://localhost:8000/api/v1/health | python -m json.tool
+
+# Verify response format
+curl -s http://localhost:8000/api/v1/health | grep -q "healthy" && echo "Health: OK" || echo "Health: FAIL"
+```
+
+## Output Format
+
+After running all phases, produce a verification report:
+
+```
+VERIFICATION REPORT
+==================
+
+Types:     [PASS/FAIL] (X errors)
+Lint:      [PASS/FAIL] (X warnings)
+Tests:     [PASS/FAIL] (X/Y passed, Z% coverage)
+Security:  [PASS/FAIL] (X issues)
+Imports:   [PASS/FAIL]
+Diff:      [X files changed]
+
+Overall:   [READY/NOT READY] for PR
+
+Issues to Fix:
+1. ...
+2. ...
+```
+
+## Quick Commands
+
+```bash
+# Full verification (WSL)
+wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && mypy src/ --ignore-missing-imports && ruff check src/ && pytest -x --tb=short"
+
+# Type check only
+wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && mypy src/ --ignore-missing-imports"
+
+# Tests only
+wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && pytest --cov=src -q"
+```
+
+## Verification Checklist
+
+### Before Commit
+- [ ] mypy passes (no type errors)
+- [ ] ruff check passes (no lint errors)
+- [ ] All tests pass
+- [ ] No print() statements in production code
+- [ ] No hardcoded secrets
+- [ ] No bare `except:` clauses
+- [ ] No SQL injection risks (f-strings in queries)
+- [ ] Coverage >= 80% for changed code
+
+### Before PR
+- [ ] All above checks pass
+- [ ] git diff reviewed for unintended changes
+- [ ] New code has tests
+- [ ] Type hints on all public functions
+- [ ] Docstrings on public APIs
+- [ ] No TODO/FIXME for critical items
+
+### Before Deployment
+- [ ] All above checks pass
+- [ ] E2E tests pass
+- [ ] Health check returns healthy
+- [ ] Model loaded successfully
+- [ ] No server errors in logs
+
+## Common Issues and Fixes
+
+### Type Error: Missing return type
+```python
+# Before
+def process(data):
+    return result
+
+# After
+def process(data: dict) -> InferenceResult:
+    return result
+```
+
+### Lint Error: Unused import
+```python
+# Remove unused imports or add to __all__
+```
+
+### Security: print() in production
+```python
+# Before
+print(f"Processing {doc_id}")
+
+# After
+logger.info(f"Processing {doc_id}")
+```
+
+### Security: Bare except
+```python
+# Before
+except:
+    pass
+
+# After
+except Exception as e:
+    logger.error(f"Error: {e}")
+    raise
+```
+
+### Security: SQL injection
+```python
+# Before (DANGEROUS)
+cur.execute(f"SELECT * FROM docs WHERE id = '{user_input}'")
+
+# After (SAFE)
+cur.execute("SELECT * FROM docs WHERE id = %s", (user_input,))
+```
+
+## Continuous Mode
+
+For long sessions, run verification after major changes:
+
+```markdown
+Checkpoints:
+- After completing each function
+- After finishing a module
+- Before moving to next task
+- Every 15-20 minutes of coding
+
+Run: /verify
+```
+
+## Integration with Other Skills
+
+| Skill | Purpose |
+|-------|---------|
+| code-review | Detailed code analysis |
+| security-review | Deep security audit |
+| tdd-workflow | Test coverage |
+| build-fix | Fix errors incrementally |
+
+This skill provides quick, comprehensive verification. Use specialized skills for deeper analysis.
diff --git a/src/ocr/test_machine_code_parser.py b/src/ocr/test_machine_code_parser.py
deleted file mode 100644
index ca9a9d3..0000000
--- a/src/ocr/test_machine_code_parser.py
+++ /dev/null
@@ -1,251 +0,0 @@
-"""
-Tests for Machine Code Parser
-
-Tests the parsing of Swedish invoice payment lines including:
-- Standard payment line format
-- Account number normalization (spaces removal)
-- Bankgiro/Plusgiro detection
-- OCR and Amount extraction
-"""
-
-import pytest
-from src.ocr.machine_code_parser import MachineCodeParser, MachineCodeResult
-
-
-class TestParseStandardPaymentLine:
-    """Tests for _parse_standard_payment_line method."""
-
-    @pytest.fixture
-    def parser(self):
-        return MachineCodeParser()
-
-    def test_standard_format_bankgiro(self, parser):
-        """Test standard payment line with Bankgiro."""
-        line = "# 31130954410 # 315 00 2 > 8983025#14#"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['ocr'] == '31130954410'
-        assert result['amount'] == '315'
-        assert result['bankgiro'] == '898-3025'
-
-    def test_standard_format_with_ore(self, parser):
-        """Test payment line with non-zero öre."""
-        line = "# 12345678901 # 100 50 2 > 7821713#41#"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['ocr'] == '12345678901'
-        assert result['amount'] == '100,50'
-        assert result['bankgiro'] == '782-1713'
-
-    def test_spaces_in_bankgiro(self, parser):
-        """Test payment line with spaces in Bankgiro number."""
-        line = "# 310196187399952 # 11699 00 6 > 78 2 1 713 #41#"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['ocr'] == '310196187399952'
-        assert result['amount'] == '11699'
-        assert result['bankgiro'] == '782-1713'
-
-    def test_spaces_in_bankgiro_multiple(self, parser):
-        """Test payment line with multiple spaces in account number."""
-        line = "# 123456789 # 500 00 1 > 1 2 3 4 5 6 7 #99#"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['bankgiro'] == '123-4567'
-
-    def test_8_digit_bankgiro(self, parser):
-        """Test 8-digit Bankgiro formatting."""
-        line = "# 12345678901 # 200 00 2 > 53939484#14#"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['bankgiro'] == '5393-9484'
-
-    def test_plusgiro_context(self, parser):
-        """Test Plusgiro detection based on context."""
-        line = "# 12345678901 # 100 00 2 > 1234567#14#"
-        result = parser._parse_standard_payment_line(line, context_line="plusgiro payment")
-
-        assert result is not None
-        assert 'plusgiro' in result
-        assert result['plusgiro'] == '123456-7'
-
-    def test_no_match_invalid_format(self, parser):
-        """Test that invalid format returns None."""
-        line = "This is not a valid payment line"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is None
-
-    def test_alternative_pattern(self, parser):
-        """Test alternative payment line pattern."""
-        line = "8120000849965361 11699 00 1 > 7821713"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['ocr'] == '8120000849965361'
-
-    def test_long_ocr_number(self, parser):
-        """Test OCR number up to 25 digits."""
-        line = "# 1234567890123456789012345 # 100 00 2 > 7821713#14#"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['ocr'] == '1234567890123456789012345'
-
-    def test_large_amount(self, parser):
-        """Test large amount extraction."""
-        line = "# 12345678901 # 1234567 00 2 > 7821713#14#"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['amount'] == '1234567'
-
-
-class TestNormalizeAccountSpaces:
-    """Tests for account number space normalization."""
-
-    @pytest.fixture
-    def parser(self):
-        return MachineCodeParser()
-
-    def test_no_spaces(self, parser):
-        """Test line without spaces in account."""
-        line = "# 123456789 # 100 00 1 > 7821713#14#"
-        result = parser._parse_standard_payment_line(line)
-        assert result['bankgiro'] == '782-1713'
-
-    def test_single_space(self, parser):
-        """Test single space between digits."""
-        line = "# 123456789 # 100 00 1 > 782 1713#14#"
-        result = parser._parse_standard_payment_line(line)
-        assert result['bankgiro'] == '782-1713'
-
-    def test_multiple_spaces(self, parser):
-        """Test multiple spaces."""
-        line = "# 123456789 # 100 00 1 > 7 8 2 1 7 1 3#14#"
-        result = parser._parse_standard_payment_line(line)
-        assert result['bankgiro'] == '782-1713'
-
-    def test_no_arrow_marker(self, parser):
-        """Test line without > marker - spaces not normalized."""
-        # Without >, the normalization won't happen
-        line = "# 123456789 # 100 00 1 7821713#14#"
-        result = parser._parse_standard_payment_line(line)
-        # This pattern might not match due to missing >
-        # Just ensure no crash
-        assert result is None or isinstance(result, dict)
-
-
-class TestMachineCodeResult:
-    """Tests for MachineCodeResult dataclass."""
-
-    def test_to_dict(self):
-        """Test conversion to dictionary."""
-        result = MachineCodeResult(
-            ocr='12345678901',
-            amount='100',
-            bankgiro='782-1713',
-            confidence=0.95,
-            raw_line='test line'
-        )
-
-        d = result.to_dict()
-        assert d['ocr'] == '12345678901'
-        assert d['amount'] == '100'
-        assert d['bankgiro'] == '782-1713'
-        assert d['confidence'] == 0.95
-        assert d['raw_line'] == 'test line'
-
-    def test_empty_result(self):
-        """Test empty result."""
-        result = MachineCodeResult()
-        d = result.to_dict()
-
-        assert d['ocr'] is None
-        assert d['amount'] is None
-        assert d['bankgiro'] is None
-        assert d['plusgiro'] is None
-
-
-class TestRealWorldExamples:
-    """Tests using real-world payment line examples."""
-
-    @pytest.fixture
-    def parser(self):
-        return MachineCodeParser()
-
-    def test_fastum_invoice(self, parser):
-        """Test Fastum invoice payment line (from Faktura_A3861)."""
-        line = "# 310196187399952 # 11699 00 6 > 78 2 1 713 #41#"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['ocr'] == '310196187399952'
-        assert result['amount'] == '11699'
-        assert result['bankgiro'] == '782-1713'
-
-    def test_standard_bankgiro_invoice(self, parser):
-        """Test standard Bankgiro format."""
-        line = "# 31130954410 # 315 00 2 > 8983025#14#"
-        result = parser._parse_standard_payment_line(line)
-
-        assert result is not None
-        assert result['ocr'] == '31130954410'
-        assert result['amount'] == '315'
-        assert result['bankgiro'] == '898-3025'
-
-    def test_payment_line_with_extra_whitespace(self, parser):
-        """Test payment line with extra whitespace."""
-        line = "#  310196187399952  #  11699  00  6  >  7821713  #41#"
-        result = parser._parse_standard_payment_line(line)
-
-        # May or may not match depending on regex flexibility
-        # At minimum, should not crash
-        assert result is None or isinstance(result, dict)
-
-
-class TestEdgeCases:
-    """Tests for edge cases and boundary conditions."""
-
-    @pytest.fixture
-    def parser(self):
-        return MachineCodeParser()
-
-    def test_empty_string(self, parser):
-        """Test empty string input."""
-        result = parser._parse_standard_payment_line("")
-        assert result is None
-
-    def test_only_whitespace(self, parser):
-        """Test whitespace-only input."""
-        result = parser._parse_standard_payment_line("   \t\n  ")
-        assert result is None
-
-    def test_minimum_ocr_length(self, parser):
-        """Test minimum OCR length (5 digits)."""
-        line = "# 12345 # 100 00 1 > 7821713#14#"
-        result = parser._parse_standard_payment_line(line)
-        assert result is not None
-        assert result['ocr'] == '12345'
-
-    def test_minimum_bankgiro_length(self, parser):
-        """Test minimum Bankgiro length (5 digits)."""
-        line = "# 12345678901 # 100 00 1 > 12345#14#"
-        result = parser._parse_standard_payment_line(line)
-        assert result is not None
-
-    def test_special_characters_in_line(self, parser):
-        """Test handling of special characters."""
-        line = "# 12345678901 # 100 00 1 > 7821713#14# (SEK)"
-        result = parser._parse_standard_payment_line(line)
-        assert result is not None
-        assert result['ocr'] == '12345678901'
-
-
-if __name__ == '__main__':
-    pytest.main([__file__, '-v'])