diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 5c515df..1f2dd1d 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -1,263 +1,143 @@ -[角色] - 你是废才,一位资深产品经理兼全栈开发教练。 - - 你见过太多人带着"改变世界"的妄想来找你,最后连需求都说不清楚。 - 你也见过真正能成事的人——他们不一定聪明,但足够诚实,敢于面对自己想法的漏洞。 - - 你负责引导用户完成产品开发的完整旅程:从脑子里的模糊想法,到可运行的产品。 +# Invoice Master POC v2 -[任务] - 引导用户完成产品开发的完整流程: - - 1. **需求收集** → 调用 product-spec-builder,生成 Product-Spec.md - 2. **原型设计** → 调用 ui-prompt-generator,生成 UI-Prompts.md(可选) - 3. **项目开发** → 调用 dev-builder,实现项目代码 - 4. **本地运行** → 启动项目,输出使用指南 +Swedish Invoice Field Extraction System - YOLOv11 + PaddleOCR 从瑞典 PDF 发票中提取结构化数据。 -[文件结构] - project/ - ├── Product-Spec.md # 产品需求文档 - ├── Product-Spec-CHANGELOG.md # 需求变更记录 - ├── UI-Prompts.md # 原型图提示词(可选) - ├── [项目源代码]/ # 代码文件 - └── .claude/ - ├── CLAUDE.md # 主控(本文件) - └── skills/ - ├── product-spec-builder/ # 需求收集 - ├── ui-prompt-generator/ # 原型图提示词 - └── dev-builder/ # 项目开发 +## Tech Stack -[总体规则] - - 严格按照 需求收集 → 原型设计(可选)→ 项目开发 → 本地运行 的流程引导 - - **任何功能变更、UI 修改、需求调整,都必须先更新 Product Spec,再实现代码** - - 无论用户如何打断或提出新问题,完成当前回答后始终引导用户进入下一步 - - 始终使用**中文**进行交流 +| Component | Technology | +|-----------|------------| +| Object Detection | YOLOv11 (Ultralytics) | +| OCR Engine | PaddleOCR v5 (PP-OCRv5) | +| PDF Processing | PyMuPDF (fitz) | +| Database | PostgreSQL + psycopg2 | +| Web Framework | FastAPI + Uvicorn | +| Deep Learning | PyTorch + CUDA 12.x | -[运行环境要求] - **强制要求**:所有程序运行、命令执行必须在 WSL 环境中进行 +## WSL Environment (REQUIRED) - - **WSL**:所有 bash 命令必须通过 `wsl` 前缀执行 - - **Conda 环境**:必须使用 `invoice-py311` 环境 +**Prefix ALL commands with:** - 命令执行格式: - ```bash - wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && <你的命令>" - ``` +```bash +wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && " +``` - 示例: - ```bash - # 运行 Python 脚本 - wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && python main.py" +**NEVER run Python commands directly in Windows PowerShell/CMD.** - # 安装依赖 - wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && pip install -r requirements.txt" +## Project-Specific Rules - # 运行测试 - wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && pytest" - ``` +- Python 3.11+ with type hints +- No print() in production - use logging +- Run tests: `pytest --cov=src` - **注意**: - - 不要直接在 Windows PowerShell/CMD 中运行 Python 命令 - - 每次执行命令都需要激活 conda 环境(因为是非交互式 shell) - - 路径需要转换为 WSL 格式(如 `/mnt/c/Users/...`) +## File Structure -[Skill 调用规则] - [product-spec-builder] - **自动调用**: - - 用户表达想要开发产品、应用、工具时 - - 用户描述产品想法、功能需求时 - - 用户要修改 UI、改界面、调整布局时(迭代模式) - - 用户要增加功能、新增功能时(迭代模式) - - 用户要改需求、调整功能、修改逻辑时(迭代模式) - - **手动调用**:/prd - - [ui-prompt-generator] - **手动调用**:/ui - - 前置条件:Product-Spec.md 必须存在 - - [dev-builder] - **手动调用**:/dev - - 前置条件:Product-Spec.md 必须存在 +``` +src/ +├── cli/ # autolabel, train, infer, serve +├── pdf/ # extractor, renderer, detector +├── ocr/ # PaddleOCR wrapper, machine_code_parser +├── inference/ # pipeline, yolo_detector, field_extractor +├── normalize/ # Per-field normalizers +├── matcher/ # Exact, substring, fuzzy strategies +├── processing/ # CPU/GPU pool architecture +├── web/ # FastAPI app, routes, services, schemas +├── utils/ # validators, text_cleaner, fuzzy_matcher +└── data/ # Database operations +tests/ # Mirror of src structure +runs/train/ # Training outputs +``` -[项目状态检测与路由] - 初始化时自动检测项目进度,路由到对应阶段: - - 检测逻辑: - - 无 Product-Spec.md → 全新项目 → 引导用户描述想法或输入 /prd - - 有 Product-Spec.md,无代码 → Spec 已完成 → 输出交付指南 - - 有 Product-Spec.md,有代码 → 项目已创建 → 可执行 /check 或 /run - - 显示格式: - "📊 **项目进度检测** - - - Product Spec:[已完成/未完成] - - 原型图提示词:[已生成/未生成] - - 项目代码:[已创建/未创建] - - **当前阶段**:[阶段名称] - **下一步**:[具体指令或操作]" +## Supported Fields -[工作流程] - [需求收集阶段] - 触发:用户表达产品想法(自动)或输入 /prd(手动) - - 执行:调用 product-spec-builder skill - - 完成后:输出交付指南,引导下一步 +| ID | Field | Description | +|----|-------|-------------| +| 0 | invoice_number | Invoice number | +| 1 | invoice_date | Invoice date | +| 2 | invoice_due_date | Due date | +| 3 | ocr_number | OCR reference (Swedish payment) | +| 4 | bankgiro | Bankgiro account | +| 5 | plusgiro | Plusgiro account | +| 6 | amount | Amount | +| 7 | supplier_organisation_number | Supplier org number | +| 8 | payment_line | Payment line (machine-readable) | +| 9 | customer_number | Customer number | - [交付阶段] - 触发:Product Spec 生成完成后自动执行 - - 输出: - "✅ **Product Spec 已生成!** - - 文件:Product-Spec.md - - --- - - ## 📘 接下来 - - - 输入 /ui 生成原型图提示词(可选) - - 输入 /dev 开始开发项目 - - 直接对话可以改 UI、加功能" +## Key Patterns - [原型图阶段] - 触发:用户输入 /ui - - 执行:调用 ui-prompt-generator skill - - 完成后: - "✅ **原型图提示词已生成!** - - 文件:UI-Prompts.md - - 把提示词发给 AI 绘图工具生成原型图,然后输入 /dev 开始开发。" +### Inference Result - [项目开发阶段] - 触发:用户输入 /dev - - 第一步:询问原型图 - 询问用户:"有原型图或设计稿吗?有的话发给我参考。" - 用户发送图片 → 记录,开发时参考 - 用户说没有 → 继续 - - 第二步:执行开发 - 调用 dev-builder skill - - 完成后:引导用户执行 /run +```python +@dataclass +class InferenceResult: + document_id: str + document_type: str # "invoice" or "letter" + fields: dict[str, str] + confidence: dict[str, float] + cross_validation: CrossValidationResult | None + processing_time_ms: float +``` - [代码检查阶段] - 触发:用户输入 /check - - 执行: - 第一步:读取 Product Spec 文档 - 加载 Product-Spec.md 文件 - 解析功能需求、UI 布局 - - 第二步:扫描项目代码 - 遍历项目目录下的代码文件 - 识别已实现的功能、组件 - - 第三步:功能完整度检查 - - 功能需求:Product Spec 功能需求 vs 代码实现 - - UI 布局:Product Spec 布局描述 vs 界面代码 - - 第四步:输出检查报告 - - 输出: - "📋 **项目完整度检查报告** - - **对照文档**:Product-Spec.md - - --- - - ✅ **已完成(X项)** - - [功能名称]:[实现位置] - - ⚠️ **部分完成(X项)** - - [功能名称]:[缺失内容] - - ❌ **缺失(X项)** - - [功能名称]:未实现 - - --- - - 💡 **改进建议** - 1. [具体建议] - 2. [具体建议] - - --- - - 需要我帮你补充这些功能吗?或输入 /run 先跑起来看看。" +### API Schemas - [本地运行阶段] - 触发:用户输入 /run - - 执行:自动检测项目类型,安装依赖,启动项目 - - 输出: - "🚀 **项目已启动!** - - **访问地址**:http://localhost:[端口号] - - --- - - ## 📖 使用指南 - - [根据 Product Spec 生成简要使用说明] - - --- - - 💡 **提示**: - - /stop 停止服务 - - /check 检查完整度 - - /prd 修改需求" +See `src/web/schemas.py` for request/response models. - [内容修订] - 当用户提出修改意见时: - - **流程**:先更新文档 → 再实现代码 - - 1. 调用 product-spec-builder(迭代模式) - - 通过追问明确变更内容 - - 更新 Product-Spec.md - - 更新 Product-Spec-CHANGELOG.md - 2. 调用 dev-builder 实现代码变更 - 3. 建议用户执行 /check 验证 +## Environment Variables -[指令集] - /prd - 需求收集,生成 Product Spec - /ui - 生成原型图提示词 - /dev - 开发项目代码 - /check - 对照 Spec 检查代码完整度 - /run - 本地运行项目 - /stop - 停止运行中的服务 - /status - 显示项目进度 - /help - 显示所有指令 +```bash +# Required +DB_PASSWORD= -[初始化] - 以下ASCII艺术应该显示"FEICAI"字样。如果您看到乱码或显示异常,请帮忙纠正,使用ASCII艺术生成显示"FEICAI" - ``` - "███████╗███████╗██╗ ██████╗ █████╗ ██╗ - ██╔════╝██╔════╝██║██╔════╝██╔══██╗██║ - █████╗ █████╗ ██║██║ ███████║██║ - ██╔══╝ ██╔══╝ ██║██║ ██╔══██║██║ - ██║ ███████╗██║╚██████╗██║ ██║██║ - ╚═╝ ╚══════╝╚═╝ ╚═════╝╚═╝ ╚═╝╚═╝" - ``` - - "👋 我是废才,产品经理兼开发教练。 +# Optional (with defaults) +DB_HOST=192.168.68.31 +DB_PORT=5432 +DB_NAME=docmaster +DB_USER=docmaster +MODEL_PATH=runs/train/invoice_fields/weights/best.pt +CONFIDENCE_THRESHOLD=0.5 +SERVER_HOST=0.0.0.0 +SERVER_PORT=8000 +``` - 我不聊理想,只聊产品。你负责想,我负责问到你想清楚。 - 从需求文档到本地运行,全程我带着走。 +## CLI Commands - 过程中我会问很多问题,有些可能让你不舒服。不过放心,我只是想让你的产品能落地,仅此而已。 +```bash +# Auto-labeling +python -m src.cli.autolabel --dual-pool --cpu-workers 3 --gpu-workers 1 - 💡 输入 /help 查看所有指令 +# Training +python -m src.cli.train --model yolo11n.pt --epochs 100 --batch 16 --name invoice_fields - 现在,说说你想做什么?" - - 执行 [项目状态检测与路由] +# Inference +python -m src.cli.infer --model runs/train/invoice_fields/weights/best.pt --input invoice.pdf --gpu + +# Web Server +python run_server.py --port 8000 +``` + +## API Endpoints + +| Method | Endpoint | Description | +|--------|----------|-------------| +| GET | `/` | Web UI | +| GET | `/api/v1/health` | Health check | +| POST | `/api/v1/infer` | Process invoice | +| GET | `/api/v1/results/{filename}` | Get visualization | + +## Current Status + +- **Tests**: 688 passing +- **Coverage**: 37% +- **Model**: 93.5% mAP@0.5 +- **Documents Labeled**: 9,738 + +## Quick Start + +```bash +# Start server +wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && python run_server.py" + +# Run tests +wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && pytest" + +# Access UI: http://localhost:8000 +``` \ No newline at end of file diff --git a/.claude/commands/build-fix.md b/.claude/commands/build-fix.md new file mode 100644 index 0000000..5951016 --- /dev/null +++ b/.claude/commands/build-fix.md @@ -0,0 +1,22 @@ +# Build and Fix + +Incrementally fix Python errors and test failures. + +## Workflow + +1. Run check: `mypy src/ --ignore-missing-imports` or `pytest -x --tb=short` +2. Parse errors, group by file, sort by severity (ImportError > TypeError > other) +3. For each error: + - Show context (5 lines) + - Explain and propose fix + - Apply fix + - Re-run test for that file + - Verify resolved +4. Stop if: fix introduces new errors, same error after 3 attempts, or user pauses +5. Show summary: fixed / remaining / new errors + +## Rules + +- Fix ONE error at a time +- Re-run tests after each fix +- Never batch multiple unrelated fixes \ No newline at end of file diff --git a/.claude/commands/checkpoint.md b/.claude/commands/checkpoint.md new file mode 100644 index 0000000..06293c0 --- /dev/null +++ b/.claude/commands/checkpoint.md @@ -0,0 +1,74 @@ +# Checkpoint Command + +Create or verify a checkpoint in your workflow. + +## Usage + +`/checkpoint [create|verify|list] [name]` + +## Create Checkpoint + +When creating a checkpoint: + +1. Run `/verify quick` to ensure current state is clean +2. Create a git stash or commit with checkpoint name +3. Log checkpoint to `.claude/checkpoints.log`: + +```bash +echo "$(date +%Y-%m-%d-%H:%M) | $CHECKPOINT_NAME | $(git rev-parse --short HEAD)" >> .claude/checkpoints.log +``` + +4. Report checkpoint created + +## Verify Checkpoint + +When verifying against a checkpoint: + +1. Read checkpoint from log +2. Compare current state to checkpoint: + - Files added since checkpoint + - Files modified since checkpoint + - Test pass rate now vs then + - Coverage now vs then + +3. Report: +``` +CHECKPOINT COMPARISON: $NAME +============================ +Files changed: X +Tests: +Y passed / -Z failed +Coverage: +X% / -Y% +Build: [PASS/FAIL] +``` + +## List Checkpoints + +Show all checkpoints with: +- Name +- Timestamp +- Git SHA +- Status (current, behind, ahead) + +## Workflow + +Typical checkpoint flow: + +``` +[Start] --> /checkpoint create "feature-start" + | +[Implement] --> /checkpoint create "core-done" + | +[Test] --> /checkpoint verify "core-done" + | +[Refactor] --> /checkpoint create "refactor-done" + | +[PR] --> /checkpoint verify "feature-start" +``` + +## Arguments + +$ARGUMENTS: +- `create ` - Create named checkpoint +- `verify ` - Verify against named checkpoint +- `list` - Show all checkpoints +- `clear` - Remove old checkpoints (keeps last 5) diff --git a/.claude/commands/code-review.md b/.claude/commands/code-review.md new file mode 100644 index 0000000..25c9e7a --- /dev/null +++ b/.claude/commands/code-review.md @@ -0,0 +1,46 @@ +# Code Review + +Security and quality review of uncommitted changes. + +## Workflow + +1. Get changed files: `git diff --name-only HEAD` and `git diff --staged --name-only` +2. Review each file for issues (see checklist below) +3. Run automated checks: `mypy src/`, `ruff check src/`, `pytest -x` +4. Generate report with severity, location, description, suggested fix +5. Block commit if CRITICAL or HIGH issues found + +## Checklist + +### CRITICAL (Block) + +- Hardcoded credentials, API keys, tokens, passwords +- SQL injection (must use parameterized queries) +- Path traversal risks +- Missing input validation on API endpoints +- Missing authentication/authorization + +### HIGH (Block) + +- Functions > 50 lines, files > 800 lines +- Nesting depth > 4 levels +- Missing error handling or bare `except:` +- `print()` in production code (use logging) +- Mutable default arguments + +### MEDIUM (Warn) + +- Missing type hints on public functions +- Missing tests for new code +- Duplicate code, magic numbers +- Unused imports/variables +- TODO/FIXME comments + +## Report Format + +``` +[SEVERITY] file:line - Issue description + Suggested fix: ... +``` + +## Never Approve Code With Security Vulnerabilities! \ No newline at end of file diff --git a/.claude/commands/e2e.md b/.claude/commands/e2e.md new file mode 100644 index 0000000..6ac6d43 --- /dev/null +++ b/.claude/commands/e2e.md @@ -0,0 +1,40 @@ +# E2E Testing + +End-to-end testing for the Invoice Field Extraction API. + +## When to Use + +- Testing complete inference pipeline (PDF -> Fields) +- Verifying API endpoints work end-to-end +- Validating YOLO + OCR + field extraction integration +- Pre-deployment verification + +## Workflow + +1. Ensure server is running: `python run_server.py` +2. Run health check: `curl http://localhost:8000/api/v1/health` +3. Run E2E tests: `pytest tests/e2e/ -v` +4. Verify results and capture any failures + +## Critical Scenarios (Must Pass) + +1. Health check returns `{"status": "healthy", "model_loaded": true}` +2. PDF upload returns valid response with fields +3. Fields extracted with confidence scores +4. Visualization image generated +5. Cross-validation included for invoices with payment_line + +## Checklist + +- [ ] Server running on http://localhost:8000 +- [ ] Health check passes +- [ ] PDF inference returns valid JSON +- [ ] At least one field extracted +- [ ] Visualization URL returns image +- [ ] Response time < 10 seconds +- [ ] No server errors in logs + +## Test Location + +E2E tests: `tests/e2e/` +Sample fixtures: `tests/fixtures/` \ No newline at end of file diff --git a/.claude/commands/eval.md b/.claude/commands/eval.md new file mode 100644 index 0000000..852c175 --- /dev/null +++ b/.claude/commands/eval.md @@ -0,0 +1,174 @@ +# Eval Command + +Evaluate model performance and field extraction accuracy. + +## Usage + +`/eval [model|accuracy|compare|report]` + +## Model Evaluation + +`/eval model` + +Evaluate YOLO model performance on test dataset: + +```bash +# Run model evaluation +python -m src.cli.train --model runs/train/invoice_fields/weights/best.pt --eval-only + +# Or use ultralytics directly +yolo val model=runs/train/invoice_fields/weights/best.pt data=data.yaml +``` + +Output: +``` +Model Evaluation: invoice_fields/best.pt +======================================== +mAP@0.5: 93.5% +mAP@0.5-0.95: 83.0% + +Per-class AP: +- invoice_number: 95.2% +- invoice_date: 94.8% +- invoice_due_date: 93.1% +- ocr_number: 91.5% +- bankgiro: 92.3% +- plusgiro: 90.8% +- amount: 88.7% +- supplier_org_num: 85.2% +- payment_line: 82.4% +- customer_number: 81.1% +``` + +## Accuracy Evaluation + +`/eval accuracy` + +Evaluate field extraction accuracy against ground truth: + +```bash +# Run accuracy evaluation on labeled data +python -m src.cli.infer --model runs/train/invoice_fields/weights/best.pt \ + --input ~/invoice-data/test/*.pdf \ + --ground-truth ~/invoice-data/test/labels.csv \ + --output eval_results.json +``` + +Output: +``` +Field Extraction Accuracy +========================= +Documents tested: 500 + +Per-field accuracy: +- InvoiceNumber: 98.9% (494/500) +- InvoiceDate: 95.5% (478/500) +- InvoiceDueDate: 95.9% (480/500) +- OCR: 99.1% (496/500) +- Bankgiro: 99.0% (495/500) +- Plusgiro: 99.4% (497/500) +- Amount: 91.3% (457/500) +- supplier_org: 78.2% (391/500) + +Overall: 94.8% +``` + +## Compare Models + +`/eval compare` + +Compare two model versions: + +```bash +# Compare old vs new model +python -m src.cli.eval compare \ + --model-a runs/train/invoice_v1/weights/best.pt \ + --model-b runs/train/invoice_v2/weights/best.pt \ + --test-data ~/invoice-data/test/ +``` + +Output: +``` +Model Comparison +================ + Model A Model B Delta +mAP@0.5: 91.2% 93.5% +2.3% +Accuracy: 92.1% 94.8% +2.7% +Speed (ms): 1850 1520 -330 + +Per-field improvements: +- amount: +4.2% +- payment_line: +3.8% +- customer_num: +2.1% + +Recommendation: Deploy Model B +``` + +## Generate Report + +`/eval report` + +Generate comprehensive evaluation report: + +```bash +python -m src.cli.eval report --output eval_report.md +``` + +Output: +```markdown +# Evaluation Report +Generated: 2026-01-25 + +## Model Performance +- Model: runs/train/invoice_fields/weights/best.pt +- mAP@0.5: 93.5% +- Training samples: 9,738 + +## Field Extraction Accuracy +| Field | Accuracy | Errors | +|-------|----------|--------| +| InvoiceNumber | 98.9% | 6 | +| Amount | 91.3% | 43 | +... + +## Error Analysis +### Common Errors +1. Amount: OCR misreads comma as period +2. supplier_org: Missing from some invoices +3. payment_line: Partially obscured by stamps + +## Recommendations +1. Add more training data for low-accuracy fields +2. Implement OCR error correction for amounts +3. Consider confidence threshold tuning +``` + +## Quick Commands + +```bash +# Evaluate model metrics +yolo val model=runs/train/invoice_fields/weights/best.pt + +# Test inference on sample +python -m src.cli.infer --input sample.pdf --output result.json --gpu + +# Check test coverage +pytest --cov=src --cov-report=html +``` + +## Evaluation Metrics + +| Metric | Target | Current | +|--------|--------|---------| +| mAP@0.5 | >90% | 93.5% | +| Overall Accuracy | >90% | 94.8% | +| Test Coverage | >60% | 37% | +| Tests Passing | 100% | 100% | + +## When to Evaluate + +- After training a new model +- Before deploying to production +- After adding new training data +- When accuracy complaints arise +- Weekly performance monitoring diff --git a/.claude/commands/learn.md b/.claude/commands/learn.md new file mode 100644 index 0000000..9899af1 --- /dev/null +++ b/.claude/commands/learn.md @@ -0,0 +1,70 @@ +# /learn - Extract Reusable Patterns + +Analyze the current session and extract any patterns worth saving as skills. + +## Trigger + +Run `/learn` at any point during a session when you've solved a non-trivial problem. + +## What to Extract + +Look for: + +1. **Error Resolution Patterns** + - What error occurred? + - What was the root cause? + - What fixed it? + - Is this reusable for similar errors? + +2. **Debugging Techniques** + - Non-obvious debugging steps + - Tool combinations that worked + - Diagnostic patterns + +3. **Workarounds** + - Library quirks + - API limitations + - Version-specific fixes + +4. **Project-Specific Patterns** + - Codebase conventions discovered + - Architecture decisions made + - Integration patterns + +## Output Format + +Create a skill file at `~/.claude/skills/learned/[pattern-name].md`: + +```markdown +# [Descriptive Pattern Name] + +**Extracted:** [Date] +**Context:** [Brief description of when this applies] + +## Problem +[What problem this solves - be specific] + +## Solution +[The pattern/technique/workaround] + +## Example +[Code example if applicable] + +## When to Use +[Trigger conditions - what should activate this skill] +``` + +## Process + +1. Review the session for extractable patterns +2. Identify the most valuable/reusable insight +3. Draft the skill file +4. Ask user to confirm before saving +5. Save to `~/.claude/skills/learned/` + +## Notes + +- Don't extract trivial fixes (typos, simple syntax errors) +- Don't extract one-time issues (specific API outages, etc.) +- Focus on patterns that will save time in future sessions +- Keep skills focused - one pattern per skill diff --git a/.claude/commands/orchestrate.md b/.claude/commands/orchestrate.md new file mode 100644 index 0000000..30ac2b8 --- /dev/null +++ b/.claude/commands/orchestrate.md @@ -0,0 +1,172 @@ +# Orchestrate Command + +Sequential agent workflow for complex tasks. + +## Usage + +`/orchestrate [workflow-type] [task-description]` + +## Workflow Types + +### feature +Full feature implementation workflow: +``` +planner -> tdd-guide -> code-reviewer -> security-reviewer +``` + +### bugfix +Bug investigation and fix workflow: +``` +explorer -> tdd-guide -> code-reviewer +``` + +### refactor +Safe refactoring workflow: +``` +architect -> code-reviewer -> tdd-guide +``` + +### security +Security-focused review: +``` +security-reviewer -> code-reviewer -> architect +``` + +## Execution Pattern + +For each agent in the workflow: + +1. **Invoke agent** with context from previous agent +2. **Collect output** as structured handoff document +3. **Pass to next agent** in chain +4. **Aggregate results** into final report + +## Handoff Document Format + +Between agents, create handoff document: + +```markdown +## HANDOFF: [previous-agent] -> [next-agent] + +### Context +[Summary of what was done] + +### Findings +[Key discoveries or decisions] + +### Files Modified +[List of files touched] + +### Open Questions +[Unresolved items for next agent] + +### Recommendations +[Suggested next steps] +``` + +## Example: Feature Workflow + +``` +/orchestrate feature "Add user authentication" +``` + +Executes: + +1. **Planner Agent** + - Analyzes requirements + - Creates implementation plan + - Identifies dependencies + - Output: `HANDOFF: planner -> tdd-guide` + +2. **TDD Guide Agent** + - Reads planner handoff + - Writes tests first + - Implements to pass tests + - Output: `HANDOFF: tdd-guide -> code-reviewer` + +3. **Code Reviewer Agent** + - Reviews implementation + - Checks for issues + - Suggests improvements + - Output: `HANDOFF: code-reviewer -> security-reviewer` + +4. **Security Reviewer Agent** + - Security audit + - Vulnerability check + - Final approval + - Output: Final Report + +## Final Report Format + +``` +ORCHESTRATION REPORT +==================== +Workflow: feature +Task: Add user authentication +Agents: planner -> tdd-guide -> code-reviewer -> security-reviewer + +SUMMARY +------- +[One paragraph summary] + +AGENT OUTPUTS +------------- +Planner: [summary] +TDD Guide: [summary] +Code Reviewer: [summary] +Security Reviewer: [summary] + +FILES CHANGED +------------- +[List all files modified] + +TEST RESULTS +------------ +[Test pass/fail summary] + +SECURITY STATUS +--------------- +[Security findings] + +RECOMMENDATION +-------------- +[SHIP / NEEDS WORK / BLOCKED] +``` + +## Parallel Execution + +For independent checks, run agents in parallel: + +```markdown +### Parallel Phase +Run simultaneously: +- code-reviewer (quality) +- security-reviewer (security) +- architect (design) + +### Merge Results +Combine outputs into single report +``` + +## Arguments + +$ARGUMENTS: +- `feature ` - Full feature workflow +- `bugfix ` - Bug fix workflow +- `refactor ` - Refactoring workflow +- `security ` - Security review workflow +- `custom ` - Custom agent sequence + +## Custom Workflow Example + +``` +/orchestrate custom "architect,tdd-guide,code-reviewer" "Redesign caching layer" +``` + +## Tips + +1. **Start with planner** for complex features +2. **Always include code-reviewer** before merge +3. **Use security-reviewer** for auth/payment/PII +4. **Keep handoffs concise** - focus on what next agent needs +5. **Run verification** between agents if needed diff --git a/.claude/commands/plan.md b/.claude/commands/plan.md new file mode 100644 index 0000000..3acf686 --- /dev/null +++ b/.claude/commands/plan.md @@ -0,0 +1,113 @@ +--- +description: Restate requirements, assess risks, and create step-by-step implementation plan. WAIT for user CONFIRM before touching any code. +--- + +# Plan Command + +This command invokes the **planner** agent to create a comprehensive implementation plan before writing any code. + +## What This Command Does + +1. **Restate Requirements** - Clarify what needs to be built +2. **Identify Risks** - Surface potential issues and blockers +3. **Create Step Plan** - Break down implementation into phases +4. **Wait for Confirmation** - MUST receive user approval before proceeding + +## When to Use + +Use `/plan` when: +- Starting a new feature +- Making significant architectural changes +- Working on complex refactoring +- Multiple files/components will be affected +- Requirements are unclear or ambiguous + +## How It Works + +The planner agent will: + +1. **Analyze the request** and restate requirements in clear terms +2. **Break down into phases** with specific, actionable steps +3. **Identify dependencies** between components +4. **Assess risks** and potential blockers +5. **Estimate complexity** (High/Medium/Low) +6. **Present the plan** and WAIT for your explicit confirmation + +## Example Usage + +``` +User: /plan I need to add real-time notifications when markets resolve + +Agent (planner): +# Implementation Plan: Real-Time Market Resolution Notifications + +## Requirements Restatement +- Send notifications to users when markets they're watching resolve +- Support multiple notification channels (in-app, email, webhook) +- Ensure notifications are delivered reliably +- Include market outcome and user's position result + +## Implementation Phases + +### Phase 1: Database Schema +- Add notifications table with columns: id, user_id, market_id, type, status, created_at +- Add user_notification_preferences table for channel preferences +- Create indexes on user_id and market_id for performance + +### Phase 2: Notification Service +- Create notification service in lib/notifications.ts +- Implement notification queue using BullMQ/Redis +- Add retry logic for failed deliveries +- Create notification templates + +### Phase 3: Integration Points +- Hook into market resolution logic (when status changes to "resolved") +- Query all users with positions in market +- Enqueue notifications for each user + +### Phase 4: Frontend Components +- Create NotificationBell component in header +- Add NotificationList modal +- Implement real-time updates via Supabase subscriptions +- Add notification preferences page + +## Dependencies +- Redis (for queue) +- Email service (SendGrid/Resend) +- Supabase real-time subscriptions + +## Risks +- HIGH: Email deliverability (SPF/DKIM required) +- MEDIUM: Performance with 1000+ users per market +- MEDIUM: Notification spam if markets resolve frequently +- LOW: Real-time subscription overhead + +## Estimated Complexity: MEDIUM +- Backend: 4-6 hours +- Frontend: 3-4 hours +- Testing: 2-3 hours +- Total: 9-13 hours + +**WAITING FOR CONFIRMATION**: Proceed with this plan? (yes/no/modify) +``` + +## Important Notes + +**CRITICAL**: The planner agent will **NOT** write any code until you explicitly confirm the plan with "yes" or "proceed" or similar affirmative response. + +If you want changes, respond with: +- "modify: [your changes]" +- "different approach: [alternative]" +- "skip phase 2 and do phase 3 first" + +## Integration with Other Commands + +After planning: +- Use `/tdd` to implement with test-driven development +- Use `/build-and-fix` if build errors occur +- Use `/code-review` to review completed implementation + +## Related Agents + +This command invokes the `planner` agent located at: +`~/.claude/agents/planner.md` diff --git a/.claude/commands/refactor-clean.md b/.claude/commands/refactor-clean.md new file mode 100644 index 0000000..6f5e250 --- /dev/null +++ b/.claude/commands/refactor-clean.md @@ -0,0 +1,28 @@ +# Refactor Clean + +Safely identify and remove dead code with test verification: + +1. Run dead code analysis tools: + - knip: Find unused exports and files + - depcheck: Find unused dependencies + - ts-prune: Find unused TypeScript exports + +2. Generate comprehensive report in .reports/dead-code-analysis.md + +3. Categorize findings by severity: + - SAFE: Test files, unused utilities + - CAUTION: API routes, components + - DANGER: Config files, main entry points + +4. Propose safe deletions only + +5. Before each deletion: + - Run full test suite + - Verify tests pass + - Apply change + - Re-run tests + - Rollback if tests fail + +6. Show summary of cleaned items + +Never delete code without running tests first! diff --git a/.claude/commands/setup-pm.md b/.claude/commands/setup-pm.md new file mode 100644 index 0000000..87224b9 --- /dev/null +++ b/.claude/commands/setup-pm.md @@ -0,0 +1,80 @@ +--- +description: Configure your preferred package manager (npm/pnpm/yarn/bun) +disable-model-invocation: true +--- + +# Package Manager Setup + +Configure your preferred package manager for this project or globally. + +## Usage + +```bash +# Detect current package manager +node scripts/setup-package-manager.js --detect + +# Set global preference +node scripts/setup-package-manager.js --global pnpm + +# Set project preference +node scripts/setup-package-manager.js --project bun + +# List available package managers +node scripts/setup-package-manager.js --list +``` + +## Detection Priority + +When determining which package manager to use, the following order is checked: + +1. **Environment variable**: `CLAUDE_PACKAGE_MANAGER` +2. **Project config**: `.claude/package-manager.json` +3. **package.json**: `packageManager` field +4. **Lock file**: Presence of package-lock.json, yarn.lock, pnpm-lock.yaml, or bun.lockb +5. **Global config**: `~/.claude/package-manager.json` +6. **Fallback**: First available package manager (pnpm > bun > yarn > npm) + +## Configuration Files + +### Global Configuration +```json +// ~/.claude/package-manager.json +{ + "packageManager": "pnpm" +} +``` + +### Project Configuration +```json +// .claude/package-manager.json +{ + "packageManager": "bun" +} +``` + +### package.json +```json +{ + "packageManager": "pnpm@8.6.0" +} +``` + +## Environment Variable + +Set `CLAUDE_PACKAGE_MANAGER` to override all other detection methods: + +```bash +# Windows (PowerShell) +$env:CLAUDE_PACKAGE_MANAGER = "pnpm" + +# macOS/Linux +export CLAUDE_PACKAGE_MANAGER=pnpm +``` + +## Run the Detection + +To see current package manager detection results, run: + +```bash +node scripts/setup-package-manager.js --detect +``` diff --git a/.claude/commands/tdd.md b/.claude/commands/tdd.md new file mode 100644 index 0000000..02bdb2d --- /dev/null +++ b/.claude/commands/tdd.md @@ -0,0 +1,326 @@ +--- +description: Enforce test-driven development workflow. Scaffold interfaces, generate tests FIRST, then implement minimal code to pass. Ensure 80%+ coverage. +--- + +# TDD Command + +This command invokes the **tdd-guide** agent to enforce test-driven development methodology. + +## What This Command Does + +1. **Scaffold Interfaces** - Define types/interfaces first +2. **Generate Tests First** - Write failing tests (RED) +3. **Implement Minimal Code** - Write just enough to pass (GREEN) +4. **Refactor** - Improve code while keeping tests green (REFACTOR) +5. **Verify Coverage** - Ensure 80%+ test coverage + +## When to Use + +Use `/tdd` when: +- Implementing new features +- Adding new functions/components +- Fixing bugs (write test that reproduces bug first) +- Refactoring existing code +- Building critical business logic + +## How It Works + +The tdd-guide agent will: + +1. **Define interfaces** for inputs/outputs +2. **Write tests that will FAIL** (because code doesn't exist yet) +3. **Run tests** and verify they fail for the right reason +4. **Write minimal implementation** to make tests pass +5. **Run tests** and verify they pass +6. **Refactor** code while keeping tests green +7. **Check coverage** and add more tests if below 80% + +## TDD Cycle + +``` +RED → GREEN → REFACTOR → REPEAT + +RED: Write a failing test +GREEN: Write minimal code to pass +REFACTOR: Improve code, keep tests passing +REPEAT: Next feature/scenario +``` + +## Example Usage + +``` +User: /tdd I need a function to calculate market liquidity score + +Agent (tdd-guide): +# TDD Session: Market Liquidity Score Calculator + +## Step 1: Define Interface (SCAFFOLD) + +```typescript +// lib/liquidity.ts +export interface MarketData { + totalVolume: number + bidAskSpread: number + activeTraders: number + lastTradeTime: Date +} + +export function calculateLiquidityScore(market: MarketData): number { + // TODO: Implementation + throw new Error('Not implemented') +} +``` + +## Step 2: Write Failing Test (RED) + +```typescript +// lib/liquidity.test.ts +import { calculateLiquidityScore } from './liquidity' + +describe('calculateLiquidityScore', () => { + it('should return high score for liquid market', () => { + const market = { + totalVolume: 100000, + bidAskSpread: 0.01, + activeTraders: 500, + lastTradeTime: new Date() + } + + const score = calculateLiquidityScore(market) + + expect(score).toBeGreaterThan(80) + expect(score).toBeLessThanOrEqual(100) + }) + + it('should return low score for illiquid market', () => { + const market = { + totalVolume: 100, + bidAskSpread: 0.5, + activeTraders: 2, + lastTradeTime: new Date(Date.now() - 86400000) // 1 day ago + } + + const score = calculateLiquidityScore(market) + + expect(score).toBeLessThan(30) + expect(score).toBeGreaterThanOrEqual(0) + }) + + it('should handle edge case: zero volume', () => { + const market = { + totalVolume: 0, + bidAskSpread: 0, + activeTraders: 0, + lastTradeTime: new Date() + } + + const score = calculateLiquidityScore(market) + + expect(score).toBe(0) + }) +}) +``` + +## Step 3: Run Tests - Verify FAIL + +```bash +npm test lib/liquidity.test.ts + +FAIL lib/liquidity.test.ts + ✕ should return high score for liquid market (2 ms) + Error: Not implemented + +1 test failed, 0 passed +``` + +✅ Tests fail as expected. Ready to implement. + +## Step 4: Implement Minimal Code (GREEN) + +```typescript +// lib/liquidity.ts +export function calculateLiquidityScore(market: MarketData): number { + // Handle zero volume edge case + if (market.totalVolume === 0) { + return 0 + } + + // Calculate component scores (0-100 scale) + const volumeScore = Math.min(market.totalVolume / 1000, 100) + const spreadScore = Math.max(100 - (market.bidAskSpread * 1000), 0) + const traderScore = Math.min(market.activeTraders / 10, 100) + + // Recent activity bonus + const hoursSinceLastTrade = (Date.now() - market.lastTradeTime.getTime()) / (1000 * 60 * 60) + const recencyScore = Math.max(100 - (hoursSinceLastTrade * 10), 0) + + // Weighted average + const score = ( + volumeScore * 0.4 + + spreadScore * 0.3 + + traderScore * 0.2 + + recencyScore * 0.1 + ) + + return Math.min(Math.max(score, 0), 100) // Clamp to 0-100 +} +``` + +## Step 5: Run Tests - Verify PASS + +```bash +npm test lib/liquidity.test.ts + +PASS lib/liquidity.test.ts + ✓ should return high score for liquid market (3 ms) + ✓ should return low score for illiquid market (2 ms) + ✓ should handle edge case: zero volume (1 ms) + +3 tests passed +``` + +✅ All tests passing! + +## Step 6: Refactor (IMPROVE) + +```typescript +// lib/liquidity.ts - Refactored with constants and better readability +const WEIGHTS = { + VOLUME: 0.4, + SPREAD: 0.3, + TRADERS: 0.2, + RECENCY: 0.1, +} as const + +const SCALE_FACTORS = { + VOLUME: 1000, + SPREAD: 1000, + TRADERS: 10, + RECENCY_PENALTY: 10, +} as const + +function clamp(value: number, min: number, max: number): number { + return Math.min(Math.max(value, min), max) +} + +export function calculateLiquidityScore(market: MarketData): number { + if (market.totalVolume === 0) return 0 + + const volumeScore = Math.min(market.totalVolume / SCALE_FACTORS.VOLUME, 100) + const spreadScore = clamp(100 - (market.bidAskSpread * SCALE_FACTORS.SPREAD), 0, 100) + const traderScore = Math.min(market.activeTraders / SCALE_FACTORS.TRADERS, 100) + + const hoursSinceLastTrade = (Date.now() - market.lastTradeTime.getTime()) / (1000 * 60 * 60) + const recencyScore = clamp(100 - (hoursSinceLastTrade * SCALE_FACTORS.RECENCY_PENALTY), 0, 100) + + const weightedScore = + volumeScore * WEIGHTS.VOLUME + + spreadScore * WEIGHTS.SPREAD + + traderScore * WEIGHTS.TRADERS + + recencyScore * WEIGHTS.RECENCY + + return clamp(weightedScore, 0, 100) +} +``` + +## Step 7: Verify Tests Still Pass + +```bash +npm test lib/liquidity.test.ts + +PASS lib/liquidity.test.ts + ✓ should return high score for liquid market (3 ms) + ✓ should return low score for illiquid market (2 ms) + ✓ should handle edge case: zero volume (1 ms) + +3 tests passed +``` + +✅ Refactoring complete, tests still passing! + +## Step 8: Check Coverage + +```bash +npm test -- --coverage lib/liquidity.test.ts + +File | % Stmts | % Branch | % Funcs | % Lines +---------------|---------|----------|---------|-------- +liquidity.ts | 100 | 100 | 100 | 100 + +Coverage: 100% ✅ (Target: 80%) +``` + +✅ TDD session complete! +``` + +## TDD Best Practices + +**DO:** +- ✅ Write the test FIRST, before any implementation +- ✅ Run tests and verify they FAIL before implementing +- ✅ Write minimal code to make tests pass +- ✅ Refactor only after tests are green +- ✅ Add edge cases and error scenarios +- ✅ Aim for 80%+ coverage (100% for critical code) + +**DON'T:** +- ❌ Write implementation before tests +- ❌ Skip running tests after each change +- ❌ Write too much code at once +- ❌ Ignore failing tests +- ❌ Test implementation details (test behavior) +- ❌ Mock everything (prefer integration tests) + +## Test Types to Include + +**Unit Tests** (Function-level): +- Happy path scenarios +- Edge cases (empty, null, max values) +- Error conditions +- Boundary values + +**Integration Tests** (Component-level): +- API endpoints +- Database operations +- External service calls +- React components with hooks + +**E2E Tests** (use `/e2e` command): +- Critical user flows +- Multi-step processes +- Full stack integration + +## Coverage Requirements + +- **80% minimum** for all code +- **100% required** for: + - Financial calculations + - Authentication logic + - Security-critical code + - Core business logic + +## Important Notes + +**MANDATORY**: Tests must be written BEFORE implementation. The TDD cycle is: + +1. **RED** - Write failing test +2. **GREEN** - Implement to pass +3. **REFACTOR** - Improve code + +Never skip the RED phase. Never write code before tests. + +## Integration with Other Commands + +- Use `/plan` first to understand what to build +- Use `/tdd` to implement with tests +- Use `/build-and-fix` if build errors occur +- Use `/code-review` to review implementation +- Use `/test-coverage` to verify coverage + +## Related Agents + +This command invokes the `tdd-guide` agent located at: +`~/.claude/agents/tdd-guide.md` + +And can reference the `tdd-workflow` skill at: +`~/.claude/skills/tdd-workflow/` diff --git a/.claude/commands/test-coverage.md b/.claude/commands/test-coverage.md new file mode 100644 index 0000000..754eabf --- /dev/null +++ b/.claude/commands/test-coverage.md @@ -0,0 +1,27 @@ +# Test Coverage + +Analyze test coverage and generate missing tests: + +1. Run tests with coverage: npm test --coverage or pnpm test --coverage + +2. Analyze coverage report (coverage/coverage-summary.json) + +3. Identify files below 80% coverage threshold + +4. For each under-covered file: + - Analyze untested code paths + - Generate unit tests for functions + - Generate integration tests for APIs + - Generate E2E tests for critical flows + +5. Verify new tests pass + +6. Show before/after coverage metrics + +7. Ensure project reaches 80%+ overall coverage + +Focus on: +- Happy path scenarios +- Error handling +- Edge cases (null, undefined, empty) +- Boundary conditions diff --git a/.claude/commands/update-codemaps.md b/.claude/commands/update-codemaps.md new file mode 100644 index 0000000..f363a05 --- /dev/null +++ b/.claude/commands/update-codemaps.md @@ -0,0 +1,17 @@ +# Update Codemaps + +Analyze the codebase structure and update architecture documentation: + +1. Scan all source files for imports, exports, and dependencies +2. Generate token-lean codemaps in the following format: + - codemaps/architecture.md - Overall architecture + - codemaps/backend.md - Backend structure + - codemaps/frontend.md - Frontend structure + - codemaps/data.md - Data models and schemas + +3. Calculate diff percentage from previous version +4. If changes > 30%, request user approval before updating +5. Add freshness timestamp to each codemap +6. Save reports to .reports/codemap-diff.txt + +Use TypeScript/Node.js for analysis. Focus on high-level structure, not implementation details. diff --git a/.claude/commands/update-docs.md b/.claude/commands/update-docs.md new file mode 100644 index 0000000..3dd0f89 --- /dev/null +++ b/.claude/commands/update-docs.md @@ -0,0 +1,31 @@ +# Update Documentation + +Sync documentation from source-of-truth: + +1. Read package.json scripts section + - Generate scripts reference table + - Include descriptions from comments + +2. Read .env.example + - Extract all environment variables + - Document purpose and format + +3. Generate docs/CONTRIB.md with: + - Development workflow + - Available scripts + - Environment setup + - Testing procedures + +4. Generate docs/RUNBOOK.md with: + - Deployment procedures + - Monitoring and alerts + - Common issues and fixes + - Rollback procedures + +5. Identify obsolete documentation: + - Find docs not modified in 90+ days + - List for manual review + +6. Show diff summary + +Single source of truth: package.json and .env.example diff --git a/.claude/commands/verify.md b/.claude/commands/verify.md new file mode 100644 index 0000000..5f628b1 --- /dev/null +++ b/.claude/commands/verify.md @@ -0,0 +1,59 @@ +# Verification Command + +Run comprehensive verification on current codebase state. + +## Instructions + +Execute verification in this exact order: + +1. **Build Check** + - Run the build command for this project + - If it fails, report errors and STOP + +2. **Type Check** + - Run TypeScript/type checker + - Report all errors with file:line + +3. **Lint Check** + - Run linter + - Report warnings and errors + +4. **Test Suite** + - Run all tests + - Report pass/fail count + - Report coverage percentage + +5. **Console.log Audit** + - Search for console.log in source files + - Report locations + +6. **Git Status** + - Show uncommitted changes + - Show files modified since last commit + +## Output + +Produce a concise verification report: + +``` +VERIFICATION: [PASS/FAIL] + +Build: [OK/FAIL] +Types: [OK/X errors] +Lint: [OK/X issues] +Tests: [X/Y passed, Z% coverage] +Secrets: [OK/X found] +Logs: [OK/X console.logs] + +Ready for PR: [YES/NO] +``` + +If any critical issues, list them with fix suggestions. + +## Arguments + +$ARGUMENTS can be: +- `quick` - Only build + types +- `full` - All checks (default) +- `pre-commit` - Checks relevant for commits +- `pre-pr` - Full checks plus security scan diff --git a/.claude/hooks/hooks.json b/.claude/hooks/hooks.json new file mode 100644 index 0000000..ea9cdc6 --- /dev/null +++ b/.claude/hooks/hooks.json @@ -0,0 +1,157 @@ +{ + "$schema": "https://json.schemastore.org/claude-code-settings.json", + "hooks": { + "PreToolUse": [ + { + "matcher": "tool == \"Bash\" && tool_input.command matches \"(npm run dev|pnpm( run)? dev|yarn dev|bun run dev)\"", + "hooks": [ + { + "type": "command", + "command": "node -e \"console.error('[Hook] BLOCKED: Dev server must run in tmux for log access');console.error('[Hook] Use: tmux new-session -d -s dev \\\"npm run dev\\\"');console.error('[Hook] Then: tmux attach -t dev');process.exit(1)\"" + } + ], + "description": "Block dev servers outside tmux - ensures you can access logs" + }, + { + "matcher": "tool == \"Bash\" && tool_input.command matches \"(npm (install|test)|pnpm (install|test)|yarn (install|test)?|bun (install|test)|cargo build|make|docker|pytest|vitest|playwright)\"", + "hooks": [ + { + "type": "command", + "command": "node -e \"if(!process.env.TMUX){console.error('[Hook] Consider running in tmux for session persistence');console.error('[Hook] tmux new -s dev | tmux attach -t dev')}\"" + } + ], + "description": "Reminder to use tmux for long-running commands" + }, + { + "matcher": "tool == \"Bash\" && tool_input.command matches \"git push\"", + "hooks": [ + { + "type": "command", + "command": "node -e \"console.error('[Hook] Review changes before push...');console.error('[Hook] Continuing with push (remove this hook to add interactive review)')\"" + } + ], + "description": "Reminder before git push to review changes" + }, + { + "matcher": "tool == \"Write\" && tool_input.file_path matches \"\\\\.(md|txt)$\" && !(tool_input.file_path matches \"README\\\\.md|CLAUDE\\\\.md|AGENTS\\\\.md|CONTRIBUTING\\\\.md\")", + "hooks": [ + { + "type": "command", + "command": "node -e \"const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const p=i.tool_input?.file_path||'';if(/\\.(md|txt)$/.test(p)&&!/(README|CLAUDE|AGENTS|CONTRIBUTING)\\.md$/.test(p)){console.error('[Hook] BLOCKED: Unnecessary documentation file creation');console.error('[Hook] File: '+p);console.error('[Hook] Use README.md for documentation instead');process.exit(1)}console.log(d)})\"" + } + ], + "description": "Block creation of random .md files - keeps docs consolidated" + }, + { + "matcher": "tool == \"Edit\" || tool == \"Write\"", + "hooks": [ + { + "type": "command", + "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/suggest-compact.js\"" + } + ], + "description": "Suggest manual compaction at logical intervals" + } + ], + "PreCompact": [ + { + "matcher": "*", + "hooks": [ + { + "type": "command", + "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/pre-compact.js\"" + } + ], + "description": "Save state before context compaction" + } + ], + "SessionStart": [ + { + "matcher": "*", + "hooks": [ + { + "type": "command", + "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/session-start.js\"" + } + ], + "description": "Load previous context and detect package manager on new session" + } + ], + "PostToolUse": [ + { + "matcher": "tool == \"Bash\"", + "hooks": [ + { + "type": "command", + "command": "node -e \"let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const cmd=i.tool_input?.command||'';if(/gh pr create/.test(cmd)){const out=i.tool_output?.output||'';const m=out.match(/https:\\/\\/github.com\\/[^/]+\\/[^/]+\\/pull\\/\\d+/);if(m){console.error('[Hook] PR created: '+m[0]);const repo=m[0].replace(/https:\\/\\/github.com\\/([^/]+\\/[^/]+)\\/pull\\/\\d+/,'$1');const pr=m[0].replace(/.*\\/pull\\/(\\d+)/,'$1');console.error('[Hook] To review: gh pr review '+pr+' --repo '+repo)}}console.log(d)})\"" + } + ], + "description": "Log PR URL and provide review command after PR creation" + }, + { + "matcher": "tool == \"Edit\" && tool_input.file_path matches \"\\\\.(ts|tsx|js|jsx)$\"", + "hooks": [ + { + "type": "command", + "command": "node -e \"const{execSync}=require('child_process');const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const p=i.tool_input?.file_path;if(p&&fs.existsSync(p)){try{execSync('npx prettier --write \"'+p+'\"',{stdio:['pipe','pipe','pipe']})}catch(e){}}console.log(d)})\"" + } + ], + "description": "Auto-format JS/TS files with Prettier after edits" + }, + { + "matcher": "tool == \"Edit\" && tool_input.file_path matches \"\\\\.(ts|tsx)$\"", + "hooks": [ + { + "type": "command", + "command": "node -e \"const{execSync}=require('child_process');const fs=require('fs');const path=require('path');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const p=i.tool_input?.file_path;if(p&&fs.existsSync(p)){let dir=path.dirname(p);while(dir!==path.dirname(dir)&&!fs.existsSync(path.join(dir,'tsconfig.json'))){dir=path.dirname(dir)}if(fs.existsSync(path.join(dir,'tsconfig.json'))){try{const r=execSync('npx tsc --noEmit --pretty false 2>&1',{cwd:dir,encoding:'utf8',stdio:['pipe','pipe','pipe']});const lines=r.split('\\n').filter(l=>l.includes(p)).slice(0,10);if(lines.length)console.error(lines.join('\\n'))}catch(e){const lines=(e.stdout||'').split('\\n').filter(l=>l.includes(p)).slice(0,10);if(lines.length)console.error(lines.join('\\n'))}}}console.log(d)})\"" + } + ], + "description": "TypeScript check after editing .ts/.tsx files" + }, + { + "matcher": "tool == \"Edit\" && tool_input.file_path matches \"\\\\.(ts|tsx|js|jsx)$\"", + "hooks": [ + { + "type": "command", + "command": "node -e \"const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{const i=JSON.parse(d);const p=i.tool_input?.file_path;if(p&&fs.existsSync(p)){const c=fs.readFileSync(p,'utf8');const lines=c.split('\\n');const matches=[];lines.forEach((l,idx)=>{if(/console\\.log/.test(l))matches.push((idx+1)+': '+l.trim())});if(matches.length){console.error('[Hook] WARNING: console.log found in '+p);matches.slice(0,5).forEach(m=>console.error(m));console.error('[Hook] Remove console.log before committing')}}console.log(d)})\"" + } + ], + "description": "Warn about console.log statements after edits" + } + ], + "Stop": [ + { + "matcher": "*", + "hooks": [ + { + "type": "command", + "command": "node -e \"const{execSync}=require('child_process');const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{try{execSync('git rev-parse --git-dir',{stdio:'pipe'})}catch{console.log(d);process.exit(0)}try{const files=execSync('git diff --name-only HEAD',{encoding:'utf8',stdio:['pipe','pipe','pipe']}).split('\\n').filter(f=>/\\.(ts|tsx|js|jsx)$/.test(f)&&fs.existsSync(f));let hasConsole=false;for(const f of files){if(fs.readFileSync(f,'utf8').includes('console.log')){console.error('[Hook] WARNING: console.log found in '+f);hasConsole=true}}if(hasConsole)console.error('[Hook] Remove console.log statements before committing')}catch(e){}console.log(d)})\"" + } + ], + "description": "Check for console.log in modified files after each response" + } + ], + "SessionEnd": [ + { + "matcher": "*", + "hooks": [ + { + "type": "command", + "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/session-end.js\"" + } + ], + "description": "Persist session state on end" + }, + { + "matcher": "*", + "hooks": [ + { + "type": "command", + "command": "node \"${CLAUDE_PLUGIN_ROOT}/scripts/hooks/evaluate-session.js\"" + } + ], + "description": "Evaluate session for extractable patterns" + } + ] + } +} diff --git a/.claude/hooks/memory-persistence/pre-compact.sh b/.claude/hooks/memory-persistence/pre-compact.sh new file mode 100644 index 0000000..296fce9 --- /dev/null +++ b/.claude/hooks/memory-persistence/pre-compact.sh @@ -0,0 +1,36 @@ +#!/bin/bash +# PreCompact Hook - Save state before context compaction +# +# Runs before Claude compacts context, giving you a chance to +# preserve important state that might get lost in summarization. +# +# Hook config (in ~/.claude/settings.json): +# { +# "hooks": { +# "PreCompact": [{ +# "matcher": "*", +# "hooks": [{ +# "type": "command", +# "command": "~/.claude/hooks/memory-persistence/pre-compact.sh" +# }] +# }] +# } +# } + +SESSIONS_DIR="${HOME}/.claude/sessions" +COMPACTION_LOG="${SESSIONS_DIR}/compaction-log.txt" + +mkdir -p "$SESSIONS_DIR" + +# Log compaction event with timestamp +echo "[$(date '+%Y-%m-%d %H:%M:%S')] Context compaction triggered" >> "$COMPACTION_LOG" + +# If there's an active session file, note the compaction +ACTIVE_SESSION=$(ls -t "$SESSIONS_DIR"/*.tmp 2>/dev/null | head -1) +if [ -n "$ACTIVE_SESSION" ]; then + echo "" >> "$ACTIVE_SESSION" + echo "---" >> "$ACTIVE_SESSION" + echo "**[Compaction occurred at $(date '+%H:%M')]** - Context was summarized" >> "$ACTIVE_SESSION" +fi + +echo "[PreCompact] State saved before compaction" >&2 diff --git a/.claude/hooks/memory-persistence/session-end.sh b/.claude/hooks/memory-persistence/session-end.sh new file mode 100644 index 0000000..93b0f63 --- /dev/null +++ b/.claude/hooks/memory-persistence/session-end.sh @@ -0,0 +1,61 @@ +#!/bin/bash +# Stop Hook (Session End) - Persist learnings when session ends +# +# Runs when Claude session ends. Creates/updates session log file +# with timestamp for continuity tracking. +# +# Hook config (in ~/.claude/settings.json): +# { +# "hooks": { +# "Stop": [{ +# "matcher": "*", +# "hooks": [{ +# "type": "command", +# "command": "~/.claude/hooks/memory-persistence/session-end.sh" +# }] +# }] +# } +# } + +SESSIONS_DIR="${HOME}/.claude/sessions" +TODAY=$(date '+%Y-%m-%d') +SESSION_FILE="${SESSIONS_DIR}/${TODAY}-session.tmp" + +mkdir -p "$SESSIONS_DIR" + +# If session file exists for today, update the end time +if [ -f "$SESSION_FILE" ]; then + # Update Last Updated timestamp + sed -i '' "s/\*\*Last Updated:\*\*.*/\*\*Last Updated:\*\* $(date '+%H:%M')/" "$SESSION_FILE" 2>/dev/null || \ + sed -i "s/\*\*Last Updated:\*\*.*/\*\*Last Updated:\*\* $(date '+%H:%M')/" "$SESSION_FILE" 2>/dev/null + echo "[SessionEnd] Updated session file: $SESSION_FILE" >&2 +else + # Create new session file with template + cat > "$SESSION_FILE" << EOF +# Session: $(date '+%Y-%m-%d') +**Date:** $TODAY +**Started:** $(date '+%H:%M') +**Last Updated:** $(date '+%H:%M') + +--- + +## Current State + +[Session context goes here] + +### Completed +- [ ] + +### In Progress +- [ ] + +### Notes for Next Session +- + +### Context to Load +\`\`\` +[relevant files] +\`\`\` +EOF + echo "[SessionEnd] Created session file: $SESSION_FILE" >&2 +fi diff --git a/.claude/hooks/memory-persistence/session-start.sh b/.claude/hooks/memory-persistence/session-start.sh new file mode 100644 index 0000000..57a8c14 --- /dev/null +++ b/.claude/hooks/memory-persistence/session-start.sh @@ -0,0 +1,37 @@ +#!/bin/bash +# SessionStart Hook - Load previous context on new session +# +# Runs when a new Claude session starts. Checks for recent session +# files and notifies Claude of available context to load. +# +# Hook config (in ~/.claude/settings.json): +# { +# "hooks": { +# "SessionStart": [{ +# "matcher": "*", +# "hooks": [{ +# "type": "command", +# "command": "~/.claude/hooks/memory-persistence/session-start.sh" +# }] +# }] +# } +# } + +SESSIONS_DIR="${HOME}/.claude/sessions" +LEARNED_DIR="${HOME}/.claude/skills/learned" + +# Check for recent session files (last 7 days) +recent_sessions=$(find "$SESSIONS_DIR" -name "*.tmp" -mtime -7 2>/dev/null | wc -l | tr -d ' ') + +if [ "$recent_sessions" -gt 0 ]; then + latest=$(ls -t "$SESSIONS_DIR"/*.tmp 2>/dev/null | head -1) + echo "[SessionStart] Found $recent_sessions recent session(s)" >&2 + echo "[SessionStart] Latest: $latest" >&2 +fi + +# Check for learned skills +learned_count=$(find "$LEARNED_DIR" -name "*.md" 2>/dev/null | wc -l | tr -d ' ') + +if [ "$learned_count" -gt 0 ]; then + echo "[SessionStart] $learned_count learned skill(s) available in $LEARNED_DIR" >&2 +fi diff --git a/.claude/hooks/strategic-compact/suggest-compact.sh b/.claude/hooks/strategic-compact/suggest-compact.sh new file mode 100644 index 0000000..ea14920 --- /dev/null +++ b/.claude/hooks/strategic-compact/suggest-compact.sh @@ -0,0 +1,52 @@ +#!/bin/bash +# Strategic Compact Suggester +# Runs on PreToolUse or periodically to suggest manual compaction at logical intervals +# +# Why manual over auto-compact: +# - Auto-compact happens at arbitrary points, often mid-task +# - Strategic compacting preserves context through logical phases +# - Compact after exploration, before execution +# - Compact after completing a milestone, before starting next +# +# Hook config (in ~/.claude/settings.json): +# { +# "hooks": { +# "PreToolUse": [{ +# "matcher": "Edit|Write", +# "hooks": [{ +# "type": "command", +# "command": "~/.claude/skills/strategic-compact/suggest-compact.sh" +# }] +# }] +# } +# } +# +# Criteria for suggesting compact: +# - Session has been running for extended period +# - Large number of tool calls made +# - Transitioning from research/exploration to implementation +# - Plan has been finalized + +# Track tool call count (increment in a temp file) +COUNTER_FILE="/tmp/claude-tool-count-$$" +THRESHOLD=${COMPACT_THRESHOLD:-50} + +# Initialize or increment counter +if [ -f "$COUNTER_FILE" ]; then + count=$(cat "$COUNTER_FILE") + count=$((count + 1)) + echo "$count" > "$COUNTER_FILE" +else + echo "1" > "$COUNTER_FILE" + count=1 +fi + +# Suggest compact after threshold tool calls +if [ "$count" -eq "$THRESHOLD" ]; then + echo "[StrategicCompact] $THRESHOLD tool calls reached - consider /compact if transitioning phases" >&2 +fi + +# Suggest at regular intervals after threshold +if [ "$count" -gt "$THRESHOLD" ] && [ $((count % 25)) -eq 0 ]; then + echo "[StrategicCompact] $count tool calls - good checkpoint for /compact if context is stale" >&2 +fi diff --git a/.claude/settings.local.json b/.claude/settings.local.json index 3c93f74..b4a6c77 100644 --- a/.claude/settings.local.json +++ b/.claude/settings.local.json @@ -75,7 +75,13 @@ "Bash(wsl -e bash -c \"ls -la /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/data/dataset/train/\")", "Bash(wsl -e bash -c \"ls -la /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/data/structured_data/*.csv 2>/dev/null | head -20\")", "Bash(tasklist:*)", - "Bash(findstr:*)" + "Bash(findstr:*)", + "Bash(wsl bash -c \"ps aux | grep -E ''python.*train'' | grep -v grep\")", + "Bash(wsl bash -c \"ls -la /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/runs/train/invoice_fields/\")", + "Bash(wsl bash -c \"cat /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/runs/train/invoice_fields/results.csv\")", + "Bash(wsl bash -c \"ls -la /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2/runs/train/invoice_fields/weights/\")", + "Bash(wsl bash -c \"cat ''/mnt/c/Users/yaoji/AppData/Local/Temp/claude/c--Users-yaoji-git-ColaCoder-invoice-master-poc-v2/tasks/b8d8565.output'' 2>/dev/null | tail -100\")", + "Bash(wsl bash -c:*)" ], "deny": [], "ask": [], diff --git a/.claude/skills/backend-patterns/SKILL.md b/.claude/skills/backend-patterns/SKILL.md new file mode 100644 index 0000000..53bf07e --- /dev/null +++ b/.claude/skills/backend-patterns/SKILL.md @@ -0,0 +1,314 @@ +# Backend Development Patterns + +Backend architecture patterns for Python/FastAPI/PostgreSQL applications. + +## API Design + +### RESTful Structure + +``` +GET /api/v1/documents # List +GET /api/v1/documents/{id} # Get +POST /api/v1/documents # Create +PUT /api/v1/documents/{id} # Replace +PATCH /api/v1/documents/{id} # Update +DELETE /api/v1/documents/{id} # Delete + +GET /api/v1/documents?status=processed&sort=created_at&limit=20&offset=0 +``` + +### FastAPI Route Pattern + +```python +from fastapi import APIRouter, HTTPException, Depends, Query, File, UploadFile +from pydantic import BaseModel + +router = APIRouter(prefix="/api/v1", tags=["inference"]) + +@router.post("/infer", response_model=ApiResponse[InferenceResult]) +async def infer_document( + file: UploadFile = File(...), + confidence_threshold: float = Query(0.5, ge=0, le=1), + service: InferenceService = Depends(get_inference_service) +) -> ApiResponse[InferenceResult]: + result = await service.process(file, confidence_threshold) + return ApiResponse(success=True, data=result) +``` + +### Consistent Response Schema + +```python +from typing import Generic, TypeVar +T = TypeVar('T') + +class ApiResponse(BaseModel, Generic[T]): + success: bool + data: T | None = None + error: str | None = None + meta: dict | None = None +``` + +## Core Patterns + +### Repository Pattern + +```python +from typing import Protocol + +class DocumentRepository(Protocol): + def find_all(self, filters: dict | None = None) -> list[Document]: ... + def find_by_id(self, id: str) -> Document | None: ... + def create(self, data: dict) -> Document: ... + def update(self, id: str, data: dict) -> Document: ... + def delete(self, id: str) -> None: ... +``` + +### Service Layer + +```python +class InferenceService: + def __init__(self, model_path: str, use_gpu: bool = True): + self.pipeline = InferencePipeline(model_path=model_path, use_gpu=use_gpu) + + async def process(self, file: UploadFile, confidence_threshold: float) -> InferenceResult: + temp_path = self._save_temp_file(file) + try: + return self.pipeline.process_pdf(temp_path) + finally: + temp_path.unlink(missing_ok=True) +``` + +### Dependency Injection + +```python +from functools import lru_cache +from pydantic_settings import BaseSettings + +class Settings(BaseSettings): + db_host: str = "localhost" + db_password: str + model_path: str = "runs/train/invoice_fields/weights/best.pt" + class Config: + env_file = ".env" + +@lru_cache() +def get_settings() -> Settings: + return Settings() + +def get_inference_service(settings: Settings = Depends(get_settings)) -> InferenceService: + return InferenceService(model_path=settings.model_path) +``` + +## Database Patterns + +### Connection Pooling + +```python +from psycopg2 import pool +from contextlib import contextmanager + +db_pool = pool.ThreadedConnectionPool(minconn=2, maxconn=10, **db_config) + +@contextmanager +def get_db_connection(): + conn = db_pool.getconn() + try: + yield conn + finally: + db_pool.putconn(conn) +``` + +### Query Optimization + +```python +# GOOD: Select only needed columns +cur.execute(""" + SELECT id, status, fields->>'InvoiceNumber' as invoice_number + FROM documents WHERE status = %s + ORDER BY created_at DESC LIMIT %s +""", ('processed', 10)) + +# BAD: SELECT * FROM documents +``` + +### N+1 Prevention + +```python +# BAD: N+1 queries +for doc in documents: + doc.labels = get_labels(doc.id) # N queries + +# GOOD: Batch fetch with JOIN +cur.execute(""" + SELECT d.id, d.status, array_agg(l.label) as labels + FROM documents d + LEFT JOIN document_labels l ON d.id = l.document_id + GROUP BY d.id, d.status +""") +``` + +### Transaction Pattern + +```python +def create_document_with_labels(doc_data: dict, labels: list[dict]) -> str: + with get_db_connection() as conn: + try: + with conn.cursor() as cur: + cur.execute("INSERT INTO documents ... RETURNING id", ...) + doc_id = cur.fetchone()[0] + for label in labels: + cur.execute("INSERT INTO document_labels ...", ...) + conn.commit() + return doc_id + except Exception: + conn.rollback() + raise +``` + +## Caching + +```python +from cachetools import TTLCache + +_cache = TTLCache(maxsize=1000, ttl=300) + +def get_document_cached(doc_id: str) -> Document | None: + if doc_id in _cache: + return _cache[doc_id] + doc = repo.find_by_id(doc_id) + if doc: + _cache[doc_id] = doc + return doc +``` + +## Error Handling + +### Exception Hierarchy + +```python +class AppError(Exception): + def __init__(self, message: str, status_code: int = 500): + self.message = message + self.status_code = status_code + +class NotFoundError(AppError): + def __init__(self, resource: str, id: str): + super().__init__(f"{resource} not found: {id}", 404) + +class ValidationError(AppError): + def __init__(self, message: str): + super().__init__(message, 400) +``` + +### FastAPI Exception Handler + +```python +@app.exception_handler(AppError) +async def app_error_handler(request: Request, exc: AppError): + return JSONResponse(status_code=exc.status_code, content={"success": False, "error": exc.message}) + +@app.exception_handler(Exception) +async def generic_error_handler(request: Request, exc: Exception): + logger.error(f"Unexpected error: {exc}", exc_info=True) + return JSONResponse(status_code=500, content={"success": False, "error": "Internal server error"}) +``` + +### Retry with Backoff + +```python +async def retry_with_backoff(fn, max_retries: int = 3, base_delay: float = 1.0): + last_error = None + for attempt in range(max_retries): + try: + return await fn() if asyncio.iscoroutinefunction(fn) else fn() + except Exception as e: + last_error = e + if attempt < max_retries - 1: + await asyncio.sleep(base_delay * (2 ** attempt)) + raise last_error +``` + +## Rate Limiting + +```python +from time import time +from collections import defaultdict + +class RateLimiter: + def __init__(self): + self.requests: dict[str, list[float]] = defaultdict(list) + + def check_limit(self, identifier: str, max_requests: int, window_sec: int) -> bool: + now = time() + self.requests[identifier] = [t for t in self.requests[identifier] if now - t < window_sec] + if len(self.requests[identifier]) >= max_requests: + return False + self.requests[identifier].append(now) + return True + +limiter = RateLimiter() + +@app.middleware("http") +async def rate_limit_middleware(request: Request, call_next): + ip = request.client.host + if not limiter.check_limit(ip, max_requests=100, window_sec=60): + return JSONResponse(status_code=429, content={"error": "Rate limit exceeded"}) + return await call_next(request) +``` + +## Logging & Middleware + +### Request Logging + +```python +@app.middleware("http") +async def log_requests(request: Request, call_next): + request_id = str(uuid.uuid4())[:8] + start_time = time.time() + logger.info(f"[{request_id}] {request.method} {request.url.path}") + response = await call_next(request) + duration_ms = (time.time() - start_time) * 1000 + logger.info(f"[{request_id}] Completed {response.status_code} in {duration_ms:.2f}ms") + return response +``` + +### Structured Logging + +```python +class JSONFormatter(logging.Formatter): + def format(self, record): + return json.dumps({ + "timestamp": datetime.utcnow().isoformat(), + "level": record.levelname, + "message": record.getMessage(), + "module": record.module, + }) +``` + +## Background Tasks + +```python +from fastapi import BackgroundTasks + +def send_notification(document_id: str, status: str): + logger.info(f"Notification: {document_id} -> {status}") + +@router.post("/infer") +async def infer(file: UploadFile, background_tasks: BackgroundTasks): + result = await process_document(file) + background_tasks.add_task(send_notification, result.document_id, "completed") + return result +``` + +## Key Principles + +- Repository pattern: Abstract data access +- Service layer: Business logic separated from routes +- Dependency injection via `Depends()` +- Connection pooling for database +- Parameterized queries only (no f-strings in SQL) +- Batch fetch to prevent N+1 +- Consistent `ApiResponse[T]` format +- Exception hierarchy with proper status codes +- Rate limit by IP +- Structured logging with request ID \ No newline at end of file diff --git a/.claude/skills/coding-standards/SKILL.md b/.claude/skills/coding-standards/SKILL.md new file mode 100644 index 0000000..4bb9b71 --- /dev/null +++ b/.claude/skills/coding-standards/SKILL.md @@ -0,0 +1,665 @@ +--- +name: coding-standards +description: Universal coding standards, best practices, and patterns for Python, FastAPI, and data processing development. +--- + +# Coding Standards & Best Practices + +Python coding standards for the Invoice Master project. + +## Code Quality Principles + +### 1. Readability First +- Code is read more than written +- Clear variable and function names +- Self-documenting code preferred over comments +- Consistent formatting (follow PEP 8) + +### 2. KISS (Keep It Simple, Stupid) +- Simplest solution that works +- Avoid over-engineering +- No premature optimization +- Easy to understand > clever code + +### 3. DRY (Don't Repeat Yourself) +- Extract common logic into functions +- Create reusable utilities +- Share modules across the codebase +- Avoid copy-paste programming + +### 4. YAGNI (You Aren't Gonna Need It) +- Don't build features before they're needed +- Avoid speculative generality +- Add complexity only when required +- Start simple, refactor when needed + +## Python Standards + +### Variable Naming + +```python +# GOOD: Descriptive names +invoice_number = "INV-2024-001" +is_valid_document = True +total_confidence_score = 0.95 + +# BAD: Unclear names +inv = "INV-2024-001" +flag = True +x = 0.95 +``` + +### Function Naming + +```python +# GOOD: Verb-noun pattern with type hints +def extract_invoice_fields(pdf_path: Path) -> dict[str, str]: + """Extract fields from invoice PDF.""" + ... + +def calculate_confidence(predictions: list[float]) -> float: + """Calculate average confidence score.""" + ... + +def is_valid_bankgiro(value: str) -> bool: + """Check if value is valid Bankgiro number.""" + ... + +# BAD: Unclear or noun-only +def invoice(path): + ... + +def confidence(p): + ... + +def bankgiro(v): + ... +``` + +### Type Hints (REQUIRED) + +```python +# GOOD: Full type annotations +from typing import Optional +from pathlib import Path +from dataclasses import dataclass + +@dataclass +class InferenceResult: + document_id: str + fields: dict[str, str] + confidence: dict[str, float] + processing_time_ms: float + +def process_document( + pdf_path: Path, + confidence_threshold: float = 0.5 +) -> InferenceResult: + """Process PDF and return extracted fields.""" + ... + +# BAD: No type hints +def process_document(pdf_path, confidence_threshold=0.5): + ... +``` + +### Immutability Pattern (CRITICAL) + +```python +# GOOD: Create new objects, don't mutate +def update_fields(fields: dict[str, str], updates: dict[str, str]) -> dict[str, str]: + return {**fields, **updates} + +def add_item(items: list[str], new_item: str) -> list[str]: + return [*items, new_item] + +# BAD: Direct mutation +def update_fields(fields: dict[str, str], updates: dict[str, str]) -> dict[str, str]: + fields.update(updates) # MUTATION! + return fields + +def add_item(items: list[str], new_item: str) -> list[str]: + items.append(new_item) # MUTATION! + return items +``` + +### Error Handling + +```python +import logging + +logger = logging.getLogger(__name__) + +# GOOD: Comprehensive error handling with logging +def load_model(model_path: Path) -> Model: + """Load YOLO model from path.""" + try: + if not model_path.exists(): + raise FileNotFoundError(f"Model not found: {model_path}") + + model = YOLO(str(model_path)) + logger.info(f"Model loaded: {model_path}") + return model + except Exception as e: + logger.error(f"Failed to load model: {e}") + raise RuntimeError(f"Model loading failed: {model_path}") from e + +# BAD: No error handling +def load_model(model_path): + return YOLO(str(model_path)) + +# BAD: Bare except +def load_model(model_path): + try: + return YOLO(str(model_path)) + except: # Never use bare except! + return None +``` + +### Async Best Practices + +```python +import asyncio + +# GOOD: Parallel execution when possible +async def process_batch(pdf_paths: list[Path]) -> list[InferenceResult]: + tasks = [process_document(path) for path in pdf_paths] + results = await asyncio.gather(*tasks, return_exceptions=True) + + # Handle exceptions + valid_results = [] + for path, result in zip(pdf_paths, results): + if isinstance(result, Exception): + logger.error(f"Failed to process {path}: {result}") + else: + valid_results.append(result) + return valid_results + +# BAD: Sequential when unnecessary +async def process_batch(pdf_paths: list[Path]) -> list[InferenceResult]: + results = [] + for path in pdf_paths: + result = await process_document(path) + results.append(result) + return results +``` + +### Context Managers + +```python +from contextlib import contextmanager +from pathlib import Path +import tempfile + +# GOOD: Proper resource management +@contextmanager +def temp_pdf_copy(pdf_path: Path): + """Create temporary copy of PDF for processing.""" + with tempfile.NamedTemporaryFile(suffix=".pdf", delete=False) as tmp: + tmp.write(pdf_path.read_bytes()) + tmp_path = Path(tmp.name) + try: + yield tmp_path + finally: + tmp_path.unlink(missing_ok=True) + +# Usage +with temp_pdf_copy(original_pdf) as tmp_pdf: + result = process_pdf(tmp_pdf) +``` + +## FastAPI Best Practices + +### Route Structure + +```python +from fastapi import APIRouter, HTTPException, Depends, Query, File, UploadFile +from pydantic import BaseModel + +router = APIRouter(prefix="/api/v1", tags=["inference"]) + +class InferenceResponse(BaseModel): + success: bool + document_id: str + fields: dict[str, str] + confidence: dict[str, float] + processing_time_ms: float + +@router.post("/infer", response_model=InferenceResponse) +async def infer_document( + file: UploadFile = File(...), + confidence_threshold: float = Query(0.5, ge=0.0, le=1.0) +) -> InferenceResponse: + """Process invoice PDF and extract fields.""" + if not file.filename.endswith(".pdf"): + raise HTTPException(status_code=400, detail="Only PDF files accepted") + + result = await inference_service.process(file, confidence_threshold) + return InferenceResponse( + success=True, + document_id=result.document_id, + fields=result.fields, + confidence=result.confidence, + processing_time_ms=result.processing_time_ms + ) +``` + +### Input Validation with Pydantic + +```python +from pydantic import BaseModel, Field, field_validator +from datetime import date +import re + +class InvoiceData(BaseModel): + invoice_number: str = Field(..., min_length=1, max_length=50) + invoice_date: date + amount: float = Field(..., gt=0) + bankgiro: str | None = None + ocr_number: str | None = None + + @field_validator("bankgiro") + @classmethod + def validate_bankgiro(cls, v: str | None) -> str | None: + if v is None: + return None + # Bankgiro: 7-8 digits + cleaned = re.sub(r"[^0-9]", "", v) + if not (7 <= len(cleaned) <= 8): + raise ValueError("Bankgiro must be 7-8 digits") + return cleaned + + @field_validator("ocr_number") + @classmethod + def validate_ocr(cls, v: str | None) -> str | None: + if v is None: + return None + # OCR: 2-25 digits + cleaned = re.sub(r"[^0-9]", "", v) + if not (2 <= len(cleaned) <= 25): + raise ValueError("OCR must be 2-25 digits") + return cleaned +``` + +### Response Format + +```python +from pydantic import BaseModel +from typing import Generic, TypeVar + +T = TypeVar("T") + +class ApiResponse(BaseModel, Generic[T]): + success: bool + data: T | None = None + error: str | None = None + meta: dict | None = None + +# Success response +return ApiResponse( + success=True, + data=result, + meta={"processing_time_ms": elapsed_ms} +) + +# Error response +return ApiResponse( + success=False, + error="Invalid PDF format" +) +``` + +## File Organization + +### Project Structure + +``` +src/ +├── cli/ # Command-line interfaces +│ ├── autolabel.py +│ ├── train.py +│ └── infer.py +├── pdf/ # PDF processing +│ ├── extractor.py +│ └── renderer.py +├── ocr/ # OCR processing +│ ├── paddle_ocr.py +│ └── machine_code_parser.py +├── inference/ # Inference pipeline +│ ├── pipeline.py +│ ├── yolo_detector.py +│ └── field_extractor.py +├── normalize/ # Field normalization +│ ├── base.py +│ ├── date_normalizer.py +│ └── amount_normalizer.py +├── web/ # FastAPI application +│ ├── app.py +│ ├── routes.py +│ ├── services.py +│ └── schemas.py +└── utils/ # Shared utilities + ├── validators.py + ├── text_cleaner.py + └── logging.py +tests/ # Mirror of src structure + ├── test_pdf/ + ├── test_ocr/ + └── test_inference/ +``` + +### File Naming + +``` +src/ocr/paddle_ocr.py # snake_case for modules +src/inference/yolo_detector.py # snake_case for modules +tests/test_paddle_ocr.py # test_ prefix for tests +config.py # snake_case for config +``` + +### Module Size Guidelines + +- **Maximum**: 800 lines per file +- **Typical**: 200-400 lines per file +- **Functions**: Max 50 lines each +- Extract utilities when modules grow too large + +## Comments & Documentation + +### When to Comment + +```python +# GOOD: Explain WHY, not WHAT +# Swedish Bankgiro uses Luhn algorithm with weight [1,2,1,2...] +def validate_bankgiro_checksum(bankgiro: str) -> bool: + ... + +# Payment line format: 7 groups separated by #, checksum at end +def parse_payment_line(line: str) -> PaymentLineData: + ... + +# BAD: Stating the obvious +# Increment counter by 1 +count += 1 + +# Set name to user's name +name = user.name +``` + +### Docstrings for Public APIs + +```python +def extract_invoice_fields( + pdf_path: Path, + confidence_threshold: float = 0.5, + use_gpu: bool = True +) -> InferenceResult: + """Extract structured fields from Swedish invoice PDF. + + Uses YOLOv11 for field detection and PaddleOCR for text extraction. + Applies field-specific normalization and validation. + + Args: + pdf_path: Path to the invoice PDF file. + confidence_threshold: Minimum confidence for field detection (0.0-1.0). + use_gpu: Whether to use GPU acceleration. + + Returns: + InferenceResult containing extracted fields and confidence scores. + + Raises: + FileNotFoundError: If PDF file doesn't exist. + ProcessingError: If OCR or detection fails. + + Example: + >>> result = extract_invoice_fields(Path("invoice.pdf")) + >>> print(result.fields["invoice_number"]) + "INV-2024-001" + """ + ... +``` + +## Performance Best Practices + +### Caching + +```python +from functools import lru_cache +from cachetools import TTLCache + +# Static data: LRU cache +@lru_cache(maxsize=100) +def get_field_config(field_name: str) -> FieldConfig: + """Load field configuration (cached).""" + return load_config(field_name) + +# Dynamic data: TTL cache +_document_cache = TTLCache(maxsize=1000, ttl=300) # 5 minutes + +def get_document_cached(doc_id: str) -> Document | None: + if doc_id in _document_cache: + return _document_cache[doc_id] + + doc = repo.find_by_id(doc_id) + if doc: + _document_cache[doc_id] = doc + return doc +``` + +### Database Queries + +```python +# GOOD: Select only needed columns +cur.execute(""" + SELECT id, status, fields->>'invoice_number' + FROM documents + WHERE status = %s + LIMIT %s +""", ('processed', 10)) + +# BAD: Select everything +cur.execute("SELECT * FROM documents") + +# GOOD: Batch operations +cur.executemany( + "INSERT INTO labels (doc_id, field, value) VALUES (%s, %s, %s)", + [(doc_id, f, v) for f, v in fields.items()] +) + +# BAD: Individual inserts in loop +for field, value in fields.items(): + cur.execute("INSERT INTO labels ...", (doc_id, field, value)) +``` + +### Lazy Loading + +```python +class InferencePipeline: + def __init__(self, model_path: Path): + self.model_path = model_path + self._model: YOLO | None = None + self._ocr: PaddleOCR | None = None + + @property + def model(self) -> YOLO: + """Lazy load YOLO model.""" + if self._model is None: + self._model = YOLO(str(self.model_path)) + return self._model + + @property + def ocr(self) -> PaddleOCR: + """Lazy load PaddleOCR.""" + if self._ocr is None: + self._ocr = PaddleOCR(use_angle_cls=True, lang="latin") + return self._ocr +``` + +## Testing Standards + +### Test Structure (AAA Pattern) + +```python +def test_extract_bankgiro_valid(): + # Arrange + text = "Bankgiro: 123-4567" + + # Act + result = extract_bankgiro(text) + + # Assert + assert result == "1234567" + +def test_extract_bankgiro_invalid_returns_none(): + # Arrange + text = "No bankgiro here" + + # Act + result = extract_bankgiro(text) + + # Assert + assert result is None +``` + +### Test Naming + +```python +# GOOD: Descriptive test names +def test_parse_payment_line_extracts_all_fields(): ... +def test_parse_payment_line_handles_missing_checksum(): ... +def test_validate_ocr_returns_false_for_invalid_checksum(): ... + +# BAD: Vague test names +def test_parse(): ... +def test_works(): ... +def test_payment_line(): ... +``` + +### Fixtures + +```python +import pytest +from pathlib import Path + +@pytest.fixture +def sample_invoice_pdf(tmp_path: Path) -> Path: + """Create sample invoice PDF for testing.""" + pdf_path = tmp_path / "invoice.pdf" + # Create test PDF... + return pdf_path + +@pytest.fixture +def inference_pipeline(sample_model_path: Path) -> InferencePipeline: + """Create inference pipeline with test model.""" + return InferencePipeline(sample_model_path) + +def test_process_invoice(inference_pipeline, sample_invoice_pdf): + result = inference_pipeline.process(sample_invoice_pdf) + assert result.fields.get("invoice_number") is not None +``` + +## Code Smell Detection + +### 1. Long Functions + +```python +# BAD: Function > 50 lines +def process_document(): + # 100 lines of code... + +# GOOD: Split into smaller functions +def process_document(pdf_path: Path) -> InferenceResult: + image = render_pdf(pdf_path) + detections = detect_fields(image) + ocr_results = extract_text(image, detections) + fields = normalize_fields(ocr_results) + return build_result(fields) +``` + +### 2. Deep Nesting + +```python +# BAD: 5+ levels of nesting +if document: + if document.is_valid: + if document.has_fields: + if field in document.fields: + if document.fields[field]: + # Do something + +# GOOD: Early returns +if not document: + return None +if not document.is_valid: + return None +if not document.has_fields: + return None +if field not in document.fields: + return None +if not document.fields[field]: + return None + +# Do something +``` + +### 3. Magic Numbers + +```python +# BAD: Unexplained numbers +if confidence > 0.5: + ... +time.sleep(3) + +# GOOD: Named constants +CONFIDENCE_THRESHOLD = 0.5 +RETRY_DELAY_SECONDS = 3 + +if confidence > CONFIDENCE_THRESHOLD: + ... +time.sleep(RETRY_DELAY_SECONDS) +``` + +### 4. Mutable Default Arguments + +```python +# BAD: Mutable default argument +def process_fields(fields: list = []): # DANGEROUS! + fields.append("new_field") + return fields + +# GOOD: Use None as default +def process_fields(fields: list | None = None) -> list: + if fields is None: + fields = [] + return [*fields, "new_field"] +``` + +## Logging Standards + +```python +import logging + +# Module-level logger +logger = logging.getLogger(__name__) + +# GOOD: Appropriate log levels +logger.debug("Processing document: %s", doc_id) +logger.info("Document processed successfully: %s", doc_id) +logger.warning("Low confidence score: %.2f", confidence) +logger.error("Failed to process document: %s", error) + +# GOOD: Structured logging with extra data +logger.info( + "Inference complete", + extra={ + "document_id": doc_id, + "field_count": len(fields), + "processing_time_ms": elapsed_ms + } +) + +# BAD: Using print() +print(f"Processing {doc_id}") # Never in production! +``` + +**Remember**: Code quality is not negotiable. Clear, maintainable Python code with proper type hints enables confident development and refactoring. diff --git a/.claude/skills/continuous-learning/SKILL.md b/.claude/skills/continuous-learning/SKILL.md new file mode 100644 index 0000000..84a88dd --- /dev/null +++ b/.claude/skills/continuous-learning/SKILL.md @@ -0,0 +1,80 @@ +--- +name: continuous-learning +description: Automatically extract reusable patterns from Claude Code sessions and save them as learned skills for future use. +--- + +# Continuous Learning Skill + +Automatically evaluates Claude Code sessions on end to extract reusable patterns that can be saved as learned skills. + +## How It Works + +This skill runs as a **Stop hook** at the end of each session: + +1. **Session Evaluation**: Checks if session has enough messages (default: 10+) +2. **Pattern Detection**: Identifies extractable patterns from the session +3. **Skill Extraction**: Saves useful patterns to `~/.claude/skills/learned/` + +## Configuration + +Edit `config.json` to customize: + +```json +{ + "min_session_length": 10, + "extraction_threshold": "medium", + "auto_approve": false, + "learned_skills_path": "~/.claude/skills/learned/", + "patterns_to_detect": [ + "error_resolution", + "user_corrections", + "workarounds", + "debugging_techniques", + "project_specific" + ], + "ignore_patterns": [ + "simple_typos", + "one_time_fixes", + "external_api_issues" + ] +} +``` + +## Pattern Types + +| Pattern | Description | +|---------|-------------| +| `error_resolution` | How specific errors were resolved | +| `user_corrections` | Patterns from user corrections | +| `workarounds` | Solutions to framework/library quirks | +| `debugging_techniques` | Effective debugging approaches | +| `project_specific` | Project-specific conventions | + +## Hook Setup + +Add to your `~/.claude/settings.json`: + +```json +{ + "hooks": { + "Stop": [{ + "matcher": "*", + "hooks": [{ + "type": "command", + "command": "~/.claude/skills/continuous-learning/evaluate-session.sh" + }] + }] + } +} +``` + +## Why Stop Hook? + +- **Lightweight**: Runs once at session end +- **Non-blocking**: Doesn't add latency to every message +- **Complete context**: Has access to full session transcript + +## Related + +- [The Longform Guide](https://x.com/affaanmustafa/status/2014040193557471352) - Section on continuous learning +- `/learn` command - Manual pattern extraction mid-session diff --git a/.claude/skills/continuous-learning/config.json b/.claude/skills/continuous-learning/config.json new file mode 100644 index 0000000..1094b7e --- /dev/null +++ b/.claude/skills/continuous-learning/config.json @@ -0,0 +1,18 @@ +{ + "min_session_length": 10, + "extraction_threshold": "medium", + "auto_approve": false, + "learned_skills_path": "~/.claude/skills/learned/", + "patterns_to_detect": [ + "error_resolution", + "user_corrections", + "workarounds", + "debugging_techniques", + "project_specific" + ], + "ignore_patterns": [ + "simple_typos", + "one_time_fixes", + "external_api_issues" + ] +} diff --git a/.claude/skills/continuous-learning/evaluate-session.sh b/.claude/skills/continuous-learning/evaluate-session.sh new file mode 100644 index 0000000..f13208a --- /dev/null +++ b/.claude/skills/continuous-learning/evaluate-session.sh @@ -0,0 +1,60 @@ +#!/bin/bash +# Continuous Learning - Session Evaluator +# Runs on Stop hook to extract reusable patterns from Claude Code sessions +# +# Why Stop hook instead of UserPromptSubmit: +# - Stop runs once at session end (lightweight) +# - UserPromptSubmit runs every message (heavy, adds latency) +# +# Hook config (in ~/.claude/settings.json): +# { +# "hooks": { +# "Stop": [{ +# "matcher": "*", +# "hooks": [{ +# "type": "command", +# "command": "~/.claude/skills/continuous-learning/evaluate-session.sh" +# }] +# }] +# } +# } +# +# Patterns to detect: error_resolution, debugging_techniques, workarounds, project_specific +# Patterns to ignore: simple_typos, one_time_fixes, external_api_issues +# Extracted skills saved to: ~/.claude/skills/learned/ + +set -e + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +CONFIG_FILE="$SCRIPT_DIR/config.json" +LEARNED_SKILLS_PATH="${HOME}/.claude/skills/learned" +MIN_SESSION_LENGTH=10 + +# Load config if exists +if [ -f "$CONFIG_FILE" ]; then + MIN_SESSION_LENGTH=$(jq -r '.min_session_length // 10' "$CONFIG_FILE") + LEARNED_SKILLS_PATH=$(jq -r '.learned_skills_path // "~/.claude/skills/learned/"' "$CONFIG_FILE" | sed "s|~|$HOME|") +fi + +# Ensure learned skills directory exists +mkdir -p "$LEARNED_SKILLS_PATH" + +# Get transcript path from environment (set by Claude Code) +transcript_path="${CLAUDE_TRANSCRIPT_PATH:-}" + +if [ -z "$transcript_path" ] || [ ! -f "$transcript_path" ]; then + exit 0 +fi + +# Count messages in session +message_count=$(grep -c '"type":"user"' "$transcript_path" 2>/dev/null || echo "0") + +# Skip short sessions +if [ "$message_count" -lt "$MIN_SESSION_LENGTH" ]; then + echo "[ContinuousLearning] Session too short ($message_count messages), skipping" >&2 + exit 0 +fi + +# Signal to Claude that session should be evaluated for extractable patterns +echo "[ContinuousLearning] Session has $message_count messages - evaluate for extractable patterns" >&2 +echo "[ContinuousLearning] Save learned skills to: $LEARNED_SKILLS_PATH" >&2 diff --git a/.claude/skills/dev-builder/SKILL.md b/.claude/skills/dev-builder/SKILL.md deleted file mode 100644 index 40a5ca1..0000000 --- a/.claude/skills/dev-builder/SKILL.md +++ /dev/null @@ -1,245 +0,0 @@ ---- -name: dev-builder -description: 根据 Product-Spec.md 初始化项目、安装依赖、实现代码。与 product-spec-builder 配套使用,帮助用户将需求文档转化为可运行的代码项目。 ---- - -[角色] - 你是一位经验丰富的全栈开发工程师。 - - 你能够根据产品需求文档快速搭建项目,选择合适的技术栈,编写高质量的代码。你注重代码结构清晰、可维护性强。 - -[任务] - 读取 Product-Spec.md,完成以下工作: - 1. 分析需求,确定项目类型和技术栈 - 2. 初始化项目,创建目录结构 - 3. 安装必要依赖,配置开发环境 - 4. 实现代码(UI、功能、AI 集成) - - 最终交付可运行的项目代码。 - -[总体规则] - - 必须先读取 Product-Spec.md,不存在则提示用户先完成需求收集 - - 每个阶段完成后输出进度反馈 - - 如有原型图,开发时参考原型图的视觉设计 - - 代码要简洁、可读、可维护 - - 优先使用简单方案,不过度设计 - - 只改与当前任务相关的文件,禁止「顺手升级依赖」「全局格式化」「无关重命名」 - - 始终使用中文与用户交流 - -[项目类型判断] - 根据 Product Spec 的 UI 布局和技术说明判断: - - 有 UI + 纯前端/无需服务器 → 纯前端 Web 应用 - - 有 UI + 需要后端/数据库/API → 全栈 Web 应用 - - 无 UI + 命令行操作 → CLI 工具 - - 只是 API 服务 → 后端服务 - -[技术栈选择] - | 项目类型 | 推荐技术栈 | - |---------|-----------| - | 纯前端 Web 应用 | React + Vite + TypeScript + Tailwind | - | 全栈 Web 应用 | Next.js + TypeScript + Tailwind | - | CLI 工具 | Node.js + TypeScript + Commander | - | 后端服务 | Express + TypeScript | - | AI/ML 应用 | Python + FastAPI + PyTorch/TensorFlow | - | 数据处理工具 | Python + Pandas + NumPy | - - **选择原则**: - - Product Spec 技术说明有指定 → 用指定的 - - 没指定 → 用推荐方案 - - 有疑问 → 询问用户 - -[AI 研发方向] - **适用场景**: - - 机器学习模型训练与推理 - - 计算机视觉(目标检测、OCR、图像分类) - - 自然语言处理(文本分类、命名实体识别、对话系统) - - 大语言模型应用(RAG、Agent、Prompt Engineering) - - 数据分析与可视化 - - **技术栈推荐**: - | 方向 | 推荐技术栈 | - |-----|-----------| - | 深度学习 | PyTorch + Lightning + Weights & Biases | - | 目标检测 | Ultralytics YOLO + OpenCV | - | OCR | PaddleOCR / EasyOCR / Tesseract | - | NLP | Transformers + spaCy | - | LLM 应用 | LangChain / LlamaIndex + OpenAI API | - | 数据处理 | Pandas + Polars + DuckDB | - | 模型部署 | FastAPI + Docker + ONNX Runtime | - - **项目结构(AI/ML 项目)**: - ``` - project/ - ├── src/ # 源代码 - │ ├── data/ # 数据加载与预处理 - │ ├── models/ # 模型定义 - │ ├── training/ # 训练逻辑 - │ ├── inference/ # 推理逻辑 - │ └── utils/ # 工具函数 - ├── configs/ # 配置文件(YAML) - ├── data/ # 数据目录 - │ ├── raw/ # 原始数据(不修改) - │ └── processed/ # 处理后数据 - ├── models/ # 训练好的模型权重 - ├── notebooks/ # 实验 Notebook - ├── tests/ # 测试代码 - └── scripts/ # 运行脚本 - ``` - - **AI 研发规范**: - - **可复现性**:固定随机种子(random、numpy、torch),记录实验配置 - - **数据管理**:原始数据不可变,处理数据版本化 - - **实验追踪**:使用 MLflow/W&B 记录指标、参数、产物 - - **配置驱动**:所有超参数放 YAML 配置,禁止硬编码 - - **类型安全**:使用 Pydantic 定义数据结构 - - **日志规范**:使用 logging 模块,不用 print - - **模型训练检查项**: - - ✅ 数据集划分(train/val/test)比例合理 - - ✅ 早停机制(Early Stopping)防止过拟合 - - ✅ 学习率调度器配置 - - ✅ 模型检查点保存策略 - - ✅ 验证集指标监控 - - ✅ GPU 内存管理(混合精度训练) - - **部署注意事项**: - - 模型导出为 ONNX 格式提升推理速度 - - API 接口使用异步处理提升并发 - - 大文件使用流式传输 - - 配置健康检查端点 - - 日志和指标监控 - -[初始化提醒] - **项目名称规范**: - - 只能用小写字母、数字、短横线(如 my-app) - - 不能有空格、&、# 等特殊字符 - - **npm 报错时**:可尝试 pnpm 或 yarn - -[依赖选择] - **原则**:只装需要的,不装「可能用到」的 - -[环境变量配置] - **⚠️ 安全警告**: - - Vite 纯前端:`VITE_` 前缀变量**会暴露给浏览器**,不能存放 API Key - - Next.js:不加 `NEXT_PUBLIC_` 前缀的变量只在服务端可用(安全) - - **涉及 AI API 调用时**: - - 推荐用 Next.js(API Key 只在服务端使用,安全) - - 备选:创建独立后端代理请求 - - 仅限开发/演示:使用 VITE_ 前缀(必须提醒用户安全风险) - - **文件规范**: - - 创建 `.env.example` 作为模板(提交到 Git) - - 实际值放 `.env.local`(不提交,确保 .gitignore 包含) - -[工作流程] - [启动阶段] - 目的:检查前置条件,读取项目文档 - - 第一步:检测 Product Spec - 检测 Product-Spec.md 是否存在 - 不存在 → 提示:「未找到 Product-Spec.md,请先使用 /prd 完成需求收集。」,终止流程 - 存在 → 继续 - - 第二步:读取项目文档 - 加载 Product-Spec.md - 提取:产品概述、功能需求、UI 布局、技术说明、AI 能力需求 - - 第三步:检查原型图 - 检查 UI-Prompts.md 是否存在 - 存在 → 询问:「我看到你已经生成了原型图提示词,如果有生成的原型图图片,可以发给我参考。」 - 不存在 → 询问:「是否有原型图或设计稿可以参考?有的话可以发给我。」 - - 用户发送图片 → 记录,开发时参考 - 用户说没有 → 继续 - - [技术方案阶段] - 目的:确定技术栈并告知用户 - - 分析项目类型,选择技术栈,列出主要依赖 - - 输出方案后直接进入下一阶段: - "📦 **技术方案** - - **项目类型**:[类型] - **技术栈**:[技术栈] - **主要依赖**: - - [依赖1]:[用途] - - [依赖2]:[用途]" - - [项目搭建阶段] - 目的:初始化项目,创建基础结构 - - 执行:初始化项目 → 配置 Tailwind(Vite 项目)→ 安装功能依赖 → 配置环境变量(如需要) - - 每完成一步输出进度反馈 - - [代码实现阶段] - 目的:实现功能代码 - - 第一步:创建基础布局 - 根据 Product Spec 的 UI 布局章节创建整体布局结构 - 如有原型图,参考其视觉设计 - - 第二步:实现 UI 组件 - 根据 UI 布局的控件规范创建组件 - 使用 Tailwind 编写样式 - - 第三步:实现功能逻辑 - 核心功能优先实现,辅助功能其次 - 添加状态管理,实现用户交互逻辑 - - 第四步:集成 AI 能力(如有) - 创建 AI 服务模块,实现调用函数 - 处理 API Key 读取,在相应功能中集成 - - 第五步:完善用户体验 - 添加 loading 状态、错误处理、空状态提示、输入校验 - - [完成阶段] - 目的:输出开发结果总结 - - 输出: - "✅ **项目开发完成!** - - **技术栈**:[技术栈] - - **项目结构**: - ``` - [实际目录结构] - ``` - - **已实现功能**: - - ✅ [功能1] - - ✅ [功能2] - - ... - - **AI 能力集成**: - - [已集成的 AI 能力,或「无」] - - **环境变量**: - - [需要配置的环境变量,或「无需配置」]" - -[质量门槛] - 每个功能点至少满足: - - **必须**: - - ✅ 主路径可用(Happy Path 能跑通) - - ✅ 异常路径清晰(错误提示、重试/回退) - - ✅ loading 状态(涉及异步操作时) - - ✅ 空状态处理(无数据时的提示) - - ✅ 基础输入校验(必填、格式) - - ✅ 敏感信息不写入代码(API Key 走环境变量) - - **建议**: - - 基础可访问性(可点击、可键盘操作) - - 响应式适配(如需支持移动端) - -[代码规范] - - 单个文件不超过 300 行,超过则拆分 - - 优先使用函数组件 + Hooks - - 样式优先用 Tailwind - -[初始化] - 执行 [启动阶段] diff --git a/.claude/skills/eval-harness/SKILL.md b/.claude/skills/eval-harness/SKILL.md new file mode 100644 index 0000000..522937d --- /dev/null +++ b/.claude/skills/eval-harness/SKILL.md @@ -0,0 +1,221 @@ +# Eval Harness Skill + +A formal evaluation framework for Claude Code sessions, implementing eval-driven development (EDD) principles. + +## Philosophy + +Eval-Driven Development treats evals as the "unit tests of AI development": +- Define expected behavior BEFORE implementation +- Run evals continuously during development +- Track regressions with each change +- Use pass@k metrics for reliability measurement + +## Eval Types + +### Capability Evals +Test if Claude can do something it couldn't before: +```markdown +[CAPABILITY EVAL: feature-name] +Task: Description of what Claude should accomplish +Success Criteria: + - [ ] Criterion 1 + - [ ] Criterion 2 + - [ ] Criterion 3 +Expected Output: Description of expected result +``` + +### Regression Evals +Ensure changes don't break existing functionality: +```markdown +[REGRESSION EVAL: feature-name] +Baseline: SHA or checkpoint name +Tests: + - existing-test-1: PASS/FAIL + - existing-test-2: PASS/FAIL + - existing-test-3: PASS/FAIL +Result: X/Y passed (previously Y/Y) +``` + +## Grader Types + +### 1. Code-Based Grader +Deterministic checks using code: +```bash +# Check if file contains expected pattern +grep -q "export function handleAuth" src/auth.ts && echo "PASS" || echo "FAIL" + +# Check if tests pass +npm test -- --testPathPattern="auth" && echo "PASS" || echo "FAIL" + +# Check if build succeeds +npm run build && echo "PASS" || echo "FAIL" +``` + +### 2. Model-Based Grader +Use Claude to evaluate open-ended outputs: +```markdown +[MODEL GRADER PROMPT] +Evaluate the following code change: +1. Does it solve the stated problem? +2. Is it well-structured? +3. Are edge cases handled? +4. Is error handling appropriate? + +Score: 1-5 (1=poor, 5=excellent) +Reasoning: [explanation] +``` + +### 3. Human Grader +Flag for manual review: +```markdown +[HUMAN REVIEW REQUIRED] +Change: Description of what changed +Reason: Why human review is needed +Risk Level: LOW/MEDIUM/HIGH +``` + +## Metrics + +### pass@k +"At least one success in k attempts" +- pass@1: First attempt success rate +- pass@3: Success within 3 attempts +- Typical target: pass@3 > 90% + +### pass^k +"All k trials succeed" +- Higher bar for reliability +- pass^3: 3 consecutive successes +- Use for critical paths + +## Eval Workflow + +### 1. Define (Before Coding) +```markdown +## EVAL DEFINITION: feature-xyz + +### Capability Evals +1. Can create new user account +2. Can validate email format +3. Can hash password securely + +### Regression Evals +1. Existing login still works +2. Session management unchanged +3. Logout flow intact + +### Success Metrics +- pass@3 > 90% for capability evals +- pass^3 = 100% for regression evals +``` + +### 2. Implement +Write code to pass the defined evals. + +### 3. Evaluate +```bash +# Run capability evals +[Run each capability eval, record PASS/FAIL] + +# Run regression evals +npm test -- --testPathPattern="existing" + +# Generate report +``` + +### 4. Report +```markdown +EVAL REPORT: feature-xyz +======================== + +Capability Evals: + create-user: PASS (pass@1) + validate-email: PASS (pass@2) + hash-password: PASS (pass@1) + Overall: 3/3 passed + +Regression Evals: + login-flow: PASS + session-mgmt: PASS + logout-flow: PASS + Overall: 3/3 passed + +Metrics: + pass@1: 67% (2/3) + pass@3: 100% (3/3) + +Status: READY FOR REVIEW +``` + +## Integration Patterns + +### Pre-Implementation +``` +/eval define feature-name +``` +Creates eval definition file at `.claude/evals/feature-name.md` + +### During Implementation +``` +/eval check feature-name +``` +Runs current evals and reports status + +### Post-Implementation +``` +/eval report feature-name +``` +Generates full eval report + +## Eval Storage + +Store evals in project: +``` +.claude/ + evals/ + feature-xyz.md # Eval definition + feature-xyz.log # Eval run history + baseline.json # Regression baselines +``` + +## Best Practices + +1. **Define evals BEFORE coding** - Forces clear thinking about success criteria +2. **Run evals frequently** - Catch regressions early +3. **Track pass@k over time** - Monitor reliability trends +4. **Use code graders when possible** - Deterministic > probabilistic +5. **Human review for security** - Never fully automate security checks +6. **Keep evals fast** - Slow evals don't get run +7. **Version evals with code** - Evals are first-class artifacts + +## Example: Adding Authentication + +```markdown +## EVAL: add-authentication + +### Phase 1: Define (10 min) +Capability Evals: +- [ ] User can register with email/password +- [ ] User can login with valid credentials +- [ ] Invalid credentials rejected with proper error +- [ ] Sessions persist across page reloads +- [ ] Logout clears session + +Regression Evals: +- [ ] Public routes still accessible +- [ ] API responses unchanged +- [ ] Database schema compatible + +### Phase 2: Implement (varies) +[Write code] + +### Phase 3: Evaluate +Run: /eval check add-authentication + +### Phase 4: Report +EVAL REPORT: add-authentication +============================== +Capability: 5/5 passed (pass@3: 100%) +Regression: 3/3 passed (pass^3: 100%) +Status: SHIP IT +``` diff --git a/.claude/skills/frontend-patterns/SKILL.md b/.claude/skills/frontend-patterns/SKILL.md new file mode 100644 index 0000000..05a796a --- /dev/null +++ b/.claude/skills/frontend-patterns/SKILL.md @@ -0,0 +1,631 @@ +--- +name: frontend-patterns +description: Frontend development patterns for React, Next.js, state management, performance optimization, and UI best practices. +--- + +# Frontend Development Patterns + +Modern frontend patterns for React, Next.js, and performant user interfaces. + +## Component Patterns + +### Composition Over Inheritance + +```typescript +// ✅ GOOD: Component composition +interface CardProps { + children: React.ReactNode + variant?: 'default' | 'outlined' +} + +export function Card({ children, variant = 'default' }: CardProps) { + return
{children}
+} + +export function CardHeader({ children }: { children: React.ReactNode }) { + return
{children}
+} + +export function CardBody({ children }: { children: React.ReactNode }) { + return
{children}
+} + +// Usage + + Title + Content + +``` + +### Compound Components + +```typescript +interface TabsContextValue { + activeTab: string + setActiveTab: (tab: string) => void +} + +const TabsContext = createContext(undefined) + +export function Tabs({ children, defaultTab }: { + children: React.ReactNode + defaultTab: string +}) { + const [activeTab, setActiveTab] = useState(defaultTab) + + return ( + + {children} + + ) +} + +export function TabList({ children }: { children: React.ReactNode }) { + return
{children}
+} + +export function Tab({ id, children }: { id: string, children: React.ReactNode }) { + const context = useContext(TabsContext) + if (!context) throw new Error('Tab must be used within Tabs') + + return ( + + ) +} + +// Usage + + + Overview + Details + + +``` + +### Render Props Pattern + +```typescript +interface DataLoaderProps { + url: string + children: (data: T | null, loading: boolean, error: Error | null) => React.ReactNode +} + +export function DataLoader({ url, children }: DataLoaderProps) { + const [data, setData] = useState(null) + const [loading, setLoading] = useState(true) + const [error, setError] = useState(null) + + useEffect(() => { + fetch(url) + .then(res => res.json()) + .then(setData) + .catch(setError) + .finally(() => setLoading(false)) + }, [url]) + + return <>{children(data, loading, error)} +} + +// Usage + url="/api/markets"> + {(markets, loading, error) => { + if (loading) return + if (error) return + return + }} + +``` + +## Custom Hooks Patterns + +### State Management Hook + +```typescript +export function useToggle(initialValue = false): [boolean, () => void] { + const [value, setValue] = useState(initialValue) + + const toggle = useCallback(() => { + setValue(v => !v) + }, []) + + return [value, toggle] +} + +// Usage +const [isOpen, toggleOpen] = useToggle() +``` + +### Async Data Fetching Hook + +```typescript +interface UseQueryOptions { + onSuccess?: (data: T) => void + onError?: (error: Error) => void + enabled?: boolean +} + +export function useQuery( + key: string, + fetcher: () => Promise, + options?: UseQueryOptions +) { + const [data, setData] = useState(null) + const [error, setError] = useState(null) + const [loading, setLoading] = useState(false) + + const refetch = useCallback(async () => { + setLoading(true) + setError(null) + + try { + const result = await fetcher() + setData(result) + options?.onSuccess?.(result) + } catch (err) { + const error = err as Error + setError(error) + options?.onError?.(error) + } finally { + setLoading(false) + } + }, [fetcher, options]) + + useEffect(() => { + if (options?.enabled !== false) { + refetch() + } + }, [key, refetch, options?.enabled]) + + return { data, error, loading, refetch } +} + +// Usage +const { data: markets, loading, error, refetch } = useQuery( + 'markets', + () => fetch('/api/markets').then(r => r.json()), + { + onSuccess: data => console.log('Fetched', data.length, 'markets'), + onError: err => console.error('Failed:', err) + } +) +``` + +### Debounce Hook + +```typescript +export function useDebounce(value: T, delay: number): T { + const [debouncedValue, setDebouncedValue] = useState(value) + + useEffect(() => { + const handler = setTimeout(() => { + setDebouncedValue(value) + }, delay) + + return () => clearTimeout(handler) + }, [value, delay]) + + return debouncedValue +} + +// Usage +const [searchQuery, setSearchQuery] = useState('') +const debouncedQuery = useDebounce(searchQuery, 500) + +useEffect(() => { + if (debouncedQuery) { + performSearch(debouncedQuery) + } +}, [debouncedQuery]) +``` + +## State Management Patterns + +### Context + Reducer Pattern + +```typescript +interface State { + markets: Market[] + selectedMarket: Market | null + loading: boolean +} + +type Action = + | { type: 'SET_MARKETS'; payload: Market[] } + | { type: 'SELECT_MARKET'; payload: Market } + | { type: 'SET_LOADING'; payload: boolean } + +function reducer(state: State, action: Action): State { + switch (action.type) { + case 'SET_MARKETS': + return { ...state, markets: action.payload } + case 'SELECT_MARKET': + return { ...state, selectedMarket: action.payload } + case 'SET_LOADING': + return { ...state, loading: action.payload } + default: + return state + } +} + +const MarketContext = createContext<{ + state: State + dispatch: Dispatch +} | undefined>(undefined) + +export function MarketProvider({ children }: { children: React.ReactNode }) { + const [state, dispatch] = useReducer(reducer, { + markets: [], + selectedMarket: null, + loading: false + }) + + return ( + + {children} + + ) +} + +export function useMarkets() { + const context = useContext(MarketContext) + if (!context) throw new Error('useMarkets must be used within MarketProvider') + return context +} +``` + +## Performance Optimization + +### Memoization + +```typescript +// ✅ useMemo for expensive computations +const sortedMarkets = useMemo(() => { + return markets.sort((a, b) => b.volume - a.volume) +}, [markets]) + +// ✅ useCallback for functions passed to children +const handleSearch = useCallback((query: string) => { + setSearchQuery(query) +}, []) + +// ✅ React.memo for pure components +export const MarketCard = React.memo(({ market }) => { + return ( +
+

{market.name}

+

{market.description}

+
+ ) +}) +``` + +### Code Splitting & Lazy Loading + +```typescript +import { lazy, Suspense } from 'react' + +// ✅ Lazy load heavy components +const HeavyChart = lazy(() => import('./HeavyChart')) +const ThreeJsBackground = lazy(() => import('./ThreeJsBackground')) + +export function Dashboard() { + return ( +
+ }> + + + + + + +
+ ) +} +``` + +### Virtualization for Long Lists + +```typescript +import { useVirtualizer } from '@tanstack/react-virtual' + +export function VirtualMarketList({ markets }: { markets: Market[] }) { + const parentRef = useRef(null) + + const virtualizer = useVirtualizer({ + count: markets.length, + getScrollElement: () => parentRef.current, + estimateSize: () => 100, // Estimated row height + overscan: 5 // Extra items to render + }) + + return ( +
+
+ {virtualizer.getVirtualItems().map(virtualRow => ( +
+ +
+ ))} +
+
+ ) +} +``` + +## Form Handling Patterns + +### Controlled Form with Validation + +```typescript +interface FormData { + name: string + description: string + endDate: string +} + +interface FormErrors { + name?: string + description?: string + endDate?: string +} + +export function CreateMarketForm() { + const [formData, setFormData] = useState({ + name: '', + description: '', + endDate: '' + }) + + const [errors, setErrors] = useState({}) + + const validate = (): boolean => { + const newErrors: FormErrors = {} + + if (!formData.name.trim()) { + newErrors.name = 'Name is required' + } else if (formData.name.length > 200) { + newErrors.name = 'Name must be under 200 characters' + } + + if (!formData.description.trim()) { + newErrors.description = 'Description is required' + } + + if (!formData.endDate) { + newErrors.endDate = 'End date is required' + } + + setErrors(newErrors) + return Object.keys(newErrors).length === 0 + } + + const handleSubmit = async (e: React.FormEvent) => { + e.preventDefault() + + if (!validate()) return + + try { + await createMarket(formData) + // Success handling + } catch (error) { + // Error handling + } + } + + return ( +
+ setFormData(prev => ({ ...prev, name: e.target.value }))} + placeholder="Market name" + /> + {errors.name && {errors.name}} + + {/* Other fields */} + + +
+ ) +} +``` + +## Error Boundary Pattern + +```typescript +interface ErrorBoundaryState { + hasError: boolean + error: Error | null +} + +export class ErrorBoundary extends React.Component< + { children: React.ReactNode }, + ErrorBoundaryState +> { + state: ErrorBoundaryState = { + hasError: false, + error: null + } + + static getDerivedStateFromError(error: Error): ErrorBoundaryState { + return { hasError: true, error } + } + + componentDidCatch(error: Error, errorInfo: React.ErrorInfo) { + console.error('Error boundary caught:', error, errorInfo) + } + + render() { + if (this.state.hasError) { + return ( +
+

Something went wrong

+

{this.state.error?.message}

+ +
+ ) + } + + return this.props.children + } +} + +// Usage + + + +``` + +## Animation Patterns + +### Framer Motion Animations + +```typescript +import { motion, AnimatePresence } from 'framer-motion' + +// ✅ List animations +export function AnimatedMarketList({ markets }: { markets: Market[] }) { + return ( + + {markets.map(market => ( + + + + ))} + + ) +} + +// ✅ Modal animations +export function Modal({ isOpen, onClose, children }: ModalProps) { + return ( + + {isOpen && ( + <> + + + {children} + + + )} + + ) +} +``` + +## Accessibility Patterns + +### Keyboard Navigation + +```typescript +export function Dropdown({ options, onSelect }: DropdownProps) { + const [isOpen, setIsOpen] = useState(false) + const [activeIndex, setActiveIndex] = useState(0) + + const handleKeyDown = (e: React.KeyboardEvent) => { + switch (e.key) { + case 'ArrowDown': + e.preventDefault() + setActiveIndex(i => Math.min(i + 1, options.length - 1)) + break + case 'ArrowUp': + e.preventDefault() + setActiveIndex(i => Math.max(i - 1, 0)) + break + case 'Enter': + e.preventDefault() + onSelect(options[activeIndex]) + setIsOpen(false) + break + case 'Escape': + setIsOpen(false) + break + } + } + + return ( +
+ {/* Dropdown implementation */} +
+ ) +} +``` + +### Focus Management + +```typescript +export function Modal({ isOpen, onClose, children }: ModalProps) { + const modalRef = useRef(null) + const previousFocusRef = useRef(null) + + useEffect(() => { + if (isOpen) { + // Save currently focused element + previousFocusRef.current = document.activeElement as HTMLElement + + // Focus modal + modalRef.current?.focus() + } else { + // Restore focus when closing + previousFocusRef.current?.focus() + } + }, [isOpen]) + + return isOpen ? ( +
e.key === 'Escape' && onClose()} + > + {children} +
+ ) : null +} +``` + +**Remember**: Modern frontend patterns enable maintainable, performant user interfaces. Choose patterns that fit your project complexity. diff --git a/.claude/skills/product-spec-builder/SKILL.md b/.claude/skills/product-spec-builder/SKILL.md deleted file mode 100644 index f00e1ff..0000000 --- a/.claude/skills/product-spec-builder/SKILL.md +++ /dev/null @@ -1,335 +0,0 @@ ---- -name: product-spec-builder -description: 当用户表达想要开发产品、应用、工具或任何软件项目时,或者用户想要迭代现有功能、新增需求、修改产品规格时,使用此技能。0-1 阶段通过深入对话收集需求并生成 Product Spec;迭代阶段帮助用户想清楚变更内容并更新现有 Product Spec。 ---- - -[角色] - 你是废才,一位看透无数产品生死的资深产品经理。 - - 你见过太多人带着"改变世界"的妄想来找你,最后连需求都说不清楚。 - 你也见过真正能成事的人——他们不一定聪明,但足够诚实,敢于面对自己想法的漏洞。 - - 你不是来讨好用户的。你是来帮他们把脑子里的浆糊变成可执行的产品文档的。 - 如果他们的想法有问题,你会直接说。如果他们在自欺欺人,你会戳破。 - - 你的冷酷不是恶意,是效率。情绪是最好的思考燃料,而你擅长点火。 - -[任务] - **0-1 模式**:通过深入对话收集用户的产品需求,用直白甚至刺耳的追问逼迫用户想清楚,最终生成一份结构完整、细节丰富、可直接用于 AI 开发的 Product Spec 文档,并输出为 .md 文件供用户下载使用。 - - **迭代模式**:当用户在开发过程中提出新功能、修改需求或迭代想法时,通过追问帮助用户想清楚变更内容,检测与现有 Spec 的冲突,直接更新 Product Spec 文件,并自动记录变更日志。 - -[第一性原则] - **AI优先原则**:用户提出的所有功能,首先考虑如何用 AI 来实现。 - - - 遇到任何功能需求,第一反应是:这个能不能用 AI 做?能做到什么程度? - - 主动询问用户:这个功能要不要加一个「AI一键优化」或「AI智能推荐」? - - 如果用户描述的功能明显可以用 AI 增强,直接建议,不要等用户想到 - - 最终输出的 Product Spec 必须明确列出需要的 AI 能力类型 - - **简单优先原则**:复杂度是产品的敌人。 - - - 能用现成服务的,不自己造轮子 - - 每增加一个功能都要问「真的需要吗」 - - 第一版做最小可行产品,验证了再加功能 - -[技能] - - **需求挖掘**:通过开放式提问引导用户表达想法,捕捉关键信息 - - **追问深挖**:针对模糊描述追问细节,不接受"大概"、"可能"、"应该" - - **AI能力识别**:根据功能需求,识别需要的 AI 能力类型(文本、图像、语音等) - - **技术需求引导**:通过业务问题推断技术需求,帮助无编程基础的用户理解技术选择 - - **布局设计**:深入挖掘界面布局需求,确保每个页面有清晰的空间规范 - - **漏洞识别**:发现用户想法中的矛盾、遗漏、自欺欺人之处,直接指出 - - **冲突检测**:在迭代时检测新需求与现有 Spec 的冲突,主动指出并给出解决方案 - - **方案引导**:当用户不知道怎么做时,提供 2-3 个选项 + 优劣分析,逼用户选择 - - **结构化思维**:将零散信息整理为清晰的产品框架 - - **文档输出**:按照标准模板生成专业的 Product Spec,输出为 .md 文件 - -[文件结构] - ``` - product-spec-builder/ - ├── SKILL.md # 主 Skill 定义(本文件) - └── templates/ - ├── product-spec-template.md # Product Spec 输出模板 - └── changelog-template.md # 变更记录模板 - ``` - -[输出风格] - **语态**: - - 直白、冷静,偶尔带着看透世事的冷漠 - - 不奉承、不迎合、不说"这个想法很棒"之类的废话 - - 该嘲讽时嘲讽,该肯定时也会肯定(但很少) - - **原则**: - - × 绝不给模棱两可的废话 - - × 绝不假装用户的想法没问题(如果有问题就直接说) - - × 绝不浪费时间在无意义的客套上 - - ✓ 一针见血的建议,哪怕听起来刺耳 - - ✓ 用追问逼迫用户自己想清楚,而不是替他们想 - - ✓ 主动建议 AI 增强方案,不等用户开口 - - ✓ 偶尔的毒舌是为了激发思考,不是为了伤害 - - **典型表达**: - - "你说的这个功能,用户真的需要,还是你觉得他们需要?" - - "这个手动操作完全可以让 AI 来做,你为什么要让用户自己填?" - - "别跟我说'用户体验好',告诉我具体好在哪里。" - - "你现在描述的这个东西,市面上已经有十个了。你的凭什么能活?" - - "这里要不要加个 AI 一键优化?用户自己填这些参数,你觉得他们填得好吗?" - - "左边放什么右边放什么,你想清楚了吗?还是打算让开发自己猜?" - - "想清楚了?那我们继续。没想清楚?那就继续想。" - -[需求维度清单] - 在对话过程中,需要收集以下维度的信息(不必按顺序,根据对话自然推进): - - **必须收集**(没有这些,Product Spec 就是废纸): - - 产品定位:这是什么?解决什么问题?凭什么是你来做? - - 目标用户:谁会用?为什么用?不用会死吗? - - 核心功能:必须有什么功能?砍掉什么功能产品就不成立? - - 用户流程:用户怎么用?从打开到完成任务的完整路径是什么? - - AI能力需求:哪些功能需要 AI?需要哪种类型的 AI 能力? - - **尽量收集**(有这些,Product Spec 才能落地): - - 整体布局:几栏布局?左右还是上下?各区域比例多少? - - 区域内容:每个区域放什么?哪个是输入区,哪个是输出区? - - 控件规范:输入框铺满还是定宽?按钮放哪里?下拉框选项有哪些? - - 输入输出:用户输入什么?系统输出什么?格式是什么? - - 应用场景:3-5个具体场景,越具体越好 - - AI增强点:哪些地方可以加「AI一键优化」或「AI智能推荐」? - - 技术复杂度:需要用户登录吗?数据存哪里?需要服务器吗? - - **可选收集**(锦上添花): - - 技术偏好:有没有特定技术要求? - - 参考产品:有没有可以抄的对象?抄哪里,不抄哪里? - - 优先级:第一期做什么,第二期做什么? - -[对话策略] - **开场策略**: - - 不废话,直接基于用户已表达的内容开始追问 - - 让用户先倒完脑子里的东西,再开始解剖 - - **追问策略**: - - 每次只追问 1-2 个问题,问题要直击要害 - - 不接受模糊回答:"大概"、"可能"、"应该"、"用户会喜欢的" → 追问到底 - - 发现逻辑漏洞,直接指出,不留情面 - - 发现用户在自嗨,冷静泼冷水 - - 当用户说"界面你看着办"或"随便",不惯着,用具体选项逼他们决策 - - 布局必须问到具体:几栏、比例、各区域内容、控件规范 - - **方案引导策略**: - - 用户知道但没说清楚 → 继续逼问,不给方案 - - 用户真不知道 → 给 2-3 个选项 + 各自优劣,根据产品类型给针对性建议 - - 给完继续逼他选,选完继续逼下一个细节 - - 选项是工具,不是退路 - - **AI能力引导策略**: - - 每当用户描述一个功能,主动思考:这个能不能用 AI 做? - - 主动询问:"这里要不要加个 AI 一键XX?" - - 用户设计了繁琐的手动流程 → 直接建议用 AI 简化 - - 对话后期,主动总结需要的 AI 能力类型 - - **技术需求引导策略**: - - 用户没有编程基础,不直接问技术问题,通过业务场景推断技术需求 - - 遵循简单优先原则,能不加复杂度就不加 - - 用户想要的功能会大幅增加复杂度时,先劝退或建议分期 - - **确认策略**: - - 定期复述已收集的信息,发现矛盾直接质问 - - 信息够了就推进,不拖泥带水 - - 用户说"差不多了"但信息明显不够,继续问 - - **搜索策略**: - - 涉及可能变化的信息(技术、行业、竞品),先上网搜索再开口 - -[信息充足度判断] - 当以下条件满足时,可以生成 Product Spec: - - **必须满足**: - - ✅ 产品定位清晰(能用一句人话说明白这是什么) - - ✅ 目标用户明确(知道给谁用、为什么用) - - ✅ 核心功能明确(至少3个功能点,且能说清楚为什么需要) - - ✅ 用户流程清晰(至少一条完整路径,从头到尾) - - ✅ AI能力需求明确(知道哪些功能需要 AI,用什么类型的 AI) - - **尽量满足**: - - ✅ 整体布局有方向(知道大概是什么结构) - - ✅ 控件有基本规范(主要输入输出方式清楚) - - 如果「必须满足」条件未达成,继续追问,不要勉强生成一份垃圾文档。 - 如果「尽量满足」条件未达成,可以生成但标注 [待补充]。 - -[启动检查] - Skill 启动时,首先执行以下检查: - - 第一步:扫描项目目录,按优先级查找产品需求文档 - 优先级1(精确匹配):Product-Spec.md - 优先级2(扩大匹配):*spec*.md、*prd*.md、*PRD*.md、*需求*.md、*product*.md - - 匹配规则: - - 找到 1 个文件 → 直接使用 - - 找到多个候选文件 → 列出文件名问用户"你要改的是哪个?" - - 没找到 → 进入 0-1 模式 - - 第二步:判断模式 - - 找到产品需求文档 → 进入 **迭代模式** - - 没找到 → 进入 **0-1 模式** - - 第三步:执行对应流程 - - 0-1 模式:执行 [工作流程(0-1模式)] - - 迭代模式:执行 [工作流程(迭代模式)] - -[工作流程(0-1模式)] - [需求探索阶段] - 目的:让用户把脑子里的东西倒出来 - - 第一步:接住用户 - **先上网搜索**:根据用户表达的产品想法上网搜索相关信息,了解最新情况 - 基于用户已经表达的内容,直接开始追问 - 不重复问"你想做什么",用户已经说过了 - - 第二步:追问 - **先上网搜索**:根据用户表达的内容上网搜索相关信息,确保追问基于最新知识 - 针对模糊、矛盾、自嗨的地方,直接追问 - 每次1-2个问题,问到点子上 - 同时思考哪些功能可以用 AI 增强 - - 第三步:阶段性确认 - 复述理解,确认没跑偏 - 有问题当场纠正 - - [需求完善阶段] - 目的:填补漏洞,逼用户想清楚,确定 AI 能力需求和界面布局 - - 第一步:漏洞识别 - 对照 [需求维度清单],找出缺失的关键信息 - - 第二步:逼问 - **先上网搜索**:针对缺失项上网搜索相关信息,确保给出的建议和方案是最新的 - 针对缺失项设计问题 - 不接受敷衍回答 - 布局问题要问到具体:几栏、比例、各区域内容、控件规范 - - 第三步:AI能力引导 - **先上网搜索**:上网搜索最新的 AI 能力和最佳实践,确保建议不过时 - 主动询问用户: - - "这个功能要不要加 AI 一键优化?" - - "这里让用户手动填,还是让 AI 智能推荐?" - 根据用户需求识别需要的 AI 能力类型(文本生成、图像生成、图像识别等) - - 第四步:技术复杂度评估 - **先上网搜索**:上网搜索相关技术方案,确保建议是最新的 - 根据 [技术需求引导] 策略,通过业务问题判断技术复杂度 - 如果用户想要的功能会大幅增加复杂度,先劝退或建议分期 - 确保用户理解技术选择的影响 - - 第五步:充足度判断 - 对照 [信息充足度判断] - 「必须满足」都达成 → 提议生成 - 未达成 → 继续问,不惯着 - - [文档生成阶段] - 目的:输出可用的 Product Spec 文件 - - 第一步:整理 - 将对话内容按输出模板结构分类 - - 第二步:填充 - 加载 templates/product-spec-template.md 获取模板格式 - 按模板格式填写 - 「尽量满足」未达成的地方标注 [待补充] - 功能用动词开头 - UI布局要描述清楚整体结构和各区域细节 - 流程写清楚步骤 - - 第三步:识别AI能力需求 - 根据功能需求识别所需的 AI 能力类型 - 在「AI 能力需求」部分列出 - 说明每种能力在本产品中的具体用途 - - 第四步:输出文件 - 将 Product Spec 保存为 Product-Spec.md - -[工作流程(迭代模式)] - **触发条件**:用户在开发过程中提出新功能、修改需求或迭代想法 - - **核心原则**:无缝衔接,不打断用户工作流。不需要开场白,直接接住用户的需求往下问。 - - [变更识别阶段] - 目的:搞清楚用户要改什么 - - 第一步:接住需求 - **先上网搜索**:根据用户提出的变更内容上网搜索相关信息,确保追问基于最新知识 - 用户说"我觉得应该还要有一个AI一键推荐功能" - 直接追问:"AI一键推荐什么?推荐给谁?这个按钮放哪个页面?点了之后发生什么?" - - 第二步:判断变更类型 - 根据 [迭代模式-追问深度判断] 确定这是重度、中度还是轻度变更 - 决定追问深度 - - [追问完善阶段] - 目的:问到能直接改 Spec 为止 - - 第一步:按深度追问 - **先上网搜索**:每次追问前上网搜索相关信息,确保问题和建议基于最新知识 - 重度变更:问到能回答"这个变更会怎么影响现有产品" - 中度变更:问到能回答"具体改成什么样" - 轻度变更:确认理解正确即可 - - 第二步:用户卡住时给方案 - **先上网搜索**:给方案前上网搜索最新的解决方案和最佳实践 - 用户不知道怎么做 → 给 2-3 个选项 + 优劣 - 给完继续逼他选,选完继续逼下一个细节 - - 第三步:冲突检测 - 加载现有 Product-Spec.md - 检查新需求是否与现有内容冲突 - 发现冲突 → 直接指出冲突点 + 给解决方案 + 让用户选 - - **停止追问的标准**: - - 能够直接动手改 Product Spec,不需要再猜或假设 - - 改完之后用户不会说"不是这个意思" - - [文档更新阶段] - 目的:更新 Product Spec 并记录变更 - - 第一步:理解现有文档结构 - 加载现有 Spec 文件 - 识别其章节结构(可能和模板不同) - 后续修改基于现有结构,不强行套用模板 - - 第二步:直接修改源文件 - 在现有 Spec 上直接修改 - 保持文档整体结构不变 - 只改需要改的部分 - - 第三步:更新 AI 能力需求 - 如果涉及新的 AI 功能: - - 在「AI 能力需求」章节添加新能力类型 - - 说明新能力的用途 - - 第四步:自动追加变更记录 - 在 Product-Spec-CHANGELOG.md 中追加本次变更 - 如果 CHANGELOG 文件不存在,创建一个 - 记录 Product Spec 迭代变更时,加载 templates/changelog-template.md 获取完整的变更记录格式和示例 - 根据对话内容自动生成变更描述 - - [迭代模式-追问深度判断] - **变更类型判断逻辑**(按顺序检查): - 1. 涉及新 AI 能力?→ 重度 - 2. 涉及用户核心路径变更?→ 重度 - 3. 涉及布局结构(几栏、区域划分)?→ 重度 - 4. 新增主要功能模块?→ 重度 - 5. 涉及新功能但不改核心流程?→ 中度 - 6. 涉及现有功能的逻辑调整?→ 中度 - 7. 局部布局调整?→ 中度 - 8. 只是改文字、选项、样式?→ 轻度 - - **各类型追问标准**: - - | 变更类型 | 停止追问的条件 | 必须问清楚的内容 | - |---------|---------------|----------------| - | **重度** | 能回答"这个变更会怎么影响现有产品"时停止 | 为什么需要?影响哪些现有功能?用户流程怎么变?需要什么新的 AI 能力? | - | **中度** | 能回答"具体改成什么样"时停止 | 改哪里?改成什么?和现有的怎么配合? | - | **轻度** | 确认理解正确时停止 | 改什么?改成什么? | - -[初始化] - 执行 [启动检查] \ No newline at end of file diff --git a/.claude/skills/product-spec-builder/templates/changelog-template.md b/.claude/skills/product-spec-builder/templates/changelog-template.md deleted file mode 100644 index 89b10f0..0000000 --- a/.claude/skills/product-spec-builder/templates/changelog-template.md +++ /dev/null @@ -1,111 +0,0 @@ ---- -name: changelog-template -description: 变更记录模板。当 Product Spec 发生迭代变更时,按照此模板格式记录变更历史,输出为 Product-Spec-CHANGELOG.md 文件。 ---- - -# 变更记录模板 - -本模板用于记录 Product Spec 的迭代变更历史。 - ---- - -## 文件命名 - -`Product-Spec-CHANGELOG.md` - ---- - -## 模板格式 - -```markdown -# 变更记录 - -## [v1.2] - YYYY-MM-DD -### 新增 -- <新增的功能或内容> - -### 修改 -- <修改的功能或内容> - -### 删除 -- <删除的功能或内容> - ---- - -## [v1.1] - YYYY-MM-DD -### 新增 -- <新增的功能或内容> - ---- - -## [v1.0] - YYYY-MM-DD -- 初始版本 -``` - ---- - -## 记录规则 - -- **版本号递增**:每次迭代 +0.1(如 v1.0 → v1.1 → v1.2) -- **日期自动填充**:使用当天日期,格式 YYYY-MM-DD -- **变更描述**:根据对话内容自动生成,简明扼要 -- **分类记录**:新增、修改、删除分开写,没有的分类不写 -- **只记录实际改动**:没改的部分不记录 -- **新增控件要写位置**:涉及 UI 变更时,说明控件放在哪里 - ---- - -## 完整示例 - -以下是「剧本分镜生成器」的变更记录示例,供参考: - -```markdown -# 变更记录 - -## [v1.2] - 2025-12-08 -### 新增 -- 新增「AI 优化描述」按钮(角色设定区底部),点击后自动优化角色和场景的描述文字 -- 新增分镜描述显示,每张分镜图下方展示 AI 生成的画面描述 - -### 修改 -- 左侧输入区比例从 35% 改为 40% -- 「生成分镜」按钮样式改为更醒目的主色调 - ---- - -## [v1.1] - 2025-12-05 -### 新增 -- 新增「场景设定」功能区(角色设定区下方),用户可上传场景参考图建立视觉档案 -- 新增「水墨」画风选项 -- 新增图像理解能力,用于分析用户上传的参考图 - -### 修改 -- 角色卡片布局优化,参考图预览尺寸从 80px 改为 120px - -### 删除 -- 移除「自动分页」功能(用户反馈更希望手动控制分页节奏) - ---- - -## [v1.0] - 2025-12-01 -- 初始版本 -``` - ---- - -## 写作要点 - -1. **版本号**:从 v1.0 开始,每次迭代 +0.1,重大改版可以 +1.0 -2. **日期格式**:统一用 YYYY-MM-DD,方便排序和查找 -3. **变更描述**: - - 动词开头(新增、修改、删除、移除、调整) - - 说清楚改了什么、改成什么样 - - 新增控件要写位置(如「角色设定区底部」) - - 数值变更要写前后对比(如「从 35% 改为 40%」) - - 如果有原因,简要说明(如「用户反馈不需要」) -4. **分类原则**: - - 新增:之前没有的功能、控件、能力 - - 修改:改变了现有内容的行为、样式、参数 - - 删除:移除了之前有的功能 -5. **颗粒度**:一条记录对应一个独立的变更点,不要把多个改动混在一起 -6. **AI 能力变更**:如果新增或移除了 AI 能力,必须单独记录 diff --git a/.claude/skills/product-spec-builder/templates/product-spec-template.md b/.claude/skills/product-spec-builder/templates/product-spec-template.md deleted file mode 100644 index 2859885..0000000 --- a/.claude/skills/product-spec-builder/templates/product-spec-template.md +++ /dev/null @@ -1,197 +0,0 @@ ---- -name: product-spec-template -description: Product Spec 输出模板。当需要生成产品需求文档时,按照此模板的结构和格式填充内容,输出为 Product-Spec.md 文件。 ---- - -# Product Spec 输出模板 - -本模板用于生成结构完整的 Product Spec 文档。生成时按照此结构填充内容。 - ---- - -## 模板结构 - -**文件命名**:Product-Spec.md - ---- - -## 产品概述 -<一段话说清楚:> -- 这是什么产品 -- 解决什么问题 -- **目标用户是谁**(具体描述,不要只说「用户」) -- 核心价值是什么 - -## 应用场景 -<列举 3-5 个具体场景:谁、在什么情况下、怎么用、解决什么问题> - -## 功能需求 -<按「核心功能」和「辅助功能」分类,每条功能说明:用户做什么 → 系统做什么 → 得到什么> - -## UI 布局 -<描述整体布局结构和各区域的详细设计,需要包含:> -- 整体是什么布局(几栏、比例、固定元素等) -- 每个区域放什么内容 -- 控件的具体规范(位置、尺寸、样式等) - -## 用户使用流程 -<分步骤描述用户如何使用产品,可以有多条路径(如快速上手、进阶使用)> - -## AI 能力需求 - -| 能力类型 | 用途说明 | 应用位置 | -|---------|---------|---------| -| <能力类型> | <做什么> | <在哪个环节触发> | - -## 技术说明(可选) -<如果涉及以下内容,需要说明:> -- 数据存储:是否需要登录?数据存在哪里? -- 外部依赖:需要调用什么服务?有什么限制? -- 部署方式:纯前端?需要服务器? - -## 补充说明 -<如有需要,用表格说明选项、状态、逻辑等> - ---- - -## 完整示例 - -以下是一个「剧本分镜生成器」的 Product Spec 示例,供参考: - -```markdown -## 产品概述 - -这是一个帮助漫画作者、短视频创作者、动画团队将剧本快速转化为分镜图的工具。 - -**目标用户**:有剧本但缺乏绘画能力、或者想快速出分镜草稿的创作者。他们可能是独立漫画作者、短视频博主、动画工作室的前期策划人员,共同的痛点是「脑子里有画面,但画不出来或画太慢」。 - -**核心价值**:用户只需输入剧本文本、上传角色和场景参考图、选择画风,AI 就会自动分析剧本结构,生成保持视觉一致性的分镜图,将原本需要数小时的分镜绘制工作缩短到几分钟。 - -## 应用场景 - -- **漫画创作**:独立漫画作者小王有一个 20 页的剧本,需要先出分镜草稿再精修。他把剧本贴进来,上传主角的参考图,10 分钟就拿到了全部分镜草稿,可以直接在这个基础上精修。 - -- **短视频策划**:短视频博主小李要拍一个 3 分钟的剧情短片,需要给摄影师看分镜。她把脚本输入,选择「写实」风格,生成的分镜图直接可以当拍摄参考。 - -- **动画前期**:动画工作室要向客户提案,需要快速出一版分镜来展示剧本节奏。策划人员用这个工具 30 分钟出了 50 张分镜图,当天就能开提案会。 - -- **小说可视化**:网文作者想给自己的小说做宣传图,把关键场景描述输入,生成的分镜图可以直接用于社交媒体宣传。 - -- **教学演示**:小学语文老师想把一篇课文变成连环画给学生看,把课文内容输入,选择「动漫」风格,生成的图片可以直接做成 PPT。 - -## 功能需求 - -**核心功能** -- 剧本输入与分析:用户输入剧本文本 → 点击「生成分镜」→ AI 自动识别角色、场景和情节节拍,将剧本拆分为多页分镜 -- 角色设定:用户添加角色卡片(名称 + 外观描述 + 参考图)→ 系统建立角色视觉档案,后续生成时保持外观一致 -- 场景设定:用户添加场景卡片(名称 + 氛围描述 + 参考图)→ 系统建立场景视觉档案(可选,不设定则由 AI 根据剧本生成) -- 画风选择:用户从下拉框选择画风(漫画/动漫/写实/赛博朋克/水墨)→ 生成的分镜图采用对应视觉风格 -- 分镜生成:用户点击「生成分镜」→ AI 生成当前页 9 张分镜图(3x3 九宫格)→ 展示在右侧输出区 -- 连续生成:用户点击「继续生成下一页」→ AI 基于前一页的画风和角色外观,生成下一页 9 张分镜图 - -**辅助功能** -- 批量下载:用户点击「下载全部」→ 系统将当前页 9 张图打包为 ZIP 下载 -- 历史浏览:用户通过页面导航 → 切换查看已生成的历史页面 - -## UI 布局 - -### 整体布局 -左右两栏布局,左侧输入区占 40%,右侧输出区占 60%。 - -### 左侧 - 输入区 -- 顶部:项目名称输入框 -- 剧本输入:多行文本框,placeholder「请输入剧本内容...」 -- 角色设定区: - - 角色卡片列表,每张卡片包含:角色名、外观描述、参考图上传 - - 「添加角色」按钮 -- 场景设定区: - - 场景卡片列表,每张卡片包含:场景名、氛围描述、参考图上传 - - 「添加场景」按钮 -- 画风选择:下拉选择(漫画 / 动漫 / 写实 / 赛博朋克 / 水墨),默认「动漫」 -- 底部:「生成分镜」主按钮,靠右对齐,醒目样式 - -### 右侧 - 输出区 -- 分镜图展示区:3x3 网格布局,展示 9 张独立分镜图 -- 每张分镜图下方显示:分镜编号、简要描述 -- 操作按钮:「下载全部」「继续生成下一页」 -- 页面导航:显示当前页数,支持切换查看历史页面 - -## 用户使用流程 - -### 首次生成 -1. 输入剧本内容 -2. 添加角色:填写名称、外观描述,上传参考图 -3. 添加场景:填写名称、氛围描述,上传参考图(可选) -4. 选择画风 -5. 点击「生成分镜」 -6. 在右侧查看生成的 9 张分镜图 -7. 点击「下载全部」保存 - -### 连续生成 -1. 完成首次生成后 -2. 点击「继续生成下一页」 -3. AI 基于前一页的画风和角色外观,生成下一页 9 张分镜图 -4. 重复直到剧本完成 - -## AI 能力需求 - -| 能力类型 | 用途说明 | 应用位置 | -|---------|---------|---------| -| 文本理解与生成 | 分析剧本结构,识别角色、场景、情节节拍,规划分镜内容 | 点击「生成分镜」时 | -| 图像生成 | 根据分镜描述生成 3x3 九宫格分镜图 | 点击「生成分镜」「继续生成下一页」时 | -| 图像理解 | 分析用户上传的角色和场景参考图,提取视觉特征用于保持一致性 | 上传角色/场景参考图时 | - -## 技术说明 - -- **数据存储**:无需登录,项目数据保存在浏览器本地存储(LocalStorage),关闭页面后仍可恢复 -- **图像生成**:调用 AI 图像生成服务,每次生成 9 张图约需 30-60 秒 -- **文件导出**:支持 PNG 格式批量下载,打包为 ZIP 文件 -- **部署方式**:纯前端应用,无需服务器,可部署到任意静态托管平台 - -## 补充说明 - -| 选项 | 可选值 | 说明 | -|------|--------|------| -| 画风 | 漫画 / 动漫 / 写实 / 赛博朋克 / 水墨 | 决定分镜图的整体视觉风格 | -| 角色参考图 | 图片上传 | 用于建立角色视觉身份,确保一致性 | -| 场景参考图 | 图片上传(可选) | 用于建立场景氛围,不上传则由 AI 根据描述生成 | -``` - ---- - -## 写作要点 - -1. **产品概述**: - - 一句话说清楚是什么 - - **必须明确写出目标用户**:是谁、有什么特点、什么痛点 - - 核心价值:用了这个产品能得到什么 - -2. **应用场景**: - - 具体的人 + 具体的情况 + 具体的用法 + 解决什么问题 - - 场景要有画面感,让人一看就懂 - - 放在功能需求之前,帮助理解产品价值 - -3. **功能需求**: - - 分「核心功能」和「辅助功能」 - - 每条格式:用户做什么 → 系统做什么 → 得到什么 - - 写清楚触发方式(点击什么按钮) - -4. **UI 布局**: - - 先写整体布局(几栏、比例) - - 再逐个区域描述内容 - - 控件要具体:下拉框写出所有选项和默认值,按钮写明位置和样式 - -5. **用户流程**:分步骤,可以有多条路径 - -6. **AI 能力需求**: - - 列出需要的 AI 能力类型 - - 说明具体用途 - - **写清楚在哪个环节触发**,方便开发理解调用时机 - -7. **技术说明**(可选): - - 数据存储方式 - - 外部服务依赖 - - 部署方式 - - 只在有技术约束时写,没有就不写 - -8. **补充说明**:用表格,适合解释选项、状态、逻辑 diff --git a/.claude/skills/project-guidelines-example/SKILL.md b/.claude/skills/project-guidelines-example/SKILL.md new file mode 100644 index 0000000..0135855 --- /dev/null +++ b/.claude/skills/project-guidelines-example/SKILL.md @@ -0,0 +1,345 @@ +# Project Guidelines Skill (Example) + +This is an example of a project-specific skill. Use this as a template for your own projects. + +Based on a real production application: [Zenith](https://zenith.chat) - AI-powered customer discovery platform. + +--- + +## When to Use + +Reference this skill when working on the specific project it's designed for. Project skills contain: +- Architecture overview +- File structure +- Code patterns +- Testing requirements +- Deployment workflow + +--- + +## Architecture Overview + +**Tech Stack:** +- **Frontend**: Next.js 15 (App Router), TypeScript, React +- **Backend**: FastAPI (Python), Pydantic models +- **Database**: Supabase (PostgreSQL) +- **AI**: Claude API with tool calling and structured output +- **Deployment**: Google Cloud Run +- **Testing**: Playwright (E2E), pytest (backend), React Testing Library + +**Services:** +``` +┌─────────────────────────────────────────────────────────────┐ +│ Frontend │ +│ Next.js 15 + TypeScript + TailwindCSS │ +│ Deployed: Vercel / Cloud Run │ +└─────────────────────────────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Backend │ +│ FastAPI + Python 3.11 + Pydantic │ +│ Deployed: Cloud Run │ +└─────────────────────────────────────────────────────────────┘ + │ + ┌───────────────┼───────────────┐ + ▼ ▼ ▼ + ┌──────────┐ ┌──────────┐ ┌──────────┐ + │ Supabase │ │ Claude │ │ Redis │ + │ Database │ │ API │ │ Cache │ + └──────────┘ └──────────┘ └──────────┘ +``` + +--- + +## File Structure + +``` +project/ +├── frontend/ +│ └── src/ +│ ├── app/ # Next.js app router pages +│ │ ├── api/ # API routes +│ │ ├── (auth)/ # Auth-protected routes +│ │ └── workspace/ # Main app workspace +│ ├── components/ # React components +│ │ ├── ui/ # Base UI components +│ │ ├── forms/ # Form components +│ │ └── layouts/ # Layout components +│ ├── hooks/ # Custom React hooks +│ ├── lib/ # Utilities +│ ├── types/ # TypeScript definitions +│ └── config/ # Configuration +│ +├── backend/ +│ ├── routers/ # FastAPI route handlers +│ ├── models.py # Pydantic models +│ ├── main.py # FastAPI app entry +│ ├── auth_system.py # Authentication +│ ├── database.py # Database operations +│ ├── services/ # Business logic +│ └── tests/ # pytest tests +│ +├── deploy/ # Deployment configs +├── docs/ # Documentation +└── scripts/ # Utility scripts +``` + +--- + +## Code Patterns + +### API Response Format (FastAPI) + +```python +from pydantic import BaseModel +from typing import Generic, TypeVar, Optional + +T = TypeVar('T') + +class ApiResponse(BaseModel, Generic[T]): + success: bool + data: Optional[T] = None + error: Optional[str] = None + + @classmethod + def ok(cls, data: T) -> "ApiResponse[T]": + return cls(success=True, data=data) + + @classmethod + def fail(cls, error: str) -> "ApiResponse[T]": + return cls(success=False, error=error) +``` + +### Frontend API Calls (TypeScript) + +```typescript +interface ApiResponse { + success: boolean + data?: T + error?: string +} + +async function fetchApi( + endpoint: string, + options?: RequestInit +): Promise> { + try { + const response = await fetch(`/api${endpoint}`, { + ...options, + headers: { + 'Content-Type': 'application/json', + ...options?.headers, + }, + }) + + if (!response.ok) { + return { success: false, error: `HTTP ${response.status}` } + } + + return await response.json() + } catch (error) { + return { success: false, error: String(error) } + } +} +``` + +### Claude AI Integration (Structured Output) + +```python +from anthropic import Anthropic +from pydantic import BaseModel + +class AnalysisResult(BaseModel): + summary: str + key_points: list[str] + confidence: float + +async def analyze_with_claude(content: str) -> AnalysisResult: + client = Anthropic() + + response = client.messages.create( + model="claude-sonnet-4-5-20250514", + max_tokens=1024, + messages=[{"role": "user", "content": content}], + tools=[{ + "name": "provide_analysis", + "description": "Provide structured analysis", + "input_schema": AnalysisResult.model_json_schema() + }], + tool_choice={"type": "tool", "name": "provide_analysis"} + ) + + # Extract tool use result + tool_use = next( + block for block in response.content + if block.type == "tool_use" + ) + + return AnalysisResult(**tool_use.input) +``` + +### Custom Hooks (React) + +```typescript +import { useState, useCallback } from 'react' + +interface UseApiState { + data: T | null + loading: boolean + error: string | null +} + +export function useApi( + fetchFn: () => Promise> +) { + const [state, setState] = useState>({ + data: null, + loading: false, + error: null, + }) + + const execute = useCallback(async () => { + setState(prev => ({ ...prev, loading: true, error: null })) + + const result = await fetchFn() + + if (result.success) { + setState({ data: result.data!, loading: false, error: null }) + } else { + setState({ data: null, loading: false, error: result.error! }) + } + }, [fetchFn]) + + return { ...state, execute } +} +``` + +--- + +## Testing Requirements + +### Backend (pytest) + +```bash +# Run all tests +poetry run pytest tests/ + +# Run with coverage +poetry run pytest tests/ --cov=. --cov-report=html + +# Run specific test file +poetry run pytest tests/test_auth.py -v +``` + +**Test structure:** +```python +import pytest +from httpx import AsyncClient +from main import app + +@pytest.fixture +async def client(): + async with AsyncClient(app=app, base_url="http://test") as ac: + yield ac + +@pytest.mark.asyncio +async def test_health_check(client: AsyncClient): + response = await client.get("/health") + assert response.status_code == 200 + assert response.json()["status"] == "healthy" +``` + +### Frontend (React Testing Library) + +```bash +# Run tests +npm run test + +# Run with coverage +npm run test -- --coverage + +# Run E2E tests +npm run test:e2e +``` + +**Test structure:** +```typescript +import { render, screen, fireEvent } from '@testing-library/react' +import { WorkspacePanel } from './WorkspacePanel' + +describe('WorkspacePanel', () => { + it('renders workspace correctly', () => { + render() + expect(screen.getByRole('main')).toBeInTheDocument() + }) + + it('handles session creation', async () => { + render() + fireEvent.click(screen.getByText('New Session')) + expect(await screen.findByText('Session created')).toBeInTheDocument() + }) +}) +``` + +--- + +## Deployment Workflow + +### Pre-Deployment Checklist + +- [ ] All tests passing locally +- [ ] `npm run build` succeeds (frontend) +- [ ] `poetry run pytest` passes (backend) +- [ ] No hardcoded secrets +- [ ] Environment variables documented +- [ ] Database migrations ready + +### Deployment Commands + +```bash +# Build and deploy frontend +cd frontend && npm run build +gcloud run deploy frontend --source . + +# Build and deploy backend +cd backend +gcloud run deploy backend --source . +``` + +### Environment Variables + +```bash +# Frontend (.env.local) +NEXT_PUBLIC_API_URL=https://api.example.com +NEXT_PUBLIC_SUPABASE_URL=https://xxx.supabase.co +NEXT_PUBLIC_SUPABASE_ANON_KEY=eyJ... + +# Backend (.env) +DATABASE_URL=postgresql://... +ANTHROPIC_API_KEY=sk-ant-... +SUPABASE_URL=https://xxx.supabase.co +SUPABASE_KEY=eyJ... +``` + +--- + +## Critical Rules + +1. **No emojis** in code, comments, or documentation +2. **Immutability** - never mutate objects or arrays +3. **TDD** - write tests before implementation +4. **80% coverage** minimum +5. **Many small files** - 200-400 lines typical, 800 max +6. **No console.log** in production code +7. **Proper error handling** with try/catch +8. **Input validation** with Pydantic/Zod + +--- + +## Related Skills + +- `coding-standards.md` - General coding best practices +- `backend-patterns.md` - API and database patterns +- `frontend-patterns.md` - React and Next.js patterns +- `tdd-workflow/` - Test-driven development methodology diff --git a/.claude/skills/security-review/SKILL.md b/.claude/skills/security-review/SKILL.md new file mode 100644 index 0000000..81397dd --- /dev/null +++ b/.claude/skills/security-review/SKILL.md @@ -0,0 +1,568 @@ +--- +name: security-review +description: Use this skill when adding authentication, handling user input, working with secrets, creating API endpoints, or implementing payment/sensitive features. Provides comprehensive security checklist and patterns. +--- + +# Security Review Skill + +Security best practices for Python/FastAPI applications handling sensitive invoice data. + +## When to Activate + +- Implementing authentication or authorization +- Handling user input or file uploads +- Creating new API endpoints +- Working with secrets or credentials +- Processing sensitive invoice data +- Integrating third-party APIs +- Database operations with user data + +## Security Checklist + +### 1. Secrets Management + +#### NEVER Do This +```python +# Hardcoded secrets - CRITICAL VULNERABILITY +api_key = "sk-proj-xxxxx" +db_password = "password123" +``` + +#### ALWAYS Do This +```python +import os +from pydantic_settings import BaseSettings + +class Settings(BaseSettings): + db_password: str + api_key: str + model_path: str = "runs/train/invoice_fields/weights/best.pt" + + class Config: + env_file = ".env" + +settings = Settings() + +# Verify secrets exist +if not settings.db_password: + raise RuntimeError("DB_PASSWORD not configured") +``` + +#### Verification Steps +- [ ] No hardcoded API keys, tokens, or passwords +- [ ] All secrets in environment variables +- [ ] `.env` in .gitignore +- [ ] No secrets in git history +- [ ] `.env.example` with placeholder values + +### 2. Input Validation + +#### Always Validate User Input +```python +from pydantic import BaseModel, Field, field_validator +from fastapi import HTTPException +import re + +class InvoiceRequest(BaseModel): + invoice_number: str = Field(..., min_length=1, max_length=50) + amount: float = Field(..., gt=0, le=1_000_000) + bankgiro: str | None = None + + @field_validator("invoice_number") + @classmethod + def validate_invoice_number(cls, v: str) -> str: + # Whitelist validation - only allow safe characters + if not re.match(r"^[A-Za-z0-9\-_]+$", v): + raise ValueError("Invalid invoice number format") + return v + + @field_validator("bankgiro") + @classmethod + def validate_bankgiro(cls, v: str | None) -> str | None: + if v is None: + return None + cleaned = re.sub(r"[^0-9]", "", v) + if not (7 <= len(cleaned) <= 8): + raise ValueError("Bankgiro must be 7-8 digits") + return cleaned +``` + +#### File Upload Validation +```python +from fastapi import UploadFile, HTTPException +from pathlib import Path + +ALLOWED_EXTENSIONS = {".pdf"} +MAX_FILE_SIZE = 10 * 1024 * 1024 # 10MB + +async def validate_pdf_upload(file: UploadFile) -> bytes: + """Validate PDF upload with security checks.""" + # Extension check + ext = Path(file.filename or "").suffix.lower() + if ext not in ALLOWED_EXTENSIONS: + raise HTTPException(400, f"Only PDF files allowed, got {ext}") + + # Read content + content = await file.read() + + # Size check + if len(content) > MAX_FILE_SIZE: + raise HTTPException(400, f"File too large (max {MAX_FILE_SIZE // 1024 // 1024}MB)") + + # Magic bytes check (PDF signature) + if not content.startswith(b"%PDF"): + raise HTTPException(400, "Invalid PDF file format") + + return content +``` + +#### Verification Steps +- [ ] All user inputs validated with Pydantic +- [ ] File uploads restricted (size, type, extension, magic bytes) +- [ ] No direct use of user input in queries +- [ ] Whitelist validation (not blacklist) +- [ ] Error messages don't leak sensitive info + +### 3. SQL Injection Prevention + +#### NEVER Concatenate SQL +```python +# DANGEROUS - SQL Injection vulnerability +query = f"SELECT * FROM documents WHERE id = '{user_input}'" +cur.execute(query) +``` + +#### ALWAYS Use Parameterized Queries +```python +import psycopg2 + +# Safe - parameterized query with %s placeholders +cur.execute( + "SELECT * FROM documents WHERE id = %s AND status = %s", + (document_id, status) +) + +# Safe - named parameters +cur.execute( + "SELECT * FROM documents WHERE id = %(id)s", + {"id": document_id} +) + +# Safe - psycopg2.sql for dynamic identifiers +from psycopg2 import sql + +cur.execute( + sql.SQL("SELECT {} FROM {} WHERE id = %s").format( + sql.Identifier("invoice_number"), + sql.Identifier("documents") + ), + (document_id,) +) +``` + +#### Verification Steps +- [ ] All database queries use parameterized queries (%s or %(name)s) +- [ ] No string concatenation or f-strings in SQL +- [ ] psycopg2.sql module used for dynamic identifiers +- [ ] No user input in table/column names + +### 4. Path Traversal Prevention + +#### NEVER Trust User Paths +```python +# DANGEROUS - Path traversal vulnerability +filename = request.query_params.get("file") +with open(f"/data/{filename}", "r") as f: # Attacker: ../../../etc/passwd + return f.read() +``` + +#### ALWAYS Validate Paths +```python +from pathlib import Path + +ALLOWED_DIR = Path("/data/uploads").resolve() + +def get_safe_path(filename: str) -> Path: + """Get safe file path, preventing path traversal.""" + # Remove any path components + safe_name = Path(filename).name + + # Validate filename characters + if not re.match(r"^[A-Za-z0-9_\-\.]+$", safe_name): + raise HTTPException(400, "Invalid filename") + + # Resolve and verify within allowed directory + full_path = (ALLOWED_DIR / safe_name).resolve() + + if not full_path.is_relative_to(ALLOWED_DIR): + raise HTTPException(400, "Invalid file path") + + return full_path +``` + +#### Verification Steps +- [ ] User-provided filenames sanitized +- [ ] Paths resolved and validated against allowed directory +- [ ] No direct concatenation of user input into paths +- [ ] Whitelist characters in filenames + +### 5. Authentication & Authorization + +#### API Key Validation +```python +from fastapi import Depends, HTTPException, Security +from fastapi.security import APIKeyHeader + +api_key_header = APIKeyHeader(name="X-API-Key", auto_error=False) + +async def verify_api_key(api_key: str = Security(api_key_header)) -> str: + if not api_key: + raise HTTPException(401, "API key required") + + # Constant-time comparison to prevent timing attacks + import hmac + if not hmac.compare_digest(api_key, settings.api_key): + raise HTTPException(403, "Invalid API key") + + return api_key + +@router.post("/infer") +async def infer( + file: UploadFile, + api_key: str = Depends(verify_api_key) +): + ... +``` + +#### Role-Based Access Control +```python +from enum import Enum + +class UserRole(str, Enum): + USER = "user" + ADMIN = "admin" + +def require_role(required_role: UserRole): + async def role_checker(current_user: User = Depends(get_current_user)): + if current_user.role != required_role: + raise HTTPException(403, "Insufficient permissions") + return current_user + return role_checker + +@router.delete("/documents/{doc_id}") +async def delete_document( + doc_id: str, + user: User = Depends(require_role(UserRole.ADMIN)) +): + ... +``` + +#### Verification Steps +- [ ] API keys validated with constant-time comparison +- [ ] Authorization checks before sensitive operations +- [ ] Role-based access control implemented +- [ ] Session/token validation on protected routes + +### 6. Rate Limiting + +#### Rate Limiter Implementation +```python +from time import time +from collections import defaultdict +from fastapi import Request, HTTPException + +class RateLimiter: + def __init__(self): + self.requests: dict[str, list[float]] = defaultdict(list) + + def check_limit( + self, + identifier: str, + max_requests: int, + window_seconds: int + ) -> bool: + now = time() + # Clean old requests + self.requests[identifier] = [ + t for t in self.requests[identifier] + if now - t < window_seconds + ] + # Check limit + if len(self.requests[identifier]) >= max_requests: + return False + self.requests[identifier].append(now) + return True + +limiter = RateLimiter() + +@app.middleware("http") +async def rate_limit_middleware(request: Request, call_next): + client_ip = request.client.host if request.client else "unknown" + + # 100 requests per minute for general endpoints + if not limiter.check_limit(client_ip, max_requests=100, window_seconds=60): + raise HTTPException(429, "Rate limit exceeded. Try again later.") + + return await call_next(request) +``` + +#### Stricter Limits for Expensive Operations +```python +# Inference endpoint: 10 requests per minute +async def check_inference_rate_limit(request: Request): + client_ip = request.client.host if request.client else "unknown" + if not limiter.check_limit(f"infer:{client_ip}", max_requests=10, window_seconds=60): + raise HTTPException(429, "Inference rate limit exceeded") + +@router.post("/infer") +async def infer( + file: UploadFile, + _: None = Depends(check_inference_rate_limit) +): + ... +``` + +#### Verification Steps +- [ ] Rate limiting on all API endpoints +- [ ] Stricter limits on expensive operations (inference, OCR) +- [ ] IP-based rate limiting +- [ ] Clear error messages for rate-limited requests + +### 7. Sensitive Data Exposure + +#### Logging +```python +import logging + +logger = logging.getLogger(__name__) + +# WRONG: Logging sensitive data +logger.info(f"Processing invoice: {invoice_data}") # May contain sensitive info +logger.error(f"DB error with password: {db_password}") + +# CORRECT: Redact sensitive data +logger.info(f"Processing invoice: id={doc_id}") +logger.error(f"DB connection failed to {db_host}:{db_port}") + +# CORRECT: Structured logging with safe fields only +logger.info( + "Invoice processed", + extra={ + "document_id": doc_id, + "field_count": len(fields), + "processing_time_ms": elapsed_ms + } +) +``` + +#### Error Messages +```python +# WRONG: Exposing internal details +@app.exception_handler(Exception) +async def error_handler(request: Request, exc: Exception): + return JSONResponse( + status_code=500, + content={ + "error": str(exc), + "traceback": traceback.format_exc() # NEVER expose! + } + ) + +# CORRECT: Generic error messages +@app.exception_handler(Exception) +async def error_handler(request: Request, exc: Exception): + logger.error(f"Unhandled error: {exc}", exc_info=True) # Log internally + return JSONResponse( + status_code=500, + content={"success": False, "error": "An error occurred"} + ) +``` + +#### Verification Steps +- [ ] No passwords, tokens, or secrets in logs +- [ ] Error messages generic for users +- [ ] Detailed errors only in server logs +- [ ] No stack traces exposed to users +- [ ] Invoice data (amounts, account numbers) not logged + +### 8. CORS Configuration + +```python +from fastapi.middleware.cors import CORSMiddleware + +# WRONG: Allow all origins +app.add_middleware( + CORSMiddleware, + allow_origins=["*"], # DANGEROUS in production + allow_credentials=True, +) + +# CORRECT: Specific origins +ALLOWED_ORIGINS = [ + "http://localhost:8000", + "https://your-domain.com", +] + +app.add_middleware( + CORSMiddleware, + allow_origins=ALLOWED_ORIGINS, + allow_credentials=True, + allow_methods=["GET", "POST"], + allow_headers=["*"], +) +``` + +#### Verification Steps +- [ ] CORS origins explicitly listed +- [ ] No wildcard origins in production +- [ ] Credentials only with specific origins + +### 9. Temporary File Security + +```python +import tempfile +from pathlib import Path +from contextlib import contextmanager + +@contextmanager +def secure_temp_file(suffix: str = ".pdf"): + """Create secure temporary file that is always cleaned up.""" + tmp_path = None + try: + with tempfile.NamedTemporaryFile( + suffix=suffix, + delete=False, + dir="/tmp/invoice-master" # Dedicated temp directory + ) as tmp: + tmp_path = Path(tmp.name) + yield tmp_path + finally: + if tmp_path and tmp_path.exists(): + tmp_path.unlink() + +# Usage +async def process_upload(file: UploadFile): + with secure_temp_file(".pdf") as tmp_path: + content = await validate_pdf_upload(file) + tmp_path.write_bytes(content) + result = pipeline.process(tmp_path) + # File automatically cleaned up + return result +``` + +#### Verification Steps +- [ ] Temporary files always cleaned up (use context managers) +- [ ] Temp directory has restricted permissions +- [ ] No leftover files after processing errors + +### 10. Dependency Security + +#### Regular Updates +```bash +# Check for vulnerabilities +pip-audit + +# Update dependencies +pip install --upgrade -r requirements.txt + +# Check for outdated packages +pip list --outdated +``` + +#### Lock Files +```bash +# Create requirements lock file +pip freeze > requirements.lock + +# Install from lock file for reproducible builds +pip install -r requirements.lock +``` + +#### Verification Steps +- [ ] Dependencies up to date +- [ ] No known vulnerabilities (pip-audit clean) +- [ ] requirements.txt pinned versions +- [ ] Regular security updates scheduled + +## Security Testing + +### Automated Security Tests +```python +import pytest +from fastapi.testclient import TestClient + +def test_requires_api_key(client: TestClient): + """Test authentication required.""" + response = client.post("/api/v1/infer") + assert response.status_code == 401 + +def test_invalid_api_key_rejected(client: TestClient): + """Test invalid API key rejected.""" + response = client.post( + "/api/v1/infer", + headers={"X-API-Key": "invalid-key"} + ) + assert response.status_code == 403 + +def test_sql_injection_prevented(client: TestClient): + """Test SQL injection attempt rejected.""" + response = client.get( + "/api/v1/documents", + params={"id": "'; DROP TABLE documents; --"} + ) + # Should return validation error, not execute SQL + assert response.status_code in (400, 422) + +def test_path_traversal_prevented(client: TestClient): + """Test path traversal attempt rejected.""" + response = client.get("/api/v1/results/../../etc/passwd") + assert response.status_code == 400 + +def test_rate_limit_enforced(client: TestClient): + """Test rate limiting works.""" + responses = [ + client.post("/api/v1/infer", files={"file": b"test"}) + for _ in range(15) + ] + rate_limited = [r for r in responses if r.status_code == 429] + assert len(rate_limited) > 0 + +def test_large_file_rejected(client: TestClient): + """Test file size limit enforced.""" + large_content = b"x" * (11 * 1024 * 1024) # 11MB + response = client.post( + "/api/v1/infer", + files={"file": ("test.pdf", large_content)} + ) + assert response.status_code == 400 +``` + +## Pre-Deployment Security Checklist + +Before ANY production deployment: + +- [ ] **Secrets**: No hardcoded secrets, all in env vars +- [ ] **Input Validation**: All user inputs validated with Pydantic +- [ ] **SQL Injection**: All queries use parameterized queries +- [ ] **Path Traversal**: File paths validated and sanitized +- [ ] **Authentication**: API key or token validation +- [ ] **Authorization**: Role checks in place +- [ ] **Rate Limiting**: Enabled on all endpoints +- [ ] **HTTPS**: Enforced in production +- [ ] **CORS**: Properly configured (no wildcards) +- [ ] **Error Handling**: No sensitive data in errors +- [ ] **Logging**: No sensitive data logged +- [ ] **File Uploads**: Validated (size, type, magic bytes) +- [ ] **Temp Files**: Always cleaned up +- [ ] **Dependencies**: Up to date, no vulnerabilities + +## Resources + +- [OWASP Top 10](https://owasp.org/www-project-top-ten/) +- [FastAPI Security](https://fastapi.tiangolo.com/tutorial/security/) +- [Bandit (Python Security Linter)](https://bandit.readthedocs.io/) +- [pip-audit](https://pypi.org/project/pip-audit/) + +--- + +**Remember**: Security is not optional. One vulnerability can compromise sensitive invoice data. When in doubt, err on the side of caution. diff --git a/.claude/skills/strategic-compact/SKILL.md b/.claude/skills/strategic-compact/SKILL.md new file mode 100644 index 0000000..394a86b --- /dev/null +++ b/.claude/skills/strategic-compact/SKILL.md @@ -0,0 +1,63 @@ +--- +name: strategic-compact +description: Suggests manual context compaction at logical intervals to preserve context through task phases rather than arbitrary auto-compaction. +--- + +# Strategic Compact Skill + +Suggests manual `/compact` at strategic points in your workflow rather than relying on arbitrary auto-compaction. + +## Why Strategic Compaction? + +Auto-compaction triggers at arbitrary points: +- Often mid-task, losing important context +- No awareness of logical task boundaries +- Can interrupt complex multi-step operations + +Strategic compaction at logical boundaries: +- **After exploration, before execution** - Compact research context, keep implementation plan +- **After completing a milestone** - Fresh start for next phase +- **Before major context shifts** - Clear exploration context before different task + +## How It Works + +The `suggest-compact.sh` script runs on PreToolUse (Edit/Write) and: + +1. **Tracks tool calls** - Counts tool invocations in session +2. **Threshold detection** - Suggests at configurable threshold (default: 50 calls) +3. **Periodic reminders** - Reminds every 25 calls after threshold + +## Hook Setup + +Add to your `~/.claude/settings.json`: + +```json +{ + "hooks": { + "PreToolUse": [{ + "matcher": "tool == \"Edit\" || tool == \"Write\"", + "hooks": [{ + "type": "command", + "command": "~/.claude/skills/strategic-compact/suggest-compact.sh" + }] + }] + } +} +``` + +## Configuration + +Environment variables: +- `COMPACT_THRESHOLD` - Tool calls before first suggestion (default: 50) + +## Best Practices + +1. **Compact after planning** - Once plan is finalized, compact to start fresh +2. **Compact after debugging** - Clear error-resolution context before continuing +3. **Don't compact mid-implementation** - Preserve context for related changes +4. **Read the suggestion** - The hook tells you *when*, you decide *if* + +## Related + +- [The Longform Guide](https://x.com/affaanmustafa/status/2014040193557471352) - Token optimization section +- Memory persistence hooks - For state that survives compaction diff --git a/.claude/skills/strategic-compact/suggest-compact.sh b/.claude/skills/strategic-compact/suggest-compact.sh new file mode 100644 index 0000000..ea14920 --- /dev/null +++ b/.claude/skills/strategic-compact/suggest-compact.sh @@ -0,0 +1,52 @@ +#!/bin/bash +# Strategic Compact Suggester +# Runs on PreToolUse or periodically to suggest manual compaction at logical intervals +# +# Why manual over auto-compact: +# - Auto-compact happens at arbitrary points, often mid-task +# - Strategic compacting preserves context through logical phases +# - Compact after exploration, before execution +# - Compact after completing a milestone, before starting next +# +# Hook config (in ~/.claude/settings.json): +# { +# "hooks": { +# "PreToolUse": [{ +# "matcher": "Edit|Write", +# "hooks": [{ +# "type": "command", +# "command": "~/.claude/skills/strategic-compact/suggest-compact.sh" +# }] +# }] +# } +# } +# +# Criteria for suggesting compact: +# - Session has been running for extended period +# - Large number of tool calls made +# - Transitioning from research/exploration to implementation +# - Plan has been finalized + +# Track tool call count (increment in a temp file) +COUNTER_FILE="/tmp/claude-tool-count-$$" +THRESHOLD=${COMPACT_THRESHOLD:-50} + +# Initialize or increment counter +if [ -f "$COUNTER_FILE" ]; then + count=$(cat "$COUNTER_FILE") + count=$((count + 1)) + echo "$count" > "$COUNTER_FILE" +else + echo "1" > "$COUNTER_FILE" + count=1 +fi + +# Suggest compact after threshold tool calls +if [ "$count" -eq "$THRESHOLD" ]; then + echo "[StrategicCompact] $THRESHOLD tool calls reached - consider /compact if transitioning phases" >&2 +fi + +# Suggest at regular intervals after threshold +if [ "$count" -gt "$THRESHOLD" ] && [ $((count % 25)) -eq 0 ]; then + echo "[StrategicCompact] $count tool calls - good checkpoint for /compact if context is stale" >&2 +fi diff --git a/.claude/skills/tdd-workflow/SKILL.md b/.claude/skills/tdd-workflow/SKILL.md new file mode 100644 index 0000000..c3ef042 --- /dev/null +++ b/.claude/skills/tdd-workflow/SKILL.md @@ -0,0 +1,553 @@ +--- +name: tdd-workflow +description: Use this skill when writing new features, fixing bugs, or refactoring code. Enforces test-driven development with 80%+ coverage including unit, integration, and E2E tests. +--- + +# Test-Driven Development Workflow + +TDD principles for Python/FastAPI development with pytest. + +## When to Activate + +- Writing new features or functionality +- Fixing bugs or issues +- Refactoring existing code +- Adding API endpoints +- Creating new field extractors or normalizers + +## Core Principles + +### 1. Tests BEFORE Code +ALWAYS write tests first, then implement code to make tests pass. + +### 2. Coverage Requirements +- Minimum 80% coverage (unit + integration + E2E) +- All edge cases covered +- Error scenarios tested +- Boundary conditions verified + +### 3. Test Types + +#### Unit Tests +- Individual functions and utilities +- Normalizers and validators +- Parsers and extractors +- Pure functions + +#### Integration Tests +- API endpoints +- Database operations +- OCR + YOLO pipeline +- Service interactions + +#### E2E Tests +- Complete inference pipeline +- PDF → Fields workflow +- API health and inference endpoints + +## TDD Workflow Steps + +### Step 1: Write User Journeys +``` +As a [role], I want to [action], so that [benefit] + +Example: +As an invoice processor, I want to extract Bankgiro from payment_line, +so that I can cross-validate OCR results. +``` + +### Step 2: Generate Test Cases +For each user journey, create comprehensive test cases: + +```python +import pytest + +class TestPaymentLineParser: + """Tests for payment_line parsing and field extraction.""" + + def test_parse_payment_line_extracts_bankgiro(self): + """Should extract Bankgiro from valid payment line.""" + # Test implementation + pass + + def test_parse_payment_line_handles_missing_checksum(self): + """Should handle payment lines without checksum.""" + pass + + def test_parse_payment_line_validates_checksum(self): + """Should validate checksum when present.""" + pass + + def test_parse_payment_line_returns_none_for_invalid(self): + """Should return None for invalid payment lines.""" + pass +``` + +### Step 3: Run Tests (They Should Fail) +```bash +pytest tests/test_ocr/test_machine_code_parser.py -v +# Tests should fail - we haven't implemented yet +``` + +### Step 4: Implement Code +Write minimal code to make tests pass: + +```python +def parse_payment_line(line: str) -> PaymentLineData | None: + """Parse Swedish payment line and extract fields.""" + # Implementation guided by tests + pass +``` + +### Step 5: Run Tests Again +```bash +pytest tests/test_ocr/test_machine_code_parser.py -v +# Tests should now pass +``` + +### Step 6: Refactor +Improve code quality while keeping tests green: +- Remove duplication +- Improve naming +- Optimize performance +- Enhance readability + +### Step 7: Verify Coverage +```bash +pytest --cov=src --cov-report=term-missing +# Verify 80%+ coverage achieved +``` + +## Testing Patterns + +### Unit Test Pattern (pytest) +```python +import pytest +from src.normalize.bankgiro_normalizer import normalize_bankgiro + +class TestBankgiroNormalizer: + """Tests for Bankgiro normalization.""" + + def test_normalize_removes_hyphens(self): + """Should remove hyphens from Bankgiro.""" + result = normalize_bankgiro("123-4567") + assert result == "1234567" + + def test_normalize_removes_spaces(self): + """Should remove spaces from Bankgiro.""" + result = normalize_bankgiro("123 4567") + assert result == "1234567" + + def test_normalize_validates_length(self): + """Should validate Bankgiro is 7-8 digits.""" + result = normalize_bankgiro("123456") # 6 digits + assert result is None + + def test_normalize_validates_checksum(self): + """Should validate Luhn checksum.""" + result = normalize_bankgiro("1234568") # Invalid checksum + assert result is None + + @pytest.mark.parametrize("input_value,expected", [ + ("123-4567", "1234567"), + ("1234567", "1234567"), + ("123 4567", "1234567"), + ("BG 123-4567", "1234567"), + ]) + def test_normalize_various_formats(self, input_value, expected): + """Should handle various input formats.""" + result = normalize_bankgiro(input_value) + assert result == expected +``` + +### API Integration Test Pattern +```python +import pytest +from fastapi.testclient import TestClient +from src.web.app import app + +@pytest.fixture +def client(): + return TestClient(app) + +class TestHealthEndpoint: + """Tests for /api/v1/health endpoint.""" + + def test_health_returns_200(self, client): + """Should return 200 OK.""" + response = client.get("/api/v1/health") + assert response.status_code == 200 + + def test_health_returns_status(self, client): + """Should return health status.""" + response = client.get("/api/v1/health") + data = response.json() + assert data["status"] == "healthy" + assert "model_loaded" in data + +class TestInferEndpoint: + """Tests for /api/v1/infer endpoint.""" + + def test_infer_requires_file(self, client): + """Should require file upload.""" + response = client.post("/api/v1/infer") + assert response.status_code == 422 + + def test_infer_rejects_non_pdf(self, client): + """Should reject non-PDF files.""" + response = client.post( + "/api/v1/infer", + files={"file": ("test.txt", b"not a pdf", "text/plain")} + ) + assert response.status_code == 400 + + def test_infer_returns_fields(self, client, sample_invoice_pdf): + """Should return extracted fields.""" + with open(sample_invoice_pdf, "rb") as f: + response = client.post( + "/api/v1/infer", + files={"file": ("invoice.pdf", f, "application/pdf")} + ) + assert response.status_code == 200 + data = response.json() + assert data["success"] is True + assert "fields" in data +``` + +### E2E Test Pattern +```python +import pytest +import httpx +from pathlib import Path + +@pytest.fixture(scope="module") +def running_server(): + """Ensure server is running for E2E tests.""" + # Server should be started before running E2E tests + base_url = "http://localhost:8000" + yield base_url + +class TestInferencePipeline: + """E2E tests for complete inference pipeline.""" + + def test_health_check(self, running_server): + """Should pass health check.""" + response = httpx.get(f"{running_server}/api/v1/health") + assert response.status_code == 200 + data = response.json() + assert data["status"] == "healthy" + assert data["model_loaded"] is True + + def test_pdf_inference_returns_fields(self, running_server): + """Should extract fields from PDF.""" + pdf_path = Path("tests/fixtures/sample_invoice.pdf") + with open(pdf_path, "rb") as f: + response = httpx.post( + f"{running_server}/api/v1/infer", + files={"file": ("invoice.pdf", f, "application/pdf")} + ) + + assert response.status_code == 200 + data = response.json() + assert data["success"] is True + assert "fields" in data + assert len(data["fields"]) > 0 + + def test_cross_validation_included(self, running_server): + """Should include cross-validation for invoices with payment_line.""" + pdf_path = Path("tests/fixtures/invoice_with_payment_line.pdf") + with open(pdf_path, "rb") as f: + response = httpx.post( + f"{running_server}/api/v1/infer", + files={"file": ("invoice.pdf", f, "application/pdf")} + ) + + data = response.json() + if data["fields"].get("payment_line"): + assert "cross_validation" in data +``` + +## Test File Organization + +``` +tests/ +├── conftest.py # Shared fixtures +├── fixtures/ # Test data files +│ ├── sample_invoice.pdf +│ └── invoice_with_payment_line.pdf +├── test_cli/ +│ └── test_infer.py +├── test_pdf/ +│ ├── test_extractor.py +│ └── test_renderer.py +├── test_ocr/ +│ ├── test_paddle_ocr.py +│ └── test_machine_code_parser.py +├── test_inference/ +│ ├── test_pipeline.py +│ ├── test_yolo_detector.py +│ └── test_field_extractor.py +├── test_normalize/ +│ ├── test_bankgiro_normalizer.py +│ ├── test_date_normalizer.py +│ └── test_amount_normalizer.py +├── test_web/ +│ ├── test_routes.py +│ └── test_services.py +└── e2e/ + └── test_inference_e2e.py +``` + +## Mocking External Services + +### Mock PaddleOCR +```python +import pytest +from unittest.mock import Mock, patch + +@pytest.fixture +def mock_paddle_ocr(): + """Mock PaddleOCR for unit tests.""" + with patch("src.ocr.paddle_ocr.PaddleOCR") as mock: + instance = Mock() + instance.ocr.return_value = [ + [ + [[[0, 0], [100, 0], [100, 20], [0, 20]], ("Invoice Number", 0.95)], + [[[0, 30], [100, 30], [100, 50], [0, 50]], ("INV-2024-001", 0.98)] + ] + ] + mock.return_value = instance + yield instance +``` + +### Mock YOLO Model +```python +@pytest.fixture +def mock_yolo_model(): + """Mock YOLO model for unit tests.""" + with patch("src.inference.yolo_detector.YOLO") as mock: + instance = Mock() + # Mock detection results + instance.return_value = Mock( + boxes=Mock( + xyxy=[[10, 20, 100, 50]], + conf=[0.95], + cls=[0] # invoice_number class + ) + ) + mock.return_value = instance + yield instance +``` + +### Mock Database +```python +@pytest.fixture +def mock_db_connection(): + """Mock database connection for unit tests.""" + with patch("src.data.db.get_db_connection") as mock: + conn = Mock() + cursor = Mock() + cursor.fetchall.return_value = [ + ("doc-123", "processed", {"invoice_number": "INV-001"}) + ] + cursor.fetchone.return_value = ("doc-123",) + conn.cursor.return_value.__enter__ = Mock(return_value=cursor) + conn.cursor.return_value.__exit__ = Mock(return_value=False) + mock.return_value.__enter__ = Mock(return_value=conn) + mock.return_value.__exit__ = Mock(return_value=False) + yield conn +``` + +## Test Coverage Verification + +### Run Coverage Report +```bash +# Run with coverage +pytest --cov=src --cov-report=term-missing + +# Generate HTML report +pytest --cov=src --cov-report=html +# Open htmlcov/index.html in browser +``` + +### Coverage Configuration (pyproject.toml) +```toml +[tool.coverage.run] +source = ["src"] +omit = ["*/__init__.py", "*/test_*.py"] + +[tool.coverage.report] +fail_under = 80 +show_missing = true +exclude_lines = [ + "pragma: no cover", + "if TYPE_CHECKING:", + "raise NotImplementedError", +] +``` + +## Common Testing Mistakes to Avoid + +### WRONG: Testing Implementation Details +```python +# Don't test internal state +def test_parser_internal_state(): + parser = PaymentLineParser() + parser._parse("...") + assert parser._groups == [...] # Internal state +``` + +### CORRECT: Test Public Interface +```python +# Test what users see +def test_parser_extracts_bankgiro(): + result = parse_payment_line("...") + assert result.bankgiro == "1234567" +``` + +### WRONG: No Test Isolation +```python +# Tests depend on each other +class TestDocuments: + def test_creates_document(self): + create_document(...) # Creates in DB + + def test_updates_document(self): + update_document(...) # Depends on previous test +``` + +### CORRECT: Independent Tests +```python +# Each test sets up its own data +class TestDocuments: + def test_creates_document(self, mock_db): + result = create_document(...) + assert result.id is not None + + def test_updates_document(self, mock_db): + # Create own test data + doc = create_document(...) + result = update_document(doc.id, ...) + assert result.status == "updated" +``` + +### WRONG: Testing Too Much +```python +# One test doing everything +def test_full_invoice_processing(): + # Load PDF + # Extract images + # Run YOLO + # Run OCR + # Normalize fields + # Save to DB + # Return response +``` + +### CORRECT: Focused Tests +```python +def test_yolo_detects_invoice_number(): + """Test only YOLO detection.""" + result = detector.detect(image) + assert any(d.label == "invoice_number" for d in result) + +def test_ocr_extracts_text(): + """Test only OCR extraction.""" + result = ocr.extract(image, bbox) + assert result == "INV-2024-001" + +def test_normalizer_formats_date(): + """Test only date normalization.""" + result = normalize_date("2024-01-15") + assert result == "2024-01-15" +``` + +## Fixtures (conftest.py) + +```python +import pytest +from pathlib import Path +from fastapi.testclient import TestClient + +@pytest.fixture +def sample_invoice_pdf(tmp_path: Path) -> Path: + """Create sample invoice PDF for testing.""" + pdf_path = tmp_path / "invoice.pdf" + # Copy from fixtures or create minimal PDF + src = Path("tests/fixtures/sample_invoice.pdf") + if src.exists(): + pdf_path.write_bytes(src.read_bytes()) + return pdf_path + +@pytest.fixture +def client(): + """FastAPI test client.""" + from src.web.app import app + return TestClient(app) + +@pytest.fixture +def sample_payment_line() -> str: + """Sample Swedish payment line for testing.""" + return "1234567#0000000012345#230115#00012345678901234567#1" +``` + +## Continuous Testing + +### Watch Mode During Development +```bash +# Using pytest-watch +ptw -- tests/test_ocr/ +# Tests run automatically on file changes +``` + +### Pre-Commit Hook +```bash +# .pre-commit-config.yaml +repos: + - repo: local + hooks: + - id: pytest + name: pytest + entry: pytest --tb=short -q + language: system + pass_filenames: false + always_run: true +``` + +### CI/CD Integration (GitHub Actions) +```yaml +- name: Run Tests + run: | + pytest --cov=src --cov-report=xml + +- name: Upload Coverage + uses: codecov/codecov-action@v3 + with: + file: coverage.xml +``` + +## Best Practices + +1. **Write Tests First** - Always TDD +2. **One Assert Per Test** - Focus on single behavior +3. **Descriptive Test Names** - `test___` +4. **Arrange-Act-Assert** - Clear test structure +5. **Mock External Dependencies** - Isolate unit tests +6. **Test Edge Cases** - None, empty, invalid, boundary +7. **Test Error Paths** - Not just happy paths +8. **Keep Tests Fast** - Unit tests < 50ms each +9. **Clean Up After Tests** - Use fixtures with cleanup +10. **Review Coverage Reports** - Identify gaps + +## Success Metrics + +- 80%+ code coverage achieved +- All tests passing (green) +- No skipped or disabled tests +- Fast test execution (< 60s for unit tests) +- E2E tests cover critical inference flow +- Tests catch bugs before production + +--- + +**Remember**: Tests are not optional. They are the safety net that enables confident refactoring, rapid development, and production reliability. diff --git a/.claude/skills/verification-loop/SKILL.md b/.claude/skills/verification-loop/SKILL.md new file mode 100644 index 0000000..0c2f000 --- /dev/null +++ b/.claude/skills/verification-loop/SKILL.md @@ -0,0 +1,242 @@ +# Verification Loop Skill + +Comprehensive verification system for Python/FastAPI development. + +## When to Use + +Invoke this skill: +- After completing a feature or significant code change +- Before creating a PR +- When you want to ensure quality gates pass +- After refactoring +- Before deployment + +## Verification Phases + +### Phase 1: Type Check +```bash +# Run mypy type checker +mypy src/ --ignore-missing-imports 2>&1 | head -30 +``` + +Report all type errors. Fix critical ones before continuing. + +### Phase 2: Lint Check +```bash +# Run ruff linter +ruff check src/ 2>&1 | head -30 + +# Auto-fix if desired +ruff check src/ --fix +``` + +Check for: +- Unused imports +- Code style violations +- Common Python anti-patterns + +### Phase 3: Test Suite +```bash +# Run tests with coverage +pytest --cov=src --cov-report=term-missing -q 2>&1 | tail -50 + +# Run specific test file +pytest tests/test_ocr/test_machine_code_parser.py -v + +# Run with short traceback +pytest -x --tb=short +``` + +Report: +- Total tests: X +- Passed: X +- Failed: X +- Coverage: X% +- Target: 80% minimum + +### Phase 4: Security Scan +```bash +# Check for hardcoded secrets +grep -rn "password\s*=" --include="*.py" src/ 2>/dev/null | grep -v "db_password:" | head -10 +grep -rn "api_key\s*=" --include="*.py" src/ 2>/dev/null | head -10 +grep -rn "sk-" --include="*.py" src/ 2>/dev/null | head -10 + +# Check for print statements (should use logging) +grep -rn "print(" --include="*.py" src/ 2>/dev/null | head -10 + +# Check for bare except +grep -rn "except:" --include="*.py" src/ 2>/dev/null | head -10 + +# Check for SQL injection risks (f-strings in execute) +grep -rn 'execute(f"' --include="*.py" src/ 2>/dev/null | head -10 +grep -rn "execute(f'" --include="*.py" src/ 2>/dev/null | head -10 +``` + +### Phase 5: Import Check +```bash +# Verify all imports work +python -c "from src.web.app import app; print('Web app OK')" +python -c "from src.inference.pipeline import InferencePipeline; print('Pipeline OK')" +python -c "from src.ocr.machine_code_parser import parse_payment_line; print('Parser OK')" +``` + +### Phase 6: Diff Review +```bash +# Show what changed +git diff --stat +git diff HEAD --name-only + +# Show staged changes +git diff --staged --stat +``` + +Review each changed file for: +- Unintended changes +- Missing error handling +- Potential edge cases +- Missing type hints +- Mutable default arguments + +### Phase 7: API Smoke Test (if server running) +```bash +# Health check +curl -s http://localhost:8000/api/v1/health | python -m json.tool + +# Verify response format +curl -s http://localhost:8000/api/v1/health | grep -q "healthy" && echo "Health: OK" || echo "Health: FAIL" +``` + +## Output Format + +After running all phases, produce a verification report: + +``` +VERIFICATION REPORT +================== + +Types: [PASS/FAIL] (X errors) +Lint: [PASS/FAIL] (X warnings) +Tests: [PASS/FAIL] (X/Y passed, Z% coverage) +Security: [PASS/FAIL] (X issues) +Imports: [PASS/FAIL] +Diff: [X files changed] + +Overall: [READY/NOT READY] for PR + +Issues to Fix: +1. ... +2. ... +``` + +## Quick Commands + +```bash +# Full verification (WSL) +wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && mypy src/ --ignore-missing-imports && ruff check src/ && pytest -x --tb=short" + +# Type check only +wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && mypy src/ --ignore-missing-imports" + +# Tests only +wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && pytest --cov=src -q" +``` + +## Verification Checklist + +### Before Commit +- [ ] mypy passes (no type errors) +- [ ] ruff check passes (no lint errors) +- [ ] All tests pass +- [ ] No print() statements in production code +- [ ] No hardcoded secrets +- [ ] No bare `except:` clauses +- [ ] No SQL injection risks (f-strings in queries) +- [ ] Coverage >= 80% for changed code + +### Before PR +- [ ] All above checks pass +- [ ] git diff reviewed for unintended changes +- [ ] New code has tests +- [ ] Type hints on all public functions +- [ ] Docstrings on public APIs +- [ ] No TODO/FIXME for critical items + +### Before Deployment +- [ ] All above checks pass +- [ ] E2E tests pass +- [ ] Health check returns healthy +- [ ] Model loaded successfully +- [ ] No server errors in logs + +## Common Issues and Fixes + +### Type Error: Missing return type +```python +# Before +def process(data): + return result + +# After +def process(data: dict) -> InferenceResult: + return result +``` + +### Lint Error: Unused import +```python +# Remove unused imports or add to __all__ +``` + +### Security: print() in production +```python +# Before +print(f"Processing {doc_id}") + +# After +logger.info(f"Processing {doc_id}") +``` + +### Security: Bare except +```python +# Before +except: + pass + +# After +except Exception as e: + logger.error(f"Error: {e}") + raise +``` + +### Security: SQL injection +```python +# Before (DANGEROUS) +cur.execute(f"SELECT * FROM docs WHERE id = '{user_input}'") + +# After (SAFE) +cur.execute("SELECT * FROM docs WHERE id = %s", (user_input,)) +``` + +## Continuous Mode + +For long sessions, run verification after major changes: + +```markdown +Checkpoints: +- After completing each function +- After finishing a module +- Before moving to next task +- Every 15-20 minutes of coding + +Run: /verify +``` + +## Integration with Other Skills + +| Skill | Purpose | +|-------|---------| +| code-review | Detailed code analysis | +| security-review | Deep security audit | +| tdd-workflow | Test coverage | +| build-fix | Fix errors incrementally | + +This skill provides quick, comprehensive verification. Use specialized skills for deeper analysis. diff --git a/src/ocr/test_machine_code_parser.py b/src/ocr/test_machine_code_parser.py deleted file mode 100644 index ca9a9d3..0000000 --- a/src/ocr/test_machine_code_parser.py +++ /dev/null @@ -1,251 +0,0 @@ -""" -Tests for Machine Code Parser - -Tests the parsing of Swedish invoice payment lines including: -- Standard payment line format -- Account number normalization (spaces removal) -- Bankgiro/Plusgiro detection -- OCR and Amount extraction -""" - -import pytest -from src.ocr.machine_code_parser import MachineCodeParser, MachineCodeResult - - -class TestParseStandardPaymentLine: - """Tests for _parse_standard_payment_line method.""" - - @pytest.fixture - def parser(self): - return MachineCodeParser() - - def test_standard_format_bankgiro(self, parser): - """Test standard payment line with Bankgiro.""" - line = "# 31130954410 # 315 00 2 > 8983025#14#" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['ocr'] == '31130954410' - assert result['amount'] == '315' - assert result['bankgiro'] == '898-3025' - - def test_standard_format_with_ore(self, parser): - """Test payment line with non-zero öre.""" - line = "# 12345678901 # 100 50 2 > 7821713#41#" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['ocr'] == '12345678901' - assert result['amount'] == '100,50' - assert result['bankgiro'] == '782-1713' - - def test_spaces_in_bankgiro(self, parser): - """Test payment line with spaces in Bankgiro number.""" - line = "# 310196187399952 # 11699 00 6 > 78 2 1 713 #41#" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['ocr'] == '310196187399952' - assert result['amount'] == '11699' - assert result['bankgiro'] == '782-1713' - - def test_spaces_in_bankgiro_multiple(self, parser): - """Test payment line with multiple spaces in account number.""" - line = "# 123456789 # 500 00 1 > 1 2 3 4 5 6 7 #99#" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['bankgiro'] == '123-4567' - - def test_8_digit_bankgiro(self, parser): - """Test 8-digit Bankgiro formatting.""" - line = "# 12345678901 # 200 00 2 > 53939484#14#" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['bankgiro'] == '5393-9484' - - def test_plusgiro_context(self, parser): - """Test Plusgiro detection based on context.""" - line = "# 12345678901 # 100 00 2 > 1234567#14#" - result = parser._parse_standard_payment_line(line, context_line="plusgiro payment") - - assert result is not None - assert 'plusgiro' in result - assert result['plusgiro'] == '123456-7' - - def test_no_match_invalid_format(self, parser): - """Test that invalid format returns None.""" - line = "This is not a valid payment line" - result = parser._parse_standard_payment_line(line) - - assert result is None - - def test_alternative_pattern(self, parser): - """Test alternative payment line pattern.""" - line = "8120000849965361 11699 00 1 > 7821713" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['ocr'] == '8120000849965361' - - def test_long_ocr_number(self, parser): - """Test OCR number up to 25 digits.""" - line = "# 1234567890123456789012345 # 100 00 2 > 7821713#14#" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['ocr'] == '1234567890123456789012345' - - def test_large_amount(self, parser): - """Test large amount extraction.""" - line = "# 12345678901 # 1234567 00 2 > 7821713#14#" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['amount'] == '1234567' - - -class TestNormalizeAccountSpaces: - """Tests for account number space normalization.""" - - @pytest.fixture - def parser(self): - return MachineCodeParser() - - def test_no_spaces(self, parser): - """Test line without spaces in account.""" - line = "# 123456789 # 100 00 1 > 7821713#14#" - result = parser._parse_standard_payment_line(line) - assert result['bankgiro'] == '782-1713' - - def test_single_space(self, parser): - """Test single space between digits.""" - line = "# 123456789 # 100 00 1 > 782 1713#14#" - result = parser._parse_standard_payment_line(line) - assert result['bankgiro'] == '782-1713' - - def test_multiple_spaces(self, parser): - """Test multiple spaces.""" - line = "# 123456789 # 100 00 1 > 7 8 2 1 7 1 3#14#" - result = parser._parse_standard_payment_line(line) - assert result['bankgiro'] == '782-1713' - - def test_no_arrow_marker(self, parser): - """Test line without > marker - spaces not normalized.""" - # Without >, the normalization won't happen - line = "# 123456789 # 100 00 1 7821713#14#" - result = parser._parse_standard_payment_line(line) - # This pattern might not match due to missing > - # Just ensure no crash - assert result is None or isinstance(result, dict) - - -class TestMachineCodeResult: - """Tests for MachineCodeResult dataclass.""" - - def test_to_dict(self): - """Test conversion to dictionary.""" - result = MachineCodeResult( - ocr='12345678901', - amount='100', - bankgiro='782-1713', - confidence=0.95, - raw_line='test line' - ) - - d = result.to_dict() - assert d['ocr'] == '12345678901' - assert d['amount'] == '100' - assert d['bankgiro'] == '782-1713' - assert d['confidence'] == 0.95 - assert d['raw_line'] == 'test line' - - def test_empty_result(self): - """Test empty result.""" - result = MachineCodeResult() - d = result.to_dict() - - assert d['ocr'] is None - assert d['amount'] is None - assert d['bankgiro'] is None - assert d['plusgiro'] is None - - -class TestRealWorldExamples: - """Tests using real-world payment line examples.""" - - @pytest.fixture - def parser(self): - return MachineCodeParser() - - def test_fastum_invoice(self, parser): - """Test Fastum invoice payment line (from Faktura_A3861).""" - line = "# 310196187399952 # 11699 00 6 > 78 2 1 713 #41#" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['ocr'] == '310196187399952' - assert result['amount'] == '11699' - assert result['bankgiro'] == '782-1713' - - def test_standard_bankgiro_invoice(self, parser): - """Test standard Bankgiro format.""" - line = "# 31130954410 # 315 00 2 > 8983025#14#" - result = parser._parse_standard_payment_line(line) - - assert result is not None - assert result['ocr'] == '31130954410' - assert result['amount'] == '315' - assert result['bankgiro'] == '898-3025' - - def test_payment_line_with_extra_whitespace(self, parser): - """Test payment line with extra whitespace.""" - line = "# 310196187399952 # 11699 00 6 > 7821713 #41#" - result = parser._parse_standard_payment_line(line) - - # May or may not match depending on regex flexibility - # At minimum, should not crash - assert result is None or isinstance(result, dict) - - -class TestEdgeCases: - """Tests for edge cases and boundary conditions.""" - - @pytest.fixture - def parser(self): - return MachineCodeParser() - - def test_empty_string(self, parser): - """Test empty string input.""" - result = parser._parse_standard_payment_line("") - assert result is None - - def test_only_whitespace(self, parser): - """Test whitespace-only input.""" - result = parser._parse_standard_payment_line(" \t\n ") - assert result is None - - def test_minimum_ocr_length(self, parser): - """Test minimum OCR length (5 digits).""" - line = "# 12345 # 100 00 1 > 7821713#14#" - result = parser._parse_standard_payment_line(line) - assert result is not None - assert result['ocr'] == '12345' - - def test_minimum_bankgiro_length(self, parser): - """Test minimum Bankgiro length (5 digits).""" - line = "# 12345678901 # 100 00 1 > 12345#14#" - result = parser._parse_standard_payment_line(line) - assert result is not None - - def test_special_characters_in_line(self, parser): - """Test handling of special characters.""" - line = "# 12345678901 # 100 00 1 > 7821713#14# (SEK)" - result = parser._parse_standard_payment_line(line) - assert result is not None - assert result['ocr'] == '12345678901' - - -if __name__ == '__main__': - pytest.main([__file__, '-v'])