Add claude config

2026-01-25 16:17:23 +01:00
parent e599424a92
commit d5101e3604
40 changed files with 5559 additions and 1378 deletions
--- a/.claude/commands/build-fix.md
+++ b/.claude/commands/build-fix.md
@@ -0,0 +1,22 @@
+# Build and Fix
+
+Incrementally fix Python errors and test failures.
+
+## Workflow
+
+1. Run check: `mypy src/ --ignore-missing-imports` or `pytest -x --tb=short`
+2. Parse errors, group by file, sort by severity (ImportError > TypeError > other)
+3. For each error:
+   - Show context (5 lines)
+   - Explain and propose fix
+   - Apply fix
+   - Re-run test for that file
+   - Verify resolved
+4. Stop if: fix introduces new errors, same error after 3 attempts, or user pauses
+5. Show summary: fixed / remaining / new errors
+
+## Rules
+
+- Fix ONE error at a time
+- Re-run tests after each fix
+- Never batch multiple unrelated fixes
--- a/.claude/commands/checkpoint.md
+++ b/.claude/commands/checkpoint.md
@@ -0,0 +1,74 @@
+# Checkpoint Command
+
+Create or verify a checkpoint in your workflow.
+
+## Usage
+
+`/checkpoint [create|verify|list] [name]`
+
+## Create Checkpoint
+
+When creating a checkpoint:
+
+1. Run `/verify quick` to ensure current state is clean
+2. Create a git stash or commit with checkpoint name
+3. Log checkpoint to `.claude/checkpoints.log`:
+
+```bash
+echo "$(date +%Y-%m-%d-%H:%M) | $CHECKPOINT_NAME | $(git rev-parse --short HEAD)" >> .claude/checkpoints.log
+```
+
+4. Report checkpoint created
+
+## Verify Checkpoint
+
+When verifying against a checkpoint:
+
+1. Read checkpoint from log
+2. Compare current state to checkpoint:
+   - Files added since checkpoint
+   - Files modified since checkpoint
+   - Test pass rate now vs then
+   - Coverage now vs then
+
+3. Report:
+```
+CHECKPOINT COMPARISON: $NAME
+============================
+Files changed: X
+Tests: +Y passed / -Z failed
+Coverage: +X% / -Y%
+Build: [PASS/FAIL]
+```
+
+## List Checkpoints
+
+Show all checkpoints with:
+- Name
+- Timestamp
+- Git SHA
+- Status (current, behind, ahead)
+
+## Workflow
+
+Typical checkpoint flow:
+
+```
+[Start] --> /checkpoint create "feature-start"
+   |
+[Implement] --> /checkpoint create "core-done"
+   |
+[Test] --> /checkpoint verify "core-done"
+   |
+[Refactor] --> /checkpoint create "refactor-done"
+   |
+[PR] --> /checkpoint verify "feature-start"
+```
+
+## Arguments
+
+$ARGUMENTS:
+- `create <name>` - Create named checkpoint
+- `verify <name>` - Verify against named checkpoint
+- `list` - Show all checkpoints
+- `clear` - Remove old checkpoints (keeps last 5)
--- a/.claude/commands/code-review.md
+++ b/.claude/commands/code-review.md
@@ -0,0 +1,46 @@
+# Code Review
+
+Security and quality review of uncommitted changes.
+
+## Workflow
+
+1. Get changed files: `git diff --name-only HEAD` and `git diff --staged --name-only`
+2. Review each file for issues (see checklist below)
+3. Run automated checks: `mypy src/`, `ruff check src/`, `pytest -x`
+4. Generate report with severity, location, description, suggested fix
+5. Block commit if CRITICAL or HIGH issues found
+
+## Checklist
+
+### CRITICAL (Block)
+
+- Hardcoded credentials, API keys, tokens, passwords
+- SQL injection (must use parameterized queries)
+- Path traversal risks
+- Missing input validation on API endpoints
+- Missing authentication/authorization
+
+### HIGH (Block)
+
+- Functions > 50 lines, files > 800 lines
+- Nesting depth > 4 levels
+- Missing error handling or bare `except:`
+- `print()` in production code (use logging)
+- Mutable default arguments
+
+### MEDIUM (Warn)
+
+- Missing type hints on public functions
+- Missing tests for new code
+- Duplicate code, magic numbers
+- Unused imports/variables
+- TODO/FIXME comments
+
+## Report Format
+
+```
+[SEVERITY] file:line - Issue description
+  Suggested fix: ...
+```
+
+## Never Approve Code With Security Vulnerabilities!
--- a/.claude/commands/e2e.md
+++ b/.claude/commands/e2e.md
@@ -0,0 +1,40 @@
+# E2E Testing
+
+End-to-end testing for the Invoice Field Extraction API.
+
+## When to Use
+
+- Testing complete inference pipeline (PDF -> Fields)
+- Verifying API endpoints work end-to-end
+- Validating YOLO + OCR + field extraction integration
+- Pre-deployment verification
+
+## Workflow
+
+1. Ensure server is running: `python run_server.py`
+2. Run health check: `curl http://localhost:8000/api/v1/health`
+3. Run E2E tests: `pytest tests/e2e/ -v`
+4. Verify results and capture any failures
+
+## Critical Scenarios (Must Pass)
+
+1. Health check returns `{"status": "healthy", "model_loaded": true}`
+2. PDF upload returns valid response with fields
+3. Fields extracted with confidence scores
+4. Visualization image generated
+5. Cross-validation included for invoices with payment_line
+
+## Checklist
+
+- [ ] Server running on http://localhost:8000
+- [ ] Health check passes
+- [ ] PDF inference returns valid JSON
+- [ ] At least one field extracted
+- [ ] Visualization URL returns image
+- [ ] Response time < 10 seconds
+- [ ] No server errors in logs
+
+## Test Location
+
+E2E tests: `tests/e2e/`
+Sample fixtures: `tests/fixtures/`
--- a/.claude/commands/eval.md
+++ b/.claude/commands/eval.md
@@ -0,0 +1,174 @@
+# Eval Command
+
+Evaluate model performance and field extraction accuracy.
+
+## Usage
+
+`/eval [model|accuracy|compare|report]`
+
+## Model Evaluation
+
+`/eval model`
+
+Evaluate YOLO model performance on test dataset:
+
+```bash
+# Run model evaluation
+python -m src.cli.train --model runs/train/invoice_fields/weights/best.pt --eval-only
+
+# Or use ultralytics directly
+yolo val model=runs/train/invoice_fields/weights/best.pt data=data.yaml
+```
+
+Output:
+```
+Model Evaluation: invoice_fields/best.pt
+========================================
+mAP@0.5:     93.5%
+mAP@0.5-0.95: 83.0%
+
+Per-class AP:
+- invoice_number:    95.2%
+- invoice_date:      94.8%
+- invoice_due_date:  93.1%
+- ocr_number:        91.5%
+- bankgiro:          92.3%
+- plusgiro:          90.8%
+- amount:            88.7%
+- supplier_org_num:  85.2%
+- payment_line:      82.4%
+- customer_number:   81.1%
+```
+
+## Accuracy Evaluation
+
+`/eval accuracy`
+
+Evaluate field extraction accuracy against ground truth:
+
+```bash
+# Run accuracy evaluation on labeled data
+python -m src.cli.infer --model runs/train/invoice_fields/weights/best.pt \
+    --input ~/invoice-data/test/*.pdf \
+    --ground-truth ~/invoice-data/test/labels.csv \
+    --output eval_results.json
+```
+
+Output:
+```
+Field Extraction Accuracy
+=========================
+Documents tested: 500
+
+Per-field accuracy:
+- InvoiceNumber:   98.9% (494/500)
+- InvoiceDate:     95.5% (478/500)
+- InvoiceDueDate:  95.9% (480/500)
+- OCR:             99.1% (496/500)
+- Bankgiro:        99.0% (495/500)
+- Plusgiro:        99.4% (497/500)
+- Amount:          91.3% (457/500)
+- supplier_org:    78.2% (391/500)
+
+Overall: 94.8%
+```
+
+## Compare Models
+
+`/eval compare`
+
+Compare two model versions:
+
+```bash
+# Compare old vs new model
+python -m src.cli.eval compare \
+    --model-a runs/train/invoice_v1/weights/best.pt \
+    --model-b runs/train/invoice_v2/weights/best.pt \
+    --test-data ~/invoice-data/test/
+```
+
+Output:
+```
+Model Comparison
+================
+                Model A     Model B     Delta
+mAP@0.5:        91.2%       93.5%       +2.3%
+Accuracy:       92.1%       94.8%       +2.7%
+Speed (ms):     1850        1520        -330
+
+Per-field improvements:
+- amount:       +4.2%
+- payment_line: +3.8%
+- customer_num: +2.1%
+
+Recommendation: Deploy Model B
+```
+
+## Generate Report
+
+`/eval report`
+
+Generate comprehensive evaluation report:
+
+```bash
+python -m src.cli.eval report --output eval_report.md
+```
+
+Output:
+```markdown
+# Evaluation Report
+Generated: 2026-01-25
+
+## Model Performance
+- Model: runs/train/invoice_fields/weights/best.pt
+- mAP@0.5: 93.5%
+- Training samples: 9,738
+
+## Field Extraction Accuracy
+| Field | Accuracy | Errors |
+|-------|----------|--------|
+| InvoiceNumber | 98.9% | 6 |
+| Amount | 91.3% | 43 |
+...
+
+## Error Analysis
+### Common Errors
+1. Amount: OCR misreads comma as period
+2. supplier_org: Missing from some invoices
+3. payment_line: Partially obscured by stamps
+
+## Recommendations
+1. Add more training data for low-accuracy fields
+2. Implement OCR error correction for amounts
+3. Consider confidence threshold tuning
+```
+
+## Quick Commands
+
+```bash
+# Evaluate model metrics
+yolo val model=runs/train/invoice_fields/weights/best.pt
+
+# Test inference on sample
+python -m src.cli.infer --input sample.pdf --output result.json --gpu
+
+# Check test coverage
+pytest --cov=src --cov-report=html
+```
+
+## Evaluation Metrics
+
+| Metric | Target | Current |
+|--------|--------|---------|
+| mAP@0.5 | >90% | 93.5% |
+| Overall Accuracy | >90% | 94.8% |
+| Test Coverage | >60% | 37% |
+| Tests Passing | 100% | 100% |
+
+## When to Evaluate
+
+- After training a new model
+- Before deploying to production
+- After adding new training data
+- When accuracy complaints arise
+- Weekly performance monitoring
--- a/.claude/commands/learn.md
+++ b/.claude/commands/learn.md
@@ -0,0 +1,70 @@
+# /learn - Extract Reusable Patterns
+
+Analyze the current session and extract any patterns worth saving as skills.
+
+## Trigger
+
+Run `/learn` at any point during a session when you've solved a non-trivial problem.
+
+## What to Extract
+
+Look for:
+
+1. **Error Resolution Patterns**
+   - What error occurred?
+   - What was the root cause?
+   - What fixed it?
+   - Is this reusable for similar errors?
+
+2. **Debugging Techniques**
+   - Non-obvious debugging steps
+   - Tool combinations that worked
+   - Diagnostic patterns
+
+3. **Workarounds**
+   - Library quirks
+   - API limitations
+   - Version-specific fixes
+
+4. **Project-Specific Patterns**
+   - Codebase conventions discovered
+   - Architecture decisions made
+   - Integration patterns
+
+## Output Format
+
+Create a skill file at `~/.claude/skills/learned/[pattern-name].md`:
+
+```markdown
+# [Descriptive Pattern Name]
+
+**Extracted:** [Date]
+**Context:** [Brief description of when this applies]
+
+## Problem
+[What problem this solves - be specific]
+
+## Solution
+[The pattern/technique/workaround]
+
+## Example
+[Code example if applicable]
+
+## When to Use
+[Trigger conditions - what should activate this skill]
+```
+
+## Process
+
+1. Review the session for extractable patterns
+2. Identify the most valuable/reusable insight
+3. Draft the skill file
+4. Ask user to confirm before saving
+5. Save to `~/.claude/skills/learned/`
+
+## Notes
+
+- Don't extract trivial fixes (typos, simple syntax errors)
+- Don't extract one-time issues (specific API outages, etc.)
+- Focus on patterns that will save time in future sessions
+- Keep skills focused - one pattern per skill
--- a/.claude/commands/orchestrate.md
+++ b/.claude/commands/orchestrate.md
@@ -0,0 +1,172 @@
+# Orchestrate Command
+
+Sequential agent workflow for complex tasks.
+
+## Usage
+
+`/orchestrate [workflow-type] [task-description]`
+
+## Workflow Types
+
+### feature
+Full feature implementation workflow:
+```
+planner -> tdd-guide -> code-reviewer -> security-reviewer
+```
+
+### bugfix
+Bug investigation and fix workflow:
+```
+explorer -> tdd-guide -> code-reviewer
+```
+
+### refactor
+Safe refactoring workflow:
+```
+architect -> code-reviewer -> tdd-guide
+```
+
+### security
+Security-focused review:
+```
+security-reviewer -> code-reviewer -> architect
+```
+
+## Execution Pattern
+
+For each agent in the workflow:
+
+1. **Invoke agent** with context from previous agent
+2. **Collect output** as structured handoff document
+3. **Pass to next agent** in chain
+4. **Aggregate results** into final report
+
+## Handoff Document Format
+
+Between agents, create handoff document:
+
+```markdown
+## HANDOFF: [previous-agent] -> [next-agent]
+
+### Context
+[Summary of what was done]
+
+### Findings
+[Key discoveries or decisions]
+
+### Files Modified
+[List of files touched]
+
+### Open Questions
+[Unresolved items for next agent]
+
+### Recommendations
+[Suggested next steps]
+```
+
+## Example: Feature Workflow
+
+```
+/orchestrate feature "Add user authentication"
+```
+
+Executes:
+
+1. **Planner Agent**
+   - Analyzes requirements
+   - Creates implementation plan
+   - Identifies dependencies
+   - Output: `HANDOFF: planner -> tdd-guide`
+
+2. **TDD Guide Agent**
+   - Reads planner handoff
+   - Writes tests first
+   - Implements to pass tests
+   - Output: `HANDOFF: tdd-guide -> code-reviewer`
+
+3. **Code Reviewer Agent**
+   - Reviews implementation
+   - Checks for issues
+   - Suggests improvements
+   - Output: `HANDOFF: code-reviewer -> security-reviewer`
+
+4. **Security Reviewer Agent**
+   - Security audit
+   - Vulnerability check
+   - Final approval
+   - Output: Final Report
+
+## Final Report Format
+
+```
+ORCHESTRATION REPORT
+====================
+Workflow: feature
+Task: Add user authentication
+Agents: planner -> tdd-guide -> code-reviewer -> security-reviewer
+
+SUMMARY
+-------
+[One paragraph summary]
+
+AGENT OUTPUTS
+-------------
+Planner: [summary]
+TDD Guide: [summary]
+Code Reviewer: [summary]
+Security Reviewer: [summary]
+
+FILES CHANGED
+-------------
+[List all files modified]
+
+TEST RESULTS
+------------
+[Test pass/fail summary]
+
+SECURITY STATUS
+---------------
+[Security findings]
+
+RECOMMENDATION
+--------------
+[SHIP / NEEDS WORK / BLOCKED]
+```
+
+## Parallel Execution
+
+For independent checks, run agents in parallel:
+
+```markdown
+### Parallel Phase
+Run simultaneously:
+- code-reviewer (quality)
+- security-reviewer (security)
+- architect (design)
+
+### Merge Results
+Combine outputs into single report
+```
+
+## Arguments
+
+$ARGUMENTS:
+- `feature <description>` - Full feature workflow
+- `bugfix <description>` - Bug fix workflow
+- `refactor <description>` - Refactoring workflow
+- `security <description>` - Security review workflow
+- `custom <agents> <description>` - Custom agent sequence
+
+## Custom Workflow Example
+
+```
+/orchestrate custom "architect,tdd-guide,code-reviewer" "Redesign caching layer"
+```
+
+## Tips
+
+1. **Start with planner** for complex features
+2. **Always include code-reviewer** before merge
+3. **Use security-reviewer** for auth/payment/PII
+4. **Keep handoffs concise** - focus on what next agent needs
+5. **Run verification** between agents if needed
--- a/.claude/commands/plan.md
+++ b/.claude/commands/plan.md
@@ -0,0 +1,113 @@
+---
+description: Restate requirements, assess risks, and create step-by-step implementation plan. WAIT for user CONFIRM before touching any code.
+---
+
+# Plan Command
+
+This command invokes the **planner** agent to create a comprehensive implementation plan before writing any code.
+
+## What This Command Does
+
+1. **Restate Requirements** - Clarify what needs to be built
+2. **Identify Risks** - Surface potential issues and blockers
+3. **Create Step Plan** - Break down implementation into phases
+4. **Wait for Confirmation** - MUST receive user approval before proceeding
+
+## When to Use
+
+Use `/plan` when:
+- Starting a new feature
+- Making significant architectural changes
+- Working on complex refactoring
+- Multiple files/components will be affected
+- Requirements are unclear or ambiguous
+
+## How It Works
+
+The planner agent will:
+
+1. **Analyze the request** and restate requirements in clear terms
+2. **Break down into phases** with specific, actionable steps
+3. **Identify dependencies** between components
+4. **Assess risks** and potential blockers
+5. **Estimate complexity** (High/Medium/Low)
+6. **Present the plan** and WAIT for your explicit confirmation
+
+## Example Usage
+
+```
+User: /plan I need to add real-time notifications when markets resolve
+
+Agent (planner):
+# Implementation Plan: Real-Time Market Resolution Notifications
+
+## Requirements Restatement
+- Send notifications to users when markets they're watching resolve
+- Support multiple notification channels (in-app, email, webhook)
+- Ensure notifications are delivered reliably
+- Include market outcome and user's position result
+
+## Implementation Phases
+
+### Phase 1: Database Schema
+- Add notifications table with columns: id, user_id, market_id, type, status, created_at
+- Add user_notification_preferences table for channel preferences
+- Create indexes on user_id and market_id for performance
+
+### Phase 2: Notification Service
+- Create notification service in lib/notifications.ts
+- Implement notification queue using BullMQ/Redis
+- Add retry logic for failed deliveries
+- Create notification templates
+
+### Phase 3: Integration Points
+- Hook into market resolution logic (when status changes to "resolved")
+- Query all users with positions in market
+- Enqueue notifications for each user
+
+### Phase 4: Frontend Components
+- Create NotificationBell component in header
+- Add NotificationList modal
+- Implement real-time updates via Supabase subscriptions
+- Add notification preferences page
+
+## Dependencies
+- Redis (for queue)
+- Email service (SendGrid/Resend)
+- Supabase real-time subscriptions
+
+## Risks
+- HIGH: Email deliverability (SPF/DKIM required)
+- MEDIUM: Performance with 1000+ users per market
+- MEDIUM: Notification spam if markets resolve frequently
+- LOW: Real-time subscription overhead
+
+## Estimated Complexity: MEDIUM
+- Backend: 4-6 hours
+- Frontend: 3-4 hours
+- Testing: 2-3 hours
+- Total: 9-13 hours
+
+**WAITING FOR CONFIRMATION**: Proceed with this plan? (yes/no/modify)
+```
+
+## Important Notes
+
+**CRITICAL**: The planner agent will **NOT** write any code until you explicitly confirm the plan with "yes" or "proceed" or similar affirmative response.
+
+If you want changes, respond with:
+- "modify: [your changes]"
+- "different approach: [alternative]"
+- "skip phase 2 and do phase 3 first"
+
+## Integration with Other Commands
+
+After planning:
+- Use `/tdd` to implement with test-driven development
+- Use `/build-and-fix` if build errors occur
+- Use `/code-review` to review completed implementation
+
+## Related Agents
+
+This command invokes the `planner` agent located at:
+`~/.claude/agents/planner.md`
--- a/.claude/commands/refactor-clean.md
+++ b/.claude/commands/refactor-clean.md
@@ -0,0 +1,28 @@
+# Refactor Clean
+
+Safely identify and remove dead code with test verification:
+
+1. Run dead code analysis tools:
+   - knip: Find unused exports and files
+   - depcheck: Find unused dependencies
+   - ts-prune: Find unused TypeScript exports
+
+2. Generate comprehensive report in .reports/dead-code-analysis.md
+
+3. Categorize findings by severity:
+   - SAFE: Test files, unused utilities
+   - CAUTION: API routes, components
+   - DANGER: Config files, main entry points
+
+4. Propose safe deletions only
+
+5. Before each deletion:
+   - Run full test suite
+   - Verify tests pass
+   - Apply change
+   - Re-run tests
+   - Rollback if tests fail
+
+6. Show summary of cleaned items
+
+Never delete code without running tests first!
--- a/.claude/commands/setup-pm.md
+++ b/.claude/commands/setup-pm.md
@@ -0,0 +1,80 @@
+---
+description: Configure your preferred package manager (npm/pnpm/yarn/bun)
+disable-model-invocation: true
+---
+
+# Package Manager Setup
+
+Configure your preferred package manager for this project or globally.
+
+## Usage
+
+```bash
+# Detect current package manager
+node scripts/setup-package-manager.js --detect
+
+# Set global preference
+node scripts/setup-package-manager.js --global pnpm
+
+# Set project preference
+node scripts/setup-package-manager.js --project bun
+
+# List available package managers
+node scripts/setup-package-manager.js --list
+```
+
+## Detection Priority
+
+When determining which package manager to use, the following order is checked:
+
+1. **Environment variable**: `CLAUDE_PACKAGE_MANAGER`
+2. **Project config**: `.claude/package-manager.json`
+3. **package.json**: `packageManager` field
+4. **Lock file**: Presence of package-lock.json, yarn.lock, pnpm-lock.yaml, or bun.lockb
+5. **Global config**: `~/.claude/package-manager.json`
+6. **Fallback**: First available package manager (pnpm > bun > yarn > npm)
+
+## Configuration Files
+
+### Global Configuration
+```json
+// ~/.claude/package-manager.json
+{
+  "packageManager": "pnpm"
+}
+```
+
+### Project Configuration
+```json
+// .claude/package-manager.json
+{
+  "packageManager": "bun"
+}
+```
+
+### package.json
+```json
+{
+  "packageManager": "pnpm@8.6.0"
+}
+```
+
+## Environment Variable
+
+Set `CLAUDE_PACKAGE_MANAGER` to override all other detection methods:
+
+```bash
+# Windows (PowerShell)
+$env:CLAUDE_PACKAGE_MANAGER = "pnpm"
+
+# macOS/Linux
+export CLAUDE_PACKAGE_MANAGER=pnpm
+```
+
+## Run the Detection
+
+To see current package manager detection results, run:
+
+```bash
+node scripts/setup-package-manager.js --detect
+```
--- a/.claude/commands/tdd.md
+++ b/.claude/commands/tdd.md
@@ -0,0 +1,326 @@
+---
+description: Enforce test-driven development workflow. Scaffold interfaces, generate tests FIRST, then implement minimal code to pass. Ensure 80%+ coverage.
+---
+
+# TDD Command
+
+This command invokes the **tdd-guide** agent to enforce test-driven development methodology.
+
+## What This Command Does
+
+1. **Scaffold Interfaces** - Define types/interfaces first
+2. **Generate Tests First** - Write failing tests (RED)
+3. **Implement Minimal Code** - Write just enough to pass (GREEN)
+4. **Refactor** - Improve code while keeping tests green (REFACTOR)
+5. **Verify Coverage** - Ensure 80%+ test coverage
+
+## When to Use
+
+Use `/tdd` when:
+- Implementing new features
+- Adding new functions/components
+- Fixing bugs (write test that reproduces bug first)
+- Refactoring existing code
+- Building critical business logic
+
+## How It Works
+
+The tdd-guide agent will:
+
+1. **Define interfaces** for inputs/outputs
+2. **Write tests that will FAIL** (because code doesn't exist yet)
+3. **Run tests** and verify they fail for the right reason
+4. **Write minimal implementation** to make tests pass
+5. **Run tests** and verify they pass
+6. **Refactor** code while keeping tests green
+7. **Check coverage** and add more tests if below 80%
+
+## TDD Cycle
+
+```
+RED → GREEN → REFACTOR → REPEAT
+
+RED:      Write a failing test
+GREEN:    Write minimal code to pass
+REFACTOR: Improve code, keep tests passing
+REPEAT:   Next feature/scenario
+```
+
+## Example Usage
+
+```
+User: /tdd I need a function to calculate market liquidity score
+
+Agent (tdd-guide):
+# TDD Session: Market Liquidity Score Calculator
+
+## Step 1: Define Interface (SCAFFOLD)
+
+```typescript
+// lib/liquidity.ts
+export interface MarketData {
+  totalVolume: number
+  bidAskSpread: number
+  activeTraders: number
+  lastTradeTime: Date
+}
+
+export function calculateLiquidityScore(market: MarketData): number {
+  // TODO: Implementation
+  throw new Error('Not implemented')
+}
+```
+
+## Step 2: Write Failing Test (RED)
+
+```typescript
+// lib/liquidity.test.ts
+import { calculateLiquidityScore } from './liquidity'
+
+describe('calculateLiquidityScore', () => {
+  it('should return high score for liquid market', () => {
+    const market = {
+      totalVolume: 100000,
+      bidAskSpread: 0.01,
+      activeTraders: 500,
+      lastTradeTime: new Date()
+    }
+
+    const score = calculateLiquidityScore(market)
+
+    expect(score).toBeGreaterThan(80)
+    expect(score).toBeLessThanOrEqual(100)
+  })
+
+  it('should return low score for illiquid market', () => {
+    const market = {
+      totalVolume: 100,
+      bidAskSpread: 0.5,
+      activeTraders: 2,
+      lastTradeTime: new Date(Date.now() - 86400000) // 1 day ago
+    }
+
+    const score = calculateLiquidityScore(market)
+
+    expect(score).toBeLessThan(30)
+    expect(score).toBeGreaterThanOrEqual(0)
+  })
+
+  it('should handle edge case: zero volume', () => {
+    const market = {
+      totalVolume: 0,
+      bidAskSpread: 0,
+      activeTraders: 0,
+      lastTradeTime: new Date()
+    }
+
+    const score = calculateLiquidityScore(market)
+
+    expect(score).toBe(0)
+  })
+})
+```
+
+## Step 3: Run Tests - Verify FAIL
+
+```bash
+npm test lib/liquidity.test.ts
+
+FAIL lib/liquidity.test.ts
+  ✕ should return high score for liquid market (2 ms)
+    Error: Not implemented
+
+1 test failed, 0 passed
+```
+
+✅ Tests fail as expected. Ready to implement.
+
+## Step 4: Implement Minimal Code (GREEN)
+
+```typescript
+// lib/liquidity.ts
+export function calculateLiquidityScore(market: MarketData): number {
+  // Handle zero volume edge case
+  if (market.totalVolume === 0) {
+    return 0
+  }
+
+  // Calculate component scores (0-100 scale)
+  const volumeScore = Math.min(market.totalVolume / 1000, 100)
+  const spreadScore = Math.max(100 - (market.bidAskSpread * 1000), 0)
+  const traderScore = Math.min(market.activeTraders / 10, 100)
+
+  // Recent activity bonus
+  const hoursSinceLastTrade = (Date.now() - market.lastTradeTime.getTime()) / (1000 * 60 * 60)
+  const recencyScore = Math.max(100 - (hoursSinceLastTrade * 10), 0)
+
+  // Weighted average
+  const score = (
+    volumeScore * 0.4 +
+    spreadScore * 0.3 +
+    traderScore * 0.2 +
+    recencyScore * 0.1
+  )
+
+  return Math.min(Math.max(score, 0), 100) // Clamp to 0-100
+}
+```
+
+## Step 5: Run Tests - Verify PASS
+
+```bash
+npm test lib/liquidity.test.ts
+
+PASS lib/liquidity.test.ts
+  ✓ should return high score for liquid market (3 ms)
+  ✓ should return low score for illiquid market (2 ms)
+  ✓ should handle edge case: zero volume (1 ms)
+
+3 tests passed
+```
+
+✅ All tests passing!
+
+## Step 6: Refactor (IMPROVE)
+
+```typescript
+// lib/liquidity.ts - Refactored with constants and better readability
+const WEIGHTS = {
+  VOLUME: 0.4,
+  SPREAD: 0.3,
+  TRADERS: 0.2,
+  RECENCY: 0.1,
+} as const
+
+const SCALE_FACTORS = {
+  VOLUME: 1000,
+  SPREAD: 1000,
+  TRADERS: 10,
+  RECENCY_PENALTY: 10,
+} as const
+
+function clamp(value: number, min: number, max: number): number {
+  return Math.min(Math.max(value, min), max)
+}
+
+export function calculateLiquidityScore(market: MarketData): number {
+  if (market.totalVolume === 0) return 0
+
+  const volumeScore = Math.min(market.totalVolume / SCALE_FACTORS.VOLUME, 100)
+  const spreadScore = clamp(100 - (market.bidAskSpread * SCALE_FACTORS.SPREAD), 0, 100)
+  const traderScore = Math.min(market.activeTraders / SCALE_FACTORS.TRADERS, 100)
+
+  const hoursSinceLastTrade = (Date.now() - market.lastTradeTime.getTime()) / (1000 * 60 * 60)
+  const recencyScore = clamp(100 - (hoursSinceLastTrade * SCALE_FACTORS.RECENCY_PENALTY), 0, 100)
+
+  const weightedScore =
+    volumeScore * WEIGHTS.VOLUME +
+    spreadScore * WEIGHTS.SPREAD +
+    traderScore * WEIGHTS.TRADERS +
+    recencyScore * WEIGHTS.RECENCY
+
+  return clamp(weightedScore, 0, 100)
+}
+```
+
+## Step 7: Verify Tests Still Pass
+
+```bash
+npm test lib/liquidity.test.ts
+
+PASS lib/liquidity.test.ts
+  ✓ should return high score for liquid market (3 ms)
+  ✓ should return low score for illiquid market (2 ms)
+  ✓ should handle edge case: zero volume (1 ms)
+
+3 tests passed
+```
+
+✅ Refactoring complete, tests still passing!
+
+## Step 8: Check Coverage
+
+```bash
+npm test -- --coverage lib/liquidity.test.ts
+
+File           | % Stmts | % Branch | % Funcs | % Lines
+---------------|---------|----------|---------|--------
+liquidity.ts   |   100   |   100    |   100   |   100
+
+Coverage: 100% ✅ (Target: 80%)
+```
+
+✅ TDD session complete!
+```
+
+## TDD Best Practices
+
+**DO:**
+- ✅ Write the test FIRST, before any implementation
+- ✅ Run tests and verify they FAIL before implementing
+- ✅ Write minimal code to make tests pass
+- ✅ Refactor only after tests are green
+- ✅ Add edge cases and error scenarios
+- ✅ Aim for 80%+ coverage (100% for critical code)
+
+**DON'T:**
+- ❌ Write implementation before tests
+- ❌ Skip running tests after each change
+- ❌ Write too much code at once
+- ❌ Ignore failing tests
+- ❌ Test implementation details (test behavior)
+- ❌ Mock everything (prefer integration tests)
+
+## Test Types to Include
+
+**Unit Tests** (Function-level):
+- Happy path scenarios
+- Edge cases (empty, null, max values)
+- Error conditions
+- Boundary values
+
+**Integration Tests** (Component-level):
+- API endpoints
+- Database operations
+- External service calls
+- React components with hooks
+
+**E2E Tests** (use `/e2e` command):
+- Critical user flows
+- Multi-step processes
+- Full stack integration
+
+## Coverage Requirements
+
+- **80% minimum** for all code
+- **100% required** for:
+  - Financial calculations
+  - Authentication logic
+  - Security-critical code
+  - Core business logic
+
+## Important Notes
+
+**MANDATORY**: Tests must be written BEFORE implementation. The TDD cycle is:
+
+1. **RED** - Write failing test
+2. **GREEN** - Implement to pass
+3. **REFACTOR** - Improve code
+
+Never skip the RED phase. Never write code before tests.
+
+## Integration with Other Commands
+
+- Use `/plan` first to understand what to build
+- Use `/tdd` to implement with tests
+- Use `/build-and-fix` if build errors occur
+- Use `/code-review` to review implementation
+- Use `/test-coverage` to verify coverage
+
+## Related Agents
+
+This command invokes the `tdd-guide` agent located at:
+`~/.claude/agents/tdd-guide.md`
+
+And can reference the `tdd-workflow` skill at:
+`~/.claude/skills/tdd-workflow/`
--- a/.claude/commands/test-coverage.md
+++ b/.claude/commands/test-coverage.md
@@ -0,0 +1,27 @@
+# Test Coverage
+
+Analyze test coverage and generate missing tests:
+
+1. Run tests with coverage: npm test --coverage or pnpm test --coverage
+
+2. Analyze coverage report (coverage/coverage-summary.json)
+
+3. Identify files below 80% coverage threshold
+
+4. For each under-covered file:
+   - Analyze untested code paths
+   - Generate unit tests for functions
+   - Generate integration tests for APIs
+   - Generate E2E tests for critical flows
+
+5. Verify new tests pass
+
+6. Show before/after coverage metrics
+
+7. Ensure project reaches 80%+ overall coverage
+
+Focus on:
+- Happy path scenarios
+- Error handling
+- Edge cases (null, undefined, empty)
+- Boundary conditions
--- a/.claude/commands/update-codemaps.md
+++ b/.claude/commands/update-codemaps.md
@@ -0,0 +1,17 @@
+# Update Codemaps
+
+Analyze the codebase structure and update architecture documentation:
+
+1. Scan all source files for imports, exports, and dependencies
+2. Generate token-lean codemaps in the following format:
+   - codemaps/architecture.md - Overall architecture
+   - codemaps/backend.md - Backend structure  
+   - codemaps/frontend.md - Frontend structure
+   - codemaps/data.md - Data models and schemas
+
+3. Calculate diff percentage from previous version
+4. If changes > 30%, request user approval before updating
+5. Add freshness timestamp to each codemap
+6. Save reports to .reports/codemap-diff.txt
+
+Use TypeScript/Node.js for analysis. Focus on high-level structure, not implementation details.
--- a/.claude/commands/update-docs.md
+++ b/.claude/commands/update-docs.md
@@ -0,0 +1,31 @@
+# Update Documentation
+
+Sync documentation from source-of-truth:
+
+1. Read package.json scripts section
+   - Generate scripts reference table
+   - Include descriptions from comments
+
+2. Read .env.example
+   - Extract all environment variables
+   - Document purpose and format
+
+3. Generate docs/CONTRIB.md with:
+   - Development workflow
+   - Available scripts
+   - Environment setup
+   - Testing procedures
+
+4. Generate docs/RUNBOOK.md with:
+   - Deployment procedures
+   - Monitoring and alerts
+   - Common issues and fixes
+   - Rollback procedures
+
+5. Identify obsolete documentation:
+   - Find docs not modified in 90+ days
+   - List for manual review
+
+6. Show diff summary
+
+Single source of truth: package.json and .env.example
--- a/.claude/commands/verify.md
+++ b/.claude/commands/verify.md
@@ -0,0 +1,59 @@
+# Verification Command
+
+Run comprehensive verification on current codebase state.
+
+## Instructions
+
+Execute verification in this exact order:
+
+1. **Build Check**
+   - Run the build command for this project
+   - If it fails, report errors and STOP
+
+2. **Type Check**
+   - Run TypeScript/type checker
+   - Report all errors with file:line
+
+3. **Lint Check**
+   - Run linter
+   - Report warnings and errors
+
+4. **Test Suite**
+   - Run all tests
+   - Report pass/fail count
+   - Report coverage percentage
+
+5. **Console.log Audit**
+   - Search for console.log in source files
+   - Report locations
+
+6. **Git Status**
+   - Show uncommitted changes
+   - Show files modified since last commit
+
+## Output
+
+Produce a concise verification report:
+
+```
+VERIFICATION: [PASS/FAIL]
+
+Build:    [OK/FAIL]
+Types:    [OK/X errors]
+Lint:     [OK/X issues]
+Tests:    [X/Y passed, Z% coverage]
+Secrets:  [OK/X found]
+Logs:     [OK/X console.logs]
+
+Ready for PR: [YES/NO]
+```
+
+If any critical issues, list them with fix suggestions.
+
+## Arguments
+
+$ARGUMENTS can be:
+- `quick` - Only build + types
+- `full` - All checks (default)
+- `pre-commit` - Checks relevant for commits
+- `pre-pr` - Full checks plus security scan