Files
invoice-master-poc-v2/.claude/CLAUDE.md
Yaojia Wang ad5ed46b4c WIP
2026-02-11 23:40:38 +01:00

2.2 KiB

Invoice Master POC v2

Swedish Invoice Field Extraction System - YOLO26 + PaddleOCR 从瑞典 PDF 发票中提取结构化数据。

Tech Stack

Component Technology
Object Detection YOLO26 (Ultralytics >= 8.4.0)
OCR Engine PaddleOCR v5 (PP-OCRv5)
PDF Processing PyMuPDF (fitz)
Database PostgreSQL + psycopg2
Web Framework FastAPI + Uvicorn
Deep Learning PyTorch + CUDA 12.x

WSL Environment (REQUIRED)

Prefix ALL commands with:

wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-sm120 && <command>"

NEVER run Python commands directly in Windows PowerShell/CMD.

Project-Specific Rules

  • Python 3.11+ with type hints
  • No print() in production - use logging
  • Run tests: pytest --cov=src

Critical Rules

Code Organization

  • Many small files over few large files
  • High cohesion, low coupling
  • 200-400 lines typical, 800 max per file
  • Organize by feature/domain, not by type

Code Style

  • No emojis in code, comments, or documentation
  • Immutability always - never mutate objects or arrays
  • No console.log in production code
  • Proper error handling with try/catch
  • Input validation with Zod or similar

Testing

  • TDD: Write tests first
  • 80% minimum coverage
  • Unit tests for utilities
  • Integration tests for APIs
  • E2E tests for critical flows

Security

  • No hardcoded secrets
  • Environment variables for sensitive data
  • Validate all user inputs
  • Parameterized queries only
  • CSRF protection enabled

Environment Variables

# Required
DB_PASSWORD=

# Optional (with defaults)
DB_HOST=192.168.68.31
DB_PORT=5432
DB_NAME=docmaster
DB_USER=docmaster
MODEL_PATH=runs/train/invoice_fields/weights/best.pt
CONFIDENCE_THRESHOLD=0.5
SERVER_HOST=0.0.0.0
SERVER_PORT=8000

Available Commands

  • /tdd - Test-driven development workflow
  • /plan - Create implementation plan
  • /code-review - Review code quality
  • /build-fix - Fix build errors

Git Workflow

  • Conventional commits: feat:, fix:, refactor:, docs:, test:
  • Never commit to main directly
  • PRs require review
  • All tests must pass before merge

Push the code before review and fix finished.