# Invoice Master POC v2 Swedish Invoice Field Extraction System - YOLO26 + PaddleOCR 从瑞典 PDF 发票中提取结构化数据。 ## Tech Stack | Component | Technology | |-----------|------------| | Object Detection | YOLO26 (Ultralytics >= 8.4.0) | | OCR Engine | PaddleOCR v5 (PP-OCRv5) | | PDF Processing | PyMuPDF (fitz) | | Database | PostgreSQL + psycopg2 | | Web Framework | FastAPI + Uvicorn | | Deep Learning | PyTorch + CUDA 12.x | ## WSL Environment (REQUIRED) **Prefix ALL commands with:** ```bash wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-sm120 && " ``` **NEVER run Python commands directly in Windows PowerShell/CMD.** ## Project-Specific Rules - Python 3.11+ with type hints - No print() in production - use logging - Run tests: `pytest --cov=src` ## Critical Rules ### Code Organization - Many small files over few large files - High cohesion, low coupling - 200-400 lines typical, 800 max per file - Organize by feature/domain, not by type ### Code Style - No emojis in code, comments, or documentation - Immutability always - never mutate objects or arrays - No console.log in production code - Proper error handling with try/catch - Input validation with Zod or similar ### Testing - TDD: Write tests first - 80% minimum coverage - Unit tests for utilities - Integration tests for APIs - E2E tests for critical flows ### Security - No hardcoded secrets - Environment variables for sensitive data - Validate all user inputs - Parameterized queries only - CSRF protection enabled ## Environment Variables ```bash # Required DB_PASSWORD= # Optional (with defaults) DB_HOST=192.168.68.31 DB_PORT=5432 DB_NAME=docmaster DB_USER=docmaster MODEL_PATH=runs/train/invoice_fields/weights/best.pt CONFIDENCE_THRESHOLD=0.5 SERVER_HOST=0.0.0.0 SERVER_PORT=8000 ``` ## Available Commands - `/tdd` - Test-driven development workflow - `/plan` - Create implementation plan - `/code-review` - Review code quality - `/build-fix` - Fix build errors ## Git Workflow - Conventional commits: `feat:`, `fix:`, `refactor:`, `docs:`, `test:` - Never commit to main directly - PRs require review - All tests must pass before merge Push the code before review and fix finished.