# Invoice Master POC v2 Swedish Invoice Field Extraction System - YOLOv11 + PaddleOCR 从瑞典 PDF 发票中提取结构化数据。 ## Tech Stack | Component | Technology | |-----------|------------| | Object Detection | YOLOv11 (Ultralytics) | | OCR Engine | PaddleOCR v5 (PP-OCRv5) | | PDF Processing | PyMuPDF (fitz) | | Database | PostgreSQL + psycopg2 | | Web Framework | FastAPI + Uvicorn | | Deep Learning | PyTorch + CUDA 12.x | ## WSL Environment (REQUIRED) **Prefix ALL commands with:** ```bash wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && " ``` **NEVER run Python commands directly in Windows PowerShell/CMD.** ## Project-Specific Rules - Python 3.11+ with type hints - No print() in production - use logging - Run tests: `pytest --cov=src` ## File Structure ``` src/ ├── cli/ # autolabel, train, infer, serve ├── pdf/ # extractor, renderer, detector ├── ocr/ # PaddleOCR wrapper, machine_code_parser ├── inference/ # pipeline, yolo_detector, field_extractor ├── normalize/ # Per-field normalizers ├── matcher/ # Exact, substring, fuzzy strategies ├── processing/ # CPU/GPU pool architecture ├── web/ # FastAPI app, routes, services, schemas ├── utils/ # validators, text_cleaner, fuzzy_matcher └── data/ # Database operations tests/ # Mirror of src structure runs/train/ # Training outputs ``` ## Supported Fields | ID | Field | Description | |----|-------|-------------| | 0 | invoice_number | Invoice number | | 1 | invoice_date | Invoice date | | 2 | invoice_due_date | Due date | | 3 | ocr_number | OCR reference (Swedish payment) | | 4 | bankgiro | Bankgiro account | | 5 | plusgiro | Plusgiro account | | 6 | amount | Amount | | 7 | supplier_organisation_number | Supplier org number | | 8 | payment_line | Payment line (machine-readable) | | 9 | customer_number | Customer number | ## Key Patterns ### Inference Result ```python @dataclass class InferenceResult: document_id: str document_type: str # "invoice" or "letter" fields: dict[str, str] confidence: dict[str, float] cross_validation: CrossValidationResult | None processing_time_ms: float ``` ### API Schemas See `src/web/schemas.py` for request/response models. ## Environment Variables ```bash # Required DB_PASSWORD= # Optional (with defaults) DB_HOST=192.168.68.31 DB_PORT=5432 DB_NAME=docmaster DB_USER=docmaster MODEL_PATH=runs/train/invoice_fields/weights/best.pt CONFIDENCE_THRESHOLD=0.5 SERVER_HOST=0.0.0.0 SERVER_PORT=8000 ``` ## CLI Commands ```bash # Auto-labeling python -m src.cli.autolabel --dual-pool --cpu-workers 3 --gpu-workers 1 # Training python -m src.cli.train --model yolo11n.pt --epochs 100 --batch 16 --name invoice_fields # Inference python -m src.cli.infer --model runs/train/invoice_fields/weights/best.pt --input invoice.pdf --gpu # Web Server python run_server.py --port 8000 ``` ## API Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/` | Web UI | | GET | `/api/v1/health` | Health check | | POST | `/api/v1/infer` | Process invoice | | GET | `/api/v1/results/{filename}` | Get visualization | ## Current Status - **Tests**: 688 passing - **Coverage**: 37% - **Model**: 93.5% mAP@0.5 - **Documents Labeled**: 9,738 ## Quick Start ```bash # Start server wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && python run_server.py" # Run tests wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && pytest" # Access UI: http://localhost:8000 ```