3.8 KiB
3.8 KiB
Invoice Master POC v2
Swedish Invoice Field Extraction System - YOLOv11 + PaddleOCR 从瑞典 PDF 发票中提取结构化数据。
Tech Stack
| Component | Technology |
|---|---|
| Object Detection | YOLOv11 (Ultralytics) |
| OCR Engine | PaddleOCR v5 (PP-OCRv5) |
| PDF Processing | PyMuPDF (fitz) |
| Database | PostgreSQL + psycopg2 |
| Web Framework | FastAPI + Uvicorn |
| Deep Learning | PyTorch + CUDA 12.x |
WSL Environment (REQUIRED)
Prefix ALL commands with:
wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && <command>"
NEVER run Python commands directly in Windows PowerShell/CMD.
Project-Specific Rules
- Python 3.11+ with type hints
- No print() in production - use logging
- Run tests:
pytest --cov=src
File Structure
src/
├── cli/ # autolabel, train, infer, serve
├── pdf/ # extractor, renderer, detector
├── ocr/ # PaddleOCR wrapper, machine_code_parser
├── inference/ # pipeline, yolo_detector, field_extractor
├── normalize/ # Per-field normalizers
├── matcher/ # Exact, substring, fuzzy strategies
├── processing/ # CPU/GPU pool architecture
├── web/ # FastAPI app, routes, services, schemas
├── utils/ # validators, text_cleaner, fuzzy_matcher
└── data/ # Database operations
tests/ # Mirror of src structure
runs/train/ # Training outputs
Supported Fields
| ID | Field | Description |
|---|---|---|
| 0 | invoice_number | Invoice number |
| 1 | invoice_date | Invoice date |
| 2 | invoice_due_date | Due date |
| 3 | ocr_number | OCR reference (Swedish payment) |
| 4 | bankgiro | Bankgiro account |
| 5 | plusgiro | Plusgiro account |
| 6 | amount | Amount |
| 7 | supplier_organisation_number | Supplier org number |
| 8 | payment_line | Payment line (machine-readable) |
| 9 | customer_number | Customer number |
Key Patterns
Inference Result
@dataclass
class InferenceResult:
document_id: str
document_type: str # "invoice" or "letter"
fields: dict[str, str]
confidence: dict[str, float]
cross_validation: CrossValidationResult | None
processing_time_ms: float
API Schemas
See src/web/schemas.py for request/response models.
Environment Variables
# Required
DB_PASSWORD=
# Optional (with defaults)
DB_HOST=192.168.68.31
DB_PORT=5432
DB_NAME=docmaster
DB_USER=docmaster
MODEL_PATH=runs/train/invoice_fields/weights/best.pt
CONFIDENCE_THRESHOLD=0.5
SERVER_HOST=0.0.0.0
SERVER_PORT=8000
CLI Commands
# Auto-labeling
python -m src.cli.autolabel --dual-pool --cpu-workers 3 --gpu-workers 1
# Training
python -m src.cli.train --model yolo11n.pt --epochs 100 --batch 16 --name invoice_fields
# Inference
python -m src.cli.infer --model runs/train/invoice_fields/weights/best.pt --input invoice.pdf --gpu
# Web Server
python run_server.py --port 8000
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET | / |
Web UI |
| GET | /api/v1/health |
Health check |
| POST | /api/v1/infer |
Process invoice |
| GET | /api/v1/results/{filename} |
Get visualization |
Current Status
- Tests: 688 passing
- Coverage: 37%
- Model: 93.5% mAP@0.5
- Documents Labeled: 9,738
Quick Start
# Start server
wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && python run_server.py"
# Run tests
wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && pytest"
# Access UI: http://localhost:8000