143 lines
3.8 KiB
Markdown
143 lines
3.8 KiB
Markdown
# Invoice Master POC v2
|
|
|
|
Swedish Invoice Field Extraction System - YOLOv11 + PaddleOCR 从瑞典 PDF 发票中提取结构化数据。
|
|
|
|
## Tech Stack
|
|
|
|
| Component | Technology |
|
|
|-----------|------------|
|
|
| Object Detection | YOLOv11 (Ultralytics) |
|
|
| OCR Engine | PaddleOCR v5 (PP-OCRv5) |
|
|
| PDF Processing | PyMuPDF (fitz) |
|
|
| Database | PostgreSQL + psycopg2 |
|
|
| Web Framework | FastAPI + Uvicorn |
|
|
| Deep Learning | PyTorch + CUDA 12.x |
|
|
|
|
## WSL Environment (REQUIRED)
|
|
|
|
**Prefix ALL commands with:**
|
|
|
|
```bash
|
|
wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && <command>"
|
|
```
|
|
|
|
**NEVER run Python commands directly in Windows PowerShell/CMD.**
|
|
|
|
## Project-Specific Rules
|
|
|
|
- Python 3.11+ with type hints
|
|
- No print() in production - use logging
|
|
- Run tests: `pytest --cov=src`
|
|
|
|
## File Structure
|
|
|
|
```
|
|
src/
|
|
├── cli/ # autolabel, train, infer, serve
|
|
├── pdf/ # extractor, renderer, detector
|
|
├── ocr/ # PaddleOCR wrapper, machine_code_parser
|
|
├── inference/ # pipeline, yolo_detector, field_extractor
|
|
├── normalize/ # Per-field normalizers
|
|
├── matcher/ # Exact, substring, fuzzy strategies
|
|
├── processing/ # CPU/GPU pool architecture
|
|
├── web/ # FastAPI app, routes, services, schemas
|
|
├── utils/ # validators, text_cleaner, fuzzy_matcher
|
|
└── data/ # Database operations
|
|
tests/ # Mirror of src structure
|
|
runs/train/ # Training outputs
|
|
```
|
|
|
|
## Supported Fields
|
|
|
|
| ID | Field | Description |
|
|
|----|-------|-------------|
|
|
| 0 | invoice_number | Invoice number |
|
|
| 1 | invoice_date | Invoice date |
|
|
| 2 | invoice_due_date | Due date |
|
|
| 3 | ocr_number | OCR reference (Swedish payment) |
|
|
| 4 | bankgiro | Bankgiro account |
|
|
| 5 | plusgiro | Plusgiro account |
|
|
| 6 | amount | Amount |
|
|
| 7 | supplier_organisation_number | Supplier org number |
|
|
| 8 | payment_line | Payment line (machine-readable) |
|
|
| 9 | customer_number | Customer number |
|
|
|
|
## Key Patterns
|
|
|
|
### Inference Result
|
|
|
|
```python
|
|
@dataclass
|
|
class InferenceResult:
|
|
document_id: str
|
|
document_type: str # "invoice" or "letter"
|
|
fields: dict[str, str]
|
|
confidence: dict[str, float]
|
|
cross_validation: CrossValidationResult | None
|
|
processing_time_ms: float
|
|
```
|
|
|
|
### API Schemas
|
|
|
|
See `src/web/schemas.py` for request/response models.
|
|
|
|
## Environment Variables
|
|
|
|
```bash
|
|
# Required
|
|
DB_PASSWORD=
|
|
|
|
# Optional (with defaults)
|
|
DB_HOST=192.168.68.31
|
|
DB_PORT=5432
|
|
DB_NAME=docmaster
|
|
DB_USER=docmaster
|
|
MODEL_PATH=runs/train/invoice_fields/weights/best.pt
|
|
CONFIDENCE_THRESHOLD=0.5
|
|
SERVER_HOST=0.0.0.0
|
|
SERVER_PORT=8000
|
|
```
|
|
|
|
## CLI Commands
|
|
|
|
```bash
|
|
# Auto-labeling
|
|
python -m src.cli.autolabel --dual-pool --cpu-workers 3 --gpu-workers 1
|
|
|
|
# Training
|
|
python -m src.cli.train --model yolo11n.pt --epochs 100 --batch 16 --name invoice_fields
|
|
|
|
# Inference
|
|
python -m src.cli.infer --model runs/train/invoice_fields/weights/best.pt --input invoice.pdf --gpu
|
|
|
|
# Web Server
|
|
python run_server.py --port 8000
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
| Method | Endpoint | Description |
|
|
|--------|----------|-------------|
|
|
| GET | `/` | Web UI |
|
|
| GET | `/api/v1/health` | Health check |
|
|
| POST | `/api/v1/infer` | Process invoice |
|
|
| GET | `/api/v1/results/{filename}` | Get visualization |
|
|
|
|
## Current Status
|
|
|
|
- **Tests**: 688 passing
|
|
- **Coverage**: 37%
|
|
- **Model**: 93.5% mAP@0.5
|
|
- **Documents Labeled**: 9,738
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Start server
|
|
wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && python run_server.py"
|
|
|
|
# Run tests
|
|
wsl bash -c "source ~/miniconda3/etc/profile.d/conda.sh && conda activate invoice-py311 && cd /mnt/c/Users/yaoji/git/ColaCoder/invoice-master-poc-v2 && pytest"
|
|
|
|
# Access UI: http://localhost:8000
|
|
``` |