Features: - Auto-labeling pipeline: CSV values -> PDF search -> YOLO annotations - Flexible date matching: year-month match, nearby date tolerance - PDF text extraction with PyMuPDF - OCR support for scanned documents (PaddleOCR) - YOLO training and inference pipeline - 7 field types: InvoiceNumber, InvoiceDate, InvoiceDueDate, OCR, Bankgiro, Plusgiro, Amount Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
72 lines
693 B
Plaintext
72 lines
693 B
Plaintext
# Python
|
|
__pycache__/
|
|
*.py[cod]
|
|
*$py.class
|
|
*.so
|
|
.Python
|
|
build/
|
|
develop-eggs/
|
|
dist/
|
|
downloads/
|
|
eggs/
|
|
.eggs/
|
|
lib/
|
|
lib64/
|
|
parts/
|
|
sdist/
|
|
var/
|
|
wheels/
|
|
*.egg-info/
|
|
.installed.cfg
|
|
*.egg
|
|
|
|
# Virtual environments
|
|
venv/
|
|
ENV/
|
|
env/
|
|
.venv/
|
|
|
|
# IDE
|
|
.idea/
|
|
.vscode/
|
|
*.swp
|
|
*.swo
|
|
*~
|
|
|
|
# Data files (large files)
|
|
data/raw_pdfs/
|
|
data/dataset/train/images/
|
|
data/dataset/val/images/
|
|
data/dataset/test/images/
|
|
data/dataset/train/labels/
|
|
data/dataset/val/labels/
|
|
data/dataset/test/labels/
|
|
*.pdf
|
|
*.png
|
|
*.jpg
|
|
*.jpeg
|
|
|
|
# Model weights
|
|
models/weights/
|
|
runs/
|
|
*.pt
|
|
*.onnx
|
|
|
|
# Reports and logs
|
|
reports/*.jsonl
|
|
logs/
|
|
*.log
|
|
|
|
# Jupyter
|
|
.ipynb_checkpoints/
|
|
|
|
# OS
|
|
.DS_Store
|
|
Thumbs.db
|
|
|
|
# Credentials
|
|
.env
|
|
*.key
|
|
*.pem
|
|
credentials.json
|