Files
invoice-master-poc-v2/requirements.txt
Yaojia Wang c4e3773df1 feat: upgrade PaddlePaddle and PaddleOCR to 3.x
- Update paddlepaddle from >=2.5.0 to >=3.0.0,<3.3.0
- Update paddleocr from >=2.7.0 to >=3.0.0
- Update paddlepaddle-gpu from >=2.5.0 to >=3.0.0,<3.3.0

Note: PaddlePaddle 3.3.0 has an OneDNN bug that breaks CPU inference
(ConvertPirAttribute2RuntimeAttribute not implemented). Using <3.3.0
until the bug is fixed upstream.

This upgrade enables PP-StructureV3 for table extraction and uses
PP-OCRv5 for improved text recognition accuracy. The existing codebase
is already compatible with the 3.x API (predict() method and new
response format).

Verified:
- PaddleOCR import works
- PPStructureV3 is available
- OCREngine initializes correctly
- Inference API returns correct field extractions
- 2117 unit tests pass

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 12:15:02 +01:00

28 lines
788 B
Plaintext

# Invoice Master POC v2 - Dependencies
# PDF Processing
PyMuPDF>=1.23.0 # PDF rendering and text extraction
# OCR
paddlepaddle>=3.0.0,<3.3.0 # PaddlePaddle framework (3.3.0 has OneDNN bug)
paddleocr>=3.0.0 # PaddleOCR (PP-OCRv5)
# YOLO
ultralytics>=8.1.0 # YOLOv8/v11
# Image Processing
Pillow>=10.0.0 # Image handling
numpy>=1.24.0 # Array operations
opencv-python>=4.8.0 # Image processing
# Data Processing
pyyaml>=6.0 # YAML config files
# Utilities
tqdm>=4.65.0 # Progress bars
python-dotenv>=1.0.0 # Environment variable management
# Database
psycopg2-binary>=2.9.0 # PostgreSQL driver
sqlmodel>=0.0.22 # SQLModel ORM (SQLAlchemy + Pydantic)