- Update paddlepaddle from >=2.5.0 to >=3.0.0,<3.3.0 - Update paddleocr from >=2.7.0 to >=3.0.0 - Update paddlepaddle-gpu from >=2.5.0 to >=3.0.0,<3.3.0 Note: PaddlePaddle 3.3.0 has an OneDNN bug that breaks CPU inference (ConvertPirAttribute2RuntimeAttribute not implemented). Using <3.3.0 until the bug is fixed upstream. This upgrade enables PP-StructureV3 for table extraction and uses PP-OCRv5 for improved text recognition accuracy. The existing codebase is already compatible with the 3.x API (predict() method and new response format). Verified: - PaddleOCR import works - PPStructureV3 is available - OCREngine initializes correctly - Inference API returns correct field extractions - 2117 unit tests pass Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
28 lines
788 B
Plaintext
28 lines
788 B
Plaintext
# Invoice Master POC v2 - Dependencies
|
|
|
|
# PDF Processing
|
|
PyMuPDF>=1.23.0 # PDF rendering and text extraction
|
|
|
|
# OCR
|
|
paddlepaddle>=3.0.0,<3.3.0 # PaddlePaddle framework (3.3.0 has OneDNN bug)
|
|
paddleocr>=3.0.0 # PaddleOCR (PP-OCRv5)
|
|
|
|
# YOLO
|
|
ultralytics>=8.1.0 # YOLOv8/v11
|
|
|
|
# Image Processing
|
|
Pillow>=10.0.0 # Image handling
|
|
numpy>=1.24.0 # Array operations
|
|
opencv-python>=4.8.0 # Image processing
|
|
|
|
# Data Processing
|
|
pyyaml>=6.0 # YAML config files
|
|
|
|
# Utilities
|
|
tqdm>=4.65.0 # Progress bars
|
|
python-dotenv>=1.0.0 # Environment variable management
|
|
|
|
# Database
|
|
psycopg2-binary>=2.9.0 # PostgreSQL driver
|
|
sqlmodel>=0.0.22 # SQLModel ORM (SQLAlchemy + Pydantic)
|