206 lines
5.8 KiB
Markdown
206 lines
5.8 KiB
Markdown
# Shared Package
|
|
|
|
Shared utilities and abstractions for the Invoice Master system.
|
|
|
|
## Storage Abstraction Layer
|
|
|
|
A unified storage abstraction supporting multiple backends:
|
|
- **Local filesystem** - Development and testing
|
|
- **Azure Blob Storage** - Azure cloud deployments
|
|
- **AWS S3** - AWS cloud deployments
|
|
|
|
### Installation
|
|
|
|
```bash
|
|
# Basic installation (local storage only)
|
|
pip install -e packages/shared
|
|
|
|
# With Azure support
|
|
pip install -e "packages/shared[azure]"
|
|
|
|
# With S3 support
|
|
pip install -e "packages/shared[s3]"
|
|
|
|
# All cloud providers
|
|
pip install -e "packages/shared[all]"
|
|
```
|
|
|
|
### Quick Start
|
|
|
|
```python
|
|
from shared.storage import get_storage_backend
|
|
|
|
# Option 1: From configuration file
|
|
storage = get_storage_backend("storage.yaml")
|
|
|
|
# Option 2: From environment variables
|
|
from shared.storage import create_storage_backend_from_env
|
|
storage = create_storage_backend_from_env()
|
|
|
|
# Upload a file
|
|
storage.upload(Path("local/file.pdf"), "documents/file.pdf")
|
|
|
|
# Download a file
|
|
storage.download("documents/file.pdf", Path("local/downloaded.pdf"))
|
|
|
|
# Get pre-signed URL for frontend access
|
|
url = storage.get_presigned_url("documents/file.pdf", expires_in_seconds=3600)
|
|
```
|
|
|
|
### Configuration File Format
|
|
|
|
Create a `storage.yaml` file with environment variable substitution support:
|
|
|
|
```yaml
|
|
# Backend selection: local, azure_blob, or s3
|
|
backend: ${STORAGE_BACKEND:-local}
|
|
|
|
# Default pre-signed URL expiry (seconds)
|
|
presigned_url_expiry: 3600
|
|
|
|
# Local storage configuration
|
|
local:
|
|
base_path: ${STORAGE_BASE_PATH:-./data/storage}
|
|
|
|
# Azure Blob Storage configuration
|
|
azure:
|
|
connection_string: ${AZURE_STORAGE_CONNECTION_STRING}
|
|
container_name: ${AZURE_STORAGE_CONTAINER:-documents}
|
|
create_container: false
|
|
|
|
# AWS S3 configuration
|
|
s3:
|
|
bucket_name: ${AWS_S3_BUCKET}
|
|
region_name: ${AWS_REGION:-us-east-1}
|
|
access_key_id: ${AWS_ACCESS_KEY_ID}
|
|
secret_access_key: ${AWS_SECRET_ACCESS_KEY}
|
|
endpoint_url: ${AWS_ENDPOINT_URL} # Optional, for S3-compatible services
|
|
create_bucket: false
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Backend | Description |
|
|
|----------|---------|-------------|
|
|
| `STORAGE_BACKEND` | All | Backend type: `local`, `azure_blob`, `s3` |
|
|
| `STORAGE_BASE_PATH` | Local | Base directory path |
|
|
| `AZURE_STORAGE_CONNECTION_STRING` | Azure | Connection string |
|
|
| `AZURE_STORAGE_CONTAINER` | Azure | Container name |
|
|
| `AWS_S3_BUCKET` | S3 | Bucket name |
|
|
| `AWS_REGION` | S3 | AWS region (default: us-east-1) |
|
|
| `AWS_ACCESS_KEY_ID` | S3 | Access key (optional, uses credential chain) |
|
|
| `AWS_SECRET_ACCESS_KEY` | S3 | Secret key (optional) |
|
|
| `AWS_ENDPOINT_URL` | S3 | Custom endpoint for S3-compatible services |
|
|
|
|
### API Reference
|
|
|
|
#### StorageBackend Interface
|
|
|
|
```python
|
|
class StorageBackend(ABC):
|
|
def upload(self, local_path: Path, remote_path: str, overwrite: bool = False) -> str:
|
|
"""Upload a file to storage."""
|
|
|
|
def download(self, remote_path: str, local_path: Path) -> Path:
|
|
"""Download a file from storage."""
|
|
|
|
def exists(self, remote_path: str) -> bool:
|
|
"""Check if a file exists."""
|
|
|
|
def list_files(self, prefix: str) -> list[str]:
|
|
"""List files with given prefix."""
|
|
|
|
def delete(self, remote_path: str) -> bool:
|
|
"""Delete a file."""
|
|
|
|
def get_url(self, remote_path: str) -> str:
|
|
"""Get URL for a file."""
|
|
|
|
def get_presigned_url(self, remote_path: str, expires_in_seconds: int = 3600) -> str:
|
|
"""Generate a pre-signed URL for temporary access (1-604800 seconds)."""
|
|
|
|
def upload_bytes(self, data: bytes, remote_path: str, overwrite: bool = False) -> str:
|
|
"""Upload bytes directly."""
|
|
|
|
def download_bytes(self, remote_path: str) -> bytes:
|
|
"""Download file as bytes."""
|
|
```
|
|
|
|
#### Factory Functions
|
|
|
|
```python
|
|
# Create from configuration file
|
|
storage = create_storage_backend_from_file("storage.yaml")
|
|
|
|
# Create from environment variables
|
|
storage = create_storage_backend_from_env()
|
|
|
|
# Create from StorageConfig object
|
|
config = StorageConfig(backend_type="local", base_path=Path("./data"))
|
|
storage = create_storage_backend(config)
|
|
|
|
# Convenience function with fallback chain: config file -> env vars -> local default
|
|
storage = get_storage_backend("storage.yaml") # or None for env-only
|
|
```
|
|
|
|
### Pre-signed URLs
|
|
|
|
Pre-signed URLs provide temporary access to files without exposing credentials:
|
|
|
|
```python
|
|
# Generate URL valid for 1 hour (default)
|
|
url = storage.get_presigned_url("documents/invoice.pdf")
|
|
|
|
# Generate URL valid for 24 hours
|
|
url = storage.get_presigned_url("documents/invoice.pdf", expires_in_seconds=86400)
|
|
|
|
# Maximum expiry: 7 days (604800 seconds)
|
|
url = storage.get_presigned_url("documents/invoice.pdf", expires_in_seconds=604800)
|
|
```
|
|
|
|
**Note:** Local storage returns `file://` URLs that don't actually expire.
|
|
|
|
### Error Handling
|
|
|
|
```python
|
|
from shared.storage import (
|
|
StorageError,
|
|
FileNotFoundStorageError,
|
|
PresignedUrlNotSupportedError,
|
|
)
|
|
|
|
try:
|
|
storage.download("nonexistent.pdf", Path("local.pdf"))
|
|
except FileNotFoundStorageError as e:
|
|
print(f"File not found: {e}")
|
|
except StorageError as e:
|
|
print(f"Storage error: {e}")
|
|
```
|
|
|
|
### Testing with MinIO (S3-compatible)
|
|
|
|
```bash
|
|
# Start MinIO locally
|
|
docker run -p 9000:9000 -p 9001:9001 minio/minio server /data --console-address ":9001"
|
|
|
|
# Configure environment
|
|
export STORAGE_BACKEND=s3
|
|
export AWS_S3_BUCKET=test-bucket
|
|
export AWS_ENDPOINT_URL=http://localhost:9000
|
|
export AWS_ACCESS_KEY_ID=minioadmin
|
|
export AWS_SECRET_ACCESS_KEY=minioadmin
|
|
```
|
|
|
|
### Module Structure
|
|
|
|
```
|
|
shared/storage/
|
|
├── __init__.py # Public exports
|
|
├── base.py # Abstract interface and exceptions
|
|
├── local.py # Local filesystem backend
|
|
├── azure.py # Azure Blob Storage backend
|
|
├── s3.py # AWS S3 backend
|
|
├── config_loader.py # YAML configuration loader
|
|
└── factory.py # Backend factory functions
|
|
```
|