# Shared Package Shared utilities and abstractions for the Invoice Master system. ## Storage Abstraction Layer A unified storage abstraction supporting multiple backends: - **Local filesystem** - Development and testing - **Azure Blob Storage** - Azure cloud deployments - **AWS S3** - AWS cloud deployments ### Installation ```bash # Basic installation (local storage only) pip install -e packages/shared # With Azure support pip install -e "packages/shared[azure]" # With S3 support pip install -e "packages/shared[s3]" # All cloud providers pip install -e "packages/shared[all]" ``` ### Quick Start ```python from shared.storage import get_storage_backend # Option 1: From configuration file storage = get_storage_backend("storage.yaml") # Option 2: From environment variables from shared.storage import create_storage_backend_from_env storage = create_storage_backend_from_env() # Upload a file storage.upload(Path("local/file.pdf"), "documents/file.pdf") # Download a file storage.download("documents/file.pdf", Path("local/downloaded.pdf")) # Get pre-signed URL for frontend access url = storage.get_presigned_url("documents/file.pdf", expires_in_seconds=3600) ``` ### Configuration File Format Create a `storage.yaml` file with environment variable substitution support: ```yaml # Backend selection: local, azure_blob, or s3 backend: ${STORAGE_BACKEND:-local} # Default pre-signed URL expiry (seconds) presigned_url_expiry: 3600 # Local storage configuration local: base_path: ${STORAGE_BASE_PATH:-./data/storage} # Azure Blob Storage configuration azure: connection_string: ${AZURE_STORAGE_CONNECTION_STRING} container_name: ${AZURE_STORAGE_CONTAINER:-documents} create_container: false # AWS S3 configuration s3: bucket_name: ${AWS_S3_BUCKET} region_name: ${AWS_REGION:-us-east-1} access_key_id: ${AWS_ACCESS_KEY_ID} secret_access_key: ${AWS_SECRET_ACCESS_KEY} endpoint_url: ${AWS_ENDPOINT_URL} # Optional, for S3-compatible services create_bucket: false ``` ### Environment Variables | Variable | Backend | Description | |----------|---------|-------------| | `STORAGE_BACKEND` | All | Backend type: `local`, `azure_blob`, `s3` | | `STORAGE_BASE_PATH` | Local | Base directory path | | `AZURE_STORAGE_CONNECTION_STRING` | Azure | Connection string | | `AZURE_STORAGE_CONTAINER` | Azure | Container name | | `AWS_S3_BUCKET` | S3 | Bucket name | | `AWS_REGION` | S3 | AWS region (default: us-east-1) | | `AWS_ACCESS_KEY_ID` | S3 | Access key (optional, uses credential chain) | | `AWS_SECRET_ACCESS_KEY` | S3 | Secret key (optional) | | `AWS_ENDPOINT_URL` | S3 | Custom endpoint for S3-compatible services | ### API Reference #### StorageBackend Interface ```python class StorageBackend(ABC): def upload(self, local_path: Path, remote_path: str, overwrite: bool = False) -> str: """Upload a file to storage.""" def download(self, remote_path: str, local_path: Path) -> Path: """Download a file from storage.""" def exists(self, remote_path: str) -> bool: """Check if a file exists.""" def list_files(self, prefix: str) -> list[str]: """List files with given prefix.""" def delete(self, remote_path: str) -> bool: """Delete a file.""" def get_url(self, remote_path: str) -> str: """Get URL for a file.""" def get_presigned_url(self, remote_path: str, expires_in_seconds: int = 3600) -> str: """Generate a pre-signed URL for temporary access (1-604800 seconds).""" def upload_bytes(self, data: bytes, remote_path: str, overwrite: bool = False) -> str: """Upload bytes directly.""" def download_bytes(self, remote_path: str) -> bytes: """Download file as bytes.""" ``` #### Factory Functions ```python # Create from configuration file storage = create_storage_backend_from_file("storage.yaml") # Create from environment variables storage = create_storage_backend_from_env() # Create from StorageConfig object config = StorageConfig(backend_type="local", base_path=Path("./data")) storage = create_storage_backend(config) # Convenience function with fallback chain: config file -> env vars -> local default storage = get_storage_backend("storage.yaml") # or None for env-only ``` ### Pre-signed URLs Pre-signed URLs provide temporary access to files without exposing credentials: ```python # Generate URL valid for 1 hour (default) url = storage.get_presigned_url("documents/invoice.pdf") # Generate URL valid for 24 hours url = storage.get_presigned_url("documents/invoice.pdf", expires_in_seconds=86400) # Maximum expiry: 7 days (604800 seconds) url = storage.get_presigned_url("documents/invoice.pdf", expires_in_seconds=604800) ``` **Note:** Local storage returns `file://` URLs that don't actually expire. ### Error Handling ```python from shared.storage import ( StorageError, FileNotFoundStorageError, PresignedUrlNotSupportedError, ) try: storage.download("nonexistent.pdf", Path("local.pdf")) except FileNotFoundStorageError as e: print(f"File not found: {e}") except StorageError as e: print(f"Storage error: {e}") ``` ### Testing with MinIO (S3-compatible) ```bash # Start MinIO locally docker run -p 9000:9000 -p 9001:9001 minio/minio server /data --console-address ":9001" # Configure environment export STORAGE_BACKEND=s3 export AWS_S3_BUCKET=test-bucket export AWS_ENDPOINT_URL=http://localhost:9000 export AWS_ACCESS_KEY_ID=minioadmin export AWS_SECRET_ACCESS_KEY=minioadmin ``` ### Module Structure ``` shared/storage/ ├── __init__.py # Public exports ├── base.py # Abstract interface and exceptions ├── local.py # Local filesystem backend ├── azure.py # Azure Blob Storage backend ├── s3.py # AWS S3 backend ├── config_loader.py # YAML configuration loader └── factory.py # Backend factory functions ```