WIP
This commit is contained in:
205
packages/shared/README.md
Normal file
205
packages/shared/README.md
Normal file
@@ -0,0 +1,205 @@
|
||||
# Shared Package
|
||||
|
||||
Shared utilities and abstractions for the Invoice Master system.
|
||||
|
||||
## Storage Abstraction Layer
|
||||
|
||||
A unified storage abstraction supporting multiple backends:
|
||||
- **Local filesystem** - Development and testing
|
||||
- **Azure Blob Storage** - Azure cloud deployments
|
||||
- **AWS S3** - AWS cloud deployments
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
# Basic installation (local storage only)
|
||||
pip install -e packages/shared
|
||||
|
||||
# With Azure support
|
||||
pip install -e "packages/shared[azure]"
|
||||
|
||||
# With S3 support
|
||||
pip install -e "packages/shared[s3]"
|
||||
|
||||
# All cloud providers
|
||||
pip install -e "packages/shared[all]"
|
||||
```
|
||||
|
||||
### Quick Start
|
||||
|
||||
```python
|
||||
from shared.storage import get_storage_backend
|
||||
|
||||
# Option 1: From configuration file
|
||||
storage = get_storage_backend("storage.yaml")
|
||||
|
||||
# Option 2: From environment variables
|
||||
from shared.storage import create_storage_backend_from_env
|
||||
storage = create_storage_backend_from_env()
|
||||
|
||||
# Upload a file
|
||||
storage.upload(Path("local/file.pdf"), "documents/file.pdf")
|
||||
|
||||
# Download a file
|
||||
storage.download("documents/file.pdf", Path("local/downloaded.pdf"))
|
||||
|
||||
# Get pre-signed URL for frontend access
|
||||
url = storage.get_presigned_url("documents/file.pdf", expires_in_seconds=3600)
|
||||
```
|
||||
|
||||
### Configuration File Format
|
||||
|
||||
Create a `storage.yaml` file with environment variable substitution support:
|
||||
|
||||
```yaml
|
||||
# Backend selection: local, azure_blob, or s3
|
||||
backend: ${STORAGE_BACKEND:-local}
|
||||
|
||||
# Default pre-signed URL expiry (seconds)
|
||||
presigned_url_expiry: 3600
|
||||
|
||||
# Local storage configuration
|
||||
local:
|
||||
base_path: ${STORAGE_BASE_PATH:-./data/storage}
|
||||
|
||||
# Azure Blob Storage configuration
|
||||
azure:
|
||||
connection_string: ${AZURE_STORAGE_CONNECTION_STRING}
|
||||
container_name: ${AZURE_STORAGE_CONTAINER:-documents}
|
||||
create_container: false
|
||||
|
||||
# AWS S3 configuration
|
||||
s3:
|
||||
bucket_name: ${AWS_S3_BUCKET}
|
||||
region_name: ${AWS_REGION:-us-east-1}
|
||||
access_key_id: ${AWS_ACCESS_KEY_ID}
|
||||
secret_access_key: ${AWS_SECRET_ACCESS_KEY}
|
||||
endpoint_url: ${AWS_ENDPOINT_URL} # Optional, for S3-compatible services
|
||||
create_bucket: false
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Backend | Description |
|
||||
|----------|---------|-------------|
|
||||
| `STORAGE_BACKEND` | All | Backend type: `local`, `azure_blob`, `s3` |
|
||||
| `STORAGE_BASE_PATH` | Local | Base directory path |
|
||||
| `AZURE_STORAGE_CONNECTION_STRING` | Azure | Connection string |
|
||||
| `AZURE_STORAGE_CONTAINER` | Azure | Container name |
|
||||
| `AWS_S3_BUCKET` | S3 | Bucket name |
|
||||
| `AWS_REGION` | S3 | AWS region (default: us-east-1) |
|
||||
| `AWS_ACCESS_KEY_ID` | S3 | Access key (optional, uses credential chain) |
|
||||
| `AWS_SECRET_ACCESS_KEY` | S3 | Secret key (optional) |
|
||||
| `AWS_ENDPOINT_URL` | S3 | Custom endpoint for S3-compatible services |
|
||||
|
||||
### API Reference
|
||||
|
||||
#### StorageBackend Interface
|
||||
|
||||
```python
|
||||
class StorageBackend(ABC):
|
||||
def upload(self, local_path: Path, remote_path: str, overwrite: bool = False) -> str:
|
||||
"""Upload a file to storage."""
|
||||
|
||||
def download(self, remote_path: str, local_path: Path) -> Path:
|
||||
"""Download a file from storage."""
|
||||
|
||||
def exists(self, remote_path: str) -> bool:
|
||||
"""Check if a file exists."""
|
||||
|
||||
def list_files(self, prefix: str) -> list[str]:
|
||||
"""List files with given prefix."""
|
||||
|
||||
def delete(self, remote_path: str) -> bool:
|
||||
"""Delete a file."""
|
||||
|
||||
def get_url(self, remote_path: str) -> str:
|
||||
"""Get URL for a file."""
|
||||
|
||||
def get_presigned_url(self, remote_path: str, expires_in_seconds: int = 3600) -> str:
|
||||
"""Generate a pre-signed URL for temporary access (1-604800 seconds)."""
|
||||
|
||||
def upload_bytes(self, data: bytes, remote_path: str, overwrite: bool = False) -> str:
|
||||
"""Upload bytes directly."""
|
||||
|
||||
def download_bytes(self, remote_path: str) -> bytes:
|
||||
"""Download file as bytes."""
|
||||
```
|
||||
|
||||
#### Factory Functions
|
||||
|
||||
```python
|
||||
# Create from configuration file
|
||||
storage = create_storage_backend_from_file("storage.yaml")
|
||||
|
||||
# Create from environment variables
|
||||
storage = create_storage_backend_from_env()
|
||||
|
||||
# Create from StorageConfig object
|
||||
config = StorageConfig(backend_type="local", base_path=Path("./data"))
|
||||
storage = create_storage_backend(config)
|
||||
|
||||
# Convenience function with fallback chain: config file -> env vars -> local default
|
||||
storage = get_storage_backend("storage.yaml") # or None for env-only
|
||||
```
|
||||
|
||||
### Pre-signed URLs
|
||||
|
||||
Pre-signed URLs provide temporary access to files without exposing credentials:
|
||||
|
||||
```python
|
||||
# Generate URL valid for 1 hour (default)
|
||||
url = storage.get_presigned_url("documents/invoice.pdf")
|
||||
|
||||
# Generate URL valid for 24 hours
|
||||
url = storage.get_presigned_url("documents/invoice.pdf", expires_in_seconds=86400)
|
||||
|
||||
# Maximum expiry: 7 days (604800 seconds)
|
||||
url = storage.get_presigned_url("documents/invoice.pdf", expires_in_seconds=604800)
|
||||
```
|
||||
|
||||
**Note:** Local storage returns `file://` URLs that don't actually expire.
|
||||
|
||||
### Error Handling
|
||||
|
||||
```python
|
||||
from shared.storage import (
|
||||
StorageError,
|
||||
FileNotFoundStorageError,
|
||||
PresignedUrlNotSupportedError,
|
||||
)
|
||||
|
||||
try:
|
||||
storage.download("nonexistent.pdf", Path("local.pdf"))
|
||||
except FileNotFoundStorageError as e:
|
||||
print(f"File not found: {e}")
|
||||
except StorageError as e:
|
||||
print(f"Storage error: {e}")
|
||||
```
|
||||
|
||||
### Testing with MinIO (S3-compatible)
|
||||
|
||||
```bash
|
||||
# Start MinIO locally
|
||||
docker run -p 9000:9000 -p 9001:9001 minio/minio server /data --console-address ":9001"
|
||||
|
||||
# Configure environment
|
||||
export STORAGE_BACKEND=s3
|
||||
export AWS_S3_BUCKET=test-bucket
|
||||
export AWS_ENDPOINT_URL=http://localhost:9000
|
||||
export AWS_ACCESS_KEY_ID=minioadmin
|
||||
export AWS_SECRET_ACCESS_KEY=minioadmin
|
||||
```
|
||||
|
||||
### Module Structure
|
||||
|
||||
```
|
||||
shared/storage/
|
||||
├── __init__.py # Public exports
|
||||
├── base.py # Abstract interface and exceptions
|
||||
├── local.py # Local filesystem backend
|
||||
├── azure.py # Azure Blob Storage backend
|
||||
├── s3.py # AWS S3 backend
|
||||
├── config_loader.py # YAML configuration loader
|
||||
└── factory.py # Backend factory functions
|
||||
```
|
||||
Reference in New Issue
Block a user