7.1 KiB
7.1 KiB
Web Directory Refactoring Plan
Current Structure Issues
- Flat structure: All files in one directory (20 Python files)
- Naming inconsistency: Mix of
admin_*,async_*,batch_*prefixes - Mixed concerns: Routes, schemas, services, and workers in same directory
- Poor scalability: Hard to navigate and maintain as project grows
Proposed Structure (Best Practices)
src/web/
├── __init__.py # Main exports
├── app.py # FastAPI app factory
├── config.py # App configuration
├── dependencies.py # Global dependencies
│
├── api/ # API Routes Layer
│ ├── __init__.py
│ └── v1/ # API version 1
│ ├── __init__.py
│ ├── routes.py # Public API routes (inference)
│ ├── admin/ # Admin API routes
│ │ ├── __init__.py
│ │ ├── documents.py # admin_routes.py → documents.py
│ │ ├── annotations.py # admin_annotation_routes.py → annotations.py
│ │ ├── training.py # admin_training_routes.py → training.py
│ │ └── auth.py # admin_auth.py → auth.py (routes only)
│ ├── async_api/ # Async processing API
│ │ ├── __init__.py
│ │ └── routes.py # async_routes.py → routes.py
│ └── batch/ # Batch upload API
│ ├── __init__.py
│ └── routes.py # batch_upload_routes.py → routes.py
│
├── schemas/ # Pydantic Models
│ ├── __init__.py
│ ├── common.py # Shared schemas (ErrorResponse, etc.)
│ ├── inference.py # schemas.py → inference.py
│ ├── admin.py # admin_schemas.py → admin.py
│ ├── async_api.py # New: async API schemas
│ └── batch.py # New: batch upload schemas
│
├── services/ # Business Logic Layer
│ ├── __init__.py
│ ├── inference.py # services.py → inference.py
│ ├── autolabel.py # admin_autolabel.py → autolabel.py
│ ├── async_processing.py # async_service.py → async_processing.py
│ └── batch_upload.py # batch_upload_service.py → batch_upload.py
│
├── core/ # Core Components
│ ├── __init__.py
│ ├── auth.py # admin_auth.py → auth.py (logic only)
│ ├── rate_limiter.py # rate_limiter.py → rate_limiter.py
│ └── scheduler.py # admin_scheduler.py → scheduler.py
│
└── workers/ # Background Task Queues
├── __init__.py
├── async_queue.py # async_queue.py → async_queue.py
└── batch_queue.py # batch_queue.py → batch_queue.py
File Mapping
Current → New Location
| Current File | New Location | Purpose |
|---|---|---|
admin_routes.py |
api/v1/admin/documents.py |
Document management routes |
admin_annotation_routes.py |
api/v1/admin/annotations.py |
Annotation routes |
admin_training_routes.py |
api/v1/admin/training.py |
Training routes |
admin_auth.py |
Split: api/v1/admin/auth.py + core/auth.py |
Auth routes + logic |
admin_schemas.py |
schemas/admin.py |
Admin Pydantic models |
admin_autolabel.py |
services/autolabel.py |
Auto-label service |
admin_scheduler.py |
core/scheduler.py |
Training scheduler |
routes.py |
api/v1/routes.py |
Public inference API |
schemas.py |
schemas/inference.py |
Inference models |
services.py |
services/inference.py |
Inference service |
async_routes.py |
api/v1/async_api/routes.py |
Async API routes |
async_service.py |
services/async_processing.py |
Async processing service |
async_queue.py |
workers/async_queue.py |
Async task queue |
batch_upload_routes.py |
api/v1/batch/routes.py |
Batch upload routes |
batch_upload_service.py |
services/batch_upload.py |
Batch upload service |
batch_queue.py |
workers/batch_queue.py |
Batch task queue |
rate_limiter.py |
core/rate_limiter.py |
Rate limiting logic |
config.py |
config.py |
Keep as-is |
dependencies.py |
dependencies.py |
Keep as-is |
app.py |
app.py |
Keep as-is (update imports) |
Benefits
1. Clear Separation of Concerns
- Routes: API endpoint definitions
- Schemas: Data validation models
- Services: Business logic
- Core: Reusable components
- Workers: Background processing
2. Better Scalability
- Easy to add new API versions (
v2/) - Clear namespace for each domain
- Reduced file size (no 800+ line files)
3. Improved Maintainability
- Find files by function, not by prefix
- Each module has single responsibility
- Easier to write focused tests
4. Standard Python Patterns
- Package-based organization
- Follows FastAPI best practices
- Similar to Django/Flask structures
Implementation Steps
Phase 1: Create New Structure (No Breaking Changes)
- Create new directories:
api/,schemas/,services/,core/,workers/ - Copy files to new locations (don't delete originals yet)
- Update imports in new files
- Add
__init__.pywith proper exports
Phase 2: Update Tests
- Update test imports to use new structure
- Run tests to verify nothing breaks
- Fix any import issues
Phase 3: Update Main App
- Update
app.pyto import from new locations - Run full test suite
- Verify all endpoints work
Phase 4: Cleanup
- Delete old files
- Update documentation
- Final test run
Migration Priority
High Priority (Most used):
- Routes and schemas (user-facing APIs)
- Services (core business logic)
Medium Priority:
- Core components (auth, rate limiter)
- Workers (background tasks)
Low Priority:
- Config and dependencies (already well-located)
Backwards Compatibility
During migration, maintain backwards compatibility:
# src/web/__init__.py
# Old imports still work
from src.web.api.v1.admin.documents import router as admin_router
from src.web.schemas.admin import AdminDocument
# Keep old names for compatibility (temporary)
admin_routes = admin_router # Deprecated alias
Testing Strategy
- Unit Tests: Test each module independently
- Integration Tests: Test API endpoints still work
- Import Tests: Verify all old imports still work
- Coverage: Maintain current 23% coverage minimum
Rollback Plan
If issues arise:
- Keep old files until fully migrated
- Git allows easy revert
- Tests catch breaking changes early
Next Steps
Would you like me to:
- Start Phase 1: Create new directory structure and move files?
- Create migration script: Automate the file moves and import updates?
- Focus on specific area: Start with admin API or async API first?