refactor: engineering improvements -- API versioning, structured logging, Alembic, error standardization, test coverage
- API versioning: all REST endpoints prefixed with /api/v1/ - Structured logging: replaced stdlib logging with structlog (console/JSON modes) - Alembic migrations: versioned DB schema with initial migration - Error standardization: global exception handlers for consistent envelope format - Interrupt cleanup: asyncio background task for expired interrupt removal - Integration tests: +30 tests (analytics, replay, openapi, error, session APIs) - Frontend tests: +57 tests (all components, pages, useWebSocket hook) - Backend: 557 tests, 89.75% coverage | Frontend: 80 tests, 16 test files
This commit is contained in:
76
docs/phases/eng-improvements-dev-log.md
Normal file
76
docs/phases/eng-improvements-dev-log.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# Engineering Improvements -- Development Log
|
||||
|
||||
> Status: COMPLETED
|
||||
> Branch: `eng/engineering-improvements`
|
||||
> Date started: 2026-04-06
|
||||
> Date completed: 2026-04-06
|
||||
|
||||
## What Was Built
|
||||
|
||||
### Phase 1: Quick Wins (no new deps)
|
||||
|
||||
1. **Interrupt Cleanup Background Task** -- Added asyncio background task in lifespan that calls `interrupt_manager.cleanup_expired()` every 60 seconds. Prevents unbounded memory growth from expired interrupts.
|
||||
|
||||
2. **API Versioning** -- All REST endpoints prefixed with `/api/v1/` (was `/api/`). Updated 4 router prefixes, Docker healthcheck, all frontend fetch URLs, and all test assertions. WebSocket `/ws` endpoint unchanged.
|
||||
|
||||
3. **Error Response Standardization** -- Added global exception handlers for `HTTPException`, `RequestValidationError`, and `Exception`. All error responses now use the same envelope format as success responses: `{"success": false, "data": null, "error": "..."}`.
|
||||
|
||||
### Phase 2: Medium Items (new deps)
|
||||
|
||||
4. **Alembic Database Migrations** -- Replaced inline DDL in `setup_app_tables()` with versioned Alembic migrations. Initial migration `001_initial_schema.py` captures all 4 tables + ALTER TABLE migration. `setup_app_tables()` preserved for tests. Production uses `run_alembic_migrations()`.
|
||||
|
||||
5. **Structured Logging** -- Replaced stdlib `logging.getLogger()` with `structlog.get_logger()` across 10 files. Added `logging_config.py` with console (dev) and JSON (production) modes. Configurable via `LOG_FORMAT` env var.
|
||||
|
||||
### Phase 3: Test Coverage
|
||||
|
||||
7. **Integration Tests (+30)** -- Created 5 new test files: analytics API, replay API, OpenAPI API, error responses, session/interrupt lifecycle. Uses httpx.AsyncClient with ASGITransport for full API layer testing.
|
||||
|
||||
8. **Frontend Tests (+57)** -- Created 12 new test files covering all components (ChatInput, ChatMessages, InterruptPrompt, ErrorBanner, NavBar, MetricCard, ReplayTimeline, AgentAction, Layout), pages (ChatPage, ReviewPage), and hooks (useWebSocket).
|
||||
|
||||
## Code Structure
|
||||
|
||||
### New files created
|
||||
- `backend/app/logging_config.py` -- structlog configuration
|
||||
- `backend/alembic.ini` -- Alembic config
|
||||
- `backend/alembic/env.py` -- Migration environment
|
||||
- `backend/alembic/versions/001_initial_schema.py` -- Initial migration
|
||||
- `backend/tests/unit/test_interrupt_cleanup.py` (3 tests)
|
||||
- `backend/tests/unit/test_error_responses.py` (6 tests)
|
||||
- `backend/tests/unit/test_logging_config.py` (2 tests)
|
||||
- `backend/tests/integration/test_analytics_api.py` (6 tests)
|
||||
- `backend/tests/integration/test_replay_api.py` (6 tests)
|
||||
- `backend/tests/integration/test_openapi_api.py` (5 tests)
|
||||
- `backend/tests/integration/test_error_responses.py` (5 tests)
|
||||
- `backend/tests/integration/test_session_interrupt_lifecycle.py` (8 tests)
|
||||
- 12 frontend test files (57 tests total)
|
||||
|
||||
### Modified files
|
||||
- `backend/app/main.py` -- cleanup task, exception handlers, alembic, structlog
|
||||
- `backend/app/db.py` -- added run_alembic_migrations()
|
||||
- `backend/app/config.py` -- added log_format setting
|
||||
- `backend/pyproject.toml` -- added alembic, structlog deps
|
||||
- 4 router files -- `/api/v1/` prefix
|
||||
- 10 files -- structlog migration
|
||||
- `docker-compose.yml` -- healthcheck URL
|
||||
- `frontend/src/api.ts` -- `/api/v1/` URLs
|
||||
- All existing test files -- API path updates + error envelope assertions
|
||||
|
||||
## Test Coverage
|
||||
|
||||
- Backend: 557 tests (was 516), 89.75% coverage
|
||||
- Unit: ~490 tests
|
||||
- Integration: ~60 tests
|
||||
- E2E: ~7 tests
|
||||
- Frontend: 80 tests (was 23), 16 test files (was 4)
|
||||
|
||||
## Deviations from Plan
|
||||
|
||||
- Redis rate limiting deferred (single-worker sufficient for now)
|
||||
- ConversationTracker verified correct by design (pool per-method), skipped
|
||||
- Coverage dropped slightly from 90.26% to 89.75% due to new alembic/logging modules with partial test coverage (still well above 80% threshold)
|
||||
|
||||
## Known Issues / Tech Debt
|
||||
|
||||
- Rate limiting remains process-global (needs Redis for multi-worker)
|
||||
- Alembic migrations not tested against real PostgreSQL in CI (would need running DB)
|
||||
- Frontend test coverage could be deeper (e.g., WebSocket reconnect edge cases)
|
||||
Reference in New Issue
Block a user