refactor: formalize safety rules, extract shared styles, reconcile docs (P2)

- Add backend/app/safety.py with explicit confirmation policy, multi-intent
  semantics, and MCP error taxonomy with retry classification
- Add 26 unit tests for safety module (confirmation rules, error taxonomy)
- Extract repeated inline styles into shared CSS classes in index.css
  (section-card, stat-label, status-badge, data-table, empty/error-state,
  pagination-bar)
- Refactor DashboardPage, ReplayListPage, ReplayPage to use shared classes
- Update README: add missing API endpoints, document safety/confirmation rules
- Use proper HTML entities for arrow/dash characters to fix encoding glitches
This commit is contained in:
Yaojia Wang
2026-04-05 23:10:50 +02:00
parent e0931daece
commit 036e12349d
7 changed files with 448 additions and 120 deletions

View File

@@ -128,11 +128,24 @@ agents:
|--------|------|-------------|
| WS | `/ws` | Main WebSocket chat endpoint |
| GET | `/api/health` | Health check |
| GET | `/api/conversations` | List conversations |
| GET | `/api/replay/{thread_id}` | Replay conversation |
| GET | `/api/analytics` | Analytics summary |
| POST | `/api/openapi/import` | Import OpenAPI spec |
| GET | `/api/conversations` | List conversations (paginated) |
| GET | `/api/replay/{thread_id}` | Replay conversation steps (paginated) |
| GET | `/api/analytics` | Analytics summary (`?range=7d`) |
| POST | `/api/openapi/import` | Start OpenAPI import job |
| GET | `/api/openapi/jobs/{id}` | Check import job status |
| GET | `/api/openapi/jobs/{id}/classifications` | Get endpoint classifications |
| PUT | `/api/openapi/jobs/{id}/classifications/{idx}` | Update a classification |
| POST | `/api/openapi/jobs/{id}/approve` | Approve and generate tools |
## Safety and Confirmation Rules
Destructive-action confirmation is explicit and auditable (see `backend/app/safety.py`):
- **Read actions** execute immediately -- no confirmation required.
- **Write actions** require human-in-the-loop approval via an interrupt gate.
- **OpenAPI-imported endpoints** use the `needs_interrupt` classification flag.
- **Multi-intent handling** is sequential: if a write action is blocked by an interrupt, subsequent actions are paused until the interrupt is resolved or rejected.
- **MCP errors** are classified into `transient` (retryable, up to 3 attempts), `validation` (not retryable), `auth` (not retryable, escalate), and `unknown` (not retryable, log and escalate).
## Security