Files
smart-support/README.md
Yaojia Wang 036e12349d refactor: formalize safety rules, extract shared styles, reconcile docs (P2)
- Add backend/app/safety.py with explicit confirmation policy, multi-intent
  semantics, and MCP error taxonomy with retry classification
- Add 26 unit tests for safety module (confirmation rules, error taxonomy)
- Extract repeated inline styles into shared CSS classes in index.css
  (section-card, stat-label, status-badge, data-table, empty/error-state,
  pagination-bar)
- Refactor DashboardPage, ReplayListPage, ReplayPage to use shared classes
- Update README: add missing API endpoints, document safety/confirmation rules
- Use proper HTML entities for arrow/dash characters to fix encoding glitches
2026-04-05 23:10:50 +02:00

180 lines
7.4 KiB
Markdown

# Smart Support
AI customer support action layer. Paste your API spec, get an AI agent that executes real actions.
## The Problem
Existing support tools (Zendesk, Intercom, Ada) answer FAQs well but automation
rates stall at 20-30%. The remaining 70% of tickets require agents to manually
log into internal systems to look up orders, cancel orders, issue coupons.
Smart Support fills that gap as the "action layer" -- it does not replace your
existing support platform, it enables AI to directly call your internal systems.
## How It Works
```
User message -> Chat UI -> FastAPI WebSocket -> LangGraph Supervisor -> Specialist Agent -> MCP Tools -> Your systems
| |
Agent Registry interrupt()
(YAML config) (human approval)
|
PostgresSaver
(session persistence)
```
1. User sends a message in the chat UI.
2. LangGraph Supervisor classifies intent and routes to the right agent.
3. Agent calls your internal systems via MCP tools.
4. Write operations trigger a human-in-the-loop approval gate.
5. All operations are logged with full replay and analytics.
## Key Features
- **Multi-agent routing** -- each operation goes to a specialist agent with its own tools and permissions
- **Zero-config import** -- paste an OpenAPI 3.0 URL, agents are generated automatically
- **Human-in-the-loop** -- all write operations (cancel, refund, modify) require approval; reads execute immediately
- **Session context** -- multi-turn conversation with persistent state across reconnects
- **Real-time streaming** -- WebSocket token streaming with live tool call visibility
- **Conversation replay** -- step-by-step audit trail of every agent decision
- **Analytics dashboard** -- resolution rate, agent usage, escalation rate, cost per conversation
- **YAML-driven config** -- agents, personas, and vertical templates in a single file
## Tech Stack
| Component | Technology |
|-----------|-----------|
| Backend | Python 3.11+, FastAPI |
| Agent orchestration | LangGraph v1.1 |
| Session state | PostgreSQL + langgraph-checkpoint-postgres |
| LLM | Claude Sonnet 4.6 (configurable: OpenAI, Google) |
| Frontend | React 19, TypeScript, Vite |
| Deployment | Docker Compose |
## Quick Start
```bash
git clone <repo-url>
cd smart-support
# Configure your LLM API key
cp .env.example .env
# Edit .env: set ANTHROPIC_API_KEY (or OPENAI_API_KEY)
# Start all services
docker compose up -d
# Open the app
open http://localhost
```
## Project Structure
```
smart-support/
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI + WebSocket entry point
│ │ ├── graph.py # LangGraph Supervisor
│ │ ├── ws_handler.py # WebSocket message dispatch + rate limiting
│ │ ├── conversation_tracker.py # Conversation lifecycle tracking
│ │ ├── agents/ # Agent definitions and tools
│ │ ├── registry.py # YAML agent registry loader
│ │ ├── openapi/ # OpenAPI parser and review API
│ │ ├── replay/ # Conversation replay API
│ │ ├── analytics/ # Analytics queries and API
│ │ └── tools/ # Error handling and retry utilities
│ ├── agents.yaml # Agent registry configuration
│ ├── fixtures/ # Demo data and sample OpenAPI spec
│ └── tests/ # Unit, integration, and E2E tests
├── frontend/
│ ├── src/
│ │ ├── pages/ # Chat, Replay, Dashboard, Review pages
│ │ ├── components/ # NavBar, Layout, MetricCard, ReplayTimeline
│ │ ├── hooks/ # useWebSocket with reconnect support
│ │ └── api.ts # Typed API client
│ └── Dockerfile # Multi-stage nginx build
├── docs/ # Architecture, deployment, guides
├── docker-compose.yml # Full-stack compose
└── .env.example # Environment variable template
```
## Agent Configuration
```yaml
# agents.yaml
agents:
- name: order_agent
description: "Handles order status, tracking, and cancellations."
permission: write
tools:
- get_order_status
- cancel_order
personality:
tone: friendly
greeting: "I can help with your order. What is the order number?"
escalation_message: "I'm escalating this to a human agent."
- name: general_agent
description: "Answers general questions and FAQs."
permission: read
tools:
- search_faq
```
## API Endpoints
| Method | Path | Description |
|--------|------|-------------|
| WS | `/ws` | Main WebSocket chat endpoint |
| GET | `/api/health` | Health check |
| GET | `/api/conversations` | List conversations (paginated) |
| GET | `/api/replay/{thread_id}` | Replay conversation steps (paginated) |
| GET | `/api/analytics` | Analytics summary (`?range=7d`) |
| POST | `/api/openapi/import` | Start OpenAPI import job |
| GET | `/api/openapi/jobs/{id}` | Check import job status |
| GET | `/api/openapi/jobs/{id}/classifications` | Get endpoint classifications |
| PUT | `/api/openapi/jobs/{id}/classifications/{idx}` | Update a classification |
| POST | `/api/openapi/jobs/{id}/approve` | Approve and generate tools |
## Safety and Confirmation Rules
Destructive-action confirmation is explicit and auditable (see `backend/app/safety.py`):
- **Read actions** execute immediately -- no confirmation required.
- **Write actions** require human-in-the-loop approval via an interrupt gate.
- **OpenAPI-imported endpoints** use the `needs_interrupt` classification flag.
- **Multi-intent handling** is sequential: if a write action is blocked by an interrupt, subsequent actions are paused until the interrupt is resolved or rejected.
- **MCP errors** are classified into `transient` (retryable, up to 3 attempts), `validation` (not retryable), `auth` (not retryable, escalate), and `unknown` (not retryable, log and escalate).
## Security
- **SSRF protection** -- OpenAPI import blocks private IPs and metadata service URLs
- **Input validation** -- messages validated for size (32 KB), content length (10 KB), thread ID format
- **Rate limiting** -- 10 messages per 10 seconds per session
- **Audit trail** -- every tool call logged with agent, params, result, timestamp
- **Permission isolation** -- each agent only accesses its configured tools
- **Interrupt TTL** -- unanswered approval prompts expire after 30 minutes
## Running Tests
```bash
cd backend
pytest --cov=app --cov-report=term-missing
```
Coverage is enforced at 80%+.
## Documentation
- [Architecture](docs/ARCHITECTURE.md) -- System design and component diagram
- [Development Plan](docs/DEVELOPMENT-PLAN.md) -- Phase breakdown and status
- [Agent Config Guide](docs/agent-config-guide.md) -- How to configure agents
- [OpenAPI Import Guide](docs/openapi-import-guide.md) -- Auto-discovery workflow
- [Deployment Guide](docs/deployment.md) -- Docker and production deployment
- [Demo Script](docs/demo-script.md) -- Step-by-step live demo walkthrough
## License
MIT