smart-support/README.md

# Smart Support

AI customer support action layer. Paste your API spec, get an AI agent that executes real actions.

## The Problem

Existing support tools (Zendesk, Intercom, Ada) answer FAQs well but automation
rates stall at 20-30%. The remaining 70% of tickets require agents to manually
log into internal systems to look up orders, cancel orders, issue coupons.

Smart Support fills that gap as the "action layer" -- it does not replace your
existing support platform, it enables AI to directly call your internal systems.

## How It Works

```
User message -> Chat UI -> FastAPI WebSocket -> LangGraph Supervisor -> Specialist Agent -> MCP Tools -> Your systems
                                                        |                      |
                                                  Agent Registry          interrupt()
                                                  (YAML config)         (human approval)
                                                        |
                                                  PostgresSaver
                                               (session persistence)
```

1. User sends a message in the chat UI.
2. LangGraph Supervisor classifies intent and routes to the right agent.
3. Agent calls your internal systems via MCP tools.
4. Write operations trigger a human-in-the-loop approval gate.
5. All operations are logged with full replay and analytics.

## Key Features

- **Multi-agent routing** -- each operation goes to a specialist agent with its own tools and permissions
- **Zero-config import** -- paste an OpenAPI 3.0 URL, agents are generated automatically
- **Human-in-the-loop** -- all write operations (cancel, refund, modify) require approval; reads execute immediately
- **Session context** -- multi-turn conversation with persistent state across reconnects
- **Real-time streaming** -- WebSocket token streaming with live tool call visibility
- **Conversation replay** -- step-by-step audit trail of every agent decision
- **Analytics dashboard** -- resolution rate, agent usage, escalation rate, cost per conversation
- **YAML-driven config** -- agents, personas, and vertical templates in a single file

## Tech Stack

| Component | Technology |
|-----------|-----------|
| Backend | Python 3.11+, FastAPI |
| Agent orchestration | LangGraph v1.1 |
| Session state | PostgreSQL + langgraph-checkpoint-postgres |
| LLM | Claude Sonnet 4.6 (configurable: OpenAI, Google) |
| Frontend | React 19, TypeScript, Vite |
| Deployment | Docker Compose |

## Quick Start

```bash
git clone <repo-url>
cd smart-support

# Configure your LLM API key
cp .env.example .env
# Edit .env: set ANTHROPIC_API_KEY (or OPENAI_API_KEY)

# Start all services
docker compose up -d

# Open the app
open http://localhost
```

## Project Structure

```
smart-support/
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI + WebSocket entry point
│   │   ├── graph.py             # LangGraph Supervisor
│   │   ├── ws_handler.py        # WebSocket message dispatch + rate limiting
│   │   ├── conversation_tracker.py  # Conversation lifecycle tracking
│   │   ├── agents/              # Agent definitions and tools
│   │   ├── registry.py          # YAML agent registry loader
│   │   ├── openapi/             # OpenAPI parser and review API
│   │   ├── replay/              # Conversation replay API
│   │   ├── analytics/           # Analytics queries and API
│   │   └── tools/               # Error handling and retry utilities
│   ├── agents.yaml              # Agent registry configuration
│   ├── fixtures/                # Demo data and sample OpenAPI spec
│   └── tests/                   # Unit, integration, and E2E tests
├── frontend/
│   ├── src/
│   │   ├── pages/               # Chat, Replay, Dashboard, Review pages
│   │   ├── components/          # NavBar, Layout, MetricCard, ReplayTimeline
│   │   ├── hooks/               # useWebSocket with reconnect support
│   │   └── api.ts               # Typed API client
│   └── Dockerfile               # Multi-stage nginx build
├── docs/                        # Architecture, deployment, guides
├── docker-compose.yml           # Full-stack compose
└── .env.example                 # Environment variable template
```

## Agent Configuration

```yaml
# agents.yaml
agents:
  - name: order_agent
    description: "Handles order status, tracking, and cancellations."
    permission: write
    tools:
      - get_order_status
      - cancel_order
    personality:
      tone: friendly
      greeting: "I can help with your order. What is the order number?"
      escalation_message: "I'm escalating this to a human agent."

  - name: general_agent
    description: "Answers general questions and FAQs."
    permission: read
    tools:
      - search_faq
```

## API Endpoints

| Method | Path | Description |
|--------|------|-------------|
| WS | `/ws` | Main WebSocket chat endpoint |
| GET | `/api/health` | Health check |
| GET | `/api/conversations` | List conversations (paginated) |
| GET | `/api/replay/{thread_id}` | Replay conversation steps (paginated) |
| GET | `/api/analytics` | Analytics summary (`?range=7d`) |
| POST | `/api/openapi/import` | Start OpenAPI import job |
| GET | `/api/openapi/jobs/{id}` | Check import job status |
| GET | `/api/openapi/jobs/{id}/classifications` | Get endpoint classifications |
| PUT | `/api/openapi/jobs/{id}/classifications/{idx}` | Update a classification |
| POST | `/api/openapi/jobs/{id}/approve` | Approve and generate tools |

## Safety and Confirmation Rules

Destructive-action confirmation is explicit and auditable (see `backend/app/safety.py`):

- **Read actions** execute immediately -- no confirmation required.
- **Write actions** require human-in-the-loop approval via an interrupt gate.
- **OpenAPI-imported endpoints** use the `needs_interrupt` classification flag.
- **Multi-intent handling** is sequential: if a write action is blocked by an interrupt, subsequent actions are paused until the interrupt is resolved or rejected.
- **MCP errors** are classified into `transient` (retryable, up to 3 attempts), `validation` (not retryable), `auth` (not retryable, escalate), and `unknown` (not retryable, log and escalate).

## Security

- **SSRF protection** -- OpenAPI import blocks private IPs and metadata service URLs
- **Input validation** -- messages validated for size (32 KB), content length (10 KB), thread ID format
- **Rate limiting** -- 10 messages per 10 seconds per session
- **Audit trail** -- every tool call logged with agent, params, result, timestamp
- **Permission isolation** -- each agent only accesses its configured tools
- **Interrupt TTL** -- unanswered approval prompts expire after 30 minutes

## Running Tests

```bash
cd backend
pytest --cov=app --cov-report=term-missing
```

Coverage is enforced at 80%+.

## Documentation

- [Architecture](docs/ARCHITECTURE.md) -- System design and component diagram
- [Development Plan](docs/DEVELOPMENT-PLAN.md) -- Phase breakdown and status
- [Agent Config Guide](docs/agent-config-guide.md) -- How to configure agents
- [OpenAPI Import Guide](docs/openapi-import-guide.md) -- Auto-discovery workflow
- [Deployment Guide](docs/deployment.md) -- Docker and production deployment
- [Demo Script](docs/demo-script.md) -- Step-by-step live demo walkthrough

## License

MIT