Files
smart-support/docs/phases/phase-1-dev-log.md
Yaojia Wang 33488fd634 feat: complete phase 1 -- core framework with chat loop, agents, and React UI
Backend:
- FastAPI WebSocket /ws endpoint with streaming via LangGraph astream
- LangGraph Supervisor connecting 3 mock agents (order_lookup, order_actions, fallback)
- YAML Agent Registry with Pydantic validation and immutable configs
- PostgresSaver checkpoint persistence via langgraph-checkpoint-postgres
- Session TTL with 30-min sliding window and interrupt extension
- LLM provider abstraction (Anthropic/OpenAI/Google)
- Token usage + cost tracking callback handler
- Input validation: message size cap, thread_id format, content length
- Security: no hardcoded defaults, startup API key validation, no input reflection

Frontend:
- React 19 + TypeScript + Vite chat UI
- WebSocket hook with reconnect + exponential backoff
- Streaming token display with agent attribution
- Interrupt approval/reject UI for write operations
- Collapsible tool call viewer

Testing:
- 87 unit tests, 87% coverage (exceeds 80% requirement)
- Ruff lint + format clean

Infrastructure:
- Docker Compose (PostgreSQL 16 + backend)
- pyproject.toml with full dependency management
2026-03-30 00:54:21 +02:00

4.0 KiB

Phase 1: Core Framework -- Development Log

Status: IN PROGRESS Phase branch: phase-1/core-framework Date started: 2026-03-30 Date completed: -- Related plan section: Phase 1 in DEVELOPMENT-PLAN

What Was Built

  • FastAPI WebSocket backend with /ws endpoint for real-time chat
  • LangGraph Supervisor (via langgraph-supervisor) connecting 3 agents
  • YAML-based Agent Registry with Pydantic validation
  • 3 Mock Agents: order_lookup (read), order_actions (write + interrupt), fallback
  • PostgresSaver checkpoint persistence via langgraph-checkpoint-postgres
  • Session TTL management with 30-minute sliding window and interrupt extension
  • LLM provider abstraction (Anthropic/OpenAI/Google) with prompt caching support
  • Token usage tracking callback handler
  • React Chat UI with streaming display, interrupt confirmation, and agent action viewer
  • Docker Compose configuration (PostgreSQL 16 + backend)

Code Structure

New files

Backend (backend/app/):

  • config.py -- pydantic-settings centralized configuration
  • db.py -- Async PostgreSQL pool + AsyncPostgresSaver setup
  • llm.py -- LLM provider factory (ChatAnthropic/ChatOpenAI/ChatGoogleGenerativeAI)
  • callbacks.py -- Token usage + cost tracking callback handler
  • registry.py -- YAML agent registry with validation + immutable config models
  • session_manager.py -- Session TTL with sliding window + interrupt extension
  • graph.py -- LangGraph Supervisor construction from registry
  • ws_handler.py -- WebSocket message dispatch + streaming logic
  • main.py -- FastAPI app entry with lifespan + WebSocket endpoint
  • agents/__init__.py -- Tool name-to-function bridge
  • agents/order_lookup.py -- Mock order status/tracking tools
  • agents/order_actions.py -- Mock cancel_order with interrupt()
  • agents/fallback.py -- Fallback response tool

Frontend (frontend/src/):

  • types.ts -- WebSocket message protocol TypeScript types
  • hooks/useWebSocket.ts -- WebSocket connection + reconnect + message dispatch
  • components/ChatMessages.tsx -- Streaming message display
  • components/ChatInput.tsx -- Message input
  • components/InterruptPrompt.tsx -- Approve/reject interrupt UI
  • components/AgentAction.tsx -- Tool call inline display
  • pages/ChatPage.tsx -- Main chat page composing all components

Infrastructure:

  • backend/pyproject.toml -- Dependencies + pytest + ruff config
  • backend/agents.yaml -- Agent registry YAML config
  • backend/Dockerfile -- Backend container
  • docker-compose.yml -- PostgreSQL 16 + backend services
  • .gitignore -- Updated for Python + Node artifacts

Tests (backend/tests/unit/):

  • test_config.py -- Settings validation tests
  • test_registry.py -- 17 tests for registry loading/validation
  • test_agents.py -- 10 tests for tool functions + tool bridge
  • test_llm.py -- 3 tests for LLM provider factory
  • test_callbacks.py -- 9 tests for token usage tracking
  • test_session_manager.py -- 9 tests for session TTL logic
  • test_graph.py -- 4 tests for supervisor construction
  • test_db.py -- 5 tests for database setup
  • test_ws_handler.py -- 12 tests for WebSocket message handling
  • test_main.py -- 5 tests for app configuration

Test Coverage

  • Unit test count: 82
  • Integration test count: 0 (requires running PostgreSQL)
  • E2E test count: 0 (manual verification in plan)
  • Overall coverage: 88%

Deviations from Plan

  • Used astream(stream_mode="messages") instead of astream_events() per langgraph best practices
  • Separated WebSocket handler logic into ws_handler.py for testability (not in original plan)
  • Session manager uses in-memory storage instead of DB-backed (sufficient for Phase 1 single-instance)

Known Issues / Tech Debt

  • Session manager not DB-backed (loses state on restart) -- acceptable for Phase 1 single-instance
  • WebSocket reconnect does not re-send pending interrupt state from server
  • No rate limiting on WebSocket endpoint (Phase 2)
  • No authentication (Phase 2)
  • main.py coverage at 47% -- lifespan function not unit-testable without full DB