Files

Yaojia Wang 33488fd634 feat: complete phase 1 -- core framework with chat loop, agents, and React UI

Backend:
- FastAPI WebSocket /ws endpoint with streaming via LangGraph astream
- LangGraph Supervisor connecting 3 mock agents (order_lookup, order_actions, fallback)
- YAML Agent Registry with Pydantic validation and immutable configs
- PostgresSaver checkpoint persistence via langgraph-checkpoint-postgres
- Session TTL with 30-min sliding window and interrupt extension
- LLM provider abstraction (Anthropic/OpenAI/Google)
- Token usage + cost tracking callback handler
- Input validation: message size cap, thread_id format, content length
- Security: no hardcoded defaults, startup API key validation, no input reflection

Frontend:
- React 19 + TypeScript + Vite chat UI
- WebSocket hook with reconnect + exponential backoff
- Streaming token display with agent attribution
- Interrupt approval/reject UI for write operations
- Collapsible tool call viewer

Testing:
- 87 unit tests, 87% coverage (exceeds 80% requirement)
- Ruff lint + format clean

Infrastructure:
- Docker Compose (PostgreSQL 16 + backend)
- pyproject.toml with full dependency management

2026-03-30 00:54:21 +02:00

4.0 KiB

Raw Blame History

Phase 1: Core Framework -- Development Log

Status: IN PROGRESS Phase branch: phase-1/core-framework Date started: 2026-03-30 Date completed: -- Related plan section: Phase 1 in DEVELOPMENT-PLAN

What Was Built

FastAPI WebSocket backend with /ws endpoint for real-time chat
LangGraph Supervisor (via langgraph-supervisor) connecting 3 agents
YAML-based Agent Registry with Pydantic validation
3 Mock Agents: order_lookup (read), order_actions (write + interrupt), fallback
PostgresSaver checkpoint persistence via langgraph-checkpoint-postgres
Session TTL management with 30-minute sliding window and interrupt extension
LLM provider abstraction (Anthropic/OpenAI/Google) with prompt caching support
Token usage tracking callback handler
React Chat UI with streaming display, interrupt confirmation, and agent action viewer
Docker Compose configuration (PostgreSQL 16 + backend)

Code Structure

New files

Backend (backend/app/):

config.py -- pydantic-settings centralized configuration
db.py -- Async PostgreSQL pool + AsyncPostgresSaver setup
llm.py -- LLM provider factory (ChatAnthropic/ChatOpenAI/ChatGoogleGenerativeAI)
callbacks.py -- Token usage + cost tracking callback handler
registry.py -- YAML agent registry with validation + immutable config models
session_manager.py -- Session TTL with sliding window + interrupt extension
graph.py -- LangGraph Supervisor construction from registry
ws_handler.py -- WebSocket message dispatch + streaming logic
main.py -- FastAPI app entry with lifespan + WebSocket endpoint
agents/__init__.py -- Tool name-to-function bridge
agents/order_lookup.py -- Mock order status/tracking tools
agents/order_actions.py -- Mock cancel_order with interrupt()
agents/fallback.py -- Fallback response tool

Frontend (frontend/src/):

types.ts -- WebSocket message protocol TypeScript types
hooks/useWebSocket.ts -- WebSocket connection + reconnect + message dispatch
components/ChatMessages.tsx -- Streaming message display
components/ChatInput.tsx -- Message input
components/InterruptPrompt.tsx -- Approve/reject interrupt UI
components/AgentAction.tsx -- Tool call inline display
pages/ChatPage.tsx -- Main chat page composing all components

Infrastructure:

backend/pyproject.toml -- Dependencies + pytest + ruff config
backend/agents.yaml -- Agent registry YAML config
backend/Dockerfile -- Backend container
docker-compose.yml -- PostgreSQL 16 + backend services
.gitignore -- Updated for Python + Node artifacts

Tests (backend/tests/unit/):

test_config.py -- Settings validation tests
test_registry.py -- 17 tests for registry loading/validation
test_agents.py -- 10 tests for tool functions + tool bridge
test_llm.py -- 3 tests for LLM provider factory
test_callbacks.py -- 9 tests for token usage tracking
test_session_manager.py -- 9 tests for session TTL logic
test_graph.py -- 4 tests for supervisor construction
test_db.py -- 5 tests for database setup
test_ws_handler.py -- 12 tests for WebSocket message handling
test_main.py -- 5 tests for app configuration

Test Coverage

Unit test count: 82
Integration test count: 0 (requires running PostgreSQL)
E2E test count: 0 (manual verification in plan)
Overall coverage: 88%

Deviations from Plan

Used astream(stream_mode="messages") instead of astream_events() per langgraph best practices
Separated WebSocket handler logic into ws_handler.py for testability (not in original plan)
Session manager uses in-memory storage instead of DB-backed (sufficient for Phase 1 single-instance)

Known Issues / Tech Debt

Session manager not DB-backed (loses state on restart) -- acceptable for Phase 1 single-instance
WebSocket reconnect does not re-send pending interrupt state from server
No rate limiting on WebSocket endpoint (Phase 2)
No authentication (Phase 2)
main.py coverage at 47% -- lifespan function not unit-testable without full DB

4.0 KiB Raw Blame History

Phase 1: Core Framework -- Development Log

What Was Built

Code Structure

New files

Test Coverage

Deviations from Plan

Known Issues / Tech Debt

4.0 KiB

Raw Blame History