smart-support

Author	SHA1	Message	Date
Yaojia Wang	f0699436c5	refactor: engineering improvements -- API versioning, structured logging, Alembic, error standardization, test coverage - API versioning: all REST endpoints prefixed with /api/v1/ - Structured logging: replaced stdlib logging with structlog (console/JSON modes) - Alembic migrations: versioned DB schema with initial migration - Error standardization: global exception handlers for consistent envelope format - Interrupt cleanup: asyncio background task for expired interrupt removal - Integration tests: +30 tests (analytics, replay, openapi, error, session APIs) - Frontend tests: +57 tests (all components, pages, useWebSocket hook) - Backend: 557 tests, 89.75% coverage \| Frontend: 80 tests, 16 test files	2026-04-06 23:19:29 +02:00
Yaojia Wang	af53111928	refactor: fix architectural issues across frontend and backend Address all architecture review findings: P0 fixes: - Add API key authentication for admin endpoints (analytics, replay, openapi) and WebSocket connections via ADMIN_API_KEY env var - Add PostgreSQL-backed PgSessionManager and PgInterruptManager for multi-worker production deployments (in-memory defaults preserved) P1 fixes: - Implement actual tool generation in OpenAPI approve_job endpoint using generate_tool_code() and generate_agent_yaml() - Add missing clarification, interrupt_expired, and tool_result message handlers in frontend ChatPage P2 fixes: - Replace monkey-patching on CompiledStateGraph with typed GraphContext - Replace 9-param dispatch_message with WebSocketContext dataclass - Extract duplicate _envelope() into shared app/api_utils.py - Replace mutable module-level counter with crypto.randomUUID() - Remove hardcoded mock data from ReviewPage, use api.ts wrappers - Remove `as any` type escape from ReplayPage All 516 tests passing, 0 TypeScript errors.	2026-04-06 15:59:14 +02:00
Yaojia Wang	b8654aa31f	feat: upgrade LangGraph to 1.x and migrate deprecated APIs - Bump langgraph from 0.4 to 1.0+, langgraph-supervisor from 0.0.12 to 0.0.30+ - Bump langchain-core, langchain-anthropic, langchain-openai to 1.x - Add langchain>=1.0 dependency for new create_agent location - Migrate create_react_agent -> create_agent (prompt -> system_prompt) - Fix create_supervisor positional arg to named agents= parameter - Replace AsyncMock checkpointer with InMemorySaver in tests (v1 type validation) - Update version references in README, ARCHITECTURE, eng-review-plan	2026-04-06 14:51:51 +02:00
Yaojia Wang	19fc9f3289	test: close coverage gaps and add frontend test infrastructure Backend (516 tests, 94% coverage): - Add azure_openai endpoint/deployment validation tests (config.py -> 100%) - Add _total_conversations and _avg_turns direct tests (queries.py -> 100%) - Add transformer edge cases: list content, string checkpoint, invalid JSON, malformed message graceful skip (transformer.py -> 93%) - Add safety combined status_code+error_message interaction tests - Fix ambiguous 200/422 assertion to strict 422 - Add E2E pagination shape assertions (total, page, per_page, row count) - Fix ReplayPool mock to respect LIMIT/OFFSET params Frontend (23 tests, vitest + happy-dom + @testing-library/react): - Add vitest infrastructure with happy-dom environment - Add api.ts tests: success, HTTP error, success=false, URL encoding - Add DashboardPage tests: loading, data, error, empty states - Add ReplayListPage tests: loading, empty, data, error, status badge classes - Add ReplayPage tests: loading, steps, empty, error states	2026-04-06 13:32:10 +02:00
Yaojia Wang	036e12349d	refactor: formalize safety rules, extract shared styles, reconcile docs (P2) - Add backend/app/safety.py with explicit confirmation policy, multi-intent semantics, and MCP error taxonomy with retry classification - Add 26 unit tests for safety module (confirmation rules, error taxonomy) - Extract repeated inline styles into shared CSS classes in index.css (section-card, stat-label, status-badge, data-table, empty/error-state, pagination-bar) - Refactor DashboardPage, ReplayListPage, ReplayPage to use shared classes - Update README: add missing API endpoints, document safety/confirmation rules - Use proper HTML entities for arrow/dash characters to fix encoding glitches	2026-04-05 23:10:50 +02:00
Yaojia Wang	e0931daece	feat: wire frontend pages to live APIs and standardize response contracts (P1) - Backend: Add COUNT query and paginated response shape to conversations endpoint Returns { conversations: [...], total, page, per_page } instead of flat array - Frontend: Replace mock data in DashboardPage with fetchAnalytics() API calls - Frontend: Replace mock data in ReplayListPage with fetchConversations() API calls - Frontend: Replace mock data in ReplayPage with fetchReplay() API calls - Add proper loading, empty, and error states to all three pages - Align ConversationSummary type with actual DB columns (created_at, status) - Update unit and E2E tests for new paginated conversation response shape - Add fetchone() to FakeCursor for COUNT query support in E2E tests	2026-04-05 23:06:00 +02:00
Yaojia Wang	e55ec42ae5	fix: restore green builds and align frontend-backend contracts (P0) - Isolate Settings tests from .env and process env leakage - Fix analytics metadata test to unwrap psycopg Json wrapper - Remove unused state variables causing frontend build failures - Fix ReviewPage to use /classifications endpoint instead of nonexistent /result - Normalize ReviewPage status enums (failed not error) and access_type values - Align api.ts types with backend response shapes (ReplayPage, AnalyticsData, AgentUsage)	2026-04-05 23:00:39 +02:00
Yaojia Wang	189a0fad34	feat(ui): implement premium beige design system and ux refinements	2026-04-05 22:35:48 +02:00
Yaojia Wang	d2b4610df9	fix: address code and security review findings for Phase 5 - Add nginx security headers (X-Frame-Options, X-Content-Type-Options, etc.) - Fix postgres networking: add to app_network, comment out host port exposure - Fix rate limit memory leak: add bounded eviction for stale thread entries - Use immutable update pattern in rate limit check (no .append mutation) - Extract _VERSION constant to avoid duplicate hardcoded version string	2026-03-31 21:35:13 +02:00
Yaojia Wang	0e78e5b06b	feat: complete phase 5 -- error hardening, frontend, Docker, demo, docs Backend: - ConversationTracker: Protocol + PostgresConversationTracker for lifecycle tracking - Error handler: ErrorCategory enum, classify_error(), with_retry() exponential backoff - Wire PostgresAnalyticsRecorder + ConversationTracker into ws_handler - Rate limiting (10 msg/10s per thread), edge case hardening - Health endpoint GET /api/health, version 0.5.0 - Demo seed data script + sample OpenAPI spec Frontend (all new): - React Router with NavBar (Chat / Replay / Dashboard / Review) - ReplayListPage + ReplayPage with ReplayTimeline component - DashboardPage with MetricCard, range selector, zero-state - ReviewPage for OpenAPI classification review - ErrorBanner for WebSocket disconnect handling - API client (api.ts) with typed fetch wrappers Infrastructure: - Frontend Dockerfile (multi-stage node -> nginx) - nginx.conf with SPA routing + API/WS proxy - docker-compose.yml with frontend service + healthchecks - .env.example files (root + backend) Documentation: - README.md with quick start and architecture - Agent configuration guide - OpenAPI import guide - Deployment guide - Demo script 48 new tests, 449 total passing, 92.87% coverage	2026-03-31 21:20:06 +02:00
Yaojia Wang	38644594d2	test: add thread_id validation tests for replay API - Test invalid thread_id with spaces returns 400 - Test thread_id with special chars returns 400 - Tighten existing 404 test assertion	2026-03-31 13:44:04 +02:00
Yaojia Wang	ef6e5ac2be	fix: address security findings in Phase 4 analytics and replay - Fix CRITICAL: use parameterized INTERVAL arithmetic (%(days)s * INTERVAL '1 day') instead of string interpolation inside SQL literal - Use asyncio.gather() for parallel query execution in get_analytics() - Add range upper bound (max 365 days) to prevent DoS via full-table scans - Add thread_id validation (alphanumeric, max 128 chars) in replay API - Sanitize error messages to not reflect user input	2026-03-31 13:38:09 +02:00
Yaojia Wang	33db5aeb10	feat: complete phase 4 -- conversation replay API + analytics dashboard - Replay models: StepType enum, ReplayStep, ReplayPage frozen dataclasses - Checkpoint transformer: PostgresSaver JSONB -> structured timeline steps - Replay API: GET /api/conversations (paginated), GET /api/replay/{thread_id} - Analytics models: AgentUsage, InterruptStats, AnalyticsResult - Analytics event recorder: Protocol + PostgresAnalyticsRecorder + NoOp - Analytics queries: resolution_rate, agent_usage, escalation_rate, cost, interrupts - Analytics API: GET /api/analytics?range=Xd with envelope response - DB migration: analytics_events table + conversations column additions - 74 new tests, 399 total passing, 92.87% coverage	2026-03-31 13:35:45 +02:00
Yaojia Wang	a2f750269d	fix: address critical security and code review findings in Phase 3 - Wire ImportOrchestrator into review_api start_import via BackgroundTasks - Sanitize docstrings in generated tool code to prevent code injection - Add Literal["read", "write"] validation for access_type - Add regex validation for agent_group - Validate URL scheme (http/https only) in ImportRequest - Validate LLM output fields (clamp confidence, validate access_type) - Use dataclasses.replace instead of manual reconstruction in importer - Expand SSRF blocked networks (Carrier-Grade NAT, IPv4-mapped IPv6, etc.) - Make _BLOCKED_NETWORKS immutable tuple - Use yaml.safe_dump instead of yaml.dump - Fix _to_snake_case for empty strings and Python keywords	2026-03-31 00:28:28 +02:00
Yaojia Wang	a54eb224e0	feat: complete phase 3 -- OpenAPI auto-discovery, SSRF protection, tool generation - SSRF protection: private IP blocking, DNS rebinding defense, redirect validation - OpenAPI fetcher with SSRF guard, JSON/YAML auto-detection, 10MB limit - Structural spec validator (3.0.x/3.1.x) - Endpoint parser with $ref resolution, auto-generated operation IDs - Heuristic + LLM endpoint classifier with Protocol interface - Review API at /api/openapi (import, job status, classification CRUD, approve) - @tool code generator + Agent YAML generator - Import orchestrator (fetch -> validate -> parse -> classify pipeline) - 125 new tests, 322 total passing, 93.23% coverage	2026-03-31 00:10:44 +02:00
Yaojia Wang	006b4ee5d7	fix: resolve ruff lint errors in Phase 2 code - Move intent imports to TYPE_CHECKING block in graph.py (TC001) - Rename test classes to CapWords convention (N801) - Fix line length violations across test files (E501) - Auto-fix import sorting (I001)	2026-03-30 21:44:47 +02:00
Yaojia Wang	b861ff055f	test: add routing integration tests for Phase 2 test requirements 9 tests covering the complete multi-agent routing flow: - Single-intent routing to each agent (order_lookup, order_actions, discount, fallback) - Multi-intent routing hint injection for sequential execution - Ambiguity detection skips graph and returns clarification - Low confidence threshold triggers ambiguity - No-classifier fallback to supervisor prompt routing Fills Phase 2 test requirement for integration-level routing coverage. Total: 197 tests, 92.60% coverage.	2026-03-30 21:41:01 +02:00
Yaojia Wang	512f988dd0	test: add Phase 2 checkpoint acceptance tests 18 integration tests validating all 7 Phase 2 checkpoint criteria: 1. Order query routes to order_lookup agent 2. Multi-intent classification with routing hint injection 3. Ambiguous message triggers clarification prompt 4. 30-min interrupt TTL auto-cancel with retry prompt 5. Webhook POST escalation with retry on failure 6. E-commerce template loads 4 correctly configured agents 7. Coverage at 92.60% (188 tests total)	2026-03-30 21:38:25 +02:00
Yaojia Wang	6e7b824b64	test: add integration tests for WebSocket message flow 17 integration tests covering: - Happy path: token streaming, tool calls, multi-message sessions - Interrupt flow: approve and reject paths with manager tracking - Session TTL: expiration, sliding window reset, interrupt extension - Validation: invalid JSON, missing fields, size limits - Interrupt TTL: expired interrupt sends retry prompt Fills Phase 1 test gap for integration-level WebSocket coverage. Total: 170 tests, 92.15% coverage.	2026-03-30 21:24:31 +02:00
Yaojia Wang	1050df780d	feat: complete phase 2 -- multi-agent routing, interrupt TTL, escalation, templates - Intent classification with LLM structured output (single/multi/ambiguous) - Discount agent with apply_discount and generate_coupon tools - Interrupt manager with 30-min TTL auto-expiration and retry prompts - Webhook escalation module with exponential backoff retry (max 3) - Three vertical industry templates (e-commerce, SaaS, fintech) - Template loading in AgentRegistry - Enhanced supervisor prompt with dynamic agent descriptions - 153 tests passing, 90.18% coverage	2026-03-30 21:04:39 +02:00
Yaojia Wang	33488fd634	feat: complete phase 1 -- core framework with chat loop, agents, and React UI Backend: - FastAPI WebSocket /ws endpoint with streaming via LangGraph astream - LangGraph Supervisor connecting 3 mock agents (order_lookup, order_actions, fallback) - YAML Agent Registry with Pydantic validation and immutable configs - PostgresSaver checkpoint persistence via langgraph-checkpoint-postgres - Session TTL with 30-min sliding window and interrupt extension - LLM provider abstraction (Anthropic/OpenAI/Google) - Token usage + cost tracking callback handler - Input validation: message size cap, thread_id format, content length - Security: no hardcoded defaults, startup API key validation, no input reflection Frontend: - React 19 + TypeScript + Vite chat UI - WebSocket hook with reconnect + exponential backoff - Streaming token display with agent attribution - Interrupt approval/reject UI for write operations - Collapsible tool call viewer Testing: - 87 unit tests, 87% coverage (exceeds 80% requirement) - Ruff lint + format clean Infrastructure: - Docker Compose (PostgreSQL 16 + backend) - pyproject.toml with full dependency management	2026-03-30 00:54:21 +02:00

21 Commits