Files

Yaojia Wang 33488fd634 feat: complete phase 1 -- core framework with chat loop, agents, and React UI

Backend:
- FastAPI WebSocket /ws endpoint with streaming via LangGraph astream
- LangGraph Supervisor connecting 3 mock agents (order_lookup, order_actions, fallback)
- YAML Agent Registry with Pydantic validation and immutable configs
- PostgresSaver checkpoint persistence via langgraph-checkpoint-postgres
- Session TTL with 30-min sliding window and interrupt extension
- LLM provider abstraction (Anthropic/OpenAI/Google)
- Token usage + cost tracking callback handler
- Input validation: message size cap, thread_id format, content length
- Security: no hardcoded defaults, startup API key validation, no input reflection

Frontend:
- React 19 + TypeScript + Vite chat UI
- WebSocket hook with reconnect + exponential backoff
- Streaming token display with agent attribution
- Interrupt approval/reject UI for write operations
- Collapsible tool call viewer

Testing:
- 87 unit tests, 87% coverage (exceeds 80% requirement)
- Ruff lint + format clean

Infrastructure:
- Docker Compose (PostgreSQL 16 + backend)
- pyproject.toml with full dependency management

2026-03-30 00:54:21 +02:00

11 KiB

Raw Blame History

Smart Support -- Project Instructions

Project Overview

AI customer support action-layer framework. Core value: "Paste your API, get an AI agent that executes real actions."

Tech Stack: Python 3.11+, FastAPI, LangGraph v1.1, React, PostgreSQL 16, Docker Compose
Planning Docs: docs/DEVELOPMENT-PLAN.md, docs/ARCHITECTURE.md, eng-review-plan.md
Phases: 5 phases (see docs/DEVELOPMENT-PLAN.md)

Phase Execution Workflow (MANDATORY)

Every phase MUST follow this exact workflow. No exceptions.

Step 0: Pre-flight Checks, Checkpoint, and Branch

Before starting ANY phase work:

# 1. Verify ECC hooks are working
#    - ECC plugin provides all hooks (quality-gate, security, session management)
#    - Controlled by ECC_HOOK_PROFILE env var (standard/strict) in ~/.claude/settings.json
#    - Run a test edit on a .py file to confirm quality-gate.js fires (ruff auto-format)
#    - If hooks are broken, check ECC plugin status before proceeding

# 2. Run full test suite to confirm main is green (regression gate)
pytest --cov=app --cov-report=term-missing
#    - If any test fails, fix it before starting the new phase

# 3. Create checkpoint to snapshot the starting state
/everything-claude-code:checkpoint create [phase name]

# 4. Create the phase branch
git checkout main
git pull origin main
git checkout -b phase-{N}/{short-description}
# Example: phase-1/core-framework

# 5. Mark phase as IN PROGRESS
#    Update the Phase Summary table in this CLAUDE.md:
#    change Status from `NOT STARTED` to `IN PROGRESS`

Step 1: Read Plan and Prepare

Read docs/DEVELOPMENT-PLAN.md -- locate the current phase section
Read docs/ARCHITECTURE.md -- understand relevant components
Identify all tasks, acceptance criteria, and dependencies for this phase
Create a phase dev log skeleton at docs/phases/phase-{N}-dev-log.md (date, branch name, plan link only -- content filled in Step 5)

Step 2: Develop Using Orchestrate Skill

Route to the correct orchestration mode based on work type:

Work Type	Skill Command
New feature	`/everything-claude-code:orchestrate feature`
Bug fix	`/everything-claude-code:orchestrate bugfix`
Refactor	`/everything-claude-code:orchestrate refactor`

ALWAYS use the appropriate orchestrate skill. Never develop without it.

A single phase may contain mixed work types (e.g., Phase 5 has feature + bugfix + refactor). Call the orchestrate skill per sub-task with the matching mode. Example:

# Within Phase 5:
/everything-claude-code:orchestrate feature    # for demo script
/everything-claude-code:orchestrate bugfix     # for error handling fixes
/everything-claude-code:orchestrate refactor   # for code cleanup

Step 3: Module Independence (CRITICAL)

Every module MUST be independently runnable (with mocked deps) and independently testable. Use Protocol for interfaces and dependency injection for all external deps (see ~/.claude/rules/python/patterns.md). No circular imports -- dependency graph must be a DAG.

Module boundaries for this project:

backend/app/
  agents/       -- Agent definitions and tools (depends on: registry)
  registry.py   -- YAML agent registry (depends on: nothing)
  graph.py      -- LangGraph supervisor (depends on: agents, registry)
  openapi/      -- OpenAPI parser + MCP gen (depends on: nothing)
  replay/       -- Conversation replay API (depends on: nothing, reads DB)
  analytics/    -- Analytics queries (depends on: nothing, reads DB)
  callbacks.py  -- Token usage logging (depends on: nothing)
  main.py       -- FastAPI entry point (composes all modules)
frontend/       -- React UI (depends on: backend API contract only)

Step 4: Testing (MANDATORY -- 80%+ Coverage)

Follow ~/.claude/rules/python/testing.md for pytest patterns and pytest.mark categorization. Use these project-specific markers:

@pytest.mark.unit          # per-module isolated tests
@pytest.mark.integration   # cross-module with real PostgreSQL (Docker)
@pytest.mark.e2e           # full-stack user flow tests

E2E scope by phase:

Phase 1: Backend-only E2E via FastAPI TestClient (WebSocket chat loop). Frontend and backend develop in parallel, full-stack E2E not yet possible.
Phase 2+: Full-stack E2E (frontend + backend) required for all critical user flows.

Test file structure:

backend/tests/
  unit/           -- @pytest.mark.unit per module
  integration/    -- @pytest.mark.integration with real DB
  e2e/            -- @pytest.mark.e2e full-stack flows
  conftest.py     -- Shared fixtures, marker registration

NOTE: This project uses --cov=app (not --cov=src as in the global Python rules):

pytest --cov=app --cov-report=term-missing
pytest -m unit              # run only unit tests
pytest -m integration       # run only integration tests
pytest -m e2e               # run only e2e tests

Step 5: Phase Development Documentation

Fill in the dev log skeleton created in Step 1 (docs/phases/phase-{N}-dev-log.md):

# Phase {N}: {Title} -- Development Log

> Status: IN PROGRESS | COMPLETED
> Phase branch: `phase-{N}/{short-description}`
> Date started: YYYY-MM-DD
> Date completed: YYYY-MM-DD
> Related plan section: [Phase {N} in DEVELOPMENT-PLAN](../DEVELOPMENT-PLAN.md#phase-{n}-xxx)

## What Was Built

- List of features/components implemented
- Architecture decisions made during development

## Code Structure

- New files created and their purposes
- Modified files and what changed
- Module dependency changes

## Test Coverage

- Unit test count and coverage %
- Integration test count
- E2E test count
- Overall coverage: XX%

## Deviations from Plan

- Any changes from the original plan and why

## Known Issues / Tech Debt

- Items deferred to future phases

Add a link to this dev log in docs/DEVELOPMENT-PLAN.md under the corresponding phase section.

Step 6: Verification and Checkpoint Verify

After all development and testing, run verification in this exact order:

# 1. Run the verification skill -- must pass
/everything-claude-code:verify

# 2. Verify the checkpoint -- validates all phase deliverables
/everything-claude-code:checkpoint verify [phase name]

The checkpoint verify validates:

All tests passing (80%+ coverage)
Phase dev log exists and is linked
No unresolved TODOs for this phase
Module independence constraints met
Code quality checks pass

Both MUST pass before proceeding. Fix any issues found.

Step 7: Mark Complete, Merge to Main

After both verify steps pass:

# Ensure all changes are committed on the phase branch
# IMPORTANT: Never use `git add -A`. Add specific files to avoid committing
# .env, __pycache__, IDE configs, or other untracked artifacts.
# Verify .gitignore covers all exclusions before staging.
git add backend/ frontend/ docs/ docker-compose.yml pyproject.toml
git status  # review staged files before committing
git commit -m "feat: complete phase {N} -- {description}"

# Tag the checkpoint
git tag checkpoint/phase-{N}

# Merge to main
git checkout main
git merge phase-{N}/{short-description}

# Push (after user confirmation)
git push origin main --tags

Mark phase as completed (MANDATORY -- do ALL three):

Update Phase Summary table in this file (CLAUDE.md): change Status from IN PROGRESS to COMPLETED (YYYY-MM-DD)
Update dev log (docs/phases/phase-{N}-dev-log.md): fill in Date completed and set status to COMPLETED
Check off tasks in docs/DEVELOPMENT-PLAN.md: mark all completed task checkboxes - [x] under the current phase section

All three markers must be consistent. If any is missed, the next phase's Step 0 regression gate will catch the discrepancy.

A checkpoint includes:

/everything-claude-code:checkpoint create at phase start
/everything-claude-code:checkpoint verify at phase end
All tests passing (80%+ coverage)
Phase dev log written and linked
/everything-claude-code:verify passed
Git tag checkpoint/phase-{N} created
Phase marked COMPLETED in three locations
Branch merged to main

Phase Summary

Phase	Branch	Focus	Status
1	`phase-1/core-framework`	FastAPI + LangGraph + React chat loop + PostgresSaver	IN PROGRESS
2	`phase-2/multi-agent-safety`	Supervisor routing + interrupts + templates	NOT STARTED
3	`phase-3/openapi-discovery`	OpenAPI parsing + MCP generation + SSRF protection	NOT STARTED
4	`phase-4/analytics-replay`	Replay API + analytics dashboard	NOT STARTED
5	`phase-5/polish-demo`	Error hardening + demo prep + Docker deploy	NOT STARTED

Status values: NOT STARTED -> IN PROGRESS -> COMPLETED (YYYY-MM-DD)

Rules Reference

This project inherits from ~/.claude/rules/. CLAUDE.md only contains project-specific overrides -- do NOT duplicate rules content here.

Rule File	Project Override
`common/coding-style.md`	None -- follow as-is
`python/coding-style.md`	Use `ruff` only (not black + isort separately)
`python/patterns.md`	`Protocol` for all module interfaces. Module boundary map above
`python/testing.md`	`--cov=app` (not `--cov=src`). Add `@pytest.mark.e2e` marker
`python/security.md`	`bandit` scan recommended before phase merge
`python/hooks.md`	All hooks provided by ECC plugin (see below). Type checker deferred to Phase 2+
`common/testing.md`	80%+ coverage enforced by workflow Step 6 verify
`common/security.md`	ECC `governance-capture.js` + `block-no-verify` handle security checks

Hooks (ECC Plugin -- No Custom Hooks)

All hooks come from the ECC plugin (everything-claude-code). No project-level hooks in .claude/settings.local.json.

ECC Hook	Type	What It Does
`quality-gate.js`	PostToolUse (Edit/Write)	Auto-runs ruff check + format on .py files
`post-edit-format.js`	PostToolUse (Edit)	Auto-format after edits
`block-no-verify`	PreToolUse (Bash)	Blocks `--no-verify` flag on git commits
`governance-capture.js`	Pre+Post (Bash/Edit/Write)	Captures secrets/policy violations (enable: `ECC_GOVERNANCE_CAPTURE=1`)
`pre-bash-git-push-reminder.js`	PreToolUse (Bash)	Confirms before git push
`suggest-compact.js`	PreToolUse (Edit/Write)	Suggests context compaction when needed
`session-end.js`	Stop	Persists session state
`evaluate-session.js`	Stop	Extracts reusable patterns
`cost-tracker.js`	Stop	Tracks token/cost metrics
`check-console-log.js`	Stop	Checks for debug statements
`mcp-health-check.js`	Pre + PostFailure	MCP server health monitoring

Controlled by ECC_HOOK_PROFILE env var in ~/.claude/settings.json (currently: standard).

Quick Reference

Plan doc: docs/DEVELOPMENT-PLAN.md
Architecture doc: docs/ARCHITECTURE.md
Phase dev logs: docs/phases/phase-{N}-dev-log.md
Test command: pytest --cov=app --cov-report=term-missing
Phase start: /everything-claude-code:checkpoint create [phase name]
Phase end: /everything-claude-code:checkpoint verify [phase name]
Verify command: /everything-claude-code:verify
Orchestrate: /everything-claude-code:orchestrate {feature|bugfix|refactor}

11 KiB Raw Blame History