Files
ColaFlow/reports/2025-11-03-10-Day-Implementation-Plan.md
Yaojia Wang 1f66b25f30
Some checks failed
Code Coverage / Generate Coverage Report (push) Has been cancelled
Tests / Run Tests (9.0.x) (push) Has been cancelled
Tests / Docker Build Test (push) Has been cancelled
Tests / Test Summary (push) Has been cancelled
In progress
2025-11-03 14:00:24 +01:00

1492 lines
44 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# ColaFlow Enterprise Multi-Tenant - 10-Day Implementation Plan
**Plan Date:** 2025-11-03
**Plan Horizon:** 10 working days (2025-11-03 to 2025-11-13)
**Plan Status:** Active - Day 1 Complete
**Project:** ColaFlow M1.2 - Enterprise Multi-Tenant Architecture
**Owner:** Product Manager + Technical Leads
---
## Executive Summary
This document provides a detailed 10-day implementation plan for ColaFlow's enterprise multi-tenant architecture upgrade, including multi-tenancy, SSO integration, and MCP authentication. The plan covers backend development (Days 1-4), frontend development (Days 5-7), integration testing (Day 8), database migration (Day 9), and production validation (Day 10).
**Overall Progress:** 10% complete (Day 1 of 10)
**Status:** On Track
**Projected Completion:** 2025-11-13
---
## Table of Contents
1. [Plan Overview](#plan-overview)
2. [Day-by-Day Breakdown](#day-by-day-breakdown)
3. [Dependencies and Critical Path](#dependencies-and-critical-path)
4. [Resource Allocation](#resource-allocation)
5. [Risk Management](#risk-management)
6. [Success Criteria](#success-criteria)
7. [Communication Plan](#communication-plan)
---
## Plan Overview
### Phases
| Phase | Days | Owner | Deliverables | Status |
|-------|------|-------|--------------|--------|
| **Phase 1: Backend Foundation** | 1-4 | Backend Team | Domain, Application, Infrastructure, Migration | Day 1 Complete |
| **Phase 2: Frontend Development** | 5-7 | Frontend Team | Auth UI, Settings, MCP Token Management | Planned |
| **Phase 3: Integration & Testing** | 8 | QA + All Teams | E2E tests, Security tests, Bug fixes | Planned |
| **Phase 4: Production Deployment** | 9-10 | DevOps + All Teams | Migration, Monitoring, Validation | Planned |
### Milestones
| Milestone | Target Date | Dependencies | Deliverables |
|-----------|-------------|--------------|--------------|
| **M1.2.1: Backend Complete** | Day 4 (2025-11-06) | Domain, Application, Infrastructure | Functional APIs, Tests passing |
| **M1.2.2: Frontend Complete** | Day 7 (2025-11-09) | Backend APIs ready | Full UI implementation |
| **M1.2.3: Integration Complete** | Day 8 (2025-11-10) | Backend + Frontend | All tests passing |
| **M1.2.4: Production Ready** | Day 10 (2025-11-13) | Testing complete | Live in production |
### Key Metrics
| Metric | Target | Current | Status |
|--------|--------|---------|--------|
| **Backend Completion** | 100% by Day 4 | 25% (Day 1) | On Track |
| **Frontend Completion** | 100% by Day 7 | 0% | Scheduled |
| **Test Coverage** | >80% | 100% (Domain) | Exceeds Target |
| **Integration Test Pass Rate** | 100% | N/A | Scheduled Day 8 |
| **Production Deployment** | Day 10 | N/A | Scheduled |
---
## Day-by-Day Breakdown
### Day 1: Domain Layer (COMPLETE)
**Date:** 2025-11-03
**Status:** COMPLETE
**Owner:** Backend Engineer
**Estimated Time:** 6 hours
**Actual Time:** 6 hours
#### Deliverables
- **Projects Created:**
- `ColaFlow.Modules.Identity.Domain`
- `ColaFlow.Modules.Identity.Application` (empty scaffold)
- `ColaFlow.Modules.Identity.Infrastructure` (empty scaffold)
- `ColaFlow.Modules.Identity.Domain.Tests`
- **Domain Layer (27 files):**
- Tenant aggregate root + 4 value objects + 3 enums + 7 domain events
- User aggregate root + 3 value objects + 2 enums + 4 domain events
- 2 repository interfaces (ITenantRepository, IUserRepository)
- **Unit Tests (44 tests):**
- TenantTests (15 tests)
- TenantSlugTests (7 tests)
- UserTests (22 tests)
- **Result:** 44/44 passing (100%)
#### Acceptance Criteria
- Domain entities follow DDD principles (aggregate roots, value objects)
- All business rules enforced at domain level
- Unit tests cover all critical paths
- No compilation warnings or errors
- Code follows Clean Architecture principles
**Status:** All criteria met
---
### Day 2: Infrastructure Layer + TenantContext
**Date:** 2025-11-04
**Status:** PLANNED
**Owner:** Backend Engineer
**Estimated Time:** 4-5 hours
#### Deliverables
1. **TenantContext Service**
- Resolve current tenant from JWT claims
- Inject into DbContext
- Handle missing tenant (throw exception)
- Unit tests (5 tests)
2. **EF Core Global Query Filter**
- Configure in `ApplicationDbContext.OnModelCreating`
- Apply to all `IHasTenant` entities: `.Where(e => e.TenantId == _tenantContext.CurrentTenantId)`
- Test with sample data
3. **Repository Implementations**
- `TenantRepository` (9 methods)
- `UserRepository` (10 methods)
- Integration tests (20 tests)
4. **Database Configuration**
- Connection string management
- DbContext configuration
- EF Core migrations setup
#### Tasks Checklist
- [ ] Create `TenantContext.cs` in Infrastructure
- [ ] Inject TenantContext into ApplicationDbContext
- [ ] Configure Global Query Filter for all entities
- [ ] Implement TenantRepository with all 9 methods
- [ ] Implement UserRepository with all 10 methods
- [ ] Write integration tests for repositories
- [ ] Test Global Query Filter with sample data
- [ ] Verify cross-tenant isolation (Tenant A cannot see Tenant B's data)
- [ ] Update connection strings in appsettings.json
- [ ] Run all tests (Unit + Integration)
#### Acceptance Criteria
- [ ] TenantContext resolves tenant from HttpContext
- [ ] Global Query Filter applied to all queries automatically
- [ ] Repositories pass all integration tests
- [ ] Cross-tenant queries return empty results (isolation verified)
- [ ] No database queries missing `WHERE tenant_id = ?`
- [ ] Performance: Queries complete in <50ms with proper indexes
#### Expected Files Created
```
Infrastructure/
Services/
TenantContext.cs (new)
Persistence/
ApplicationDbContext.cs (updated)
Repositories/
TenantRepository.cs (new)
UserRepository.cs (new)
Configuration/
TenantEntityConfiguration.cs (new)
UserEntityConfiguration.cs (new)
Tests/Integration/
Repositories/
TenantRepositoryTests.cs (new, 10 tests)
UserRepositoryTests.cs (new, 10 tests)
TenantContextTests.cs (new, 5 tests)
```
#### Dependencies
- Day 1 Domain layer complete
- PostgreSQL database running locally
#### Risks
- Global Query Filter might be bypassed accidentally (mitigation: code review)
- Performance issues if indexes not configured correctly (mitigation: EXPLAIN ANALYZE)
---
### Day 3: Application Layer + Tenant Registration API
**Date:** 2025-11-05
**Status:** PLANNED
**Owner:** Backend Engineer
**Estimated Time:** 4-5 hours
#### Deliverables
1. **Application Layer (Commands)**
- `RegisterTenantCommand` + Handler + Validator (FluentValidation)
- `UpdateSsoConfigCommand` + Handler + Validator
- `CreateUserCommand` + Handler + Validator
- `CreateUserFromSsoCommand` + Handler
2. **Application Layer (Queries)**
- `GetTenantBySlugQuery` + Handler
- `GetTenantByIdQuery` + Handler
- `CheckSlugAvailabilityQuery` + Handler
- `GetUserByEmailQuery` + Handler
3. **API Endpoints (TenantController)**
- `POST /api/tenants` - Register new tenant
- `GET /api/tenants/{id}` - Get tenant details
- `POST /api/tenants/check-slug` - Check slug availability
- `PUT /api/tenants/{id}/sso-config` - Configure SSO
4. **Validation**
- FluentValidation for all commands
- Custom validators (slug format, email format)
- Error responses with validation details
#### Tasks Checklist
- [ ] Create Commands folder in Application layer
- [ ] Implement RegisterTenantCommand + Handler
- [ ] Implement UpdateSsoConfigCommand + Handler
- [ ] Implement CreateUserCommand + Handler
- [ ] Create Queries folder in Application layer
- [ ] Implement GetTenantBySlugQuery + Handler
- [ ] Implement CheckSlugAvailabilityQuery + Handler
- [ ] Create TenantController in API project
- [ ] Implement all 4 tenant endpoints
- [ ] Add FluentValidation for all commands
- [ ] Write unit tests for commands/queries (20 tests)
- [ ] Write API integration tests (10 tests)
- [ ] Update API documentation (Scalar/Swagger)
- [ ] Test with Postman/Insomnia
#### Acceptance Criteria
- [ ] All commands execute successfully
- [ ] Validation errors return 400 Bad Request with details
- [ ] Slug availability check returns true/false correctly
- [ ] Tenant registration creates both tenant and admin user
- [ ] API returns proper HTTP status codes (200, 201, 400, 404)
- [ ] All unit tests pass
- [ ] All API integration tests pass
- [ ] API documentation updated
#### Expected Files Created
```
Application/
Tenants/
Commands/
RegisterTenant/
RegisterTenantCommand.cs (new)
RegisterTenantCommandHandler.cs (new)
RegisterTenantCommandValidator.cs (new)
UpdateSsoConfig/
UpdateSsoConfigCommand.cs (new)
UpdateSsoConfigCommandHandler.cs (new)
Queries/
GetTenantBySlug/
GetTenantBySlugQuery.cs (new)
GetTenantBySlugQueryHandler.cs (new)
CheckSlugAvailability/
CheckSlugAvailabilityQuery.cs (new)
CheckSlugAvailabilityQueryHandler.cs (new)
API/
Controllers/
TenantsController.cs (new)
```
#### Dependencies
- Day 2 Infrastructure layer complete
- TenantRepository and UserRepository working
#### Risks
- Slug validation regex might be too restrictive (mitigation: user testing)
- SSO config validation complex (mitigation: provider-specific validators)
---
### Day 4: Database Migration Preparation + Performance Testing
**Date:** 2025-11-06
**Status:** PLANNED
**Owner:** Backend Engineer + DBA
**Estimated Time:** 4-5 hours
#### Deliverables
1. **EF Core Migrations**
- Generate migration: `AddMultiTenancySupport`
- Review generated SQL scripts
- Test migration in local database
- Test rollback procedure
2. **Manual SQL Scripts**
- 01_create_tenants.sql
- 02_insert_default_tenant.sql
- 03_add_tenant_columns.sql
- 04_migrate_data.sql
- 05_validate_migration.sql
- 06_set_not_null.sql
- 07_create_indexes.sql
- 08_update_constraints.sql
- 09_add_sso_columns.sql
- 10_create_mcp_tables.sql
- 11_verify_performance.sql
3. **Performance Testing**
- Load sample data (100 tenants × 10,000 records)
- Run EXPLAIN ANALYZE on common queries
- Verify index usage (no sequential scans)
- Benchmark query performance (<50ms target)
- Stress test: 10,000 requests with tenant filtering
4. **Staging Environment Testing**
- Clone production-like database to staging
- Run full migration on staging
- Validate data integrity (0 rows with NULL tenant_id)
- Test rollback procedure
- Document any issues
#### Tasks Checklist
- [ ] Generate EF Core migration: `dotnet ef migrations add AddMultiTenancySupport`
- [ ] Review generated migration code
- [ ] Create manual SQL scripts (11 scripts)
- [ ] Test migration on local database
- [ ] Load sample data: 100 tenants, 10,000 records each
- [ ] Run EXPLAIN ANALYZE on:
- `SELECT * FROM projects WHERE tenant_id = 'xxx'`
- `SELECT * FROM issues WHERE tenant_id = 'xxx' AND status = 1`
- `SELECT * FROM users WHERE tenant_id = 'xxx' AND email = 'test@acme.com'`
- [ ] Verify all queries use index scans (not sequential scans)
- [ ] Benchmark query performance: <50ms p95
- [ ] Stress test: 10,000 concurrent requests
- [ ] Clone production database to staging
- [ ] Run full migration on staging
- [ ] Run validation script: check for NULL tenant_id
- [ ] Test rollback: restore from backup
- [ ] Document migration steps
- [ ] Create runbook for production migration
#### Acceptance Criteria
- [ ] EF Core migration generated successfully
- [ ] All SQL scripts reviewed and approved
- [ ] Local migration completes without errors
- [ ] No NULL tenant_id values after migration
- [ ] Query performance meets target (<50ms p95)
- [ ] All queries use appropriate indexes
- [ ] Staging migration successful (no data loss)
- [ ] Rollback procedure tested and documented
- [ ] Runbook created for production deployment
#### Expected Files Created
```
Infrastructure/
Persistence/
Migrations/
YYYYMMDDHHMMSS_AddMultiTenancySupport.cs (new)
migration_scripts/
01_create_tenants.sql (new)
02_insert_default_tenant.sql (new)
03_add_tenant_columns.sql (new)
04_migrate_data.sql (new)
05_validate_migration.sql (new)
06_set_not_null.sql (new)
07_create_indexes.sql (new)
08_update_constraints.sql (new)
09_add_sso_columns.sql (new)
10_create_mcp_tables.sql (new)
99_rollback.sql (new)
docs/
deployment/
migration-runbook.md (new)
rollback-procedure.md (new)
performance-benchmarks.md (new)
```
#### Dependencies
- Day 3 Application layer complete
- Staging environment available
- DBA review scheduled
#### Risks
- Migration scripts might fail on large databases (mitigation: test with production-sized data)
- Performance issues discovered late (mitigation: early performance testing on Day 4)
- Rollback procedure untested (mitigation: test rollback on staging)
---
### Day 5: Frontend Core Infrastructure + Authentication Pages
**Date:** 2025-11-07
**Status:** PLANNED
**Owner:** Frontend Engineer
**Estimated Time:** 8 hours
#### Deliverables
1. **Core Infrastructure (Morning - 3-4 hours)**
- API client with Axios interceptors
- Auth Store (Zustand)
- TypeScript types (auth.ts, api.ts, mcp.ts)
- Next.js middleware for route protection
2. **Authentication Pages (Afternoon - 4-5 hours)**
- Login page (local + SSO buttons)
- Signup page (3-step wizard)
- SSO callback page
- Suspended tenant page
#### Tasks Checklist - Morning
**API Client (`lib/api-client.ts`):**
- [ ] Create Axios instance with base URL
- [ ] Add request interceptor (inject Authorization header)
- [ ] Add response interceptor (handle 401, refresh token)
- [ ] Implement token refresh logic (queue concurrent requests)
- [ ] Add error handling (network errors, timeouts)
- [ ] Unit tests (5 tests)
**Auth Store (`stores/useAuthStore.ts`):**
- [ ] Define AuthState interface (user, tenant, accessToken)
- [ ] Implement login action
- [ ] Implement logout action
- [ ] Implement updateToken action (for refresh)
- [ ] NO persistence for accessToken (security)
- [ ] Unit tests (8 tests)
**TypeScript Types:**
- [ ] Create `types/auth.ts` (User, Tenant, LoginRequest, LoginResponse)
- [ ] Create `types/api.ts` (ApiResponse, ApiError)
- [ ] Create `types/mcp.ts` (McpToken, McpPermission, AuditLog)
- [ ] Export all types in `types/index.ts`
**Next.js Middleware (`app/middleware.ts`):**
- [ ] Protect routes requiring authentication
- [ ] Validate JWT token (jose library)
- [ ] Check tenant status (active/suspended)
- [ ] Redirect logic:
- Unauthenticated + protected route `/login?redirect=/original`
- Authenticated + `/login` `/dashboard`
- Suspended tenant `/suspended`
- [ ] Unit tests (6 tests)
#### Tasks Checklist - Afternoon
**Login Page (`app/(auth)/login/page.tsx`):**
- [ ] Create login form (email, password)
- [ ] Add SSO buttons (Azure AD, Google, Okta)
- [ ] "Remember me" checkbox
- [ ] "Forgot password" link
- [ ] Form validation with Zod
- [ ] Loading states
- [ ] Error handling (toast notifications)
- [ ] Responsive design (mobile-friendly)
- [ ] Integration with Auth Store
- [ ] Unit tests (10 tests)
**Signup Page (`app/(auth)/signup/page.tsx`):**
- [ ] Step 1: Company info (name, slug)
- [ ] Step 2: Admin account (name, email, password)
- [ ] Step 3: Plan selection
- [ ] Progress indicator (steps)
- [ ] Real-time slug validation (debounced 500ms)
- [ ] Password strength indicator
- [ ] Form validation with Zod
- [ ] Submit registration
- [ ] Success animation + redirect
- [ ] Unit tests (12 tests)
**SSO Callback Page (`app/(auth)/auth/callback/page.tsx`):**
- [ ] Parse URL parameters (?token=xxx&state=yyy)
- [ ] Validate state parameter (CSRF protection)
- [ ] Store token in Auth Store
- [ ] Redirect to original page or dashboard
- [ ] Error handling (SSO failed)
- [ ] Loading state
- [ ] Unit tests (5 tests)
#### Acceptance Criteria
**Morning (Infrastructure):**
- [ ] API client automatically injects Authorization header
- [ ] 401 errors trigger automatic token refresh
- [ ] Token refresh only happens once for concurrent requests
- [ ] Failed refresh redirects to `/login`
- [ ] Auth Store persists user info (but not token)
- [ ] Middleware protects all dashboard routes
- [ ] JWT validation works correctly
**Afternoon (Pages):**
- [ ] Login works with local credentials
- [ ] SSO buttons redirect to backend SSO endpoints
- [ ] Signup wizard navigates through 3 steps
- [ ] Slug validation shows "Available" or "Taken" in real-time
- [ ] Password strength indicator works (weak/medium/strong)
- [ ] SSO callback handles success and error cases
- [ ] All forms have proper validation and error messages
- [ ] Mobile responsive (tested on 375px width)
#### Expected Files Created
```
lib/
api-client.ts (new)
query-client.ts (new)
utils.ts (new)
validations.ts (new)
stores/
useAuthStore.ts (new)
useUiStore.ts (new)
types/
auth.ts (new)
api.ts (new)
mcp.ts (new)
index.ts (new)
app/
middleware.ts (new)
(auth)/
login/page.tsx (new)
signup/page.tsx (new)
auth/callback/page.tsx (new)
suspended/page.tsx (new)
components/
auth/
SsoButton.tsx (new)
TenantSlugInput.tsx (new)
PasswordStrengthIndicator.tsx (new)
hooks/
auth/
useLogin.ts (new)
useSignup.ts (new)
useLoginWithSso.ts (new)
services/
auth.service.ts (new)
__tests__/
lib/api-client.test.ts (new)
stores/useAuthStore.test.ts (new)
components/auth/SsoButton.test.tsx (new)
pages/login.test.tsx (new)
pages/signup.test.tsx (new)
```
#### Dependencies
- Backend APIs ready: `/api/auth/login`, `/api/tenants/check-slug`, `/api/tenants` (POST)
- Design specs complete (from docs/design/)
- shadcn/ui components installed
#### Risks
- Token refresh logic complex (mitigation: test extensively)
- SSO redirect flow confusing (mitigation: clear loading states)
- Slug validation debouncing issues (mitigation: use TanStack Query with proper config)
---
### Day 6: Frontend Settings + SSO Configuration
**Date:** 2025-11-08
**Status:** PLANNED
**Owner:** Frontend Engineer
**Estimated Time:** 6-7 hours
#### Deliverables
1. **Organization Settings Page**
- Tabs: General, SSO, Billing, Usage
- SSO tab with configuration form
- Dynamic form fields based on provider
- Test connection button
- Save configuration
2. **Components**
- SsoConfigForm (dynamic fields)
- ProviderSpecificFields (OIDC vs SAML)
- TestConnectionButton
- AllowedDomainsInput
#### Tasks Checklist
**Settings Layout (`app/(dashboard)/settings/layout.tsx`):**
- [ ] Create settings sidebar navigation
- [ ] Tabs: General, SSO, Billing, Usage
- [ ] Active tab highlighting
- [ ] Responsive layout
**SSO Configuration Page (`app/(dashboard)/settings/organization/page.tsx`):**
- [ ] Provider selection dropdown (Azure AD, Google, Okta, SAML)
- [ ] Dynamic form fields based on provider:
- OIDC: Authority URL, Client ID, Client Secret, Metadata URL
- SAML: Entity ID, SSO URL, X.509 Certificate, Metadata URL
- [ ] Auto-provision users toggle
- [ ] Allowed email domains input (TagInput)
- [ ] Callback URL display (read-only, copy button)
- [ ] Test connection button (with loading state)
- [ ] Save configuration button
- [ ] Success/error toast notifications
- [ ] Form validation with Zod
- [ ] Unit tests (15 tests)
**Components:**
**SsoConfigForm (`components/settings/SsoConfigForm.tsx`):**
- [ ] Provider selection logic
- [ ] Conditional field rendering (OIDC vs SAML)
- [ ] Form state management (React Hook Form)
- [ ] Validation
- [ ] Submit handler
- [ ] Unit tests (10 tests)
**AllowedDomainsInput (`components/settings/AllowedDomainsInput.tsx`):**
- [ ] Tag input for domains
- [ ] Add/remove domain
- [ ] Domain validation (@domain.com format)
- [ ] Unit tests (5 tests)
**TestConnectionButton (`components/settings/TestConnectionButton.tsx`):**
- [ ] Call `/api/tenants/{id}/sso-config/test` endpoint
- [ ] Loading state (spinner)
- [ ] Success modal with details (metadata reachable, certificate valid, etc.)
- [ ] Error modal with troubleshooting steps
- [ ] Unit tests (5 tests)
#### Acceptance Criteria
- [ ] Provider selection changes form fields dynamically
- [ ] All form fields validate correctly
- [ ] Test connection shows success/error with details
- [ ] Save configuration updates tenant SSO config
- [ ] Callback URL is displayed and copyable
- [ ] Allowed domains can be added/removed
- [ ] Auto-provision toggle works
- [ ] Form handles errors gracefully (backend errors, network errors)
- [ ] Only Admin users can access (permission check)
- [ ] All unit tests pass
#### Expected Files Created
```
app/
(dashboard)/
settings/
layout.tsx (new)
organization/page.tsx (new)
sso/page.tsx (new)
components/
settings/
SsoConfigForm.tsx (new)
AllowedDomainsInput.tsx (new)
TestConnectionButton.tsx (new)
SsoProviderLogo.tsx (new)
hooks/
tenants/
useSsoConfig.ts (new)
useTestSsoConnection.ts (new)
useUpdateSsoConfig.ts (new)
services/
tenant.service.ts (new)
__tests__/
components/settings/SsoConfigForm.test.tsx (new)
pages/settings/organization.test.tsx (new)
```
#### Dependencies
- Backend APIs ready: GET/PUT `/api/tenants/{id}/sso-config`, POST `/api/tenants/{id}/sso-config/test`
- shadcn/ui components: Select, Tabs, Form, Alert
- Day 5 Auth infrastructure complete
#### Risks
- SSO form complexity (many fields, validation) (mitigation: provider-specific Zod schemas)
- Test connection endpoint slow (mitigation: show loading state, timeout after 10s)
---
### Day 7: Frontend MCP Token Management
**Date:** 2025-11-09
**Status:** PLANNED
**Owner:** Frontend Engineer
**Estimated Time:** 7-8 hours
#### Deliverables
1. **MCP Tokens List Page**
- Token list table
- Generate token button
- Revoke token action
- Filter by status (Active/Revoked/Expired)
2. **Create Token Dialog**
- 3-step wizard (Token info, Permissions, Review)
- Permission matrix (resource × operations)
- Token display (one-time only)
3. **Token Details Page**
- Token metadata
- Usage statistics
- Audit log table
- Revoke button
#### Tasks Checklist
**MCP Tokens List Page (`app/(dashboard)/settings/mcp-tokens/page.tsx`):**
- [ ] Token list table (@tanstack/react-table)
- [ ] Columns: Name, Permissions, Last Used, Expires, Status, Actions
- [ ] Generate token button (opens dialog)
- [ ] Revoke button for each token
- [ ] Filter dropdown (All, Active, Revoked, Expired)
- [ ] Empty state (no tokens)
- [ ] Loading skeleton
- [ ] Unit tests (8 tests)
**Create Token Dialog (`components/mcp/CreateTokenDialog.tsx`):**
- [ ] 3-step wizard UI
- [ ] Step 1: Token info (name, description, expiration)
- [ ] Step 2: Permissions (permission matrix)
- [ ] Step 3: Review & create
- [ ] Token display modal (one-time, with warning)
- [ ] Copy to clipboard button
- [ ] Download as .env file button
- [ ] "I've saved the token" checkbox
- [ ] Form validation
- [ ] Unit tests (12 tests)
**Permission Matrix (`components/mcp/McpPermissionMatrix.tsx`):**
- [ ] Checkbox grid (resources × operations)
- [ ] Resources: Projects, Issues, Documents, Reports, Sprints, Comments
- [ ] Operations: Read, Create, Update, Delete, Search
- [ ] "Select all" shortcuts (per resource, per operation)
- [ ] Permission templates (Read Only, Read+Write, Custom)
- [ ] Unit tests (5 tests)
**Token Display (`components/mcp/TokenDisplay.tsx`):**
- [ ] Display generated token (once only)
- [ ] Copy button (with success toast)
- [ ] Download button (generates .env file)
- [ ] Warning message ("You won't see this again")
- [ ] Checkbox: "I've saved this token"
- [ ] Close button (enabled only after checkbox)
- [ ] Unit tests (5 tests)
**Token Details Page (`app/(dashboard)/settings/mcp-tokens/[id]/page.tsx`):**
- [ ] Token metadata (name, created, expires, status)
- [ ] Usage statistics (total calls, last used)
- [ ] Activity chart (last 7 days)
- [ ] Audit log table (timestamp, action, resource, result)
- [ ] Pagination for audit logs
- [ ] Revoke button (with confirmation dialog)
- [ ] Unit tests (8 tests)
**Audit Log Table (`components/mcp/AuditLogTable.tsx`):**
- [ ] Table with columns: Timestamp, HTTP Method, Endpoint, Status Code, Duration, IP Address
- [ ] Pagination (server-side)
- [ ] Date range filter
- [ ] Status code filter (200, 401, 403, 500)
- [ ] Export to CSV button
- [ ] Unit tests (5 tests)
#### Acceptance Criteria
- [ ] Token list loads and displays all tokens
- [ ] Generate token wizard works (3 steps)
- [ ] Permission matrix selects/deselects correctly
- [ ] Generated token displayed only once
- [ ] Copy and download buttons work
- [ ] Token details page shows metadata and audit logs
- [ ] Revoke confirmation dialog works
- [ ] Revoked tokens marked as "Revoked" (red badge)
- [ ] All unit tests pass
- [ ] Mobile responsive
#### Expected Files Created
```
app/
(dashboard)/
settings/
mcp-tokens/
page.tsx (new)
[id]/page.tsx (new)
components/
mcp/
CreateTokenDialog.tsx (new)
McpPermissionMatrix.tsx (new)
TokenDisplay.tsx (new)
AuditLogTable.tsx (new)
RevokeTokenDialog.tsx (new)
hooks/
mcp/
useMcpTokens.ts (new)
useCreateMcpToken.ts (new)
useRevokeMcpToken.ts (new)
useMcpAuditLogs.ts (new)
services/
mcp.service.ts (new)
__tests__/
components/mcp/CreateTokenDialog.test.tsx (new)
components/mcp/McpPermissionMatrix.test.tsx (new)
pages/mcp-tokens.test.tsx (new)
```
#### Dependencies
- Backend APIs ready: GET/POST `/api/mcp-tokens`, DELETE `/api/mcp-tokens/{id}`, GET `/api/mcp-tokens/{id}/audit-logs`
- shadcn/ui components: Dialog, Table, Checkbox
- Day 6 Settings infrastructure complete
#### Risks
- Permission matrix UI complex (many checkboxes) (mitigation: clear visual grouping, templates)
- Token display security (prevent screenshots) (mitigation: warning messages only, no technical prevention)
- Audit log pagination performance (mitigation: server-side pagination, limit to 50 per page)
---
### Day 8: Integration Testing + Security Testing
**Date:** 2025-11-10
**Status:** PLANNED
**Owner:** QA Engineer + All Teams
**Estimated Time:** 8 hours
#### Deliverables
1. **End-to-End Tests**
- Registration Login Dashboard flow
- SSO login flow (mocked IdP)
- MCP token creation Usage Revocation
- Cross-tenant isolation tests
2. **Security Tests**
- XSS protection (tokens in memory)
- CSRF protection (SameSite cookies)
- SQL injection (parameterized queries)
- Authorization (tenant isolation)
3. **Performance Tests**
- API response times (<100ms)
- Frontend render times (<16ms)
- Database query performance (<50ms)
- Load test: 1,000 concurrent users
4. **Bug Fixes**
- Address all issues found during testing
- Re-test after fixes
#### Tasks Checklist
**E2E Tests (Playwright):**
- [ ] Test: New tenant registration (3-step wizard)
- [ ] Test: Local login Dashboard
- [ ] Test: SSO login (mocked IdP) Dashboard
- [ ] Test: Configure SSO (admin user)
- [ ] Test: Generate MCP token (3-step wizard)
- [ ] Test: Copy token, download .env file
- [ ] Test: Use MCP token to call API (success)
- [ ] Test: Revoke token, verify API call fails (401)
- [ ] Test: Cross-tenant isolation (Tenant A cannot access Tenant B's data)
- [ ] Test: Logout Clear auth state
**Security Tests:**
- [ ] Test: Access token not in localStorage/sessionStorage/cookies
- [ ] Test: Refresh token in httpOnly cookie
- [ ] Test: XSS attack simulation (inject script, cannot steal token)
- [ ] Test: CSRF attack simulation (forged request, rejected)
- [ ] Test: SQL injection attempt (parameterized queries protect)
- [ ] Test: Attempt to access other tenant's data (403 Forbidden)
- [ ] Test: Attempt to use revoked MCP token (401 Unauthorized)
- [ ] Test: JWT signature validation (tampered token rejected)
**Performance Tests:**
- [ ] Test: API response time <100ms (p95)
- [ ] Test: Frontend render time <16ms (60fps)
- [ ] Test: Database query time <50ms (p95)
- [ ] Test: Token validation <10ms
- [ ] Load test: 1,000 concurrent users
- [ ] Load test: 10,000 API requests with tenant filtering
**Integration Tests:**
- [ ] Test: Frontend calls backend APIs successfully
- [ ] Test: Error handling (network errors, 500 errors)
- [ ] Test: Token refresh flow (401 refresh retry)
- [ ] Test: API client retries failed requests
- [ ] Test: TanStack Query cache invalidation
**Bug Tracking:**
- [ ] Log all bugs in issue tracker
- [ ] Prioritize: P0 (blocker), P1 (critical), P2 (major), P3 (minor)
- [ ] Assign to developers
- [ ] Re-test after fixes
- [ ] Sign-off when all P0/P1 bugs fixed
#### Acceptance Criteria
- [ ] All E2E tests pass (10/10)
- [ ] All security tests pass (8/8)
- [ ] Performance tests meet targets
- [ ] All P0 bugs fixed
- [ ] All P1 bugs fixed or scheduled for M1.3
- [ ] No regressions in existing features
- [ ] Test reports generated and reviewed
#### Expected Deliverables
```
tests/
e2e/
registration.spec.ts (new)
login.spec.ts (new)
sso.spec.ts (new)
mcp-tokens.spec.ts (new)
tenant-isolation.spec.ts (new)
security/
xss.spec.ts (new)
csrf.spec.ts (new)
authorization.spec.ts (new)
performance/
api-benchmarks.spec.ts (new)
load-test.spec.ts (new)
reports/
2025-11-10-Testing-Report.md (new)
bug-tracker.csv (new)
```
#### Dependencies
- Day 7 Frontend complete
- Day 4 Backend complete
- Test environment ready (staging)
#### Risks
- Bugs discovered late (mitigation: prioritize fixes, may extend to Day 9 if critical)
- Performance issues (mitigation: profiling, optimization)
- SSO mocking complex (mitigation: use MSW with realistic responses)
---
### Day 9: Database Migration + Production Deployment
**Date:** 2025-11-11
**Status:** PLANNED
**Owner:** DBA + Backend Engineer + DevOps
**Estimated Time:** 6-8 hours (includes 30-60 min downtime)
#### Deliverables
1. **Pre-Migration**
- Full database backup
- Verify backup integrity
- Copy backup to S3
- Enable maintenance mode
2. **Database Migration**
- Run 11 SQL migration scripts
- Validate data integrity
- Verify performance (indexes working)
3. **Code Deployment**
- Deploy backend API (updated code)
- Deploy frontend (updated code)
- Restart services
4. **Smoke Tests**
- Login test
- Create project test
- API health check
- Frontend loads correctly
#### Tasks Checklist
**Pre-Migration (30 minutes):**
- [ ] Schedule maintenance window (2 hours, off-peak)
- [ ] Notify all users 24 hours in advance
- [ ] Enable maintenance mode (503 page)
- [ ] Full database backup: `pg_dump -Fc colaflow > backup.dump`
- [ ] Verify backup: `pg_restore --list backup.dump | head -20`
- [ ] Copy backup to S3: `aws s3 cp backup.dump s3://backups/pre-migration/`
- [ ] Team on standby (Backend, Frontend, DevOps)
**Database Migration (30-60 minutes):**
- [ ] Run script 01: `psql -f 01_create_tenants.sql`
- [ ] Run script 02: `psql -f 02_insert_default_tenant.sql`
- [ ] Run script 03: `psql -f 03_add_tenant_columns.sql`
- [ ] Run script 04: `psql -f 04_migrate_data.sql`
- [ ] Run script 05: `psql -f 05_validate_migration.sql` (check for errors)
- [ ] Run script 06: `psql -f 06_set_not_null.sql`
- [ ] Run script 07: `psql -f 07_create_indexes.sql`
- [ ] Run script 08: `psql -f 08_update_constraints.sql`
- [ ] Run script 09: `psql -f 09_add_sso_columns.sql`
- [ ] Run script 10: `psql -f 10_create_mcp_tables.sql`
- [ ] Run validation: Check for NULL tenant_id (should be 0 rows)
- [ ] Verify indexes: EXPLAIN ANALYZE sample queries
**Code Deployment (20 minutes):**
- [ ] Deploy backend: `dotnet publish -c Release -o /var/www/api`
- [ ] Restart backend: `systemctl restart colaflow-api`
- [ ] Verify backend: `curl https://api.colaflow.com/health`
- [ ] Deploy frontend: `npm run build && rsync -avz out/ /var/www/web/`
- [ ] Restart frontend: `systemctl restart colaflow-web`
- [ ] Verify frontend: `curl https://colaflow.com`
**Smoke Tests (20 minutes):**
- [ ] Test 1: Login with existing user (default tenant)
- [ ] Test 2: View projects list (default tenant)
- [ ] Test 3: Create new project
- [ ] Test 4: View issue board
- [ ] Test 5: API health check returns 200
- [ ] Test 6: Frontend loads without errors (check browser console)
- [ ] Test 7: Database query performance <50ms
**Post-Deployment (10 minutes):**
- [ ] Disable maintenance mode
- [ ] Monitor error logs (30 minutes)
- [ ] Monitor database performance (pg_stat_statements)
- [ ] Monitor API response times (APM dashboard)
- [ ] Notify users: Maintenance complete
- [ ] Post-deployment retrospective
#### Acceptance Criteria
- [ ] Migration completes successfully (no errors)
- [ ] No data loss (validation script passes)
- [ ] All smoke tests pass
- [ ] API response times normal (<100ms)
- [ ] No 500 errors in first hour
- [ ] Error rate <1% in first 24 hours
#### Rollback Plan
**If migration fails:**
- [ ] Stop application (maintenance mode)
- [ ] Drop migrated database: `DROP DATABASE colaflow;`
- [ ] Restore from backup: `pg_restore -C -d postgres backup.dump`
- [ ] Deploy previous code version
- [ ] Restart services
- [ ] Verify rollback successful
- [ ] Notify stakeholders
- [ ] Schedule post-mortem
#### Dependencies
- Day 8 testing complete (all P0 bugs fixed)
- Migration scripts reviewed and approved
- Backup strategy verified
- Rollback procedure tested
#### Risks
- Data loss (mitigation: full backup, tested rollback)
- Extended downtime (mitigation: pre-test migration, have rollback ready)
- Performance issues (mitigation: indexes tested on Day 4)
---
### Day 10: Production Validation + Monitoring
**Date:** 2025-11-13
**Status:** PLANNED
**Owner:** All Teams
**Estimated Time:** 4-8 hours (continuous monitoring)
#### Deliverables
1. **Monitoring Setup**
- Application error tracking (Sentry)
- Performance monitoring (APM)
- Database monitoring (pg_stat_statements)
- User analytics (PostHog)
2. **User Acceptance Testing**
- Internal team testing
- Beta user testing (if available)
- Stakeholder demo
3. **Documentation**
- Update README
- API documentation (Scalar)
- User guides (SSO setup, MCP tokens)
- Admin guides (tenant management)
4. **Post-Deployment Review**
- Retrospective meeting
- Lessons learned
- Plan for M1.3
#### Tasks Checklist
**Monitoring (Morning):**
- [ ] Configure error tracking (Sentry):
- Backend errors (500, exceptions)
- Frontend errors (React errors, network failures)
- Error rate alerts (>5% triggers)
- [ ] Configure performance monitoring:
- API response times (p50, p95, p99)
- Database query times
- Slow query alerts (>100ms)
- [ ] Configure database monitoring:
- pg_stat_statements enabled
- Slow query log analysis
- Index usage monitoring
- [ ] Configure user analytics:
- Page views
- User flows (registration, login, SSO)
- Feature usage (MCP tokens created)
**Validation Testing (Afternoon):**
- [ ] Internal team testing:
- All team members register new tenants
- Test SSO configuration (Azure AD, Google)
- Generate MCP tokens
- Report any issues
- [ ] Beta user testing (if available):
- Invite 5-10 beta users
- Monitor their usage
- Collect feedback
- [ ] Stakeholder demo:
- Present new features to leadership
- Demo multi-tenancy, SSO, MCP tokens
- Discuss roadmap
**Documentation:**
- [ ] Update README.md:
- Multi-tenant architecture overview
- Environment variables
- Deployment instructions
- [ ] Update API documentation (Scalar):
- All new endpoints documented
- Request/response examples
- Authentication instructions
- [ ] Create user guides:
- "How to configure SSO" (with screenshots)
- "How to generate MCP tokens"
- "How to integrate with Claude/ChatGPT"
- [ ] Create admin guides:
- "How to manage tenants"
- "How to troubleshoot SSO issues"
- "How to monitor token usage"
**Post-Deployment Review:**
- [ ] Schedule retrospective meeting (all teams)
- [ ] Discuss: What went well?
- [ ] Discuss: What could be improved?
- [ ] Discuss: What should we do differently next time?
- [ ] Document lessons learned
- [ ] Plan M1.3 features and timeline
#### Acceptance Criteria
- [ ] Error rate <1% in first 24 hours
- [ ] API response times normal (<100ms p95)
- [ ] No critical bugs reported
- [ ] Internal team successfully uses all features
- [ ] Stakeholder demo successful
- [ ] All documentation updated
- [ ] Monitoring and alerts configured
- [ ] Retrospective completed
#### Expected Deliverables
```
docs/
user-guides/
sso-setup.md (new)
mcp-token-generation.md (new)
claude-integration.md (new)
admin-guides/
tenant-management.md (new)
troubleshooting-sso.md (new)
monitoring.md (new)
reports/
2025-11-13-Post-Deployment-Report.md (new)
2025-11-13-Retrospective-Notes.md (new)
2025-11-13-M1.2-Final-Report.md (new)
```
#### Dependencies
- Day 9 deployment successful
- No critical issues in production
#### Risks
- Unexpected production issues (mitigation: rollback plan ready)
- User adoption issues (mitigation: clear documentation, training)
---
## Dependencies and Critical Path
### Critical Path Analysis
**Critical Path:** Day 1 Day 2 Day 3 Day 4 Day 9 (Backend-dependent path)
```
Day 1: Domain Layer (COMPLETE)
Day 2: Infrastructure + TenantContext
Day 3: Application Layer + APIs
Day 4: Migration Prep + Testing
↓ (Backend Ready)
Day 5: Frontend Core (can start if APIs mocked)
Day 6: Frontend Settings
Day 7: Frontend MCP
↓ (Frontend Ready)
Day 8: Integration Testing
Day 9: Production Migration
Day 10: Validation
```
### Parallel Work Opportunities
**Days 5-7 (Frontend) can start in parallel with Days 2-4 (Backend) if:**
- API contracts defined (DONE - API integration guide complete)
- MSW mocks configured for frontend development
- Backend APIs deployed to staging environment when ready
**Optimization:**
- Frontend team can start Day 5 tasks while backend completes Day 3
- This compresses timeline by 1-2 days if backend APIs are delayed
### Dependencies Matrix
| Task | Depends On | Blocks |
|------|-----------|--------|
| Day 1: Domain | - | Day 2 |
| Day 2: Infrastructure | Day 1 | Day 3 |
| Day 3: Application + APIs | Day 2 | Day 4, Day 5 (for real API testing) |
| Day 4: Migration Prep | Day 3 | Day 9 |
| Day 5: Frontend Core | API contracts (done), ideally Day 3 for real APIs | Day 6 |
| Day 6: Frontend Settings | Day 5 | Day 7 |
| Day 7: Frontend MCP | Day 6 | Day 8 |
| Day 8: Integration Testing | Days 4 + 7 | Day 9 |
| Day 9: Migration | Days 4 + 8 | Day 10 |
| Day 10: Validation | Day 9 | - |
---
## Resource Allocation
### Team Assignments
| Day | Backend | Frontend | QA | DevOps | DBA | PM |
|-----|---------|----------|----|---------|----|-----|
| **1** | Full time (6h) | - | - | - | - | Review (1h) |
| **2** | Full time (5h) | - | - | - | - | Review (1h) |
| **3** | Full time (5h) | - | - | - | - | Review (1h) |
| **4** | Part time (3h) | - | - | - | Full time (5h) | Review (1h) |
| **5** | Support (1h) | Full time (8h) | - | - | - | Review (1h) |
| **6** | Support (1h) | Full time (7h) | - | - | - | Review (1h) |
| **7** | Support (1h) | Full time (8h) | - | - | - | Review (1h) |
| **8** | Part time (3h) | Part time (3h) | Full time (8h) | - | - | Review (2h) |
| **9** | Part time (3h) | - | Testing (4h) | Full time (8h) | Part time (4h) | Oversight (4h) |
| **10** | Support (2h) | Support (2h) | Testing (4h) | Monitoring (4h) | - | Reporting (6h) |
**Total Effort:**
- Backend Engineer: ~30 hours
- Frontend Engineer: ~31 hours
- QA Engineer: ~16 hours
- DevOps Engineer: ~12 hours
- DBA: ~9 hours
- Product Manager: ~16 hours
- **Total: ~114 team-hours over 10 days**
---
## Risk Management
### High-Risk Items
| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| **Data loss during migration** | Low | Critical | Full backup, tested rollback, validation scripts |
| **Extended downtime** | Medium | High | Pre-test migration, have rollback ready, schedule off-peak |
| **Performance degradation** | Medium | High | Performance testing on Day 4, proper indexes |
| **Critical bugs found on Day 8** | Medium | High | Allow extra time on Day 9 for fixes |
| **SSO integration issues** | Low | Medium | Test with real IdPs, clear error messages |
### Risk Mitigation Strategies
1. **Technical Risks:**
- Comprehensive testing (unit, integration, E2E, security)
- Code reviews (all PRs reviewed by at least one other developer)
- Performance testing before production (Day 4)
- Staging environment mirrors production
2. **Schedule Risks:**
- Buffer time built into Days 8-9 (can absorb 1-day delay)
- Frontend can start earlier if APIs mocked
- Rollback plan tested and ready
3. **Quality Risks:**
- TDD approach (tests written alongside code)
- Integration tests verify tenant isolation
- Security tests before production
- Code coverage target: >80%
---
## Success Criteria
### M1.2 Completion Criteria
- [ ] All 10 days complete
- [ ] All backend features implemented and tested
- [ ] All frontend features implemented and tested
- [ ] Database migration successful (no data loss)
- [ ] Production deployment successful
- [ ] All P0 and P1 bugs fixed
- [ ] Documentation complete (user guides + admin guides)
- [ ] Monitoring and alerts configured
- [ ] Stakeholder demo complete
### Quality Metrics
| Metric | Target | Measurement Method |
|--------|--------|-------------------|
| **Test Coverage** | >80% | Code coverage report |
| **Unit Test Pass Rate** | 100% | CI/CD pipeline |
| **Integration Test Pass Rate** | 100% | CI/CD pipeline |
| **E2E Test Pass Rate** | 100% | Playwright test report |
| **API Response Time** | <100ms (p95) | APM dashboard |
| **Database Query Time** | <50ms (p95) | pg_stat_statements |
| **Error Rate** | <1% | Sentry dashboard |
| **Uptime** | >99.9% | Status page |
### Feature Completion Checklist
**Multi-Tenancy:**
- [ ] Tenant registration works (3-step wizard)
- [ ] Tenant isolation enforced (cross-tenant queries fail)
- [ ] Tenant context resolved from JWT claims
- [ ] Global Query Filter applies automatically
- [ ] Composite indexes improve performance
**SSO Integration:**
- [ ] OIDC login works (Azure AD, Google, Okta)
- [ ] SAML 2.0 login works (generic IdP)
- [ ] SSO configuration UI functional for admins
- [ ] User auto-provisioning works
- [ ] Email domain restrictions enforced
- [ ] SSO errors handled gracefully
**MCP Authentication:**
- [ ] MCP token generation works (3-step wizard)
- [ ] Token displayed once with warning
- [ ] Token authentication works (API calls)
- [ ] Fine-grained permissions enforced
- [ ] Token revocation works instantly
- [ ] Audit logs created for all operations
---
## Communication Plan
### Daily Standups
**Time:** 9:00 AM daily
**Duration:** 15 minutes
**Attendees:** All team members
**Format:**
- What did you complete yesterday?
- What are you working on today?
- Any blockers?
### Progress Updates
**Frequency:** End of each day
**Owner:** Product Manager
**Format:** Slack message with:
- Day summary (what was completed)
- Current status (on track / at risk / blocked)
- Next day plan
- Any issues or decisions needed
### Milestone Reviews
**Schedule:**
- Day 4 (End of Backend phase): 1-hour review meeting
- Day 7 (End of Frontend phase): 1-hour review meeting
- Day 10 (End of M1.2): 2-hour retrospective
### Stakeholder Updates
**Frequency:** Every 3 days
**Owner:** Product Manager
**Audience:** Executive team, stakeholders
**Format:** Written update with:
- Progress summary
- Key achievements
- Risks and issues
- Next steps
---
## Appendix A: File Count Summary
### Expected Files Created
**Backend (Days 1-4):**
- Domain: 27 source files
- Application: ~20 files (Commands, Queries, Handlers, Validators)
- Infrastructure: ~15 files (Repositories, DbContext, Configurations)
- Tests: ~15 test files (~100 total tests)
- **Total: ~77 files**
**Frontend (Days 5-7):**
- Pages: ~10 files
- Components: ~20 files
- Hooks: ~15 files
- Services: ~5 files
- Stores: ~2 files
- Types: ~5 files
- Tests: ~30 test files (~150 total tests)
- **Total: ~87 files**
**Documentation (Days 8-10):**
- User guides: ~3 files
- Admin guides: ~3 files
- Reports: ~5 files
- **Total: ~11 files**
**Overall: ~175 new files created**
---
## Appendix B: Testing Matrix
| Test Type | Count | Tool | Owner | When |
|-----------|-------|------|-------|------|
| **Unit Tests (Backend)** | ~80 | xUnit | Backend | Days 1-4 |
| **Unit Tests (Frontend)** | ~70 | Vitest | Frontend | Days 5-7 |
| **Integration Tests (Backend)** | ~30 | xUnit + TestContainers | Backend | Days 3-4 |
| **Integration Tests (Frontend)** | ~20 | React Testing Library | Frontend | Days 6-7 |
| **E2E Tests** | ~10 | Playwright | QA | Day 8 |
| **Security Tests** | ~8 | Custom + OWASP ZAP | QA | Day 8 |
| **Performance Tests** | ~5 | k6 | DevOps | Day 4, Day 8 |
| **Total Tests** | ~223 | - | All | Days 1-10 |
---
**Plan Status:** Active - In Execution
**Next Update:** Day 2 Evening (2025-11-04)
**Contact:** Product Manager for questions or updates
---
**End of 10-Day Implementation Plan**