feat(backend): Implement Refresh Token mechanism (Day 5 Phase 1)

Implemented secure refresh token rotation with the following features:
- RefreshToken domain entity with IsExpired(), IsRevoked(), IsActive(), Revoke() methods
- IRefreshTokenService with token generation, rotation, and revocation
- RefreshTokenService with SHA-256 hashing and token family tracking
- RefreshTokenRepository for database operations
- Database migration for refresh_tokens table with proper indexes
- Updated LoginCommandHandler and RegisterTenantCommandHandler to return refresh tokens
- Added POST /api/auth/refresh endpoint (token rotation)
- Added POST /api/auth/logout endpoint (revoke single token)
- Added POST /api/auth/logout-all endpoint (revoke all user tokens)
- Updated JWT access token expiration to 15 minutes (from 60)
- Refresh token expiration set to 7 days
- Security features: token reuse detection, IP address tracking, user-agent logging

Changes:
- Domain: RefreshToken.cs, IRefreshTokenRepository.cs
- Application: IRefreshTokenService.cs, updated LoginResponseDto and RegisterTenantResult
- Infrastructure: RefreshTokenService.cs, RefreshTokenRepository.cs, RefreshTokenConfiguration.cs
- API: AuthController.cs (3 new endpoints), RefreshTokenRequest.cs, LogoutRequest.cs
- Configuration: appsettings.Development.json (updated JWT settings)
- DI: DependencyInjection.cs (registered new services)
- Migration: AddRefreshTokens migration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Yaojia Wang
2025-11-03 14:44:36 +01:00
parent 1f66b25f30
commit 9e2edb2965
32 changed files with 4669 additions and 28 deletions

View File

@@ -0,0 +1,948 @@
# Day 5 Priority Analysis and Requirements Document
**Date**: 2025-11-03
**Project**: ColaFlow Authentication System
**Milestone**: M1 - Core Project Module
---
## Executive Summary
Based on Day 4's authentication implementation (JWT + BCrypt + Middleware) and ColaFlow's M1-M6 roadmap, this document prioritizes 4 pending features and defines Day 5 implementation focus.
**Day 5 Recommendation**: Focus on **Refresh Token** + **Role-Based Authorization (RBAC)**
---
## 1. Priority Analysis
### Feature Priority Matrix
| Feature | Business Value | Technical Complexity | MCP Dependency | Risk | Priority |
|---------|---------------|---------------------|----------------|------|----------|
| **Refresh Token** | HIGH | LOW | HIGH | LOW | **P0 (Must Have)** |
| **Role-Based Authorization** | HIGH | MEDIUM | CRITICAL | MEDIUM | **P0 (Must Have)** |
| **Email Verification** | MEDIUM | LOW | LOW | LOW | **P1 (Should Have)** |
| **SSO Integration** | LOW | HIGH | LOW | HIGH | **P2 (Nice to Have)** |
---
### 1.1 Refresh Token Implementation
**Priority**: **P0 (Must Have)**
#### Why P0?
1. **Security Best Practice**: Current 60-minute JWT is too long for production (increases vulnerability window)
2. **User Experience**: Prevents frequent re-logins (enables 7-day "Remember Me" functionality)
3. **MCP Integration**: AI tools need long-lived sessions to perform multi-step operations (create PRD → generate tasks → update progress)
4. **Industry Standard**: All production auth systems use refresh tokens
#### Business Value
- **High**: Essential for production security and UX
- **MCP Relevance**: Critical - AI agents need persistent sessions to complete multi-turn workflows
#### Technical Complexity
- **Low**: Interface already exists (`GenerateRefreshTokenAsync()`)
- **Effort**: 2-3 hours
- **Dependencies**: Database or Redis storage
#### Risk
- **Low**: Well-defined pattern, no architectural changes needed
---
### 1.2 Role-Based Authorization (RBAC)
**Priority**: **P0 (Must Have)**
#### Why P0?
1. **MCP Security Requirement**: AI tools must have restricted permissions (read-only vs. read-write)
2. **Multi-Tenant Architecture**: Tenant Admins vs. Members vs. Guests need different access levels
3. **Project Core Requirement**: Epic/Story/Task management requires role-based access control
4. **Audit & Compliance**: ColaFlow's audit log system requires role tracking for accountability
#### Business Value
- **High**: Foundation for all access control in M1-M6
- **MCP Relevance**: Critical - AI agents must operate under restricted roles (e.g., "AI Agent" role with write-preview permissions)
#### Technical Complexity
- **Medium**: Requires database schema changes (User-Role mapping), claims modification, authorization policies
- **Effort**: 4-5 hours
- **Dependencies**: JWT claims, authorization middleware
#### Risk
- **Medium**: Requires migration of existing users, potential breaking changes
---
### 1.3 Email Verification
**Priority**: **P1 (Should Have)**
#### Why P1?
1. **Security Enhancement**: Prevents fake account registrations
2. **User Validation**: Ensures users own their email addresses
3. **Password Reset Prerequisite**: Required for secure password reset flow
#### Business Value
- **Medium**: Improves security but not blocking for M1
- **MCP Relevance**: Low - AI tools don't require email verification
#### Technical Complexity
- **Low**: Standard email verification flow
- **Effort**: 3-4 hours
- **Dependencies**: Email service (SendGrid/AWS SES), verification token storage
#### Risk
- **Low**: Non-breaking addition to registration flow
#### Deferral Justification
- Not blocking for M1 Core Project Module
- Can be added in M2 or M3 without architectural changes
- Focus on MCP-critical features first
---
### 1.4 SSO Integration
**Priority**: **P2 (Nice to Have)**
#### Why P2?
1. **Enterprise Feature**: Primarily for M5 Enterprise Pilot
2. **High Complexity**: Requires OAuth 2.0/OIDC implementation, multiple provider support
3. **Not MCP-Critical**: AI tools use API tokens, not SSO
#### Business Value
- **Low**: Enterprise convenience feature, not required for M1-M3
- **MCP Relevance**: None - AI tools don't use SSO
#### Technical Complexity
- **High**: Multiple providers (Azure AD, Google, GitHub), token exchange, user mapping
- **Effort**: 10-15 hours
- **Dependencies**: OAuth libraries, provider registrations, user linking logic
#### Risk
- **High**: Complex integration, provider-specific quirks, testing overhead
#### Deferral Justification
- Target for M4 (External Integration) or M5 (Enterprise Pilot)
- Does not block M1-M3 development
- Local authentication + API tokens sufficient for early milestones
---
## 2. Day 5 Focus: Refresh Token + RBAC
### Recommended Scope
**Day 5 Goals**:
1. Implement **Refresh Token** mechanism (2-3 hours)
2. Implement **Role-Based Authorization** foundation (4-5 hours)
**Total Effort**: 6-8 hours (achievable in 1 day)
---
## 3. Feature Requirements
---
## 3.1 Refresh Token Implementation
### 3.1.1 Background & Goals
#### Business Context
- Current JWT tokens expire in 60 minutes, forcing users to re-login frequently
- AI agents performing long-running tasks (multi-step PRD generation) lose authentication mid-workflow
- Industry standard: Short-lived access tokens (15-30 min) + long-lived refresh tokens (7-30 days)
#### User Pain Points
- Users lose session while actively working
- AI tools fail mid-operation due to token expiration
- No "Remember Me" functionality
#### Project Objectives
- Reduce access token lifetime to 15 minutes (increase security)
- Implement 7-day refresh tokens (improve UX)
- Enable seamless token refresh for AI agents
---
### 3.1.2 Requirements
#### Core Functionality
**FR-RT-1**: JWT Access Token Generation
- Reduce JWT expiration to 15 minutes (configurable)
- Keep existing JWT structure and claims
- Access tokens remain stateless
**FR-RT-2**: Refresh Token Generation
- Generate cryptographically secure refresh tokens (GUID or random bytes)
- Store refresh tokens in database (or Redis)
- Associate refresh tokens with User + Tenant + Device/Client
- Set expiration to 7 days (configurable)
**FR-RT-3**: Refresh Token Storage
```sql
CREATE TABLE RefreshTokens (
Id UUID PRIMARY KEY,
UserId UUID NOT NULL FOREIGN KEY REFERENCES Users(Id),
TenantId UUID NOT NULL FOREIGN KEY REFERENCES Tenants(Id),
Token VARCHAR(500) NOT NULL UNIQUE,
ExpiresAt TIMESTAMP NOT NULL,
CreatedAt TIMESTAMP NOT NULL DEFAULT NOW(),
RevokedAt TIMESTAMP NULL,
ReplacedByToken VARCHAR(500) NULL
);
CREATE INDEX IX_RefreshTokens_Token ON RefreshTokens(Token);
CREATE INDEX IX_RefreshTokens_UserId ON RefreshTokens(UserId);
```
**FR-RT-4**: Token Refresh Endpoint
- **POST /api/auth/refresh**
- **Request Body**: `{ "refreshToken": "..." }`
- **Response**: New access token + new refresh token (token rotation)
- **Validation**:
- Refresh token exists and not revoked
- Refresh token not expired
- User and Tenant still active
- **Behavior**: Issue new access token + rotate refresh token (invalidate old token)
**FR-RT-5**: Token Revocation
- **POST /api/auth/logout**
- Mark refresh token as revoked
- Prevent reuse of revoked tokens
**FR-RT-6**: Automatic Cleanup
- Background job to delete expired refresh tokens (older than 30 days)
---
#### User Scenarios
**Scenario 1: User Login**
1. User submits credentials → `/api/auth/login`
2. System validates credentials
3. System generates:
- Access Token (15-minute JWT)
- Refresh Token (7-day GUID stored in database)
4. System returns both tokens
5. Client stores refresh token securely (HttpOnly cookie or secure storage)
**Expected Result**: User receives short-lived access token + long-lived refresh token
---
**Scenario 2: Access Token Expiration**
1. Client makes API request with expired access token
2. API returns `401 Unauthorized`
3. Client automatically calls `/api/auth/refresh` with refresh token
4. System validates refresh token and issues new access token + new refresh token
5. Client retries original API request with new access token
**Expected Result**: Seamless token refresh without user re-login
---
**Scenario 3: Refresh Token Expiration**
1. User hasn't accessed app for 7+ days
2. Refresh token expired
3. Client attempts token refresh → System returns `401 Unauthorized`
4. Client redirects user to login page
**Expected Result**: User must re-authenticate after 7 days of inactivity
---
**Scenario 4: User Logout**
1. User clicks "Logout"
2. Client calls `/api/auth/logout` with refresh token
3. System marks refresh token as revoked
4. Client clears stored tokens
**Expected Result**: Refresh token becomes invalid, user must re-login
---
#### Priority Levels
**P0 (Must Have)**:
- Refresh token generation and storage
- `/api/auth/refresh` endpoint with token rotation
- Database schema for refresh tokens
- Token revocation on logout
**P1 (Should Have)**:
- Automatic expired token cleanup job
- Multiple device/session support (one refresh token per device)
- Admin endpoint to revoke all user tokens
**P2 (Nice to Have)**:
- Refresh token usage analytics
- Suspicious activity detection (token reuse, concurrent sessions)
---
### 3.1.3 Acceptance Criteria
#### Functional Criteria
- [ ] **AC-RT-1**: Access tokens expire in 15 minutes (configurable via `appsettings.json`)
- [ ] **AC-RT-2**: Refresh tokens expire in 7 days (configurable)
- [ ] **AC-RT-3**: `/api/auth/login` returns both access token and refresh token
- [ ] **AC-RT-4**: `/api/auth/refresh` validates refresh token and issues new tokens
- [ ] **AC-RT-5**: Old refresh token is revoked when new token is issued (token rotation)
- [ ] **AC-RT-6**: Revoked refresh tokens cannot be reused
- [ ] **AC-RT-7**: Expired refresh tokens cannot be used
- [ ] **AC-RT-8**: `/api/auth/logout` revokes refresh token
- [ ] **AC-RT-9**: Refresh tokens are stored securely (hashed or encrypted)
#### Security Criteria
- [ ] **AC-RT-10**: Refresh tokens are cryptographically secure (min 256-bit entropy)
- [ ] **AC-RT-11**: Token rotation prevents token replay attacks
- [ ] **AC-RT-12**: Refresh tokens are unique per user session
- [ ] **AC-RT-13**: Concurrent refresh attempts invalidate all tokens (suspicious activity detection - P1)
#### Performance Criteria
- [ ] **AC-RT-14**: Token refresh completes in < 200ms (database lookup + JWT generation)
- [ ] **AC-RT-15**: Database indexes on `Token` and `UserId` for fast lookups
---
### 3.1.4 Timeline
- **Epic**: Identity & Authentication
- **Story**: Refresh Token Implementation
- **Tasks**:
1. Create `RefreshToken` entity and DbContext configuration (30 min)
2. Add database migration for `RefreshTokens` table (15 min)
3. Implement `GenerateRefreshTokenAsync()` in `JwtService` (30 min)
4. Implement `RefreshTokenRepository` for storage (30 min)
5. Update `/api/auth/login` to return refresh token (15 min)
6. Implement `/api/auth/refresh` endpoint (45 min)
7. Implement `/api/auth/logout` token revocation (15 min)
8. Update JWT expiration to 15 minutes (5 min)
9. Write integration tests (30 min)
10. Update documentation (15 min)
**Estimated Effort**: 3 hours
**Target Milestone**: M1
---
## 3.2 Role-Based Authorization (RBAC)
### 3.2.1 Background & Goals
#### Business Context
- ColaFlow is a multi-tenant system with hierarchical permissions
- Different users need different access levels (Tenant Admin, Project Admin, Member, Guest, AI Agent)
- MCP integration requires AI agents to operate under restricted roles
- Audit logs require role information for accountability
#### User Pain Points
- No granular access control (all users have same permissions)
- Cannot restrict AI agents to read-only or preview-only operations
- Cannot enforce tenant-level vs. project-level permissions
#### Project Objectives
- Implement role hierarchy: Tenant Admin > Project Admin > Member > Guest > AI Agent (Read-Only)
- Support role-based JWT claims for authorization
- Enable `[Authorize(Roles = "Admin")]` attribute usage
- Prepare for MCP-specific roles (AI agents with write-preview permissions)
---
### 3.2.2 Requirements
#### Core Functionality
**FR-RBAC-1**: Role Definitions
Define 5 core roles:
| Role | Scope | Permissions |
|------|-------|------------|
| **TenantAdmin** | Tenant-wide | Full control: manage users, roles, projects, billing |
| **ProjectAdmin** | Project-specific | Manage project: create/edit/delete tasks, assign members |
| **Member** | Project-specific | Create/edit own tasks, view all project data |
| **Guest** | Project-specific | Read-only access to assigned tasks |
| **AIAgent** | Tenant-wide | Read all + Write with preview (requires human approval) |
**FR-RBAC-2**: Database Schema
```sql
-- Enum or lookup table for roles
CREATE TABLE Roles (
Id UUID PRIMARY KEY,
Name VARCHAR(50) NOT NULL UNIQUE, -- TenantAdmin, ProjectAdmin, Member, Guest, AIAgent
Description VARCHAR(500),
IsSystemRole BOOLEAN NOT NULL DEFAULT TRUE
);
-- User-Role mapping (many-to-many)
CREATE TABLE UserRoles (
Id UUID PRIMARY KEY,
UserId UUID NOT NULL FOREIGN KEY REFERENCES Users(Id) ON DELETE CASCADE,
RoleId UUID NOT NULL FOREIGN KEY REFERENCES Roles(Id) ON DELETE CASCADE,
TenantId UUID NOT NULL FOREIGN KEY REFERENCES Tenants(Id) ON DELETE CASCADE,
ProjectId UUID NULL FOREIGN KEY REFERENCES Projects(Id) ON DELETE CASCADE, -- NULL for tenant-level roles
GrantedAt TIMESTAMP NOT NULL DEFAULT NOW(),
GrantedBy UUID NULL FOREIGN KEY REFERENCES Users(Id), -- Who assigned this role
UNIQUE(UserId, RoleId, TenantId, ProjectId)
);
CREATE INDEX IX_UserRoles_UserId ON UserRoles(UserId);
CREATE INDEX IX_UserRoles_TenantId ON UserRoles(TenantId);
CREATE INDEX IX_UserRoles_ProjectId ON UserRoles(ProjectId);
```
**FR-RBAC-3**: JWT Claims Enhancement
Add role claims to JWT:
```json
{
"sub": "user-guid",
"email": "user@example.com",
"role": "TenantAdmin", // Primary role
"roles": ["TenantAdmin", "ProjectAdmin"], // All roles (array)
"tenant_id": "tenant-guid",
"permissions": ["users:read", "users:write", "projects:admin"] // Optional: fine-grained permissions
}
```
**FR-RBAC-4**: Authorization Policies
Configure policies in `Program.cs`:
```csharp
builder.Services.AddAuthorization(options =>
{
options.AddPolicy("RequireTenantAdmin", policy =>
policy.RequireRole("TenantAdmin"));
options.AddPolicy("RequireProjectAdmin", policy =>
policy.RequireRole("TenantAdmin", "ProjectAdmin"));
options.AddPolicy("RequireMemberOrHigher", policy =>
policy.RequireRole("TenantAdmin", "ProjectAdmin", "Member"));
options.AddPolicy("RequireHumanUser", policy =>
policy.RequireAssertion(ctx =>
!ctx.User.HasClaim("role", "AIAgent")));
});
```
**FR-RBAC-5**: Controller Protection
Apply role-based authorization to endpoints:
```csharp
[Authorize(Roles = "TenantAdmin")]
[HttpPost("api/tenants/{tenantId}/users")]
public async Task<IActionResult> CreateUser(...) { }
[Authorize(Policy = "RequireProjectAdmin")]
[HttpDelete("api/projects/{projectId}")]
public async Task<IActionResult> DeleteProject(...) { }
[Authorize(Policy = "RequireMemberOrHigher")]
[HttpPost("api/projects/{projectId}/tasks")]
public async Task<IActionResult> CreateTask(...) { }
```
**FR-RBAC-6**: Default Role Assignment
- New tenant registration: First user gets `TenantAdmin` role
- Invited users: Get `Member` role by default
- AI agents: Require explicit `AIAgent` role assignment
---
#### User Scenarios
**Scenario 1: Tenant Admin Creates User**
1. Tenant Admin invites new user via `/api/tenants/{tenantId}/users`
2. System validates requester has `TenantAdmin` role
3. System creates user with `Member` role by default
4. System sends invitation email
**Expected Result**: User created successfully, assigned Member role
---
**Scenario 2: Member Attempts Tenant Admin Action**
1. Member user attempts to delete tenant via `/api/tenants/{tenantId}`
2. System validates JWT role claim
3. System returns `403 Forbidden` (insufficient permissions)
**Expected Result**: Request rejected with clear error message
---
**Scenario 3: Project Admin Assigns Roles**
1. Project Admin assigns user to project with `ProjectAdmin` role
2. System validates requester has `TenantAdmin` or `ProjectAdmin` role for this project
3. System creates `UserRoles` entry (UserId, ProjectAdmin, ProjectId)
4. User receives notification
**Expected Result**: User gains ProjectAdmin role for specific project
---
**Scenario 4: AI Agent Creates Task (MCP Integration)**
1. AI agent calls `/api/projects/{projectId}/tasks` with `AIAgent` role token
2. System detects `AIAgent` role → triggers diff preview mode
3. System generates task preview (not committed to database)
4. System returns preview to AI agent → AI presents to human for approval
5. Human approves → AI agent calls `/api/tasks/preview/{previewId}/commit`
6. System validates approval and commits task
**Expected Result**: AI agent creates task only after human approval
---
#### Priority Levels
**P0 (Must Have)**:
- Role definitions (TenantAdmin, ProjectAdmin, Member, Guest, AIAgent)
- Database schema: `Roles` + `UserRoles` tables
- JWT role claims
- Authorization policies in `Program.cs`
- Controller-level `[Authorize(Roles = "...")]` protection
- Default role assignment (TenantAdmin for first user, Member for new users)
**P1 (Should Have)**:
- Project-specific role assignment (UserRoles with ProjectId)
- Role management API (assign/revoke roles)
- Admin UI for role management
- Role-based audit logging
**P2 (Nice to Have)**:
- Fine-grained permissions (users:read, users:write, etc.)
- Custom role creation
- Role inheritance (ProjectAdmin inherits Member permissions)
---
### 3.2.3 Acceptance Criteria
#### Functional Criteria
- [ ] **AC-RBAC-1**: 5 system roles exist in database (TenantAdmin, ProjectAdmin, Member, Guest, AIAgent)
- [ ] **AC-RBAC-2**: First user in new tenant is automatically assigned `TenantAdmin` role
- [ ] **AC-RBAC-3**: JWT tokens include `role` and `roles` claims
- [ ] **AC-RBAC-4**: Endpoints protected with `[Authorize(Roles = "...")]` reject unauthorized users with `403 Forbidden`
- [ ] **AC-RBAC-5**: `TenantAdmin` can access all tenant-level endpoints
- [ ] **AC-RBAC-6**: `Member` cannot access admin endpoints (returns `403`)
- [ ] **AC-RBAC-7**: Role assignment is logged in audit trail (P1)
#### Security Criteria
- [ ] **AC-RBAC-8**: Role claims are cryptographically signed in JWT (tamper-proof)
- [ ] **AC-RBAC-9**: Role validation happens on every request (no role caching vulnerabilities)
- [ ] **AC-RBAC-10**: AI agents cannot access endpoints requiring human user (RequireHumanUser policy)
#### MCP Integration Criteria
- [ ] **AC-RBAC-11**: `AIAgent` role is distinguishable in authorization logic
- [ ] **AC-RBAC-12**: Endpoints can detect AI agent role and trigger preview mode (P0 for M2)
- [ ] **AC-RBAC-13**: Human-only endpoints (e.g., approve preview) reject AI agent tokens
#### Performance Criteria
- [ ] **AC-RBAC-14**: Role lookup from JWT claims (no database query per request)
- [ ] **AC-RBAC-15**: Authorization decision completes in < 10ms
---
### 3.2.4 Timeline
- **Epic**: Identity & Authentication
- **Story**: Role-Based Authorization (RBAC)
- **Tasks**:
1. Design role hierarchy and permissions matrix (30 min)
2. Create `Role` and `UserRole` entities (30 min)
3. Add database migration for RBAC tables (15 min)
4. Seed default roles (TenantAdmin, ProjectAdmin, Member, Guest, AIAgent) (15 min)
5. Update `JwtService` to include role claims (30 min)
6. Update `RegisterTenantCommandHandler` to assign TenantAdmin role (15 min)
7. Configure authorization policies in `Program.cs` (30 min)
8. Add `[Authorize(Roles = "...")]` to existing controllers (30 min)
9. Implement role assignment/revocation API (P1) (45 min)
10. Write integration tests for RBAC (45 min)
11. Update API documentation (15 min)
**Estimated Effort**: 4.5 hours
**Target Milestone**: M1
---
## 4. MCP Integration Requirements
### 4.1 Authentication System Capabilities for MCP
To support M2 (MCP Server Implementation) and M3 (ChatGPT Integration PoC), the authentication system must provide:
---
#### MCP-1: AI Agent Authentication
**Requirement**: AI tools must authenticate with ColaFlow using API tokens (not username/password)
**Implementation**:
- Generate long-lived API tokens (30-90 days) for AI agents
- API tokens stored in database (hashed) with metadata (agent name, permissions, expiration)
- API tokens map to User with `AIAgent` role
- Endpoint: **POST /api/auth/tokens** (generate API token for AI agent)
**Example**:
```json
POST /api/auth/tokens
{
"agentName": "ChatGPT-PRD-Generator",
"permissions": ["projects:read", "tasks:write_preview"],
"expiresInDays": 90
}
Response:
{
"token": "cola_live_sk_abc123...",
"expiresAt": "2026-02-01T00:00:00Z"
}
```
---
#### MCP-2: AI Agent Role & Permissions
**Requirement**: AI agents must have restricted permissions (read + write-preview only)
**Implementation**:
- `AIAgent` role defined with permissions:
- **Read**: All projects, tasks, docs (tenant-scoped)
- **Write Preview**: Generate diffs for tasks/docs (not committed)
- **No Direct Write**: Cannot commit changes without human approval
- Authorization policies detect `AIAgent` role and enforce preview mode
**Example**:
```csharp
[Authorize(Roles = "Member,ProjectAdmin,TenantAdmin")]
[HttpPost("api/projects/{projectId}/tasks")]
public async Task<IActionResult> CreateTask(...)
{
if (User.IsInRole("AIAgent"))
{
// Generate preview, return for human approval
return Ok(new { preview: taskPreview, requiresApproval: true });
}
// Direct commit for human users
await _taskService.CreateTaskAsync(...);
return Created(...);
}
```
---
#### MCP-3: Multi-Turn Session Management
**Requirement**: AI agents need persistent sessions for multi-turn workflows (e.g., create PRD generate tasks update status)
**Implementation**:
- Refresh tokens for AI agents (90-day expiration)
- Session storage for AI agent context (e.g., current project, draft document ID)
- Session cleanup after 24 hours of inactivity
**Example Workflow**:
```
1. AI: Generate PRD draft → System: Creates draft (not committed), returns previewId
2. AI: Review PRD draft → System: Returns preview with previewId
3. Human: Approve PRD → System: Commits draft to database
4. AI: Generate tasks from PRD → System: Creates task previews
5. Human: Approve tasks → System: Commits tasks
```
---
#### MCP-4: Audit Trail for AI Actions
**Requirement**: All AI agent actions must be logged for compliance and debugging
**Implementation**:
- Audit log entries include:
- Actor: AI agent name (from JWT `sub` or `agent_name` claim)
- Action: Resource + Operation (e.g., "tasks.create_preview")
- Timestamp
- Request payload (diff)
- Approval status (pending, approved, rejected)
- Queryable audit log: **GET /api/audit?actorType=AIAgent**
---
#### MCP-5: Human Approval Workflow
**Requirement**: All AI write operations require human approval
**Implementation**:
- Preview storage: Store AI-generated changes in temporary table
- Approval API:
- **GET /api/previews/{previewId}** - View diff
- **POST /api/previews/{previewId}/approve** - Commit changes
- **POST /api/previews/{previewId}/reject** - Discard changes
- Preview expiration: Auto-delete after 24 hours
**Database Schema**:
```sql
CREATE TABLE Previews (
Id UUID PRIMARY KEY,
EntityType VARCHAR(50) NOT NULL, -- Task, Document, etc.
Operation VARCHAR(50) NOT NULL, -- Create, Update, Delete
Payload JSONB NOT NULL, -- Full entity data or diff
CreatedBy UUID NOT NULL FOREIGN KEY REFERENCES Users(Id), -- AI agent user
CreatedAt TIMESTAMP NOT NULL DEFAULT NOW(),
ExpiresAt TIMESTAMP NOT NULL,
ApprovedBy UUID NULL FOREIGN KEY REFERENCES Users(Id),
ApprovedAt TIMESTAMP NULL,
RejectedBy UUID NULL FOREIGN KEY REFERENCES Users(Id),
RejectedAt TIMESTAMP NULL,
Status VARCHAR(20) NOT NULL DEFAULT 'Pending' -- Pending, Approved, Rejected, Expired
);
```
---
#### MCP-6: Rate Limiting for AI Agents
**Requirement**: Prevent AI agents from overwhelming the system
**Implementation**:
- Rate limits per AI agent token:
- Read operations: 100 requests/minute
- Write preview operations: 10 requests/minute
- Commit operations: N/A (human-initiated)
- Return `429 Too Many Requests` when limit exceeded
- Use Redis or in-memory cache for rate limit tracking
---
### 4.2 MCP Integration Readiness Checklist
For Day 5 implementation, ensure authentication system supports:
- [ ] **MCP-Ready-1**: AI agent user creation (User with `AIAgent` role)
- [ ] **MCP-Ready-2**: API token generation and validation (long-lived tokens)
- [ ] **MCP-Ready-3**: Role-based authorization (AIAgent role defined)
- [ ] **MCP-Ready-4**: Refresh tokens for multi-turn AI sessions
- [ ] **MCP-Ready-5**: Audit logging foundation (log actor role in all operations)
- [ ] **MCP-Ready-6**: Preview storage schema (P1 - can be added in M2)
---
## 5. Technical Constraints & Dependencies
### 5.1 Technology Stack
- **.NET 9.0**: Use latest C# 13 features
- **PostgreSQL**: Primary database (RBAC tables, refresh tokens)
- **Entity Framework Core 9.0**: ORM for database access
- **System.IdentityModel.Tokens.Jwt**: JWT token handling
- **Redis** (Optional): For refresh token storage (if high throughput needed)
---
### 5.2 Dependencies
#### Internal Dependencies
- **Day 4 Completion**: JWT service, password hashing, authentication middleware
- **Database Migrations**: Existing `IdentityDbContext` must be migrated
- **Tenant & User Entities**: Must support role relationships
#### External Dependencies
- **PostgreSQL Instance**: Running and accessible
- **Configuration**: `appsettings.json` updated with token lifetimes
- **Testing Environment**: Integration tests require test database
---
### 5.3 Breaking Changes
#### Refresh Token Implementation
- **Breaking**: Access token lifetime changes from 60 min 15 min
- **Migration Path**: Clients must implement token refresh logic
- **Backward Compatibility**: Old tokens valid until expiration (no immediate break)
#### RBAC Implementation
- **Breaking**: Existing users have no roles (must assign default role in migration)
- **Migration Path**: Data migration to assign `TenantAdmin` to first user per tenant
- **Backward Compatibility**: Endpoints without `[Authorize(Roles)]` remain accessible
---
### 5.4 Testing Requirements
#### Refresh Token Tests
1. Token refresh succeeds with valid refresh token
2. Token refresh fails with expired refresh token
3. Token refresh fails with revoked refresh token
4. Token rotation invalidates old refresh token
5. Logout revokes refresh token
6. Concurrent refresh attempts handled correctly (P1)
#### RBAC Tests
1. TenantAdmin can access admin endpoints
2. Member cannot access admin endpoints (403 Forbidden)
3. Guest has read-only access
4. AIAgent role triggers preview mode
5. Role claims present in JWT
6. Authorization policies enforce role requirements
---
## 6. Next Steps After Day 5
### Day 6-7: Complete M1 Core Project Module
- Implement Project/Epic/Story/Task entities
- Implement Kanban workflow (To Do In Progress Done)
- Basic audit log for entity changes
### Day 8-9: Email Verification + Password Reset
- Email verification flow (P1 from this document)
- Password reset with secure tokens
- Email service integration (SendGrid)
### Day 10-12: M2 MCP Server Foundation
- Implement Preview storage and approval API (MCP-5)
- Implement API token generation for AI agents (MCP-1)
- Rate limiting for AI agents (MCP-6)
- MCP protocol implementation (Resources + Tools)
---
## 7. Success Metrics
### Day 5 Success Criteria
#### Refresh Token
- [ ] Access token lifetime: 15 minutes
- [ ] Refresh token lifetime: 7 days
- [ ] Token refresh endpoint response time: < 200ms
- [ ] All refresh token tests passing
#### RBAC
- [ ] 5 system roles seeded in database
- [ ] JWT includes role claims
- [ ] Admin endpoints protected with role-based authorization
- [ ] All RBAC tests passing
#### MCP Readiness
- [ ] AIAgent role defined and assignable
- [ ] Role-based authorization policies configured
- [ ] Audit logging includes actor role (foundation)
---
## 8. Risk Mitigation
### Risk 1: Refresh Token Implementation Complexity
**Risk**: Token rotation logic may introduce race conditions
**Mitigation**: Use database transactions, test concurrent refresh attempts
**Fallback**: Implement simple refresh without rotation (P0), add rotation in P1
### Risk 2: RBAC Migration Breaks Existing Users
**Risk**: Existing users have no roles, break auth flow
**Mitigation**: Data migration assigns default roles before deploying RBAC
**Fallback**: Add fallback logic (users without roles get Member role temporarily)
### Risk 3: Day 5 Scope Too Large
**Risk**: Cannot complete both features in 1 day
**Mitigation**: Prioritize Refresh Token (P0), defer RBAC project-level roles to Day 6
**Fallback**: Complete Refresh Token only, move RBAC to Day 6
---
## 9. Approval & Sign-Off
### Stakeholders
- **Product Manager**: Approved
- **Architect**: Pending review
- **Backend Lead**: Pending review
- **Security Team**: Pending review (refresh token security)
### Next Steps
1. Review this PRD with architect and backend lead
2. Create detailed technical design for refresh token storage (database vs. Redis)
3. Begin Day 5 implementation
---
## Appendix A: Alternative Approaches Considered
### Refresh Token Storage: Database vs. Redis
#### Option 1: PostgreSQL (Recommended)
**Pros**:
- Simple setup, no additional infrastructure
- ACID guarantees for token rotation
- Easy audit trail integration
**Cons**:
- Slower than Redis (but < 200ms acceptable)
- Database load for high-traffic scenarios
**Decision**: Use PostgreSQL for M1-M3, evaluate Redis for M4-M6 if needed
---
#### Option 2: Redis
**Pros**:
- Extremely fast (< 10ms lookup)
- TTL-based automatic expiration
- Scales horizontally
**Cons**:
- Additional infrastructure complexity
- No ACID transactions (potential race conditions)
- Audit trail requires separate logging
**Decision**: Defer to M4+ if performance bottleneck identified
---
### RBAC Implementation: Enum vs. Database Roles
#### Option 1: Database Roles (Recommended)
**Pros**:
- Flexible, supports custom roles in future
- Queryable, auditable
- Supports project-level roles
**Cons**:
- More complex schema
- Requires migration for role changes
**Decision**: Use database roles for extensibility
---
#### Option 2: Enum Roles
**Pros**:
- Simple, type-safe in C#
- No database lookups
**Cons**:
- Cannot add custom roles without code changes
- No project-level role support
**Decision**: Rejected - too rigid for M2+ requirements
---
## Appendix B: References
- [RFC 6749: OAuth 2.0](https://datatracker.ietf.org/doc/html/rfc6749) - Refresh token spec
- [OWASP Authentication Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Authentication_Cheat_Sheet.html)
- [ASP.NET Core Authorization](https://learn.microsoft.com/en-us/aspnet/core/security/authorization/introduction)
- ColaFlow Product Plan: `product.md`
- Day 4 Implementation: `DAY4-IMPLEMENTATION-SUMMARY.md`
---
**Document Version**: 1.0
**Last Updated**: 2025-11-03
**Next Review**: Day 6 (Post-Implementation Review)