19 KiB
story_id, sprint_id, status, priority, assignee, created_date, story_type, estimated_weeks
| story_id | sprint_id | status | priority | assignee | created_date | story_type | estimated_weeks |
|---|---|---|---|---|---|---|---|
| story_0 | sprint_5 | not_started | P0 | backend | 2025-11-09 | epic | 8 |
Story 0 (EPIC): Integrate Microsoft .NET MCP SDK
Type: Epic / Feature Story Priority: P0 - Critical Infrastructure Improvement Estimated Effort: 8 weeks (40 working days)
Epic Goal
Migrate ColaFlow from custom MCP implementation to Microsoft's official .NET MCP SDK using a hybrid architecture approach. The SDK will handle protocol layer, Tool/Resource registration, and transport, while ColaFlow retains its unique business logic (Diff Preview, multi-tenant isolation, Pending Changes).
Business Value
Why This Matters
- Code Reduction: 60-70% less boilerplate code (protocol parsing, JSON-RPC, handshake)
- Performance Gain: 30-40% faster response times (SDK optimizations)
- Maintenance: Microsoft-maintained protocol updates (no manual updates)
- Standard Compliance: 100% MCP specification compliance guaranteed
- Developer Experience: Attribute-based registration (cleaner, more intuitive)
Success Metrics
- Code Reduction: Remove 500-700 lines of custom protocol code
- Performance: ≥ 20% response time improvement
- Test Coverage: Maintain ≥ 80% coverage
- Zero Breaking Changes: All existing MCP clients work without changes
- SDK Integration: 100% of Tools and Resources migrated
Research Context
Research Report: docs/research/mcp-sdk-integration-research.md
Key findings from research team:
- SDK Maturity: Production-ready (v1.0+), 4000+ GitHub stars
- Architecture Fit: Excellent fit with ColaFlow's Clean Architecture
- Attribute System:
[McpTool],[McpResource]attributes simplify registration - Transport Options: stdio (CLI), HTTP/SSE (Server), WebSocket (future)
- Performance: Faster JSON parsing, optimized middleware
- Compatibility: Supports Claude Desktop, Continue, Cline
Hybrid Architecture Strategy
What SDK Handles (Replace Custom Code)
┌──────────────────────────────────────┐
│ Microsoft .NET MCP SDK │
├──────────────────────────────────────┤
│ ✅ Protocol Layer │
│ - JSON-RPC 2.0 parsing │
│ - MCP handshake (initialize) │
│ - Request/response routing │
│ - Error handling │
│ │
│ ✅ Transport Layer │
│ - stdio (Standard In/Out) │
│ - HTTP/SSE (Server-Sent Events) │
│ - WebSocket (future) │
│ │
│ ✅ Registration System │
│ - Attribute-based discovery │
│ - Tool/Resource/Prompt catalog │
│ - Schema validation │
└──────────────────────────────────────┘
What ColaFlow Keeps (Business Logic)
┌──────────────────────────────────────┐
│ ColaFlow Business Layer │
├──────────────────────────────────────┤
│ 🔒 Security & Multi-Tenant │
│ - TenantContext extraction │
│ - API Key authentication │
│ - Field-level permissions │
│ │
│ 🔍 Diff Preview System │
│ - Before/after snapshots │
│ - Changed fields detection │
│ - HTML diff generation │
│ │
│ ✅ Approval Workflow │
│ - PendingChange management │
│ - Human approval required │
│ - SignalR notifications │
│ │
│ 📊 Advanced Features │
│ - Redis caching │
│ - Audit logging │
│ - Rate limiting │
└──────────────────────────────────────┘
Migration Phases (8 Weeks)
Phase 1: Foundation (Week 1-2) - Story 13
Goal: Setup SDK infrastructure and validate compatibility
Tasks:
- Install
Microsoft.MCPNuGet package - Create PoC Tool/Resource using SDK
- Verify compatibility with existing architecture
- Performance baseline benchmarks
- Team training on SDK APIs
Deliverables:
- SDK installed and configured
- PoC validates SDK works with ColaFlow
- Performance baseline report
- Migration guide for developers
Acceptance Criteria:
- SDK integrated into ColaFlow.Modules.Mcp project
- PoC Tool successfully called from Claude Desktop
- Performance baseline recorded (response time, throughput)
- Zero conflicts with existing Clean Architecture
Phase 2: Tool Migration (Week 3-4) - Story 14
Goal: Migrate all 10 Tools to SDK attribute-based registration
Tools to Migrate:
create_issue→[McpTool]attributeupdate_status→[McpTool]attributeadd_comment→[McpTool]attributeassign_issue→[McpTool]attributecreate_sprint→[McpTool]attributeupdate_sprint→[McpTool]attributelog_decision→[McpTool]attributegenerate_prd→[McpTool]attributesplit_epic→[McpTool]attributedetect_risks→[McpTool]attribute
Migration Pattern:
// BEFORE (Custom)
public class CreateIssueTool : IMcpTool
{
public string Name => "create_issue";
public string Description => "Create a new issue";
public McpToolInputSchema InputSchema => ...;
public async Task<McpToolResult> ExecuteAsync(...)
{
// Custom routing logic
}
}
// AFTER (SDK)
[McpTool(
Name = "create_issue",
Description = "Create a new issue (Epic/Story/Task)"
)]
public class CreateIssueTool
{
[McpToolParameter(Required = true)]
public Guid ProjectId { get; set; }
[McpToolParameter(Required = true)]
public string Title { get; set; }
[McpToolParameter]
public string? Description { get; set; }
public async Task<McpToolResult> ExecuteAsync(
McpContext context,
CancellationToken cancellationToken)
{
// Business logic stays the same
// DiffPreviewService integration preserved
}
}
Deliverables:
- All 10 Tools migrated to SDK attributes
- DiffPreviewService integration maintained
- Integration tests updated
- Performance comparison report
Acceptance Criteria:
- All Tools work with
[McpTool]attribute - Diff Preview workflow preserved (no breaking changes)
- Integration tests pass (>80% coverage)
- Performance improvement measured (target: 20%+)
Phase 3: Resource Migration (Week 5) - Story 15
Goal: Migrate all 11 Resources to SDK attribute-based registration
Resources to Migrate:
projects.list→[McpResource]projects.get/{id}→[McpResource]issues.search→[McpResource]issues.get/{id}→[McpResource]sprints.current→[McpResource]sprints.list→[McpResource]users.list→[McpResource]docs.prd/{projectId}→[McpResource]reports.daily/{date}→[McpResource]reports.velocity→[McpResource]audit.history/{entityId}→[McpResource]
Migration Pattern:
// BEFORE (Custom)
public class ProjectsListResource : IMcpResource
{
public string Uri => "colaflow://projects.list";
public string Name => "Projects List";
public async Task<McpResourceContent> GetContentAsync(...)
{
// Custom logic
}
}
// AFTER (SDK)
[McpResource(
Uri = "colaflow://projects.list",
Name = "Projects List",
Description = "List all projects in current tenant",
MimeType = "application/json"
)]
public class ProjectsListResource
{
private readonly IProjectRepository _repo;
private readonly ITenantContext _tenant;
private readonly IMemoryCache _cache; // Redis preserved
public async Task<McpResourceContent> GetContentAsync(
McpContext context,
CancellationToken cancellationToken)
{
// Business logic stays the same
// Multi-tenant filtering preserved
// Redis caching preserved
}
}
Deliverables:
- All 11 Resources migrated to SDK attributes
- Multi-tenant isolation verified
- Redis caching maintained
- Performance tests passed
Acceptance Criteria:
- All Resources work with
[McpResource]attribute - Multi-tenant isolation 100% verified
- Redis cache hit rate > 80% maintained
- Response time < 200ms (P95)
Phase 4: Transport Layer (Week 6) - Story 16
Goal: Replace custom HTTP middleware with SDK transport
Current Custom Transport:
// Custom middleware (will be removed)
app.UseMiddleware<McpProtocolMiddleware>();
app.UseMiddleware<ApiKeyAuthMiddleware>();
SDK Transport Configuration:
// SDK-based transport
builder.Services.AddMcpServer(options =>
{
// stdio transport (for CLI tools like Claude Desktop)
options.UseStdioTransport();
// HTTP/SSE transport (for web-based clients)
options.UseHttpTransport(http =>
{
http.BasePath = "/mcp";
http.EnableSse = true; // Server-Sent Events
});
// Custom authentication (preserve API Key auth)
options.AddAuthentication<ApiKeyAuthHandler>();
// Custom authorization (preserve field-level permissions)
options.AddAuthorization<FieldLevelAuthHandler>();
});
Deliverables:
- Custom middleware removed
- SDK transport configured (stdio + HTTP/SSE)
- API Key authentication migrated to SDK pipeline
- Field-level permissions preserved
Acceptance Criteria:
- stdio transport works (Claude Desktop compatibility)
- HTTP/SSE transport works (web client compatibility)
- API Key authentication functional
- Field-level permissions enforced
- Zero breaking changes for existing clients
Phase 5: Testing & Optimization (Week 7-8) - Story 17
Goal: Comprehensive testing, performance tuning, and documentation
Week 7: Integration Testing & Bug Fixes
Tasks:
-
End-to-End Testing
- Claude Desktop integration test (stdio)
- Web client integration test (HTTP/SSE)
- Multi-tenant isolation verification
- Diff Preview workflow validation
-
Performance Testing
- Benchmark Tools (target: 20%+ improvement)
- Benchmark Resources (target: 30%+ improvement)
- Concurrent request testing (100 req/s)
- Memory usage profiling
-
Security Audit
- API Key brute force test
- Cross-tenant access attempts
- Field-level permission bypass tests
- SQL injection attempts
-
Bug Fixes
- Fix integration test failures
- Address performance bottlenecks
- Fix security vulnerabilities (if found)
Week 8: Documentation & Production Readiness
Tasks:
-
Architecture Documentation
- Update
mcp-server-architecture.mdwith SDK details - Create SDK migration guide for developers
- Document hybrid architecture decisions
- Add troubleshooting guide
- Update
-
API Documentation
- Update OpenAPI/Swagger specs
- Document Tool parameter schemas
- Document Resource URI patterns
- Add example requests/responses
-
Code Cleanup
- Remove old custom protocol code
- Delete obsolete interfaces (IMcpTool, IMcpResource)
- Clean up unused NuGet packages
- Update code comments
-
Production Readiness
- Deploy to staging environment
- Smoke testing with real AI clients
- Performance validation
- Final code review
Deliverables:
- Comprehensive test suite (>80% coverage)
- Performance report (vs. baseline)
- Security audit report (zero CRITICAL issues)
- Updated architecture documentation
- Production deployment guide
Acceptance Criteria:
- Integration tests pass (>80% coverage)
- Performance improved by ≥20%
- Security audit clean (0 CRITICAL, 0 HIGH)
- Documentation complete and reviewed
- Production-ready checklist signed off
Stories Breakdown
This Epic is broken down into 5 child Stories:
- Story 13 - MCP SDK Foundation & PoC (Week 1-2) -
not_started - Story 14 - Tool Migration to SDK (Week 3-4) -
not_started - Story 15 - Resource Migration to SDK (Week 5) -
not_started - Story 16 - Transport Layer Migration (Week 6) -
not_started - Story 17 - Testing & Optimization (Week 7-8) -
not_started
Progress: 0/5 stories completed (0%)
Risk Assessment
High-Priority Risks
| Risk ID | Description | Impact | Probability | Mitigation |
|---|---|---|---|---|
| RISK-001 | SDK breaking changes during migration | High | Low | Lock SDK version, gradual migration |
| RISK-002 | Performance regression | High | Medium | Continuous benchmarking, rollback plan |
| RISK-003 | DiffPreview integration conflicts | Medium | Medium | Thorough testing, preserve interfaces |
| RISK-004 | Client compatibility issues | High | Low | Test with Claude Desktop early |
| RISK-005 | Multi-tenant isolation bugs | Critical | Very Low | 100% test coverage, security audit |
Mitigation Strategies
- Phased Migration: 5 phases allow early detection of issues
- Parallel Systems: Keep old code until SDK fully validated
- Feature Flags: Enable/disable SDK via configuration
- Rollback Plan: Can revert to custom implementation if needed
- Continuous Testing: Run tests after each phase
Dependencies
Prerequisites
- ✅ Sprint 5 Phase 1-3 infrastructure (Stories 1-12)
- ✅ Custom MCP implementation complete and working
- ✅ DiffPreview service production-ready
- ✅ Multi-tenant security verified
External Dependencies
- Microsoft .NET MCP SDK v1.0+ (NuGet)
- Claude Desktop 1.0+ (for testing)
- Continue VS Code Extension (for testing)
Technical Requirements
- .NET 9+ (already installed)
- PostgreSQL 15+ (already configured)
- Redis 7+ (already configured)
Acceptance Criteria (Epic-Level)
Functional Requirements
- All 10 Tools migrated to SDK
[McpTool]attributes - All 11 Resources migrated to SDK
[McpResource]attributes - stdio transport works (Claude Desktop compatible)
- HTTP/SSE transport works (web client compatible)
- Diff Preview workflow preserved (no breaking changes)
- Multi-tenant isolation 100% verified
- API Key authentication functional
- Field-level permissions enforced
Performance Requirements
- Response time improved by ≥20%
- Tool execution time < 500ms (P95)
- Resource query time < 200ms (P95)
- Throughput ≥100 requests/second
- Memory usage optimized (no leaks)
Quality Requirements
- Test coverage ≥80%
- Zero CRITICAL security vulnerabilities
- Zero HIGH security vulnerabilities
- Code duplication <5%
- All integration tests pass
Documentation Requirements
- Architecture documentation updated
- API documentation complete
- Migration guide published
- Troubleshooting guide published
- Code examples updated
Success Metrics
Code Quality
- Lines Removed: 500-700 lines of custom protocol code
- Code Duplication: <5%
- Test Coverage: ≥80%
- Security Score: 0 CRITICAL, 0 HIGH vulnerabilities
Performance
- Response Time: 20-40% improvement
- Throughput: 100+ req/s (from 70 req/s)
- Memory Usage: 10-20% reduction
- Cache Hit Rate: >80% maintained
Developer Experience
- Onboarding Time: 50% faster (simpler SDK APIs)
- Code Readability: +30% (attributes vs. manual registration)
- Maintenance Effort: -60% (Microsoft maintains protocol)
Related Documents
Research & Design
Sprint Planning
- Sprint 5 Plan
- Product Roadmap - M2 section
Technical References
Notes
Why Hybrid Architecture?
Question: Why not use 100% SDK?
Answer: ColaFlow has unique business requirements:
- Diff Preview: SDK doesn't provide preview mechanism (ColaFlow custom)
- Approval Workflow: SDK doesn't have human-in-the-loop (ColaFlow custom)
- Multi-Tenant: SDK doesn't enforce tenant isolation (ColaFlow custom)
- Field Permissions: SDK doesn't have field-level security (ColaFlow custom)
Hybrid approach gets best of both worlds:
- SDK handles boring protocol stuff (60-70% code reduction)
- ColaFlow handles business-critical stuff (security, approval)
What Gets Deleted?
Custom Code to Remove (~700 lines):
McpProtocolHandler.cs(JSON-RPC parsing)McpProtocolMiddleware.cs(HTTP middleware)IMcpTool.csinterface (replaced by SDK attributes)IMcpResource.csinterface (replaced by SDK attributes)McpRegistry.cs(replaced by SDK discovery)McpRequest.cs/McpResponse.csDTOs (SDK provides)
Custom Code to Keep (~1200 lines):
DiffPreviewService.cs(business logic)PendingChangeService.cs(approval workflow)ApiKeyAuthHandler.cs(security)FieldLevelAuthHandler.cs(permissions)TenantContextService.cs(multi-tenant)
Timeline Justification
Why 8 weeks?
- Week 1-2: PoC + training (can't rush, need to understand SDK)
- Week 3-4: 10 Tools migration (careful testing required)
- Week 5: 11 Resources migration (simpler than Tools)
- Week 6: Transport layer (critical, can't break clients)
- Week 7-8: Testing + docs (quality gate, can't skip)
Could it be faster?
- Yes, if we skip testing (NOT RECOMMENDED)
- Yes, if we accept higher risk (NOT RECOMMENDED)
- This is already aggressive timeline (1.6 weeks per phase)
Post-Migration Benefits
Developer Velocity:
- New Tool creation: 30 min (was 2 hours)
- New Resource creation: 15 min (was 1 hour)
- Onboarding new developers: 2 days (was 5 days)
Maintenance Burden:
- Protocol updates: 0 hours (Microsoft handles)
- Bug fixes: -60% effort (less custom code)
- Feature additions: +40% faster (SDK simplifies)
Created: 2025-11-09 by Product Manager Agent Epic Owner: Backend Team Lead Estimated Start: 2025-11-27 (After Sprint 5 Phase 1-3) Estimated Completion: 2026-01-22 (Week 8 of Sprint 5) Status: Not Started (planning complete)