581 lines
19 KiB
Markdown
581 lines
19 KiB
Markdown
---
|
|
story_id: story_0
|
|
sprint_id: sprint_5
|
|
status: not_started
|
|
priority: P0
|
|
assignee: backend
|
|
created_date: 2025-11-09
|
|
story_type: epic
|
|
estimated_weeks: 8
|
|
---
|
|
|
|
# Story 0 (EPIC): Integrate Microsoft .NET MCP SDK
|
|
|
|
**Type**: Epic / Feature Story
|
|
**Priority**: P0 - Critical Infrastructure Improvement
|
|
**Estimated Effort**: 8 weeks (40 working days)
|
|
|
|
## Epic Goal
|
|
|
|
Migrate ColaFlow from custom MCP implementation to Microsoft's official .NET MCP SDK using a hybrid architecture approach. The SDK will handle protocol layer, Tool/Resource registration, and transport, while ColaFlow retains its unique business logic (Diff Preview, multi-tenant isolation, Pending Changes).
|
|
|
|
## Business Value
|
|
|
|
### Why This Matters
|
|
|
|
1. **Code Reduction**: 60-70% less boilerplate code (protocol parsing, JSON-RPC, handshake)
|
|
2. **Performance Gain**: 30-40% faster response times (SDK optimizations)
|
|
3. **Maintenance**: Microsoft-maintained protocol updates (no manual updates)
|
|
4. **Standard Compliance**: 100% MCP specification compliance guaranteed
|
|
5. **Developer Experience**: Attribute-based registration (cleaner, more intuitive)
|
|
|
|
### Success Metrics
|
|
|
|
- **Code Reduction**: Remove 500-700 lines of custom protocol code
|
|
- **Performance**: ≥ 20% response time improvement
|
|
- **Test Coverage**: Maintain ≥ 80% coverage
|
|
- **Zero Breaking Changes**: All existing MCP clients work without changes
|
|
- **SDK Integration**: 100% of Tools and Resources migrated
|
|
|
|
## Research Context
|
|
|
|
**Research Report**: `docs/research/mcp-sdk-integration-research.md`
|
|
|
|
Key findings from research team:
|
|
- **SDK Maturity**: Production-ready (v1.0+), 4000+ GitHub stars
|
|
- **Architecture Fit**: Excellent fit with ColaFlow's Clean Architecture
|
|
- **Attribute System**: `[McpTool]`, `[McpResource]` attributes simplify registration
|
|
- **Transport Options**: stdio (CLI), HTTP/SSE (Server), WebSocket (future)
|
|
- **Performance**: Faster JSON parsing, optimized middleware
|
|
- **Compatibility**: Supports Claude Desktop, Continue, Cline
|
|
|
|
## Hybrid Architecture Strategy
|
|
|
|
### What SDK Handles (Replace Custom Code)
|
|
|
|
```
|
|
┌──────────────────────────────────────┐
|
|
│ Microsoft .NET MCP SDK │
|
|
├──────────────────────────────────────┤
|
|
│ ✅ Protocol Layer │
|
|
│ - JSON-RPC 2.0 parsing │
|
|
│ - MCP handshake (initialize) │
|
|
│ - Request/response routing │
|
|
│ - Error handling │
|
|
│ │
|
|
│ ✅ Transport Layer │
|
|
│ - stdio (Standard In/Out) │
|
|
│ - HTTP/SSE (Server-Sent Events) │
|
|
│ - WebSocket (future) │
|
|
│ │
|
|
│ ✅ Registration System │
|
|
│ - Attribute-based discovery │
|
|
│ - Tool/Resource/Prompt catalog │
|
|
│ - Schema validation │
|
|
└──────────────────────────────────────┘
|
|
```
|
|
|
|
### What ColaFlow Keeps (Business Logic)
|
|
|
|
```
|
|
┌──────────────────────────────────────┐
|
|
│ ColaFlow Business Layer │
|
|
├──────────────────────────────────────┤
|
|
│ 🔒 Security & Multi-Tenant │
|
|
│ - TenantContext extraction │
|
|
│ - API Key authentication │
|
|
│ - Field-level permissions │
|
|
│ │
|
|
│ 🔍 Diff Preview System │
|
|
│ - Before/after snapshots │
|
|
│ - Changed fields detection │
|
|
│ - HTML diff generation │
|
|
│ │
|
|
│ ✅ Approval Workflow │
|
|
│ - PendingChange management │
|
|
│ - Human approval required │
|
|
│ - SignalR notifications │
|
|
│ │
|
|
│ 📊 Advanced Features │
|
|
│ - Redis caching │
|
|
│ - Audit logging │
|
|
│ - Rate limiting │
|
|
└──────────────────────────────────────┘
|
|
```
|
|
|
|
## Migration Phases (8 Weeks)
|
|
|
|
### Phase 1: Foundation (Week 1-2) - Story 13
|
|
|
|
**Goal**: Setup SDK infrastructure and validate compatibility
|
|
|
|
**Tasks**:
|
|
1. Install `Microsoft.MCP` NuGet package
|
|
2. Create PoC Tool/Resource using SDK
|
|
3. Verify compatibility with existing architecture
|
|
4. Performance baseline benchmarks
|
|
5. Team training on SDK APIs
|
|
|
|
**Deliverables**:
|
|
- SDK installed and configured
|
|
- PoC validates SDK works with ColaFlow
|
|
- Performance baseline report
|
|
- Migration guide for developers
|
|
|
|
**Acceptance Criteria**:
|
|
- [ ] SDK integrated into ColaFlow.Modules.Mcp project
|
|
- [ ] PoC Tool successfully called from Claude Desktop
|
|
- [ ] Performance baseline recorded (response time, throughput)
|
|
- [ ] Zero conflicts with existing Clean Architecture
|
|
|
|
---
|
|
|
|
### Phase 2: Tool Migration (Week 3-4) - Story 14
|
|
|
|
**Goal**: Migrate all 10 Tools to SDK attribute-based registration
|
|
|
|
**Tools to Migrate**:
|
|
1. `create_issue` → `[McpTool]` attribute
|
|
2. `update_status` → `[McpTool]` attribute
|
|
3. `add_comment` → `[McpTool]` attribute
|
|
4. `assign_issue` → `[McpTool]` attribute
|
|
5. `create_sprint` → `[McpTool]` attribute
|
|
6. `update_sprint` → `[McpTool]` attribute
|
|
7. `log_decision` → `[McpTool]` attribute
|
|
8. `generate_prd` → `[McpTool]` attribute
|
|
9. `split_epic` → `[McpTool]` attribute
|
|
10. `detect_risks` → `[McpTool]` attribute
|
|
|
|
**Migration Pattern**:
|
|
```csharp
|
|
// BEFORE (Custom)
|
|
public class CreateIssueTool : IMcpTool
|
|
{
|
|
public string Name => "create_issue";
|
|
public string Description => "Create a new issue";
|
|
public McpToolInputSchema InputSchema => ...;
|
|
|
|
public async Task<McpToolResult> ExecuteAsync(...)
|
|
{
|
|
// Custom routing logic
|
|
}
|
|
}
|
|
|
|
// AFTER (SDK)
|
|
[McpTool(
|
|
Name = "create_issue",
|
|
Description = "Create a new issue (Epic/Story/Task)"
|
|
)]
|
|
public class CreateIssueTool
|
|
{
|
|
[McpToolParameter(Required = true)]
|
|
public Guid ProjectId { get; set; }
|
|
|
|
[McpToolParameter(Required = true)]
|
|
public string Title { get; set; }
|
|
|
|
[McpToolParameter]
|
|
public string? Description { get; set; }
|
|
|
|
public async Task<McpToolResult> ExecuteAsync(
|
|
McpContext context,
|
|
CancellationToken cancellationToken)
|
|
{
|
|
// Business logic stays the same
|
|
// DiffPreviewService integration preserved
|
|
}
|
|
}
|
|
```
|
|
|
|
**Deliverables**:
|
|
- All 10 Tools migrated to SDK attributes
|
|
- DiffPreviewService integration maintained
|
|
- Integration tests updated
|
|
- Performance comparison report
|
|
|
|
**Acceptance Criteria**:
|
|
- [ ] All Tools work with `[McpTool]` attribute
|
|
- [ ] Diff Preview workflow preserved (no breaking changes)
|
|
- [ ] Integration tests pass (>80% coverage)
|
|
- [ ] Performance improvement measured (target: 20%+)
|
|
|
|
---
|
|
|
|
### Phase 3: Resource Migration (Week 5) - Story 15
|
|
|
|
**Goal**: Migrate all 11 Resources to SDK attribute-based registration
|
|
|
|
**Resources to Migrate**:
|
|
1. `projects.list` → `[McpResource]`
|
|
2. `projects.get/{id}` → `[McpResource]`
|
|
3. `issues.search` → `[McpResource]`
|
|
4. `issues.get/{id}` → `[McpResource]`
|
|
5. `sprints.current` → `[McpResource]`
|
|
6. `sprints.list` → `[McpResource]`
|
|
7. `users.list` → `[McpResource]`
|
|
8. `docs.prd/{projectId}` → `[McpResource]`
|
|
9. `reports.daily/{date}` → `[McpResource]`
|
|
10. `reports.velocity` → `[McpResource]`
|
|
11. `audit.history/{entityId}` → `[McpResource]`
|
|
|
|
**Migration Pattern**:
|
|
```csharp
|
|
// BEFORE (Custom)
|
|
public class ProjectsListResource : IMcpResource
|
|
{
|
|
public string Uri => "colaflow://projects.list";
|
|
public string Name => "Projects List";
|
|
|
|
public async Task<McpResourceContent> GetContentAsync(...)
|
|
{
|
|
// Custom logic
|
|
}
|
|
}
|
|
|
|
// AFTER (SDK)
|
|
[McpResource(
|
|
Uri = "colaflow://projects.list",
|
|
Name = "Projects List",
|
|
Description = "List all projects in current tenant",
|
|
MimeType = "application/json"
|
|
)]
|
|
public class ProjectsListResource
|
|
{
|
|
private readonly IProjectRepository _repo;
|
|
private readonly ITenantContext _tenant;
|
|
private readonly IMemoryCache _cache; // Redis preserved
|
|
|
|
public async Task<McpResourceContent> GetContentAsync(
|
|
McpContext context,
|
|
CancellationToken cancellationToken)
|
|
{
|
|
// Business logic stays the same
|
|
// Multi-tenant filtering preserved
|
|
// Redis caching preserved
|
|
}
|
|
}
|
|
```
|
|
|
|
**Deliverables**:
|
|
- All 11 Resources migrated to SDK attributes
|
|
- Multi-tenant isolation verified
|
|
- Redis caching maintained
|
|
- Performance tests passed
|
|
|
|
**Acceptance Criteria**:
|
|
- [ ] All Resources work with `[McpResource]` attribute
|
|
- [ ] Multi-tenant isolation 100% verified
|
|
- [ ] Redis cache hit rate > 80% maintained
|
|
- [ ] Response time < 200ms (P95)
|
|
|
|
---
|
|
|
|
### Phase 4: Transport Layer (Week 6) - Story 16
|
|
|
|
**Goal**: Replace custom HTTP middleware with SDK transport
|
|
|
|
**Current Custom Transport**:
|
|
```csharp
|
|
// Custom middleware (will be removed)
|
|
app.UseMiddleware<McpProtocolMiddleware>();
|
|
app.UseMiddleware<ApiKeyAuthMiddleware>();
|
|
```
|
|
|
|
**SDK Transport Configuration**:
|
|
```csharp
|
|
// SDK-based transport
|
|
builder.Services.AddMcpServer(options =>
|
|
{
|
|
// stdio transport (for CLI tools like Claude Desktop)
|
|
options.UseStdioTransport();
|
|
|
|
// HTTP/SSE transport (for web-based clients)
|
|
options.UseHttpTransport(http =>
|
|
{
|
|
http.BasePath = "/mcp";
|
|
http.EnableSse = true; // Server-Sent Events
|
|
});
|
|
|
|
// Custom authentication (preserve API Key auth)
|
|
options.AddAuthentication<ApiKeyAuthHandler>();
|
|
|
|
// Custom authorization (preserve field-level permissions)
|
|
options.AddAuthorization<FieldLevelAuthHandler>();
|
|
});
|
|
```
|
|
|
|
**Deliverables**:
|
|
- Custom middleware removed
|
|
- SDK transport configured (stdio + HTTP/SSE)
|
|
- API Key authentication migrated to SDK pipeline
|
|
- Field-level permissions preserved
|
|
|
|
**Acceptance Criteria**:
|
|
- [ ] stdio transport works (Claude Desktop compatibility)
|
|
- [ ] HTTP/SSE transport works (web client compatibility)
|
|
- [ ] API Key authentication functional
|
|
- [ ] Field-level permissions enforced
|
|
- [ ] Zero breaking changes for existing clients
|
|
|
|
---
|
|
|
|
### Phase 5: Testing & Optimization (Week 7-8) - Story 17
|
|
|
|
**Goal**: Comprehensive testing, performance tuning, and documentation
|
|
|
|
#### Week 7: Integration Testing & Bug Fixes
|
|
|
|
**Tasks**:
|
|
1. **End-to-End Testing**
|
|
- Claude Desktop integration test (stdio)
|
|
- Web client integration test (HTTP/SSE)
|
|
- Multi-tenant isolation verification
|
|
- Diff Preview workflow validation
|
|
|
|
2. **Performance Testing**
|
|
- Benchmark Tools (target: 20%+ improvement)
|
|
- Benchmark Resources (target: 30%+ improvement)
|
|
- Concurrent request testing (100 req/s)
|
|
- Memory usage profiling
|
|
|
|
3. **Security Audit**
|
|
- API Key brute force test
|
|
- Cross-tenant access attempts
|
|
- Field-level permission bypass tests
|
|
- SQL injection attempts
|
|
|
|
4. **Bug Fixes**
|
|
- Fix integration test failures
|
|
- Address performance bottlenecks
|
|
- Fix security vulnerabilities (if found)
|
|
|
|
#### Week 8: Documentation & Production Readiness
|
|
|
|
**Tasks**:
|
|
1. **Architecture Documentation**
|
|
- Update `mcp-server-architecture.md` with SDK details
|
|
- Create SDK migration guide for developers
|
|
- Document hybrid architecture decisions
|
|
- Add troubleshooting guide
|
|
|
|
2. **API Documentation**
|
|
- Update OpenAPI/Swagger specs
|
|
- Document Tool parameter schemas
|
|
- Document Resource URI patterns
|
|
- Add example requests/responses
|
|
|
|
3. **Code Cleanup**
|
|
- Remove old custom protocol code
|
|
- Delete obsolete interfaces (IMcpTool, IMcpResource)
|
|
- Clean up unused NuGet packages
|
|
- Update code comments
|
|
|
|
4. **Production Readiness**
|
|
- Deploy to staging environment
|
|
- Smoke testing with real AI clients
|
|
- Performance validation
|
|
- Final code review
|
|
|
|
**Deliverables**:
|
|
- Comprehensive test suite (>80% coverage)
|
|
- Performance report (vs. baseline)
|
|
- Security audit report (zero CRITICAL issues)
|
|
- Updated architecture documentation
|
|
- Production deployment guide
|
|
|
|
**Acceptance Criteria**:
|
|
- [ ] Integration tests pass (>80% coverage)
|
|
- [ ] Performance improved by ≥20%
|
|
- [ ] Security audit clean (0 CRITICAL, 0 HIGH)
|
|
- [ ] Documentation complete and reviewed
|
|
- [ ] Production-ready checklist signed off
|
|
|
|
---
|
|
|
|
## Stories Breakdown
|
|
|
|
This Epic is broken down into 5 child Stories:
|
|
|
|
- [ ] [Story 13](sprint_5_story_13.md) - MCP SDK Foundation & PoC (Week 1-2) - `not_started`
|
|
- [ ] [Story 14](sprint_5_story_14.md) - Tool Migration to SDK (Week 3-4) - `not_started`
|
|
- [ ] [Story 15](sprint_5_story_15.md) - Resource Migration to SDK (Week 5) - `not_started`
|
|
- [ ] [Story 16](sprint_5_story_16.md) - Transport Layer Migration (Week 6) - `not_started`
|
|
- [ ] [Story 17](sprint_5_story_17.md) - Testing & Optimization (Week 7-8) - `not_started`
|
|
|
|
**Progress**: 0/5 stories completed (0%)
|
|
|
|
## Risk Assessment
|
|
|
|
### High-Priority Risks
|
|
|
|
| Risk ID | Description | Impact | Probability | Mitigation |
|
|
|---------|-------------|--------|-------------|------------|
|
|
| RISK-001 | SDK breaking changes during migration | High | Low | Lock SDK version, gradual migration |
|
|
| RISK-002 | Performance regression | High | Medium | Continuous benchmarking, rollback plan |
|
|
| RISK-003 | DiffPreview integration conflicts | Medium | Medium | Thorough testing, preserve interfaces |
|
|
| RISK-004 | Client compatibility issues | High | Low | Test with Claude Desktop early |
|
|
| RISK-005 | Multi-tenant isolation bugs | Critical | Very Low | 100% test coverage, security audit |
|
|
|
|
### Mitigation Strategies
|
|
|
|
1. **Phased Migration**: 5 phases allow early detection of issues
|
|
2. **Parallel Systems**: Keep old code until SDK fully validated
|
|
3. **Feature Flags**: Enable/disable SDK via configuration
|
|
4. **Rollback Plan**: Can revert to custom implementation if needed
|
|
5. **Continuous Testing**: Run tests after each phase
|
|
|
|
## Dependencies
|
|
|
|
### Prerequisites
|
|
- ✅ Sprint 5 Phase 1-3 infrastructure (Stories 1-12)
|
|
- ✅ Custom MCP implementation complete and working
|
|
- ✅ DiffPreview service production-ready
|
|
- ✅ Multi-tenant security verified
|
|
|
|
### External Dependencies
|
|
- Microsoft .NET MCP SDK v1.0+ (NuGet)
|
|
- Claude Desktop 1.0+ (for testing)
|
|
- Continue VS Code Extension (for testing)
|
|
|
|
### Technical Requirements
|
|
- .NET 9+ (already installed)
|
|
- PostgreSQL 15+ (already configured)
|
|
- Redis 7+ (already configured)
|
|
|
|
## Acceptance Criteria (Epic-Level)
|
|
|
|
### Functional Requirements
|
|
- [ ] All 10 Tools migrated to SDK `[McpTool]` attributes
|
|
- [ ] All 11 Resources migrated to SDK `[McpResource]` attributes
|
|
- [ ] stdio transport works (Claude Desktop compatible)
|
|
- [ ] HTTP/SSE transport works (web client compatible)
|
|
- [ ] Diff Preview workflow preserved (no breaking changes)
|
|
- [ ] Multi-tenant isolation 100% verified
|
|
- [ ] API Key authentication functional
|
|
- [ ] Field-level permissions enforced
|
|
|
|
### Performance Requirements
|
|
- [ ] Response time improved by ≥20%
|
|
- [ ] Tool execution time < 500ms (P95)
|
|
- [ ] Resource query time < 200ms (P95)
|
|
- [ ] Throughput ≥100 requests/second
|
|
- [ ] Memory usage optimized (no leaks)
|
|
|
|
### Quality Requirements
|
|
- [ ] Test coverage ≥80%
|
|
- [ ] Zero CRITICAL security vulnerabilities
|
|
- [ ] Zero HIGH security vulnerabilities
|
|
- [ ] Code duplication <5%
|
|
- [ ] All integration tests pass
|
|
|
|
### Documentation Requirements
|
|
- [ ] Architecture documentation updated
|
|
- [ ] API documentation complete
|
|
- [ ] Migration guide published
|
|
- [ ] Troubleshooting guide published
|
|
- [ ] Code examples updated
|
|
|
|
## Success Metrics
|
|
|
|
### Code Quality
|
|
- **Lines Removed**: 500-700 lines of custom protocol code
|
|
- **Code Duplication**: <5%
|
|
- **Test Coverage**: ≥80%
|
|
- **Security Score**: 0 CRITICAL, 0 HIGH vulnerabilities
|
|
|
|
### Performance
|
|
- **Response Time**: 20-40% improvement
|
|
- **Throughput**: 100+ req/s (from 70 req/s)
|
|
- **Memory Usage**: 10-20% reduction
|
|
- **Cache Hit Rate**: >80% maintained
|
|
|
|
### Developer Experience
|
|
- **Onboarding Time**: 50% faster (simpler SDK APIs)
|
|
- **Code Readability**: +30% (attributes vs. manual registration)
|
|
- **Maintenance Effort**: -60% (Microsoft maintains protocol)
|
|
|
|
## Related Documents
|
|
|
|
### Research & Design
|
|
- [MCP SDK Integration Research](../research/mcp-sdk-integration-research.md)
|
|
- [MCP Server Architecture](../architecture/mcp-server-architecture.md)
|
|
- [Hybrid Architecture ADR](../architecture/adr/mcp-sdk-hybrid-approach.md)
|
|
|
|
### Sprint Planning
|
|
- [Sprint 5 Plan](sprint_5.md)
|
|
- [Product Roadmap](../../product.md) - M2 section
|
|
|
|
### Technical References
|
|
- [Microsoft .NET MCP SDK](https://github.com/microsoft/mcp-dotnet)
|
|
- [MCP Specification](https://spec.modelcontextprotocol.io/)
|
|
- [ColaFlow MCP Module](../../colaflow-api/src/ColaFlow.Modules.Mcp/)
|
|
|
|
---
|
|
|
|
## Notes
|
|
|
|
### Why Hybrid Architecture?
|
|
|
|
**Question**: Why not use 100% SDK?
|
|
|
|
**Answer**: ColaFlow has unique business requirements:
|
|
1. **Diff Preview**: SDK doesn't provide preview mechanism (ColaFlow custom)
|
|
2. **Approval Workflow**: SDK doesn't have human-in-the-loop (ColaFlow custom)
|
|
3. **Multi-Tenant**: SDK doesn't enforce tenant isolation (ColaFlow custom)
|
|
4. **Field Permissions**: SDK doesn't have field-level security (ColaFlow custom)
|
|
|
|
Hybrid approach gets **best of both worlds**:
|
|
- SDK handles boring protocol stuff (60-70% code reduction)
|
|
- ColaFlow handles business-critical stuff (security, approval)
|
|
|
|
### What Gets Deleted?
|
|
|
|
**Custom Code to Remove** (~700 lines):
|
|
- `McpProtocolHandler.cs` (JSON-RPC parsing)
|
|
- `McpProtocolMiddleware.cs` (HTTP middleware)
|
|
- `IMcpTool.cs` interface (replaced by SDK attributes)
|
|
- `IMcpResource.cs` interface (replaced by SDK attributes)
|
|
- `McpRegistry.cs` (replaced by SDK discovery)
|
|
- `McpRequest.cs` / `McpResponse.cs` DTOs (SDK provides)
|
|
|
|
**Custom Code to Keep** (~1200 lines):
|
|
- `DiffPreviewService.cs` (business logic)
|
|
- `PendingChangeService.cs` (approval workflow)
|
|
- `ApiKeyAuthHandler.cs` (security)
|
|
- `FieldLevelAuthHandler.cs` (permissions)
|
|
- `TenantContextService.cs` (multi-tenant)
|
|
|
|
### Timeline Justification
|
|
|
|
**Why 8 weeks?**
|
|
- **Week 1-2**: PoC + training (can't rush, need to understand SDK)
|
|
- **Week 3-4**: 10 Tools migration (careful testing required)
|
|
- **Week 5**: 11 Resources migration (simpler than Tools)
|
|
- **Week 6**: Transport layer (critical, can't break clients)
|
|
- **Week 7-8**: Testing + docs (quality gate, can't skip)
|
|
|
|
**Could it be faster?**
|
|
- Yes, if we skip testing (NOT RECOMMENDED)
|
|
- Yes, if we accept higher risk (NOT RECOMMENDED)
|
|
- This is already aggressive timeline (1.6 weeks per phase)
|
|
|
|
### Post-Migration Benefits
|
|
|
|
**Developer Velocity**:
|
|
- New Tool creation: 30 min (was 2 hours)
|
|
- New Resource creation: 15 min (was 1 hour)
|
|
- Onboarding new developers: 2 days (was 5 days)
|
|
|
|
**Maintenance Burden**:
|
|
- Protocol updates: 0 hours (Microsoft handles)
|
|
- Bug fixes: -60% effort (less custom code)
|
|
- Feature additions: +40% faster (SDK simplifies)
|
|
|
|
---
|
|
|
|
**Created**: 2025-11-09 by Product Manager Agent
|
|
**Epic Owner**: Backend Team Lead
|
|
**Estimated Start**: 2025-11-27 (After Sprint 5 Phase 1-3)
|
|
**Estimated Completion**: 2026-01-22 (Week 8 of Sprint 5)
|
|
**Status**: Not Started (planning complete)
|