feat(backend): Implement MCP Protocol Handler (Story 5.1)
Implemented JSON-RPC 2.0 protocol handler for MCP communication, enabling AI agents to communicate with ColaFlow using the Model Context Protocol. **Implementation:** - JSON-RPC 2.0 data models (Request, Response, Error, ErrorCode) - MCP protocol models (Initialize, Capabilities, ClientInfo, ServerInfo) - McpProtocolHandler with method routing and error handling - Method handlers: initialize, resources/list, tools/list, tools/call - ASP.NET Core middleware for /mcp endpoint - Service registration and dependency injection setup **Testing:** - 28 unit tests covering protocol parsing, validation, and error handling - Integration tests for initialize handshake and error responses - All tests passing with >80% coverage **Changes:** - Created ColaFlow.Modules.Mcp.Contracts project - Created ColaFlow.Modules.Mcp.Domain project - Created ColaFlow.Modules.Mcp.Application project - Created ColaFlow.Modules.Mcp.Infrastructure project - Created ColaFlow.Modules.Mcp.Tests project - Registered MCP module in ColaFlow.API Program.cs - Added /mcp endpoint via middleware **Acceptance Criteria Met:** ✅ JSON-RPC 2.0 messages correctly parsed ✅ Request validation (jsonrpc: "2.0", method, params, id) ✅ Error responses conform to JSON-RPC 2.0 spec ✅ Invalid requests return proper error codes (-32700, -32600, -32601, -32602) ✅ MCP initialize method implemented ✅ Server capabilities returned (resources, tools, prompts) ✅ Protocol version negotiation works (1.0) ✅ Request routing to method handlers ✅ Unit test coverage > 80% ✅ All tests passing **Story**: docs/stories/sprint_5/story_5_1.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
530
docs/stories/sprint_5/story_5_4.md
Normal file
530
docs/stories/sprint_5/story_5_4.md
Normal file
@@ -0,0 +1,530 @@
|
||||
---
|
||||
story_id: story_5_4
|
||||
sprint_id: sprint_5
|
||||
phase: Phase 1 - Foundation
|
||||
status: not_started
|
||||
priority: P0
|
||||
story_points: 3
|
||||
assignee: backend
|
||||
estimated_days: 1
|
||||
created_date: 2025-11-06
|
||||
dependencies: [story_5_1]
|
||||
---
|
||||
|
||||
# Story 5.4: Error Handling & Logging
|
||||
|
||||
**Phase**: Phase 1 - Foundation (Week 1-2)
|
||||
**Priority**: P0 CRITICAL
|
||||
**Estimated Effort**: 3 Story Points (1 day)
|
||||
|
||||
## User Story
|
||||
|
||||
**As a** Backend Developer
|
||||
**I want** comprehensive error handling and structured logging for MCP operations
|
||||
**So that** issues can be quickly diagnosed and the system is observable in production
|
||||
|
||||
## Business Value
|
||||
|
||||
Production-grade error handling and logging are critical for:
|
||||
- **Debugging**: Quickly diagnose issues in production
|
||||
- **Monitoring**: Track error rates and performance metrics
|
||||
- **Audit**: Complete trail of all MCP operations
|
||||
- **Security**: Detect and respond to suspicious activity
|
||||
|
||||
**Impact**:
|
||||
- Reduces mean time to resolution (MTTR) by 50%
|
||||
- Enables proactive monitoring and alerting
|
||||
- Supports compliance requirements (audit logs)
|
||||
- Improves developer productivity
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
### AC1: MCP Exception Hierarchy
|
||||
- [ ] Custom exception classes for MCP operations
|
||||
- [ ] Exception-to-MCP-error-code mapping (JSON-RPC 2.0)
|
||||
- [ ] Preserve stack traces and inner exceptions
|
||||
- [ ] Structured exception data (request ID, tenant ID, API Key ID)
|
||||
|
||||
### AC2: Global Exception Handler
|
||||
- [ ] Catch all unhandled exceptions
|
||||
- [ ] Map exceptions to MCP error responses
|
||||
- [ ] Log errors with full context
|
||||
- [ ] Return appropriate HTTP status codes
|
||||
- [ ] Never expose internal details to clients
|
||||
|
||||
### AC3: Structured Logging
|
||||
- [ ] Use Serilog with structured logging
|
||||
- [ ] Include correlation ID in all logs
|
||||
- [ ] Log request/response at DEBUG level
|
||||
- [ ] Log errors at ERROR level with stack traces
|
||||
- [ ] Log performance metrics (timing)
|
||||
- [ ] Never log sensitive data (API Keys, passwords, PII)
|
||||
|
||||
### AC4: Request/Response Logging Middleware
|
||||
- [ ] Log all MCP requests (method, params, correlation ID)
|
||||
- [ ] Log all MCP responses (result or error, timing)
|
||||
- [ ] Performance timing (request duration)
|
||||
- [ ] Exclude sensitive fields from logs (API Key hash)
|
||||
|
||||
### AC5: Performance Monitoring
|
||||
- [ ] Track request duration (P50, P95, P99)
|
||||
- [ ] Track error rates (by method, by tenant)
|
||||
- [ ] Track API Key usage (by key, by tenant)
|
||||
- [ ] Export metrics for Prometheus (future)
|
||||
|
||||
### AC6: Testing
|
||||
- [ ] Unit tests for exception mapping
|
||||
- [ ] Integration tests for error responses
|
||||
- [ ] Verify sensitive data never logged
|
||||
- [ ] Test correlation ID propagation
|
||||
|
||||
## Technical Design
|
||||
|
||||
### MCP Exception Hierarchy
|
||||
|
||||
```csharp
|
||||
public abstract class McpException : Exception
|
||||
{
|
||||
public int ErrorCode { get; }
|
||||
public object? ErrorData { get; }
|
||||
|
||||
protected McpException(
|
||||
int errorCode,
|
||||
string message,
|
||||
object? errorData = null,
|
||||
Exception? innerException = null)
|
||||
: base(message, innerException)
|
||||
{
|
||||
ErrorCode = errorCode;
|
||||
ErrorData = errorData;
|
||||
}
|
||||
|
||||
public McpError ToMcpError()
|
||||
{
|
||||
return new McpError
|
||||
{
|
||||
Code = ErrorCode,
|
||||
Message = Message,
|
||||
Data = ErrorData
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
public class McpParseException : McpException
|
||||
{
|
||||
public McpParseException(string message, Exception? innerException = null)
|
||||
: base(McpErrorCode.ParseError, message, null, innerException) { }
|
||||
}
|
||||
|
||||
public class McpInvalidRequestException : McpException
|
||||
{
|
||||
public McpInvalidRequestException(string message, object? errorData = null)
|
||||
: base(McpErrorCode.InvalidRequest, message, errorData) { }
|
||||
}
|
||||
|
||||
public class McpMethodNotFoundException : McpException
|
||||
{
|
||||
public McpMethodNotFoundException(string method)
|
||||
: base(McpErrorCode.MethodNotFound, $"Method not found: {method}") { }
|
||||
}
|
||||
|
||||
public class McpInvalidParamsException : McpException
|
||||
{
|
||||
public McpInvalidParamsException(string message, object? errorData = null)
|
||||
: base(McpErrorCode.InvalidParams, message, errorData) { }
|
||||
}
|
||||
|
||||
public class McpUnauthorizedException : McpException
|
||||
{
|
||||
public McpUnauthorizedException(string message = "Unauthorized")
|
||||
: base(McpErrorCode.Unauthorized, message) { }
|
||||
}
|
||||
|
||||
public class McpForbiddenException : McpException
|
||||
{
|
||||
public McpForbiddenException(string message = "Forbidden")
|
||||
: base(McpErrorCode.Forbidden, message) { }
|
||||
}
|
||||
|
||||
public class McpNotFoundException : McpException
|
||||
{
|
||||
public McpNotFoundException(string resourceType, string resourceId)
|
||||
: base(McpErrorCode.NotFound,
|
||||
$"{resourceType} not found: {resourceId}") { }
|
||||
}
|
||||
|
||||
public class McpValidationException : McpException
|
||||
{
|
||||
public McpValidationException(string message, object? errorData = null)
|
||||
: base(McpErrorCode.ValidationFailed, message, errorData) { }
|
||||
}
|
||||
```
|
||||
|
||||
### Global Exception Handler Middleware
|
||||
|
||||
```csharp
|
||||
public class McpExceptionHandlerMiddleware
|
||||
{
|
||||
private readonly RequestDelegate _next;
|
||||
private readonly ILogger<McpExceptionHandlerMiddleware> _logger;
|
||||
|
||||
public async Task InvokeAsync(HttpContext context)
|
||||
{
|
||||
try
|
||||
{
|
||||
await _next(context);
|
||||
}
|
||||
catch (McpException mcpEx)
|
||||
{
|
||||
await HandleMcpExceptionAsync(context, mcpEx);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
await HandleUnexpectedExceptionAsync(context, ex);
|
||||
}
|
||||
}
|
||||
|
||||
private async Task HandleMcpExceptionAsync(
|
||||
HttpContext context,
|
||||
McpException mcpEx)
|
||||
{
|
||||
var correlationId = context.Items["CorrelationId"] as string;
|
||||
|
||||
_logger.LogError(mcpEx,
|
||||
"MCP Error: {ErrorCode} - {Message} (CorrelationId: {CorrelationId})",
|
||||
mcpEx.ErrorCode, mcpEx.Message, correlationId);
|
||||
|
||||
var response = new McpResponse
|
||||
{
|
||||
JsonRpc = "2.0",
|
||||
Error = mcpEx.ToMcpError(),
|
||||
Id = context.Items["McpRequestId"] as string
|
||||
};
|
||||
|
||||
context.Response.StatusCode = GetHttpStatusCode(mcpEx.ErrorCode);
|
||||
context.Response.ContentType = "application/json";
|
||||
await context.Response.WriteAsJsonAsync(response);
|
||||
}
|
||||
|
||||
private async Task HandleUnexpectedExceptionAsync(
|
||||
HttpContext context,
|
||||
Exception ex)
|
||||
{
|
||||
var correlationId = context.Items["CorrelationId"] as string;
|
||||
|
||||
_logger.LogError(ex,
|
||||
"Unexpected error in MCP Server (CorrelationId: {CorrelationId})",
|
||||
correlationId);
|
||||
|
||||
var response = new McpResponse
|
||||
{
|
||||
JsonRpc = "2.0",
|
||||
Error = new McpError
|
||||
{
|
||||
Code = McpErrorCode.InternalError,
|
||||
Message = "Internal server error"
|
||||
// Do NOT expose exception details
|
||||
},
|
||||
Id = context.Items["McpRequestId"] as string
|
||||
};
|
||||
|
||||
context.Response.StatusCode = 500;
|
||||
context.Response.ContentType = "application/json";
|
||||
await context.Response.WriteAsJsonAsync(response);
|
||||
}
|
||||
|
||||
private static int GetHttpStatusCode(int mcpErrorCode)
|
||||
{
|
||||
return mcpErrorCode switch
|
||||
{
|
||||
McpErrorCode.Unauthorized => 401,
|
||||
McpErrorCode.Forbidden => 403,
|
||||
McpErrorCode.NotFound => 404,
|
||||
McpErrorCode.ValidationFailed => 422,
|
||||
_ => 400
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Request/Response Logging Middleware
|
||||
|
||||
```csharp
|
||||
public class McpLoggingMiddleware
|
||||
{
|
||||
private readonly RequestDelegate _next;
|
||||
private readonly ILogger<McpLoggingMiddleware> _logger;
|
||||
|
||||
public async Task InvokeAsync(HttpContext context)
|
||||
{
|
||||
// Generate correlation ID
|
||||
var correlationId = Guid.NewGuid().ToString();
|
||||
context.Items["CorrelationId"] = correlationId;
|
||||
|
||||
// Start timing
|
||||
var stopwatch = Stopwatch.StartNew();
|
||||
|
||||
// Enable request buffering for logging
|
||||
context.Request.EnableBuffering();
|
||||
|
||||
// Log request
|
||||
await LogRequestAsync(context, correlationId);
|
||||
|
||||
// Capture response
|
||||
var originalBody = context.Response.Body;
|
||||
using var memoryStream = new MemoryStream();
|
||||
context.Response.Body = memoryStream;
|
||||
|
||||
try
|
||||
{
|
||||
await _next(context);
|
||||
|
||||
stopwatch.Stop();
|
||||
|
||||
// Log response
|
||||
await LogResponseAsync(
|
||||
context, correlationId, stopwatch.ElapsedMilliseconds);
|
||||
|
||||
// Copy response to original stream
|
||||
memoryStream.Seek(0, SeekOrigin.Begin);
|
||||
await memoryStream.CopyToAsync(originalBody);
|
||||
}
|
||||
finally
|
||||
{
|
||||
context.Response.Body = originalBody;
|
||||
}
|
||||
}
|
||||
|
||||
private async Task LogRequestAsync(
|
||||
HttpContext context,
|
||||
string correlationId)
|
||||
{
|
||||
context.Request.Body.Seek(0, SeekOrigin.Begin);
|
||||
var body = await new StreamReader(context.Request.Body).ReadToEndAsync();
|
||||
context.Request.Body.Seek(0, SeekOrigin.Begin);
|
||||
|
||||
var tenantId = context.Items["TenantId"];
|
||||
var apiKeyId = context.Items["ApiKeyId"];
|
||||
|
||||
_logger.LogDebug(
|
||||
"MCP Request: {Method} {Path} (CorrelationId: {CorrelationId}, " +
|
||||
"TenantId: {TenantId}, ApiKeyId: {ApiKeyId})\n{Body}",
|
||||
context.Request.Method,
|
||||
context.Request.Path,
|
||||
correlationId,
|
||||
tenantId,
|
||||
apiKeyId,
|
||||
SanitizeBody(body));
|
||||
}
|
||||
|
||||
private async Task LogResponseAsync(
|
||||
HttpContext context,
|
||||
string correlationId,
|
||||
long elapsedMs)
|
||||
{
|
||||
context.Response.Body.Seek(0, SeekOrigin.Begin);
|
||||
var body = await new StreamReader(context.Response.Body).ReadToEndAsync();
|
||||
context.Response.Body.Seek(0, SeekOrigin.Begin);
|
||||
|
||||
var logLevel = context.Response.StatusCode >= 400
|
||||
? LogLevel.Error
|
||||
: LogLevel.Debug;
|
||||
|
||||
_logger.Log(logLevel,
|
||||
"MCP Response: {StatusCode} (CorrelationId: {CorrelationId}, " +
|
||||
"Duration: {Duration}ms)\n{Body}",
|
||||
context.Response.StatusCode,
|
||||
correlationId,
|
||||
elapsedMs,
|
||||
body);
|
||||
}
|
||||
|
||||
private string SanitizeBody(string body)
|
||||
{
|
||||
// Remove sensitive data (API Keys, passwords, etc.)
|
||||
// This is a simplified example
|
||||
return body.Replace("\"key_hash\":", "\"key_hash\":[REDACTED]");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Serilog Configuration
|
||||
|
||||
```csharp
|
||||
public static class SerilogConfiguration
|
||||
{
|
||||
public static void ConfigureSerilog(WebApplicationBuilder builder)
|
||||
{
|
||||
builder.Host.UseSerilog((context, services, configuration) =>
|
||||
{
|
||||
configuration
|
||||
.ReadFrom.Configuration(context.Configuration)
|
||||
.Enrich.FromLogContext()
|
||||
.Enrich.WithMachineName()
|
||||
.Enrich.WithEnvironmentName()
|
||||
.Enrich.WithProperty("Application", "ColaFlow")
|
||||
.WriteTo.Console(
|
||||
outputTemplate: "[{Timestamp:HH:mm:ss} {Level:u3}] " +
|
||||
"{CorrelationId} {Message:lj}{NewLine}{Exception}")
|
||||
.WriteTo.File(
|
||||
path: "logs/colaflow-.log",
|
||||
rollingInterval: RollingInterval.Day,
|
||||
retainedFileCountLimit: 30,
|
||||
outputTemplate: "[{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz}] " +
|
||||
"[{Level:u3}] {CorrelationId} {Message:lj}{NewLine}{Exception}")
|
||||
.WriteTo.PostgreSQL(
|
||||
connectionString: context.Configuration.GetConnectionString("Default"),
|
||||
tableName: "logs",
|
||||
needAutoCreateTable: true);
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Tasks
|
||||
|
||||
### Task 1: MCP Exception Classes (2 hours)
|
||||
- [ ] Create `McpException` base class
|
||||
- [ ] Create 8 specific exception classes
|
||||
- [ ] Implement `ToMcpError()` method
|
||||
- [ ] Add XML documentation
|
||||
|
||||
**Files to Create**:
|
||||
- `ColaFlow.Modules.Mcp/Exceptions/McpException.cs`
|
||||
- `ColaFlow.Modules.Mcp/Exceptions/McpParseException.cs`
|
||||
- `ColaFlow.Modules.Mcp/Exceptions/McpInvalidRequestException.cs`
|
||||
- `ColaFlow.Modules.Mcp/Exceptions/McpMethodNotFoundException.cs`
|
||||
- `ColaFlow.Modules.Mcp/Exceptions/McpInvalidParamsException.cs`
|
||||
- `ColaFlow.Modules.Mcp/Exceptions/McpUnauthorizedException.cs`
|
||||
- `ColaFlow.Modules.Mcp/Exceptions/McpForbiddenException.cs`
|
||||
- `ColaFlow.Modules.Mcp/Exceptions/McpNotFoundException.cs`
|
||||
- `ColaFlow.Modules.Mcp/Exceptions/McpValidationException.cs`
|
||||
|
||||
### Task 2: Global Exception Handler Middleware (2 hours)
|
||||
- [ ] Create `McpExceptionHandlerMiddleware`
|
||||
- [ ] Handle `McpException` (map to MCP error)
|
||||
- [ ] Handle unexpected exceptions (return InternalError)
|
||||
- [ ] Map error codes to HTTP status codes
|
||||
- [ ] Add structured logging
|
||||
|
||||
**Files to Create**:
|
||||
- `ColaFlow.Modules.Mcp/Middleware/McpExceptionHandlerMiddleware.cs`
|
||||
|
||||
### Task 3: Request/Response Logging Middleware (3 hours)
|
||||
- [ ] Create `McpLoggingMiddleware`
|
||||
- [ ] Generate correlation ID
|
||||
- [ ] Log request (method, params, timing)
|
||||
- [ ] Log response (result/error, timing)
|
||||
- [ ] Sanitize sensitive data (API Keys, passwords)
|
||||
|
||||
**Files to Create**:
|
||||
- `ColaFlow.Modules.Mcp/Middleware/McpLoggingMiddleware.cs`
|
||||
|
||||
### Task 4: Serilog Configuration (1 hour)
|
||||
- [ ] Configure Serilog (Console, File, PostgreSQL sinks)
|
||||
- [ ] Add enrichers (correlation ID, machine name, environment)
|
||||
- [ ] Configure structured logging format
|
||||
- [ ] Set log levels (DEBUG for dev, INFO for prod)
|
||||
|
||||
**Files to Modify**:
|
||||
- `ColaFlow.Api/Program.cs`
|
||||
|
||||
### Task 5: Unit Tests (3 hours)
|
||||
- [ ] Test exception-to-MCP-error mapping
|
||||
- [ ] Test HTTP status code mapping
|
||||
- [ ] Test sensitive data sanitization
|
||||
- [ ] Test correlation ID generation
|
||||
|
||||
**Files to Create**:
|
||||
- `ColaFlow.Modules.Mcp.Tests/Exceptions/McpExceptionTests.cs`
|
||||
- `ColaFlow.Modules.Mcp.Tests/Middleware/McpExceptionHandlerTests.cs`
|
||||
- `ColaFlow.Modules.Mcp.Tests/Middleware/McpLoggingMiddlewareTests.cs`
|
||||
|
||||
### Task 6: Integration Tests (2 hours)
|
||||
- [ ] Test end-to-end error handling
|
||||
- [ ] Test logging middleware (verify logs written)
|
||||
- [ ] Test correlation ID propagation
|
||||
- [ ] Test sensitive data never logged
|
||||
|
||||
**Files to Create**:
|
||||
- `ColaFlow.Modules.Mcp.Tests/Integration/ErrorHandlingIntegrationTests.cs`
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
- Exception class creation and properties
|
||||
- Exception-to-MCP-error conversion
|
||||
- HTTP status code mapping
|
||||
- Sensitive data sanitization
|
||||
|
||||
### Integration Tests
|
||||
- End-to-end error handling flow
|
||||
- Logging middleware writes logs correctly
|
||||
- Correlation ID in all log entries
|
||||
- No sensitive data in logs
|
||||
|
||||
### Manual Testing Checklist
|
||||
- [ ] Trigger ParseError → verify logs and response
|
||||
- [ ] Trigger MethodNotFound → verify logs and response
|
||||
- [ ] Trigger InternalError → verify logs and response
|
||||
- [ ] Check logs for correlation ID
|
||||
- [ ] Check logs do NOT contain API Keys or passwords
|
||||
|
||||
## Dependencies
|
||||
|
||||
**Prerequisites**:
|
||||
- Story 5.1 (MCP Protocol Handler) - Exception handling integrated here
|
||||
- Serilog NuGet packages
|
||||
- Serilog.Sinks.PostgreSQL
|
||||
|
||||
**Used By**:
|
||||
- All MCP Stories (5.5-5.12) - Rely on error handling
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
| Risk | Impact | Probability | Mitigation |
|
||||
|------|--------|-------------|------------|
|
||||
| Sensitive data logged | High | Medium | Sanitization function, code review |
|
||||
| Logging performance | Medium | Low | Async logging, buffering |
|
||||
| Log storage growth | Medium | Medium | 30-day retention, log rotation |
|
||||
| Exception details leaked | High | Low | Never expose internal details to clients |
|
||||
|
||||
## Definition of Done
|
||||
|
||||
- [ ] Code compiles without warnings
|
||||
- [ ] All unit tests passing (> 80% coverage)
|
||||
- [ ] All integration tests passing
|
||||
- [ ] Code reviewed and approved
|
||||
- [ ] XML documentation for exceptions
|
||||
- [ ] Serilog configured correctly
|
||||
- [ ] Correlation ID in all logs
|
||||
- [ ] No sensitive data in logs (verified)
|
||||
- [ ] Error responses conform to JSON-RPC 2.0
|
||||
|
||||
## Notes
|
||||
|
||||
### Why This Story Matters
|
||||
- **Production Readiness**: Cannot deploy without proper error handling
|
||||
- **Observability**: Structured logging enables monitoring and alerting
|
||||
- **Debugging**: Correlation ID makes troubleshooting 10x faster
|
||||
- **Security**: Prevents sensitive data leaks in logs
|
||||
|
||||
### Key Design Decisions
|
||||
1. **Custom Exceptions**: Clear error semantics, easy to map to MCP errors
|
||||
2. **Correlation ID**: Track requests across distributed system
|
||||
3. **Structured Logging**: Machine-readable, queryable logs
|
||||
4. **Sanitization**: Prevent sensitive data leaks
|
||||
5. **Async Logging**: Minimal performance impact
|
||||
|
||||
### Logging Best Practices
|
||||
- Use structured logging (not string interpolation)
|
||||
- Include correlation ID in all logs
|
||||
- Log at appropriate levels (DEBUG, INFO, ERROR)
|
||||
- Never log sensitive data (API Keys, passwords, PII)
|
||||
- Use async logging to avoid blocking
|
||||
- Rotate logs daily, retain for 30 days
|
||||
|
||||
### Reference Materials
|
||||
- Serilog Documentation: https://serilog.net/
|
||||
- JSON-RPC 2.0 Error Codes: https://www.jsonrpc.org/specification#error_object
|
||||
- Sprint 5 Plan: `docs/plans/sprint_5.md`
|
||||
Reference in New Issue
Block a user