Implemented JSON-RPC 2.0 protocol handler for MCP communication, enabling AI agents to communicate with ColaFlow using the Model Context Protocol. **Implementation:** - JSON-RPC 2.0 data models (Request, Response, Error, ErrorCode) - MCP protocol models (Initialize, Capabilities, ClientInfo, ServerInfo) - McpProtocolHandler with method routing and error handling - Method handlers: initialize, resources/list, tools/list, tools/call - ASP.NET Core middleware for /mcp endpoint - Service registration and dependency injection setup **Testing:** - 28 unit tests covering protocol parsing, validation, and error handling - Integration tests for initialize handshake and error responses - All tests passing with >80% coverage **Changes:** - Created ColaFlow.Modules.Mcp.Contracts project - Created ColaFlow.Modules.Mcp.Domain project - Created ColaFlow.Modules.Mcp.Application project - Created ColaFlow.Modules.Mcp.Infrastructure project - Created ColaFlow.Modules.Mcp.Tests project - Registered MCP module in ColaFlow.API Program.cs - Added /mcp endpoint via middleware **Acceptance Criteria Met:** ✅ JSON-RPC 2.0 messages correctly parsed ✅ Request validation (jsonrpc: "2.0", method, params, id) ✅ Error responses conform to JSON-RPC 2.0 spec ✅ Invalid requests return proper error codes (-32700, -32600, -32601, -32602) ✅ MCP initialize method implemented ✅ Server capabilities returned (resources, tools, prompts) ✅ Protocol version negotiation works (1.0) ✅ Request routing to method handlers ✅ Unit test coverage > 80% ✅ All tests passing **Story**: docs/stories/sprint_5/story_5_1.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
16 KiB
16 KiB
story_id, sprint_id, phase, status, priority, story_points, assignee, estimated_days, created_date, dependencies
| story_id | sprint_id | phase | status | priority | story_points | assignee | estimated_days | created_date | dependencies | |
|---|---|---|---|---|---|---|---|---|---|---|
| story_5_4 | sprint_5 | Phase 1 - Foundation | not_started | P0 | 3 | backend | 1 | 2025-11-06 |
|
Story 5.4: Error Handling & Logging
Phase: Phase 1 - Foundation (Week 1-2) Priority: P0 CRITICAL Estimated Effort: 3 Story Points (1 day)
User Story
As a Backend Developer I want comprehensive error handling and structured logging for MCP operations So that issues can be quickly diagnosed and the system is observable in production
Business Value
Production-grade error handling and logging are critical for:
- Debugging: Quickly diagnose issues in production
- Monitoring: Track error rates and performance metrics
- Audit: Complete trail of all MCP operations
- Security: Detect and respond to suspicious activity
Impact:
- Reduces mean time to resolution (MTTR) by 50%
- Enables proactive monitoring and alerting
- Supports compliance requirements (audit logs)
- Improves developer productivity
Acceptance Criteria
AC1: MCP Exception Hierarchy
- Custom exception classes for MCP operations
- Exception-to-MCP-error-code mapping (JSON-RPC 2.0)
- Preserve stack traces and inner exceptions
- Structured exception data (request ID, tenant ID, API Key ID)
AC2: Global Exception Handler
- Catch all unhandled exceptions
- Map exceptions to MCP error responses
- Log errors with full context
- Return appropriate HTTP status codes
- Never expose internal details to clients
AC3: Structured Logging
- Use Serilog with structured logging
- Include correlation ID in all logs
- Log request/response at DEBUG level
- Log errors at ERROR level with stack traces
- Log performance metrics (timing)
- Never log sensitive data (API Keys, passwords, PII)
AC4: Request/Response Logging Middleware
- Log all MCP requests (method, params, correlation ID)
- Log all MCP responses (result or error, timing)
- Performance timing (request duration)
- Exclude sensitive fields from logs (API Key hash)
AC5: Performance Monitoring
- Track request duration (P50, P95, P99)
- Track error rates (by method, by tenant)
- Track API Key usage (by key, by tenant)
- Export metrics for Prometheus (future)
AC6: Testing
- Unit tests for exception mapping
- Integration tests for error responses
- Verify sensitive data never logged
- Test correlation ID propagation
Technical Design
MCP Exception Hierarchy
public abstract class McpException : Exception
{
public int ErrorCode { get; }
public object? ErrorData { get; }
protected McpException(
int errorCode,
string message,
object? errorData = null,
Exception? innerException = null)
: base(message, innerException)
{
ErrorCode = errorCode;
ErrorData = errorData;
}
public McpError ToMcpError()
{
return new McpError
{
Code = ErrorCode,
Message = Message,
Data = ErrorData
};
}
}
public class McpParseException : McpException
{
public McpParseException(string message, Exception? innerException = null)
: base(McpErrorCode.ParseError, message, null, innerException) { }
}
public class McpInvalidRequestException : McpException
{
public McpInvalidRequestException(string message, object? errorData = null)
: base(McpErrorCode.InvalidRequest, message, errorData) { }
}
public class McpMethodNotFoundException : McpException
{
public McpMethodNotFoundException(string method)
: base(McpErrorCode.MethodNotFound, $"Method not found: {method}") { }
}
public class McpInvalidParamsException : McpException
{
public McpInvalidParamsException(string message, object? errorData = null)
: base(McpErrorCode.InvalidParams, message, errorData) { }
}
public class McpUnauthorizedException : McpException
{
public McpUnauthorizedException(string message = "Unauthorized")
: base(McpErrorCode.Unauthorized, message) { }
}
public class McpForbiddenException : McpException
{
public McpForbiddenException(string message = "Forbidden")
: base(McpErrorCode.Forbidden, message) { }
}
public class McpNotFoundException : McpException
{
public McpNotFoundException(string resourceType, string resourceId)
: base(McpErrorCode.NotFound,
$"{resourceType} not found: {resourceId}") { }
}
public class McpValidationException : McpException
{
public McpValidationException(string message, object? errorData = null)
: base(McpErrorCode.ValidationFailed, message, errorData) { }
}
Global Exception Handler Middleware
public class McpExceptionHandlerMiddleware
{
private readonly RequestDelegate _next;
private readonly ILogger<McpExceptionHandlerMiddleware> _logger;
public async Task InvokeAsync(HttpContext context)
{
try
{
await _next(context);
}
catch (McpException mcpEx)
{
await HandleMcpExceptionAsync(context, mcpEx);
}
catch (Exception ex)
{
await HandleUnexpectedExceptionAsync(context, ex);
}
}
private async Task HandleMcpExceptionAsync(
HttpContext context,
McpException mcpEx)
{
var correlationId = context.Items["CorrelationId"] as string;
_logger.LogError(mcpEx,
"MCP Error: {ErrorCode} - {Message} (CorrelationId: {CorrelationId})",
mcpEx.ErrorCode, mcpEx.Message, correlationId);
var response = new McpResponse
{
JsonRpc = "2.0",
Error = mcpEx.ToMcpError(),
Id = context.Items["McpRequestId"] as string
};
context.Response.StatusCode = GetHttpStatusCode(mcpEx.ErrorCode);
context.Response.ContentType = "application/json";
await context.Response.WriteAsJsonAsync(response);
}
private async Task HandleUnexpectedExceptionAsync(
HttpContext context,
Exception ex)
{
var correlationId = context.Items["CorrelationId"] as string;
_logger.LogError(ex,
"Unexpected error in MCP Server (CorrelationId: {CorrelationId})",
correlationId);
var response = new McpResponse
{
JsonRpc = "2.0",
Error = new McpError
{
Code = McpErrorCode.InternalError,
Message = "Internal server error"
// Do NOT expose exception details
},
Id = context.Items["McpRequestId"] as string
};
context.Response.StatusCode = 500;
context.Response.ContentType = "application/json";
await context.Response.WriteAsJsonAsync(response);
}
private static int GetHttpStatusCode(int mcpErrorCode)
{
return mcpErrorCode switch
{
McpErrorCode.Unauthorized => 401,
McpErrorCode.Forbidden => 403,
McpErrorCode.NotFound => 404,
McpErrorCode.ValidationFailed => 422,
_ => 400
};
}
}
Request/Response Logging Middleware
public class McpLoggingMiddleware
{
private readonly RequestDelegate _next;
private readonly ILogger<McpLoggingMiddleware> _logger;
public async Task InvokeAsync(HttpContext context)
{
// Generate correlation ID
var correlationId = Guid.NewGuid().ToString();
context.Items["CorrelationId"] = correlationId;
// Start timing
var stopwatch = Stopwatch.StartNew();
// Enable request buffering for logging
context.Request.EnableBuffering();
// Log request
await LogRequestAsync(context, correlationId);
// Capture response
var originalBody = context.Response.Body;
using var memoryStream = new MemoryStream();
context.Response.Body = memoryStream;
try
{
await _next(context);
stopwatch.Stop();
// Log response
await LogResponseAsync(
context, correlationId, stopwatch.ElapsedMilliseconds);
// Copy response to original stream
memoryStream.Seek(0, SeekOrigin.Begin);
await memoryStream.CopyToAsync(originalBody);
}
finally
{
context.Response.Body = originalBody;
}
}
private async Task LogRequestAsync(
HttpContext context,
string correlationId)
{
context.Request.Body.Seek(0, SeekOrigin.Begin);
var body = await new StreamReader(context.Request.Body).ReadToEndAsync();
context.Request.Body.Seek(0, SeekOrigin.Begin);
var tenantId = context.Items["TenantId"];
var apiKeyId = context.Items["ApiKeyId"];
_logger.LogDebug(
"MCP Request: {Method} {Path} (CorrelationId: {CorrelationId}, " +
"TenantId: {TenantId}, ApiKeyId: {ApiKeyId})\n{Body}",
context.Request.Method,
context.Request.Path,
correlationId,
tenantId,
apiKeyId,
SanitizeBody(body));
}
private async Task LogResponseAsync(
HttpContext context,
string correlationId,
long elapsedMs)
{
context.Response.Body.Seek(0, SeekOrigin.Begin);
var body = await new StreamReader(context.Response.Body).ReadToEndAsync();
context.Response.Body.Seek(0, SeekOrigin.Begin);
var logLevel = context.Response.StatusCode >= 400
? LogLevel.Error
: LogLevel.Debug;
_logger.Log(logLevel,
"MCP Response: {StatusCode} (CorrelationId: {CorrelationId}, " +
"Duration: {Duration}ms)\n{Body}",
context.Response.StatusCode,
correlationId,
elapsedMs,
body);
}
private string SanitizeBody(string body)
{
// Remove sensitive data (API Keys, passwords, etc.)
// This is a simplified example
return body.Replace("\"key_hash\":", "\"key_hash\":[REDACTED]");
}
}
Serilog Configuration
public static class SerilogConfiguration
{
public static void ConfigureSerilog(WebApplicationBuilder builder)
{
builder.Host.UseSerilog((context, services, configuration) =>
{
configuration
.ReadFrom.Configuration(context.Configuration)
.Enrich.FromLogContext()
.Enrich.WithMachineName()
.Enrich.WithEnvironmentName()
.Enrich.WithProperty("Application", "ColaFlow")
.WriteTo.Console(
outputTemplate: "[{Timestamp:HH:mm:ss} {Level:u3}] " +
"{CorrelationId} {Message:lj}{NewLine}{Exception}")
.WriteTo.File(
path: "logs/colaflow-.log",
rollingInterval: RollingInterval.Day,
retainedFileCountLimit: 30,
outputTemplate: "[{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz}] " +
"[{Level:u3}] {CorrelationId} {Message:lj}{NewLine}{Exception}")
.WriteTo.PostgreSQL(
connectionString: context.Configuration.GetConnectionString("Default"),
tableName: "logs",
needAutoCreateTable: true);
});
}
}
Tasks
Task 1: MCP Exception Classes (2 hours)
- Create
McpExceptionbase class - Create 8 specific exception classes
- Implement
ToMcpError()method - Add XML documentation
Files to Create:
ColaFlow.Modules.Mcp/Exceptions/McpException.csColaFlow.Modules.Mcp/Exceptions/McpParseException.csColaFlow.Modules.Mcp/Exceptions/McpInvalidRequestException.csColaFlow.Modules.Mcp/Exceptions/McpMethodNotFoundException.csColaFlow.Modules.Mcp/Exceptions/McpInvalidParamsException.csColaFlow.Modules.Mcp/Exceptions/McpUnauthorizedException.csColaFlow.Modules.Mcp/Exceptions/McpForbiddenException.csColaFlow.Modules.Mcp/Exceptions/McpNotFoundException.csColaFlow.Modules.Mcp/Exceptions/McpValidationException.cs
Task 2: Global Exception Handler Middleware (2 hours)
- Create
McpExceptionHandlerMiddleware - Handle
McpException(map to MCP error) - Handle unexpected exceptions (return InternalError)
- Map error codes to HTTP status codes
- Add structured logging
Files to Create:
ColaFlow.Modules.Mcp/Middleware/McpExceptionHandlerMiddleware.cs
Task 3: Request/Response Logging Middleware (3 hours)
- Create
McpLoggingMiddleware - Generate correlation ID
- Log request (method, params, timing)
- Log response (result/error, timing)
- Sanitize sensitive data (API Keys, passwords)
Files to Create:
ColaFlow.Modules.Mcp/Middleware/McpLoggingMiddleware.cs
Task 4: Serilog Configuration (1 hour)
- Configure Serilog (Console, File, PostgreSQL sinks)
- Add enrichers (correlation ID, machine name, environment)
- Configure structured logging format
- Set log levels (DEBUG for dev, INFO for prod)
Files to Modify:
ColaFlow.Api/Program.cs
Task 5: Unit Tests (3 hours)
- Test exception-to-MCP-error mapping
- Test HTTP status code mapping
- Test sensitive data sanitization
- Test correlation ID generation
Files to Create:
ColaFlow.Modules.Mcp.Tests/Exceptions/McpExceptionTests.csColaFlow.Modules.Mcp.Tests/Middleware/McpExceptionHandlerTests.csColaFlow.Modules.Mcp.Tests/Middleware/McpLoggingMiddlewareTests.cs
Task 6: Integration Tests (2 hours)
- Test end-to-end error handling
- Test logging middleware (verify logs written)
- Test correlation ID propagation
- Test sensitive data never logged
Files to Create:
ColaFlow.Modules.Mcp.Tests/Integration/ErrorHandlingIntegrationTests.cs
Testing Strategy
Unit Tests
- Exception class creation and properties
- Exception-to-MCP-error conversion
- HTTP status code mapping
- Sensitive data sanitization
Integration Tests
- End-to-end error handling flow
- Logging middleware writes logs correctly
- Correlation ID in all log entries
- No sensitive data in logs
Manual Testing Checklist
- Trigger ParseError → verify logs and response
- Trigger MethodNotFound → verify logs and response
- Trigger InternalError → verify logs and response
- Check logs for correlation ID
- Check logs do NOT contain API Keys or passwords
Dependencies
Prerequisites:
- Story 5.1 (MCP Protocol Handler) - Exception handling integrated here
- Serilog NuGet packages
- Serilog.Sinks.PostgreSQL
Used By:
- All MCP Stories (5.5-5.12) - Rely on error handling
Risks & Mitigation
| Risk | Impact | Probability | Mitigation |
|---|---|---|---|
| Sensitive data logged | High | Medium | Sanitization function, code review |
| Logging performance | Medium | Low | Async logging, buffering |
| Log storage growth | Medium | Medium | 30-day retention, log rotation |
| Exception details leaked | High | Low | Never expose internal details to clients |
Definition of Done
- Code compiles without warnings
- All unit tests passing (> 80% coverage)
- All integration tests passing
- Code reviewed and approved
- XML documentation for exceptions
- Serilog configured correctly
- Correlation ID in all logs
- No sensitive data in logs (verified)
- Error responses conform to JSON-RPC 2.0
Notes
Why This Story Matters
- Production Readiness: Cannot deploy without proper error handling
- Observability: Structured logging enables monitoring and alerting
- Debugging: Correlation ID makes troubleshooting 10x faster
- Security: Prevents sensitive data leaks in logs
Key Design Decisions
- Custom Exceptions: Clear error semantics, easy to map to MCP errors
- Correlation ID: Track requests across distributed system
- Structured Logging: Machine-readable, queryable logs
- Sanitization: Prevent sensitive data leaks
- Async Logging: Minimal performance impact
Logging Best Practices
- Use structured logging (not string interpolation)
- Include correlation ID in all logs
- Log at appropriate levels (DEBUG, INFO, ERROR)
- Never log sensitive data (API Keys, passwords, PII)
- Use async logging to avoid blocking
- Rotate logs daily, retain for 30 days
Reference Materials
- Serilog Documentation: https://serilog.net/
- JSON-RPC 2.0 Error Codes: https://www.jsonrpc.org/specification#error_object
- Sprint 5 Plan:
docs/plans/sprint_5.md