Correctover implements the MAPE-K autonomic loop (Monitor-Analyze-Plan-Execute over Knowledge) for self-healing LLM API calls. This is the technical foundation behind verified failover.
Every LLM API call is intercepted at the SDK level. The monitor collects response metadata and performance metrics before passing them to the analysis engine.
Correctover's proprietary CANON engine validates every response across 6 dimensions simultaneously, in-process, before the response reaches application code.
CANON validation process:
1. Parse response → extract structure
2. Validate JSON schema (8µs)
3. Check field types (5µs)
4. Measure latency (1µs)
5. Calculate cost (2µs)
6. Verify model identity (3µs)
7. Semantic coherence check (22µs)
Total P50: 22µs | P99: 99µs
When contract validation fails on the primary provider, Correctover escalates through the configured fallback chain:
Provider 1 (OpenAI gpt-4o)
↓ contract fails → validated=False, reason="schema mismatch"
Provider 2 (Anthropic claude-3-opus)
↓ contract fails → validated=False, reason="latency > 5s"
Provider 3 (Google gemini-2.0-pro)
↓ contract passes ✅
E2E failover time: 949ms
├─ DNS resolution: ~20ms
├─ Connection setup: ~150ms
├─ API call: ~750ms
└─ Contract validation: ~22µs
Tracks per-provider metrics: response time, error rate, token usage, contract failure rate, and drift indicators. Runs at 0.4µs per record with 177,582 rec/s throughput.
Detects patterns: repeated timeouts → circuit breaker opens; schema drift → alert; cost anomalies → budget cap enforced. Analysis runs at 47µs P99 for 1M samples.
Selects the optimal failover path based on: failure type, provider health scores, cost budgets, latency requirements, and geographic proximity. The plan is re-evaluated per-request.
Switches provider, sets up new connection, sends request with the same prompt, and applies contract validation to the response. If validated, returns result; if not, escalates to next plan.
The self-healing rule database grows through the MAPE-K flywheel. Started with 62 high-confidence rules, now 84 (62 high-confidence, verified in 70,000+ fault injection scenarios across 7 failure types).
Real-time monitoring across all 6 dimensions with automatic alerting. Detects:
For multi-step agent chains, Correctover's Checkpoint feature saves intermediate states at each validated step. If a chain fails mid-execution, it can be resumed from the last validated checkpoint — no need to restart from scratch.