← Home Docs

🏆 Production Best Practices

1. Circuit Breaker Configuration

Set conservative thresholds to avoid cascading failures. A circuit breaker opens when a provider exceeds the failure threshold within a sliding window.

# Start with these values and tune based on real traffic
circuit_breaker={"threshold": 3, "window_seconds": 60}
# - 3 failures in 60s triggers open
# - After 30s cooldown, half-open for probe request
# - If probe succeeds, circuit closes

2. Budget-Aware Routing

Ensure failover targets don't cause cost surprises. A failover to a 16x more expensive model should be intentional.

provider = AIProvider(
    default="openai/gpt-4o-mini",
    fallbacks=["openai/gpt-4o"],
    contracts=[Contract.cost(max_cents=5.0)],  # Cap per request
)

3. Failover Testing in CI/CD

4. Observability

Track these metrics in production:

Every Correctover SDK emits OpenTelemetry-compatible metrics with zero configuration.

5. Graceful Degradation

When all providers are exhausted, don't crash:

try:
    response = provider.complete(prompt)
    if not response.validated:
        return {"data": cached, "confidence": "degraded"}
except:
    return {"error": "Service temporarily unavailable", "retry_after": 30}

6. Provider Health Monitoring

Track per-provider health scores and adjust routing dynamically. Providers with degrading performance get fewer requests until they recover.