Health Endpoints¶
Learn how to add database health monitoring to your FastAPI application.
What Are Health Endpoints?¶
Health endpoints are HTTP endpoints that report whether your application and its dependencies (like databases) are working correctly.
They're essential for: - Kubernetes liveness probes - Is the app alive? - Kubernetes readiness probes - Is the app ready to serve traffic? - Monitoring systems - Prometheus, Datadog, New Relic, etc. - Load balancers - Should traffic be routed here? - Debugging - Quick status check during incidents
Quick Example¶
Add health endpoints to your app in one line:
from fastapi import FastAPI
from dbwarden.fastapi import DBWardenHealthRouter
app = FastAPI()
app.include_router(DBWardenHealthRouter(), prefix="/health")
That's it! You now have:
- GET /health/ - Overall health for all databases
- GET /health/{database_name} - Health for a specific database
Your First Health Check¶
Let's start with a complete minimal example:
from contextlib import asynccontextmanager
from fastapi import FastAPI
from dbwarden.fastapi import DBWardenHealthRouter, migration_context
@asynccontextmanager
async def lifespan(app: FastAPI):
async with migration_context(mode="check"):
yield
app = FastAPI(lifespan=lifespan)
# Add health endpoints
app.include_router(DBWardenHealthRouter(), prefix="/health")
Start your app:
uvicorn main:app --reload
Test Your Endpoints¶
Check Overall Health¶
curl http://localhost:8000/health/
Response:
{
"status": "ok",
"databases": [
{
"database": "primary",
"status": "ok",
"connected": true,
"pending_migrations": 0,
"lock_active": false,
"error": null
}
]
}
Check Specific Database¶
curl http://localhost:8000/health/primary
Response:
{
"status": "ok",
"databases": [
{
"database": "primary",
"status": "ok",
"connected": true,
"pending_migrations": 0,
"lock_active": false,
"error": null
}
]
}
Test with httpie
For prettier output, use httpie:
bash
http :8000/health/
Understanding the Response¶
Let's break down what each field means:
Response Schema¶
{
"status": str, # Overall status: "ok" | "degraded" | "error"
"databases": [ # List of database health details
{
"database": str, # Database name from config
"status": str, # This database's status: "ok" | "degraded" | "error"
"connected": bool, # Can we execute SELECT 1?
"pending_migrations": int, # Number of unapplied migrations
"lock_active": bool, # Is a migration currently running?
"error": str | None # Error message if connection failed
}
]
}
Status Values¶
"ok" - Everything is healthy
- ✅ Database is connected
- ✅ Zero pending migrations
- ✅ No migration lock
"degraded" - Functional but needs attention
- ✅ Database is connected
- ⚠️ Has pending migrations
- ℹ️ App works but schema is outdated
"error" - Not functional
- ❌ Cannot connect to database
- ❌ Health check failed
Overall Status Logic¶
The overall status field follows these rules:
if any database has status "error":
overall_status = "error"
elif any database has status "degraded":
overall_status = "degraded"
else:
overall_status = "ok"
HTTP Status Codes¶
DBWarden health endpoints return these HTTP status codes:
| Scenario | HTTP Status | Overall Status |
|---|---|---|
| All databases healthy | 200 | "ok" |
| Pending migrations exist | 200 | "degraded" |
| Database unreachable | 503 | "error" |
| Database name not found | 404 | N/A |
Why 200 for Degraded?
Pending migrations are a state, not a failure. The app still works, the schema is just outdated. You decide whether to block traffic on degraded state.
Why These Status Codes Matter¶
200 OK - App is functional - Continue routing traffic - Report as healthy to load balancers - Use for readiness probes (optionally)
503 Service Unavailable - App cannot function - Stop routing traffic - Restart the pod (liveness probe) - Alert the on-call engineer
404 Not Found - Configuration error - Database name doesn't exist in config - Only returned for per-database route
Health Check Flow¶
Here's what happens when you hit /health/:
1. Request arrives → /health/
2. For each database in config:
a. Get or create engine
b. Attempt connection
c. Execute SELECT 1
d. If connected:
- Count applied migrations
- Count total migration files
- Calculate pending = total - applied
- Check migration lock status
e. If connection fails:
- Set status = "error"
- Set error message
- Skip migration checks
3. Aggregate all database statuses
4. Return JSON response with appropriate HTTP status
What Gets Checked¶
For each database, DBWarden checks:
- Connectivity - Can we connect? (
SELECT 1) - Migration state - Are migrations pending?
- Migration lock - Is a migration currently running?
Performance
Health checks are fast - they only run SELECT 1 and query the dbwarden_migrations table. No expensive queries or full schema scans.
Common Use Cases¶
Kubernetes Liveness Probe¶
Liveness probes check if your app is alive. If it fails, Kubernetes restarts the pod.
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: app
image: myapp:latest
livenessProbe:
httpGet:
path: /health/
port: 8000
initialDelaySeconds: 10
periodSeconds: 30
failureThreshold: 3
This checks health every 30 seconds. If it fails 3 times in a row, Kubernetes restarts the pod.
Kubernetes Readiness Probe¶
Readiness probes check if your app is ready to serve traffic. If it fails, Kubernetes stops routing to the pod (but doesn't restart it).
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: app
image: myapp:latest
readinessProbe:
httpGet:
path: /health/
port: 8000
initialDelaySeconds: 5
periodSeconds: 10
This checks health every 10 seconds. If degraded or error, traffic stops routing to this pod.
Degraded State
If you use /health/ for readiness and have pending migrations, your pods will be marked not ready and receive no traffic. This might be what you want, or you might want a separate /ping endpoint for basic liveness.
Separate Liveness and Readiness¶
For more control, use different endpoints:
from fastapi import FastAPI
from dbwarden.fastapi import DBWardenHealthRouter
app = FastAPI()
# Database health at /health/ (for readiness)
app.include_router(DBWardenHealthRouter(), prefix="/health")
# Simple ping at /ping (for liveness)
@app.get("/ping")
async def ping():
return {"status": "ok"}
Then configure Kubernetes:
livenessProbe:
httpGet:
path: /ping # Simple check - app is alive
port: 8000
readinessProbe:
httpGet:
path: /health/ # Full check - app is ready
port: 8000
Monitoring and Alerting¶
Use health endpoints to feed monitoring systems:
Prometheus¶
Create a script that exports metrics:
# monitoring/exporter.py
import httpx
from prometheus_client import Gauge, start_http_server
database_health = Gauge('dbwarden_database_health', 'Database health status', ['database'])
pending_migrations = Gauge('dbwarden_pending_migrations', 'Pending migrations', ['database'])
async def collect_metrics():
async with httpx.AsyncClient() as client:
response = await client.get("http://localhost:8000/health/")
data = response.json()
for db in data["databases"]:
# 1 = ok, 0.5 = degraded, 0 = error
health_value = 1 if db["status"] == "ok" else (0.5 if db["status"] == "degraded" else 0)
database_health.labels(database=db["database"]).set(health_value)
pending_migrations.labels(database=db["database"]).set(db["pending_migrations"])
if __name__ == "__main__":
start_http_server(9090)
# Run collect_metrics periodically
Datadog¶
# monitoring/datadog.py
import httpx
from datadog import statsd
async def report_health():
async with httpx.AsyncClient() as client:
response = await client.get("http://localhost:8000/health/")
data = response.json()
for db in data["databases"]:
statsd.gauge(
'app.database.pending_migrations',
db["pending_migrations"],
tags=[f'database:{db["database"]}']
)
statsd.service_check(
'app.database.health',
statsd.OK if db["status"] == "ok" else statsd.WARNING,
tags=[f'database:{db["database"]}']
)
Pre-Query Health Check¶
Check database health before running expensive operations:
from fastapi import FastAPI, HTTPException
import httpx
app = FastAPI()
@app.post("/analytics/generate-report")
async def generate_report():
# Check if analytics database is healthy
async with httpx.AsyncClient() as client:
response = await client.get("http://localhost:8000/health/analytics")
data = response.json()
db_health = data["databases"][0]
if not db_health["connected"]:
raise HTTPException(
status_code=503,
detail="Analytics database unavailable"
)
if db_health["pending_migrations"] > 0:
raise HTTPException(
status_code=503,
detail=f"Analytics database has {db_health['pending_migrations']} pending migrations"
)
# Proceed with expensive report generation
...
Load Balancer Health Checks¶
Configure your load balancer (ALB, nginx, etc.) to check /health/:
AWS Application Load Balancer¶
# target-group.yaml
TargetGroup:
HealthCheckPath: /health/
HealthCheckIntervalSeconds: 30
HealthCheckTimeoutSeconds: 5
HealthyThresholdCount: 2
UnhealthyThresholdCount: 3
Matcher:
HttpCode: 200
Nginx¶
# nginx.conf
upstream myapp {
server app1:8000;
server app2:8000;
server app3:8000;
# Health check
check interval=10000 rise=2 fall=3 timeout=5000 type=http;
check_http_send "GET /health/ HTTP/1.0\r\n\r\n";
check_http_expect_alive http_2xx;
}
Relationship to Startup Checks¶
Health endpoints and startup checks serve different purposes:
| Health Endpoints | Startup Checks | |
|---|---|---|
| When | On demand (HTTP request) | Once at app boot |
| Failure | Returns HTTP 503 | Blocks app startup |
| Use case | Runtime monitoring | Enforce readiness before traffic |
| Frequency | Every request | Once |
Startup Check (Lifespan)¶
from contextlib import asynccontextmanager
from dbwarden.fastapi import migration_context
@asynccontextmanager
async def lifespan(app: FastAPI):
async with migration_context(mode="check", fail_fast=True):
# ❌ If this fails, app won't start
yield
Runtime Health (HTTP Endpoint)¶
from dbwarden.fastapi import DBWardenHealthRouter
# ✅ Always available - returns status code based on health
app.include_router(DBWardenHealthRouter(), prefix="/health")
Shared Logic
Both use the same underlying check_database_health() function, so behavior is consistent.
Common Patterns¶
Pattern 1: Basic Health Only¶
from fastapi import FastAPI
from dbwarden.fastapi import DBWardenHealthRouter
app = FastAPI()
app.include_router(DBWardenHealthRouter(), prefix="/health")
Pattern 2: Health + Simple Ping¶
from fastapi import FastAPI
from dbwarden.fastapi import DBWardenHealthRouter
app = FastAPI()
# Database health (readiness)
app.include_router(DBWardenHealthRouter(), prefix="/health")
# Simple ping (liveness)
@app.get("/ping")
async def ping():
return {"status": "ok"}
Pattern 3: Custom Prefix¶
# Health at /api/v1/health/
app.include_router(DBWardenHealthRouter(), prefix="/api/v1/health")
Pattern 4: With Tags for Documentation¶
# Show in OpenAPI docs with "Health" tag
router = DBWardenHealthRouter()
router.tags = ["Health"]
app.include_router(router, prefix="/health")
Troubleshooting¶
503 Errors in Production¶
If you're getting 503 errors, check:
-
Is the database reachable?
bash # Check database connectivity curl http://localhost:8000/health/ | jq '.databases[].error' -
Are migrations pending?
bash # Check pending migrations curl http://localhost:8000/health/ | jq '.databases[].pending_migrations' -
Is a migration lock active?
bash # Check lock status curl http://localhost:8000/health/ | jq '.databases[].lock_active'
Degraded State in Production¶
If your app is marked degraded:
{
"status": "degraded",
"databases": [
{
"pending_migrations": 3,
...
}
]
}
This means migrations need to be applied:
# Apply migrations
dbwarden migrate
# Or let the app auto-migrate (if configured)
# See: Startup Checks documentation
404 on Per-Database Route¶
curl http://localhost:8000/health/analytics
# 404: Database 'analytics' not found
Check your DBWarden config - the database name must be defined:
# dbwarden.py
database_config(
database_name="analytics", # ← Must match route parameter
...
)
Health Check Too Slow¶
Health checks should be fast (< 100ms). If they're slow:
- Database connection is slow - Check network latency
- Migration table is huge - Consider squashing migrations
- Multiple databases - Each one adds latency
For very fast health checks, consider:
# Simple ping (no database check)
@app.get("/ping")
async def ping():
return {"status": "ok"}
Recap¶
You learned:
✅ Add health endpoints with DBWardenHealthRouter()
✅ Health endpoints return database connectivity and migration state
✅ Status codes: 200 (healthy/degraded), 503 (error), 404 (not found)
✅ Use for Kubernetes liveness/readiness probes
✅ Feed metrics to Prometheus, Datadog, etc.
✅ Health endpoints vs startup checks - different purposes
✅ Troubleshoot common issues (503, degraded, 404)
What's Next?¶
- Startup Checks - Validate on app boot
- Complete Application - Full working example
- Production Patterns - K8s, CI/CD, monitoring
- Multi-Database - Multiple databases