firecrawl-incident-runbook
Execute Firecrawl incident response procedures with triage, mitigation, and postmortem. Use when responding to Firecrawl-related outages, investigating scrape/crawl failures, or running post-incident reviews for Firecrawl integration issues. Trigger with phrases like "firecrawl incident", "firecrawl outage", "firecrawl down", "firecrawl on-call", "firecrawl emergency", "firecrawl broken".
claude-codecodexopenclaw
Allowed Tools
ReadGrepBash(curl:*)Bash(kubectl:*)
Provided by Plugin
firecrawl-pack
Claude Code skill pack for FireCrawl (30 skills)
Installation
This skill is included in the firecrawl-pack plugin:
/plugin install firecrawl-pack@claude-code-plugins-plus
Click to copy
Instructions
Firecrawl Incident Runbook
Overview
Rapid incident response procedures for Firecrawl integration failures. Covers API outage triage, credential issues, credit exhaustion, crawl job failures, and webhook delivery problems.
Severity Levels
| Level | Definition | Response Time | Examples |
|---|---|---|---|
| P1 | Complete failure | < 15 min | API returns 401/500 on all requests |
| P2 | Degraded service | < 1 hour | High latency, partial failures, 429s |
| P3 | Minor impact | < 4 hours | Webhook delays, some empty scrapes |
| P4 | No user impact | Next business day | Monitoring gaps, credit warnings |
Quick Triage (Run First)
set -euo pipefail
# 1. Test Firecrawl API directly
echo "=== API Health ==="
curl -s -w "\nHTTP %{http_code}\n" https://api.firecrawl.dev/v1/scrape \
-H "Authorization: Bearer $FIRECRAWL_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com","formats":["markdown"]}' | jq '{success, error}'
# 2. Check credit balance
echo "=== Credits ==="
curl -s https://api.firecrawl.dev/v1/team/credits \
-H "Authorization: Bearer $FIRECRAWL_API_KEY" | jq .
# 3. Check our app health
echo "=== App Health ==="
curl -sf https://api.yourapp.com/health | jq '.services.firecrawl' || echo "App unhealthy"
Decision Tree
Firecrawl API returning errors?
├─ 401: API key invalid
│ → Verify key at firecrawl.dev/app, rotate if needed
├─ 402: Credits exhausted
│ → Upgrade plan or wait for monthly reset
├─ 429: Rate limited
│ → Reduce concurrency, enable backoff, check Retry-After
├─ 500/503: Firecrawl outage
│ → Enable fallback mode, monitor firecrawl.dev status
└─ API working fine
└─ Our integration issue
├─ Empty markdown → Increase waitFor, check target site
├─ Crawl stuck → Check job status, enforce timeout
└─ Webhook not firing → Verify endpoint, check signature
Immediate Actions by Error Type
401 — Authentication Failure
set -euo pipefail
# Verify current key
echo "Key prefix: ${FIRECRAWL_API_KEY:0:5}"
echo "Key length: ${#FIRECRAWL_API_KEY}"
# Test with explicit key
curl -s https://api.firecrawl.dev/v1/scrape \
-H "Authorization: Bearer $FIRECRAWL_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com","formats":["markdown"]}' | jq .success
# If fails: regenerate key at firecrawl.dev/app and update all environments
402 — Credits Exhausted
set -euo pipefail
# Check balance
curl -s https://api.firecrawl.dev/v1/team/credits \
-H "Authorization: Bearer $FIRECRAWL_API_KEY" | jq .
# Immediate: disable non-critical scraping
# Long-term: upgrade plan or implement credit budget
429 — Rate Limited
// Enable emergency rate limiting
const EMERGENCY_DELAY_MS = 5000; // 5s between requests
async function emergencyScrape(url: string) {
await new Promise(r => setTimeout(r, EMERGENCY_DELAY_MS));
return firecrawl.scrapeUrl(url, { formats: ["markdown"] });
}
500/503 — Firecrawl Outage
// Enable graceful degradation
async function scrapeWithFallback(url: string) {
try {
return await firecrawl.scrapeUrl(url, { formats: ["markdown"] });
} catch (error: any) {
if (error.statusCode >= 500) {
console.error("Firecrawl unavailable — using cached content");
return getCachedContent(url); // serve stale data
}
throw error;
}
}
Communication Templates
Internal (Slack)
P[1-4] INCIDENT: Firecrawl Integration
Status: INVESTIGATING
Impact: [Describe user-facing impact]
Error: [401/402/429/500] — [brief description]
Action: [What you're doing right now]
Next update: [time]
Post-Incident
Evidence Collection
set -euo pipefail
# Collect debug bundle
mkdir -p incident-$(date +%Y%m%d)
curl -s https://api.firecrawl.dev/v1/team/credits \
-H "Authorization: Bearer $FIRECRAWL_API_KEY" > incident-$(date +%Y%m%d)/credits.json
# Application logs
kubectl logs -l app=my-app --since=1h | grep -i firecrawl > incident-$(date +%Y%m%d)/logs.txt 2>/dev/null || true
Postmortem Template
## Incident: Firecrawl [Error Type]
Date: YYYY-MM-DD | Duration: X hours | Severity: P[1-4]
### Summary
[1-2 sentence description]
### Timeline
- HH:MM — [First alert]
- HH:MM — [Investigation started]
- HH:MM — [Root cause identified]
- HH:MM — [Resolved]
### Root Cause
[Technical explanation]
### Action Items
- [ ] [Preventive measure] — Owner — Due date
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Can't reach Firecrawl API | Network/DNS issue | Try from different network, check DNS |
| All scrapes return empty | Target site changed | Verify manually, adjust scrape options |
| Crawl jobs never complete | Queue backup | Cancel stuck jobs, reduce concurrency |
| Webhook endpoint unreachable | Deployment issue | Check HTTPS cert, DNS, firewall |
Resources
Next Steps
For data handling, see firecrawl-data-handling.