Claude Code skill pack for Anthropic (30 skills)
Installation
Open Claude Code and run this command:
/plugin install anthropic-pack@claude-code-plugins-plus
Use --global to install for all projects, or --project for current project only.
Skills (30)
Debug complex Claude API issues including context window overflow, tool use failures, streaming corruption, and response quality problems.
Anthropic Advanced Troubleshooting
Issue: Context Window Overflow
# Symptom: invalid_request_error about token count
# Diagnosis: pre-check with Token Counting API
import anthropic
client = anthropic.Anthropic()
count = client.messages.count_tokens(
model="claude-sonnet-4-20250514",
messages=conversation_history,
system=system_prompt
)
print(f"Input tokens: {count.input_tokens}")
# Claude Sonnet: 200K context, Claude Opus: 200K context
# Fix: truncate oldest messages or summarize
def trim_conversation(messages: list, max_tokens: int = 180_000) -> list:
"""Keep recent messages within token budget."""
# Always keep first (system context) and last 5 messages
if len(messages) <= 5:
return messages
return messages[:1] + messages[-5:] # Crude but effective
Issue: Tool Use Not Triggering
# Symptom: Claude responds with text instead of calling tools
# Diagnosis checklist:
# 1. Tool description must clearly state WHEN to use the tool
# 2. User message must match the tool's trigger condition
# BAD description (too vague):
{"name": "search", "description": "Search for things"}
# GOOD description (clear trigger):
{"name": "search_products", "description": "Search the product catalog by name, category, or price range. Use whenever the user asks about products, pricing, or availability."}
# Force tool use if needed:
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
tool_choice={"type": "any"}, # Must call at least one tool
messages=[{"role": "user", "content": "Find products under $50"}]
)
Issue: Streaming Drops or Corruption
# Symptom: stream ends prematurely or text is garbled
# Cause: network interruption, proxy timeout, or large response
# Fix: implement reconnection with content tracking
def resilient_stream(client, **kwargs):
"""Stream with reconnection on failure."""
collected_text = ""
max_retries = 3
for attempt in range(max_retries):
try:
with client.messages.stream(**kwargs) as stream:
for text in stream.text_stream:
collected_text += text
yield text
return # Success
except Exception as e:
if attempt == max_retries - 1:
raise
# Note: Claude streams are NOT resumable
# Must restart from beginning
collected_text = ""
print(f"Stream interrupted, retrying ({attempt + 1}/{max_retries})")
Issue: Unexpected Stop Reason
Choose and implement Claude API architecture patterns for different scales: serverless, microservice, event-driven, and edge deployment.
Anthropic Architecture Variants
Overview
Four validated architecture patterns for Claude API integrations at different scales and use cases.
Variant 1: Serverless (AWS Lambda / Cloud Functions)
# Best for: < 100 RPM, event-driven, pay-per-invocation
# lambda_function.py
import anthropic
import json
def handler(event, context):
client = anthropic.Anthropic() # Key from Lambda env var
body = json.loads(event["body"])
msg = client.messages.create(
model="claude-haiku-4-20250514", # Haiku for Lambda speed
max_tokens=512,
messages=[{"role": "user", "content": body["prompt"]}]
)
return {
"statusCode": 200,
"body": json.dumps({
"text": msg.content[0].text,
"tokens": msg.usage.input_tokens + msg.usage.output_tokens
})
}
Trade-offs: Cold starts add 1-3s. Lambda timeout (15min) limits long generations. No connection pooling between invocations.
Variant 2: Streaming Microservice (FastAPI + WebSocket)
# Best for: chatbots, interactive UIs, real-time responses
from fastapi import FastAPI, WebSocket
import anthropic
app = FastAPI()
client = anthropic.Anthropic()
@app.websocket("/chat")
async def chat_ws(websocket: WebSocket):
await websocket.accept()
while True:
prompt = await websocket.receive_text()
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=2048,
messages=[{"role": "user", "content": prompt}]
) as stream:
for text in stream.text_stream:
await websocket.send_text(text)
await websocket.send_text("[DONE]")
Variant 3: Queue-Based Pipeline (Celery / Cloud Tasks)
# Best for: batch processing, async workflows, high volume
from celery import Celery
import anthropic
app = Celery("tasks", broker="redis://localhost")
@app.task(bind=True, max_retries=3, default_retry_delay=30)
def process_document(self, doc_id: str, content: str):
try:
client = anthropic.Anthropic()
msg = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048,
messages=[{"role": "user", "content": f"Summarize:\n\n{content}"}]
)
save_result(doc_id, msg.content[0].text)
except anthropic.RateLimitError as e:
self.retry(exc=e, countdown=int(e.response.headers.get("retry-after", 30)))
Variant 4: Multi-Model Orchestrator
# Best for: complex workflows needing different model strengthsConfigure CI/CD pipelines for Anthropic Claude API integrations.
Anthropic CI Integration
Overview
Set up CI/CD pipelines that validate Claude API integrations with mock-based unit tests (free, fast) and prompt regression tests (live API, gated to main).
GitHub Actions Workflow
# .github/workflows/claude-tests.yml
name: Claude API Tests
on: [push, pull_request]
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: '3.12' }
- run: pip install anthropic pytest
- run: pytest tests/unit/ -v # No API key needed
prompt-regression:
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: '3.12' }
- run: pip install anthropic pytest
- run: pytest tests/prompt_regression/ -v --timeout=60
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
Mock-Based Unit Tests
# tests/unit/test_tool_routing.py
from unittest.mock import MagicMock, patch
import anthropic
def make_mock_message(text="Hello", stop_reason="end_turn"):
msg = MagicMock()
msg.id = "msg_mock_123"
msg.model = "claude-sonnet-4-20250514"
msg.stop_reason = stop_reason
block = MagicMock()
block.type = "text"
block.text = text
msg.content = [block]
msg.usage = MagicMock(input_tokens=100, output_tokens=50)
return msg
@patch("anthropic.Anthropic")
def test_service_returns_text(MockClient):
MockClient.return_value.messages.create.return_value = make_mock_message("42")
from myapp.service import ask_claude
assert ask_claude("What is 6*7?") == "42"
Prompt Regression Tests
# tests/prompt_regression/test_prompts.py
import anthropic, pytest, os, json
pytestmark = pytest.mark.skipif(not os.getenv("ANTHROPIC_API_KEY"), reason="No API key")
client = anthropic.Anthropic()
def test_json_output_format():
msg = client.messages.create(
model="claude-haiku-4-20250514",
max_tokens=256,
messages=[
{"role": "user", "content": "Extract: 'Alice, 30, NYC'. Return JSON: {name, age, city}"},
{"role": "assistant", "content": "{"}
]
)
data = json.loads("{" + msg.content[0].text)
assert "name" in data and "age" in data
def test_system_prompt_boundary():
msg = client.messages.create(
model="claude-haiku-4-20250514",
max_tokens=128,
system="You only discuss cooking recipes. For other topics say: 'I only help with cooking.Diagnose and fix Anthropic Claude API errors by HTTP status code.
Anthropic Common Errors
Overview
Quick reference for all Claude API error types with exact HTTP codes, error bodies, and fixes. The API returns errors as JSON: {"type": "error", "error": {"type": "...", "message": "..."}}.
Error Reference
400 — invalidrequesterror
{"type": "error", "error": {"type": "invalid_request_error", "message": "messages: roles must alternate between \"user\" and \"assistant\""}}
Common causes and fixes:
| Message Pattern | Cause | Fix |
|---|---|---|
messages: roles must alternate |
Consecutive same-role messages | Merge adjacent user/assistant messages |
max_tokens: must be >= 1 |
Missing or zero max_tokens |
Always set max_tokens (required param) |
model: invalid model id |
Typo in model name | Use exact ID: claude-sonnet-4-20250514 |
messages.0.content: empty |
Empty message content | Ensure content is non-empty string or array |
toolresult: tooluse_id not found |
Mismatched tool ID | Copy id from the tool_use block exactly |
401 — authentication_error
# Verify your key is set and valid
echo $ANTHROPIC_API_KEY | head -c 15 # Should show: sk-ant-api03-...
# Test directly with curl
curl -s https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{"model":"claude-sonnet-4-20250514","max_tokens":16,"messages":[{"role":"user","content":"hi"}]}'
403 — permission_error
API key lacks required permissions. Generate a new key at console.anthropic.com.
404 — notfounderror
Invalid endpoint or model. Check you're using https://api.anthropic.com/v1/messages and a valid model ID.
429 — ratelimiterror
{"type": "error", "error": {"type": "rate_limit_error", "message": "Number of request tokens has exceeded your per-minute rate limit"}}
Check headers for details:
retry-after— seconds to waitanthropic-ratelimit-requests-limit— RPM cap
Build Claude tool use (function calling) workflows with the Messages API.
Anthropic Core Workflow A — Tool Use (Function Calling)
Overview
Implement Claude's tool use capability where the model can call functions you define. Claude returns tooluse content blocks with structured JSON inputs; your code executes the function and returns toolresult blocks. This is the foundation for building AI agents.
Prerequisites
- Completed
anth-install-authsetup - Understanding of the Messages API request/response cycle
- Functions or APIs you want Claude to call
Instructions
Step 1: Define Tools
import anthropic
client = anthropic.Anthropic()
tools = [
{
"name": "get_weather",
"description": "Get current weather for a city. Use when the user asks about weather conditions.",
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name, e.g. 'San Francisco, CA'"
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature units"
}
},
"required": ["city"]
}
},
{
"name": "search_database",
"description": "Search product database by query string. Returns matching products.",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"},
"max_results": {"type": "integer", "default": 10}
},
"required": ["query"]
}
}
]
Step 2: Send Request with Tools
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}]
)
# Claude responds with stop_reason="tool_use"
# message.content contains both text and tool_use blocks:
# [
# {"type": "text", "text": "I'll check the weather for you."},
# {"type": "tool_use", "id": "toolu_01A...", "name": "get_weather",
# "input": {"city": "Tokyo", "units": "celsius"}}
# ]
Step
Build Claude streaming and Message Batches API workflows.
Anthropic Core Workflow B — Streaming & Batches
Overview
Two complementary patterns: real-time streaming for interactive UIs (SSE events via POST /v1/messages with stream: true) and the Message Batches API (POST /v1/messages/batches) for processing up to 100,000 requests asynchronously at 50% cost reduction.
Prerequisites
- Completed
anth-install-authsetup - Familiarity with
anth-core-workflow-a(Messages API basics) - For batches: understanding of async/polling patterns
Instructions
Streaming — Python SDK
import anthropic
client = anthropic.Anthropic()
# Method 1: High-level streaming (recommended)
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=2048,
messages=[{"role": "user", "content": "Write a short story about a robot."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
# After stream completes, access full message
final_message = stream.get_final_message()
print(f"\nUsage: {final_message.usage.input_tokens}+{final_message.usage.output_tokens}")
# Method 2: Event-level streaming (for custom event handling)
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=2048,
messages=[{"role": "user", "content": "Explain REST APIs."}]
) as stream:
for event in stream:
if event.type == "content_block_delta":
if event.delta.type == "text_delta":
print(event.delta.text, end="")
elif event.type == "message_stop":
print("\n[Stream complete]")
Streaming — TypeScript SDK
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
// High-level streaming
const stream = client.messages.stream({
model: 'claude-sonnet-4-20250514',
max_tokens: 2048,
messages: [{ role: 'user', content: 'Write a haiku about code.' }],
});
stream.on('text', (text) => process.stdout.write(text));
stream.on('finalMessage', (msg) => {
console.log(`\nTokens: ${msg.usage.input_tokens}+${msg.usage.output_tokens}`);
});
await stream.finalMessage();
Streaming with Tool Use
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools, # Same tools array from core-workflow-a
messages=[{"role": "user", "content": "What's the weather?"}]
) as stream:
for event in stream:
if event.type == "content_block_start":
if event.content_block.type == &quOptimize Anthropic Claude API costs with model routing, prompt caching, batching, and spend monitoring.
Anthropic Cost Tuning
Overview
Optimize Claude API spend through model routing, prompt caching, the Message Batches API, and real-time cost tracking. The four biggest levers: model selection (4-19x), prompt caching (10x input), batches (2x), and max_tokens discipline.
Pricing Reference (per million tokens)
| Model | Input | Output | Cache Read | Cache Write |
|---|---|---|---|---|
| Claude Haiku | $0.80 | $4.00 | $0.08 | $1.00 |
| Claude Sonnet | $3.00 | $15.00 | $0.30 | $3.75 |
| Claude Opus | $15.00 | $75.00 | $1.50 | $18.75 |
Message Batches: 50% off all model pricing for async processing.
Cost Calculator
def estimate_cost(
input_tokens: int,
output_tokens: int,
model: str = "claude-sonnet-4-20250514",
cached_input: int = 0,
use_batch: bool = False
) -> float:
pricing = {
"claude-haiku-4-20250514": {"input": 0.80, "output": 4.00, "cache_read": 0.08},
"claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00, "cache_read": 0.30},
"claude-opus-4-20250514": {"input": 15.00, "output": 75.00, "cache_read": 1.50},
}
rates = pricing[model]
uncached_input = input_tokens - cached_input
cost = (
uncached_input * rates["input"] +
cached_input * rates["cache_read"] +
output_tokens * rates["output"]
) / 1_000_000
if use_batch:
cost *= 0.5
return cost
# Example: 10K requests/day, 500 input + 200 output tokens each
daily = estimate_cost(500, 200, "claude-sonnet-4-20250514") * 10_000
print(f"Daily: ${daily:.2f}") # ~$0.045 * 10K = $450/day
print(f"Monthly: ${daily * 30:.2f}") # ~$13,500/month
# Same with Haiku + batching
daily_optimized = estimate_cost(500, 200, "claude-haiku-4-20250514", use_batch=True) * 10_000
print(f"Optimized: ${daily_optimized:.2f}/day") # ~$22/day (20x cheaper)
Strategy 1: Model Routing
def route_to_model(task: str, complexity: str) -> str:
"""Route tasks to cheapest adequate model."""
# Haiku: classification, extraction, yes/no, routing ($0.80/$4)
if task in ("classify", "extract", "route", "validate"):
return "claude-haiku-4-20250514"
# Sonnet: general tasks, code, tool use ($3/$15)
if complexity in ("low", "medium"):
return "claude-sonnet-4-20250514"
# Opus: only for complex reasoning, researchImplement data privacy, PII handling, and compliance patterns for Claude API.
Anthropic Data Handling
Overview
Anthropic's data policies: API inputs/outputs are NOT used for model training (commercial API). Zero-day retention is available. This skill covers PII redaction before sending to Claude and compliance patterns.
Anthropic Data Policies
| Policy | Details |
|---|---|
| Training data | API data is NOT used for training (commercial API) |
| Data retention | 30-day default; 0-day available via agreement |
| Encryption | TLS 1.2+ in transit, AES-256 at rest |
| SOC 2 Type II | Certified |
| HIPAA BAA | Available for eligible customers |
PII Redaction Before API Calls
import re
import anthropic
def redact_pii(text: str) -> tuple[str, dict]:
"""Redact PII before sending to Claude, return redaction map for restoration."""
redaction_map = {}
patterns = [
(r'\b\d{3}-\d{2}-\d{4}\b', 'SSN', '[SSN-REDACTED-{}]'),
(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', 'EMAIL', '[EMAIL-REDACTED-{}]'),
(r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', 'PHONE', '[PHONE-REDACTED-{}]'),
(r'\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b', 'CARD', '[CARD-REDACTED-{}]'),
]
counter = 0
for pattern, label, replacement in patterns:
for match in re.finditer(pattern, text):
counter += 1
placeholder = replacement.format(counter)
redaction_map[placeholder] = match.group()
text = text.replace(match.group(), placeholder, 1)
return text, redaction_map
def restore_pii(text: str, redaction_map: dict) -> str:
"""Restore redacted PII in Claude's response."""
for placeholder, original in redaction_map.items():
text = text.replace(placeholder, original)
return text
# Usage
user_input = "Contact John at john@example.com or 555-123-4567"
safe_input, redactions = redact_pii(user_input)
# safe_input: "Contact John at [EMAIL-REDACTED-1] or [PHONE-REDACTED-2]"
client = anthropic.Anthropic()
msg = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=256,
messages=[{"role": "user", "content": safe_input}]
)
final_output = restore_pii(msg.content[0].text, redactions)
Audit Logging
import json
import logging
from datetime import datetime, timezone
audit_logger = logging.getLogger("claude.audit")
def audited_request(client, user_id: str, purpose: str, **kwargs):
"""Wrap Claude API calls with audit logging."""
# Log request metadata (never log contenCollect Anthropic Claude API debug evidence for support and troubleshooting.
Anthropic Debug Bundle
Overview
Collect diagnostic information for Claude API issues. Every API response includes a request-id header — this is the single most important piece of data for Anthropic support.
Prerequisites
- Anthropic SDK installed
- Access to application logs
ANTHROPICAPIKEYset in environment
Instructions
Step 1: Capture Request ID
import anthropic
client = anthropic.Anthropic()
try:
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=64,
messages=[{"role": "user", "content": "test"}]
)
print(f"Request ID: {message._request_id}") # req_01A1B2C3...
except anthropic.APIStatusError as e:
print(f"Request ID: {e.response.headers.get('request-id')}")
print(f"Status: {e.status_code}")
print(f"Error: {e.message}")
// TypeScript — access raw response headers
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 64,
messages: [{ role: 'user', content: 'test' }],
}).asResponse();
console.log('Request ID:', response.headers.get('request-id'));
console.log('Rate limit remaining:', response.headers.get('anthropic-ratelimit-requests-remaining'));
Step 2: Debug Bundle Script
#!/bin/bash
# anthropic-debug-bundle.sh
BUNDLE_DIR="anthropic-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BUNDLE_DIR"
echo "=== Anthropic Debug Bundle ===" > "$BUNDLE_DIR/summary.txt"
echo "Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> "$BUNDLE_DIR/summary.txt"
# SDK versions
echo -e "\n--- SDK Versions ---" >> "$BUNDLE_DIR/summary.txt"
pip show anthropic 2>/dev/null | grep -E "^(Name|Version)" >> "$BUNDLE_DIR/summary.txt"
npm list @anthropic-ai/sdk 2>/dev/null >> "$BUNDLE_DIR/summary.txt"
python3 --version >> "$BUNDLE_DIR/summary.txt" 2>&1
node --version >> "$BUNDLE_DIR/summary.txt" 2>&1
# API key status (NEVER log the key itself)
echo -e "\n--- Auth Status ---" >> "$BUNDLE_DIR/summary.txt"
echo "ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY:+SET (${#ANTHROPIC_API_KEY} chars)}" >> "$BUNDLE_DIR/summary.txt"
# Connectivity test with headers
echo -e "\n--- API Connectivity ---" >> "$BUNDLE_DIR/summary.txt"
curl -s -w "\nHTTP %{http_code} | Time: %{time_total}s" \
-o "$BUNDLE_DIR/api-response.json" \
-D "$BUNDLE_DIR/response-headers.txt" \
https://api.anthropic.coDeploy Claude API integrations to production cloud environments.
Anthropic Deploy Integration
Overview
Deploy Claude API integrations with proper secret management, health checks, and rollback procedures across Docker, GCP Cloud Run, and Kubernetes.
Docker Deployment
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY src/ ./src/
ENV ANTHROPIC_API_KEY=""
EXPOSE 8000
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
# src/main.py
from fastapi import FastAPI, HTTPException
import anthropic
app = FastAPI()
client = anthropic.Anthropic()
@app.get("/health")
async def health():
try:
count = client.messages.count_tokens(
model="claude-haiku-4-20250514",
messages=[{"role": "user", "content": "ping"}]
)
return {"status": "healthy", "api": "connected"}
except Exception as e:
raise HTTPException(503, detail=str(e))
GCP Cloud Run
echo -n "sk-ant-api03-..." | gcloud secrets create anthropic-key --data-file=-
gcloud run deploy claude-service \
--image gcr.io/my-project/claude-service \
--set-secrets ANTHROPIC_API_KEY=anthropic-key:latest \
--min-instances 1 --max-instances 10 \
--memory 512Mi --timeout 120s
Kubernetes
apiVersion: apps/v1
kind: Deployment
metadata: { name: claude-service }
spec:
replicas: 3
strategy: { type: RollingUpdate, rollingUpdate: { maxUnavailable: 1 } }
template:
spec:
containers:
- name: app
env:
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef: { name: anthropic-secrets, key: api-key }
livenessProbe:
httpGet: { path: /health, port: 8000 }
periodSeconds: 30
Rollback
# Cloud Run
gcloud run services update-traffic claude-service --to-revisions=PREVIOUS=100
# Kubernetes
kubectl rollout undo deployment/claude-service
Error Handling
| Issue | Cause | Fix |
|---|---|---|
| Container crash on start | Missing API key env var | Verify secret binding |
| Health check fails | Key invalid in prod | Test key with curl |
| 429 after scaling up | More replicas = more RPM | Shared rate limiter (Redis) |
Resources
Next Steps
For event-driven patterns,
Configure Anthropic enterprise organization management, Workspaces, and role-based access control for teams.
Anthropic Enterprise RBAC
Overview
Anthropic provides organization-level access control through Workspaces, API key scoping, and member roles via the Console at console.anthropic.com.
Organization Structure
Organization (billing entity)
├── Workspace: Production
│ ├── API Key: sk-ant-api03-prod-main-...
│ ├── API Key: sk-ant-api03-prod-batch-...
│ └── Rate limits: Tier 4
├── Workspace: Staging
│ ├── API Key: sk-ant-api03-stg-...
│ └── Rate limits: Tier 2
└── Workspace: Development
├── API Key: sk-ant-api03-dev-...
└── Rate limits: Tier 1
Console Roles
| Role | Capabilities |
|---|---|
| Owner | Full access, billing, member management |
| Admin | Manage workspaces, API keys, view usage |
| Developer | Create/revoke own API keys, view own usage |
| Billing | View invoices and usage reports only |
Application-Level RBAC
# Implement your own RBAC on top of Anthropic Workspaces
from enum import Enum
import anthropic
class UserRole(Enum):
VIEWER = "viewer" # Can read Claude responses (no direct API)
USER = "user" # Can send prompts (rate limited)
POWER_USER = "power" # Can use Opus, higher limits
ADMIN = "admin" # Can access all models, no limits
ROLE_CONFIG = {
UserRole.VIEWER: {"allowed": False},
UserRole.USER: {
"allowed": True,
"models": ["claude-haiku-4-20250514"],
"max_tokens": 512,
"rpm_limit": 10,
},
UserRole.POWER_USER: {
"allowed": True,
"models": ["claude-haiku-4-20250514", "claude-sonnet-4-20250514", "claude-opus-4-20250514"],
"max_tokens": 4096,
"rpm_limit": 60,
},
UserRole.ADMIN: {
"allowed": True,
"models": ["claude-haiku-4-20250514", "claude-sonnet-4-20250514", "claude-opus-4-20250514"],
"max_tokens": 8192,
"rpm_limit": 200,
},
}
def create_message(user_role: UserRole, model: str, **kwargs):
config = ROLE_CONFIG[user_role]
if not config["allowed"]:
raise PermissionError("Role does not allow API access")
if model not in config["models"]:
raise PermissionError(f"Role cannot access model: {model}")
kwargs["max_tokens"] = min(kwargs.get("max_tokens", 1024), config["max_tokens"])
client = anthropic.Anthropic()
return client.messages.create(model=model, **kwargs)
Key Management Best Practices
Create a minimal working Anthropic Claude Messages API example.
Anthropic Hello World
Overview
Three minimal examples covering the Claude Messages API core surfaces: basic text completion, vision (image analysis), and streaming responses.
Prerequisites
- Completed
anth-install-authsetup - Valid
ANTHROPICAPIKEYin environment - Python 3.8+ with
anthropicpackage or Node.js 18+ with@anthropic-ai/sdk
Instructions
Example 1: Basic Text Message (Python)
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain quantum computing in 3 sentences."}
]
)
# Response structure
print(message.content[0].text) # The actual text response
print(f"ID: {message.id}") # msg_01XFDUDYJgAACzvnptvVoYEL
print(f"Model: {message.model}") # claude-sonnet-4-20250514
print(f"Stop: {message.stop_reason}")# end_turn
print(f"Usage: {message.usage.input_tokens}in / {message.usage.output_tokens}out")
Example 2: Vision — Analyze an Image (TypeScript)
import Anthropic from '@anthropic-ai/sdk';
import * as fs from 'fs';
const client = new Anthropic();
// From file (base64)
const imageData = fs.readFileSync('chart.png').toString('base64');
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{
role: 'user',
content: [
{
type: 'image',
source: {
type: 'base64',
media_type: 'image/png',
data: imageData,
},
},
{ type: 'text', text: 'Describe what this chart shows.' },
],
}],
});
console.log(message.content[0].type === 'text' ? message.content[0].text : '');
Example 3: Streaming Response (Python)
import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a haiku about APIs."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
# Get final message with full metadata
final = stream.get_final_message()
print(f"\nTokens used: {final.usage.input_tokens}+{final.usage.output_tokens}")
Output
- Working code file with Claude client initialization
- Successful API response with text content
- Console output showing model response and usage metadata
Error Handling
|
Execute incident response procedures for Claude API outages and degradation.
ReadBash(curl:*)Grep
Anthropic Incident RunbookSeverity Classification
Immediate Triage (First 5 Minutes)
Decision Tree
Mitigation ActionsRate Limiting (429)
API Outage (500/529)Install and configure Anthropic Claude SDK authentication for Python and TypeScript.
ReadWriteEditBash(npm:*)Bash(pip:*)Grep
Anthropic Install & AuthOverviewSet up the official Anthropic SDK for Python or TypeScript and configure API key authentication. The SDK wraps the Claude Messages API at Prerequisites
InstructionsStep 1: Install SDK
Step 2: Configure API Key
Step 3: Verify Connection (Python)
Step 4: Verify Connection (TypeScript)
Output
Error Handling
Side-by-Side Code Comparison
Tool Use MigrationConfigure Claude API across dev, staging, and production environments with isolated keys, model routing, and spend controls per environment.
ReadWriteEditBash(npm:*)Grep
Anthropic Multi-Environment SetupOverviewConfigure isolated Claude API environments with per-env API keys, model selection, and spend controls using Anthropic Workspaces. Environment Configuration
Anthropic Workspaces (Key Isolation)Create separate Workspaces in console.anthropic.com:
Each workspace has independent API keys, usage tracking, and rate limits. Environment Files
Client Factory
Per-Environment Model OverrideSet up observability for Claude API integrations with metrics, logging, and alerting for latency, cost, errors, and token usage.
ReadWriteEditGrep
Anthropic ObservabilityOverviewInstrument Claude API calls with structured logging, Prometheus metrics, and cost tracking. Every API response includes Structured Logging
Prometheus MetricsOptimize Claude API performance with prompt caching, model selection, streaming, and latency reduction techniques.
ReadWriteEditGrep
Anthropic Performance TuningOverviewOptimize Claude API latency and throughput via prompt caching, model selection, streaming, and request optimization. The biggest wins come from prompt caching (90% input cost reduction) and model selection (Haiku is 4x faster than Sonnet). Prompt Caching (Biggest Win)
Cache requirements: Minimum 1,024 tokens for Sonnet/Opus, 2,048 for Haiku. Cache lives for 5 minutes (refreshed on each hit). Model Selection for Speed
Streaming for Perceived SpeedImplement content policy guardrails, input/output validation, and usage governance for Claude API integrations.
ReadWriteEditGrep
Anthropic Policy GuardrailsOverviewImplement application-level guardrails for Claude API: input validation, output filtering, topic restrictions, and cost governance. These complement Claude's built-in safety (Anthropic Usage Policy). Input Guardrails
System Prompt Guardrails
Output GuardrailsExecute production deployment checklist for Claude API integrations.
ReadBash(curl:*)Grep
Anthropic Production ChecklistOverviewComplete checklist for deploying Claude API integrations to production with reliability, observability, and cost controls. Pre-Launch ChecklistAuthentication & Keys
Error Handling
Rate Limits & Cost
Reliability
Observability
Implement Anthropic Claude API rate limiting, backoff, and quota management.
ReadWriteEdit
Anthropic Rate LimitsOverviewThe Claude API uses token-bucket rate limiting measured in three dimensions: requests per minute (RPM), input tokens per minute (ITPM), and output tokens per minute (OTPM). Limits increase automatically as you move through usage tiers. Rate Limit Dimensions
Limits are per-organization and per-model-class. Cached input tokens do NOT count toward ITPM limits. Usage Tiers (Auto-Upgrade)
Check your current tier and limits at console.anthropic.com. SDK Built-In Retry
Custom Rate Limiter with Header AwarenessImplement Claude API reference architectures for common use cases.
ReadWriteEditGrep
Anthropic Reference ArchitectureOverviewThree validated architecture patterns for Claude API integrations: synchronous API gateway, async queue-based processing, and multi-model routing. Architecture 1: Sync API Gateway (Simple)
Architecture 2: Async Queue-Based (Scalable)
Architecture 3: Multi-Model Router
Implement reliability patterns for Claude API: circuit breakers, graceful degradation, idempotency, and fallback strategies.
ReadWriteEditGrep
Anthropic Reliability PatternsOverviewProduction reliability patterns for Claude API: circuit breaker (prevent cascading failures), graceful degradation (serve fallbacks), idempotency (safe retries), and timeout management. Circuit Breaker
Graceful DegradationApply production-ready Anthropic SDK patterns for TypeScript and Python.
ReadWriteEdit
Anthropic SDK PatternsOverviewProduction-ready patterns for the Anthropic SDK covering client management, error handling, type safety, and multi-tenant configurations. Prerequisites
Pattern 1: Typed Wrapper with Retry
Pattern 2: Multi-Turn Conversation ManagerApply Anthropic Claude API security best practices for key management, input validation, and prompt injection defense.
ReadWriteGrep
Anthropic Security BasicsOverviewSecurity practices for Claude API integrations: API key management, input sanitization, prompt injection defense, and output validation. API Key SecurityEnvironment-Based Key Management
Key Rotation Procedure
Workspace Key IsolationUse Anthropic Workspaces to isolate keys per team/environment:
Prompt Injection Defense
Input ValidationUpgrade Anthropic SDK versions and migrate between Claude API versions.
ReadWriteEditBash(npm:*)Bash(pip:*)Bash(git:*)
Anthropic Upgrade & MigrationOverviewGuide for upgrading the Anthropic SDK and migrating between API versions. The SDK follows semver — major versions may have breaking changes. Check Current Versions
Upgrade PathStep 1: Create Upgrade Branch
Step 2: Upgrade SDK
Step 3: Review Breaking ChangesKey breaking changes by version: Python SDK 0.20+ (anthropic-version: 2023-06-01)
Python SDK 0.30+ (streaming changes)
TypeScript SDK 0.20+ (import path change)
Step 4: Update API Version Header
Step 5: Run Tests and Verify
Migration: Text Completions to Messages
|
|---|