elevenlabs-performance-tuning
Optimize ElevenLabs API performance with caching, batching, and connection pooling. Use when experiencing slow API responses, implementing caching strategies, or optimizing request throughput for ElevenLabs integrations. Trigger with phrases like "elevenlabs performance", "optimize elevenlabs", "elevenlabs latency", "elevenlabs caching", "elevenlabs slow", "elevenlabs batch".
Allowed Tools
Provided by Plugin
elevenlabs-pack
Claude Code skill pack for ElevenLabs (18 skills)
Installation
This skill is included in the elevenlabs-pack plugin:
/plugin install elevenlabs-pack@claude-code-plugins-plus
Click to copy
Instructions
ElevenLabs Performance Tuning
Overview
Optimize ElevenLabs API performance with caching, batching, and connection pooling.
Prerequisites
- ElevenLabs SDK installed
- Understanding of async patterns
- Redis or in-memory cache available (optional)
- Performance monitoring in place
Latency Benchmarks
| Operation | P50 | P95 | P99 |
|---|---|---|---|
| Read | 50ms | 150ms | 300ms |
| Write | 100ms | 250ms | 500ms |
| List | 75ms | 200ms | 400ms |
Caching Strategy
Response Caching
import { LRUCache } from 'lru-cache';
const cache = new LRUCache<string, any>({
max: 1000,
ttl: 60000, // 1 minute
updateAgeOnGet: true,
});
async function cachedElevenLabsRequest<T>(
key: string,
fetcher: () => Promise<T>,
ttl?: number
): Promise<T> {
const cached = cache.get(key);
if (cached) return cached as T;
const result = await fetcher();
cache.set(key, result, { ttl });
return result;
}
Redis Caching (Distributed)
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
async function cachedWithRedis<T>(
key: string,
fetcher: () => Promise<T>,
ttlSeconds = 60
): Promise<T> {
const cached = await redis.get(key);
if (cached) return JSON.parse(cached);
const result = await fetcher();
await redis.setex(key, ttlSeconds, JSON.stringify(result));
return result;
}
Request Batching
import DataLoader from 'dataloader';
const elevenlabsLoader = new DataLoader<string, any>(
async (ids) => {
// Batch fetch from ElevenLabs
const results = await elevenlabsClient.batchGet(ids);
return ids.map(id => results.find(r => r.id === id) || null);
},
{
maxBatchSize: 100,
batchScheduleFn: callback => setTimeout(callback, 10),
}
);
// Usage - automatically batched
const [item1, item2, item3] = await Promise.all([
elevenlabsLoader.load('id-1'),
elevenlabsLoader.load('id-2'),
elevenlabsLoader.load('id-3'),
]);
Connection Optimization
import { Agent } from 'https';
// Keep-alive connection pooling
const agent = new Agent({
keepAlive: true,
maxSockets: 10,
maxFreeSockets: 5,
timeout: 30000,
});
const client = new ElevenLabsClient({
apiKey: process.env.ELEVENLABS_API_KEY!,
httpAgent: agent,
});
Pagination Optimization
async function* paginatedElevenLabsList<T>(
fetcher: (cursor?: string) => Promise<{ data: T[]; nextCursor?: string }>
): AsyncGenerator<T> {
let cursor: string | undefined;
do {
const { data, nextCursor } = await fetcher(cursor);
for (const item of data) {
yield item;
}
cursor = nextCursor;
} while (cursor);
}
// Usage
for await (const item of paginatedElevenLabsList(cursor =>
elevenlabsClient.list({ cursor, limit: 100 })
)) {
await process(item);
}
Performance Monitoring
async function measuredElevenLabsCall<T>(
operation: string,
fn: () => Promise<T>
): Promise<T> {
const start = performance.now();
try {
const result = await fn();
const duration = performance.now() - start;
console.log({ operation, duration, status: 'success' });
return result;
} catch (error) {
const duration = performance.now() - start;
console.error({ operation, duration, status: 'error', error });
throw error;
}
}
Instructions
Step 1: Establish Baseline
Measure current latency for critical ElevenLabs operations.
Step 2: Implement Caching
Add response caching for frequently accessed data.
Step 3: Enable Batching
Use DataLoader or similar for automatic request batching.
Step 4: Optimize Connections
Configure connection pooling with keep-alive.
Output
- Reduced API latency
- Caching layer implemented
- Request batching enabled
- Connection pooling configured
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Cache miss storm | TTL expired | Use stale-while-revalidate |
| Batch timeout | Too many items | Reduce batch size |
| Connection exhausted | No pooling | Configure max sockets |
| Memory pressure | Cache too large | Set max cache entries |
Examples
Quick Performance Wrapper
const withPerformance = <T>(name: string, fn: () => Promise<T>) =>
measuredElevenLabsCall(name, () =>
cachedElevenLabsRequest(`cache:${name}`, fn)
);
Resources
Next Steps
For cost optimization, see elevenlabs-cost-tuning.