Complete Deepgram integration skill pack with 24 skills covering speech-to-text, real-time transcription, voice intelligence, and audio processing. Flagship tier vendor pack.
Installation
Open Claude Code and run this command:
/plugin install deepgram-pack@claude-code-plugins-plus
Use --global to install for all projects, or --project for current project only.
Skills (24)
Configure Deepgram CI/CD integration for automated testing and deployment.
Deepgram CI Integration
Overview
Set up CI/CD pipelines for Deepgram integrations with GitHub Actions. Includes unit tests with mocked SDK, integration tests against the real API, smoke tests, automated key rotation, and deployment gates.
Prerequisites
- GitHub repository with Actions enabled
DEEPGRAMAPIKEYstored as repository secret@deepgram/sdkandvitestinstalled- Test fixtures committed (or downloaded in CI)
Instructions
Step 1: GitHub Actions Workflow
# .github/workflows/deepgram-ci.yml
name: Deepgram CI
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
NODE_VERSION: '20'
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: npm
- run: npm ci
- run: npm run lint
- run: npm run typecheck
- run: npm test -- --reporter=verbose
# Unit tests use mocked SDK — no API key needed
integration-tests:
runs-on: ubuntu-latest
needs: unit-tests
if: github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: npm
- run: npm ci
- run: npm run test:integration
env:
DEEPGRAM_API_KEY: ${{ secrets.DEEPGRAM_API_KEY }}
timeout-minutes: 5
smoke-test:
runs-on: ubuntu-latest
needs: integration-tests
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: npm
- run: npm ci && npm run build
- name: Smoke test
run: npx tsx scripts/smoke-test.ts
env:
DEEPGRAM_API_KEY: ${{ secrets.DEEPGRAM_API_KEY }}
timeout-minutes: 2
Step 2: Integration Test Suite
// tests/integration/deepgram.test.ts
import { describe, it, expect, beforeAll } from 'vitest';
import { createClient, DeepgramClient } from '@deepgram/sdk';
const SAMPLE_URL = 'https://static.deepgram.com/examples/Bueller-Life-moves-702702706.wav';
describe('Deepgram Integration', () => {
let client: DeepgramClient;
beforeAll(() => {
const key = process.env.DEEPGRAM_API_KEY;
if (!key) throw new Error('DEEPGRAM_API_KEY required for integration tests');
client = createClient(key);
});
it('authenticates successfully', async () => {
const { result, error } = await client.manage.getProjects();
expect(error).toBeNull();
expect(result.projects.length).toBeGreaterThan(0);
}Diagnose and fix common Deepgram errors and issues.
Deepgram Common Errors
Overview
Comprehensive error reference for Deepgram API integration. Covers HTTP error codes, WebSocket errors, transcription quality issues, SDK-specific problems, and audio format debugging with real diagnostic commands.
Prerequisites
- Deepgram API key configured
curlavailable for API testing- Access to application logs
Instructions
Step 1: Quick Diagnostic
# Test API key validity
curl -s -w "\nHTTP %{http_code}\n" \
'https://api.deepgram.com/v1/projects' \
-H "Authorization: Token $DEEPGRAM_API_KEY"
# Test transcription endpoint
curl -s -w "\nHTTP %{http_code}\n" \
-X POST 'https://api.deepgram.com/v1/listen?model=nova-3&smart_format=true' \
-H "Authorization: Token $DEEPGRAM_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url":"https://static.deepgram.com/examples/Bueller-Life-moves-702702706.wav"}'
Step 2: HTTP Error Reference
| Code | Error | Cause | Solution |
|---|---|---|---|
| 400 | Bad Request | Invalid audio format, bad params | Check audio headers, validate query params |
| 401 | Unauthorized | Invalid/expired API key | Regenerate in Console > API Keys |
| 403 | Forbidden | Key lacks scope | Create key with listen scope for STT |
| 404 | Not Found | Wrong endpoint URL | Use api.deepgram.com/v1/listen |
| 408 | Timeout | Audio too long for sync | Use callback param for async |
| 413 | Payload Too Large | File exceeds 2GB | Split with ffmpeg -f segment -segment_time 3600 |
| 429 | Too Many Requests | Concurrency limit hit | Implement backoff, check plan limits |
| 500 | Internal Error | Deepgram server error | Retry with backoff, check status.deepgram.com |
| 502 | Bad Gateway | Upstream failure | Retry after 5-10 seconds |
| 503 | Service Unavailable | Maintenance/overload | Check status.deepgram.com, retry later |
Step 3: WebSocket Errors
import { LiveTranscriptionEvents } from '@deepgram/sdk';
connection.on(LiveTranscriptionEvents.Error, (error) => {
console.error('WebSocket error:', {
message: error.message,
type: error.type,
});
});
// Common WebSocket issues:
// 1. Connection closes after ~10s of silence
// Fix: Send keepAlive() every 8 seconds
connection.keepAlive();
// 2. "CouImplement production pre-recorded speech-to-text with Deepgram.
Deepgram Core Workflow A: Pre-recorded Transcription
Overview
Production pre-recorded transcription service using Deepgram's REST API. Covers transcribeUrl and transcribeFile, speaker diarization, audio intelligence (summarization, topic detection, sentiment, intent), batch processing with concurrency control, and callback-based async transcription for large files.
Prerequisites
@deepgram/sdkinstalled,DEEPGRAMAPIKEYconfigured- Audio files: WAV, MP3, FLAC, OGG, M4A, or WebM
- For batch:
p-limitpackage (npm install p-limit)
Instructions
Step 1: Transcription Service Class
import { createClient, DeepgramClient } from '@deepgram/sdk';
import { readFileSync } from 'fs';
interface TranscribeOptions {
model?: 'nova-3' | 'nova-2' | 'nova-2-meeting' | 'nova-2-phonecall' | 'base';
language?: string;
diarize?: boolean;
utterances?: boolean;
paragraphs?: boolean;
smart_format?: boolean;
summarize?: boolean; // Audio intelligence
detect_topics?: boolean; // Topic detection
sentiment?: boolean; // Sentiment analysis
intents?: boolean; // Intent recognition
keywords?: string[]; // Keyword boosting: ["term:weight"]
callback?: string; // Async callback URL
}
class DeepgramTranscriber {
private client: DeepgramClient;
constructor(apiKey: string) {
this.client = createClient(apiKey);
}
async transcribeUrl(url: string, opts: TranscribeOptions = {}) {
const { result, error } = await this.client.listen.prerecorded.transcribeUrl(
{ url },
{
model: opts.model ?? 'nova-3',
language: opts.language ?? 'en',
smart_format: opts.smart_format ?? true,
diarize: opts.diarize ?? false,
utterances: opts.utterances ?? false,
paragraphs: opts.paragraphs ?? false,
summarize: opts.summarize ? 'v2' : undefined,
detect_topics: opts.detect_topics ?? false,
sentiment: opts.sentiment ?? false,
intents: opts.intents ?? false,
keywords: opts.keywords,
callback: opts.callback,
}
);
if (error) throw new Error(`Transcription failed: ${error.message}`);
return result;
}
async transcribeFile(filePath: string, opts: TranscribeOptions = {}) {
const audio = readFileSync(filePath);
const mimetype = this.detectMimetype(filePath);
const { result, error } = await this.client.listen.prerecorded.transcribeFile(
audio,
{
model: opts.model ?? 'nova-3',
smart_format: opts.smart_format ?? true,
mimetype,
diarize: opts.diarize ?? false,
utterances: opts.utterances ?? false,
summarize: opts.summarize ? 'v2' : undefined,
Implement real-time streaming transcription with Deepgram WebSocket.
Deepgram Core Workflow B: Live Streaming Transcription
Overview
Real-time streaming transcription using Deepgram's WebSocket API. The SDK manages the WebSocket connection via listen.live(). Covers microphone capture, interim/final result handling, speaker diarization, UtteranceEnd detection, auto-reconnect, and building an SSE endpoint for browser clients.
Prerequisites
@deepgram/sdkinstalled,DEEPGRAMAPIKEYconfigured- Audio source: microphone (via Sox/
rec), file stream, or WebSocket audio from browser - For mic capture:
soxinstalled (apt install sox/brew install sox)
Instructions
Step 1: Basic Live Transcription
import { createClient, LiveTranscriptionEvents } from '@deepgram/sdk';
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
const connection = deepgram.listen.live({
model: 'nova-3',
language: 'en',
smart_format: true,
punctuate: true,
interim_results: true, // Show in-progress results
utterance_end_ms: 1000, // Silence threshold for utterance end
vad_events: true, // Voice activity detection events
encoding: 'linear16', // 16-bit PCM
sample_rate: 16000, // 16 kHz
channels: 1, // Mono
});
// Connection lifecycle events
connection.on(LiveTranscriptionEvents.Open, () => {
console.log('WebSocket connected to Deepgram');
});
connection.on(LiveTranscriptionEvents.Close, () => {
console.log('WebSocket closed');
});
connection.on(LiveTranscriptionEvents.Error, (err) => {
console.error('Deepgram error:', err);
});
// Transcript events
connection.on(LiveTranscriptionEvents.Transcript, (data) => {
const transcript = data.channel.alternatives[0]?.transcript;
if (!transcript) return;
if (data.is_final) {
console.log(`[FINAL] ${transcript}`);
} else {
process.stdout.write(`\r[interim] ${transcript}`);
}
});
// UtteranceEnd — fires when speaker pauses
connection.on(LiveTranscriptionEvents.UtteranceEnd, () => {
console.log('\n--- utterance end ---');
});
Step 2: Microphone Capture with Sox
import { spawn } from 'child_process';
function startMicrophone(connection: any) {
// Sox captures from default mic: 16kHz, 16-bit signed LE, mono
const mic = spawn('rec', [
'-q', // Quiet (no progress)
'-r', '16000', // Sample rate
'-e', 'signed', // Encoding
'-b', '16', // Bit depth
'-c', '1', // Mono
'-t', 'raw', // Raw PCM output
'-', // Output to stdout
]);
mic.stdout.on('data', (chunk: Optimize Deepgram costs and usage for budget-conscious deployments.
Deepgram Cost Tuning
Overview
Optimize Deepgram API costs through smart model selection, audio preprocessing to reduce billable minutes, usage monitoring via the Deepgram API, budget guardrails, and feature-aware cost estimation. Deepgram bills per audio minute processed.
Deepgram Pricing (2026)
| Product | Model | Price/Minute | Notes |
|---|---|---|---|
| STT (Batch) | Nova-3 | $0.0043 | Best accuracy |
| STT (Batch) | Nova-2 | $0.0043 | Proven stable |
| STT (Streaming) | Nova-3 | $0.0059 | Real-time |
| STT (Streaming) | Nova-2 | $0.0059 | Real-time |
| STT (Batch) | Base | $0.0048 | Fastest |
| STT (Batch) | Whisper | $0.0048 | Multilingual |
| TTS | Aura-2 | Pay-per-character | See TTS pricing |
| Intelligence | Summarize/Topics/Sentiment | Included with STT | No extra cost |
Add-on costs:
- Diarization: +$0.0044/min
- Multichannel: billed per channel
Instructions
Step 1: Budget-Aware Transcription Service
import { createClient } from '@deepgram/sdk';
interface BudgetConfig {
monthlyLimitUsd: number;
warningThreshold: number; // 0.0-1.0 (e.g., 0.8 = warn at 80%)
costPerMinute: number; // Base STT cost
}
class BudgetAwareTranscriber {
private client: ReturnType<typeof createClient>;
private config: BudgetConfig;
private monthlySpendUsd = 0;
private monthlyMinutes = 0;
constructor(apiKey: string, config: BudgetConfig) {
this.client = createClient(apiKey);
this.config = config;
}
async transcribe(source: any, options: any) {
// Estimate cost before transcription
const estimatedCost = this.estimateCost(options);
const projected = this.monthlySpendUsd + estimatedCost;
if (projected > this.config.monthlyLimitUsd) {
throw new Error(
`Budget exceeded: $${this.monthlySpendUsd.toFixed(2)} spent, ` +
`$${this.config.monthlyLimitUsd} limit`
);
}
if (projected > this.config.monthlyLimitUsd * this.config.warningThreshold) {
console.warn(
`Budget warning: ${((projected / this.config.monthlyLimitUsd) * 100).toFixed(0)}% ` +
`of $${this.config.monthlyLimitUsd} limit`
);
}
const { result, error } = await this.client.listen.prerecorded.transcribeUrl(
source, options
);
if (error) throw error;
// Track actual usage
const duration = result.metadata.duration / 60; // Convert to minutes
const actualCost = this.calculateCost(duration, options);
this.monthlyMinutes += duration;
this.monthlySpendUsImplement audio data handling best practices for Deepgram integrations.
Deepgram Data Handling
Overview
Best practices for handling audio and transcript data with Deepgram. Covers Deepgram's built-in redact parameter for PII, secure audio upload with encryption, transcript storage patterns, data retention policies, and GDPR/HIPAA compliance workflows.
Data Privacy Quick Reference
| Deepgram Feature | What It Does | Enable |
|---|---|---|
redact: ['pci'] |
Masks credit card numbers in transcript | Query param |
redact: ['ssn'] |
Masks Social Security numbers | Query param |
redact: ['numbers'] |
Masks all numeric sequences | Query param |
| Data retention | Deepgram does NOT store audio or transcripts | Default behavior |
Deepgram's data policy: Audio is processed in real-time and not stored. Transcripts are not retained unless you use Deepgram's optional storage features.
Instructions
Step 1: Deepgram Built-in PII Redaction
import { createClient } from '@deepgram/sdk';
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
// Deepgram redacts PII directly during transcription
const { result } = await deepgram.listen.prerecorded.transcribeUrl(
{ url: audioUrl },
{
model: 'nova-3',
smart_format: true,
redact: ['pci', 'ssn'], // Credit cards + SSNs -> [REDACTED]
}
);
// Output: "My card is [REDACTED] and SSN is [REDACTED]"
console.log(result.results.channels[0].alternatives[0].transcript);
// For maximum privacy, redact all numbers:
// redact: ['pci', 'ssn', 'numbers']
Step 2: Application-Level PII Redaction
// Additional redaction patterns beyond Deepgram's built-in
const piiPatterns: Array<{ name: string; pattern: RegExp; replacement: string }> = [
{ name: 'email', pattern: /\b[\w.-]+@[\w.-]+\.\w{2,}\b/g, replacement: '[EMAIL]' },
{ name: 'phone', pattern: /\b(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b/g, replacement: '[PHONE]' },
{ name: 'dob', pattern: /\b(0[1-9]|1[0-2])[\/.-](0[1-9]|[12]\d|3[01])[\/.-](19|20)\d{2}\b/g, replacement: '[DOB]' },
{ name: 'address', pattern: /\b\d{1,5}\s[\w\s]+(?:Street|St|Avenue|Ave|Road|Rd|Drive|Dr|Lane|Ln|Boulevard|Blvd)\b/gi, replacement: '[ADDRESS]' },
];
function redactPII(text: string): { redacted: string; found: string[] } {
let redacted = text;
const found: string[] = [];
for (const { name, pattern, replacement } of piiPatterns) {
const matches = text.match(pattern);
if (matches) {
found.push(`${name}: ${matches.length} occurrence(s)`);
redacted Collect Deepgram debug evidence for support and troubleshooting.
Deepgram Debug Bundle
Current State
!node --version 2>/dev/null || echo 'Node.js not installed'
!npm list @deepgram/sdk 2>/dev/null | grep deepgram || echo '@deepgram/sdk not found'
!python3 --version 2>/dev/null || echo 'Python not installed'
Overview
Collect comprehensive debug information for Deepgram support tickets. Generates a sanitized bundle with environment info, API connectivity tests, audio analysis, request/response logs, and a minimal reproduction script. All API keys are automatically redacted.
Prerequisites
- Deepgram API key configured
ffprobeavailable for audio analysis (part of ffmpeg)- Sample audio that reproduces the issue
Instructions
Step 1: Environment Collection Script
#!/bin/bash
set -euo pipefail
BUNDLE_DIR="deepgram-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BUNDLE_DIR"
# System info
{
echo "=== System ==="
uname -a
echo ""
echo "=== Node.js ==="
node --version 2>/dev/null || echo "Not installed"
echo ""
echo "=== @deepgram/sdk ==="
npm list @deepgram/sdk 2>/dev/null || echo "Not installed"
echo ""
echo "=== Python ==="
python3 --version 2>/dev/null || echo "Not installed"
pip show deepgram-sdk 2>/dev/null || echo "Not installed"
} > "$BUNDLE_DIR/environment.txt"
Step 2: API Connectivity Tests
# Test REST API
{
echo "=== REST API Test ==="
echo "Timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
echo ""
echo "--- Project listing ---"
curl -s -w "\nHTTP: %{http_code} | Time: %{time_total}s\n" \
'https://api.deepgram.com/v1/projects' \
-H "Authorization: Token $DEEPGRAM_API_KEY" 2>&1 | \
sed "s/$DEEPGRAM_API_KEY/REDACTED/g"
echo ""
echo "--- Transcription test (Bueller sample) ---"
curl -s -w "\nHTTP: %{http_code} | Time: %{time_total}s\n" \
-X POST 'https://api.deepgram.com/v1/listen?model=nova-3&smart_format=true' \
-H "Authorization: Token $DEEPGRAM_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url":"https://static.deepgram.com/examples/Bueller-Life-moves-702702706.wav"}' 2>&1 | \
sed "s/$DEEPGRAM_API_KEY/REDACTED/g"
echo ""
echo "--- WebSocket handshake test ---"
curl -s -w "\nHTTP: %{http_code}\n" -o /dev/null \
'https://api.deepgram.com/v1/listen' \
-H "Authorization: Token $DEEPGRAM_API_KEY" \
-H "Upgrade: websocket" 2>&1 | \
sed "s/$DEEPGRAM_API_KEY/REDACTED/g"
Deploy Deepgram integrations to production environments.
Deepgram Deploy Integration
Overview
Deploy Deepgram transcription services to Docker, Kubernetes, AWS Lambda, and Google Cloud Run. Includes production Dockerfile, K8s manifests with secret management, serverless handlers for event-driven transcription, and health check patterns.
Prerequisites
- Working Deepgram integration (tested locally)
- Production API key in secret manager
- Container registry access (Docker Hub, ECR, GCR)
- Target platform CLI installed
Instructions
Step 1: Production Dockerfile
# Multi-stage build for minimal production image
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --production=false
COPY tsconfig.json ./
COPY src/ ./src/
RUN npm run build
FROM node:20-alpine AS runtime
# Security: non-root user
RUN addgroup -g 1001 -S app && adduser -S app -u 1001
WORKDIR /app
# Production dependencies only
COPY package*.json ./
RUN npm ci --production && npm cache clean --force
# Copy built application
COPY --from=builder /app/dist ./dist
# Health check (tests Deepgram connectivity)
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD wget -q --spider http://localhost:3000/health || exit 1
USER app
EXPOSE 3000
CMD ["node", "dist/server.js"]
Step 2: Docker Compose
# docker-compose.yml
version: '3.8'
services:
deepgram-service:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY}
- DEEPGRAM_MODEL=nova-3
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
restart: unless-stopped
deploy:
resources:
limits:
memory: 512M
cpus: '1.0'
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis-data:/data
volumes:
redis-data:
Step 3: Kubernetes Deployment
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deepgram-service
labels:
app: deepgram-service
spec:
replicas: 3
selector:
matchLabels:
app: deepgram-service
template:
metadata:
labels:
app: deepgram-service
spec:
containers:
- name: deepgram-service
image: your-registry/deepgram-service:latest
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: production
- name: DEEPGRAM_API_KEY
valueFrom:
secretKeyRef:
name: deepgram-secrets
key: api-key
- name: DEEPGRAM_MODEL
Configure enterprise role-based access control for Deepgram integrations.
Deepgram Enterprise RBAC
Overview
Role-based access control for enterprise Deepgram deployments. Maps five application roles to Deepgram API key scopes, implements scoped key provisioning via the Deepgram Management API, Express permission middleware, team management with auto-provisioned keys, and automated key rotation.
Deepgram Scope Reference
| Scope | Permission | Used By |
|---|---|---|
member |
Full access (all scopes) | Admin only |
listen |
STT transcription | Developers, Services |
speak |
TTS synthesis | Developers, Services |
manage |
Project/key management | Admin |
usage:read |
View usage metrics | Analysts, Auditors |
keys:read |
List API keys | Auditors |
keys:write |
Create/delete keys | Admin |
Instructions
Step 1: Define Roles and Scope Mapping
interface Role {
name: string;
deepgramScopes: string[];
keyExpiry: number; // Days
description: string;
}
const ROLES: Record<string, Role> = {
admin: {
name: 'Admin',
deepgramScopes: ['member'],
keyExpiry: 90,
description: 'Full access — project and key management',
},
developer: {
name: 'Developer',
deepgramScopes: ['listen', 'speak'],
keyExpiry: 90,
description: 'STT and TTS — no management access',
},
analyst: {
name: 'Analyst',
deepgramScopes: ['usage:read'],
keyExpiry: 365,
description: 'Read-only usage metrics',
},
service: {
name: 'Service Account',
deepgramScopes: ['listen'],
keyExpiry: 90,
description: 'STT only — for automated systems',
},
auditor: {
name: 'Auditor',
deepgramScopes: ['usage:read', 'keys:read'],
keyExpiry: 30,
description: 'Read-only audit access',
},
};
Step 2: Scoped Key Provisioning
import { createClient } from '@deepgram/sdk';
class DeepgramKeyManager {
private admin: ReturnType<typeof createClient>;
private projectId: string;
constructor(adminKey: string, projectId: string) {
this.admin = createClient(adminKey);
this.projectId = projectId;
}
async createScopedKey(userId: string, roleName: string): Promise<{
keyId: string;
key: string;
scopes: string[];
expiresAt: string;
}> {
const role = ROLES[roleName];
if (!role) throw new Error(`Unknown role: ${roleName}`);
const expirationDate = new Date(Date.now() + role.keyExpiryCreate a minimal working Deepgram transcription example.
Deepgram Hello World
Overview
Minimal working examples for Deepgram speech-to-text. Transcribe an audio URL in 5 lines with createClient + listen.prerecorded.transcribeUrl. Includes local file transcription, Python equivalent, and Nova-3 model selection.
Prerequisites
npm install @deepgram/sdkcompletedDEEPGRAMAPIKEYenvironment variable set- Audio source: URL or local file (WAV, MP3, FLAC, OGG, M4A)
Instructions
Step 1: Transcribe Audio from URL (TypeScript)
import { createClient } from '@deepgram/sdk';
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
async function main() {
const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
{ url: 'https://static.deepgram.com/examples/Bueller-Life-moves-702702706.wav' },
{
model: 'nova-3', // Latest model — best accuracy
smart_format: true, // Auto-punctuation, paragraphs, numerals
language: 'en',
}
);
if (error) throw error;
const transcript = result.results.channels[0].alternatives[0].transcript;
console.log('Transcript:', transcript);
console.log('Confidence:', result.results.channels[0].alternatives[0].confidence);
}
main();
Step 2: Transcribe a Local File
import { createClient } from '@deepgram/sdk';
import { readFileSync } from 'fs';
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
async function transcribeFile(filePath: string) {
const audio = readFileSync(filePath);
const { result, error } = await deepgram.listen.prerecorded.transcribeFile(
audio,
{
model: 'nova-3',
smart_format: true,
// Deepgram auto-detects format, but you can specify:
mimetype: 'audio/wav',
}
);
if (error) throw error;
console.log(result.results.channels[0].alternatives[0].transcript);
}
transcribeFile('./meeting-recording.wav');
Step 3: Python Equivalent
import os
from deepgram import DeepgramClient, PrerecordedOptions
client = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])
# URL transcription
url = {"url": "https://static.deepgram.com/examples/Bueller-Life-moves-702702706.wav"}
options = PrerecordedOptions(model="nova-3", smart_format=True, language="en")
response = client.listen.rest.v("1").transcribe_url(url, options)
transcript = response.results.channels[0].alternatives[0].transcript
print(f"Transcript: {transcript}")
print(f"Confidence: {response.results.channels[0].alternatives[0].confidence}")
# Local file transcription
with open("meeting.wav", "rbExecute Deepgram incident response procedures for production issues.
Deepgram Incident Runbook
Overview
Standardized incident response for Deepgram-related production issues. Includes automated triage script, severity classification (SEV1-SEV4), immediate mitigation actions, fallback activation, and post-incident review template.
Quick Reference
| Resource | URL |
|---|---|
| Deepgram Status | https://status.deepgram.com |
| Deepgram Console | https://console.deepgram.com |
| Support Email | support@deepgram.com |
| Community | https://github.com/orgs/deepgram/discussions |
Severity Classification
| Level | Definition | Response Time | Example |
|---|---|---|---|
| SEV1 | Complete outage, all transcriptions failing | Immediate | 100% 5xx errors |
| SEV2 | Major degradation, >50% error rate | < 15 min | Specific model failing |
| SEV3 | Minor degradation, elevated latency | < 1 hour | P95 > 30s |
| SEV4 | Single feature affected, cosmetic | < 24 hours | Diarization inaccurate |
Instructions
Step 1: Automated Triage (First 5 Minutes)
#!/bin/bash
set -euo pipefail
echo "=== Deepgram Incident Triage ==="
echo "Timestamp: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
echo ""
# 1. Check Deepgram status page
echo "--- Status Page ---"
STATUS=$(curl -s -o /dev/null -w "%{http_code}" https://status.deepgram.com)
echo "Status page: HTTP $STATUS"
# 2. Test API connectivity
echo ""
echo "--- API Connectivity ---"
curl -s -w "\nHTTP: %{http_code} | Latency: %{time_total}s\n" \
'https://api.deepgram.com/v1/projects' \
-H "Authorization: Token $DEEPGRAM_API_KEY" | head -5
# 3. Test transcription
echo ""
echo "--- Transcription Test ---"
RESULT=$(curl -s -w "\n%{http_code}" \
-X POST 'https://api.deepgram.com/v1/listen?model=nova-3&smart_format=true' \
-H "Authorization: Token $DEEPGRAM_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url":"https://static.deepgram.com/examples/Bueller-Life-moves-702702706.wav"}')
HTTP_CODE=$(echo "$RESULT" | tail -1)
echo "Transcription: HTTP $HTTP_CODE"
# 4. Test multiple models
echo ""
echo "--- Model Tests ---"
for MODEL in nova-3 nova-2 base; do
CODE=$(curl -s -o /dev/null -w "%{http_code}" \
-X POST "https://api.deepgram.com/v1/listen?model=$MODEL" \
-H "Authorization: Token $DEEPGRAM_API_KEY" \
-H "Content-Type: application/json" \
-d '{"Install and configure Deepgram SDK authentication.
Deepgram Install & Auth
Current State
!npm list @deepgram/sdk 2>/dev/null || echo '@deepgram/sdk not installed'
!pip show deepgram-sdk 2>/dev/null | grep Version || echo 'deepgram-sdk (Python) not installed'
Overview
Install the Deepgram SDK and configure API key authentication. Deepgram provides speech-to-text (Nova-3, Nova-2), text-to-speech (Aura-2), and audio intelligence APIs. The JS SDK uses createClient() (v3/v4) or new DeepgramClient() (v5+).
Prerequisites
- Node.js 18+ or Python 3.10+
- Deepgram account at console.deepgram.com
- API key from Console > Settings > API Keys
Instructions
Step 1: Install SDK
Node.js (v3/v4 — current stable):
npm install @deepgram/sdk
# or
pnpm add @deepgram/sdk
Python:
pip install deepgram-sdk
Step 2: Configure API Key
# Option A: Environment variable (recommended)
export DEEPGRAM_API_KEY="your-api-key-here"
# Option B: .env file (add .env to .gitignore)
echo 'DEEPGRAM_API_KEY=your-api-key-here' >> .env
Never hardcode keys. Use dotenv for local dev, secret managers in production.
Step 3: Initialize Client (TypeScript)
import { createClient } from '@deepgram/sdk';
// Reads DEEPGRAM_API_KEY from env automatically
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
SDK v5+ uses a different constructor:
import { DeepgramClient } from '@deepgram/sdk';
const deepgram = new DeepgramClient({ apiKey: process.env.DEEPGRAM_API_KEY });
Step 4: Initialize Client (Python)
from deepgram import DeepgramClient, PrerecordedOptions, LiveOptions
import os
client = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])
Step 5: Verify Connection
// TypeScript — list projects to verify key is valid
async function verify() {
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
const { result, error } = await deepgram.manage.getProjects();
if (error) {
console.error('Auth failed:', error.message);
process.exit(1);
}
console.log(`Connected. Projects: ${result.projects.length}`);
result.projects.forEach(p => console.log(` - ${p.name} (${p.project_id})`));
}
verify();
# Python — verify connection
from deepgram import DeepgramClient
import os
client = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])
response = client.manage.v("1").get_projConfigure Deepgram local development workflow with testing and mocks.
Deepgram Local Dev Loop
Overview
Set up a fast local development workflow for Deepgram: test fixtures with sample audio, mock responses for offline unit tests, Vitest integration tests against the real API, and a watch-mode transcription dev server.
Prerequisites
@deepgram/sdkinstalled,DEEPGRAMAPIKEYconfigurednpm install -D vitest tsx dotenvfor testing and dev server- Optional:
curlfor downloading test fixtures
Instructions
Step 1: Project Structure
mkdir -p src tests/mocks fixtures
touch src/transcribe.ts tests/transcribe.test.ts tests/mocks/deepgram-responses.ts
Step 2: Download Test Fixtures
# Deepgram provides free sample audio files
curl -o fixtures/nasa-podcast.wav \
https://static.deepgram.com/examples/nasa-podcast.wav
curl -o fixtures/bueller.wav \
https://static.deepgram.com/examples/Bueller-Life-moves-702702706.wav
Step 3: Environment Config
# .env.development
DEEPGRAM_API_KEY=your-dev-key
DEEPGRAM_MODEL=nova-3
# .env.test (use a separate test key with low limits)
DEEPGRAM_API_KEY=your-test-key
DEEPGRAM_MODEL=base
{
"scripts": {
"dev": "tsx watch src/transcribe.ts",
"test": "vitest",
"test:watch": "vitest --watch",
"test:integration": "vitest run tests/integration/"
}
}
Step 4: Mock Deepgram Responses
// tests/mocks/deepgram-responses.ts
export const mockPrerecordedResult = {
metadata: {
request_id: 'mock-request-id-001',
created: '2026-01-01T00:00:00.000Z',
duration: 12.5,
channels: 1,
models: ['nova-3'],
model_info: { 'nova-3': { name: 'nova-3', version: '2026-01-01' } },
},
results: {
channels: [{
alternatives: [{
transcript: 'Life moves pretty fast. If you don\'t stop and look around once in a while, you could miss it.',
confidence: 0.98,
words: [
{ word: 'life', start: 0.08, end: 0.32, confidence: 0.99, punctuated_word: 'Life' },
{ word: 'moves', start: 0.32, end: 0.56, confidence: 0.98, punctuated_word: 'moves' },
{ word: 'pretty', start: 0.56, end: 0.88, confidence: 0.97, punctuated_word: 'pretty' },
{ word: 'fast', start: 0.88, end: 1.12, confidence: 0.99, punctuated_word: 'fast.' },
],
}],
}],
utterances: [{
speaker: 0,
transcript: 'Life moves pretty fast. If you don\'t stop and look around once in a while, you could miss it.',
start: 0.08,
end: 5.44,Deep dive into migrating to Deepgram from other transcription providers.
Deepgram Migration Deep Dive
Current State
!npm list @deepgram/sdk 2>/dev/null | grep deepgram || echo 'Not installed'
!npm list @aws-sdk/client-transcribe 2>/dev/null | grep transcribe || echo 'AWS Transcribe SDK not found'
!pip show google-cloud-speech 2>/dev/null | grep Version || echo 'Google STT not found'
Overview
Migrate to Deepgram from AWS Transcribe, Google Cloud Speech-to-Text, Azure Cognitive Services, or OpenAI Whisper. Uses an adapter pattern with a unified interface, parallel running for quality validation, percentage-based traffic shifting, and automated rollback.
Feature Mapping
AWS Transcribe -> Deepgram
| AWS Transcribe | Deepgram | Notes |
|---|---|---|
LanguageCode: 'en-US' |
language: 'en' |
ISO 639-1 (2-letter) |
ShowSpeakerLabels: true |
diarize: true |
Same feature, different param |
VocabularyName: 'custom' |
keywords: ['term:1.5'] |
Inline boosting, no pre-upload |
ContentRedactionType: 'PII' |
redact: ['pci', 'ssn'] |
Granular PII categories |
OutputBucketName |
callback: 'https://...' |
Callback URL, not S3 |
| Job polling model | Sync response or callback | No polling needed |
Google Cloud STT -> Deepgram
| Google STT | Deepgram | Notes |
|---|---|---|
RecognitionConfig.encoding |
Auto-detected | Deepgram auto-detects format |
RecognitionConfig.sampleRateHertz |
sample_rate (live only) |
REST auto-detects |
RecognitionConfig.model: 'latest_long' |
model: 'nova-3' |
Direct mapping |
SpeakerDiarizationConfig |
diarize: true |
Simpler configuration |
StreamingRecognize |
listen.live() |
WebSocket vs gRPC |
OpenAI Whisper -> Deepgram
| Whisper | Deepgram | Notes | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Local GPU processing | API call | No GPU needed | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
whisper.transcribe(audio) |
listen.prerecorded.transcribeFile() |
Similar interface | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
model='large-v3' |
model: 'nova-3' |
10-100x faster | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
language='en' |
language: 'en' |
Same format | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| No diarization | <
| Setting | Development | Staging | Production |
|---|---|---|---|
| Model | base (fast, cheap) |
nova-3 |
nova-3 |
| Concurrency | 5 | 20 | 100 |
| Diarization | Off | On | On |
| PII Redaction | Off | On | On |
| Callback URL | localhost:3000 | staging.example.com | api.example.com |
| Key Rotation | Manual | Monthly | 90-day auto |
Instructions
Step 1: Typed Environment Configuration
interface DeepgramEnvConfig {
apiKey: string;
projectId: string;
model: 'base' | 'nova-2' | 'nova-3';
maxConcurrency: number;
features: {
diarize: boolean;
smart_format: boolean;
redact: string[] | false;
summarize: boolean;
};
callbackBaseUrl?: string;
timeout: number;
}
function loadConfig(env: string): DeepgramEnvConfig {
const configs: Record<string, DeepgramEnvConfig> = {
development: {
apiKey: process.env.DEEPGRAM_API_KEY_DEV!,
projectId: process.env.DEEPGRAM_PROJECT_ID_DEV!,
model: 'base',
maxConcurrency: 5,
features: {
diarize: false,
smart_format: true,
redact: false,
summarize: false,
},
callbackBaseUrl: 'http://localhost:3000',
timeout: 60000,
},
staging: {
apiKey: process.env.DEEPGRAM_API_KEY_STAGING!,
projectId: process.env.DEEPGRAM_PROJECT_ID_STAGING!,
model: 'nova-3',
maxConcurrency: 20,
features: {
diarize: true,
smart_format: true,
redact: ['pci', 'ssn'],
summarize: true,
},
callbackBaseUrl: 'https://staging.example.com',
timeout: 30000,
},
production: {
apiKey: process.env.DEEPGRAM_API_KEY_PRODUCTION!,
projectId: process.env.DEEPGRAM_PROJECT_ID_PRODUCTION!,
model: 'nova-3',
maxConcurrency: 100,
features: {
diarize: true,
smart_format: true,
redact: ['pci', 'ssn'],
summarize: true,
},
callbackBaseUrl: 'https://api.example.com',
timeout: 30000,
},
};
const config = configs[env];
if (!config) throw new Error(`Unknown environment: ${env}. Use: development, staging, productioSet up comprehensive observability for Deepgram integrations.
Deepgram Observability
Overview
Full observability stack for Deepgram: Prometheus metrics (request counts, latency histograms, audio processed, cost tracking), OpenTelemetry distributed tracing, structured JSON logging with Pino, Grafana dashboard JSON, and AlertManager rules.
Four Pillars
| Pillar | Tool | What It Tracks |
|---|---|---|
| Metrics | Prometheus | Request rate, latency, error rate, audio minutes, estimated cost |
| Traces | OpenTelemetry | End-to-end request flow, Deepgram API span timing |
| Logs | Pino (JSON) | Request details, errors, audit trail |
| Alerts | AlertManager | Error rate >5%, P95 latency >10s, rate limit hits |
Instructions
Step 1: Prometheus Metrics Definition
import { Counter, Histogram, Gauge, Registry, collectDefaultMetrics } from 'prom-client';
const registry = new Registry();
collectDefaultMetrics({ register: registry });
// Request metrics
const requestsTotal = new Counter({
name: 'deepgram_requests_total',
help: 'Total Deepgram API requests',
labelNames: ['method', 'model', 'status'] as const,
registers: [registry],
});
const latencyHistogram = new Histogram({
name: 'deepgram_request_duration_seconds',
help: 'Deepgram API request duration',
labelNames: ['method', 'model'] as const,
buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60],
registers: [registry],
});
// Usage metrics
const audioProcessedSeconds = new Counter({
name: 'deepgram_audio_processed_seconds_total',
help: 'Total audio seconds processed',
labelNames: ['model'] as const,
registers: [registry],
});
const estimatedCostDollars = new Counter({
name: 'deepgram_estimated_cost_dollars_total',
help: 'Estimated cost in USD',
labelNames: ['model', 'method'] as const,
registers: [registry],
});
// Operational metrics
const activeConnections = new Gauge({
name: 'deepgram_active_websocket_connections',
help: 'Currently active WebSocket connections',
registers: [registry],
});
const rateLimitHits = new Counter({
name: 'deepgram_rate_limit_hits_total',
help: 'Number of 429 rate limit responses',
registers: [registry],
});
export { registry, requestsTotal, latencyHistogram, audioProcessedSeconds,
estimatedCostDollars, activeConnections, rateLimitHits };
Step 2: Instrumented Deepgram Client
import { createClient, DeepgramClient } from '@deepgram/sdk';
class InstrumentedDeepgram {
private client: DeepgramClient;
private costPerMinute: Record<string, number> = {
'nova-3': 0.0043, 'nova-2Optimize Deepgram API performance for faster transcription and lower latency.
Deepgram Performance Tuning
Overview
Optimize Deepgram transcription performance through audio preprocessing with ffmpeg, model selection for speed vs accuracy, streaming for large files, parallel processing, result caching, and connection reuse. Targets: <2s latency for short files, 100+ files/minute batch throughput.
Performance Levers
| Factor | Impact | Default | Optimized |
|---|---|---|---|
| Audio format | High | Any format | 16kHz mono WAV |
| Model | High | nova-3 | base (speed) or nova-3 (accuracy) |
| File size | High | Full file sync | Stream >60s, callback >5min |
| Concurrency | Medium | Sequential | 50 parallel (p-limit) |
| Caching | Medium | None | Redis hash by audio+options |
| Features | Medium | All enabled | Disable unused (diarize, utterances) |
Instructions
Step 1: Audio Preprocessing with ffmpeg
# Optimal format for Deepgram: 16kHz, 16-bit, mono, WAV
ffmpeg -i input.mp3 \
-ar 16000 \ # 16kHz sample rate (ideal for speech)
-ac 1 \ # Mono channel
-acodec pcm_s16le \ # 16-bit signed LE PCM
-f wav \
output.wav
# Remove silence (saves API cost + processing time)
ffmpeg -i input.wav \
-af "silenceremove=stop_periods=-1:stop_duration=0.5:stop_threshold=-30dB" \
-ar 16000 -ac 1 -acodec pcm_s16le \
trimmed.wav
# Noise reduction + normalization
ffmpeg -i input.wav \
-af "highpass=f=200,lowpass=f=3000,loudnorm=I=-16:TP=-1.5:LRA=11" \
-ar 16000 -ac 1 -acodec pcm_s16le \
clean.wav
import { execSync } from 'child_process';
import { statSync } from 'fs';
function preprocessAudio(inputPath: string, outputPath: string): {
originalSize: number;
optimizedSize: number;
savings: string;
} {
const originalSize = statSync(inputPath).size;
execSync(`ffmpeg -y -i "${inputPath}" \
-af "silenceremove=stop_periods=-1:stop_duration=0.5:stop_threshold=-30dB,\
highpass=f=200,lowpass=f=3000" \
-ar 16000 -ac 1 -acodec pcm_s16le \
"${outputPath}" 2>/dev/null`);
const optimizedSize = statSync(outputPath).size;
const savings = ((1 - optimizedSize / originalSize) * 100).toFixed(1);
console.log(`Preprocessed: ${inputPath}`);
console.log(` Original: ${(originalSize / 1024).toFixed(0)}KB`);
console.log(` Optimized: ${(optimizedSize / 1024).toFixed(0)}KB (${savings}% smaller)`);
return { originalSize, optimizedSize, savings };
}
Step 2: Model Selection Strategy
import { createClient } from '@deepgram/sdk';
type Priority = 'acExecute Deepgram production deployment checklist.
Deepgram Production Checklist
Overview
Comprehensive go-live checklist for Deepgram integrations. Covers singleton client, health checks, Prometheus metrics, alert rules, error handling, and a phased go-live timeline.
Production Readiness Matrix
| Category | Item | Status |
|---|---|---|
| Auth | Production API key with scoped permissions | [ ] |
| Auth | Key stored in secret manager (not env file) | [ ] |
| Auth | Key rotation schedule (90-day) configured | [ ] |
| Auth | Fallback key provisioned and tested | [ ] |
| Resilience | Retry with exponential backoff on 429/5xx | [ ] |
| Resilience | Circuit breaker for cascade failure prevention | [ ] |
| Resilience | Request timeout set (30s pre-recorded, 10s TTS) | [ ] |
| Resilience | Graceful degradation when API unavailable | [ ] |
| Performance | Singleton client (not creating per-request) | [ ] |
| Performance | Concurrency limited (50-80% of plan limit) | [ ] |
| Performance | Audio preprocessed (16kHz mono for best results) | [ ] |
| Performance | Large files use callback URL (async) | [ ] |
| Monitoring | Health check endpoint testing Deepgram API | [ ] |
| Monitoring | Prometheus metrics: latency, error rate, usage | [ ] |
| Monitoring | Alerts: error rate >5%, latency >10s, circuit open | [ ] |
| Security | PII redaction enabled if handling sensitive audio | [ ] |
| Security | Audio URLs validated (HTTPS, no private IPs) | [ ] |
| Security | Audit logging on all operations | [ ] |
Instructions
Step 1: Production Singleton Client
import { createClient, DeepgramClient } from '@deepgram/sdk';
class ProductionDeepgram {
private static client: DeepgramClient | null = null;
static getClient(): DeepgramClient {
if (!this.client) {
const key = process.env.DEEPGRAM_API_KEY;
if (!key) throw new Error('DEEPGRAM_API_KEY required for production');
this.client = createClient(key);
}
return this.client;
}
// Force re-init (for key rotation)
static reset() { this.client =Implement Deepgram rate limiting and backoff strategies.
Deepgram Rate Limits
Overview
Implement rate limiting, exponential backoff, and circuit breaker patterns for Deepgram API. Deepgram limits by concurrent connections (not requests per second). Understanding this model is key to building reliable integrations.
Deepgram Rate Limit Model
Deepgram uses concurrency-based limits, not traditional requests-per-minute:
| Plan | Concurrent Requests (STT) | Concurrent Connections (Live) | Concurrent Requests (TTS) |
|---|---|---|---|
| Pay As You Go | 100 | 100 | 100 |
| Growth | 200 | 200 | 200 |
| Enterprise | Custom | Custom | Custom |
When you exceed your concurrency limit, Deepgram returns 429 Too Many Requests.
Key insight: You can send unlimited total requests — just not more than your concurrency limit simultaneously.
Instructions
Step 1: Concurrency-Aware Queue
import pLimit from 'p-limit';
import { createClient } from '@deepgram/sdk';
class DeepgramRateLimiter {
private limit: ReturnType<typeof pLimit>;
private client: ReturnType<typeof createClient>;
private stats = { total: 0, active: 0, queued: 0, errors: 0 };
constructor(apiKey: string, maxConcurrent = 50) {
// Stay well under plan limit (e.g., 50 of 100 allowed)
this.limit = pLimit(maxConcurrent);
this.client = createClient(apiKey);
}
async transcribe(source: { url: string }, options: Record<string, any>) {
this.stats.queued++;
return this.limit(async () => {
this.stats.queued--;
this.stats.active++;
this.stats.total++;
try {
const { result, error } = await this.client.listen.prerecorded.transcribeUrl(
source, options
);
if (error) {
this.stats.errors++;
throw error;
}
return result;
} catch (err) {
this.stats.errors++;
throw err;
} finally {
this.stats.active--;
}
});
}
getStats() { return { ...this.stats }; }
}
// Usage:
const limiter = new DeepgramRateLimiter(process.env.DEEPGRAM_API_KEY!, 50);
const urls = ['audio1.wav', 'audio2.wav', /* ...hundreds more */];
const results = await Promise.allSettled(
urls.map(url => limiter.transcribe({ url }, { model: 'nova-3', smart_format: true }))
);
Step 2: Exponential Backoff with Jitter
class RetryableDeepgramClient {
private client: ReturnType<typeof createClient>;
constructor(apiKey: string) {
this.client = createClient(apiKey);
}
async transcribeWithRetry(
source: any,
options: any,
config = { maxRetries: 5, baseDelImplement Deepgram reference architecture for scalable transcription systems.
Deepgram Reference Architecture
Overview
Four reference architectures for Deepgram transcription at scale: synchronous REST for short files, async queue (BullMQ) for batch processing, WebSocket proxy for real-time streaming, and a hybrid router that auto-selects the best pattern based on audio duration.
Architecture Selection Guide
| Pattern | Best For | Latency | Throughput | Complexity |
|---|---|---|---|---|
| Sync REST | Files <60s, low volume | Low | Low | Simple |
| Async Queue | Batch, files >60s | Medium | High | Medium |
| WebSocket Proxy | Live audio, real-time | Real-time | Medium | Medium |
| Hybrid Router | Mixed workloads | Varies | High | High |
| Callback | Files >5min, fire-and-forget | N/A | Very High | Low |
Instructions
Step 1: Synchronous REST Pattern
import express from 'express';
import { createClient } from '@deepgram/sdk';
const app = express();
app.use(express.json());
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
// Direct API call — best for short files (<60s)
app.post('/api/transcribe', async (req, res) => {
const { url, model = 'nova-3', diarize = false } = req.body;
try {
const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
{ url },
{ model, smart_format: true, diarize, utterances: diarize }
);
if (error) return res.status(502).json({ error: error.message });
res.json({
transcript: result.results.channels[0].alternatives[0].transcript,
confidence: result.results.channels[0].alternatives[0].confidence,
duration: result.metadata.duration,
request_id: result.metadata.request_id,
utterances: diarize ? result.results.utterances : undefined,
});
} catch (err: any) {
res.status(500).json({ error: err.message });
}
});
Step 2: Async Queue Pattern (BullMQ)
import { Queue, Worker, Job } from 'bullmq';
import { createClient } from '@deepgram/sdk';
import Redis from 'ioredis';
const connection = new Redis(process.env.REDIS_URL ?? 'redis://localhost:6379');
// Producer: submit transcription jobs
const transcriptionQueue = new Queue('transcription', { connection });
async function submitJob(audioUrl: string, options: Record<string, any> = {}) {
const job = await transcriptionQueue.add('transcribe', {
audioUrl,
model: options.model ?? 'nova-3',
diarize: options.diarize ?? false,
submittedAt: new Date().toISOString(),
}, {
attempts: 3,
backoff: { type: 'exponeApply production-ready Deepgram SDK patterns for TypeScript and Python.
Deepgram SDK Patterns
Overview
Production patterns for @deepgram/sdk (TypeScript) and deepgram-sdk (Python). Covers singleton client, typed wrappers, text-to-speech with Aura, audio intelligence pipeline, error handling, and SDK v5 migration path.
Prerequisites
npm install @deepgram/sdkorpip install deepgram-sdkDEEPGRAMAPIKEYenvironment variable configured
Instructions
Step 1: Singleton Client (TypeScript)
import { createClient, DeepgramClient } from '@deepgram/sdk';
class DeepgramService {
private static instance: DeepgramService;
private client: DeepgramClient;
private constructor() {
const apiKey = process.env.DEEPGRAM_API_KEY;
if (!apiKey) throw new Error('DEEPGRAM_API_KEY is required');
this.client = createClient(apiKey);
}
static getInstance(): DeepgramService {
if (!this.instance) this.instance = new DeepgramService();
return this.instance;
}
getClient(): DeepgramClient { return this.client; }
}
export const deepgram = DeepgramService.getInstance().getClient();
Step 2: Text-to-Speech with Aura
import { createClient } from '@deepgram/sdk';
import { writeFileSync } from 'fs';
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
async function textToSpeech(text: string, outputPath: string) {
const response = await deepgram.speak.request(
{ text },
{
model: 'aura-2-thalia-en', // Female English voice
encoding: 'linear16',
container: 'wav',
sample_rate: 24000,
}
);
const stream = await response.getStream();
if (!stream) throw new Error('No audio stream returned');
// Collect stream into buffer
const reader = stream.getReader();
const chunks: Uint8Array[] = [];
while (true) {
const { done, value } = await reader.read();
if (done) break;
chunks.push(value);
}
const buffer = Buffer.concat(chunks);
writeFileSync(outputPath, buffer);
console.log(`Audio saved: ${outputPath} (${buffer.length} bytes)`);
return buffer;
}
// Aura-2 voice options:
// aura-2-thalia-en — Female, warm
// aura-2-asteria-en — Female, default
// aura-2-orion-en — Male, deep
// aura-2-luna-en — Female, soft
// aura-2-helios-en — Male, authoritative
// aura-asteria-en — Aura v1 fallback
Step 3: Audio Intelligence Pipeline
async function analyzeConversation(audioUrl: string) {
const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
{ url: audioUrl },
{
model: 'nova-3',
smart_format: true,
diarize: true,
utterances: true,
// Audio Intelligence features
summarize: 'v2', // Generates Apply Deepgram security best practices for API key management and data protection.
Deepgram Security Basics
Overview
Security best practices for Deepgram integration: scoped API keys, key rotation, Deepgram's built-in PII redaction feature, client-side temporary keys, SSRF prevention for audio URLs, and audit logging.
Security Checklist
- [ ] API keys in environment variables or secret manager (never in code)
- [ ] Separate keys per environment (dev/staging/prod)
- [ ] Keys scoped to minimum required permissions
- [ ] Key rotation schedule (90 days recommended)
- [ ] Deepgram
redactoption enabled for PII-sensitive audio - [ ] Audio URLs validated (HTTPS only, no private IPs)
- [ ] Audit logging on all transcription operations
Instructions
Step 1: Scoped API Keys
Create keys with minimal permissions in Console > Settings > API Keys:
// Production transcription service — only needs listen scope
const sttKey = process.env.DEEPGRAM_STT_KEY; // Scope: listen
// TTS service — only needs speak scope
const ttsKey = process.env.DEEPGRAM_TTS_KEY; // Scope: speak
// Monitoring dashboard — only needs usage read
const monitorKey = process.env.DEEPGRAM_MONITOR_KEY; // Scope: usage:read
// Admin operations — separate key, restricted access
const adminKey = process.env.DEEPGRAM_ADMIN_KEY; // Scope: manage, keys
Step 2: Deepgram Built-in PII Redaction
import { createClient } from '@deepgram/sdk';
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
// Deepgram redacts PII directly in the transcript
const { result } = await deepgram.listen.prerecorded.transcribeUrl(
{ url: audioUrl },
{
model: 'nova-3',
smart_format: true,
// Built-in redaction — replaces sensitive data in transcript
redact: ['pci', 'ssn', 'numbers'],
// pci — Credit card numbers → [REDACTED]
// ssn — Social Security numbers → [REDACTED]
// numbers — All numeric sequences → [REDACTED]
}
);
// Transcript will contain [REDACTED] in place of sensitive numbers
console.log(result.results.channels[0].alternatives[0].transcript);
// "My card number is [REDACTED] and my SSN is [REDACTED]"
Step 3: Temporary Keys for Client-Side
// Generate short-lived keys for browser/mobile clients
// This prevents exposing your main API key
import { createClient } from '@deepgram/sdk';
import express from 'express';
const app = express();
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
app.post('/api/deepgram/token', async (req, res) => {
// Create a temporary key that expires in 10 seconds
// Use for browser WebSocket connections
const { result, error } = await deepgram.manage.createProjectKey(
process.env.DEEPGRAM_PROJECT_ID!,
{
comment: `temp-key-${Plan and execute Deepgram SDK upgrades and model migrations.
Deepgram Upgrade Migration
Current State
!npm list @deepgram/sdk 2>/dev/null | grep deepgram || echo 'SDK not installed'
Overview
Guide for Deepgram SDK version upgrades (v3 -> v4 -> v5) and model migrations (Nova-2 -> Nova-3). Includes breaking change maps, side-by-side API comparison, A/B testing scripts, automated validation, and rollback procedures.
SDK Version History
| Version | Client Init | STT API | Live API | TTS API | Status |
|---|---|---|---|---|---|
| v3.x | createClient(key) |
listen.prerecorded.transcribeUrl() |
listen.live() |
speak.request() |
Stable |
| v4.x | createClient(key) |
listen.prerecorded.transcribeUrl() |
listen.live() |
speak.request() |
Stable |
| v5.x | new DeepgramClient({apiKey}) |
listen.v1.media.transcribeUrl() |
listen.v1.connect() |
speak.v1.audio.generate() |
Beta |
Instructions
Step 1: Identify Current Version and Breaking Changes
# Check installed version
npm list @deepgram/sdk
# Check latest available
npm view @deepgram/sdk versions --json | tail -5
Step 2: v3/v4 to v5 Migration Map
// ============= CLIENT CREATION =============
// v3/v4:
import { createClient } from '@deepgram/sdk';
const dg = createClient(process.env.DEEPGRAM_API_KEY!);
// v5:
import { DeepgramClient } from '@deepgram/sdk';
const dg = new DeepgramClient({ apiKey: process.env.DEEPGRAM_API_KEY });
// ============= PRE-RECORDED STT =============
// v3/v4:
const { result, error } = await dg.listen.prerecorded.transcribeUrl(
{ url: audioUrl },
{ model: 'nova-3', smart_format: true }
);
// v5:
const response = await dg.listen.v1.media.transcribeUrl(
{ url: audioUrl },
{ model: 'nova-3', smart_format: true }
);
// v5 throws on error instead of returning { error }
// ============= FILE TRANSCRIPTION =============
// v3/v4:
const { result, error } = await dg.listen.prerecorded.transcribeFile(
buffer,
{ model: 'nova-3', mimetype: 'audio/wav' }
);
// v5:
const response = await dg.listen.v1.media.transcribeFile(
createReadStream('audio.wav'),
{ model: 'nova-3' }
);
// ============= LIVE STREAMING =============
// v3/v4:
const connection = dg.listen.live({ model: 'nova-3', encoding: 'linear16' });
connection.on(LiveTranscriptionEvents.Transcript, (data) => { ... });
// v5:
const connection = await dg.listen.v1.connect({ model: 'nova-3', encoding: 'linear16' });
// Note: v5 connect() is asynImplement Deepgram callback and webhook handling for async transcription.
Deepgram Webhooks & Callbacks
Overview
Implement async transcription with Deepgram's callback feature. When you pass a callback URL, Deepgram returns a request_id immediately, processes audio in the background, and POSTs results to your endpoint. Supports HTTP and WebSocket callbacks with automatic retry (10 attempts, 30s intervals).
Deepgram Callback Flow
1. Client -> POST /v1/listen?callback=https://you.com/webhook (with audio)
2. Deepgram -> 200 { request_id: "..." } (immediate)
3. Deepgram processes audio asynchronously
4. Deepgram -> POST https://you.com/webhook (results)
Retries up to 10 times (30s delay) on non-2xx response
Instructions
Step 1: Submit Async Transcription
import { createClient } from '@deepgram/sdk';
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
async function submitAsync(audioUrl: string, callbackUrl: string) {
// Deepgram sends transcription via callback URL instead of
// holding the connection open.
const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
{ url: audioUrl },
{
model: 'nova-3',
smart_format: true,
diarize: true,
utterances: true,
callback: callbackUrl, // Your HTTPS endpoint
// callback_method: 'put', // Optional: use PUT instead of POST
}
);
if (error) throw new Error(`Submit failed: ${error.message}`);
// Deepgram returns immediately with request_id
const requestId = result.metadata.request_id;
console.log(`Submitted. Request ID: ${requestId}`);
console.log(`Results will be POSTed to: ${callbackUrl}`);
return requestId;
}
// Also works with direct curl:
// curl -X POST 'https://api.deepgram.com/v1/listen?model=nova-3&callback=https://you.com/webhook' \
// -H "Authorization: Token $DEEPGRAM_API_KEY" \
// -H "Content-Type: application/json" \
// -d '{"url":"https://example.com/audio.wav"}'
Step 2: Callback Server
import express from 'express';
import crypto from 'crypto';
const app = express();
// IMPORTANT: Use raw body for HMAC signature verification
app.use('/webhooks/deepgram', express.raw({ type: 'application/json', limit: '50mb' }));
app.post('/webhooks/deepgram', async (req, res) => {
try {
// 1. Verify signature (if webhook secret configured)
const signature = req.headers['x-deepgram-signature'] as string;
if (process.env.DEEPGRAM_WEBHOOK_SECRET && signature) {
const expected = crypto
.createHmac('sha256', process.env.DEEPGRAM_WEBHOOK_SECRET)
.update(req.body)
.digest('hex');
// Timing-safe comparison to prevent Ready to use deepgram-pack?
Related Plugins
ai-ethics-validator
AI ethics and fairness validation
ai-experiment-logger
Track and analyze AI experiments with a web dashboard and MCP tools
ai-ml-engineering-pack
Professional AI/ML Engineering toolkit: Prompt engineering, LLM integration, RAG systems, AI safety with 12 expert plugins
ai-sdk-agents
Multi-agent orchestration with AI SDK v5 - handoffs, routing, and coordination for any AI provider (OpenAI, Anthropic, Google)
anomaly-detection-system
Detect anomalies and outliers in data
automl-pipeline-builder
Build AutoML pipelines