Complete Perplexity integration skill pack with 30 skills covering AI search, real-time answers, citations, and research workflows. Flagship+ tier vendor pack.
Installation
Open Claude Code and run this command:
/plugin install perplexity-pack@claude-code-plugins-plus
Use --global to install for all projects, or --project for current project only.
Skills (30)
Apply advanced debugging techniques for hard-to-diagnose Perplexity Sonar API issues.
Perplexity Advanced Troubleshooting
Overview
Deep debugging for Perplexity Sonar API issues that resist standard fixes. Common hard problems: inconsistent citations between identical queries, intermittent timeouts on sonar-pro, search results not matching recency filter, and response quality degradation.
Prerequisites
- Access to production logs and metrics
curlfor direct API testing- Understanding of Perplexity's search-augmented generation model
Diagnostic Tools
Layer-by-Layer Test
#!/bin/bash
set -euo pipefail
echo "=== Perplexity Layer Diagnostics ==="
# Layer 1: DNS
echo -n "1. DNS: "
dig +short api.perplexity.ai || echo "FAIL"
# Layer 2: TCP connectivity
echo -n "2. TCP: "
timeout 5 bash -c 'echo > /dev/tcp/api.perplexity.ai/443 && echo "OK"' 2>/dev/null || echo "FAIL"
# Layer 3: TLS handshake
echo -n "3. TLS: "
echo | openssl s_client -connect api.perplexity.ai:443 2>/dev/null | grep -c "Verify return code: 0" | sed 's/1/OK/;s/0/FAIL/'
# Layer 4: HTTP with auth
echo -n "4. Auth: "
curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
https://api.perplexity.ai/chat/completions
echo ""
# Layer 5: Response quality
echo "5. Quality check:"
RESPONSE=$(curl -s \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sonar","messages":[{"role":"user","content":"What is 2+2?"}],"max_tokens":50}' \
https://api.perplexity.ai/chat/completions)
echo " Model: $(echo $RESPONSE | jq -r '.model')"
echo " Answer: $(echo $RESPONSE | jq -r '.choices[0].message.content' | head -c 100)"
echo " Citations: $(echo $RESPONSE | jq -r '.citations | length')"
echo " Tokens: $(echo $RESPONSE | jq -r '.usage.total_tokens')"
Inconsistent Citation Investigation
// Same query can return different citations due to live web search
// Run N times and compare to identify pattern vs randomness
async function citationStabilityTest(query: string, runs: number = 5) {
const results: Array<{ citations: string[]; answer: string }> = [];
for (let i = 0; i < runs; i++) {
const response = await perplexity.chat.completions.create({
model: "sonar",
messages: [{ role: "user", content: querChoose and implement Perplexity architecture blueprints for different scales: direct search widget, cached research layer, and multi-query pipeline.
Perplexity Architecture Variants
Overview
Three validated architectures for Perplexity Sonar API at different scales. Each builds on the previous, adding caching and orchestration as volume grows.
Decision Matrix
| Factor | Direct Widget | Cached Layer | Research Pipeline |
|---|---|---|---|
| Volume | <500/day | 500-5K/day | 5K+/day |
| Latency (p50) | 2-5s | 50ms (cached) / 2-5s (miss) | 10-30s |
| Model | sonar |
sonar + cache |
sonar + sonar-pro |
| Monthly Cost | <$150 | $50-$300 | $300+ |
| Complexity | Minimal | Moderate | High |
Instructions
Variant 1: Direct Search Widget (<500 queries/day)
Best for: Adding AI search to an existing app. No cache needed at this scale.
// Simple endpoint — add to any Express/Next.js app
import OpenAI from "openai";
const perplexity = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY!,
baseURL: "https://api.perplexity.ai",
});
app.post("/api/search", async (req, res) => {
try {
const response = await perplexity.chat.completions.create({
model: "sonar",
messages: [{ role: "user", content: req.body.query }],
max_tokens: 1024,
});
res.json({
answer: response.choices[0].message.content,
citations: (response as any).citations || [],
});
} catch (err: any) {
if (err.status === 429) {
res.status(429).json({ error: "Rate limited. Try again shortly." });
} else {
res.status(500).json({ error: "Search unavailable" });
}
}
});
Variant 2: Cached Research Layer (500-5K queries/day)
Best for: Repeated queries, knowledge base search, FAQ bots. Cache eliminates duplicate API calls.
import { createHash } from "crypto";
import { LRUCache } from "lru-cache";
const cache = new LRUCache<string, any>({
max: 5000,
ttl: 4 * 3600_000, // 4-hour TTL
});
class CachedSearchService {
constructor(private client: OpenAI) {}
async search(query: string, model = "sonar") {
const key = this.cacheKey(query, model);
const cached = cache.get(key);
if (cached) return { ...cached, cached: true };
const response = await this.client.chat.completions.create({
model,
messages: [{ role: "user", content: query }],
max_tokens: 1024,
});
const result = {
answer: response.choices[0].message.content || "",
citations: (response as any).citations || [],
model: response.model,
};
cache.set(key, result);
return {Configure CI/CD for Perplexity Sonar API integrations with GitHub Actions.
Perplexity CI Integration
Overview
Set up CI/CD pipelines for Perplexity Sonar API integrations. Key CI concerns: live API calls cost money (use mocks for unit tests, reserve live calls for integration tests), API keys must be in GitHub Secrets, and rate limits apply even in CI.
Prerequisites
- GitHub repository with Actions enabled
- Perplexity API key for CI (separate from production)
- Test suite with mocked and live test separation
Instructions
Step 1: Configure GitHub Secret
set -euo pipefail
# Store API key as a GitHub secret
gh secret set PERPLEXITY_API_KEY --body "pplx-your-ci-key-here"
Step 2: GitHub Actions Workflow
# .github/workflows/perplexity-tests.yml
name: Perplexity Integration Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
cache: "npm"
- run: npm ci
- run: npm test -- --coverage
# Unit tests use mocked responses — no API key needed
integration-tests:
runs-on: ubuntu-latest
needs: unit-tests
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
env:
PERPLEXITY_API_KEY: ${{ secrets.PERPLEXITY_API_KEY }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
cache: "npm"
- run: npm ci
- name: Run live Perplexity integration tests
run: npm run test:integration
timeout-minutes: 5
Step 3: Test Structure
// tests/perplexity.unit.test.ts — runs on every PR, uses mocks
import { describe, it, expect, vi } from "vitest";
import fixture from "./fixtures/sonar-response.json";
vi.mock("openai", () => ({
default: vi.fn().mockImplementation(() => ({
chat: {
completions: {
create: vi.fn().mockResolvedValue(fixture),
},
},
})),
}));
describe("Perplexity Search (mocked)", () => {
it("parses citations from response", async () => {
const { search } = await import("../src/perplexity/search");
const result = await search("test query");
expect(result.citations.length).toBeGreaterThan(0);
});
it("formats answer with citation links", async () => {
const { formatCitationsAsMarkdown } = await import("../src/perplexity/citations");
const formatted = formatCitationsAsMarkdown("See [1] for details", ["https://example.com"]);
expect(formatted).toContain("[1](https://example.com)");
});
});
Diagnose and fix Perplexity Sonar API errors and exceptions.
Perplexity Common Errors
Overview
Quick reference for the most common Perplexity Sonar API errors, their root causes, and fixes. All Perplexity errors follow the OpenAI error format since the API is OpenAI-compatible.
Prerequisites
PERPLEXITYAPIKEYenvironment variable setcurlavailable for diagnostic commands
Error Reference
401 Unauthorized — Invalid API Key
{"error": {"message": "Invalid API key", "type": "authentication_error", "code": 401}}
Causes: Key missing, expired, revoked, or doesn't start with pplx-.
Fix:
set -euo pipefail
# Verify key is set and has correct prefix
echo "${PERPLEXITY_API_KEY:0:5}" # Should print "pplx-"
# Test key directly
curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
https://api.perplexity.ai/chat/completions
# 200 = valid, 401 = invalid key
Regenerate at perplexity.ai/settings/api.
429 Too Many Requests — Rate Limited
{"error": {"message": "Rate limit exceeded", "type": "rate_limit_error", "code": 429}}
Causes: Exceeded requests per minute (RPM). Most tiers allow 50 RPM. Perplexity uses a leaky bucket algorithm.
Fix:
async function withBackoff<T>(fn: () => Promise<T>, maxRetries = 5): Promise<T> {
for (let i = 0; i <= maxRetries; i++) {
try {
return await fn();
} catch (err: any) {
if (err.status !== 429 || i === maxRetries) throw err;
const delay = Math.pow(2, i) * 1000 + Math.random() * 500;
console.log(`Rate limited. Retrying in ${delay.toFixed(0)}ms...`);
await new Promise(r => setTimeout(r, delay));
}
}
throw new Error("Unreachable");
}
See perplexity-rate-limits for queue-based solutions.
400 Bad Request — Invalid Model
{"error": {"message": "Invalid model: gpt-4", "type": "invalid_request_error"}}
Cause: Using a non-Perplexity model name.
Valid models: sonar, sonar-pro, sonar-reasoning-pro, sonar-deep-research.
Execute Perplexity primary workflow: single-query search with citations.
Perplexity Core Workflow A: Search with Citations
Overview
Primary money-path workflow: send a search query to Perplexity Sonar, receive a web-grounded answer with inline citations, parse and display the results. This is the single-query pattern used for search widgets, fact-checking, and real-time information retrieval.
Prerequisites
- Completed
perplexity-install-authsetup openaipackage installedPERPLEXITYAPIKEYset
Instructions
Step 1: Initialize Client and Send Query
import OpenAI from "openai";
const perplexity = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY,
baseURL: "https://api.perplexity.ai",
});
async function searchWithCitations(query: string) {
const response = await perplexity.chat.completions.create({
model: "sonar",
messages: [
{
role: "system",
content: "Provide accurate, well-sourced answers. Cite your sources inline.",
},
{ role: "user", content: query },
],
// Perplexity-specific parameters
search_recency_filter: "week", // hour | day | week | month
} as any);
return response;
}
Step 2: Parse Response with Citations
interface SearchResult {
answer: string;
citations: string[];
searchResults: Array<{ title: string; url: string; snippet: string }>;
tokensUsed: number;
}
function parseResponse(response: any): SearchResult {
return {
answer: response.choices[0].message.content,
citations: response.citations || [],
searchResults: response.search_results || [],
tokensUsed: response.usage?.total_tokens || 0,
};
}
Step 3: Format Citations for Display
function formatAnswer(result: SearchResult): string {
let formatted = result.answer;
// Replace [1], [2] markers with markdown links
result.citations.forEach((url, i) => {
formatted = formatted.replaceAll(`[${i + 1}]`, `[${i + 1}](${url})`);
});
// Append source list
if (result.citations.length > 0) {
formatted += "\n\n**Sources:**\n";
result.citations.forEach((url, i) => {
formatted += `${i + 1}. ${url}\n`;
});
}
return formatted;
}
Step 4: Complete Workflow
async function main() {
const query = "What are the latest advances in battery technology?";
const response = await searchWithCitations(query);
const result = parseResponse(response);
const formatted = formatAnswer(result);
console.log(formatted);
console.log(`\n[${result.tokensUsed} tokens | ${result.citations.length} sources]`);
}
main().catch(console.error);
Step 5: Domain-Filtered Search
Execute Perplexity multi-turn research sessions and batch query pipelines.
ReadWriteEditBash(npm:*)Grep
Perplexity Core Workflow B: Multi-Query Research
Overview
Multi-turn research workflow using Perplexity Sonar API. Decomposes a broad topic into focused sub-queries, runs them with context continuity, deduplicates citations, and synthesizes a structured research document. Use sonar for fast passes and sonar-pro for deep dives.
Prerequisites
- Completed
perplexity-install-auth setup
- Familiarity with
perplexity-core-workflow-a
PERPLEXITYAPIKEY set
Instructions
Step 1: Conversational Research Session
import OpenAI from "openai";
const perplexity = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY,
baseURL: "https://api.perplexity.ai",
});
type Message = OpenAI.ChatCompletionMessageParam;
class ResearchSession {
private messages: Message[] = [];
private allCitations: Set<string> = new Set();
constructor(systemPrompt: string = "You are a research assistant. Provide thorough, cited answers.") {
this.messages.push({ role: "system", content: systemPrompt });
}
async ask(question: string, model: "sonar" | "sonar-pro" = "sonar"): Promise<{
answer: string;
citations: string[];
}> {
this.messages.push({ role: "user", content: question });
const response = await perplexity.chat.completions.create({
model,
messages: this.messages,
} as any);
const answer = response.choices[0].message.content || "";
const citations = (response as any).citations || [];
// Maintain conversation context
this.messages.push({ role: "assistant", content: answer });
// Accumulate all citations across the session
citations.forEach((url: string) => this.allCitations.add(url));
return { answer, citations };
}
getAllCitations(): string[] {
return [...this.allCitations];
}
// Keep context manageable (Perplexity searches per turn)
trimHistory(keepLast: number = 6) {
const system = this.messages[0];
const recent = this.messages.slice(-(keepLast * 2));
this.messages = [system, ...recent];
}
}
Step 2: Batch Query Pipeline
interface ResearchPlan {
topic: string;
questions: string[];
}
interface ResearchReport {
topic: string;
sections: Array<{ question: string; answer: string; citations: string[] }>;
allCitations: string[];
totalTokens: number;
}
async function conductResearch(plan: ResearchPlan): Promise<ResearchReport> {
const sections: ResearchReport["sections"] = [];
const allCitations = new Set<string>();
let totalTokens = 0;
for (const question of plan.questions) {
const response = await perplexity.chat.completions.create({
model: "sonar-pro
Optimize Perplexity costs through model routing, caching, token limits, and budget monitoring.
Perplexity Cost Tuning
Overview
Reduce Perplexity Sonar API costs. Perplexity charges per-token (input + output) plus a per-request fee that varies by search context size. The biggest cost lever is model selection: sonar-pro costs 3-15x more than sonar per request.
Pricing Reference
| Model | Input $/M tokens | Output $/M tokens | Request Fee |
|---|---|---|---|
sonar |
$1 | $1 | $5 per 1K requests |
sonar-pro |
$3 | $15 | $5 per 1K requests |
sonar-reasoning-pro |
$3 | $15 | $5 per 1K requests |
sonar-deep-research |
$2 | $8 | $5 per 1K searches |
Search context size (Low/Medium/High) affects the request fee. More context = higher fee.
Prerequisites
- Perplexity API account with usage dashboard
- Understanding of query patterns in your application
- Cache infrastructure for search results
Instructions
Step 1: Route Queries to the Right Model
// 60-70% of queries can use sonar, saving 3-15x per query
function selectModel(query: string): "sonar" | "sonar-pro" {
const simplePatterns = [
/^what is/i, /^define/i, /^who is/i, /^when did/i,
/current price/i, /^how many/i, /^is it true/i,
];
if (simplePatterns.some((p) => p.test(query))) return "sonar";
const complexPatterns = [
/compare.*vs/i, /analysis of/i, /comprehensive/i,
/pros and cons/i, /in-depth/i, /research/i,
];
if (complexPatterns.some((p) => p.test(query))) return "sonar-pro";
return "sonar"; // Default to cheapest
}
Step 2: Limit Output Tokens
set -euo pipefail
# Factual queries need ~100 tokens, not 4096
# Setting max_tokens dramatically reduces output costs
# Simple fact: 100 tokens = $0.0001 output
curl -X POST https://api.perplexity.ai/chat/completions \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "sonar",
"messages": [{"role": "user", "content": "Current population of Tokyo"}],
"max_tokens": 100
}'
# Research query: keep at 2048 only when needed
curl -X POST https://api.perplexity.ai/chat/completions \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "sonar-pro",
"messages": [{"role": "user", "content": "Compare React vs Vue in 2025 for enterprise apps"}],
"max_tokens&qImplement Perplexity query sanitization, citation validation, result caching, and conversation context management for search workflows.
Perplexity Data Handling
Overview
Manage data flowing through Perplexity Sonar API. Critical concern: queries are sent to Perplexity for web search, so any PII in queries is exposed to external infrastructure. Responses contain citations (third-party URLs) that must be validated before displaying to users.
Data Flow
User Input → Query Sanitization → Perplexity API → Response Parsing
│
┌─────────────┼──────────────┐
│ │ │
Answer Text Citations Search Results
│ │ │
Format & Validate & Store for
Display Deduplicate Analytics
Prerequisites
- Perplexity API key configured
- Understanding of PII regulations (GDPR/CCPA)
- Cache storage (Redis or in-memory)
Instructions
Step 1: Query Sanitization
function sanitizeQuery(query: string): { clean: string; redacted: boolean } {
let clean = query;
let redacted = false;
const patterns: Array<[RegExp, string]> = [
[/\b[\w.+-]+@[\w-]+\.[\w.]+\b/g, "[email]"],
[/\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g, "[phone]"],
[/\b\d{3}-\d{2}-\d{4}\b/g, "[ssn]"],
[/\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g, "[card]"],
[/\b(pplx-|sk-|pk_|sk_live_)\w{20,}\b/g, "[token]"],
[/\b(user|customer|account)\s*#?\s*\d+\b/gi, "[id]"],
];
for (const [pattern, replacement] of patterns) {
if (pattern.test(clean)) {
clean = clean.replace(pattern, replacement);
redacted = true;
}
}
return { clean, redacted };
}
async function safeSearch(rawQuery: string) {
const { clean, redacted } = sanitizeQuery(rawQuery);
if (redacted) {
console.warn("[Data] PII redacted from Perplexity query");
}
return perplexity.chat.completions.create({
model: "sonar",
messages: [{ role: "user", content: clean }],
});
}
Step 2: Citation Validation
interface ValidatedCitation {
url: string;
domain: string;
valid: boolean;
index: number;
}
function validateCitations(citations: string[]): ValidatedCitation[] {
return citations.map((url, i) => {
try {
const parsed = new URL(url);
return {
url: url.replace(/[.,;:]+$/, ""),
domain: parsed.hostname,
valid: ["http:", "https:"].includes(parsed.protocol),
index: i + 1,
};
} catch {
return { url, domain: "unknown", valid: false, index: i + 1 };
}
});
}
fuCollect Perplexity debug evidence for support tickets and troubleshooting.
Perplexity Debug Bundle
Current State
!node --version 2>/dev/null || echo 'N/A'
!python3 --version 2>/dev/null || echo 'N/A'
!echo "PERPLEXITYAPIKEY: ${PERPLEXITYAPIKEY:+SET (${#PERPLEXITYAPIKEY} chars)}${PERPLEXITYAPIKEY:-NOT SET}"
Overview
Collect all diagnostic information needed to troubleshoot Perplexity Sonar API issues. Generates a redacted bundle safe for sharing with support or teammates.
Prerequisites
PERPLEXITYAPIKEYenvironment variablecurlandtaravailable- Permission to collect environment info
Instructions
Step 1: Create Debug Bundle Script
#!/bin/bash
set -euo pipefail
# perplexity-debug-bundle.sh
BUNDLE_DIR="perplexity-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BUNDLE_DIR"
echo "=== Perplexity Debug Bundle ===" > "$BUNDLE_DIR/summary.txt"
echo "Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> "$BUNDLE_DIR/summary.txt"
echo "" >> "$BUNDLE_DIR/summary.txt"
Step 2: Collect Environment Info
set -euo pipefail
cat >> "$BUNDLE_DIR/summary.txt" << 'EOF'
--- Environment ---
EOF
echo "Node: $(node --version 2>/dev/null || echo 'not installed')" >> "$BUNDLE_DIR/summary.txt"
echo "Python: $(python3 --version 2>/dev/null || echo 'not installed')" >> "$BUNDLE_DIR/summary.txt"
echo "OS: $(uname -sr)" >> "$BUNDLE_DIR/summary.txt"
echo "OpenAI SDK (npm): $(npm list openai 2>/dev/null | grep openai || echo 'not found')" >> "$BUNDLE_DIR/summary.txt"
echo "OpenAI SDK (pip): $(pip show openai 2>/dev/null | grep Version || echo 'not found')" >> "$BUNDLE_DIR/summary.txt"
echo "API Key: ${PERPLEXITY_API_KEY:+SET (prefix: ${PERPLEXITY_API_KEY:0:5}...)}${PERPLEXITY_API_KEY:-NOT SET}" >> "$BUNDLE_DIR/summary.txt"
Step 3: Test API Connectivity
set -euo pipefail
echo "" >> "$BUNDLE_DIR/summary.txt"
echo "--- API Connectivity ---" >> "$BUNDLE_DIR/summary.txt"
# DNS resolution
echo -n "DNS: " >> "$BUNDLE_DIR/summary.txt"
dig +short api.perplexity.ai >> "$BUNDLE_DIR/summary.txt" 2>&1
# API response test
echo -n "API Health: " >> "$BUNDLE_DIR/summary.txt"
curl -s -w "HTTP %{http_code} in %{time_total}s" \
-o "$BUNDLE_DIR/api-response.json" \
-H "Authorization: Bearer ${PERPLEXITY_API_KEY}" \
-H "Content-Type:Deploy Perplexity Sonar API integrations to Vercel, Cloud Run, and Docker.
Perplexity Deploy Integration
Overview
Deploy applications using Perplexity Sonar API to edge and server platforms. Perplexity's OpenAI-compatible endpoint at https://api.perplexity.ai/chat/completions works from any platform that can make HTTPS requests.
Prerequisites
- Perplexity API key stored in
PERPLEXITYAPIKEY - Platform CLI installed (vercel, gcloud, or docker)
- Application tested locally
Instructions
Step 1: Vercel Edge Function
// api/search.ts
import OpenAI from "openai";
export const config = { runtime: "edge" };
const perplexity = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY!,
baseURL: "https://api.perplexity.ai",
});
export default async function handler(req: Request) {
const { query, model = "sonar", stream = false } = await req.json();
if (stream) {
const response = await perplexity.chat.completions.create({
model,
messages: [{ role: "user", content: query }],
stream: true,
max_tokens: 2048,
});
return new Response(response.toReadableStream(), {
headers: { "Content-Type": "text/event-stream" },
});
}
const response = await perplexity.chat.completions.create({
model,
messages: [{ role: "user", content: query }],
max_tokens: 2048,
});
return Response.json({
answer: response.choices[0].message.content,
citations: (response as any).citations || [],
model: response.model,
});
}
set -euo pipefail
# Deploy to Vercel
vercel env add PERPLEXITY_API_KEY production
vercel deploy --prod
Step 2: Cloud Run with Redis Cache
// server.ts
import express from "express";
import OpenAI from "openai";
import { createClient } from "redis";
import { createHash } from "crypto";
const app = express();
app.use(express.json());
const perplexity = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY!,
baseURL: "https://api.perplexity.ai",
});
const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();
app.post("/api/search", async (req, res) => {
const { query, model = "sonar" } = req.body;
const cacheKey = `pplx:${createHash("sha256").update(`${model}:${query}`).digest("hex")}`;
// Check cache first
const cached = await redis.get(cacheKey);
if (cached) {
return res.json({ ...JSON.parse(cached), cached: true });
}
const response = await perplexity.chat.completions.create({
model,
messages: [{ role: "user", content: query }],
max_tokens: 2048,
});
const result = {
answer: response.choices[0].message.content,
citations: (response as any).citations || [],
Configure Perplexity API key scoping, per-team model access, cost controls, and search domain restrictions for enterprise deployments.
Perplexity Enterprise RBAC
Overview
Control access to Perplexity Sonar API at the organizational level. Perplexity does not have built-in RBAC -- you implement access control through: separate API keys per team/environment, a gateway that enforces model and budget policies, and domain restrictions for compliance.
Access Control Strategy
| Layer | Mechanism | Perplexity Support |
|---|---|---|
| Authentication | API key per team | Yes (multiple keys) |
| Model restriction | Gateway enforcement | Build yourself |
| Budget cap | Per-key monthly limit | Via dashboard |
| Domain restriction | searchdomainfilter |
Yes (per-request) |
| Rate limiting | Gateway + key limits | Yes (per-key RPM) |
Prerequisites
- Perplexity API account with admin access
- Separate API keys per team/environment
- Gateway or middleware for policy enforcement
Instructions
Step 1: Create Per-Team API Keys
Generate separate keys at perplexity.ai/settings/api:
Key: pplx-support-bot-prod → Budget: $200/mo, sonar only
Key: pplx-research-team → Budget: $1000/mo, sonar + sonar-pro
Key: pplx-data-team → Budget: $500/mo, sonar only
Key: pplx-executive-reports → Budget: $300/mo, sonar-pro
Step 2: Gateway with Policy Enforcement
// perplexity-gateway.ts
import OpenAI from "openai";
interface TeamPolicy {
apiKey: string;
allowedModels: string[];
maxTokensPerRequest: number;
maxRequestsPerMinute: number;
requiredDomainFilter?: string[]; // Force search to specific domains
blockedDomainFilter?: string[]; // Block specific domains
}
const TEAM_POLICIES: Record<string, TeamPolicy> = {
support: {
apiKey: process.env.PPLX_KEY_SUPPORT!,
allowedModels: ["sonar"],
maxTokensPerRequest: 512,
maxRequestsPerMinute: 30,
},
research: {
apiKey: process.env.PPLX_KEY_RESEARCH!,
allowedModels: ["sonar", "sonar-pro", "sonar-reasoning-pro"],
maxTokensPerRequest: 4096,
maxRequestsPerMinute: 50,
},
compliance: {
apiKey: process.env.PPLX_KEY_COMPLIANCE!,
allowedModels: ["sonar", "sonar-pro"],
maxTokensPerRequest: 2048,
maxRequestsPerMinute: 20,
requiredDomainFilter: ["sec.gov", "edgar.sec.gov", "law.cornell.edu"],
},
marketing: {
apiKey: process.env.PPLX_KEY_MARKETING!,
allowedModels: ["sonar"],
maxTokensPerRequest: 1024,
maxRequestsPerMinute: 20,
blockedDomainFilter: ["-competitor1.com", "-competitor2.com&quCreate a minimal working Perplexity Sonar search example with citations.
Perplexity Hello World
Overview
Minimal working example demonstrating Perplexity's core value: web-grounded answers with citations. Unlike standard LLMs, Perplexity searches the web for every query and returns cited sources.
Prerequisites
- Completed
perplexity-install-authsetup openaipackage installedPERPLEXITYAPIKEYenvironment variable set
Instructions
Step 1: Basic Search with Citations (TypeScript)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY,
baseURL: "https://api.perplexity.ai",
});
async function main() {
const response = await client.chat.completions.create({
model: "sonar",
messages: [
{
role: "system",
content: "Be precise and cite your sources.",
},
{
role: "user",
content: "What are the latest features in Node.js 22?",
},
],
});
const answer = response.choices[0].message.content;
console.log("Answer:", answer);
// Citations are returned as a top-level array on the response
const citations = (response as any).citations || [];
console.log("\nSources:");
citations.forEach((url: string, i: number) => {
console.log(` [${i + 1}] ${url}`);
});
// Usage breakdown
console.log("\nUsage:", {
prompt_tokens: response.usage?.prompt_tokens,
completion_tokens: response.usage?.completion_tokens,
total_tokens: response.usage?.total_tokens,
});
}
main().catch(console.error);
Step 2: Basic Search with Citations (Python)
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["PERPLEXITY_API_KEY"],
base_url="https://api.perplexity.ai",
)
response = client.chat.completions.create(
model="sonar",
messages=[
{"role": "system", "content": "Be precise and cite your sources."},
{"role": "user", "content": "What are the latest features in Node.js 22?"},
],
)
answer = response.choices[0].message.content
print("Answer:", answer)
# Citations from the raw response
raw = response.model_dump()
citations = raw.get("citations", [])
print("\nSources:")
for i, url in enumerate(citations, 1):
print(f" [{i}] {url}")
print(f"\nTokens: {response.usage.total_tokens}")
Step 3: Search with Domain Filter
// Restrict search to specific domains
const response = await client.chat.completions.create({
model: "sonar",
messages: [
{ role: "user", content: "What is the latest Python releExecute Perplexity incident response procedures with triage, mitigation, and postmortem.
Perplexity Incident Runbook
Overview
Rapid incident response for Perplexity Sonar API issues. Perplexity-specific: the API depends on live web search, so outages can be partial (search degraded but API responding), model-specific (sonar-pro down but sonar working), or citation-related (answers returned but no sources).
Severity Levels
| Level | Definition | Response Time | Example |
|---|---|---|---|
| P1 | Complete API failure | < 15 min | All requests returning 500/503 |
| P2 | Degraded service | < 1 hour | High latency, 429 rate limits, no citations |
| P3 | Minor impact | < 4 hours | Single model unavailable, sporadic errors |
| P4 | No user impact | Next business day | Monitoring gap, stale cache |
Quick Triage (Run Immediately)
set -euo pipefail
echo "=== Perplexity Triage ==="
# 1. Test sonar model
echo -n "sonar: "
curl -s -w "HTTP %{http_code} in %{time_total}s" -o /dev/null \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
https://api.perplexity.ai/chat/completions
echo ""
# 2. Test sonar-pro model
echo -n "sonar-pro: "
curl -s -w "HTTP %{http_code} in %{time_total}s" -o /dev/null \
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sonar-pro","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
https://api.perplexity.ai/chat/completions
echo ""
# 3. Check API key validity
echo -n "Auth: "
curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer invalid-key" \
-H "Content-Type: application/json" \
-d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
https://api.perplexity.ai/chat/completions
echo " (expect 401 = API reachable)"
# 4. DNS check
echo -n "DNS: "
dig +short api.perplexity.ai
Decision Tree
API returning errors?
├─ 401/402: Auth issue
│ └─ Verify API key → Regenerate at perplexity.ai/settings/api
├─ 429: Rate limited
│ └─ Enable request queue → Reduce concurrency → Wait
├─ 500/503: Server error
│ ├─ All models affected?
│ │ ├─ YES → Perplexity outage. Enable fallback/cache.
│ │ └─ NO → Model-specific issue. Route to workInstall and configure Perplexity Sonar API authentication.
Perplexity Install & Auth
Overview
Set up Perplexity Sonar API access using the OpenAI-compatible chat completions endpoint at https://api.perplexity.ai. Perplexity does not have a custom SDK -- you use the standard OpenAI client library pointed at Perplexity's base URL.
Prerequisites
- Node.js 18+ or Python 3.10+
- Perplexity account at perplexity.ai
- API key from perplexity.ai/settings/api
Instructions
Step 1: Install OpenAI Client Library
set -euo pipefail
# Node.js / TypeScript
npm install openai
# Python
pip install openai
There is no @perplexity/sdk package. Perplexity uses the OpenAI wire format, so you use the official openai package with a custom baseURL.
Step 2: Configure API Key
# Set environment variable
export PERPLEXITY_API_KEY="pplx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
# Or create .env file (add .env to .gitignore)
echo 'PERPLEXITY_API_KEY=pplx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' >> .env
API keys start with pplx- and are generated at perplexity.ai/settings/api. You must add credits to your account before making API calls.
Step 3: Verify Connection (TypeScript)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY,
baseURL: "https://api.perplexity.ai",
});
async function verify() {
const response = await client.chat.completions.create({
model: "sonar",
messages: [{ role: "user", content: "What is 2+2?" }],
max_tokens: 50,
});
console.log("Connected:", response.choices[0].message.content);
console.log("Model:", response.model);
console.log("Tokens used:", response.usage?.total_tokens);
}
verify().catch(console.error);
Step 4: Verify Connection (Python)
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["PERPLEXITY_API_KEY"],
base_url="https://api.perplexity.ai",
)
response = client.chat.completions.create(
model="sonar",
messages=[{"role": "user", "content": "What is 2+2?"}],
max_tokens=50,
)
print("Connected:", response.choices[0].message.content)
print("Model:", response.model)
print("Tokens:", response.usage.total_tokens)
Available Models
| Model | Use Case | Input $/M tokens | Output $/M tokens | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identify and avoid Perplexity anti-patterns and common integration mistakes.
ReadGrep
Perplexity Known PitfallsOverviewReal gotchas when integrating Perplexity Sonar API. Perplexity uses an OpenAI-compatible chat endpoint but performs live web searches -- a fundamentally different paradigm from standard LLM completions. These pitfalls come from treating it like a regular chatbot. Prerequisites
Pitfalls1. Using It as a Generic ChatbotPerplexity searches the web per request. Using it for tasks that don't need web search wastes money.
2. Ignoring CitationsPerplexity returns
3. Using Wrong SDK ImportThere is no
4. Not Setting max_tokensWithout
5. No Recency Filter for Time-Sensitive QueriesWithout Load test Perplexity Sonar API integrations and plan capacity.
ReadWriteEditBash(k6:*)Bash(kubectl:*)
Perplexity Load & ScaleOverviewLoad testing and capacity planning for Perplexity Sonar API. Key constraint: Perplexity rate limits at 50 RPM (default tier), and every request performs a live web search with variable latency. Load testing must respect these limits to avoid burning through credits. Capacity Constraints
Prerequisites
InstructionsStep 1: k6 Load Test ScriptConfigure Perplexity local development with mocking, testing, and hot reload.
ReadWriteEditBash(npm:*)Bash(pnpm:*)Grep
Perplexity Local Dev LoopOverviewSet up a fast, cost-effective local development workflow for Perplexity Sonar API. Key challenge: every real API call performs a web search and costs money, so mocking and caching are essential for development. Prerequisites
InstructionsStep 1: Project Structure
Step 2: Type-Safe Client Wrapper
Step 3: Save Fixtures for Offline DevelopmentMigrate to Perplexity Sonar from other search/LLM APIs using the strangler fig pattern.
ReadWriteEditBash(npm:*)Bash(node:*)Bash(kubectl:*)
Perplexity Migration Deep DiveCurrent State! ! OverviewMigrate from traditional search APIs (Google Custom Search, Bing, SerpAPI) or legacy LLMs to Perplexity Sonar. Key advantage: Perplexity combines search + LLM summarization in a single API call, replacing a multi-step pipeline. Migration Comparison
InstructionsStep 1: Assess Current Integration
Step 2: Build Adapter LayerConfigure Perplexity Sonar API across development, staging, and production environments.
ReadWriteEditBash(aws:*)Bash(gcloud:*)Bash(vault:*)
Perplexity Multi-Environment SetupOverviewConfigure Perplexity Sonar API across dev/staging/prod. Key decisions per environment: which models are allowed (sonar vs sonar-pro), rate limits, and cost caps. All environments use the same base URL ( Environment Strategy
Prerequisites
InstructionsStep 1: Configuration Structure
Step 2: Base Configuration
Step 3: Environment ConfigsSet up monitoring for Perplexity Sonar API with latency, cost, citation quality, and error tracking.
ReadWriteEdit
Perplexity ObservabilityOverviewMonitor Perplexity Sonar API performance, cost, and quality. Key signals unique to Perplexity: citation count per response (quality indicator), search latency variability (web search is non-deterministic), and per-model cost differences. Key Metrics
Prerequisites
InstructionsStep 1: Instrument the Perplexity Client
Step 2: Prometheus Metrics ExportOptimize Perplexity Sonar API performance with caching, streaming, model routing, and batching.
ReadWriteEdit
Perplexity Performance TuningOverviewOptimize Perplexity Sonar API for latency, throughput, and cost. Key insight: every Perplexity call performs a live web search, so response times are inherently variable. Typical latencies: sonar 1-3s, sonar-pro 3-8s, sonar-deep-research 10-60s. Latency Benchmarks
Prerequisites
InstructionsStep 1: Smart Model Routing
Step 2: Query Hash CachingImplement content moderation, model selection policy, citation quality enforcement, and per-user usage quotas for Perplexity Sonar API.
ReadWriteEditBash(npx:*)
Perplexity Policy GuardrailsOverviewPolicy enforcement for Perplexity Sonar API. Since Perplexity performs live web searches, guardrails must address: query content moderation (what users can search for), citation reliability (filtering low-quality sources), cost control (model selection + token limits), and responsible AI usage. Policy Pipeline
Prerequisites
InstructionsStep 1: Query Content Moderation
Step 2: Model Selection PolicyExecute Perplexity production deployment checklist for Sonar API integrations.
ReadBash(kubectl:*)Bash(curl:*)Grep
Perplexity Production ChecklistOverviewComplete checklist for deploying Perplexity Sonar API integrations to production. Perplexity-specific concerns: every API call performs a live web search (variable latency), citations link to third-party sites (must validate), and costs scale per-request plus per-token. Prerequisites
Production Readiness ChecklistAPI Configuration
Code Quality
Performance
Monitoring
Cost Controls
Graceful DegradationImplement Perplexity rate limiting, backoff, and request queuing.
ReadWriteEdit
Perplexity Rate LimitsOverviewHandle Perplexity Sonar API rate limits. Perplexity uses a leaky bucket algorithm: burst capacity is available, with tokens refilling continuously at your assigned rate. Rate limits are based on requests per minute (RPM). Rate Limit Tiers
Rate limits apply per API key, not per model. Using Prerequisites
InstructionsStep 1: Exponential Backoff with Jitter
Step 2: Queue-Based Rate LimitingImplement Perplexity reference architecture with model routing, citation pipeline, and research automation.
ReadGrep
Perplexity Reference ArchitectureOverviewProduction architecture for AI-powered search with Perplexity Sonar API. Three tiers: search service (model routing + caching), citation pipeline (extract, validate, store), and research orchestrator (multi-query synthesis). Architecture
Prerequisites
InstructionsStep 1: Search Service with Model RoutingImplement reliability patterns for Perplexity Sonar API: circuit breaker, model fallback, streaming timeout, and citation validation.
ReadWriteEdit
Perplexity Reliability PatternsOverviewProduction reliability patterns for Perplexity Sonar API. Perplexity performs live web searches per request, making response times inherently variable. The key reliability challenges: search can stall, citations can break, and model tiers have different availability. Prerequisites
InstructionsStep 1: Model Tier Fallback
Step 2: Circuit BreakerApply production-ready Perplexity Sonar API patterns for TypeScript and Python.
ReadWriteEdit
Perplexity SDK PatternsOverviewProduction-ready patterns for Perplexity Sonar API. Since Perplexity uses the OpenAI wire format, you build wrappers around the Prerequisites
InstructionsStep 1: Typed Client Singleton (TypeScript)
Step 2: Search with Full Response ParsingApply Perplexity security best practices for API key management and query safety.
ReadWriteGrep
Perplexity Security BasicsOverviewSecurity best practices for Perplexity Sonar API. Key concerns: API key protection (keys start with Prerequisites
InstructionsStep 1: API Key Management
Step 2: Query Sanitization (Critical)Perplexity sends your query to the open web for search. Any PII in the query is exposed to external search infrastructure.
Step 3: Restrict Search DomainsUse Migrate between Perplexity model generations and API parameter changes.
ReadWriteEditBash(npm:*)Bash(git:*)
Perplexity Upgrade & MigrationCurrent State! ! OverviewGuide for migrating between Perplexity model generations and API changes. Perplexity has evolved from pplx-api with third-party models to the Sonar family with built-in web search. Model Evolution
InstructionsStep 1: Identify Legacy Patterns to Update
Step 2: Model Name Migration MapBuild event-driven architectures around Perplexity Sonar API with streaming, batch pipelines, and scheduled search monitoring.
ReadWriteEditBash(curl:*)
Perplexity Events & Async PatternsOverviewBuild event-driven architectures around Perplexity Sonar API. Perplexity does not have webhooks -- all interactions are request/response. Event patterns are built using streaming SSE, job queues for batch processing, and cron-triggered monitoring. Event Patterns
Prerequisites
InstructionsStep 1: Streaming Search (Server-Sent Events)
Step 2: Batch Research PipelineReady to use perplexity-pack?Related Pluginsai-ethics-validatorAI ethics and fairness validation ai-experiment-loggerTrack and analyze AI experiments with a web dashboard and MCP tools ai-ml-engineering-packProfessional AI/ML Engineering toolkit: Prompt engineering, LLM integration, RAG systems, AI safety with 12 expert plugins ai-sdk-agentsMulti-agent orchestration with AI SDK v5 - handoffs, routing, and coordination for any AI provider (OpenAI, Anthropic, Google) anomaly-detection-systemDetect anomalies and outliers in data automl-pipeline-builderBuild AutoML pipelines
Tags
perplexityai-searchresearchcitationsreal-timeknowledgeanswers
|