Complete Perplexity integration skill pack with 30 skills covering AI search, real-time answers, citations, and research workflows. Flagship+ tier vendor pack.

30 Skills

MIT License

Installation

Open Claude Code and run this command:

/plugin install perplexity-pack@claude-code-plugins-plus

Use --global to install for all projects, or --project for current project only.

What It Does

> Claude Code skill pack for Perplexity Sonar API — AI-powered search with web-grounded answers and citations (30 skills)

Skills (30)

perplexity-advanced-troubleshooting View full skill →

'Apply advanced debugging techniques for hard-to-diagnose Perplexity.

ReadGrepBash(kubectl:*)Bash(curl:*)Bash(tcpdump:*)

Perplexity Advanced Troubleshooting

Overview

Deep debugging for Perplexity Sonar API issues that resist standard fixes. Common hard problems: inconsistent citations between identical queries, intermittent timeouts on sonar-pro, search results not matching recency filter, and response quality degradation.

Prerequisites

Access to production logs and metrics
curl for direct API testing
Understanding of Perplexity's search-augmented generation model

Diagnostic Tools

Layer-by-Layer Test


#!/bin/bash
set -euo pipefail
echo "=== Perplexity Layer Diagnostics ==="

# Layer 1: DNS
echo -n "1. DNS: "
dig +short api.perplexity.ai || echo "FAIL"

# Layer 2: TCP connectivity
echo -n "2. TCP: "
timeout 5 bash -c 'echo > /dev/tcp/api.perplexity.ai/443 && echo "OK"' 2>/dev/null || echo "FAIL"

# Layer 3: TLS handshake
echo -n "3. TLS: "
echo | openssl s_client -connect api.perplexity.ai:443 2>/dev/null | grep -c "Verify return code: 0" | sed 's/1/OK/;s/0/FAIL/'

# Layer 4: HTTP with auth
echo -n "4. Auth: "
curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
  https://api.perplexity.ai/chat/completions

echo ""

# Layer 5: Response quality
echo "5. Quality check:"
RESPONSE=$(curl -s \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sonar","messages":[{"role":"user","content":"What is 2+2?"}],"max_tokens":50}' \
  https://api.perplexity.ai/chat/completions)

echo "   Model: $(echo $RESPONSE | jq -r '.model')"
echo "   Answer: $(echo $RESPONSE | jq -r '.choices[0].message.content' | head -c 100)"
echo "   Citations: $(echo $RESPONSE | jq -r '.citations | length')"
echo "   Tokens: $(echo $RESPONSE | jq -r '.usage.total_tokens')"

Inconsistent Citation Investigation


// Same query can return different citations due to live web search
// Run N times and compare to identify pattern vs randomness

async function citationStabilityTest(query: string, runs: number = 5) {
  const results: Array<{ citations: string[]; answer: string }> = [];

  for (let i = 0; i < runs; i++) {
    const response = await perplexity.chat.completions.create({
      model: "sonar",
      messages: [{ role: "user", content: quer


                
                  
                  perplexity-architecture-variants
                  View full skill →
                
                
                  'Choose and implement Perplexity architecture blueprints for different.
                  
                      ReadGrep
                    
                
                
                  Perplexity Architecture Variants
Overview
Three validated architectures for Perplexity Sonar API at different scales. Each builds on the previous, adding caching and orchestration as volume grows.
Decision Matrix

Factor
Direct Widget
Cached Layer
Research Pipeline


Volume
<500/day
500-5K/day
5K+/day


Latency (p50)
2-5s
50ms (cached) / 2-5s (miss)
10-30s


Model
sonar
sonar + cache
sonar + sonar-pro


Monthly Cost
<$150
$50-$300
$300+


Complexity
Minimal
Moderate
High


Instructions
Variant 1: Direct Search Widget (<500 queries/day)
Best for: Adding AI search to an existing app. No cache needed at this scale.

// Simple endpoint — add to any Express/Next.js app
import OpenAI from "openai";

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  baseURL: "https://api.perplexity.ai",
});

app.post("/api/search", async (req, res) => {
  try {
    const response = await perplexity.chat.completions.create({
      model: "sonar",
      messages: [{ role: "user", content: req.body.query }],
      max_tokens: 1024,
    });

    res.json({
      answer: response.choices[0].message.content,
      citations: (response as any).citations || [],
    });
  } catch (err: any) {
    if (err.status === 429) {
      res.status(429).json({ error: "Rate limited. Try again shortly." });
    } else {
      res.status(500).json({ error: "Search unavailable" });
    }
  }
});

Variant 2: Cached Research Layer (500-5K queries/day)
Best for: Repeated queries, knowledge base search, FAQ bots. Cache eliminates duplicate API calls.

import { createHash } from "crypto";
import { LRUCache } from "lru-cache";

const cache = new LRUCache<string, any>({
  max: 5000,
  ttl: 4 * 3600_000,  // 4-hour TTL
});

class CachedSearchService {
  constructor(private client: OpenAI) {}

  async search(query: string, model = "sonar") {
    const key = this.cacheKey(query, model);
    const cached = cache.get(key);
    if (cached) return { ...cached, cached: true };

    const response = await this.client.chat.completions.create({
      model,
      messages: [{ role: "user", content: query }],
      max_tokens: 1024,
    });

    const result = {
      answer: response.choices[0].message.content || "",
      citations: (response as any).citations || [],
      model: response.model,
    };

    cache.set(key, result);
    return {


                
                  
                  perplexity-ci-integration
                  View full skill →
                
                
                  'Configure CI/CD for Perplexity Sonar API integrations with GitHub Actions.
                  
                      ReadWriteEditBash(gh:*)
                    
                
                
                  Perplexity CI Integration
Overview
Set up CI/CD pipelines for Perplexity Sonar API integrations. Key CI concerns: live API calls cost money (use mocks for unit tests, reserve live calls for integration tests), API keys must be in GitHub Secrets, and rate limits apply even in CI.
Prerequisites

GitHub repository with Actions enabled
Perplexity API key for CI (separate from production)
Test suite with mocked and live test separation

Instructions
Step 1: Configure GitHub Secret

set -euo pipefail
# Store API key as a GitHub secret
gh secret set PERPLEXITY_API_KEY --body "pplx-your-ci-key-here"

Step 2: GitHub Actions Workflow

# .github/workflows/perplexity-tests.yml
name: Perplexity Integration Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  unit-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: "20"
          cache: "npm"
      - run: npm ci
      - run: npm test -- --coverage
        # Unit tests use mocked responses — no API key needed

  integration-tests:
    runs-on: ubuntu-latest
    needs: unit-tests
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    env:
      PERPLEXITY_API_KEY: ${{ secrets.PERPLEXITY_API_KEY }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: "20"
          cache: "npm"
      - run: npm ci
      - name: Run live Perplexity integration tests
        run: npm run test:integration
        timeout-minutes: 5

Step 3: Test Structure

// tests/perplexity.unit.test.ts — runs on every PR, uses mocks
import { describe, it, expect, vi } from "vitest";
import fixture from "./fixtures/sonar-response.json";

vi.mock("openai", () => ({
  default: vi.fn().mockImplementation(() => ({
    chat: {
      completions: {
        create: vi.fn().mockResolvedValue(fixture),
      },
    },
  })),
}));

describe("Perplexity Search (mocked)", () => {
  it("parses citations from response", async () => {
    const { search } = await import("../src/perplexity/search");
    const result = await search("test query");
    expect(result.citations.length).toBeGreaterThan(0);
  });

  it("formats answer with citation links", async () => {
    const { formatCitationsAsMarkdown } = await import("../src/perplexity/citations");
    const formatted = formatCitationsAsMarkdown("See [1] for details", ["https://example.com"]);
    expect(formatted).toContain("[1](https://example.com)");
  });
});

                
              

                
                  
                  perplexity-common-errors
                  View full skill →
                
                
                  'Diagnose and fix Perplexity Sonar API errors and exceptions.
                  
                      ReadGrepBash(curl:*)
                    
                
                
                  Perplexity Common Errors
Overview
Quick reference for the most common Perplexity Sonar API errors, their root causes, and fixes. All Perplexity errors follow the OpenAI error format since the API is OpenAI-compatible.
Prerequisites

PERPLEXITYAPIKEY environment variable set
curl available for diagnostic commands

Error Reference
401 Unauthorized — Invalid API Key

{"error": {"message": "Invalid API key", "type": "authentication_error", "code": 401}}

Causes: Key missing, expired, revoked, or doesn't start with pplx-.
Fix:

set -euo pipefail
# Verify key is set and has correct prefix
echo "${PERPLEXITY_API_KEY:0:5}"  # Should print "pplx-"

# Test key directly
curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
  https://api.perplexity.ai/chat/completions
# 200 = valid, 401 = invalid key

Regenerate at perplexity.ai/settings/api.

429 Too Many Requests — Rate Limited

{"error": {"message": "Rate limit exceeded", "type": "rate_limit_error", "code": 429}}

Causes: Exceeded requests per minute (RPM). Most tiers allow 50 RPM. Perplexity uses a leaky bucket algorithm.
Fix:

async function withBackoff<T>(fn: () => Promise<T>, maxRetries = 5): Promise<T> {
  for (let i = 0; i <= maxRetries; i++) {
    try {
      return await fn();
    } catch (err: any) {
      if (err.status !== 429 || i === maxRetries) throw err;
      const delay = Math.pow(2, i) * 1000 + Math.random() * 500;
      console.log(`Rate limited. Retrying in ${delay.toFixed(0)}ms...`);
      await new Promise(r => setTimeout(r, delay));
    }
  }
  throw new Error("Unreachable");
}

See perplexity-rate-limits for queue-based solutions.

400 Bad Request — Invalid Model

{"error": {"message": "Invalid model: gpt-4", "type": "invalid_request_error"}}

Cause: Using a non-Perplexity model name.
Valid models: sonar, sonar-pro, sonar-reasoning-pro, sonar-deep-research.
                
              

                
                  
                  perplexity-core-workflow-a
                  View full skill →
                
                
                  'Execute Perplexity primary workflow: single-query search with citations.
                  
                      ReadWriteEditBash(npm:*)Grep
                    
                
                
                  Perplexity Core Workflow A: Search with Citations
Overview
Primary money-path workflow: send a search query to Perplexity Sonar, receive a web-grounded answer with inline citations, parse and display the results. This is the single-query pattern used for search widgets, fact-checking, and real-time information retrieval.
Prerequisites

Completed perplexity-install-auth setup
openai package installed
PERPLEXITYAPIKEY set

Instructions
Step 1: Initialize Client and Send Query

import OpenAI from "openai";

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

async function searchWithCitations(query: string) {
  const response = await perplexity.chat.completions.create({
    model: "sonar",
    messages: [
      {
        role: "system",
        content: "Provide accurate, well-sourced answers. Cite your sources inline.",
      },
      { role: "user", content: query },
    ],
    // Perplexity-specific parameters
    search_recency_filter: "week",  // hour | day | week | month
  } as any);

  return response;
}

Step 2: Parse Response with Citations

interface SearchResult {
  answer: string;
  citations: string[];
  searchResults: Array<{ title: string; url: string; snippet: string }>;
  tokensUsed: number;
}

function parseResponse(response: any): SearchResult {
  return {
    answer: response.choices[0].message.content,
    citations: response.citations || [],
    searchResults: response.search_results || [],
    tokensUsed: response.usage?.total_tokens || 0,
  };
}

Step 3: Format Citations for Display

function formatAnswer(result: SearchResult): string {
  let formatted = result.answer;

  // Replace [1], [2] markers with markdown links
  result.citations.forEach((url, i) => {
    formatted = formatted.replaceAll(`[${i + 1}]`, `${i + 1}`);
  });

  // Append source list
  if (result.citations.length > 0) {
    formatted += "\n\n**Sources:**\n";
    result.citations.forEach((url, i) => {
      formatted += `${i + 1}. ${url}\n`;
    });
  }

  return formatted;
}

Step 4: Complete Workflow

async function main() {
  const query = "What are the latest advances in battery technology?";

  const response = await searchWithCitations(query);
  const result = parseResponse(response);
  const formatted = formatAnswer(result);

  console.log(formatted);
  console.log(`\n[${result.tokensUsed} tokens | ${result.citations.length} sources]`);
}

main().catch(console.error);

Step 5: Domain-Filtered Search

                
              

                
                  
                  perplexity-core-workflow-b
                  View full skill →
                
                
                  'Execute Perplexity multi-turn research sessions and batch query pipelines.
                  
                      ReadWriteEditBash(npm:*)Grep
                    
                
                
                  Perplexity Core Workflow B: Multi-Query Research
Overview
Multi-turn research workflow using Perplexity Sonar API. Decomposes a broad topic into focused sub-queries, runs them with context continuity, deduplicates citations, and synthesizes a structured research document. Use sonar for fast passes and sonar-pro for deep dives.
Prerequisites

Completed perplexity-install-auth setup
Familiarity with perplexity-core-workflow-a
PERPLEXITYAPIKEY set

Instructions
Step 1: Conversational Research Session

import OpenAI from "openai";

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

type Message = OpenAI.ChatCompletionMessageParam;

class ResearchSession {
  private messages: Message[] = [];
  private allCitations: Set<string> = new Set();

  constructor(systemPrompt: string = "You are a research assistant. Provide thorough, cited answers.") {
    this.messages.push({ role: "system", content: systemPrompt });
  }

  async ask(question: string, model: "sonar" | "sonar-pro" = "sonar"): Promise<{
    answer: string;
    citations: string[];
  }> {
    this.messages.push({ role: "user", content: question });

    const response = await perplexity.chat.completions.create({
      model,
      messages: this.messages,
    } as any);

    const answer = response.choices[0].message.content || "";
    const citations = (response as any).citations || [];

    // Maintain conversation context
    this.messages.push({ role: "assistant", content: answer });

    // Accumulate all citations across the session
    citations.forEach((url: string) => this.allCitations.add(url));

    return { answer, citations };
  }

  getAllCitations(): string[] {
    return [...this.allCitations];
  }

  // Keep context manageable (Perplexity searches per turn)
  trimHistory(keepLast: number = 6) {
    const system = this.messages[0];
    const recent = this.messages.slice(-(keepLast * 2));
    this.messages = [system, ...recent];
  }
}

Step 2: Batch Query Pipeline

interface ResearchPlan {
  topic: string;
  questions: string[];
}

interface ResearchReport {
  topic: string;
  sections: Array<{ question: string; answer: string; citations: string[] }>;
  allCitations: string[];
  totalTokens: number;
}

async function conductResearch(plan: ResearchPlan): Promise<ResearchReport> {
  const sections: ResearchReport["sections"] = [];
  const allCitations = new Set<string>();
  let totalTokens = 0;

  for (const question of plan.questions) {
    const response = await perplexity.chat.completions.create({
      model: "sonar-pro

                

              

                
                  
                  perplexity-cost-tuning
                  View full skill →
                
                
                  'Optimize Perplexity costs through model routing, caching, token limits,.
                  
                      ReadGrep
                    
                
                
                  Perplexity Cost Tuning
Overview
Reduce Perplexity Sonar API costs. Perplexity charges per-token (input + output) plus a per-request fee that varies by search context size. The biggest cost lever is model selection: sonar-pro costs 3-15x more than sonar per request.
Pricing Reference

Model
Input $/M tokens
Output $/M tokens
Request Fee


sonar
$1
$1
$5 per 1K requests


sonar-pro
$3
$15
$5 per 1K requests


sonar-reasoning-pro
$3
$15
$5 per 1K requests


sonar-deep-research
$2
$8
$5 per 1K searches


Search context size (Low/Medium/High) affects the request fee. More context = higher fee.
Prerequisites

Perplexity API account with usage dashboard
Understanding of query patterns in your application
Cache infrastructure for search results

Instructions
Step 1: Route Queries to the Right Model

// 60-70% of queries can use sonar, saving 3-15x per query
function selectModel(query: string): "sonar" | "sonar-pro" {
  const simplePatterns = [
    /^what is/i, /^define/i, /^who is/i, /^when did/i,
    /current price/i, /^how many/i, /^is it true/i,
  ];
  if (simplePatterns.some((p) => p.test(query))) return "sonar";

  const complexPatterns = [
    /compare.*vs/i, /analysis of/i, /comprehensive/i,
    /pros and cons/i, /in-depth/i, /research/i,
  ];
  if (complexPatterns.some((p) => p.test(query))) return "sonar-pro";

  return "sonar"; // Default to cheapest
}

Step 2: Limit Output Tokens

set -euo pipefail
# Factual queries need ~100 tokens, not 4096
# Setting max_tokens dramatically reduces output costs

# Simple fact: 100 tokens = $0.0001 output
curl -X POST https://api.perplexity.ai/chat/completions \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sonar",
    "messages": [{"role": "user", "content": "Current population of Tokyo"}],
    "max_tokens": 100
  }'

# Research query: keep at 2048 only when needed
curl -X POST https://api.perplexity.ai/chat/completions \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sonar-pro",
    "messages": [{"role": "user", "content": "Compare React vs Vue in 2025 for enterprise apps"}],
    "max_tokens&q

                

              

                
                  
                  perplexity-data-handling
                  View full skill →
                
                
                  'Implement Perplexity query sanitization, citation validation, result.
                  
                      ReadWriteEdit
                    
                
                
                  Perplexity Data Handling
Overview
Manage data flowing through Perplexity Sonar API. Critical concern: queries are sent to Perplexity for web search, so any PII in queries is exposed to external infrastructure. Responses contain citations (third-party URLs) that must be validated before displaying to users.
Data Flow

User Input → Query Sanitization → Perplexity API → Response Parsing
                                                         │
                                           ┌─────────────┼──────────────┐
                                           │             │              │
                                      Answer Text    Citations    Search Results
                                           │             │              │
                                      Format &      Validate &    Store for
                                      Display       Deduplicate   Analytics

Prerequisites

Perplexity API key configured
Understanding of PII regulations (GDPR/CCPA)
Cache storage (Redis or in-memory)

Instructions
Step 1: Query Sanitization

function sanitizeQuery(query: string): { clean: string; redacted: boolean } {
  let clean = query;
  let redacted = false;

  const patterns: Array<[RegExp, string]> = [
    [/\b[\w.+-]+@[\w-]+\.[\w.]+\b/g, "[email]"],
    [/\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g, "[phone]"],
    [/\b\d{3}-\d{2}-\d{4}\b/g, "[ssn]"],
    [/\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g, "[card]"],
    [/\b(pplx-|sk-|pk_|sk_live_)\w{20,}\b/g, "[token]"],
    [/\b(user|customer|account)\s*#?\s*\d+\b/gi, "[id]"],
  ];

  for (const [pattern, replacement] of patterns) {
    if (pattern.test(clean)) {
      clean = clean.replace(pattern, replacement);
      redacted = true;
    }
  }

  return { clean, redacted };
}

async function safeSearch(rawQuery: string) {
  const { clean, redacted } = sanitizeQuery(rawQuery);
  if (redacted) {
    console.warn("[Data] PII redacted from Perplexity query");
  }

  return perplexity.chat.completions.create({
    model: "sonar",
    messages: [{ role: "user", content: clean }],
  });
}

Step 2: Citation Validation

interface ValidatedCitation {
  url: string;
  domain: string;
  valid: boolean;
  index: number;
}

function validateCitations(citations: string[]): ValidatedCitation[] {
  return citations.map((url, i) => {
    try {
      const parsed = new URL(url);
      return {
        url: url.replace(/[.,;:]+$/, ""),
        domain: parsed.hostname,
        valid: ["http:", "https:"].includes(parsed.protocol),
        index: i + 1,
      };
    } catch {
      return { url, domain: "unknown", valid: false, index: i + 1 };
    }
  });
}

fu

                

              

                
                  
                  perplexity-debug-bundle
                  View full skill →
                
                
                  'Collect Perplexity debug evidence for support tickets and troubleshooting.
                  
                      ReadBash(grep:*)Bash(curl:*)Bash(tar:*)Grep
                    
                
                
                  Perplexity Debug Bundle
Current State
!node --version 2>/dev/null || echo 'N/A'
!python3 --version 2>/dev/null || echo 'N/A'
!echo "PERPLEXITYAPIKEY: ${PERPLEXITYAPIKEY:+SET (${#PERPLEXITYAPIKEY} chars)}${PERPLEXITYAPIKEY:-NOT SET}"
Overview
Collect all diagnostic information needed to troubleshoot Perplexity Sonar API issues. Generates a redacted bundle safe for sharing with support or teammates.
Prerequisites

PERPLEXITYAPIKEY environment variable
curl and tar available
Permission to collect environment info

Instructions
Step 1: Create Debug Bundle Script

#!/bin/bash
set -euo pipefail
# perplexity-debug-bundle.sh

BUNDLE_DIR="perplexity-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BUNDLE_DIR"

echo "=== Perplexity Debug Bundle ===" > "$BUNDLE_DIR/summary.txt"
echo "Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> "$BUNDLE_DIR/summary.txt"
echo "" >> "$BUNDLE_DIR/summary.txt"

Step 2: Collect Environment Info

set -euo pipefail
cat >> "$BUNDLE_DIR/summary.txt" << 'EOF'
--- Environment ---
EOF

echo "Node: $(node --version 2>/dev/null || echo 'not installed')" >> "$BUNDLE_DIR/summary.txt"
echo "Python: $(python3 --version 2>/dev/null || echo 'not installed')" >> "$BUNDLE_DIR/summary.txt"
echo "OS: $(uname -sr)" >> "$BUNDLE_DIR/summary.txt"
echo "OpenAI SDK (npm): $(npm list openai 2>/dev/null | grep openai || echo 'not found')" >> "$BUNDLE_DIR/summary.txt"
echo "OpenAI SDK (pip): $(pip show openai 2>/dev/null | grep Version || echo 'not found')" >> "$BUNDLE_DIR/summary.txt"
echo "API Key: ${PERPLEXITY_API_KEY:+SET (prefix: ${PERPLEXITY_API_KEY:0:5}...)}${PERPLEXITY_API_KEY:-NOT SET}" >> "$BUNDLE_DIR/summary.txt"

Step 3: Test API Connectivity

set -euo pipefail
echo "" >> "$BUNDLE_DIR/summary.txt"
echo "--- API Connectivity ---" >> "$BUNDLE_DIR/summary.txt"

# DNS resolution
echo -n "DNS: " >> "$BUNDLE_DIR/summary.txt"
dig +short api.perplexity.ai >> "$BUNDLE_DIR/summary.txt" 2>&1

# API response test
echo -n "API Health: " >> "$BUNDLE_DIR/summary.txt"
curl -s -w "HTTP %{http_code} in %{time_total}s" \
  -o "$BUNDLE_DIR/api-response.json" \
  -H "Authorization: Bearer ${PERPLEXITY_API_KEY}" \
  -H "Content-Type:

                

              

                
                  
                  perplexity-deploy-integration
                  View full skill →
                
                
                  'Deploy Perplexity Sonar API integrations to Vercel, Cloud Run, and Docker.
                  
                      ReadWriteEditBash(vercel:*)Bash(fly:*)Bash(gcloud:*)
                    
                
                
                  Perplexity Deploy Integration
Overview
Deploy applications using Perplexity Sonar API to edge and server platforms. Perplexity's OpenAI-compatible endpoint at https://api.perplexity.ai/chat/completions works from any platform that can make HTTPS requests.
Prerequisites

Perplexity API key stored in PERPLEXITYAPIKEY
Platform CLI installed (vercel, gcloud, or docker)
Application tested locally

Instructions
Step 1: Vercel Edge Function

// api/search.ts
import OpenAI from "openai";

export const config = { runtime: "edge" };

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  baseURL: "https://api.perplexity.ai",
});

export default async function handler(req: Request) {
  const { query, model = "sonar", stream = false } = await req.json();

  if (stream) {
    const response = await perplexity.chat.completions.create({
      model,
      messages: [{ role: "user", content: query }],
      stream: true,
      max_tokens: 2048,
    });

    return new Response(response.toReadableStream(), {
      headers: { "Content-Type": "text/event-stream" },
    });
  }

  const response = await perplexity.chat.completions.create({
    model,
    messages: [{ role: "user", content: query }],
    max_tokens: 2048,
  });

  return Response.json({
    answer: response.choices[0].message.content,
    citations: (response as any).citations || [],
    model: response.model,
  });
}


set -euo pipefail
# Deploy to Vercel
vercel env add PERPLEXITY_API_KEY production
vercel deploy --prod

Step 2: Cloud Run with Redis Cache

// server.ts
import express from "express";
import OpenAI from "openai";
import { createClient } from "redis";
import { createHash } from "crypto";

const app = express();
app.use(express.json());

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  baseURL: "https://api.perplexity.ai",
});

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

app.post("/api/search", async (req, res) => {
  const { query, model = "sonar" } = req.body;
  const cacheKey = `pplx:${createHash("sha256").update(`${model}:${query}`).digest("hex")}`;

  // Check cache first
  const cached = await redis.get(cacheKey);
  if (cached) {
    return res.json({ ...JSON.parse(cached), cached: true });
  }

  const response = await perplexity.chat.completions.create({
    model,
    messages: [{ role: "user", content: query }],
    max_tokens: 2048,
  });

  const result = {
    answer: response.choices[0].message.content,
    citations: (response as any).citations || [],
   

                

              

                
                  
                  perplexity-enterprise-rbac
                  View full skill →
                
                
                  'Configure Perplexity API key scoping, per-team model access, cost controls,.
                  
                      ReadWriteEdit
                    
                
                
                  Perplexity Enterprise RBAC
Overview
Control access to Perplexity Sonar API at the organizational level. Perplexity does not have built-in RBAC -- you implement access control through: separate API keys per team/environment, a gateway that enforces model and budget policies, and domain restrictions for compliance.
Access Control Strategy

Layer
Mechanism
Perplexity Support


Authentication
API key per team
Yes (multiple keys)


Model restriction
Gateway enforcement
Build yourself


Budget cap
Per-key monthly limit
Via dashboard


Domain restriction
searchdomainfilter
Yes (per-request)


Rate limiting
Gateway + key limits
Yes (per-key RPM)


Prerequisites

Perplexity API account with admin access
Separate API keys per team/environment
Gateway or middleware for policy enforcement

Instructions
Step 1: Create Per-Team API Keys
Generate separate keys at perplexity.ai/settings/api:

Key: pplx-support-bot-prod     → Budget: $200/mo, sonar only
Key: pplx-research-team        → Budget: $1000/mo, sonar + sonar-pro
Key: pplx-data-team            → Budget: $500/mo, sonar only
Key: pplx-executive-reports    → Budget: $300/mo, sonar-pro

Step 2: Gateway with Policy Enforcement

// perplexity-gateway.ts
import OpenAI from "openai";

interface TeamPolicy {
  apiKey: string;
  allowedModels: string[];
  maxTokensPerRequest: number;
  maxRequestsPerMinute: number;
  requiredDomainFilter?: string[];  // Force search to specific domains
  blockedDomainFilter?: string[];   // Block specific domains
}

const TEAM_POLICIES: Record<string, TeamPolicy> = {
  support: {
    apiKey: process.env.PPLX_KEY_SUPPORT!,
    allowedModels: ["sonar"],
    maxTokensPerRequest: 512,
    maxRequestsPerMinute: 30,
  },
  research: {
    apiKey: process.env.PPLX_KEY_RESEARCH!,
    allowedModels: ["sonar", "sonar-pro", "sonar-reasoning-pro"],
    maxTokensPerRequest: 4096,
    maxRequestsPerMinute: 50,
  },
  compliance: {
    apiKey: process.env.PPLX_KEY_COMPLIANCE!,
    allowedModels: ["sonar", "sonar-pro"],
    maxTokensPerRequest: 2048,
    maxRequestsPerMinute: 20,
    requiredDomainFilter: ["sec.gov", "edgar.sec.gov", "law.cornell.edu"],
  },
  marketing: {
    apiKey: process.env.PPLX_KEY_MARKETING!,
    allowedModels: ["sonar"],
    maxTokensPerRequest: 1024,
    maxRequestsPerMinute: 20,
    blockedDomainFilter: ["-competitor1.com", "-competitor2.com&qu

                

              

                
                  
                  perplexity-hello-world
                  View full skill →
                
                
                  'Create a minimal working Perplexity Sonar search example with citations.
                  
                      ReadWriteEdit
                    
                
                
                  Perplexity Hello World
Overview
Minimal working example demonstrating Perplexity's core value: web-grounded answers with citations. Unlike standard LLMs, Perplexity searches the web for every query and returns cited sources.
Prerequisites

Completed perplexity-install-auth setup
openai package installed
PERPLEXITYAPIKEY environment variable set

Instructions
Step 1: Basic Search with Citations (TypeScript)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

async function main() {
  const response = await client.chat.completions.create({
    model: "sonar",
    messages: [
      {
        role: "system",
        content: "Be precise and cite your sources.",
      },
      {
        role: "user",
        content: "What are the latest features in Node.js 22?",
      },
    ],
  });

  const answer = response.choices[0].message.content;
  console.log("Answer:", answer);

  // Citations are returned as a top-level array on the response
  const citations = (response as any).citations || [];
  console.log("\nSources:");
  citations.forEach((url: string, i: number) => {
    console.log(`  [${i + 1}] ${url}`);
  });

  // Usage breakdown
  console.log("\nUsage:", {
    prompt_tokens: response.usage?.prompt_tokens,
    completion_tokens: response.usage?.completion_tokens,
    total_tokens: response.usage?.total_tokens,
  });
}

main().catch(console.error);

Step 2: Basic Search with Citations (Python)

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["PERPLEXITY_API_KEY"],
    base_url="https://api.perplexity.ai",
)

response = client.chat.completions.create(
    model="sonar",
    messages=[
        {"role": "system", "content": "Be precise and cite your sources."},
        {"role": "user", "content": "What are the latest features in Node.js 22?"},
    ],
)

answer = response.choices[0].message.content
print("Answer:", answer)

# Citations from the raw response
raw = response.model_dump()
citations = raw.get("citations", [])
print("\nSources:")
for i, url in enumerate(citations, 1):
    print(f"  [{i}] {url}")

print(f"\nTokens: {response.usage.total_tokens}")

Step 3: Search with Domain Filter

// Restrict search to specific domains
const response = await client.chat.completions.create({
  model: "sonar",
  messages: [
    { role: "user", content: "What is the latest Python rele

                

              

                
                  
                  perplexity-incident-runbook
                  View full skill →
                
                
                  'Execute Perplexity incident response procedures with triage, mitigation,.
                  
                      ReadGrepBash(kubectl:*)Bash(curl:*)
                    
                
                
                  Perplexity Incident Runbook
Overview
Rapid incident response for Perplexity Sonar API issues. Perplexity-specific: the API depends on live web search, so outages can be partial (search degraded but API responding), model-specific (sonar-pro down but sonar working), or citation-related (answers returned but no sources).
Severity Levels

Level
Definition
Response Time
Example


P1
Complete API failure
< 15 min
All requests returning 500/503


P2
Degraded service
< 1 hour
High latency, 429 rate limits, no citations


P3
Minor impact
< 4 hours
Single model unavailable, sporadic errors


P4
No user impact
Next business day
Monitoring gap, stale cache


Quick Triage (Run Immediately)

set -euo pipefail
echo "=== Perplexity Triage ==="

# 1. Test sonar model
echo -n "sonar: "
curl -s -w "HTTP %{http_code} in %{time_total}s" -o /dev/null \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
  https://api.perplexity.ai/chat/completions
echo ""

# 2. Test sonar-pro model
echo -n "sonar-pro: "
curl -s -w "HTTP %{http_code} in %{time_total}s" -o /dev/null \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sonar-pro","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
  https://api.perplexity.ai/chat/completions
echo ""

# 3. Check API key validity
echo -n "Auth: "
curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer invalid-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"sonar","messages":[{"role":"user","content":"test"}],"max_tokens":5}' \
  https://api.perplexity.ai/chat/completions
echo " (expect 401 = API reachable)"

# 4. DNS check
echo -n "DNS: "
dig +short api.perplexity.ai

Decision Tree

API returning errors?
├─ 401/402: Auth issue
│   └─ Verify API key → Regenerate at perplexity.ai/settings/api
├─ 429: Rate limited
│   └─ Enable request queue → Reduce concurrency → Wait
├─ 500/503: Server error
│   ├─ All models affected?
│   │   ├─ YES → Perplexity outage. Enable fallback/cache.
│   │   └─ NO → Model-specific issue. Route to work

                

              

                
                  
                  perplexity-install-auth
                  View full skill →
                
                
                  'Install and configure Perplexity Sonar API authentication.
                  
                      ReadWriteEditBash(npm:*)Bash(pip:*)Grep
                    
                
                
                  Perplexity Install & Auth
Overview
Set up Perplexity Sonar API access using the OpenAI-compatible chat completions endpoint at https://api.perplexity.ai. Perplexity does not have a custom SDK -- you use the standard OpenAI client library pointed at Perplexity's base URL.
Prerequisites

Node.js 18+ or Python 3.10+
Perplexity account at perplexity.ai
API key from perplexity.ai/settings/api

Instructions
Step 1: Install OpenAI Client Library

set -euo pipefail
# Node.js / TypeScript
npm install openai

# Python
pip install openai

There is no @perplexity/sdk package. Perplexity uses the OpenAI wire format, so you use the official openai package with a custom baseURL.
Step 2: Configure API Key

# Set environment variable
export PERPLEXITY_API_KEY="pplx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

# Or create .env file (add .env to .gitignore)
echo 'PERPLEXITY_API_KEY=pplx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' >> .env

API keys start with pplx- and are generated at perplexity.ai/settings/api. You must add credits to your account before making API calls.
Step 3: Verify Connection (TypeScript)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

async function verify() {
  const response = await client.chat.completions.create({
    model: "sonar",
    messages: [{ role: "user", content: "What is 2+2?" }],
    max_tokens: 50,
  });
  console.log("Connected:", response.choices[0].message.content);
  console.log("Model:", response.model);
  console.log("Tokens used:", response.usage?.total_tokens);
}

verify().catch(console.error);

Step 4: Verify Connection (Python)

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["PERPLEXITY_API_KEY"],
    base_url="https://api.perplexity.ai",
)

response = client.chat.completions.create(
    model="sonar",
    messages=[{"role": "user", "content": "What is 2+2?"}],
    max_tokens=50,
)
print("Connected:", response.choices[0].message.content)
print("Model:", response.model)
print("Tokens:", response.usage.total_tokens)

Available Models

Model
Use Case
Input $/M tokens
Output $/M tokens



                
              
                
                  
                  perplexity-known-pitfalls
                  View full skill →
                
                
                  'Identify and avoid Perplexity anti-patterns and common integration mistakes.
                  
                      ReadGrep
                    
                
                
                  Perplexity Known Pitfalls
Overview
Real gotchas when integrating Perplexity Sonar API. Perplexity uses an OpenAI-compatible chat endpoint but performs live web searches -- a fundamentally different paradigm from standard LLM completions. These pitfalls come from treating it like a regular chatbot.
Prerequisites

Perplexity API key configured
Understanding of OpenAI-compatible chat API format

Pitfalls
1. Using It as a Generic Chatbot
Perplexity searches the web per request. Using it for tasks that don't need web search wastes money.

# BAD: general chatbot (wastes a search query)
response = call_perplexity("Write me a haiku about cats")
# Costs $0.005+ for something any LLM can do offline

# GOOD: leverage web search capability
response = call_perplexity(
    "What are the latest Next.js 15 features released this month?",
    search_recency_filter="month"
)

2. Ignoring Citations
Perplexity returns [1], [2] markers in text with a separate citations array. Ignoring them loses the key value prop.

data = response.model_dump()  # or response.json() for raw HTTP
answer = data["choices"][0]["message"]["content"]
citations = data.get("citations", [])  # NOT in choices — top-level field

# BAD: displaying raw markers
print(answer)  # "According to [1], Node.js 22 adds..."

# GOOD: replace markers with links
import re
for i, url in enumerate(citations, 1):
    answer = answer.replace(f"[{i}]", f"{i}")

3. Using Wrong SDK Import
There is no @perplexity/sdk or perplexity Python package. Use the standard OpenAI client.

// BAD — this package doesn't exist
import { PerplexityClient } from "@perplexity/sdk";

// GOOD — use OpenAI client with Perplexity base URL
import OpenAI from "openai";
const client = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

4. Not Setting max_tokens
Without max_tokens, responses can be arbitrarily long, increasing costs unpredictably.

// BAD: no token limit — output cost can spike
await client.chat.completions.create({
  model: "sonar-pro",  // $15/M output tokens!
  messages: [{ role: "user", content: "Tell me about AI" }],
});

// GOOD: always set max_tokens
await client.chat.completions.create({
  model: "sonar-pro",
  messages: [{ role: "user", content: "Tell me about AI" }],
  max_tokens: 1024,
});

5. No Recency Filter for Time-Sensitive Queries
Without searchrecencyfilter

                

              

                
                  
                  perplexity-load-scale
                  View full skill →
                
                
                  'Load test Perplexity Sonar API integrations and plan capacity.
                  
                      ReadWriteEditBash(k6:*)Bash(kubectl:*)
                    
                
                
                  Perplexity Load & Scale
Overview
Load testing and capacity planning for Perplexity Sonar API. Key constraint: Perplexity rate limits at 50 RPM (default tier), and every request performs a live web search with variable latency. Load testing must respect these limits to avoid burning through credits.
Capacity Constraints

Constraint
Default Limit
Impact


RPM (requests per minute)
50
Hard ceiling on throughput


Context window
127K tokens
Limits conversation history


sonar latency
1-3s
Throughput: ~20-50 concurrent


sonar-pro latency
3-8s
Throughput: ~6-16 concurrent


searchdomainfilter
20 domains max
Per-request limit


Prerequisites

k6 load testing tool installed
Separate Perplexity API key for load testing
Budget approval (load tests cost money)

Instructions
Step 1: k6 Load Test Script

// perplexity-load-test.js
import http from "k6/http";
import { check, sleep } from "k6";
import { Rate, Trend } from "k6/metrics";

const errorRate = new Rate("perplexity_errors");
const citationCount = new Trend("perplexity_citations");

export const options = {
  stages: [
    { duration: "1m", target: 5 },    // Ramp to 5 VUs
    { duration: "3m", target: 5 },    // Steady at 5 VUs
    { duration: "1m", target: 15 },   // Ramp to 15 VUs
    { duration: "3m", target: 15 },   // Steady at 15 VUs
    { duration: "1m", target: 0 },    // Ramp down
  ],
  thresholds: {
    http_req_duration: ["p(95)<10000"],  // 10s P95 for sonar
    perplexity_errors: ["rate<0.05"],    // <5% error rate
  },
};

const queries = [
  "What is TypeScript?",
  "Latest Node.js features",
  "Python vs JavaScript for web development",
  "Current state of AI in healthcare",
  "Best practices for REST API design",
];

export default function () {
  const query = queries[Math.floor(Math.random() * queries.length)];

  const response = http.post(
    "https://api.perplexity.ai/chat/completions",
    JSON.stringify({
      model: "sonar",
      messages: [{ role: "user", content: query }],
      max_tokens: 200,
    }),
    {
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${__ENV.PERPLEXITY_API_KEY}`,
      },
      timeout: "15s",
    }
  );

  const success = check(response, {
    "status is 200": (r) => r.status === 200,
    "has content": (r) => {

                

              

                
                  
                  perplexity-local-dev-loop
                  View full skill →
                
                
                  'Configure Perplexity local development with mocking, testing, and hot.
                  
                      ReadWriteEditBash(npm:*)Bash(pnpm:*)Grep
                    
                
                
                  Perplexity Local Dev Loop
Overview
Set up a fast, cost-effective local development workflow for Perplexity Sonar API. Key challenge: every real API call performs a web search and costs money, so mocking and caching are essential for development.
Prerequisites

Completed perplexity-install-auth setup
Node.js 18+ with npm/pnpm
vitest for testing

Instructions
Step 1: Project Structure

my-perplexity-project/
├── src/
│   ├── perplexity/
│   │   ├── client.ts       # OpenAI client wrapper for Perplexity
│   │   ├── search.ts       # Search functions with citation handling
│   │   └── types.ts        # Response type extensions
│   └── index.ts
├── tests/
│   ├── fixtures/           # Saved API responses for mocking
│   │   └── sonar-response.json
│   ├── perplexity.test.ts
│   └── setup.ts
├── .env.local              # API key (git-ignored)
├── .env.example            # Template
└── package.json

Step 2: Type-Safe Client Wrapper

// src/perplexity/client.ts
import OpenAI from "openai";

export interface PerplexityResponse extends OpenAI.ChatCompletion {
  citations?: string[];
  search_results?: Array<{
    title: string;
    url: string;
    snippet: string;
  }>;
  related_questions?: string[];
}

export type PerplexityModel = "sonar" | "sonar-pro" | "sonar-reasoning-pro" | "sonar-deep-research";

export function createClient(apiKey?: string): OpenAI {
  return new OpenAI({
    apiKey: apiKey || process.env.PERPLEXITY_API_KEY,
    baseURL: "https://api.perplexity.ai",
  });
}

export async function search(
  client: OpenAI,
  query: string,
  opts: {
    model?: PerplexityModel;
    systemPrompt?: string;
    maxTokens?: number;
    searchRecencyFilter?: "hour" | "day" | "week" | "month";
    searchDomainFilter?: string[];
  } = {}
): Promise<PerplexityResponse> {
  const response = await client.chat.completions.create({
    model: opts.model || "sonar",
    messages: [
      ...(opts.systemPrompt
        ? [{ role: "system" as const, content: opts.systemPrompt }]
        : []),
      { role: "user" as const, content: query },
    ],
    max_tokens: opts.maxTokens,
    ...(opts.searchRecencyFilter && { search_recency_filter: opts.searchRecencyFilter }),
    ...(opts.searchDomainFilter && { search_domain_filter: opts.searchDomainFilter }),
  } as any);

  return response as unknown as PerplexityResponse;
}

Step 3: Save Fixtures for Offline Development

// scripts/capture-fixture.ts
import { createClient, search } from "../src/perplexity/client";
import { writeFileSync } from "fs";

async function captureFixture() {
  const client = createCli

                

              

                
                  
                  perplexity-migration-deep-dive
                  View full skill →
                
                
                  'Migrate to Perplexity Sonar from other search/LLM APIs using the strangler.
                  
                      ReadWriteEditBash(npm:*)Bash(node:*)Bash(kubectl:*)
                    
                
                
                  Perplexity Migration Deep Dive
Current State
!npm list openai 2>/dev/null | grep openai || echo 'N/A'
!grep -rn "google.search\|bing.api\|serpapi\|pplx-7b\|pplx-70b" --include=".ts" --include=".py" . 2>/dev/null | head -5 || echo 'No legacy search APIs found'
Overview
Migrate from traditional search APIs (Google Custom Search, Bing, SerpAPI) or legacy LLMs to Perplexity Sonar. Key advantage: Perplexity combines search + LLM summarization in a single API call, replacing a multi-step pipeline.
Migration Comparison

Feature
Google CSE / Bing
Perplexity Sonar


Returns
Raw search results (links + snippets)
Synthesized answer + citations


Answer generation
Requires separate LLM call
Built-in


Citation handling
Manual extraction
Automatic citations array


Cost structure
Per-search ($5/1K queries)
Per-token + per-request


Recency filter
Date range parameters
searchrecencyfilter


Domain filter
Site restriction
searchdomainfilter


Instructions
Step 1: Assess Current Integration

set -euo pipefail
# Find existing search API usage
grep -rn "googleapis.*customsearch\|bing.*search\|serpapi\|serper\|tavily" \
  --include="*.ts" --include="*.py" --include="*.js" \
  . 2>/dev/null || echo "No search APIs found"

# Count integration points
grep -rln "search.*api\|customsearch\|bing.*web" \
  --include="*.ts" --include="*.py" --include="*.js" \
  . 2>/dev/null | wc -l

Step 2: Build Adapter Layer

// src/search/adapter.ts
export interface SearchResult {
  answer: string;
  citations: string[];
  rawResults?: Array<{ title: string; url: string; snippet: string }>;
}

export interface SearchAdapter {
  search(query: string, opts?: { recency?: string; domains?: string[] }): Promise<SearchResult>;
}

// Legacy adapter (existing Google/Bing implementation)
class GoogleSearchAdapter implements SearchAdapter {
  async search(query: string): Promise<SearchResult> {
    // Existing Google CSE code
    const results = await googleCustomSearch(query);
    return {
      answer: "", // No built-in answer generation
      citations: results.items.map((i: any) => i.link),
      rawResults: results.items.map((i: any) => ({
        title: i.title,
        url: i.link,
        snippet: i.snippet,
      })),
    };
  }
}

// New Perplexity adapter
class PerplexitySearchAdapter implements SearchAdapter 

                

              

                
                  
                  perplexity-multi-env-setup
                  View full skill →
                
                
                  'Configure Perplexity Sonar API across development, staging, and production.
                  
                      ReadWriteEditBash(aws:*)Bash(gcloud:*)Bash(vault:*)
                    
                
                
                  Perplexity Multi-Environment Setup
Overview
Configure Perplexity Sonar API across dev/staging/prod. Key decisions per environment: which models are allowed (sonar vs sonar-pro), rate limits, and cost caps. All environments use the same base URL (https://api.perplexity.ai) but different API keys with different budget limits.
Environment Strategy

Environment
Model
Rate Limit
Key Source
Monthly Budget


Development
sonar only
5 RPM self-imposed
.env.local
$10


Staging
sonar only
20 RPM
CI secrets
$50


Production
sonar + sonar-pro
50 RPM
Secret manager
$500+


Prerequisites

Separate Perplexity API keys per environment
openai package installed
Secret management (env files for dev, vault/KMS for prod)

Instructions
Step 1: Configuration Structure

config/
  perplexity/
    base.ts           # OpenAI client + base URL
    development.ts    # Dev: sonar only, low limits
    staging.ts        # Staging: sonar only, moderate limits
    production.ts     # Prod: full model access
    index.ts          # Environment resolver

Step 2: Base Configuration

// config/perplexity/base.ts
import OpenAI from "openai";

export const PERPLEXITY_BASE_URL = "https://api.perplexity.ai";

export function createPerplexityClient(apiKey: string): OpenAI {
  if (!apiKey) throw new Error("Perplexity API key is required");
  if (!apiKey.startsWith("pplx-")) throw new Error("Invalid Perplexity API key format");

  return new OpenAI({ apiKey, baseURL: PERPLEXITY_BASE_URL });
}

Step 3: Environment Configs

// config/perplexity/development.ts
export const devConfig = {
  apiKey: process.env.PERPLEXITY_API_KEY!,
  defaultModel: "sonar" as const,
  deepModel: "sonar" as const,       // No sonar-pro in dev (cost)
  maxTokens: 512,
  maxConcurrentRequests: 1,
  cacheTTLMs: 24 * 3600_000,         // Long cache in dev
};

// config/perplexity/staging.ts
export const stagingConfig = {
  apiKey: process.env.PERPLEXITY_API_KEY_STAGING!,
  defaultModel: "sonar" as const,
  deepModel: "sonar" as const,       // Keep sonar in staging
  maxTokens: 1024,
  maxConcurrentRequests: 2,
  cacheTTLMs: 4 * 3600_000,
};

// config/perplexity/production.ts
export const productionConfig = {
  apiKey: process.env.PERPLEXITY_API_KEY_PROD!,
  defaultModel: "sonar" as const,    // Fast queries use sonar
  deepModel: "sonar-pro" as cons

                

              

                
                  
                  perplexity-observability
                  View full skill →
                
                
                  'Set up monitoring for Perplexity Sonar API with latency, cost, citation.
                  
                      ReadWriteEdit
                    
                
                
                  Perplexity Observability
Overview
Monitor Perplexity Sonar API performance, cost, and quality. Key signals unique to Perplexity: citation count per response (quality indicator), search latency variability (web search is non-deterministic), and per-model cost differences.
Key Metrics

Metric
sonar (typical)
sonar-pro (typical)
Alert Threshold


Latency p50
1-2s
3-5s
p95 > 15s


Citations/response
3-5
5-10
0 for 10min


Error rate
<1%
<1%
>5%


Cost/query
$0.005
$0.02
>$0.10


Prerequisites

Perplexity API integration running
Metrics backend (Prometheus, Datadog, or custom)
Alerting system configured

Instructions
Step 1: Instrument the Perplexity Client

import OpenAI from "openai";

interface SearchMetrics {
  model: string;
  latencyMs: number;
  status: "success" | "error";
  citationCount: number;
  totalTokens: number;
  cached: boolean;
  errorCode?: number;
}

const metrics: SearchMetrics[] = [];

async function instrumentedSearch(
  client: OpenAI,
  query: string,
  model: string = "sonar",
  cached: boolean = false
): Promise<{ response: any; metrics: SearchMetrics }> {
  const start = performance.now();
  let searchMetrics: SearchMetrics;

  try {
    const response = await client.chat.completions.create({
      model,
      messages: [{ role: "user", content: query }],
    });

    searchMetrics = {
      model,
      latencyMs: performance.now() - start,
      status: "success",
      citationCount: (response as any).citations?.length || 0,
      totalTokens: response.usage?.total_tokens || 0,
      cached,
    };

    metrics.push(searchMetrics);
    return { response, metrics: searchMetrics };
  } catch (err: any) {
    searchMetrics = {
      model,
      latencyMs: performance.now() - start,
      status: "error",
      citationCount: 0,
      totalTokens: 0,
      cached,
      errorCode: err.status,
    };

    metrics.push(searchMetrics);
    throw err;
  }
}

Step 2: Prometheus Metrics Export

// Export metrics in Prometheus format
function prometheusMetrics(): string {
  const lines: string[] = [];

  // Latency histogram
  lines.push("# HELP perplexity_latency_ms Search response latency");
  lines.push("# TYPE perplexity_latency_ms histogram");

  // Query counter
  const byModel = metrics.reduce((acc, m) => {
    const key = `${m.model}_${m.status}`;
    acc[key] = (acc[key] || 0) + 1;
    return acc;
  }, {} as Record<string, number>);

  for (const [key, count] of Object

                

              

                
                  
                  perplexity-performance-tuning
                  View full skill →
                
                
                  'Optimize Perplexity Sonar API performance with caching, streaming, model.
                  
                      ReadWriteEdit
                    
                
                
                  Perplexity Performance Tuning
Overview
Optimize Perplexity Sonar API for latency, throughput, and cost. Key insight: every Perplexity call performs a live web search, so response times are inherently variable. Typical latencies: sonar 1-3s, sonar-pro 3-8s, sonar-deep-research 10-60s.
Latency Benchmarks

Model
Typical Latency
Max Tokens
Best For


sonar
1-3s
4096
Quick answers, simple facts


sonar-pro
3-8s
8192
Deep research, many citations


sonar-reasoning-pro
5-15s
8192
Multi-step analysis


sonar-deep-research
10-60s
8192
Comprehensive reports


Prerequisites

Perplexity API key configured
Understanding of search-augmented generation latency patterns
Cache infrastructure (Redis or in-memory LRU)

Instructions
Step 1: Smart Model Routing

import OpenAI from "openai";

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

type QueryComplexity = "simple" | "standard" | "deep";

function classifyQuery(query: string): QueryComplexity {
  const words = query.split(/\s+/).length;
  const simplePatterns = [/^what is/i, /^who is/i, /^when did/i, /^define/i, /^how many/i];
  const deepPatterns = [/compare.*vs/i, /analysis of/i, /comprehensive/i, /pros and cons/i, /in-depth/i];

  if (simplePatterns.some((p) => p.test(query)) && words < 15) return "simple";
  if (deepPatterns.some((p) => p.test(query)) || words > 30) return "deep";
  return "standard";
}

function selectModel(complexity: QueryComplexity): { model: string; maxTokens: number } {
  switch (complexity) {
    case "simple":  return { model: "sonar",     maxTokens: 256 };
    case "standard": return { model: "sonar",     maxTokens: 1024 };
    case "deep":    return { model: "sonar-pro", maxTokens: 4096 };
  }
}

async function smartSearch(query: string) {
  const complexity = classifyQuery(query);
  const { model, maxTokens } = selectModel(complexity);

  return perplexity.chat.completions.create({
    model,
    messages: [{ role: "user", content: query }],
    max_tokens: maxTokens,
  });
}

Step 2: Query Hash Caching

import { LRUCache } from "lru-cache";
import { createHash } from "crypto";

const CACHE_TTL = {
  news: 30 * 60 * 1000,      // 30 min for current events
  research: 4 * 60 * 60 * 1000,  // 4 hours for research
  factual: 24 

                

              

                
                  
                  perplexity-policy-guardrails
                  View full skill →
                
                
                  'Implement content moderation, model selection policy, citation quality.
                  
                      ReadWriteEditBash(npx:*)
                    
                
                
                  Perplexity Policy Guardrails
Overview
Policy enforcement for Perplexity Sonar API. Since Perplexity performs live web searches, guardrails must address: query content moderation (what users can search for), citation reliability (filtering low-quality sources), cost control (model selection + token limits), and responsible AI usage.
Policy Pipeline

User Query
    │
    ▼
Query Moderation (block harmful queries)
    │
    ▼
PII Sanitization (strip personal data)
    │
    ▼
Quota Check (daily limit by user tier)
    │
    ▼
Model Selection (enforce tier-appropriate model)
    │
    ▼
Perplexity API Call
    │
    ▼
Citation Quality Scoring (filter low-trust sources)
    │
    ▼
Response to User

Prerequisites

Perplexity API configured
Content moderation policy defined
User tier system in place
Redis for quota tracking (optional: in-memory for simple apps)

Instructions
Step 1: Query Content Moderation

const BLOCKED_PATTERNS = [
  /\b(write|generate|create)\s+(malware|virus|exploit|ransomware)\b/i,
  /\b(personal|private)\s+(address|phone|ssn)\s+of\s+\w+/i,
  /\b(bypass|circumvent|hack)\s+(security|firewall|authentication)\b/i,
  /\b(how to|tutorial)\s+(stalk|dox|harass)\b/i,
];

const MAX_QUERY_LENGTH = 2000;

class PolicyError extends Error {
  constructor(public code: string, message: string) {
    super(message);
    this.name = "PolicyError";
  }
}

function moderateQuery(query: string): string {
  if (query.length > MAX_QUERY_LENGTH) {
    throw new PolicyError("QUERY_TOO_LONG", `Query exceeds ${MAX_QUERY_LENGTH} characters`);
  }

  for (const pattern of BLOCKED_PATTERNS) {
    if (pattern.test(query)) {
      throw new PolicyError("CONTENT_BLOCKED", "Query blocked by content policy");
    }
  }

  return query;
}

Step 2: Model Selection Policy

interface ModelPolicy {
  model: string;
  maxTokens: number;
  costPerRequest: number;
}

const MODEL_POLICIES: Record<string, ModelPolicy> = {
  free:       { model: "sonar",     maxTokens: 256,  costPerRequest: 0.005 },
  basic:      { model: "sonar",     maxTokens: 1024, costPerRequest: 0.005 },
  pro:        { model: "sonar-pro", maxTokens: 2048, costPerRequest: 0.02 },
  enterprise: { model: "sonar-pro", maxTokens: 4096, costPerRequest: 0.02 },
};

function enforceModelPolicy(
  userTier: string,
  requestedModel?: string
): ModelPolicy {
  const policy = MODEL_POLICIES[userTier] || MODEL_POLICIES.free;

  // Prevent free users from using expensive models
  if (requestedModel === "sonar-pro" && !["pro", "enterprise"].includes(userTier)) {
    console.warn(`User tier ${userTier} not allowed sonar-pro, using sonar`);
    return MODEL_POLICIES.free;
  }

                

              

                
                  
                  perplexity-prod-checklist
                  View full skill →
                
                
                  'Execute Perplexity production deployment checklist for Sonar API integrations.
                  
                      ReadBash(kubectl:*)Bash(curl:*)Grep
                    
                
                
                  Perplexity Production Checklist
Overview
Complete checklist for deploying Perplexity Sonar API integrations to production. Perplexity-specific concerns: every API call performs a live web search (variable latency), citations link to third-party sites (must validate), and costs scale per-request plus per-token.
Prerequisites

Staging environment tested
Production API key generated (separate from dev/staging)
Monitoring configured
Cost budget defined

Production Readiness Checklist
API Configuration

[ ] Production PERPLEXITYAPIKEY in secret manager (not env file)
[ ] Key starts with pplx- and has credits loaded
[ ] Separate API keys for dev/staging/prod
[ ] Base URL is https://api.perplexity.ai (not localhost/proxy)
[ ] Model selection configured: sonar for fast, sonar-pro for deep

Code Quality

[ ] All search calls wrapped in retry with exponential backoff
[ ] Rate limiting implemented (50 RPM default)
[ ] Query sanitization strips PII before sending to Perplexity
[ ] Citations parsed from response (not extracted from text)
[ ] max_tokens set on all requests (prevents runaway costs)
[ ] Timeouts configured: 15s for sonar, 30s for sonar-pro
[ ] Error handling covers 401, 402, 429, 500+ status codes
[ ] No hardcoded API keys in source code

Performance

[ ] Result caching implemented for repeated queries
[ ] Cache TTL appropriate: 30min for news, 4hrs for research, 24hrs for facts
[ ] Streaming enabled for user-facing search (reduces perceived latency)
[ ] Request queue prevents burst overload
[ ] searchdomainfilter used where appropriate (reduces search time)

Monitoring

[ ] Latency tracked per model (sonar ~2s, sonar-pro ~5s, deep-research ~30s)
[ ] Error rate monitored (alert on >5% failure rate)
[ ] Token usage tracked for cost projection
[ ] Citation count per response logged (quality signal)
[ ] 429 rate limit errors tracked with alert

Cost Controls

[ ] Monthly budget cap set on API key
[ ] Model routing: simple queries to sonar, complex to sonar-pro
[ ] max_tokens capped per endpoint
[ ] Cache hit rate monitored (target >30%)
[ ] Cost per query tracked by model

Graceful Degradation

async function searchWithFallback(query: string) {
  try {
    // Primary: sonar-pro for deep answers
    return await perplexity.chat.completions.create({
      model: "sonar-pro",
      messages: [{ role: "user", content: query

                

              

                
                  
                  perplexity-rate-limits
                  View full skill →
                
                
                  'Implement Perplexity rate limiting, backoff, and request queuing.
                  
                      ReadWriteEdit
                    
                
                
                  Perplexity Rate Limits
Overview
Handle Perplexity Sonar API rate limits. Perplexity uses a leaky bucket algorithm: burst capacity is available, with tokens refilling continuously at your assigned rate. Rate limits are based on requests per minute (RPM).
Rate Limit Tiers

Tier
RPM
Notes


Free / Starter
50
Default for new API keys


Search API
~3 req/sec
Per-endpoint limit


Higher tiers
Contact sales
Custom limits available


Rate limits apply per API key, not per model. Using sonar-pro counts against the same RPM as sonar.
Prerequisites

PERPLEXITYAPIKEY set
Understanding of HTTP 429 responses

Instructions
Step 1: Exponential Backoff with Jitter

async function withExponentialBackoff<T>(
  operation: () => Promise<T>,
  config = { maxRetries: 5, baseDelayMs: 1000, maxDelayMs: 30000, jitterMs: 500 }
): Promise<T> {
  for (let attempt = 0; attempt <= config.maxRetries; attempt++) {
    try {
      return await operation();
    } catch (error: any) {
      if (attempt === config.maxRetries) throw error;

      const status = error.status || error.response?.status;
      // Only retry on 429 (rate limit) and 5xx (server errors)
      if (status && status !== 429 && status < 500) throw error;

      const exponentialDelay = config.baseDelayMs * Math.pow(2, attempt);
      const jitter = Math.random() * config.jitterMs;
      const delay = Math.min(exponentialDelay + jitter, config.maxDelayMs);

      console.warn(`[Perplexity] ${status || "error"} — retry ${attempt + 1}/${config.maxRetries} in ${delay.toFixed(0)}ms`);
      await new Promise((r) => setTimeout(r, delay));
    }
  }
  throw new Error("Unreachable");
}

// Usage
const result = await withExponentialBackoff(() =>
  perplexity.chat.completions.create({
    model: "sonar",
    messages: [{ role: "user", content: "test query" }],
  })
);

Step 2: Queue-Based Rate Limiting

import PQueue from "p-queue";

// 50 RPM = ~0.83 req/sec. Set intervalCap=1, interval=1200ms for safety.
const perplexityQueue = new PQueue({
  concurrency: 3,
  interval: 1200,
  intervalCap: 1,
});

async function queuedSearch(query: string, model = "sonar") {
  return perplexityQueue.add(() =>
    withExponentialBackoff(() =>
      perplexity.chat.completions.create({
        model,
        messages: [{ role: "user", content: query }],
      })
    )
  );
}

// Batch queries are automatically rate-limited
const queries = ["query 1", "query 2", "que

                

              

                
                  
                  perplexity-reference-architecture
                  View full skill →
                
                
                  'Implement Perplexity reference architecture with model routing, citation.
                  
                      ReadGrep
                    
                
                
                  Perplexity Reference Architecture
Overview
Production architecture for AI-powered search with Perplexity Sonar API. Three tiers: search service (model routing + caching), citation pipeline (extract, validate, store), and research orchestrator (multi-query synthesis).
Architecture

┌─────────────────────────────────────────────┐
│              Application Layer              │
│  (Search Widget, Research Agent, Fact Check) │
└──────────────────────┬──────────────────────┘
                       │
┌──────────────────────▼──────────────────────┐
│            Search Service Layer             │
│  ┌──────────┐ ┌──────────┐ ┌─────────────┐ │
│  │  Model   │ │  Query   │ │   Response   │ │
│  │  Router  │ │  Cache   │ │   Parser     │ │
│  └──────────┘ └──────────┘ └─────────────┘ │
└──────────────────────┬──────────────────────┘
                       │
┌──────────────────────▼──────────────────────┐
│          api.perplexity.ai/chat/completions │
│  sonar | sonar-pro | sonar-reasoning-pro    │
└─────────────────────────────────────────────┘

Prerequisites

Perplexity API key with Sonar access
OpenAI-compatible client library (openai package)
Redis for production caching (LRU for development)

Instructions
Step 1: Search Service with Model Routing

// src/perplexity/search-service.ts
import OpenAI from "openai";
import { createHash } from "crypto";

type SearchDepth = "quick" | "standard" | "deep" | "reasoning";

const MODEL_MAP: Record<SearchDepth, { model: string; maxTokens: number; timeout: number }> = {
  quick:     { model: "sonar",               maxTokens: 256,  timeout: 10000 },
  standard:  { model: "sonar",               maxTokens: 1024, timeout: 15000 },
  deep:      { model: "sonar-pro",           maxTokens: 4096, timeout: 30000 },
  reasoning: { model: "sonar-reasoning-pro", maxTokens: 4096, timeout: 45000 },
};

export class SearchService {
  constructor(
    private client: OpenAI,
    private cache: Map<string, { result: any; expiry: number }> = new Map()
  ) {}

  async search(query: string, depth: SearchDepth = "standard", opts: {
    recencyFilter?: "hour" | "day" | "week" | "month";
    domainFilter?: string[];
    systemPrompt?: string;
  } = {}) {
    const config = MODEL_MAP[depth];
    const cacheKey = this.hashQuery(query, config.model, opts);

    // Check cache
    const cached = this.cache.get(cacheKey);
    if (cached && cached.expiry > Date.now()) {
      return { ...cached.result, cached: true };
    }

    const response = await this.client.chat.completions.create({
      model: config.model,
      messages: [
        ...(opts.systemPrompt ? [{ role: "system" as const, content: 

                

              

                
                  
                  perplexity-reliability-patterns
                  View full skill →
                
                
                  'Implement reliability patterns for Perplexity Sonar API: circuit breaker,.
                  
                      ReadWriteEdit
                    
                
                
                  Perplexity Reliability Patterns
Overview
Production reliability patterns for Perplexity Sonar API. Perplexity performs live web searches per request, making response times inherently variable. The key reliability challenges: search can stall, citations can break, and model tiers have different availability.
Prerequisites

Perplexity API key configured
Cache layer (Redis or in-memory)
Understanding of search latency variability

Instructions
Step 1: Model Tier Fallback

import OpenAI from "openai";

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  baseURL: "https://api.perplexity.ai",
});

async function resilientSearch(
  query: string,
  preferredModel: string = "sonar-pro"
) {
  const fallbackChain = [preferredModel, "sonar"];
  let lastError: Error | null = null;

  for (const model of fallbackChain) {
    try {
      const response = await perplexity.chat.completions.create({
        model,
        messages: [{ role: "user", content: query }],
        max_tokens: model === "sonar-pro" ? 2048 : 512,
      });

      if (model !== preferredModel) {
        console.warn(`[Reliability] Fell back from ${preferredModel} to ${model}`);
      }

      return {
        answer: response.choices[0].message.content || "",
        citations: (response as any).citations || [],
        model: response.model,
        fallback: model !== preferredModel,
      };
    } catch (err: any) {
      lastError = err;
      if (err.status === 401 || err.status === 402) throw err; // Don't retry auth/billing
      console.warn(`[Reliability] ${model} failed (${err.status || err.message}), trying next`);
    }
  }

  throw lastError || new Error("All models failed");
}

Step 2: Circuit Breaker

class CircuitBreaker {
  private failures = 0;
  private lastFailure = 0;
  private state: "closed" | "open" | "half-open" = "closed";

  constructor(
    private threshold: number = 5,
    private resetTimeMs: number = 60000
  ) {}

  async execute<T>(fn: () => Promise<T>, fallback: () => Promise<T>): Promise<T> {
    if (this.state === "open") {
      if (Date.now() - this.lastFailure > this.resetTimeMs) {
        this.state = "half-open";
      } else {
        console.warn("[CircuitBreaker] Open — using fallback");
        return fallback();
      }
    }

    try {
      const result = await fn();
      if (this.state === "half-open") {
        this.state = "closed";
        this.failures = 0;
      }
      return result;
    } catch (err) {
      this.failures++;
      this.lastFailure = Date.now();
      if (this.failures >= this.threshold) {
        this.state = &

                

              

                
                  
                  perplexity-sdk-patterns
                  View full skill →
                
                
                  'Apply production-ready Perplexity Sonar API patterns for TypeScript.
                  
                      ReadWriteEdit
                    
                
                
                  Perplexity SDK Patterns
Overview
Production-ready patterns for Perplexity Sonar API. Since Perplexity uses the OpenAI wire format, you build wrappers around the openai client library with Perplexity-specific response handling (citations, search results, related questions).
Prerequisites

openai package installed (npm install openai or pip install openai)
API key configured in PERPLEXITYAPIKEY
Understanding of OpenAI chat completions format

Instructions
Step 1: Typed Client Singleton (TypeScript)

// src/perplexity/client.ts
import OpenAI from "openai";

export interface PerplexityChatCompletion extends OpenAI.ChatCompletion {
  citations?: string[];
  search_results?: Array<{
    title: string;
    url: string;
    date?: string;
    snippet: string;
  }>;
  related_questions?: string[];
}

export interface PerplexityUsage extends OpenAI.CompletionUsage {
  citation_tokens?: number;
  num_search_queries?: number;
  reasoning_tokens?: number;
}

let instance: OpenAI | null = null;

export function getClient(): OpenAI {
  if (!instance) {
    if (!process.env.PERPLEXITY_API_KEY) {
      throw new Error("PERPLEXITY_API_KEY not set");
    }
    instance = new OpenAI({
      apiKey: process.env.PERPLEXITY_API_KEY,
      baseURL: "https://api.perplexity.ai",
    });
  }
  return instance;
}

Step 2: Search with Full Response Parsing

// src/perplexity/search.ts
import { getClient, PerplexityChatCompletion } from "./client";

export type SearchModel = "sonar" | "sonar-pro" | "sonar-reasoning-pro" | "sonar-deep-research";
export type RecencyFilter = "hour" | "day" | "week" | "month";

export interface SearchOptions {
  model?: SearchModel;
  systemPrompt?: string;
  maxTokens?: number;
  temperature?: number;
  searchRecencyFilter?: RecencyFilter;
  searchDomainFilter?: string[];   // max 20 domains
  returnRelatedQuestions?: boolean;
  returnImages?: boolean;
}

export interface SearchResult {
  answer: string;
  citations: string[];
  relatedQuestions: string[];
  usage: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
    citationTokens?: number;
    searchQueries?: number;
  };
  model: string;
}

export async function search(
  query: string,
  opts: SearchOptions = {}
): Promise<SearchResult> {
  const client = getClient();

  const response = (await client.chat.completions.create({
    model: opts.model || "sonar",
    messages: [
      ...(opts.systemPrompt
        ? [{ role: "system" as const, content: opts.systemPrompt }]
        : []),
      { role: "user" as const, content: query },
    ],
    max_toke

                

              

                
                  
                  perplexity-security-basics
                  View full skill →
                
                
                  'Apply Perplexity security best practices for API key management and.
                  
                      ReadWriteGrep
                    
                
                
                  Perplexity Security Basics
Overview
Security best practices for Perplexity Sonar API. Key concerns: API key protection (keys start with pplx-), query sanitization (Perplexity searches the open web, so PII in queries gets sent to external sources), and response handling (citations link to third-party sites).
Prerequisites

Perplexity API key from perplexity.ai/settings/api
Understanding of environment variable management
.gitignore configured to exclude secret files

Instructions
Step 1: API Key Management

# .env (NEVER commit to git)
PERPLEXITY_API_KEY=pplx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

# .gitignore
.env
.env.local
.env.*.local
*.pem


// Validate key format at startup
function validateApiKey(key: string): void {
  if (!key) throw new Error("PERPLEXITY_API_KEY is not set");
  if (!key.startsWith("pplx-")) {
    throw new Error("PERPLEXITY_API_KEY must start with 'pplx-'");
  }
  if (key.length < 40) {
    throw new Error("PERPLEXITY_API_KEY appears truncated");
  }
}

validateApiKey(process.env.PERPLEXITY_API_KEY || "");

Step 2: Query Sanitization (Critical)
Perplexity sends your query to the open web for search. Any PII in the query is exposed to external search infrastructure.

function sanitizeQuery(query: string): string {
  return query
    // Remove email addresses
    .replace(/\b[\w.+-]+@[\w-]+\.[\w.]+\b/g, "[email]")
    // Remove phone numbers
    .replace(/\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g, "[phone]")
    // Remove SSN
    .replace(/\b\d{3}-\d{2}-\d{4}\b/g, "[ssn]")
    // Remove credit card numbers
    .replace(/\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g, "[card]")
    // Remove API keys / tokens
    .replace(/\b(pplx-|sk-|pk_|sk_live_)\w{20,}\b/g, "[token]")
    // Remove AWS keys
    .replace(/\bAKIA[A-Z0-9]{16}\b/g, "[aws-key]");
}

async function safeSearch(rawQuery: string) {
  const query = sanitizeQuery(rawQuery);
  if (query !== rawQuery) {
    console.warn("[Security] PII redacted from Perplexity query");
  }

  return perplexity.chat.completions.create({
    model: "sonar",
    messages: [{ role: "user", content: query }],
  });
}

Step 3: Restrict Search Domains
Use searchdomainfilter to prevent Perplexity from searching untrusted or competitor sites.

// Compliance: only search approved sources
const complianceSearch = await perplexity.chat.completions.create({
  model: "sonar",
  messages: [{ role: "user", content: query }],
  search_domain_filter: [

                

              

                
                  
                  perplexity-upgrade-migration
                  View full skill →
                
                
                  'Migrate between Perplexity model generations and API parameter changes.
                  
                      ReadWriteEditBash(npm:*)Bash(git:*)
                    
                
                
                  Perplexity Upgrade & Migration
Current State
!npm list openai 2>/dev/null | grep openai || echo 'openai not installed'
!pip show openai 2>/dev/null | grep Version || echo 'N/A'
Overview
Guide for migrating between Perplexity model generations and API changes. Perplexity has evolved from pplx-api with third-party models to the Sonar family with built-in web search.
Model Evolution

Era
Models
Status


Legacy (pre-2025)
pplx-7b-online, pplx-70b-online, llama-3.1-sonar-*
Deprecated


Current (2025)
sonar, sonar-pro
Active


Extended (2025+)
sonar-reasoning-pro, sonar-deep-research
Active


Instructions
Step 1: Identify Legacy Patterns to Update

set -euo pipefail
# Find deprecated model names in your codebase
grep -rn "pplx-7b\|pplx-70b\|llama.*sonar\|sonar-small\|sonar-medium\|sonar-huge" \
  --include="*.ts" --include="*.py" --include="*.js" --include="*.json" \
  . 2>/dev/null || echo "No legacy models found"

# Find old base URLs
grep -rn "api.perplexity.com\|pplx.readme.io" \
  --include="*.ts" --include="*.py" --include="*.js" \
  . 2>/dev/null || echo "No legacy URLs found"

Step 2: Model Name Migration Map

// Migration map: old model name -> new model name
const MODEL_MIGRATION: Record<string, string> = {
  // Legacy pplx-api models (fully deprecated)
  "pplx-7b-online": "sonar",
  "pplx-70b-online": "sonar-pro",
  "pplx-7b-chat": "sonar",
  "pplx-70b-chat": "sonar",

  // Llama-era Sonar models (deprecated)
  "llama-3.1-sonar-small-128k-online": "sonar",
  "llama-3.1-sonar-large-128k-online": "sonar-pro",
  "llama-3.1-sonar-huge-128k-online": "sonar-pro",
  "sonar-small-online": "sonar",
  "sonar-medium-online": "sonar-pro",

  // Current models (no migration needed)
  "sonar": "sonar",
  "sonar-pro": "sonar-pro",
  "sonar-reasoning-pro": "sonar-reasoning-pro",
  "sonar-deep-research": "sonar-deep-research",
};

function migrateModel(model: string): string {
  const migrated = MODEL_MIGRATION[model];
  if (!migrated) {
    console.warn(`Unknown Perplexity model: ${model}. Defaulting to sonar.`);
    return "sonar";
  }
  if (migrated !== model) {
    console.warn(`Perplexity model ${model} is dep

                

              

                
                  
                  perplexity-webhooks-events
                  View full skill →
                
                
                  'Build event-driven architectures around Perplexity Sonar API with streaming,.
                  
                      ReadWriteEditBash(curl:*)
                    
                
                
                  Perplexity Events & Async Patterns
Overview
Build event-driven architectures around Perplexity Sonar API. Perplexity does not have webhooks -- all interactions are request/response. Event patterns are built using streaming SSE, job queues for batch processing, and cron-triggered monitoring.
Event Patterns

Pattern
Trigger
Use Case


Streaming SSE
Client request
Real-time search with progressive rendering


Batch queue
Job submission
Research automation, report generation


Scheduled search
Cron job
News monitoring, trend alerts, competitive intel


Citation pipeline
Post-processing
Source verification, link validation


Prerequisites

openai package installed
PERPLEXITYAPIKEY set
Queue system (BullMQ, SQS) for batch patterns
Cron scheduler for monitoring patterns

Instructions
Step 1: Streaming Search (Server-Sent Events)

import OpenAI from "openai";
import express from "express";

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  baseURL: "https://api.perplexity.ai",
});

const app = express();
app.use(express.json());

app.post("/api/search/stream", async (req, res) => {
  const { query, model = "sonar" } = req.body;

  res.writeHead(200, {
    "Content-Type": "text/event-stream",
    "Cache-Control": "no-cache",
    Connection: "keep-alive",
  });

  try {
    const stream = await perplexity.chat.completions.create({
      model,
      messages: [{ role: "user", content: query }],
      stream: true,
      max_tokens: 2048,
    });

    let fullText = "";
    for await (const chunk of stream) {
      const text = chunk.choices[0]?.delta?.content || "";
      fullText += text;

      res.write(`data: ${JSON.stringify({ type: "text", content: text })}\n\n`);

      // Citations arrive in the final chunk
      const citations = (chunk as any).citations;
      if (citations) {
        res.write(`data: ${JSON.stringify({ type: "citations", urls: citations })}\n\n`);
      }
    }

    res.write(`data: ${JSON.stringify({ type: "done", totalLength: fullText.length })}\n\n`);
  } catch (err: any) {
    res.write(`data: ${JSON.stringify({ type: "error", message: err.message })}\n\n`);
  }

  res.end();
});

Step 2: Batch Research Pipeline

import { Queue, Worker } from "bullmq";

const searchQueue = new Queue("perplexity-research", {
  connection: { host: "localhost", port: 6379

                

              

          
        

      
      
          How It Works
          
import OpenAI from "openai";

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

const response = await perplexity.chat.completions.create({
  model: "sonar",
  messages: [{ role: "user", content: "What are the latest AI developments?" }],
});

console.log(response.choices[0].message.content);
console.log("Sources:", (response as any).citations);

        

      
      

      
      

      
      
  Ready to use perplexity-pack?
  
    
    
  




      
      
          Related Plugins
          
            
  supabase-pack
  Complete Supabase integration skill pack with 30 skills covering authentication, database, storage, realtime, edge functions, and production operations. Flagship+ tier vendor pack.
  /plugin install supabase-pack@claude-code-plugins-plus
  

  vercel-pack
  Complete Vercel integration skill pack with 30 skills covering deployments, edge functions, preview environments, performance optimization, and production operations. Flagship+ tier vendor pack.
  /plugin install vercel-pack@claude-code-plugins-plus
  

  clay-pack
  Complete Clay integration skill pack with 30 skills covering data enrichment, waterfall workflows, AI agents, and GTM automation. Flagship+ tier vendor pack.
  /plugin install clay-pack@claude-code-plugins-plus
  

  cursor-pack
  Complete Cursor integration skill pack with 30 skills covering AI code editing, composer workflows, codebase indexing, and productivity features. Flagship+ tier vendor pack.
  /plugin install cursor-pack@claude-code-plugins-plus
  

  exa-pack
  Complete Exa integration skill pack with 30 skills covering neural search, semantic retrieval, web search API, and AI-powered discovery. Flagship+ tier vendor pack.
  /plugin install exa-pack@claude-code-plugins-plus
  

  firecrawl-pack
  Complete Firecrawl integration skill pack with 30 skills covering web scraping, crawling, markdown conversion, and LLM-ready data extraction. Flagship+ tier vendor pack.
  /plugin install firecrawl-pack@claude-code-plugins-plus
  

          
        

      
      
          Tags
          
            perplexityai-searchresearchcitationsreal-timeknowledgeanswers
          
        
    
  

  

    

    
    
        
            
                Agent Skills in Your Inbox
                
                    
                    
                    
                
                No spam, ever. Unsubscribe with one click.
            

            
                
                    Product
                    
                        Explore
                        Skills
                        Cowork
                        Compare
                        Tools
                    
                
                
                    Resources
                    
                        Docs
                        Changelog
                        Collections
                        Playbooks
                        Research
                        Learning
                    
                
                
                    Company
                    
                        Community
                        Hall of Fame
                        GitHub
                    
                
                
                    Legal
                    
                        Privacy
                        Terms
                        Acceptable Use
                    
                
            

            
                Tons of Skills by Intent Solutions. Marine. Citadel Grad. 20 years ops → self-taught dev → AI architect.
                © 2026 Tons of Skills | Intent Solutions

Factor	Direct Widget	Cached Layer	Research Pipeline
Volume	<500/day	500-5K/day	5K+/day
Latency (p50)	2-5s	50ms (cached) / 2-5s (miss)	10-30s
Model	`sonar`	`sonar` + cache	`sonar` + `sonar-pro`
Monthly Cost	<$150	$50-$300	$300+
Complexity	Minimal	Moderate	High

Model	Input $/M tokens	Output $/M tokens	Request Fee
`sonar`	$1	$1	$5 per 1K requests
`sonar-pro`	$3	$15	$5 per 1K requests
`sonar-reasoning-pro`	$3	$15	$5 per 1K requests
`sonar-deep-research`	$2	$8	$5 per 1K searches

Layer	Mechanism	Perplexity Support
Authentication	API key per team	Yes (multiple keys)
Model restriction	Gateway enforcement	Build yourself
Budget cap	Per-key monthly limit	Via dashboard
Domain restriction	`searchdomainfilter`	Yes (per-request)
Rate limiting	Gateway + key limits	Yes (per-key RPM)

Level	Definition	Response Time	Example
P1	Complete API failure	< 15 min	All requests returning 500/503
P2	Degraded service	< 1 hour	High latency, 429 rate limits, no citations
P3	Minor impact	< 4 hours	Single model unavailable, sporadic errors
P4	No user impact	Next business day	Monitoring gap, stale cache