perplexity-known-pitfalls

'Identify and avoid Perplexity anti-patterns and common integration mistakes.

v1.12.0

Jeremy Longshore

MIT

Allowed Tools

ReadGrep

Provided by Plugin

perplexity-pack

Claude Code skill pack for Perplexity (30 skills)

saas packs v1.12.0

View Plugin

Installation

This skill is included in the perplexity-pack plugin:

/plugin install perplexity-pack@claude-code-plugins-plus

Click to copy

Instructions

Perplexity Known Pitfalls

Overview

Real gotchas when integrating Perplexity Sonar API. Perplexity uses an OpenAI-compatible chat endpoint but performs live web searches -- a fundamentally different paradigm from standard LLM completions. These pitfalls come from treating it like a regular chatbot.

Prerequisites

Perplexity API key configured
Understanding of OpenAI-compatible chat API format

Pitfalls

1. Using It as a Generic Chatbot

Perplexity searches the web per request. Using it for tasks that don't need web search wastes money.


# BAD: general chatbot (wastes a search query)
response = call_perplexity("Write me a haiku about cats")
# Costs $0.005+ for something any LLM can do offline

# GOOD: leverage web search capability
response = call_perplexity(
    "What are the latest Next.js 15 features released this month?",
    search_recency_filter="month"
)

2. Ignoring Citations

Perplexity returns [1], [2] markers in text with a separate citations array. Ignoring them loses the key value prop.


data = response.model_dump()  # or response.json() for raw HTTP
answer = data["choices"][0]["message"]["content"]
citations = data.get("citations", [])  # NOT in choices — top-level field

# BAD: displaying raw markers
print(answer)  # "According to [1], Node.js 22 adds..."

# GOOD: replace markers with links
import re
for i, url in enumerate(citations, 1):
    answer = answer.replace(f"[{i}]", f"{i}")

3. Using Wrong SDK Import

There is no @perplexity/sdk or perplexity Python package. Use the standard OpenAI client.


// BAD — this package doesn't exist
import { PerplexityClient } from "@perplexity/sdk";

// GOOD — use OpenAI client with Perplexity base URL
import OpenAI from "openai";
const client = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

4. Not Setting max_tokens

Without max_tokens, responses can be arbitrarily long, increasing costs unpredictably.


// BAD: no token limit — output cost can spike
await client.chat.completions.create({
  model: "sonar-pro",  // $15/M output tokens!
  messages: [{ role: "user", content: "Tell me about AI" }],
});

// GOOD: always set max_tokens
await client.chat.completions.create({
  model: "sonar-pro",
  messages: [{ role: "user", content: "Tell me about AI" }],
  max_tokens: 1024,
});

5. No Recency Filter for Time-Sensitive Queries

Without searchrecencyfilter, Perplexity may cite outdated articles.


# BAD: may return articles from any time period
response = call_perplexity("current Bitcoin price")

# GOOD: constrain to recent results
response = call_perplexity(
    "current Bitcoin price",
    search_recency_filter="day"  # hour | day | week | month
)

6. Sending Full Conversation History

Each message in the conversation may trigger new search queries. Sending 20 turns of history is expensive and slow.


# BAD: 20 turns of history = many search queries
messages = long_history + [{"role": "user", "content": "summarize"}]

# GOOD: summarize context, send focused query
messages = [
    {"role": "system", "content": "Answer based on web search."},
    {"role": "user", "content": f"Context: {summary}\nQuestion: {question}"}
]

7. Using sonar-pro for Simple Queries

sonar-pro costs 3-15x more than sonar. Using it for simple factual lookups wastes budget.


// BAD: sonar-pro for a trivial question
await client.chat.completions.create({
  model: "sonar-pro",  // $3 input + $15 output per M tokens
  messages: [{ role: "user", content: "What is the capital of France?" }],
});

// GOOD: match model to complexity
const model = isComplexQuery(query) ? "sonar-pro" : "sonar";

8. Mixing Allowlist and Denylist in Domain Filter

searchdomainfilter supports either allowlist (include) or denylist (exclude with - prefix), but not both in the same request.


// BAD: mixing modes
search_domain_filter: ["python.org", "-reddit.com"]  // ERROR

// GOOD: pick one mode
search_domain_filter: ["python.org", "docs.python.org"]  // Allowlist
// OR
search_domain_filter: ["-reddit.com", "-quora.com"]  // Denylist

9. Not Caching Search Results

Every uncached call performs a web search. At scale, duplicate queries burn budget.


// BAD: same query hits API every time
app.get("/search", (req, res) => {
  const result = await client.chat.completions.create({ ... });
  res.json(result);
});

// GOOD: cache by query hash
const cache = new LRUCache({ max: 1000, ttl: 3600_000 });
app.get("/search", (req, res) => {
  const key = hash(req.query.q);
  if (cache.has(key)) return res.json(cache.get(key));
  const result = await client.chat.completions.create({ ... });
  cache.set(key, result);
  res.json(result);
});

10. Wrong Base URL

The API is at api.perplexity.ai, not api.perplexity.com.


// BAD
baseURL: "https://api.perplexity.com"  // Wrong domain

// GOOD
baseURL: "https://api.perplexity.ai"   // Correct

Code Review Checklist

[ ] Uses openai package, not fake @perplexity/sdk
[ ] Base URL is https://api.perplexity.ai
[ ] max_tokens set on every request
[ ] Citations parsed from response.citations array
[ ] searchrecencyfilter used for time-sensitive queries
[ ] Caching implemented for repeated queries
[ ] Model routing: sonar for simple, sonar-pro for complex
[ ] Conversation history trimmed before sending
[ ] PII sanitized from queries
[ ] Domain filter uses only allowlist OR denylist, not both

Error Handling

Pitfall	Impact	Detection
No caching	3-5x cost overrun	Check cache hit rate metric
Wrong model	Budget waste	Grep for `sonar-pro` in simple query paths
No max_tokens	Unpredictable costs	Grep for `create()` calls without `max_tokens`
PII in queries	Privacy violation	Run sanitization check in CI

Output

Identified anti-patterns in existing code
Applied fixes for each pitfall
Code review checklist for ongoing quality

perplexity-known-pitfalls

Allowed Tools

Provided by Plugin

perplexity-pack

Installation

Instructions

Perplexity Known Pitfalls

Overview

Prerequisites

Pitfalls

1. Using It as a Generic Chatbot

2. Ignoring Citations

3. Using Wrong SDK Import

4. Not Setting max_tokens

5. No Recency Filter for Time-Sensitive Queries

6. Sending Full Conversation History

7. Using sonar-pro for Simple Queries

8. Mixing Allowlist and Denylist in Domain Filter

9. Not Caching Search Results

10. Wrong Base URL

Code Review Checklist

Error Handling

Output

Resources

Ready to use perplexity-pack?

Allowed Tools

Provided by Plugin

perplexity-pack

Installation

Instructions

Perplexity Known Pitfalls

Overview

Prerequisites

Pitfalls

1. Using It as a Generic Chatbot

2. Ignoring Citations

3. Using Wrong SDK Import

4. Not Setting max_tokens

5. No Recency Filter for Time-Sensitive Queries

6. Sending Full Conversation History

7. Using sonar-pro for Simple Queries

8. Mixing Allowlist and Denylist in Domain Filter

9. Not Caching Search Results

10. Wrong Base URL

Code Review Checklist

Error Handling

Output

Resources

Ready to use perplexity-pack?

Related Skills

abridge-ci-integration

abridge-common-errors

abridge-core-workflow-a

abridge-core-workflow-b

abridge-cost-tuning

abridge-debug-bundle