langchain-sdk-patterns

Compose LangChain 1.0 Python runnables with the production defaults the docs do not warn about: parallel batching, narrow fallbacks, and brace-safe prompts. Use when building an LCEL chain with RunnableSequence / RunnableParallel, adding resilience via `.with_fallbacks()`, tuning throughput with `.batch()` or `.abatch()`, or wrapping user input in a prompt template. Trigger with "langchain runnable", "with_fallbacks", "langchain batch", "runnable sequence", "lcel", "runnableparallel", "chain composition".

v2.0.0

Jeremy Longshore

MIT

claude-codecodex

5 Tools

langchain-py-pack Plugin

saas packs Category

Allowed Tools
        ReadWriteEditBash(python:*)Bash(pip:*)
      

Provided by Plugin

langchain-py-pack

Claude Code skill pack for LangChain 1.0 + LangGraph 1.0 (Python) - 34 skills covering chains, agents, RAG, middleware, checkpointing, HITL, streaming, and production patterns

saas packs v2.0.0

View Plugin

Installation

This skill is included in the langchain-py-pack plugin:

/plugin install langchain-py-pack@claude-code-plugins-plus

Click to copy

Instructions

LangChain SDK Patterns (Python)

Overview

chain.batch(inputs) in LangChain 1.0 does not parallelize by default. The

max_concurrency parameter defaults to 1 in several provider packages

(notably older langchain-openai), so a call like chain.batch(inputs_1000)

runs 1,000 sequential round-trips — same wall-clock time as a for loop, plus

the overhead of the batch machinery. Users file "batch is slow" tickets,

benchmark it against asyncio, and move to a different framework — when the fix

is two lines:


# BAD — silently serializes (P08)
chain.batch(inputs_1000)

# GOOD — 10 in flight at once
chain.batch(inputs_1000, config={"max_concurrency": 10})

Then three more traps wait:

P07 — .withfallbacks([backup]) defaults exceptionsto_handle=(Exception,),

and on Python <3.12 that tuple includes KeyboardInterrupt. A Ctrl+C during

a long run does not stop the process — it silently hands off to the fallback

chain and keeps billing.

P57 — ChatPromptTemplate.frommessages(..., templateformat="f-string")

(the default) parses every { in every string, including user input. A user

who pastes {"error": "..."} raises KeyError: 'error' at invoke time.

P53 — Pydantic v2 rejects extra fields by default; models cheerfully add

summary or confidence to your Plan schema and withstructuredoutput

crashes with ValidationError: extra fields not permitted.

This skill walks through LCEL composition (RunnableSequence, RunnableParallel,

RunnableBranch, RunnablePassthrough, RunnableLambda); the correct

exceptionstohandle whitelist per provider; max_concurrency tuning with

safe ceilings (10 for most providers, 20+ with a semaphore); and prompt

templates that survive untrusted input. Pin: langchain-core 1.0.x,

langchain-anthropic 1.0.x, langchain-openai 1.0.x. Pain-catalog anchors:

P07, P08, P53, P57.

Prerequisites

Python 3.10+ (3.12+ fixes the KeyboardInterrupt half of P07 — upgrade if you can)
langchain-core >= 1.0, < 2.0
At least one provider: pip install langchain-anthropic langchain-openai
pydantic >= 2.0 for schema-aware composition
Completed langchain-model-inference — the chat-model factory from that skill is reused here

Instructions

Step 1 — Compose with typed runnables, not lambdas


from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough

llm = ChatAnthropic(model="claude-sonnet-4-6", timeout=30, max_retries=2)

prompt = ChatPromptTemplate.from_messages(
    [("system", "You are a summarizer."), ("human", "{text}")],
    template_format="jinja2",  # P57 — see Step 4
)

# Sequence: prompt -> llm -> str
chain = prompt | llm | StrOutputParser()

# Parallel: run two sub-chains and merge
enriched = RunnableParallel(
    summary=chain,
    original=RunnablePassthrough(),
)

The | operator creates a RunnableSequence. Each step has a declared input

and output shape — swap a concrete model for a router and the type contract

holds. See Runnable Composition Matrix

for when to reach for RunnableSequence vs RunnableParallel vs RunnableBranch

vs RunnableLambda, with input/output shape conventions for each.

Step 2 — Add fallbacks with a narrow exception whitelist


from anthropic import APIError, APITimeoutError, RateLimitError
from langchain_openai import ChatOpenAI

backup = ChatOpenAI(model="gpt-4o", timeout=30, max_retries=2)
backup_chain = prompt | backup | StrOutputParser()

# GOOD — only retry on transient provider errors
resilient = chain.with_fallbacks(
    [backup_chain],
    exceptions_to_handle=(RateLimitError, APIError, APITimeoutError),
)

# BAD — default `(Exception,)` catches KeyboardInterrupt on Python <3.12 (P07)
# resilient_bad = chain.with_fallbacks([backup_chain])

The default exceptionstohandle=(Exception,) on Python <3.12 inherits

KeyboardInterrupt and SystemExit into the caught set — which means a

Ctrl+C during a long .batch() run falls through to the backup instead of

stopping. Python 3.12+ moved these under BaseException directly, which fixes

the inheritance path, but the default is still too broad: a Pydantic

ValidationError or a ToolException will trigger a pointless backup call.

See Fallback Exception List for the

curated whitelist per provider with concrete imports.

Step 3 — Batch with explicit concurrency


import asyncio

inputs = [{"text": doc} for doc in documents]

# Synchronous batch — blocks until done
results = chain.batch(inputs, config={"max_concurrency": 10})

# Async batch — non-blocking
results = await chain.abatch(inputs, config={"max_concurrency": 10})

Safe ceilings: 10 for Anthropic and OpenAI at default tier; 20+ only

behind an asyncio.Semaphore if you are also tracking rate-limit headers.

Claude TPM/RPM limits vary by tier; OpenAI's TPD (tokens per day) is the

binding limit at scale. See Batch Concurrency Tuning

for per-provider ceilings and the semaphore pattern.

invoke vs batch vs stream — when each is correct:

Method	Input shape	Concurrency	Error behavior	When to use
`.invoke(x)`	Single	1	Raises on failure	One-shot call, interactive, tests
`.batch(xs, config={"max_concurrency": N})`	List	N parallel	Raises on first failure unless `return_exceptions=True`	Bulk sync workloads, ETL, eval harnesses
`.abatch(xs, config={"max_concurrency": N})`	List	N parallel (async)	Same as `.batch`	Event loops, async web servers, LangGraph nodes
`.stream(x)`	Single	1, chunked	Raises on failure	Interactive UI, live token display
`.astream(x)` / `.astream_events(x, version="v2")`	Single	1, chunked (async)	Raises on failure	Async UIs, event-driven pipelines, token metering (see `langchain-model-inference`)

Pass return_exceptions=True in the config to keep a batch from aborting on

the first failure — exceptions come back in the result list instead of raising.

Step 4 — Escape prompt templates for untrusted input


from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# BAD — default f-string format crashes on literal `{` in user input (P57)
bad = ChatPromptTemplate.from_messages(
    [("system", "Reply in JSON"), ("human", "{user_text}")]
)
bad.invoke({"user_text": '{"error": "oops"}'})  # KeyError: 'error'

# GOOD — jinja2 treats `{...}` as literal, uses `{{ var }}` for substitution
good = ChatPromptTemplate.from_messages(
    [("system", "Reply in JSON"), ("human", "{{ user_text }}")],
    template_format="jinja2",
)
good.invoke({"user_text": '{"error": "oops"}'})  # OK

# MIXED — message history is a list, use MessagesPlaceholder
with_history = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder("history"),
    ("human", "{{ question }}"),
], template_format="jinja2")

Rule of thumb: if any variable can contain user-provided free text (a paste,

a transcript, a code block), use template_format="jinja2". The f-string

format is fine for trusted template authors composing fixed instructions, but

it is the wrong tool for user input. See Prompt Template Escaping

for the full brace-escaping rules and a MessagesPlaceholder reference.

Step 5 — Validate structured output with `extra="ignore"`


from pydantic import BaseModel, ConfigDict, Field

class Plan(BaseModel):
    # P53 — without this, the chain crashes when the model adds extra fields
    model_config = ConfigDict(extra="ignore")
    steps: list[str] = Field(default_factory=list)
    estimated_minutes: int

structured_chain = prompt | llm.with_structured_output(Plan, method="json_schema")

Pydantic v2 rejects unknown fields by default. Models trained on "be helpful"

add summary, confidence, rationale — the schema crashes instead of

dropping them. extra="ignore" is the right default for model outputs.

Output

RunnableSequence / RunnableParallel composition with declared input/output shapes
.withfallbacks(exceptionsto_handle=(...)) with a narrow, provider-specific whitelist
.batch() / .abatch() with explicit max_concurrency (10 default, 20+ behind semaphore)
ChatPromptTemplate.frommessages(..., templateformat="jinja2") for any template touching user input
Pydantic schemas with ConfigDict(extra="ignore") for structured output
A clear invoke / batch / abatch / stream / astream decision matrix for each chain stage

Error Handling

Error	Cause	Fix
`Ctrl+C` does not stop a long `.batch()`; fallback keeps running	`exceptionstohandle=(Exception,)` swallows `KeyboardInterrupt` on Python <3.12 (P07)	Pass a narrow tuple: `exceptionstohandle=(RateLimitError, APIError, APITimeoutError)`
`.batch(inputs)` takes same time as sequential loop	`max_concurrency` defaults to 1 (P08)	`config={"max_concurrency": 10}`; raise to 20+ only with a semaphore
`KeyError: ''` when invoking a `ChatPromptTemplate`	f-string parser reads user input's `{` as a variable (P57)	`template_format="jinja2"`; escape literals as `{{`/`}}` in f-string mode
`ValidationError: extra fields not permitted` on structured output	Pydantic v2 strict-by-default (P53)	`model_config = ConfigDict(extra="ignore")` on the schema
`ValidationError` caught by fallback and treated as transient	Fallback whitelist too broad	Remove `ValidationError` from `exceptionstohandle` so it surfaces
`.batch` aborts on the first failure, losing all results	Default raises on first error	Pass `config={"maxconcurrency": 10, "returnexceptions": True}` and filter
Fallback chain never fires even on genuine `RateLimitError`	Provider's own `max_retries` consumes the error first	Lower `max_retries=0` on the primary when a fallback chain is the retry strategy

Examples

Fan-out enrichment with RunnableParallel

A common pattern — given a document, produce a summary, extracted entities,

and sentiment in parallel. RunnableParallel runs sub-chains concurrently and

merges results into a dict. Combined with .batch() at the outer level, you

get N documents times 3 sub-chains in flight up to max_concurrency.

See Runnable Composition Matrix

for the fan-out/fan-in pattern and the input/output shape of each runnable type.

Resilient chain with per-provider fallback

Primary: Claude Sonnet 4.6. Fallback: GPT-4o. Catch only RateLimitError,

APIError, and APITimeoutError from each SDK — let AuthenticationError

and ValidationError crash the process so they get debugged, not masked.

See Fallback Exception List for the

concrete imports per provider and a note on why BadRequestError should not

be in the whitelist.

High-throughput batch with semaphore-bounded concurrency

At N >= 20 concurrent in-flight calls, provider rate-limit headers become the

bottleneck. Wrap .abatch() in an asyncio.Semaphore and honor the

retry-after header on 429 responses.

See Batch Concurrency Tuning for the

semaphore pattern and a table of provider TPM/RPM limits per tier.

Prompt template over user-pasted JSON payload

Support ticket triage where users paste arbitrary JSON from their app's error

log. Without template_format="jinja2", every single ticket with a JSON body

crashes the chain at template-render time.

See Prompt Template Escaping for the

worked example and the MessagesPlaceholder pattern for chat history.

Resources

LangChain Python: Runnable interface
LangChain Python: withfallbacks
LangChain Python: batch and abatch
ChatPromptTemplate reference
Pydantic v2 ConfigDict
LangChain 1.0 release notes
Pack pain catalog: docs/pain-catalog.md (entries P07, P08, P53, P57)

Allowed Tools

Provided by Plugin

langchain-py-pack

Installation

Instructions

LangChain SDK Patterns (Python)

Overview

Prerequisites

Instructions

Step 1 — Compose with typed runnables, not lambdas

Step 2 — Add fallbacks with a narrow exception whitelist

Step 3 — Batch with explicit concurrency

Step 4 — Escape prompt templates for untrusted input

Step 5 — Validate structured output with extra="ignore"

Output

Error Handling

Examples

Fan-out enrichment with RunnableParallel

Resilient chain with per-provider fallback

High-throughput batch with semaphore-bounded concurrency

Prompt template over user-pasted JSON payload

Resources

Ready to use langchain-py-pack?

Related Skills

"cursor-advanced-composer"

"cursor-ai-chat"

"cursor-api-key-management"

"cursor-codebase-indexing"

"cursor-common-errors"

"cursor-compliance-audit"

Step 5 — Validate structured output with `extra="ignore"`