langchain-content-blocks
Works correctly with LangChain 1.0's typed content blocks on AIMessage.content — text, tool_use, image, thinking, document — across Claude, GPT-4o, and Gemini, including multi-modal composition and tool-call iteration. Use when composing multi-modal messages, iterating tool_use blocks, handling Claude's thinking content, or unifying image inputs across providers. Trigger with "langchain content blocks", "AIMessage.content", "tool_use block", "claude image input", "langchain multimodal", "thinking block replay", "claude citations".
Allowed Tools
Provided by Plugin
langchain-py-pack
Claude Code skill pack for LangChain 1.0 + LangGraph 1.0 (Python) - 34 skills covering chains, agents, RAG, middleware, checkpointing, HITL, streaming, and production patterns
Installation
This skill is included in the langchain-py-pack plugin:
/plugin install langchain-py-pack@claude-code-plugins-plus
Click to copy
Instructions
LangChain Content Blocks (Python)
Overview
On Claude, AIMessage.content is list[dict] even for pure text — so any
code from an OpenAI-first tutorial that calls message.content.lower() or
message.content.split() crashes with `AttributeError: 'list' object has
no attribute 'lower'` on the first production Claude call (P02).
Multi-modal code that works on GPT-4o breaks on Claude because pre-1.0
image-block shapes differed across providers (P64). Multi-turn Claude
replay with extended thinking fails with
anthropic.BadRequestError: missing signature when prior thinking
blocks are stripped. Forced tool_choice prevents
stopreason="endturn" and loops forever (P63).
This is the deep-dive companion to langchain-model-inference. That
skill's references/content-blocks.md covers the str vs list[dict]
divergence and a safe text extractor. This skill goes further:
tool_useblock iteration mechanics — IDs, args as dict vs JSON string, streaming deltasthinkingblocks — signature, redaction, multi-turn replay semanticsdocumentblocks — Claude citations API, source types, citation extraction- Multi-modal composition — universal 1.0
imageshape, per-provider adapter behavior - Per-provider size limits (Anthropic 5 MB/image up to 20 images, OpenAI 20 MB/image, Gemini 20 MB/request)
Pin: langchain-core 1.0.x, langchain-anthropic >= 1.0,
langchain-openai >= 1.0, anthropic >= 0.40. Pain-catalog anchors:
P02, P58, P63, P64.
Prerequisites
- Python 3.10+
langchain-core >= 1.0, < 2.0- At least one provider package:
pip install langchain-anthropic langchain-openai - For extended thinking:
langchain-anthropic >= 1.0and Claude Sonnet 4+ / Opus 4+ - For citations:
anthropic >= 0.40and Claude Sonnet 4+ - Familiarity with
langchain-model-inference(readsreferences/content-blocks.mdfirst)
Instructions
Step 1 — Learn the block-type taxonomy
LangChain 1.0 defines six typed content blocks on AIMessage.content
(and on chunks during streaming):
| Block type | Produced by | Notes |
|---|---|---|
text |
All providers | On Claude, always wrapped as [{"type":"text","text":"..."}] |
tool_use |
Claude, GPT-4o, Gemini | Always round-trip via msg.tool_calls, not hand-parsed |
tool_result |
You (via ToolMessage) |
One per tooluse; toolcall_id must match byte-for-byte |
image |
Claude vision, GPT-4o, Gemini | Universal 1.0 shape; adapter handles wire format per provider |
thinking |
Claude extended thinking only | Must preserve signature for replay |
document |
Claude citations API (Sonnet 4+) | Input-side only; citations attach to output text blocks |
See Block-Type Matrix for the full table
with streaming behavior and per-type gotchas.
Step 2 — Iterate mixed content safely
For most code, use the helpers:
text = msg.text() # concatenated text across all text blocks
tool_calls = msg.tool_calls # normalized list[ToolCall]
usage = msg.usage_metadata # input_tokens, output_tokens, cache_*
Hand-roll block iteration only when you need to (a) preserve order,
(b) extract thinking blocks for replay, or (c) read citations
metadata from text blocks. Order-preserving iteration:
from langchain_core.messages import AIMessage
def iter_blocks(msg: AIMessage):
if isinstance(msg.content, str):
yield "text", {"type": "text", "text": msg.content}
return
for block in msg.content:
if isinstance(block, dict):
yield block.get("type", "unknown"), block
else:
yield getattr(block, "type", "unknown"), block
Step 3 — Compose multi-modal messages with the universal image block
import base64
from pathlib import Path
from langchain_core.messages import HumanMessage
def image_block(path: str) -> dict:
data = base64.standard_b64encode(Path(path).read_bytes()).decode("ascii")
mime = {"png": "image/png", "jpg": "image/jpeg",
"jpeg": "image/jpeg", "webp": "image/webp"}[
Path(path).suffix.lstrip(".").lower()]
return {
"type": "image",
"source_type": "base64", # or "url"
"data": data,
"mime_type": mime,
}
msg = HumanMessage(content=[
image_block("screenshot.png"), # put image FIRST
{"type": "text", "text": "What is broken here?"}, # instruction LAST
])
response = claude.invoke([msg])
Three invariants:
contentmust belist[dict]when including non-text blocks.- Put the image before the instruction — Claude attends most to trailing tokens.
- Respect provider limits (Anthropic: 5 MB/image, up to 20 images; OpenAI: 20 MB/image; Gemini: 20 MB/request total).
LangChain's adapter translates the universal shape to each provider's
wire format. See Multi-Modal Composition
for the full adapter table, MIME-type compatibility, and the
document/citations pattern.
Step 4 — Iterate tool_use correctly across stream deltas
Canonical non-streaming:
for tc in msg.tool_calls:
output = tools[tc["name"]](**tc["args"])
history.append(ToolMessage(content=str(output), tool_call_id=tc["id"]))
tc["args"] is already a parsed dict — do not json.loads it.
tc["id"] is provider-shaped (toolu on Anthropic, call on
OpenAI, 24+ chars) and must be copied verbatim to the ToolMessage.
Streaming is different. tool_use.input arrives as partial JSON
fragments across onchatmodel_stream events. Buffer with
toolcallchunks, parse once at onchatmodel_end:
from collections import defaultdict
import json
partial = defaultdict(str) # index -> accumulated JSON fragment
meta = {} # index -> {name, id}
async for event in model.astream_events({"messages": [...]}, version="v2"):
if event["event"] != "on_chat_model_stream":
continue
for tc_chunk in getattr(event["data"]["chunk"], "tool_call_chunks", []) or []:
idx = tc_chunk["index"]
if tc_chunk.get("name"):
meta[idx] = {"name": tc_chunk["name"], "id": tc_chunk["id"]}
if tc_chunk.get("args"):
partial[idx] += tc_chunk["args"]
completed = [{**meta[i], "args": json.loads(partial[i])} for i in meta]
See Tool-Use Iteration for
multi-tool-per-turn handling, ToolMessage ordering, and the forced-
tool_choice infinite-loop trap (P63).
Step 5 — Preserve Claude thinking blocks for replay
Claude extended thinking (Sonnet 4+, Opus 4+) returns thinking blocks
carrying a cryptographic signature. The next turn must round-trip
those blocks intact or Anthropic rejects the request:
anthropic.BadRequestError: messages.1.content.0: missing signature
The foot-gun: msg.text() strips thinking blocks. Never do:
# WRONG — thinking blocks lost, replay fails
history.append(AIMessage(content=ai_1.text()))
Correct — pass the AIMessage back verbatim:
history.append(ai_1) # preserves full content list + signatures
For persistence across sessions, serialize with
messagestodict(...) (not custom JSON), which preserves block
structure:
import json
from langchain_core.messages import messages_to_dict, messages_from_dict
serialized = json.dumps(messages_to_dict([ai_1]))
restored = messages_from_dict(json.loads(serialized))
See Thinking Blocks for redaction
handling, the budget-tokens rule, and the interaction with tool calls.
Step 6 — Provider-adapter checklist
Before sending any multi-modal or tool-using message:
- Is
contentalist[dict]when it contains non-text blocks? - Are image blocks in the universal 1.0 shape (
sourcetype,data,mimetype)? - Is each image under the target provider's limit? (5 MB / 20 MB / 20 MB total.)
- If
tooluseis involved, am I passingmsg.toolcalls— not parsedcontent? - If extended thinking is on, am I returning the full
AIMessage— notmsg.text()? - System message at position 0 (P58) — not reordered by middleware?
Output
- Block-type matrix applied to a specific response (which types present, which helper used)
- Safe iteration that preserves order, citations, and thinking signatures
- Multi-modal
HumanMessagein the universal 1.0imageshape, portable across Claude/GPT-4o/Gemini tool_usestream-delta accumulator that buffers partialinputJSON and parses once at end- Multi-turn Claude replay that keeps
thinkingblocks intact (nomissing signatureerrors) document/citations extractor that readscitationsmetadata fromtextblocks
Error Handling
| Error | Cause | Fix |
|---|---|---|
AttributeError: 'list' object has no attribute 'lower' |
Treating AIMessage.content as str on Claude (P02) |
Use msg.text() or iterate blocks |
anthropic.BadRequestError: messages.N.content.M: missing signature |
Stripped thinking block on replay |
Pass AIMessage object back verbatim; never rebuild from text() |
anthropic.BadRequestError: tooluseid not found in corresponding tool_result |
Typo / case mismatch in ToolMessage.toolcallid |
Copy tc["id"] verbatim |
anthropic.BadRequestError: tooluse ids were found without toolresult blocks |
Skipped a tool call | Emit one ToolMessage per tool_call (use status="error" on failure) |
anthropic.BadRequestError: image exceeds 5 MB limit |
Un-resized screenshot | Pre-resize to < 5 MB (1024x1024 JPEG 85 is ~500 KB) |
openai.BadRequestError: Invalid image data |
Hand-rolled image_url with wrong prefix |
Use the universal block; adapter emits the data:image/...;base64, prefix |
| Infinite agent loop | Forced tool_choice inside a loop (P63) |
Use tool_choice="auto" for agents; forced-choice only for single-call extraction |
json.JSONDecodeError inside stream loop |
Parsing partial tool_use.input fragment |
Buffer in a defaultdict(str); parse once at onchatmodel_end |
| Citations silently missing | Read via msg.text() which strips metadata |
Iterate msg.content and read block["citations"] on text blocks |
Examples
Single-shot multi-modal on Claude + GPT-4o with one message object
msg = HumanMessage(content=[
image_block("ui.png"),
{"type": "text", "text": "Identify the broken UI element."},
])
# Same message works on both providers via adapter translation
claude_resp = claude.invoke([msg])
gpt4o_resp = gpt4o.invoke([msg])
Multi-turn Claude replay with extended thinking
claude = ChatAnthropic(
model="claude-sonnet-4-6",
max_tokens=8192,
thinking={"type": "enabled", "budget_tokens": 4096},
)
ai_1 = claude.invoke([HumanMessage(content="What is the capital of France?")])
# ai_1.content == [{"type":"thinking",...,"signature":"..."}, {"type":"text",...}]
# Turn 2 — pass ai_1 VERBATIM
ai_2 = claude.invoke([
HumanMessage(content="What is the capital of France?"),
ai_1, # thinking preserved
HumanMessage(content="And the population?"),
])
See Thinking Blocks for the full replay
invariants and persistence pattern.
Extracting Claude citations from document input
doc_block = {
"type": "document",
"source": {"type": "base64", "media_type": "application/pdf", "data": pdf_b64},
"title": "Q3 Earnings Report",
"citations": {"enabled": True},
}
resp = claude.invoke([HumanMessage(content=[
doc_block,
{"type": "text", "text": "What drove revenue this quarter?"},
])])
for block in resp.content:
if block.get("type") != "text":
continue
print(block["text"])
for c in block.get("citations", []):
print(f" -> {c['document_title']}: {c['cited_text']!r}")
msg.text() flattens this — you lose citations. See
Multi-Modal Composition for the
full document block reference including supported source types.
Streaming tool_use with live argument rendering
See Tool-Use Iteration for the
complete toolcallchunks accumulator including multi-tool-per-turn
handling and the ToolMessage ordering invariant.
Resources
- LangChain messages concept
- LangChain multimodality
AIMessageAPI reference- Anthropic content blocks / messages API
- Anthropic extended thinking
- Anthropic citations
- OpenAI vision
- Gemini multimodal
- Companion skill:
langchain-model-inference(read itsreferences/content-blocks.mdfor thestrvslist[dict]fundamentals) - Pack pain catalog:
docs/pain-catalog.md(entries P02, P58, P63, P64)