langchain-langgraph-human-in-loop

"Build LangGraph 1.0 human-in-the-loop approval flows with `interrupt_before`\

v2.5.0

Jeremy Longshore

MIT

Allowed Tools

ReadWriteEditBash(python:*)

Provided by Plugin

langchain-py-pack

Claude Code skill pack for LangChain 1.0 + LangGraph 1.0 (Python) - 34 skills covering chains, agents, RAG, middleware, checkpointing, HITL, streaming, and production patterns

saas packs v2.5.0

View Plugin

Installation

This skill is included in the langchain-py-pack plugin:

/plugin install langchain-py-pack@claude-code-plugins-plus

Click to copy

Instructions

LangChain LangGraph Human-in-the-Loop (Python)

Overview

A team adds interruptbefore=["sendemail"] to require a human approval

before the email goes out. First integration test crashes at the interrupt

boundary with:


TypeError: Object of type datetime is not JSON serializable

The culprit is two nodes upstream: a classify node stashed

"received_at": datetime.utcnow() into state. Every node-level unit test

passed because node completion does not serialize state — only the

checkpointer does, and only at supersteps that include an interrupt. The

failure is invisible until interrupt time (P17).

A week later the resume path ships. The human reviews the draft, clicks

"approve with edits," and the backend runs:


graph.invoke(Command(update={"messages": [corrected_msg]}, resume="approved"), config)

The prior 47 messages vanish. messages was typed as plain

list[AnyMessage] with no reducer, so update replaces the field instead of

appending (P18).

This skill covers: three interrupt styles (interrupt_before,

interrupt_after, inline interrupt()), the JSON-only state invariant with

a pre-interrupt scanner, the Command(resume=...) /

Command(update=..., resume=...) contract, an approval UI wire format

(GET pending / POST decision with optimistic concurrency), safe-cancellation

routing to END, and the tradeoff between native interrupts and a separate

approval service. Pin: langgraph 1.0.x, langgraph-checkpoint 2.0.x.

Pain-catalog anchors: P17, P18 (adjacent: P16, P20).

Prerequisites

Python 3.10+
langgraph >= 1.0, < 2.0
A checkpointer: MemorySaver (dev), PostgresSaver (prod), or SqliteSaver (single-box)
A thread_id contract at the app boundary (see langchain-langgraph-checkpointing)
Familiarity with langchain-langgraph-basics — nodes, edges, TypedDict state with reducers

Instructions

Step 1 — Choose the interrupt style

LangGraph 1.0 exposes three interrupt mechanisms. They are not interchangeable.

Style	Syntax	Use when
`interrupt_before=[node]`	`compile(interruptbefore=["sendemail"])`	Review inputs before an irreversible tool. Graph pauses before node runs. State shown is the input.
`interrupt_after=[node]`	`compile(interruptafter=["draftemail"])`	Review output of a node (e.g., an LLM draft). Graph pauses after node completes.
Inline `interrupt()`	Inside a node: `decision = interrupt({"kind": "..."})`	Structured prompt mid-node with custom payload. Most flexible; lives in node code.

Rule of thumb: prefer interrupt_before for hard gates (tool must not run

without approval). Use interrupt_after for review loops (draft → approve →

send). Use inline interrupt() when the prompt varies on intermediate

computation.

Typical interrupt round-trip latency in production is 50-300 ms from

pause to checkpoint write (local Postgres) plus UI time; budget 1-5 s

total for a Slack-based approval. Checkpoint row sizes average 2-20 KB on

small graphs and cap at ~1 MB on PostgresSaver before historical

checkpoints need pruning.

See Interrupt Decision Tree for full

criteria, multiple-interrupt-per-graph patterns, and the interrupt-vs-tool

comparison.

Step 2 — Enforce the JSON-serializable state invariant (P17)

Checkpointers serialize state to JSON on every superstep. Any non-JSON type

raises TypeError at the interrupt boundary — not at the offending node.

Canonical offenders:

Type	Fix
`datetime` / `date`	`dt.isoformat()` — ISO 8601 string
`bytes`	`base64.b64encode(b).decode()`
`set`	`sorted(s)`
Pydantic `BaseModel` with non-primitive fields	`.model_dump(mode="json")`
Custom classes	`dataclasses.asdict(obj)` or `vars(obj)`
`numpy.ndarray`	`.tolist()`
`decimal.Decimal`	`str(d)` or `float(d)` (lossy)
`float("nan")` / `float("inf")`	`None` (JSON forbids them; some savers crash on `allow_nan=False`)

Ship a pre-interrupt scanner in dev and CI:


import json
from typing import Any

class NonSerializableStateError(TypeError):
    """Raised when state contains values the checkpointer cannot serialize."""

def assert_state_is_json_serializable(state: dict[str, Any], *, path: str = "state") -> None:
    """Walk state depth-first and raise a typed error naming the offending key path."""
    _walk(state, path)

def _walk(v: Any, path: str) -> None:
    if v is None or isinstance(v, (bool, int, float, str)):
        return
    if isinstance(v, list):
        for i, item in enumerate(v):
            _walk(item, f"{path}[{i}]")
        return
    if isinstance(v, dict):
        for k, val in v.items():
            _walk(val, f"{path}.{k}")
        return
    raise NonSerializableStateError(
        f"{path} is {type(v).__name__}, not JSON-serializable. "
        f"Convert at node boundary."
    )

Call assertstateisjsonserializable(state) at the end of every node

preceding an interrupt-flagged node, or attach as LangGraph middleware. In

CI, run the full graph to interrupt against a fixture that exercises every

branch — the only way to catch P17 before prod.

See State Serialization for Interrupts

for the full forbidden-types list, the Pydantic-in-state pattern, and the

integration-test harness.

Step 3 — The resume contract

Two shapes. They are not equivalent.


from langgraph.types import Command

# Shape A — resume only: human approved as-is
graph.invoke(Command(resume="approved"), config)

# Shape B — update + resume: human edited state mid-graph
graph.invoke(
    Command(update={"recipient": "new@example.com"}, resume="approved"),
    config,
)

resume="..." is the value returned from inline interrupt() inside the

node (if any). For interruptbefore / interruptafter, no node reads

resume, but the checkpoint records it for audit.

update={...} merges into state via the reducer declared in the TypedDict.

Without a reducer, update replaces the field (P18). Always annotate

list and dict state:


from typing import Annotated, TypedDict
from langchain_core.messages import AnyMessage
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    messages: Annotated[list[AnyMessage], add_messages]      # append, not replace
    approvals: Annotated[list[dict], lambda l, r: l + r]     # custom append reducer
    draft: Annotated[dict, lambda l, r: {**l, **r}]          # dict merge reducer
    last_decision: str                                        # scalar: replace is fine

See Resume Patterns for the five canonical

resume shapes (plain approve, approve with edits, reject to END, partial

approval, inline-interrupt structured return), the reducer cookbook, and the

audit-log write order.

Step 4 — Wire the approval UI

Two HTTP endpoints. Keep them boring.

GET /approvals/pending lists paused threads:


[
  {
    "thread_id": "conv-abc123",
    "checkpoint_id": "01JABC...",
    "interrupted_at": "2026-04-21T15:32:11Z",
    "node": "send_email",
    "state_diff": {"draft": {"to": "user@example.com", "subject": "Welcome"}}
  }
]

POST /approvals//decision applies the decision:


{
  "decision": "approve" | "reject" | "edit",
  "edits": {"recipient": "corrected@example.com"},
  "approver": "jeremy@intentsolutions.io",
  "reason": "Verified against ticket INT-4821",
  "expected_checkpoint_id": "01JABC...",
  "idempotency_key": "c2f5e8a0-..."
}

Optimistic concurrency (the expectedcheckpointid check) matters the

moment two approvers open the same thread in two browser tabs. Without it,

the second click silently overwrites the first. Return 409 Conflict on

mismatch; UI refreshes.

Server-side flow: authz → idempotency dedupe → checkpoint check → audit-log

write (BEFORE mutation) → build Command → graph.ainvoke(cmd, config) →

audit-log finalize.

See Approval UI Wiring for the full

HTTP contract with status codes, FastAPI implementation, Slack Block Kit

mapping, state-diff redaction, and an audit-log schema compatible with SOC2

evidence requirements.

Step 5 — Safe cancellation: route to `END` on reject

When the human rejects, the gated node must NOT execute. Two clean patterns:

Pattern A — conditional edge after the interrupted node (preferred):


from langgraph.graph import END

def route_after_approval(state: AgentState) -> str:
    if state.get("last_decision") == "rejected":
        return END
    return "send_email"

builder.add_conditional_edges("await_approval", route_after_approval, {
    "send_email": "send_email",
    END: END,
})

Pattern B — Command(goto=END) at resume:


graph.invoke(Command(resume="rejected", goto=END), config)

Prefer Pattern A in production: graph topology stays the source of truth,

audit replays work without the UI. Always log the rejection to the checkpoint

via Command(update={"lastdecision": "rejected", "rejectreason": ...})

BEFORE routing to END — otherwise the audit trail lives only in the UI DB.

Step 6 — Native interrupts vs a separate approval service

Dimension	LangGraph interrupts	Separate approval service
Latency	50-300 ms pause + human time	Human time + queue latency
State coherence	Single source of truth (checkpoint)	Two systems to reconcile
Concurrency	Checkpoint-based optimistic locking	Whatever the queue provides
Multi-graph	Per-graph, per-thread	Centralized policy engine
Observability	`get_state()` + checkpoint history	Separate audit system
Failure mode	JSON-serialization at interrupt (P17)	Network partition between services
Best for	Single LangGraph app, 1-10 approval types, <1k/day	Multi-app enterprise, complex RBAC, 10k+/day

Single LangGraph app with fewer than a dozen approval types: native

interrupts are simpler and more reliable. Cross-app approval platform with

escalations, delegations, and SLAs: run a dedicated service and call it from

a tool, not from an interrupt.

Output

Graph compiled with explicit interruptbefore / interruptafter lists, or inline interrupt() calls where payload structure matters
JSON-only state: datetime → ISO strings, bytes → base64, Pydantic → .model_dump(mode="json"), custom classes → dicts
TypedDict state with explicit reducers on every list and dict field
Pre-interrupt state scanner attached as middleware or called at node exits; raises NonSerializableStateError with a key path
Approval HTTP endpoints: GET pending with state diffs, POST decision with expectedcheckpointid optimistic-concurrency check and idempotency_key dedupe
Rejection routes to END via conditional edge (Pattern A) with last_decision recorded in state for audit
Audit log written BEFORE state mutation with approver, reason, threadid, checkpointidbefore, checkpointid_after

Error Handling

Error	Cause	Fix
`TypeError: Object of type datetime is not JSON serializable` at interrupt	Non-JSON value in state (P17)	Convert at node boundary; add pre-interrupt scanner in CI
Resume with `Command(update={"messages": [new]})` loses history	`messages` field missing reducer (P18)	Annotate as `Annotated[list[AnyMessage], add_messages]`
`ValueError: Thread ... has no interrupted nodes` on resume	Graph already ran to completion, or `thread_id` mismatch	Call `graph.get_state(config)` first; assert `snapshot.next` is non-empty
Human clicks approve, nothing happens	Missing checkpointer on `compile()` — interrupts require persistence	`graph.compile(checkpointer=MemorySaver() or PostgresSaver(...))`
Two approvers both click approve, second one's edits win silently	No optimistic concurrency	Include `expectedcheckpointid` in POST body; return 409 on mismatch
`KeyError: 'configurable'` at resume	`config` dict missing `thread_id`	`config = {"configurable": {"thread_id": tid}}` — required by every checkpointer
Approval UI shows stale state after another approver acted	Cached GET /pending response	`Cache-Control: no-store` on the pending endpoint
Graph halts silently after reject	Conditional edge router returned value not in `path_map`	Include `END` in `path_map`; assert router output in keyset

Examples

Approval gate before an expensive tool

Email-sending agent that must not send without approval. State carries

draft: {to, subject, body}, graph compiles with

interruptbefore=["sendemail"], resume either invokes the send tool or

routes to END on reject. See

Resume Patterns for the full worked example

including audit-log write order.

Partial approval — approve one argument, edit another

Human accepts the recipient but rewrites the subject. Resume is

Command(update={"draft": {**state["draft"], "subject": new_subject}}, resume="approved").

Note the spread — without it the draft is replaced. Scalar dicts replace by

default; declare a dict reducer to merge partials cleanly. See

Resume Patterns.

Inline `interrupt()` with a custom payload

Inside a validate_purchase node, the model has decided to buy three items

at USD 450 total. The node calls

decision = interrupt({"kind": "confirm_purchase", "items": items, "total": 450})

and the UI reads the payload to render a rich confirmation dialog. On resume,

decision is whatever the UI sent via

Command(resume={"approved": True, "notes": "..."}). See

Interrupt Decision Tree.

Slack-driven approval

GET /pending feeds a cron that posts Block Kit messages with approve/reject

buttons. Button callback POSTs to /decision. Slack's interaction payload

carries user.id, which becomes approver in the audit log. See

Approval UI Wiring for the Block Kit

template and signing-secret validation.

Resources

LangGraph: Human-in-the-loop (concepts)
LangGraph: How to add human-in-the-loop
LangGraph: Command type reference
LangGraph: interrupt function reference
LangGraph: Persistence and checkpointers
LangGraph 1.0 release notes
Pack pain catalog: docs/pain-catalog.md (entries P16, P17, P18, P20)
Related skills: langchain-langgraph-basics, langchain-langgraph-checkpointing, langchain-middleware-patterns

Allowed Tools

Provided by Plugin

langchain-py-pack

Installation

Instructions

LangChain LangGraph Human-in-the-Loop (Python)

Overview

Prerequisites

Instructions

Step 1 — Choose the interrupt style

Step 2 — Enforce the JSON-serializable state invariant (P17)

Step 3 — The resume contract

Step 4 — Wire the approval UI

Step 5 — Safe cancellation: route to END on reject

Step 6 — Native interrupts vs a separate approval service

Output

Error Handling

Examples

Approval gate before an expensive tool

Partial approval — approve one argument, edit another

Inline interrupt() with a custom payload

Slack-driven approval

Resources

Ready to use langchain-py-pack?

Related Skills

abridge-ci-integration

abridge-common-errors

abridge-core-workflow-a

abridge-core-workflow-b

abridge-cost-tuning

abridge-debug-bundle

Step 5 — Safe cancellation: route to `END` on reject

Inline `interrupt()` with a custom payload