langchain-multi-env-setup

Build reliable dev / staging / prod isolation for LangChain 1.0 services — Pydantic `Settings` + `SecretStr`, cloud Secret Manager in prod, per-env prompt and model version pinning, env-specific checkpointer and observability. Use when graduating from `.env`-in-dev to real prod infra, or debugging a config that loaded the wrong values in the wrong env. Trigger with "langchain multi-env", "langchain pydantic settings", "langchain secret manager", "langchain env config", "langchain prod setup".

v2.0.0

Jeremy Longshore

MIT

claude-codecodex

5 Tools

langchain-py-pack Plugin

saas packs Category

Allowed Tools
        ReadWriteEditBash(python:*)Bash(gcloud:*)
      

Provided by Plugin

langchain-py-pack

Claude Code skill pack for LangChain 1.0 + LangGraph 1.0 (Python) - 34 skills covering chains, agents, RAG, middleware, checkpointing, HITL, streaming, and production patterns

saas packs v2.0.0

View Plugin

Installation

This skill is included in the langchain-py-pack plugin:

/plugin install langchain-py-pack@claude-code-plugins-plus

Click to copy

Instructions

LangChain Multi-Env Setup (Python)

Overview

A team ships a LangChain 1.0 service to staging with python-dotenv loading

.env.staging into os.environ. Security audits —

docker exec STAGING-POD env prints ANTHROPICAPIKEY=sk-ant-api03-... in

plain text. Anyone with kubectl exec, any sidecar, any core dump, any

error tracker that auto-captures process env sees the key. This is pain

P37: secrets loaded from .env in production containers leak via env.

A second failure chains. A developer runs the staging deploy from a shell

where LANGCHAIN_ENV=production was set hours earlier. The loader picks

the prod .env, staging answers with a prompt commit tuned only for the

prod model tier, latency doubles. Two root causes: no type-safe env gate,

no startup validation that would have caught the mismatched model id.

Both are one refactor:


# BAD — dotenv populates os.environ; any process with container access sees it
from dotenv import load_dotenv
load_dotenv(".env.production")
api_key = os.environ["ANTHROPIC_API_KEY"]  # P37: leaks via `docker exec env`

# GOOD — SecretStr in a validated Settings object, pulled from Secret Manager
from pydantic import SecretStr
from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    env: Literal["dev", "staging", "prod"]
    anthropic_api_key: SecretStr

settings = build_settings()  # pulls from GCP Secret Manager in prod
api_key = settings.anthropic_api_key.get_secret_value()
# repr(settings) prints `SecretStr('**********')` — safe to log

This skill owns the per-env config plumbing — Settings skeleton,

Secret Manager integration, per-env pinning, startup smoke test. It does

not own the full secrets lifecycle (rotation, revocation, scope) —

that belongs to langchain-security-basics.

Pin: langchain-core 1.0.x, langchain-anthropic 1.0.x, pydantic >= 2.5,

pydantic-settings >= 2.1. Pain anchors: P37 (primary), P20

(checkpointer schema — cross-ref langchain-langgraph-checkpointing).

Two numbers: smoke test < 10 seconds; env-var count ~15-30 (more

than 30 means Settings is absorbing feature flags and should split).

Prerequisites

Python 3.10+ (3.11+ recommended for Literal and StrEnum ergonomics)
langchain-core >= 1.0, < 2.0
pydantic >= 2.5, pydantic-settings >= 2.1
One secret backend: GCP Secret Manager (google-cloud-secret-manager),

AWS Secrets Manager (boto3), or HashiCorp Vault (hvac)

Completed langchain-sdk-patterns — the Settings object is injected into

the chain factories from that skill

Instructions

Run these six steps in order — each adds one invariant the next step depends on:

Define a Settings class with SecretStr keys, Literal env, and fail-fast validation.
Add a per-env loader — file in dev, env vars in staging, Secret Manager in prod.
Use the cloud Secret Manager client to pull keys into memory only.
Pin modelid, promptcommithash, and vectorindex_name per env.
Configure the checkpointer per env — memory in dev, Postgres elsewhere.
Run a startup smoke test under 10 seconds before the HTTP server binds.

Step 1 — Create a Settings class with SecretStr and fail-fast validation


from typing import Literal
from pydantic import SecretStr, HttpUrl, Field, ValidationError
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=None,              # see Step 2 — loader picks the file
        env_file_encoding="utf-8",
        case_sensitive=False,
        extra="forbid",             # reject unknown env vars — typo detection
    )

    # --- env switch (drives everything else) ---
    env: Literal["dev", "staging", "prod"] = Field(..., alias="LANGCHAIN_ENV")

    # --- secrets (always SecretStr — never str) ---
    anthropic_api_key: SecretStr = Field(..., alias="ANTHROPIC_API_KEY")
    openai_api_key: SecretStr = Field(..., alias="OPENAI_API_KEY")
    langsmith_api_key: SecretStr = Field(..., alias="LANGSMITH_API_KEY")

    # --- per-env pinning (see Step 4) ---
    model_id: str = Field(..., alias="LANGCHAIN_MODEL_ID")
    prompt_commit_hash: str = Field(..., alias="LANGCHAIN_PROMPT_COMMIT")
    vector_index_name: str = Field(..., alias="LANGCHAIN_VECTOR_INDEX")

    # --- endpoints (validated URLs — typo caught at startup) ---
    checkpointer_url: HttpUrl | None = Field(None, alias="LANGCHAIN_CHECKPOINTER_URL")
    otel_endpoint: HttpUrl = Field(..., alias="OTEL_EXPORTER_OTLP_ENDPOINT")

    # --- budget guards (per-env) ---
    max_cost_usd_per_day: float = Field(10.0, alias="LANGCHAIN_DAILY_BUDGET_USD")
    max_rpm: int = Field(60, alias="LANGCHAIN_MAX_RPM")

SecretStr masks repr(settings) to SecretStr('**********') — a routine

logger.info(settings) cannot leak the key. The only way to read plaintext

is .getsecretvalue(), which greps like a sore thumb in review.

extra="forbid" catches typos (LANGCHINMODELID) at import time.

HttpUrl rejects http:/otel:4318 before the exporter wastes 60s on DNS.

See Settings Skeleton for the full class.

Step 2 — Per-env config loading (file OR Secret Manager, never both)


import os
from pathlib import Path

def build_settings() -> Settings:
    env = os.environ.get("LANGCHAIN_ENV", "dev")

    if env == "dev":
        # Local dev: .env.dev file, values checked into 1Password not git
        return Settings(_env_file=Path(".env.dev"))

    if env == "staging":
        # CI / staging: env vars injected by the orchestrator
        # (GitHub Actions secrets, k8s envFrom: secretRef, etc.)
        return Settings()  # reads os.environ directly

    if env == "prod":
        # Prod: pull from Secret Manager into memory ONLY
        values = pull_from_secret_manager()
        return Settings(**values)

    raise ValueError(f"unknown LANGCHAIN_ENV: {env!r}")

Three loaders, one class. Dev touches a file on disk. Staging inherits env

vars from the orchestrator — envFrom: secretRef is readable via

docker exec env, but the blast radius is bounded and rotation is weekly.

Prod is the P37 fix: pullfromsecret_manager() builds a dict and passes

kwargs to Settings(...). Values land in the instance attribute and

never touch os.environ. A subprocess will not inherit them.

Step 3 — Secret Manager pull (GCP example; AWS / Vault in reference)


from google.cloud import secretmanager

def pull_from_secret_manager() -> dict[str, str]:
    client = secretmanager.SecretManagerServiceClient()
    project = os.environ["GCP_PROJECT_ID"]
    secret_names = ["ANTHROPIC_API_KEY", "OPENAI_API_KEY", "LANGSMITH_API_KEY"]
    out: dict[str, str] = {}
    for name in secret_names:
        resource = f"projects/{project}/secrets/{name}/versions/latest"
        response = client.access_secret_version(request={"name": resource})
        out[name] = response.payload.data.decode("utf-8")
    # Non-secret passthrough (model id, prompt hash, endpoints)
    for key in ["LANGCHAIN_ENV", "LANGCHAIN_MODEL_ID", "LANGCHAIN_PROMPT_COMMIT",
                "LANGCHAIN_VECTOR_INDEX", "LANGCHAIN_CHECKPOINTER_URL",
                "OTEL_EXPORTER_OTLP_ENDPOINT"]:
        if key in os.environ:
            out[key] = os.environ[key]
    return out

No os.environ[k] = v line. The dict goes straight into

Settings(**values). Workload-identity IAM handles auth; no static key on

disk. For AWS / Vault see Secret Manager Integration.

Step 4 — Per-env model and prompt pinning

Dev, staging, and prod run different model ids and different prompt

commit hashes. Pinning happens at env-var level so app code is env-agnostic

(see the Env Matrix below for values). One function reads

settings.promptcommithash and pulls from LangSmith

(cross-ref langchain-prompt-engineering):


from langsmith import Client
ls = Client(api_key=settings.langsmith_api_key.get_secret_value())

def get_prompt(settings: Settings) -> ChatPromptTemplate:
    return ls.pull_prompt(f"triage-prompt:{settings.prompt_commit_hash}")

Prevents: staging loading a prod prompt commit. Pinning per env makes

promotion explicit — dev → staging → prod moves one hash at a time. See

Per-Env Pinning.

Step 5 — Per-env checkpointer selection

Checkpointer choice is per-env too:


from langgraph.checkpoint.memory import MemorySaver
from langgraph.checkpoint.postgres import PostgresSaver

def build_checkpointer(settings: Settings):
    if settings.env == "dev":
        return MemorySaver()          # ephemeral, resets on restart
    # staging + prod: Postgres with env-isolated schema
    # cross-ref langchain-langgraph-checkpointing (P20) for schema migration
    return PostgresSaver.from_conn_string(
        str(settings.checkpointer_url)
    )

Dev uses MemorySaver — no infra dependency, no state between runs.

Staging and prod use PostgresSaver against separate databases (or

separate schemas). Never share a checkpointer DB between envs; P20 explains

— schema migrations on a version bump corrupt cross-env threads.

Step 6 — Startup smoke test (< 10 seconds budget)


import time
from anthropic import Anthropic

def validate_integrations(settings: Settings) -> None:
    t0 = time.monotonic()

    # 1. Model reachable (1-token ping ~ $0.00001)
    anthropic = Anthropic(api_key=settings.anthropic_api_key.get_secret_value())
    anthropic.messages.create(
        model=settings.model_id,
        max_tokens=1,
        messages=[{"role": "user", "content": "hi"}],
    )

    # 2. Checkpointer reachable
    if settings.env != "dev":
        checkpointer = build_checkpointer(settings)
        checkpointer.setup()  # runs SELECT 1 + schema check

    # 3. Vector store reachable (see langchain-embeddings-search)
    # ... describe_index call here ...

    # 4. Observability endpoint reachable (OTLP HTTP health)
    # ... requests.get(f"{settings.otel_endpoint}/health", timeout=2) ...

    elapsed = time.monotonic() - t0
    if elapsed > 10.0:
        raise RuntimeError(
            f"startup smoke test took {elapsed:.1f}s (budget 10s)"
        )

Call validate_integrations(settings) before the HTTP server binds.

Failure aborts the deploy — the readiness probe never goes green, the

rollout halts, the bad version takes no traffic. Budget: 10 seconds.

Past 10s an integration is degraded — fail loudly rather than ship a 30s

cold start. See Startup Smoke Test.

Output

Settings class on pydantic-settings with SecretStr for keys, Literal env, HttpUrl endpoints, extra="forbid"
Env-specific loader (file → dev; env vars → staging; Secret Manager → prod); values land in Settings only, never os.environ
Cloud Secret Manager integration (GCP / AWS / Vault) with IAM-bound auth; no static keys on disk
Per-env pinning for modelid, promptcommithash, vectorindexname, checkpointerurl
Per-env checkpointer (MemorySaver dev, PostgresSaver on isolated DBs staging/prod)
Startup smoke test — model / vector / checkpointer / observability under 10-second budget

Env Matrix

Dimension	dev	staging	prod
Secret backend	`.env.dev` file (git-ignored)	orchestrator env vars	cloud Secret Manager, memory only
`os.environ` holds keys	yes (local)	yes (sidecar visible)	no (P37 fix)
`model_id`	`claude-haiku-4-6`	`claude-sonnet-4-6`	`claude-sonnet-4-6`
`promptcommithash`	WIP	canary	stable (1 week old)
`temperature`	0.7	0.2	0.2
Checkpointer	`MemorySaver`	`PostgresSaver` (staging DB)	`PostgresSaver` (prod DB)
Vector index	`dev-index`	`staging-index`	`prod-index`
OTEL sample rate	1.0	1.0	0.1
RPM limit	10	60	provider tier
Daily budget	$1	$10	$500-$5000
Smoke probes	model	model + checkpointer + OTEL	all four

Error Handling

Error	Cause	Fix
`docker exec POD env` shows `ANTHROPICAPIKEY=...` in prod (P37)	`dotenv` / plain env injection in prod	Pull from Secret Manager into `Settings(**values)`; never write to `os.environ`
Staging answers with prod prompts / wrong model	Loader defaulted or picked stale `LANGCHAIN_ENV`	`Literal["dev","staging","prod"]` on env; raise on unknown; no default
`ValidationError: extra fields forbidden` at startup	Typo (`LANGCHINMODELID`)	Fix the typo — `extra="forbid"` working as intended
Startup takes 30s before first request	Serialized probes or degraded integration	Enforce 10s budget; parallelize probes; fail the deploy
`repr(settings)` in a log leaks the API key	Plain `str` used, not `SecretStr`	Change field to `SecretStr`; repr masks to `'**********'`
Prod silently using `MemorySaver`	`buildcheckpointer` defaulted when `checkpointer``url` was None	Require `checkpointer_url` in staging/prod via a model validator
Secret Manager auth fails in CI	SA not bound; `google.auth` fell back to ADC	Bind SA with `roles/secretmanager.secretAccessor`
Prompt hash rolled forward in staging without dev validation	Promotion skipped the dev gate	Enforce dev → staging → prod order in CI (see per-env pinning ref)

Examples

Graduating a `.env`-in-dev service to prod

Start: a single .env committed (or leaked via docker exec env). End:

Settings class, three loaders, Secret Manager in prod, smoke test under

10s. Three PRs — (1) introduce Settings without changing loader behavior,

(2) add SecretStr and migrate call sites to .getsecretvalue(),

(3) swap prod to Secret Manager and remove the prod .env from the image.

See Settings Skeleton and

Secret Manager Integration.

Wrong-env prompt loaded in staging — postmortem

Staging inherited LANGCHAIN_ENV=production from a stale shell. The

Literal["dev","staging","prod"] field rejects production; CI promotion

sets LANGCHAIN_ENV explicitly; direnv pins it per-project. See

Per-Env Pinning.

Smoke test blocked a bad model id

A prod deploy went out with LANGCHAINMODELID=claude-sonnet-4-7 (not yet

rolled out). The 1-token ping failed with model not found,

validate_integrations raised, the container crash-looped, the rollout

halted, the previous version kept taking traffic. Zero user impact; failure

budget stayed under 3s. See Startup Smoke Test.

Resources

Pydantic Settings docs
Pydantic SecretStr
GCP Secret Manager client
AWS Secrets Manager boto3
HashiCorp Vault hvac
LangChain 1.0 release notes
Related skills in pack: langchain-security-basics (secrets lifecycle, owns rotation and revocation — not duplicated here); langchain-langgraph-checkpointing (P20 schema migration); langchain-prompt-engineering (prompt pin / LangSmith pull workflow); langchain-reference-architecture (where Settings fits in the DI layer)
Pack pain catalog: docs/pain-catalog.md (entries P37 primary, P20 cross-ref)

Allowed Tools

Provided by Plugin

langchain-py-pack

Installation

Instructions

LangChain Multi-Env Setup (Python)

Overview

Prerequisites

Instructions

Step 1 — Create a Settings class with SecretStr and fail-fast validation

Step 2 — Per-env config loading (file OR Secret Manager, never both)

Step 3 — Secret Manager pull (GCP example; AWS / Vault in reference)

Step 4 — Per-env model and prompt pinning

Step 5 — Per-env checkpointer selection

Step 6 — Startup smoke test (< 10 seconds budget)

Output

Env Matrix

Error Handling

Examples

Graduating a .env-in-dev service to prod

Wrong-env prompt loaded in staging — postmortem

Smoke test blocked a bad model id

Resources

Ready to use langchain-py-pack?

Related Skills

"cursor-advanced-composer"

"cursor-ai-chat"

"cursor-api-key-management"

"cursor-codebase-indexing"

"cursor-common-errors"

"cursor-compliance-audit"

Graduating a `.env`-in-dev service to prod