local-tts
Offline text-to-speech via VoxCPM2 — 30 languages, voice design, voice cloning. Runs locally on Apple Silicon.
Installation
Open Claude Code and run this command:
/plugin install local-tts@claude-code-plugins-plus
Use --global to install for all projects, or --project for current project only.
What It Does
Generate speech locally with VoxCPM2. 30 languages, voice design, voice cloning. Zero cloud, zero cost.
Runs 100% on your machine using VoxCPM2 (2B-parameter model, Apache-2.0). Optimized for Apple Silicon via Metal (MPS). No API keys, no rate limits, no telemetry.
Features
| Feature | Description |
|---|---|
| Text-to-Speech | 30 languages, auto-detected from input text |
| Voice Design | Describe a voice in natural language (e.g. "warm female voice, mid-30s") |
| Voice Cloning | Clone any voice from a 3-10 second reference clip |
| Ultimate Cloning | Reference + prompt for maximum fidelity (vocal micro-nuances) |
| 48 kHz output | Production-quality WAV ready for Telegram, video, podcast |
Skills (1)
Generate speech locally from text using VoxCPM2 (2B params, Apache-2.
How It Works
Via natural language (auto-triggered)
Just ask Claude to:
- "Say hello in French"
- "Generate a voiceover for this text: ..."
- "Clone this voice: /path/to/sample.wav and say ..."
- "Make a warm female voice reading: ..."
Direct invocation
VENV=~/.local-tts/venv
SCRIPT=${CLAUDE_PLUGIN_ROOT}/scripts/generate.py
"$VENV/bin/python" "$SCRIPT" --text "Hello world" --out /tmp/hello.wav
"$VENV/bin/python" "$SCRIPT" \
--text "(warm female voice, mid-30s, calm)Welcome back." \
--out /tmp/design.wav
"$VENV/bin/python" "$SCRIPT" \
--text "This is my cloned voice." \
--ref /path/to/sample.wav \
--out /tmp/clone.wav
cat article.txt | "$VENV/bin/python" "$SCRIPT" --stdin --out /tmp/article.wav
Supported languages (30)
Arabic, Burmese, Chinese (+ dialects), Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Khmer, Korean, Lao, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Tagalog, Thai, Turkish, Vietnamese.
No language tag needed — VoxCPM auto-detects from the text.
FAQ
No such file: VoxCPM2 weights — HuggingFace cache missing. First run will download (needs network).
Slow first call — normal, model load takes 30s after cache warms. Subsequent calls in the same Python process are instant; this script spawns a fresh process per call. For batch work, write a wrapper that loads the model once.
French pronunciation of names — add an IPA-ish hint or rephrase. Most names work out of the box.