Execute Fly.
ReadWriteEditBash(fly:*)Bash(psql:*)Grep
Fly.io Core Workflow B: Postgres, Volumes & Networking
Overview
Set up Fly Postgres, persistent Fly Volumes, and private networking between apps. Fly Postgres runs as a regular Fly app with automated replication. Volumes provide persistent NVMe storage attached to specific machines.
Instructions
Step 1: Create Fly Postgres
# Create a Postgres cluster
fly postgres create --name my-db --region iad --vm-size shared-cpu-1x --volume-size 10
# Attach to your app (sets DATABASE_URL secret automatically)
fly postgres attach my-db -a my-app
# Connect directly
fly postgres connect -a my-db
# psql> SELECT version();
# Proxy to local machine for dev tools
fly proxy 5432 -a my-db
# Now connect: psql postgres://postgres:password@localhost:5432
Step 2: Create Persistent Volumes
# Create a volume (same region as your machine)
fly volumes create data --size 10 --region iad -a my-app
# List volumes
fly volumes list -a my-app
# Mount in fly.toml
# fly.toml
[mounts]
source = "data"
destination = "/data"
# Deploy to pick up mount
fly deploy
# Verify mount inside machine
fly ssh console -C "df -h /data"
Step 3: Private Networking (6PN)
# Apps in the same org can reach each other via .internal DNS
# my-app can reach my-db at: my-db.internal:5432
# Internal DNS format: <app-name>.internal
# Machine-specific: <machine-id>.vm.<app-name>.internal
# Example: connect from app code
DATABASE_URL=postgres://postgres:password@my-db.internal:5432/my_db
// Access internal services (no public internet)
const dbUrl = `postgres://postgres:${process.env.DB_PASSWORD}@my-db.internal:5432/mydb`;
const apiUrl = `http://my-api.internal:3000/health`; // Internal HTTP
Step 4: Postgres Backups and Failover
# List backups
fly postgres barman list-backups -a my-db
# Create manual backup
fly postgres barman backup -a my-db
# Check replication status
fly postgres barman check -a my-db
# Failover to standby (if primary fails)
fly postgres failover -a my-db
Error Handling
| Error |
Cause |
Solution |
volume not found |
Volume in different region |
Create volume in same region as machine |
connection refused on .internal |
App not running |
Check fly status -a target-app |
database does not exist |
Not yet created |
Run CREATE DATABASE mydb; via fly postgres connect |
disk full |
Volume full |
<
Optimize Fly.
ReadWriteEditBash(fly:*)
Fly.io Cost Tuning
Overview
Fly.io charges per-second for running machines plus storage. Key levers: auto-stop idle machines, suspend instead of stop, right-size VMs, and clean up unused volumes.
Pricing Quick Reference
| Resource |
Free Tier |
Cost |
| shared-cpu-1x (256mb) |
3 VMs free |
~$1.94/month each |
| shared-cpu-1x (512mb) |
included |
~$3.88/month |
| shared-cpu-2x (1gb) |
- |
~$11.62/month |
| Volumes |
3GB free |
$0.15/GB/month |
| Bandwidth |
100GB free |
$0.02/GB after |
| IPv4 |
1 free per org |
$2/month each |
Instructions
Strategy 1: Auto-Stop Idle Machines
# fly.toml — stop machines when no traffic
[http_service]
auto_stop_machines = "stop" # Full stop (cheapest, ~5s cold start)
auto_start_machines = true
min_machines_running = 0 # Allow all machines to stop
# Use min_machines_running = 1 only for production apps
Strategy 2: Suspend for Faster Resume
# Suspend keeps memory state — resumes in ~100ms but costs ~$0.50/month
[http_service]
auto_stop_machines = "suspend"
Strategy 3: Audit and Clean Up
# List all apps and their machine counts
fly apps list
# Find idle/stopped machines
fly machine list -a my-app --json | jq '.[] | select(.state != "started") | {id, state, region}'
# Destroy unused apps
fly apps destroy old-app --yes
# List and delete orphaned volumes
fly volumes list -a my-app
fly volumes destroy vol_xxx
Strategy 4: Right-Size VMs
# Check memory usage to see if oversized
fly ssh console -a my-app -C "cat /proc/meminfo | head -3"
# Downgrade if using <50% of allocated memory
fly scale vm shared-cpu-1x --memory 256 -a my-app
Cost Monitoring
# Check current month's usage
fly billing # Shows org-level billing
# Estimate per-app cost
fly scale show -a my-app # See VM count and size
Resources
Next Steps
For architecture design, see flyio-reference-architecture.
Collect Fly.
ReadBash(fly:*)Bash(curl:*)Bash(tar:*)Grep
Fly.io Debug Bundle
Overview
Collect machine state, app health, volume status, deploy history, network connectivity, and platform diagnostics into a single archive for Fly.io support tickets. This bundle captures everything needed to troubleshoot stuck deployments, machine boot failures, volume corruption, and edge networking problems.
Debug Collection Script
#!/bin/bash
set -euo pipefail
APP="${1:?Usage: fly-debug.sh <app-name>}"
BUNDLE="debug-flyio-${APP}-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BUNDLE"
# Environment check
echo "=== Fly.io Debug Bundle: $APP ===" | tee "$BUNDLE/summary.txt"
echo "Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> "$BUNDLE/summary.txt"
echo "FLY_API_TOKEN: ${FLY_API_TOKEN:+[SET]}" >> "$BUNDLE/summary.txt"
echo "flyctl: $(fly version 2>/dev/null || echo 'not found')" >> "$BUNDLE/summary.txt"
# API connectivity
HTTP=$(curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer ${FLY_API_TOKEN}" \
https://api.machines.dev/v1/apps 2>/dev/null || echo "000")
echo "Machines API: HTTP $HTTP" >> "$BUNDLE/summary.txt"
# App status and machine state
fly status -a "$APP" > "$BUNDLE/status.txt" 2>&1 || true
fly machine list -a "$APP" --json > "$BUNDLE/machines.json" 2>&1 || true
# Recent logs (last 200 lines)
fly logs -a "$APP" --no-tail 2>&1 | tail -200 > "$BUNDLE/logs.txt" || true
# Volumes, releases, and doctor
fly volumes list -a "$APP" > "$BUNDLE/volumes.txt" 2>&1 || true
fly releases -a "$APP" > "$BUNDLE/releases.txt" 2>&1 || true
fly doctor > "$BUNDLE/doctor.txt" 2>&1 || true
# Network and platform status
curl -s -o /dev/null -w "App endpoint: HTTP %{http_code}\n" \
"https://${APP}.fly.dev/" >> "$BUNDLE/summary.txt" 2>/dev/null || echo "App: unreachable" >> "$BUNDLE/summary.txt"
curl -s https://status.flyio.net/api/v2/status.json 2>/dev/null | \
jq -r '"Platform: " + .status.description' >> "$BUNDLE/summary.txt" || true
tar -czf "$BUNDLE.tar.gz" "$BUNDLE" && rm -rf "$BUNDLE"
echo "Bundle: $BUNDLE.tar.gz"
Analyzing the Bundle
tar -xzf debug-flyio-*.tar.gz
cat debug-flyio-*/summary.txt # Quick health overview
jq '.[] | {id, state, region}' debug-flyio-*/machines.json # Machine states
grep -i "error\|fail\|crash" debug-flyio-*/logs.txt # Error patterns
cat debug-flyio-*/doctor.txt # Fly.io self-diagnosis
C
Advanced Fly.
ReadWriteEditBash(fly:*)Bash(curl:*)Grep
Fly.io Deploy Integration
Overview
Deploy edge applications on Fly.io with Docker containers and the fly.toml configuration file. This skill covers building production images optimized for Fly's micro-VM architecture, configuring fly.toml for services, health checks, and multi-region placement, verifying API connectivity from edge locations, and executing rolling updates with automatic rollback. Fly.io deploys as Firecracker micro-VMs, so containers start in under a second and scale to zero when idle.
Docker Configuration
FROM node:20-slim AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY tsconfig.json ./
COPY src/ ./src/
RUN npm run build
FROM node:20-slim
RUN addgroup --system app && adduser --system --ingroup app app
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
USER app
EXPOSE 8080
CMD ["node", "dist/index.js"]
Fly.io Configuration
# fly.toml
app = "my-integration"
primary_region = "iad"
[build]
dockerfile = "Dockerfile"
[env]
LOG_LEVEL = "info"
PORT = "8080"
[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = true
auto_start_machines = true
[[http_service.checks]]
interval = "30s"
timeout = "5s"
grace_period = "10s"
method = "GET"
path = "/health"
Environment Variables
export FLY_API_TOKEN="fo1_xxxxxxxxxxxx"
fly secrets set FLYIO_APP_NAME="my-integration"
fly secrets set LOG_LEVEL="info"
Health Check Endpoint
import express from 'express';
const app = express();
app.get('/health', async (req, res) => {
try {
const region = process.env.FLY_REGION || 'unknown';
const appName = process.env.FLY_APP_NAME || 'unknown';
res.json({ status: 'healthy', service: 'flyio-integration', region, app: appName, timestamp: new Date().toISOString() });
} catch (error) {
res.status(503).json({ status: 'unhealthy', error: (error as Error).message });
}
});
Deployment Steps
Step 1: Build
fly launch --no-deploy
Step 2: Run
fly deploy --strategy rolling
Step 3: Verify
fly status
curl -s https://my-integration.fly.dev/health | jq .
Step 4: Rolling Update
fly deploy --strategy rolling --wait-timeout 300
fly releases --image
fly releases rollback # if health check fails
Error Handling
Deploy your first app to Fly.
ReadWriteEditBash(fly:*)Bash(curl:*)Bash(docker:*)
Fly.io Hello World
Overview
Deploy a minimal app to Fly.io using fly launch. Fly.io runs Docker containers on Firecracker microVMs across 30+ regions worldwide. Two paths: flyctl CLI (simple) or Machines API (programmatic).
Instructions
Step 1: Launch with flyctl
# Create a new directory with a Dockerfile
mkdir fly-hello && cd fly-hello
cat > Dockerfile << 'EOF'
FROM node:20-alpine
WORKDIR /app
COPY server.js .
EXPOSE 3000
CMD ["node", "server.js"]
EOF
cat > server.js << 'EOF'
const http = require('http');
const server = http.createServer((req, res) => {
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({
message: 'Hello from Fly.io!',
region: process.env.FLY_REGION,
app: process.env.FLY_APP_NAME,
}));
});
server.listen(3000, () => console.log('Listening on :3000'));
EOF
# Launch — creates app, generates fly.toml, deploys
fly launch --name hello-fly --region iad --now
Step 2: Verify Deployment
# Check status
fly status
# Open in browser
fly open
# View logs
fly logs
# Test with cURL
curl https://hello-fly.fly.dev/
# {"message":"Hello from Fly.io!","region":"iad","app":"hello-fly"}
Step 3: Deploy via Machines API
const FLY_API = 'https://api.machines.dev';
const headers = {
'Authorization': `Bearer ${process.env.FLY_API_TOKEN}`,
'Content-Type': 'application/json',
};
// Create an app
const app = await fetch(`${FLY_API}/v1/apps`, {
method: 'POST',
headers,
body: JSON.stringify({
app_name: 'hello-api',
org_slug: 'personal',
}),
}).then(r => r.json());
// Create a machine in the app
const machine = await fetch(`${FLY_API}/v1/apps/hello-api/machines`, {
method: 'POST',
headers,
body: JSON.stringify({
region: 'iad',
config: {
image: 'nginx:alpine',
services: [{
ports: [{ port: 443, handlers: ['tls', 'http'] }],
protocol: 'tcp',
internal_port: 80,
}],
guest: { cpu_kind: 'shared', cpus: 1, memory_mb: 256 },
},
}),
}).then(r => r.json());
console.log(`Machine ${machine.id} created in ${machine.region}`);
Output
Machine e784079f004d86 created in iad
App URL: https://hello-api.fly.dev
Error Handling
| Error |
Cause |
Solution |
No machines in group |
App exists but no machines |
Run fly deploy or create via API |
Could not find image<
Install flyctl CLI and configure Fly.
ReadWriteEditBash(fly:*)Bash(curl:*)Grep
Fly.io Install & Auth
Overview
Install flyctl CLI and configure authentication for Fly.io edge compute platform. Two auth methods: interactive login (opens browser) and API tokens (CI/CD and Machines API). The Machines API base URL is https://api.machines.dev.
Prerequisites
- Fly.io account at fly.io
- macOS, Linux, or WSL2
Instructions
Step 1: Install flyctl
# macOS / Linux
curl -L https://fly.io/install.sh | sh
# Or via Homebrew
brew install flyctl
# Verify
fly version
Step 2: Authenticate
# Interactive login (opens browser)
fly auth login
# Or with token (CI/CD)
fly auth token # Get current token
export FLY_API_TOKEN="fo1_your_token_here"
# Verify auth
fly auth whoami
Step 3: Create API Token for Machines API
# Create deploy token (scoped to an app)
fly tokens create deploy -a my-app
# Create org-level token
fly tokens create org
# Use with Machines API
curl -s -H "Authorization: Bearer $FLY_API_TOKEN" \
https://api.machines.dev/v1/apps | jq '.[].name'
Step 4: Verify Machines API Access
const FLY_API = 'https://api.machines.dev';
async function verifyFlyAccess() {
const res = await fetch(`${FLY_API}/v1/apps`, {
headers: { 'Authorization': `Bearer ${process.env.FLY_API_TOKEN}` },
});
const apps = await res.json();
console.log(`Connected. Found ${apps.length} apps.`);
apps.forEach((app: any) => console.log(` ${app.name} (${app.organization.slug})`));
}
Token Types
| Token Type |
Scope |
Lifetime |
Use Case |
| User token |
All orgs/apps |
Until revoked |
Development, personal |
| Deploy token |
Single app |
Until revoked |
CI/CD per app |
| Org token |
All apps in org |
Until revoked |
Org-wide automation |
| Machines token |
API access |
Until revoked |
Machines API calls |
Error Handling
| Error |
Cause |
Solution |
Error: not authenticated |
No token set |
Run fly auth login or set FLYAPITOKEN |
401 Unauthorized |
Invalid/expired token |
Regenerate with fly tokens create |
Could not find app |
Wrong app name |
Check with fly apps list |
flyctl not found |
CLI not installed |
Run
Configure Fly.
ReadWriteEditBash(fly:*)Bash(docker:*)Grep
Fly.io Local Dev Loop
Overview
Fast local development workflow for Fly.io apps: build and test Docker containers locally, proxy remote Fly services (Postgres, Redis) to localhost, and use fly deploy for integration testing.
Instructions
Step 1: Local Docker Testing
# Build and run locally — same Dockerfile used by Fly
docker build -t my-app .
docker run -p 3000:3000 \
-e NODE_ENV=development \
-e DATABASE_URL="postgres://localhost:5432/dev" \
my-app
# Test
curl http://localhost:3000/health
Step 2: Proxy Remote Fly Services
# Proxy Fly Postgres to localhost:5432
fly proxy 5432 -a my-db &
# Now use local tools against remote Fly Postgres
psql postgres://postgres:password@localhost:5432/mydb
npx prisma studio # Prisma GUI works against proxied DB
# Proxy Redis
fly proxy 6379 -a my-redis &
redis-cli -h localhost -p 6379
Step 3: Development fly.toml
# fly.dev.toml — dev overrides (not committed)
app = "my-app-dev"
primary_region = "iad"
[env]
NODE_ENV = "development"
LOG_LEVEL = "debug"
[http_service]
internal_port = 3000
auto_stop_machines = "off" # Keep running for debugging
min_machines_running = 1
[[vm]]
cpu_kind = "shared"
cpus = 1
memory = "256mb" # Smaller for dev
Step 4: Fast Deploy Cycle
# Deploy to dev app
fly deploy -a my-app-dev --config fly.dev.toml
# Watch logs while testing
fly logs -a my-app-dev --no-tail &
# SSH in for debugging
fly ssh console -a my-app-dev
# Quick restart after config change
fly apps restart my-app-dev
Dev Scripts
{
"scripts": {
"dev": "tsx watch src/index.ts",
"docker:build": "docker build -t my-app .",
"docker:run": "docker run -p 3000:3000 --env-file .env.local my-app",
"fly:dev": "fly deploy -a my-app-dev --config fly.dev.toml",
"fly:proxy:db": "fly proxy 5432 -a my-db",
"fly:logs": "fly logs -a my-app-dev",
"fly:ssh": "fly ssh console -a my-app-dev"
}
}
Resources
Next Steps
See flyio-sdk-patterns for Machines API client patterns.
Optimize Fly.
ReadWriteEditBash(fly:*)
Fly.io Performance Tuning
Overview
Optimize Fly.io performance: eliminate cold starts, right-size VMs, leverage multi-region for low latency, and tune concurrency settings.
Instructions
Step 1: Eliminate Cold Starts
# fly.toml — suspend instead of stop for faster resume (~100ms vs ~5s)
[http_service]
auto_stop_machines = "suspend" # Suspend to RAM, not full stop
auto_start_machines = true
min_machines_running = 1 # Always-warm in primary region
# For latency-critical: keep machines running in all regions
# min_machines_running applies globally
Step 2: Right-Size VMs
# Check current allocation
fly scale show -a my-app
# Start small, scale up based on metrics
fly scale vm shared-cpu-1x --memory 256 # Start here
fly scale vm shared-cpu-1x --memory 512 # If memory-constrained
fly scale vm shared-cpu-2x --memory 1024 # If CPU-bound
fly scale vm performance-2x --memory 4096 # For compute-heavy workloads
| Workload |
VM |
Memory |
When |
| Static site / API proxy |
shared-cpu-1x |
256mb |
Low traffic |
| Node.js API |
shared-cpu-1x |
512mb |
Most apps |
| Heavy processing |
shared-cpu-2x |
1gb |
Background jobs |
| Database / ML |
performance-2x |
4gb |
Compute-intensive |
Step 3: Multi-Region Latency Optimization
# Deploy close to your users
fly scale count 1 --region iad # US East
fly scale count 1 --region lhr # Europe
fly scale count 1 --region nrt # Asia Pacific
# Fly automatically routes to nearest region via Anycast
# Verify: curl with timing
curl -w "DNS: %{time_namelookup}s, Connect: %{time_connect}s, Total: %{time_total}s\n" \
-o /dev/null -s https://my-app.fly.dev/health
Step 4: Connection Pooling for Postgres
// Use connection pooling for Fly Postgres
// PgBouncer runs on port 5433 (pooled) vs 5432 (direct)
const pooledUrl = process.env.DATABASE_URL?.replace(':5432/', ':5433/');
// Prisma: add pgbouncer=true
// DATABASE_URL="postgres://user:pass@my-db.internal:5433/db?pgbouncer=true"
Step 5: Tune Concurrency
[http_service.concurrency]
type = "requests" # or "connections"
hard_limit = 250 # Max before rejecting
soft_limit = 200 # Start scaling at this point
Resources
Execute Fly.
ReadBash(fly:*)Bash(curl:*)Grep
Fly.io Production Checklist
Overview
Fly.io runs applications on edge infrastructure across 30+ regions with Machines, Volumes, and managed Postgres. A production deployment requires multi-region redundancy, proper secret management, health checks, and rollback procedures. Misconfigured auto-scaling means cold starts; missing volume backups mean data loss. This checklist ensures your Fly.io app is production-hardened.
Authentication & Secrets
- [ ]
FLYAPITOKEN stored in CI secrets (never in fly.toml or source)
- [ ] All app secrets set via
fly secrets (not [env] block)
- [ ] Deploy tokens scoped per app (not org-wide personal tokens)
- [ ] Key rotation scheduled (quarterly, or after team changes)
- [ ] No hardcoded secrets in Dockerfile or codebase
API Integration
- [ ] Production base URL: app deployed to
https://.fly.dev
- [ ]
forcehttps = true in fly.toml httpservice
- [ ] Custom domain with TLS certificate active and auto-renewing
- [ ]
minmachinesrunning = 1 to avoid cold starts
- [ ] Machines deployed in 2+ regions for redundancy
- [ ] Concurrency limits tuned (
softlimit/hardlimit per workload)
- [ ] Volumes backed up if using persistent storage
Error Handling & Resilience
- [ ] Health check endpoint configured with appropriate grace period
- [ ] Graceful shutdown handles SIGTERM within 10s window
- [ ] Auto-stop/auto-start configured for cost optimization
- [ ] Postgres standby replica provisioned for database apps
- [ ] Rollback procedure tested:
fly releases rollback
- [ ] Dockerfile builds and runs identically local vs deployed
Monitoring & Alerting
- [ ]
fly logs streaming configured for centralized logging
- [ ] Machine health monitored via
fly machine status
- [ ] Platform status checked:
https://status.flyio.net
- [ ] Alert on health check failures across any region
- [ ] VM resource utilization tracked (
fly scale show)
Validation Script
async function checkFlyioReadiness(): Promise<void> {
const checks: { name: string; pass: boolean; detail: string }[] = [];
// Fly.io API connectivity
try {
const res = await fetch('https://api.machines.dev/v1/apps', {
headers: { Authorization: `Bearer ${process.env.FLY_API_TOKEN}` },
});
checks.push({ name: 'Fly API', pass: res.ok, detail: res.ok ? 'Connected' : `HTTP ${res.status}` });
} catch (e: any) { checks.push({ name: 'Fly API', pass: false, detail: e.message }); }
// Token present
checks.push({ name: 'API
Handle Fly.
ReadWriteEdit
Fly.io Rate Limits
Overview
The Fly.io Machines API rate-limits per organization, with write operations (create, delete, update) throttled much more aggressively than reads. Deploying fleets of edge machines across multiple regions can easily trigger 429s, especially during rolling deployments or auto-scaling events. The API returns a Retry-After header on rate-limited responses, and organizations running 50+ machines should implement client-side token bucket limiting to avoid cascading failures during high-churn operations.
Rate Limit Reference
| Endpoint |
Limit |
Window |
Scope |
| Machine create/delete |
10 req |
1 minute |
Per org |
| Machine start/stop |
30 req |
1 minute |
Per org |
| Machine list/get |
120 req |
1 minute |
Per org |
| App create/delete |
5 req |
1 minute |
Per org |
| Volume operations |
15 req |
1 minute |
Per org |
Rate Limiter Implementation
class FlyRateLimiter {
private tokens: number;
private lastRefill: number;
private readonly max: number;
private readonly refillRate: number;
private queue: Array<{ resolve: () => void }> = [];
constructor(maxPerMinute: number) {
this.max = maxPerMinute;
this.tokens = maxPerMinute;
this.lastRefill = Date.now();
this.refillRate = maxPerMinute / 60_000;
}
async acquire(): Promise<void> {
this.refill();
if (this.tokens >= 1) { this.tokens -= 1; return; }
return new Promise(resolve => this.queue.push({ resolve }));
}
private refill() {
const now = Date.now();
this.tokens = Math.min(this.max, this.tokens + (now - this.lastRefill) * this.refillRate);
this.lastRefill = now;
while (this.tokens >= 1 && this.queue.length) {
this.tokens -= 1;
this.queue.shift()!.resolve();
}
}
}
const writeLimiter = new FlyRateLimiter(8); // leave headroom under 10/min
const readLimiter = new FlyRateLimiter(100);
Retry Strategy
async function flyRetry<T>(fn: () => Promise<Response>, maxRetries = 4): Promise<T> {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
const res = await fn();
if (res.ok) return res.json();
if (res.status === 429) {
const retryAfter = parseInt(res.headers.get("Retry-After") || "10", 10);
const delay = retryAfter * 1000 + Math.random() * 2000;
await new Promise(r => setTimeout(r, delay));
continue;
}
if (res.status >= 500 && attempt < maxRetries) {
await new Promise(r => setTimeout(r, Math.pow(2, attempt) * 1000));
continue;
}
Implement Fly.
ReadWriteEdit
Fly.io Reference Architecture
Overview
Production architecture for Fly.io: multi-region web tier, Postgres with read replicas, Redis for caching, background workers, and private networking.
Architecture
┌─────────── Fly.io Anycast DNS ──────────┐
│ │
┌──────▼──────┐ ┌──────────────┐ ┌─────────────▼───┐
│ Web (iad) │ │ Web (lhr) │ │ Web (nrt) │
│ shared-1x │ │ shared-1x │ │ shared-1x │
└──────┬──────┘ └──────┬───────┘ └────────┬────────┘
│ │ │
───────┴────────────────┴────────────────────┴─── .internal DNS
│ │ │
┌──────▼──────┐ ┌──────▼───────┐ ┌────────▼────────┐
│ Postgres │ │ Postgres │ │ Redis │
│ Primary │ │ Replica │ │ (upstash.io) │
│ (iad) │ │ (lhr) │ │ │
└─────────────┘ └──────────────┘ └──────────────────┘
│
┌──────▼──────┐
│ Worker │
│ (iad) │
│ shared-1x │
└─────────────┘
Setup Commands
# 1. Web app — multi-region
fly launch --name my-web --region iad
fly scale count 1 --region lhr
fly scale count 1 --region nrt
# 2. Postgres with replica
fly postgres create --name my-db --region iad
fly postgres attach my-db -a my-web
# Add read replica in Europe
fly machine clone <primary-machine-id> --region lhr -a my-db
# 3. Background worker (same codebase, different process)
fly launch --name my-worker --region iad --no-deploy
# fly.toml for worker: no [http_service], use [processes]
# 4. All communicate via .internal DNS
# my-db.internal:5432 (Postgres)
# my-web.internal:3000 (internal API)
fly.toml Configurations
Web App
app = "my-web"
primary_region = "iad"
[http_service]
internal_port = 3000
force_https = true
auto_stop_machines = "suspend"
min_machines_running = 1
[[vm]]
cpu_kind = "shared"
cpus = 1
memory = "512mb"
Background Worker
app = "my-worker"
primary_region = "iad"
[processes]
worker = "node dist/worker.js"
# No [http_service] — worker doesn't serve HTTP
[[vm]]
cpu_kind = "shared"
cpus = 1
memory = "512mb"
Key Design Decisions
| Decision |
Choice |
Rationale |
| Web tier |
3 regions |
Low latency for global users |
| Database |
Fly Postgres + replica |
Read replicas near users |
| Cache |
Upstash Redis (or Fly Redis) |
Managed, multi-region |
| Workers |
Separate Fly app |
Apply production-ready Fly.
ReadWriteEdit
Fly.io SDK Patterns
Overview
Production-ready patterns for the Fly.io Machines REST API at https://api.machines.dev. Fly.io exposes both GraphQL (organization queries) and REST (machine lifecycle) APIs. The Machines REST API is the primary integration surface for creating, starting, stopping, and destroying VMs across 30+ global regions. A structured client ensures consistent auth, typed machine states, and reliable wait-for-state polling.
Singleton Client
const FLY_API = 'https://api.machines.dev';
let _client: FlyClient | null = null;
export function getClient(appName: string): FlyClient {
if (!_client) {
const token = process.env.FLY_API_TOKEN;
if (!token) throw new Error('FLY_API_TOKEN must be set');
_client = new FlyClient(appName, token);
}
return _client;
}
class FlyClient {
private h: Record<string, string>;
constructor(private app: string, token: string) {
this.h = { 'Authorization': `Bearer ${token}`, 'Content-Type': 'application/json' };
}
async listMachines(): Promise<FlyMachine[]> {
const r = await fetch(`${FLY_API}/v1/apps/${this.app}/machines`, { headers: this.h });
if (!r.ok) throw new FlyError(r.status, await r.text()); return r.json();
}
async createMachine(config: MachineConfig, region: string): Promise<FlyMachine> {
const r = await fetch(`${FLY_API}/v1/apps/${this.app}/machines`, {
method: 'POST', headers: this.h, body: JSON.stringify({ region, config }) });
if (!r.ok) throw new FlyError(r.status, await r.text()); return r.json();
}
async waitForState(id: string, state: string, timeout = 30): Promise<void> {
const r = await fetch(`${FLY_API}/v1/apps/${this.app}/machines/${id}/wait?state=${state}&timeout=${timeout}`,
{ headers: this.h });
if (!r.ok) throw new FlyError(r.status, `Wait for ${state} timed out`);
}
}
Error Wrapper
export class FlyError extends Error {
constructor(public status: number, message: string) { super(message); this.name = 'FlyError'; }
}
export async function safeCall<T>(operation: string, fn: () => Promise<T>): Promise<T> {
try { return await fn(); }
catch (err: any) {
if (err instanceof FlyError && err.status === 429) { await new Promise(r => setTimeout(r, 2000)); return fn(); }
if (err instanceof FlyError && err.status === 401) throw new FlyError(401, 'Invalid FLY_API_TOKEN');
throw new FlyError(err.status ?? 0, `${operation} failed: ${err.message}`);
}
}
Request Builder
class DeployBuilder {
private regions: string[] = []; private config: Partial<MachineConfig> = {};
toRegions(...r: string[]) { this.regions = r; return this; }
withImage(img: string) { this.config.image =
Apply Fly.
ReadWriteEditBash(fly:*)
Fly.io Security Basics
Overview
Fly.io deploys applications to edge locations worldwide using Firecracker microVMs. Security concerns center on deploy token scoping (org-wide vs per-app), secrets management (encrypted at rest, injected as env vars), private networking via WireGuard mesh (6PN), and TLS certificate management. A leaked deploy token can push arbitrary code to production machines across all regions.
API Key Management
function validateFlyToken(): void {
const token = process.env.FLY_API_TOKEN;
if (!token) {
throw new Error("Missing FLY_API_TOKEN — use `fly tokens create deploy -a <app>`");
}
// Never log tokens; log only token type for debugging
const isDeployToken = token.startsWith("FlyV1");
console.log("Fly.io token loaded, type:", isDeployToken ? "deploy" : "personal");
}
Webhook Signature Verification
import crypto from "crypto";
import { Request, Response, NextFunction } from "express";
function verifyFlyWebhook(req: Request, res: Response, next: NextFunction): void {
const signature = req.headers["x-fly-signature"] as string;
const secret = process.env.FLY_WEBHOOK_SECRET!;
const expected = crypto.createHmac("sha256", secret).update(req.body).digest("hex");
if (!signature || !crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected))) {
res.status(401).send("Invalid signature");
return;
}
next();
}
Input Validation
import { z } from "zod";
const FlyDeploySchema = z.object({
app_name: z.string().regex(/^[a-z0-9-]+$/).max(63),
region: z.enum(["iad", "ord", "lax", "sjc", "ams", "lhr", "nrt", "syd", "gru"]),
image: z.string().regex(/^registry\..+\/.+:.+$/),
vm_size: z.enum(["shared-cpu-1x", "shared-cpu-2x", "performance-1x", "performance-2x"]).optional(),
min_machines: z.number().int().min(0).max(20).optional(),
});
function validateDeployConfig(data: unknown) {
return FlyDeploySchema.parse(data);
}
Data Protection
const FLY_SENSITIVE_FIELDS = ["fly_api_token", "deploy_token", "db_password", "wireguard_private_key", "tls_private_key"];
function redactFlyLog(record: Record<string, unknown>): Record<string, unknown> {
const redacted = { ...record };
for (const field of FLY_SENSITIVE_FIELDS) {
if (field in redacted) redacted[field] = "[REDACTED]";
}
return redacted;
}
Security Checklist
- [ ] All sensitive values in
fly secrets, never in [env] se
Migrate between Fly.
ReadWriteEditBash(fly:*)Grep
Fly.io Upgrade & Migration
Overview
Guide for Fly.io platform migrations: Apps v1 (Nomad) to v2 (Machines), flyctl CLI upgrades, Postgres major version upgrades, and region migrations.
Instructions
Apps v1 to v2 Migration
# Check current platform version
fly status -a my-app # Look for "Platform: machines" vs "nomad"
# Migrate to Apps v2 (Machines)
fly migrate-to-v2 -a my-app
# Verify
fly status -a my-app
fly machine list -a my-app
flyctl CLI Upgrade
# Check current version
fly version
# Upgrade
fly version update
# Or reinstall
curl -L https://fly.io/install.sh | sh
Postgres Major Version Upgrade
# Check current version
fly postgres connect -a my-db -c "SELECT version();"
# Create new cluster with target version
fly postgres create --name my-db-v16 --region iad --image-ref flyio/postgres-flex:16
# Migrate data
fly postgres import pg_dump_url -a my-db-v16
# Update app to point to new cluster
fly postgres detach my-db -a my-app
fly postgres attach my-db-v16 -a my-app
fly deploy -a my-app # Picks up new DATABASE_URL
Region Migration
# Add machines in new region
fly scale count 1 --region fra -a my-app
# Verify new region is healthy
fly status -a my-app
# Remove machines from old region
fly scale count 0 --region iad -a my-app
# For volumes: create new volume, migrate data, destroy old
fly volumes create data --size 10 --region fra -a my-app
Migration Checklist
- [ ] Current state documented (
fly status, fly scale show)
- [ ] Database backed up before migration
- [ ] Tested migration in staging app first
- [ ] DNS/certificates transferred if changing domains
- [ ] Monitoring confirms healthy after cutover
- [ ] Old resources cleaned up
Resources
Next Steps
For CI integration, see flyio-ci-integration.
Implement Fly.
ReadWriteEditBash(fly:*)Bash(curl:*)
Fly.io Events & Monitoring
Overview
Fly.io does not have traditional webhooks. Instead, monitor machine state changes via the Machines API, process structured logs via fly logs, and use health check endpoints for automated responses.
Instructions
Step 1: Poll Machine State Changes
// Monitor machine state transitions via Machines API
async function watchMachines(appName: string, callback: (event: MachineEvent) => void) {
const client = new FlyClient(appName, process.env.FLY_API_TOKEN!);
const stateCache = new Map<string, string>();
setInterval(async () => {
const machines = await client.listMachines();
for (const m of machines) {
const prev = stateCache.get(m.id);
if (prev && prev !== m.state) {
callback({
machineId: m.id,
region: m.region,
previousState: prev,
currentState: m.state,
timestamp: new Date(),
});
}
stateCache.set(m.id, m.state);
}
}, 10_000); // Check every 10 seconds
}
interface MachineEvent {
machineId: string;
region: string;
previousState: string;
currentState: string;
timestamp: Date;
}
Step 2: Health Check Event Handler
// Implement health check that reports machine health
// Fly.io uses this to auto-restart unhealthy machines
import express from 'express';
const app = express();
app.get('/health', async (req, res) => {
const checks = {
database: await checkPostgres(),
redis: await checkRedis(),
memory: process.memoryUsage().heapUsed < 500 * 1024 * 1024, // < 500MB
};
const healthy = Object.values(checks).every(Boolean);
res.status(healthy ? 200 : 503).json({
status: healthy ? 'healthy' : 'unhealthy',
region: process.env.FLY_REGION,
machine: process.env.FLY_MACHINE_ID,
checks,
});
});
Step 3: Structured Log Processing
# Stream logs and process with jq
fly logs -a my-app --json | jq -c 'select(.level == "error")' | while read -r line; do
echo "$line" >> errors.jsonl
# Send to Slack, PagerDuty, etc.
done
# Search recent logs for specific patterns
fly logs -a my-app --no-tail | grep -i "error\|crash\|oom"
Step 4: Deployment Event Notifications
# Post-deploy notification in CI
fly deploy -a my-app && \
curl -X POST "$SLACK_WEBHOOK_URL" \
-H "Content-Type: application/json" \
-d "{\"text\": \"Deployed my-app to Fly.io. Status: $(fly status -a my-app --json | jq -r '.Status')\"}"
Resources
|
|
|