castai-debug-bundle

'Collect CAST AI diagnostic bundle for support tickets and troubleshooting.

6 Tools
castai-pack Plugin
saas packs Category

Allowed Tools

ReadBash(kubectl:*)Bash(curl:*)Bash(tar:*)Bash(helm:*)Grep

Provided by Plugin

castai-pack

Claude Code skill pack for Cast AI (18 skills)

saas packs v1.0.0
View Plugin

Installation

This skill is included in the castai-pack plugin:

/plugin install castai-pack@claude-code-plugins-plus

Click to copy

Instructions

CAST AI Debug Bundle

Overview

Collect all CAST AI component logs, cluster state, and configuration into a single archive for troubleshooting or support tickets. The bundle captures agent status, Helm releases, autoscaler policies, node inventory, and recent events.

Prerequisites

  • kubectl access to the cluster running CAST AI
  • CASTAIAPIKEY and CASTAICLUSTERID configured
  • helm installed

Instructions

Step 1: Run the Debug Bundle Script


#!/bin/bash
# castai-debug-bundle.sh
set -euo pipefail

BUNDLE_DIR="castai-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BUNDLE_DIR"

echo "=== CAST AI Debug Bundle ===" | tee "$BUNDLE_DIR/summary.txt"
echo "Generated: $(date -u)" | tee -a "$BUNDLE_DIR/summary.txt"
echo "Cluster ID: ${CASTAI_CLUSTER_ID:-unknown}" | tee -a "$BUNDLE_DIR/summary.txt"

# 1. Helm releases
echo "--- Helm Releases ---" >> "$BUNDLE_DIR/summary.txt"
helm list -n castai-agent -o yaml > "$BUNDLE_DIR/helm-releases.yaml" 2>&1

# 2. Pod status
echo "--- Pod Status ---" >> "$BUNDLE_DIR/summary.txt"
kubectl get pods -n castai-agent -o wide > "$BUNDLE_DIR/pod-status.txt" 2>&1

# 3. Agent logs (last 200 lines each)
for deploy in castai-agent cluster-controller castai-evictor castai-spot-handler castai-workload-autoscaler; do
  kubectl logs -n castai-agent "deployment/$deploy" --tail=200 \
    > "$BUNDLE_DIR/${deploy}-logs.txt" 2>&1 || echo "Not found: $deploy" >> "$BUNDLE_DIR/summary.txt"
done

# 4. Cluster events (CAST AI related)
kubectl get events -n castai-agent --sort-by='.lastTimestamp' \
  > "$BUNDLE_DIR/events.txt" 2>&1

# 5. Node inventory
kubectl get nodes -o wide > "$BUNDLE_DIR/nodes.txt" 2>&1

# 6. API cluster status (redact API key from output)
if [ -n "${CASTAI_API_KEY:-}" ] && [ -n "${CASTAI_CLUSTER_ID:-}" ]; then
  curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
    "https://api.cast.ai/v1/kubernetes/external-clusters/${CASTAI_CLUSTER_ID}" \
    | jq '{name, status, agentStatus, providerType, createdAt}' \
    > "$BUNDLE_DIR/cluster-status.json" 2>&1

  curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
    "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/policies" \
    > "$BUNDLE_DIR/policies.json" 2>&1
fi

# 7. RBAC check
kubectl get clusterrole -l app.kubernetes.io/managed-by=castai \
  > "$BUNDLE_DIR/rbac.txt" 2>&1

# 8. Package bundle
tar -czf "$BUNDLE_DIR.tar.gz" "$BUNDLE_DIR"
rm -rf "$BUNDLE_DIR"
echo "Bundle created: $BUNDLE_DIR.tar.gz"

Step 2: Review Before Sharing

Safe to include:

  • Pod logs (no secrets in CAST AI agent logs)
  • Helm release metadata
  • Node names and instance types
  • Autoscaler policies
  • Cluster events

Redact before sharing:

  • API keys (the script never writes them)
  • Custom environment variables
  • Internal hostnames if sensitive

Step 3: Submit to CAST AI Support

  1. Generate bundle: bash castai-debug-bundle.sh
  2. Attach castai-debug-*.tar.gz to support ticket at support.cast.ai
  3. Include your cluster ID and a description of the issue

Error Handling

Issue Cause Solution
kubectl permission denied Missing RBAC Use cluster-admin kubeconfig
Empty log files Pod not running Note which components are down
API call fails Key expired Bundle still useful with kubectl data
tar fails Disk full Clean temp files first

Resources

Next Steps

For rate limit handling, see castai-rate-limits.

Ready to use castai-pack?