castai-cost-tuning

'Maximize Kubernetes cost savings with CAST AI spot strategies and right-sizing.

5 Tools
castai-pack Plugin
saas packs Category

Allowed Tools

ReadWriteEditBash(curl:*)Grep

Provided by Plugin

castai-pack

Claude Code skill pack for Cast AI (18 skills)

saas packs v1.0.0
View Plugin

Installation

This skill is included in the castai-pack plugin:

/plugin install castai-pack@claude-code-plugins-plus

Click to copy

Instructions

CAST AI Cost Tuning

Overview

Maximize Kubernetes cost savings through CAST AI: spot instance strategies, workload right-sizing, cluster hibernation, and savings tracking. Typical savings: 50-70% on cloud compute costs.

Prerequisites

  • CAST AI Phase 2 enabled with full automation
  • Savings report available (requires 24h+ of data)
  • Understanding of workload criticality tiers

Instructions

Step 1: Analyze Current Savings


# Get savings breakdown
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/savings" \
  | jq '{
    currentMonthlyCost: .currentMonthlyCost,
    optimizedMonthlyCost: .optimizedMonthlyCost,
    monthlySavings: .monthlySavings,
    savingsPercentage: .savingsPercentage,
    spotSavings: .spotSavings,
    rightSizingSavings: .rightSizingSavings
  }'

Step 2: Maximize Spot Usage


# Enable aggressive spot with diversity and fallbacks
curl -X PUT -H "X-API-Key: ${CASTAI_API_KEY}" \
  -H "Content-Type: application/json" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/policies" \
  -d '{
    "enabled": true,
    "spotInstances": {
      "enabled": true,
      "clouds": ["aws"],
      "spotDiversityEnabled": true,
      "spotDiversityPriceIncreaseLimitPercent": 20,
      "spotBackups": {
        "enabled": true,
        "spotBackupRestoreRateSeconds": 600
      }
    }
  }'

Spot allocation strategy by workload tier:

Workload Type Spot % Rationale
Batch jobs, CI runners 100% spot Interruptible, restartable
Stateless APIs (behind LB) 80% spot Can handle brief interruptions
Stateful services, databases 0% spot Use on-demand or reserved
ML training 80-100% spot Checkpointing handles interrupts

Step 3: Workload Right-Sizing


# Get resource waste analysis
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/workloads" \
  | jq '[.items[] | select(.estimatedSavingsPercent > 20) | {
    name: .workloadName,
    namespace: .namespace,
    wastedCpu: (.currentCpuRequest - .recommendedCpuRequest),
    wastedMemory: (.currentMemoryRequest - .recommendedMemoryRequest),
    savingsPercent: .estimatedSavingsPercent
  }] | sort_by(-.savingsPercent) | .[0:10]'

Step 4: Cluster Hibernation (Dev/Staging)


# Hibernate non-production clusters during off-hours
# Scales nodes to zero, resume on demand

# Enable hibernation
curl -X POST -H "X-API-Key: ${CASTAI_API_KEY}" \
  -H "Content-Type: application/json" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/hibernate" \
  -d '{
    "schedule": {
      "enabled": true,
      "hibernateAt": "20:00",
      "wakeUpAt": "08:00",
      "timezone": "America/New_York",
      "weekdaysOnly": true
    }
  }'

Step 5: Cost Tracking Dashboard


interface CostReport {
  cluster: string;
  period: string;
  currentCost: number;
  optimizedCost: number;
  savings: number;
  spotPercent: number;
}

async function generateMonthlyCostReport(
  clusterIds: string[]
): Promise<CostReport[]> {
  const reports: CostReport[] = [];

  for (const clusterId of clusterIds) {
    const [cluster, savings, nodes] = await Promise.all([
      castaiGet(`/v1/kubernetes/external-clusters/${clusterId}`),
      castaiGet(`/v1/kubernetes/clusters/${clusterId}/savings`),
      castaiGet(`/v1/kubernetes/external-clusters/${clusterId}/nodes`),
    ]);

    const spotNodes = nodes.items.filter(
      (n: { lifecycle: string }) => n.lifecycle === "spot"
    ).length;

    reports.push({
      cluster: cluster.name,
      period: new Date().toISOString().slice(0, 7),
      currentCost: savings.currentMonthlyCost,
      optimizedCost: savings.optimizedMonthlyCost,
      savings: savings.monthlySavings,
      spotPercent:
        nodes.items.length > 0
          ? (spotNodes / nodes.items.length) * 100
          : 0,
    });
  }

  return reports;
}

Cost Optimization Checklist

  • [ ] Spot instances enabled with diversity
  • [ ] Workload autoscaler right-sizing resources
  • [ ] Dev/staging clusters hibernated off-hours
  • [ ] Empty node downscaler enabled
  • [ ] Instance families include latest generation (cheaper)
  • [ ] Reserved/savings plan for baseline on-demand nodes
  • [ ] Weekly savings report review

Error Handling

Issue Cause Solution
Savings lower than expected Too many on-demand constraints Relax node template constraints
Spot interruptions too frequent Single instance type Enable spot diversity
Hibernation not triggering Schedule timezone wrong Use IANA timezone format
Right-sizing too aggressive Low headroom Increase memory headroom to 20%

Resources

Next Steps

For architecture patterns, see castai-reference-architecture.

Ready to use castai-pack?