castai-cost-tuning

'Maximize Kubernetes cost savings with CAST AI spot strategies and right-sizing.

v1.0.0

Jeremy Longshore

MIT

5 Tools

castai-pack Plugin

saas packs Category

Allowed Tools
        ReadWriteEditBash(curl:*)Grep
      

Provided by Plugin

castai-pack

Claude Code skill pack for Cast AI (18 skills)

saas packs v1.0.0

View Plugin

Installation

This skill is included in the castai-pack plugin:

/plugin install castai-pack@claude-code-plugins-plus

Click to copy

Instructions

CAST AI Cost Tuning

Overview

Maximize Kubernetes cost savings through CAST AI: spot instance strategies, workload right-sizing, cluster hibernation, and savings tracking. Typical savings: 50-70% on cloud compute costs.

Prerequisites

CAST AI Phase 2 enabled with full automation
Savings report available (requires 24h+ of data)
Understanding of workload criticality tiers

Instructions

Step 1: Analyze Current Savings


# Get savings breakdown
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/savings" \
  | jq '{
    currentMonthlyCost: .currentMonthlyCost,
    optimizedMonthlyCost: .optimizedMonthlyCost,
    monthlySavings: .monthlySavings,
    savingsPercentage: .savingsPercentage,
    spotSavings: .spotSavings,
    rightSizingSavings: .rightSizingSavings
  }'

Step 2: Maximize Spot Usage


# Enable aggressive spot with diversity and fallbacks
curl -X PUT -H "X-API-Key: ${CASTAI_API_KEY}" \
  -H "Content-Type: application/json" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/policies" \
  -d '{
    "enabled": true,
    "spotInstances": {
      "enabled": true,
      "clouds": ["aws"],
      "spotDiversityEnabled": true,
      "spotDiversityPriceIncreaseLimitPercent": 20,
      "spotBackups": {
        "enabled": true,
        "spotBackupRestoreRateSeconds": 600
      }
    }
  }'

Spot allocation strategy by workload tier:

Workload Type	Spot %	Rationale
Batch jobs, CI runners	100% spot	Interruptible, restartable
Stateless APIs (behind LB)	80% spot	Can handle brief interruptions
Stateful services, databases	0% spot	Use on-demand or reserved
ML training	80-100% spot	Checkpointing handles interrupts

Step 3: Workload Right-Sizing


# Get resource waste analysis
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/workloads" \
  | jq '[.items[] | select(.estimatedSavingsPercent > 20) | {
    name: .workloadName,
    namespace: .namespace,
    wastedCpu: (.currentCpuRequest - .recommendedCpuRequest),
    wastedMemory: (.currentMemoryRequest - .recommendedMemoryRequest),
    savingsPercent: .estimatedSavingsPercent
  }] | sort_by(-.savingsPercent) | .[0:10]'

Step 4: Cluster Hibernation (Dev/Staging)


# Hibernate non-production clusters during off-hours
# Scales nodes to zero, resume on demand

# Enable hibernation
curl -X POST -H "X-API-Key: ${CASTAI_API_KEY}" \
  -H "Content-Type: application/json" \
  "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/hibernate" \
  -d '{
    "schedule": {
      "enabled": true,
      "hibernateAt": "20:00",
      "wakeUpAt": "08:00",
      "timezone": "America/New_York",
      "weekdaysOnly": true
    }
  }'

Step 5: Cost Tracking Dashboard


interface CostReport {
  cluster: string;
  period: string;
  currentCost: number;
  optimizedCost: number;
  savings: number;
  spotPercent: number;
}

async function generateMonthlyCostReport(
  clusterIds: string[]
): Promise<CostReport[]> {
  const reports: CostReport[] = [];

  for (const clusterId of clusterIds) {
    const [cluster, savings, nodes] = await Promise.all([
      castaiGet(`/v1/kubernetes/external-clusters/${clusterId}`),
      castaiGet(`/v1/kubernetes/clusters/${clusterId}/savings`),
      castaiGet(`/v1/kubernetes/external-clusters/${clusterId}/nodes`),
    ]);

    const spotNodes = nodes.items.filter(
      (n: { lifecycle: string }) => n.lifecycle === "spot"
    ).length;

    reports.push({
      cluster: cluster.name,
      period: new Date().toISOString().slice(0, 7),
      currentCost: savings.currentMonthlyCost,
      optimizedCost: savings.optimizedMonthlyCost,
      savings: savings.monthlySavings,
      spotPercent:
        nodes.items.length > 0
          ? (spotNodes / nodes.items.length) * 100
          : 0,
    });
  }

  return reports;
}

Cost Optimization Checklist

[ ] Spot instances enabled with diversity
[ ] Workload autoscaler right-sizing resources
[ ] Dev/staging clusters hibernated off-hours
[ ] Empty node downscaler enabled
[ ] Instance families include latest generation (cheaper)
[ ] Reserved/savings plan for baseline on-demand nodes
[ ] Weekly savings report review

Error Handling

Issue	Cause	Solution
Savings lower than expected	Too many on-demand constraints	Relax node template constraints
Spot interruptions too frequent	Single instance type	Enable spot diversity
Hibernation not triggering	Schedule timezone wrong	Use IANA timezone format
Right-sizing too aggressive	Low headroom	Increase memory headroom to 20%

Resources

Next Steps

For architecture patterns, see castai-reference-architecture.

Allowed Tools

Provided by Plugin

castai-pack

Installation

Instructions

CAST AI Cost Tuning

Overview

Prerequisites

Instructions

Step 1: Analyze Current Savings

Step 2: Maximize Spot Usage

Step 3: Workload Right-Sizing

Step 4: Cluster Hibernation (Dev/Staging)

Step 5: Cost Tracking Dashboard

Cost Optimization Checklist

Error Handling

Resources

Next Steps

Ready to use castai-pack?

Related Skills

abridge-ci-integration

abridge-common-errors

abridge-core-workflow-a

abridge-core-workflow-b

abridge-cost-tuning

abridge-debug-bundle