coreweave-cost-tuning

'Optimize CoreWeave GPU cloud costs with right-sizing and scheduling.

v1.0.0

Jeremy Longshore

MIT

5 Tools

coreweave-pack Plugin

saas packs Category

Allowed Tools
        ReadWriteEditBash(kubectl:*)Grep
      

Provided by Plugin

coreweave-pack

Claude Code skill pack for CoreWeave (24 skills)

saas packs v1.0.0

View Plugin

Installation

This skill is included in the coreweave-pack plugin:

/plugin install coreweave-pack@claude-code-plugins-plus

Click to copy

Instructions

CoreWeave Cost Tuning

GPU Pricing Reference (approximate)

GPU	Per GPU/hour	Best For
A100 40GB PCIe	~$1.50	Development, smaller models
A100 80GB PCIe	~$2.21	Production inference
H100 80GB PCIe	~$4.76	High-throughput inference
H100 SXM5 (8x)	~$6.15/GPU	Training, multi-GPU
L40	~$1.10	Image generation, light inference

Cost Optimization Strategies

Scale-to-Zero for Dev/Staging


autoscaling.knative.dev/minScale: "0"
autoscaling.knative.dev/scaleDownDelay: "5m"

Right-Size GPU Selection


def recommend_gpu(model_size_b: float, inference_only: bool = True) -> str:
    if model_size_b <= 7:
        return "L40" if inference_only else "A100_PCIE_80GB"
    elif model_size_b <= 13:
        return "A100_PCIE_80GB"
    elif model_size_b <= 70:
        return "A100_PCIE_80GB (4x tensor parallel)"
    else:
        return "H100_SXM5 (8x tensor parallel)"

Quantization to Use Smaller GPUs

Use AWQ or GPTQ quantization to fit larger models on smaller GPUs:


# 70B model at 4-bit fits on single A100-80GB instead of 4x
vllm serve meta-llama/Llama-3.1-70B-Instruct-AWQ --quantization awq

Resources

Next Steps

For architecture patterns, see coreweave-reference-architecture.

Allowed Tools

Provided by Plugin

coreweave-pack

Installation

Instructions

CoreWeave Cost Tuning

GPU Pricing Reference (approximate)

Cost Optimization Strategies

Scale-to-Zero for Dev/Staging

Right-Size GPU Selection

Quantization to Use Smaller GPUs

Resources

Next Steps

Ready to use coreweave-pack?

Related Skills

abridge-ci-integration

abridge-common-errors

abridge-core-workflow-a

abridge-core-workflow-b

abridge-cost-tuning

abridge-debug-bundle