coreweave-hello-world

'Deploy a GPU workload on CoreWeave with kubectl.

v1.0.0

Jeremy Longshore

MIT

4 Tools

coreweave-pack Plugin

saas packs Category

Allowed Tools
        ReadWriteEditBash(kubectl:*)
      

Provided by Plugin

coreweave-pack

Claude Code skill pack for CoreWeave (24 skills)

saas packs v1.0.0

View Plugin

Installation

This skill is included in the coreweave-pack plugin:

/plugin install coreweave-pack@claude-code-plugins-plus

Click to copy

Instructions

CoreWeave Hello World

Overview

Deploy your first GPU workload on CoreWeave: a simple inference service using vLLM or a batch CUDA job. CoreWeave runs Kubernetes on bare-metal GPU nodes with A100, H100, and L40 GPUs.

Prerequisites

Completed coreweave-install-auth setup
kubectl configured with CoreWeave kubeconfig
Namespace with GPU quota

Instructions

Step 1: Deploy a vLLM Inference Server


# vllm-inference.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: vllm-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vllm-server
  template:
    metadata:
      labels:
        app: vllm-server
    spec:
      containers:
        - name: vllm
          image: vllm/vllm-openai:latest
          args:
            - "--model"
            - "meta-llama/Llama-3.1-8B-Instruct"
            - "--port"
            - "8000"
          ports:
            - containerPort: 8000
          resources:
            limits:
              nvidia.com/gpu: 1
              memory: 48Gi
              cpu: "8"
            requests:
              nvidia.com/gpu: 1
              memory: 32Gi
              cpu: "4"
          env:
            - name: HUGGING_FACE_HUB_TOKEN
              valueFrom:
                secretKeyRef:
                  name: hf-token
                  key: token
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: gpu.nvidia.com/class
                    operator: In
                    values: ["A100_PCIE_80GB"]
---
apiVersion: v1
kind: Service
metadata:
  name: vllm-server
spec:
  selector:
    app: vllm-server
  ports:
    - port: 8000
      targetPort: 8000
  type: ClusterIP


# Create HuggingFace token secret
kubectl create secret generic hf-token --from-literal=token="${HF_TOKEN}"

# Deploy
kubectl apply -f vllm-inference.yaml
kubectl get pods -w  # Wait for Running state

# Port-forward and test
kubectl port-forward svc/vllm-server 8000:8000 &
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "meta-llama/Llama-3.1-8B-Instruct", "messages": [{"role": "user", "content": "Hello!"}]}'

Step 2: Batch GPU Job


# gpu-batch-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: gpu-benchmark
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: benchmark
          image: pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
          command: ["python3", "-c"]
          args:
            - |
              import torch
              print(f"CUDA available: {torch.cuda.is_available()}")
              print(f"GPU: {torch.cuda.get_device_name(0)}")
              x = torch.randn(10000, 10000, device="cuda")
              y = torch.matmul(x, x)
              print(f"Matrix multiply result shape: {y.shape}")
              print("CoreWeave GPU test passed!")
          resources:
            limits:
              nvidia.com/gpu: 1
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: gpu.nvidia.com/class
                    operator: In
                    values: ["A100_PCIE_80GB"]


kubectl apply -f gpu-batch-job.yaml
kubectl logs job/gpu-benchmark --follow

Error Handling

Error	Cause	Solution
Pod stuck Pending	No GPU capacity	Try different GPU type or check quota
`nvidia-smi` not found	Wrong base image	Use NVIDIA CUDA images
OOMKilled	Insufficient GPU memory	Use larger GPU (80GB A100)
Image pull error	Registry auth	Create imagePullSecret

Resources

Next Steps

Proceed to coreweave-local-dev-loop for development workflow setup.

Allowed Tools

Provided by Plugin

coreweave-pack

Installation

Instructions

CoreWeave Hello World

Overview

Prerequisites

Instructions

Step 1: Deploy a vLLM Inference Server

Step 2: Batch GPU Job

Error Handling

Resources

Next Steps

Ready to use coreweave-pack?

Related Skills

abridge-ci-integration

abridge-common-errors

abridge-core-workflow-a

abridge-core-workflow-b

abridge-cost-tuning

abridge-debug-bundle