Terraform configurations for Vertex AI platform and Agent Engine
Installation
Open Claude Code and run this command:
/plugin install jeremy-vertex-terraform@claude-code-plugins-plus
Use --global to install for all projects, or --project for current project only.
What It Does
This plugin provides Terraform modules for deploying Vertex AI services including Model Garden foundation models, Gemini API endpoints, vector search for RAG applications, ML pipelines, and production model serving infrastructure.
Key Infrastructure Components:
googlevertexai_endpointfor model servinggooglevertexaideployedmodelfor model versionsgooglevertexai_indexfor vector searchgooglevertexaiindexendpointfor similarity searchgooglevertexaifeaturestorefor feature management- Cloud Storage for model artifacts
- BigQuery for ML model training
Features
✅ Model Garden Deployment: Foundation models (Gemini, PaLM, Claude, Llama)
✅ Gemini API Endpoints: Dedicated endpoints with rate limiting
✅ Vector Search: ScaNN-based similarity search for RAG
✅ ML Pipelines: Kubeflow Pipelines for training workflows
✅ Model Serving: Production endpoints with auto-scaling
✅ Batch Predictions: Large-scale inference jobs
✅ Feature Store: Centralized feature management
✅ Monitoring: Model performance tracking and drift detection
Skills (1)
Execute use when provisioning Vertex AI infrastructure with Terraform.
How It Works
Natural Language Activation
"Create Terraform for Gemini endpoint deployment"
"Deploy vector search for RAG application"
"Set up Vertex AI Pipeline for model training"
"Create Feature Store for ML features"
"Deploy custom model to Vertex AI endpoint"
Use Cases
Gemini API Deployment
"Create Terraform for Gemini 2.0 Flash endpoint"
"Deploy Gemini Pro with auto-scaling"
Vector Search for RAG
"Set up vector search infrastructure for RAG application"
"Deploy embeddings index with 768 dimensions"
Custom Model Serving
"Deploy custom scikit-learn model to Vertex AI"
"Create endpoint for TensorFlow model with GPU"
Batch Predictions
"Set up batch prediction job for large dataset"
"Deploy batch inference with T4 GPUs"
Feature Store
"Create Feature Store for user features"
"Deploy feature serving for real-time predictions"