Getting Started
Install CastSlice into your Kubernetes cluster and start sharing GPUs across your AI workloads in under five minutes.
Prerequisites
Before installing CastSlice, ensure you have the following:
Any CNCF-conformant cluster β EKS, GKE, AKS, or on-prem.
Or equivalent device plugin exposing nvidia.com/gpu resources.
Cluster admin access to apply manifests and inspect Pods.
Don't have a GPU yet? You can still verify the webhook mutation logic using the Local Testing (No GPU) guide.
Install CastSlice
CastSlice uses cert-manager to inject TLS certs into the Mutating Webhook configuration automatically.
The single-file install.yaml includes the Namespace, Deployment, Service, Certificate (cert-manager), and MutatingWebhookConfiguration.
1/1 Running means the cert-manager certificate was issued, TLS was injected into the webhook, and the readiness probe at /readyz passed.
Enable GPU Slicing on a Workload
Add castops.io/optimize: "true" to the Pod's metadata.annotations. Optionally set castops.io/workload-type to control the slice count. For Deployments, add annotations to the Pod template (spec.template.metadata.annotations), not the Deployment's own metadata.
Or use an explicit ratio for fine-grained control:
After the Pod is created, inspect its actual resource spec to confirm the rewrite:
nvidia.com/gpu was rewritten to nvidia.com/gpu-shared with the correct ratio β GPU slicing is active.Uninstall
To remove CastSlice from your cluster:
This removes the MutatingWebhookConfiguration, so new Pods will no longer be mutated. Existing running Pods are unaffected since admission happens at creation time.