Single Node Training
Interactive Access
See the documentation about Jupyter notebooks on how to configure and create a notebook to access one or more GPUs on a single node.
Submission-based
With a Job
resource you can start your container with one or multiple GPUs.
apiVersion: batch/v1
kind: Job
metadata:
name: job
spec:
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
labels:
mount-eos: "true"
inject-oauth2-token: "true"
spec:
containers:
- name: job
image: REPLACE_WITH_YOUR_IMAGE
command:
- REPLACE_WITH_COMMAND # e.g. python train.py
resources:
limits:
nvidia.com/gpu: REPLACE_WITH_NUMBER_OF_GPUS # e.g. 2
restartPolicy: Never