Jupyter Notebook
To provide full data science experience, ml.cern.ch offers Jupyter Notebooks.
Every Notebook can be customized, by installing required packages.
For every Notebook, resources such as CPUs or GPUs can be added.
- Go to https://ml.cern.ch/_/jupyter/
- Select New Notebook
- Select a name that starts with a letter (not a number)
-
Select an image under Custom Notebook
Tip
The fastest way to start development is to select one of pre-built images to run a Jupyter notebook. Pre-built images contain most functionalities to provide machine learning training and integrate with existing CERN services.
Pre-built images contain:- Python 3.
- TensorFlow, Pytorch or SciPy.
- Common Python libraries: matplotlib, boto3, sklearn
- kfp library for intergration with Kubeflow pipelines
-
Select the number of CPUs and amount of Memory
-
Select the number of GPUs
Available GPUs
Personal profiles have a quota limit of 1 GPU.
-
Configuring Additional Notebook Storage
- By default, EOS is automatically authenticated and mounted as your home directory in the notebook.
Storage
Volume storage should not be used for large datasets. Use EOS for most data storage needs.
- Under Workspace Volume, click on Add new volume
- Configure the new volume:
- Size: Specify the desired size (e.g., 3Gi)
- Storage Class: Select
manila-meyrin-cephfs
- Access Mode: Select
ReadWriteMany
(recommended for most use cases)
- This volume is mounted at the default path of
/home/jovyan
-
Wait until notebook server starts, then click Connect
How to deal with custom dependencies?
They can be installed within a notebook by running !!pip3 install LIBRARY --user
in a notebook cell or by running pip3 install LIRBARY --user
in the notebook terminal.
If you plan to use these dependencies more often, you can create a custom image with your dependencies.
FROM registry.cern.ch/kubeflow/kubeflownotebookswg/jupyter-scipy:v1.8.0
#Switch the image with any of the following based on your requirements
#registry.cern.ch/kubeflow/kubeflownotebookswg/jupyter-pytorch-cuda-full:v1.8.0
#registry.cern.ch/kubeflow/kubeflownotebookswg/jupyter-pytorch-full:v1.8.0
#registry.cern.ch/kubeflow/kubeflownotebookswg/jupyter-tensorflow-cuda-full:v1.8.0
#registry.cern.ch/kubeflow/kubeflownotebookswg/jupyter-tensorflow-full:v1.8.0
# Install required packages and modify the text file based on your requirements
COPY requirements.txt /requirements.txt
RUN pip3 install -r /requirements.txt
# The following line is mandatory:
CMD ["sh", "-c", \
"jupyter lab --notebook-dir=/home/jovyan --ip=0.0.0.0 --no-browser \
--allow-root --port=8888 --LabApp.token='' --LabApp.password='' \
--LabApp.allow_origin='*' --LabApp.base_url=${NB_PREFIX}"]
-
Build Docker image:
-
Login to CERN docker registry (See Kubernetes docs for more help):
-
Push built custom image to a container registry:
-
Create the new Notebook Server by following steps here and use
registry.cern.ch/[PROJECT_NAME]/[REPOSITORY_NAME]:[IMAGE_TAG]
as the image.
How to reuse a custom image?
If you already have a base image that you want to use, you can extent it with the neccessary tools in order to run a Jupyter notebook on top of it.
FROM [REPLACE_WITH_YOUR_BASE_IMAGE] AS jupyter
# Install kubectl so you can interact with Kubernetes
RUN apt update && apt install -y curl
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl
RUN chmod +x ./kubectl
RUN mv ./kubectl /usr/local/bin
# Install Jupyter
RUN pip install jupyter
RUN mkdir /home/jovyan
RUN chmod 777 /home/jovyan
RUN useradd jovyan
# It is important to use this user in order to properly authenticate against EOS
USER jovyan
CMD ["sh", "-c", \
"jupyter lab --notebook-dir=/home/jovyan --ip=0.0.0.0 --no-browser \
--allow-root --port=8888 --LabApp.token='' --LabApp.password='' \
--LabApp.allow_origin='*' --LabApp.base_url=${NB_PREFIX}"]
-
Build Docker image:
-
Login to CERN docker registry:
-
Push built custom image to a container registry:
-
Create the new Notebook Server by following the steps here and use
registry.cern.ch/[PROJECT_NAME]/[REPOSITORY_NAME]:[IMAGE_TAG]
as the image.