Getting Started
This section will walk you through the steps for creating a notebook, setting up a volume, and running an end-to-end MNIST example that demonstrates hyperparameter tuning, distributed training, and model serving.
Table of Contents
- Creating a Notebook
- Creating a Volume
- Running the MNIST E2E Example
- Predicting with the Trained Model
Creating a Notebook
To create a new notebook:
- Navigate to the
Notebooks
section in the left sidebar. - Click the
New Notebook
button. -
Fill in the required fields:
- Name: Choose a unique name for your notebook
-
Image:
-
Navigate to the
Custom Notebook
panel -
Choose the Tensorflow notebook Image:
kubeflownotebookswg/jupyter-tensorflow-full:v1.8.0
-
-
Click
Launch
to create and start your notebook server.
Creating a Volume
To create a persistent volume for storing your models and data using the UI:
- Navigate to the
Volumes
section in the left sidebar. - Click the
New Volume
button. - Fill in the required fields:
- Name: Choose a unique name for your volume (e.g.,
my-model-volume
) - Size: Specify the desired size (e.g., 3Gi)
- Storage Class: Select
manila-meyrin-cephfs
- Access Mode: Select
ReadWriteMany
for most use cases
- Name: Choose a unique name for your volume (e.g.,
- Click
Create
to create the persistent volume claim.
You can now use this volume in your notebooks and pipelines to store persistent data.
Running the MNIST E2E Example
Overview
This guide will walk you through running an end-to-end MNIST example. This example demonstrates hyperparameter tuning, distributed training, and model serving using the MNIST dataset.
Step-by-Step Guide
-
Access Your Notebook Server:
- Navigate to the "Notebooks" section in the left sidebar.
- Click on the
CONNECT
button of the notebook server you created earlier to access it.
-
Upload the MNIST E2E Notebook:
- Navigate to the notebook example
- Download the Notebook and Upload it to your Notebook Server
-
Open the Notebook:
- Click on the
mnist-e2e.ipynb
file in your Jupyter file browser to open it.
- Click on the
-
Install Required Packages: Run the following commands in the first code cell of the notebook:
-
Set Up Pipeline Parameters: Look for the cell containing the pipeline parameters and update them as follows:
name = "mnist-e2e" namespace = "<your-namespace>" training_steps = "200" model_volume_name = "<your-volume-name>"
Replace
<your-namespace>
with your user namespace and<your-volume-name>
with the name of the persistent volume claim you created earlier. -
Run the Notebook Cells: Execute each cell in the notebook sequentially by clicking the
Run
button or using the Shift+Enter keyboard shortcut. This will:- Define the Katib experiment for hyperparameter tuning
- Create a TFJob for distributed training
- Set up KServe for model inference
- Launch the pipeline from the notebook
-
Launch the Pipeline:
- Look for the cell that launches the pipeline.
- Run this cell to launch the pipeline from your notebook.
-
Monitor the Pipeline:
- After launching the pipeline, you'll see a link to the pipeline run in the notebook output under
Run Details
- Click on this link to open the Pipelines UI and monitor the progress of your pipeline.
- Alternatively, you can navigate to the "Pipelines" section in the dashboard to see all running and completed pipelines.
- After launching the pipeline, you'll see a link to the pipeline run in the notebook output under
Predicting with the Trained Model
Once the pipeline has completed successfully, you can use the trained model for predictions:
-
In your notebook, the following code is used to send a prediction request:
import numpy as np from PIL import Image import requests # Specify the image URL image_url = "https://raw.githubusercontent.com/kubeflow/katib/master/examples/v1beta1/kubeflow-pipelines/images/9.bmp" image = Image.open(requests.get(image_url, stream=True).raw) data = np.array(image.convert('L').resize((28, 28))).astype(np.float64).reshape(-1, 28, 28, 1) data_formatted = np.array2string(data, separator=",", formatter={"float": lambda x: "%.1f" % x}) json_request = '{{ "instances" : {} }}'.format(data_formatted) # Specify the prediction URL url = f"http://{name}.{namespace}.svc.cluster.local/v1/models/{name}:predict" response = requests.post(url, data=json_request) print("Prediction for the image") display(image) print(response.json())
-
Run the cell to see the prediction results for the sample image.