Getting Started

This section will walk you through the steps for creating a notebook, setting up a volume, and running an end-to-end MNIST example that demonstrates hyperparameter tuning, distributed training, and model serving.

Creating a Notebook

To create a new notebook:

Navigate to the Notebooks section in the left sidebar.
Click the New Notebook button.
Fill in the required fields:
- Name: Choose a unique name for your notebook
- Image:
  - Navigate to the Custom Notebook panel
  - Choose the Tensorflow notebook Image:
    - kubeflownotebookswg/jupyter-tensorflow-full:v1.8.0
Click Launch to create and start your notebook server.

Creating a Volume

To create a persistent volume for storing your models and data using the UI:

Navigate to the Volumes section in the left sidebar.
Click the New Volume button.
Fill in the required fields:
- Name: Choose a unique name for your volume (e.g., my-model-volume)
- Size: Specify the desired size (e.g., 3Gi)
- Storage Class: Select manila-meyrin-cephfs
- Access Mode: Select ReadWriteMany for most use cases
Click Create to create the persistent volume claim.

You can now use this volume in your notebooks and pipelines to store persistent data.

Running the MNIST E2E Example

Overview

This guide will walk you through running an end-to-end MNIST example. This example demonstrates hyperparameter tuning, distributed training, and model serving using the MNIST dataset.

Step-by-Step Guide

Access Your Notebook Server:
- Navigate to the "Notebooks" section in the left sidebar.
- Click on the CONNECT button of the notebook server you created earlier to access it.
Upload the MNIST E2E Notebook:
- Navigate to the notebook example
- Download the Notebook and Upload it to your Notebook Server
Open the Notebook:
- Click on the mnist-e2e.ipynb file in your Jupyter file browser to open it.

Install Required Packages: Run the following commands in the first code cell of the notebook:

!python3 -m pip install --no-cache-dir --force-reinstall --pre kfp
!pip install kubeflow-katib==0.12.0

Set Up Pipeline Parameters: Look for the cell containing the pipeline parameters and update them as follows:
```
name = "mnist-e2e"
namespace = "<your-namespace>"
training_steps = "200"
model_volume_name = "<your-volume-name>"
```
Replace <your-namespace> with your user namespace and <your-volume-name> with the name of the persistent volume claim you created earlier.
Run the Notebook Cells: Execute each cell in the notebook sequentially by clicking the Run button or using the Shift+Enter keyboard shortcut. This will:
- Define the Katib experiment for hyperparameter tuning
- Create a TFJob for distributed training
- Set up KServe for model inference
- Launch the pipeline from the notebook

Launch the Pipeline:

Look for the cell that launches the pipeline.

kfp_client = kfp.Client()
run_id = kfp_client.create_run_from_pipeline_func(
   mnist_pipeline,
   namespace=namespace,
   arguments={}
).run_id
print("Run ID:", run_id)

Run this cell to launch the pipeline from your notebook.

Monitor the Pipeline:
- After launching the pipeline, you'll see a link to the pipeline run in the notebook output under Run Details
- Click on this link to open the Pipelines UI and monitor the progress of your pipeline.
- Alternatively, you can navigate to the "Pipelines" section in the dashboard to see all running and completed pipelines.

Predicting with the Trained Model

Once the pipeline has completed successfully, you can use the trained model for predictions:

In your notebook, the following code is used to send a prediction request:

import numpy as np
from PIL import Image
import requests

# Specify the image URL
image_url = "https://raw.githubusercontent.com/kubeflow/katib/master/examples/v1beta1/kubeflow-pipelines/images/9.bmp"
image = Image.open(requests.get(image_url, stream=True).raw)
data = np.array(image.convert('L').resize((28, 28))).astype(np.float64).reshape(-1, 28, 28, 1)
data_formatted = np.array2string(data, separator=",", formatter={"float": lambda x: "%.1f" % x})
json_request = '{{ "instances" : {} }}'.format(data_formatted)

# Specify the prediction URL
url = f"http://{name}.{namespace}.svc.cluster.local/v1/models/{name}:predict"
response = requests.post(url, data=json_request)

print("Prediction for the image")
display(image)
print(response.json())

Run the cell to see the prediction results for the sample image.