Pipelines: Consolidation of Tekton and Argo Workflows backends for improved flexibility.
Hyperparameter Optimization: Katib brings official support for ARM64.
Model Training: The Ray operator is now available
Model Serving: Enhanced Hugging Face Runtime Support: Hugging Face models are supported out-of-the-box, implementing a KServe Hugging Face Serving Runtime. Currently supported tasks include sequence classification, token classification, fill mask, text generation, and text to text generation.
LLMs
Fine-Tune APIs for LLMs: Simplifies fine-tuning of LLMs with custom datasets.
vLLM Support: Dedicated runtime support for vLLM is now included, streamlining the deployment process for LLMs.
OpenAI Schema Integration: KServe now supports endpoints for generative transformer models, following the OpenAI protocol. This enables KServe to be used directly with OpenAI’s client libraries or third-party tools like LangChain and LlamaIndex