Skip to content

Fundamentals

Overview

The Model Management module streamlines the handling of machine learning models from inception to deployment and monitoring, encompassing three key stages.

Registered models and their versions form the core of our model management system, allowing users to organize and track different iterations of machine learning models. Each registered model can have multiple versions, each associated with specific configurations necessary for deployment. After setting up the desired configuration, a model version can be readily deployed to meet operational needs. For detailed guidance on managing registered models and their versions, visit the Registered Models & Versions page. To learn more about configuring and deploying model versions, refer to the Model Version Configuration page.

Key Features

  • Model Selection Options: Ability to choose from a comprehensive model repository or bundle custom inferencing code with models for specific runtime behaviors or custom docker image.

  • Resource Specification: Detailed specification of computational resources such as CPUs, GPUs, and memory, ensuring optimal performance.

  • Advanced Configuration Setup: Configuration of serving parameters and dependencies tailored to enhance model performance and compatibility.

  • Kubernetes-Powered Deployment: Utilization of Kubernetes for deploying models in a containerized environment, leveraging its scalability and reliability.

  • Streamlined Deployment Process: Efficient and guided deployment process that includes a clear manifest and scheduler for optimized resource allocation.