logo
Alauda AI
Navigation
Overview
Architecture
Introduction
Quick Start
Release Notes
Install
Pre-installation Configuration
Enable Fine-Tuning and Training Features
Install Alauda AI Essentials
Install Alauda AI
Upgrade
Upgrade from AI 1.3
Uninstall
Uninstall
Infrastructure Management
Device Management
About Alauda Build of Hami
About Alauda Build of NVIDIA GPU Device Plugin
Multi-Tenant
Guides
Namespace Management
Workbench
Overview
Introduction
Install
Upgrade
How To
Create WorkspaceKind
Create Workbench
Model Deployment & Inference
Overview
Introduction
Features
Inference Service
Introduction
Guides
Inference Service
How To
Extend Inference Runtimes
Configure External Access for Inference Services
Configure Scaling for Inference Services
Configure Accurately Scheduling Inference Services based on the CUDA version
Troubleshooting
Experiencing Inference Service Timeouts with MLServer Runtime
Inference Service Fails to Enter Running State
Model Management
Introduction
Guides
Model Repository
How To
Upload Models Using Notebook
Share Models
Fine-tuning
Introduction
How To
Create Fine-tuning Tasks
Developing Custom Fine-Tuning Templates
Training
Introduction
How To
Create Training Tasks
Develop Custom Training Templates
Monitoring & Ops
Overview
Introduction
Features Overview
Logging & Tracing
Introduction
Guides
Logging
Resource Monitoring
Introduction
Guides
Resource Monitoring
Label Studio
Overview
Introduction
Main Features
Install Label Studio
Quickstart
API Reference
Introduction
Kubernetes APIs
Inference Service APIs
ClusterServingRuntime [serving.kserve.io/v1alpha1]
InferenceService [serving.kserve.io/v1beta1]
Workbench APIs
Workspace Kind [kubeflow.org/v1beta1]
Workspace [kubeflow.org/v1beta1]
Manage APIs
AmlNamespace [manage.aml.dev/v1alpha1]
Operator APIs
AmlCluster [amlclusters.aml.dev/v1alpha1]
Glossary

#Guides

Inference Service

  • Advantages
  • Core Features
  • Create inference service
  • Inference Service Template Management
  • Inference service update
  • Calling the published inference service
Previous pageIntroductionNext pageInference Service