Introduction to Red Hat build of Kueue
Red Hat build of Kueue is a Kubernetes-native system that manages access to resources for jobs. Red Hat build of Kueue can determine when a job waits, is admitted to start by creating pods, or should be preempted, meaning that active pods for that job are deleted.
Note
In the context of Red Hat build of Kueue, a job can be defined as a one-time or on-demand task that runs to completion.
Red Hat build of Kueue is based on the Kueue open source project.
Red Hat build of Kueue is compatible with environments that use heterogeneous, elastic resources. This means that the environment has many different resource types, and those resources are capable of dynamic scaling.
Red Hat build of Kueue does not replace any existing components in a Kubernetes cluster, but instead integrates with the existing Kubernetes API server, scheduler, and cluster autoscaler components.
Red Hat build of Kueue supports all-or-nothing semantics. This means that either an entire job with all of its components is admitted to the cluster, or the entire job is rejected if it does not fit on the cluster.
Personas
Different personas exist in a Red Hat build of Kueue workflow.
- Batch administrators
-
Batch administrators manage the cluster infrastructure and establish quotas and queues.
- Batch users
-
Batch users run jobs on the cluster. Examples of batch users might be researchers, AI/ML engineers, or data scientists.
- Serving users
-
Serving users run jobs on the cluster. For example, to expose a trained AI/ML model for inference.
- Platform developers
-
Platform developers integrate Red Hat build of Kueue with other software. They might also contribute to the Kueue open source project.
Workflow overview
The Red Hat build of Kueue workflow can be described at a high level as follows:
-
Batch administrators create and configure
ResourceFlavor,LocalQueue, andClusterQueueresources. -
User personas create jobs on the cluster.
-
The Kubernetes API server validates and accepts job data.
-
Red Hat build of Kueue admits jobs based on configured options, such as order or quota. It injects affinity into the job by using resource flavors, and creates a
Workloadobject that corresponds to each job. -
The applicable controller for the job type creates pods.
-
The Kubernetes scheduler assigns pods to a node in the cluster.
-
The Kubernetes cluster autoscaler provisions more nodes as required.