Configuring virtual GPUs
If you have graphics processing unit (GPU) cards, OpenShift Virtualization can automatically create virtual GPUs (vGPUs) that you can assign to virtual machines (VMs).
About using virtual GPUs with OpenShift Virtualization
Some graphics processing unit (GPU) cards support the creation of virtual GPUs (vGPUs). OpenShift Virtualization can automatically create vGPUs and other mediated devices if an administrator provides configuration details in the HyperConverged custom resource (CR).
This automation is especially useful for large clusters.
Note
Refer to your hardware vendor’s documentation for functionality and support details.
- Mediated device
-
A physical device that is divided into one or more virtual devices. A vGPU is a type of mediated device (mdev); the performance of the physical GPU is divided among the virtual devices. You can assign mediated devices to one or more virtual machines (VMs), but the number of guests must be compatible with your GPU. Some GPUs do not support multiple guests.
Preparing hosts for mediated devices
You must enable the Input-Output Memory Management Unit (IOMMU) driver before you can configure mediated devices.
Adding kernel arguments to enable the IOMMU driver
To enable the IOMMU driver in the kernel, create the MachineConfig object and add the kernel arguments.
-
You have cluster administrator permissions.
-
Your CPU hardware is Intel or AMD.
-
You enabled Intel Virtualization Technology for Directed I/O extensions or AMD IOMMU in the BIOS.
-
You have installed the OpenShift CLI (
oc).
-
Create a
MachineConfigobject that identifies the kernel argument. The following example shows a kernel argument for an Intel CPU.apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: 100-worker-iommu spec: config: ignition: version: 3.2.0 kernelArguments: - intel_iommu=on # ...where:
- <apiversion>
-
Applies the new kernel argument only to worker nodes.
- <name>
-
Indicates the ranking of this kernel argument (100) among the machine configs and its purpose. If you have an AMD CPU, specify the kernel argument as
amd_iommu=on. - <intel_iommu=o>
-
Identifies the kernel argument as
intel_iommufor an Intel CPU.
-
Create the new
MachineConfigobject:$ oc create -f 100-worker-kernel-arg-iommu.yaml
-
Verify that the new
MachineConfigobject was added by entering the following command and observing the output:$ oc get MachineConfigExample output:
NAME IGNITIONVERSION AGE 00-master 3.5.0 164m 00-worker 3.5.0 164m 01-master-container-runtime 3.5.0 164m 01-master-kubelet 3.5.0 164m 01-worker-container-runtime 3.5.0 164m 01-worker-kubelet 3.5.0 164m 100-master-chrony-configuration 3.5.0 169m 100-master-set-core-user-password 3.5.0 169m 100-worker-chrony-configuration 3.5.0 169m 100-worker-iommu 3.5.0 14s -
Verify that IOMMU is enabled at the operating system (OS) level by entering the following command:
$ dmesg | grep -i iommu-
If IOMMU is enabled, output is displayed as shown in the following example:
Example output:
Intel: [ 0.000000] DMAR: Intel(R) IOMMU Driver AMD: [ 0.000000] AMD-Vi: IOMMU Initialized
-
Configuring the NVIDIA GPU Operator
You can use the NVIDIA GPU Operator to provision worker nodes for running GPU-accelerated virtual machines (VMs) in OpenShift Virtualization.
Note
The NVIDIA GPU Operator is supported only by NVIDIA. For more information, see Obtaining Support from NVIDIA in the Red Hat Knowledgebase.
Using the NVIDIA GPU Operator
You can use the NVIDIA GPU Operator with OpenShift Virtualization to accelerate the deployment of worker nodes for running GPU-enabled virtual machines (VMs). The NVIDIA GPU Operator manages NVIDIA GPU resources in an OpenShift Container Platform cluster and automates tasks when preparing nodes for GPU workloads.
The NVIDIA GPU Operator can also facilitate provisioning complex artificial intelligence and machine learning (AI/ML) workloads.
-
Configure your
ClusterPolicymanifest. YourClusterPolicymanifest must match the provided example:apiVersion: nvidia.com/v1 kind: ClusterPolicy metadata: name: gpu-cluster-policy spec: daemonsets: updateStrategy: RollingUpdate dcgm: enabled: true dcgmExporter: {} devicePlugin: {} driver: enabled: false kernelModuleType: auto gfd: {} mig: strategy: single migManager: enabled: true nodeStatusExporter: enabled: true operator: defaultRuntime: crio initContainer: {} runtimeClass: nvidia use_ocp_driver_toolkit: true sandboxDevicePlugin: enabled: true sandboxWorkloads: defaultWorkload: vm-vgpu enabled: true toolkit: enabled: true installDir: /usr/local/nvidia validator: plugin: env: - name: WITH_WORKLOAD value: "true" vfioManager: enabled: true vgpuDeviceManager: config: default: default name: vgpu-devices-config enabled: true vgpuManager: enabled: true image: <vgpu_image_name> repository: <vgpu_container_registry> version: <nvidia_vgpu_manager_version>where:
<vgpu_image_name>-
Specifies the vGPU image name.
<vgpu_container_registry>-
Specifies the vGPU container registry value.
<nvidia_vgpu_manager_version>-
Specifies the version of the vGPU driver you have downloaded from the NVIDIA website and used to build the image.
-
Use the NVIDIA GPU Operator to configure mediated devices. For more information see NVIDIA GPU Operator with OpenShift Virtualization.
Labeling nodes with a MIG-backed vGPU profile
If you have GPUs that support NVIDIA Multi-Instance GPU (MIG), you can select a MIG-backed vGPU instance instead of time-sliced vGPU instances. When you use MIG, you give a partition of dedicated hardware to selected VMs.
-
You have configured vGPU support. For more information see MIG Support in OpenShift Container Platform.
-
You have the NVIDIA GPU Operator version 25.10 or higher.
-
You are using the NVIDIA AI Enterprise (AIE) vGPU Manager image.
-
Label the node with the name of the MIG-backed vGPU profile:
$ oc label node <node> --overwrite nvidia.com/vgpu.config=<profile>-
Replace
<node>with the fully qualified domain name (FQDN) of your compute node. -
Replace
<profile>with a supported MIG profile.
-
$ oc label node worker_1 --overwrite nvidia.com/vgpu.config=A30-1-6C
For more information about MIG profiles, see the MIG User Guide.
How vGPUs are assigned to nodes
For each physical device, OpenShift Virtualization configures the following values:
-
A single mdev type.
-
The maximum number of instances of the selected
mdevtype.
The cluster architecture affects how devices are created and assigned to nodes.
- Large cluster with multiple cards per node
-
On nodes with multiple cards that can support similar vGPU types, the relevant device types are created in a round-robin manner. For example:
# ... mediatedDevicesConfiguration: mediatedDeviceTypes: - nvidia-222 - nvidia-228 - nvidia-105 - nvidia-108 # ...In this scenario, each node has two cards, both of which support the following vGPU types:
nvidia-105 # ... nvidia-108 nvidia-217 nvidia-299 # ...On each node, OpenShift Virtualization creates the following vGPUs:
-
16 vGPUs of type nvidia-105 on the first card.
-
2 vGPUs of type nvidia-108 on the second card.
-
- One node has a single card that supports more than one requested vGPU type
-
OpenShift Virtualization uses the supported type that comes first on the
mediatedDeviceTypeslist.For example, the card on a node card supports
nvidia-223andnvidia-224. The followingmediatedDeviceTypeslist is configured:# ... mediatedDevicesConfiguration: mediatedDeviceTypes: - nvidia-22 - nvidia-223 - nvidia-224 # ...In this example, OpenShift Virtualization uses the
nvidia-223type.
Managing mediated devices
Before you can assign mediated devices to virtual machines, you must create the devices and expose them to the cluster. You can also reconfigure and remove mediated devices.
Creating and exposing mediated devices
As an administrator, you can create mediated devices and expose them to the cluster by editing the HyperConverged custom resource (CR). Before you edit the CR, explore a worker node to find the configuration values that are specific to your hardware devices.
-
You installed the OpenShift CLI (
oc). -
You enabled the Input-Output Memory Management Unit (IOMMU) driver.
-
If your hardware vendor provides drivers, you installed them on the nodes where you want to create mediated devices.
-
If you use NVIDIA cards, you installed the NVIDIA GRID driver.
-
-
Identify the name selector and resource name values for the mediated devices by exploring a worker node:
-
Start a debugging session with the worker node by using the
oc debugcommand. For example:$ oc debug node/node-11.redhat.com -
Change the root directory of the shell process to the file system of the host node by running the following command:
# chroot /host -
Navigate to the
mdev_busdirectory and view its contents. Each subdirectory name is a PCI address of a physical GPU. For example:# cd sys/class/mdev_bus && lsExample output:
0000:4b:00.4 -
Go to the directory for your physical device and list the supported mediated device types as defined by the hardware vendor. For example:
# cd 0000:4b:00.4 && ls mdev_supported_typesExample output:
nvidia-742 nvidia-744 nvidia-746 nvidia-748 nvidia-750 nvidia-752 nvidia-743 nvidia-745 nvidia-747 nvidia-749 nvidia-751 nvidia-753 -
Select the mediated device type that you want to use and identify its name selector value by viewing the contents of its
namefile. For example:# cat nvidia-745/nameExample output:
NVIDIA A2-2Q
-
-
Open the
HyperConvergedCR in your default editor by running the following command:$ oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnv -
Create and expose the mediated devices by updating the configuration:
-
Create mediated devices by adding them to the
spec.mediatedDevicesConfigurationstanza. -
Expose the mediated devices to the cluster by adding the
mdevNameSelectorandresourceNamevalues to thespec.permittedHostDevices.mediatedDevicesstanza. TheresourceNamevalue is based on themdevNameSelectorvalue, but you use underscores instead of spaces.Example
HyperConvergedCR:apiVersion: hco.kubevirt.io/v1 kind: HyperConverged metadata: name: kubevirt-hyperconverged namespace: openshift-cnv spec: mediatedDevicesConfiguration: mediatedDeviceTypes: - nvidia-745 nodeMediatedDeviceTypes: - mediatedDeviceTypes: - nvidia-746 nodeSelector: kubernetes.io/hostname: node-11.redhat.com permittedHostDevices: mediatedDevices: - mdevNameSelector: NVIDIA A2-2Q resourceName: nvidia.com/NVIDIA_A2-2Q externalResourceProvider: true - mdevNameSelector: NVIDIA A2-4Q resourceName: nvidia.com/NVIDIA_A2-4Q externalResourceProvider: true # ...where:
mediatedDeviceTypes-
Specifies global settings for the cluster and is required.
nodeMediatedDeviceTypes-
Specifies global configuration overrides for a specific node or group of nodes and is optional. Must be used with the global
mediatedDeviceTypesconfiguration. mediatedDeviceTypes-
Specifies an override to the global
mediatedDeviceTypesconfiguration for the specified nodes. Required if you usenodeMediatedDeviceTypes. nodeSelector-
Specifies the node selector and must include a
key:valuepair. Required if you usenodeMediatedDeviceTypes. mdevNameSelector-
Specifies the mediated devices that map to this value on the host.
resourceName-
Specifies the matching resource name that is allocated on the node.
-
-
Save your changes and exit the editor.
-
Confirm that the virtual GPU is attached to the node by running the following command:
$ oc get node <node_name> -o json \ | jq '.status.allocatable \ | with_entries(select(.key | startswith("nvidia.com/"))) \ | with_entries(select(.value != "0"))'
About changing and removing mediated devices
As an administrator, you can change or remove mediated devices by editing the HyperConverged custom resource (CR).
You can reconfigure or remove mediated devices in several ways:
-
Edit the
HyperConvergedCR and change the contents of themediatedDeviceTypesstanza. -
Change the node labels that match the
nodeMediatedDeviceTypesnode selector. -
Remove the device information from the
spec.mediatedDevicesConfigurationandspec.permittedHostDevicesstanzas of theHyperConvergedCR.Note
If you remove the device information from the
spec.permittedHostDevicesstanza without also removing it from thespec.mediatedDevicesConfigurationstanza, you cannot create a new mediated device type on the same node. To properly remove mediated devices, remove the device information from both stanzas.
Removing mediated devices from the cluster
To remove a mediated device from the cluster, delete the information for that device from the HyperConverged custom resource (CR).
-
You have installed the OpenShift CLI (
oc).
-
Edit the
HyperConvergedCR in your default editor by running the following command:$ oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnv -
Remove the device information from the
spec.mediatedDevicesConfigurationandspec.permittedHostDevicesstanzas of theHyperConvergedCR. Removing both entries ensures that you can later create a new mediated device type on the same node. For example:apiVersion: hco.kubevirt.io/v1 kind: HyperConverged metadata: name: kubevirt-hyperconverged namespace: openshift-cnv spec: mediatedDevicesConfiguration: mediatedDeviceTypes: - nvidia-231 permittedHostDevices: mediatedDevices: - mdevNameSelector: GRID T4-2Q resourceName: nvidia.com/GRID_T4-2Q-
To remove the
nvidia-231device type, delete it from themediatedDeviceTypesarray. -
To remove the
GRID T4-2Qdevice, delete themdevNameSelectorfield and its correspondingresourceNamefield.
-
-
Save your changes and exit the editor.
Using mediated devices
You can assign mediated devices to one or more virtual machines.
Assigning a vGPU to a VM by using the CLI
Assign mediated devices such as virtual GPUs (vGPUs) to virtual machines (VMs).
-
The mediated device is configured in the
HyperConvergedcustom resource. -
The virtual machine (VM) is stopped.
-
Assign the mediated device to a VM by editing the
spec.domain.devices.gpusstanza of theVirtualMachinemanifest.Example virtual machine manifest:
apiVersion: kubevirt.io/v1 kind: VirtualMachine spec: domain: devices: gpus: - deviceName: nvidia.com/TU104GL_Tesla_T4 name: gpu1 - deviceName: nvidia.com/GRID_T4-2Q name: gpu2-
spec.template.spec.domain.devices.gpus.deviceNamespecifies the resource name associated with the mediated device. -
spec.template.spec.domain.devices.gpus.namespecifies a name to identify the device on the VM.
-
-
To verify that the device is available from the virtual machine, run the following command, substituting
<device_name>with thedeviceNamevalue from theVirtualMachinemanifest:$ lspci -nnk | grep <device_name>
Assigning a vGPU to a VM by using the web console
You can assign virtual GPUs to virtual machines by using the OpenShift Container Platform web console.
Note
You can add hardware devices to virtual machines created from customized templates or a YAML file. You cannot add devices to pre-supplied boot source templates for specific operating systems.
-
The vGPU is configured as a mediated device in your cluster.
-
To view the devices that are connected to your cluster, click Compute → Hardware Devices from the side menu.
-
-
The VM is stopped.
-
In the OpenShift Container Platform web console, click Virtualization → VirtualMachines from the side menu.
-
Select the VM that you want to assign the device to.
-
On the Details tab, click GPU devices.
-
Click Add GPU device.
-
Enter an identifying value in the Name field.
-
From the Device name list, select the device that you want to add to the VM.
-
Click Save.
-
To confirm that the devices were added to the VM, click the YAML tab and review the
VirtualMachineconfiguration. Mediated devices are added to thespec.domain.devicesstanza.