Collecting data for Red Hat Support

When you submit a support case to Red Hat Support, it is helpful to provide debugging information for OpenShift Container Platform and OpenShift Virtualization by using the following tools:

must-gather tool: The must-gather tool collects diagnostic information, including resource definitions and service logs.
Prometheus: Prometheus is a time-series database and a rule evaluation engine for metrics. Prometheus sends alerts to Alertmanager for processing.
Alertmanager: The Alertmanager service handles alerts received from Prometheus. The Alertmanager is also responsible for sending the alerts to external notification systems. For information about the OpenShift Container Platform monitoring stack, see About OpenShift Container Platform monitoring.

Collecting data about your environment

Collecting data about your environment minimizes the time required to analyze and determine the root cause.

Prerequisites

Set the retention time for Prometheus metrics data to a minimum of seven days.
Configure the Alertmanager to capture relevant alerts and to send alert notifications to a dedicated mailbox so that they can be viewed and persisted outside the cluster.
Record the exact number of affected nodes and virtual machines.

Procedure

Collecting data about virtual machines

Collecting data about malfunctioning virtual machines (VMs) minimizes the time required to analyze and determine the root cause.

Prerequisites

Linux VMs: Install the latest QEMU guest agent.
Windows VMs:
- Record the Windows patch update details.
- Install the latest VirtIO drivers.
- Install the latest QEMU guest agent.
- If Remote Desktop Protocol (RDP) is enabled, connect by using the desktop viewer to determine whether there is a problem with the connection software.

Procedure

Collect must-gather data for the VMs using the /usr/bin/gather script.
Collect screenshots of VMs that have crashed before you restart them.
Collect memory dumps from VMs before remediation attempts.
Record factors that the malfunctioning VMs have in common. For example, the VMs have the same host or network.

Using the must-gather tool for OpenShift Virtualization

You can collect data about OpenShift Virtualization resources by running the must-gather command with the OpenShift Virtualization image.

The default data collection includes information about the following resources:

OpenShift Virtualization Operator namespaces, including child objects
OpenShift Virtualization custom resource definitions
Namespaces that contain virtual machines
Basic virtual machine definitions

You can add optional environment details and scripts to the must-gather command to collect additional information. Use these environment variables and scripts to collect data about specific VMs, images, or instance types.

Prerequisites

You have installed the OpenShift CLI (oc).

Procedure

Run the must-gather command to collect data about OpenShift Virtualization:
```
$ oc adm must-gather \
  --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel9:v4.21.0 \
  -- /usr/bin/gather
```
Note

You can also collect must-gather logs for all Operators and products on your cluster by running following command:
```
$ oc adm must-gather --all-images
```
1. Run the following command to modify the number of processes running in parallel when collecting must-gather data:
  $ oc adm must-gather \ --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel9:v4.21.0 \ -- PROS=<number> /usr/bin/gather
  PROS defines the number of parallel processes running to collect data. The default number of processes is 5. Increasing the number of processes may result in faster data collection, but uses more resources. Increasing the maximum number of parallel processes is not recommended.
2. Run the following command to collect detailed information for a specific VM in a specific namespace:
  $ oc adm must-gather \ --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel9:v4.21.0 \ -- NS=<namespace name> VM=<VM name> /usr/bin/gather --vms_details
  NS is the environment variable for namespace. It is mandatory when using the VM environment variable.
3. Run the following command to collect image, image-stream, and image-stream-tags information from the cluster:
  $ oc adm must-gather \ --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel9:v4.21.0 \ /usr/bin/gather --images
4. Run the following command to collect information about instance types from the cluster:
  $ oc adm must-gather \ --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel9:v4.21.0 \ /usr/bin/gather --instancetypes

must-gather tool options

To troubleshoot complex issues and collect specific data beyond the default logs, add optional parameters to the must-gather command when gathering information from your cluster.

You can specify a combination of scripts and environment variables for the following options:

Collecting detailed virtual machine (VM) information from a namespace
Collecting detailed information about specified VMs
Collecting image, image-stream, and image-stream-tags information
Limiting the maximum number of parallel processes used by the must-gather tool

Environment variables

You can specify environment variables for a compatible script.

NS=<namespace_name>: Collect virtual machine information, including virt-launcher pod details, from the namespace that you specify. The VirtualMachine and VirtualMachineInstance CR data is collected for all namespaces.
VM=<vm_name>: Collect details about a particular virtual machine. To use this option, you must also specify a namespace by using the NS environment variable.
PROS=<number_of_processes>: Modify the maximum number of parallel processes that the must-gather tool uses. The default value is 5.

Important

Using too many parallel processes can cause performance issues. Increasing the maximum number of parallel processes is not recommended.

Scripts

Each script is compatible only with certain environment variable combinations.

/usr/bin/gather: Use the default must-gather script, which collects cluster data from all namespaces and includes only basic VM information. This script is compatible only with the PROS variable.
/usr/bin/gather --vms_details: Collect VM log files, VM definitions, control-plane logs, and namespaces that belong to OpenShift Virtualization resources. Specifying namespaces includes their child objects. If you use this parameter without specifying a namespace or VM, the must-gather tool collects this data for all VMs in the cluster. This script is compatible with all environment variables, but you must specify a namespace if you use the VM variable.
/usr/bin/gather --images: Collect image, image-stream, and image-stream-tags custom resource information. This script is compatible only with the PROS variable.
/usr/bin/gather --instancetypes: Collect instance types information. This information is not currently collected by default; you can, however, optionally collect it.

Usage and examples

You can run a script by itself or with one or more compatible environment variables.

must-gather syntax with optional parameters

$ oc adm must-gather \
  --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel9:v4.21.0 \
  -- <environment_variable_1> <environment_variable_2> <script_name>

Table 1. Compatible parameters
Script	Compatible environment variable
`/usr/bin/gather`	* `PROS=<number_of_processes>`
`/usr/bin/gather --vms_details`	* For a namespace: `NS=<namespace_name>` * For a VM: `VM=<vm_name> NS=<namespace_name>` * `PROS=<number_of_processes>`
`/usr/bin/gather --images`	* `PROS=<number_of_processes>`

Generating a VM memory dump

When a virtual machine (VM) terminates unexpectedly, you can use the virtctl memory-dump to generate a memory dump command to output a VM memory dump and save it on a persistent volume claim (PVC). Afterwards, you can analyze the memory dump to diagnose and troubleshoot issues on the VM.

Prerequisites

The hot plug feature gate is enabled in the HyperConverged custom resource. To do so, run the following command:

$ oc patch hyperconverged kubevirt-hyperconverged -n openshift-cnv \
  --type json -p '[{"op": "add", "path": "/spec/featureGates", \
  "value": "HotplugVolumes"}]'

Optional: You have an existing PVC on which you want to save the memory dump.
- The PVC volume mode must be FileSystem.
- The PVC must be large enough to contain the memory dump.
  
  The formula for calculating the PVC size is (VMMemorySize + 100Mi) * FileSystemOverhead, where 100Mi is the memory dump overhead, and FileSystemOverhead is defined in the HCO object.

Procedure

Create a memory dump of the required VM:
- If you have an existing PVC selected on which you want to save the memory dump:
  $ virtctl memory-dump get <vm_name> --claim-name=<pvc_name>
- If you want to create a new PVC for the memory dump:
  $ virtctl memory-dump get <vm_name> --claim-name=<new_pvc_name> --create-claim

Download the memory dump:

$ virtctl memory-dump download <vm_name> --output=<output_file>

Attach the memory dump to a Red Hat Support case.

Alternatively, you can inspect the memory dump, for example by using the volatility3 tool.
Optional: Remove the memory dump:
```
$ virtctl memory-dump remove <vm_name>
```