Maintenance and support for monitoring
Not all configuration options for the monitoring stack are exposed. The only supported way of configuring OpenShift Container Platform monitoring is by configuring the Cluster Monitoring Operator (CMO) using the options described in the Config map reference for the Cluster Monitoring Operator. Do not use other configurations, as they are unsupported.
Configuration paradigms might change across Prometheus releases, and such cases can only be handled gracefully if all configuration possibilities are controlled. If you use configurations other than those described in the Config map reference for the Cluster Monitoring Operator, your changes will disappear because the CMO automatically reconciles any differences and resets any unsupported changes back to the originally defined state by default and by design.
Support considerations for monitoring
Note
Backward compatibility for metrics, recording rules, or alerting rules is not guaranteed.
The following modifications are explicitly not supported:
-
Creating additional
ServiceMonitor,PodMonitor, andPrometheusRuleobjects in theopenshift-*andkube-*projects. -
Modifying any resources or objects deployed in the
openshift-monitoringoropenshift-user-workload-monitoringprojects. The resources created by the OpenShift Container Platform monitoring stack are not meant to be used by any other resources, as there are no guarantees about their backward compatibility.Note
The Alertmanager configuration is deployed as the
alertmanager-mainsecret resource in theopenshift-monitoringnamespace. If you have enabled a separate Alertmanager instance for user-defined alert routing, an Alertmanager configuration is also deployed as thealertmanager-user-workloadsecret resource in theopenshift-user-workload-monitoringnamespace. To configure additional routes for any instance of Alertmanager, you need to decode, modify, and then encode that secret. This procedure is a supported exception to the preceding statement. -
Modifying resources of the stack. The OpenShift Container Platform monitoring stack ensures its resources are always in the state it expects them to be. If they are modified, the stack will reset them.
-
Deploying user-defined workloads to
openshift-*, andkube-*projects. These projects are reserved for Red Hat provided components and they should not be used for user-defined workloads. -
Enabling symptom based monitoring by using the
Probecustom resource definition (CRD) in Prometheus Operator. -
Manually deploying monitoring resources into namespaces that have the
openshift.io/cluster-monitoring: "true"label. -
Adding the
openshift.io/cluster-monitoring: "true"label to namespaces. This label is reserved only for the namespaces with core OpenShift Container Platform components and Red Hat certified components. -
Installing custom Prometheus instances on OpenShift Container Platform. A custom instance is a Prometheus custom resource (CR) managed by the Prometheus Operator.
Support policy for monitoring Operators
Monitoring Operators ensure that OpenShift Container Platform monitoring resources function as designed and tested. If Cluster Version Operator (CVO) control of an Operator is overridden, the Operator does not respond to configuration changes, reconcile the intended state of cluster objects, or receive updates.
While overriding CVO control for an Operator can be helpful during debugging, this is unsupported and the cluster administrator assumes full control of the individual component configurations and upgrades.
The spec.overrides parameter can be added to the configuration for the CVO to allow administrators to provide a list of overrides to the behavior of the CVO for a component. Setting the spec.overrides[].unmanaged parameter to true for a component blocks cluster upgrades and alerts the administrator after a CVO override has been set:
Disabling ownership via cluster version overrides prevents upgrades. Please remove overrides before continuing.
Warning
Setting a CVO override puts the entire cluster in an unsupported state and prevents the monitoring stack from being reconciled to its intended state. This impacts the reliability features built into Operators and prevents updates from being received. Reported issues must be reproduced after removing any overrides for support to proceed.
Support version matrix for monitoring components
The following matrix contains information about versions of monitoring components for OpenShift Container Platform 4.12 and later releases:
| OpenShift Container Platform | Prometheus Operator | Prometheus | Metrics Server | Alertmanager | kube-state-metrics agent | monitoring-plugin | node-exporter agent | Thanos |
|---|---|---|---|---|---|---|---|---|
4.20 |
0.85.0 |
3.5.0 |
0.8.0 |
0.28.1 |
2.16.0 |
1.0.0 |
1.9.1 |
0.39.2 |
4.19 |
0.81.0 |
3.2.1 |
0.7.2 |
0.28.1 |
2.15.0 |
1.0.0 |
1.9.1 |
0.37.2 |
4.18 |
0.78.1 |
2.55.1 |
0.7.2 |
0.27.0 |
2.13.0 |
1.0.0 |
1.8.2 |
0.36.1 |
4.17 |
0.75.2 |
2.53.1 |
0.7.1 |
0.27.0 |
2.13.0 |
1.0.0 |
1.8.2 |
0.35.1 |
4.16 |
0.73.2 |
2.52.0 |
0.7.1 |
0.26.0 |
2.12.0 |
1.0.0 |
1.8.0 |
0.35.0 |
4.15 |
0.70.0 |
2.48.0 |
0.6.4 |
0.26.0 |
2.10.1 |
1.0.0 |
1.7.0 |
0.32.5 |
4.14 |
0.67.1 |
2.46.0 |
N/A |
0.25.0 |
2.9.2 |
1.0.0 |
1.6.1 |
0.30.2 |
4.13 |
0.63.0 |
2.42.0 |
N/A |
0.25.0 |
2.8.1 |
N/A |
1.5.0 |
0.30.2 |
4.12 |
0.60.1 |
2.39.1 |
N/A |
0.24.0 |
2.6.0 |
N/A |
1.4.0 |
0.28.1 |
Note
The openshift-state-metrics agent and Telemeter Client are OpenShift-specific components. Therefore, their versions correspond with the versions of OpenShift Container Platform.