Updating a cluster using the CLI
You can perform minor version and patch updates on an OpenShift Container Platform cluster by using the OpenShift CLI (oc).
Prerequisites
-
Have access to the cluster as a user with
adminprivileges. See Using RBAC to define and apply permissions. -
Have a recent etcd backup in case your update fails and you must restore your cluster to a previous state.
-
Have a recent Container Storage Interface (CSI) volume snapshot in case you need to restore persistent volumes due to a pod failure.
-
Your RHEL7 workers are replaced with RHEL8 or RHCOS workers. Red Hat does not support in-place RHEL7 to RHEL8 updates for RHEL workers; those hosts must be replaced with a clean operating system install.
-
You have updated all Operators previously installed through Operator Lifecycle Manager (OLM) to a version that is compatible with your target release. Updating the Operators ensures they have a valid update path when the default software catalogs switch from the current minor version to the next during a cluster update. See Updating installed Operators for more information on how to check compatibility and, if necessary, update the installed Operators.
-
Ensure that all machine config pools (MCPs) are running and not paused. Nodes associated with a paused MCP are skipped during the update process. You can pause the MCPs if you are performing a canary rollout update strategy.
-
If your cluster uses manually maintained credentials, update the cloud provider resources for the new release. For more information, including how to determine if this is a requirement for your cluster, see Preparing to update a cluster with manually maintained credentials.
-
Ensure that you address all
Upgradeable=Falseconditions so the cluster allows an update to the next minor version. An alert displays at the top of the Cluster Settings page when you have one or more cluster Operators that cannot be updated. You can still update to the next available patch update for the minor release you are currently on. -
If you run an Operator or you have configured any application with the pod disruption budget, you might experience an interruption during the update process. If
minAvailableis set to 1 inPodDisruptionBudget, the nodes are drained to apply pending machine configs which might block the eviction process. If several nodes are rebooted, all the pods might run on only one node, and thePodDisruptionBudgetfield can prevent the node drain.
Important
-
When an update is failing to complete, the Cluster Version Operator (CVO) reports the status of any blocking components while attempting to reconcile the update. Rolling your cluster back to a previous version is not supported. If your update is failing to complete, contact Red Hat support.
-
Using the
unsupportedConfigOverridessection to modify the configuration of an Operator is unsupported and might block cluster updates. You must remove this setting before you can update your cluster.
Pausing a MachineHealthCheck resource
During the update process, nodes in the cluster might become temporarily unavailable. In the case of worker nodes, the MachineHealthCheck resources might identify such nodes as unhealthy and reboot them. To avoid rebooting such nodes, pause all the MachineHealthCheck resources before updating the cluster.
Note
Some MachineHealthCheck resources might not need to be paused. If your MachineHealthCheck resource relies on unrecoverable conditions, pausing that MHC is unnecessary.
-
Install the OpenShift CLI (
oc).
-
To list all the available
MachineHealthCheckresources that you want to pause, run the following command:$ oc get machinehealthcheck -n openshift-machine-api -
To pause the machine health checks, add the
cluster.x-k8s.io/paused=""annotation to theMachineHealthCheckresource. Run the following command:$ oc -n openshift-machine-api annotate mhc <mhc-name> cluster.x-k8s.io/paused=""The annotated
MachineHealthCheckresource resembles the following YAML file:apiVersion: machine.openshift.io/v1beta1 kind: MachineHealthCheck metadata: name: example namespace: openshift-machine-api annotations: cluster.x-k8s.io/paused: "" spec: selector: matchLabels: role: worker unhealthyConditions: - type: "Ready" status: "Unknown" timeout: "300s" - type: "Ready" status: "False" timeout: "300s" maxUnhealthy: "40%" status: currentHealthy: 5 expectedMachines: 5Important
Resume the machine health checks after updating the cluster. To resume the check, remove the pause annotation from the
MachineHealthCheckresource by running the following command:$ oc -n openshift-machine-api annotate mhc <mhc-name> cluster.x-k8s.io/paused-
About updating single node OpenShift Container Platform
You can update, or upgrade, a single-node OpenShift Container Platform cluster by using either the console or CLI.
However, note the following limitations:
-
The prerequisite to pause the
MachineHealthCheckresources is not required because there is no other node to perform the health check. -
Restoring a single-node OpenShift Container Platform cluster using an etcd backup is not officially supported. However, it is good practice to perform the etcd backup in case your update fails. If your control plane is healthy, you might be able to restore your cluster to a previous state by using the backup.
-
Updating a single-node OpenShift Container Platform cluster requires downtime and can include an automatic reboot. The amount of downtime depends on the update payload, as described in the following scenarios:
-
If the update payload contains an operating system update, which requires a reboot, the downtime is significant and impacts cluster management and user workloads.
-
If the update contains machine configuration changes that do not require a reboot, the downtime is less, and the impact on the cluster management and user workloads is lessened. In this case, the node draining step is skipped with single-node OpenShift Container Platform because there is no other node in the cluster to reschedule the workloads to.
-
If the update payload does not contain an operating system update or machine configuration changes, a short API outage occurs and resolves quickly.
-
Important
There are conditions, such as bugs in an updated package, that can cause the single node to not restart after a reboot. In this case, the update does not rollback automatically.
-
For information on which machine configuration changes require a reboot, see the note in About the Machine Config Operator.
Updating a cluster by using the CLI
You can use the OpenShift CLI (oc) to review and request cluster updates.
You can find information about available OpenShift Container Platform advisories and updates in the errata section of the Customer Portal.
-
Install the OpenShift CLI (
oc) that matches the version for your updated version. -
Log in to the cluster as user with
cluster-adminprivileges. -
Pause all
MachineHealthCheckresources.
-
View the available updates and note the version number of the update that you want to apply:
$ oc adm upgrade recommendExample outputFailing=True: Reason: ClusterOperatorNotAvailable Message: Cluster operator monitoring is not available ... Upstream update service: https://api.integration.openshift.com/api/upgrades_info/graph Channel: candidate-4.16 (available channels: candidate-4.16, candidate-4.17, candidate-4.18, eus-4.16, fast-4.16, fast-4.17, stable-4.16, stable-4.17) Updates to 4.16: VERSION ISSUES 4.16.32 no known issues relevant to this cluster 4.16.30 no known issues relevant to this cluster And 2 older 4.16 updates you can see with '--show-outdated-releases' or '--version VERSION'.Note
-
You can use the
--versionflag to determine whether a specific version is recommended for your update. If there are no recommended updates, updates that have known issues might still be available. -
For details and information on how to perform a
Control Plane Onlyupdate, please refer to the Preparing to perform a Control Plane Only update page, listed in the Additional resources section.
-
-
Based on your organization requirements, set the appropriate update channel. For example, you can set your channel to
stable-4.13orfast-4.13. For more information about channels, refer to Understanding update channels and releases listed in the Additional resources section.$ oc adm upgrade channel <channel>For example, to set the channel to
stable-4.19:$ oc adm upgrade channel stable-4.19Important
For production clusters, you must subscribe to a
stable-*,eus-*, orfast-*channel.Note
When you are ready to move to the next minor version, choose the channel that corresponds to that minor version. The sooner the update channel is declared, the more effectively the cluster can recommend update paths to your target version. The cluster might take some time to evaluate all the possible updates that are available and offer the best update recommendations to choose from. Update recommendations can change over time, as they are based on what update options are available at the time.
If you cannot see an update path to your target minor version, keep updating your cluster to the latest patch release for your current version until the next minor version is available in the path.
-
Apply an update:
-
To update to the latest version:
$ oc adm upgrade --to-latest=true -
To update to a specific version:
$ oc adm upgrade --to=<version><version>is the update version that you obtained from the output of theoc adm upgrade recommendcommand.Important
When using
oc adm upgrade --help, there is a listed option for--force. This is heavily discouraged, because using the--forceoption bypasses cluster-side guards, including release verification and precondition checks. Using--forcedoes not guarantee a successful update. Bypassing guards puts the cluster at risk.
-
-
If the cluster administrator evaluates the potential known risks and decides it is acceptable for the current cluster, then the administrator can waive the safety guards and proceed with the update by running the following command:
$ oc adm upgrade --allow-not-recommended --to <version> -
Optional: Review the status of the Cluster Version Operator by running the following command:
$ oc adm upgrade statusNote
To monitor the update in real time, run
oc adm upgrade statusin awatchutility. -
After the update completes, you can confirm that the cluster version has updated to the new version:
$ oc adm upgradeExample outputCluster version is <version> Upstream is unset, so the cluster will use an appropriate default. Channel: stable-<version> (available channels: candidate-<version>, eus-<version>, fast-<version>, stable-<version>) No updates available. You may force an update to a specific release image, but doing so might not be supported and might result in downtime or data loss. -
If you are updating your cluster to the next minor version, such as version X.y to X.(y+1), it is recommended to confirm that your nodes are updated before deploying workloads that rely on a new feature:
$ oc get nodesExample outputNAME STATUS ROLES AGE VERSION ip-10-0-168-251.ec2.internal Ready master 82m v1.34.2 ip-10-0-170-223.ec2.internal Ready master 82m v1.34.2 ip-10-0-179-95.ec2.internal Ready worker 70m v1.34.2 ip-10-0-182-134.ec2.internal Ready worker 70m v1.34.2 ip-10-0-211-16.ec2.internal Ready master 82m v1.34.2 ip-10-0-250-100.ec2.internal Ready worker 69m v1.34.2
Cluster update status using oc adm upgrade status
When updating your cluster, the oc adm upgrade command returns limited information about the status of your update. The cluster administrator can use the oc adm upgrade status command to decouple status information from the oc adm upgrade command and return specific information regarding a cluster update, including the status of the control plane and worker node updates. Worker is also known as compute.
The oc adm upgrade status command is read-only and does not alter any state in your cluster.
The oc adm upgrade status command can be used for clusters from version 4.12 up to the latest supported release.
The oc adm upgrade status command will output three sections, control plane update, worker nodes update, and health insights.
-
Control Plane Update: Displays details about the updating cluster control plane, contains a high-level assessment, completion status, duration estimate, or cluster operator health. The section also shows a table with control plane node update information.
The control plane update section can also show an additional table that lists cluster operators being updated if the
--details=operatorsor--details-allflags are used. Please note that due the asynchronous distributed nature of OpenShift Container Platform, an operator may appear in this section more than once during the update, or not at all. The section is only shown when a Cluster Operator is observed to be updating. It is normal during an update to observe no updating Cluster Operator at certain periods; not every performed action can be assigned to an observable updating Cluster Operator. -
Worker Notes Update: Displays the worker node update information. The worker nodes section starts with a table that displays a summary of information about each worker pool configured in the cluster. Each non-empty worker pool output will show a dedicated table listing update information about nodes that belong to that pool. If a cluster does not have any worker nodes the output will not contain the worker node section. You can make the node tables show all lines by using
--details=nodesor--details=all. -
Health Insights: displays insights about states and events present in the cluster that may be relevant for the ongoing update. You can use
--details=healthto expand the items in this section into a more verbose form with more content such as documentation links, longer form descriptions, or cluster resources involved in the insight.
Note
The oc adm upgrade status command is currently not supported on hosted control planes clusters.
The following is an example of the output you will see for an update progressing successfully:
= Control Plane =
Assessment: Progressing
Target Version: 4.17.1 (from 4.17.0)
Updating: machine-config
Completion: 97% (32 operators updated, 1 updating, 0 waiting)
Duration: 54m (Est. Time Remaining: <10m)
Operator Status: 32 Healthy, 1 Unavailable
Control Plane Nodes
NAME ASSESSMENT PHASE VERSION EST MESSAGE
ip-10-0-53-40.us-east-2.compute.internal Progressing Draining 4.17.0 +10m
ip-10-0-30-217.us-east-2.compute.internal Outdated Pending 4.17.0 ?
ip-10-0-92-180.us-east-2.compute.internal Outdated Pending 4.17.0 ?
= Worker Upgrade =
WORKER POOL ASSESSMENT COMPLETION STATUS
worker Progressing 0% (0/2) 1 Available, 1 Progressing, 1 Draining
infra Progressing 50% (1/2) 1 Available, 1 Progressing, 1 Draining
Worker Pool Nodes: Worker
NAME ASSESSMENT PHASE VERSION EST MESSAGE
ip-10-0-4-159.us-east-2.compute.internal Progressing Draining 4.17.0 +10m
ip-10-0-99-40.us-east-2.compute.internal Outdated Pending 4.17.0 ?
Worker Pool Nodes: infra
NAME ASSESSMENT PHASE VERSION EST MESSAGE
ip-10-0-4-159-infra.us-east-2.compute.internal Progressing Draining 4.17.0 +10m
ip-10-0-20-162.us-east-2.compute.internal Completed Updated 4.17.1 -
= Update Health =
SINCE LEVEL IMPACT MESSAGE
54m4s Info None Update is proceeding well
Changing the update server by using the CLI
Changing the update server is optional. If you have an OpenShift Update Service (OSUS) installed and configured locally, you must set the URL for the server as the upstream to use the local server during updates. The default value for upstream is https://api.openshift.com/api/upgrades_info/v1/graph.
-
Change the
upstreamparameter value in the cluster version:$ oc patch clusterversion/version --patch '{"spec":{"upstream":"<update-server-url>"}}' --type=mergeThe
<update-server-url>variable specifies the URL for the update server.Example outputclusterversion.config.openshift.io/version patched