Evicting pods using the descheduler
You can run the descheduler in OpenShift Container Platform by installing the Kube Descheduler Operator and setting the required profiles and other customizations.
Installing the descheduler
The descheduler is not available by default. To enable the descheduler, you must install the Kube Descheduler Operator from the software catalog and enable one or more descheduler profiles.
By default, the descheduler runs in predictive mode, which means that it only simulates pod evictions. You must change the mode to automatic for the descheduler to perform the pod evictions.
Important
If you have enabled hosted control planes in your cluster, set a custom priority threshold to lower the chance that pods in the hosted control plane namespaces are evicted. Set the priority threshold class name to hypershift-control-plane, because it has the lowest priority value (100000000) of the hosted control plane priority classes.
-
You are logged in to OpenShift Container Platform as a user with the
cluster-adminrole. -
Access to the OpenShift Container Platform web console.
-
Log in to the OpenShift Container Platform web console.
-
Create the required namespace for the Kube Descheduler Operator.
-
Navigate to Administration → Namespaces and click Create Namespace.
-
Enter
openshift-kube-descheduler-operatorin the Name field, enteropenshift.io/cluster-monitoring=truein the Labels field to enable descheduler metrics, and click Create.
-
-
Install the Kube Descheduler Operator.
-
Navigate to Ecosystem → Software Catalog.
-
Type Kube Descheduler Operator into the filter box.
-
Select the Kube Descheduler Operator and click Install.
-
On the Install Operator page, select A specific namespace on the cluster. Select openshift-kube-descheduler-operator from the drop-down menu.
-
Adjust the values for the Update Channel and Approval Strategy to the desired values.
-
Click Install.
-
-
Create a descheduler instance.
-
From the Ecosystem → Installed Operators page, click the Kube Descheduler Operator.
-
Select the Kube Descheduler tab and click Create KubeDescheduler.
-
Edit the settings as necessary.
-
To evict pods instead of simulating the evictions, change the Mode field to Automatic.
-
Expand the Profiles section to select one or more profiles to enable. The
AffinityAndTaintsprofile is enabled by default. Click Add Profile to select additional profiles.Note
Do not enable both
TopologyAndDuplicatesandSoftTopologyAndDuplicates. Enabling both results in a conflict. -
Optional: Expand the Profile Customizations section to set optional configurations for the descheduler.
-
Set a custom pod lifetime value for the
LifecycleAndUtilizationprofile. Use the podLifetime field to set a numerical value and a valid unit (s,m, orh). The default pod lifetime is 24 hours (24h). -
Set a custom priority threshold to consider pods for eviction only if their priority is lower than a specified priority level. Use the thresholdPriority field to set a numerical priority threshold or use the thresholdPriorityClassName field to specify a certain priority class name.
Note
Do not specify both thresholdPriority and thresholdPriorityClassName for the descheduler.
-
Set specific namespaces to exclude or include from descheduler operations. Expand the namespaces field and add namespaces to the excluded or included list. You can only either set a list of namespaces to exclude or a list of namespaces to include. Note that protected namespaces (
openshift-*,kube-system,hypershift) are excluded by default. -
Experimental: Set thresholds for underutilization and overutilization for the
LowNodeUtilizationstrategy. Use the devLowNodeUtilizationThresholds field to set one of the following values:-
Low: 10% underutilized and 30% overutilized -
Medium: 20% underutilized and 50% overutilized (Default) -
High: 40% underutilized and 70% overutilized
Note
This setting is experimental and should not be used in a production environment.
-
-
-
Optional: Use the Descheduling Interval Seconds field to change the number of seconds between descheduler runs. The default is
3600seconds.
-
-
Click Create.
You can also configure the profiles and settings for the descheduler later using the OpenShift CLI (
oc). If you did not adjust the profiles when creating the descheduler instance from the web console, theAffinityAndTaintsprofile is enabled by default. -
Configuring descheduler profiles
To manage cluster pod eviction behavior, select which descheduler profiles to enable.
-
You are logged in to OpenShift Container Platform as a user with the
cluster-adminrole.
-
Edit the
KubeDeschedulerobject:$ oc edit kubedeschedulers.operator.openshift.io cluster -n openshift-kube-descheduler-operator -
Specify one or more profiles in the
spec.profilessection.apiVersion: operator.openshift.io/v1 kind: KubeDescheduler metadata: name: cluster namespace: openshift-kube-descheduler-operator spec: deschedulingIntervalSeconds: 3600 logLevel: Normal managementState: Managed operatorLogLevel: Normal mode: Predictive profileCustomizations: namespaces: excluded: - my-namespace podLifetime: 48h thresholdPriorityClassName: my-priority-class-name evictionLimits: total: 20 profiles: - AffinityAndTaints - TopologyAndDuplicates - LifecycleAndUtilization - EvictPodsWithLocalStorage - EvictPodsWithPVCwhere:
spec.mode-
Specifies the eviction mode. By default, the descheduler does not evict pods. To evict pods, set
modetoAutomatic. spec.profileCustomizations.namespaces-
Specifies a list of user-created namespaces to include or exclude from descheduler operations. Use
excludedto set a list of namespaces to exclude or useincludedto set a list of namespaces to include. Note that protected namespaces (openshift-*,kube-system,hypershift) are excluded by default. This value is optional. spec.profileCustomizations.podLifetime-
Specifies a custom pod lifetime value for the
LifecycleAndUtilizationprofile. Valid units ares,m, orh. The default pod lifetime is 24 hours. This value is optional. spec.profileCustomizations.thresholdPriorityClassName-
Specifies a priority threshold to consider pods for eviction only if their priority is lower than the specified level. Use the
thresholdPriorityfield to set a numerical priority threshold (for example,10000) or use thethresholdPriorityClassNamefield to specify a certain priority class name (for example,my-priority-class-name). If you specify a priority class name, it must already exist or the descheduler will throw an error. Do not set boththresholdPriorityandthresholdPriorityClassName. This value is optional. spec.evictionLimits.total-
Specifies the maximum number of pods to evict during each descheduler run. This value is optional.
spec.profiles-
Specifies one or more profiles to enable. Available profiles:
AffinityAndTaints,TopologyAndDuplicates,LifecycleAndUtilization,SoftTopologyAndDuplicates,EvictPodsWithLocalStorage,EvictPodsWithPVC,CompactAndScale, andLongLifecycle. You can enable multiple profiles, but ensure that you do not enable profiles that conflict with each other. The order of the list of profiles is not important.
-
Save the file to apply the changes.
Configuring the descheduler interval
You can configure the amount of time between descheduler runs. The default is 3600 seconds (one hour).
-
You are logged in to OpenShift Container Platform as a user with the
cluster-adminrole.
-
Edit the
KubeDeschedulerobject:$ oc edit kubedeschedulers.operator.openshift.io cluster -n openshift-kube-descheduler-operator -
Update the
deschedulingIntervalSecondsfield to the required value:apiVersion: operator.openshift.io/v1 kind: KubeDescheduler metadata: name: cluster namespace: openshift-kube-descheduler-operator spec: deschedulingIntervalSeconds: 3600 ...Set the
spec.deschedulingIntervalSecondsfield to the number of seconds you want between descheduler runs. A value of0in this field runs the descheduler once and exits. -
Save the file to apply the changes.