Configuring MCO-related custom resources
Besides managing MachineConfig objects, the MCO manages two custom resources (CRs): KubeletConfig and ContainerRuntimeConfig. Those CRs let you change node-level settings impacting how the kubelet and CRI-O container runtime services behave.
Creating a KubeletConfig CR to edit kubelet parameters
The kubelet configuration is currently serialized as an Ignition configuration, so it can be directly edited. However, there is also a new kubelet-config-controller added to the Machine Config Controller (MCC). This lets you use a KubeletConfig custom resource (CR) to edit the kubelet parameters.
Note
As the fields in the kubeletConfig object are passed directly to the kubelet from upstream Kubernetes, the kubelet validates those values directly. Invalid values in the kubeletConfig object might cause cluster nodes to become unavailable. For valid values, see the Kubernetes documentation.
Consider the following guidance:
-
Edit an existing
KubeletConfigCR to modify existing settings or add new settings, instead of creating a CR for each change. It is recommended that you create a CR only to modify a different machine config pool, or for changes that are intended to be temporary, so that you can revert the changes. -
Create one
KubeletConfigCR for each machine config pool with all the config changes you want for that pool. -
As needed, create multiple
KubeletConfigCRs with a limit of 10 per cluster. For the firstKubeletConfigCR, the Machine Config Operator (MCO) creates a machine config appended withkubelet. With each subsequent CR, the controller creates anotherkubeletmachine config with a numeric suffix. For example, if you have akubeletmachine config with a-2suffix, the nextkubeletmachine config is appended with-3.
Note
If you are applying a kubelet or container runtime config to a custom machine config pool, the custom role in the machineConfigSelector must match the name of the custom machine config pool.
For example, because the following custom machine config pool is named infra, the custom role must also be infra:
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
name: infra
spec:
machineConfigSelector:
matchExpressions:
- {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,infra]}
# ...
If you want to delete the machine configs, delete them in reverse order to avoid exceeding the limit. For example, you delete the kubelet-3 machine config before deleting the kubelet-2 machine config.
Note
If you have a machine config with a kubelet-9 suffix, and you create another KubeletConfig CR, a new machine config is not created, even if there are fewer than 10 kubelet machine configs.
KubeletConfig CR$ oc get kubeletconfig
NAME AGE
set-kubelet-config 15m
KubeletConfig machine config$ oc get mc | grep kubelet
...
99-worker-generated-kubelet-1 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.5.0 26m
...
The following procedure is an example to show how to configure the maximum number of pods per node, the maximum PIDs per node, and the maximum container log size size on the worker nodes.
-
Obtain the label associated with the static
MachineConfigPoolCR for the type of node you want to configure. Perform one of the following steps:-
View the machine config pool:
$ oc describe machineconfigpool <name>For example:
$ oc describe machineconfigpool workerExample outputapiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: creationTimestamp: 2019-02-08T14:52:39Z generation: 1 labels: custom-kubelet: set-kubelet-config- If a label has been added it appears under
labels.
- If a label has been added it appears under
-
If the label is not present, add a key/value pair:
$ oc label machineconfigpool worker custom-kubelet=set-kubelet-config
-
-
View the available machine configuration objects that you can select:
$ oc get machineconfigBy default, the two kubelet-related configs are
01-master-kubeletand01-worker-kubelet. -
Check the current value for the maximum pods per node:
$ oc describe node <node_name>For example:
$ oc describe node ci-ln-5grqprb-f76d1-ncnqq-worker-a-mdv94Look for
value: pods: <value>in theAllocatablestanza:Example outputAllocatable: attachable-volumes-aws-ebs: 25 cpu: 3500m hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 15341844Ki pods: 250 -
Configure the worker nodes as needed:
-
Create a YAML file similar to the following that contains the kubelet configuration:
Important
Kubelet configurations that target a specific machine config pool also affect any dependent pools. For example, creating a kubelet configuration for the pool containing worker nodes will also apply to any subset pools, including the pool containing infrastructure nodes. To avoid this, you must create a new machine config pool with a selection expression that only includes worker nodes, and have your kubelet configuration target this new pool.
apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: set-kubelet-config spec: machineConfigPoolSelector: matchLabels: custom-kubelet: set-kubelet-config kubeletConfig: podPidsLimit: 8192 containerLogMaxSize: 50Mi maxPods: 500- Enter the label from the machine config pool.
- Add the kubelet configuration. For example:
-
Use
podPidsLimitto set the maximum number of PIDs in any pod. -
Use
containerLogMaxSizeto set the maximum size of the container log file before it is rotated. -
Use
maxPodsto set the maximum pods per node.Note
The rate at which the kubelet talks to the API server depends on queries per second (QPS) and burst values. The default values,
50forkubeAPIQPSand100forkubeAPIBurst, are sufficient if there are limited pods running on each node. It is recommended to update the kubelet QPS and burst rates if there are enough CPU and memory resources on the node.apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: set-kubelet-config spec: machineConfigPoolSelector: matchLabels: custom-kubelet: set-kubelet-config kubeletConfig: maxPods: <pod_count> kubeAPIBurst: <burst_rate> kubeAPIQPS: <QPS>
-
-
Update the machine config pool for workers with the label:
$ oc label machineconfigpool worker custom-kubelet=set-kubelet-config -
Create the
KubeletConfigobject:$ oc create -f change-maxPods-cr.yaml
-
-
Verify that the
KubeletConfigobject is created:$ oc get kubeletconfigExample outputNAME AGE set-kubelet-config 15mDepending on the number of worker nodes in the cluster, wait for the worker nodes to be rebooted one by one. For a cluster with 3 worker nodes, this could take about 10 to 15 minutes.
-
Verify that the changes are applied to the node:
-
Check on a worker node that the
maxPodsvalue changed:$ oc describe node <node_name> -
Locate the
Allocatablestanza:... Allocatable: attachable-volumes-gce-pd: 127 cpu: 3500m ephemeral-storage: 123201474766 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 14225400Ki pods: 500 ...- In this example, the
podsparameter should report the value you set in theKubeletConfigobject.
- In this example, the
-
-
Verify the change in the
KubeletConfigobject:$ oc get kubeletconfigs set-kubelet-config -o yamlThis should show a status of
Trueandtype:Success, as shown in the following example:spec: kubeletConfig: containerLogMaxSize: 50Mi maxPods: 500 podPidsLimit: 8192 machineConfigPoolSelector: matchLabels: custom-kubelet: set-kubelet-config status: conditions: - lastTransitionTime: "2021-06-30T17:04:07Z" message: Success status: "True" type: Success
Creating a ContainerRuntimeConfig CR to edit CRI-O parameters
You can change some of the settings associated with the OpenShift Container Platform CRI-O runtime for the nodes associated with a specific machine config pool (MCP). Using a ContainerRuntimeConfig custom resource (CR), you set the configuration values and add a label to match the MCP. The MCO then rebuilds the crio.conf and storage.conf configuration files on the associated nodes with the updated values.
Note
To revert the changes implemented by using a ContainerRuntimeConfig CR, you must delete the CR. Removing the label from the machine config pool does not revert the changes.
You can modify the following settings by using a ContainerRuntimeConfig CR:
-
Log level: The
logLevelparameter sets the CRI-Olog_levelparameter, which is the level of verbosity for log messages. The default isinfo(log_level = info). Other options includefatal,panic,error,warn,debug, andtrace. -
Overlay size: The
overlaySizeparameter sets the CRI-O Overlay storage driversizeparameter, which is the maximum size of a container image. -
Container runtime: The
defaultRuntimeparameter sets the container runtime to eithercrunorrunc. The default iscrun.
You should have one ContainerRuntimeConfig CR for each machine config pool with all the config changes you want for that pool. If you are applying the same content to all the pools, you only need one ContainerRuntimeConfig CR for all the pools.
You should edit an existing ContainerRuntimeConfig CR to modify existing settings or add new settings instead of creating a new CR for each change. It is recommended to create a new ContainerRuntimeConfig CR only to modify a different machine config pool, or for changes that are intended to be temporary so that you can revert the changes.
You can create multiple ContainerRuntimeConfig CRs, as needed, with a limit of 10 per cluster. For the first ContainerRuntimeConfig CR, the MCO creates a machine config appended with containerruntime. With each subsequent CR, the controller creates a new containerruntime machine config with a numeric suffix. For example, if you have a containerruntime machine config with a -2 suffix, the next containerruntime machine config is appended with -3.
If you want to delete the machine configs, you should delete them in reverse order to avoid exceeding the limit. For example, you should delete the containerruntime-3 machine config before deleting the containerruntime-2 machine config.
Note
If you have a machine config with a containerruntime-9 suffix, and you create another ContainerRuntimeConfig CR, a new machine config is not created, even if there are fewer than 10 containerruntime machine configs.
ContainerRuntimeConfig CRs$ oc get ctrcfg
NAME AGE
ctr-overlay 15m
ctr-level 5m45s
containerruntime machine configs$ oc get mc | grep container
...
01-master-container-runtime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.5.0 57m
...
01-worker-container-runtime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.5.0 57m
...
99-worker-generated-containerruntime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.5.0 26m
99-worker-generated-containerruntime-1 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.5.0 17m
99-worker-generated-containerruntime-2 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.5.0 7m26s
...
The following example sets the log_level field to debug, sets the overlay size to 8 GB, and configures runC as the container runtime:
ContainerRuntimeConfig CRapiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: overlay-size
spec:
machineConfigPoolSelector:
matchLabels:
pools.operator.machineconfiguration.openshift.io/worker: ''
containerRuntimeConfig:
logLevel: debug
overlaySize: 8G
defaultRuntime: "runc"
- Specifies the machine config pool label. For a container runtime config, the role must match the name of the associated machine config pool.
- Optional: Specifies the level of verbosity for log messages.
- Optional: Specifies the maximum size of a container image.
- Optional: Specifies the container runtime to deploy to new containers, either
crunorrunc. The default value iscrun.
To change CRI-O settings using the ContainerRuntimeConfig CR:
-
Create a YAML file for the
ContainerRuntimeConfigCR:apiVersion: machineconfiguration.openshift.io/v1 kind: ContainerRuntimeConfig metadata: name: overlay-size spec: machineConfigPoolSelector: matchLabels: pools.operator.machineconfiguration.openshift.io/worker: '' containerRuntimeConfig: logLevel: debug overlaySize: 8G defaultRuntime: "runc"- Specify a label for the machine config pool that you want you want to modify.
- Set the parameters as needed.
-
Create the
ContainerRuntimeConfigCR:$ oc create -f <file_name>.yaml -
Verify that the CR is created:
$ oc get ContainerRuntimeConfigExample outputNAME AGE overlay-size 3m19s -
Check that a new
containerruntimemachine config is created:$ oc get machineconfigs | grep containerrunExample output99-worker-generated-containerruntime 2c9371fbb673b97a6fe8b1c52691999ed3a1bfc2 3.5.0 31s -
Monitor the machine config pool until all are shown as ready:
$ oc get mcp workerExample outputNAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-169 False True False 3 1 1 0 9h -
Verify that the settings were applied in CRI-O:
-
Open an
oc debugsession to a node in the machine config pool and runchroot /host.$ oc debug node/<node_name>sh-4.4# chroot /host -
Verify the changes in the
crio.conffile:sh-4.4# crio config | grep 'log_level'Example outputlog_level = "debug" -
Verify the changes in the
storage.conffile:sh-4.4# head -n 7 /etc/containers/storage.confExample output[storage] driver = "overlay" runroot = "/var/run/containers/storage" graphroot = "/var/lib/containers/storage" [storage.options] additionalimagestores = [] size = "8G" -
Verify the changes in the
crio/crio.conf.d/01-ctrcfg-defaultRuntimefile:sh-5.1# cat /etc/crio/crio.conf.d/01-ctrcfg-defaultRuntimeExample output[crio] [crio.runtime] default_runtime = "runc"
-
Setting the default maximum container root partition size for Overlay with CRI-O
The root partition of each container shows all of the available disk space of the underlying host. Follow this guidance to set a maximum partition size for the root disk of all containers.
To configure the maximum Overlay size, as well as other CRI-O options like the log level, you can create the following ContainerRuntimeConfig custom resource definition (CRD):
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: overlay-size
spec:
machineConfigPoolSelector:
matchLabels:
custom-crio: overlay-size
containerRuntimeConfig:
logLevel: debug
overlaySize: 8G
-
Create the configuration object:
$ oc apply -f overlaysize.yml -
To apply the new CRI-O configuration to your worker nodes, edit the worker machine config pool:
$ oc edit machineconfigpool worker -
Add the
custom-criolabel based on thematchLabelsname you set in theContainerRuntimeConfigCRD:apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: creationTimestamp: "2020-07-09T15:46:34Z" generation: 3 labels: custom-crio: overlay-size machineconfiguration.openshift.io/mco-built-in: "" -
Save the changes, then view the machine configs:
$ oc get machineconfigsNew
99-worker-generated-containerruntimeandrendered-worker-xyzobjects are created:Example output99-worker-generated-containerruntime 4173030d89fbf4a7a0976d1665491a4d9a6e54f1 3.5.0 7m42s rendered-worker-xyz 4173030d89fbf4a7a0976d1665491a4d9a6e54f1 3.5.0 7m36s -
After those objects are created, monitor the machine config pool for the changes to be applied:
$ oc get mcp workerThe worker nodes show
UPDATINGasTrue, as well as the number of machines, the number updated, and other details:Example outputNAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-xyz False True False 3 2 2 0 20hWhen complete, the worker nodes transition back to
UPDATINGasFalse, and theUPDATEDMACHINECOUNTnumber matches theMACHINECOUNT:Example outputNAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-xyz True False False 3 3 3 0 20hLooking at a worker machine, you see that the new 8 GB max size configuration is applied to all of the workers:
Example outputhead -n 7 /etc/containers/storage.conf [storage] driver = "overlay" runroot = "/var/run/containers/storage" graphroot = "/var/lib/containers/storage" [storage.options] additionalimagestores = [] size = "8G"Looking inside a container, you see that the root partition is now 8 GB:
Example output~ $ df -h Filesystem Size Used Available Use% Mounted on overlay 8.0G 8.0K 8.0G 0% /
Creating a drop-in file for the default CRI-O capabilities
You can change some of the settings associated with the OpenShift Container Platform CRI-O runtime for the nodes associated with a specific machine config pool (MCP). By using a controller custom resource (CR), you set the configuration values and add a label to match the MCP. The Machine Config Operator (MCO) then rebuilds the crio.conf and default.conf configuration files on the associated nodes with the updated values.
Earlier versions of OpenShift Container Platform included specific machine configs by default. If you updated to a later version of OpenShift Container Platform, those machine configs were retained to ensure that clusters running on the same OpenShift Container Platform version have the same machine configs.
You can create multiple ContainerRuntimeConfig CRs, as needed, with a limit of 10 per cluster. For the first ContainerRuntimeConfig CR, the MCO creates a machine config appended with containerruntime. With each subsequent CR, the controller creates a containerruntime machine config with a numeric suffix. For example, if you have a containerruntime machine config with a -2 suffix, the next containerruntime machine config is appended with -3.
If you want to delete the machine configs, delete them in reverse order to avoid exceeding the limit. For example, delete the containerruntime-3 machine config before you delete the containerruntime-2 machine config.
Note
If you have a machine config with a containerruntime-9 suffix and you create another ContainerRuntimeConfig CR, a new machine config is not created, even if there are fewer than 10 containerruntime machine configs.
$ oc get ctrcfg
NAME AGE
ctr-overlay 15m
ctr-level 5m45s
$ cat /proc/1/status | grep Cap
$ capsh --decode=<decode_CapBnd_value>
- Replace
<decode_CapBnd_value>with the specific value you want to decode.