Managing symmetric routing with MetalLB
As a cluster administrator, you can effectively manage traffic for pods behind a MetalLB load-balancer service with multiple host interfaces by implementing features from MetalLB, NMState, and OVN-Kubernetes. By combining these features in this context, you can provide symmetric routing, traffic segregation, and support clients on different networks with overlapping CIDR addresses.
To achieve this functionality, learn how to implement virtual routing and forwarding (VRF) instances with MetalLB, and configure egress services.
Important
Configuring symmetric traffic by using a VRF instance with MetalLB and an egress service is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
Challenges of managing symmetric routing with MetalLB
To resolve network isolation and asymmetric routing challenges on multiple host interfaces, implement a configuration combining MetalLB, NMState, and OVN-Kubernetes. This solution ensures symmetric routing and prevents overlapping CIDR addresses without requiring manual static route maintenance.
One option to ensure that return traffic reaches the correct client is to use static routes. However, with this solution, MetalLB cannot isolate the services and then announce each service through a different interface. Additionally, static routing requires manual configuration and requires maintenance if remote sites are added.
A further challenge of symmetric routing when implementing a MetalLB service is scenarios where external systems expect the source and destination IP address for an application to be the same. The default behavior for OpenShift Container Platform is to assign the IP address of the host network interface as the source IP address for traffic originating from pods. This is problematic with multiple host interfaces.
You can overcome these challenges by implementing a configuration that combines features from MetalLB, NMState, and OVN-Kubernetes.
Overview of managing symmetric routing by using VRFs with MetalLB
You can overcome the challenges of implementing symmetric routing by using NMState to configure a VRF instance on a host, associating the VRF instance with a MetalLB BGPPeer resource, and configuring an egress service for egress traffic with OVN-Kubernetes.
The configuration process involves three stages:
- 1: Define a VRF and routing rules
-
-
Configure a
NodeNetworkConfigurationPolicycustom resource (CR) to associate a VRF instance with a network interface. -
Use the VRF routing table to direct ingress and egress traffic.
-
- 2: Link the VRF to a MetalLB
BGPPeer -
-
Configure a MetalLB
BGPPeerresource to use the VRF instance on a network interface. -
By associating the
BGPPeerresource with the VRF instance, the designated network interface becomes the primary interface for the BGP session, and MetalLB advertises the services through this interface.
-
- 3: Configure an egress service
-
-
Configure an egress service to choose the network associated with the VRF instance for egress traffic.
-
Optional: Configure an egress service to use the IP address of the MetalLB load-balancer service as the source IP for egress traffic.
-
Configuring symmetric routing by using VRFs with MetalLB
To ensure that applications behind a MetalLB service use the same network path for both ingress and egress, configure symmetric routing by using Virtual Routing and Forwarding (VRF).
The example in the procedure associates a VRF routing table with MetalLB and an egress service to enable symmetric routing for ingress and egress traffic for pods behind a LoadBalancer service.
Important
-
If you use the
sourceIPBy: "LoadBalancerIP"setting in theEgressServiceCR, you must specify the load-balancer node in theBGPAdvertisementcustom resource (CR). -
You can use the
sourceIPBy: "Network"setting on clusters that use OVN-Kubernetes configured with thegatewayConfig.routingViaHostspecification set totrueonly. Additionally, if you use thesourceIPBy: "Network"setting, you must schedule the application workload on nodes configured with the network VRF instance.
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. -
Install the Kubernetes NMState Operator.
-
Install the MetalLB Operator.
-
Create a
NodeNetworkConfigurationPolicyCR to define the VRF instance:-
Create a file, such as
node-network-vrf.yaml, with content like the following example:apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: vrfpolicy spec: nodeSelector: vrf: "true" maxUnavailable: 3 desiredState: interfaces: - name: ens4vrf type: vrf state: up vrf: port: - ens4 route-table-id: 2 - name: ens4 type: ethernet state: up ipv4: address: - ip: 192.168.130.130 prefix-length: 24 dhcp: false enabled: true routes: config: - destination: 0.0.0.0/0 metric: 150 next-hop-address: 192.168.130.1 next-hop-interface: ens4 table-id: 2 route-rules: config: - ip-to: 172.30.0.0/16 priority: 998 route-table: 254 - ip-to: 10.132.0.0/14 priority: 998 route-table: 254 - ip-to: 169.254.0.0/17 priority: 998 route-table: 254 # ...where:
metadata.name-
Specifies the name of the policy.
nodeSelector.vrf-
Specifies the policy for all nodes with the label
vrf:true. interfaces.name.ens4vrf-
Specifies the name of the interface.
interfaces.type-
Specifies the type of interface. This example creates a VRF instance.
vrf.port-
Specifies the node interface that the VRF attaches to.
vrf.route-table-id-
Specifies the name of the route table ID for the VRF.
- `interfaces.name.ens4 `
-
Specifies the IPv4 address of the interface associated with the VRF.
routes-
Specifies the configuration for network routes. The
next-hop-addressfield defines the IP address of the next hop for the route. Thenext-hop-interfacefield defines the outgoing interface for the route. In this example, the VRF routing table is2, which references the ID that you define in theEgressServiceCR. route-rules-
Specifies additional route rules. The
ip-tofields must match theCluster NetworkCIDR,Service NetworkCIDR, andInternal Masqueradesubnet CIDR. You can view the values for these CIDR address specifications by running the following command:oc describe network.operator/cluster. route-rules.route-table-
Specifies the main routing table that the Linux kernel uses when calculating routes has the ID
254.
-
Apply the policy by running the following command:
$ oc apply -f node-network-vrf.yaml
-
-
Create a
BGPPeercustom resource (CR):-
Create a file, such as
frr-via-vrf.yaml, with content like the following example:apiVersion: metallb.io/v1beta2 kind: BGPPeer metadata: name: frrviavrf namespace: metallb-system spec: myASN: 100 peerASN: 200 peerAddress: 192.168.130.1 vrf: ens4vrf # ...where:
spec.vrf-
Specifies the VRF instance to associate with the BGP peer. MetalLB can advertise services and make routing decisions based on the routing information in the VRF.
-
Apply the configuration for the BGP peer by running the following command:
$ oc apply -f frr-via-vrf.yaml
-
-
Create an
IPAddressPoolCR:-
Create a file, such as
first-pool.yaml, with content like the following example:apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: first-pool namespace: metallb-system spec: addresses: - 192.169.10.0/32 # ... -
Apply the configuration for the IP address pool by running the following command:
$ oc apply -f first-pool.yaml
-
-
Create a
BGPAdvertisementCR:-
Create a file, such as
first-adv.yaml, with content like the following example:apiVersion: metallb.io/v1beta1 kind: BGPAdvertisement metadata: name: first-adv namespace: metallb-system spec: ipAddressPools: - first-pool peers: - frrviavrf nodeSelectors: - matchLabels: egress-service.k8s.ovn.org/test-server1: "" # ...where:
peers-
In this example, MetalLB advertises a range of IP addresses from the
first-poolIP address pool to thefrrviavrfBGP peer. nodeSelectors-
In this example, the
EgressServiceCR configures the source IP address for egress traffic to use the load-balancer service IP address. Therefore, you must specify the load-balancer node for return traffic to use the same return path for the traffic originating from the pod.
-
Apply the configuration for the BGP advertisement by running the following command:
$ oc apply -f first-adv.yaml
-
-
Create an
EgressServiceCR:-
Create a file, such as
egress-service.yaml, with content like the following example:apiVersion: k8s.ovn.org/v1 kind: EgressService metadata: name: server1 namespace: test spec: sourceIPBy: "LoadBalancerIP" nodeSelector: matchLabels: vrf: "true" network: "2" # ...where:
metadata.name-
Specifies the name for the egress service. The name of the
EgressServiceresource must match the name of the load-balancer service that you want to modify. metadata.namespace-
Specifies the namespace for the egress service. The namespace for the
EgressServicemust match the namespace of the load-balancer service that you want to modify. The egress service is namespace-scoped. spec.sourceIPBy-
Specifies the
LoadBalancerservice ingress IP address as the source IP address for egress traffic. matchLabels.vrf-
If you specify
LoadBalancerfor thesourceIPByspecification, a single node handles theLoadBalancerservice traffic. In this example, only a node with the labelvrf: "true"can handle the service traffic. If you do not specify a node, OVN-Kubernetes selects a worker node to handle the service traffic. When a node is selected, OVN-Kubernetes labels the node in the following format:egress-service.k8s.ovn.org/<svc_namespace>-<svc_name>: "". network-
Specifyies the routing table ID for egress traffic. Ensure that the value matches the
route-table-idID defined in theNodeNetworkConfigurationPolicyresource, for example,route-table-id: 2.
-
Apply the configuration for the egress service by running the following command:
$ oc apply -f egress-service.yaml
-
-
Verify that you can access the application endpoint of the pods running behind the MetalLB service by running the following command:
$ curl <external_ip_address>:<port_number>-
<external_ip_address>:<port_number>: Specifies the external IP address and port number to suit your application endpoint.
-
-
Optional: If you assigned the
LoadBalancerservice ingress IP address as the source IP address for egress traffic, verify this configuration by using tools such astcpdumpto analyze packets received at the external client.