19. Integration with Kubernetes¶

Robin CNP integrates with the Kubernetes Topology Manager, CPU Manager, and Memory Manager. For information about Memory Manager, see: Memory Manager.

19.1. Topology and CPU Manager¶

Robin Cloud Native Platform integrates with the Kubernetes Topology Manager and CPU Manager to support Non-Uniform Memory Access (NUMA) aware CPU allocation, Single Root I/O Virtualization (SR-IOV) devices, dedicated CPU allocation and FPGA devices for deployments using YAML files and Helm charts. This integration enables you to run workloads, such as telecommunication applications, that are latency-sensitive in native Kubernetes environments.

In environments where you cannot use Robin Bundles due to a preference for homegrown standardized Helm charts, operators, or YAMLs, the integration of the Topology Manager and CPU Manager enables you to utilize the advanced networking and compute features supported by Bundles in a native Kubernetes environments. To support this, Robin CNP provides special annotations for adding multiple interfaces and devices in Helm charts.

The following are advantages of the integration:

Multiple interface support for Pods
Dedicated CPUs for Guaranteed Pods
Helm charts that work in native Kubernetes environment work with Robin CNP
No hardcoding of resources is required in Helm charts per environment (CPU IDs or SR-IOV resource names)
Easy to use annotations for specifying device and interface requests

Note

NUMA-aware memory allocation is not currently supported by the Topology Manager.

19.1.1. Role of Topology Manager¶

The Topology Manager communicates with the CPU Manager and Device Managers to determine the layout of the physical infrastructure of the nodes within a cluster in order to place a Pod on an appropriate NUMA node based on the configured policies. For more information, see Topology Manager.

Note

When an application has an SR-IOV interface, it is not possible to allocate the CPU and memory resources across NUMA nodes. Also, it is not possible for SRIOV interfaces to be distributed across NUMA nodes.

19.1.2. Topology Manager Policies¶

Robin Supports the following Topology Manager policies:

None
Best-effort
Restricted
Single-numa-node

Note

The default Topology Manager policy is Restricted .

More information about the different Topology Manager policies can be found here.

19.1.3. Isolated shared CPU on Kubernetes CPU Manager¶

Starting from Robin CNP v5.4.3 HF4, support to configure isolated shared CPUs on Kubernetes CPU Manager is available. To use the isolated shared CPU on the Kubernetes CPU manager, the CPUs must be from the isolated pool of CPUs on a host.

19.1.3.1. Configure Isolated shared CPU¶

To configure isolated-shared CPUs on the host, run the following command on each worker node:

# /root/bin/isolated-shared-cpu-patch.sh <ISOLSHARED_CPUS>

Example

# /root/bin/isolated-shared-cpu-patch.sh 5-8,45-48
Redirecting to /bin/systemctl restart libvirtd.service
Thu Nov 30 19:40:22 PST 2023 PID:306438 docker daemon started
Thu Nov 30 19:40:22 PST 2023 PID:306438 docker daemon is running
Thu Nov 30 19:40:22 PST 2023 PID:306438 Updating haproxy container   cpuset to 0-8,40-48
haproxy-keepalived-robink8s_monitor
Thu Nov 30 19:40:23 PST 2023 PID:306438 robin-cri doesnt exist. creating robin-cri
Deleting robin-cri
Thu Nov 30 19:40:23 PST 2023 PID:306438 robin-cri deleted
79113c63c1537d66b2d66045f59a2e204bfb75a558ace559298ffda0640d8035
Thu Nov 30 19:40:23 PST 2023 PID:306438 robin-cri exist. Waiting for it to start
Thu Nov 30 19:40:24 PST 2023 PID:306438 robin-cri started
Thu Nov 30 19:40:24 PST 2023 PID:306438 robin-cri started
DefaultCpuset 0-79
2ac81c6f834c: 0-79
Updating 2ac81c6f834c with 0-79
2ac81c6f834c
9055c52facfd: 0-79
Updating 9055c52facfd with 0-79
9055c52facfd
2456d1a2ec3f: 0-79
…… O/P truncated

After running the above command, you can check which CPUs are configured as isolated shared CPUs in the following file:

# cat /root/bin/ROBIN_INSTALL_CONTEXT | grep -i ISOLATED_SHARED_CPUS

Example

# cat /root/bin/ROBIN_INSTALL_CONTEXT | grep -i isolated
ISOLATED_SHARED_CPUS=default=5-8,45-48

19.1.3.2. Request Isolated Shared CPU for Pod¶

After configuring isolated shared CPUs on a host, a Pod can request for these isolated shared CPUs as needed. To request the isolated shared CPUs for a Pod, you need to add the following annotation in the Pod YAML for Helm apps and Bundle’s manifest file or input.yaml file for Robin Bundle apps:

robin.runtime.isolated_shared: "default"

Helm Apps

You must add the robin.runtime.isolated_shared: "default" annotation in Pod YAML for Helm apps.

Sample Pod YAML requesting for isolated shared CPUs for Helm apps:

  apiVersion: v1
  kind: Pod
  metadata:
    name: isol1
    annotations:
      robin.runtime.isolated_shared: "default"
  spec:
    tolerations:
    - effect: NoSchedule
    key: node-role.kubernetes.io/master
    - effect: NoSchedule
    key: node-role.kubernetes.io/control-plane
    containers:
    - name: app
    image: robinsys/virtlauncher:rocky8
    command: ["/bin/bash","-c","while true; do sleep 5; done"]

Robin Bundle Apps

You need to add the robin.runtime.isolated_shared: "default" annotation in the following ways for Robin Bundle apps:
- Bundle’s manifest file - in the annotations section under the role.
- Bundle’s input.yaml - in the annotations section under the role.

19.1.4. Memory Manager¶

Robin CNP integrates the Kubernetes Memory Manager plugin starting with Robin CNP v5.6.0.

The Memory Manager plugin allocates guaranteed memory and hugepages for guaranteed QoS Pods at the NUMA level.

The Memory Manager plugin works along with the CPU Manager and Topology Manager. It provides hints to the Topology manager and enables resource allocations. The Memory Manager plugin ensures that the memory requested by a pod is allocated from a minimum number of NUMA nodes.

Note

Robin CNP supports only the Static policy for Memory Manager and supports only Pod as the scope for Topology Manager(topology-manager-scope=Pod).

19.1.4.1. Enable Memory Manager plugin¶

You must enable the Memory Manager plugin using the "memory-manager-policy":"Static" parameter when installing Robin CNP or upgrading to a supported version.

Note

You must provide this parameter in the config.json only and the plugin functions at the cluster level.

19.1.4.2. Memory Manager state file¶

The memory manager state file tracks memory and hugepage assignments for Pods and containers with the Guaranteed QoS class. And, it also tracks from which NUMA nodes the memory or hugepages is assigned.

The Memory Manager state file is stored at the following location on each node: /var/lib/kubelet/memory_manager_state.

For more information about Memory Manager, see Utilizing the NUMA-aware Memory Manager.

When you create, delete, or resize Pods, the memory manager state changes dynamically. Robin CNP periodically refreshes this information to maintain an accurate view of resource availability in the memory manager state file.
The Memory Manager state file only tracks memory allocated to Pods that meet the Guaranteed QoS class requirement. The plugin does not manage BestEffort and Burstable Pods.

19.1.4.3. Memory Manager integration advantages¶

Robin CNP uses the information from the Memory Manager state file for the following points:

To determine the available memory resources on a NUMA node for scheduling Guaranteed QoS pods
To schedule Pods on NUMA nodes with sufficient available memory
To monitor and detect potential memory shortages or imbalances across NUMA nodes.
To troubleshoot if any Pods are failing to schedule due to memory constraints. The memory manager state file helps you to understand the memory allocation state of the node.

19.2. CNI plugins¶

Robin CNP supports the following three CNI plugins:

Open vSwitch (OVS)
Calico
SR-IOV

These CNI plugins enables you to orchestrate container networking for applications deployed on Robin CNP. Each of these is regarded as a driver for an IP-Pool without which an application cannot be created. Each has its own advantages and might be the preferred network specification based on the workload. Detailed below are some notes on each driver and the benefits they bring.

Note

Selecting a driver is mandatory for IP-Pool creation.

19.2.1. Calico¶

Robin CNP enhances Kubernetes networking with Calico CNI plugin driven network address assignments and enables a fully distributed network architecture scaling smoothly for any size of deployment (in terms of number of PODs). Moreover this driver enables policy based networking wherein which ingress/egress policies for Pods can be set up. In addition Robin CNP integrates with Calico Typha in order to scale Kubernetes clusters beyond 50 nodes in size by minimizing the calico-node, present on every node given the daemonset nature of the Calico CNI plugin, pod’s impact on the Kubernetes API server datastore. This in turn improves the performance of the cluster.

19.2.2. Open vSwitch¶

Robin CNP provides flat networking support with Open vSwitch (OVS) network driver based CNI IP address allocations to Robin application workloads thereby allowing users to access applications from outside the Robin cluster via a Kubernetes NodePort. This driver is clearly the preferred choice if applications need to be accessed by external services. Robin CNP also provides dual stack networking support with OVS allowing users to assign both Ipv4 and Ipv6 addresses to their application workloads. This flexibility can be an advantage depending on application requirements.

19.2.3. SR-IOV¶

SR-IOV is a specification that allows a PCIe device to appear as multiple separate physical PCIe devices. It works by introducing the idea of physical functions (PFs) and virtual functions (VFs). Physical functions (PFs) are full-featured PCIe functions, where as virtual functions (VFs) are “lightweight” functions that lack configuration resources. To function correctly, SR-IOV requires support at BIOS level as well as in the operating system instance or hypervisor that is running on the hardware. Robin CNP allows a user to assign one or more virtual functions from physical functions to PODs created through Robin bundles. To achieve this Robin CNP leverages the following CNI’s: Multus, Bond CNI and Intel CNI for SR-IOV.

Robin CNP discovers SR-IOV NICs alongside their Virtual Functions as part of the discovery process (more details available here) and accounts for them similarly to any other compute resource. As a result, Robin supports NUMA awareness for applications not only in terms of memory and CPU but also with regard to VF’s from SR-IOV NICs that are within the NUMA boundary.

Some additional features that complement SR-IOV support include:

Robin CNP has the ability to tag VLAN traffic for networks based on the VFs present.
Robin CNP has the ability to bind a DPDK driver to a VF associated with a POD. The user can specify this kernel driver whilst creating an IP-Pool. Note that the kernel driver module has to be pre-loaded on the node for full functionality.
Robin CNP has the ability to bond virtual functions, which are attached to POD, from different physical functions.

19.3. Soft Affinity Support¶

Robin CNP supports the soft affinity feature with a few limitations.

In Kubernetes, the soft affinity feature refers to a way of guiding the Kubernetes Scheduler to make a decision about where to place Pods based on preferences, rather than strict requirements. This preference helps to increase the likelihood of co-locating certain Pods on the same node, while still allowing the Kubernetes Scheduler to make adjustments based on resource availability and other constraints. For more information, see Affinity and anti-affinity.

Limitations

The following are the limitations of support for soft affinity and anti-affinity support:

These operators are not supported: DoesNotExist, Gt, and Lt.
Multiple weight parameters for node and Pod affinity are not supported.
Soft anti-affinity doesn’t check or match for the label selector coming from a different Deployment.
During a complete cluster restart, if all nodes are not up at the same time, Pods will not be spread across nodes with soft anti-affinity.
After a Pod restart, it might not come back on the same node.
Post downsizing the number of replicas in a Deployment, soft Pod anti-affinity might not delete the Pods in the same order as creation.
As the affinity information is handled in the cache, restarting the robin-server will flush the cache, resulting in scaled-up Pods not being placed as per anti-affinity.
Creating, deleting, or recreating Pods multiple times will not honour soft affinity. Also, you must ensure that the Pods are not in the Terminating status when increasing the Pod replicas.
Pods will be unequally distributed on nodes when all Pods in a deployment are deleted.