19. Integration with Kubernetes¶
19.1. Topology and CPU Manager¶
Robin Cloud Native Platform integrates with the Kubernetes Topology Manager and CPU Manager to support Non-Uniform Memory Access (NUMA) aware CPU allocation, Single Root I/O Virtualization (SR-IOV) devices, dedicated CPU allocation and FPGA devices for deployments using YAML files and Helm charts. This integration enables you to run workloads, such as telecommunication applications, that are latency-sensitive in native Kubernetes environments.
In environments where you cannot use Robin Bundles due to a preference for homegrown standardized Helm charts, operators, or YAMLs, the integration of the Topology Manager and CPU Manager enables you to utilize the advanced networking and compute features supported by Bundles in a native Kubernetes environments. To support this, Robin CNP provides special annotations for adding multiple interfaces and devices in Helm charts.
The following are advantages of the integration:
Multiple interface support for Pods
Dedicated CPUs for Guaranteed Pods
Helm charts that work in native Kubernetes environment work with Robin CNP
No hardcoding of resources is required in Helm charts per environment (CPU IDs or SR-IOV resource names)
Easy to use annotations for specifying device and interface requests
Note
NUMA-aware memory allocation is not currently supported by the Topology Manager.
19.1.1. Role of Topology Manager¶
The Topology Manager communicates with the CPU Manager and Device Managers to determine the layout of the physical infrastructure of the nodes within a cluster in order to place a Pod on an appropriate NUMA node based on the configured policies. For more information, see Topology Manager.
Note
When an application has an SR-IOV interface, it is not possible to allocate the CPU and memory resources across NUMA nodes. Also, it is not possible for SRIOV interfaces to be distributed across NUMA nodes.
19.2. CNI plugins¶
Robin CNP supports the following three CNI plugins:
Open vSwitch (OVS)
Calico
SR-IOV
These CNI plugins enables you to orchestrate container networking for applications deployed on Robin CNP. Each of these is regarded as a driver for an IP-Pool without which an application cannot be created. Each has its own advantages and might be the preferred network specification based on the workload. Detailed below are some notes on each driver and the benefits they bring.
Note
Selecting a driver is mandatory for IP-Pool creation.
19.2.1. Calico¶
Robin CNP enhances Kubernetes networking with Calico CNI plugin driven network address assignments and enables a fully distributed network architecture scaling smoothly for any size of deployment (in terms of number of PODs).
Moreover this driver enables policy based networking wherein which ingress/egress policies for Pods can be set up. In addition Robin CNP integrates with Calico Typha in order to scale Kubernetes clusters beyond 50 nodes in size by minimizing the calico-node
, present on every node given the daemonset nature of the Calico CNI plugin,
pod’s impact on the Kubernetes API server datastore. This in turn improves the performance of the cluster.
19.2.2. Open vSwitch¶
Robin CNP provides flat networking support with Open vSwitch (OVS) network driver based CNI IP address allocations to Robin application workloads thereby allowing users to access applications from outside the Robin cluster via a Kubernetes NodePort. This driver is clearly the preferred choice if applications need to be accessed by external services. Robin CNP also provides dual stack networking support with OVS allowing users to assign both Ipv4 and Ipv6 addresses to their application workloads. This flexibility can be an advantage depending on application requirements.
19.2.3. SR-IOV¶
SR-IOV is a specification that allows a PCIe device to appear as multiple separate physical PCIe devices. It works by introducing the idea of physical functions (PFs) and virtual functions (VFs). Physical functions (PFs) are full-featured PCIe functions, where as virtual functions (VFs) are “lightweight” functions that lack configuration resources. To function correctly, SR-IOV requires support at BIOS level as well as in the operating system instance or hypervisor that is running on the hardware. Robin CNP allows a user to assign one or more virtual functions from physical functions to PODs created through Robin bundles. To achieve this Robin CNP leverages the following CNI’s: Multus, Bond CNI and Intel CNI for SR-IOV.
Robin CNP discovers SR-IOV NICs alongside their Virtual Functions as part of the discovery process (more details available here) and accounts for them similarly to any other compute resource. As a result, Robin supports NUMA awareness for applications not only in terms of memory and CPU but also with regard to VF’s from SR-IOV NICs that are within the NUMA boundary.
Some additional features that complement SR-IOV support include:
Robin CNP has the ability to tag VLAN traffic for networks based on the VFs present.
Robin CNP has the ability to bind a DPDK driver to a VF associated with a POD. The user can specify this kernel driver whilst creating an IP-Pool. Note that the kernel driver module has to be pre-loaded on the node for full functionality.
Robin CNP has the ability to bond virtual functions, which are attached to POD, from different physical functions.
19.3. Soft Affinity Support¶
Robin CNP supports the soft affinity feature with a few limitations.
In Kubernetes, the soft affinity feature refers to a way of guiding the Kubernetes Scheduler to make a decision about where to place Pods based on preferences, rather than strict requirements. This preference helps to increase the likelihood of co-locating certain Pods on the same node, while still allowing the Kubernetes Scheduler to make adjustments based on resource availability and other constraints. For more information, see Affinity and anti-affinity.
Limitations
The following are the limitations of support for soft affinity and anti-affinity support:
These operators are not supported: DoesNotExist, Gt, and Lt.
Multiple weight parameters for node and Pod affinity are not supported.
Soft anti-affinity doesn’t check or match for the label selector coming from a different Deployment.
During a complete cluster restart, if all nodes are not up at the same time, Pods will not be spread across nodes with soft anti-affinity.
After a Pod restart, it might not come back on the same node.
Post downsizing the number of replicas in a Deployment, soft Pod anti-affinity might not delete the Pods in the same order as creation.
As the affinity information is handled in the cache, restarting the robin-server will flush the cache, resulting in scaled-up Pods not being placed as per anti-affinity.
Creating, deleting, or recreating Pods multiple times will not honour soft affinity. Also, you must ensure that the Pods are not in the
Terminating
status when increasing the Pod replicas.Pods will be unequally distributed on nodes when all Pods in a deployment are deleted.