19. Integration with Kubernetes

19.1. Topology and CPU Manager

Robin Cloud Native Platform integrates with the Kubernetes Topology Manager and CPU Manager to support Non-Uniform Memory Access (NUMA) aware CPU allocation, Single Root I/O Virtualization (SR-IOV) devices, dedicated CPU allocation and FPGA devices for deployments using YAML files and Helm charts. This integration enables you to run workloads, such as telecommunication applications, that are latency-sensitive in native Kubernetes environments.

In environments where you cannot use Robin Bundles due to a preference for homegrown standardized Helm charts, operators, or YAMLs, the integration of the Topology Manager and CPU Manager enables you to utilize the advanced networking and compute features supported by Bundles in a native Kubernetes environments. To support this, Robin CNP provides special annotations for adding multiple interfaces and devices in Helm charts.

The following are advantages of the integration:

  • Multiple interface support for Pods

  • Dedicated CPUs for Guaranteed Pods

  • Helm charts that work in native Kubernetes environment work with Robin CNP

  • No hardcoding of resources is required in Helm charts per environment (CPU IDs or SR-IOV resource names)

  • Easy to use annotations for specifying device and interface requests

Note

NUMA-aware memory allocation is not currently supported by the Topology Manager.

19.1.1. Role of Topology Manager

The Topology Manager communicates with the CPU Manager and Device Managers to determine the layout of the physical infrastructure of the nodes within a cluster in order to place a Pod on an appropriate NUMA node based on the configured policies. For more information, see Topology Manager.

Note

When an application has an SR-IOV interface, it is not possible to allocate the CPU and memory resources across NUMA nodes. Also, it is not possible for SRIOV interfaces to be distributed across NUMA nodes.

19.1.2. Topology Manager Policies

Robin Supports the following Topology Manager policies:

  • None

  • Best-effort

  • Restricted

  • Single-numa-node

Note

The default Topology Manager policy is Restricted .

More information about the different Topology Manager policies can be found here.

19.1.3. Isolated shared CPU on Kubernetes CPU Manager

Starting from Robin CNP v5.4.3 HF4, support to configure isolated shared CPUs on Kubernetes CPU Manager is available. To use the isolated shared CPU on the Kubernetes CPU manager, the CPUs must be from the isolated pool of CPUs on a host.

19.1.3.1. Configure Isolated shared CPU

  1. To configure isolated-shared CPUs on the host, run the following command on each worker node:

    # /root/bin/isolated-shared-cpu-patch.sh <ISOLSHARED_CPUS>
    

    Example

    # /root/bin/isolated-shared-cpu-patch.sh 5-8,45-48
    Redirecting to /bin/systemctl restart libvirtd.service
    Thu Nov 30 19:40:22 PST 2023 PID:306438 docker daemon started
    Thu Nov 30 19:40:22 PST 2023 PID:306438 docker daemon is running
    Thu Nov 30 19:40:22 PST 2023 PID:306438 Updating haproxy container   cpuset to 0-8,40-48
    haproxy-keepalived-robink8s_monitor
    Thu Nov 30 19:40:23 PST 2023 PID:306438 robin-cri doesnt exist. creating robin-cri
    Deleting robin-cri
    Thu Nov 30 19:40:23 PST 2023 PID:306438 robin-cri deleted
    79113c63c1537d66b2d66045f59a2e204bfb75a558ace559298ffda0640d8035
    Thu Nov 30 19:40:23 PST 2023 PID:306438 robin-cri exist. Waiting for it to start
    Thu Nov 30 19:40:24 PST 2023 PID:306438 robin-cri started
    Thu Nov 30 19:40:24 PST 2023 PID:306438 robin-cri started
    DefaultCpuset 0-79
    2ac81c6f834c: 0-79
    Updating 2ac81c6f834c with 0-79
    2ac81c6f834c
    9055c52facfd: 0-79
    Updating 9055c52facfd with 0-79
    9055c52facfd
    2456d1a2ec3f: 0-79
    …… O/P truncated
    
  2. After running the above command, you can check which CPUs are configured as isolated shared CPUs in the following file:

    # cat /root/bin/ROBIN_INSTALL_CONTEXT | grep -i ISOLATED_SHARED_CPUS
    

    Example

    # cat /root/bin/ROBIN_INSTALL_CONTEXT | grep -i isolated
    ISOLATED_SHARED_CPUS=default=5-8,45-48
    

19.1.3.2. Request Isolated Shared CPU for Pod

After configuring isolated shared CPUs on a host, a Pod can request for these isolated shared CPUs as needed. To request the isolated shared CPUs for a Pod, you need to add the following annotation in the Pod YAML for Helm apps and Bundle’s manifest file or input.yaml file for Robin Bundle apps:

robin.runtime.isolated_shared: "default"

  • Helm Apps

    You must add the robin.runtime.isolated_shared: "default" annotation in Pod YAML for Helm apps.

    Sample Pod YAML requesting for isolated shared CPUs for Helm apps:

      apiVersion: v1
      kind: Pod
      metadata:
        name: isol1
        annotations:
          robin.runtime.isolated_shared: "default"
      spec:
        tolerations:
        - effect: NoSchedule
        key: node-role.kubernetes.io/master
        - effect: NoSchedule
        key: node-role.kubernetes.io/control-plane
        containers:
        - name: app
        image: robinsys/virtlauncher:rocky8
        command: ["/bin/bash","-c","while true; do sleep 5; done"]
    
  • Robin Bundle Apps

    You need to add the robin.runtime.isolated_shared: "default" annotation in the following ways for Robin Bundle apps:

    • Bundle’s manifest file - in the annotations section under the role.

    • Bundle’s input.yaml - in the annotations section under the role.

19.2. CNI plugins

Robin CNP supports the following three CNI plugins:

  • Open vSwitch (OVS)

  • Calico

  • SR-IOV

These CNI plugins enables you to orchestrate container networking for applications deployed on Robin CNP. Each of these is regarded as a driver for an IP-Pool without which an application cannot be created. Each has its own advantages and might be the preferred network specification based on the workload. Detailed below are some notes on each driver and the benefits they bring.

Note

Selecting a driver is mandatory for IP-Pool creation.

19.2.1. Calico

Robin CNP enhances Kubernetes networking with Calico CNI plugin driven network address assignments and enables a fully distributed network architecture scaling smoothly for any size of deployment (in terms of number of PODs). Moreover this driver enables policy based networking wherein which ingress/egress policies for Pods can be set up. In addition Robin CNP integrates with Calico Typha in order to scale Kubernetes clusters beyond 50 nodes in size by minimizing the calico-node, present on every node given the daemonset nature of the Calico CNI plugin, pod’s impact on the Kubernetes API server datastore. This in turn improves the performance of the cluster.

19.2.2. Open vSwitch

Robin CNP provides flat networking support with Open vSwitch (OVS) network driver based CNI IP address allocations to Robin application workloads thereby allowing users to access applications from outside the Robin cluster via a Kubernetes NodePort. This driver is clearly the preferred choice if applications need to be accessed by external services. Robin CNP also provides dual stack networking support with OVS allowing users to assign both Ipv4 and Ipv6 addresses to their application workloads. This flexibility can be an advantage depending on application requirements.

19.2.3. SR-IOV

SR-IOV is a specification that allows a PCIe device to appear as multiple separate physical PCIe devices. It works by introducing the idea of physical functions (PFs) and virtual functions (VFs). Physical functions (PFs) are full-featured PCIe functions, where as virtual functions (VFs) are “lightweight” functions that lack configuration resources. To function correctly, SR-IOV requires support at BIOS level as well as in the operating system instance or hypervisor that is running on the hardware. Robin CNP allows a user to assign one or more virtual functions from physical functions to PODs created through Robin bundles. To achieve this Robin CNP leverages the following CNI’s: Multus, Bond CNI and Intel CNI for SR-IOV.

Robin CNP discovers SR-IOV NICs alongside their Virtual Functions as part of the discovery process (more details available here) and accounts for them similarly to any other compute resource. As a result, Robin supports NUMA awareness for applications not only in terms of memory and CPU but also with regard to VF’s from SR-IOV NICs that are within the NUMA boundary.

Some additional features that complement SR-IOV support include:

  • Robin CNP has the ability to tag VLAN traffic for networks based on the VFs present.

  • Robin CNP has the ability to bind a DPDK driver to a VF associated with a POD. The user can specify this kernel driver whilst creating an IP-Pool. Note that the kernel driver module has to be pre-loaded on the node for full functionality.

  • Robin CNP has the ability to bond virtual functions, which are attached to POD, from different physical functions.

19.3. Soft Affinity Support

Robin CNP supports the soft affinity feature with a few limitations.

In Kubernetes, the soft affinity feature refers to a way of guiding the Kubernetes Scheduler to make a decision about where to place Pods based on preferences, rather than strict requirements. This preference helps to increase the likelihood of co-locating certain Pods on the same node, while still allowing the Kubernetes Scheduler to make adjustments based on resource availability and other constraints. For more information, see Affinity and anti-affinity.

Limitations

The following are the limitations of support for soft affinity and anti-affinity support:

  • These operators are not supported: DoesNotExist, Gt, and Lt.

  • Multiple weight parameters for node and Pod affinity are not supported.

  • Soft anti-affinity doesn’t check or match for the label selector coming from a different Deployment.

  • During a complete cluster restart, if all nodes are not up at the same time, Pods will not be spread across nodes with soft anti-affinity.

  • After a Pod restart, it might not come back on the same node.

  • Post downsizing the number of replicas in a Deployment, soft Pod anti-affinity might not delete the Pods in the same order as creation.

  • As the affinity information is handled in the cache, restarting the robin-server will flush the cache, resulting in scaled-up Pods not being placed as per anti-affinity.

  • Creating, deleting, or recreating Pods multiple times will not honour soft affinity. Also, you must ensure that the Pods are not in the Terminating status when increasing the Pod replicas.

  • Pods will be unequally distributed on nodes when all Pods in a deployment are deleted.