***********************
Release Notes
***********************

===================================
Robin Cloud Native Platform v5.3.11
===================================

The Robin Cloud Native Platform (CNP) v5.3.11 release has new features, an improvement, a bug fix, and known issues. 

**Release Date:** 10 November 2021

Infrastructure Versions
-----------------------

The following software applications are included in this CNP release.

====================  ========
Software Application  Version
====================  ========
Kubernetes	          1.21.5
Docker                19.03.9 
Prometheus 	          2.16.0
Node-exporter         1.1.2
Calico                3.12.3
HA-Proxy   	          1.5.18
PostgreSQL            9.6.11
Grafana               6.5.3
====================  ========

Upgrade Path
-------------

The following is the supported upgrade path for Robin CNP v5.3.11:

* Robin v5.3.9-286 (GA) to Robin v5.3.11-69 (GA)

New Features
------------

---------------------------
Support for NVIDIA A100 MIG
---------------------------

Robin CNP v5.3.11 supports the Multi-Instance GPU (MIG) mode of operation for the NVIDIA A100 GPU. Robin allows you to use partitioned GPUs in Robin bundles and also supports chargeback functionality for these GPU partitions.

-------------------
Rocky Linux Support
-------------------

Starting from Robin CNP v5.3.11, Rocky Linux 8.4 is supported. You can install Robin CNP v5.3.11 on this version of Linux.

-----------------------------------------
Support for Application Ephemeral Volumes
-----------------------------------------

Robin CNP v5.3.11 supports Application Ephemeral Volumes (AEVs). An AEV is temporary storage that Robin bundle applications can use. The AEV only exists when an application is running. When you create an application, the AEV is created for usage, and its space will be reclaimed back by Robin when the application stops. Other applications can use the reclaimed storage space. Robin provisions the storage space back to the application when it starts again.

You can add the AEVs only from the Robin UI when creating an application using a Robin bundle. Also, you can create templates of the application with AEV for future use. You can add a maximum of 10 AEVs per application.

Improvement
-----------

---------------------------------------
Support for @ Symbol in Robin Usernames
---------------------------------------

Starting with Robin CNP v5.3.11, you can use the @ symbol as part of Robin usernames. This enables you to use email addresses as usernames.


Fixed Issue
-------------

=============     ==========================================================================================================================================================================
Reference ID      Description 
PP-24202          The security issue with SSL Medium Strength Cipher Suites is fixed by supporting the Strong Cipher Suites with more than 128bit keys are configured in Robin CNP Services.

                  The following are the list of the supported Strong Cipher Suites: 

                  * TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
                  * TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
                  * TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
                  * TLS_RSA_WITH_AES_128_CBC_SHA
                  * TLS_RSA_WITH_AES_256_CBC_SHA
                  * TLS_RSA_WITH_AES_128_GCM_SHA256
                  * TLS_RSA_WITH_AES_256_GCM_SHA384
                  * TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA

                  The above-mentioned list of Strong Cipher Suites is supported in the following Robin CNP Services:

                  * K8s API server
                  * K8s controller Manger
                  * K8s scheduler
                  * K8s kubelet
                  * ROBIN UI https server
                  * ROBIN event server
=============     ==========================================================================================================================================================================

Known Issues
-------------

=============     ============================================================================================================================================================================================================================================
Reference ID      Description 
PP-24270          **Symptom**
                  
                  In Robin CNP, Kubelet might go down due to the presence of stale ``cpu_manager_state`` file.  

                  **Workaround**

                  Complete the following steps to fix this issue: 

                  1. Remove the stale ``/var/lib/kubelet/cpu_manager_state`` file using the following command: 

                  .. code-block:: text

                     # rm -rf /var/lib/kubelet/cpu_manager_state

                  2. Restart the Kubelet by running the following command: 
                  
                  .. code-block:: text

                     # systemctl restart kubelet

                  3. Make sure ``etcd`` and ``apiserver`` Pods on this node are up and running. 
PP-24248          **Symptom**
                    
                  When you create a new resource pool and assign it to nodes and later try to deploy a Pod with storage affinity on the node with a newly assigned resource pool, the Pod deployment fails as the node is not taking the correct resource pool.

                  **Workaround**

                  Complete the following steps to fix this issue:

                  1. Run the following to edit the node.

                  .. code-block:: text

                     # kubectl edit node <node_name>
                  
                  2. Remove the robin.io/robinrpool resource pool.
                  
                  3. Add the correct resource pool name.
PP-22853          **Symptom**

                  GPUs might not be detected after CNP installation, upgrade, or addition of a new node.
                  
                  **Workaround**

                  Run the following host probe rediscover command:

                  .. code-block:: text

                     # robin host probe <hostname> --rediscover
PP-22626          **Symptom**

                  If NVIDIA GPU drivers are already installed on your setup, operator deployments might fail.

                  **Workaround** 
                  
                  Complete the following steps to fix this issue:

                  1. .. code-block:: text

                        # yum remove nvidia-driver-latest-dkms

                  2. .. code-block:: text

                        # yum remove nvidia-container-toolkit

                  3. Reboot the node. 
PP-21832          **Symptom**

                  After you reboot a node, it might be in the ``NotReady`` state. 

                  **Workaround** 
                  
                  Complete the following steps to fix this issue:

                  1. .. code-block:: text

                        # systemctl restart kubelet

                  2. .. code-block:: text

                        # systemctl restart dockershim

                  3. .. code-block:: text

                        # docker restart robin-cri         
PP-22781          **Symptom**

                  After removing a taint on a master node, GPUs are not detected automatically.

                  GPUs don’t get detected automatically after removing a taint on the master node.

                  **Workaround** 
                  
                  You need to run the ``robin host probe --rediscover --all --wait`` command for the GPUs to be detected on the primary master node.                                
=============     ============================================================================================================================================================================================================================================

Technical Support
-----------------

Contact `Robin Technical support <https://robinio.slack.com/>`_ for any assistance.

=======================================
Robin Cloud Native Platform v5.3.11 HF1
=======================================

The Robin CNP v5.3.11 HF1 release has improvements, bug fixes, and known issues.

**Release Date:** 08 December 2021

Infrastructure Versions
-----------------------

The following software applications are included in this CNP release.

====================  ========
Software Application  Version
====================  ========
Kubernetes	          1.21.5
Docker                19.03.9 
Prometheus 	          2.16.0
Node-exporter         1.1.2
Calico                3.12.3
HA-Proxy   	          1.5.18
PostgreSQL            9.6.11
Grafana               6.5.3
====================  ========

Upgrade Paths
-------------

The following are the supported upgrade paths for Robin CNP v5.3.11 HF1:

* Robin v5.3.5 (HF3) **to** Robin v5.3.11 (HF1)
* Robin v5.3.5 (HF5) **to** Robin v5.3.11 (HF1)
* Robin v5.3.9 (GA) **to** Robin v5.3.11 (HF1)
* Robin v5.3.11 (GA) **to** Robin v5.3.11 (HF1)

Improvements
------------

---------------------------------------------------------------------
Network Planning Support for Apps with Pod Affinity and Anti-affinity
---------------------------------------------------------------------

Robin CNP v5.3.11 HF1 provides the network planning support for apps with Pod affinity and anti-affinity.

--------------------------------------------
Application Ephemeral Volumes UI Improvement
--------------------------------------------

The APPLICATION EPHEMERAL STORAGE section of the Robin UI in Robin CNP v5.3.11 HF1 is improved to display the following drop-down options for the replication option.  

* Storage-Compute Affinity
* Not Replicated
* Replicated (2 copies)
* Replicated (3 copies)

.. Note:: These options appear when you create AEVs from Robin UI if the AEVs are not defined in a Robin bundle manifest file.

Fixed Issues
------------

=============     ============================================================================================================================================================================================================================================
Reference ID      Description 
PP-24650          Robin CNP v5.3.11 HF1 fixed the sweet32 vulnerability issue.
PP-24528          The issue of the CNP planner assigning Pods without checking the status of network interfaces is fixed in this version. With Robin CNP 5.3.11 HF1, the planner skips the network interfaces when they are down.
PP-24428          The creation of an Application Ephemeral Volume (AEV) fails in a cloud deployment if the replication factor is not specified in the AEV specifications.
 
                  This issue is fixed in Robin CNP v5.3.11 HF1 by setting a default replication factor for each AEV if it is not provided.
PP-22941          When you do not provide any limits and requests in the container resource section, but you provide Robin annotation for network planning, the issue of a Pod not coming up successfully is fixed.
PP-21983          When an IP address is not in robin ``ip-pool info --ip-allocations`` and no other running Pods in the cluster is using the IP address, a Pod may not get created that is controlled by a Deployment, StatefulSet, or DaemonSet. This issue is fixed.
PP-24589          The issue of Calico's CIDR value was not correctly set when updating a Calico IP pool in an IPv6 Robin CNP setup is fixed.
PP-24313          The ``robin bundle add`` command is incorrectly storing the bundle files in the file-object directory of the log collection service instead of the file-object directory of the file collection service. This issue is fixed.
=============     ============================================================================================================================================================================================================================================

Known Issues
-------------

=============     ==============================================================================================================================================================================================================================================================
Reference ID      Description 
PP-24697          **Symptom**
                  
                  If the Network attachment definitions (net-attach-def) are not cleaned up when a Pod is bounced to create the Pod from the webhook start phase, follow this workaround.

                  **Workaround**

                  You must delete the net-attach-def that is not cleaned up. 

                  To delete, run the following command: 

                  .. code-block:: text

                     # kubectl delete net-attach-def <net-attach-def-name> -n <namespace>
PP-24600          When you deploy a Pod using Kubernetes Deployment, StatefulSet, or DaemonSet, in some scenarios, the deployment process might take a longer time than usual due to the exponential back-off delay during scheduling. This is a known behavior with Kubernetes.                              
=============     ==============================================================================================================================================================================================================================================================

Technical Support
-----------------

Contact `Robin Technical support <https://robinio.slack.com/>`_ for any assistance.

=======================================
Robin Cloud Native Platform v5.3.11 HF2
=======================================

The Robin CNP v5.3.11 HF2 release has a new feature, improvements, bug fixes, and known issues.

**Release Date:** 02 February 2022

Infrastructure Versions
-----------------------

The following software applications are included in this CNP release.

====================  ========
Software Application  Version
====================  ========
Kubernetes	          1.21.5
Docker                19.03.9 
Prometheus 	          2.16.0
Prometheus-adapter    0.9.1
Node-exporter         1.1.2
Calico                3.12.3
HA-Proxy   	          1.5.18
PostgreSQL            9.6.22
Grafana               6.5.3
====================  ========

Upgrade Paths
-------------

The following are the supported upgrade paths for Robin CNP v5.3.11 HF2:

* Robin v5.3.5-232 (HF5) **to** Robin v5.3.11 HF2
* Robin v5.3.7-120 (HF1) **to** Robin v5.3.11 HF2
* Robin v5.3.11-104 (HF1) **to** Robin v5.3.11 HF2

New Feature
-----------

------------------------------------------------------
Intel Cache Allocation Technology Support for vDU Pods
------------------------------------------------------

Robin CNP v5.3.11 HF2 supports the Intel Cache Allocation Technology (CAT) to deploy 4G vDU (Virtual Distributed Unit) Pods for a single non-uniform memory access (NUMA) node host.

The Intel CAT enables vDUs in a 4G environment to access the CPUs on the host using dedicated cache lines/ways. When you deploy a vDU Pod using Robin CNP, the vDUs get a dedicated number of cache lines/ways configured on the host to access the CPUs.

You can use the following annotation in your Pod YAML file for requesting cache lines/ways from the CPU for vDUs.

**Example:**

.. code-block:: yaml
     
     # "robin.runtime.num_cache_ways": "4"

.. Note:: You must use the annotation at the Pod level only.

Improvements
------------

-------------------------------------------------------------------
Default replication factor and fault domain parameters for RWX PVCs
-------------------------------------------------------------------

Starting with Robin CNP v5.3.11 HF2, for RWX PVCs, the default ``replication`` factor is ``2`` and the default ``faultdomain`` is ``host``.

If you want to change the ``replication`` factor to ``1``, you can use the following parameter under annotations in the RWX PVC: ``robin.io/rwx_force_single_replica: "1"``.

.. Note:: You must not use the ``robin.io/replication`` and ``robin.io/rwx_force_single_replica`` annotations together in a PVC.

The following is an example of a PVC file:

.. code-block:: yaml
     :emphasize-lines: 7

     apiVersion: v1
     kind: PersistentVolumeClaim
     metadata:
        name: nfs-shared-1
        annotations:
           robin.io/nfs-server-type: "shared"
           robin.io/replication: "2"
           robin.io/faultdomain: "host"
     spec:
     storageClassName: robin
     accessModes:
        - ReadWriteMany
     resources:
        requests:
           storage: 500Gi

---------------------------------------------------
IP-Pool Prefix as Environment Variable inside a Pod
---------------------------------------------------

Starting with Robin CNP v5.3.11 HF2, Robin CNP provides an IP-Pool prefix as an environment variable inside a Pod.

When you deploy a Robin bundle or Helm app and use Calico or OVS IP-Pool, Robin CNP adds an IP-Pool prefix as an environment variable inside the Pod. 

Using the IP-Pool prefix environment variable, you can discover the IP prefix of an IP-Pool by using the command line.

**Example:**

.. code-block:: text

   # kubectl exec -it -n t001-u000004   c1-server-01 -- env | grep -i prefix
   ROBIN_SAMPLE_IPPOOL_PREFIX=16

In the above example with the command output: ``ROBIN_SAMPLE_IPPOOL_PREFIX=16``

**SAMPLE_ IPPOOL** is the IP-Pool name, and it is prefixed with **ROBIN** and suffixed with the word **PREFIX**, and **16** is the IP prefix for the IP-Pool.

------------------------------------------
Robin StorageClass with runAsAny parameter
------------------------------------------

Robin CNP v5.3.11 HF2 provides a new parameter ``runAsAny`` in the StorageClass object to enable any user other than the root user to read or write to an NFS mountpoint of an RWX volume. 

You can use this parameter in a scenario with multiple containers and different users, and you want to allow any user accessing the Pod (containers) to read or write to an NFS mountpoint of an RWX volume.

In the StorageClass object file, set the ``runAsAny`` parameter to True. 

The following is an example of the StorageClass with ``runAsAny`` parameter:

.. code-block:: yaml
     :emphasize-lines: 16

     apiVersion: storage.k8s.io/v1
     kind: StorageClass
     metadata:
       name: run-as-any-imm
       labels:
         app.kubernetes.io/instance: robin
         app.kubernetes.io/managed-by: robin.io
         app.kubernetes.io/name: robin
     provisioner: robin
     reclaimPolicy: Delete
     allowVolumeExpansion: true
     volumeBindingMode: Immediate
     parameters:
       replication: '2'
       media: HDD
       runAsAny: "true"

----------------------------------------------------
Optimization of CPU and memory for Kubernetes Events
----------------------------------------------------

Robin CNP v5.3.11 HF2 is optimized to reduce the usage of CPU and memory when processing Kubernetes events.

Fixed Issues
------------

=============     ============================================================================================================================================================================================================================================
Reference ID      Description 
PP-25070          Vulnerability CVE-2021-41103 is related to containerd runtime. The container root directories and some plugins had insufficiently restricted permissions. It allows unprivileged Linux users to traverse directory contents and execute programs. For more information about this vulnerability, see `CVE-2021-41103 <https://cve.report/CVE-2021-41103>`_.
                  
                  In this release, Robin.io has upgraded the containerd package to containerd version 1.5.7 for handling this vulnerability. 
PP-24947          In the recent versions of Robin CNP, the source-based route is not configured properly for the first additional IP-Pool. This issue is fixed.
PP-24938          After upgrading to Robin CNP v5.3.11 (HF1), the static IP address was not allocated to a Pod, and the Pod did not come up. This issue is fixed.
PP-24796          The scheduler is unable to create a Pod within 30 seconds because  Nmap showed that the requested static IP address was already in use due to an external firewall. This issue is fixed, and Nmap is disabled by default.

                  You can enable Nmap to allow the Robin server to scan the network before IP address allocation by running the following command.

                  .. code-block:: text

                     # robin config update manager mutating_webhook_staticip_validation_enable true
PP-24776          The ``robin ip-pool add`` command with the ``nictags pci_addr`` option is failing in Robin CNP v5.3.11 HF1. This issue is fixed.
PP-24697          When a Pod with ``robin.io/networks`` annotation is deleted, the Network attachment definitions (net-attach-defs) entries are left behind. You need to manually delete these entries. This issue is fixed.
PP-24789          An inaccessible device does not get elected by marking the device offline when the node goes down. This issue is fixed now.
PP-25116          When you bounce a Pod or reinstall an app, the deletion event might take time to complete the process as the earlier event is stuck at registration due to a missing API in the kubectl API resources. As a result, the static IP address allocation is failing. This issue is fixed.
PP-25109          In a scenario, when two MANAGER nodes are becoming SLAVE nodes and waiting for the third node to become the MASTER node, it fails to become the MASTER node due to internal issues.

                  And, the other two nodes are waiting for the third node to become the MASTER without checking whether any node is holding the MASTER lock. As a result, the control plane is down. This issue is fixed.
PP-24645          The existence of a ``recover.conf`` file in the PostgreSQL data directory was preventing a Manager node from becoming MASTER during a high availability transition. This issue is fixed.

                  Instead of failing, Robin CNP now allows the node to continue with its transition to become MASTER.
PP-25221          A Pod fails to come up to running state, and the ``kubectl describe pod -n namespace <pod name>`` command shows an error that the network CNI plugin failed to set up the Pod. This issue is fixed.
=============     ============================================================================================================================================================================================================================================

Known Issues
-------------

=============     ==============================================================================================================================================================================
Reference ID      Description 
PP-25360          **Symptom**
                  
                  If containers in a Pod are using an RWX PVC and if they are stuck in the ``ContainerCreating`` state for a long time and display a 
                  timeout error, apply the following workaround.

                  **Workaround**

                  Delete Pods if they are part of a Deployment or StatefulSet. 
PP-24937          **Symptom**

                  When upgrading to Robin CNP v5.3.11 HF2 from any supported versions using GoRobin, an error related to Paramiko might be encountered.

                  **Workaround**

                  Check the login credential of the Robin node or try restarting the sshd service on the Robin node.
PP-25370          **Symptom**

                  A Robin node in a cluster might go into the ``NotReady`` state when an RWX PVC's mount path is not responding. 
                  This issue could occur due to several internal Kubernetes known issues.

                  **Workaround**

                  For workaround steps, see `Troubleshooting a Robin Cluster Node with the NotReady State <release_note.html#id10>`_.
PP-25430          **Symptom**

                  After upgrading to Robin CNP v5.3.11 HF2, you might observe that the automatically restarted Pods containing ``robin.io/networks`` annotation did not 
                  have the secondary IP addresses.

                  **Workaround**

                  Bounce such Pods manually. 
PP-25422          **Symptom**

                  Upgrade to Robin v5.3.11 HF2 might fail as one of the worker node robinds and node plugin Pods are stuck in the Terminating state.

                  **Workaround**

                  Perform the following steps to recover all stale NFS mount points:

                  1. Make sure that no application Pod uses the PVC.
                  2. Run the following command to scale down the replica count to 0.

                  .. code-block:: text

                     # kubectl scale —-replicas=0 <replica set>

                  3. Run the following command to unmount the stale mount point.

                  .. code-block:: text

                     # umount -f -l <stale mount point>

                  4. When all stale NFS mount points on a node are recovered, run the following command to restart Kubelet.

                  .. code-block:: text

                     # systemctl restart kubelet

                  5. Scale up the replicas back to the original count.

                  .. code-block:: text

                     # kubectl scale —-replicas=<replica count> <replica set>

                  .. Note:: You must not delete the PVC or application Pod.
PP-25425          **Symptom**

                  When you create an application using Application Ephemeral Volume (AEV) with Storage-Compute Affinity, the app creation might 
                  fail due to the missing ``robin.io/hostname`` tag on the host.
                  
                  **Workaround**

                  Add the ``robin.io/hostname`` tag to the host.

                  Run the following command to add the tag:

                  .. code-block:: text

                     # robin host add-tags <hostname> <tag>

                  **Example:**

                  .. code-block:: text

                     # robin host add-tags cscale-82-37.robinsystems.com 
                     robin.io/hostname=cscale-82-37.robinsystems.com
PP-25296          **Symptom**

                  When a cluster is recovered from a network partition, the Pods deployed on the worker nodes in the 
                  minority partition are redeployed in the majority partition. 
                  The Robin control plane is unable to access the worker nodes in the minority partition. 
                  The Pods and their volume mounts are cleaned up automatically when the network partition is resolved. 
                  In some cases, this automatic clean-up fails to remove the stale Pods on the worker nodes.

                  **Workaround**

                  Reboot the server. When the server restarts, the server will rejoin the cluster without the stale pods and volume mounts.
PP-21832          **Symptom**

                  After upgrading your cluster to Robin CNP v5.3.11HF2, if you notice that the cluster node is 
                  in the ``NotReady`` state when you reboot the cluster, you must apply the following workaround on all nodes of the cluster.

                  **Workaround**

                  To resolve this issue, complete the following steps:

                  1. Run the following command.

                  .. code-block:: text

                     \/bin/cp /root/bin/robin-reboot.service /etc/systemd/system/robin-reboot.service
                     kernel_version=$(uname -r)
                      if [[ $kernel_version == "3.10"* ]]; then
                        sed -i "/EL8/d;s/#EL7//" /etc/systemd/system/robin-reboot.service
                        else
                         sed -i "/EL7/d;s/#EL8//" /etc/systemd/system/robin-reboot.service
                        fi

                  2. Run the following command to restart Kubelet.  
                  
                  .. code-block:: text

                        # systemctl restart kubelet

                  3. Run the following command to restart Dockershim.
                  
                  .. code-block:: text

                        # systemctl restart dockershim

                  4. Run the following command to restart robin-cri.
                  
                  .. code-block:: text

                        # docker restart robin-cri
PP-25286          **Symptom**

                  When you try to resize a PDV to a large size, the resize task fails due to insufficient storage, and due to this
                  resize task failure, the subsequent PDV resizing tasks also fail.

                  **Workaround**

                  For workaround steps, see `PDV Resize issue <release_note.html#id11>`_.
PP-25441          **Symptom**

                  In Robin CNP v5.3.11, provisioning clones with Application Ephemeral Volumes (AEVs) are not supported.
PP-25412          **Symptom**
        
                  Storage-Compute affinity is not enforced on Application Ephemeral Volume (AEV) when storage is available on the same host.
PP-25453          **Symptom**

                  When upgrading to Robin CNP v5.3.11 HF2 from any supported versions, CNS fails to execute post-Robin upgrade actions on
                  one of the nodes as the IO Manager might be down and the node displays ``NotReady`` state. 

                  Apply the following workaround if you notice this issue.

                  **Workaround**

                  Run the following command one the node where you executed the upgrade command:

                  .. code-block:: text

                     ./<gorobin-binary> onprem post-upgrade-robin --hosts-json /root/hosts.json 
                     --gorobintar <local location of GoRobin tar file> --robin-admin-user 
                     <Robin Admin username> --robin-admin-passwd <Robin Admin password>
PP-25461          **Symptom**

                  If you discover that a Pod comes up without a static IP address, you must apply the following workaround. 
                  The Pod may not get allocated with one or more static IP addresses in multiple conditions.

                  **Workaround**

                  Bounce the Pods that are not allocated with a required number of static IP addresses.
PP-25423          **Symptom**
 
                  After upgrading to Robin CNP v5.3.11HF2 from Robin CNP v5.3.5, application Pods might be in the ``ContainerCreating`` state. 
 
                  If you notice this issue, apply the following workaround.
                 
                  **Workaround**

                  1. Using the RWX PVC, find the Deployment or StatefulSet and the Pod details by run the following command:

                  .. code-block:: text

                     # kubectl describe PVC <pvc name> -n ns

                  2. Note the replica count by running the following command: 

                  .. code-block:: text

                     # kubectl get all -n <ns>

                  3. Scale the replicas to 0 by running the following command.

                  .. code-block:: text

                     # kubectl scale --replicas=0 <deployment/statefulset> -n <ns>

                  4. Observe that the Pod is terminated and does not exist anymore.

                  5. Scale the replicas back to the count that you noted in step 2. Use the following command to scale up.

                  .. code-block:: text

                     # kubectl scale --replicas=<n> <deployment/statefulset> -n <ns>
PP-25381          **Symptom**

                  Robin CNP does not support rack fault domain for Application Ephemeral Volume (AEV). However, Robin CNP UI incorrectly displays the FaultDomain (Rack) option.
PP-25467          **Symptom**

                  You might observe Kubelet is slow or does not respond and periodically goes into the error state resulting in issues with the Robin storage layer. 
                  This issue could be due to an orphan Pod, or Kubelet is trying to mount a Pod on old PVCs.

                  **Workaround**

                  1. Run the following command to check the status of Kubelet.

                  .. code-block:: text

                     # systemctl status kubelet -l

                  2. In the command output, find the following message. 

                  .. code-block:: text

                      "orphaned pod pod_id found, but error not a directory occurred
                       when trying to remove the volumes dir" 

                  3. Run the following command to know the PVC name from the node.

                  .. code-block:: text

                     # kubectl get pvc -A

                  4. Run the following command to check if the PVC is existing.

                  .. code-block:: text

                     # kubectl get pvc -A | grep <pvc_name>

                  5. If the PVC does not exist, delete the directory by running the following command.

                  .. code-block:: text

                     # rm -rf /var/lib/kubelet/pods/<pod_id>
PP-25463          **Symptom**

                  The volume mounts in a Pod fail due to duplicate FS UUIDs.

                  **Workaround**

                  A duplicate FS UUID is present when the device is already mounted on the same node. When a volume mount fails, the FS UUID is displayed in ``syslog/dmesg``.

                  Perform the following steps to resolve the duplicate FS UUID.

                  1. Run the following command to check whether any device has the same FS UUID:

                  .. code-block:: text

                     # blkid | grep <FS UUID>

                  2. Run the following command to check whether the device is mounted:

                  .. code-block:: text

                     # mount | grep <device>

                  3. Run the following command to unmount the device, when the device is mounted:

                  .. code-block:: text 

                     # unmount <device>

                  After unmounting the device, the duplicate FS UUID conflict will be resolved.
PP-25466          **Symptom**

                  A Pod fails to come up because the volume is not accessible, and the volume is in the faulted state.

                  **Workaround**

                  You need to probe the robin host by running the following command:  

                  .. code-block:: text

                     # robin host probe --all   

PP-25508          **Symptom**

                  When you try to modify an IP Pool, the modification process might fail with an error message due to missing values in the IP Pool.

                  If you notice any error message when modifying the IP Pool, apply the following workaround.

                  **Workaround**

               
                  1. Run the following command only once on the cluster.

                  .. code-block:: text 

                     # robin schedule update K8sResSync k8s_resource_sync 63072000

                  2. Run the following command and make a note of the IP Pool values.
                  
                  .. code-block:: text 

                      robin ip-pool info <name>


                  3.  Run the following command to update missing values in the IP Pool 
                      that you noted in the previous step.
                    
                  .. code-block:: text

                     kubectl edit ripp <name> --validate=false      

                  **Example:**

                  In the following example, you need to add the missing values
                  in the ``spec:`` section. You do not need to update all values. 
                  For example, for network-based IP-Pool, the prefix field is not required. 
                  Similarly, you can ignore the values that are not required.

                  .. code-block:: text 

                     [root@centos-60-205 ~]# kubectl edit ripp ovs-1 --validate=false
                        ...
                        spec:
                        available: "15"
                        dns_search: domain.com
                        driver: ovs
                        gateway: fd74:ca9b:3a09:868c::1
                        ifcount: 1
                        name: ovs-1
                        nameserver: fd74:ca9b:3a09:868c:10:9:60:62
                        netmask: ffff:ffff:ffff:ffff:0000:0000:0000:0000
                        ranges:
                        - fd74:ca9b:3a09:868c:0010:0009:0109:0010-0020
                        - fd74:ca9b:3a09:868c:0010:0009:0109:0040-0050
                        subnet: fd74:ca9b:3a09:868c:0000:0000:0000:0000
                        used: "2"
                        zone: default
                        prefix: 64
                        vfdriver: xyz
                        vlan_number: 100

                  4. Rerun the failed IP Pool command to verify.

                       
=============     ==============================================================================================================================================================================

Appendix
--------

----------------------------------------------------------------
Troubleshooting a Robin Cluster Node with the ``NotReady`` State
----------------------------------------------------------------

**The following content is the workaround for PP-25370.**

A Robin node in a cluster might go into the ``NotReady`` state when an RWX PVC's mount path is not responding. This issue could occur due to several internal Kubernetes known issues. 

The RWX PVC's mount path may not respond due to the following issues/symptoms on your cluster. You can troubleshoot these issues and bring back the node to the ``Ready`` state. This document section provides troubleshooting steps for the following issues:

* NFS server's service IP address entry in the conntrack table might go into ``SYN_SENT`` or ``TIME_WAIT`` state
* NFS Servers may not be ready
* NFS Server Failover Issues
* I/O hangs on the volume

With Robin v5.3.11 HF2, you might notice the ``NotReady`` state issue when you are upgrading from Robin v5.3.11 HF1 to Robin v5.3.11 HF2. 

* **Troubleshoot NFS Server’s service IP address entry in the conntrack table in SYN_SENT or TIME_WAIT state**

The Robin node could be in the ``NotReady`` state if the NFS Server’s service IP address entry in the conntrack table in ``SYN_SENT`` or ``TIME_WAIT``. 

The following steps enable you to troubleshoot this issue and bring the node to the ``Ready`` state.

1. Run the following command to know if your node is in the ``NotReady`` state when you notice any of the above-mentioned symptoms: 

.. code-block:: text

   # kubectl get node <node name>

**Example:**

.. code-block:: text

   # kubectl  get node hypervvm-61-46
   NAME             STATUS     ROLES    AGE   VERSION
   hypervvm-61-46   NotReady   <none>   25h   v1.21.5

2. Run the following command and grep the NFS server mount paths:

.. code-block:: text

   # mount|grep :/pvc

3. Copy the mount paths for verification from the command output. 

4. Run the following command to check the status of the mount path:

.. code-block:: text

   # ls <nfsmount>

**Example:**

.. code-block:: text

   # ls /var/lib/kubelet/pods/25d256d5-e6cc-4865-a3ee-88640e0d1fc8/volumes/kubernetes.io~csi/pvc-210829ca-96d4-4a12-aab8-5646d087054d/mount

.. Note:: If any mount paths do not respond or hang, you must check the status of conntrack.

You need the service IP of the NFS Server Pod for checking conntrack status. 

5. Run the following command to get the NFS server Pod service IP address:

.. code-block:: text

   # mount|grep <pvc name>

**Example:**

.. code-block:: text

   # mount|grep pvc-210829ca-96d4-4a12-aab8-5646d087054d

   [fd74:ca9b:3a09:868c:172:18:0:e23e]:/pvc-210829ca-96d4-4a12-aab8-5646d087054d on /var/lib/kubelet/pods/25d256d5-e6cc-4865-a3ee-88640e0d1fc8/volumes/kubernetes.io~csi/pvc-210829ca-96d4-4a12-aab8-5646d087054d/mount type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp6,timeo=600,retrans=2,sec=sys,clientaddr=fd74:ca9b:3a09:868c:10:9:82:127,local_lock=none,addr=fd74:ca9b:3a09:868c:172:18:0:e23e)

6. Verify if the conntrack entry state using NFS server Pod IP address by running the following command:

.. code-block:: text

   # conntrack -L -d <NFS server Pod IP address>

.. Note:: If you notice the conntrack entry state as ``SYN_SENT`` or ``TIME_WAIT``, you need to delete the entry from conntrack table entries to allow connections to the NFS service.

**Example:**

.. code-block:: text

   # conntrack -L -d fd74:ca9b:3a09:868c:172:18:0:e23e
   13tcp      6 110 SYN_SENT src=fd74:ca9b:3a09:868c:10:9:82:127 dst=fd74:ca9b:3a09:868c:172:18:0:e23e sport=980 dport=2049 [UNREPLIED] src=fd74:ca9b:3a09:868c:172:18:0:71d4 dst=fd74:ca9b:3a09:868c:10:9:82:127 sport=2049 dport=614 mark=0 use=1
   14conntrack v1.4.4 (conntrack-tools): 1 flow entries have been shown.

7. Run the following command to delete the ``SYN_SENT`` or ``TIME_WAIT``:

.. code-block:: text

   # conntrack -D -d <NFS server Pod IP address>

**Example:**

.. code-block:: text

   # conntrack -D -d fd74:ca9b:3a09:868c:172:18:0:e23e
   18tcp      6 102 SYN_SENT src=fd74:ca9b:3a09:868c:10:9:82:127 dst=fd74:ca9b:3a09:868c:172:18:0:e23e sport=980 dport=2049 [UNREPLIED] src=fd74:ca9b:3a09:868c:172:18:0:71d4 dst=fd74:ca9b:3a09:868c:10:9:82:127 sport=2049 dport=614 mark=0 use=1
   19conntrack v1.4.4 (conntrack-tools): 1 flow entries have been deleted.

.. Note:: After deleting the ``SYN_SENT`` or ``TIME_WAIT`` state from the conntrack, you should be able to access the NFS mount path.

8. Run the following command to verify mount path status.

.. code-block:: text

   # ls /var/lib/kubelet/pods/25d256d5-e6cc-4865-a3ee-88640e0d1fc8/volumes/kubernetes.io~csi/pvc-210829ca-96d4-4a12-aab8-5646d087054d/mount


* **Additional Troubleshooting Checks**

If you have verified the NFS Server’s service IP address entry in the conntrack table in ``SYN_SENT`` or ``TIME_WAIT`` status and still your node is in the ``NotReady`` state, you need to perform additional checks to troubleshoot the issue.

The following are some additional checks for troubleshooting the issue: 

  * Check NFS Exports Status 
  * Check NFS server failover Status 
  * Check NFS server Pod is provisioned.

* **Check NFS Exports Status**

All NFS exports must be in the ``READY`` state.

To check the NFS exports status, run the following command:

.. code-block:: text

   # robin nfs export-list

**Example:**

.. code-block:: text

   # robin nfs export-list
   +--------------+-----------+------------------------------------------+---------------------+-----------------------------------------------------------------------+
   | Export State | Export ID |                  Volume                  |    NFS Server Pod   |                             Export Clients                            |
   +--------------+-----------+------------------------------------------+---------------------+-----------------------------------------------------------------------+
   |    READY     |     7     | pvc-9b1ef05e-5e4a-4e6a-ab3e-f7c95d1ae920 |  robin-nfs-shared-9 | ["hypervvm-61-48.robinsystems.com","hypervvm-61-43.robinsystems.com"] |
   +--------------+-----------+------------------------------------------+---------------------+-----------------------------------------------------------------------+

.. Note:: If NFS exports are not in the ``READY`` state, make sure the NFS server failover is enabled. Generally, it is enabled by default.

* **Check NFS server failover Status**

The NFS Server failover status is by default enabled. However, you should check for confirmation and enable it if it is disabled. 

To check NFS server failover status, run the following command: 

.. code-block:: text

   # robin config list nfs|grep failover_enabled
   nfs     | failover_enabled

* **Check NFS server Pod is provisioned**

To check whether NFS server Pod is provisioned or not, run the following command: 

.. code-block:: text

   # robin job list|grep -i NFSServerPodCreate|tail

.. Note:: If all of these checks are fine, then it could be a bug in the NFS Server Failover. To troubleshoot the NFS Server failover issue, see **Troubleshoot NFS Server Failover Issues**.

* **Troubleshoot NFS Server Failover Issues**

A node could go to the ``NotReady`` state due to NFS Server failover issues as well, apart from other issues mentioned in this section. 

.. Note:: 1. You can use the following steps even if your NFS Server has no issues, however, the PVC mount path is hung.

          2. Before you troubleshoot the NFS Server failover issues, check the **Troubleshoot NFS Server’s service IP address entry in the conntrack table in SYN_SENT or TIME_WAIT** state and **Additional Troubleshooting Checks**.

To fix the NFS server failover issues, complete the following steps:

1. Run the following command to check if any NFS exports are in the ``ASSIGNED_ERR`` state and identify corresponding PVCs:

.. code-block:: text

   # robin nfs export-list

2. Run the following command to note the replica count in the deployment or  StatefulSet:

.. code-block:: text

   # kubectl get all -n <ns>

**Example:** 

.. code-block:: text

   # kubectl get all -n <ns>
   ...
   NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
   deployment.apps/app1   2/2     2            2           27h

   NAME                              DESIRED   CURRENT   READY   AGE
   replicaset.apps/app1-5cbbc6d9db   2         2         2       27h

3. Run the following command to scale the application Pods that use those PVCs to 0: 

.. Note:: Do not delete the application. 

Scaling down the application Pods will make sure that new Pods do not come up and results in the NFS exports being cleared. 

.. code-block:: text

   # kubectl scale --replicas=0 <deployment/statefulset> -n <ns>

4. Run the following command to check all NFS exports are healthy:

.. code-block:: text

   # robin nfs export-list

5. (Optional) Run the following command on the hung paths if you notice some mount paths are still hung:

.. code-block:: text

   # umount -f -l <hung nfs mount path>

6. Run the following command to check the node status:

.. code-block:: text

   # kubectl get node <node name>

.. Note:: If you notice the node is still not showing the ``Ready`` state, wait for 2 minutes for kubelet to refresh the status.

If the status is still not showing ``Ready``, stop and start kubelet by running following commands:

.. code-block:: text

   # systemctl stop kubelet
   
   #systemctl start kubelet 

7. Check the node status again. If the status is ``Ready``, then go to the last step.

.. code-block:: text

   # kubectl get node <node name>

8. If the node is still not in the ``Ready`` state or flapping between ``Ready/NotReady`` and you do not see any Pods in k8s that are using the RWX PVC, it may be Pods are deleted by force from Kubernetes. 

In this case, k8s does not see Pods, but Docker is still hanging on to those Pods. THIS IS A RARE CASE and is hit only when Pods are deleted forcefully.  

In this case, run the following commands: 

.. code-block:: text

   # docker rm <>

   # systemctl restart docker

9. Run the following command to check the node status:

.. code-block:: text

   # kubectl get node <node name>

The node should be in the ``Ready`` state.

10. Run the following command to scale up the application Pods back to the original count that you noted earlier:

.. code-block:: text

   # kubectl scale --replicas=<n> <deployment/statefulset> -n <ns>

PDV Resize issue
----------------

When you try to resize a PDV to a large size, the resize task fails due to insufficient storage, and due to this resize task failure, the subsequent PDV resizing tasks also fail. 

If you face this issue, complete the following troubleshooting steps:

1. Run the following command to verify the PersistentDataVolumeResize job status:

.. code-block:: text

   # robin job info <Job ID>

.. Note:: In the command output, notice the PersistentDataVolumeResize job failure, and similarly, you might notice multiple failed volume expansion jobs. This is because the Robin CNP is continuously trying to allocate storage for volume expansion.

**Example:**

.. code-block:: text

   # robin job info 935
   ID  |Type                        | Desc                                                                               | State    | Start |End              | Duration | Dependson| Error             | Message                                                                                                                                                                                                                                                                                     
   935 | PersistentDataVolumeResize | Resizing PersistentDataVolume 'test-pdv-202201102020041' from 108447924224 to 151G | COMPLETED|FAILED | 27 Jan 12:58:53 | 12:59:06 | 0:00:13  | []        | 1     | Unable to allocate storage for volume pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80 of logical size 50.0G. Needed 51.56G of type HDD in physical space but found only 42.28G available. Check available capacity, maximum volume count, physical sector size and maintenance mode for the drives.

2. Run the following command to get information about the PV and PVC for the impacted PDV volumes:

.. code-block:: text

   # robin pdv list <PDV name>

**Example:**

.. code-block:: text

   # robin pdv list test-pdv-202201102020041 
   Name                     | Owner/Tenant | Access  | Size | Media | PV                                       | PVC  
   test-pdv-202201102020041 | u1/tenant1   | Private | 100G | HDD   | pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80 | t038-u000040/pvc-38-40-16420352860585

3. Save the PVC configuration file.

.. code-block:: text

   # kubectl get pvc -n <namespace> <pvc> -o yaml > <file.yaml>

**Example:**

.. code-block:: text

   # kubectl get pvc -n t002-u000006 pvc-38-40-16420352860585 -o yaml > pvc.yaml

**Example PVC file:**

.. code-block:: yaml
     :emphasize-lines: 21, 22 

     apiVersion: v1
     kind: PersistentVolumeClaim
     metadata:
       annotations:
         pv.kubernetes.io/bind-completed: "yes"
         pv.kubernetes.io/bound-by-controller: "yes"
         robin.io/faultdomain: host
         robin.io/media: HDD
         robin.io/replication: "2"
         robin.io/rpool: default
         volume.beta.kubernetes.io/storage-provisioner: robin
         volume.kubernetes.io/storage-provisioner: robin
       creationTimestamp: "2022-01-13T00:54:46Z"
       finalizers:
       - kubernetes.io/pvc-protection
       labels:
         robin.io/domain: ROBIN_PDV
         robin.io/tenant: tenant1
         robin.io/tenant_id: "38"
         robin.io/user_id: "40"
         robin.io/username: u1
       name: pvc-38-40-16420352860585
       namespace: t038-u000040
       resourceVersion: "2378648"
       uid: 2a9ffb4e-fc25-4536-b700-501c2a7a8d80
     spec:
       accessModes:
       - ReadWriteMany
       resources:
         requests:
           storage: 200Gi
       storageClassName: robin-immediate
       volumeMode: Filesystem
       volumeName: pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80
     status:
       accessModes:
       - ReadWriteMany
       capacity:
         storage: 100Gi
       conditions:
       - lastProbeTime: null
         lastTransitionTime: "2022-01-27T17:01:41Z"
         status: "True"
         type: Resizing
       phase: Bound

4. Edit the PVC YAML file and remove the following attributes:

.. code-block:: text

   vi <file.yaml from Step 3>

* In the metadata annotations, remove the following attributes:

.. code-block:: text

   pv.kubernetes.io/bind-completed
   pv.kubernetes.io/bound-by-controller

* In the In metadata, remove the following attributes:

.. code-block:: text

   creationTimestamp
   resourceVersion
   Uid

* Remove the complete status section in the PVC YAML file. 

**Edited PVC YAML file example:**

.. code-block:: yaml
     :emphasize-lines: 18, 19

     apiVersion: v1
     kind: PersistentVolumeClaim
     metadata:
       annotations:
         robin.io/faultdomain: host
         robin.io/media: HDD
         robin.io/replication: "2"
         robin.io/rpool: default
         volume.beta.kubernetes.io/storage-provisioner: robin
         volume.kubernetes.io/storage-provisioner: robin
       finalizers:
       - kubernetes.io/pvc-protection
       labels:
         robin.io/domain: ROBIN_PDV
         robin.io/tenant: tenant1
         robin.io/tenant_id: "38"
         robin.io/user_id: "40"
         robin.io/username: u1
       name: pvc-38-40-16420352860585
       namespace: t038-u000040
     spec:
       accessModes:
       - ReadWriteMany
       resources:
         requests:
           storage: 200Gi
       storageClassName: robin-immediate
       volumeMode: Filesystem
       volumeName: pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80

5. In the edited YAML  file, change the ``spec.resources.requests.storage`` size of the underlying Robin volume.

6. Mark the PersistentVolume(PV) that is bound to the PersistentVolumeClaim(PVC) with the ``Retain`` reclaim policy. This will prevent the underlying volume from being deleted when the PVC is deleted.

**Example:**

.. code-block:: text

   # kubectl patch pv pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80 -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
   persistentvolume/pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80 patched

7. Patch the PVC to disable Robin admission control from preventing the PDV’s deletion.

**Example**

.. code-block:: text

   # kubectl patch pvc -n t038-u000040 pvc-38-40-16420352860585  -p '{"metadata":{"labels": {"robin.io/override_delete_protection":"true"}}}'
   persistentvolumeclaim/pvc-38-40-16420352860585 patched

8. Delete the PVC. 

As PV now has a Retain reclaim policy, you will not lose any data when the PVC is recreated.

**Example:**

.. code-block:: text

   # kubectl delete pvc -n t038-u000040 pvc-38-40-16420352860585
   persistentvolumeclaim "pvc-38-40-16420352860585" deleted

9. Delete the ``claimRef`` entry from PV specs, so the new PVC can bind to it. This should make the PV Available.

**Example:**

.. code-block:: text

   # kubectl patch pv pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80 --type json -p '[{"op": "remove", "path": "/spec/claimRef"}]'
   persistentvolume/pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80 patched

10. Re-create the PVC with a required size without exceeding the available resources.

.. Note:: In the ``pvc.yaml`` file, that is created above, has the name of the existing PV set in the ``volumeName`` attribute. This will bind the new PVC to the existing PV.

**Example:**

.. code-block:: text

   # kubectl create -f pvc.yaml
   persistentvolumeclaim/pvc-38-40-16420352860585 created

11. Restore the original reclaim policy of the PV.

**Example:**

.. code-block:: text

   # kubectl patch pv pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80  -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}'
   persistentvolume/pvc-2a9ffb4e-fc25-4536-b700-501c2a7a8d80 patched

Technical Support
-----------------

Contact `Robin Technical support <https://robinio.slack.com/>`_ for any assistance.

=======================================
Robin Cloud Native Platform v5.3.11 HF3
=======================================

The Robin CNP v5.3.11 HF3 release has a new feature and a known issue.

**Release Date:** 06 May 2022

Infrastructure Versions
-----------------------

The following software applications are included in this CNP release.

====================  ========
Software Application  Version
====================  ========
Kubernetes	          1.21.5
Docker                19.03.9 
Prometheus 	          2.16.0
Prometheus-adapter    0.9.1
Node-exporter         1.1.2
Calico                3.12.3
HA-Proxy   	          1.5.18
PostgreSQL            9.6.22
Grafana               6.5.3
====================  ========

Upgrade Path
-------------

The following is the supported upgrade path for Robin CNP v5.3.11 HF3:

* Robin v5.3.11 (HF2) **to** Robin v5.3.11 (HF3)


New Feature
-----------

------------------------------------------------
Support for Cisco DCNM E1000 Virtual Interaface 
------------------------------------------------
Robin CNP 5.3.11 HF3 supports Cisco Data Center Network Manager (DCNM) E1000 virtual network interface for KVMs. You can deploy the Cisco DCNM application on the Robin CNP cluster.

.. Note:: The Cisco DCNM E1000 Virtual Interface is supported only on KVMs with OVS IP Pool. You can configure the interface only using an ``input.yaml`` file. For more information, see `here <apps.html#create-a-vm-using-cisco-dcnm-e1000-virtual-interface>`_. 


Known Issue
-------------

=============     ==============================================================================================================================================================================
Reference ID      Description 

PP-27192          **Symptom**
                  
                  In some rare scenarios, creating an app from a snapshot of a KVM app fails with this error:  *Failed to ping instance.*

                  **Workaround**

                  Run the following command to restart the KVM instance:

                  .. code-block:: text 

                     # robin instance restart <name>

=============     ==============================================================================================================================================================================


Technical Support
-----------------

Contact `Robin Technical support <https://robinio.slack.com/>`_ for any assistance.

=======================================
Robin Cloud Native Platform v5.3.11 HF4
=======================================

The Robin CNP v5.3.11 HF4 release has improvements, a fixed issue, and known issue.

**Release Date:** 19 June 2022

Infrastructure Versions
-----------------------

The following software applications are included in this CNP release.

====================  ========
Software Application  Version
====================  ========
Kubernetes	          1.21.5
Docker                19.03.9 
Prometheus 	          2.16.0
Prometheus-adapter    0.9.1
Node-exporter         1.1.2
Calico                3.12.3
HA-Proxy   	          1.5.18
PostgreSQL            9.6.22
Grafana               6.5.3
====================  ========

Upgrade Path
-------------

The following is the supported upgrade path for Robin CNP v5.3.11 HF4:

* Robin v5.3.11(HF2) **to** Robin v5.3.11(HF4)


Improvements 
------------

--------------------
Rocky Linux Support  
--------------------
Robin CNP v5.3.11HF4 supports Rocky Linux 8.6 version. You can install Robin CNP 5.3.11HF4 on Rocky Linux 8.6 servers.

 The following are the supported Rocky Linux 8.6  Kernal versions: 

 - 4.18.0-372.9.1.rt7.166.el8.x86_64 (RT Kernel) 
 - 4.18.0-372.9.1.el8.x86_64 (Non-RT Kernel)

---------------------------------------------------
Disable Init Containers and Sidecars in Bundle App
---------------------------------------------------

Robin CNP v5.3.11HF4 supports disabling the Init Containers and Sidecars in the Robin Bundle apps using the input.yaml file when deploying the Bundle apps.

The following is the sample Robin Bundle file: 

.. code-block:: yaml
     :emphasize-lines: 34, 45, 23

      name: dpdk-intel
      version: v1
      icon: icon.png
      snapshot: enabled
      clone: enabled
      roles:
      - pktgen
      pktgen:
      name: pktgen
      norootfs: true
      image:
         name: robinsys/dpdk-intel
         version: v1
         engine: docker
         imagePullPolicy: IfNotPresent
         entrypoint: entry.sh
      compute:
         memory: 1G
         cpu:
            reserve: true
            cores: 2
      initContainers:
         - name: init1
            image: 'robinsys/dpdk-intel:v1'
            imagePullPolicy: IfNotPresent
            resources:
            limits:
               cpu: 25m
               memory: 128Mi
            command:
            - sleep
            - '5'
      sidecars:
         - name: side1
            image: 'robinsys/dpdk-intel:v1'
            imagePullPolicy: IfNotPresent
            command:
            - /bin/bash
            - '-c'
            - trap 'exit 0' SIGTERM; while true; do sleep 1; done
            resources:
            limits:
               memory: 200Mi
               cpu: '1'
         - name: side2
            image: 'robinsys/dpdk-intel:v1'
            imagePullPolicy: IfNotPresent
            command:
            - /bin/bash
            - '-c'
            - trap 'exit 0' SIGTERM; while true; do sleep 1; done
            resources:
            limits:
               memory: 200Mi
               cpu: '1'
  
**Input Yaml file for disabling Init Containers and Sidecars**

In the earlier Robin Bundle sample file, we have ``side1`` and ``side2`` sidecars and Init container ``init1``. 

Using the following sample ``Input.yaml`` file you can disable the Init Containers and sidecars. 
From the above sample Bundle Yaml file example, we are disabling ``side1`` sidecar and Init container ``init1``. 

The following is the sample input.yaml file for disabling Init Containers and sidecars. 

.. code-block:: yaml
     :emphasize-lines: 7, 9

     roles:
     - name: pktgen
       containers:
         - name: side2
           disabled: false
         - name: side1
           disabled: true
         - name: init1
           disabled: true

You can use the input.yaml file when creating an app using the Robin Bundle.

**Syntax**

Run the following command when creating an app using the Robin Bundle.

   ``# robin app create from-bundle <appname> <bundleid> <yamlfile> --rpool <rpool> --wait``


Fixed Issues
------------

================== =============================================================================================================================================
Reference ID        Description
PP-27304            The 503 error message appears due to timeout of the HAProxy. 

                    To fix this issue, you need to increase the timeout values of the HAProxy using the ``robin config update`` command to 60 seconds for 
                    the ``connect_timeout`` attribute. 

================== =============================================================================================================================================


Known Issues
-------------

=============     ==============================================================================================================================================================================
Reference ID      Description 
PP-27400          **Symptom**
                  
                  The ``--disablerepo=*`` option does not work with CentOS 8 and Rocky Linux 8.

                  **Workaround**

                  You can disable all repos by creating a backup folder and manually moving all repo files to it.

                  Run the following commands to manually move all repo files: 

                  1. ``mkdir /etc/yum.repos.d.backup``
                  2. ``mv /etc/yum.repos.d/* /etc/yum.repos.d.backup/``


PP-27613          When you create an IP pool, the IP pool creation succeeds; however, the app creation fails with the error *IP Pool does not exist*. 
                  This issue occurs as the pool creation is registered on Kubernetes and fails to register on the database.

                  **Workaround**

                  1. Log in to the Robin server pod from any node using ``rbash master``. 
                  2. Run the following command to restart Robin cluster.
                    ``systemctl restart robin-server``

                  3. Run the following command to verify the IP pool list. 
                   ``robin ip-pool list``  

=============     ==============================================================================================================================================================================


Technical Support
-----------------

Contact `Robin Technical support <https://robinio.slack.com/>`_ for any assistance.