25. Release Notes¶
25.1. Robin Cloud Native Platform v5.5.1¶
The Robin Cloud Native Platform (CNP) v5.5.1 release notes has pre- and post-upgrade steps, new features, improvements, fixed issues, and known issues.
Release Date: September 30, 2025
25.1.1. Infrastructure Versions¶
The following software applications are included in this CNP release:
Software Application |
Version |
---|---|
Kubernetes |
1.31.6 |
Docker |
25.0.2 |
Prometheus |
2.39.1 |
Prometheus Adapter |
0.10.0 |
Node Exporter |
1.4.0 |
Calico |
3.28.2 |
HAProxy |
2.4.7 |
PostgreSQL |
14.12 |
Grafana |
9.2.3 |
CRI Tools |
1.31.1 |
25.1.2. Supported Operating System¶
The following is the supported operating system and kernel version for Robin CNP v5.5.1:
CentOS 7.9 (kernel version: 3.10.0-1160.71.1.el7.x86_64)
25.1.3. Upgrade Paths¶
The following are the supported upgrade paths for Robin CNP v5.5.1:
Robin CNP v5.4.3 HF6 to Robin CNP v5.5.1-1950
Robin CNP v5.5.0-1857 to Robin CNP v5.5.1-1950
25.1.3.1. Pre-upgrade considerations¶
For a successful upgrade, you must run the
possible_job_stuck.py
script before and after the upgrade. Contact the Robin Support team for the upgrade procedure using the script.When upgrading from supported Robin CNP versions to Robin CNP v5.5.1, if your cluster already has
cert-manager
installed, you must uninstall it before upgrading to Robin CNP v5.5.1.Robin CNP v5.5.1 does not support the
OnDelete
strategy for IOMGR Pods during the upgrade process.
25.1.3.2. Post-upgrade considerations¶
After upgrading to Robin CNP v5.5.1, you must run the
robin schedule update K8sResSync k8s_resource_sync 60000
command to update therobin schedule K8sResSync
.After upgrading to Robin CNP v5.5.1, you must run the
robin-server validate-role-bindings
command. To run this command, you need to log in to therobin-master
Pod. This command verifies the roles assigned to each user in the cluster and corrects them if necessary.After upgrading to Robin CNP v5.5.1, the
k8s_auto_registration config
parameter is disabled by default. The config setting is deactivated to prevent all Kubernetes apps from automatically registering and consuming resources. The following are the points you must be aware of with this change:You can register the Kubernetes apps using the
robin app register
command manually and use Robin CNP for snapshots, clones, and backup operations of the Kubernetes app.As this config parameter is disabled, when you run the
robin app nfs-list
command, the mappings between Kubernetes apps and NFS server Pods are not listed in the command output.If you need mapping between Kubernetes app and NFS server Pod when the
k8s_auto_registration config
parameter is disabled or the k8s app is not manually registered, get the PVC name from the Pod YAML file(kubectl get pod -n <name> -o YAML)
and run the robinnfs export list | grep <pvc name>
command.The
robin nfs export list
command output displays the PVC name and namespace.
25.1.3.3. Pre-upgrade steps¶
Upgrading from Robin CNP v5.4.3 HF6 or Robin CNP v5.5.0-1857 to Robin CNP v5.5.1
Before upgrading from Robin CNP v5.4.3 HF6 or Robin CNP v5.5.0-1857 to Robin CNP v5.5.1, perform the following steps:
Update the value of the
suicide_threshold
config parameter to1800
:# robin config update agent suicide_threshold 1800
Disable the
NFS Server
Monitor schedule:# robin schedule disable "NFS Server" Monitor
Set the toleration seconds for all NFS server Pods to 86400 seconds. After upgrade, you must change the tolerations seconds according to the post-upgrade steps.
# for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds to 86400"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 86400}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 86400}]'; done
Verify and configure pod-max-pids on master nodes
Before you change the maximum Pods per node, you need to verify and configure pod-max-pids
on master nodes and change if required.
Verify the pod-max-pids
configuration on master nodes in kubelet. Based on the number of vCPUs on the host, set pod-max-pids
to at least 4096 on master nodes.
Note
If the current value exceeds 4096 based on application requirements (for example, 10000), you do not need to change it.
Always verify the current value before making changes. You can tune this setting further based on cluster observations.
To verify the current pod-max-pids and modify if requried, complete the following steps:
Check the current pod-max-pids value:
# cat /etc/sysconfig/kubelet systemctl status kubelet -l | grep -i pod-max-pids
Check if the value is less than 4096. If the value is more than 4096, you do not need to update.
Open the kubelet configuration file and update if the value less than 4096.
# vi /etc/sysconfig/kubelet
Restart the kubelet service:
# systemctl restart kubelet
Verify the updated configuration:
# systemctl status kubelet -l | grep -i pod-max-pids
25.1.3.4. Post-upgrade steps¶
After upgrading from Robin CNP v5.4.3 HF6 or Robin CNP v5.5.0-1857 to Robin CNP v5.5.1
After upgrading from Robin CNP v5.4.3 HF6 or Robin CNP v5.5.0-1857 to Robin CNP v5.5.1, perform the following steps:
Update the value of the
suicide_threshold
config parameter to40
:# robin config update agent suicide_threshold 40
Enable the
NFS Server
Monitor schedule:# robin schedule enable "NFS Server" Monitor
Set the
check_helm_apps
config parameter toFalse
:# robin config update cluster check_helm_apps False
Set the
chargeback_track_k8s_resusage
config parameter toFalse
:# robin config update server chargeback_track_k8s_resusage False
Set the
robin_k8s_extension
config parameter toTrue
:# robin config update manager robin_k8s_extension True
Verify whether the following mutating webhooks are present:
# kubectl get mutatingwebhookconfigurations -A | grep robin k8srobin-deployment-mutating-webhook 1 20d k8srobin-ds-mutating-webhook 1 20d k8srobin-pod-mutating-webhook 1 20d k8srobin-sts-mutating-webhook 1 20d robin-deployment-mutating-webhook 1 20d robin-ds-mutating-webhook 1 20d robin-pod-mutating-webhook 1 20d robin-sts-mutating-webhook 1 20d
If above
k8srobin-*
mutating webhooks are not present then bounce therobink8s-serverext
Pods:# kubectl delete pod -n robinio -l app=robink8s-serverext
Verify whether the following validating webhooks are present:
# kubectl get validatingwebhookconfigurations NAME WEBHOOKS AGE cert-manager-webhook 1 45h controllers-validating-webhook 1 31h ippoolcr-validating-webhook 1 31h namespaces-validating-webhook 1 31h pods-validating-webhook 1 31h pvcs-validating-webhook 1 31h
If
robin-*
mutating webhooks displayed in the step 6 output and validating webhooks displayed in the step 8 output are not present on your setup, then restart therobin-server-bg
service:# rbash master # supervisorctl restart robin-server-bg
Set the toleration seconds for all NFS server Pods to 60 seconds when the node is in the
notready
state and set to 0 seconds, when the node isunreachable
state.for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 60}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 0}]'; done 2>/dev/null
25.1.4. New Features¶
25.1.4.1. Support for Clusters with AMD Processors¶
Robin CNP v5.5.1 supports clusters with AMD processors. The following improvements are available to support the clusters with AMD processors.
Support for Mellanox VFs using the
mlx5_core
driver for clusters with AMD processors.Increased number of Pods per node.
Parameter to configure
max-pods
per node usingconfig.json
file.
25.1.4.1.1. Support for Mellanox VFs using the mlx5_core driver for clusters with AMD processors¶
Starting with Robin CNP v5.5.1, support for the mlx5_core
VF (Virtual Function) driver is provided as part of IP-pools for clusters with Intel and AMD processors.
You can now configure IP pools with the --vfdriver mlx5_core
to utilize Mellanox Virtual Functions on clusters.
Robin CNP continues to support the native iavf
VF driver. IP pools configured with iavf
can still be used to allocate VFs from Mellanox NICs.
You can use the following command to configure mlx5_core
driver:
robin ip-pool add <pool-name> --driver sriov --vf-driver mlx5_core --subnet <subnet> --gateway <gateway> --device-ids <device-ids> --nic <nic-name> --nodes <node-names>
Example:
# robin ip-pool add mlx-1 --driver sriov --prefix 64 --vfdriver mlx5_core --range 2a00:fbc:1270:1f3b:0:0:0:1-1000 --vlan 3897
25.1.4.1.2. Change the maximum pods per node after installation¶
You can set the maximum number of Pods per node any time after installing Robin CNP. Decide the maximum number of Pods based on your requirements and resources on your cluster.
You can also set the maximum Pods per node while installing Robin CNP using the max-pods
parameter in the config.json
file.
Prerequsietes
You must complete the following steps on all nodes of the cluster.
Before you change the maximum Pods per node, you need to verify and configure
pod-max-pids
on master nodes and change if required.
To change the maximum Pods per node after installation, complete the following steps:
Update the kubelet configuration
Edit the kubelet service configuration file:
# vi /etc/sysconfig/kubelet
Add or update the
--max-pods
parameter inKUBELET_EXTRA_ARGS
:# cat /etc/sysconfig/kubelet # KUBELET_EXTRA_ARGS="--container-runtime-endpoint=unix:///var/run/crirobin.sock --image-service-endpoint=unix:///var/run/crirobin.sock --enable-controller-attach-detach=true --cluster-dns=fd74:ca9b:3a09:868c:0252:0059:0124:800a --cluster-domain=abhinav.mantina.robin --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --feature-gates=RotateKubeletServerCertificate=true,MemoryManager=true --container-log-max-size=260M --read-only-port=0 --event-qps=0 --streaming-connection-idle-timeout=30m --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA --reserved-cpus=0 --cpu-manager-policy=static --topology-manager-policy=restricted --topology-manager-scope=pod --pod-max-pids=4096 --max-pods=260"
Restart the kubelet service
Restart the kubelet service:
# systemctl restart kubelet.service
Update Robin’s host information
Probe the host to rediscover the configuration changes:
# robin host probe --rediscover <hostname> # robin host probe --rediscover r7515-01
Verify the changes
Check that the maximum pods value is updated in Robin:
# robin host list | egrep "<hostname>|Pod|----" # robin host list | egrep "r7515-01|Pod|----" Id | Hostname | Version | Status | RPool | Avail. Zone | LastOpr | Roles | Cores | GPUs | Mem | HDD(#/Alloc/Total) | SSD(#/Alloc/Total) | Pod Usage | Joined Time --------------+----------+------------+--------+---------+-------------+---------+-------+-----------------+-------+------------------+--------------------+--------------------+------------+---------------------- 1755834757:36 | r7515-01 | 5.5.1-1939 | Ready | default | N/A | ONLINE | S,C | 316.75/3.25/320 | 0/0/0 | 1636G/629G/2266G | -/-/- | 8/4414G/47693G | 232/28/260 | 09 Sep 2025 14:34:16
25.1.4.1.3. Change the write unit size for SSD disk drives¶
Robin CNP supports the write unit size of 4096 bytes and 512 bytes. You must make sure that all disk drives on the cluster have the same write unit size.
Note
Before changing the write unit size, make sure your disk drive supports the write unit size that you intend to change. If you try to change to the unsupported write unit size, the operation fails.
When adding a new cluster to your environment, if the new cluster has disk drives that by default come with a different write unit size not present on your existing clusters, you must make sure to change the write unit to match with your existing Robin CNP clusters. The following steps help you to change the write unit size for disk drives in the Robin CNP cluster from 4096 bytes to 512 bytes or vice versa.
Note
You must run the steps on all nodes of the cluster.
To change the write unit size, complete the following steps:
Unregister the SSD drive
List drives to find your target drive:
# robin drive list | grep <drive_id>
Example:
# robin drive list | grep 290 290 | 0xui.3656313058a046000025384300000002 | r7715-04 | default | nvme-eui.3656313058a046000025384300000002 | 5961 | N | SSD | 4624/4624 (100%) | 0/400 | Storage | ONLINE | READY | 4096
Unregister the drive:
# robin drive unregister <drive_wwn> --wait -y
Example:
# robin drive unregister 0xui.3656313058a046000025384300000002 --wait -y Job: 71534 Name: DiskUnregister State: INITIALIZED Error: 0 Job: 71534 Name: DiskUnregister State: COMPLETED Error: 0
Verify the drive is unregistered:
# robin drive list | grep <drive_wwn>
Example:
# robin drive list | grep 0xui.3656313058a046000025384300000002
Rediscover the drive
Probe the host to rediscover all drives:
# robin host probe <hostname> --rediscover --all --wait
Example:
# robin host probe r7715-04 --rediscover --all --wait Job: 71535 Name: HostProbe State: VALIDATED Error: 0 Job: 71535 Name: HostProbe State: COMPLETED Error: 0
Verify the drive appears with UNKNOWN status:
# robin drive list | grep <drive_wwn>
Example:
# robin drive list | grep 0xui.3656313058a046000025384300000002 - | 0xui.3656313058a046000025384300000002 | r7715-04 | default | nvme-eui.3656313058a046000025384300000002 | 5961 | N | SSD | 4624/4624 (100%) | 0/100000 | Storage | UNKNOWN | INIT | 4096
Check the current write unit:
# robin disk info <drive_wwn> | grep -i write
Example:
# robin disk info 0xui.3656313058a046000025384300000002 | grep -i write Write Unit: 4096
Update the write unit
Change the write unit to the required size (512 or 4096 bytes):
# robin disk update --writeunit <size> <drive_wwn> --wait
Example:
# robin disk update --writeunit 512 0xui.3656313058a046000025384300000002 --wait Job: 71539 Name: DiskModify State: INITIALIZED Error: 0 Job: 71539 Name: DiskModify State: COMPLETED Error: 0
Reassign the storage role
Add the storage role back to the node:
# robin host add-role <hostname> Storage --wait
Example:
# robin host add-role r7715-04 Storage --wait Job: 71540 Name: HostAddRoles State: VALIDATED Error: 0 Job: 71540 Name: HostAddRoles State: COMPLETED Error: 0
Verify the drive is online with the new write unit:
# robin drive list | grep <drive_wwn>
Example:
# robin drive list | grep 0xui.3656313058a046000025384300000002 291 | 0xui.3656313058a046000025384300000002 | r7715-04 | default | nvme-eui.3656313058a046000025384300000002 | 5961 | N | SSD | 4624/4624 (100%) | 0/100000 | Storage | ONLINE | READY | 512
Confirm the write unit changed to 512:
# robin disk info <drive_wwn> | grep -i write
Example:
# robin disk info 0xui.3656313058a046000025384300000002 | grep -i write Write Unit: 512
After completing the earlier steps:
The drive status changes from
UNKNOWN
toONLINE
The drive state changes from
INIT
toREADY
25.1.5. Improvements¶
25.1.5.1. Remove NUMA restrictions for KVM apps¶
Starting with Robin CNP v5.5.1, you can remove NUMA restrictions for KVM apps when creating them. This helps in deploying KVM app pods on all worker nodes of a cluster. To remove NUMA restrictions, you must add the following annotation in the input.yaml
and create KVM apps using this input.yaml
:
robin.runtime.skip_cpuset_mems: "ENABLED"
25.1.5.2. Support to create an application with static IPs and static MACs from a snapshot¶
Starting with Robin CNP v5.5.1, Robin CNP supports creation of application with static IP and static MAC addresses from its snapshot.
Note
Creating an application with static IP and static MAC addresses is supported only for Robin bundle applications.
To create an application, you must specify the static IP and static MAC addresses in the following parameters:
static-ips
static-macs
For more information, see Create an application from a snapshot.
25.1.5.3. Support to create an application with static IPs and static MACs from a backup¶
Starting with Robin CNP v5.5.1, Robin CNP supports creation of application with static IP and static MAC addresses from its backup.
Note
Creating an application with static IP and static MAC addresses is supported only for Robin bundle applications.
To create an application, you must specify the static IP and static MAC addresses in the following parameters:
static-ips
static-macs
For more information, see Create an application from a backup.
25.1.6. Fixed Issues¶
Reference ID |
Description |
---|---|
RSD-8083 |
The IO hang issue observed on clusters with large disk sizes is fixed. |
RSD-9127 |
The output of Robin CLI commands ( |
RSD-9981 |
After upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.5.0, auto-deployment of KVM apps failed on certain nodes due to insufficient CPU resources on these nodes. This issue is fixed. |
RSD-9911 |
Kafka Pods restart due to I/O timeouts on volumes during auto-rebalance operations. This issue is fixed. |
RSD-5327 |
The issue of IOMGR restarting slowly is fixed. |
RSD-8854 |
The issue of the IOMGR service crashing on a node when it came back online after a reboot is fixed. |
RSD-8104 |
A delay in creating a large-size volume is observed when the volume size is more than the individual disks on the cluster. This issue is fixed. |
RSD-8083 |
The issue of the dev slices leader changing tasks, which delays epoch update tasks and results in IO timeouts on the application side, is fixed. |
RSD-9478 |
When a node makes RPC calls to an unhealthy node, these RPC calls are blocked for a long time because TCP keepalive timeout is configured only for client-side RPC sockets. This issue is fixed. |
RSD-8083 |
The default CPU and memory resource limits for the Robin Patroni PostgreSQL database lead to performance issues, particularly for Storage Manager (SM) tasks in larger cluster environments. This issue is fixed. |
RSD-8846 |
The Robin CAT profiles feature does not work as expected on RHEL 8.10. This issue is fixed. |
RSD-9127 |
For a Pod with a sidecar container, Robin CNP is erroneously allocating 0.5 CPU when it actually needs 1.5 CPUs. Due to this issue, Pod deployments are failing, indicating insufficient CPU. This issue is fixed. |
RSD-9316 |
When you try to deploy a large-sized KVM app on a Robin CNP cluster, the deployment fails with the following error message: Example Failed to download file_object c3cc99163f225f167ae886339eb02fca, not accessible at this point. Ensure the file collection is ONLINE. Error: Connection broken: ConnectionResetError(104, ‘Connection reset by peer’). This issue is fixed. |
RSD-9919 |
When upgrading from the supported Robin CNP version to v5.5.0-1857, the upgrade failed due to the |
PP-38268 |
The |
PP-39285 |
In a rare circumstance, when Patroni instances reboot happening in a particular order, erroneously a lagging Patroni replica claims the Leader role. This issue is fixed. |
PP-38087 |
In certain cases, the snapshot size allocated to a volume could be less than what is requested. This occurs when the volume is allocated from multiple disks. This issue is fixed. |
PP-34457 |
When you have a Robin CNP cluster with the Metrics feature enabled, the Grafana application is not displaying metrics under certain conditions. This issue is fixed. |
PP-38061 |
In rare scenarios, when upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.5.0, the upgrade may get stuck while executing Robin upgrade actions on the primary master node because some of the hosts are not in the Ready state. This issue is fixed. |
RSD-9176 |
Creating an application from a bundle fails due to missing IP pool configuration details in the |
RSD-9146 |
Prune, Purge, and Archive schedules trigger duplicate jobs in the Robin Cluster. This issue is fixed. |
RSD-9088 |
An error occurs when running the ERROR - local variable ‘total_cnt’ referenced before assignment This issue is fixed. |
RSD-9289 |
The Robin CLI commands ( |
RSD-9208 |
The issue where creating a Robin Bundle app failed with the following error after upgrading to Robin CNP v5.5.0 is fixed: IndexError: list index out of range |
RSD-9204 |
The issue where creating a KVM app failed with the following error after upgrading to Robin CNP v5.5.0 is fixed: NameError: name ‘net_attr’ is not defined |
RSD-9202 |
The issue of |
RSD-9222 |
The issue where the |
RSD-9075 |
The issue in Robin CNP v5.5.0 where Pod deployments that depend on VLANs configured as ‘ALL’ on the host network interfaces are failing with the Pending status is fixed |
RSD-9642 |
The robin-nfs bundle application is experiencing continuous restarts when Autopilot is enabled. This issue is fixed. |
RSD-9455 |
In Robin version 5.5.0, the Robin CLI command output ( |
RSD-9346 |
When upgrading Robin to CNP v5.5.0, the issue of intermittent failure of the |
RSD-9290 |
The issue of the Robin Master Pod (robinrcm) restarting due to exceeding the |
RSD-9273 |
The issue where restarting a Robin Bundle application failed with the following error after upgrading to Robin CNP v5.5.0 is fixed: ‘int object’ has no attribute ‘split’ |
25.1.7. Known Issues¶
Reference ID |
Description |
---|---|
PP-39656 |
Symptom When you deploy applications with Workaround Restart the Pod to place it on the node with a soft label. |
PP-39645 |
Symptom Robin CNP v5.5.1 may rarely fail to honor soft Pod anti-affinity, resulting in uneven Pod distribution on labeled nodes. When you deploy an application with the recommended preferred Workaround Bounce the Pod that has not honored soft affinity. |
PP-39632 |
Symptom After upgrading to Robin CNP 5.5.1, NFS client might hang with no pending IO message. For no pending IO, refer this path :
Also, you can find the following message in the
Workaround
|
PP-39429 |
Symptom When you try to create an application from a snapshot or backup, the configuration for static IP addresses and static MAC addresses is supported only for Robin Bundle applications, and it is not supported for Helm-based applications. |
PP-38044 |
Symptom When attempting to detach a repository from a hydrated Helm application, the operation might fail with the following error: Can’t detach repo as the application is in IMPORTED state, hydrate it in order to detach the repo from it. This issue occurs even if the application has already been hydrated. The system incorrectly marks the application in the Workaround To detach the repository, manually rehydrate the application and then retry the detach operation:
|
PP-37652 |
Symptom When you deploy a multi-container application using Helm with static IPs assigned from an IP pool, only a subset of the Pods appear on the Robin CNP UI. Workaround Run the following CLI command to view all the Pods: # robin app info <appname> --status
|
PP-37416 |
Symptom In rare scenarios, when upgrading from Robin CNP v5.4.3 HF6 to Robin CNP v5.5.1, the upgrade might fail with the following error during the Kubernetes upgrade process on other master nodes: Failed to execute kubeadm upgrade command for K8S upgrade. Please make sure you have the correct version of kubeadm rpm binary installed Steps to identify the issue:
Workaround If you notice the above error, restart the kubelet: # systemctl restart kubelet
|
PP-35015 |
Symptom After renewing the expired Robin license successfully, Robin CNP incorrectly displays the License Violation error when you try to add a new user to the cluster. If you notice this issue, apply the following workaround. Workaround You need to restart the # rbash master
# supervisorctl restart robin-server-bg
|
PP-34492 |
Symptom When you run the Workaround
The host should now transition to the |
PP-34414 |
Symptom In rare scenarios, the
To confirm the above issues, complete the following steps:
Workaround If the device is not in use, restart the # supervisorctl restart iomgr
|
PP-34226 |
Symptom When a PersistentVolumeClaim (PVC) is created, the CSI provisioner initiates a VolumeCreate job. If this job fails, the CSI provisioner will call a new VolumeCreate job again for the same PVC. However, if the PVC is deleted during this process, the CSI provisioner will continue to call the VolumeCreate job because it does not verify the existence of the PVC before calling the VolumeCreate job. Workaround Bounce the CSI provisioner Pod: # kubectl delete pod <csi-provisioner-robin> -n robinio
|
PP-38251 |
Symptom When evacuating a disk from an offline node, the Json deserialize error: invalid value: integer -‘10’, expected u64 at line 1 column 2440. Workaround If you notice the above issue, contact the Robin CS team. |
PP-37965 |
Symptom In Robin CNP v5.5.1, when you scale up a Robin Bundle app, it is not considering the existing CPU cores and memory already in use by a vnode. As a result, Robin CNP is not able to find a suitable host, even though there are additional resources available. Workaround If you notice this issue, apply the following Workaround
|
PP-39619 |
Symptom After creating an app from a backup, the app is stuck in the App <app-name> couldn’t be deleted. Please detach app from repos before deleting. Workaround If you notice the above issue, contact the Robin CS team. |
PP-36865 |
Symptom After rebooting a node, the node might not come back online after a long time, and the host BMC console displays the following message for RWX PVCs mounted on that node: Remounting nfs rwx pic timed out, issugin SIGKILL Workaround Power cycle the host machine. |
PP-39806 |
Symptom When a node hosting KVM applications is shut down due to technical reasons, you might get the following error message during the migration process of these Pods on another node: Target /usr/local/robin/instances/kvm/clone-kvm-ovs2-server.svc.cluster.local is busy, please retry later. Workaround Run the following command to restart the Robin instance after five minutes: # robin instance restart <instance-name>
|
PP-38471 |
Symptom When StatefulSet Pods restart, the Pods might get stuck in the Workaround If you notice this issue, restart the # kubectl delete pod <csi-nodeplugin> -n robinio
|
PP-38039 |
Symptom During node reboot or power reset scenarios, application volumes may force shutdown due to I/O errors. As a result, application Pods might get stuck in the Context Deadline Exceeded. On the affected node where the volume is mounted or the application Pod is scheduled, the following error might be observed in the Log I/O Error Detected. Shutting down filesystem Workaround If you notice this issue, contact the Robin Customer Support team for assistance. |
PP-37330 |
Symptom During or after upgrading from the supported versions to Robin CNP v5.5.1 or following node reboots and failover events, applications relying on ReadWriteMany (RWX) NFS volumes may experience critical failures. These failures might manifest into the following:
The underlying cause for these symptoms could be arising from duplicate filesystem UUIDs. You might observe one of the following error messages:
Example /bin/mount /dev/sdn /var/lib/robin/nfs/robin-nfs-shared-35/ganesha/ pvc-822e76f0-9bb8-4629-8aae-8318fb2d3b41 -o discard failed with return code 32: mount: /var/lib/robin/nfs/robin-nfs-shared-35/ ganesha/pvc-822e76f0-9bb8-4629-8aae-8318fb2d3b41: wrong fs type, bad option, bad superblock on /dev/sdn, missing codepage or helper program, or other error. Workaround If you notice this issue, contact the Robin Customer Support team for assistance. |
PP-38078 |
Symptom After a network partition, the As a result, stale devices may not be cleaned up, potentially leading to resource contention and other issues. Workaround Manually restart the # supervisorctl restart robin-agent iomgr-server
|
PP-39842 |
Symptom In Robin CNP v5.5.1, the You can identify this issue by comparing the output of the following commands: # kubectl describe node
# robin host list
# robin k8s-collect info
The resource usage reported by |
PP-39901 |
Symptom After rebooting a worker node that is hosting Pods with Robin RWX volumes, one or more application Pods using these volumes might get stuck in the ContainerCreating state indefinitely. Workaround If you notice the above issue, contact the Robin CS team. |
PP-38924 |
Symptom After you delete multiple Helm applications, one of the Pods might get stuck in the Workaround Restart Docker and Kubelet on the node where the Pod stuck in the |
PP-39936 |
Symptom When relocating a Pod on another node using the Workaround Check affinity rules and violations manually when using the |
PP-39467 |
Symptom When deploying applications with ReadWriteMany (RWX) PersistentVolumeClaims (PVCs), application Pods fail to mount volumes and stuck in the Workaround Reboot the host that is in the |
25.1.8. Technical Support¶
Contact Robin Technical support for any assistance.