26. Release Notes¶
26.1. Robin Cloud Native Platform v5.7.0¶
The Robin Cloud Native Platform (CNP) v5.7.0 release notes has pre- and post-upgrade steps, a new feature, improvements, fixed issues, and known issues.
Release Date: January 30, 2026
26.1.1. Infrastructure Versions¶
The following software applications are included in this CNP release:
Software Application |
Version |
|---|---|
Kubernetes |
1.33.5 |
Docker |
25.0.2 (RHEL 8.10 or Rocky Linux 8.10) |
Podman |
5.4.0 (RHEL 9.6) |
Prometheus |
2.39.1 |
Prometheus Adapter |
0.10.0 |
Node Exporter |
1.4.0 |
Calico |
3.28.2 |
HAProxy |
2.4.7 |
PostgreSQL |
14.12 |
Grafana |
9.2.3 |
CRI Tools |
1.33.0 |
cert-manager |
1.19.1 |
26.1.2. Supported Operating Systems¶
The following are the supported operating systems and kernel versions for Robin CNP v5.7.0:
OS Version |
Kernel Version |
|---|---|
Red Hat Enterprise Linux 8.10 |
4.18.0-553.el8_10.x86_64 |
Rocky Linux 8.10 |
4.18.0-553.el8_10.x86_64 |
Red Hat Enterprise Linux 9.6 |
5.14.0-570.24.1.el9_6.x86_64+rt |
Note
Robin CNP supports both RT and non-RT kernels on above supported operating systems.
26.1.3. Upgrade Paths¶
The following are the supported upgrade paths for Robin CNP v5.7.0:
Robin CNP v5.4.3 HF5+PP to Robin CNP v5.7.0-296
Robin CNP v5.4.3 HF6 to Robin CNP v5.7.0-296
Robin CNP v5.4.3 HF7 to Robin CNP v5.7.0-296
Robin CNP v5.5.1 to Robin CNP v5.7.0-296
26.1.3.1. Pre-upgrade considerations¶
For a successful upgrade, you must run the
possible_job_stuck.pyscript before and after the upgrade. Contact the Robin Support team for the upgrade procedure using the script.When upgrading from supported Robin CNP versions to Robin CNP v5.7.0, if your cluster already has cert-manager installed, you must uninstall it before upgrading to Robin CNP v5.7.0.
Before upgrading to Robin CNP v5.7.0, you must stop the
robin-certs-checkjob or CronJob. To stop therobin-certs-checkjob, run thekubectl delete job robin-certs-check -n robiniocommand, and to stop therobin-certs-checkCronJob, run therobin cert check --stop-cronjobcommand.
26.1.3.2. Post-upgrade considerations¶
After upgrading to Robin CNP v5.7.0, verify that the value of the
k8s_resource_syncconfig parameter is set to60000using therobin schedule list | grep -i K8sResSynccommand. If it is not set, you must run therobin schedule update K8sResSync k8s_resource_sync 60000command to update the value of therobin schedule K8sResSyncconfig parameter.After upgrading to Robin CNP v5.7.0, you must run the
robin-server validate-role-bindingscommand. To run this command, you need to log in to therobin-masterPod. This command verifies the roles assigned to each user in the cluster and corrects them if necessary.After upgrading to Robin CNP v5.7.0, the
k8s_auto_registrationconfig parameter is disabled by default. The config setting is deactivated to prevent all Kubernetes apps from automatically registering and consuming resources. The following are the points you must be aware of with this change:You can register the Kubernetes apps using the
robin app registercommand manually and use Robin CNP for snapshots, clones, and backup operations of the Kubernetes app.As this config parameter is disabled, when you run the
robin app nfs-listcommand, the mappings between Kubernetes apps and NFS server Pods are not listed in the command output.If you need mapping between Kubernetes app and NFS server Pod when the
k8s_auto_registration configparameter is disabled or the k8s app is not manually registered, get the PVC name from the Pod YAML file(kubectl get pod -n <name> -o YAML)and run therobin nfs export list | grep <pvc name>command.The
robin nfs export listcommand output displays the PVC name and namespace.
After upgrading to Robin CNP v5.7.0, you must start the
robin-certs-checkCronJob using therobin cert check -stat-cronjobcommand, if it was stopped before upgrade.
26.1.3.3. Pre-upgrade steps¶
Upgrading from Robin CNP v5.4.3 to Robin CNP v5.7.0-296
Before upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.7.0, perform the following steps:
Update the value of the
suicide_thresholdconfig parameter to1800:# robin config update agent suicide_threshold 1800
Verify the NFS monitor is enabled. It must be
True# robin schedule list | grep -I NFS
Set the toleration seconds for all NFS server Pods to
86400seconds. After upgrade, you must change the tolerations seconds according to the post-upgrade steps.for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds to 86400"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 86400}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 86400}]'; doneVerify the webhooks are enabled.
# robin config list | grep -I robin_k8s_extension
It must be
True. If it the disabled, enable it.# robin config update manager robin_k8s_extension True
Check the mutating webhook’s is present on the cluster.
# kubectl get mutatingwebhookconfiguration -A
Enable
robin_k8s_extension. Note: You must create therobin-schedulextender-policy-template.yamlandpreentry-<version>.shfiles in theusr/local/robindirectory on all three K8s control-plane hosts.Create a
robin-schedulextender-policy-template.yamlfile and add the following:apiVersion: kubescheduler.config.k8s.io/v1 kind: KubeSchedulerConfiguration clientConnection: kubeconfig: /etc/kubernetes/scheduler.conf extenders: - urlPrefix: "https://{{hostname}}:{{port}}/{{urlsuffix}}" filterVerb: predicates enableHTTPS: true nodeCacheCapable: false ignorable: {{ignorable}} httpTimeout: {{httptimeout}} tlsConfig: insecure: true managedResources: - name: robin.io/robin-required ignoredByScheduler: trueCreate a
preentry-<version>.shand add the following:# cp /usr/local/robin/robin-schedulextender-policy-template.yaml /opt/robin/current/etc/robin/k8s/robin-schedulextender-policy-template.yaml # cp /usr/local/robin/robin-schedulextender-policy-template.yaml /etc/robin/k8s/robin-schedulextender-policy-template.yaml
Upgrading from Robin CNP v5.5.1 to Robin CNP v5.7.0-296
Before upgrading from Robin CNP v5.5.1 to Robin CNP v5.7.0, perform the following steps:
Verify the NFS monitor is enabled using
robin schedule list | grep -I NFS. It must beTrue.Update the value of the
suicide_thresholdconfig parameter to1800:# robin config update agent suicide_threshold 1800
Set the toleration seconds for all NFS server Pods to
86400seconds. After upgrade, you must change the tolerations seconds according to the post-upgrade steps.for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds to 86400"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path":"/spec/tolerations/1/tolerationSeconds", "value": 86400}, {"op":"replace", "path": "/spec/tolerations/2/tolerationSeconds", "value":86400}]']; done
26.1.3.4. Post-upgrade steps¶
After upgrading from Robin CNP v5.4.3 or Robin CNP v5.5.1 to Robin CNP v5.7.0-296
After upgrading from Robin CNP v5.4.3 to Robin CNP v5.7.0, perform the following steps:
Update the value of the
suicide_thresholdconfig parameter to40:# robin config update agent suicide_threshold 40
Set the
check_helm_appsconfig parameter toFalse:# robin config update cluster check_helm_apps False
Set the
chargeback_track_k8s_resusageconfig parameter toFalse:# robin config update server chargeback_track_k8s_resusage False
Set the
robin_k8s_extensionconfig parameter toTrue. If no, set it toTrue.# robin config update manager robin_k8s_extension True
Verify whether the following mutating webhooks are present:
# kubectl get mutatingwebhookconfigurations -A | grep robin k8srobin-deployment-mutating-webhook 1 20d k8srobin-ds-mutating-webhook 1 20d k8srobin-pod-mutating-webhook 1 20d k8srobin-sts-mutating-webhook 1 20d robin-deployment-mutating-webhook 1 20d robin-ds-mutating-webhook 1 20d robin-pod-mutating-webhook 1 20d robin-sts-mutating-webhook 1 20d
If above
k8srobin-*mutating webhooks are not present then bounce therobink8s-serverextPods:# kubectl delete pod -n robinio -l app=robink8s-serverext
Verify whether the following validating webhooks are present:
# kubectl get validatingwebhookconfigurations NAME WEBHOOKS AGE cert-manager-webhook 1 45h controllers-validating-webhook 1 31h ippoolcr-validating-webhook 1 31h namespaces-validating-webhook 1 31h pods-validating-webhook 1 31h pvcs-validating-webhook 1 31h
If
robin-*mutating webhooks displayed in the step 6 output and validating webhooks displayed in the step 8 output are not present on your setup, then restart therobin-server-bgservice:# rbash master # supervisorctl restart robin-server-bg
Set the toleration seconds for all NFS server Pods to
60seconds when the node is in thenotreadystate and set to0seconds, when the node isunreachablestate.for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 60}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 0}]'; done 2>/dev/null
26.1.4. New Features¶
26.1.4.1. Ephemeral storage limits with LimitRange¶
Starting with Robin CNP v5.7.0, you can use the Kubernetes LimitRange object to manage ephemeral storage limits for containers in pods. This prevents the host’s root filesystem from filling up when containers write excessive data.
This feature is disabled by default and applies at the cluster level.
To configure these limits, modify the parameters in the limitrange section of the Robin configuration. Run the following command to enable the LimitRange object:
# robin config update limitrange enabled True
Note
This feature is supported only on hosts running Red Hat Enterprise Linux 9.6.
For more information, see Limit Ranges for Ephemeral Storage.
26.1.5. Improvements¶
26.1.5.1. Support for cert-manager v1.19.1¶
Starting with Robin CNP v5.7.0, Robin CNP supports cert-manager v1.19.1. When you install or upgrade to Robin CNP v5.7.0 from any supported version, the latest cert-manager v1.19.1 is installed.
26.1.5.2. Support for Red Hat Enterprise Linux 9.6¶
Starting with Robin CNP v5.7.0, Robin CNP supports Red Hat Enterprise Linux (RHEL) 9.6 OS and kernel version 5.14.0-570.24.1.el9_6.x86_64+rt.
26.1.5.3. Enhanced NIC tagging for SR-IOV IP Pools¶
The Robin CNP v5.7.0 release provided enhanced NIC-level tagging for SR-IOV Physical Functions (PFs), providing more granular control over Virtual Function (VF) placement.
You can now label each PF with custom key-value pairs called NIC tags, such as nictype=active or location=tor-a. By adding these tags to your IP pools, Robin CNP ensures that VFs for an application are allocated only from specific, tagged interfaces. This enhancement enables the deterministic grouping of PFs, leading to predictable and consistent VF placement within pods.
Previously, NICs could only be identified in IP pools by their name or pci_addr, which limited the ability to group multiple SR-IOV interfaces into flexible pools. The new NIC tagging feature solves this issue by allowing you to define custom NIC tags, assign them to different interfaces, and create IP pools that allocate VFs from one or more tagged groups. For example, you can now create a bonded interface pool that draws VFs from interfaces tagged as nictype=active and nictype=standby, giving you precise control over your network topology and resource allocation.
For more information, see NIC Allocation for SR-IOV interfaces.
26.1.5.4. VF allocation based on Bandwidth¶
Robin CNP v5.7.0 supports Virtual Function (VF) allocation based on the bandwidth of an SR-IOV NIC, allowing for optimized resource utilization and granular control over network resource allocation.
The following VF allocation policies are added in Robin CNP:
bandwidth-spread: Utilizes VFs across multiple interfaces to spread bandwidth usage evenly. This ensures better utilization of network interface bandwidth and prevents a single interface from becoming a bottleneck.
bandwidth-pack: Packs and utilizes VFs onto fewer interfaces to maximize the utilization of SR-IOV NICs’ bandwidth.
You can now create an IP pool with specific bandwidth weights (in Gbps) using the --bandwidth option. When a VF is being allocated, Robin CNP uses these weights to enforce capacity limits and determine placement based on your selected policy.
Previously, VFs were allocated based on VF counts. This led to over-allocation of a NIC, where the aggregate bandwidth requirements of VFs exceeded the physical capacity of the NIC, resulting in unpredictable latency and performance throttling. Bandwidth-based allocation solves this issue by validating the capacity of a NIC before allocation. Robin CNP now ensures that allocated VFs never exceed the physical capacity of a NIC.
For more information, see NIC Allocation policies.
26.1.5.5. Create application with static IPs and static MACs from snapshot¶
Starting with Robin CNP v5.7.0, Robin CNP supports creation of application with static IP and static MAC addresses from its snapshot.
Note
Creating an application with static IP and static MAC addresses is supported only for Robin bundle applications.
To create an application, you must specify the static IP and static MAC addresses in the following parameters:
static-ipsstatic-macs
For more information, see Create an application from a snapshot.
26.1.5.6. Create application with static IPs and static MACs from backup¶
Starting with Robin CNP v5.7.0, Robin CNP supports creation of application with static IP and static MAC addresses from its backup.
Note
Creating an application with static IP and static MAC addresses is supported only for Robin bundle applications.
To create an application, you must specify the static IP and static MAC addresses in the following parameters:
static-ipsstatic-macs
For more information, see Create an application from a backup.
26.1.5.7. Robin CNP installation qualification on 2 P-Cores with Best-effort-QoS¶
Tests validated CNP installation with best-effort QoS enabled on a single-node cluster with one storage disk and 40 P-cores. The test environment allocated 2 P-cores for CNP, the OS, and Kubernetes, and ran approximately 70 application Pods to simulate a heavy workload.
The testing focused on platform behavior and CPU allocation. No platform-side issues regarding CPU allocation or stability occurred.
26.1.5.8. Support for the Docker overlay2 driver¶
Robin CNP v5.7.0 supports docker_storage_driver: overlay2 driver. It is the primary mechanism for managing how Docker images and containers are stored and accessed on a host’s filesystem. This driver prevents installation failures caused by slow Docker image load time. You can configure the Docker storage driver per node using the config.json file while installing CNP.
Note
By default, the Robin CNP v5.7.0 installation process uses the robin-graph driver. However, you can use the overlay2 driver that offers faster load times and can prevent timeouts during the installation process.
This configuration applies at the node level. You must apply the storage driver option ("docker_storage_driver": "overlay2") to every node in your cluster using the configuration JSON file (config.json).
26.1.5.9. Accelerator abstraction¶
Robin CNP v5.7.0 supports accelerator abstraction, a feature that simplifies the allocation of hardware accelerator resources (such as FPGA) across mixed clusters. This feature allows you to configure PCI resources using abstract names, decoupling Pod specifications from specific hardware identifiers like device IDs or vendor IDs.
With this update, you can apply a single, common annotation to Pod specifications and Helm charts. This enables seamless deployments on clusters that contain different types of accelerator cards, such as those found on existing servers or new GNR-D servers. For more information, see Configure abstract accelerator resources.
26.1.5.10. Schedule Pods of same application on a node using Pod-level affinity and anti-affinity¶
Robin CNP v5.7.0 enables you to schedule application Pods of the same type on a single node to maximize resource utilization. This approach helps when you have a limited number of nodes and prefer to separate specific workloads, such as ensuring different application types run on different nodes.
To configure this behavior, define soft affinity and anti-affinity rules in the podAffinity and podAntiAffinity fields.
Use the preferredDuringSchedulingIgnoredDuringExecution rule to establish these preferences.
This rule attempts to schedule Pods according to your criteria but permits placement on other nodes if resources are limited. For more information, see Schedule Pods of same application on a node using Pod-level affinity and anti-affinity.
26.1.5.11. CRI-O support¶
Robin CNP v5.7.0 supports CRI-O as a high-level container runtime with RHEL 9.6 OS only. CRI-O is a lightweight container runtime for Kubernetes. It is an optimized implementation of the Kubernetes Container Runtime Interface (CRI) to run pods by using any OCI (Open Container Initiative) compliant runtime. It is open-source and an alternative to Docker for Kubernetes clusters.
For more information, see CRI-O.
26.1.5.12. crun support¶
Robin CNP v5.7.0 supports crun as a low-level container runtime with RHEL 9.6 OS only. crun is a fast, lightweight, and low-memory container runtime. It is the default container runtime in RHEL 9.6 OS, which is used to execute containers within a Pod. It is fully compliant with the Open Container Initiative (OCI) specifications.
26.1.5.13. cgroup V2 support¶
Robin CNP v5.7.0 supports cgroup v2 of the Linux kernel with RHEL 9.6 OS only. cgroup v2 is the latest, more efficient version of the Linux kernel’s control groups feature, which is the underlying mechanism for managing and enforcing resource limits (CPU, memory, I/O) on containers and Pods. It is the default cgroup version in the RHEL 9.6 OS.
For more information, see cgroup v2.
26.1.6. Fixed Issues¶
Reference ID |
Description |
|---|---|
RSD-8083 |
Dev slices leader change tasks, which delays epoch update tasks and results in IO timeouts on the application side. This issue is fixed. |
RSD-8104 |
A delay in creating a large-size volume is observed when the volume size is more than the individual disks on the cluster. This issue is fixed. |
RSD-7814 |
The issue of the application creation operation failing with the following error is now fixed. Failed to mount volume <volume-name>: Node <node-name> has mount_blocked STORMGR_NODE_BLOCK_MOUNT. No new mounts are allowed. |
RSD-7499 |
The issue of storage size mismatch between Robin storage and Kubernetes storage, which caused calculation and quota errors when users requested sizes in GB, is fixed. |
RSD-10741 |
The issue of the |
RSD-6763 |
The issue of Helm binary version mismatch between the host and the downloaded Helm client or the robin-master Pod is fixed. |
RSD-7183 |
A device may run out of space, and you might observe disk usage alerts or out-of-space errors when an application is writing data, resulting in failed writes. You might also observe that the physical size of a volume is greater than the logical size when you run the This issue could be because the garbage collector (GC) failed to reclaim space. This issue is fixed. |
RSD-6162 |
The issue of the snapshot-controller Pod stuck in the |
RSD-5771 |
IPv6 IP pool creation is failing with the gateway the same as the broadcast address for the IP pool subnet. This issue is fixed. |
RSD-8150 |
The race condition issue where the device evacuation operation would fail if a replica was assigned to a drive marked as not ready is fixed. |
RSD-4634 |
When Robin CNP is running on SuperMicro nodes, the IPMI tool is incorrectly displaying the BMC IPv6 address as follows: |
PP-37360 |
The issue of the Robin UI displaying a maximum of 1000 users when adding users from an LDAP server is fixed. |
RSD-8690 |
The issue of a newly added node not appearing in the |
PP-39842 |
The |
PP-39806 |
When a node hosting KVM applications is shut down due to technical reasons, you might get the following error message during the migration process of these Pods on another node: Target /usr/local/robin/instances/kvm/clone-kvm-ovs2-server.svc.cluster.local is busy, please retry later. This issue is fixed. |
RSD-10622 |
Istio installation failing with the following error when calling the k8srobin.kubernetes.service.robin webhook: |
RSD-10265 |
The kubelet service failed to reload client certificates after automatic rotation. This caused nodes to report a degraded status with |
RSD-10366 |
When expanding a ReadWriteMany (RWX) Persistent Volume Claim (PVC) failed with the following error: |
RSD-10471 |
When affinityrules defined in a bundle |
RSD-10754 |
Nodes remain in a |
RSD-10243 |
Application auto-redeployment fails because the scheduler did not correctly validate SR-IOV configuration on the target node. This issue resulted in Pods being scheduled on nodes unable to satisfy the resource requirements, causing the application to remain in a |
RSD-10105 |
Bundle application deployment fails when multiple IP addresses resolve to the same hostname during reverse DNS lookup. This issue occurred because the system prohibited different IPs from mapping to the same hostname. This issue is fixed. You can now prevent this conflict by setting the |
RSD-9892 |
The issue of VMs provisioned using an ISO image on Robin CNP KVM environments entering into a reboot is fixed. |
RSD-9843 |
The cluster license status became FORCEFULLY EXPIRED immediately after installation. This occurred because the system incorrectly compared the current UTC time against an installation timestamp recorded in the local timezone, leading to a false detection of a system clock rollback. This issue is fixed. |
RSD-9767 |
Deployment of Helm-based applications is taking longer than expected due to pod scheduling delays. This issue is fixed. As part of the fix, the following new cluster-wide configurable attribute is provided: |
RSD-9791 |
The |
PP-39285 |
In a rare circumstance, when Patroni instances reboot happening in a particular order, erroneously a lagging Patroni replica claims the Leader role. This issue is fixed. |
RSD-9323 |
When you try to restore an application from a backup that previously had a static IP address, the restore process fails to honor the Non static IP allocations cannot be done from non-range(network) IP pools -> ‘nc-bss-ov-internal-mgmt-int-v6’. This issue is fixed. |
RSD-9042 |
The issue where the GoRobin installer did not verify that SELinux was disabled on target hosts is fixed. The installation pre-check now validates the SELinux status and fails if it detects that SELinux is enabled. |
RSD-8630 |
The application creation from a bundle fails when a namespace resource quota was active. This issue occurred because the hook jobs generated during deployment did not define CPU and memory limits, violating the namespace’s quota requirements. This issue is fixed. |
RSD-9143, RSD-8589 |
Relocating a master Pod caused nodes to become UNREACHABLE, triggering unintended host failovers and workload evacuations. This issue occurred because the agent incorrectly detected a communication failure during the master pod relocation. This issue is fixed. The system now verifies the health of the Patroni service before initiating an agent restart, preventing false positives and unnecessary service disruptions. |
PP-39619 |
After creating an app from a backup, the app is stuck in the App <app-name> couldn’t be deleted. Please detach app from repos before deleting. This issue is fixed. |
PP-37652 |
When you deploy a multi-container application using Helm with static IPs assigned from an IP pool, only a subset of the Pods appear on the Robin CNP UI. This issue is fixed. |
PP-34457 |
When you have a Robin CNP cluster with the Metrics feature enabled, the Grafana application is not displaying metrics under certain conditions. This issue is fixed. |
RSD-10972 |
The issue of different VF driver names for the |
RSD-10971 |
The issue of the |
RSD-10973 |
The issue of Pods failing to create in Robin CNP v5.7.0 with the following error when specifying tolerations in Pod YAML is fixed: Json deserialize error: missing field key at line 1 column 237 |
RSD-10990 |
The issue of image upgrades for Robin bundle applications failing on RHEL 9 hosts with the following error is fixed:
|
RSD-4584 |
If you have added a range of blacklisted IPs in an unexpanded form, Robin CNP does not allow you to remove a range of blacklisted IPs from the IP Pool. This issue is fixed. |
RSD-3885 |
The |
RSD-4065 |
When creating a superadmin user with
|
RSD-3447 |
The issue of |
RSD-10063 |
The issue of the |
RSD- 9738, RSD-9753 |
The issue of the |
RSD-9765 |
The issue of the |
RSD-10720 |
When upgrade from Robin CNP v5.5.0 to Robin CNP v5.5.1, Pods and containers were restarting unexpectedly because the FileNotFoundError(No such file or directory) This issue is fixed. |
26.1.7. Known Issues¶
Reference ID |
Description |
|---|---|
PP-35015 |
Symptom After rebooting a worker node that is hosting Pods with Robin RWX volumes, one or more application Pods using these volumes might get stuck in the ContainerCreating state indefinitely. Workaround If you notice the above issue, contact the Robin CS team. |
PP-39901 |
Symptom A Pod IP is not pingable from any other node in the cluster, apart from the node where it is running. Workaround Bounce the Calico Pod running on the node where the issue is seen. |
PP-39900 |
Symptom After upgrading your Robin cluster from supported version to Robin CNP v5.5.1, some Helm application Pods might get stuck in the ContainerCreating state When you run the Error: Failed to mount volume pvc-6f29f4a5-4009-4a99-b37e-a37f34ca5165: Volume 1:22 active snapshot is already mounted elsewhere You also observe repeated VolumeMount jobs in the Workaround Important: Apply this workaround only after confirming both of the following conditions:
Note For further assistance, contact the Robin Customer Support team. |
PP-39645 |
Symptom Robin CNP v5.7.0 may rarely fail to honor soft Pod anti-affinity, resulting in uneven Pod distribution on labeled nodes. When you deploy an application with the recommended Workaround Bounce the Pod that has not honored soft affinity. |
PP-34226 |
Symptom When a PersistentVolumeClaim (PVC) is created, the CSI provisioner initiates a Workaround Bounce the CSI provisioner Pod. # kubectl delete pod -n robinio <csi-provisioner-robin>
|
PP-34414 |
Symptom In rare scenarios, the IOMGR service might fail to open devices in the exclusive mode when it starts as other processes are using these disks. You might observe the following issue:
Steps to identify the issue:
Workaround If the device is not in use, restart the # supervisorctl restart iomgr
|
PP-39632 |
Symptom After upgrading to Robin CNP 5.5.1, NFS client might hang with no pending IO message. For no pending IO, refer this path: CsiServer_9 - robin.utils - INFO - Executing command /usr/bin/nc -z -w 6 172.19.149.161 2049 with timeout 60 seconds CsiServer_9 - robin.utils - INFO - Command /usr/bin/nc -z -w 6 172.19.149.161 2049 completed with return code 0. CsiServer_9 - robin.utils - INFO - Standard out: Also, you can find the following message in the nfs: server 172.19.131.218 not responding, timed out nfs: server 172.19.131.218 not responding, timed out nfs: server 172.19.131.218 not responding, timed out Workaround
|
PP-34492 |
Symptom When you run the Workaround
|
PP-35478 |
Symptom In rare scenarios, the kube-scheduler may not function as expected when many Pods are deployed in a cluster due to issues with the Workaround Complete the following workaround steps to resolve issues with the
|
PP-36865 |
Symptom The Workaround To resolve this, manually restart the robin-server and robin-server-bg services using following commands: # rbash master
# supervisorctl restart robin-server
# supervisorctl restart robin-server-bg
|
PP-37330 |
Symptom During or after upgrading to Robin CNP v5.7.0, the /bin/mount /dev/sdn /var/lib/robin/nfs/robin-nfs-shared-35/ganesha/pvc-822e76f0-9bb8-4629-8aae-8318fb2d3b41 -o discard failed with return code 32: mount: /var/lib/robin/nfs/robin-nfs-shared-35/ganesha/pvc-822e76f0-9bb8-4629-8aae-8318fb2d3b41: wrong fs type, bad option, bad superblock on /dev/sdn, missing codepage or helper program, or other error. Workaround If you notice this issue, contact the Robin Customer Support team for assistance. |
PP-37416 |
Symptom In rare scenarios, when upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.7.0, the upgrade might fail with the following error during the Kubernetes upgrade process on other master nodes: Failed to execute kubeadm upgrade command for K8S upgrade. Please make sure you have the correct version of kubeadm rpm binary installed Steps to identify the issue:
Workaround If you notice the above error, restart the kubelet: # systemctl restart kubelet
|
PP-39467 |
Symptom When deploying applications with RWX PVCs, application Pods fail to mount volumes and stuck in the Workaround Reboot the host that is in the |
PP-39429 |
Symptom When you try to create an application from a snapshot or backup, the configuration for static IP addresses and static MAC addresses is supported only for Robin Bundle applications, and it is not supported for Helm-based applications. |
PP-38044 |
Symptom When attempting to detach a repository from a hydrated Helm application, the operation might fail with the following error: Can’t detach repo as the application is in IMPORTED state, hydrate it in order to detach the repo from it. This issue occurs even if the application has already been hydrated. The system incorrectly marks the application in the Workaround To detach the repository, manually rehydrate the application and then retry the detach operation:
|
PP-38251 |
Symptom When evacuating a disk from an offline node in the large cluster, the robin drive evacuate command fails with the following error message: “Json deserialize error: invalid value: integer -10, expected u64 at line 1 column 2440. Workaround If you notice the above issue, contact the Robin CS team. |
PP-38471 |
Symptom When StatefulSet Pods restart, the Pods might get stuck in the Workaround If you notice this issue, restart the # kubectl delete pod <csi-nodeplugin> -n robinio
|
PP-38087 |
Symptom In certain cases, the snapshot size allocated to a volume could be less than what is requested. This occurs when the volume is allocated from multiple disks. |
PP-38924 |
Symptom After you delete multiple Helm applications, one of the Pods might get stuck in the Workaround On the node where the Pod stuck in the Error state, restart Docker and Kubelet. |
PP-34451 |
Symptom In rare scenarios, the RWX Pod might be stuck in the mount.nfs: mount system call failed Perform the following steps to confirm the issue:
# robin disk info
If you notice any input and output errors in step 4, apply the following workaround: Workaround
# kubectl get pods --all-namespaces -o=jsonpath='{range .items[]}
{.metadata.namespace} /{.metadata.name}{"\t"}{.spec.volumes[].
persistentVolumeClaim.claimName}{"\n"}{end}' | grep <pvc_nmae>
# kubectl delete pod <pod> -n <namespace>
|
PP-21916 |
Symptom A pod IP is not pingable from any other node in the cluster, apart from the node where it is running. Workaround Bounce the Calico pod running on the node where the issue is seen. |
PP-40819 |
Symptom From the Robin CNP UI, when you try to deploy an application by cloning from a snapshot, the operation might fail with the following similar error message indicating an invalid negative CPU value: Invalid value: “-200m”: must be greater than or equal to 0. You might observe this issue specifically when the application has sidecar containers configured with CPU requests/limits. This is a CNP UI issue. You can use the CNP CLI to perform the same operation successfully. Workaround Use the following Robin CLI command to clone the snapshot and create an app: # robin app create from-snapshot <new_app_name>
<snapshot_id> --rpool default --wait
|
PP-41022 |
Symptom The Workaround If you notice this issue, apply the workaround. Restart kubelet on the affected node: # systemctl restart kubelet
|
PP-40993 |
Symptom During large cluster upgrades, the upgrade might fail during Robin pre‑upgrade actions if Robin Auto Pilot creates active jobs. This occurs when multiple Robin Auto Pilot watchers are configured for a single pod, resulting in lingering jobs (for example, VnodeDeploy) that block the upgrade process. Workaround Restart the |
26.1.8. Technical Support¶
Contact Robin Technical support for any assistance.