25. Release Notes¶
25.1. Robin Cloud Native Platform v5.6.0¶
The Robin Cloud Native Platform (CNP) v5.6.0 release notes has pre- and post-upgrade steps, new features, improvements, fixed issues, and known issues.
Release Date: June 25, 2025
25.1.1. Infrastructure Versions¶
The following software applications are included in this CNP release:
Software Application |
Version |
---|---|
Kubernetes |
1.32.4 |
Docker |
25.0.2 |
Prometheus |
2.39.1 |
Prometheus Adapter |
0.10.0 |
Node Exporter |
1.4.0 |
Calico |
3.28.2 |
HAProxy |
2.4.7 |
PostgreSQL |
14.12 |
Grafana |
9.2.3 |
CRI Tools |
1.32.0 |
25.1.2. Supported Operating Systems¶
The following are the supported operating systems and kernel versions for Robin CNP v5.6.0:
OS Version |
Kernel Version |
---|---|
RHEL 8.10 |
4.18.0-553.el8_10.x86_64 |
Rocky Linux 8.10 |
4.18.0-553.el8_10.x86_64 |
25.1.3. Upgrade Paths¶
The following are the supported upgrade paths for Robin CNP v5.6.0:
Robin CNP v5.4.3 HF4 to Robin CNP v5.6.0-128
Robin CNP v5.4.3 HF4 PP2 to Robin CNP v5.6.0-128
Robin CNP v5.4.3 HF5 to Robin CNP v5.6.0-128
Robin CNP v5.4.3 HF5 PP1 to Robin CNP v5.6.0-128
25.1.3.1. Pre-upgrade considerations¶
For a successful upgrade, you must run the
possible_job_stuck.py
script before and after the upgrade. Contact the Robin Support team for the upgrade procedure using the script.When upgrading from supported Robin CNP versions to Robin CNP v5.6.0, if your cluster already has cert-manager installed, you must uninstall it before upgrading to Robin CNP v5.6.0.
25.1.3.2. Post-upgrade considerations¶
After upgrading to Robin CNP v5.6.0, you must run the
robin schedule update K8sResSync k8s_resource_sync 60000
command to update therobin schedule K8sResSync
.After upgrading to Robin CNP v5.6.0, you must run the
robin-server validate-role-bindings
command. To run this command, you need to log in to therobin-master
Pod. This command verifies the roles assigned to each user in the cluster and corrects them if necessary.After upgrading to Robin CNP v5.6.0, the
k8s_auto_registration config
parameter is disabled by default. The config setting is deactivated to prevent all Kubernetes apps from automatically registering and consuming resources. The following are the points you must be aware of with this change:You can register the Kubernetes apps using the
robin app register
command manually and use Robin CNP for snapshots, clones, and backup operations of the Kubernetes app.As this config parameter is disabled, when you run the
robin app nfs-list
command, the mappings between Kubernetes apps and NFS server Pods are not listed in the command output.If you need mapping between Kubernetes app and NFS server Pod when the
k8s_auto_registration config
parameter is disabled or the k8s app is not manually registered, get the PVC name from the Pod YAML file(kubectl get pod -n <name> -o YAML)
and run therobin nfs export list | grep <pvc name>
command.The
robin nfs export list
command output displays the PVC name and namespace.
25.1.3.3. Pre-upgrade steps¶
Upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.6.0-128
Before upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.6.0, perform the following steps:
Update the value of the
suicide_threshold
config parameter to1800
:# robin config update agent suicide_threshold 1800
Disable the
NFS Server
Monitor schedule:# robin schedule disable "NFS Server" Monitor
Set the toleration seconds for all NFS server Pods to
86400
seconds. After upgrade, you must change the tolerations seconds according to the post-upgrade steps.for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds to 86400"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 86400}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 86400}]'; done
Upgrading from Robin CNP v5.4.3 HF4+PP to Robin CNP v5.6.0-128
Before upgrading from Robin CNP v5.4.3 HF4+PP to Robin CNP v5.6.0, perform the following steps:
Update the value of the
suicide_threshold
config parameter to1800
:# robin config update agent suicide_threshold 1800
Set the
NFS Server
schedule CronJob to at least more than6
months:# rbash master # rsql # update schedule set kwargs='{"cron":"1 1 1 1 *"}' where callback='nfs_server_monitor'; # \q # systemctl restart robin-server
Set the toleration seconds for all NFS server Pods to
86400
seconds. After upgrade, you must change the tolerations seconds according to the post-upgrade steps.for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds to 86400"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 86400}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 86400}]'; done
25.1.3.4. Post-upgrade steps¶
After upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.6.0-128
After upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.6.0, perform the following steps:
Update the value of the
suicide_threshold
config parameter to40
:# robin config update agent suicide_threshold 40
Enable the
NFS Server
Monitor schedule:# robin schedule enable "NFS Server" Monitor
Set the
check_helm_apps
config parameter toFalse
:# robin config update cluster check_helm_apps False
Set the
chargeback_track_k8s_resusage
config parameter toFalse
:# robin config update server chargeback_track_k8s_resusage False
Set the
robin_k8s_extension
config parameter toTrue
:# robin config update manager robin_k8s_extension True
Verify whether the following mutating webhooks are present:
# kubectl get mutatingwebhookconfigurations -A | grep robin k8srobin-deployment-mutating-webhook 1 20d k8srobin-ds-mutating-webhook 1 20d k8srobin-pod-mutating-webhook 1 20d k8srobin-sts-mutating-webhook 1 20d robin-deployment-mutating-webhook 1 20d robin-ds-mutating-webhook 1 20d robin-pod-mutating-webhook 1 20d robin-sts-mutating-webhook 1 20d
If above
k8srobin-*
mutating webhooks are not present then bounce therobink8s-serverext
Pods:# kubectl delete pod -n robinio -l app=robink8s-serverext
Verify whether the following validating webhooks are present:
# kubectl get validatingwebhookconfigurations NAME WEBHOOKS AGE cert-manager-webhook 1 45h controllers-validating-webhook 1 31h ippoolcr-validating-webhook 1 31h namespaces-validating-webhook 1 31h pods-validating-webhook 1 31h pvcs-validating-webhook 1 31h
If
robin-*
mutating webhooks displayed in the step 6 output and validating webhooks displayed in the step 8 output are not present on your setup, then restart therobin-server-bg
service:# rbash master # supervisorctl restart robin-server-bg
Set the toleration seconds for all NFS server Pods to
60
seconds when the node is in thenotready
state and set to0
seconds, when the node isunreachable
state.for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 60}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 0}]'; done 2>/dev/null
After upgrading from Robin CNP v5.4.3 HF4+PP to Robin CNP v5.6.0-128
After upgrading from Robin CNP v5.4.3 HF4+PP to Robin CNP v5.6.0, perform the following steps:
Update the value of the
suicide_threshold
config parameter to40
:# robin config update agent suicide_threshold 40
Enable the
NFS Server
Monitor schedule:# robin schedule enable "NFS Server" Monitor
Set the
check_helm_apps
config parameter toFalse
:# robin config update cluster check_helm_apps False
Set the
chargeback_track_k8s_resusage
config parameter toFalse
:# robin config update server chargeback_track_k8s_resusage False
Set the
robin_k8s_extension
config parameter toTrue
:# robin config update manager robin_k8s_extension True
Delete the
NFS Server
schedule CronJob and restart therobin-server
androbin-server-bg
services:# rbash master # rsql # DELETE from schedule where callback='nfs_server_monitor'; # \q # supervisorctl restart robin-server # supervisorctl restart robin-server-bg
Verify whether the following mutating webhooks are present:
# kubectl get mutatingwebhookconfigurations -A | grep robin k8srobin-deployment-mutating-webhook 1 20d k8srobin-ds-mutating-webhook 1 20d k8srobin-pod-mutating-webhook 1 20d k8srobin-sts-mutating-webhook 1 20d robin-deployment-mutating-webhook 1 20d robin-ds-mutating-webhook 1 20d robin-pod-mutating-webhook 1 20d robin-sts-mutating-webhook 1 20d
If above
k8srobin-*
mutating webhooks are not present then bounce therobink8s-serverext
Pods:# kubectl delete pod -n robinio -l app=robink8s-serverext
Verify whether the following validating webhooks are present:
# kubectl get validatingwebhookconfigurations NAME WEBHOOKS AGE cert-manager-webhook 1 45h controllers-validating-webhook 1 31h ippoolcr-validating-webhook 1 31h namespaces-validating-webhook 1 31h pods-validating-webhook 1 31h pvcs-validating-webhook 1 31h
If
robin-*
mutating webhooks displayed in the step 7 output and validating webhooks displayed in the step 9 output are not present on your setup, then restart therobin-server-bg
service:# rbash master # supervisorctl restart robin-server-bg
Set the toleration seconds for all NFS server Pods to
60
seconds when the node is in thenotready
state and set to0
seconds, when the node isunreachable
state.for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 60}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 0}]'; done 2>/dev/null
25.1.4. New Features¶
25.1.4.1. Robin Certificate Management¶
Starting with Robin CNP v5.6.0, you can manage all certificates for your cluster without manual intervention using the Robin certificate management feature. Robin CNP uses the functionality of cert-manager for this feature. The cert-manager feature is a native Kubernetes certificate management controller. It helps in issuing certificates from various certificate authorities, such as Let’s Encrypt, Entrust, DigiCert, HashiCorp Vault, and Venafi. It can also issue certificates from a local CA (self-signed).
cert-manager adds Certificate and Issuer resources in Kubernetes clusters, which simplifies the process of obtaining, generating, and renewing the certificates for the cluster. For more information, see cert-manager.
The Robin certificate management feature manages certificates only for Robin internal services deployed in the robinio
namespace. It also ensures that all certificates are valid and up-to-date. It automatically renews certificates before they expire.
The Robin certificate management feature has the following certificate issuers:
cluster-issuer - it is responsible for all certificates used internally by the various control plane services.
ident-issuer - it is responsible for the Cluster Identity certificate used by all outward-facing services such as Kubernetes API Server, Robin client, and GUI.
Points to consider for Robin Certificate Management
When you install or upgrade to Robin CNP v5.6.0, cert-manager is deployed by default, and a new service named
robin-cert-monitor
is deployed to monitor the state of all certificates required by various Pods and containers in the Robin CNP cluster, ensuring that all required certificates exist and are valid.During installation or upgrade to Robin CNP v5.6.0, only the cert-manager option is supported. If you want to manage certificates of your cluster using the local control mode, you can use the
robin cert reset-cluster-certs
to enable local control mode.You can have only one cert-manager instance in a cluster.
If your cluster is already installed with a Cluster Identity certificate signed by an external CA, you must reconfigure it using the
robin cert reset-cluster-identity
command after updating to Robin CNP v5.6.0.If you want to utilize a Cluster Identity certificate signed by an external CA after installing Robin CNP v5.6.0, you can use the
robin cert reset-cluster-identity
command to configure it.If you want to install Robin CNP v5.6.0 with both (a Cluster Identity certificate signed by an external CA and cert-manager), you must pass the following options in the
config.json
file for one of the master nodes. For more information, see Installation with Custom Cluster Identity certificate.ident-ca-path
ident-cert-path
ident-key-path
You cannot install your own cert-manager on a Robin CNP cluster. If you want to utilize the functionality of cert-manager, you must use the cert-manager deployed as part of the Robin certificate management feature to create
Issuers
andCertificates
in other namespaces.
For more information, see Robin Certificate Management.
25.1.4.2. Recreate a Faulted Volume for Helm Apps¶
Robin CNP v5.6.0 enables you to recreate a volume that is in the Faulted status using the same configuration as that of the faulted one. The feature is only supported for volumes used by Helm applications. To support this feature, the following new command is made available:
# robin volume recreate --name <faulted volume name> or --pvc-name <PVC name of a faulted volume> --force
Note
You must use the --force
command option along with the command.
When you recreate a new volume in place of a faulted volume, you lose the complete data permanently. For more information, see Recreate a Faulted Volume for Helm Apps.
25.1.4.3. Memory Manager Integration¶
Robin CNP integrates the Kubernetes Memory Manager plugin starting with Robin CNP v5.6.0.
The Memory Manager plugin allocates guaranteed memory and hugepages for guaranteed QoS Pods at the NUMA level.
The Memory Manager plugin works along with the CPU Manager and Topology Manager. It provides hints to the Topology manager and enables resource allocations. The Memory Manager plugin ensures that the memory requested by a Pod is allocated from a minimum number of Non-Uniform Memory Access (NUMA) nodes.
Note
Robin CNP supports only the Static
policy for Memory Manager and supports only Pod
as the scope for Topology Manager(topology-manager-scope=Pod
).
You can enable this plugin using the "memory-manager-policy":"Static"
parameter as part of config.json
file during Robin CNP installation or when upgrading to Robin CNP v5.6.0 from a supported version. For more information, see Memory Manager.
25.1.4.4. Integrating Helm Support¶
Starting with Robin CNP v5.6.0, Robin CNP introduces native support for Helm chart management. The feature allows you to easily deploy, manage, and upgrade applications packaged as Helm charts within the CNP environment.
A new CLI (robin helm
) is available to support this feature. For more information, see Helm Operations.
25.1.4.5. Istio Integration¶
Robin CNP supports integration of Istio 1.23. You can install Istio after installing or upgrading to Robin CNP v5.6.0.
Istio is a service mesh that helps in managing the communications between microservices in distributed applications. For more information, see Istio.
After installing the Istio control plane, you must install Ingress and Egress gateways to manage the incoming and outgoing traffic. For more information, see Integrate Istio with Robin CNP.
25.1.4.6. Dual Stack (IPv4 & IPv6) Support¶
Starting with Robin CNP v5.6.0, Robin CNP supports dual-stack networking on the Calico interface for a cluster, allowing it to accept traffic from both Internet Protocol version 4 (IPv4) and Internet Protocol version 6 (IPv6) devices. For more information, see Pv4/IPv6 dual-stack.
Dual-stack Pod networking assigns both IPv4 and IPv6 Calico addresses to Pods. A service can utilize an IPv4 address, an IPv6 address, or both. For more information, see Services. Pod Egress routing works through both IPv4 and IPv6 interfaces.
You can enable the dual-stack networking feature during the Robin CNP installation only, not during the upgrade of an existing Robin CNP cluster. To enable this feature, you must specify the following option in the Config JSON file for one of the master nodes:
"ip-protocol":"dualstack"
Note
Hosts must have dual-stack (IPv4 and IPv6) network interfaces.
For more information, see Dual-stack (IPv4 and IPv6) installation.
25.1.4.7. Auto Release Static IP of Terminating Pod¶
Starting with Robin CNP v5.6.0, Robin CNP supports automatically releasing the static IP address of a Pod that is stuck in the terminating state on a node with the NotReady status. If a Pod with a static IP address is stuck in the terminating state, Kubernetes cannot assign this static IP address to a new Pod because the IP address remains in use by the terminating Pod. The IP address must be released before it can be reassigned to any Pod.
To address this, Robin CNP deploys a system service named robin-kubelet-watcher
. This service monitors the health and connectivity of API server with kubelet, CRI, and Docker services on the Notready nodes every 10 seconds. If any of these services are unhealthy for 60 seconds, the robin-kubelet-watcher
will terminate all Pods running on that node, releasing their IP addresses.
For more information, see Auto Release Static IP address of Terminating Pod.
25.1.4.8. Secure communication between Kubelet and Kube apiserver¶
Starting with Robion CNP v5.6.0, Robin CNP supports secure communication between kubelet and kube-apiserver. In a Kubernetes cluster, the kubelet and kube-apiserver communicate with each other securely using TLS certificates. This communication is secured through mutual TLS, meaning both the kubelet and kube-apiserver present their certificates to verify each other’s identity. This ensures that only authorized kubelets connect to the kube-apiserver and communication between them is secure.
By default, kubelet’s server certificate is self-signed meaning it is signed by a temporary Certificate Authority (CA) that is created on the fly and then discarded. To enable secure communication between the kubelet and kube-apiserver, you must configure the kubelet to obtain its server certificate by issuing a Certificate Signing Request (CSR), rather than using a server certificate signed by a self-signed CA. After configuring the kubelet, you must also configure the kube-apiserver to process and approve the CSR.
For more information, see Secure communication between kubelet and kube-apiserver.
25.1.4.9. Large cluster support¶
Starting with Robin CNP v5.6.0, support for large clusters is available. You can now have a Robin CNP cluster with up to 110 nodes.
25.1.5. Improvements¶
25.1.5.1. Persistent Prometheus Configuration¶
Robin CNP v5.6.0 provides an improvement to keep the Prometheus configuration persistent when you stop and start metrics.
With this improvement, when you update any of the following Prometheus-related configuration parameters, they will be persistent across metrics feature stop and start sessions.
node_exporter_ds_cpu_limit
node_exporter_ds_memory_limit
prom_evaluation_interval
prom_scrape_interval
prom_scrape_timeout
25.1.5.2. New Volume Metrics¶
Starting with Robin CNP v5.6.0, the robin_vol_psize
metric is introduced.
robin_vol_psize
It represents the physical (or raw) storage space (in bytes) used by a single replica of the volume. This metric provides further insight into storage consumption.
Example:
# curl -k https://localhost:29446/metrics
robin_vol_rawused{name="pvc-89382d8e-66c4-4d42-8d8c-62f7a328c713",volid="2"} 134217728
robin_vol_size{name="pvc-89382d8e-66c4-4d42-8d8c-62f7a328c713",volid="2"} 1073741824
robin_vol_psize{name="pvc-89382d8e-66c4-4d42-8d8c-62f7a328c713",volid="2"} 67108864
In the above example, the value 67108864
for robin_vol_psize
represents the physical (or raw) storage space (in bytes) used by a single replica of the volume.
25.1.5.3. Helm Version Upgrade¶
Starting with Robin CNP v5.6.0, the Helm version is upgraded from v3.6.3 to v3.16.1.
25.1.5.4. New Node Level Events¶
Robin CNP v5.6.0 provides the following new events to enhance the system’s ability to monitor and detect node readiness issues at both the Kubernetes and service/component levels:
EVENT_NODE_K8S_NOTREADY
- This event is generated when a node is marked as down due to an issue with a Kubernetes component. It is a warning alert.EVENT_NODE_K8S_READY
- This event is generated when a node is up after being marked as down. It is an info alert.EVENT_NODE_NOTREADY
- This event is generated when a node is marked as not ready due to an unhealthy service or component. It is a warning alert.EVENT_NODE_READY
- This event is generated when a node is ready after being marked as not ready. It is an info alert.
25.1.5.5. Updated the Default Reclaim Policy for robin-patroni
PVs¶
Starting with Robin CNP v5.6.0, the reclaim policy for robin-patroni
PVs is now set to Retain
by default.
25.1.5.6. HTTPS support for license proxy server¶
Starting from Robin CNP v5.6.0, Robin CNP supports Hypertext Transfer Protocol Secure (HTTPS) for the license proxy server to activate and renew Robin CNP cluster’s licenses.
25.1.5.7. VDI access support for Windows VMs¶
Starting with Robin CNP v5.6.0, you can access Windows-based VMs using the RDP console from the Robin UI.
25.1.5.8. KVM console access for tenant users¶
Starting with Robin CNP v5.6.0, tenant admins and tenant users can access the KVM application console from the Robin UI.
25.1.5.9. Events for certificates add and remove¶
Robin CNP generates an event when you add or remove a certificate. The following new Info events are added as part of this release:
EVENT_CERT_ADDED
- This is generated when a certificate is added.EVENT_CERT_REMOVED
- This is generated when a certificate is removed.
25.1.5.10. Archive failed job logs¶
Starting with Robin CNP v5.6.0, Robin CNP automatically archives failed job logs. A new config parameter failed_job_archive_age
is added to archive failed job logs. The default value of this parameter is 3 days
, which means failed job logs older than 3 days will be automatically archived.
25.1.5.11. Relaxation in NIC bonding policy¶
Starting with Robin CNP v5.6.0, Robin CNP considers the NIC bonding interface operational and up when at least one interface from the two interfaces used to create the bond interface is up.
25.1.5.12. Resume upgrade after a failure¶
The Robin CNP upgrade process is idempotent starting with Robin CNP v5.6.0 and allows you to resume it after a failure.
25.1.5.13. Support to provide static IP when creating an app from backup¶
When you are creating an app from a backup, you can provide static IPs from an IP pool starting from Robin CNP v5.6.0.
The following new option is added to the existing robin app create from-backup
command:
--static-ips
Note
You must use the --ip-pools
option along with the --static-ips
option.
The following is the format for this new option:
<ippool1>@<ip1/ip2>
Note
You can only provide multiple IPs from the same IP pool by separating the list of IPs using the “/” symbol.
Example
--static-ips ovs-2@192.0.2.14/192.0.2.15/192.0.2.16
25.1.5.14. MetalLB new install options¶
Starting with Robin CNP v5.6.0, the following new install options are added for MetalLB:
metallb-skip-nodes – Skip nodes from deploying MetalLB speaker Pods.
metallb-skip-controlplane – Skip master nodes from deploying MetalLB controller Pods
metallb-k8sfrr-mode - Deploy MetalLB using the K8s-FRR mode instead of the default FRR mode.
25.1.5.15. Patroni and Robin Manager Services metrics¶
Robin CNP v5.6.0 provides support for Patroni metrics and Robin manager service metrics. For more information, see Patroni and service metrics.
25.1.6. Fixed Issues¶
Reference ID |
Description |
---|---|
RSD-8287 |
Under specific conditions, volumes are unable to recover from a fault, leading them to enter a |
RSD-3885 |
The |
RSD-4634 |
When Robin CNP is running on SuperMicro nodes, the IPMI tool is incorrectly displaying the BMC IPV6 address as follows: |
RSD-4584 |
If you have added a range of blacklisted IPs in an unexpanded form, Robin CNP does not allow you to remove a range of blacklisted IPs from the IP Pool. This issue is fixed. |
RSD-5771 |
IPv6 IP pool creation failing with gateway the same as the broadcast address for the IP pool subnet. This issue is fixed. |
RSD-8104 |
The issue of the |
RSD-7814 |
The issue of the application creation operation failing with the following error is now fixed. Failed to mount volume <volume-name>: Node <node-name> has mount_blocked STORMGR_NODE_BLOCK_MOUNT. No new mounts are allowed.` |
RSD-7499 |
There is an issue with storage creation request calculation between Robin CNP and Kubernetes. Due to this mismatched calculation, some of the application Pod are failing to deploy as desired. This issue is fixed. |
RSD-9323 |
When you try to restore an application from a backup that previously had a static IP address, the restore process fails to honor the Non static IP allocations cannot be done from non-range(network) IP pools -> ‘nc-bss-ov-internal-mgmt-int-v6’. This issue is fixed. |
PP-34457 |
When the Metrics feature is enabled, the Grafana metrics application is not displaying. This issue is fixed. |
PP-38087 |
In certain cases, the snapshot size allocated to a volume could be less than what is requested. This occurs when the volume is allocated from multiple disks. This issue is fixed. |
PP-38397 |
Robin CNP upgrade failing due to a Docker installation failure. The failure is caused by missing |
PP-38071 |
The issue of application creation might fail with the following error is fixed: Failed to mount volume : Node has mount_blocked STORMGR_NODE_BLOCK_MOUNT. No new mounts are allowed. |
25.1.7. Known Issues¶
Reference ID |
Description |
---|---|
PP-35015 |
Symptom After renewing the expired Robin license successfully, Robin CNP incorrectly displays the Workaround You need to restart the robin-server-bg service. # rbash master
# supervisorctl restart robin-server-bg
|
PP-21916 |
Symptom A Pod IP is not pingable from any other node in the cluster, apart from the node where it is running. Workaround Bounce the Calico Pod running on the node where the issue is seen. |
PP-30247 |
Symptom After upgrading from Robin CNP v5.4.3HF5 to Robin CNP v5.6.0, the RWX apps might report the following error event type: wrong fs type, bad option, bad superblock on /dev/sdj, missing codepage or helper program, or other error Workaround To resolve this issue, contact the Robin Customer Support team. |
PP-30398 |
Symptom After removing an offline master node from the cluster and power cycling it, the removed master node is automatically added back as a worker node. Workaround
|
PP-34226 |
Symptom When a PersistentVolumeClaim (PVC) is created, the CSI provisioner initiates a Workaround Bounce the CSI provisioner Pod. # kubectl delete pod -n robinio
|
PP-34414 |
Symptom In rare scenarios, the IOMGR service might fail to open devices in the exclusive mode when it starts as other processes are using these disks. You might observe the following issue:
Steps to identify the issue:
Workaround If the device is not in use, restart the IOMGR service on the respective node: # supervisorctl restart iomgr
|
PP-34451 |
Symptom In rare scenarios, the RWX Pod might be stuck in the mount.nfs: mount system call failed Perform the following steps to confirm the issue:
If you notice any input and output errors in step 4, apply the following workaround: Workaround
|
PP-34492 |
Symptom When you run the Workaround
In rare scenarios, when upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.6.0, the upgrade might fail with the following error during the Kubernetes upgrade process on other master nodes: Failed to execute kubeadm upgrade command for K8S upgrade. Please make sure you have the correct version of kubeadm rpm binary installed Steps to identify the issue:
Workaround If you notice the above error, restart the kubelet and rerun the upgrade: # systemctl restart kubelet
|
PP-35478 |
Symptom In rare scenarios, the kube-scheduler may not function as expected when many Pods are deployed in a cluster due to issues with the kube-scheduler lease. Workaround Complete the following workaround steps to resolve issues with the kube-scheduler lease:
|
PP-36865 |
Symptom After rebooting a node, the node might not come back online after a long time, and the host BMC console displays the following message for RWX PVCs mounted on that node:
Workaround Power cycle the host system. |
PP-37330 |
Symptom During or after upgrading to Robin CNP v5.6.0, the /bin/mount /dev/sdn /var/lib/robin/nfs/robin-nfs-shared-35/ganesha/pvc-822e76f0-9bb8-4629-8aae-8318fb2d3b41 -o discard failed with return code 32: mount: /var/lib/robin/nfs/robin-nfs-shared-35/ganesha/pvc-822e76f0-9bb8-4629-8aae-8318fb2d3b41: wrong fs type, bad option, bad superblock on /dev/sdn, missing codepage or helper program, or other error. Workaround If you notice this issue, contact the Robin Customer Support team for assistance. |
PP-37416 |
Symptom In rare scenarios, when upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.6.0, the upgrade might fail with the following error during the Kubernetes upgrade process on other master nodes: Failed to execute kubeadm upgrade command for K8S upgrade. Please make sure you have the correct version of kubeadm rpm binary installed Steps to identify the issue:
Workaround If you notice the above error, restart the kubelet: # systemctl restart kubelet
|
PP-37965 |
Symptom In Robin CNP v5.6.0, when you scale up a Robin Bundle app, it is not considering the existing CPU cores and memory already in use by a vnode. As a result, Robin CNP is not able to find a suitable host, even though there are additional resources available. Workaround If you notice this issue, apply the following workaround:
# robin app computeqos <appname> --role <rolename> --cpus <newcnt>
--memory <newmem> -- wait
# robin app stop <appname> --wait
# robin app computeqos <appname> --role <rolename> --cpus <newcnt>
-- memory <newmem> --wait
|
PP-38039 |
Symptom During node reboot or power reset scenarios, application volumes may force shutdown due to I/O errors. As a result, application Pods might get stuck in the Context Deadline Exceeded. On the affected node where the volume is mounted or the application Pod is scheduled, the following error might be observed in the
Workaround If you notice this issue, contact the Robin Customer Support team for assistance |
PP-38044 |
Symptom When attempting to detach a repository from a hydrated Helm application, the operation might fail with the following error: Can’t detach repo as the application is in IMPORTED state, hydrate it in order to detach the repo from it. This issue occurs even if the application has already been hydrated. The system incorrectly marks the application in the Workaround To detach the repository, manually rehydrate the application and then retry the detach operation:
|
PP-38078 |
Symptom After a network partition, the robin-agent and iomgr-server may not restart automatically, and stale devices may not be cleaned up.This issue occurs because the consulwatch thread responsible for monitoring Consul and triggering restarts may fail to detect the network partition. As a result, stale devices may not be cleaned up, potentially leading to resource contention and other issues. Workaround Manually restart the robin-agent and iomgr-server using # supervisorctl restart robin-agent iomgr-server
|
PP-38471 |
Symptom When StatefulSet Pods restart, the Pods might get stuck in the Workaround If you notice this issue, restart the # kubectl delete pod <csi-nodeplugin> -n robinio
|
PP-39098 |
When you create a Robin bundle app with an affinity rule, the bundle app Pod might get stuck in the ContainerCreating and Terminating states in a continuous loop after a node reboot. If you notice this issue, apply the following workaround. You need to restart the robin-server-bg service. # rbash master
# supervisorctl restart robin-server-bg
|
PP-38924 |
After you delete multiple Helm applications, one of the Pods might get stuck in the “ Workaround On the node where the Pod stuck in the Error state, restart Docker and Kubelet. |
PP-38524 |
When you upgrade your cluster from any supported Robin CNP version to Robin CNP v5.6.0, the upgrade process might get stuck while upgrading Kubernetes and display this error: Workaround Restart the Calico Pods by performing a rolling restart of the calico-node DaemonSet: # kubectl rollout restart ds -n kube-system calico-node
|
PP-39200 |
After upgrading a non-HA (single-node) Robin cluster from a supported version to Robin CNP v5.6.0, application deployments and scaling operations might fail with the following error: Failed to download file_object, not accessible at this point. |
PP-38411 |
Symptom After upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.6.0, the ERROR - ippoolcr-validating-webhook not found. Please wait for Robin Server Start up to complete. This issue occurs because the necessary validating webhooks for Robin’s IP Pool Custom Resource Definition (CRD) are not properly created during the upgrade process. Workaround To resolve this issue, enable the
If the output does not list any webhooks related to robin, proceed to the next step.
This will add the
|
PP-39087 |
Symptom In a scenario where there are multiple placement constraints with Pod-level anti-affinity for each role and role affinity (co-locate the roles) with explicit tags limiting the placement of Pods and Roles, the application deployment fails. Workaround Use tags, maintenance mode, taints, and tolerances to manage placement of Pods. |
PP-39188 |
Symptom After a Pod using an RWX volume is bounced (deleted and recreated), the new Pod may become stuck in the Workaround
From the output, identify the VolumeFailoverAddNFSExport job ID that is holding the lock.
From the output, identify the sub-job in # AGENT_WAIT state.
After canceling the job, the pod should eventually transition to the Running state. |
PP-37652 |
Symptom When you deploy a multi-container application using Helm with static IPs assigned from an IP pool, only a subset of the Pods appear on the Robin CNP UI. Workaround Run the following CLI command to view all the Pods: # robin app info <appname> --status
|
PP-39260 |
Symptom Backup operations for applications with sidecar containers are not supported. Contact the Robin Customer Support team for further queries. |
PP-39263 |
Symptom When you try to create a volume using the Workaround Use the unit |
PP-39264 |
Symptom In the Robin UI, when you have an empty Helm chart, the Helm Charts UI page displays the following error. Failed to fetch the helm charts Workaround You can ignore the error message. |
PP-39265 |
Symptom When you try to share a Helm app using the Robin UI, the Share button in the UI does not respond. Workaround Use the following CLI command to share the Helm app. # robin app share <name> <user name> --all-tenant-users
|
25.1.8. Technical Support¶
Contact Robin Technical support for any assistance.