25. Release Notes¶
25.1. Robin Cloud Native Platform v5.5.0¶
The Robin Cloud Native Platform (CNP) v5.5.0 release notes has pre- and post-upgrade steps, new features, improvements, fixed issues, and known issues.
Release Date: April 17, 2025
25.1.1. Infrastructure Versions¶
The following software applications are included in this CNP release:
Software Application |
Version |
---|---|
Kubernetes |
1.31.6 |
Docker |
25.0.2 |
Prometheus |
2.39.1 |
Prometheus Adapter |
0.10.0 |
Node Exporter |
1.4.0 |
Calico |
3.28.2 |
HAProxy |
2.4.7 |
PostgreSQL |
14.12 |
Grafana |
9.2.3 |
CRI Tools |
1.31.1 |
25.1.2. Supported Operating System¶
The following is the supported operating system and kernel version for Robin CNP v5.5.0:
CentOS 7.9 (kernel version: 3.10.0-1160.71.1.el7.x86_64)
25.1.3. Upgrade Paths¶
The following are the supported upgrade paths for Robin CNP v5.5.0:
Robin CNP v5.5.0-1841 to Robin CNP v5.5.0-1852
Robin CNP v5.4.3 HF5 to Robin CNP v5.5.0-1852
Robin CNP v5.4.3 HF3+PP to Robin CNP v5.5.0-1852
25.1.3.1. Pre-upgrade consideration¶
For a successful upgrade, you must run the
possible_job_stuck.py
script before and after the upgrade. Contact the Robin Support team for the upgrade procedure using the script.When upgrading from supported Robin CNP versions to Robin CNP v5.5.0, if your cluster already has cert-manager installed, you must uninstall it before upgrading to Robin CNP v5.5.0.
25.1.3.2. Post-upgrade considerations¶
After upgrading to Robin CNP v5.5.0, you must run the
robin schedule update K8sResSync k8s_resource_sync 60000
command to update therobin schedule K8sResSync
.After upgrading to Robin CNP v5.5.0, you must run the
robin-server validate-role-bindings
command. To run this command, you need to log in to therobin-master
Pod. This command verifies the roles assigned to each user in the cluster and corrects them if necessary.After upgrading to Robin CNP v5.5.0, the
k8s_auto_registration config
parameter is disabled by default. The config setting is deactivated to prevent all Kubernetes apps from automatically registering and consuming resources. The following are the points you must be aware of with this change:You can register the Kubernetes apps using the
robin app register
command manually and use Robin CNP for snapshots, clones, and backup operations of the Kubernetes app.As this config parameter is disabled, when you run the
robin app nfs-list
command, the mappings between Kubernetes apps and NFS server Pods are not listed in the command output.If you need mapping between Kubernetes app and NFS server Pod when the
k8s_auto_registration config
parameter is disabled or the k8s app is not manually registered, get the PVC name from the Pod YAML file(kubectl get pod -n <name> -o YAML)
and run the robinnfs export list | grep <pvc name>
command.The
robin nfs export list
command output displays the PVC name and namespace.
25.1.3.3. Pre-upgrade steps¶
Upgrading from Robin CNP v5.4.3 HF5 or Robin CNP v5.5.0-1841 to Robin CNP v5.5.0-1852
Before upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.5.0, perform the following steps:
Update the value of the
suicide_threshold
config parameter to1800
:# robin config update agent suicide_threshold 1800
Disable the
NFS Server
Monitor schedule:# robin schedule disable "NFS Server" Monitor
Set the toleration seconds for all NFS server Pods to 86400 seconds. After upgrade, you must change the tolerations seconds according to the post-upgrade steps.
for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds to 86400"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 86400}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 86400}]'; done
Upgrading from Robin CNP v5.4.3 HF3+PP to Robin CNP v5.5.0
Before upgrading from Robin CNP v5.4.3 HF3+PP to Robin CNP v5.5.0, perform the following steps:
Update the value of the
suicide_threshold
config parameter to1800
:# robin config update agent suicide_threshold 1800
Set the
NFS Server
schedule CronJob to at least more than 6 months:# rbash master # rsql # update schedule set kwargs='{"cron":"1 1 1 1 *"}' where callback='nfs_server_monitor'; # \q # systemctl restart robin-server
Set the toleration seconds for all NFS server Pods to 86400 seconds. After upgrade, you must change the tolerations seconds according to the post-upgrade steps.
for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds to 86400"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 86400}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 86400}]'; done
25.1.3.4. Post-upgrade steps¶
After upgrading from Robin CNP v5.4.3 HF5 or Robin CNP v5.5.0-1841 to Robin CNP v5.5.0-1852
After upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.5.0, perform the following steps:
Update the value of the
suicide_threshold
config parameter to40
:# robin config update agent suicide_threshold 40
Enable the
NFS Server
Monitor schedule:# robin schedule enable "NFS Server" Monitor
Set the
check_helm_apps
config parameter toFalse
:# robin config update cluster check_helm_apps False
Set the
chargeback_track_k8s_resusage
config parameter toFalse
:# robin config update server chargeback_track_k8s_resusage False
Set the
robin_k8s_extension
config parameter toTrue
:# robin config update manager robin_k8s_extension True
Verify whether the following mutating webhooks are present:
# kubectl get mutatingwebhookconfigurations -A | grep robin k8srobin-deployment-mutating-webhook 1 20d k8srobin-ds-mutating-webhook 1 20d k8srobin-pod-mutating-webhook 1 20d k8srobin-sts-mutating-webhook 1 20d robin-deployment-mutating-webhook 1 20d robin-ds-mutating-webhook 1 20d robin-pod-mutating-webhook 1 20d robin-sts-mutating-webhook 1 20d
If above
k8srobin-*
mutating webhooks are not present then bounce therobink8s-serverext
Pods:# kubectl delete pod -n robinio -l app=robink8s-serverext
Verify whether the following validating webhooks are present:
# kubectl get validatingwebhookconfigurations NAME WEBHOOKS AGE cert-manager-webhook 1 45h controllers-validating-webhook 1 31h ippoolcr-validating-webhook 1 31h namespaces-validating-webhook 1 31h pods-validating-webhook 1 31h pvcs-validating-webhook 1 31h
If
robin-*
mutating webhooks displayed in the step 6 output and validating webhooks displayed in the step 8 output are not present on your setup, then restart therobin-server-bg
service:# rbash master # supervisorctl restart robin-server-bg
Set the toleration seconds for all NFS server Pods to 60 seconds when the node is in the
notready
state and set to 0 seconds, when the node isunreachable
state.for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 60}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 0}]'; done 2>/dev/null
After upgrading from Robin CNP v5.4.3 HF3+PP to Robin CNP v5.5.0
After upgrading from Robin CNP v5.4.3 HF3+PP to Robin CNP v5.5.0, perform the following steps:
Update the value of the
suicide_threshold
config parameter to40
:# robin config update agent suicide_threshold 40
Enable the
NFS Server
Monitor schedule:# robin schedule enable "NFS Server" Monitor
Set the
check_helm_apps
config parameter toFalse
:# robin config update cluster check_helm_apps False
Set the
chargeback_track_k8s_resusage
config parameter toFalse
:# robin config update server chargeback_track_k8s_resusage False
Set the
robin_k8s_extension
config parameter toTrue
:# robin config update manager robin_k8s_extension True
Delete the
NFS Server
schedule CronJob and restart therobin-server
androbin-server-bg
services:# rbash master # rsql # DELETE from schedule where callback='nfs_server_monitor'; # \q # supervisorctl restart robin-server # supervisorctl restart robin-server-bg
Verify whether the following mutating webhooks are present:
# kubectl get mutatingwebhookconfigurations -A | grep robin k8srobin-deployment-mutating-webhook 1 20d k8srobin-ds-mutating-webhook 1 20d k8srobin-pod-mutating-webhook 1 20d k8srobin-sts-mutating-webhook 1 20d robin-deployment-mutating-webhook 1 20d robin-ds-mutating-webhook 1 20d robin-pod-mutating-webhook 1 20d robin-sts-mutating-webhook 1 20d
If above
k8srobin-*
mutating webhooks are not present then bounce therobink8s-serverext
Pods:# kubectl delete pod -n robinio -l app=robink8s-serverext
Verify whether the following validating webhooks are present:
# kubectl get validatingwebhookconfigurations NAME WEBHOOKS AGE cert-manager-webhook 1 45h controllers-validating-webhook 1 31h ippoolcr-validating-webhook 1 31h namespaces-validating-webhook 1 31h pods-validating-webhook 1 31h pvcs-validating-webhook 1 31h
If
robin-*
mutating webhooks displayed in the step 7 output and validating webhooks displayed in the step 9 output are not present on your setup, then restart therobin-server-bg
service:# rbash master # supervisorctl restart robin-server-bg
Set the toleration seconds for all NFS server Pods to 60 seconds when the node is in the
notready
state and set to 0 seconds, when the node isunreachable
state.
for pod in `kubectl get pod -n robinio -l robin.io/instance=robin-nfs --output=jsonpath={.items..metadata.name}`; do echo "Updating $pod tolerationseconds"; kubectl patch pod $pod -n robinio --type='json' -p='[{"op": "replace", "path": "/spec/tolerations/0/tolerationSeconds", "value": 60}, {"op": "replace", "path": "/spec/tolerations/1/tolerationSeconds", "value": 0}]'; done 2>/dev/null
25.1.4. New Features¶
25.1.4.1. Robin Patroni Monitor¶
The Robin Patroni Monitor feature allows you to monitor the status of the Patroni instances (Pods) in a cluster. The Robin CNP architecture includes a highly available PostgreSQL cluster managed by Patroni, referred to as the Patroni Cluster.
To ensure high availability (HA), Patroni maintains three copies of its database, meaning a maximum of three Patroni instances are present in a cluster at any given time.
A Patroni cluster might become unavailable for a number of reasons. To monitor the status of the Patroni cluster, Robin CNP provides the Robin Patroni Monitor feature, which generates the events as required.
Note
After you upgrade from the previous Robin CNP versions to Robin CNP v5.5.0, the Robin Patroni Monitor feature is automatically enabled.
Also, in the Robin CNP v5.5.0 release, the robin event-type list command displays the Event types related to Patroni if there are any changes in the status of the Patroni replicas:
EVENT_PATRONI_LEADER_CHANGE
EVENT_PATRONI_INSTANCE_NOT_READY
EVENT_PATRONI_INSTANCE_FAILED
EVENT_PATRONI_INSTANCE_READY
For more information, see Robin Patroni Monitor.
25.1.4.2. Robin Certificate Management¶
Starting with Robin CNP v5.5.0, you can manage all certificates for your cluster without manual intervention using the Robin certificate management feature. Robin CNP uses functionality of cert-manager for this feature. cert-manager feature is a native Kubernetes certificate management controller. It helps in issuing certificates from various certificate authorities, such as Let’s Encrypt, Entrust, DigiCert, HashiCorp Vault, Venafi. It can also issue certificates from a local CA (self-signed).
cert-manager adds Certificate and Issuer resources in Kubernetes clusters, which simplifies the process of obtaining, generating, and renewing the certificates for the cluster. For more information, see cert-manager.
Robin certificate management feature manages certificates only for Robin internal services deployed in the robinio
namespace. It also ensures that all certificates are valid and up-to-date. It automatically renews certificates before they expire.
Robin certificate management feature has the following certificate issuers:
cluster-issuer - it is responsible for all certificates used internally by the various control plane services.
ident-issuer - it is responsible for the Cluster Identity certificate used by all outward-facing services such as Kubernetes API Server, Robin client, and GUI.
Points to consider for Robin Certificate Management feature
When you install or upgrade to Robin CNP v5.5.0, cert-manager is deployed by default, and a new service named
robin-cert-monitor
is deployed to monitor the state of all certificates required by various Pods and containers in the Robin CNP cluster, ensuring that all required certificates exist and valid.During installation or upgrade to Robin CNP v5.5.0, only the cert-manager option is supported. If you want to manage certificates of your cluster using the local control mode, you can use the
robin cert reset-cluster-certs
to enable local control mode.You can have only one cert-manager instance in a cluster.
If your cluster is already installed with a Cluster Identity certificate signed by an external CA, you must reconfigure it using the
robin cert reset-cluster-identity
command after updating to Robin CNP v5.5.0.If you want to utilize a Cluster Identity certificate signed by an external CA after installing Robin CNP v5.5.0, you can use the
robin cert reset-cluster-identity
command to configure it.If you want to install Robin CNP v5.5.0 with both (Cluster Identity certificate signed by an external CA and cert-manager), you must pass the following options in the
config.json
file for one of the master nodes. For more information, see Installation with Custom Cluster Identity certificate.ident-ca-path
ident-cert-path
ident-key-path
You cannot install your own cert-manager on a Robin CNP cluster. If you want to utilize the functionality of cert-manager, then use cert-manager deployed as part of the Robin certificate management feature to create Issuers and Certificates in other namespaces.
For more information, see Robin Certificate Management.
25.1.4.3. Large cluster support¶
Starting with Robin CNP v5.5.0, support for large clusters is available. You can now have a Robin CNP cluster with up to 110 nodes.
25.1.5. Improvements¶
25.1.5.1. Support for SSL certificate-based authentication for Kafka Subscribers¶
Starting with Robin CNP v5.5.0, Robin CNP supports SSL certificate-based authentication for Kafka subscribers for alerts and events.
Prior to this release, Robin CNP only supported username and password-based authentication.
To setup SSL certificate-based authentication, you must specify the following certificates and key as part of the robin subscriber add
command:
CA certificate
Client certificate
Client key
For more information, see Registering a Robin subscriber.
25.1.5.2. Support for MetalLB with BGP peering¶
Staring with Robin CNP v5.5.0, Robin CNP supports MetalLB layer 3 mode with the Border Gateway Protocol (BGP) peering. In this mode, each node in the cluster establishes BGP peering sessions with the upstream router and advertises the load balancer IP address assigned to a service for each BGP peering. As a result, there are multiple routes for each load balancer IP address with the upstream router. When a router receives traffic, it selects one of the nodes that advertised the load balancer IP address and sends the traffic to that node.
You can set up MetalLB either during the Robin CNP installation or post-installation.
Note
Robin CNP supports MetalLB with BGP peering in FRR mode.
For more information, see Load Balancer Support using MetalLB.
25.1.5.3. New Node Level Events¶
Robin CNP v5.5.0 provides the following new events to enhance the system’s ability to monitor and detect node readiness issues at both the Kubernetes and service/component levels:
EVENT_NODE_K8S_NOTREADY - This event is generated when a node is marked as down due to an issue with a Kubernetes component. It is a warning alert.
EVENT_NODE_K8S_READY - This event is generated when a node is up after being marked as down. It is an info alert.
EVENT_NODE_NOTREADY - This event is generated when a node is marked as not ready due to an unhealthy service or component. It is a warning alert.
EVENT_NODE_READY - This event is generated when a node is ready after being marked as not ready. It is an info alert.
25.1.5.4. PostgreSQL’s Archive mode is Disabled¶
Starting with Robin CNP 5.5.0, PostgreSQL’s archive mode is disabled. When archive mode is enabled, WAL files are closed and switched at regular intervals, even if minimal or no data is written during that time, resulting in unnecessary resource usage.
25.1.5.5. New Metrics¶
Robin CNP v5.5.0 provides the following new metrics for the following categories:
Manager services
robin_manager_services_robin_server robin_manager_services_consul_server robin_manager_services_robin_event_server robin_manager_services_stormgr_server robin_manager_services_pgsql robin_manager_services_robin_master
Agent Services
robin_agent_services_robin_agent robin_agent_services_iomgr_service robin_agent_services_monitor_server robin_agent_services_consul_client
Node Metrics
robin_node_state robin_node_maintenance_mode
Disk Metrics
robin_disk_state robin_disk_maintenance_mode
Volume Metrics
robin_vol_storstatus robin_vol_status robin_vol_mount_node_id robin_vol_snapshot_space_used robin_vol_snapshot_space_limit robin_vol_total_snapshot_count
25.1.5.6. Superadmin with limited capabilities¶
Starting with Robin CNP v5.5.0, new user capabilities allow you to create a superadmin user with limited capabilities.
You can create a superadmin user with limited capabilities by disabling the following newly added user capabilities:
ManageUserCapabilities:
- When this capability is disabled, the user cannot create, edit, or delete custom user capabilities.
ManageAdministratorsTenant:
- When this capability is disabled, the user cannot manage resources and users in the Administrators tenant.
AddSelfToTenants:
- When this capability is disabled, the user cannot add oneself as a member of other tenants in the cluster.
Note
These user capabilities are by default enabled for the superadmin user. You must disable these capabilities if you need to create a superadmin user without these capabilities.
For more information, see Superadmin with limited capabilities.
25.1.6. Fixed Issues¶
Reference ID |
Description |
---|---|
RSD-8104 |
The issue of the |
RSD-8150 |
The issue of the device evacuation operation failing when a replica is allocated to a drive marked for evacuation is fixed. |
PP-37627 |
The issue of volume expansion operations failing despite having sufficient disk space available in the pool is fixed. |
PP-37694 |
The issue of Pods failing to start due to the system reaching the file descriptor limit on CentOS and Dockershim is displaying the following error message is fixed.
In this version, the limit on the number of open files in the Dockershim service is increased to a higher value. |
RSD-8846 |
The Robin CAT profiles feature does not work as expected on RHEL 8.10. This issue is fixed. |
RSD-8083 |
The IO hang issue observed on clusters with large disk sizes is fixed. |
RSD-5711 |
In Robin CNP v5.4.3, the |
RSD-8083 |
Dev slices leader change tasks, which delays epoch update tasks and results in IO timeouts on the application side. This issue is fixed. |
RSD-8854 |
The issue of the IOMGR service crashing on a node when it came back online after a reboot is fixed. |
RSD-8083 |
The current resource (CPU and memory) limits for Patroni in Robin CNP are insufficient and result in cluster performance issues. This issue is fixed by increasing the resource (CPU and memory) limits for Patroni in Robin CNP v5.5.0. The following are new values:
|
PP-37695 |
The issue where Pods are restricted to accessing memory from a single NUMA node, limiting workloads that require larger memory pools, is fixed. Starting with Robin CNP v5.5.0, a new annotation can be specified to bypass the NUMA restriction: |
PP-38088 |
The issue of volume evacuation not being supported is fixed. And the following error message does not appear during the evacuation operation: |
PP-38008 |
When you perform a disk evacuation operation on a disk that is hosting volume leader slices, and at the same time, if other replicas of the slices are marked as |
PP-38038 |
The issue of Robin CNP v5.5.0 not supporting scaling up and scaling down of resources (CPU and memory) for Robin Bundle apps is fixed. |
25.1.7. Known Issues¶
Reference ID |
Description |
---|---|
PP-21916 |
Symptom A Pod IP is not pingable from any other node in the cluster, apart from the node where it is running. Workaround Bounce the Calico Pod running on the node where the issue is seen. |
PP-30247 |
Symptom After upgrading from Robin CNP v5.4.3HF5 to Robin CNP v5.5.0, the RWX apps might report the following error event type: wrong fs type, bad option, bad superblock on /dev/sdj, missing codepage or helper program, or other error Workaround To resolve this issue, contact the Robin Customer Support team. |
PP-30398 |
Symptom After removing an offline master node from the cluster and power cycling it, the removed master node is automatically added back as a worker node. Workaround
|
PP-34226 |
Symptom When a PersistentVolumeClaim (PVC) is created, the CSI provisioner initiates a Workaround Bounce the CSI provisioner Pod. # kubectl delete pod -n robinio
|
PP-34414 |
Symptom In rare scenarios, the IOMGR service might fail to open devices in the exclusive mode when it starts as other processes are using these disks. You might observe the following issue:
Steps to identify the issue:
Workaround If the device is not in use, restart the IOMGR service on the respective node: # supervisorctl restart iomgr
|
PP-34451 |
Symptom In rare scenarios, the RWX Pod might be stuck in the mount.nfs: mount system call failed Perform the following steps to confirm the issue:
If you notice any input and output errors in step 4, apply the following workaround: Workaround
|
PP-34457 |
Symptom If the Metrics feature is enabled on your Robin CNP cluster and you are using Grafana for monitoring, after upgrading the cluster from any supported Robin CNP versions to Robin CNP v5.4.3 HF5, the Grafana metrics will not work. Note You need to take a backup of the configmaps of the Prometheus and Grafana apps in the Workaround You need to stop and restart the Metrics feature.
|
PP-34492 |
Symptom When you run the Workaround
In rare scenarios, when upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.5.0, the upgrade might fail with the following error during the Kubernetes upgrade process on other master nodes: Failed to execute kubeadm upgrade command for K8S upgrade. Please make sure you have the correct version of kubeadm rpm binary installed Steps to identify the issue:
Workaround If you notice the above error, restart the kubelet and rerun the upgrade: # systemctl restart kubelet
|
PP-35478 |
Symptom In rare scenarios, the kube-scheduler may not function as expected when many Pods are deployed in a cluster due to issues with the kube-scheduler lease. Workaround Complete the following workaround steps to resolve issues with the kube-scheduler lease:
|
PP-36865 |
Symptom After rebooting a node, the node might not come back online after a long time, and the host BMC console displays the following message for RWX PVCs mounted on that node:
Workaround Power cycle the host system. |
PP-37330 |
Symptom During or after upgrading to Robin CNP v5.5.0, the /bin/mount /dev/sdn /var/lib/robin/nfs/robin-nfs-shared-35/ganesha/pvc-822e76f0-9bb8-4629-8aae-8318fb2d3b41 -o discard failed with return code 32: mount: /var/lib/robin/nfs/robin-nfs-shared-35/ganesha/pvc-822e76f0-9bb8-4629-8aae-8318fb2d3b41: wrong fs type, bad option, bad superblock on /dev/sdn, missing codepage or helper program, or other error. Workaround If you notice this issue, contact the Robin Customer Support team for assistance. |
PP-37416 |
Symptom In rare scenarios, when upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.5.0, the upgrade might fail with the following error during the Kubernetes upgrade process on other master nodes: Failed to execute kubeadm upgrade command for K8S upgrade. Please make sure you have the correct version of kubeadm rpm binary installed Steps to identify the issue:
Workaround If you notice the above error, restart the kubelet: # systemctl restart kubelet
|
PP-37965 |
Symptom In Robin CNP v5.5.0, when you scale up a Robin Bundle app, it is not considering the existing CPU cores and memory already in use by a vnode. As a result, Robin CNP is not able to find a suitable host, even though there are additional resources available. Workaround If you notice this issue, apply the following workaround:
# robin app computeqos <appname> --role <rolename> --cpus <newcnt>
--memory <newmem> -- wait
# robin app stop <appname> --wait
# robin app computeqos <appname> --role <rolename> --cpus <newcnt>
-- memory <newmem> --wait
|
PP-38039 |
Symptom During node reboot or power reset scenarios, application volumes may force shutdown due to I/O errors. As a result, application Pods might get stuck in the Context Deadline Exceeded. On the affected node where the volume is mounted or the application Pod is scheduled, the following error might be observed in the
Workaround If you notice this issue, contact the Robin Customer Support team for assistance |
PP-38044 |
Symptom When attempting to detach a repository from a hydrated Helm application, the operation might fail with the following error: Can’t detach repo as the application is in IMPORTED state, hydrate it in order to detach the repo from it. This issue occurs even if the application has already been hydrated. The system incorrectly marks the application in the Workaround To detach the repository, manually rehydrate the application and then retry the detach operation:
|
PP-38061 |
Symptom In rare scenarios, when upgrading from Robin CNP v5.4.3 HF5 to Robin CNP v5.5.0, the upgrade may get stuck while executing Robin upgrade actions on the primary master node because some of the hosts are not in the Steps to identify the issue:
Workaround If any host in the cluster is in the
|
PP-38071 |
Symptom Application creation might fail with the following error: Failed to mount volume : Node has mount_blocked STORMGR_NODE_BLOCK_MOUNT. No new mounts are allowed. This issue occurs when a node enters a mount-blocked state ( Workaround Try to create the application after 15 minutes. |
PP-38078 |
Symptom After a network partition, the robin-agent and iomgr-server may not restart automatically, and stale devices may not be cleaned up.This issue occurs because the consulwatch thread responsible for monitoring Consul and triggering restarts may fail to detect the network partition. As a result, stale devices may not be cleaned up, potentially leading to resource contention and other issues. Workaround Manually restart the robin-agent and iomgr-server using # supervisorctl restart robin-agent iomgr-server
|
PP-38087 |
Symptom In certain cases, the snapshot size allocated to a volume could be less than what is requested. This occurs when the volume is allocated from multiple disks. |
PP-38397 |
Symptom When upgrading from supported Robin CNP versions to Robin CNP v5.5.0, the Robin upgrade process might fail due to a Docker installation failure caused by missing dependencies. This issue occurs when the cluster is missing the fuse-overlayfs and slirp4netns packages, which are required by the new Docker version. The upgrade process removes the existing Docker version but fails to install the new version, and the Docker service file gets masked, preventing Docker from starting. Workaround
|
PP-38471 |
Symptom When StatefulSet Pods restart, the Pods might get stuck in the Workaround If you notice this issue, restart the # kubectl delete pod <csi-nodeplugin> -n robinio
|
25.1.8. Technical Support¶
Contact Robin Technical support for any assistance.