Release Notes
#############

Robin Cloud Native Storage v6.0.0-226
***************************************

The Robin CNS v6.0.0-226 release notes document provides information about upgrade paths, new features, an improvement, fixed issues, and known issues.

**Release Date:** March 13, 2026 

.. Robin CNS v6.0.0-211 was released on February 23, 2026. Robin CNS v6.0.0-211 and Robin CNS v6.0.0-226 are merged to do only 226 release as per Santosh's ask (Penn is not using 211 build no.)

Upgrade Paths
==============

The following are the supported upgrade paths for Robin CNS v6.0.0-226:

- Robin CNS v5.4.18-257 to Robin CNS v6.0.0-226
- Robin CNS v5.4.18-278 to Robin CNS v6.0.0-226
- Robin CNS v5.4.18-281 to Robin CNS v6.0.0-226

.. Note:: - After upgrading to Robin CNS v6.0.0-226, if you are using the Robin Client outside the ``robincli`` Pod, you must upgrade to the latest version of the Robin Client.
          - If you have installed Robin CNS with the ``skip_postgres_operator`` parameter to use the Zalando PostgreSQL operator, then you must first upgrade the Zalando PostgreSQL operator to v1.11.0 or later before upgrading to Robin CNS v6.0.0-226.

New Features
=============

Data Locality Tracking for Volumes
-----------------------------------
Starting with Robin CNS v6.0.0, Robin CNS provides visibility into the data locality ratio for each volume mount. The data locality ratio indicates the percentage of a volume's data that is physically stored on the same node where the volume is currently mounted. 

If the data locality percentage for a volume is high, I/O operations will be served locally without network hops. This lowers latency and improves storage performance.

The data locality ratio is reported as a percentage between 0% and 100%.

* **Hundred percent (100%)** - All leader data for a volume resides on the mount node (fully local).
* **Zero percent (0%)** - None of the leader data for a volume resides on the mount node (fully remote).

**Viewing data locality ratio**

You can view the data locality ratio for a volume using the following ways: 

* The ``Mount`` column in the ``robin volume list`` command.
* The ``Data Locality`` column in the Mounts table in the ``robin volume info`` command.
* The ``mount_data_locality`` label in the ``robin_vol_mount_node_ids`` metric.

For more information, see `Data Locality Tracking for Volumes <manage_storage.html#data-locality-tracking-for-volumes>`__.

New Metrics
------------

Robin CNS v6.0.0 provides the following new metrics for the CSI sidecar containers and Manager services categories:

CSI sidecar containers
^^^^^^^^^^^^^^^^^^^^^^^^

* **_total (Counters)** - Total number of CSI operations (e.g., ``CreateVolume``, ``DeleteVolume``).
* **_errors_total (Counters)** - Total number of errors encountered during CSI operations.
* **_duration_seconds (Histograms)** - Duration and latency of CSI calls in seconds.

**Example**

.. code-block:: text

      csi_sidecar_operations_seconds_count{driver_name="robin",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerGetCapabilities"} 1
      csi_sidecar_operations_seconds_sum{driver_name="robin",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerGetCapabilities"} 0.000508898
      csi_sidecar_operations_seconds_bucket{driver_name="robin",grpc_status_code="OK",method_name="/csi.v1.Identity/Probe",le="0.1"} 1

Manager services
^^^^^^^^^^^^^^^^^
* robin_manager_services_auth_server
* robin_manager_services_node_monitor_server
* robin_manager_services_control_infra_server

For more information, see `Storage Metrics <manage_storage.html#storage-metrics>`__.

Improvement
=============

Enhanced Master Pods upgrade strategy 
------------------------------------------
Starting with Robin CNS v6.0.0, the Robin master Pods upgrade sequentially to ensure continuous service and prevent downtime.

With this enhancement, Robin CNS first upgrades the standby master Pods. Once the standby master Pods are upgraded successfully, it upgrades the active master Pod.

Fixed Issues 
=============

=============    =======================================================================================================================
Reference ID     Description
=============    =======================================================================================================================
RSD-10850        The issue that caused unnecessary slice leader changes during iomgr-server restarts is fixed.
RSD-10242        The Robin worker Pod is crashing due to the empty ``/etc/robin/rcm/config.ini`` file. This issue is fixed.
RSD-10160        The issue where ``VolumeAttachment`` jobs could get stuck due to a race condition and lock contention within the job manager and prevent volume mount and unmount operations from progressing is fixed.
RSD-10372        The lease mechanism feature in Robin CNS failed to handle the temporary etcd connection interruptions, resulting in frequent Robin master failovers. This issue is fixed. 
RSD-11041        The issue of Robin cluster resource reporting the ``NotReady`` status for an extended period of time is fixed. The backoff time is adjusted to display accurate cluster status. 
RSD-10603        During the Robin CNS upgrade, IOMgr failed to start because of the empty ``/etc/consul/client/checks.json`` file on the node. This issue is fixed. 
RSD-11044        The DaemonSet ``robin-tcmuproxy`` priorityClass was not set to ``critical``. This issue is fixed. In this version, it is set to ``critical``. 
RSD-11042        The issue of the standby Robin master pod failing its readiness probe and becoming unhealthy due to a mismatch between the in-memory phase state and the value stored in ``/etc/robin/robin_master/phase`` is fixed.
RSD-11147        When the ``robin-patroni-pre-install-hook`` job fails on a node during the Robin CNS installation, the job is not getting rescheduled on that node and causes the installation to fail. This issue is fixed.
RSD-11233        The issue of Robin cluster remaining in the ``NotReady`` state as the host gets stuck in the ``INIT_FAILED`` state due to the Kubernetes API server flakiness is fixed. Now, the Robin CNS operator will retry in case of failure.
RSD-11238        For a multi-attach volume, when one of the nodes became faulty, in addition to the faulty node, the volume was incorrectly unmounted on the other nodes. This issue is fixed.
RSD-11258        During the initial bootstrap, a failure in namespace creation led to a failure in creating the default system user during the first iteration, which completed successfully in a subsequent iteration. As a result, internal API calls used a stale token, causing authorization failures. This issue is fixed.
RSD-11319        If Robin CNS is reinstalled on the same K8s cluster, the stale ``/home/robinds`` folder from the previous Robin CNS installation causes the IOMGR to get stuck in the ``NotReady`` state. This issue is fixed.
=============    =======================================================================================================================

Known Issues
=============

=============    =======================================================================================================================
Reference ID     Description
=============    =======================================================================================================================
PP-41215         **Symptom**

                 Under rare scenarios, a VM volume's slice leader can temporarily appear on two nodes. This happens if the I/O Manager (IOMgr) is down for an extended period, active I/O occurred on volumes, and the IOMgr is then restarted. The issue is due to a race condition between Robin Cluster Manager (RCM) updating node/device states and the IOMgr initiating remounts.
                 However, this issue corrects itself once the volume slice leader automatically consolidates to the mounted node once volume slices resynchronize.

PP-40480         **Symptom**

                 In rare scenarios, you might observe that one of the Pods is stuck in the ``ContainerCreating`` state, and the ``kubectl describe pod`` command shows the following volume mount error:

                 *Failed to mount volume pvc-d16fa6b1-5bcb-4c69-805d-ab4df9018cee: Node <default:vnode-87-237> has mount_blocked STORMGR_NODE_BLOCK_MOUNT. No new mounts are allowed.*

                 **Workaround**

                 Bounce the worker Pod running on the affected node.
PP-39632         **Symptom**

                 After upgrading to Robin CNS v6.0.0, NFS client might hang with no pending IO message.

                 For no pending IO,  refer this path : ``/var/log/robin/nodeplugin/robin-csi.log`` with the following message:

                 .. code-block:: text

                        CsiServer_9 - robin.utils - INFO - Executing command 
                        /usr/bin/nc -z -w 6 2049 with timeout 60 seconds
                        CsiServer_9 - robin.utils - INFO - Command 
                        /usr/bin/nc -z -w 6 2049 completed with return code 0.
                        CsiServer_9 - robin.utils - INFO - Standard out:

                 Also, you can find the following message in the ``dmesg``:

                 .. code-block:: text
              
                        nfs: server 192.02.1.218 not responding, timed out
                        nfs: server 192.02.1.218 not responding, timed out
                        nfs: server 192.02.1.218 not responding, timed out

                 **Workaround**

                 1. Check the node provisioner logs where the PVC is checking for the path and it is hung.
                 2. For the deployment/statefulset that is using the problematic PVC, scale down the replica count to ``0``.
                 3. Ensure all Pods associated with the application have terminated.
                 4. Scale up the replica count back to the original value.
PP-34414         **Symptom**

                 In rare scenarios, the ``IOMGR`` service might fail to open devices in the exclusive mode when it starts as other processes are using these disks. You might observe the following issues:

                 - All app Pods restart, and some app Pods get stuck in the ContainerCreating state.

                 To confirm the above issues, complete the following steps:

                 1. Check for the ``EVENT_DISK_FAULTED`` event type in the disk events:

                    .. code-block:: text

                          # robin event list --type EVENT_DISK_FAULTED

                 2. If you see the disk is faulted error, check the ``IOMGR`` logs for **dev_open()** error and **Failed to exclusively open** message on the node where disks are present.

                    .. code-block:: text

                          # cat iomgr.log.0 | grep <device> | grep "dev_open"

                 3. If you see the **Device or resource busy** error in the log file, use fuser command with the device path to confirm whether the device is in use:

                    .. code-block:: text

                          # fuser /dev/disk/by-id/scsi-SATA_Micron_M500_MTFD_1401096049D5

                 **Workaround**

                 If the device is not in use, restart the ``IOMGR`` service on the respective node:

                 .. code-block:: text

                     # supervisorctl restart iomgr-server
=============    =======================================================================================================================

.. Technical Support
.. =================

.. Contact `Robin Technical support <https://www.robin.io/support/>`_ for any assistance.