6. Managing Storage

Robin discovers disks attached to the nodes and uses them for providing storage to applications. This applies to local disks, cloud volumes, and SAN stoage that are available to nodes which have the storage role assigned to them. During the process Robin collects all the required metadata about the disks and identifies them as HDD, SSD or NVMes. All the elligible disks (this excludes disks with parititions and/or without WWNs) are marked as Robin storage.

Cloud volumes like EBS volumes on AWS and Persistent disks on GCP are not physically tied to a cloud compute instance. They can be detached from one cloud compute instance and attached to another instance at any time. Logical Units from a Storage Array Network can similarly be visible from multiple physical servers in the same storage fabric or network. Robin assigns a primary host for these devices and ensures that all IO access to a device only happens through one server at a given time. As with local disks, these devices are automatically discovered by Robin and registered. However they are considered to be re-attachable given their properties. Robin makes sure these re-attachable disks are always accessible even in the event of node failure. When a node goes down, Robin detaches the disk from failed node and attaches it to a healthy node so that application data remains accessible.

Note

Whilst installing if it is known that certain disks will not have a WWN, the --set-uuid option can be provided via the installer alongside a list of disks on which there is no WWN. This in turn will result in Robin stamping a UUID on the disk and thus enabling it to be used as Robin storage.

Robin allows multiple operations, detailed below, to be performed on registered disks.

robin disk attach

Attach an existing disk

robin disk detach

Detach a registered disk

robin disk create

Provision and attach a cloud disk

robin disk list

List disks

robin disk info

Display detailed information about a disk

robin disk evacuate

Evacuate allocations off a disk

robin disk remove

Remove a disk

robin disk update

Update the attributes of a disk

robin disk unfault

Unfault a disk

robin volume snapshot-space-limit

Change snapshot space limit

6.1. Attaching a disk

If a disk was previously detached from a host due to the host undergoing decommissioning or to rebalance the storage available on the cluster one will need to attach the disk back to another host in the cluster. Issue the following command to do so:

Note

This command is only for supported for re-attachable disks. Moreover if a host became unreachable Robin will automatically detach the disk and choose a new server to reattach it to based on the accessibility of the disk and the load on the other servers.

# robin disk attach <wwn>
                    --hostname <hostname>
                    --force

wwn

WWN of disk to attach

--hostname <hostname>

Name of host to attach disk to. Note this is a mandatory parameter.

--force

Forcibly re-attach a disk that is already ONLINE

Example:

# robin disk detach 0xQEMU_QEMU_HARDDISK_3c71c872-fe13-4fd5-b --wait
Job:  187 Name: DiskAttach     State: VALIDATED       Error: 0
Job:  187 Name: DiskAttach     State: WAITING         Error: 0
Job:  187 Name: DiskAttach     State: FINALIZED       Error: 0
Job:  187 Name: DiskAttach     State: COMPLETED       Error: 0

Attaches a disk, which might have previously been detached due to the host undergoing decommissioning or a rebalancing of storage disks, to a particular host.

End Point: /api/v3/robin_server/disks/<disk_wwn>

Method: PUT

URL Parameters: None

Data Parameters:

  • action: attach - This mandatory field within the payload specifies that the attach operation is to be performed.

  • hostname: <hostname> - Utilizing this parameter results in the disk being attached to the specified host.

  • force: true - Utilizing this parameter enables one to force re-attach an ONLINE disk.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error)

Example Response:

Output
{
   "jobid": 128
}

6.2. Detaching a disk

If a physical host is temporarily or permanently taken down and one can detach its storage disks to ensure future access. Issue the following command to do so:

Note

This command is only for supported for re-attachable disks.

# robin disk detach <wwn>
                    --hostname <hostname>
                    --force

wwn

WWN of disk to detach

--hostname <hostname>

Name of host to which disk is attached to. Note this is an optional parameter

--force

Forcibly detach a disk that is currently ONLINE

Example:

# robin disk detach 0xQEMU_QEMU_HARDDISK_3c71c872-fe13-4fd5-b --wait
Job:  187 Name: DiskDetach     State: VALIDATED       Error: 0
Job:  187 Name: DiskDetach     State: WAITING         Error: 0
Job:  187 Name: DiskDetach     State: FINALIZED       Error: 0
Job:  187 Name: DiskDetach     State: COMPLETED       Error: 0

Detaches a disk from a host.

End Point: /api/v3/robin_server/disks/<disk_wwn>

Method: PUT

URL Parameters: None

Data Parameters:

  • action: detach - This mandatory field within the payload specifies that the detach operation is to be performed.

  • hostname: <hostname> - Utilizing this parameter results in the disk being detached from the specified host. This is optional as the host to which the disk is attached can be discovered implicitly.

  • force: true - Utilizing this parameter enables one to force detach an ONLINE disk.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error)

Example Response:

Output
{
   "jobid": 130
}

6.3. Provisioning a disk

Robin provides a utility through which it can provision disks of any size, attach them to hosts and discover them automatically on multiple cloud platforms. This enables users to expand the storage available for their cluster with convenience and ease. Detailed below are the general options for the robin drive create command, followed by specific examples for each supported cloud environment.

# robin disk create <hostname>
                    --type <type>
                    --number <number>
                    --size <size>
                    --iops <iops>

hostname

Name of host to attach disk to after creation

--type <type>

Type of disks. Choices for GCP include: pd-ssd, pd-standard. Choices for AWS include: gp2, io1, st1. Choices for Anthos include: independent-persistent.

--number <number>

Number of disks to be created. The default value is 1

--size <size>

Size of disk to be created in GB. The default value is 500 GB

--iops <iops>

IOPs for AWS ‘io1’ disk type. Can be between 100-160000

Provisions disks of any size, attaches and discovers them automatically on cloud based nodes.

End Point: /api/v3/robin_server/disks/

Method: POST

URL Parameters: None

Data Parameters:

  • hostname: <hostname> - This mandatory field within the payload specifies the host to which the provisioned disk should be attached.

  • type: <disk_type> - This mandatory field within the payload specifies the type of disk to be created. Supported types include: pd-ssd, pd-standard (for GCP); gp2, io1, st1 (for AWS); independent-persistent (for Anthos).

  • number: <num_of_disks> - This mandatory field within the payload specifies the number of disks to be created. It should be an integer value.

  • size: <size_of_disk> - This mandatory field within the payload specifies the size of the disks to be created. It should be a string. An example value is ‘200GB’.

  • iops: <iops_of_disk> - Utilizing this parameter results in disks that can handle a maximum number of IOPs equal to <iops_of_disks> being created. Note this parameter is only valid for the ‘io1’ disk type for AWS nodes.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error)

Example Response:

Output
{
   "jobid": 196
}

On Google Cloud Platform, you can attach disks to your instance via the UI or Google APIs and have them available for use by Robin by running the below command:

$ robin host probe <hostname> --rediscover

On the other hand you can utilize Robin to provision disks in GCP to use for application deployment. To create 100 GB disk in GCP, run following command:

$ robin disk create <hostname> --type <pd-standard | pd-ssd> --size 100

These disks will be attached automatically and auto discovered by Robin so they will be ready to use straightaway.

Note

Due to Robin’s advanced feature to make sure disks are always accessible, it needs the manage disks permission to be selected while deploying cluster on GCP.

On the Google Anthos platform, you can add disks to cluster VMs from vSphere and have them available for use by Robin by running the below command:

$ robin host probe <hostname> --rediscover

On the other hand you can utilize Robin to provision virtual disks to use for application deployment. To create 100 GB disk for Anthos, run following command:

$ robin disk create <hostname> --type independent-persistent --size 100

These disks will be attached automatically and auto discovered by Robin so they will be ready to use straightaway.

Note

Due to Robin’s advanced feature to make sure disks are always accessible, it needs credentials, provided via Kubernetes secret, to have all cluster and disk level API privileges.

On AWS, you can attach disks to your EC2 instance via the UI or AWS CLI/APIs and have them available for use by Robin by running the below command:

$ robin host probe <hostname> --rediscover

On the other hand you can utilize Robin to provision disks in AWS to use for application deployment. To create 100 GB disk in AWS, run following command:

$ robin disk create <hostname> --type <gp2 | io1 | st1> --size 100

These disks will be attached automatically to the EC2 instances and auto discovered by Robin so they will be ready to use straightaway.

Note

Due to Robin’s advanced feature to make sure disks are always accessible, IAM Profiles associated with the host (or permissions granted to a user) must contain all Volume write and list actions.

6.4. Listing all disks

In order to view all disks currently present on the cluster and some additional details such as the number of volumes allocated from each, their size, media type etc, issue the following command:

# robin disk list --host <hostname>
                  --role <role>
                  --media <media>
                  --reattachable
                  --eligible
                  --tags
                  --json

--host <hostname>

Filter list by hostname

--role <role>

Filter list by role. Valid choices include: all, storage, rootdisk and reserved

--media <media>

Filter list by media type. Valid choices include: HDD and SSD

--reattachable

Filter list to only display reattachable disks

--eligible

Filter list to only display disks that have free capacity and have not reached the maximum volume count

--tags

Display tags for each disk

--json

Output in JSON

Example:

# robin disk list
ID | WWN                                       | Host    | Path /dev/disk/by-id                          | Size(GB) | Movable | Type | Free/Max(GB) | Vols | Role    | Status | LastOpr
---+-------------------------------------------+---------+-----------------------------------------------+----------+---------+------+--------------+------+---------+--------+---------
1  | 0xQEMU_QEMU_HARDDISK_3c71c872-fe13-4fd5-b | vnode36 | scsi-0QEMU_QEMU_HARDDISK_3c71c872-fe13-4fd5-b | 50       | N       | HDD  | 38/38 (100%) | 0/10 | Storage | ONLINE | READY
2  | 0xQEMU_QEMU_HARDDISK_f12b1f33-8b71-4a8c-a | vnode36 | scsi-0QEMU_QEMU_HARDDISK_f12b1f33-8b71-4a8c-a | 50       | N       | HDD  | 28/38 (74%)  | 1/10 | Storage | ONLINE | READY
3  | 0xQEMU_QEMU_HARDDISK_e54d6149-0a4e-48ce-b | vnode88 | scsi-0QEMU_QEMU_HARDDISK_e54d6149-0a4e-48ce-b | 100      | N       | HDD  | 77/77 (100%) | 0/10 | Storage | ONLINE | READY
4  | 0xQEMU_QEMU_HARDDISK_89fc0488-2050-4f44-a | vnode88 | scsi-0QEMU_QEMU_HARDDISK_89fc0488-2050-4f44-a | 100      | N       | HDD  | 63/77 (82%)  | 2/10 | Storage | ONLINE | READY
5  | 0xQEMU_QEMU_HARDDISK_19f9ac67-5e7e-4f00-8 | vnode89 | scsi-0QEMU_QEMU_HARDDISK_19f9ac67-5e7e-4f00-8 | 100      | N       | HDD  | 77/77 (100%) | 0/10 | Storage | ONLINE | READY
6  | 0xQEMU_QEMU_HARDDISK_d523b7f2-eba7-4edc-b | vnode89 | scsi-0QEMU_QEMU_HARDDISK_d523b7f2-eba7-4edc-b | 100      | N       | HDD  | 77/77 (100%) | 0/10 | Storage | ONLINE | READY

Returns all disks currently present on the cluster and some additional details such as the number of volumes allocated from each, their size, and media type.

End Point: /api/v5/robin_server/disks

Method: GET

URL Parameters:

  • details=tags : Utilizing this parameter results in tag information for each disk being present in the response payload.

  • host=<hostname> : Utilizing this parameter results in only disks attached to the specified host being returned.

Data Parameters: None

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 200

Error Response Code: 500 (Internal Server Error)

Example Response:

Output
{
   "items":[
      {
         "node_hostname":"cscale-82-140.robinsystems.com",
         "spf":0.8,
         "state":"READY",
         "type":"HDD",
         "nvols":3,
         "dev":"\/dev\/sdb",
         "aslices":7,
         "stormgrid":1,
         "pused":234881024,
         "nodeid":1,
         "maintenance_mode":"DISABLED",
         "availability_zone":null,
         "max_alloc_slices":77,
         "role":"Storage",
         "wwn":"0x600224804c48fd7e16c608dea0919064",
         "status":"ONLINE",
         "free_alloc_slices":68,
         "lused_size":0,
         "devpath":"\/dev\/disk\/by-id\/scsi-3600224804c48fd7e16c608dea0919064",
         "alloc_slices":9,
         "reattachable":0,
         "max_volumes_per_disk":10,
         "protected":0,
         "slices":6390,
         "capacity":107374182400,
         "pfree":104287174656
      },
      {
         "node_hostname":"cscale-82-140.robinsystems.com",
         "spf":0.8,
         "state":"READY",
         "type":"HDD",
         "nvols":1,
         "dev":"\/dev\/sdc",
         "aslices":20,
         "stormgrid":2,
         "pused":939524096,
         "nodeid":1,
         "maintenance_mode":"DISABLED",
         "availability_zone":null,
         "max_alloc_slices":77,
         "role":"Storage",
         "wwn":"0x600224803bcdafde95b1f5cd27ceb5fb",
         "status":"ONLINE",
         "free_alloc_slices":53,
         "lused_size":0,
         "devpath":"\/dev\/disk\/by-id\/scsi-3600224803bcdafde95b1f5cd27ceb5fb",
         "alloc_slices":24,
         "reattachable":0,
         "max_volumes_per_disk":10,
         "protected":0,
         "slices":6390,
         "capacity":107374182400,
         "pfree":103582531584
      },
      {
         "node_hostname":"cscale-82-140.robinsystems.com",
         "spf":0.8,
         "state":"INIT",
         "type":"HDD",
         "nvols":0,
         "dev":"\/dev\/dm-1",
         "aslices":0,
         "stormgrid":0,
         "pused":0,
         "nodeid":1,
         "maintenance_mode":"DISABLED",
         "availability_zone":null,
         "max_alloc_slices":5,
         "role":"RootDisk",
         "wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-swap",
         "status":"UNKNOWN",
         "free_alloc_slices":5,
         "lused_size":0,
         "devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphaFy4aq3EUo1yluonS8FG0LF16ycBrdEw",
         "alloc_slices":0,
         "reattachable":0,
         "max_volumes_per_disk":10,
         "protected":0,
         "slices":0,
         "capacity":8254390272,
         "pfree":0
      },
      {
         "node_hostname":"cscale-82-140.robinsystems.com",
         "spf":0.8,
         "state":"INIT",
         "type":"HDD",
         "nvols":0,
         "dev":"\/dev\/dm-0",
         "aslices":0,
         "stormgrid":0,
         "pused":0,
         "nodeid":1,
         "maintenance_mode":"DISABLED",
         "availability_zone":null,
         "max_alloc_slices":38,
         "role":"RootDisk",
         "wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-root",
         "status":"UNKNOWN",
         "free_alloc_slices":38,
         "lused_size":0,
         "devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphgpZcvqGdfOKaXbEbOZzNthc6btsoSXDj",
         "alloc_slices":0,
         "reattachable":0,
         "max_volumes_per_disk":10,
         "protected":0,
         "slices":0,
         "capacity":53687091200,
         "pfree":0
      },
      {
         "node_hostname":"cscale-82-140.robinsystems.com",
         "spf":0.8,
         "state":"INIT",
         "type":"HDD",
         "nvols":0,
         "dev":"\/dev\/dm-2",
         "aslices":0,
         "stormgrid":0,
         "pused":0,
         "nodeid":1,
         "maintenance_mode":"DISABLED",
         "availability_zone":null,
         "max_alloc_slices":32,
         "role":"RootDisk",
         "wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-home",
         "status":"UNKNOWN",
         "free_alloc_slices":32,
         "lused_size":0,
         "devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphQObDlS6eMUSpSxH5zsvyg9I5a0Gpuj5W",
         "alloc_slices":0,
         "reattachable":0,
         "max_volumes_per_disk":10,
         "protected":0,
         "slices":0,
         "capacity":44350570496,
         "pfree":0
      },
      {
         "node_hostname":"cscale-82-140.robinsystems.com",
         "spf":0.8,
         "state":"INIT",
         "type":"HDD",
         "nvols":0,
         "dev":"\/dev\/sda",
         "aslices":0,
         "stormgrid":0,
         "pused":0,
         "nodeid":1,
         "maintenance_mode":"DISABLED",
         "availability_zone":null,
         "max_alloc_slices":77,
         "role":"RootDisk",
         "wwn":"0x600224801d3ac9b6650afd3280aa5898",
         "status":"UNKNOWN",
         "free_alloc_slices":77,
         "lused_size":0,
         "devpath":"\/dev\/disk\/by-id\/scsi-3600224801d3ac9b6650afd3280aa5898",
         "alloc_slices":0,
         "reattachable":0,
         "max_volumes_per_disk":10,
         "protected":0,
         "slices":0,
         "capacity":107374182400,
         "pfree":0
      }
   ]
}

6.5. Show information about a specific disk

In order to view detailed information about a disk such as the breakdown of its available capacity, current allocations including the associated applications, write unit etc, issue the following command:

# robin disk info <wwn>
                  --json

wwn

WWN of disk to detach

--json

Output in JSON

Example:

# robin disk info 0x600224804c48fd7e16c608dea0919064
Drive: 0x600224804c48fd7e16c608dea0919064
  Role: Storage
  Type: HDD
  Make: None
  Model: None
  Write Unit: 4096
  Availability Zone: None
  Zone Id: 1596601846
  Path: /dev/disk/by-id/scsi-3600224804c48fd7e16c608dea0919064
  Node: cscale-82-140.robinsystems.com
  Protected: No
  Reattachable: No
  State: READY
  Status: ONLINE
  Maintenance: OFF
  Allocations:
    Current: 9G
    Maximum: 77G
    Free: 68G
    Capacity: 100G
    Factor: 9%
    AllocScore: 95
    Physical Usage: 0.22G
    Physical Free: 97G
  Volumes: 3
    Max volumes: 10
    Max latency volumes: 2
    Max throughput volumes: 1
  Applications: 2
    file-collection-1596578146092
      Instance: file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53

          Volume Name                                                        | Size (GB) | Allocated (GB)
          -------------------------------------------------------------------+-----------+----------------
          file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53 |         5 |              5


    test-RIC-1
      Instance: test-RIC-1.server.01

          Volume Name                                                       | Size (GB) | Allocated (GB)
          ------------------------------------------------------------------+-----------+----------------
          test-RIC-1.server.01.data.1.382f1ad5-1294-4e24-8297-9c6025eacfe5  |         1 |              2
          test-RIC-1.server.01.block.1.1053eaeb-4542-42a5-a173-d69a76703ead |         1 |              2

Returns detailed information about a disk such as the breakdown of its available capacity, current allocations including the associated applications and write unit.

End Point: /api/v3/robin_server/disks/<wwn>

Method: GET

URL Parameters: None

Data Parameters: None

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 200

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error)

Example Response:

Output
{
   "items":[
      {
         "node_hostname":"cscale-82-140.robinsystems.com",
         "tags":{

         },
         "dev":"\/dev\/sdb",
         "type":"HDD",
         "zoneid":1596601846,
         "nvols":3,
         "state":"READY",
         "reattachpolicy":{
            "restarts_done":0,
            "burst_count":0,
            "burst_start_time":0,
            "burst_interval":600,
            "id":2,
            "restart_limit":5
         },
         "aslices":7,
         "stormgrid":1,
         "model":null,
         "nodeid":1,
         "slices":6390,
         "availability_zone":null,
         "max_throughput_intensive_vols_per_disk":1,
         "write_unit":4096,
         "role":"Storage",
         "wwn":"0x600224804c48fd7e16c608dea0919064",
         "status":"ONLINE",
         "make":null,
         "lused_size":0,
         "reattachable_nodes":[
            [
               "cscale-82-140.robinsystems.com",
               "ONLINE"
            ]
         ],
         "devpath":"\/dev\/disk\/by-id\/scsi-3600224804c48fd7e16c608dea0919064",
         "alloc_score":95,
         "reattachable":0,
         "allocations":[
            {
               "vols":[
                  {
                     "size":5368709120,
                     "media":"HDD",
                     "state":"ONLINE",
                     "name":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53",
                     "id":"1"
                  }
               ],
               "volume_group":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53.72.1.673abece-0975-4234-9fc2-56a06bf54031",
               "name":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53.0.970a44c7-a15c-4612-ac57-9b4f15ae386e",
               "volume":{
                  "size":5368709120,
                  "media":"HDD",
                  "state":"ONLINE",
                  "name":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53",
                  "id":"1"
               },
               "slices":5
            },
            {
               "vols":[
                  {
                     "size":1073741824,
                     "media":"HDD",
                     "state":"ONLINE",
                     "name":"test-RIC-1.server.01.data.1.382f1ad5-1294-4e24-8297-9c6025eacfe5",
                     "id":"8"
                  }
               ],
               "volume_group":"test-RIC-1.server.01.72.1.ea297971-f931-4787-99cc-6782e026b77c",
               "name":"test-RIC-1.server.01.72.1.ea297971-f931-4787-99cc-6782e026b77c.0.1e8feb41-fd42-4409-b8c1-751331febdc1",
               "volume":{
                  "size":1073741824,
                  "media":"HDD",
                  "state":"ONLINE",
                  "name":"test-RIC-1.server.01.data.1.382f1ad5-1294-4e24-8297-9c6025eacfe5",
                  "id":"8"
               },
               "slices":2
            },
            {
               "vols":[
                  {
                     "size":1073741824,
                     "media":"HDD",
                     "state":"ONLINE",
                     "name":"test-RIC-1.server.01.block.1.1053eaeb-4542-42a5-a173-d69a76703ead",
                     "id":"9"
                  }
               ],
               "volume_group":"test-RIC-1.server.01.72.1.1e483fd0-2d5c-434c-aef0-91a87796977a",
               "name":"test-RIC-1.server.01.72.1.1e483fd0-2d5c-434c-aef0-91a87796977a.0.5bf728c4-ea26-4e0f-82d7-c584fcf0bd9a",
               "volume":{
                  "size":1073741824,
                  "media":"HDD",
                  "state":"ONLINE",
                  "name":"test-RIC-1.server.01.block.1.1053eaeb-4542-42a5-a173-d69a76703ead",
                  "id":"9"
               },
               "slices":2
            }
         ],
         "max_volumes_per_disk":10,
         "protected":0,
         "capacity":107374182400,
         "maintenance_mode":"OFF",
         "max_latency_sensitive_vols_per_disk":2,
         "preserved":0,
         "pused":234881024,
         "pfree":104287174656
      }
   ]
}

6.6. Evacuating volumes from a disk

This command allows users to move Robin storage volumes from one disk to another. As a result, it can be utilized to free up space on a disk when it is getting full, or to move data out of a disk before it is decommissioned. To evacuate a volume, issue the following command:

Note

Robin’s placement algorithm will determine the best disks to evacuate volumes to if target disks are not specified.

# robin disk evacuate <wwn>
                      --volume <volume>
                      --to-disks <target_disks>
                      --exclude-disks <excluded_disks>

wwn

WWN of disk to evacuate

--volume <volume>

Name of volume to be evacuated. Note if not provided all the volumes will be evacuated

--to-disks <target_disks>

List of WWNs representing disks to evacuate to

--exclude-disks <excluded_disks>

List of WWNs representing disks to avoid evacuating to

--yes

Do not prompt the user for confirmation of evacuation

Example:

# robin disk evacuate 0xQEMU_QEMU_HARDDISK_3c71c872-fe13-4fd5-b --wait --yes
Job:   65 Name: DiskEvacuate         State: VALIDATED       Error: 0
Job:   65 Name: DiskEvacuate         State: WAITING         Error: 0
Job:   65 Name: DiskEvacuate         State: COMPLETED       Error: 0

Evacuates volumes residing on one disk to another.

End Point: /api/v3/robin_server/disks/<disk_wwn>

Method: PUT

URL Parameters: None

Data Parameters:

  • action: evacuate - This mandatory field within the payload specifies that the evacuate operation is to be performed.

  • volume: <volume_name> - Utilizing this parameter results in only the specified volume being evacuated from the disk. If this is not specified all volumes on the disk are evacuated.

  • target_drives: <list_of_target_drives> - Utilizing this parameter, by specifying a list of WWNs representing respective target drives, ensures the evacuated volumes will be transferred to one of the specified disks.

  • exclude_drives: <list_of_excluded_drives> - Utilizing this parameter, by specifying a list of WWNs representing respective excluded drives, ensures the evacuated volumes will not transferred to one of the specified disks.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error)

Example Response:

Output
{
   "jobid": 145
}

6.7. Removing a disk

In order to remove disk from the Robin cluster due to the fact it has physically malfunctioned or it is faulted, issue the following command:

Note

A disk can only be removed if there are no storage volumes allocated from it.

# robin disk remove <wwn>
                    --yes

wwn

WWN of disk to remove

--yes

Do not prompt the user for confirmation of disk removal

Example:

# robin disk remove 0xQEMU_QEMU_HARDDISK_3c71c872-fe13-4fd5-b --wait --yes
Job:   101 Name: DiskRemove         State: VALIDATED       Error: 0
Job:   101 Name: DiskRemove         State: WAITING         Error: 0
Job:   101 Name: DiskRemove         State: COMPLETED       Error: 0

Removes a disk from the Robin cluster.

End Point: /api/v3/robin_server/disks/<disk_wwn>

Method: DELETE

URL Parameters: None

Data Parameters: None

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error)

Example Response:

Output
{
   "jobid": 130
}

6.8. Updating disk properties

In certain circumstances, due to the hardware or environmental reasons, Robin may not correctly discover all the attributes of a device correctly. As a result, in order to modify a disk properties such that they are correct, issue the following command:

# robin disk update <wwn>
                    --maxvolumes <max_volumes>
                    --maxlatencysensitivevolumes <max_latency_volumes>
                    --maxthroughputintensivevolumes <max_throughput_volumes>
                    --role <role>
                    --type <type>
                    --all
                    --set-reattachable
                    --unset-reattachable
                    --set-maintenance
                    --unset-maintenance

wwn

WWN of disk to update

--maxvolumes <max_volumes>

Max number of volumes allowed on the disk

--maxthroughputintensivevolumes <max_latency_volumes>

Max number of latency sensitive volumes allowed on the disk

--maxthroughputinstensivevolumes <max_throughput_volumes>

Max number of throughput intensive volumes allowed on the disk

--role <role>

Role of the disk to update to

--type <type>

Type of the disk to update to

--all

Run the update for all disks. Should be specified if no WWN is given

--set-reattachable

Mark the disk as re-attachable

--unset-reattachable

Mark the disk as not re-attachable

--set-maintenance

Put the disk into maintenance mode

--unset-maintenance

Remove the disk from maintenance mode

Example:

# robin disk update 0xQEMU_QEMU_HARDDISK_3c71c872-fe13-4fd5-b --maxvolumes 100 --type SSD --wait --yes
Job:   101 Name: DiskModify         State: VALIDATED       Error: 0
Job:   101 Name: DiskModify         State: WAITING         Error: 0
Job:   101 Name: DiskModify         State: COMPLETED       Error: 0

Modifies a disk’s discovered and/or Robin specific properties.

End Point: /api/v3/robin_server/disks/<disk_wwn>

Method: PUT

URL Parameters: None

Data Parameters:

  • action: update - This mandatory field within the payload specifies that the update operation is to be performed.

  • role: <role> - Utilizing this parameter results in the role of the disk being set to the value specified. Options include: Storage, Swap, RootDisk, and Reserved.

  • type: <type> - Utilizing this parameter results in the type of the disk being set to the value specified. Options include: HDD and SSD.

  • maxvolumesperdisk: <max_vols_on_disk> - Utilizing this parameter results in the maximum number of volumes allowed on the disk being set to the value specified.

  • maxlatencysensitivevolumesperdisk: <max_lat_sens_vols_on_disk> - Utilizing this parameter results in the maximum number of latency sensitive volumes allowed on the disk being set to the value specified.

  • maxthroughputintensivevolumesperdisk: <max_through_ints_vols_on_disk> - Utilizing this parameter results in the maximum number of throughput intensive volumes allowed on the disk being set to the value specified.

  • reattachable: [0,1] - Utilizing this parameter results in the reattachable attribute of the disk being set. By specifying a value of 1, the disk is said to be reattachable and vice versa for a value of 0.

  • maintenance: [0,1] - Utilizing this parameter results in the maintenance mode attribute of the disk being set. By specifying a value of 1, the disk is said to be in maintenance mode and vice versa for a value of 0.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error)

Example Response:

Output
{
   "jobid": 54
}

6.9. Unfaulting a disk

When the Robin storage stack encounters an IO error on a disk, it will mark the disk as FAULTED. Sometimes the error is NOT due to the disk, but due to environmental or other hardware errors such as controller errors. In this case one can determine that the disk is actually healthy. In turn this command allows a user to inform Robin that the disk which is presumed to be FAULTED is actually healthy and the external issues that caused the IO failure have been resolved. Robin will proceed to mark the disk as ONLINE and resume normal access to the disk. In order to an unfault a disk, issue the following command:

Note

This functionality should be used with extreme care. If the disk is actually bad, and a user utilizes this command to reverse the error reported by the Robin storage stack, it could result in data loss.

# robin disk unfault <wwn>
                     --yes

wwn

WWN of disk to unfault

--yes

Do not prompt the user for confirmation of unfaulting the drive

Example:

# robin disk unfault 0xQEMU_QEMU_HARDDISK_3c71c872-fe13-4fd5-b --wait --yes
Job:   101 Name: DiskUnfault         State: VALIDATED       Error: 0
Job:   101 Name: DiskUnfault         State: WAITING         Error: 0
Job:   101 Name: DiskUnfault         State: COMPLETED       Error: 0

Unfaults a disk that has been marked as FAULTED due to an IO error. Note this functionality should be used with extreme care and only in situations where one has determined that external issues such as hardware/controller errors, that have since been resolved, caused the IO failure. If this condition is not met, unfaulting the disk could result in data loss.

End Point: /api/v3/robin_server/disks/<disk_wwn>

Method: PUT

URL Parameters: None

Data Parameters:

  • action: unfault - This mandatory field within the payload specifies that the unfault operation is to be performed.

  • wwn: <disk_wwn> - This mandatory field within the payload specifies the WWN of the disk on which the unfault operation should be performed on.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error)

Example Response:

Output
{
   "jobid": 187
}

6.10. Configure the snapshot space limit of a volume

Robin stores snapshot(s) of a volume within a dedicated space, referred to as the snapshot space of the volume. The snapshot space is considered to be a percentage of the allocated space for a volume (by default this percentage is 40%). However after the volume is provisioned the size of the snapshot might need to be modified. In order to achieve this, issue the following command:

# robin volume snapshot-space-limit <volume_name>  <snapshot_space_limit>

Note

Only the snapshot space limit of individual volumes can be configured at this time. To configure the snapshot space limit for an application, update the snapshot space limit of each of the volumes it consists of.

volume_name

Name of the volume to configure the snapshot space limit for

snapshot_space_limit

Size of the snapshot space limit in bytes

Example:

# robin volume snapshot-space-limit pvc-bd01bcd9-9e4d-4a34-ac13-5a98395935b3 4G
Successfully updated volume pvc-bd01bcd9-9e4d-4a34-ac13-5a98395935b3 snapshot space limit to 4G.

6.11. Handling Disruptions

With Robin, highly available applications can be deployed on Kubernetes as Robin can handle failures of drives, rack or hosts automatically. On a Baremetal setup, volumes can be setup with a replication factor of 2 or 3 to ensure that storage is available even if a drive fails. Users can also choose the fault domain to be ‘host’ to protect against node reboots or lost.

However, in a public cloud environment the cloud disks can be detached from one cloud node and reattached to another one. For example, in AWS an EBS volume can be detached on one EC2 host and reattached to a different EC2 host. Same with GCP where a PD can be moved across GCE nodes. If a cloud node (EC2, GCE, Azure VM) is terminated or rebooted, one would want any cloud drive attached to them (EBS, PD, Block) to be moved to the one or more of the remaining healthy nodes automatically. This is not limited to just cloud disks, but also SAN LUNS that are offered to Robin as disks. The SAN LUNS can also be multi-mounted onto multiple nodes or moved around from node to node. User can still choose to replicate volume on public cloud as it takes sometime to detach and attach drives on cloud platforms.

Just having the storage available during a disruption will not help if Kubernetes can not access it from the Pod. For example a Kubernetes StatefulSet serializes the mounting and unmounting of a volume to protect against possible corruptions. Robin utilizes smart detection techniques to ensure that even if a volume is mounted on multiple nodes, it can differentiate the IOs issued from the previous stale mount and the new mount. With this consistency guarantees, Robin enables the Kubernetes StatefulSet to unmount a volume from a dead node and remount it on a healthy node where the Pod is scheduled to run. Robin actively monitors these events to allow for the fast failover of the Pods without user intervention and consequently enables users to reliably deploy highly available stateful applications on Kubernetes.