********************* Managing Nodes ********************* =================== Resource Discovery =================== As part of the Robin storage installation process, resource discovery is run on the node wherein which details about the physical configuration, hardware limits and resource availability are discovered. The purpose of this is two fold. First this process allows Robin to gain a better understanding of the machine in terms of the storage resources it can provide for application deployment as well as allow Robin to better optimize the nodes usage within the cluster. The following properties of the node are discovered: - Disks Details on what is captured with regards to each one of the above aspects alongside how they are captured are described below. -------------------- Disk Discovery -------------------- Robin leverages a multitude of sources to discover the disks that are available to a node. Some of the commands and directories used to attain the below details are: ``lsblk``, ``partprobe``, ``pvs``, ``blkid`` and ``/proc/mounts``. The following details are captured for each disk (if present): - Devpath - Capacity - Physical Sector size - WWN (along with make and model) - Media type -------------------------------- Disk Partitions (LVM) Discovery -------------------------------- Environments where resources are constrained, such as Edge servers, may not have dedicated data disks for Robin to consume and instead may only contain disks which are partitioned. By default, partitioned disks are discovered and marked as ``Reserved`` to avoid any user data being overwritten. As a result, in order for the partition(s) to serve as data disks they will have to be setup manually by the process described below. First ensure the target partitions are discovered appropriately by Robin by running the following commands: .. code-block:: text # lsblk sdb 8:16 0 50G 0 disk ├─sdb1 8:17 0 10G 0 part └─sdb2 8:18 0 40G 0 part └─vg-robinds 253:0 0 39G 0 lvm # robin drive list --role=all ID | WWN                                                  | Host         | Path /dev/disk/by-id                                                         | Size(GB) | Movable | Type | Free/Max(GB) | Vols | Role     | Status  | LastOpr ---+------------------------------------------------------+--------------+------------------------------------------------------------------------------+----------+---------+------+--------------+------+----------+---------+--------- -  | 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8            | vnode-89-142 | scsi-0QEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8                                | 50       | N       | HDD  | 38/38 (100%) | 0/10 | RootDisk | UNKNOWN | INIT -  | 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8-vg-robinds | vnode-89-142 | dm-uuid-LVM-DOy7w9WSdGi2PcSERuceOHkfM7dotzr7nm5EuoAWsyAHkvbYGT02MaeWDro05F3R | 39       | N       | HDD  | 30/30 (100%) | 0/10 | Reserved | UNKNOWN | INIT .. Note:: Details on the ``robin drive list`` command can be found `here `__. In order to mark the partition as ready to be used as storage and confirm that its role has been updated in accordance, run the following command: .. code-block:: text # robin drive update 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8-vg-robinds --role storage --wait Job: 5922 Name: DiskModify State: PROCESSED Error: 0 Job: 5922 Name: DiskModify State: COMPLETED Error: 0 # robin drive list ID | WWN | Host | Path /dev/disk/by-id | Size(GB) | Movable | Type | Free/Max(GB) | Vols | Role | Status | LastOpr ---+------------------------------------------------------+--------------+------------------------------------------------------------------------------+----------+---------+------+--------------+------+----------+---------+--------- - | 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8-vg-robinds | vnode-89-142 | dm-uuid-LVM-DOy7w9WSdGi2PcSERuceOHkfM7dotzr7nm5EuoAWsyAHkvbYGT02MaeWDro05F3R | 39 | N | HDD | 30/30 (100%) | 0/10 | Storage | UNKNOWN | INIT .. Note:: Details on the ``robin drive update`` command can be found `here `__. Lastly to intialize the paritition and confirm that is ready for use, retrieve the name of the host it is currently associated with and run the following commands: .. code-block:: text # robin host add-role vnode-89-142 storage --wait Job: 5923 Name: HostAddRoles State: PROCESSED Error: 0 Job: 5923 Name: HostAddRoles State: WAITING Error: 0 Job: 5923 Name: HostAddRoles State: COMPLETED Error: 0 # robin drive list ID | WWN | Host | Path /dev/disk/by-id | Size(GB) | Movable | Type | Free/Max(GB) | Vols | Role | Status | LastOpr ---+------------------------------------------------------+--------------+------------------------------------------------------------------------------+----------+---------+------+--------------+------+----------+---------+--------- 4 | 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8-vg-robinds | vnode-89-142 | dm-uuid-LVM-DOy7w9WSdGi2PcSERuceOHkfM7dotzr7nm5EuoAWsyAHkvbYGT02MaeWDro05F3R | 39 | N | HDD | 30/30 (100%) | 0/10 | Storage | ONLINE | READY .. Note:: Details on the ``robin host add-role`` command can be found `here `__. =================== Robin Node Roles =================== In Robin CNS, only the Storage role is assigned to hosts. It is automatically assigned by the operator. You can manually assign the Storage role when required. - The Storage role is designated to a node which is intended to provide storage, as indicated by its name, for applications deployed on Robin. As a result, any volumes needed for deployed applications will be created and mounted on devices on nodes with this role set. The following commands are described in this section: =============================== ================================================================================= ``robin host add-role`` Add one or more role(s) to a host ``robin host enable-role`` Move one or more role(s) out of maintenance mode for a host ``robin host disable-role`` Move one or more role(s) into of maintenance mode for a host ``robin host remove-role`` Remove one or more role(s) from a host =============================== ================================================================================= ----------------- Adding role(s) ----------------- .. tabs:: .. tab:: CLI To add a role to a host that is already registered within the Robin cluster, issue the following command: .. code-block:: text # robin host add-role [] [] --rpool --disks ====================== ============================================================================================== ``hosts`` Comma separated list of hosts to add role to. ``roles`` Comma separated list of roles. The only valid value is 'storage'. ``--rpool`` Assign a resource pool prior to adding a role. ``--disks`` Comma separated list of disk WWNs to add as part of storage role addition. ====================== ============================================================================================== **Example:** .. code-block:: text # robin host add-role centos-60-212,centos-60-214 storage --wait Job: 18 Name: HostAddRoles State: VALIDATED Error: 0 Job: 18 Name: HostAddRoles State: WAITING Error: 0 Job: 18 Name: HostAddRoles State: COMPLETED Error: 0 .. tab:: API Adds a role to a host that is already registered within the Robin cluster. **End Point:** /api/v3/robin_server/hosts/ **Method:** PUT **URL Parameters:** None **Data Parameters:** - ``action: add_roles`` - This mandatory field within the payload specifies that the add role operation is to be performed. - ``roles: `` - This mandatory field within the payload is a list of roles that should be added to the specified host. The only valid value is 'storage'. - ``drives: `` - Utilizing this parameter by specifiying a list of WWNs of drives results in the disks associated with the aforementioned WWNs being added alongside the addition of the storage role. - ``rpool: `` - Utilizing this parameter by specifying a resource pool name results in a resource pool being assigned to the host prior to the addition of roles. **Port:** RCM Port (default value is 29442) **Headers:** - ``Authorization: `` : Authorization token to identify which user is sending the request. The token can be acquired from the login API. **Success Response Code:** 202 **Error Response Code:** 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error) **Example Response:** .. raw:: html
Output .. code-block:: text { "jobid":20 } .. raw:: html
------------------- Enabling role(s) ------------------- .. tabs:: .. tab:: CLI To move a role, which is already added to a host, out of maintenance mode and thus enable it for use again, issue the following command: .. code-block:: text # robin host enable-role [] [] ====================== ============================================================================================== ``hosts`` Fully qualified hostname ``roles`` Valid values include: 'storage' ====================== ============================================================================================== **Example:** .. code-block:: text # robin host enable-role centos-60-212.robinsystems.com Storage Role(s) 'Storage' enabled on host centos-60-212.robinsystems.com .. tab:: API Enables a role that was previously disabled on a host. **End Point:** /api/v3/robin_server/hosts/ **Method:** PUT **URL Parameters:** None **Data Parameters:** - ``action: enable-role`` - This mandatory field within the payload specifies that the enable role operation is to be performed. - ``roles: `` - This mandatory field within the payload is a list of roles that should be enabled on the specified host. **Port:** RCM Port (default value is 29442) **Headers:** - ``Authorization: `` : Authorization token to identify which user is sending the request. The token can be acquired from the login API. **Success Response Code:** 200 **Error Response Code:** 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error) **Example Response:** On success the reponse is empty. -------------------- Disabling role(s) -------------------- .. tabs:: .. tab:: CLI Within Robin when one disables a role, the role is said to be put into maintenance mode. This in turn means that for all intents and purposes the host does not have access to this role. This is useful for debugging purposes and to temporarily reserve the hosts resources. To move a role into maintenance mode and thus disable it for use, issue the following command: .. code-block:: text # robin host disable-role [] [] ====================== ============================================================================================== ``hosts`` Fully qualified hostname ``roles`` Valid values include: 'storage' ====================== ============================================================================================== **Example:** .. code-block:: text # robin host disable-role centos-60-212.robinsystems.com Storage Role(s) 'Storage' disabled on host centos-60-212.robinsystems.com .. tab:: API Disables a role for a host such that the host temporarily does not have access to it. **End Point:** /api/v3/robin_server/hosts/ **Method:** PUT **URL Parameters:** None **Data Parameters:** - ``action: disable-role`` - This mandatory field within the payload specifies that the disable role operation is to be performed. - ``roles: `` - This mandatory field within the payload is a list of roles that should be disabled on the specified host. Valid values include: 'storage'. **Port:** RCM Port (default value is 29442) **Headers:** - ``Authorization: `` : Authorization token to identify which user is sending the request. The token can be acquired from the login API. **Success Response Code:** 200 **Error Response Code:** 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error) **Example Response:** On success the reponse is empty. ----------------- Removing role(s) ----------------- .. tabs:: .. tab:: CLI In order to a remove a role that had previously been assigned to a node, issue the following command: .. code-block:: text # robin host remove-role [] [] --force --yes ====================== ============================================================================================== ``hosts`` Comma separated list of hosts to remove role from. ``roles`` Comma separated list of roles. The only valid value is 'storage'. ``--force`` Required if retrying a remove-role operation. ``--yes`` Do not prompt the user for confirmation of removal ====================== ============================================================================================== **Example:** .. code-block:: text # robin host remove-role centos-60-212 storage --wait Job: 192 Name: HostRemoveRoles State: VALIDATED Error: 0 Job: 192 Name: HostRemoveRoles State: WAITING Error: 0 Job: 192 Name: HostRemoveRoles State: COMPLETED Error: 0 .. tab:: API Removes a role that had previously been assigned to a host. **End Point:** /api/v3/robin_server/hosts/ **Method:** PUT **URL Parameters:** None **Data Parameters:** - ``action: remove_roles`` - This mandatory field within the payload specifies that the remove role operation is to be performed. - ``roles: `` - This mandatory field within the payload is a list of roles that should be removed from the specified host. The only valid value is 'storage'. - ``force: true`` - This field is mandatory when retrying the role removal operation, otherwise it can be excluded. **Port:** RCM Port (default value is 29442) **Headers:** - ``Authorization: `` : Authorization token to identify which user is sending the request. The token can be acquired from the login API. **Success Response Code:** 202 **Error Response Code:** 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error) **Example Response:** .. raw:: html
Output .. code-block:: text { "jobid":24 } .. raw:: html
=============================== Gathering information on nodes =============================== Robin exposes multiple endpoints that provides a user the means by which to attain information about the hosts registered with a Robin cluster. The information that is returned is a combination of the physical attributes (obtained by the resource discovery described `here `_), resource utilization and status of services for the host. This gives a user insight into the state of the cluster alongside granular details for each individual host and enables application deployment planning. The following commands are described in this section: =============================== ================================================================================= ``robin host list`` View all hosts in a cluster ``robin host info`` Display detailed information about a host =============================== ================================================================================= ------------------------------------- List all hosts ------------------------------------- .. tabs:: .. tab:: CLI In order to view all hosts within a cluster alongside information on their statuses (from Robin's perspective), resource consumption, and roles within the cluster, issue the following command: .. code-block:: text # robin host list --services --resources --network --devices --tags --json ============================ ====================================================================================== ``--services`` Show status information for each host ``--resources`` Show resource utilization for each host ``--network`` Show network resource utilization info for each host ``--devices`` Show devices resource utilization info for each host ``--tags`` Show tag information for each host ``--json`` Output in JSON ============================ ====================================================================================== **Example 1 (Listing all hosts):** .. code-block:: text # robin host list Id | Hostname | Version | Status | RPool | LastOpr | Roles | Isol Cores(SHR/DED/Total) | Non-Isol Cores | GPUs | Mem(Free/Alloc/Total) | HDD(#/Alloc/Total) | SSD(#/Alloc/Total) | Pod Usage | Joined Time -------------+-------------------------------+-----------+--------+---------+---------+--------+---------------------------+----------------+------+-----------------------+--------------------+--------------------+-----------+---------------------- 1596566663:1 | cscale-82-81.robinsystems.com | 5.3.0-172 | Ready | default | ONLINE | M*,S,C | 0/0/0 | 1/40 | 0/0 | 24G/6G/31G | 1/5G/100G | -/-/- | 11/110 | 04 Aug 2020 04:44:47 1596566663:2 | cscale-82-82.robinsystems.com | 5.3.0-172 | Ready | default | ONLINE | M,S,C | 0/0/0 | 2/40 | 0/0 | 24G/7G/31G | 2/40G/200G | -/-/- | 13/110 | 04 Aug 2020 04:51:52 1596566663:3 | cscale-82-83.robinsystems.com | 5.3.0-172 | Ready | default | ONLINE | M,S,C | 0/0/0 | 1/40 | 0/0 | 24G/6G/31G | 2/-/200G | -/-/- | 11/110 | 04 Aug 2020 04:58:25 1596566663:4 | qct-07.robinsystems.com | 5.3.0-172 | Ready | workers | ONLINE | S,C | 8/68/76 | 1/4 | 0/0 | 280G/95G/376G | 1/554G/893G | -/-/- | 83/110 | 04 Aug 2020 05:05:24 1596566663:5 | qct-08.robinsystems.com | 5.3.0-172 | Ready | workers | ONLINE | S,C | 0/76/76 | 1/4 | 0/0 | 284G/91G/376G | 1/547G/893G | -/-/- | 88/110 | 04 Aug 2020 05:12:06 1596566663:6 | qct-11.robinsystems.com | 5.3.0-172 | Ready | workers | ONLINE | S,C | 0/27/76 | 1/4 | 0/0 | 147G/40G/187G | 1/581G/893G | -/-/- | 34/110 | 04 Aug 2020 05:18:47 1596566663:7 | cscale-82-80.robinsystems.com | 5.3.0-172 | Ready | workers | ONLINE | C,S | 0/21/36 | 1/40 | 0/0 | 0.7G/30G/31G | 1/-/100G | -/-/- | 28/110 | 04 Aug 2020 20:21:40 **Example 2 (Retrieving status information):** .. code-block:: text # robin host list --services +-------------------------------+-------+-------+------+-------+-------+------+-------+------+------+------+-------+------+------+-------+---------+---------+ | Host | ConCl | ConSr | GCli | Httpd | Iomgr | RMon | Pgsql | RAgt | RAer | REvt | RFile | NMon | RSer | RWdog | Metrics | Stormgr | +-------------------------------+-------+-------+------+-------+-------+------+-------+------+------+------+-------+------+------+-------+---------+---------+ | cscale-82-81.robinsystems.com | DOWN | UP | UP | UP | UP | UP | UP | UP | UP | UP | UP | UP | UP | UP | DOWN | UP | | cscale-82-82.robinsystems.com | DOWN | UP | UP | UP | UP | UP | UP | UP | DOWN | DOWN | DOWN | DOWN | DOWN | UP | DOWN | DOWN | | cscale-82-83.robinsystems.com | DOWN | UP | UP | UP | UP | UP | UP | UP | DOWN | DOWN | DOWN | DOWN | DOWN | UP | DOWN | DOWN | | qct-07.robinsystems.com | UP | DOWN | UP | UP | UP | UP | DOWN | UP | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | | qct-08.robinsystems.com | UP | DOWN | UP | UP | UP | UP | DOWN | UP | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | | qct-11.robinsystems.com | UP | DOWN | UP | UP | UP | UP | DOWN | UP | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | | cscale-82-80.robinsystems.com | UP | DOWN | UP | UP | UP | UP | DOWN | UP | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | +-------------------------------+-------+-------+------+-------+-------+------+-------+------+------+------+-------+------+------+-------+---------+---------+ UP: Running CRIT: Critical and Down DOWN: Not Running **Example 3 (Listing resource utilization for all hosts):** .. code-block:: text [root@cscale-82-73 ~]# robin host list --resources Id | Hostname | Version | Status | RPool | Avail. Zone | Cores | GPUs | Mem | Hpages-2Mi | Hpages-1Gi | HDD(#/Alloc/Total) | SSD(#/Alloc/Total) | Pod Usage -------------+-------------------------------+-----------+--------+---------+-------------+---------+-------+---------------+------------+------------+--------------------+--------------------+----------- 1637691944:1 | cscale-82-73.robinsystems.com | 5.3.5-214 | Ready | default | N/A | 35/5/40 | 0/0/0 | 22G/8G/31G | - | - | 2/-/200G | -/-/- | 73/27/100 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | 5/75/80 | 0/0/0 | 300G/36G/336G | - | 33/6/40 | 1/-/8043G | -/-/- | 47/53/100 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | 5/75/80 | 0/0/0 | 300G/36G/336G | - | 33/6/40 | 1/-/8043G | -/-/- | 48/52/100 * Note: all values indicated above in the format XX/XX/XX represent the Free/Allocated/Total values of the respective resource unless otherwise specified. In addition allocated values for compute resource such as cpu, memory and pod usage includes reserved values for the corresponding resource. **Example 4 (Listing network resource utilization for all hosts):** .. code-block:: text [root@cscale-82-73 ~]# robin host list --network Id | Hostname | Version | Status | RPool | Avail. Zone | NIC | State | PCIAddr | NUMA | VLANs | VFDrv | VFs -------------+-------------------------------+-----------+--------+---------+-------------+------------+-------+--------------+------+-------+--------+---------- 1637691944:1 | cscale-82-73.robinsystems.com | 5.3.5-214 | Ready | default | N/A | br0 | - | - | - | | - | - 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f0 | - | 0000:af:00.0 | 1 | 20 | iavf | 16/16/32 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f0 | - | 0000:af:00.0 | 1 | 20 | igbuio | 32/0/32 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f1 | - | 0000:af:00.1 | 1 | 20 | iavf | 16/16/32 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f1 | - | 0000:af:00.1 | 1 | 20 | igbuio | 32/0/32 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f0 | - | 0000:3b:00.0 | 0 | 20 | iavf | 16/16/32 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f0 | - | 0000:3b:00.0 | 0 | 20 | igbuio | 32/0/32 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f1 | - | 0000:3b:00.1 | 0 | 20 | iavf | 16/16/32 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f1 | - | 0000:3b:00.1 | 0 | 20 | igbuio | 32/0/32 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | br0 | - | 0000:5e:00.1 | 0 | 10 | - | - 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f0 | - | 0000:af:00.0 | 1 | 20 | iavf | 16/16/32 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f0 | - | 0000:af:00.0 | 1 | 20 | igbuio | 32/0/32 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f1 | - | 0000:af:00.1 | 1 | 20 | iavf | 16/16/32 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f1 | - | 0000:af:00.1 | 1 | 20 | igbuio | 32/0/32 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f0 | - | 0000:3b:00.0 | 0 | 20 | iavf | 16/16/32 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f0 | - | 0000:3b:00.0 | 0 | 20 | igbuio | 32/0/32 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f1 | - | 0000:3b:00.1 | 0 | 20 | iavf | 16/16/32 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f1 | - | 0000:3b:00.1 | 0 | 20 | igbuio | 32/0/32 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | br0 | - | 0000:5e:00.1 | 0 | 10 | - | - * Note: all values indicated above in the format XX/XX/XX represent the Free/Allocated/Total values of the respective resource unless otherwise specified. In addition allocated values for compute resource such as cpu, memory and pod usage includes reserved values for the corresponding resource. **Example 5 (Listing device utilization for all hosts):** .. code-block:: text [root@qct-26 ~]# robin host list --devices Id | Hostname | Version | Status | RPool | Avail. Zone | Type | Vendor | State | PCIAddr | NUMA | Devid | Driver | Count -------------+-------------------------+-----------+-----------+---------+-------------+------+--------+-------+--------------+------+--------+---------+------- 1633606675:1 | qct-26.robinsystems.com | 5.3.9-286 | Ready | default | N/A | fpga | 0x8086 | - | 0000:1e:00.0 | 0 | 0x0d90 | vfiopci | 2/0/2 1633606675:2 | qct-28.robinsystems.com | 5.3.9-286 | Ready | default | N/A | fpga | 0x8086 | - | 0000:1e:00.0 | 0 | 0x0d90 | vfiopci | 0/2/2 1633606675:3 | qct-27.robinsystems.com | 5.3.9-286 | Ready | default | N/A | fpga | 0x8086 | - | 0000:1e:00.0 | 0 | 0x0d90 | vfiopci | 0/2/2 * Note: all values indicated above in the format XX/XX/XX represent the Free/Allocated/Total values of the respective resource unless otherwise specified. In addition allocated values for compute resource such as cpu, memory and pod usage includes reserved values for the corresponding resource. **Example 6 (Listing tags for all hosts):** .. code-block:: text [root@cscale-82-73 ~]# robin host list --tags Id | Hostname | Version | Status | RPool | Avail. Zone | Rack | Lab | DC | Tags -------------+-------------------------------+-----------+--------+---------+-------------+------+-----+----+----------------------------------------------------------------------------------------------------------------------------------------------------- 1637691944:1 | cscale-82-73.robinsystems.com | 5.3.5-214 | Ready | default | N/A | - | - | - | {'node-role.kubernetes.io/control-plane': [''], 'kubernetes.io/arch': ['amd64'], 'kubernetes.io/os': ['linux'], 'robin.io/robinrpool': ['default']} 1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | - | - | - | {'node-role.kubernetes.io/control-plane': [''], 'kubernetes.io/arch': ['amd64'], 'kubernetes.io/os': ['linux'], 'robin.io/robinrpool': ['default']} 1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | - | - | - | {'node-role.kubernetes.io/control-plane': [''], 'kubernetes.io/arch': ['amd64'], 'kubernetes.io/os': ['linux'], 'robin.io/robinrpool': ['default']} * Note: all values indicated above in the format XX/XX/XX represent the Free/Allocated/Total values of the respective resource unless otherwise specified. In addition allocated values for compute resource such as cpu, memory and pod usage includes reserved values for the corresponding resource. .. tab:: API Returns information on all hosts within a cluster including details on their statuses (from Robin's perspective), resource consumption, and roles within the cluster. **End Point:** /api/v5/robin_server/hosts **Method:** GET **URL Parameters:** - ``details=tags`` : Utilizing this parameter results in tag information for each host being present in the response payload. **Data Parameters:** None **Port:** RCM Port (default value is 29442) **Headers:** - ``Authorization: `` : Authorization token to identify which user is sending the request. The token can be acquired from the login API. **Success Response Code:** 200 **Error Response Code:** 500 (Internal Server Error) **Example Response:** .. raw:: html
Output .. code-block:: text { "items":[ { "memory_used":2692743168, "memory":33555709952, "isol_shared_map":{ }, "zoneid":1596601846, "non_isol_cores_used":3, "pods_used":26, "rack":"default", "napps":2, "non_isol_total":400, "k8s_node_name":"cscale-82-140", "mem_for_storage":1073741824, "id":1, "lab":"default", "gpu_cores_allocated":0, "isol_dedicated_cores_used":0, "roles":[ [ "MANAGER", "ONLINE", "READY" ], [ "COMPUTE", "ONLINE", "READY" ], [ "STORAGE", "ONLINE", "READY" ] ], "cpu_cores_used":0, "cpu_prov_factor":10, "services":"{\"update_time\":1596761892.3700919151,\"services\":{\"consul_dns\":true,\"stormgr-server\":{\"Id\":\"stormgr-server\",\"MainPID\":2299,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:26.794916\",\"ActiveState\":\"active\"},\"gui-cli\":{\"Id\":\"gui-cli\",\"MainPID\":2647,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:32.291459\",\"ActiveState\":\"active\"},\"consul-client\":{\"Id\":\"consul-client\",\"MainPID\":0,\"Type\":\"simple\",\"ExecMainStartTimestamp\":0,\"ActiveState\":\"inactive\"},\"httpd\":{\"Id\":\"httpd\",\"MainPID\":2613,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:32.124472\",\"ActiveState\":\"active\"},\"robin-node-monitor\":{\"Id\":\"robin-node-monitor\",\"MainPID\":1278,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:13.459063\",\"ActiveState\":\"active\"},\"iomgr-server\":{\"Id\":\"iomgr-server\",\"MainPID\":7384,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:55:13.851492\",\"ActiveState\":\"active\"},\"robin-event-server\":{\"Id\":\"robin-event-server\",\"MainPID\":1039,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:55.322709\",\"ActiveState\":\"active\"},\"consul-server\":{\"Id\":\"consul-server\",\"MainPID\":564,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:32.432940\",\"ActiveState\":\"active\"},\"consul_members\":[{\"DelegateMax\":5,\"ProtocolMin\":1,\"Port\":29460,\"Status\":1,\"ProtocolMax\":5,\"DelegateCur\":4,\"ProtocolCur\":2,\"Name\":\"cscale-82-140.robinsystems.com\",\"Tags\":{\"dc\":\"consul\",\"role\":\"consul\",\"vsn\":\"2\",\"wan_join_port\":\"29461\",\"segment\":\"\",\"port\":\"29459\",\"raft_vsn\":\"2\",\"vsn_min\":\"2\",\"vsn_max\":\"3\",\"id\":\"9dbc13cd-bbb4-1bf1-9bcd-f3d7e0f0026f\",\"bootstrap\":\"1\",\"build\":\"0.9.4:40f243a+\"},\"Addr\":\"10.9.82.140\",\"DelegateMin\":2}],\"robin-file-server\":{\"Id\":\"robin-file-server\",\"MainPID\":1071,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:55.801664\",\"ActiveState\":\"active\"},\"robin-watchdog\":{\"Id\":\"robin-watchdog\",\"MainPID\":860,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:37.982382\",\"ActiveState\":\"active\"},\"sherlock-server\":{\"Id\":\"sherlock-server\",\"MainPID\":0,\"Type\":\"simple\",\"ExecMainStartTimestamp\":0,\"ActiveState\":\"inactive\"},\"robin-agent\":{\"Id\":\"robin-agent\",\"MainPID\":9186,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:59:23.165676\",\"ActiveState\":\"active\"},\"postgresql-9.6\":{\"Id\":\"postgresql-9.6\",\"MainPID\":660,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:34.988682\",\"ActiveState\":\"active\"},\"monitor-server\":{\"Id\":\"monitor-server\",\"MainPID\":2687,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:32.455445\",\"ActiveState\":\"active\"},\"robin-server\":{\"Id\":\"robin-server\",\"MainPID\":64400,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-06 17:57:55.208502\",\"ActiveState\":\"active\"},\"robin-auth-server\":{\"Id\":\"robin-auth-server\",\"MainPID\":1010,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:50.844131\",\"ActiveState\":\"active\"}}}", "gpu_cores":0, "sysmem":[ 33555709952, 2689277952, 7223099392, 0, 0, 0, 22075457536, 843776 ], "ssd_faulted":0, "isol_shared_cores_used":0, "hugepages_1g":0, "ninstances":2, "maintenance_mode":"DISABLED", "memory_allocated":0, "hostname":"cscale-82-140.robinsystems.com", "datacenter":"default", "tags":{ "kubernetes.io\/os":[ "linux" ], "robin.io\/robinrpool":[ "default" ], "kubernetes.io\/arch":[ "amd64" ] }, "remove_taint":true, "hugepages_2m_allocated":0, "rpool":"default", "pods":110, "state":"ONLINE", "name":"cscale-82-140", "rcm_ha_role":"MANAGER_MASTER", "isol_total":0, "hugepages_2m":0, "hdd_faulted":0, "ipaddresses":[ { "mac_address":"00:15:5d:14:06:0e", "netmask":"255.255.0.0", "ip_address":"10.9.82.140" } ], "cpu_cores_allocated":0, "memory_reserved":6442450944.0, "cpu_cores":40, "hugepages_1g_allocated":0, "k8s_node_status":"Ready", "status":"Ready", "nics":[ { "allowed_vlans":[ ], "function":null, "numa_node":null, "vendor":null, "mtu":1500, "mac_address":"00:15:5d:14:06:0e", "bus":null, "name":"br0", "physical_nic":"eth0", "num_vfs":0, "linkstate":"", "all_vlans_allowed":false, "used_vfs":0, "native_vfdriver":null, "native_vlan":null, "vendor_desc":null, "domain":null, "untagged":false, "slot":null, "vfdrivers":[ ] } ], "public_hostname":"cscale-82-140.robinsystems.com", "cpu_cores_present":40, "sysinfo":{ "join_time":1596576678, "current_version":"5.3.0-171", "iqn":"iqn.1994-05.com.redhat:329b8568de1", "install_date":"Tue Mar 17 23:49:17 UTC 2020", "wwpns":[ ], "distribution":"CentOS Linux", "version":"#1 SMP Tue Mar 17 23:49:17 UTC 2020", "uuid":"", "boot_time":1596576222, "robin_software":[ { "version":"5.3.0", "patch":"", "full_version":"5.3.0-171", "install_date":"2020-08-03", "patch_date":"", "release":"171", "build_info":"robin-c2edf85eaa83a42ced9512e7de9c7c2f1e4fa962:robin-ui:9ee33fd00273ba19861d4dc3ef8c6169d822d3e0:robingraph:cf0ceefe696ccac2dbd2eeb1d28b859955452843" } ], "release":"3.10.0-1062.18.1.el7.x86_64", "system":"Linux", "processor":"x86_64" }, "disks":[ { "spf":0.8, "state":"READY", "type":"HDD", "zoneid":1596601846, "dev":"\/dev\/sdb", "max_alloc_slices":77, "free_alloc_slices":68, "model":null, "allocated":7, "maintenance_mode":"DISABLED", "max_throughput_intensive_vols_per_disk":1, "role":"Storage", "wwn":"0x600224804c48fd7e16c608dea0919064", "status":"ONLINE", "make":null, "devpath":"\/dev\/disk\/by-id\/scsi-3600224804c48fd7e16c608dea0919064", "alloc_slices":9, "reattachable":0, "max_volumes_per_disk":10, "protected":0, "capacity":107374182400, "node_ref":1, "max_latency_sensitive_vols_per_disk":2, "pused":234881024, "pfree":104287174656 }, { "spf":0.8, "state":"READY", "type":"HDD", "zoneid":1596601846, "dev":"\/dev\/sdc", "max_alloc_slices":77, "free_alloc_slices":53, "model":null, "allocated":20, "maintenance_mode":"DISABLED", "max_throughput_intensive_vols_per_disk":1, "role":"Storage", "wwn":"0x600224803bcdafde95b1f5cd27ceb5fb", "status":"ONLINE", "make":null, "devpath":"\/dev\/disk\/by-id\/scsi-3600224803bcdafde95b1f5cd27ceb5fb", "alloc_slices":24, "reattachable":0, "max_volumes_per_disk":10, "protected":0, "capacity":107374182400, "node_ref":1, "max_latency_sensitive_vols_per_disk":2, "pused":939524096, "pfree":103582531584 }, { "spf":0.8, "state":"INIT", "type":"HDD", "zoneid":1596601846, "dev":"\/dev\/dm-1", "max_alloc_slices":5, "free_alloc_slices":5, "model":null, "allocated":0, "maintenance_mode":"DISABLED", "max_throughput_intensive_vols_per_disk":1, "role":"RootDisk", "wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-swap", "status":"UNKNOWN", "make":null, "devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphaFy4aq3EUo1yluonS8FG0LF16ycBrdEw", "alloc_slices":0, "reattachable":0, "max_volumes_per_disk":10, "protected":0, "capacity":8254390272, "node_ref":1, "max_latency_sensitive_vols_per_disk":2, "pused":0, "pfree":0 }, { "spf":0.8, "state":"INIT", "type":"HDD", "zoneid":1596601846, "dev":"\/dev\/dm-0", "max_alloc_slices":38, "free_alloc_slices":38, "model":null, "allocated":0, "maintenance_mode":"DISABLED", "max_throughput_intensive_vols_per_disk":1, "role":"RootDisk", "wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-root", "status":"UNKNOWN", "make":null, "devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphgpZcvqGdfOKaXbEbOZzNthc6btsoSXDj", "alloc_slices":0, "reattachable":0, "max_volumes_per_disk":10, "protected":0, "capacity":53687091200, "node_ref":1, "max_latency_sensitive_vols_per_disk":2, "pused":0, "pfree":0 }, { "spf":0.8, "state":"INIT", "type":"HDD", "zoneid":1596601846, "dev":"\/dev\/dm-2", "max_alloc_slices":32, "free_alloc_slices":32, "model":null, "allocated":0, "maintenance_mode":"DISABLED", "max_throughput_intensive_vols_per_disk":1, "role":"RootDisk", "wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-home", "status":"UNKNOWN", "make":null, "devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphQObDlS6eMUSpSxH5zsvyg9I5a0Gpuj5W", "alloc_slices":0, "reattachable":0, "max_volumes_per_disk":10, "protected":0, "capacity":44350570496, "node_ref":1, "max_latency_sensitive_vols_per_disk":2, "pused":0, "pfree":0 }, { "spf":0.8, "state":"INIT", "type":"HDD", "zoneid":1596601846, "dev":"\/dev\/sda", "max_alloc_slices":77, "free_alloc_slices":77, "model":null, "allocated":0, "maintenance_mode":"DISABLED", "max_throughput_intensive_vols_per_disk":1, "role":"RootDisk", "wwn":"0x600224801d3ac9b6650afd3280aa5898", "status":"UNKNOWN", "make":null, "devpath":"\/dev\/disk\/by-id\/scsi-3600224801d3ac9b6650afd3280aa5898", "alloc_slices":0, "reattachable":0, "max_volumes_per_disk":10, "protected":0, "capacity":107374182400, "node_ref":1, "max_latency_sensitive_vols_per_disk":2, "pused":0, "pfree":0 } ] } ], "total":1, "page_num":1, "nodes_count":1, "num_items":1, "page_size":1 } .. raw:: html
--------------------------------------- Show information about a specific host --------------------------------------- .. tabs:: .. tab:: CLI In order to display detailed information for a host such as the storage allocation breakdown, discovered physical attributes with their utilization (NUMA configuration, network topology etc.) and service details, issue the following command: .. code-block:: text # robin host info --services --resources --config --consul --json ============================ ====================================================================================== ``hostname`` FQDN of host ``--services`` Show status information for the host ``--resources`` Show resource utilization for the host ``--config`` Show config info ``--consul`` Show consul cluster info ``--json`` Output in JSON ============================ ====================================================================================== **Example:** .. code-block:: text # robin host info poch01.robin.io .. raw:: html
Output .. code-block:: text Host: qct-07.robinsystems.com Zone Id: 1596566663 Host Id: 4 Type: physical Version: 5.3.0-172 Kernel Version: 3.10.0-1062.el7.x86_64 Boot Time: 04 Aug 2020 03:18:01 Resource pool: workers CPU: Total Cores: 80 Total Isolated Cores: 76 Total Non-Isolated Cores: 4 Non-Isolated CPUs allocated: 1 Shared Isolated CPUs allocated: 8 Dedicated Isolated CPUs allocated: 68 Provisioning Factor: 1 NUMA Topology: Node 0: Total Memory: 187G Total Isolated CPUs: 38 Total Non-Isolated CPUs: 2 Total Reserved CPUs: 0 Non-Isolated Pinned CPUs: 0 Isolated Shared Pinned CPUs: 0 Isolated Dedicated Pinned CPUs: 38 Total HugePages_1G: - Total HugePages_2M: - CPU List: 1-19,41-59 NIC List: enp94s0f0,enp59s0f0,enp59s0f1,enp94s0f1 Node 1: Total Memory: 188G Total Isolated CPUs: 38 Total Non-Isolated CPUs: 2 Total Reserved CPUs: 0 Non-Isolated Pinned CPUs: 0 Isolated Shared Pinned CPUs: 8 Isolated Dedicated Pinned CPUs: 30 Total HugePages_1G: - Total HugePages_2M: - CPU List: 21-39,61-79 NIC List: enp175s0f1,enp175s0f0 GPU: Total Cores: 0 Memory: System Total: 376G Allocatable Total: 376G Reserved: 6G Robin Manager services: - Robin Compute services: 4G Robin Storage services: 2G Memory allocated to instances: 88G Free Total: 292G HugePages_2M: Total: - Allocated for Robin apps: - HugePages_1G: Total: - Allocated for Robin apps: - POD Utilization: 83/110 Network: Bridge Interface: br0 Physical Interface: enp94s0f1 MTU: 1500 Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710) Vendor Info: 8086 - Intel Corporation NUMA Node: 0 H/W Info: 0000:5e:00.1 IP Addresses: 10.9.20.15/16 Interface: enp175s0f1 MTU: 1500 Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710) Vendor Info: 8086 - Intel Corporation NUMA Node: 1 H/W Info: 0000:af:00.1 Interface: enp175s0f0 MTU: 1500 Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710-2) Vendor Info: 8086 - Intel Corporation NUMA Node: 1 H/W Info: 0000:af:00.0 Interface: enp94s0f0 MTU: 1500 Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter OCP XXV710-2) Vendor Info: 8086 - Intel Corporation NUMA Node: 0 H/W Info: 0000:5e:00.0 Interface: enp59s0f0 MTU: 1500 Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710-2) Vendor Info: 8086 - Intel Corporation NUMA Node: 0 H/W Info: 0000:3b:00.0 Interface: enp59s0f1 MTU: 1500 Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710) Vendor Info: 8086 - Intel Corporation NUMA Node: 0 H/W Info: 0000:3b:00.1 Public IP Address: 10.9.20.15 Public Hostname: qct-07.robinsystems.com Instances: 69 State: ONLINE Status: Ready K8s_Node_Status: Ready Maintenance Mode: DISABLED Consul state: UP Roles: STORAGE: ONLINE Status:READY COMPUTE: ONLINE Status:READY Storage: Type | Used (GB) | Robin Allocated (GB) | K8s Allocated (GB) | Total (GB) -----+-----------+----------------------+--------------------+------------ HDD | 38 | 538 | 16 | 893 SSD | - | - | - | - Services: Name | State | RoleTags | PID | Started -------------------+-------+----------+------+---------------------------- consul-client | UP | A | 351 | 2020-08-04 17:51:31.538657 consul-server | DOWN | M | 0 | 0 gui-cli | UP | - | 1132 | 2020-08-04 17:51:50.479143 httpd | UP | * | 1077 | 2020-08-04 17:51:50.116133 iomgr-server | UP | C S | 5391 | 2020-08-04 17:52:16.467809 monitor-server | UP | M C S | 539 | 2020-08-04 17:51:42.748944 postgresql-9.6 | DOWN | M | 0 | 0 robin-agent | UP | M C S | 600 | 2020-08-04 17:51:45.394012 robin-auth-server | DOWN | M* | 0 | 0 robin-event-server | DOWN | M* | 0 | 0 robin-file-server | DOWN | M* | 0 | 0 robin-node-monitor | DOWN | M* | 0 | 0 robin-server | DOWN | M* | 0 | 0 robin-watchdog | DOWN | M | 0 | 0 sherlock-server | DOWN | - | 0 | 0 stormgr-server | DOWN | M* | 0 | 0 Last updated (04 Aug 2020 18:40:14) UP: Running CRIT: Critical and Down DOWN: Not Running Root Disk storage info: Partition | Name | Size (GB) | Available (GB) --------------------------+-----------------+-----------+---------------- /var/log | RobinLog | 299 | 268 /var/lib/pgsql | Pgsql | 299 | 268 /var/crash | Crash | 299 | 268 /var/lib/robin | RobinLib | 299 | 268 /var/lib/[appropriateCRI] | ContainerImages | 15 | - Unused container images: 6G Image | Size (GB) --------------------------------------------+----------- k8s.gcr.io/kube-controller-manager:v1.18.6 | 0.15 k8s.gcr.io/kube-apiserver:v1.18.6 | 0.16 k8s.gcr.io/kube-scheduler:v1.18.6 | 0.09 robinsys/robinimg:5.2.7-18 | 3 quay.io/k8scsi/csi-provisioner:v1.6.0_robin | 0.04 k8s.gcr.io/kube-proxy:v1.17.5 | 0.11 k8s.gcr.io/kube-apiserver:v1.17.5 | 0.16 k8s.gcr.io/kube-controller-manager:v1.17.5 | 0.15 k8s.gcr.io/kube-scheduler:v1.17.5 | 0.09 quay.io/k8scsi/snapshot-controller:v2.1.0 | 0.04 k8s.gcr.io/pause:3.2 | 0.0 prom/prometheus:v2.16.0 | 0.12 quay.io/k8scsi/csi-attacher:v2.1.0 | 0.04 calico/typha:v3.11.1 | 0.05 calico/pod2daemon-flexvol:v3.11.1 | 0.1 calico/cni:v3.11.1 | 0.18 calico/kube-controllers:v3.11.1 | 0.05 quay.io/k8scsi/csi-provisioner:v1.4.0_robin | 0.05 k8s.gcr.io/kube-proxy:v1.16.3 | 0.08 k8s.gcr.io/kube-apiserver:v1.16.3 | 0.2 k8s.gcr.io/kube-controller-manager:v1.16.3 | 0.15 k8s.gcr.io/kube-scheduler:v1.16.3 | 0.08 k8s.gcr.io/coredns:1.6.5 | 0.04 metallb/controller:v0.8.2 | 0.04 metallb/speaker:v0.8.2 | 0.04 k8s.gcr.io/etcd:3.4.3-0 | 0.27 quay.io/k8scsi/csi-snapshotter:v1.2.2 | 0.04 k8s.gcr.io/etcd:3.3.15-0 | 0.23 quay.io/k8scsi/csi-attacher:v1.2.1 | 0.04 k8s.gcr.io/coredns:1.6.2 | 0.04 robinsys/genie-plugin:v3.0 | 0.02 quay.io/k8scsi/csi-provisioner:v1.0.0_robin | 0.04 quay.io/k8scsi/csi-provisioner:v0.4.1_robin | 0.04 robinsys/coredns:1.2.2 | 0.03 .. raw:: html
.. tab:: API Returns detailed information for a host such as the storage allocation breakdown, discovered physical attributes with their utilization (NUMA configuration, network topology etc.) and service details. **End Point:** /api/v3/robin_server/hosts/ **Method:** GET **URL Parameters:** - ``diskinfo=true`` : Utilizing this parameter results in details of the disks attached to the specified host being returned. **Data Parameters:** None **Port:** RCM Port (default value is 29442) **Headers:** - ``Authorization: `` : Authorization token to identify which user is sending the request. The token can be acquired from the login API. **Success Response Code:** 200 **Error Response Code:** 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error) **Example Response:** .. raw:: html
Output .. code-block:: text { "items":[ { "memory_used":2692743168, "hdd_lalloc":35433480192.0, "memory":33555709952, "lab":"default", "saas_mode":false, "zoneid":1596601846, "non_isol_total_with_prov":400, "zone_name":"default", "rack":"default", "ipaddresses":[ { "ip_address":"10.9.82.140", "mac_address":"00:15:5d:14:06:0e", "netmask":"255.255.0.0" } ], "public_ip":"10.9.82.140", "ssd_nonrobin_usage":0, "k8s_node_name":"cscale-82-140", "mem_for_storage":1073741824, "id":1, "ssd_for_storage":0, "rcm_ha_role":"MANAGER_MASTER", "ssd_robin_usage":0, "gpu_cores_allocated":0, "numa_map":{ "0":{ "memory_used":0, "hugepages_1g_used":0, "isol_total":0, "isol_shared_map":{ }, "cpu_reserved":0, "numa_id":0, "non_isol_cores_used":2, "cpu_ids":"", "cpu_used":0, "mem_used":1493172224, "non_isol_total":20, "hugepages_2m_used":0, "gpu_used":0, "isol_shared_cores_used":0, "hugepages_1g_total":0, "cpu_total":20, "hugepages_2m_total":0, "memory_total":16777626965, "isol_dedicated_cores_used":0 }, "1":{ "memory_used":0, "hugepages_1g_used":0, "isol_total":0, "isol_shared_map":{ }, "cpu_reserved":0, "numa_id":1, "non_isol_cores_used":0, "cpu_ids":"", "cpu_used":0, "mem_used":0, "non_isol_total":20, "hugepages_2m_used":0, "gpu_used":0, "isol_shared_cores_used":0, "hugepages_1g_total":0, "cpu_total":20, "hugepages_2m_total":0, "memory_total":16778082988, "isol_dedicated_cores_used":0 } }, "roles":[ [ "MANAGER", "ONLINE", "READY" ], [ "COMPUTE", "ONLINE", "READY" ], [ "STORAGE", "ONLINE", "READY" ] ], "ssd_max_alloc_slices":0, "cpu_prov_factor":10, "services":{ "update_time":1596761892.3700919151, "services":{ "consul_dns":true, "stormgr-server":{ "MainPID":2299, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:31:26.794916", "RoleTags":[ "M*" ], "Id":"stormgr-server", "State":"UP", "ActiveState":"active" }, "gui-cli":{ "MainPID":2647, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:31:32.291459", "RoleTags":[ "-" ], "Id":"gui-cli", "State":"UP", "ActiveState":"active" }, "consul-client":{ "MainPID":0, "Type":"simple", "ExecMainStartTimestamp":0, "RoleTags":[ "A" ], "Id":"consul-client", "State":"DOWN", "ActiveState":"inactive" }, "robin-server":{ "MainPID":64400, "Type":"simple", "ExecMainStartTimestamp":"2020-08-06 17:57:55.208502", "RoleTags":[ "M*" ], "Id":"robin-server", "State":"UP", "ActiveState":"active" }, "robin-node-monitor":{ "MainPID":1278, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:31:13.459063", "RoleTags":[ "M*" ], "Id":"robin-node-monitor", "State":"UP", "ActiveState":"active" }, "iomgr-server":{ "MainPID":7384, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:55:13.851492", "RoleTags":[ "C", "S" ], "Id":"iomgr-server", "State":"UP", "ActiveState":"active" }, "consul_members":[ { "DelegateMax":5, "ProtocolCur":2, "Port":29460, "Status":1, "ProtocolMax":5, "DelegateCur":4, "Tags":{ "dc":"consul", "role":"consul", "vsn":"2", "wan_join_port":"29461", "segment":"", "port":"29459", "raft_vsn":"2", "vsn_min":"2", "vsn_max":"3", "id":"9dbc13cd-bbb4-1bf1-9bcd-f3d7e0f0026f", "bootstrap":"1", "build":"0.9.4:40f243a+" }, "ProtocolMin":1, "Name":"cscale-82-140.robinsystems.com", "Addr":"10.9.82.140", "DelegateMin":2 } ], "robin-file-server":{ "MainPID":1071, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:30:55.801664", "RoleTags":[ "M*" ], "Id":"robin-file-server", "State":"UP", "ActiveState":"active" }, "robin-event-server":{ "MainPID":1039, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:30:55.322709", "RoleTags":[ "M*" ], "Id":"robin-event-server", "State":"UP", "ActiveState":"active" }, "sherlock-server":{ "MainPID":0, "Type":"simple", "ExecMainStartTimestamp":0, "RoleTags":[ "-" ], "Id":"sherlock-server", "State":"DOWN", "ActiveState":"inactive" }, "robin-agent":{ "MainPID":9186, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:59:23.165676", "RoleTags":[ "M", "C", "S" ], "Id":"robin-agent", "State":"UP", "ActiveState":"active" }, "postgresql-9.6":{ "MainPID":660, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:30:34.988682", "RoleTags":[ "M" ], "Id":"postgresql-9.6", "State":"UP", "ActiveState":"active" }, "consul-server":{ "MainPID":564, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:30:32.432940", "RoleTags":[ "M" ], "Id":"consul-server", "State":"UP", "ActiveState":"active" }, "httpd":{ "MainPID":2613, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:31:32.124472", "RoleTags":[ "*" ], "Id":"httpd", "State":"UP", "ActiveState":"active" }, "robin-auth-server":{ "MainPID":1010, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:30:50.844131", "RoleTags":[ "M*" ], "Id":"robin-auth-server", "State":"UP", "ActiveState":"active" }, "robin-watchdog":{ "MainPID":860, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:30:37.982382", "RoleTags":[ "M" ], "Id":"robin-watchdog", "State":"UP", "ActiveState":"active" }, "monitor-server":{ "MainPID":2687, "Type":"simple", "ExecMainStartTimestamp":"2020-08-04 14:31:32.455445", "RoleTags":[ "M", "C", "S" ], "Id":"monitor-server", "State":"UP", "ActiveState":"active" } } }, "non_isol_total":40, "gpu_cores":0, "hdd_robin_usage":35433480192, "visibledisks":[ "0x600224801d3ac9b6650afd3280aa5898", "0x600224801d3ac9b6650afd3280aa5898-centos-root", "0x600224801d3ac9b6650afd3280aa5898-centos-swap", "0x600224801d3ac9b6650afd3280aa5898-centos-home", "0x600224804c48fd7e16c608dea0919064", "0x600224803bcdafde95b1f5cd27ceb5fb" ], "ssd_faulted":0, "isol_shared_cores_used":0, "nic_details":{ "br0":{ "vfdrivers":[ ], "product_desc":null, "all_vlans_allowed":false, "mtu":1500, "bus":null, "vendor_id":null, "slot":null, "physical_nic":"eth0", "allowed_vlans":[ ], "num_vfs":0, "function":null, "ips":[ "10.9.82.140\/16" ], "product_id":null, "local_cpulist":null, "native_vlan":null, "vendor_desc":null, "domain":null, "untagged":false, "used_vfs":0, "numa_node":null } }, "ninstances":2, "non_isol_cores_used":3, "maintenance_mode":"DISABLED", "memory_allocated":0, "mem_for_management":1073741824.0, "rpool_id":1, "hdd_for_storage":214748364800, "datacenter":"default", "tags":{ "kubernetes.io\/os":[ "linux" ], "robin.io\/robinrpool":[ "default" ], "kubernetes.io\/arch":[ "amd64" ] }, "hugepages_1g":0, "hugepages_2m_allocated":0, "rpool":"default", "hdd_total":428414599168, "ssd_free_alloc_slices":0, "pods":110, "state":"ONLINE", "status":"Ready", "instances":[ { "state":"STARTED", "name":"rohan-app.nginx.03", "hostname":"rohan-app-nginx-03.t001-u000003.svc.cluster.local" }, { "state":"STARTED", "name":"test-RIC-1.server.01", "hostname":"test-ric-1-server-01.t001-u000003.svc.cluster.local" } ], "isol_dedicated_cores_used":0, "host_type":"physical", "ssd_pused":0, "isol_total":0, "primary_ip":"10.9.82.140", "isol_shared_map":{ }, "hugepages_2m":0, "pods_used":26, "hdd_lused":0, "hdd_faulted":0, "napps":2, "hdd_nonrobin_usage":0, "cpu_cores_present":40, "ssd_total":0, "cpu_cores_allocated":0, "is_master":true, "memory_reserved":6442450944.0, "cpu_cores":400, "config":{ "stormgr_rest_port":29454, "monitor_host_mem_lowmark":0.8, "monitor_host_root_volume_highmark":0.9, "rio_rest_port":29456, "stormgr_rest_listen_addr":"127.0.0.1", "kvm_enabled":true, "hard_reset_on_isolation":0, "monitor_host_cpu_lowmark":0.8, "monitor_host_var_crash_volume_highmark":0.9, "monitor_interval":1, "monitor_host_var_pgsql_volume_lowmark":0.5, "kubelet_restart_bursttime":25, "server_rest_port":29442, "kubelet_restart_burstlimit":2, "event_server_port":29449, "rio_rpc_port":29453, "rdvm_bmapcache_skip_all":0, "rdvm_mem_maxcap":25769803776, "rest_server":"cscale-82-140.robinsystems.com", "registration_timeout":10, "rdvm_rpc_port":29452, "node_exporter_port":29457, "stormgr_rpc_port":29451, "monitor_host_root_volume_lowmark":0.85, "database_port":29458, "rdvm_rest_listen_addr":"127.0.0.1", "https_port":29443, "metrics_grafana_details":"{\"url\": \"\", \"auth\": \":\"}", "monitor_host_var_volume_lowmark":0.85, "monitor_host_cpu_highmark":0.85, "rio_rest_listen_addr":"127.0.0.1", "monitor_host_var_robin_volume_highmark":0.9, "rdvm_mem_alloc":1073741824, "monitor_num_samples":3600, "monitor_host_swap_lowmark":0.75, "watchdog_loop_interval":3, "rdvm_rest_port":29455, "monitor_container_swap_highmark":0.8, "consul_serfwan_port":29461, "saas_mode":false, "file_object_cache":"\/var\/lib\/robin\/file_object_cache", "node_monitor_port":29467, "monitor_influx_details":"{\"url\": \"\", \"dbname\": \"robin\", \"auth\": \":\" }", "consul_http_port":29462, "monitor_container_swap_lowmark":0.75, "hostname":"cscale-82-140.robinsystems.com", "monitor_host_var_volume_highmark":0.9, "network_type":4, "suicide_threshold":50, "mem_for_compute":null, "mem_for_management":null, "sherlock_rest_port":29446, "nfs_mount_options":"nolock,rw,timeo=60", "rediscover_timeout":120, "kvm_emulatorpin_cpuset":"", "rdvm_bmapcache_invalidate_all":0, "consul_serflan_port":29460, "monitor_host_var_robin_volume_lowmark":0.85, "rest_port":29450, "monitor_report_interval":5, "host_type":"physical", "monitor_host_swap_highmark":0.8, "nodejs_port":29447, "monitor_push_interval":60, "ovs_enabled":true, "monitor_host_var_log_volume_highmark":0.9, "kubelet_restart_tolerance":15, "monitor_host_mem_highmark":0.85, "monitor_host_var_log_volume_lowmark":0.85, "log_level":10, "monitor_host_var_crash_volume_lowmark":0.85, "monitor_container_volume_highmark":0.9, "monitor_container_volume_lowmark":0.85, "consul_server_port":29459, "monitor_host_var_pgsql_volume_highmark":0.7 }, "ssd_lused":0, "hugepages_1g_allocated":0, "k8s_node_status":"Ready", "hdd_free_alloc_slices":293131517952.0, "ssd_lalloc":0, "hostname":"cscale-82-140.robinsystems.com", "public_hostname":"cscale-82-140.robinsystems.com", "hdd_max_alloc_slices":328564998144.0, "memory_total":33555709952, "mem_for_compute":4294967296, "sysinfo":{ "join_time":1596576678, "current_version":"5.3.0-171", "iqn":"iqn.1994-05.com.redhat:329b8568de1", "install_date":"Tue Mar 17 23:49:17 UTC 2020", "wwpns":[ ], "distribution":"CentOS Linux", "version":"#1 SMP Tue Mar 17 23:49:17 UTC 2020", "uuid":"", "boot_time":1596576222, "robin_software":[ { "version":"5.3.0", "patch":"", "full_version":"5.3.0-171", "install_date":"2020-08-03", "patch_date":"", "release":"171", "build_info":"robin-c2edf85eaa83a42ced9512e7de9c7c2f1e4fa962:robin-ui:9ee33fd00273ba19861d4dc3ef8c6169d822d3e0:robingraph:cf0ceefe696ccac2dbd2eeb1d28b859955452843" } ], "release":"3.10.0-1062.18.1.el7.x86_64", "system":"Linux", "processor":"x86_64" }, "hdd_pused":1174405120, "disks":[ { "spf":0.8, "zoneid":1596601846, "dev":"\/dev\/sda", "aslices":0, "nodeid":1, "maintenance_mode":"OFF", "role":"RootDisk", "protected":0, "status":"UNKNOWN", "make":null, "reattachable_nodes":[ [ "cscale-82-140.robinsystems.com", "ONLINE" ] ], "capacity":107374182400, "max_latency_sensitive_vols_per_disk":2, "pfree":0, "node_hostname":"cscale-82-140.robinsystems.com", "tags":{ }, "pused":0, "type":"HDD", "nvols":0, "state":"INIT", "reattachpolicy":{ "restarts_done":0, "burst_count":0, "burst_start_time":0, "burst_interval":600, "id":1, "restart_limit":5 }, "max_alloc_slices":77, "stormgrid":0, "free_alloc_slices":77, "slices":0, "availability_zone":null, "max_throughput_intensive_vols_per_disk":1, "model":null, "lused_size":0, "devpath":"\/dev\/disk\/by-id\/scsi-3600224801d3ac9b6650afd3280aa5898", "alloc_slices":0, "reattachable":0, "max_volumes_per_disk":10, "wwn":"0x600224801d3ac9b6650afd3280aa5898", "allocations":[ ], "alloc_score":0, "node_ref":1, "preserved":0 }, { "spf":0.8, "zoneid":1596601846, "dev":"\/dev\/dm-0", "aslices":0, "nodeid":1, "maintenance_mode":"OFF", "role":"RootDisk", "protected":0, "status":"UNKNOWN", "make":null, "reattachable_nodes":[ [ "cscale-82-140.robinsystems.com", "ONLINE" ] ], "capacity":53687091200, "max_latency_sensitive_vols_per_disk":2, "pfree":0, "node_hostname":"cscale-82-140.robinsystems.com", "tags":{ }, "pused":0, "type":"HDD", "nvols":0, "state":"INIT", "reattachpolicy":{ "restarts_done":0, "burst_count":0, "burst_start_time":0, "burst_interval":600, "id":4, "restart_limit":5 }, "max_alloc_slices":38, "stormgrid":0, "free_alloc_slices":38, "slices":0, "availability_zone":null, "max_throughput_intensive_vols_per_disk":1, "model":null, "lused_size":0, "devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphgpZcvqGdfOKaXbEbOZzNthc6btsoSXDj", "alloc_slices":0, "reattachable":0, "max_volumes_per_disk":10, "wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-root", "allocations":[ ], "alloc_score":0, "node_ref":1, "preserved":0 }, { "spf":0.8, "zoneid":1596601846, "dev":"\/dev\/dm-1", "aslices":0, "nodeid":1, "maintenance_mode":"OFF", "role":"RootDisk", "protected":0, "status":"UNKNOWN", "make":null, "reattachable_nodes":[ [ "cscale-82-140.robinsystems.com", "ONLINE" ] ], "capacity":8254390272, "max_latency_sensitive_vols_per_disk":2, "pfree":0, "node_hostname":"cscale-82-140.robinsystems.com", "tags":{ }, "pused":0, "type":"HDD", "nvols":0, "state":"INIT", "reattachpolicy":{ "restarts_done":0, "burst_count":0, "burst_start_time":0, "burst_interval":600, "id":5, "restart_limit":5 }, "max_alloc_slices":5, "stormgrid":0, "free_alloc_slices":5, "slices":0, "availability_zone":null, "max_throughput_intensive_vols_per_disk":1, "model":null, "lused_size":0, "devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphaFy4aq3EUo1yluonS8FG0LF16ycBrdEw", "alloc_slices":0, "reattachable":0, "max_volumes_per_disk":10, "wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-swap", "allocations":[ ], "alloc_score":0, "node_ref":1, "preserved":0 }, { "spf":0.8, "zoneid":1596601846, "dev":"\/dev\/dm-2", "aslices":0, "nodeid":1, "maintenance_mode":"OFF", "role":"RootDisk", "protected":0, "status":"UNKNOWN", "make":null, "reattachable_nodes":[ [ "cscale-82-140.robinsystems.com", "ONLINE" ] ], "capacity":44350570496, "max_latency_sensitive_vols_per_disk":2, "pfree":0, "node_hostname":"cscale-82-140.robinsystems.com", "tags":{ }, "pused":0, "type":"HDD", "nvols":0, "state":"INIT", "reattachpolicy":{ "restarts_done":0, "burst_count":0, "burst_start_time":0, "burst_interval":600, "id":6, "restart_limit":5 }, "max_alloc_slices":32, "stormgrid":0, "free_alloc_slices":32, "slices":0, "availability_zone":null, "max_throughput_intensive_vols_per_disk":1, "model":null, "lused_size":0, "devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphQObDlS6eMUSpSxH5zsvyg9I5a0Gpuj5W", "alloc_slices":0, "reattachable":0, "max_volumes_per_disk":10, "wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-home", "allocations":[ ], "alloc_score":0, "node_ref":1, "preserved":0 }, { "spf":0.8, "zoneid":1596601846, "dev":"\/dev\/sdb", "aslices":7, "nodeid":1, "maintenance_mode":"OFF", "role":"Storage", "write_unit":4096, "status":"ONLINE", "make":null, "reattachable_nodes":[ [ "cscale-82-140.robinsystems.com", "ONLINE" ] ], "protected":0, "capacity":107374182400, "max_latency_sensitive_vols_per_disk":2, "pfree":104287174656, "node_hostname":"cscale-82-140.robinsystems.com", "tags":{ }, "pused":234881024, "type":"HDD", "nvols":3, "state":"READY", "reattachpolicy":{ "restarts_done":0, "burst_count":0, "burst_start_time":0, "burst_interval":600, "id":2, "restart_limit":5 }, "max_alloc_slices":77, "stormgrid":1, "free_alloc_slices":68, "slices":6390, "availability_zone":null, "max_throughput_intensive_vols_per_disk":1, "model":null, "lused_size":0, "devpath":"\/dev\/disk\/by-id\/scsi-3600224804c48fd7e16c608dea0919064", "alloc_slices":9, "reattachable":0, "max_volumes_per_disk":10, "wwn":"0x600224804c48fd7e16c608dea0919064", "allocations":[ { "vols":[ { "media":"HDD", "pused":167772160, "id":"1", "size":5368709120, "state":"ONLINE", "name":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53" } ], "volume_group":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53.72.1.673abece-0975-4234-9fc2-56a06bf54031", "name":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53.0.970a44c7-a15c-4612-ac57-9b4f15ae386e", "volume":{ "media":"HDD", "pused":167772160, "id":"1", "size":5368709120, "state":"ONLINE", "name":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53" }, "slices":5 }, { "vols":[ { "media":"HDD", "pused":67108864, "id":"8", "size":1073741824, "state":"ONLINE", "name":"test-RIC-1.server.01.data.1.382f1ad5-1294-4e24-8297-9c6025eacfe5" } ], "volume_group":"test-RIC-1.server.01.72.1.ea297971-f931-4787-99cc-6782e026b77c", "name":"test-RIC-1.server.01.72.1.ea297971-f931-4787-99cc-6782e026b77c.0.1e8feb41-fd42-4409-b8c1-751331febdc1", "volume":{ "media":"HDD", "pused":67108864, "id":"8", "size":1073741824, "state":"ONLINE", "name":"test-RIC-1.server.01.data.1.382f1ad5-1294-4e24-8297-9c6025eacfe5" }, "slices":2 }, { "vols":[ { "media":"HDD", "pused":0, "id":"9", "size":1073741824, "state":"ONLINE", "name":"test-RIC-1.server.01.block.1.1053eaeb-4542-42a5-a173-d69a76703ead" } ], "volume_group":"test-RIC-1.server.01.72.1.1e483fd0-2d5c-434c-aef0-91a87796977a", "name":"test-RIC-1.server.01.72.1.1e483fd0-2d5c-434c-aef0-91a87796977a.0.5bf728c4-ea26-4e0f-82d7-c584fcf0bd9a", "volume":{ "media":"HDD", "pused":0, "id":"9", "size":1073741824, "state":"ONLINE", "name":"test-RIC-1.server.01.block.1.1053eaeb-4542-42a5-a173-d69a76703ead" }, "slices":2 } ], "alloc_score":95, "node_ref":1, "preserved":0 }, { "spf":0.8, "zoneid":1596601846, "dev":"\/dev\/sdc", "aslices":20, "nodeid":1, "maintenance_mode":"OFF", "role":"Storage", "write_unit":4096, "status":"ONLINE", "make":null, "reattachable_nodes":[ [ "cscale-82-140.robinsystems.com", "ONLINE" ] ], "protected":0, "capacity":107374182400, "max_latency_sensitive_vols_per_disk":2, "pfree":103582531584, "node_hostname":"cscale-82-140.robinsystems.com", "tags":{ }, "pused":939524096, "type":"HDD", "nvols":1, "state":"READY", "reattachpolicy":{ "restarts_done":0, "burst_count":0, "burst_start_time":0, "burst_interval":600, "id":3, "restart_limit":5 }, "max_alloc_slices":77, "stormgrid":2, "free_alloc_slices":53, "slices":6390, "availability_zone":null, "max_throughput_intensive_vols_per_disk":1, "model":null, "lused_size":0, "devpath":"\/dev\/disk\/by-id\/scsi-3600224803bcdafde95b1f5cd27ceb5fb", "alloc_slices":24, "reattachable":0, "max_volumes_per_disk":10, "wwn":"0x600224803bcdafde95b1f5cd27ceb5fb", "allocations":[ { "vols":[ { "media":"HDD", "pused":939524096, "id":"16", "size":21474836480, "state":"ONLINE", "name":"rohan-app.nginx.03.data.1.83d03fbf-3bfe-4723-8abe-5cbd51014e0c" } ], "volume_group":"rohan-app.nginx.03.72.1.44251a23-0221-4fda-837e-db26bca3ccb8", "name":"rohan-app.nginx.03.72.1.44251a23-0221-4fda-837e-db26bca3ccb8.0.6e2afb65-5c4c-42e4-972e-161de3fb3856", "volume":{ "media":"HDD", "pused":939524096, "id":"16", "size":21474836480, "state":"ONLINE", "name":"rohan-app.nginx.03.data.1.83d03fbf-3bfe-4723-8abe-5cbd51014e0c" }, "slices":24 } ], "alloc_score":89, "node_ref":1, "preserved":0 } ] } ] } .. raw:: html
================= Disabling a node ================= In certain situations, a user might not want any resources for an application to be allocated from a particular host due to a malfunction with the physical machine or simply because the host is temporarily undergoing maintenance. Instead of requiring the user to remove the node from an existing cluster, Robin allows one to place a host into maintenance mode. This effectively isolates the host with regards to resource availability as it entails that none of the host's storage capacity can be used for future application deployment regardless of the Robin roles assigned to the node. This mode can be toggled using the commands detailed below. For more granular control, please review the section on disabling/enabling particular roles `here `_. The following commands are described in this section: ================================== ================================================================================= ``robin host set-maintenance`` Place a host into maintenance mode ``robin host unset-maintenance`` Place a host into non-maintenance (normal) mode ================================== ================================================================================= ------------------------------------- Placing a host into maintenance mode ------------------------------------- .. tabs:: .. tab:: CLI In order to put a host into maintenance and thus temporarily suspend it from providing either storage or compute resources for future application deployments, issue the following command: .. code-block:: text # robin host set-maintenance ============================ ====================================================================================== ``hostname`` FQDN of host ============================ ====================================================================================== **Example:** .. code-block:: text # robin host set-maintenance vnode36.robinsystems.com Host vnode36.robinsystems.com set in maintenance mode .. tab:: API Puts a host into maintenance mode, which in turn temporarily suspends it from providing storage and compute resources for application deployments. **End Point:** /api/v3/robin_server/hosts/ **Method:** PUT **URL Parameters:** None **Data Parameters:** - ``action: set_maintenance`` - This mandatory field within the payload specifies that the set maintenance mode operation is to be performed. **Port:** RCM Port (default value is 29442) **Headers:** - ``Authorization: `` : Authorization token to identify which user is sending the request. The token can be acquired from the login API. **Success Response Code:** 200 **Error Response Code:** 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error) **Example Response:** .. raw:: html
Output .. code-block:: text { "message":"Maintenance mode set" } .. raw:: html
----------------------------------------- Placing a host into non-maintenance mode ----------------------------------------- .. tabs:: .. tab:: CLI In order to revert a host back into its normal setting and thus allow it to provide resources for future application deployments, issue the following command: .. code-block:: text # robin host unset-maintenance ============================ ====================================================================================== ``hostname`` FQDN of host ============================ ====================================================================================== **Example:** .. code-block:: text # robin host unset-maintenance vnode36.robinsystems.com Host vnode36.robinsystems.com out of maintenance mode .. tab:: API Removes a host from maintenance mode, which in turn allows it to provide storage and compute resources for application deployments. **End Point:** /api/v3/robin_server/hosts/ **Method:** PUT **URL Parameters:** None **Data Parameters:** - ``action: unset_maintenance`` - This mandatory field within the payload specifies that the unset maintenance mode operation is to be performed. **Port:** RCM Port (default value is 29442) **Headers:** - ``Authorization: `` : Authorization token to identify which user is sending the request. The token can be acquired from the login API. **Success Response Code:** 200 **Error Response Code:** 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error) **Example Response:** .. raw:: html
Output .. code-block:: text { "message":"Maintenance mode unset" } .. raw:: html
======================= Removing Nodes ======================= ------------------ Remove Node on GKE ------------------ For Robin CNS installed on Google Kubernetes Engine (GKE), you must use the GKE UI to add or remove the nodes. For more information, see `here `__. **Node Removal Limitations** When removing a node using the GKE UI , you must consider the following limitations. - When removing a node using the GKE UI, you must select one node at a time although the GKE UI supports removing more than one node. This is because Robin CNS supports only removing one node at a time. - When removing a node, you do not have any control to select a particular node for removal. ----------------------------- Remove Node on Google Anthos ----------------------------- You might need to remove a master or worker node if a particular node has any technical issues. As part of node removal, you must first evacuate all volumes on the drives from the node that you wanted to remove. Complete the following steps to remove a node from the cluster. 1. Run the following command to evacuate all volumes on all drives of the node that you are planning to remove. .. Note:: Use the ``robin disk list`` command to find the list of drives on the host. .. Important:: You need to repeat this step for all drives on the node. .. code-block:: text # robin disk evacuate **Example:** .. code-block:: text [root@hypervvm-72-42 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION hypervvm-72-42.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-43.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-44.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-45.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001 hypervvm-72-46.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001 [root@hypervvm-72-42 ~]# robin drive evacuate 0x60022480c3025c597a2fc0d7fc407ed4 --wait Are you sure you want to evacuate all the volumes on disk 0x60022480c3025c597a2fc0d7fc407ed4 [y/n] ?y Job: 362 Name: DiskEvacuate State: VALIDATED Error: 0 Job: 362 Name: DiskEvacuate State: WAITING Error: 0 Job: 362 Name: DiskEvacuate State: COMPLETED Error: 0 2. (Remove Worker node) Edit the **Anthos node-pool custom resource** file. Remove node IP details for a worker node from the file and save it. **Example:** .. code-block:: text [root@workstation nehacluster]# kubectl get nodepools.baremetal.cluster.gke.io -A --kubeconfig=nehacluster-kubeconfig NAMESPACE NAME READY RECONCILING STALLED UNDERMAINTENANCE UNKNOWN cluster-nehacluster nehacluster 3 0 0 0 0 cluster-nehacluster node-pool-1 2 0 0 0 0 kubectl edit nodepool --kubeconfig ./nehacluster-kubeconfig -n cluster-nehacluster node-pool-1 **After removing IP address details from the file**. [root@workstation nehacluster]# kubectl get nodes --kubeconfig=nehacluster-kubeconfig NAME STATUS ROLES AGE VERSION hypervvm-72-42.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-43.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-44.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-45.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001 3. (Remove master node ) Edit the **nodepool spec in the cluster resource file**. Remove the master node IP details from the file and save it. **Example** .. code-block:: text [root@workstation nehacluster]# kubectl edit cluster nehacluster -n cluster-nehacluster --kubeconfig ./nehacluster-kubeconfig [root@workstation nehacluster]# kubectl get nodes --kubeconfig=nehacluster-kubeconfig NAME STATUS ROLES AGE VERSION hypervvm-72-42.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-43.robinsystems.com Ready,SchedulingDisabled control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-44.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-45.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001 hypervvm-72-46.robinsystems.com Ready worker 17m v1.25.5-gke.1001 **After removing IP address details from the file**. [root@workstation nehacluster]# kubectl get nodes --kubeconfig=nehacluster-kubeconfig NAME STATUS ROLES AGE VERSION hypervvm-72-42.robinsystems.com Ready control-plane,master 2d17h v1.25.5-gke.1001 hypervvm-72-44.robinsystems.com Ready control-plane,master 2d17h v1.25.5-gke.1001 hypervvm-72-45.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001 hypervvm-72-46.robinsystems.com Ready worker 25m v1.25.5-gke.1001 ========================================= Managing a cluster via the remote client ========================================= In addition to the Robin CLI, which is available on all hosts where Robin is installed, a remote client is shipped with each cluster that is deployed. This client mirrors the functionality of the native CLI with regards to the commands available and hence it provides the management capabilities that are described throughout this document. One advantage of utilizing this client is that it can be used to manage a multitude of Robin clusters via the concept of ``contexts``. A ``context`` in this scenario refers to a Robin cluster and is identified by the server name or IP Address. In addition to this primary key, the following attributes can also be set within a ``context``: the port values for various Robin services (including the Robin Server, File Server, Event Server, Watchdog Server, and Metrics Server) along with the logging level. The attributes are discussed in more detail in the following sections. After creating the appropriate ``context`` for a Robin cluster, one can set it to be the current context and communicate with the respective cluster. The commands which can be used to achieve this are described below. The following commands are described in this section: ================================ ================================================================================= ``robin client add-context`` Add a Robin cluster context ``robin client list-contexts`` List all registered Robin cluster contexts ``robin client set-current`` Set a Robin cluster context as the current context ``robin client update-context`` Update attributes for the current Robin cluster context ``robin client delete-context`` Delete a Robin cluster context ================================ ================================================================================= ------------------------------------- Downloading the Robin client ------------------------------------- In order to download the Robin client from an existing Robin cluster, issue the following command: .. code-block:: text # curl -k 'https://:/api/v3/robin_server/download?file=robincli&os=' -o robin ============================ ====================================================================================== ``manager ip`` IP Address of the manager node or load balancer IP. Run ``robin manager list`` to get the IP. ``port`` Port number for the Robin Server ``os`` The operating system to download the client for. Supported operating systems include: Linux, MacOS. ============================ ====================================================================================== **Example**: .. code-block:: text # curl -k 'https://vnode42:29442/api/v3/robin_server/download?file=robincli&os=linux' -o robin % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 10.1M 100 10.1M 0 0 1421k 0 0:00:07 0:00:07 --:--:-- 1483k # ls -lart -rw-r--r-- 1 demo staff 10655536 Mar 26 14:12 robin ------------------------------------- Adding a Context ------------------------------------- .. tabs:: .. tab:: CLI A context is a construct that can be used to define a Robin cluster in a manner that the remote client can understand. In order to add a context, issue the following command: .. Note:: If a context already exists with the server specified, that context will be updated with the values supplied. .. code-block:: text # robin client add-context --port --file-port --event-port --metrics-port --log-level --product --set-current ============================ ====================================================================================== ``server`` FQDN/IP Address of the Master Node or VIP ``--port`` Port number for the Robin Server. Default value is 29442 ``--file-port`` Port number for the File Server. Default value is 29445 ``--event-port`` Port number for the Event Server. Default value is 29449 ``--metrics-port`` Port number for the Metrics Server. Default is 29446 ``--log-level`` Number indicating the verbosity of logs. Valid values are 10 (DEBUG), 20 (INFO), 40 (ERROR). Default value is 40. ``--product`` Type of ROBIN installation. Valid choices are 'platform' or 'storage'. Default value is 'platform'. ``--set-current`` Set context to be created as the current ============================ ====================================================================================== **Example**: .. code-block:: text # robin client add-context centos-60-214 --port 29443 Context robin-cluster-centos-60-214 created successfully ------------------------------------- Listing all available contexts ------------------------------------- .. tabs:: .. tab:: CLI In order to list all contexts that have already been registered with the client alongside additional details such as the port values specified or the log level, issue the following command: .. code-block:: text # robin client list-contexts --full ============================ ====================================================================================== ``--full`` Show additional details about all registered contexts ============================ ====================================================================================== **Example**: .. code-block:: text # robin client list-contexts --full | Server | Port | Version | Tenant | Last Login | Tenants | FPort | WPort | MPort | LogLevel ---+-----------------------------------+-------+------------+----------------+----------------------+----------------+-------+-------+-------+---------- | master.robin-server.service.robin | 29442 | - | - | - | | 29445 | 29444 | 29446 | ERR | centos-60-214 | 29443 | - | Administrators | - | | 29445 | 29444 | 29446 | ERR * | 172.19.174.194 | 29442 | 5.2.3-9842 | Administrators | 26 Mar 2020 16:10:58 | Administrators | 29445 | 29444 | 29446 | ERR .. Note:: The asterisk displayed above indicates the current context. ------------------------------------- Setting the current context ------------------------------------- .. tabs:: .. tab:: CLI In order to access a particular Robin cluster, its respective context needs to be set as the current context. To achieve this, issue the following command: .. code-block:: text # robin client set-current ============================ ====================================================================================== ``context`` The server attribute of the context to be set as current ============================ ====================================================================================== **Example**: .. code-block:: text # robin client set-current centos-60-214 Current context set to robin-cluster-centos-60-214 ------------------------------------- Updating the current context ------------------------------------- .. tabs:: .. tab:: CLI In certain situations, such as a reinstallation, the attributes of a context might be altered whilst retaining the same server IP Address or hostname. As a result, the context which refers to this cluster will have to be updated. In order to do so, issue the following command: .. Note:: The below command only updates the current context. .. code-block:: text # robin client update-context --port --file-port --event-port --metrics-port --log-level ============================ ====================================================================================== ``--port`` Updated port number for the Robin Server ``--file-port`` Updated port number for the File Server ``--event-port`` Updated port number for the Event Server ``--metrics-port`` Updated port number for the Metrics Server ``--log-level`` Updated number indicating the verbosity of logs. Valid values are 10 (DEBUG), 20 (INFO), 40 (ERROR) ============================ ====================================================================================== **Example**: .. code-block:: text # robin client update-context --port 29942 --file-port 29445 --watchdog-port 29444 --metrics-port 29446 Updating attributes for context robin-cluster-centos-60-214 Server: centos-60-214 Context config updated for robin-cluster-centos-60-214 ------------------------------------- Deleting a context ------------------------------------- .. tabs:: .. tab:: CLI In order to remove a registered context, issue the following command: .. code-block:: text # robin client delete-context ============================ ====================================================================================== ``context`` The server attribute of the context to be deleted ============================ ====================================================================================== **Example**: .. code-block:: text # robin client delete-context centos-60-214 Context centos-60-214 deleted