5. Managing Nodes¶
5.1. Resource Discovery¶
As part of the Robin storage installation process, resource discovery is run on the node wherein which details about the physical configuration, hardware limits and resource availability are discovered. The purpose of this is two fold. First this process allows Robin to gain a better understanding of the machine in terms of the storage resources it can provide for application deployment as well as allow Robin to better optimize the nodes usage within the cluster.
The following properties of the node are discovered:
Disks
Details on what is captured with regards to each one of the above aspects alongside how they are captured are described below.
5.1.1. Disk Discovery¶
Robin leverages a multitude of sources to discover the disks that are available to a node. Some of the commands and directories used to attain the below details are: lsblk
, partprobe
, pvs
, blkid
and /proc/mounts
. The following details are captured for each disk (if present):
Devpath
Capacity
Physical Sector size
WWN (along with make and model)
Media type
5.1.2. Disk Partitions (LVM) Discovery¶
Environments where resources are constrained, such as Edge servers, may not have dedicated data disks for Robin to consume and instead may only contain disks which are partitioned. By default, partitioned disks are discovered and marked as Reserved
to avoid any user data being overwritten. As a result, in order for the partition(s) to serve as data disks they will have to be setup manually by the process described below.
First ensure the target partitions are discovered appropriately by Robin by running the following commands:
# lsblk
sdb 8:16 0 50G 0 disk
├─sdb1 8:17 0 10G 0 part
└─sdb2 8:18 0 40G 0 part
└─vg-robinds 253:0 0 39G 0 lvm
# robin drive list --role=all
ID | WWN | Host | Path /dev/disk/by-id | Size(GB) | Movable | Type | Free/Max(GB) | Vols | Role | Status | LastOpr
---+------------------------------------------------------+--------------+------------------------------------------------------------------------------+----------+---------+------+--------------+------+----------+---------+---------
- | 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8 | vnode-89-142 | scsi-0QEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8 | 50 | N | HDD | 38/38 (100%) | 0/10 | RootDisk | UNKNOWN | INIT
- | 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8-vg-robinds | vnode-89-142 | dm-uuid-LVM-DOy7w9WSdGi2PcSERuceOHkfM7dotzr7nm5EuoAWsyAHkvbYGT02MaeWDro05F3R | 39 | N | HDD | 30/30 (100%) | 0/10 | Reserved | UNKNOWN | INIT
Note
Details on the robin drive list
command can be found here.
In order to mark the partition as ready to be used as storage and confirm that its role has been updated in accordance, run the following command:
# robin drive update 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8-vg-robinds --role storage --wait
Job: 5922 Name: DiskModify State: PROCESSED Error: 0
Job: 5922 Name: DiskModify State: COMPLETED Error: 0
# robin drive list
ID | WWN | Host | Path /dev/disk/by-id | Size(GB) | Movable | Type | Free/Max(GB) | Vols | Role | Status | LastOpr
---+------------------------------------------------------+--------------+------------------------------------------------------------------------------+----------+---------+------+--------------+------+----------+---------+---------
- | 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8-vg-robinds | vnode-89-142 | dm-uuid-LVM-DOy7w9WSdGi2PcSERuceOHkfM7dotzr7nm5EuoAWsyAHkvbYGT02MaeWDro05F3R | 39 | N | HDD | 30/30 (100%) | 0/10 | Storage | UNKNOWN | INIT
Note
Details on the robin drive update
command can be found here.
Lastly to intialize the paritition and confirm that is ready for use, retrieve the name of the host it is currently associated with and run the following commands:
# robin host add-role vnode-89-142 storage --wait
Job: 5923 Name: HostAddRoles State: PROCESSED Error: 0
Job: 5923 Name: HostAddRoles State: WAITING Error: 0
Job: 5923 Name: HostAddRoles State: COMPLETED Error: 0
# robin drive list
ID | WWN | Host | Path /dev/disk/by-id | Size(GB) | Movable | Type | Free/Max(GB) | Vols | Role | Status | LastOpr
---+------------------------------------------------------+--------------+------------------------------------------------------------------------------+----------+---------+------+--------------+------+----------+---------+---------
4 | 0xQEMU_QEMU_HARDDISK_561637eb-07d0-4a0d-8-vg-robinds | vnode-89-142 | dm-uuid-LVM-DOy7w9WSdGi2PcSERuceOHkfM7dotzr7nm5EuoAWsyAHkvbYGT02MaeWDro05F3R | 39 | N | HDD | 30/30 (100%) | 0/10 | Storage | ONLINE | READY
Note
Details on the robin host add-role
command can be found here.
5.2. Robin Node Roles¶
In Robin CNS, only the Storage role is assigned to hosts. It is automatically assigned by the operator. You can manually assign the Storage role when required.
The Storage role is designated to a node which is intended to provide storage, as indicated by its name, for applications deployed on Robin. As a result, any volumes needed for deployed applications will be created and mounted on devices on nodes with this role set.
The following commands are described in this section:
|
Add one or more role(s) to a host |
|
Move one or more role(s) out of maintenance mode for a host |
|
Move one or more role(s) into of maintenance mode for a host |
|
Remove one or more role(s) from a host |
5.2.1. Adding role(s)¶
To add a role to a host that is already registered within the Robin cluster, issue the following command:
# robin host add-role [<hosts>] [<roles>]
--rpool <rpool>
--disks <disks>
|
Comma separated list of hosts to add role to. |
|
Comma separated list of roles. The only valid value is ‘storage’. |
|
Assign a resource pool prior to adding a role. |
|
Comma separated list of disk WWNs to add as part of storage role addition. |
Example:
# robin host add-role centos-60-212,centos-60-214 storage --wait
Job: 18 Name: HostAddRoles State: VALIDATED Error: 0
Job: 18 Name: HostAddRoles State: WAITING Error: 0
Job: 18 Name: HostAddRoles State: COMPLETED Error: 0
Adds a role to a host that is already registered within the Robin cluster.
End Point: /api/v3/robin_server/hosts/<hostname>
Method: PUT
URL Parameters: None
Data Parameters:
action: add_roles
- This mandatory field within the payload specifies that the add role operation is to be performed.roles: <list_of_roles>
- This mandatory field within the payload is a list of roles that should be added to the specified host. The only valid value is ‘storage’.drives: <list_of_wwns>
- Utilizing this parameter by specifiying a list of WWNs of drives results in the disks associated with the aforementioned WWNs being added alongside the addition of the storage role.rpool: <rpool_name>
- Utilizing this parameter by specifying a resource pool name results in a resource pool being assigned to the host prior to the addition of roles.
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 202
Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)
Example Response:
Output
{
"jobid":20
}
5.2.2. Enabling role(s)¶
To move a role, which is already added to a host, out of maintenance mode and thus enable it for use again, issue the following command:
# robin host enable-role [<host>] [<roles>]
|
Fully qualified hostname |
|
Valid values include: ‘storage’ |
Example:
# robin host enable-role centos-60-212.robinsystems.com Storage
Role(s) 'Storage' enabled on host centos-60-212.robinsystems.com
Enables a role that was previously disabled on a host.
End Point: /api/v3/robin_server/hosts/<hostname>
Method: PUT
URL Parameters: None
Data Parameters:
action: enable-role
- This mandatory field within the payload specifies that the enable role operation is to be performed.roles: <list_of_roles>
- This mandatory field within the payload is a list of roles that should be enabled on the specified host.
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)
Example Response: On success the reponse is empty.
5.2.3. Disabling role(s)¶
Within Robin when one disables a role, the role is said to be put into maintenance mode. This in turn means that for all intents and purposes the host does not have access to this role. This is useful for debugging purposes and to temporarily reserve the hosts resources. To move a role into maintenance mode and thus disable it for use, issue the following command:
# robin host disable-role [<host>] [<roles>]
|
Fully qualified hostname |
|
Valid values include: ‘storage’ |
Example:
# robin host disable-role centos-60-212.robinsystems.com Storage
Role(s) 'Storage' disabled on host centos-60-212.robinsystems.com
Disables a role for a host such that the host temporarily does not have access to it.
End Point: /api/v3/robin_server/hosts/<hostname>
Method: PUT
URL Parameters: None
Data Parameters:
action: disable-role
- This mandatory field within the payload specifies that the disable role operation is to be performed.roles: <list_of_roles>
- This mandatory field within the payload is a list of roles that should be disabled on the specified host. Valid values include: ‘storage’.
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)
Example Response: On success the reponse is empty.
5.2.4. Removing role(s)¶
In order to a remove a role that had previously been assigned to a node, issue the following command:
# robin host remove-role [<hosts>] [<roles>]
--force
--yes
|
Comma separated list of hosts to remove role from. |
|
Comma separated list of roles. The only valid value is ‘storage’. |
|
Required if retrying a remove-role operation. |
|
Do not prompt the user for confirmation of removal |
Example:
# robin host remove-role centos-60-212 storage --wait
Job: 192 Name: HostRemoveRoles State: VALIDATED Error: 0
Job: 192 Name: HostRemoveRoles State: WAITING Error: 0
Job: 192 Name: HostRemoveRoles State: COMPLETED Error: 0
Removes a role that had previously been assigned to a host.
End Point: /api/v3/robin_server/hosts/<hostname>
Method: PUT
URL Parameters: None
Data Parameters:
action: remove_roles
- This mandatory field within the payload specifies that the remove role operation is to be performed.roles: <list_of_roles>
- This mandatory field within the payload is a list of roles that should be removed from the specified host. The only valid value is ‘storage’.force: true
- This field is mandatory when retrying the role removal operation, otherwise it can be excluded.
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 202
Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)
Example Response:
Output
{
"jobid":24
}
5.3. Gathering information on nodes¶
Robin exposes multiple endpoints that provides a user the means by which to attain information about the hosts registered with a Robin cluster. The information that is returned is a combination of the physical attributes (obtained by the resource discovery described here), resource utilization and status of services for the host. This gives a user insight into the state of the cluster alongside granular details for each individual host and enables application deployment planning.
The following commands are described in this section:
|
View all hosts in a cluster |
|
Display detailed information about a host |
5.3.1. List all hosts¶
In order to view all hosts within a cluster alongside information on their statuses (from Robin’s perspective), resource consumption, and roles within the cluster, issue the following command:
# robin host list --services
--resources
--network
--devices
--tags
--json
|
Show status information for each host |
|
Show resource utilization for each host |
|
Show network resource utilization info for each host |
|
Show devices resource utilization info for each host |
|
Show tag information for each host |
|
Output in JSON |
Example 1 (Listing all hosts):
# robin host list
Id | Hostname | Version | Status | RPool | LastOpr | Roles | Isol Cores(SHR/DED/Total) | Non-Isol Cores | GPUs | Mem(Free/Alloc/Total) | HDD(#/Alloc/Total) | SSD(#/Alloc/Total) | Pod Usage | Joined Time
-------------+-------------------------------+-----------+--------+---------+---------+--------+---------------------------+----------------+------+-----------------------+--------------------+--------------------+-----------+----------------------
1596566663:1 | cscale-82-81.robinsystems.com | 5.3.0-172 | Ready | default | ONLINE | M*,S,C | 0/0/0 | 1/40 | 0/0 | 24G/6G/31G | 1/5G/100G | -/-/- | 11/110 | 04 Aug 2020 04:44:47
1596566663:2 | cscale-82-82.robinsystems.com | 5.3.0-172 | Ready | default | ONLINE | M,S,C | 0/0/0 | 2/40 | 0/0 | 24G/7G/31G | 2/40G/200G | -/-/- | 13/110 | 04 Aug 2020 04:51:52
1596566663:3 | cscale-82-83.robinsystems.com | 5.3.0-172 | Ready | default | ONLINE | M,S,C | 0/0/0 | 1/40 | 0/0 | 24G/6G/31G | 2/-/200G | -/-/- | 11/110 | 04 Aug 2020 04:58:25
1596566663:4 | qct-07.robinsystems.com | 5.3.0-172 | Ready | workers | ONLINE | S,C | 8/68/76 | 1/4 | 0/0 | 280G/95G/376G | 1/554G/893G | -/-/- | 83/110 | 04 Aug 2020 05:05:24
1596566663:5 | qct-08.robinsystems.com | 5.3.0-172 | Ready | workers | ONLINE | S,C | 0/76/76 | 1/4 | 0/0 | 284G/91G/376G | 1/547G/893G | -/-/- | 88/110 | 04 Aug 2020 05:12:06
1596566663:6 | qct-11.robinsystems.com | 5.3.0-172 | Ready | workers | ONLINE | S,C | 0/27/76 | 1/4 | 0/0 | 147G/40G/187G | 1/581G/893G | -/-/- | 34/110 | 04 Aug 2020 05:18:47
1596566663:7 | cscale-82-80.robinsystems.com | 5.3.0-172 | Ready | workers | ONLINE | C,S | 0/21/36 | 1/40 | 0/0 | 0.7G/30G/31G | 1/-/100G | -/-/- | 28/110 | 04 Aug 2020 20:21:40
Example 2 (Retrieving status information):
# robin host list --services
+-------------------------------+-------+-------+------+-------+-------+------+-------+------+------+------+-------+------+------+-------+---------+---------+
| Host | ConCl | ConSr | GCli | Httpd | Iomgr | RMon | Pgsql | RAgt | RAer | REvt | RFile | NMon | RSer | RWdog | Metrics | Stormgr |
+-------------------------------+-------+-------+------+-------+-------+------+-------+------+------+------+-------+------+------+-------+---------+---------+
| cscale-82-81.robinsystems.com | DOWN | UP | UP | UP | UP | UP | UP | UP | UP | UP | UP | UP | UP | UP | DOWN | UP |
| cscale-82-82.robinsystems.com | DOWN | UP | UP | UP | UP | UP | UP | UP | DOWN | DOWN | DOWN | DOWN | DOWN | UP | DOWN | DOWN |
| cscale-82-83.robinsystems.com | DOWN | UP | UP | UP | UP | UP | UP | UP | DOWN | DOWN | DOWN | DOWN | DOWN | UP | DOWN | DOWN |
| qct-07.robinsystems.com | UP | DOWN | UP | UP | UP | UP | DOWN | UP | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN |
| qct-08.robinsystems.com | UP | DOWN | UP | UP | UP | UP | DOWN | UP | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN |
| qct-11.robinsystems.com | UP | DOWN | UP | UP | UP | UP | DOWN | UP | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN |
| cscale-82-80.robinsystems.com | UP | DOWN | UP | UP | UP | UP | DOWN | UP | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN | DOWN |
+-------------------------------+-------+-------+------+-------+-------+------+-------+------+------+------+-------+------+------+-------+---------+---------+
UP: Running
CRIT: Critical and Down
DOWN: Not Running
Example 3 (Listing resource utilization for all hosts):
[root@cscale-82-73 ~]# robin host list --resources
Id | Hostname | Version | Status | RPool | Avail. Zone | Cores | GPUs | Mem | Hpages-2Mi | Hpages-1Gi | HDD(#/Alloc/Total) | SSD(#/Alloc/Total) | Pod Usage
-------------+-------------------------------+-----------+--------+---------+-------------+---------+-------+---------------+------------+------------+--------------------+--------------------+-----------
1637691944:1 | cscale-82-73.robinsystems.com | 5.3.5-214 | Ready | default | N/A | 35/5/40 | 0/0/0 | 22G/8G/31G | - | - | 2/-/200G | -/-/- | 73/27/100
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | 5/75/80 | 0/0/0 | 300G/36G/336G | - | 33/6/40 | 1/-/8043G | -/-/- | 47/53/100
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | 5/75/80 | 0/0/0 | 300G/36G/336G | - | 33/6/40 | 1/-/8043G | -/-/- | 48/52/100
* Note: all values indicated above in the format XX/XX/XX represent the Free/Allocated/Total values of the respective resource unless otherwise specified. In addition allocated values for compute resource such as cpu, memory and pod usage includes reserved values for the corresponding resource.
Example 4 (Listing network resource utilization for all hosts):
[root@cscale-82-73 ~]# robin host list --network
Id | Hostname | Version | Status | RPool | Avail. Zone | NIC | State | PCIAddr | NUMA | VLANs | VFDrv | VFs
-------------+-------------------------------+-----------+--------+---------+-------------+------------+-------+--------------+------+-------+--------+----------
1637691944:1 | cscale-82-73.robinsystems.com | 5.3.5-214 | Ready | default | N/A | br0 | - | - | - | | - | -
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f0 | - | 0000:af:00.0 | 1 | 20 | iavf | 16/16/32
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f0 | - | 0000:af:00.0 | 1 | 20 | igbuio | 32/0/32
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f1 | - | 0000:af:00.1 | 1 | 20 | iavf | 16/16/32
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f1 | - | 0000:af:00.1 | 1 | 20 | igbuio | 32/0/32
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f0 | - | 0000:3b:00.0 | 0 | 20 | iavf | 16/16/32
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f0 | - | 0000:3b:00.0 | 0 | 20 | igbuio | 32/0/32
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f1 | - | 0000:3b:00.1 | 0 | 20 | iavf | 16/16/32
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f1 | - | 0000:3b:00.1 | 0 | 20 | igbuio | 32/0/32
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | br0 | - | 0000:5e:00.1 | 0 | 10 | - | -
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f0 | - | 0000:af:00.0 | 1 | 20 | iavf | 16/16/32
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f0 | - | 0000:af:00.0 | 1 | 20 | igbuio | 32/0/32
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f1 | - | 0000:af:00.1 | 1 | 20 | iavf | 16/16/32
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp175s0f1 | - | 0000:af:00.1 | 1 | 20 | igbuio | 32/0/32
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f0 | - | 0000:3b:00.0 | 0 | 20 | iavf | 16/16/32
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f0 | - | 0000:3b:00.0 | 0 | 20 | igbuio | 32/0/32
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f1 | - | 0000:3b:00.1 | 0 | 20 | iavf | 16/16/32
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | enp59s0f1 | - | 0000:3b:00.1 | 0 | 20 | igbuio | 32/0/32
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | br0 | - | 0000:5e:00.1 | 0 | 10 | - | -
* Note: all values indicated above in the format XX/XX/XX represent the Free/Allocated/Total values of the respective resource unless otherwise specified. In addition allocated values for compute resource such as cpu, memory and pod usage includes reserved values for the corresponding resource.
Example 5 (Listing device utilization for all hosts):
[root@qct-26 ~]# robin host list --devices
Id | Hostname | Version | Status | RPool | Avail. Zone | Type | Vendor | State | PCIAddr | NUMA | Devid | Driver | Count
-------------+-------------------------+-----------+-----------+---------+-------------+------+--------+-------+--------------+------+--------+---------+-------
1633606675:1 | qct-26.robinsystems.com | 5.3.9-286 | Ready | default | N/A | fpga | 0x8086 | - | 0000:1e:00.0 | 0 | 0x0d90 | vfiopci | 2/0/2
1633606675:2 | qct-28.robinsystems.com | 5.3.9-286 | Ready | default | N/A | fpga | 0x8086 | - | 0000:1e:00.0 | 0 | 0x0d90 | vfiopci | 0/2/2
1633606675:3 | qct-27.robinsystems.com | 5.3.9-286 | Ready | default | N/A | fpga | 0x8086 | - | 0000:1e:00.0 | 0 | 0x0d90 | vfiopci | 0/2/2
* Note: all values indicated above in the format XX/XX/XX represent the Free/Allocated/Total values of the respective resource unless otherwise specified. In addition allocated values for compute resource such as cpu, memory and pod usage includes reserved values for the corresponding resource.
Example 6 (Listing tags for all hosts):
[root@cscale-82-73 ~]# robin host list --tags
Id | Hostname | Version | Status | RPool | Avail. Zone | Rack | Lab | DC | Tags
-------------+-------------------------------+-----------+--------+---------+-------------+------+-----+----+-----------------------------------------------------------------------------------------------------------------------------------------------------
1637691944:1 | cscale-82-73.robinsystems.com | 5.3.5-214 | Ready | default | N/A | - | - | - | {'node-role.kubernetes.io/control-plane': [''], 'kubernetes.io/arch': ['amd64'], 'kubernetes.io/os': ['linux'], 'robin.io/robinrpool': ['default']}
1637691944:2 | qct-03.robinsystems.com | 5.3.5-214 | Ready | default | N/A | - | - | - | {'node-role.kubernetes.io/control-plane': [''], 'kubernetes.io/arch': ['amd64'], 'kubernetes.io/os': ['linux'], 'robin.io/robinrpool': ['default']}
1637691944:3 | qct-05.robinsystems.com | 5.3.5-214 | Ready | default | N/A | - | - | - | {'node-role.kubernetes.io/control-plane': [''], 'kubernetes.io/arch': ['amd64'], 'kubernetes.io/os': ['linux'], 'robin.io/robinrpool': ['default']}
* Note: all values indicated above in the format XX/XX/XX represent the Free/Allocated/Total values of the respective resource unless otherwise specified. In addition allocated values for compute resource such as cpu, memory and pod usage includes reserved values for the corresponding resource.
Returns information on all hosts within a cluster including details on their statuses (from Robin’s perspective), resource consumption, and roles within the cluster.
End Point: /api/v5/robin_server/hosts
Method: GET
URL Parameters:
details=tags
: Utilizing this parameter results in tag information for each host being present in the response payload.
Data Parameters: None
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error)
Example Response:
Output
{
"items":[
{
"memory_used":2692743168,
"memory":33555709952,
"isol_shared_map":{
},
"zoneid":1596601846,
"non_isol_cores_used":3,
"pods_used":26,
"rack":"default",
"napps":2,
"non_isol_total":400,
"k8s_node_name":"cscale-82-140",
"mem_for_storage":1073741824,
"id":1,
"lab":"default",
"gpu_cores_allocated":0,
"isol_dedicated_cores_used":0,
"roles":[
[
"MANAGER",
"ONLINE",
"READY"
],
[
"COMPUTE",
"ONLINE",
"READY"
],
[
"STORAGE",
"ONLINE",
"READY"
]
],
"cpu_cores_used":0,
"cpu_prov_factor":10,
"services":"{\"update_time\":1596761892.3700919151,\"services\":{\"consul_dns\":true,\"stormgr-server\":{\"Id\":\"stormgr-server\",\"MainPID\":2299,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:26.794916\",\"ActiveState\":\"active\"},\"gui-cli\":{\"Id\":\"gui-cli\",\"MainPID\":2647,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:32.291459\",\"ActiveState\":\"active\"},\"consul-client\":{\"Id\":\"consul-client\",\"MainPID\":0,\"Type\":\"simple\",\"ExecMainStartTimestamp\":0,\"ActiveState\":\"inactive\"},\"httpd\":{\"Id\":\"httpd\",\"MainPID\":2613,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:32.124472\",\"ActiveState\":\"active\"},\"robin-node-monitor\":{\"Id\":\"robin-node-monitor\",\"MainPID\":1278,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:13.459063\",\"ActiveState\":\"active\"},\"iomgr-server\":{\"Id\":\"iomgr-server\",\"MainPID\":7384,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:55:13.851492\",\"ActiveState\":\"active\"},\"robin-event-server\":{\"Id\":\"robin-event-server\",\"MainPID\":1039,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:55.322709\",\"ActiveState\":\"active\"},\"consul-server\":{\"Id\":\"consul-server\",\"MainPID\":564,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:32.432940\",\"ActiveState\":\"active\"},\"consul_members\":[{\"DelegateMax\":5,\"ProtocolMin\":1,\"Port\":29460,\"Status\":1,\"ProtocolMax\":5,\"DelegateCur\":4,\"ProtocolCur\":2,\"Name\":\"cscale-82-140.robinsystems.com\",\"Tags\":{\"dc\":\"consul\",\"role\":\"consul\",\"vsn\":\"2\",\"wan_join_port\":\"29461\",\"segment\":\"\",\"port\":\"29459\",\"raft_vsn\":\"2\",\"vsn_min\":\"2\",\"vsn_max\":\"3\",\"id\":\"9dbc13cd-bbb4-1bf1-9bcd-f3d7e0f0026f\",\"bootstrap\":\"1\",\"build\":\"0.9.4:40f243a+\"},\"Addr\":\"10.9.82.140\",\"DelegateMin\":2}],\"robin-file-server\":{\"Id\":\"robin-file-server\",\"MainPID\":1071,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:55.801664\",\"ActiveState\":\"active\"},\"robin-watchdog\":{\"Id\":\"robin-watchdog\",\"MainPID\":860,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:37.982382\",\"ActiveState\":\"active\"},\"sherlock-server\":{\"Id\":\"sherlock-server\",\"MainPID\":0,\"Type\":\"simple\",\"ExecMainStartTimestamp\":0,\"ActiveState\":\"inactive\"},\"robin-agent\":{\"Id\":\"robin-agent\",\"MainPID\":9186,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:59:23.165676\",\"ActiveState\":\"active\"},\"postgresql-9.6\":{\"Id\":\"postgresql-9.6\",\"MainPID\":660,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:34.988682\",\"ActiveState\":\"active\"},\"monitor-server\":{\"Id\":\"monitor-server\",\"MainPID\":2687,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:31:32.455445\",\"ActiveState\":\"active\"},\"robin-server\":{\"Id\":\"robin-server\",\"MainPID\":64400,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-06 17:57:55.208502\",\"ActiveState\":\"active\"},\"robin-auth-server\":{\"Id\":\"robin-auth-server\",\"MainPID\":1010,\"Type\":\"simple\",\"ExecMainStartTimestamp\":\"2020-08-04 14:30:50.844131\",\"ActiveState\":\"active\"}}}",
"gpu_cores":0,
"sysmem":[
33555709952,
2689277952,
7223099392,
0,
0,
0,
22075457536,
843776
],
"ssd_faulted":0,
"isol_shared_cores_used":0,
"hugepages_1g":0,
"ninstances":2,
"maintenance_mode":"DISABLED",
"memory_allocated":0,
"hostname":"cscale-82-140.robinsystems.com",
"datacenter":"default",
"tags":{
"kubernetes.io\/os":[
"linux"
],
"robin.io\/robinrpool":[
"default"
],
"kubernetes.io\/arch":[
"amd64"
]
},
"remove_taint":true,
"hugepages_2m_allocated":0,
"rpool":"default",
"pods":110,
"state":"ONLINE",
"name":"cscale-82-140",
"rcm_ha_role":"MANAGER_MASTER",
"isol_total":0,
"hugepages_2m":0,
"hdd_faulted":0,
"ipaddresses":[
{
"mac_address":"00:15:5d:14:06:0e",
"netmask":"255.255.0.0",
"ip_address":"10.9.82.140"
}
],
"cpu_cores_allocated":0,
"memory_reserved":6442450944.0,
"cpu_cores":40,
"hugepages_1g_allocated":0,
"k8s_node_status":"Ready",
"status":"Ready",
"nics":[
{
"allowed_vlans":[
],
"function":null,
"numa_node":null,
"vendor":null,
"mtu":1500,
"mac_address":"00:15:5d:14:06:0e",
"bus":null,
"name":"br0",
"physical_nic":"eth0",
"num_vfs":0,
"linkstate":"",
"all_vlans_allowed":false,
"used_vfs":0,
"native_vfdriver":null,
"native_vlan":null,
"vendor_desc":null,
"domain":null,
"untagged":false,
"slot":null,
"vfdrivers":[
]
}
],
"public_hostname":"cscale-82-140.robinsystems.com",
"cpu_cores_present":40,
"sysinfo":{
"join_time":1596576678,
"current_version":"5.3.0-171",
"iqn":"iqn.1994-05.com.redhat:329b8568de1",
"install_date":"Tue Mar 17 23:49:17 UTC 2020",
"wwpns":[
],
"distribution":"CentOS Linux",
"version":"#1 SMP Tue Mar 17 23:49:17 UTC 2020",
"uuid":"",
"boot_time":1596576222,
"robin_software":[
{
"version":"5.3.0",
"patch":"",
"full_version":"5.3.0-171",
"install_date":"2020-08-03",
"patch_date":"",
"release":"171",
"build_info":"robin-c2edf85eaa83a42ced9512e7de9c7c2f1e4fa962:robin-ui:9ee33fd00273ba19861d4dc3ef8c6169d822d3e0:robingraph:cf0ceefe696ccac2dbd2eeb1d28b859955452843"
}
],
"release":"3.10.0-1062.18.1.el7.x86_64",
"system":"Linux",
"processor":"x86_64"
},
"disks":[
{
"spf":0.8,
"state":"READY",
"type":"HDD",
"zoneid":1596601846,
"dev":"\/dev\/sdb",
"max_alloc_slices":77,
"free_alloc_slices":68,
"model":null,
"allocated":7,
"maintenance_mode":"DISABLED",
"max_throughput_intensive_vols_per_disk":1,
"role":"Storage",
"wwn":"0x600224804c48fd7e16c608dea0919064",
"status":"ONLINE",
"make":null,
"devpath":"\/dev\/disk\/by-id\/scsi-3600224804c48fd7e16c608dea0919064",
"alloc_slices":9,
"reattachable":0,
"max_volumes_per_disk":10,
"protected":0,
"capacity":107374182400,
"node_ref":1,
"max_latency_sensitive_vols_per_disk":2,
"pused":234881024,
"pfree":104287174656
},
{
"spf":0.8,
"state":"READY",
"type":"HDD",
"zoneid":1596601846,
"dev":"\/dev\/sdc",
"max_alloc_slices":77,
"free_alloc_slices":53,
"model":null,
"allocated":20,
"maintenance_mode":"DISABLED",
"max_throughput_intensive_vols_per_disk":1,
"role":"Storage",
"wwn":"0x600224803bcdafde95b1f5cd27ceb5fb",
"status":"ONLINE",
"make":null,
"devpath":"\/dev\/disk\/by-id\/scsi-3600224803bcdafde95b1f5cd27ceb5fb",
"alloc_slices":24,
"reattachable":0,
"max_volumes_per_disk":10,
"protected":0,
"capacity":107374182400,
"node_ref":1,
"max_latency_sensitive_vols_per_disk":2,
"pused":939524096,
"pfree":103582531584
},
{
"spf":0.8,
"state":"INIT",
"type":"HDD",
"zoneid":1596601846,
"dev":"\/dev\/dm-1",
"max_alloc_slices":5,
"free_alloc_slices":5,
"model":null,
"allocated":0,
"maintenance_mode":"DISABLED",
"max_throughput_intensive_vols_per_disk":1,
"role":"RootDisk",
"wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-swap",
"status":"UNKNOWN",
"make":null,
"devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphaFy4aq3EUo1yluonS8FG0LF16ycBrdEw",
"alloc_slices":0,
"reattachable":0,
"max_volumes_per_disk":10,
"protected":0,
"capacity":8254390272,
"node_ref":1,
"max_latency_sensitive_vols_per_disk":2,
"pused":0,
"pfree":0
},
{
"spf":0.8,
"state":"INIT",
"type":"HDD",
"zoneid":1596601846,
"dev":"\/dev\/dm-0",
"max_alloc_slices":38,
"free_alloc_slices":38,
"model":null,
"allocated":0,
"maintenance_mode":"DISABLED",
"max_throughput_intensive_vols_per_disk":1,
"role":"RootDisk",
"wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-root",
"status":"UNKNOWN",
"make":null,
"devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphgpZcvqGdfOKaXbEbOZzNthc6btsoSXDj",
"alloc_slices":0,
"reattachable":0,
"max_volumes_per_disk":10,
"protected":0,
"capacity":53687091200,
"node_ref":1,
"max_latency_sensitive_vols_per_disk":2,
"pused":0,
"pfree":0
},
{
"spf":0.8,
"state":"INIT",
"type":"HDD",
"zoneid":1596601846,
"dev":"\/dev\/dm-2",
"max_alloc_slices":32,
"free_alloc_slices":32,
"model":null,
"allocated":0,
"maintenance_mode":"DISABLED",
"max_throughput_intensive_vols_per_disk":1,
"role":"RootDisk",
"wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-home",
"status":"UNKNOWN",
"make":null,
"devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphQObDlS6eMUSpSxH5zsvyg9I5a0Gpuj5W",
"alloc_slices":0,
"reattachable":0,
"max_volumes_per_disk":10,
"protected":0,
"capacity":44350570496,
"node_ref":1,
"max_latency_sensitive_vols_per_disk":2,
"pused":0,
"pfree":0
},
{
"spf":0.8,
"state":"INIT",
"type":"HDD",
"zoneid":1596601846,
"dev":"\/dev\/sda",
"max_alloc_slices":77,
"free_alloc_slices":77,
"model":null,
"allocated":0,
"maintenance_mode":"DISABLED",
"max_throughput_intensive_vols_per_disk":1,
"role":"RootDisk",
"wwn":"0x600224801d3ac9b6650afd3280aa5898",
"status":"UNKNOWN",
"make":null,
"devpath":"\/dev\/disk\/by-id\/scsi-3600224801d3ac9b6650afd3280aa5898",
"alloc_slices":0,
"reattachable":0,
"max_volumes_per_disk":10,
"protected":0,
"capacity":107374182400,
"node_ref":1,
"max_latency_sensitive_vols_per_disk":2,
"pused":0,
"pfree":0
}
]
}
],
"total":1,
"page_num":1,
"nodes_count":1,
"num_items":1,
"page_size":1
}
5.3.2. Show information about a specific host¶
In order to display detailed information for a host such as the storage allocation breakdown, discovered physical attributes with their utilization (NUMA configuration, network topology etc.) and service details, issue the following command:
# robin host info <hostname>
--services
--resources
--config
--consul
--json
|
FQDN of host |
|
Show status information for the host |
|
Show resource utilization for the host |
|
Show config info |
|
Show consul cluster info |
|
Output in JSON |
Example:
# robin host info poch01.robin.io
Output
Host: qct-07.robinsystems.com
Zone Id: 1596566663
Host Id: 4
Type: physical
Version: 5.3.0-172
Kernel Version: 3.10.0-1062.el7.x86_64
Boot Time: 04 Aug 2020 03:18:01
Resource pool: workers
CPU:
Total Cores: 80
Total Isolated Cores: 76
Total Non-Isolated Cores: 4
Non-Isolated CPUs allocated: 1
Shared Isolated CPUs allocated: 8
Dedicated Isolated CPUs allocated: 68
Provisioning Factor: 1
NUMA Topology:
Node 0:
Total Memory: 187G
Total Isolated CPUs: 38
Total Non-Isolated CPUs: 2
Total Reserved CPUs: 0
Non-Isolated Pinned CPUs: 0
Isolated Shared Pinned CPUs: 0
Isolated Dedicated Pinned CPUs: 38
Total HugePages_1G: -
Total HugePages_2M: -
CPU List: 1-19,41-59
NIC List: enp94s0f0,enp59s0f0,enp59s0f1,enp94s0f1
Node 1:
Total Memory: 188G
Total Isolated CPUs: 38
Total Non-Isolated CPUs: 2
Total Reserved CPUs: 0
Non-Isolated Pinned CPUs: 0
Isolated Shared Pinned CPUs: 8
Isolated Dedicated Pinned CPUs: 30
Total HugePages_1G: -
Total HugePages_2M: -
CPU List: 21-39,61-79
NIC List: enp175s0f1,enp175s0f0
GPU:
Total Cores: 0
Memory:
System Total: 376G
Allocatable Total: 376G
Reserved: 6G
Robin Manager services: -
Robin Compute services: 4G
Robin Storage services: 2G
Memory allocated to instances: 88G
Free Total: 292G
HugePages_2M:
Total: -
Allocated for Robin apps: -
HugePages_1G:
Total: -
Allocated for Robin apps: -
POD Utilization: 83/110
Network:
Bridge Interface: br0
Physical Interface: enp94s0f1
MTU: 1500
Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710)
Vendor Info: 8086 - Intel Corporation
NUMA Node: 0
H/W Info: 0000:5e:00.1
IP Addresses: 10.9.20.15/16
Interface: enp175s0f1
MTU: 1500
Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710)
Vendor Info: 8086 - Intel Corporation
NUMA Node: 1
H/W Info: 0000:af:00.1
Interface: enp175s0f0
MTU: 1500
Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710-2)
Vendor Info: 8086 - Intel Corporation
NUMA Node: 1
H/W Info: 0000:af:00.0
Interface: enp94s0f0
MTU: 1500
Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter OCP XXV710-2)
Vendor Info: 8086 - Intel Corporation
NUMA Node: 0
H/W Info: 0000:5e:00.0
Interface: enp59s0f0
MTU: 1500
Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710-2)
Vendor Info: 8086 - Intel Corporation
NUMA Node: 0
H/W Info: 0000:3b:00.0
Interface: enp59s0f1
MTU: 1500
Product Info: 158B - Ethernet Controller XXV710 for 25GbE SFP28 (Ethernet Network Adapter XXV710)
Vendor Info: 8086 - Intel Corporation
NUMA Node: 0
H/W Info: 0000:3b:00.1
Public IP Address: 10.9.20.15
Public Hostname: qct-07.robinsystems.com
Instances: 69
State: ONLINE
Status: Ready
K8s_Node_Status: Ready
Maintenance Mode: DISABLED
Consul state: UP
Roles:
STORAGE: ONLINE Status:READY
COMPUTE: ONLINE Status:READY
Storage:
Type | Used (GB) | Robin Allocated (GB) | K8s Allocated (GB) | Total (GB)
-----+-----------+----------------------+--------------------+------------
HDD | 38 | 538 | 16 | 893
SSD | - | - | - | -
Services:
Name | State | RoleTags | PID | Started
-------------------+-------+----------+------+----------------------------
consul-client | UP | A | 351 | 2020-08-04 17:51:31.538657
consul-server | DOWN | M | 0 | 0
gui-cli | UP | - | 1132 | 2020-08-04 17:51:50.479143
httpd | UP | * | 1077 | 2020-08-04 17:51:50.116133
iomgr-server | UP | C S | 5391 | 2020-08-04 17:52:16.467809
monitor-server | UP | M C S | 539 | 2020-08-04 17:51:42.748944
postgresql-9.6 | DOWN | M | 0 | 0
robin-agent | UP | M C S | 600 | 2020-08-04 17:51:45.394012
robin-auth-server | DOWN | M* | 0 | 0
robin-event-server | DOWN | M* | 0 | 0
robin-file-server | DOWN | M* | 0 | 0
robin-node-monitor | DOWN | M* | 0 | 0
robin-server | DOWN | M* | 0 | 0
robin-watchdog | DOWN | M | 0 | 0
sherlock-server | DOWN | - | 0 | 0
stormgr-server | DOWN | M* | 0 | 0
Last updated (04 Aug 2020 18:40:14)
UP: Running
CRIT: Critical and Down
DOWN: Not Running
Root Disk storage info:
Partition | Name | Size (GB) | Available (GB)
--------------------------+-----------------+-----------+----------------
/var/log | RobinLog | 299 | 268
/var/lib/pgsql | Pgsql | 299 | 268
/var/crash | Crash | 299 | 268
/var/lib/robin | RobinLib | 299 | 268
/var/lib/[appropriateCRI] | ContainerImages | 15 | -
Unused container images: 6G
Image | Size (GB)
--------------------------------------------+-----------
k8s.gcr.io/kube-controller-manager:v1.18.6 | 0.15
k8s.gcr.io/kube-apiserver:v1.18.6 | 0.16
k8s.gcr.io/kube-scheduler:v1.18.6 | 0.09
robinsys/robinimg:5.2.7-18 | 3
quay.io/k8scsi/csi-provisioner:v1.6.0_robin | 0.04
k8s.gcr.io/kube-proxy:v1.17.5 | 0.11
k8s.gcr.io/kube-apiserver:v1.17.5 | 0.16
k8s.gcr.io/kube-controller-manager:v1.17.5 | 0.15
k8s.gcr.io/kube-scheduler:v1.17.5 | 0.09
quay.io/k8scsi/snapshot-controller:v2.1.0 | 0.04
k8s.gcr.io/pause:3.2 | 0.0
prom/prometheus:v2.16.0 | 0.12
quay.io/k8scsi/csi-attacher:v2.1.0 | 0.04
calico/typha:v3.11.1 | 0.05
calico/pod2daemon-flexvol:v3.11.1 | 0.1
calico/cni:v3.11.1 | 0.18
calico/kube-controllers:v3.11.1 | 0.05
quay.io/k8scsi/csi-provisioner:v1.4.0_robin | 0.05
k8s.gcr.io/kube-proxy:v1.16.3 | 0.08
k8s.gcr.io/kube-apiserver:v1.16.3 | 0.2
k8s.gcr.io/kube-controller-manager:v1.16.3 | 0.15
k8s.gcr.io/kube-scheduler:v1.16.3 | 0.08
k8s.gcr.io/coredns:1.6.5 | 0.04
metallb/controller:v0.8.2 | 0.04
metallb/speaker:v0.8.2 | 0.04
k8s.gcr.io/etcd:3.4.3-0 | 0.27
quay.io/k8scsi/csi-snapshotter:v1.2.2 | 0.04
k8s.gcr.io/etcd:3.3.15-0 | 0.23
quay.io/k8scsi/csi-attacher:v1.2.1 | 0.04
k8s.gcr.io/coredns:1.6.2 | 0.04
robinsys/genie-plugin:v3.0 | 0.02
quay.io/k8scsi/csi-provisioner:v1.0.0_robin | 0.04
quay.io/k8scsi/csi-provisioner:v0.4.1_robin | 0.04
robinsys/coredns:1.2.2 | 0.03
Returns detailed information for a host such as the storage allocation breakdown, discovered physical attributes with their utilization (NUMA configuration, network topology etc.) and service details.
End Point: /api/v3/robin_server/hosts/<hostname>
Method: GET
URL Parameters:
diskinfo=true
: Utilizing this parameter results in details of the disks attached to the specified host being returned.
Data Parameters: None
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error)
Example Response:
Output
{
"items":[
{
"memory_used":2692743168,
"hdd_lalloc":35433480192.0,
"memory":33555709952,
"lab":"default",
"saas_mode":false,
"zoneid":1596601846,
"non_isol_total_with_prov":400,
"zone_name":"default",
"rack":"default",
"ipaddresses":[
{
"ip_address":"10.9.82.140",
"mac_address":"00:15:5d:14:06:0e",
"netmask":"255.255.0.0"
}
],
"public_ip":"10.9.82.140",
"ssd_nonrobin_usage":0,
"k8s_node_name":"cscale-82-140",
"mem_for_storage":1073741824,
"id":1,
"ssd_for_storage":0,
"rcm_ha_role":"MANAGER_MASTER",
"ssd_robin_usage":0,
"gpu_cores_allocated":0,
"numa_map":{
"0":{
"memory_used":0,
"hugepages_1g_used":0,
"isol_total":0,
"isol_shared_map":{
},
"cpu_reserved":0,
"numa_id":0,
"non_isol_cores_used":2,
"cpu_ids":"",
"cpu_used":0,
"mem_used":1493172224,
"non_isol_total":20,
"hugepages_2m_used":0,
"gpu_used":0,
"isol_shared_cores_used":0,
"hugepages_1g_total":0,
"cpu_total":20,
"hugepages_2m_total":0,
"memory_total":16777626965,
"isol_dedicated_cores_used":0
},
"1":{
"memory_used":0,
"hugepages_1g_used":0,
"isol_total":0,
"isol_shared_map":{
},
"cpu_reserved":0,
"numa_id":1,
"non_isol_cores_used":0,
"cpu_ids":"",
"cpu_used":0,
"mem_used":0,
"non_isol_total":20,
"hugepages_2m_used":0,
"gpu_used":0,
"isol_shared_cores_used":0,
"hugepages_1g_total":0,
"cpu_total":20,
"hugepages_2m_total":0,
"memory_total":16778082988,
"isol_dedicated_cores_used":0
}
},
"roles":[
[
"MANAGER",
"ONLINE",
"READY"
],
[
"COMPUTE",
"ONLINE",
"READY"
],
[
"STORAGE",
"ONLINE",
"READY"
]
],
"ssd_max_alloc_slices":0,
"cpu_prov_factor":10,
"services":{
"update_time":1596761892.3700919151,
"services":{
"consul_dns":true,
"stormgr-server":{
"MainPID":2299,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:31:26.794916",
"RoleTags":[
"M*"
],
"Id":"stormgr-server",
"State":"UP",
"ActiveState":"active"
},
"gui-cli":{
"MainPID":2647,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:31:32.291459",
"RoleTags":[
"-"
],
"Id":"gui-cli",
"State":"UP",
"ActiveState":"active"
},
"consul-client":{
"MainPID":0,
"Type":"simple",
"ExecMainStartTimestamp":0,
"RoleTags":[
"A"
],
"Id":"consul-client",
"State":"DOWN",
"ActiveState":"inactive"
},
"robin-server":{
"MainPID":64400,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-06 17:57:55.208502",
"RoleTags":[
"M*"
],
"Id":"robin-server",
"State":"UP",
"ActiveState":"active"
},
"robin-node-monitor":{
"MainPID":1278,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:31:13.459063",
"RoleTags":[
"M*"
],
"Id":"robin-node-monitor",
"State":"UP",
"ActiveState":"active"
},
"iomgr-server":{
"MainPID":7384,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:55:13.851492",
"RoleTags":[
"C",
"S"
],
"Id":"iomgr-server",
"State":"UP",
"ActiveState":"active"
},
"consul_members":[
{
"DelegateMax":5,
"ProtocolCur":2,
"Port":29460,
"Status":1,
"ProtocolMax":5,
"DelegateCur":4,
"Tags":{
"dc":"consul",
"role":"consul",
"vsn":"2",
"wan_join_port":"29461",
"segment":"",
"port":"29459",
"raft_vsn":"2",
"vsn_min":"2",
"vsn_max":"3",
"id":"9dbc13cd-bbb4-1bf1-9bcd-f3d7e0f0026f",
"bootstrap":"1",
"build":"0.9.4:40f243a+"
},
"ProtocolMin":1,
"Name":"cscale-82-140.robinsystems.com",
"Addr":"10.9.82.140",
"DelegateMin":2
}
],
"robin-file-server":{
"MainPID":1071,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:30:55.801664",
"RoleTags":[
"M*"
],
"Id":"robin-file-server",
"State":"UP",
"ActiveState":"active"
},
"robin-event-server":{
"MainPID":1039,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:30:55.322709",
"RoleTags":[
"M*"
],
"Id":"robin-event-server",
"State":"UP",
"ActiveState":"active"
},
"sherlock-server":{
"MainPID":0,
"Type":"simple",
"ExecMainStartTimestamp":0,
"RoleTags":[
"-"
],
"Id":"sherlock-server",
"State":"DOWN",
"ActiveState":"inactive"
},
"robin-agent":{
"MainPID":9186,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:59:23.165676",
"RoleTags":[
"M",
"C",
"S"
],
"Id":"robin-agent",
"State":"UP",
"ActiveState":"active"
},
"postgresql-9.6":{
"MainPID":660,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:30:34.988682",
"RoleTags":[
"M"
],
"Id":"postgresql-9.6",
"State":"UP",
"ActiveState":"active"
},
"consul-server":{
"MainPID":564,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:30:32.432940",
"RoleTags":[
"M"
],
"Id":"consul-server",
"State":"UP",
"ActiveState":"active"
},
"httpd":{
"MainPID":2613,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:31:32.124472",
"RoleTags":[
"*"
],
"Id":"httpd",
"State":"UP",
"ActiveState":"active"
},
"robin-auth-server":{
"MainPID":1010,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:30:50.844131",
"RoleTags":[
"M*"
],
"Id":"robin-auth-server",
"State":"UP",
"ActiveState":"active"
},
"robin-watchdog":{
"MainPID":860,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:30:37.982382",
"RoleTags":[
"M"
],
"Id":"robin-watchdog",
"State":"UP",
"ActiveState":"active"
},
"monitor-server":{
"MainPID":2687,
"Type":"simple",
"ExecMainStartTimestamp":"2020-08-04 14:31:32.455445",
"RoleTags":[
"M",
"C",
"S"
],
"Id":"monitor-server",
"State":"UP",
"ActiveState":"active"
}
}
},
"non_isol_total":40,
"gpu_cores":0,
"hdd_robin_usage":35433480192,
"visibledisks":[
"0x600224801d3ac9b6650afd3280aa5898",
"0x600224801d3ac9b6650afd3280aa5898-centos-root",
"0x600224801d3ac9b6650afd3280aa5898-centos-swap",
"0x600224801d3ac9b6650afd3280aa5898-centos-home",
"0x600224804c48fd7e16c608dea0919064",
"0x600224803bcdafde95b1f5cd27ceb5fb"
],
"ssd_faulted":0,
"isol_shared_cores_used":0,
"nic_details":{
"br0":{
"vfdrivers":[
],
"product_desc":null,
"all_vlans_allowed":false,
"mtu":1500,
"bus":null,
"vendor_id":null,
"slot":null,
"physical_nic":"eth0",
"allowed_vlans":[
],
"num_vfs":0,
"function":null,
"ips":[
"10.9.82.140\/16"
],
"product_id":null,
"local_cpulist":null,
"native_vlan":null,
"vendor_desc":null,
"domain":null,
"untagged":false,
"used_vfs":0,
"numa_node":null
}
},
"ninstances":2,
"non_isol_cores_used":3,
"maintenance_mode":"DISABLED",
"memory_allocated":0,
"mem_for_management":1073741824.0,
"rpool_id":1,
"hdd_for_storage":214748364800,
"datacenter":"default",
"tags":{
"kubernetes.io\/os":[
"linux"
],
"robin.io\/robinrpool":[
"default"
],
"kubernetes.io\/arch":[
"amd64"
]
},
"hugepages_1g":0,
"hugepages_2m_allocated":0,
"rpool":"default",
"hdd_total":428414599168,
"ssd_free_alloc_slices":0,
"pods":110,
"state":"ONLINE",
"status":"Ready",
"instances":[
{
"state":"STARTED",
"name":"rohan-app.nginx.03",
"hostname":"rohan-app-nginx-03.t001-u000003.svc.cluster.local"
},
{
"state":"STARTED",
"name":"test-RIC-1.server.01",
"hostname":"test-ric-1-server-01.t001-u000003.svc.cluster.local"
}
],
"isol_dedicated_cores_used":0,
"host_type":"physical",
"ssd_pused":0,
"isol_total":0,
"primary_ip":"10.9.82.140",
"isol_shared_map":{
},
"hugepages_2m":0,
"pods_used":26,
"hdd_lused":0,
"hdd_faulted":0,
"napps":2,
"hdd_nonrobin_usage":0,
"cpu_cores_present":40,
"ssd_total":0,
"cpu_cores_allocated":0,
"is_master":true,
"memory_reserved":6442450944.0,
"cpu_cores":400,
"config":{
"stormgr_rest_port":29454,
"monitor_host_mem_lowmark":0.8,
"monitor_host_root_volume_highmark":0.9,
"rio_rest_port":29456,
"stormgr_rest_listen_addr":"127.0.0.1",
"kvm_enabled":true,
"hard_reset_on_isolation":0,
"monitor_host_cpu_lowmark":0.8,
"monitor_host_var_crash_volume_highmark":0.9,
"monitor_interval":1,
"monitor_host_var_pgsql_volume_lowmark":0.5,
"kubelet_restart_bursttime":25,
"server_rest_port":29442,
"kubelet_restart_burstlimit":2,
"event_server_port":29449,
"rio_rpc_port":29453,
"rdvm_bmapcache_skip_all":0,
"rdvm_mem_maxcap":25769803776,
"rest_server":"cscale-82-140.robinsystems.com",
"registration_timeout":10,
"rdvm_rpc_port":29452,
"node_exporter_port":29457,
"stormgr_rpc_port":29451,
"monitor_host_root_volume_lowmark":0.85,
"database_port":29458,
"rdvm_rest_listen_addr":"127.0.0.1",
"https_port":29443,
"metrics_grafana_details":"{\"url\": \"\", \"auth\": \":\"}",
"monitor_host_var_volume_lowmark":0.85,
"monitor_host_cpu_highmark":0.85,
"rio_rest_listen_addr":"127.0.0.1",
"monitor_host_var_robin_volume_highmark":0.9,
"rdvm_mem_alloc":1073741824,
"monitor_num_samples":3600,
"monitor_host_swap_lowmark":0.75,
"watchdog_loop_interval":3,
"rdvm_rest_port":29455,
"monitor_container_swap_highmark":0.8,
"consul_serfwan_port":29461,
"saas_mode":false,
"file_object_cache":"\/var\/lib\/robin\/file_object_cache",
"node_monitor_port":29467,
"monitor_influx_details":"{\"url\": \"\", \"dbname\": \"robin\", \"auth\": \":\" }",
"consul_http_port":29462,
"monitor_container_swap_lowmark":0.75,
"hostname":"cscale-82-140.robinsystems.com",
"monitor_host_var_volume_highmark":0.9,
"network_type":4,
"suicide_threshold":50,
"mem_for_compute":null,
"mem_for_management":null,
"sherlock_rest_port":29446,
"nfs_mount_options":"nolock,rw,timeo=60",
"rediscover_timeout":120,
"kvm_emulatorpin_cpuset":"",
"rdvm_bmapcache_invalidate_all":0,
"consul_serflan_port":29460,
"monitor_host_var_robin_volume_lowmark":0.85,
"rest_port":29450,
"monitor_report_interval":5,
"host_type":"physical",
"monitor_host_swap_highmark":0.8,
"nodejs_port":29447,
"monitor_push_interval":60,
"ovs_enabled":true,
"monitor_host_var_log_volume_highmark":0.9,
"kubelet_restart_tolerance":15,
"monitor_host_mem_highmark":0.85,
"monitor_host_var_log_volume_lowmark":0.85,
"log_level":10,
"monitor_host_var_crash_volume_lowmark":0.85,
"monitor_container_volume_highmark":0.9,
"monitor_container_volume_lowmark":0.85,
"consul_server_port":29459,
"monitor_host_var_pgsql_volume_highmark":0.7
},
"ssd_lused":0,
"hugepages_1g_allocated":0,
"k8s_node_status":"Ready",
"hdd_free_alloc_slices":293131517952.0,
"ssd_lalloc":0,
"hostname":"cscale-82-140.robinsystems.com",
"public_hostname":"cscale-82-140.robinsystems.com",
"hdd_max_alloc_slices":328564998144.0,
"memory_total":33555709952,
"mem_for_compute":4294967296,
"sysinfo":{
"join_time":1596576678,
"current_version":"5.3.0-171",
"iqn":"iqn.1994-05.com.redhat:329b8568de1",
"install_date":"Tue Mar 17 23:49:17 UTC 2020",
"wwpns":[
],
"distribution":"CentOS Linux",
"version":"#1 SMP Tue Mar 17 23:49:17 UTC 2020",
"uuid":"",
"boot_time":1596576222,
"robin_software":[
{
"version":"5.3.0",
"patch":"",
"full_version":"5.3.0-171",
"install_date":"2020-08-03",
"patch_date":"",
"release":"171",
"build_info":"robin-c2edf85eaa83a42ced9512e7de9c7c2f1e4fa962:robin-ui:9ee33fd00273ba19861d4dc3ef8c6169d822d3e0:robingraph:cf0ceefe696ccac2dbd2eeb1d28b859955452843"
}
],
"release":"3.10.0-1062.18.1.el7.x86_64",
"system":"Linux",
"processor":"x86_64"
},
"hdd_pused":1174405120,
"disks":[
{
"spf":0.8,
"zoneid":1596601846,
"dev":"\/dev\/sda",
"aslices":0,
"nodeid":1,
"maintenance_mode":"OFF",
"role":"RootDisk",
"protected":0,
"status":"UNKNOWN",
"make":null,
"reattachable_nodes":[
[
"cscale-82-140.robinsystems.com",
"ONLINE"
]
],
"capacity":107374182400,
"max_latency_sensitive_vols_per_disk":2,
"pfree":0,
"node_hostname":"cscale-82-140.robinsystems.com",
"tags":{
},
"pused":0,
"type":"HDD",
"nvols":0,
"state":"INIT",
"reattachpolicy":{
"restarts_done":0,
"burst_count":0,
"burst_start_time":0,
"burst_interval":600,
"id":1,
"restart_limit":5
},
"max_alloc_slices":77,
"stormgrid":0,
"free_alloc_slices":77,
"slices":0,
"availability_zone":null,
"max_throughput_intensive_vols_per_disk":1,
"model":null,
"lused_size":0,
"devpath":"\/dev\/disk\/by-id\/scsi-3600224801d3ac9b6650afd3280aa5898",
"alloc_slices":0,
"reattachable":0,
"max_volumes_per_disk":10,
"wwn":"0x600224801d3ac9b6650afd3280aa5898",
"allocations":[
],
"alloc_score":0,
"node_ref":1,
"preserved":0
},
{
"spf":0.8,
"zoneid":1596601846,
"dev":"\/dev\/dm-0",
"aslices":0,
"nodeid":1,
"maintenance_mode":"OFF",
"role":"RootDisk",
"protected":0,
"status":"UNKNOWN",
"make":null,
"reattachable_nodes":[
[
"cscale-82-140.robinsystems.com",
"ONLINE"
]
],
"capacity":53687091200,
"max_latency_sensitive_vols_per_disk":2,
"pfree":0,
"node_hostname":"cscale-82-140.robinsystems.com",
"tags":{
},
"pused":0,
"type":"HDD",
"nvols":0,
"state":"INIT",
"reattachpolicy":{
"restarts_done":0,
"burst_count":0,
"burst_start_time":0,
"burst_interval":600,
"id":4,
"restart_limit":5
},
"max_alloc_slices":38,
"stormgrid":0,
"free_alloc_slices":38,
"slices":0,
"availability_zone":null,
"max_throughput_intensive_vols_per_disk":1,
"model":null,
"lused_size":0,
"devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphgpZcvqGdfOKaXbEbOZzNthc6btsoSXDj",
"alloc_slices":0,
"reattachable":0,
"max_volumes_per_disk":10,
"wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-root",
"allocations":[
],
"alloc_score":0,
"node_ref":1,
"preserved":0
},
{
"spf":0.8,
"zoneid":1596601846,
"dev":"\/dev\/dm-1",
"aslices":0,
"nodeid":1,
"maintenance_mode":"OFF",
"role":"RootDisk",
"protected":0,
"status":"UNKNOWN",
"make":null,
"reattachable_nodes":[
[
"cscale-82-140.robinsystems.com",
"ONLINE"
]
],
"capacity":8254390272,
"max_latency_sensitive_vols_per_disk":2,
"pfree":0,
"node_hostname":"cscale-82-140.robinsystems.com",
"tags":{
},
"pused":0,
"type":"HDD",
"nvols":0,
"state":"INIT",
"reattachpolicy":{
"restarts_done":0,
"burst_count":0,
"burst_start_time":0,
"burst_interval":600,
"id":5,
"restart_limit":5
},
"max_alloc_slices":5,
"stormgrid":0,
"free_alloc_slices":5,
"slices":0,
"availability_zone":null,
"max_throughput_intensive_vols_per_disk":1,
"model":null,
"lused_size":0,
"devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphaFy4aq3EUo1yluonS8FG0LF16ycBrdEw",
"alloc_slices":0,
"reattachable":0,
"max_volumes_per_disk":10,
"wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-swap",
"allocations":[
],
"alloc_score":0,
"node_ref":1,
"preserved":0
},
{
"spf":0.8,
"zoneid":1596601846,
"dev":"\/dev\/dm-2",
"aslices":0,
"nodeid":1,
"maintenance_mode":"OFF",
"role":"RootDisk",
"protected":0,
"status":"UNKNOWN",
"make":null,
"reattachable_nodes":[
[
"cscale-82-140.robinsystems.com",
"ONLINE"
]
],
"capacity":44350570496,
"max_latency_sensitive_vols_per_disk":2,
"pfree":0,
"node_hostname":"cscale-82-140.robinsystems.com",
"tags":{
},
"pused":0,
"type":"HDD",
"nvols":0,
"state":"INIT",
"reattachpolicy":{
"restarts_done":0,
"burst_count":0,
"burst_start_time":0,
"burst_interval":600,
"id":6,
"restart_limit":5
},
"max_alloc_slices":32,
"stormgrid":0,
"free_alloc_slices":32,
"slices":0,
"availability_zone":null,
"max_throughput_intensive_vols_per_disk":1,
"model":null,
"lused_size":0,
"devpath":"\/dev\/disk\/by-id\/dm-uuid-LVM-vI83PDTxV3H0dWyAXfH5ef7rxTOuYyphQObDlS6eMUSpSxH5zsvyg9I5a0Gpuj5W",
"alloc_slices":0,
"reattachable":0,
"max_volumes_per_disk":10,
"wwn":"0x600224801d3ac9b6650afd3280aa5898-centos-home",
"allocations":[
],
"alloc_score":0,
"node_ref":1,
"preserved":0
},
{
"spf":0.8,
"zoneid":1596601846,
"dev":"\/dev\/sdb",
"aslices":7,
"nodeid":1,
"maintenance_mode":"OFF",
"role":"Storage",
"write_unit":4096,
"status":"ONLINE",
"make":null,
"reattachable_nodes":[
[
"cscale-82-140.robinsystems.com",
"ONLINE"
]
],
"protected":0,
"capacity":107374182400,
"max_latency_sensitive_vols_per_disk":2,
"pfree":104287174656,
"node_hostname":"cscale-82-140.robinsystems.com",
"tags":{
},
"pused":234881024,
"type":"HDD",
"nvols":3,
"state":"READY",
"reattachpolicy":{
"restarts_done":0,
"burst_count":0,
"burst_start_time":0,
"burst_interval":600,
"id":2,
"restart_limit":5
},
"max_alloc_slices":77,
"stormgrid":1,
"free_alloc_slices":68,
"slices":6390,
"availability_zone":null,
"max_throughput_intensive_vols_per_disk":1,
"model":null,
"lused_size":0,
"devpath":"\/dev\/disk\/by-id\/scsi-3600224804c48fd7e16c608dea0919064",
"alloc_slices":9,
"reattachable":0,
"max_volumes_per_disk":10,
"wwn":"0x600224804c48fd7e16c608dea0919064",
"allocations":[
{
"vols":[
{
"media":"HDD",
"pused":167772160,
"id":"1",
"size":5368709120,
"state":"ONLINE",
"name":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53"
}
],
"volume_group":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53.72.1.673abece-0975-4234-9fc2-56a06bf54031",
"name":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53.0.970a44c7-a15c-4612-ac57-9b4f15ae386e",
"volume":{
"media":"HDD",
"pused":167772160,
"id":"1",
"size":5368709120,
"state":"ONLINE",
"name":"file-collection-1596578146092.269f9b38-f828-48c2-a382-8921dd74ee53"
},
"slices":5
},
{
"vols":[
{
"media":"HDD",
"pused":67108864,
"id":"8",
"size":1073741824,
"state":"ONLINE",
"name":"test-RIC-1.server.01.data.1.382f1ad5-1294-4e24-8297-9c6025eacfe5"
}
],
"volume_group":"test-RIC-1.server.01.72.1.ea297971-f931-4787-99cc-6782e026b77c",
"name":"test-RIC-1.server.01.72.1.ea297971-f931-4787-99cc-6782e026b77c.0.1e8feb41-fd42-4409-b8c1-751331febdc1",
"volume":{
"media":"HDD",
"pused":67108864,
"id":"8",
"size":1073741824,
"state":"ONLINE",
"name":"test-RIC-1.server.01.data.1.382f1ad5-1294-4e24-8297-9c6025eacfe5"
},
"slices":2
},
{
"vols":[
{
"media":"HDD",
"pused":0,
"id":"9",
"size":1073741824,
"state":"ONLINE",
"name":"test-RIC-1.server.01.block.1.1053eaeb-4542-42a5-a173-d69a76703ead"
}
],
"volume_group":"test-RIC-1.server.01.72.1.1e483fd0-2d5c-434c-aef0-91a87796977a",
"name":"test-RIC-1.server.01.72.1.1e483fd0-2d5c-434c-aef0-91a87796977a.0.5bf728c4-ea26-4e0f-82d7-c584fcf0bd9a",
"volume":{
"media":"HDD",
"pused":0,
"id":"9",
"size":1073741824,
"state":"ONLINE",
"name":"test-RIC-1.server.01.block.1.1053eaeb-4542-42a5-a173-d69a76703ead"
},
"slices":2
}
],
"alloc_score":95,
"node_ref":1,
"preserved":0
},
{
"spf":0.8,
"zoneid":1596601846,
"dev":"\/dev\/sdc",
"aslices":20,
"nodeid":1,
"maintenance_mode":"OFF",
"role":"Storage",
"write_unit":4096,
"status":"ONLINE",
"make":null,
"reattachable_nodes":[
[
"cscale-82-140.robinsystems.com",
"ONLINE"
]
],
"protected":0,
"capacity":107374182400,
"max_latency_sensitive_vols_per_disk":2,
"pfree":103582531584,
"node_hostname":"cscale-82-140.robinsystems.com",
"tags":{
},
"pused":939524096,
"type":"HDD",
"nvols":1,
"state":"READY",
"reattachpolicy":{
"restarts_done":0,
"burst_count":0,
"burst_start_time":0,
"burst_interval":600,
"id":3,
"restart_limit":5
},
"max_alloc_slices":77,
"stormgrid":2,
"free_alloc_slices":53,
"slices":6390,
"availability_zone":null,
"max_throughput_intensive_vols_per_disk":1,
"model":null,
"lused_size":0,
"devpath":"\/dev\/disk\/by-id\/scsi-3600224803bcdafde95b1f5cd27ceb5fb",
"alloc_slices":24,
"reattachable":0,
"max_volumes_per_disk":10,
"wwn":"0x600224803bcdafde95b1f5cd27ceb5fb",
"allocations":[
{
"vols":[
{
"media":"HDD",
"pused":939524096,
"id":"16",
"size":21474836480,
"state":"ONLINE",
"name":"rohan-app.nginx.03.data.1.83d03fbf-3bfe-4723-8abe-5cbd51014e0c"
}
],
"volume_group":"rohan-app.nginx.03.72.1.44251a23-0221-4fda-837e-db26bca3ccb8",
"name":"rohan-app.nginx.03.72.1.44251a23-0221-4fda-837e-db26bca3ccb8.0.6e2afb65-5c4c-42e4-972e-161de3fb3856",
"volume":{
"media":"HDD",
"pused":939524096,
"id":"16",
"size":21474836480,
"state":"ONLINE",
"name":"rohan-app.nginx.03.data.1.83d03fbf-3bfe-4723-8abe-5cbd51014e0c"
},
"slices":24
}
],
"alloc_score":89,
"node_ref":1,
"preserved":0
}
]
}
]
}
5.4. Disabling a node¶
In certain situations, a user might not want any resources for an application to be allocated from a particular host due to a malfunction with the physical machine or simply because the host is temporarily undergoing maintenance. Instead of requiring the user to remove the node from an existing cluster, Robin allows one to place a host into maintenance mode. This effectively isolates the host with regards to resource availability as it entails that none of the host’s storage capacity can be used for future application deployment regardless of the Robin roles assigned to the node. This mode can be toggled using the commands detailed below. For more granular control, please review the section on disabling/enabling particular roles here.
The following commands are described in this section:
|
Place a host into maintenance mode |
|
Place a host into non-maintenance (normal) mode |
5.4.1. Placing a host into maintenance mode¶
In order to put a host into maintenance and thus temporarily suspend it from providing either storage or compute resources for future application deployments, issue the following command:
# robin host set-maintenance <hostname>
|
FQDN of host |
Example:
# robin host set-maintenance vnode36.robinsystems.com
Host vnode36.robinsystems.com set in maintenance mode
Puts a host into maintenance mode, which in turn temporarily suspends it from providing storage and compute resources for application deployments.
End Point: /api/v3/robin_server/hosts/<hostname>
Method: PUT
URL Parameters: None
Data Parameters:
action: set_maintenance
- This mandatory field within the payload specifies that the set maintenance mode operation is to be performed.
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error)
Example Response:
Output
{
"message":"Maintenance mode set"
}
5.4.2. Placing a host into non-maintenance mode¶
In order to revert a host back into its normal setting and thus allow it to provide resources for future application deployments, issue the following command:
# robin host unset-maintenance <hostname>
|
FQDN of host |
Example:
# robin host unset-maintenance vnode36.robinsystems.com
Host vnode36.robinsystems.com out of maintenance mode
Removes a host from maintenance mode, which in turn allows it to provide storage and compute resources for application deployments.
End Point: /api/v3/robin_server/hosts/<hostname>
Method: PUT
URL Parameters: None
Data Parameters:
action: unset_maintenance
- This mandatory field within the payload specifies that the unset maintenance mode operation is to be performed.
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid Api Usage Error)
Example Response:
Output
{
"message":"Maintenance mode unset"
}
5.5. Removing Nodes¶
5.5.1. Remove Node on GKE¶
For Robin CNS installed on Google Kubernetes Engine (GKE), you must use the GKE UI to add or remove the nodes. For more information, see here.
Node Removal Limitations
When removing a node using the GKE UI , you must consider the following limitations.
When removing a node using the GKE UI, you must select one node at a time although the GKE UI supports removing more than one node. This is because Robin CNS supports only removing one node at a time.
When removing a node, you do not have any control to select a particular node for removal.
5.5.2. Remove Node on Google Anthos¶
You might need to remove a master or worker node if a particular node has any technical issues. As part of node removal, to retain the data of the removed node, you must first evacuate all volumes from the drives of the node you wanted to remove.
Note
If you want to remove a node without evacuating the volumes of that node, you can first enable the Auto Volume Cleanup before removing the node.
Complete the following steps to remove a node from the cluster:
Run the following command to evacuate all volumes on all drives of the node that you are planning to remove:
Note
Use the
robin disk list
command to find the list of drives on the host.Important
You need to repeat this step for all drives on the node.
# robin disk evacuate <wwn>
Example:
# kubectl get nodes NAME STATUS ROLES AGE VERSION hypervvm-72-42.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-43.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-44.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-45.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001 hypervvm-72-46.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001 # robin drive evacuate 0x60022480c3025c597a2fc0d7fc407ed4 --wait Are you sure you want to evacuate all the volumes on disk 0x60022480c3025c597a2fc0d7fc407ed4 [y/n] ?y Job: 362 Name: DiskEvacuate State: VALIDATED Error: 0 Job: 362 Name: DiskEvacuate State: WAITING Error: 0 Job: 362 Name: DiskEvacuate State: COMPLETED Error: 0
(Remove Worker node) Edit the Anthos node-pool custom resource file. Remove node IP details for a worker node from the file and save it.
Example:
# kubectl get nodepools.baremetal.cluster.gke.io -A --kubeconfig=nehacluster-kubeconfig NAMESPACE NAME READY RECONCILING STALLED UNDERMAINTENANCE UNKNOWN cluster-nehacluster nehacluster 3 0 0 0 0 cluster-nehacluster node-pool-1 2 0 0 0 0 kubectl edit nodepool --kubeconfig ./nehacluster-kubeconfig -n cluster-nehacluster node-pool-1 **After removing IP address details from the file**. # kubectl get nodes --kubeconfig=nehacluster-kubeconfig NAME STATUS ROLES AGE VERSION hypervvm-72-42.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-43.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-44.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-45.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001
(Remove master node ) Edit the nodepool spec in the cluster resource file. Remove the master node IP details from the file and save it.
Example
# kubectl edit cluster nehacluster -n cluster-nehacluster --kubeconfig ./nehacluster-kubeconfig # kubectl get nodes --kubeconfig=nehacluster-kubeconfig NAME STATUS ROLES AGE VERSION hypervvm-72-42.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-43.robinsystems.com Ready,SchedulingDisabled control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-44.robinsystems.com Ready control-plane,master 2d16h v1.25.5-gke.1001 hypervvm-72-45.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001 hypervvm-72-46.robinsystems.com Ready worker 17m v1.25.5-gke.1001 **After removing IP address details from the file**. # kubectl get nodes --kubeconfig=nehacluster-kubeconfig NAME STATUS ROLES AGE VERSION hypervvm-72-42.robinsystems.com Ready control-plane,master 2d17h v1.25.5-gke.1001 hypervvm-72-44.robinsystems.com Ready control-plane,master 2d17h v1.25.5-gke.1001 hypervvm-72-45.robinsystems.com Ready worker 2d16h v1.25.5-gke.1001 hypervvm-72-46.robinsystems.com Ready worker 25m v1.25.5-gke.1001
5.6. Managing a cluster via the remote client¶
In addition to the Robin CLI, which is available on all hosts where Robin is installed, a remote client is shipped with each cluster that is deployed.
This client mirrors the functionality of the native CLI with regards to the commands available and hence it provides the management capabilities that are described
throughout this document. One advantage of utilizing this client is that it can be used to manage a multitude of Robin clusters via the concept of contexts
.
A context
in this scenario refers to a Robin cluster and is identified by the server name or IP Address. In addition to this primary key, the following attributes
can also be set within a context
: the port values for various Robin services (including the Robin Server, File Server, Event Server, Watchdog Server, and Metrics Server)
along with the logging level. The attributes are discussed in more detail in the following sections. After creating the appropriate context
for a Robin cluster,
one can set it to be the current context and communicate with the respective cluster. The commands which can be used to achieve this are described below.
The following commands are described in this section:
|
Add a Robin cluster context |
|
List all registered Robin cluster contexts |
|
Set a Robin cluster context as the current context |
|
Update attributes for the current Robin cluster context |
|
Delete a Robin cluster context |
5.6.1. Downloading the Robin client¶
In order to download the Robin client from an existing Robin cluster, issue the following command:
# curl -k 'https://<manager ip>:<port>/api/v3/robin_server/download?file=robincli&os=<os>' -o robin
|
IP Address of the manager node or load balancer IP. Run |
|
Port number for the Robin Server |
|
The operating system to download the client for. Supported operating systems include: Linux, MacOS. |
Example:
# curl -k 'https://vnode42:29442/api/v3/robin_server/download?file=robincli&os=linux' -o robin
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 10.1M 100 10.1M 0 0 1421k 0 0:00:07 0:00:07 --:--:-- 1483k
# ls -lart
-rw-r--r-- 1 demo staff 10655536 Mar 26 14:12 robin
5.6.2. Adding a Context¶
A context is a construct that can be used to define a Robin cluster in a manner that the remote client can understand. In order to add a context, issue the following command:
Note
If a context already exists with the server specified, that context will be updated with the values supplied.
# robin client add-context <server>
--port <port>
--file-port <file_port>
--event-port <event_port>
--metrics-port <metrics_port>
--log-level <log_level>
--product <product_type>
--set-current
|
FQDN/IP Address of the Master Node or VIP |
|
Port number for the Robin Server. Default value is 29442 |
|
Port number for the File Server. Default value is 29445 |
|
Port number for the Event Server. Default value is 29449 |
|
Port number for the Metrics Server. Default is 29446 |
|
Number indicating the verbosity of logs. Valid values are 10 (DEBUG), 20 (INFO), 40 (ERROR). Default value is 40. |
|
Type of ROBIN installation. Valid choices are ‘platform’ or ‘storage’. Default value is ‘platform’. |
|
Set context to be created as the current |
Example:
# robin client add-context centos-60-214 --port 29443
Context robin-cluster-centos-60-214 created successfully
5.6.3. Listing all available contexts¶
In order to list all contexts that have already been registered with the client alongside additional details such as the port values specified or the log level, issue the following command:
# robin client list-contexts
--full
|
Show additional details about all registered contexts |
Example:
# robin client list-contexts --full
| Server | Port | Version | Tenant | Last Login | Tenants | FPort | WPort | MPort | LogLevel
---+-----------------------------------+-------+------------+----------------+----------------------+----------------+-------+-------+-------+----------
| master.robin-server.service.robin | 29442 | - | - | - | | 29445 | 29444 | 29446 | ERR
| centos-60-214 | 29443 | - | Administrators | - | | 29445 | 29444 | 29446 | ERR
* | 172.19.174.194 | 29442 | 5.2.3-9842 | Administrators | 26 Mar 2020 16:10:58 | Administrators | 29445 | 29444 | 29446 | ERR
Note
The asterisk displayed above indicates the current context.
5.6.4. Setting the current context¶
In order to access a particular Robin cluster, its respective context needs to be set as the current context. To achieve this, issue the following command:
# robin client set-current <context>
|
The server attribute of the context to be set as current |
Example:
# robin client set-current centos-60-214
Current context set to robin-cluster-centos-60-214
5.6.5. Updating the current context¶
In certain situations, such as a reinstallation, the attributes of a context might be altered whilst retaining the same server IP Address or hostname. As a result, the context which refers to this cluster will have to be updated. In order to do so, issue the following command:
Note
The below command only updates the current context.
# robin client update-context
--port <port>
--file-port <file_port>
--event-port <event_port>
--metrics-port <metrics_port>
--log-level <log_level>
|
Updated port number for the Robin Server |
|
Updated port number for the File Server |
|
Updated port number for the Event Server |
|
Updated port number for the Metrics Server |
|
Updated number indicating the verbosity of logs. Valid values are 10 (DEBUG), 20 (INFO), 40 (ERROR) |
Example:
# robin client update-context --port 29942 --file-port 29445 --watchdog-port 29444 --metrics-port 29446
Updating attributes for context robin-cluster-centos-60-214
Server: centos-60-214
Context config updated for robin-cluster-centos-60-214
5.6.6. Deleting a context¶
In order to remove a registered context, issue the following command:
# robin client delete-context <context>
|
The server attribute of the context to be deleted |
Example:
# robin client delete-context centos-60-214
Context centos-60-214 deleted