16. Chargeback¶
Robin provides an in-built chargeback utility which allows one to track the usage and cost of the resources utilized by Robin Bundle applications on a deployed Robin cluster regardless of environment. The resources that are accounted for include: CPU, GPU, multi-instance GPU (MIG), memory, and storage space consumed. This cost visibility and insight into resource consumption not only encourages accountability among application owners but also enables cluster administrators to make better decisions about resource requirements and priorities with regards to their Robin cluster. This in turn improves the meaningful utilization of resources on a Robin Cluster.
The chargeback counters for each of the aforementioned resources are initially started when an application is successfully created and its requested resources are allocated. In addition the following situations could also result in a counter (re)starting:
Individual pod is created
Application or individual pod is (re)started
Failed pod is successfully redeployed (either manually or via Autopilot)
On the other hand counters for an application and its associated resources are stopped when the application is deleted. In addition the following situations could also result in a counter being stopped:
Individual pod is deleted or removed
Application or individual pod is stopped
Redeployment of pod fails (when it initially was in a good state)
The above scenarios cover all of the operations on an application that could result in a change of resource utilization. These include: scaling an application in/out horizontally, scaling an application up/down vertically, rolling an application back and adding a new volume to an application.
Note
Separate counters are maintained for every combination of pod and resource tracked.
Topics covered in this chapter:
|
View and manipulate the prices enforced by chargeback counters |
|
View chargeback report |
|
View list of resources tracked as part of chargeback |
|
View information on a resource tracked as part of chargeback |
16.1. Viewing and editing Chargeback pricing scheme¶
Upon installation, Robin creates a default price sheet (in USD) wherein which prices are associated with each tracked resources and applied to the whole cluster. In addition they can be updated by the Robin Cluster Superadmin so as to customize it for the deployment. Issue the following command to view/update the chargeback price sheet:
# robin chargeback price-sheet --price-per-cpu <cpu_price>
--price-per-gpu <gpu_price>
--price-per-mem <mem_price>
--price-per-hdd <hdd_price>
--price-per-ssd <ssd_price>
--currency <currency>
|
Price of 1 cpu per day |
|
Price of 1 gpu per day |
|
Price of 1 GB of memory per day |
|
Price of 1 GB of HDD storage per day |
|
Price of 1 GB of SSD storage per day |
|
Currency in 3 letter format as per ISO-4217 standard |
Example 1: View price sheet
# robin chargeback price-sheet
Chargeback Price-sheet Currency: USD
Resource Type | Unit size | Price Per Hour | Price Per Day
------------------------------+---------------------------------+----------------+---------------
CPU | 1 CPU | 0.00500 | 0.12000
GPU | 1 GPU | 0.04167 | 1.00000
HDD | 1G | 0.00006 | 0.00150
MEMORY | 1G | 0.01042 | 0.25000
NVIDIA A100-SXM4-40GB | 1 NVIDIA A100-SXM4-40GB | 0.12500 | 3.00000
NVIDIA A100-SXM4-40GB-1g.5gb | 1 NVIDIA A100-SXM4-40GB-1g.5gb | 0.01786 | 0.42857
NVIDIA A100-SXM4-40GB-2g.10gb | 1 NVIDIA A100-SXM4-40GB-2g.10gb | 0.03571 | 0.85714
NVIDIA A100-SXM4-40GB-3g.20gb | 1 NVIDIA A100-SXM4-40GB-3g.20gb | 0.05357 | 1.28571
NVIDIA A100-SXM4-40GB-4g.20gb | 1 NVIDIA A100-SXM4-40GB-4g.20gb | 0.07143 | 1.71429
NVIDIA A100-SXM4-40GB-7g.40gb | 1 NVIDIA A100-SXM4-40GB-7g.40gb | 0.12500 | 3.00000
SSD | 1G | 0.00017 | 0.00420
Example 2: Update price sheet
# robin chargeback price-sheet --type “CPU” --price 0.14000 --type “GPU” --price 0.14000 --type “MEMORY” --price 0.30000 --type “HDD” --price 0.00040 --type “SDD” --price 0.00150 --type "NVIDIA A100-SXM4-40GB" --price 4.00000 --type "NVIDIA A100-SXM4-40GB-1g.5gb" --price 0.50000 --type "NVIDIA A100-SXM4-40GB-2g.10gb" --price 0.90000 --type "NVIDIA A100-SXM4-40GB-3g.20gb" --price 1.50000 --type "NVIDIA A100-SXM4-40GB-4g.20gb" --price 2.00000 --type "NVIDIA A100-SXM4-40GB-7g.40gb" --price 4.00000 --currency "USD"
Successfully updated the chargeback price-sheet
# robin chargeback price-sheet
Chargeback Price-sheet Currency: USD
Resource Type | Unit size | Price Per Hour | Price Per Day
------------------------------+---------------------------------+----------------+---------------
CPU | 1 CPU | 0.00583 | 0.14000
GPU | 1 GPU | 0.00583 | 0.14000
HDD | 1G | 0.00002 | 0.00040
MEMORY | 1G | 0.01250 | 0.30000
NVIDIA A100-SXM4-40GB | 1 NVIDIA A100-SXM4-40GB | 0.16667 | 4.00000
NVIDIA A100-SXM4-40GB-1g.5gb | 1 NVIDIA A100-SXM4-40GB-1g.5gb | 0.02083 | 0.50000
NVIDIA A100-SXM4-40GB-2g.10gb | 1 NVIDIA A100-SXM4-40GB-2g.10gb | 0.03750 | 0.90000
NVIDIA A100-SXM4-40GB-3g.20gb | 1 NVIDIA A100-SXM4-40GB-3g.20gb | 0.06250 | 1.50000
NVIDIA A100-SXM4-40GB-4g.20gb | 1 NVIDIA A100-SXM4-40GB-4g.20gb | 0.08333 | 2.00000
NVIDIA A100-SXM4-40GB-7g.40gb | 1 NVIDIA A100-SXM4-40GB-7g.40gb | 0.16667 | 4.00000
SSD | 1G | 0.00006 | 0.00150
Returns the chargeback price sheet which contains details on the prices associated with each type of tracked resource (in the selected currency).
End Point: /api/v3/robin_server/chargeback
Method: GET
URL Parameters:
pricesheet=true
: This mandatory parameter specifies that details of the price sheet should be returned.
Data Parameters: None
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)
Example Response:
Output
{
"items":{
"currency":"USD",
"resources":[
{
"type":"CPU",
"unit_price":0.12,
"unit_size":1
},
{
"type":"MEMORY",
"unit_price":0.25,
"unit_size":1073741824
},
{
"type":"HDD",
"unit_price":0.0015,
"unit_size":1073741824
},
{
"type":"SSD",
"unit_price":0.0042,
"unit_size":1073741824
},
{
"type":"GPU",
"unit_price":1.0,
"unit_size":1
},
{
"type":"NVIDIA A100-SXM4-40GB",
"unit_price":3.0,
"unit_size":1
},
{
"type":"NVIDIA A100-SXM4-40GB-1g.5gb",
"unit_price":0.4285714286,
"unit_size":1
},
{
"type":"NVIDIA A100-SXM4-40GB-2g.10gb",
"unit_price":0.8571428571,
"unit_size":1
},
{
"type":"NVIDIA A100-SXM4-40GB-3g.20gb",
"unit_price":1.2857142857,
"unit_size":1
},
{
"type":"NVIDIA A100-SXM4-40GB-4g.20gb",
"unit_price":1.7142857143,
"unit_size":1
},
{
"type":"NVIDIA A100-SXM4-40GB-7g.40gb",
"unit_price":3.0,
"unit_size":1
}
]
}
}
16.2. Generating Chargeback report¶
Robin generates a Chargeback report that allows one to the track the cost of an application in the cluster. The report can be filtered by tenants, users, application, or date interval in order to provide more focused information. Issue the following command to generate the Chargeback report:
# robin chargeback report --app <app>
--tenant <tenant>
--user <user>
--starttime <start_time>
--endtime <end_time>
--interval <interval>
--details
Note
Chargeback reports for deleted applications are saved. This enables historical cost information to be viewed.
|
Application name to filter by |
|
Tenant name to filter by |
|
User name to filter by |
|
Start time in format YYYY-MM-DDTHH:MM:SS or YYYY-MM-DD or YYYY-MM |
|
End time in format YYYY-MM-DDTHH:MM:SS or YYYY-MM-DD or YYYY-MM |
|
Interval of report. Options include: yearly or monthly |
|
Provide additional information for each application |
Example:
# robin chargeback report
+----------+-------+----------------+--------------+-------------------------------+-------------+-------+
| App | User | Tenant | State/Status | Resource Type | Price (USD) | Total |
+----------+-------+----------------+--------------+-------------------------------+-------------+-------+
| gpu-2 | robin | Administrators | ONLINE/Ready | CPU | 0.51 | 3.79 |
| | | | | HDD | 0.04 | |
| | | | | MEMORY | 2.27 | |
| | | | | NVIDIA A100-SXM4-40GB-1g.5gb | 0.97 | |
| test | robin | Administrators | ONLINE/Ready | CPU | 0.43 | 3.38 |
| | | | | HDD | 0.04 | |
| | | | | MEMORY | 1.93 | |
| | | | | NVIDIA A100-SXM4-40GB-1g.5gb | 0.98 | |
| cl | robin | Administrators | ONLINE/Ready | CPU | 0.49 | 5.54 |
| | | | | HDD | 0.04 | |
| | | | | MEMORY | 2.18 | |
| | | | | NVIDIA A100-SXM4-40GB-3g.20gb | 2.83 | |
| tets-123 | robin | Administrators | ONLINE/Ready | CPU | 0.09 | 1.42 |
| | | | | HDD | 0.01 | |
| | | | | MEMORY | 0.37 | |
| | | | | NVIDIA A100-SXM4-40GB-3g.20gb | 0.95 | |
+----------+-------+----------------+--------------+-------------------------------+-------------+-------+
Generates a Chargeback report that allows one to the track the cost of an application in the cluster. The report can be filtered by tenants, users, application, or date interval in order to provide more focused information.
End Point: /api/v3/robin_server/chargeback
Method: GET
URL Parameters:
appname=<app_name>
: Utilizing this parameter filters the results such that only applications whose name match the specified application name are returned.tenantname=<tenant_name>
: Utilizing this parameter filters the results such that only applications within the specified tenant are returned.username=<user_name>
: Utilizing this parameter filters the results such that only applications created by the specified user are returned.starttime=<start_time>
: Utilizing this parameter results in the price of each application being calculated from the specified start time. This field should be specified in one of the following formats: YYYY-MM-DDTHH:MM:SS, YYYY-MM-DD or YYYY-MM.endtime=<end_time>
: Utilizing this parameter results in the price of each application being calculated until the specified end time. This field should be specified in one of the following formats: YYYY-MM-DDTHH:MM:SS, YYYY-MM-DD or YYYY-MM.details=true
: Utilizing this parameter results in additional information for each application being returned.interval=[1,2]
: Utilizing this parameter results in application information within the specified interval being returned. Options include 1, indicating a yearly interval, or 2 which indicates a monthly interval.
Data Parameters: None
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)
Example Response:
Output
[
{
"appid":3,
"name":"gpu-2",
"username":"robin",
"tenantname":"Administrators",
"pods":{
"3":{
"name":"gpu-2.main.01",
"resource_entries":[
]
}
},
"resources":{
"4":{
"type":"MEMORY",
"unit_size":1048576,
"total_time":813080576,
"min_units":4096,
"max_units":4096,
"price":2.3
},
"5":{
"type":"HDD",
"unit_size":1048576,
"total_time":2439696384,
"min_units":12288,
"max_units":12288,
"price":0.04
},
"3":{
"type":"CPU",
"unit_size":1,
"total_time":397012,
"min_units":2,
"max_units":2,
"price":0.51
},
"9":{
"type":"NVIDIA A100-SXM4-40GB-1g.5gb",
"unit_size":1,
"total_time":198543,
"min_units":1,
"max_units":1,
"price":0.98
}
},
"total_price":3.83,
"state":"ONLINE",
"status":"Partial"
},
{
"appid":2,
"name":"test",
"username":"robin",
"tenantname":"Administrators",
"pods":{
"2":{
"name":"test.main.01",
"resource_entries":[
]
}
},
"resources":{
"3":{
"type":"CPU",
"unit_size":1,
"total_time":340764,
"min_units":2,
"max_units":2,
"price":0.43
},
"4":{
"type":"MEMORY",
"unit_size":1048576,
"total_time":697884672,
"min_units":4096,
"max_units":4096,
"price":1.96
},
"5":{
"type":"HDD",
"unit_size":1048576,
"total_time":2446712832,
"min_units":12288,
"max_units":12288,
"price":0.04
},
"9":{
"type":"NVIDIA A100-SXM4-40GB-1g.5gb",
"unit_size":1,
"total_time":199114,
"min_units":1,
"max_units":1,
"price":0.99
}
},
"total_price":3.42,
"state":"ONLINE",
"status":"Partial"
},
{
"appid":5,
"name":"cl",
"username":"robin",
"tenantname":"Administrators",
"pods":{
"4":{
"name":"cl.main.01",
"resource_entries":[
]
}
},
"resources":{
"3":{
"type":"CPU",
"unit_size":1,
"total_time":384950,
"min_units":2,
"max_units":2,
"price":0.49
},
"4":{
"type":"MEMORY",
"unit_size":1048576,
"total_time":788377600,
"min_units":4096,
"max_units":4096,
"price":2.21
},
"7":{
"type":"NVIDIA A100-SXM4-40GB-3g.20gb",
"unit_size":1,
"total_time":192476,
"min_units":1,
"max_units":1,
"price":2.86
},
"5":{
"type":"HDD",
"unit_size":1048576,
"total_time":2365145088,
"min_units":12288,
"max_units":12288,
"price":0.04
}
},
"total_price":5.6,
"state":"ONLINE",
"status":"Partial"
},
{
"appid":6,
"name":"tets-123",
"username":"robin",
"tenantname":"Administrators",
"pods":{
"5":{
"name":"tets-123.main.01",
"resource_entries":[
]
}
},
"resources":{
"3":{
"type":"CPU",
"unit_size":1,
"total_time":68738,
"min_units":2,
"max_units":2,
"price":0.09
},
"4":{
"type":"MEMORY",
"unit_size":1048576,
"total_time":140775424,
"min_units":4096,
"max_units":4096,
"price":0.4
},
"5":{
"type":"HDD",
"unit_size":1048576,
"total_time":422326272,
"min_units":12288,
"max_units":12288,
"price":0.01
},
"7":{
"type":"NVIDIA A100-SXM4-40GB-3g.20gb",
"unit_size":1,
"total_time":68738,
"min_units":2,
"max_units":2,
"price":1.02
}
},
"total_price":1.52,
"state":"ONLINE",
"status":"Partial"
}
]
16.3. List all tracked resources¶
To view all physical resources that could be utilized by applications and consequently tracked by independent chargeback counters, issue the following command:
# robin chargeback resource-list --json
|
Output in JSON |
Example:
# robin chargeback resource-list
+----+-----------------------------------------------+--------+----------+
| ID | Name | Type | NumHosts |
+----+-----------------------------------------------+--------+----------+
| 1 | DEFAULT-HDD | HDD | 0 |
| 2 | DEFAULT-SSD | SSD | 0 |
| 3 | Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz | CPU | 7 |
| 4 | UNKNOWN-GENERIC-UNKNOWN | MEMORY | 7 |
| 5 | AWS-EBS-General Purpose-107374182400 | SSD | 5 |
| 6 | Tesla K80 | GPU | 6 |
| 7 | AWS-EBS-General Purpose-53687091200 | SDD | 2 |
+----+-----------------------------------------------+--------+----------+
Returns all physical resources that could be utilized by applications and consequently tracked by independent chargeback counters.
End Point: /api/v3/robin_server/chargeback
Method: GET
URL Parameters:
resourceonly=true
: This mandatory parameter specifies that only information on available resources should be returned.
Data Parameters: None
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)
Example Response:
Output
{
"items":[
{
"name":"DEFAULT-HDD",
"id":1,
"hosts":[
],
"attributes":{
"default":true,
"type":"HDD"
},
"type":"HDD"
},
{
"name":"DEFAULT-SSD",
"id":2,
"hosts":[
],
"attributes":{
"default":true,
"type":"SSD"
},
"type":"SSD"
},
{
"name":"Intel(R) Xeon(R) Gold 5220 CPU @ 2.20GHz",
"id":3,
"hosts":[
"cscale-82-140.robinsystems.com",
"cscale-82-139.robinsystems.com"
],
"attributes":{
"cache_size_kb":25344,
"min_speed_mhz":null,
"model_name":"Intel(R) Xeon(R) Gold 5220 CPU @ 2.20GHz",
"max_speed_mhz":null,
"flags":"fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec spec_ctrl intel_stibp flush_l1d arch_capabilities",
"online":1,
"cur_speed_mhz":2194.841,
"cpu_index":0,
"vendor_id":"GenuineIntel",
"model":85,
"physical_id":0,
"core_id":0
},
"type":"CPU"
},
{
"name":"UNKNOWN-GENERIC-UNKNOWN",
"id":4,
"hosts":[
"cscale-82-140.robinsystems.com",
"cscale-82-139.robinsystems.com"
],
"attributes":{
},
"type":"MEMORY"
},
{
"name":"UNKNOWN-UNKNOWN-107374182400",
"id":5,
"hosts":[
"cscale-82-140.robinsystems.com",
"cscale-82-139.robinsystems.com"
],
"attributes":{
"0x600224804c48fd7e16c608dea0919064":{
"aws_path":null,
"availability_zone":null,
"devpath":"\/dev\/disk\/by-id\/scsi-3600224804c48fd7e16c608dea0919064",
"log_sec_size":512,
"capacity":107374182400,
"kernel_name":"\/dev\/sdb",
"smart_info":{
"enabled":false,
"available":false
},
"Valid":true,
"discard":{
"enabled":true,
"zeroes_data":0,
"granularity":2097152,
"max_bytes":4294966784
},
"make":null,
"type":"HDD",
"phy_sec_size":4096,
"model":null,
"wwn":"0x600224804c48fd7e16c608dea0919064"
},
"0x60022480940ed076551cfaf75612e24e":{
"aws_path":null,
"availability_zone":null,
"devpath":"\/dev\/disk\/by-id\/scsi-360022480940ed076551cfaf75612e24e",
"log_sec_size":512,
"capacity":107374182400,
"kernel_name":"\/dev\/sdb",
"smart_info":{
"enabled":false,
"available":false
},
"Valid":true,
"discard":{
"enabled":true,
"zeroes_data":0,
"granularity":2097152,
"max_bytes":4294966784
},
"make":null,
"type":"HDD",
"phy_sec_size":4096,
"model":null,
"wwn":"0x60022480940ed076551cfaf75612e24e"
},
"0x60022480ffcf3deb224fb37d78fe7767":{
"aws_path":null,
"availability_zone":null,
"devpath":"\/dev\/disk\/by-id\/scsi-360022480ffcf3deb224fb37d78fe7767",
"log_sec_size":512,
"capacity":107374182400,
"kernel_name":"\/dev\/sdc",
"smart_info":{
"enabled":false,
"available":false
},
"Valid":true,
"discard":{
"enabled":true,
"zeroes_data":0,
"granularity":2097152,
"max_bytes":4294966784
},
"make":null,
"type":"HDD",
"phy_sec_size":4096,
"model":null,
"wwn":"0x60022480ffcf3deb224fb37d78fe7767"
},
"0x600224803bcdafde95b1f5cd27ceb5fb":{
"aws_path":null,
"availability_zone":null,
"devpath":"\/dev\/disk\/by-id\/scsi-3600224803bcdafde95b1f5cd27ceb5fb",
"log_sec_size":512,
"capacity":107374182400,
"kernel_name":"\/dev\/sdc",
"smart_info":{
"enabled":false,
"available":false
},
"Valid":true,
"discard":{
"enabled":true,
"zeroes_data":0,
"granularity":2097152,
"max_bytes":4294966784
},
"make":null,
"type":"HDD",
"phy_sec_size":4096,
"model":null,
"wwn":"0x600224803bcdafde95b1f5cd27ceb5fb"
}
},
"type":"HDD"
}
]
}
16.4. Show information about a specific resource¶
To view detailed information about a particular resource, such as the host(s) on which it resides and resource specific attributes, issue the following command:
# robin chargeback resource-info [<resource_id>]
|
ID of the resource. Can be obtained by running |
Example:
# robin chargeback resource-info 1
ID: 1
Name: Intel Xeon E312xx (Sandy Bridge, IBRS update)
Type: CPU
Attributes:
vendor_id : GenuineIntel
online : 1
model_name : Intel Xeon E312xx (Sandy Bridge, IBRS update)
max_speed_mhz : None
cpu_index : 0
cur_speed_mhz : 1999.999
model : 42
core_id : 0
physical_id : 0
cache_size_kb : 4096
flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
min_speed_mhz : None
Hosts:
vnode36.robinsystems.com
vnode89.robinsystems.com
Returns detailed information about a particular resource, such as the host(s) on which it resides and resource specific attributes.
End Point: /api/v3/robin_server/chargeback
Method: GET
URL Parameters:
resourceonly=true
: This mandatory parameter specifies that only information on available resources should be returned.restypeid=<resource_id>
: This mandatory parameter specifies the ID of the resource for which information should be returned.
Data Parameters: None
Port: RCM Port (default value is 29442)
Headers:
Authorization: <auth_token>
: Authorization token to identify which user is sending the request. The token can be acquired from the login API.
Success Response Code: 200
Error Response Code: 500 (Internal Server Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error), 404 (Not Found Error)
Example Response:
Output
{
"items":[
{
"name":"Intel(R) Xeon(R) Gold 5220 CPU @ 2.20GHz",
"id":3,
"hosts":[
"cscale-82-140.robinsystems.com",
"cscale-82-139.robinsystems.com"
],
"attributes":{
"cache_size_kb":25344,
"min_speed_mhz":null,
"model_name":"Intel(R) Xeon(R) Gold 5220 CPU @ 2.20GHz",
"max_speed_mhz":null,
"flags":"fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec spec_ctrl intel_stibp flush_l1d arch_capabilities",
"online":1,
"cur_speed_mhz":2194.841,
"cpu_index":0,
"vendor_id":"GenuineIntel",
"model":85,
"physical_id":0,
"core_id":0
},
"type":"CPU"
}
]
}