17. Asynchronous Disaster Recovery

Starting from Robin CNP 5.4.1, Robin.io provides the snapshot-based Asynchronous Disaster Recovery(DR) feature.

The feature enables you to replicate your Kubernetes-based stateful applications along with its constructs (PVC, StatefulSet, Config maps, Secrets, Services, etc.) onto a remote secondary peer cluster (site). You can manually fail over to the secondary cluster in the event of a disaster or maintenance activities.

The DR feature provides the encryption option. You can enable it when transmitting data over the wire to a peer cluster.For more information, see Over-Wire Data Transfer Encryption.

The Robin Asynchronous Disaster Recovery feature allows you to bring your applications online faster by failing over to the secondary cluster(site) when a disaster occurs with a minimum application downtime and allows to failback later.

You can configure the required Recovery Point Objective (RPO) for your applications.

As Robin CNP runs on the cloud and on-premise, you can use the Robin Replication feature on both platforms. For more information, see Set Up Robin Asynchronous Disaster Recovery.

17.1. Concepts

Described below are ideas and constructs that are fundamental to the Asynchronous Disaster Recovery feature. It is highly recommended that all of the below concepts are understood thoroughly before proceeding with the utilization of the replication process. This is because of each section summarizes a crucial component of the feature and thus will need to be taken into account when planning the recovery procedure incase of a failure.

17.1.1. Peer Clusters

A Peer Cluster in the context of the Disaster Recovery process represents a Robin cluster. As such, a DR pair requires two Peer Clusters that can communicate with each other and be synced. Each Peer Cluster will host a Protection Group of its own and thus recieve snapshots of applications based on the role of the Protection Group they host.

In order for the replication process to begin, the two designated Robin clusters needed to be paired together and become Peer Clusters to one another. Details on this pairing process can be found here.

17.1.1.1. Endpoint

An Endpoint is the IP address that a Peer Cluster hosting a Secondary Protection Group will use to contact the Robin cluster where the Primary Protection Group resides. As a result, it helps secure a connection between Peer Clusters.

By default, if no Endpoint is manually specified the VIP of highly available clusters is used. When this is not available, the IP address of the master node is used in its place. Information on the Endpoint to use is transferred to the Peer Cluster via the encoded blob generated on the pair intiating cluster.

Note

For HA environments, it is recommended to use the VIP for the Endpoint. Details on how to update an Endpoint that is currently in-use can be found here.

17.1.2. Protection Groups

A Protection Group in the Robin Asynchronous DR feature is a logical construct that plays a key role in the disaster recovery setup and managing DR configuration on a Robin cluster. For more information, see Manage Protection Group.

You can have multiple Protection Groups in a Robin cluster as per your requirements.

The Protection Group comprises the following entities:

  • Applications

  • Peer cluster name

  • Asynchronous DR Role (Primary or Secondary)

  • Replication Policy

17.1.2.1. Roles

A role is designated to each Protection Group when its created. The following are the supported roles and their implications for applications:

  • Primary - Applications in a Protection Group with this role will be actively replicated and their data will be transfered to the associated Peer Cluster(s).

  • Secondary - Applications in a Protection Group with this role will receive updates from the cluster where the Primary Protection Group resides but their respective Pods will not be running. Protection Group(s) with this role are in place to take over the Primary role in the event of a failover.

For more information on role changes, see Failover or Failback a Cluster.

Note

You cannot have two active Protection Groups with the Primary role at the same time in a DR setup.

17.1.3. Replication Policy

A Replication Policy defines the frequency of data transfer between Peer Clusters. It needs to be attached to a primary Protection Group in order to take affect. The aforementioned frequency applies to all the applications in the Protection Group. The replication of application metadata and data is primarily done through the snapshot mechanism with snapshots essentially being between clusters. For more information on how to manage Replication Policies, see Manage Replication Policy.

17.1.3.1. Replication Snapshots

Replication Snapshots are snapshots that a Replication Policy automatically creates of applications attached to a Protection Group as per the cadence defined in its configuration. As they are system generated they are transferred as part of the replication process by default. Moreover all Replication Snapshots utilize the following naming convention, <appname>_<namespace>_<repl-policy-name>_<zoneid>_<creation-time>, as well as being associated with the standard system_replicate:yes label.

Note

The creation time is displayed in the standard UNIX timestamp format within the snapshot name.

More details on how to configure the creation of these snapshots when first defining a Replication Policy can be found here. In addition, the Replication Snapshot feature can be enabled after the initial configuration of a Replication Policy using the robin replication-policy enable-repl-snaps command, details of which can be found here.

17.1.3.2. Utilizing Labels

Robin enables users to label snapshots in order to help categorize and indentify them. These labels can also be specified within a Replication policy, at the time of creation or after the fact, such that snapshots with the associated labels are replicated. This enables snapshots which are created manually or outside the replication process, via a snapshot schedule for instance, to be replicated to secondary Protection Groups as well.

For example, if a Replication Policy is associated with the label REPDEMO:Yes, application snapshots with the same label will be replicated at the frequency set for the Replication Policy alongside any replication snapshots if they are enabled. In addition, the number of snapshots with this label to retain is also configurable in case a given number need to be maintained. Details on how to associate labels with a Replication Policy can be found here.

Note

More than one label can be associated with a Replication Policy so as to allow a variety of normal snapshots to be replicated.

17.1.4. Sync States

The robin protection-group info command, details for which can be found here, can be used to assess the sync state of applications associated with a Protection Group. The following are all the valid states an application can be in:

  • SYNC_SUCCESSFUL - This signifies a synchronization was successful with the latest synced snapshot also being displayed. This is a mirrored state so depending on the role of the Protection Group on the cluster where the state is checked either PRIMARY_SYNC_SUCCESSFUL or SECONDARY_SYNC_SUCCESSFUL will be displayed.

  • IN_PROGRESS - This signifies a synchronization is currently in progress. This is a mirrored state so depending on the role of the Protection Group on the cluster where the state is checked either PRIMARY_SYNC_IN_PROGRESS or SECONDARY_SYNC_IN_PROGRESS will be displayed.

  • SYNC_FAILED - This signifies a synchronization failed. This is a mirrored state so depending on the role of the Protection Group on the cluster where the state is checked either PRIMARY_SYNC_FAILED or SECONDARY_SYNC_FAILED will be displayed.

  • INITIAL_SYNC_PENDING - This signifies the very first synchronization still has not completed. This is a mirrored state so depending on the role of the Protection Group on the cluster where the state is checked either PRIMARY_INITIAL_SYNC_PENDING or SECONDARY_INITIAL_SYNC_PENDING will be displayed.

  • INITIAL_SYNC_IN_PROGRESS - This signifies the very first synchronization currently running. This is a mirrored state so depending on the role of the Protection Group on the cluster where the state is checked either PRIMARY_INITIAL_SYNC_IN_PROGRESS or SECONDARY_INITIAL_SYNC_IN_PROGRESS will be displayed.

  • INITIAL_SYNC_FAILED - This signifies the very first synchronization has failed. This is a mirrored state so depending on the role of the Protection Group on the cluster where the state is checked either PRIMARY_INITIAL_SYNC_FAILED or SECONDARY_INITIAL_SYNC_FAILED will be displayed.

  • PRIMARY_NO_APP - This signifies a failover event has happened but the intial synchronization of the application within the primary Protection Group has not completed. This state will only be when the command is run on the cluster where the new primary Protection Group is hosted.

  • SECONDARY_APP_WITH_NO_PRIMARY - This signifies a failover event has happened but the intial synchronization of the application on the new primary Protection Group did not complete. This state will only be when the command is run on the cluster where the new secondary Protection Group is hosted.

  • PENDING_USER_ASSIGNMENT_AT_PEER - This signifies the application needs to be assigned to a user present on the cluster hosting the secondary Protection Group. This state will only be when the command is run on the cluster where the primary Protection Group is hosted.

  • PENDING_USER_ASSIGNMENT - This signifies the application needs to be assigned to a user present on the cluster hosting the secondary Protection Group. This state will only be when the command is run on the cluster where the secondary Protection Group is hosted.

  • UNDEFINED - This signifies that the application sync state does not meet any of the above states and will only appear in rare conditions.

17.1.5. Role-based Access Control

Role-based Access Control (RBAC) for objects created as part of the replication process follows a similar model to that of the Robin platform; namely it based around tenants and the users contained within them. Specifically Protection Groups must be associated with a tenant and only applications within the same tenant can be assigned to them. Consequently, users within the same tenant must own the applications that are to be associated with the given Protection Group.

This model holds true for secondary Protection Groups on Peer Clusters as well. However there is some flexibility with regards to the which tenants the secondary Protection Groups are assigned to as well as which users own the respective attached applications on Peer Clusters. The determining factor for these associations is the Auto-Assignment Mode that is set on the Peer Cluster. Details on the different modes available and what each offers can be found here.

17.1.6. Auto-Assignment Modes

The Auto-Assignment Mode configured on the Peer Cluster hosting a seconday Protection Group determines which tenant the aforementioned Protection Group is associated to. Furthermore it also determines which of the tenant users should be assigned the applications attached to the Protection Group.

The following are the different modes available and how they influence assignments on the Peer Cluster:

MIRROR

This is the default mode. As its name suggests, it assigns the secondary Protection Group to the tenant on the Peer Cluster with the same name as the one associated to the primary Protection Group. This is also the case for the tenant users to which applications are assigned to. As a result, when utilizing this mode, the configuration of tenants and users on the Peer Cluster must mirror that of the Robin cluster hosting the primary Protection Group. If this is not the case, the Peer Cluster cannot be registered.

DEFAULT

With this mode, the Protection Group is assigned to the default Adminstrators tenant on the Peer Cluster that is created as part of the cluster installation. In addition applications are assigned to the super-admin user, of which there must be at least one within the tenant. This mode is useful when there are custom tenants/users on the Robin cluster hosting the primary Protection Group that cannot recreated or replicated on the Peer Cluster.

MANUAL

With this mode, the super-admin user on the Peer Cluster has to manually assign the Protection Group to a tenant as well as assign ownership of the applications to users within the same tenant. These steps have to be completed for the replication process to proceed. If this is not the case, the following states will be displayed for the primary Protection Group and its applications respectively: PENDING_TENANT_ASSIGNMENT_AT_PEER and PENDING_USER_ASSIGNMENT_AT_PEER. Similarly on the Peer Cluster the following states will be seen: PENDING_TENANT_ASSIGNMENT and PENDING_USER_ASSIGNMENT.

As its name suggest, the auto_assign_mode config attribute controls the current setting for the Auto-Assignment Mode. More information on the variable can be found here. In order to update the mode in use, the robin config update command, detailed here must be used.

Important

The desired Auto-Assignment mode must be set on the Peer Cluster hosting the secondary Protection Group in order to take affect.

17.1.7. Metrics

Metrics for objects that are part of the replication process can be viewed using a provided Grafana dashboard. The following details are displayed as part of the dashboard for each registered Protection Group:

  • Protection Group role

  • Data transfer rate

  • Sync and pause states for associated applications

  • Creation time and ID of the last synced snapshot for associated applications

More information about enabling metrics can be found here.

17.1.8. Limitations

The following are the limitations of the asynchronous DR feature:

  • Snapshot restore to the original location from the backup is not supported.

  • Disaster recovery configuration with different media type clusters is not supported.

  • Multisite asynchronous DR is not supported.

  • Applications which are associated with a Protection Group cannot be scaled or have volumes added to them. In order to run these operations on an application linked to a Protection Group, it must be removed from the Protection Group first, then updated and finally re-associated.

  • Clones of applications cannot be associated with Protection Groups and thus are not valid for replication. This also holds true for applications with ephemeral volumes (AEVs).

  • Robin Bundle application with PDV and AEV is not supported for disaster recovery.

17.1.9. Failover

A failover is the term used to describe an event wherein the Robin cluster hosting the primary Protection Group goes offline. This shutdown might be due to planned maintenance or an unexpected disaster. In this scenario, the secondary Protection Group hosted on a Peer Cluster needs to become the primary Protection Group in order to continue serving applications and their data. In addition the previous primary Protection Group will have to assume the secondary role. These role changes have to performed manually using the robin protection-group change-role command, detailed here. An example can also be seen here.

Note

If the Robin cluster hosting the primary Protection Group(s) is accessible, the respective Protection Group(s) must have their roles changed to secondary before the original secondary Protection Group(s) roles can change their roles to primary. If the cluster hosting the primary Protection Group(s) cannot be reached, the original secondary Protection Group(s) will be allowed to assume the primary role. However as soon as its accessible again this role change must occur as there cannot be two active primary Protection Groups in a DR pair.

17.1.10. Failback

A failback is the term used to describe an event wherein the original primary Protection Group that assumed the role of a seconday Protection Group due to a failover event returns to being a primary Protection Group once again. The end goal of a failback operation is to make the original primary Protection Group the leader again thus restoring the original configuration of the DR pair. However, before the operation is performed it is highly recommended that users confirm the delta changes from the current primary Protection Group to the original are replicated in order to minimize data loss. As with the failover operation the role change needs to be performed manually using the robin protection-group change-role command, detailed here.

Important

Performing a failback operation is optional, it is solely up to the users descretion and preference.

17.2. Disaster Recovery process

17.2.2. Changing roles for a Protection Group

During any disaster recovery process, whether as part of a failover or failback operation, the roles of respective Protection Groups within the replication pair must be changed. This role change has to be performed manually and so it is manadatory that the robin protection-group change-role command, details for which can be found here, be run after logging into the respective cluster. All role change operations abort ongoing replication tasks and wait until all pending replication tasks are complete before it can begin.

Detailed below are the two directions roles can be changed, and what occurs with each respective transition:

  • Primary to Secondary: When a Protection Group stops assuming the primary role, the applications associated with it are stopped and thus go offline. Any ongoing data transfers from the cluster hosting the primary Protection Group to the peer clusters is also halted.

  • Secondary to Primary: When a Protection Group assumes the primary role, all data transfer processes from the previous primary Protection Group are aborted and all associated applications are started. Each application is assumes its state and configuration based off their most recent and healthy snapshot.

Note

This transition occurs only when the cluster is in a healthy state (all hosts and volumes are in the Ready state) or the cluster was brought back after a certain time (e.g reboot, power off, etc.)

17.2.3. Example Role Change

The following example showcases how to change Protection Group’s role from Secondary to Primary on a Peer Cluster.

# robin protection-group change-role prod-pg Primary
Job: 12695 Name: ProtectionGroupRoleChange State: VALIDATED Error: 0
Job: 12695 Name: ProtectionGroupRoleChange State: WAITING Error: 0
Job: 12695 Name: ProtectionGroupRoleChange State: COMPLETED Error: 0

# robin protection-group info prod-pg
Name: prod-pg
Tenant: Administrators
Role: PRIMARY
Peer Information:
=================

**Peer Cluster Name: Sanfrancisco**

Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: minute_repl

+----------+---------------------------+-------------------------------------------+--------------------------+
| App Name | State                     | Latest Synced Snapshot                    | Latest Sycned Snap ctime |
+----------+---------------------------+-------------------------------------------+--------------------------+
| mysqldb  | SECONDARY_SYNC_SUCCESSFUL | mysqldb_minute_repl-1639611848-1639692409 | 2021-12-16 14:06:54      |
+----------+---------------------------+-------------------------------------------+--------------------------+

Note

Details on the robin protection-group change-role command can be found here.

17.3. Managing Peer Clusters

Topics covered in this section:

create

Create a Peer Cluster

pair

Pair a Peer Cluster

add-description

Add description for a Peer Cluster

list

List all Peer Clusters

encryption

Enable or disable encryption for a Peer Cluster

unpair

Unpair a Peer Cluster

update-endpoint

Update endpoint of a Peer Cluster

17.3.1. Create a Peer Cluster

As described previously two Robin CNP clusters are needed as part of a DR pair. One cluster is designated as the pairing initiating cluster whilst the other is referred to as the peer. The command described below enables the pairing process to begin by registering the peer cluster. When this is successful, an encoded blob is generated on the pairing initiating cluster. The encoded blob contains the required metadata for pairing. You must copy the blob from where it was generated and securely share it with the administrator at your peer cluster location. The blob must then be utilized by the robin peer-cluster pair command, details of which can be found here, in order to complete the pairing process.

Important

The encoded blob expires after ten minutes. You must use it within this time period on the peer cluster in order to complete pairing process.

Note

For HA environments, it is recommended to use the VIP for the Endpoint.

To register a peer and generate an encoded blob for pairing, run the following command:

# robin peer-cluster create <name>
                            --description <description>
                            --endpoint <endpoint>

name

Name of the peer cluster

--description <description>

Description of the Peer Cluster

--endpoint <endpoint>

The IP Address at which the peer cluster will contact the pairing initiating cluster. If not provided, the cluster VIP or master node IP Address will be used

Example

# robin peer-cluster create test_peer_secondary

Provide this token to the secondary cluster:
eyJlbmRwb2ludCI6ICIxMC45LjYxLjQ3IiwgInBlZXJfYXV0aF90b2tlbiI6ICJleUowZVhBaU9pSktWMVFpTENKaGJHY2lPaUpJVXpJMU5pSjkuZXlKMWMyVnlYMmxrSWpveUxDSjBaVzVoYm5SZmFXUWlPakVzSW1WNGNDSTZNVFk1T0RRd01UY3pObjAuZkFveUM4S3FfQUdfZ2U5cTZKLXVpdC1mRjJtdUVsMzUtNFYzVklFeS1lWSIsICJwZWVyX3Rva2VuIjogIjNDUWVJNG1oQVRoV0VKb0FENTZVUFI1VSIsICJwZWVyX2NsdXN0ZXJfaWQiOiAiMWNkZmNiY2YtMTliYS00YThjLWEzMDMtNGNiMGFjZmU3N2YxIiwgInBlZXJfem9uZWlkIjogMTY0NjQ0NjI3OX0=

17.3.2. Pair a Peer Cluster

As described previously two Robin CNP clusters are needed as part of a DR pair. One cluster is designated as the pairing initiating cluster whilst the other is referred to as the peer. The command described below enables the pairing process to be completed on the peer cluster. It enables users to submit the encoded blob generated on the pairing intiated cluster and consequently results in an association of the two clusters by transmitting details of the peer back to the peer initiating cluster. The blob can be generated on the pairing initiated cluster via the robin peer-cluster create command, details of which can be found here.

Important

The encoded blob expires after ten minutes. You must use it within this time period on the peer cluster in order to complete pairing process.

To complete pairing on your peer cluster, run the following steps:

# robin peer-cluster pair <name> <blob>
                                 --description <description>

name

Peer cluster name

blob

Encoded blob created on the peer initiating Peer Cluster

--description <description>

Description of the Peer Cluster

Example

# robin peer-cluster pair test_peer_primary eyJlbmRwb2ludCI6ICIxMC45LjYxLjQ3IiwgInBlZXJfYXV0aF90b2tlbiI6ICJleUowZVhBaU9pSktWMVFpTENKaGJHY2lPaUpJVXpJMU5pSjkuZXlKMWMyVnlYMmxrSWpveUxDSjBaVzVoYm5SZmFXUWlPakVzSW1WNGNDSTZNVFk1T0RRd01UY3pObjAuZkFveUM4S3FfQUdfZ2U5cTZKLXVpdC1mRjJtdUVsMzUtNFYzVklFeS1lWSIsICJwZWVyX3Rva2VuIjogIjNDUWVJNG1oQVRoV0VKb0FENTZVUFI1VSIsICJwZWVyX2NsdXN0ZXJfaWQiOiAiMWNkZmNiY2YtMTliYS00YThjLWEzMDMtNGNiMGFjZmU3N2YxIiwgInBlZXJfem9uZWlkIjogMTY0NjQ0NjI3OX0= --wait
Job: 52 Name: PeerClusterPair State: VALIDATED Error: 0
Job: 52 Name: PeerClusterPair State: COMPLETED Error: 0

17.3.3. Add Description to a Peer Cluster

You can add or update the description for a peer cluster when required. In order to do so, run the following command:

# robin peer-cluster add-description <name> <description>

name

Name of the peer cluster

description

Description for the peer cluster

17.3.4. List all Peer Clusters

To list all peer clusters currently registered, run the following command:

# robin peer-cluster list --verbose

--verbose

Include additional information in the output

Example

# robin peer-cluster list
+----+---------+--------+------------+--------------+------------+
| ID | Name    | State  | IP-Address | Description  | Zoneid     |
+----+---------+--------+------------+--------------+------------+
| 2  | NewYork | PAIRED | 10.7.82.20 | Primary Site | 1639611100 |
+----+---------+--------+------------+--------------+------------+

17.3.5. Manage Over-Wire Data Transfer Encryption

You can enable encryption for a particular peer cluster in order to ensure the data transmitted to the peer is secure. By default, over-the-wire data transfer encryption is disabled and so must be individually enabled for each per replication pair. Multiple encryption algorithms are supported, namely: AES128, AES256, and CHACHA20. To enable, disable or change the encryption type for a peer cluster at any point of time, run the following command:

# robin peer-cluster encryption <name>
                                --algorithm <algorithm>
                                --enable
                                --disable

name

Name of the Peer Cluster to update the encryption status of

--algorithm <algorithm>

Encryption algorithm for the peer cluster. Options include: AES128, AES256, CHACHA20. Default is AES-256

--enable

Enables on-wire encryption for the specified peer cluster

--disable

Disables on-wire encryption for the specified peer cluster

Note

At least one of the --enable or --disable parameters must be given, however both options cannot be specified at the same time.

Example

# robin peer-cluster encryption --enable --algorithm aes128 hyperv1

17.3.6. Unpair a Cluster

In order to remove the relationship to a peer cluster and consequently unpair the clusters, run the following command:

Note

The peer cluster to be disassociated must be removed from all Protection Groups before it can be unpaired.

# robin peer-cluster unpair <peer>
                            --inform

peer

Name or zone ID of the Peer Cluster

--inform

Inform Peer Cluster to unpair as well

Note

If the --inform parameter is not utilized the unpair command must be run on both clusters in the replication pair.

Example

# robin peer-cluster unpair NewYork
Job: 53 Name: PeerClusterUnpair State: VALIDATED Error: 0
Job: 53 Name: PeerClusterUnpair State: COMPLETED Error: 0

17.3.7. Update endpoint of a Peer

The endpoint being used to contact a peer can be updated when there is a change to the associated IP Address. You must update the endpoint only when the network connection to the peer is available and the peer is not part of any Protection Groups. More information on the concept of endpoints can be found here.

Note

Contact Robin Support team before updating the endpoint.

To update an endpoint of a peer, run the following command:

# robin peer-cluster update-endpoint <name> <endpoint>
                                            --yes

name

Name of the peer cluster for which we are updating endpoint.

endpoint

New endpoint of the peer cluster.

--yes

Do not prompt the user for confirmation.

Example

# robin peer-cluster update-endpoint hyperv1 10.9.61.74

Please contact Robin before using this command as it will result in
serious repercussions. Do you wish to continue [y/n]?

17.4. Managing Protection Groups

Topics covered in this section:

create

Create a Protection Group

add-peer

Add peer to Protection Group

remove-peer

Remove peer from Protection Group

assign-tenant

Assign tenant to Protection Group

assign-app

Assign application in the Protection Group to a user

delete

Delete a Protection Group

add-app

Add an application to Protection Group

remove-app

Remove an application from a Protection Group

change-role

Change role of Protection Group

attach-repl-policy

Attach a replication policy to a Protection Group

detach-repl-policy

Detach a replication policy from a Protection Group

run-rpol

Run replication policy for a specific peer

info

Print info of protection group

list

List all Protection Groups

pause-replication

Pause replication for one or more applications in a Protection Group

resume-replication

Resume replication for one or more applications in a Protection Group

17.4.1. Create a Protection Group

In order to replicate applications across peer clusters a Protection Group must be created on both clusters within the peer pair. The cluster on which the initial creation command for the Protection Group is run is set as the primary by default. In addition to this a Protection Group of the same name is created on the specified peer clusters, if any, and are marked with the secondary role.

Note

Although any number of Protection Groups can be created, currently only one peer per Protection Group is supported.

To create a protection group, run the following command:

# robin protection-group create <name>
                                --peers <peers>

name

Name of the protection group

--peers <peers>

A comma separated list of Peers to use with this Protection Group

Note

You can also add peers after the fact using the robin protection-group add-peer command.

Example

# robin protection-group create prod-pg --peers NewYork
Job: 999 Name: ProtectionGroupCreateMultiPeer State: PROCESSED Error: 0
Job: 999 Name: ProtectionGroupCreateMultiPeer State: WAITING Error: 0
Job: 999 Name: ProtectionGroupCreateMultiPeer State: COMPLETED Error: 0

# robin protection-group list

+----+---------+---------+----------------+-------------+
| ID | Name    |  Role   |     Tenant     |    Peers    |
+----+---------+---------+----------------+-------------+
| 3  | prod-pg | PRIMARY | Administrators | ['NewYork'] |
+----+---------+---------+----------------+-------------+

Creates a primary Protection Group on the cluster from which it was initiated and replicates the configuration to any Peer Cluster(s) specified such that a secondary Protection Group with the same name is spawned on the aforementioned Peer Cluster(s).

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: POST

URL Parameters: None

Data Parameters:

  • action: pg-create - This mandatory field within the payload specifies the Protection Group create operation is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group to be created.

  • peers: <list_of_peers> - Utilizing this parameter by specifiying a comma seperated list of Peers results in a secondary Protection Group being created on each of Peer Clusters with mirrored conifgurations.

  • add-peer: true - Utilizing this parameter within the payload, by specifying a boolean value, determines whether or not a Peer is added to the Protection Group to be created. This attribute is only valid when a list of Peers is specified via the peers attribute.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error), 409 (Duplicate Resource Error)

Example Response:

Output
{
   "jobid":275,
   "plan":{
      "kind":"robin",
      "quiesce":[
         "fs"
      ],
      "original_app_tenant":1,
      "snapname":"snap1",
      "action":"snapshot",
      "original_app_owner":3,
      "opcode":8,
      "current_user":{
         "session_expires":"2020-09-10T10:40:27",
         "tenant_id":1,
         "tenant":"Administrators",
         "user_capabilities":[
            "AllSuperAdminCapabilities"
         ],
         "user_contexts":[
            "robin"
         ],
         "username":"robin",
         "user_permissions":{

         },
         "tenant_role":"superadmin",
         "user_context":"robin",
         "tenants":[
            "Administrators"
         ],
         "user_id":3,
         "namespace":"t001-u000003",
         "ip_addr":"172.17.0.1"
      },
      "namespace":"t001-u000003",
      "name":"custom-labels-app"
   }
}

17.4.2. Add a Peer Cluster to a Protection Group

You can add a Peer Cluster to an existing Protection Group in order to increase the number the clusters an application is replicated to. As part of this process the configuration of the Protection Group is replicated to the Peer Cluster, resulting in an identical secondary Protection Group being spawned on the aforemntioned cluster. To add a Peer Cluster to an existing Protection Group, run the following command:

# robin protection-group add-peer <name> <peer>

name

Name of the Protection Group

peer

Name of the Peer Cluster

Example

# robin protection-group add-peer prod-pg NewYork
Submitted job '47'. Use 'robin job wait 47' to track the progress

# robin protection-group list
+----+---------+---------+----------------+-------------+
| ID | Name    | Role    | Tenant         | Peers       |
+----+---------+---------+----------------+-------------+
| 3  | prod-pg | PRIMARY | Administrators | ['NewYork'] |
+----+---------+---------+----------------+-------------+

Associates a Peer Cluster to a pre-existing Protection Group and replicates the Protection Group configuration to aformentioned clusters such that a secondary Protection Group with the same name is spawned on them.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: POST

URL Parameters: None

Data Parameters:

  • action: pg-create - This mandatory field within the payload specifies the Protection Group create operation is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group to add the specified Peer Cluster to.

  • add-peer: true - This mandatory boolean field within the payload, indicates a Peer is to be added to the Proection Group.

  • peer_name: <peer_name> - This mandatory field within the payload specifies the name of the Peer to add the specified Protection Group.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error), 409 (Duplicate Resource Error)

Example Response:

Output

17.4.3. List all Protection Groups

To view the list of all Protection Groups currently registered with a cluster, run the following command:

# robin protection-group list --json

--json

Output in JSON

Example

# robin protection-group list
+----+---------+---------+----------------+-------+
| ID | Name    | Role    | Tenant         | Peers |
+----+---------+---------+----------------+-------+
| 2  | prod-pg | PRIMARY | Administrators | []    |
+----+---------+---------+----------------+-------+

Lists all Protection Groups.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: GET

URL Parameters: None

Data Parameters: None

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 200

Error Response Code: 500 (Internal Server Error)

Example Response:

Output

17.4.4. View details of a Protection Group

Issue the following command to get detailed information such as the state, role, attached Replication Policy, associated Peer Cluster(s) and their state(s) with regard to a specific Protection Group:

# robin protection-group info <name>
                              --verbose

name

Protection Group name

--verbose

Show verbose information about the specified Protection Group

Example

# robin pg info prod-pg

Name: prod-pg
Tenant: Administrators
Role: PRIMARY
Peer Information:
=================

**Peer Cluster Name: NewYork**

Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: No Replication Policy has been attached yet

Returns details about a specific Protection Group such as its state, role, attached Replication Policy, associated Peer Cluster(s) and their state(s).

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: GET

URL Parameters: None

Data Parameters:

  • action: get-pg-info - This mandatory field within the payload specifies the Protection Group information retrieval operation is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group to fetch the details for.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error)

Example Response:

Output

17.4.5. Assign a tenant to a secondary Protection Group

If the primary Protection Group was created with the auto-assignment mode set to MANUAL, a tenant must be assigned to the assocaiated secondary Protection Group(s) on the Peer Cluster(s) where said Protection Group(s) are hosted to facilitate the replication process.

To assign a tenant to a Protection Group on the secondary cluster with RBAC, there must be one Tenant admin on the secondary cluster with all the permissions for Protection Groups, such as create, update, and delete. However, the DeleteProtectioGroup is not part of the permissions of the AllTenantAdminCapabilities.

You can add the DeleteProtectioGroup in the following two scenarios:

  • When adding a tenant user

  • When updating tenant user capabilities

Use the following command when adding a user:

robin user add <username> <tenant> <tenant_role> [<user_capabilities>]

For more information, see Robin User Management.

Use the following command when updating the tenant user capabilities:

robin tenant update-user-capabilities <tenant_name> <username> <user_capabilities>

For more information, see Update user capabilities

In order to meet this requirement, run the following command:

# robin protection-group assign-tenant <name> <tenant>

name

Protection Group name

tenant

Name of the tenant to assign to specified Protection Group

Note

More information on Role-based Access Control (RBAC) in Robin’s Asynchronous Disaster Recovery feature can be found here.

Assigns a tenant to a secondary Protection Group.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: PUT

URL Parameters: None

Data Parameters:

  • action: pg-assign-tenant - This mandatory field within the payload specifies the assign tenant operation for Protection Groups is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group to assign the tenant to.

  • tenant_name: <tenant_name> - This mandatory field within the payload specifies the name of the tenant to assign to the given Protection Group.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error), 409 (Duplicate Resource Error)

Example Response:

Output

17.4.6. Assign Application on Secondary Cluster

If the primary Protection Group was created with the auto-assignment mode set to MANUAL, any application(s) attached to the primary Protection Group must be assigned to a user within a tenant associated with the respective secondary Protection Group(s) on the Peer Cluster(s), where said Protection Group(s) are hosted, to facilitate the replication process. In order to meet this requirement, run the following command:

# robin protection-group assign-app <name> <app> <user> <namespace>

name

Protection Group name

app

Name of the application to assign

user

Name of user to assign specified application to

namespace

Namespace the specified application will be deployed in

Note

Tenants must be assigned to secondary Protection Group(s) on the Peer Cluster(s) before an application can be assigned using the robin protection-group assign-tenant command, details of which can be found here.

Assigns an application to a user within the tenant associated with a secondary Protection Group.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: PUT

URL Parameters: None

Data Parameters:

  • action: pg-assign-app - This mandatory field within the payload specifies the assign application operation for Protection Groups is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group to assign the application to.

  • app_name: <app_name> - This mandatory field within the payload specifies the name of the application to assign to the given user within the aforementioned Protection Group.

  • user_name: <user_name> - This mandatory field within the payload specifies the name of the user, who is part of the tenant linked to aforementioned Protection Group, which the given application will be assigned to.

  • namespace: <namespace> - This mandatory field within the payload specifies the name of the namespace in which the application will be created.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error), 409 (Duplicate Resource Error)

Example Response:

Output

17.4.7. Add an application to a Protection Group

You need to add a required application to the Protection Group for it to be replicated to a Peer Cluster via snapshots. Each application can only be attached to one Protection Group. When an application is selected for replication, all the application-dependent constructs such as volumes, configmaps etc are replicated too.

Important

An application will only be replicated to a Peer if the Replication Policy associated with the Protection Group is attached to has replication snapshots enabled.

All types of applications, including Robin Bundle applications, Helm applications and FlexApps (with PVCs) can be associated to a Protection Group. However, this only holds true for the parent applications, as cloned applications cannot be linked to a Protection Group. In addition, replication for applications with ephemeral volumes (AEVs) is not supported.

Note

If you are adding a Robin Bundle app to the Protection Group, you must add the same Robin Bundle to both peers.

When adding an app to the Protection Group, the auto_assign_mode and association of users with the namespace plays an important role. You must consider the following important points:

  • Ensure that the same users exist on both primary and the secondary clusters.

  • Namespace and user mapping must be the same on both primary and secondary clusters.

Note

The process to add the app to the Protection Group fails if the above-mentioned points are not met. In this scenario, you must change the auto_assign_mode to manual on the secondary cluster and assign the app a tenant user.

After the app addition succeeds with the auto_assign_mode set to manual on the secondary, you can notice the following states of the Protection Group: PENDING_USER_ASSIGNMENT_AT_PEER on the primary and PENDING_USER_ASSIGNMENT on the secondary.

To add an application to the Protection Group, run the following command:

# robin protection-group add-app <name> <app_name>
                                        --peer <peer_name>
                                        --nanmespace <namespace>

name

Name of Protection Group

app_name

Name of application

--peer <peer_name>

Specific peer to replicate application to

--namespace <namespace>

Namespace in which application is registered. This option only needs to be used when there are multiple applications with the same name

Note

When an application is attached to a Protection Group all regular operations can be continue to be run on it except for horizontal scaling and volume addition.

Example

# robin protection-group add-app prod-pg mysqldb --wait
Job: 10245 Name: ProtectionGroupAddApps State: PROCESSED Error: 0
Job: 10245 Name: ProtectionGroupAddApps State: WAITING Error: 0
Job: 10245 Name: ProtectionGroupAddApps State: COMPLETED Error: 0

# robin pg info prod-pg

Name: prod-pg
Tenant: Administrators
Role: PRIMARY
Peer Information:
=================

**Peer Cluster Name: NewYork**

Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: minute_repl

+----------+-------------------------+-------------------------------------------+--------------------------+
| App Name | State                   | Latest Synced Snapshot                    | Latest Sycned Snap ctime |
+----------+-------------------------+-------------------------------------------+--------------------------+
| mysqldb  | PRIMARY_SYNC_SUCCESSFUL | mysqldb_minute_repl-1639611848-1639688268 | 2021-12-16 12:57:52      |
+----------+-------------------------+-------------------------------------------+--------------------------+

Attaches an application to a Protection Group such that it can be replicated to Peer Clusters via snapshots.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: PUT

URL Parameters: None

Data Parameters:

  • action: pg-add-app - This mandatory field within the payload specifies the add application operation for Protection Groups is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group to add an application to.

  • app_name: <app_name> - This mandatory field within the payload specifies the name of the application to add to the aforementioned Protection Group.

  • peer_name: <peer_name> - Utilizing this parameter by specifiying the name of a Peer results in the application only replicating to the specified peer.

  • namespace: <namespace> - Utilizing this parameter by specifiying the name of a namespace results in the application within the given namespace being replicated. This option only needs to be used when there are multiple applications with the same name.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error), 409 (Duplicate Resource Error)

Example Response:

Output

17.4.8. Remove an application from a Protection Group

In order to disable the replication of an application entirely, an application can be removed from its associated Protection Group. This operation should be only performed on the cluster linked to the primary Protection Group. Upon completion, the application will also be removed from the secondary Protection Group(s) and their respective clusters. However, the replication snapshots for the application will remain on all clusters linked to the Protection Group(s) regardless of their role.

Note

All replication of the application must be stopped before it can be removed from the Protection Group. As a result, either the replication must be paused via the robin protection-group pause-replication command, details of which can be found here, or by ensuring there is no Replication Policy associated with the Protection Group.

To remove the application from the Protection Group, run the following command:

# robin protection-group remove-app <name> <app_name>
                                           --peer <peer>
                                           --namespace <namespace>

name

Name of the Protection Group

app_name

Name of the application

--peer <peer>

Only stop application replicating to the specified peer

--namespace <namespace>

Namespace in which application is registered. This option only needs to be used when there are multiple applications with the same name

Example

# robin protection-group remove-app prod-pg mysqldb -–peer NewYork --wait
Job: 12693 Name: ProtectionGroupRemoveApps State: VALIDATED Error: 0
Job: 12693 Name: ProtectionGroupRemoveApps State: WAITING Error: 0
Job: 12693 Name: ProtectionGroupRemoveApps State: COMPLETED Error: 0

Detachs an application from a Protection Group such that it will no longer be replicated to Peer Clusters.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: PUT

URL Parameters: None

Data Parameters:

  • action: pg-remove-app - This mandatory field within the payload specifies the remove application operation for Protection Groups is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group to remove an application from.

  • app_name: <app_name> - This mandatory field within the payload specifies the name of the application to remove from the aforementioned Protection Group.

  • peer_name: <peer_name> - Utilizing this parameter by specifiying the name of a Peer results in replication of the application only being stopped for the given Peer.

  • namespace: <namespace> - Utilizing this parameter by specifiying the name of a namespace results in the application within the given namespace being removed from the Protection Group. This option only needs to be used when there are multiple applications with the same name.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)

Example Response:

Output

17.4.9. Change role

The role of a Protection Group can be changed in order to influence the direction of replication for a set of applications as well as to aid the failover process. By default, it involves all current replication tasks being aborted and so requires all hosts and volumes within the cluster to be in a healthy state. To change the role of a Protection Group, run the following command:

Note

This command only alters the relevant Protection Group on the cluster it is run on.

# robin protection-group change-role <name> <role>
                                            --force
                                            --no-abort

name

Name of the Protection Group to change the role for

role

Role of the Protection Group. Options include: ‘Primary’ or ‘Secondary’

--force

Change the role of a Protection group without checking Peers. Note this overrides checking if there already exists another Protection Group on a Peer with the Primary role

--no-abort

Do not abort ongoing replication tasks

Note

If the --no-abort option is utilized, the operation will wait for all replication tasks to complete before continuing which could result in a severe time lag.

Important

When the role of Protection Group is changed from primary to secondary, a new snapshot is taken for each application on the respective cluster where the role change occured. These snapshots retain the data that is not replicated to the Peer cluster before the role change so as to make sure no data is lost and abide by the following naming convention: <appname>_snap_before_role_change_<timestamp>. These snapshots must be cleaned up manually if and when needed.

Example

# robin protection-group change-role prod-pg Primary
Job: 12695 Name: ProtectionGroupRoleChange State: VALIDATED Error: 0
Job: 12695 Name: ProtectionGroupRoleChange State: WAITING Error: 0
Job: 12695 Name: ProtectionGroupRoleChange State: COMPLETED Error: 0

# robin protection-group info prod-pg

Name: prod-pg
Tenant: Administrators
Role: PRIMARY
Peer Information:
=================

**Peer Cluster Name: Sanfrancisco**

Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: minute_repl

+----------+---------------------------+-------------------------------------------+--------------------------+
| App Name | State                     | Latest Synced Snapshot                    | Latest Sycned Snap ctime |
+----------+---------------------------+-------------------------------------------+--------------------------+
| mysqldb  | SECONDARY_SYNC_SUCCESSFUL | mysqldb_minute_repl-1639611848-1639692409 | 2021-12-16 14:06:54      |
+----------+---------------------------+-------------------------------------------+--------------------------+

Changes the role of a Protection Group in order to influence the direction of replication for a set of applications as well as to aid the failover process.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: PUT

URL Parameters: None

Data Parameters:

  • action: change-role - This mandatory field within the payload specifies the change role operation is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group for which the role needs to be changed.

  • role: <role> - This mandatory field within the payload specifies the name of the new role to assign to the Protection Group. Note valid choices for this field include: Primary’ or ‘Secondary’

  • force: [true|false] - Utilizing this parameter within the payload, by specifying a boolean value, determines whether or not Peers associated with the Protection Group are checked before the role is changed. If set to true the associated Peers will not be checked.

  • abort: [true|false] - Utilizing this parameter within the payload, by specifying a boolean value, determines whether or not ongoing replication tasks are aborted before the role change.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)

Example Response:

Output

17.4.10. Attach a Replication Policy to a Protection Group

In order to have applications within a Protection Group be replicated in a scheduled manner, a Replication Policy needs to be associated with the aforementioned Protection Group. The configuration of the Replication Policy determines the frequency at which data is transferred from the primary cluster to secondary peer cluster. Details on how to create a Replication Policy can be found here.

To attach the Replication Policy to a Protection Group, run the following command:

# robin protection-group attach-repl-policy <name> <policy_name>
                                                   --peer <peer>

name

Name of the Protection Group

policy_name

Name of the Replication Policy

--peer <peer>

Name of specific peer to replicate applications to based on specified Replication Policy configuration. If not specified, the Replication Policy will apply to all peers within the Protection Group

Example

# robin protection-group attach-repl-policy prod-pg minute_repl
Job: 9998 Name: ProtectionGroupAttachReplicationPolicy State: VALIDATED Error: 0
Job: 9998 Name: ProtectionGroupAttachReplicationPolicy State: WAITING Error: 0
Job: 9998 Name: ProtectionGroupAttachReplicationPolicy State: COMPLETED Error: 0

# robin protection-group info prod-pg

Name: prod-pg
Tenant: Administrators
Role: PRIMARY
Peer Information:
=================

**Peer Cluster Name: NewYork**

Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: minute_repl

Attaches a Replication Policy to a Protection Group such that applications within a Protection Group can be replicated in a scheduled manner.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: PUT

URL Parameters: None

Data Parameters:

  • action: attach-replication-policy - This mandatory field within the payload specifies the attach Replication Policy operation is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group to which the given Replication Policy needs to be attached.

  • policy_name: <policy_name> - This mandatory field within the payload specifies the name of the Replication Policy that should be attached to the aforementioned Protection Group.

  • peer_name: <peer> - Utilizing this parameter within the payload, by specifying a string representing the name of a Peer, results in applications only being replicated to given Peer. If not specified, the Replication Policy will apply to all peers within the Protection Group.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error), 409 (Duplicate Resource Error)

Example Response:

Output

17.4.11. Detach a Replication Policy from a Protection Group

In order to update an attached Repliction Policy or add a new one, the current Replication Policy needs to be detached. When you detach a Replication Policy from a Protection Group, ongoing replications might fail and the replication schedule for the Protection Group stops permanently.

To detach a Replication Policy from a Protection Group, run the following command:

# robin protection-group detach-repl-policy <name> <policy_name>
                                                   --peer <peer>

name

Name of the Protection Group

policy_name

Name of the Replication Policy

--peer <peer>

Name of the Peer Cluster to detach the Replication Policy for. If not specified, the policy is disassociated with all peers within the Protection Group

Example

# robin protection-group detach-repl-policy prod-pg minute_repl --peer NewYork --wait
Job: 12695 Name: ProtectionGroupUpdate State: VALIDATED Error: 0
Job: 12695 Name: ProtectionGroupUpdate State: WAITING Error: 0
Job: 12695 Name: ProtectionGroupUpdate State: COMPLETED Error: 0

# robin pg info prod-pg

Name: prod-pg
Tenant: Administrators
Role: PRIMARY
Peer Information:
=================

**Peer Cluster Name: NewYork**

Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: No Replication Policy has been attached yet

Detaches a Replication Policy from a Protection Group such that the replication schedule for the Protection Group stops permanently and allows for the Replication Policy to be updated.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: PUT

URL Parameters: None

Data Parameters:

  • action: detach-replication-policy - This mandatory field within the payload specifies the detach Replication Policy operation is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group from which the given Replication Policy needs to be detached.

  • policy_name: <policy_name> - This mandatory field within the payload specifies the name of the Replication Policy that should be detached from the aforementioned Protection Group.

  • peer_name: <peer> - Utilizing this parameter within the payload, by specifying a string representing the name of a Peer, results in the Replication Policy only being detached for the given Peer. If not specified, the policy is disassociated with all peers within the Protection Group.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error), 409 (Duplicate Resource Error)

Example Response:

Output

17.4.12. Start replication on demand

You can manually replicate applications in a Protection Group to a Peer site when required. In addition, Replications Policies that were previously created can be specified in order to create custom snapshots. In order replicate applications within a protection group on demand, run the following command:

# robin protection-group run-rpol <name>
                                  --rpol <rpol>
                                  --peer <peer>
                                  --app <app_name>
                                  --namespace <namespace>

name

Name of the Protection Group

--rpol <rpol>

Name of the Replication Policy to use. Note any Replication Policy specified must be attached to the Protection Group beforehand

--peer <peer>

Name of Peer Cluster

--app <app_name>

Name of the specific application that should be replicated

--namespace <namespace>

Namespace of the application to sync

Note

This command should be run the cluster linked to the primary Protection Group.

Example

# robin protection-group run-rpol test_pg --peer hyperv1 --wait
Job: 1463 Name: ProtectionGroupRunPolicy State: AGENT_WAIT Error: 0
Job: 1463 Name: ProtectionGroupRunPolicy State: FINALIZED Error: 0
Job: 1463 Name: ProtectionGroupRunPolicy State: COMPLETED Error: 0

Allows users to manually replicate application(s) in a Protection Group to a Peer site on demand.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: PUT

URL Parameters: None

Data Parameters:

  • action: force-rpol-run - This mandatory field within the payload specifies the run Replication Policy operation is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group that the replication should run for.

  • policy_name: <policy_name> - Utilizing this parameter within the payload, by specifying a string representing the name of a Replication Policy, results in the given Replication Policy being used to create and filter custom snapshots to be replicated. If not specified, the Replication Policy attached to the Protection Group is used.

  • peer_name: <peer> - Utilizing this parameter within the payload, by specifying a string representing the name of a Peer, results in application(s) only being replicated to the given Peer. If not specified, the application is replicated to all Peers within the Protection Group.

  • app_name: <app> - Utilizing this parameter within the payload, by specifying a string representing the name of an application, results in the given application being replicated. If not specified, all applications associated to the aforementioned Protection Group will be replicated.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)

Example Response:

Output

17.4.13. Remove a Peer from a Protection Group

In order to remove a peer cluster from a Protection Group, run the following command:

# robin protection-group remove-peer <name> <peer>
                                            --force

name

Name of the Protection Group

peer

Name of the Peer cluster

--force

Force delete the Peer even if the cluster cannot be informed or there is no connection to it

Note

When a peer is removed from the primary Protection Group by running the above command on the associated cluster, the secondary protection group on the specified peer is also removed.

Example

# robin protection-group remove-peer prod-pg NewYork --wait
Job: 12693 Name: ProtectionGroupUpdate State: VALIDATED Error: 0
Job: 12693 Name: ProtectionGroupUpdate State: WAITING Error: 0
Job: 12693 Name: ProtectionGroupUpdate State: COMPLETED Error: 0

# robin protection-group list
+----+---------+---------+----------------+-------+
| ID | Name    | Role    |    Tenant      | Peers |
+----+---------+---------+----------------+-------+
| 3  | prod-pg | PRIMARY | Administrators | []    |
+----+---------+---------+----------------+-------+

Removes a Peer Cluster from a Protection Group. If the Peer is removed from a primary Protection Group, the secondary Protection Group on the specified Peer is also removed.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: POST

URL Parameters: None

Data Parameters:

  • action: pg-remove-peer - This mandatory field within the payload specifies the remove Peer operation for Protection Groups is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group from which to remove the given Peer Cluster.

  • peer_name: <peer_name> - This mandatory field within the payload specifies the name of the Peer to remove from the specified Protection Group.

  • force: [true|false] - Utilizing this parameter within the payload, by specifying a boolean value, determines whether or not a Peer should be removed even though it cannot be reached. If set to true the given Peer will be removed regardless of if a connection can be made.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)

Example Response:

Output

17.4.14. Pause replication

Robin allows for the replication of one or more applications to be paused at any given time. This is particularly useful when an application needs to be removed from a Protection Group or or if there are any infrastructure (network or storage) related issues.

Note

All ongoing replication jobs will still complete after the command is issued however no new replication jobs will be spawned.

To pause the replication of one or more applications, run the following command:

# robin protection-group pause-replication <name>
                                           --app <app>
                                           --peer <peer>
                                           --namespace <namespace>

name

Name of the Protection Group

--app <app>

Name of application for which replication should be paused. If not specified, replication for all applications in the Protection Group will be paused

--peer <peer>

Name of Peer Cluster to which replication should be paused. If not specified, replication to all peers within the Protection Group will be paused

--namespace <namespace>

Namespace of the application to be paused

Note

The replication status for an application can be found using the robin protection-group info command.

Pauses replication for one or more applications within a Protection Group allowing them to be removed or for any infrstructure related issues hindering the replication process to be addressed.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: POST

URL Parameters: None

Data Parameters:

  • action: pg-pause-replication - This mandatory field within the payload specifies the pause replication operation is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group for which replication should be paused.

  • apps: <list_of_applications> - Utilizing this parameter within the payload, by specifying a comma seperated list of application names, results in only replication for the given application(s) being paused. If not specified, replication for all applications within the Protection Group will be paused.

  • peer_name: <peer_name> - Utilizing this parameter within the payload, by specifying a string representing the name of a Peer, results in replication to the given Peer being paused. If not specified, replication to all Peers within the Protection Group will be paused.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)

Example Response:

Output

17.4.15. Resume replication

To resume the replication of one or more applications that was previously paused, run the following command:

# robin protection-group resume-replication <name>
                                            --app <app>
                                            --peer <peer>

name

Name of the Protection Group

--app <app>

Name of application for which replication should be resumed. If not specified, replication for all applications in the Protection Group will be resumed

--peer <peer>

Name of Peer Cluster to which replication should be resumed. If not specified, replication to all peers within the Protection Group will be resumed

--namespace <namespace>

Namespace of the application to be resumed

Resumes replication for one or more applications within a Protection Group, that was previously suspended.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: POST

URL Parameters: None

Data Parameters:

  • action: pg-resume-replication - This mandatory field within the payload specifies the resume replication operation is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group for which replication should be resumed.

  • apps: <list_of_applications> - Utilizing this parameter within the payload, by specifying a comma seperated list of application names, results in only replication for the given application(s) being resumed. If not specified, replication for all applications within the Protection Group will be resumed.

  • peer_name: <peer_name> - Utilizing this parameter within the payload, by specifying a string representing the name of a Peer, results in replication to the given Peer being resumed. If not specified, replication to all Peers within the Protection Group will be resumed.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)

Example Response:

Output

17.4.16. Delete a Protection Group

Deleting a primary Protection Group results in the replication relationship being broken. Before it can be deleted all Peers, applications, and Replication Policies must be disassociated with the Protection Group. In order to properly remove the primary Protection group the delete operation must be run on the cluster associated with it.

Note

Secondary Protection Group(s) are automatically deleted on their associated cluster(s) when the Peer is removed from the Protection Group.

To delete a Protection Group, run the following command:

# robin protection-group delete <name>

name

Name of the Protection Group

Example

# robin protection-group delete prod-pg
Job: 12699 Name: ProtectionGroupDelete State: VALIDATED Error: 0
Job: 12699 Name: ProtectionGroupDelete State: WAITING Error: 0
Job: 12699 Name: ProtectionGroupDelete State: COMPLETED Error: 0

# robin protection-group list

+----+------+------+--------+-------+
| ID | Name | Role | Tenant | Peers |
+----+------+------+--------+-------+
+----+------+------+--------+-------+

Deletes a primary Protection Group such that the replication relationship is broken. However before it can be deleted all Peers, applications, and Replication Policies must be disassociated with the Protection Group.

End Point: /api/v3/robin_server/protection_groups?version=1&max_version=1

Method: DELETE

URL Parameters: None

Data Parameters:

  • action: pg-delete - This mandatory field within the payload specifies the delete operation for Protection Groups is to be performed.

  • name: <pg_name> - This mandatory field within the payload specifies the name of the Protection Group to be deleted.

Port: RCM Port (default value is 29442)

Headers:

  • Authorization: <auth_token> : Authorization token to identify which user is sending the request. The token can be acquired from the login API.

Success Response Code: 202

Error Response Code: 500 (Internal Server Error), 404 (Not Found Error), 401 (Unauthorized Error), 400 (Invalid API Usage Error)

Example Response:

Output

17.5. Managing Replication Policies

Topics covered in this section:

create

Create a Replication Policy

list

List Replication Policies

info

Get information about a Replication Policy

update-schedule

Update replication schedule for a Replication Policy

add-label

Associate snapshot label(s) to a Replication Policy

update-label

Update retention count for snapshot label(s) associated to a Replication Policy

remove-label

Disassociate snapshot label(s) from a Replication Policy

enable-repl-snaps

Enable application replication snapshots for a Replication Policy

update-repl-snaps

Update retention count for application replication snapshots for a Replication Policy

disable-repl-snaps

Disable application replication snapshots for a Replication Policy

delete

Delete a Replication Policy

17.5.1. Create a Replication Policy

A Replication Policy defines the frequency of data transfer between Peer Clusters. The policy must be attached to a Protection Group for it to become part of the replication relationship. In turn, the same frequency defined in the Replication Policy applies to all the applications in the Protection Group. The replication of application metadata and data is primarily done through the snapshot mechanism with snapshots essentially being transferred to the secondary Protection Group from the primary. As a result, a Replication Policy can utilize snapshots created via a snapshot schedule, details for which can be found here apps.html#manage-application-snapshot-schedules, or replication snapshots created by the policy itself. The former can be configured by specifying the labels associated with the snapshot schedule when configuring the policy such that any snapshot with those labels are replicated whilst the latter simply requires the --create-repl-snapshots option to be given.

Note

A Replication Policy can also replicate a combination of snapshots created manually or by a schedule and replication snapshots if both the --labels and --create-repl-snapshots parameters are utilized. This is becausse replication snapshots created by the policy are automatically transferred whilst the additional label filter will enable snapshots created externally to be transferred too.

Important

Regardless of the type of snapshot that is chosen for replication, only snapshots created from the application(s) attached to the same Protection Group, that the Replication Policy will eventually be associated to, will be transferred. That is to say for externally created snapshots only those that meet the specified label filter and are based off attached applications will be replicated. On the other hand, if Replication Snapshots are enabled, the attached applications will be snapshotted at the cadence of the policy and the resulting snapshot will be replicated. The names of these snapshots will in the following format: <appname>_<namespace>_<repl-policy-name>_<zoneid>_<creation-time>.

You must define the frequency of a Replication Policy in a JSON file when creating the Replication Policy or you can use CRON expression. The same format used for setting up application snapshot schedules, detailed here apps.html#manage-application-snapshot-schedules, applies in this case too.

JSON file example:

{
   "frequency": "minute",
   "minute": 1
}

To create a Replication Policy, run the following command:

# robin replication-policy create <name>
                                  --sched-json <sched_json>
                                  --sched-cron <sched_cron>
                                  --labels <labels>
                                  --create-repl-snapshots
                                  --retain-repl-snapshots

name

Replication Policy name

--sched-json <sched_json>

JSON file containing schedule information

--sched-cron <sched_cron>

CRON string specifying schedule interval

--labels <labels>

Snapshot labels to filter which snapshots need to be transferred. Should be provided as a comma separated list in the following format <key>:<value>:<retain> where the retain value refers to the number of snapshots to be maintained on peer clusters

--create-repl-snapshots

Enable replication snapshots

--retain-repl-snapshots

Number of replication snapshots to maintain on peer clusters

Example 1 (Creating a standard Replication Policy):

# robin replication-policy create --sched_json ~/minute.json first-replication-policy
Submitted job '9996'. Use 'robin job wait 9996' to track the progress

# robin replication-policy info first-replication-policy

Name: first-replication-policy
Policy Owner: admin
Policy Interval: 60
Create Snapshots: False
Snapshot Labels

+-----+-------+--------+
| Key | Value | Retain |
+-----+-------+--------+
+-----+-------+--------+

Example 2 (Creating a Replication Policy with custom labels):

# robin replication-policy create Rep-test --sched_json 5min_repl.json --labels REPDEMO:Yes:5
Submitted job '9998'. Use 'robin job wait 9998' to track the progress

# robin replication-policy info Rep-test

Name: Rep-test

Policy Owner: robin
Policy Interval: 300
Create Snapshots: True
Retain Snapshot Count: 9
Snapshot Labels

+---------+-------+--------+
|  Key    | Value | Retain |
+---------+-------+--------+
| REPDEMO | YES   |  5     |
+---------+-------+--------+

17.5.2. List Replication Policies

To view a list of Replication Policies, run the following command:

# robin replication-policy list --json

--json

Output in JSON

Example

# robin replication-policy list
+-------------+-------+----------+
|   Name      | Owner | Interval |
+-------------+-------+----------+
| minute_repl | admin |   60     |
+-------------+-------+----------+

17.5.3. Show information about a specific Replication Policy

To view details of a Replication Policy, run the following command:

# robin replication-policy info <name>

name

Replication Policy name

Example

# robin replication-policy info minute_repl

Name: minute_repl
Policy Owner: admin
Policy Interval: 60
Create Snapshots: True
Retain Snapshot Count: 10
Snapshot Labels

+-----+-------+--------+
| Key | Value | Retain |
+-----+-------+--------+
+-----+-------+--------+

17.5.4. Update replication schedule of a Replication Policy

In order to update a Replication Policy schedule such that the cadence at which an application is replicated is modified, run the following command:

Note

Any changes made to an existing Replication Policy are reflected on the mirror policy set on the associated Peer Clusters unless otherwise specified. This helps to maintain the desired RPO and RTO in case of Protection Group role change.

# robin replication-policy update-schedule <name>
                                           --sched-json <sched_json>
                                           --sched-cron <sched_cron>
                                           --skip-mirror

name

Replication Policy name

--sched-json <sched_json>

JSON file containing updated schedule information

--sched-cron <sched_cron>

Updated CRON string specifying schedule interval

--skip-mirror

Skip updating associated Peer Clusters hosting secondary Protection Groups

Example

# robin replication-policy update-schedule --sched_json minute.json pri_rpol
Job: 12705 Name: ReplicationPolicyUpdateSchedule State: VALIDATED Error: 0
Job: 12705 Name: ReplicationPolicyUpdateSchedule State: WAITING Error: 0
Job: 12705 Name: ReplicationPolicyUpdateSchedule State: COMPLETED Error: 0

# robin replication-policy list
+----------+-------+----------+
| Name     | Owner | Interval |
+----------+-------+----------+
| pri_rpol | admin | 60       |
+----------+-------+----------+

17.5.5. Add labels to a Replication Policy

After the creation of a Replication Policy you can associate new and additional labels with it such that any snapshots linked to the aforementioned labels will be replicated to the attached Peer Clusters.

Note

You must provide the same labels in the pod and PVC of the application for successful replication.

To expand the list of snapshot label filters used by a Replication Policy, run the following command:

# robin replication-policy add-label <name> <labels>
                                            --skip-mirror

name

Replication Policy name

labels

Comma separated list in the following format <key>:<value>:<retain> where the retain value refers to the number of snapshots with this label to be maintained on Peer Clusters, not including the cluster on which the snapshot was created

--skip-mirror

Skip updating associated Peer Clusters hosting secondary Protection Groups

Important

Only snapshots that match the label filter being added and are created from application(s) attached to the Protection Group the relevant Replication Policy is associated to will be transferred.

Example

# robin replication-policy add-label minute_repl SNAP:YES:5

Submitted job '31'. Use 'robin job wait 31' to track the progress

# robin replication-policy info minute_repl

Name: minute_repl
Policy Owner: admin
Policy Interval: 60
Create Snapshots: False
Snapshot Labels

+------+-------+--------+
| Key  | Value | Retain |
+------+-------+--------+
| SNAP | YES   |   5    |
+------+-------+--------+

17.5.6. Update labels associated with a Replication Policy

You can update the number of snapshots that you want to retain on attached Peer Clusters for a given set of labels in a Replication Policy by running the following command:

# robin replication-policy update-label <name> <labels> <retain>
                                                        --skip-mirror

name

Replication Policy name

labels

Comma separated list of snapshot label filters to update, with each label in the format: <key>:<value>

retain

Number of snapshots with this label to be maintained on Peer Clusters, not including the cluster on which the snapshot was created

--skip-mirror

Skip updating associated Peer Clusters hosting secondary Protection Groups

Example

# robin replication-policy update-label 5min_auto SNAP:YES 10

# robin replication-policy info 5min_auto

Name: 5min_auto
Policy Owner: robin
Policy Interval: 300
Create Snapshots: True
Retain Snapshot Count: 5
Snapshot Labels

+------+-------+--------+
| Key  | Value | Retain |
+------+-------+--------+
| SNAP | YES   | 10     |
+------+-------+--------+

17.5.7. Remove labels from a Replication Policy

To remove a snapshot label filter from a Replication Policy, such that it no longer replicates snapshots with the given set of labels, run the following command:

# robin replication-policy remove-label <name> <labels>
                                               --skip-mirror

name

Replication Policy name

labels

Comma separated list of snapshot label filters to disassociate with the Replication Policy, with each label in the format: <key>:<value>

--skip-mirror

Skip updating associated Peer Clusters hosting secondary Protection Groups

Example

# robin replication-policy info 5min_auto
Name: 5min_auto
Policy Owner: robin
Policy Interval: 300
Create Snapshots: True
Retain Snapshot Count: 5
Snapshot Labels

+------+-------+--------+
| Key  | Value | Retain |
+------+-------+--------+
| SNAP | YES   |  10    |
+------+-------+--------+

# robin replication-policy remove-label 5min_auto SNAP:YES --wait
Job: 844 Name: ReplicationPolicyUpdatePeers State: VALIDATED Error: 0
Job: 844 Name: ReplicationPolicyUpdatePeers State: WAITING Error: 0
Job: 844 Name: ReplicationPolicyUpdatePeers State: COMPLETED Error: 0

# robin replication-policy info 5min_auto
Name: 5min_auto
Policy Owner: robin
Policy Interval: 300
Create Snapshots: True
Retain Snapshot Count: 5
Snapshot Labels

+-----+-------+--------+
| Key | Value | Retain |
+-----+-------+--------+
+-----+-------+--------+

17.5.8. Enable Replication Snapshots for a Replication Policy

Replication snapshots are system generated snapshots of applications attached to the Protection Group which the Replication Policy is associated to. They utilize the default system label of system_label:Yes and are created in cadence with the Replication Policy schedule. The names of the snapshots to be created will in the following format: <appname>_<namespace>_<repl-policy-name>_<zoneid>_<creation-time>. In addition, the number of these snapshots to maintain on Peer Clusters is also configurable. The feature controlling the creation of these snapshots can be enabled post-creation of the Replication Policy or if it was previously disabled, by running the following command:

# robin replication-policy enable-repl-snaps <name> <retain_repl_snaps>
                                                    --skip-mirror

name

Replication Policy name

retain_repl_snaps

Number of snapshots to be maintained on all Peer Clusters including the one on which it was created

--skip-mirror

Skip updating associated Peer Clusters hosting secondary Protection Groups

Note

Whether or not the Replication Snapshot feature is enabled can be seen using the robin replication-policy info command.

Example:

# robin replication-policy enable-repl-snaps 5min_auto 6

Submitted job '14742'. Use 'robin job wait 14742' to track the progress

# robin replication-policy info 5min_auto
Name: 5min_auto
Policy Owner: robin
Policy Interval: 300
Create Snapshots: True
Retain Snapshot Count: 6
Snapshot Labels

+-----+-------+--------+
| Key | Value | Retain |
+-----+-------+--------+
+-----+-------+--------+

17.5.9. Update number of Replication Snapshots retained by a Replication Policy

The number of Replication Snapshots to be maintained for a given policy can be updated as needed. When the value is updated, the change will only be reflected during the next scheduled run to create a new Replication Snapshot. To update the number of Replication Snapshots to retain for a particular Replication Policy, run the following command:

# robin replication-policy update-repl-snaps <name> <retain_repl_snaps>
                                                   --skip-mirror

name

Replication Policy name

retain_repl_snaps

Number of snapshots to be maintained on all Peer Clusters including the one on which it was created

--skip-mirror

Skip updating associated Peer Clusters hosting secondary Protection Groups

17.5.10. Disable Replication Snapshots for a Replication Policy

To disable the creation of system-generated Replication Snapshots for a particular Replication Policy, run the following command:

# robin replication-policy disable-repl-snaps <name>
                                              --skip-mirror

name

Replication Policy name

--skip-mirror

Skip updating associated Peer Clusters

Example:

# robin replication-policy disable-repl-snaps 5min_auto

Submitted job '14740'. Use 'robin job wait 14740' to track the progress

# robin replication-policy info 5min_auto
Name: 5min_auto
Policy Owner: robin
Policy Interval: 300
Create Snapshots: False
Snapshot Labels

+-----+-------+--------+
| Key | Value | Retain |
+-----+-------+--------+
+-----+-------+--------+

17.5.11. List Replication Snapshots

You can view the list of the system generated Replication Snapshots by utilizing the robin snapshot list command, details of which can be found here, alongside the label filter and the default system label of system_replicate:yes. An example of this shown below:

Example

# robin snapshot list --labels system_replicate:yes
+----------------------------------+--------+---------------+----------+--------------------------------------------+
| Snapshot ID                      | State  |      App Name | App Kind |         Snapshot name                      |
+----------------------------------+--------+---------------+----------+--------------------------------------------+
| 1cb934a4ab3a11ec8b84b3d731b0b3bb | ONLINE | test-ns/pgsql | flexapp  | pgsql_test_5min_auto-1648050416-1648102657 |
| b26fa730ac2911ec9282312d4a1cadc1 | ONLINE | test-ns/pgsql | flexapp  | pgsql_test_5min_auto-1648050416-1648205557 |
| 6540b266ac2a11ecb401af83176cced7 | ONLINE | test-ns/pgsql | flexapp  | pgsql_test_5min_auto-1648050416-1648205857 |
| 18c4b84aac2b11ec952e47e3b463ab7d | ONLINE | test-ns/pgsql | flexapp  | pgsql_test_5min_auto-1648050416-1648206158 |
| c9b57c68ac2b11eca959d7db3dfb1472 | ONLINE | test-ns/pgsql | flexapp  | pgsql_test_5min_auto-1648050416-1648206456 |
| 7d5c34c0ac2c11ecb94a031211f1ef31 | ONLINE | test-ns/pgsql | flexapp  | pgsql_test_5min_auto-1648050416-1648206757 |
+----------------------------------+--------+---------------+----------+--------------------------------------------+

Note

The same command can be used with any label filter provided to the Replication Policy to showcase which externally generated snapshots will be replicated.

17.5.12. Delete a Replication Policy

To delete a Replication Policy, run the following command:

# robin replication-policy delete <name>

Note

In order to delete a Replication Policy, it must not be attached to any Protection Group(s).

name

Replication Policy name

Example

# robin replication-policy delete minute_repl –-wait
Job: 12710 Name: ReplicationPolicyDelete State: VALIDATED Error: 0
Job: 12710 Name: ReplicationPolicyDelete State: WAITING Error: 0
Job: 12710 Name: ReplicationPolicyDelete State: COMPLETED Error: 0

17.6. Asynchronous Disaster Recovery quickstart guide

17.6.1. Prerequisites

The following are the prerequisites for setting up asynchronous Disaster Recovery:

  • Two Robin clusters of the same version (5.4.1+) are needed to create a replication pair.

  • A user with administrator privileges for both clusters.

17.6.2. Overview of Steps

The following steps need to be completed in order to setup a functioning replication procedure to safeguard an application:

  1. Initiate Pairing on Primary Cluster

  2. Complete Pairing on Secondary Cluster

  3. Create a Protection Group

  4. Create a Replication Policy

  5. Enable Replication Policy Snapshots

  6. Attach Replication Policy to a Protection Group

  7. Attach an Application to a Protection Group

  8. Verify Data Transfer

Note

It is highly recommneded that the general Disaster Recovery concepts, detailed here are reviewed before starting the tutorial.

17.6.3. Step 1: Initiate Pairing

First login to the first cluster from where the pairing is to be initiated and register the second cluster of the peer pair, using the following command:

# robin peer-cluster create test_peer

Provide this token to the secondary cluster:
eyJlbmRwb2ludCI6ICIxMC45LjYxLjQ3IiwgInBlZXJfYXV0aF90b2tlbiI6ICJleUowZVhBaU9pSktWMVFpTENKaGJHY2lPaUpJVXpJMU5pSjkuZXlKMWMyVnlYMmxrSWpveUxDSjBaVzVoYm5SZmFXUWlPakVzSW1WNGNDSTZNVFk1T0RRd01UY3pObjAuZkFveUM4S3FfQUdfZ2U5cTZKLXVpdC1mRjJtdUVsMzUtNFYzVklFeS1lWSIsICJwZWVyX3Rva2VuIjogIjNDUWVJNG1oQVRoV0VKb0FENTZVUFI1VSIsICJwZWVyX2NsdXN0ZXJfaWQiOiAiMWNkZmNiY2YtMTliYS00YThjLWEzMDMtNGNiMGFjZmU3N2YxIiwgInBlZXJfem9uZWlkIjogMTY0NjQ0NjI3OX0=

Note

More details regarding the robin peer-cluster create command such as additional parameters and an explanation of the commands use case can be found here.

As displayed in the output of the command, the token provided must be set on the second cluster of the cluster pair. Details on how to achieve this are given in the second step of this walkthrough here.

Important

The encoded blob expires after ten minutes. You must use it within this time period on the peer cluster in order to complete pairing process.

17.6.4. Step 2: Complete Pairing

To complete the peer pairing process, the blob created from the previous step must be registered on the second cluster within the cluster pair by running the following command on the aforementioned cluster:

Important

The encoded blob expires after ten minutes. You must use it within this time period on the peer cluster in order to complete pairing process.

# robin peer-cluster pair test_peer_primary eyJlbmRwb2ludCI6ICIxMC45LjYxLjQ3IiwgInBlZXJfYXV0aF90b2tlbiI6ICJleUowZVhBaU9pSktWMVFpTENKaGJHY2lPaUpJVXpJMU5pSjkuZXlKMWMyVnlYMmxrSWpveUxDSjBaVzVoYm5SZmFXUWlPakVzSW1WNGNDSTZNVFk1T0RRd01UY3pObjAuZkFveUM4S3FfQUdfZ2U5cTZKLXVpdC1mRjJtdUVsMzUtNFYzVklFeS1lWSIsICJwZWVyX3Rva2VuIjogIjNDUWVJNG1oQVRoV0VKb0FENTZVUFI1VSIsICJwZWVyX2NsdXN0ZXJfaWQiOiAiMWNkZmNiY2YtMTliYS00YThjLWEzMDMtNGNiMGFjZmU3N2YxIiwgInBlZXJfem9uZWlkIjogMTY0NjQ0NjI3OX0= --wait
Job: 52 Name: PeerClusterPair State: VALIDATED Error: 0
Job: 52 Name: PeerClusterPair State: COMPLETED Error: 0

Note

More details regarding the robin peer-cluster pair command such as additional parameters and an explanation of the commands functionality can be found here.

17.6.5. Step 3: Create a Protection Group

After setting up the peers such that the two clusters now form a replication pair, a protection group must be created on each site in order to facilitate the replication process. In order to create a protection group, run the following command:

# robin protection-group create prod-pg --peers NewYork
Job: 999 Name: ProtectionGroupCreateMultiPeer State: PROCESSED Error: 0
Job: 999 Name: ProtectionGroupCreateMultiPeer State: WAITING Error: 0
Job: 999 Name: ProtectionGroupCreateMultiPeer State: COMPLETED Error: 0

Important

The protection group creation command only needs to be run on one cluster within the pair as a mirror protection group will be spawned on the given peer(s) as part of the operation. By default the protection group on created on the cluster where the initial command was run will have the primary role whilst the mirror protection groups will have the secondary role.

Note

More details regarding the robin protection-group create command such as additional parameters and an explanation of the commands functionality can be found here.

In order to ensure the protection group was successfully created, issue the following command:

# robin protection-group list
+----+---------+---------+----------------+-------------+
| ID  | Name   |  Role   |     Tenant     |    Peers    |
+----+---------+---------+----------------+-------------+
| 3  | prod-pg | PRIMARY | Administrators | ['NewYork'] |
+----+---------+---------+----------------+-------------+

Note

More details regarding the robin protection-group list command such as additional parameters and an explanation of the commands functionality can be found here.

17.6.6. Step 4: Create a Replication Policy

A Replication Policy is a construct where in which the frequency of transferrance of application snapshots is defined. In addition it also contains details on which application snapshots should be transferred as part of the replication process by utilizing a label filter. As a result it is the primary driving force in determining what data is transferred between peer clusters as well as how often the data is moved. In order to create a Replication Policy that attempts to replicate externally created application snapshots with the label SNAP:YES whilst retaining at least 5 of these snapshots on any relevant peer every minute, run the following command:

# robin replication-policy create first-replication-policy --sched-cron * * * * *  --labels SNAP:YES:5
Submitted job '9998'. Use 'robin job wait 9998' to track the progress

# robin replication-policy info first-replication-policy
Name: first-replication-policy
Policy Owner: admin
Policy Interval: 60
Create Snapshots: False
Snapshot Labels

+------+-------+--------+
| Key  | Value | Retain |
+------+-------+--------+
| SNAP | YES   |   5    |
+------+-------+--------+

Note

More details regarding the robin replication-policy create command such as additional parameters and an explanation of the commands functionality can be found here.

17.6.7. Step 5: Enable Replication Policy Snapshots

In the previous step a Replication Policy was created which aimed to move snapshots created manually or using a snapshot schedule with a given label every minute. In addition to this snapshots can be created in cadence with the Replication Policy schedule and transferred. These are known as Replication Snapshots and more information about this type of snapshot can be found here. In order to enable this optional feature, run the following command:

# robin replication-policy enable-repl-snaps first-replication-policy 10
Submitted job '9997'. Use 'robin job wait 9997' to track the progress

# robin replication-policy info first-replication-policy
Name: first-replication-policy
Policy Owner: admin
Policy Interval: 60
Create Snapshots: True
Snapshot Labels

+------+-------+--------+
| Key  | Value | Retain |
+------+-------+--------+
| SNAP | YES   |   5    |
+------+-------+--------+

Note

More details regarding the robin replication-policy enable-repl-snaps command such as additional parameters and an explanation of the commands functionality can be found here.

17.6.8. Step 6: Attach Replication Policy to Protection Group

After the desired Replication Policy has been created it needs to be attached the previously created Protection Group. This is needed in order for the schedule defined within the policy to come into effect and consequently for the replication process to begin. In order to attach the Replication Policy to a Protection Group, run the following command:

Note

Alongside the Replication Policy, any applications targeted for replication also need to be attached to the Protection Group before the replication process can begin. The means by which to do this is described in the next step.

# robin protection-group attach-repl-policy prod-pg first-replication-policy
Submitted job '9998'. Use 'robin job wait 9998' to track the progress

# robin protection-group info prod-pg
Name: prod-pg
Tenant: Administrators
Role: PRIMARY
Peer Information:
=================

**Peer Cluster Name: NewYork**

Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: first-replication-policy

Note

More details regarding the robin protection-group attach-repl-policy command such as additional parameters and an explanation of the commands functionality can be found here.

17.6.9. Step 7: Attach an Application to Protection Group

In the previous steps, the construct of Replication Snapshots and their creation via a Replication Policy was introduced. In order for these snapshots to actually come into fruition, applications must first be attached to a Protection Group as they will be the subject of the aforementioned snapshots. Moreover, for snapshots created otherwise only those associated with attached applications and match the label filter will be replicated. To attach an application to a Protection Group, run the following command:

# robin protection-group add-app prod-pg mysqldb --wait
Job: 10245 Name: ProtectionGroupAddApps State: PROCESSED Error: 0
Job: 10245 Name: ProtectionGroupAddApps State: WAITING Error: 0
Job: 10245 Name: ProtectionGroupAddApps State: COMPLETED Error: 0

# robin pg info prod-pg
Name: prod-pg
Tenant: Administrators
Role: PRIMARY
Peer Information:
=================

Peer Cluster Name: NewYork
Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: first-replication-policy

+----------+-------------------------+--------------------------------------------------------+--------------------------+
| App Name | State                   | Latest Synced Snapshot                                 | Latest Sycned Snap ctime |
+----------+-------------------------+--------------------------------------------------------+--------------------------+
| mysqldb  | PRIMARY_SYNC_SUCCESSFUL | mysqldb_first-replication-policy-1639611848-1639688268 | 2021-12-16 12:57:52      |
+----------+-------------------------+--------------------------------------------------------+--------------------------+

Note

More details regarding the robin protection-group add-app command such as additional parameters and an explanation of the commands functionality can be found here.

17.6.10. Step 8: Verify Data Transfer

After completing all the previous steps, the replication procedure for applications associated with a given Protection Group will be setup. The application data transferred between peer clusters can be tracked with the previously seen robin protection-group info command. Specifically the state of the respective snapshots will change in the given order upon successfully replicating:

  • SYNC Pending

  • SYNC In Progress

  • SYNC Successful

Note

Each of the above states will be prefixed with the role of the Protection Group, either Primary or Secondary based on the cluster the aforementioned command is run on. More information on the given Sync states can be found here.

Show below is an example output of the successful replication of a snapshot on the cluster hosting the primary Protection Group.

# robin protection-group info prod-pg

Name: prod-pg
Tenant: Administrators
Role: PRIMARY
Peer Information:
=================

**Peer Cluster Name: NewYork**

Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: minute_repl

+----------+-------------------------+-------------------------------------------+--------------------------+
| App Name |      State              | Latest Synced Snapshot                    | Latest Sycned Snap ctime |
+----------+-------------------------+-------------------------------------------+--------------------------+
| mysqldb  | PRIMARY_SYNC_SUCCESSFUL | mysqldb_minute_repl-1639611848-1639688868 | 2021-12-16 13:07:53      |
+----------+-------------------------+-------------------------------------------+--------------------------+

On the other hand, below is an example output of the successful replication of a snapshot on the cluster hosting a secondary Protection Group.

# robin protection-group info prod-pg

Name: prod-pg
Tenant: Administrators
Role: SECONDARY
Peer Information:
=================

**Peer Cluster Name: Sanfrancisco**

Peer Cluster State: PAIRED
Peer Cluster ProtectionGroupState: PROTECTED
Replication Policy: minute_repl-1

+----------+---------------------------+-------------------------------------------+--------------------------+
| App Name | State                     | Latest Synced Snapshot                    | Latest Sycned Snap ctime |
+----------+---------------------------+-------------------------------------------+--------------------------+
| mysqldb  | SECONDARY_SYNC_SUCCESSFUL | mysqldb_minute_repl-1639611848-1639688989 | 2021-12-16 13:09:53      |
+----------+---------------------------+-------------------------------------------+--------------------------+