8. Managing External Repositories

In the previous chapter we looked at how ROBIN allows one to Snapshot and Clone the entire application-stack. Snapshots are stored on the same disks from which the storage for the PersistentVolume is allocated. Over time, as more snapshots are taken, they start consuming more and more of the disk space on the primary storage device. To avoid this, yet preserve as many snapshots as desired, ROBIN allows one to attach an external secondary storage repository onto which older snapshots are copied. Using this approach, one can setup a policy to retain only the last 2 snapshots on the primary storage device, and copy the older snapshots to a secondary storage repository.

Recall that ROBIN is a highly sophisticated product in that snapshots are not just snapshots of the PersistentVolumeClaims, but they capture all the relevant Kubernetes resources that make up that application. Hence with ROBIN one can:

  • Backup Apps+Data to an external storage repository such as S3 Object Store, Google Cloud Storage etc.

  • Make Apps+Data portable across Kubernetes clusters spanning on-prem, publc and hybrid cloud environments.

This is done by registering a common external Repository (or repo in short) that resides outside the ROBIN Cluster. This repository is typically an object storage system such as Amazon S3 bucket, Google Cloud Storage (GCS) bucket, Azure Blob Storage or NFS exports.

The workflow to register and use the repository (abbreviated to repo) is as follows:

  1. Create/Regsiter a Repo by passing access credentials to it

  2. Attach the Repo to an App

  3. Backup App+Data to attached Repo

  4. Register the same repo to a different Kubernetes cluster

  5. Export an App backup from the first cluster

  6. Import the App backup into the second cluster

  7. Browse the contents of the repo on the second cluster, pick an app, and light it up

Topics covered in this chapter:

robin repo register

Register a new external storage repository (S3, GCS, …)

robin repo unregister

Unregister a repo

robin repo list

Lists all repos

robin repo info

Show details about a specific repo

robin repo contents

Show the contents of a repo

robin repo attach

Attach a repo to a specific app

robin repo detach

Detach a repo from a specific app

robin repo purge

Purge/delete entries in a repo

robin repo status

Show status of transfers to and from repo

robin repo share

Share a repo with one or more users

robin repo unshare

Stop sharing a repo with one or more users

8.1. Register a Repo

A repo is registered using the following command:

$ robin repo register <reponame> s3|gcs://bucket[/path/to/folder] <credentials> <readwrite|readonly>

reponame

A name that would be assigned to the repo

s3|gcs

Type of the repo. It can be one of the following values:
- s3: AWS S3
- gcs: for Google Cloud Storage

bucket[/path/to/folder]

S3/GCS bucket name and a folder within that bucket into
which the application snapshots would be pushed (backedup)

<credentials>

Path to a JSON file with credentials to access the repo. The format for different
repo types is shown below

<readwrite | readonly>

This CLI argument specifies the permissions with which the repo would be
registered with. The ‘readwrite’ permission would create the bucket and the folder
hierarchy in the repo if it doesn’t already exist. This assumes
that the supplied creds has bucket creation privilleges. With the ‘readonly’
permission the repo would be registered in read-only mode. That is it can only
be used to pull already pushed application snapshots from the repo. One can not
write new snapshots to this repo

8.1.1. Format of credential file for AWS S3 type repo

The format of the json file passed via <credentials> parameter for AWS S3 type repos is as follows:

{
    "aws_access_key_id" : "AKIAIC9YAL8TKCSEO43A",
    "aws_secret_access_key" : "66HXkA3mjQEbmrUo1Aw02bcdYEWjsuSPHvLEMNfZ",
    "end_point": "s3.amazonaws.com",
}

Example:

$ robin repo register mybackups s3://db-backup-bucket/mybackups aws.json readwrite

AWS Access Key and Secret Keys can be obtained by following these instructions:

  1. Log in to your AWS Management Console.

  2. Click on your user name at the top right of the page.

  3. Click on the Security Credentials link from the drop-down menu.

  4. Find the Access Credentials section, and copy the latest Access Key ID.

  5. Click on the Show link in the same row, and copy the Secret Access Key.

8.1.2. Format of credential file for Google Cloud Storage (GCS) type repo

The format of the json file passed via <credentials> parameter for Google Cloud Storage (GCS) type repos is as follows:

 {
     "type": "service_account",
     "project_id": "rock-range-805623",
     "private_key_id": "c9fde8c819735439248147457629895ebbcc1f21",
     "private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgpqkkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQDXlyNg3997PJlA\nyS9doXSfX5lEpDgmk7nATyOrKQkl1D3/bWvoFMR372i0WKopj5FvLNY42jrcRzjF\nTlbJuwP4WAU652ss3qEvnUgd+mD7CcjOWd9bAA6vPJAJDo/TtTleZNQWd96y2WbT\nH/BqI9UCxdWNfsuYHQtpVViPdviizU3AxFtD6NTN/KmUX9mlM6RuF2RXAc5N/p4X\nhbeSV+rSfEgb9PA2U+fwobGzeR97V4SwK15btPpSN5twf0Wy49zGQLONFcmfwhRc\n+3r3gpPfqb+cb1xWiiD5fgO4yWYcWQXJJgUumVtVZxikc6k+9Vbko9zw20NNzRgp\nKfZdtm1jAgMBAAECggEABx+l2h4bntFWSQc3yu26UkfQ4y0/37pe4WVcCtxMwpS3\nUEbV+7Gv+mbYwVjKqpvlVNSY3YD4f+3UiOR5RIzK6UpTRep+ppoPzGh4iREMYk3k\n9PiQQkwCsDSil3IE65xp3F4die6FC3jWNFdSVNeBQtmxoD7H0GtpRJ4+0mK+fXbW\nr5f6O3WES4nOTNRonOdg9bIJJKklt3GSjtd1X5JWfGU53sbrksvy31+hL71pXCQn\nUZlkoilc3KYGnesd3KwIpxX9Pi5TldNUWuSnibvgXnDjM05PuQ0YI9VEfVR+eDfI\nBZAw2b/dmCWyU3QVAMtRbhwISmbxB33rZ133gTVN/QKBgQD5ZTZSnf7/b0YdKcAD\nP3bevN3/0s9EXO+D62MTRdrNeyb46hAcbNHmzAvgqtB74bQxiF5hs6RbmTmUpVpL\nSyCl6lqf3eMzP2gdECRLmkeZKpiWtfbPNbTefGVvbvWG6vi3E6+b/5cKPVHMLSV6\n4LIbqpVyWFWM905MXU9cgFgJDwKBgQDdTL5Yl3G/91HnyFieQ9pju6C9v1S3277k\nN10hcmldkJWludgT3WzDsNmbcAdCp+KOtQusf6KMKT5sd5ickMmAI3IZkzsgb7rS\n7AVBf3bpLG+3kOGJ4PVGKXfTEONw7hSo1UwLQxSZj1SMTtOd63pFcVea1hYQXifr\nADUrkwhObQKBgQCTb9pljTIrIEV7CCuTwChTv6QekSonaCnQ+19fDXUE9UFc9kMA\nCvUsVITRFSqbkhHXlp71c5Y+3J6x2e3/g/KRI7Lfv/WJXnrWc6yBZXveeOgscPaK\nQ13iCfiNndX5JQBXb+vo3iEU1Jt+3VGvCxc/3IjtSHuLEskfLCq2rMle0QKBgQCX\nsTJM0gGKT8KCCc/M9J/vez6Msmkk+mkYUGbzNVTKQQCDMCfQPhh+72vKY3lmlGP1\nBF7zKC5Iu0DB4xzmPU0SG/DBzS1bZ5r9V7GmmvPsk3wkrRgcheo65NPxBwOQdnIM\n5OCSW7H0LM58us/N0UG+Zhnx1cwb/h6ItIS90LSB5QKBgCwZPxhDhlKXGn3J8sLn\nuZ9WjFTBlabt8mksHpnE4xgj5BeOXee29CtaX+mPIcPEOO9ynsbizaSv7I2kBM+d\n9DZt9XiM6NcjrqtP7FCdgvZeQV0GIHjlqj2zH0UDxImGmENyHeXA1X2d0fZwzuJ+\n5dEZAfYh4QAnnPiAF5K9QFEi\n-----END PRIVATE KEY-----\n",
     "client_email": "jamesbond-cloud-repo@rock-range-805623.iam.gserviceaccount.com",
     "client_id": "1815195052297641083965",
     "auth_uri": "https://accounts.google.com/o/oauth2/auth",
     "token_uri": "https://oauth2.googleapis.com/token",
     "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
     "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/jamesbond-cloud-repo%40rock-range-805623.iam.gserviceaccount.com"
}

Example:

$ robin repo register datadump gcs://datadump-bucket/dumps gcs.json readwrite

GCS Key file can be downloaded from Google Cloud Platform portal using the following instructions:

  1. Open the IAM & Admin page in the GCP Console.

  2. Select your project and click Continue.

  3. In the left nav, click Service accounts.

  4. Look for the service account for which you wish to create a key, click the More : button in that row, and then click Create key.

  5. Select a Key type and click Create.

More instructures are here: https://cloud.google.com/iam/docs/creating-managing-service-account-keys

Note

The ability to register a repo is subject to ROBIN Role Based Access Control (RBAC). By default, only Cluster Administrators (users having the clusteradmin role) have permission to do so. See the section on User Management for details on how ROBIN RBAC works.

Only one repo with a given bucket name and path can be registered in a ROBIN cluster.

To provide some users with readonly access to a repo, share the repo with them for the view operation only. For users that should have readwrite access to a repo, share the repo with them for view and push operations. In order for this to work, the repo must have been regsitered with readwrite permisison. See Share a Repo for details.

8.2. Unregister a Repo

A registered repo can be unregistered using the following command:

$ robin repo unregister <reponame>

reponame

The name of the repo to remove

Note

The ability to unregister a repo is subject to ROBIN Role Based Access Control (RBAC). By default, only Cluster Administrators (users having the clusteradmin role) have permission to do so. See the section on User Management for details on how ROBIN RBAC works.

There are a number of conditions that will prevent a repo from being unregistered:

  • A repo cannot be unregistered if it has attached applications (see robin repo attach command). Use the robin app detach command to detach each app before running the robin repo unregister command. See documentation for robin repo detach to understand what happens to an app’s data when it is detached from the repo.

  • A repo cannot be unregistered if there are any backup entries in the repo’s catalog. All backup entries must be purged from the repo’s catalog before running the robin repo unregister command. Backup entries can be purged from the repo catalog only, or they can be purged from the catalog and the external repository. See documentation for robin repo purge and robin backup delete for details.

  • A repo cannot be unregistered if it is shared with any users (see robin repo share command). Use the robin repo unshare command to remove any user shares before running the robin repo unregister command.

8.3. List Repos

Repos registered with a ROBIN cluster can be listed using the following command:

$ robin repo list

Example:

$ robin repo list

+----------+------+----------------------+--------------+-----------------+--------+-------------+
| Name     | Type | Owner/Tenant         | BackupTarget | Bucket          | Path   | Permissions |
+----------+------+----------------------+--------------+-----------------+--------+-------------+
| datadump | GCS  | robin/Administrators | 1            | datadump-bucket | dumps/ | readwrite   |
+----------+------+----------------------+--------------+-----------------+--------+-------------+

Note

The ability to view repos is subject to ROBIN Role Based Access Control (RBAC). Cluster Administrators (users having the clusteradmin role) are able to view all registerd repos in a cluster. Cluster Users (users having the clusteruser role), on the other hand, are only able to view repos that have been explicitly shared with them or that have been shared with all cluster users (see Share a Repo for details). Cluster Users are also able to view any repos they register/create, however they must first be assigned the capability to do so. See the section on User Management for details on how ROBIN RBAC works.

8.4. Get information about a specific Repo

Details of a specific repo can be obtained using the following command:

$ robin repo info <reponame>

reponame

The name of the repo registered with ROBIN

Example:

$ robin repo info datadump

Name                   : datadump
Type                   : GCS
Bucket                 : datadump-bucket
Path                   : dumps/
Permissions            : readwrite

Scan Details
-------
Scan State             : SUCCESS
Start Time             : 2019-04-08 16:05:35
End Time               : 2019-04-08 16:05:35
Time Taken             : 0s
Scan Error             : -
Scan Diff              : {'removed': 0, 'added': 0}

Apps attached: 1
        helm/mydb

Note

The ability to view information about a specific repo is subject to ROBIN Role Based Access Control (RBAC). Cluster Administrators (users having the clusteradmin role) are able to view information about all registerd repos in a cluster. Cluster Users (users having the clusteruser role), on the other hand, are only able to view information about repos that have been explicitly shared with them or that have been shared with all cluster users (see Share a Repo for details). See the section on User Management for details on how ROBIN RBAC works.

8.5. List the contents of a Repo (Browse Backup Catalog)

To browse the contents of a repo (aka browsing the backup catalog) use the following command:

$ robin repo contents <reponame>                \
                      --refresh                 \
                      --zoneid <zoneid>         \
                      --backupid <backupid>     \
                      --managed

reponame

Name of the repo to browse

--refresh

When specified the repo would be scanned again to build the catalog

--zoneid <zoneid>

The zoneid to use when scanning repo contents (disaster recovery only)

--backup_id <backupid>

The backup Id to use when scanning repo contents (disaster recovery only)

--managed

Take ownership of an orphaned backup in an external storage repo (disaster recovery only)

Example:

$ robin repo contents datadump --refresh

+----------------------------------+------------+------+------------+
| BackupID                         | ZoneID     | App  | Snapshot   |
+----------------------------------+------------+------+------------+
| 5eccc5285a5211e9a1c3417886f14cc5 | 1554764031 | mydb | mydb_snap1 |
+----------------------------------+------------+------+------------+

Note

The ability to view the contents of a repo (view the repo catalog) is subject to ROBIN Role Based Access Control (RBAC). By default, only Cluster Administrators (users having the clusteradmin role) have permission to do so. Cluster Users can use the robin backup list command to view information about app backups that are stored in a repo, at least for those backups they have view access to. See the section on User Management for details on how ROBIN RBAC works.

In general, it should not be necessary to issue a robin repo contents command with the --refresh command line argument. All backups pushed to a repo from a cluster will automatically get added to the repo’s catalog.

To perform disaster recovery (recover orphaned backups from an external storage repo), issue the robin repo contents command with the --zoneid <zoneid> command line argument. This will allow all backups from the destroyed cluster to be added to the repo catalog in the local ROBIN cluster. If only a single backup is required, then include the --backupid <backupid> command line option as well. Note that recovered backups are functionally identical to backups that have been imported from another cluster. To take ownership of the recovered backups (have the ability to issue a robin repo purge command on them), include the --managed command line option.

8.6. Attach a Repo to an App

To push (backup) snapshots of an app, a repo must first be attached to it. This is done using the following command:

$ robin repo attach <reponame> <appname>

reponame

Name of the repo to attach

appname

Name of the app to which to attach (this is obtained by running robin app list

Note

The ability to attach an app to a repo is subject to ROBIN Role Based Access Control (RBAC). By default, only Cluster Administrators (users having the clusteradmin role) have permission to do so. A repo must be shared with a user for the push operation before they will be allowed to attach an app to it (see Share a Repo for details). See the section on User Management for details on how ROBIN RBAC works.

Attaching a repo to an application doesn’t automatically trigger snapshots for that application to be pushed (backed up) to the repo. Snapshots must either be manually pushed (backed up) using robin snapshot push command or automatically pushed via scheduled snapshots, which is configured using the robin app config command. See Configuring Data Management Attributes of an Application.

Example:

$ robin repo attach datadump mydb

$ robin repo info datadump

Name                   : datadump
Type                   : GCS
Bucket                 : datadump-bucket
Path                   : dumps/
Permissions            : readwrite

Scan Details
-------
Scan State             : SUCCESS
Start Time             : 2019-04-08 16:05:35
End Time               : 2019-04-08 16:05:35
Time Taken             : 0s
Scan Error             : -
Scan Diff              : {'removed': 0, 'added': 0}

Apps attached: 1
        helm/mydb

8.7. Detaching a repo from an App

A repo can be detached from an app using the following command:

$ robin repo detach <reponame> <appname>

reponame

Name of the repo to detach

appname

Name of the app from which to detach the repo

Example:

$ robin repo detach datadump mydb

$ robin repo info datadump

Name                   : datadump
Type                   : GCS
Bucket                 : datadump-bucket
Path                   : dumps/
Permissions            : readwrite

Scan Details
-------
Scan State             : SUCCESS
Start Time             : 2019-04-08 17:12:35
End Time               : 2019-04-08 17:12:36
Time Taken             : 1s
Scan Error             : -
Scan Diff              : {'removed': 0, 'added': 0}

Apps attached: 0

Note

The ability to detach an app from a repo is subject to ROBIN Role Based Access Control (RBAC). By default, only Cluster Administrators (users having the clusteradmin role) have permission to do so. A repo must be shared with a user for the push operation before they will be allowed to detach an app from it (see Share a Repo for details). See the section on User Management for details on how ROBIN RBAC works.

8.8. Purging (Deleting) backups from a repo

A snapshot that has been pushed (backed up) to a repo can be purged (deleted from the repo) using the following command:

$ robin repo purge <reponame> <backupid>

reponame

Name of the repo to purge from

backupid

Unique ID of the backup (obtained from the output of robin backup list)

Example:

$ robin purge datadump <backupid>

8.9. Monitor Status of Repo

The status of transfers to and from a repo can be viewed using the following command:

$ robin repo status <reponame>

reponame

Name of the repo to monitor

Example:

$ robin repo status datadump
+---------+------------------------------------+-----------------+
| Name    | TransferStatus                     | Schedule/Manual |
+---------+------------------------------------+-----------------+
| backup1 | 20% [####................] 480.0MB | MANUAL          |
+---------+------------------------------------+-----------------+

After a few seconds. We can run the command again to see:

$ robin repo status datadump
+---------+-------------------------------------+-----------------+
| Name    | TransferStatus                      | Schedule/Manual |
+---------+-------------------------------------+-----------------+
| backup1 | 100% [####################] 480.0MB | MANUAL          |
+---------+-------------------------------------+-----------------+

8.10. Share a Repo

Share a storage repo with one or more users:

$ robin repo share <reponame> <user_list>                       \
                              --operations <operations>         \
                              --all-tenant-users

reponame

Name of the repo to share

user_list

List of users to share the repo with

--operations <operations>

List of operations the user will be allowed to perform or ALL_OPERATIONS (default is view)

--all-tenant-users

Share a repo (for specified operations) with all users

Example:

$ robin repo share testrepo --all-tenant-users

$ robin repo share user1 --operations push

$ robin repo list --full
+----------+------+------------------+--------------+-----------------------+--------------+-------------+
| Name     | Type | Owner/Tenant     | BackupTarget | Bucket                | Path         | Permissions |
+----------+------+------------------+--------------+-----------------------+--------------+-------------+
| testrepo | GCS  | robin/K8sCluster | 1            | testbucket-1029384756 | dev/testing/ | readwrite   |
+----------+------+------------------+--------------+-----------------------+--------------+-------------+
  User Shares:
    user1: push
    all_tenant_users: view

Note

The ability to share a repo is subject to ROBIN Role Based Access Control (RBAC). By default, only Cluster Administrators (users having the clusteradmin role) have permission to do so. See the section on User Management for details on how ROBIN RBAC works.

8.11. Unshare a Repo

Stop sharring a storage repo with one or more users:

$ robin repo unshare <reponame> <user_list>                       \
                                --operations <operations>         \
                                --all-tenant-users

reponame

Name of the repo to stop sharing

user_list

List of users to stop sharing the repo with

--operations <operations>

List of operations the user will no longer be allowed to perform or ALL_OPERATIONS (default is view)

--all-tenant-users

Stop sharing a repo (for specified operations) with all users

Example:

$ robin repo unshare testrepo --all-tenant-users

Note

The ability to unshare a repo is subject to ROBIN Role Based Access Control (RBAC). By default, only Cluster Administrators (users having the clusteradmin role) have permission to do so. See the section on User Management for details on how ROBIN RBAC works.