Backing up and restoring databases (version 4.0.x)
You can back up and restore databases in IBM Watson® Knowledge Studio for IBM Cloud Pak® for Data version 4.0.x by running scripts.
The all-backup-restore.sh script backs up or restores all the databases and deactivates the pods to prevent access. It then reactivates the pods. However, with the individual database scripts, you must run individual procedures.
Using a backup from a different instance will cause errors. You should only restore a backup which is created from the same IBM Watson® Knowledge Studio instance.
Before you begin
-
Download the scripts.
-
Review information about the script's use of the MinIO client. The client is required for the MinIO commands.
- The scripts download the client from the MinIO website if the client isn't installed.
- If you want the script to use your installed version, verify that you can run the client by issuing the command
mcon the command line.
all-backup-restore script
The all-backup-restore.sh script backs up and restores the MongoDB, PostgreSQL, and Minio databases, and PVC.
Unless you only need to back up a single database, it is recommended that you use the all-backup-restore script, which deactivates and reactivates Knowledge Studio.
The script backs up or restores the data in the following order:
- MongoDB
- PostgreSQL
- MinIO
- PVC
all-backup-restore.sh [backup|restore] [RELEASE_NAME] [BACKUP_DIR] [-n NAMESPACE]
Use either the backup or restore command.
-
backup
Backs up each database to a subdirectory of the
BACKUP_DIRdirectory. -
restore
Restores the data from each database in the
BACKUP_DIRdirectory.
Arguments and options
-
RELEASE_NAME
Required. The release name that was specified when the Knowledge Studio Helm chart was installed in your cluster.
You can find the release name as the prefix of the pod name, for example, {release_name}-ibm-watson-ks-yyy-xxx. For version 4.3.0, the value is always
wks. -
BACKUP_DIR
Required. The base directory of each database where backups are stored to or restored from. Each database is stored in a subdirectory of the backup directory. A new folder with timestamp
wks-backup-yyyymmdd_hhmmsswill be created under [backupDir], for example: [backupDir]/wks-backup-yyyymmdd_hhmmss/mongodbFor restore: please set [backupDir] with
wks-backup-yyyymmdd_hhmmss, for example: [backupDir]/mongodb -
-n NAMESPACE
Namespace for the pods. The default value is
zen.
Output
The script returns the following output, indicating either the backup or restore command:
[SUCCESS] MongoDB,PostgreSQL,Minio,PVC (backup|restore)
If the process fails, the following message is displayed, indicating either the backup or restore command:
[FAIL] MongoDB,PostgreSQL,Minio,PVC (backup|restore)
If the script fails, the data is corrupted. Do not use the corrupted data to restore.
Database-specific scripts
If you need to back up or restore a single database, use one of the database-specific scripts. However, make sure that you deactivate the pods before you run the script, and reactivate the pods after the script completes successfully.
MongoDB
Use this script instead of all-backup-restore.sh to back up or restore only the MongoDB database.
mongodb-backup-restore.sh backup|restore RELEASE_NAME BACKUP_DIR -n NAMESPACE
Use either the backup or restore command. For more information about the arguments and options, see the all-backup-restore script.
Backing up MongoDB
Back up your MongoDB data. Databases named WKSDATA, ENVDATA, and escloud_sbsep store data for Knowledge Studio.
- Deactivate Knowledge Studio.
- Run the
mongodb-backup-restore.shscript with thebackupcommand. The script runs the following operations:- Creates a remote temporary file under the mongoDB pod and extracts the following data:
WKSDATA,ENVDATA, andescloud_sbsep. - Copies the
WKSDATA,ENVDATA, andescloud_sbsepdata to theBACKUP_DIRthat you specify and deletes the temporary file.
- Creates a remote temporary file under the mongoDB pod and extracts the following data:
- Reactivate Knowledge Studio.
Restoring MongoDB data
Restore the backed-up data to MongoDB.
- Deactivate Knowledge Studio.
- Run the
mongodb-backup-restore.shscript with therestorecommand. The script runs the following operations:- Create a remote temporary file under the mongoDB pod
- Copies the
WKSDATAENVDATAescloud_sbsepdata from theBACKUP_DIRthat you specify to the remote temporary file. - Restores the data from the temporary file and deletes the temporary file.
- Reactivate Knowledge Studio.
PostgreSQL
Use this script instead of all-backup-restore.sh to back up or restore only the PostgreSQL database.
postgresql-backup-restore.sh backup|restore RELEASE_NAME BACKUP_DIR -n NAMESPACE
Use either the backup or restore command. For more information about the arguments and options, see the all-backup-restore script.
Backing up PostgreSQL
Back up your PostgreSQL data by getting a data dump.
- Deactivate Knowledge Studio.
- Run the
postgresql-backup-restore.shscript with thebackupcommand. The script runs the following operations:- Creates a job for the
postgresqlbackup. - Dumps the databases. The filenames are the database names, such as
jobq_{release_name_underscore},model_management_api,model_management_api_v2andawt, with the.customextension appended. - Copies the dump files to the local
[backupDir]that you specify. - Deletes the
.pgpassfile.
- Creates a job for the
- Reactivate Knowledge Studio.
Restoring PostgreSQL data
Restores the backed-up data to PostgreSQL.
- Deactivate Knowledge Studio.
- Run the
postgresql-backup-restore.shscript with therestorecommand. The script runs the following operations:- Creates a job for
postgresqlrestore. - Restores the databases (
jobq_{release_name_underscore},model_management_api,model_management_api_v2andawt) by loading the.customfiles from the[backupDir]that you specify. - Deletes the
.pgpassfile.
- Creates a job for
- Reactivate Knowledge Studio.
MinIO
Use this script instead of all-backup-restore.sh to back up or restore only the MinIO database.
minio-backup-restore.sh backup|restore RELEASE_NAME BACKUP_DIR -n NAMESPACE
Use either the backup or restore command. For more information about the arguments and options, see the all-backup-restore script.
Backing up MinIO
Back up your MinIO database by taking a snapshot of the data. A bucket named wks-icp stores data for Knowledge Studio.
- Deactivate Knowledge Studio.
- Run the
minio-backup-restore.shscript with thebackupcommand. The script runs the following operations:- Establishes a connection to the pod
RELEASE_NAME-ibm-minioby runningkubectl -n NAMESPACE port-forward. - Configures a MinIO alias named
wks-minio. - Copies data from
wks-minio/wks-icpto the BACKUP_DIR you specify. - Closes the
port-forwardconnection.
- Establishes a connection to the pod
- Reactivate Knowledge Studio.
Restoring MinIO data
Restores the snapshot data to MinIO. Deletes the existing data in the MinIO server, and then restores the backup data.
- Deactivate Knowledge Studio.
- Run the
minio-backup-restore.shscript with thebackupcommand. The script runs the following operations:- Establishes a connection to the pod
RELEASE_NAME-ibm-minioby runningkubectl -n NAMESPACE port-forward. - Configures a MinIO alias named
wks-minio. - Copies data from the BACKUP_DIR you specify to
wks-minio/wks-icp. - Closes the
port-forwardconnection.
- Establishes a connection to the pod
- Reactivate Knowledge Studio.
PVC
Use this script instead of all-backup-restore.sh to back up or restore only the Persistent volume claim (PVC) data.
pvc-backup-restore.sh backup|restore RELEASE_NAME BACKUP_DIR DOCKERREGISTRY PVC_USER_ID -n NAMESPACE
Arguments for PVC
-
DOCKERREGISTRY
The same Docker registry as the
RELEASE_NAME-ibm-watson-ks-aql-web-toolingpod. -
PVC_USER_ID
The user ID for the running containers in the
RELEASE_NAME-ibm-watson-ks-aql-web-toolingpod.
Use either the backup or restore command. For more information about the other arguments and options, see the all-backup-restore script.
Backing up PVC
- Identify the name of Docker registry and user ID before you deactivate Knowledge Studio.
- Deactivate Knowledge Studio.
- Run the
pvc-backup-restore.shscript with thebackupcommand. The script runs the following operations:- Creates a temporary pod at
RELEASE_NAME-ibm-watson-ks-aql-web-tooling-backup. Compresses/opt/ibm/watson/aql-web-tooling/target/sandboxand saves it assandbox.tgztoRELEASE_NAME-ibm-watson-ks-aql-web-tooling-backup. - Copies
sandbox.tgzto the BACKUP_DIR that you specify. - Deletes the temporary pod
- Creates a temporary pod at
- Reactivate Knowledge Studio.
Restoring PVC data
Restores data to the PVC. Deletes the existing data in the sandbox, and then restores the backup data.
- Deactivate Knowledge Studio.
- Run the
pvc-backup-restore.shscript with therestorecommand. The script runs the following operations:- Copies
sandbox.tgzfrom the BACKUP_DIR that you specify to a temporary pod atRELEASE_NAME-ibm-watson-ks-aql-web-tooling-backup. - Deletes the data in
/opt/ibm/watson/aql-web-tooling/target/sandbox. - Decompresses
sandbox.tgzto/opt/ibm/watson/aql-web-tooling/target/sandbox. - Deletes the temporary pod
- Copies
- Reactivate Knowledge Studio.
Deactivate Knowledge Studio
You don't need to deactivate when you run the all-backup-restore.sh script because the script handles the process.
To ensure that users don't have access to Knowledge Studio when you back up or restore a single database, stop the Knowledge Studio front-end pods before you back up or restore data.
-
Make sure that no training and evaluation processes are running. You can check job status with the following command:
kubectl -n NAMESPACE get jobsTraining jobs of Knowledge Studio are named in the format
wks-train-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx, and evaluation jobs are named in the formatwks-batch-apply-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx. If theCOMPLETIONScolumn of a training job reads0/1, that job is still running. Wait until all of the training jobs finish. -
Deactivate Knowledge Studio using the folowing commands:
kubectl -n NAMESPACE patch --type=merge wks wks -p '{"spec":{"global":{"quiesceMode":true}}}' kubectl -n NAMESPACE patch --type=merge wks wks -p '{"spec":{"mma":{"replicas":0}}}' -
Ensure that no Knowledge Studio pods exist, except datastore pods, by running the following command (this may takes few minutes):
kubectl -n NAMESPACE get pod | grep -Ev 'minio|etcd|mongo|postgresql|gw-instance|Completed' | grep wks
Reactivate Knowledge Studio
You don't need to reactivate when you run the all-backup-restore.sh script because the script handles the process.
To reactivate Knowledge Studio, use the following command:
kubectl -n NAMESPACE patch --type=merge wks wks -p '{"spec":{"global":{"quiesceMode":false}}}'
kubectl -n NAMESPACE patch --type=merge wks wks -p '{"spec":{"mma":{"replicas":2}}}'