Set Up Cross-Region Replication
To set up cross-region replication (CRR), you set up Disaster Recovery (DR) clusters that sync with the primary cluster. Changes on the primary cluster are copied over to the DR cluster.
When necessary, you can fail over to a DR cluster, making it the new primary cluster.
This setup assumes that you are setting up a DR cluster for an existing primary cluster. If you are setting up both the primary cluster and DR cluster from scratch, you only need perform Step 3 after TigerGraph is installed on both clusters. |
1. Before you begin
-
Install TigerGraph 3.2 or higher on both the primary cluster and the DR cluster in the same version.
-
Make sure that your DR cluster has the same number of partitions as the primary cluster.
-
Make sure the username and password of the TigerGraph database user created on the DR cluster during installation matches one of the users on the primary cluster who have the
superuser
role. -
If you choose to enable CRR and your DR cluster is in a different Virtual Private Cloud (VPC) than your primary cluster, make sure that TigerGraph is installed on your cluster with public IPs:
-
If you install interactively, make sure that you supply the public IP of all nodes.
-
If you install non-interactively, make sure in the
NodeList
field ofinstall_conf.json
that you are providing the public IPs for all nodes.
-
Make sure TigerGraph is not installed with a local loopback IP such as 127.0.0.1. You can verify if you are using loopback IP with |
2. Procedure
The following setup is needed in order to enable Cross Region Replication.
2.1. Back up primary data
Use GBAR to create a backup of the primary cluster. See Backup and Restore for how to create a backup.
If you are setting up both the primary cluster and the DR cluster from scratch, you can skip Steps 1, 2, and 4 and only perform Step 3.
2.2. Restore on the DR cluster
Copy the backup files from every node to every node on the new cluster. Restore the backup of the primary cluster on the DR cluster. See Backup and Restore on how to restore a backup.
2.3. Delete metadata records
This section applies only to specific sub-versions earlier than 3.5.2, 3.6.3, 3.7.1, 3.9.0, and 4.0. If you have the most up-to-date version of TigerGraph Server, you do not need to delete the metadata records. |
After the restoration, but before enabling CRR on the DR cluster, delete the oudated Kafka metadata records.
Kafka provides a script called kafka-delete-records.sh
.
It takes a JSON configuration file to specify which records/messages on which partition, and until which offset, you want to delete.
If the offset is set to -1
, the script will delete all records in the given partition.
This script runs on the Kafka partition level, so it requires configuration through all Kafka partitions of the Metadata topic.
# get into kafka bin directory in DR cluster
$ cd $(gadmin config get System.AppRoot)/kafka/bin/
# Add JAVA to your system path. JAVA is already provided by TigerGraph.
$ JAVA_HOME=$(dirname `find $(gadmin config get System.AppRoot)/.syspre -name java -type f`)
$ PATH=$PATH:$JAVA_HOME
# run kafka-delete-records to delete outdated records during restore
$ bash kafka-delete-records.sh \
--bootstrap-server $(gmyip):30002 \
--offset-json-file <path to delete_config.json>
Here is an example of a delete_config.json
file.
{
"partitions": [
{
"topic": "<crr metadata topic name>",
"partition": 0,
"offset": -1
}
],
"version": 1
}
<crr metadata topic name>
is the name of the Kafka topic from which CRR consumes metadata messages.
It is generated by concatenating a topic prefix with the string Metadata
with a period as the separator.
The topic prefix is determined by the gadmin
config entry System.CrossRegionReplication.TopicPrefix
.
Please refer to the following example.
-
If
System.CrossRegionReplication.TopicPrefix
is an empty string on the cluster, then<crr metadata topic name>
isMetadata
. -
If
System.CrossRegionReplication.TopicPrefix
is set toPrimary
, then<crr metadata topic name>
isPrimary.Metadata
. Therefore, users need to pay attention when filling in the topic name in the JSON config.
2.4. Enable CRR on the DR cluster
Run the following commands on the DR cluster to enable CRR on the DR cluster.
# Enable Kafka Mirrormaker
$ gadmin config set System.CrossRegionReplication.Enabled true
# Kafka mirrormaker primary cluster's IPs, separator by ','
$ gadmin config set System.CrossRegionReplication.PrimaryKafkaIPs <PRIMARY_IP1,PRIMARY_IP2,PRIMARY_IP3>
# Kafka mirrormaker primary cluster's KafkaPort
$ gadmin config set System.CrossRegionReplication.PrimaryKafkaPort 30002
# The prefix of GPE/GUI/GSQL Kafka Topic, by default is empty.
$ gadmin config set System.CrossRegionReplication.TopicPrefix Primary
# Apply the config changes, init Kafka, and restart
$ gadmin config apply -y
$ gadmin init kafka -y
$ gadmin restart all -y
3. Restrictions on the DR cluster
After being set up, the DR cluster will be read-only and all data update operations will be blocked. This includes the following operations:
-
All metadata operations
-
Schema changes
-
User access management operations
-
Query creation, installation, and dropping
-
User-defined function operations
-
-
Data-loading operations
-
Loading job operations
-
RESTPP calls that modify graph data
-
-
Queries that modify the graph
4. Sync an outdated DR cluster
When the primary cluster executes an IMPORT
, DROP ALL
, or CLEAR GRAPH STORE
GSQL command, or the gsql --reset
bash command, the services on the DR cluster will stop syncing with the primary and become outdated.
To bring an outdated cluster back in sync, you need to generate a fresh backup of the primary cluster, and perform the setup steps detailed on this page again. However, you can skip Step 3: Enable CRR on the DR cluster, because CRR will
5. Updating a CRR system
From time to time, you may want to update the TigerGraph software on a CRR system. To perform this correctly, follow this sequence of steps.
-
Disable CRR on your DR cluster.
$ gadmin config set System.CrossRegionReplication.Enabled false
$ gadmin config apply -y
$ gadmin restart all -y
-
Upgrade both the primary cluster and DR cluster.
-
Enable CRR on the DR cluster.