One command to setup Migration Demo From Aurora to TiDB Cloud with API

張春華
5 min readJul 3, 2023

In this post, I will explain the demo to migrate AWS aurora database to TiDB Cloud with AWS SDK and TiDB Cloud OpenAPI with minimum manual work. It includes AWS resource generation, full data migration and incremental replication.

  • AWS aurora cluster setup with golang SDK According to provided config file.
  • EC2 node generation for workstation and DM cluster Generate EC2 nodes for workstation and DM clusters. The workstation is used to run TiUP to deploy DM cluster, data comparison, table creation and data generation. The DM cluster is used for data replication from AWS aurora to TiDB Cloud.
  • TiDB Cloud setup The TiDB Cloud API is used to generate TiDB Cloud cluster. Please find the OpenAPI interface and TiDB cloud golang sdk for your reference.
  • Setup Private Link between TiDB Cloud and workstation/aurora (Interactive operation) Need interactive operation to setup private link between workstation/DM cluster and TiDB Cloud since it has not been provided by TiDB Cloud.
  • Test table creation and test data generation Create one table and insert few rows for test.
  • Take binlog position from AWS aurora Take the binlog position before taking snapshot, with which the DM replication task will be created.
  • AWS aurora snapshot taking Take the aurora snapshot to extract data to S3.
  • Export snapshot to S3 parquet data Export the snapshot to S3 parquet which is supported to be imported by TiDB Cloud.
  • Data import to TiDB Cloud TiDB Cloud supports the data import API which make the data integration easier.
  • DM setup for replication So far DM on TiDB Cloud API has not been supported. To make the demo simpler, deploy the DM cluster on AWS premise.
  • Data comparison between TiDB Cloud and aurora The sync-diff-inspector is used to compare between Aurora and TiDB Cloud to make sure that the data has been migrated successfully.

DEMO Execution

pi@local$ more /tmp/aurora2tidbcloud.yaml
workstation:
cidr: 172.82.0.0/16
instance_type: c5.2xlarge
keyname: key-name # public key name
keyfile: /home/pi/.ssh/private-key-name # private key name
username: admin
imageid: ami-07d02ee1eeb0c996c
volumeSize: 100
#shared: false
aurora:
cidr: 172.84.0.0/16
instance_type: db.r5.large
db_parameter_family_group: aurora-mysql5.7
engine: aurora-mysql
engine_version: 5.7.mysql_aurora.2.11.2
db_username: admin
db_password: 1234Abcd
public_accessible_flag: false
s3backup_folder: s3://jay-data/aurora-export/ # s3 directory for data export
aws_topo_configs:
general:
# debian os
imageid: ami-07d02ee1eeb0c996c # Default image id for EC2
keyname: jay-us-east-01 # Public key to access the EC2 instance
keyfile: /home/pi/.ssh/jay-us-east-01.pem # Private key ti access the EC2 instance
cidr: 172.83.0.0/16 # The cidr for the VPC
instance_type: m5.2xlarge # Default instance type
tidb_version: v6.5.2 # TiDB version
excluded_az: # The AZ to be excluded for the subnets
- us-east-1e
network_type: nat
dm-master:
instance_type: t2.small # Instance type for dm master
count: 1 # Number of dm master node to be deployed
dm-worker:
instance_type: t2.small # Instance type for dm worker
count: 1 # Number of dm worker node to be deployed
tidb_cloud:
tidbcloud_project_id: 1111113089206752222 # The project id in the tidb cloud in which tidb cluster is to be created.
cloud_provider: AWS
cluster_type: DEDICATED
region: us-east-1
components:
tidb:
node_quantity: 1
node_size: 2C8G
tikv:
node_quantity: 3
node_size: 2C8G
storage_size_gib: 200
ip_access_list:
cidr: 0.0.0.0/0
description: Data migration from aurora to TiDB Test
port: 4000
user: root
password: 1234Abcd
databases:
- test01
- test02
pi@local$./bin/aws aurora2tidbcloud deploy aurora2tidbcloud /tmp/aurora2tidbcloud.yaml
Parallel Main step ... Echo: Create TransitGateway ... ...
... ...

Private Link Setup between TiDB Cloud and TiDB Cloud

Once workstation and TiDB Cloud is setup, the prompt asks the endpoint service which is provided by TiDB Cloud. You will have to switch to TiDB Cloud to get the information.

Go to TiDB Console to check that the TiDB Cluster has been created as below picture.

The cli command has been included in the script. No need to run the cli command any more.

Do the same process for private link between TiDB Cloud and DM Cluster VPC

Input private link connection host

Execution summary

The execution time for each process is showed after the demo is setup.

Result verification

Confirm the data has been copied to TiDb Cloud and DM replication

Conclusion

What can we use this script?

First, it helps to setup the demo very quickly. Without too much effort, it is completed within 90 minutes.

Secondly, based on this demo, one tool to migrate aurora to TiDB is to provided. In the future, with one command, we are able to complete the data migration .

What do I expect of the OpenAPI? If we get below API, it can be improved much more easily.

  • TiDB Cloud API to fetch private endpoint service
  • TiDB Cloud API to accept the private endpoint connection
  • TiDB Cloud API to fetch private endpoint service host name
  • TiDB Cloud API to support DM
  • TiDB Cloud API to support TiCDC

Originally published at https://github.com.

--

--