Migrate MySQL-Compatible Databases to TiDB Cloud Using Data Migration
This document describes how to migrate data from a MySQL-compatible database on a cloud provider (Amazon Aurora MySQL or Amazon Relational Database Service (RDS)) or on-premises to TiDB Cloud using the Data Migration feature of the TiDB Cloud console.
This feature helps you migrate your database and its ongoing changes to TiDB Cloud (either in the same region or cross regions). Compared with solutions that require tools such as Dumpling and TiDB Lightning, this feature is easier to use. You do not need to manually dump data from the source database and then import it to TiDB Cloud. Instead, you can migrate data directly from the source database to TiDB Cloud in one go.
Limitations
The Data Migration feature is available only for Dedicated Tier clusters.
The Data Migration feature is only available to clusters in the projects that are created in the following regions after November 9, 2022. If your project was created before the date or if your cluster is in another region, this feature is not available to your cluster and the Data Migration tab will not be displayed on the cluster overview page in the TiDB Cloud console.
- AWS Oregon (us-west-2)
- AWS N. Virginia (us-east-1)
- AWS Mumbai (ap-south-1)
- AWS Singapore (ap-southeast-1)
- AWS Tokyo (ap-northeast-1)
- AWS Frankfurt (eu-central-1)
- AWS Seoul (ap-northeast-2)
You can create up to 200 migration jobs for each organization. To create more migration jobs, you need to file a support ticket.
The system databases will be filtered out and not migrated to TiDB Cloud even if you select all of the databases to migrate. That is,
mysql
,information_schema
,information_schema
, andsys
will not be migrated using this feature.During full data migration, if the table to be migrated already exists in the target database with duplicated keys, the duplicate keys will be replaced.
During incremental data migration, if the table to be migrated already exists in the target database with duplicated keys, an error is reported and the migration is interrupted. In this situation, you need to make sure whether the upstream data is accurate. If yes, click the "Restart" button of the migration job and the migration job will replace the downstream conflicting records with the upstream records.
When you delete a cluster in TiDB Cloud, all migration jobs in that cluster are automatically deleted and not recoverable.
During incremental replication (migrating ongoing changes to your cluster), if the migration job recovers from an abrupt error, it might open the safe mode for 60 seconds. During the safe mode,
INSERT
statements are replicated asREPLACE
,UPDATE
statements asDELETE
andREPLACE
, and then these transactions are replicated to the downstream cluster to make sure that all the data during the abrupt error has been migrated smoothly to the downstream cluster. For upstream tables without primary keys or not-null unique indexes, some data might be duplicated in the downstream cluster because the data might be inserted repeatedly to the downstream.When you use Data Migration, it is recommended to keep the size of your dataset smaller than 1 TiB. If the dataset size is larger than 1 TiB, the full data migration will take a long time due to limited specifications.
In the following scenarios, if the migration job takes longer than 24 hours, do not purge binlogs in the source database to ensure that Data Migration can get consecutive binlogs for incremental replication:
- During full data migration.
- After the full data migration is completed and when incremental data migration is started for the first time, the latency is not 0ms.
Prerequisites
Before performing the migration, you need to check the data sources, prepare privileges for upstream and downstream databases, and set up network connections.
Make sure your data source and version are supported
Data Migration supports the following data sources and versions:
- MySQL 5.6, 5.7, and 8.0 local instances or on a public cloud provider. Note that MySQL 8.0 is still experimental on TiDB Cloud and might have incompatibility issues.
- Amazon Aurora (MySQL 5.6 and 5.7)
- Amazon RDS (MySQL 5.7)
Grant required privileges to the upstream database
The username you use for the upstream database must have all the following privileges:
Privilege | Scope |
---|---|
SELECT | Tables |
LOCK | Tables |
REPLICATION SLAVE | Global |
REPLICATION CLIENT | Global |
For example, you can use the following GRANT
statement to grant corresponding privileges:
GRANT SELECT,LOCK TABLES,REPLICATION SLAVE,REPLICATION CLIENT ON *.* TO 'your_user'@'your_IP_address_of_host'
Grant required privileges to the downstream TiDB Cloud cluster
The username you use for the downstream TiDB Cloud cluster must have the following privileges:
Privilege | Scope |
---|---|
CREATE | Databases, Tables |
SELECT | Tables |
INSERT | Tables |
UPDATE | Tables |
DELETE | Tables |
ALTER | Tables |
DROP | Databases, Tables |
INDEX | Tables |
TRUNCATE | Tables |
For example, you can execute the following GRANT
statement to grant corresponding privileges:
GRANT CREATE,SELECT,INSERT,UPDATE,DELETE,ALTER,TRUNCATE,DROP,INDEX ON *.* TO 'your_user'@'your_IP_address_of_host'
To quickly test a migration job, you can use the root
account of the TiDB Cloud cluster.
Set up network connection
Before creating a migration job, set up the network connection according to your connection methods. See Connect to Your TiDB Cluster.
If you use public IP (this is, standard connection) for network connection, make sure that the upstream database can be connected through the public network.
If you use AWS PrivateLink, set it up according to Set Up Private Endpoint Connections.
If you use VPC Peering, see the following instructions to configure the network.
Set up VPC Peering
If your MySQL service is in an AWS VPC, take the following steps:
Set up a VPC peering connection between the VPC of the MySQL service and your TiDB cluster.
Modify the inbound rules of the security group that the MySQL service is associated with.
You must add the CIDR of the region where your TiDB Cloud cluster is located to the inbound rules. Doing so allows the traffic to flow from your TiDB cluster to the MySQL instance.
If the MySQL URL contains a DNS hostname, you need to allow TiDB Cloud to be able to resolve the hostname of the MySQL service.
- Follow the steps in Enable DNS resolution for a VPC peering connection.
- Enable the Accepter DNS resolution option.
Enable binlogs
To perform incremental data migration, make sure you have enabled binlogs of the upstream database, and the binlogs have been kept for more than 24 hours.
Step 1: Go to the Data Migration page
Log in to the TiDB Cloud console and navigate to the Clusters page of your project.
Click the name of your target cluster to go to its overview page, and then click Data Migration in the left navigation pane.
On the Data Migration page, click Create Migration Job in the upper-right corner. The Create Migration Job page is displayed.
Step 2: Configure the source and target connection
On the Create Migration Job page, configure the source and target connection.
Enter a job name, which must start with a letter and must be less than 60 characters. Letters (A-Z, a-z), numbers (0-9), underscores (_), and hyphens (-) are acceptable.
Fill in the source connection profile.
- Data source: the data source type.
- Region: the region of the data source, which is required for cloud databases only.
- Connectivity method: the connection method for the data source. Currently, you can choose public IP, VPC Peering, or Private Link according to your connection method.
- Hostname or IP address (for public IP and VPC Peering): the hostname or IP address of the data source.
- Service Name (for Private Link): the endpoint service name.
- Port: the port of the data source.
- Username: the username of the data source.
- Password: the password of the username.
- SSL/TLS: if you enable SSL/TLS, you need to upload the certificates of the data source, including any of the following:
- only the CA certificate
- the client certificate and client key
- the CA certificate, client certificate and client key
Fill in the target connection profile.
- Username: enter the username of the target cluster in TiDB Cloud.
- Password: enter the password of the TiDB Cloud username.
Click Validate Connection and Next to validate the information you have entered.
Take action according to the message you see:
- If you use Public IP or VPC Peering, you need to add the Data Migration service's IP addresses to the IP Access List of your source database and firewall (if any).
- If you use Private Link, you are prompted to accept the endpoint request. Go to the AWS VPC console, and click Endpoint services to accept the endpoint request.
Step 3: Choose the objects to be migrated
Choose full data migration, incremental data migration, or both by choosing the checkboxes.
On the Choose Objects to Migrate page, select the objects to be migrated. You can click All to select all objects, or click Customize and then click the checkbox next to the object name to select the object.
If you click All, the migration job will migrate the existing data from the whole source database instance to TiDB Cloud and replicate ongoing changes after the full migration. Note that it happens only if you have selected the Full data migration and Incremental data migration checkboxes in the previous step.
If you click Customize and select some databases, the migration job will migrate the existing data and replicate ongoing changes of the selected databases to TiDB Cloud. Note that it happens only if you have selected the Full data migration and Incremental data migration checkboxes in the previous step.
If you click Customize and select some tables under a dataset name, the migration job only will migrate the existing data and replicate ongoing changes of the selected tables. Tables created afterwards in the same database will not be migrated.
Click Next.
Step 4: Precheck
On the Precheck page, you can view the precheck results. If the precheck fails, you need to operate according to Failed or Warning details, and then click Check again to recheck.
If there are only warnings on some check items, you can evaluate the risk and consider whether to ignore the warnings. If all warnings are ignored, the migration job will automatically go on to the next step.
For more information about errors and solutions, see Precheck errors and solutions.
For more information about precheck items, see Migration Task Precheck.
If all check items show Pass, click Next.
Step 5: Choose a spec and start migration
On the Choose a Spec and Start Migration page, select an appropriate migration specification according to your performance requirements. For more information about the specifications, see Specifications for Data Migration.
After selecting the spec, click Create Job and Start to start the migration.
Step 6: View the migration progress
After the migration job is created, you can view the migration progress on the Migration Job Details page. The migration progress is displayed in the Stage and Status area.
You can pause or delete a migration job when it is running.
If a migration job has failed, you can restart it after solving the problem.
You can delete a migration job in any status.
If you encounter any problems during the migration, see Migration errors and solutions.