PostgreSQL, also known as “Postgres,” is an open-source relational database management system (RDBMS). It has seen a drastic growth in popularity in recent years, with many developers and companies migrating their data to Postgres from other database solutions.
PostgreSQL ,也称为“ Postgres”,是一种开源关系数据库管理系统(RDBMS)。 近年来,它的受欢迎程度急剧增长 ,许多开发人员和公司将其数据从其他数据库解决方案迁移到Postgres。
The prospect of migrating a database can be intimidating, especially when migrating from one database management system to another. pgLoader is an open-source database migration tool that aims to simplify the process of migrating to PostgreSQL. It supports migrations from several file types and RBDMSs — including MySQL and SQLite — to PostgreSQL.
迁移数据库的前景令人生畏,尤其是从一个数据库管理系统迁移到另一个数据库时。 pgLoader是一个开源数据库迁移工具,旨在简化迁移到PostgreSQL过程。 它支持从多种文件类型和RBDMS(包括MySQL和SQLite )到PostgreSQL迁移。
This tutorial provides instructions on how to install pgLoader and use it to migrate a remote MySQL database to PostgreSQL over an SSL connection. Near the end of the tutorial, we will also briefly touch on a few different migration scenarios where pgLoader may be useful.
本教程提供了有关如何安装pgLoader以及如何使用它通过SSL连接将远程MySQL数据库迁移到PostgreSQL说明。 在本教程快要结束时,我们还将简要介绍pgLoader可能有用的几种不同的迁移方案。
To complete this tutorial, you’ll need the following:
要完成本教程,您将需要以下内容:
Access to two servers, each running Ubuntu 18.04. Both servers should have a firewall and a non-root user with sudo privileges configured. To set these up, you can follow our Initial Server Setup guide for Ubuntu 18.04.
访问两台服务器,每台服务器都运行Ubuntu 18.04。 两台服务器都应具有防火墙和配置了sudo特权的非root用户。 要进行设置,您可以遵循我们的Ubuntu 18.04初始服务器设置指南 。
MySQL installed on one of the servers. To set this up, follow Steps 1, 2, and 3 of our guide on How To Install MySQL on Ubuntu 18.04. Please note that in order to complete all the prerequisite tutorials linked here, you will need to configure your root MySQL user to authenticate with a password, as described in Step 3 of the MySQL installation guide.
MySQL安装在其中一台服务器上 。 要进行设置,请遵循我们的指南如何在Ubuntu 18.04上安装MySQL的 步骤1、2和3 。 请注意,为了完成此处链接的所有先决条件教程,您将需要配置MySQL 根用户以使用密码进行身份验证,如MySQL安装指南的步骤3中所述。
PostgreSQL installed on the other server. To set this up, complete Step 1 of our guide How To Install and Use PostgreSQL on Ubuntu 18.04.
PostgreSQL安装在另一台服务器上 。 要进行此设置,请完成我们的指南如何在Ubuntu 18.04上安装和使用PostgreSQL 步骤1 。
Your MySQL server should also be configured to accept encrypted connections. To set this up, complete every step of our tutorial on How To Configure SSL/TLS for MySQL on Ubuntu 18.04, including the optional Step 6. As you follow this guide, be sure to use your PostgreSQL server as the MySQL client machine, as you will need to be able to connect to your MySQL server from your Postgres machine in order to migrate the data with pgLoader.
您的MySQL服务器也应配置为接受加密连接。 要进行设置,请完成我们关于如何在Ubuntu 18.04上为MySQL配置SSL / TLS的教程的每个步骤,包括可选的步骤6 。 遵循本指南时,请确保将PostgreSQL服务器用作MySQL客户端计算机,因为您将需要能够从Postgres计算机连接到MySQL服务器,以便使用pgLoader迁移数据。
Please note that throughout this guide, the server on which you installed MySQL will be referred to as the “MySQL server” and any commands that should be run on this machine will be shown with a blue background, like this:
请注意,在本指南中,安装了MySQL的服务器将称为“ MySQL服务器 ”,并且应在该计算机上运行的所有命令都将以蓝色背景显示,如下所示:
Similarly, this guide will refer to the other server as the “PostgreSQL” or “Postgres” server and any commands that must be run on that machine will be shown with a red background:
同样,本指南将另一台服务器称为“ PostgreSQL ”或“ Postgres”服务器,并且必须在该计算机上运行的所有命令将以红色背景显示:
Please keep these in mind as you follow this tutorial so as to avoid any confusion.
在遵循本教程的过程中,请牢记这些,以免造成任何混淆。
This step describes the process of creating a test database and populating it with dummy data. We encourage you to practice using pgLoader with this test case, but if you already have a database you want to migrate, you can move on to the next step.
此步骤描述了创建测试数据库并向其填充虚拟数据的过程。 我们鼓励您在此测试用例中练习使用pgLoader,但是如果您已经有要迁移的数据库,则可以继续进行下一步 。
Start by opening up the MySQL prompt on your MySQL server:
首先打开MySQL服务器上MySQL提示符:
After entering your root MySQL user’s password, you will see the MySQL prompt.
输入您的root MySQL用户密码后,您将看到MySQL提示。
From there, create a new database by running the following command. You can name your database whatever you’d like, but in this guide we will name it source_db
:
在此处,通过运行以下命令来创建新数据库。 您可以随意命名数据库,但是在本指南中,我们将其命名为source_db
:
CREATE DATABASE source_db;
创建数据库source_db ;
Then switch to this database with the USE
command:
然后USE
命令切换到该数据库:
USE source_db;
使用source_db ;
Output
Database changed
Within this database, use the following command to create a sample table. Here, we will name this table sample_table
but feel free to give it another name:
在此数据库中,使用以下命令创建示例表。 在这里,我们将将此表命名为sample_table
但随时可以给它起另一个名字:
CREATE TABLE sample_table (
创建表sample_table (
Then populate this table with some sample employee data using the following command:
然后使用以下命令用一些样本员工数据填充该表:
INSERT INTO sample_table (employee_id, first_name, last_name, start_date, salary)
INSERT INTO sample_table (employee_id,first_name,last_name,start_date,salary)
Following this, you can close the MySQL prompt:
之后,您可以关闭MySQL提示符:
Now that you have a sample database loaded with dummy data, you can move on to the next step in which you will install pgLoader on your PostgreSQL server.
现在,您已经有一个加载了虚拟数据的示例数据库,您可以继续下一步,在其中将pgLoader安装在PostgreSQL服务器上。
pgLoader is a program that can load data into a PostgreSQL database from a variety of different sources. It uses PostgreSQL’s COPY
command to copy data from a source database or file — such as a comma-separated values (CSV) file — into a target PostgreSQL database.
pgLoader是一个程序,可以从各种不同的来源将数据加载到PostgreSQL数据库中。 它使用PostgreSQLCOPY
命令将数据从源数据库或文件(例如, 逗号分隔值(CSV)文件)复制到目标PostgreSQL数据库中。
pgLoader is available from the default Ubuntu APT repositories and you can install it using the apt
command. However, in this guide we will take advantage of pgLoader’s useSSL
option, a feature that allows for migrations from MySQL over an SSL connection. This feature is only available in the latest version of pgLoader which, as of this writing, can only be installed using the source code from its GitHub repository.
pgLoader可从默认的Ubuntu APT存储库中获得,您可以使用apt
命令安装它。 但是,在本指南中,我们将利用pgLoader的useSSL
选项,该功能允许通过SSL连接从MySQL进行迁移。 此功能仅在最新版本的pgLoader中可用,在撰写本文时,只能使用其GitHub存储库中的源代码进行安装。
Before installing pgLoader, you will need to install its dependencies. If you haven’t done so recently, update your Postgres server’s package index:
在安装pgLoader之前,您需要安装其依赖项。 如果您最近没有这样做,请更新Postgres服务器的软件包索引:
Then install the following packages:
然后安装以下软件包:
sbcl
: A Common Lisp compiler
sbcl
:一个通用的Lisp编译器
unzip
: A de-archiver for .zip
files
unzip
: .zip
文件的unzip
器
libsqlite3-dev
: A collection of development files for SQLite 3
libsqlite3-dev
:SQLite 3开发文件的集合
gawk
: Short for “GNU awk”, a pattern scanning and processing language
gawk
:模式扫描和处理语言“ GNU awk”的gawk
curl
: A command line tool for transferring data from a URL
curl
:用于从URL传输数据的命令行工具
make
: A utility for managing package compilation
make
:用于管理软件包编译的实用程序
freetds-dev
: A client library for MS SQL and Sybase databases
freetds-dev
:用于MS SQL和Sybase数据库的客户端库
libzip-dev
: A library for reading, creating, and modifying zip archives
libzip-dev
:一个用于阅读,创建和修改zip档案的库
Use the following command to install these dependencies:
使用以下命令来安装这些依赖项:
When prompted, confirm that you want to install these packages by pressing ENTER
.
出现提示时,请按ENTER
确认要安装这些软件包。
Next, navigate to the pgLoader GitHub project’s Releases page and find the latest release. For this guide, we will use the latest release at the time of this writing: version 3.6.1. Scroll down to its Assets menu and copy the link for the tar.gz
file labeled Source code. Then paste the link into the following wget
command. This will download the tarball to your server:
接下来,导航到pgLoader GitHub项目的Releases页面,并找到最新版本。 对于本指南,我们将在撰写本文时使用最新版本: 3.6.1版 。 向下滚动到其Assets菜单,然后复制标有Source code的tar.gz
文件的链接。 然后将链接粘贴到以下wget
命令中。 这会将压缩包下载到您的服务器:
wget https://github.com/dimitri/pgloader/archive/v3.6.1.tar.gz
wget https://github.com/dimitri/pgloader/archive/v3.6.1.tar.gz
Extract the tarball:
提取压缩包:
tar xvf v3.6.1.tar.gz
焦油xvf v3.6.1.tar.gz
This will create a number of new directories and files on your server. Navigate into the new pgLoader parent directory:
这将在您的服务器上创建许多新目录和文件。 导航到新的pgLoader父目录:
cd pgloader-3.6.1/
cd pgloader -3.6.1 /
Then use the make
utility to compile the pgloader
binary:
然后使用make
实用程序来编译pgloader
二进制文件:
This command will take some time to build the pgloader
binary.
该命令将花费一些时间来构建pgloader
二进制文件。
Move the binary file into the /usr/local/bin
directory, the location where Ubuntu looks for executable files:
将二进制文件移到/usr/local/bin
目录中,Ubuntu在其中查找可执行文件:
You can test that pgLoader was installed correctly by checking its version, like so:
您可以通过检查pgLoader的版本来测试其是否正确安装,如下所示:
Output
pgloader version "3.6.1"
compiled with SBCL 1.4.5.debian
pgLoader is now installed, but before you can begin your migration you’ll need to make some configuration changes to both your PostgreSQL and MySQL instances. We’ll focus on the PostgreSQL server first.
现在已经安装了pgLoader,但是在开始迁移之前,您需要对PostgreSQL和MySQL实例进行一些配置更改。 我们将首先关注PostgreSQL服务器。
The pgloader
command works by copying source data, either from a file or directly from a database, and inserting it into a PostgreSQL database. For this reason, you must either run pgLoader as a Linux user who has access to your Postgres database or you must specify a PostgreSQL role with the appropriate permissions in your load command.
pgloader
命令的工作原理是从文件或直接从数据库复制源数据,然后将其插入到PostgreSQL数据库中。 因此,您必须以有权访问Postgres数据库的Linux用户身份运行pgLoader,或者必须在load命令中指定具有适当权限的PostgreSQL角色。
PostgreSQL manages database access through the use of roles. Depending on how the role is configured, it can be thought of as either a database user or a group of database users. In most RDBMSs, you create a user with the CREATE USER
SQL command. Postgres, however, comes installed with a handy script called createuser
. This script serves as a wrapper for the CREATE USER
SQL command that you can run directly from the command line.
PostgreSQL通过使用角色来管理数据库访问。 根据角色的配置方式,可以将其视为数据库用户或一组数据库用户。 在大多数RDBMS中,可以使用CREATE USER
SQL命令创建一个用户。 但是,Postgres附带了一个名为createuser
的方便脚本。 该脚本用作CREATE USER
SQL命令的包装,您可以从命令行直接运行该命令。
Note: In PostgreSQL, you authenticate as a database user using the Identification Protocol, or ident, authentication method by default, rather than with a password. This involves PostgreSQL taking the client’s Ubuntu username and using it as the allowed Postgres database username. This allows for greater security in many cases, but it can also cause issues in instances where you’d like an outside program to connect to one of your databases.
注意:在PostgreSQL中,默认情况下,您使用身份验证协议 ( ident)或ident身份验证方法(而不是密码)作为数据库用户进行身份验证。 这涉及到PostgreSQL使用客户端的Ubuntu用户名并将其用作允许的Postgres数据库用户名。 在许多情况下,这可以提高安全性,但是在您希望外部程序连接到数据库之一的情况下,也会引起问题。
pgLoader can load data into a Postgres database through a role that authenticates with the ident method as long as that role shares the same name as the Linux user profile issuing the pgloader
command. However, to keep this process as clear as possible, this tutorial describes setting up a different PostgreSQL role that authenticates with a password rather than with the ident method.
pgLoader可以通过使用ident方法进行身份验证的角色将数据加载到Postgres数据库中,只要该角色与发出pgloader
命令的Linux用户配置文件具有相同的名称pgloader
。 但是,为了使此过程尽可能清晰,本教程介绍了设置一个不同的PostgreSQL角色,该角色使用密码而不是ident方法进行身份验证。
Run the following command on your Postgres server to create a new role. Note the -P
flag, which tells createuser
to prompt you to enter a password for the new role:
在Postgres服务器上运行以下命令以创建新角色。 注意-P
标志,该标志告诉createuser
提示您输入新角色的密码:
You may first be prompted for your sudo
password. The script will then prompt you to enter a name for the new role. In this guide, we’ll call this role pgloader_pg:
可能会首先提示您输入sudo
密码。 然后,脚本将提示您输入新角色的名称。 在本指南中,我们将其称为pgloader_pg :
Output
Enter name of role to add: pgloader_pg
Following that, createuser
will prompt you to enter and confirm a password for this role. Be sure to take note of this password, as you’ll need it to perform the migration in Step 5:
之后, createuser
将提示您输入并确认该角色的密码。 请务必记下该密码,因为在步骤5中将需要使用它来执行迁移:
Output
Enter password for new role:
Enter it again:
Lastly, the script will ask you if the new role should be classified as a superuser. In PostgreSQL, connecting to the database with a superuser role allows you to circumvent all of the database’s permissions checks, except for the right to log in. Because of this, the superuser privilege should not be used lightly, and the PostgreSQL documentation recommends that you do most of your database work as a non-superuser role. However, because pgLoader needs broad privileges to access and load data into tables, you can safely grant this new role superuser privileges. Do so by typing y
and then pressing ENTER
:
最后,脚本将询问您是否应将新角色归类为超级用户。 在PostgreSQL中,以超级用户角色连接到数据库允许您规避数据库的所有权限检查(登录权除外)。因此,不应轻易使用超级用户特权,并且PostgreSQL文档建议您将您的大多数数据库用作非超级用户角色。 但是,由于pgLoader需要广泛的特权才能访问数据并将数据加载到表中,因此您可以安全地授予此新角色超级用户特权。 键入y
,然后按ENTER
:
Output
. . .
Shall the new role be a superuser? (y/n) y
PostgreSQL comes with another useful script that allows you to create a database from the command line. Since pgLoader also needs a target database into which it can load the source data, run the following command to create one. We’ll name this database new_db
but feel free to modify that if you like:
PostgreSQL带有另一个有用的脚本,该脚本允许您从命令行创建数据库。 由于pgLoader还需要一个可以将源数据加载到其中的目标数据库,因此运行以下命令来创建一个。 我们将这个数据库new_db
但是如果您愿意,可以随时对其进行修改:
sudo -u postgres createdb new_db
须藤-u postgres createdb new_db
If there aren’t any errors, this command will complete without any output.
如果没有任何错误,此命令将完成而没有任何输出。
Now that you have a dedicated PostgreSQL user and an empty database into which you can load your MySQL data, there are just a few more changes you’ll need to make before performing a migration. You’ll need to create a dedicated MySQL user with access to your source database and add your client-side certificates to Ubuntu’s trusted certificate store.
现在您已经拥有一个专用的PostgreSQL用户和一个空数据库,可以在其中装载MySQL数据,在执行迁移之前,您只需要进行一些其他更改。 您需要创建一个可访问源数据库的专用MySQL用户,并将客户端证书添加到Ubuntu的受信任证书存储中。
Protecting data from snoopers is one of the most important parts of any database administrator’s job. Migrating data from one machine to another opens up an opportunity for malicious actors to sniff the packets traveling over the network connection if it isn’t encrypted. In this step, you will create a dedicated MySQL user which pgLoader will use to perform the migration over an SSL connection.
保护数据免受窥探者的侵扰是任何数据库管理员工作中最重要的部分之一。 将数据从一台计算机迁移到另一台计算机为恶意行为者提供了一个机会,即嗅探未经加密的通过网络连接传输的数据包。 在此步骤中,您将创建一个专用MySQL用户,pgLoader将使用该用户执行SSL连接上的迁移。
Begin by opening up your MySQL prompt:
首先打开MySQL提示符:
From the MySQL prompt, use the following CREATE USER
command to create a new MySQL user. We will name this user pgloader_my. Because this user will only access MySQL from your PostgreSQL server, be sure to replace your_postgres_server_ip
with the public IP address of your PostgreSQL server. Additionally, replace password
with a secure password or passphrase:
在MySQL提示符下,使用以下CREATE USER
命令创建一个新MySQL用户。 我们将这个用户命名为pgloader_my 。 因为此用户将仅从PostgreSQL服务器访问MySQL,所以请确保将your_postgres_server_ip
替换为PostgreSQL服务器的公共IP地址。 此外,将password
替换为安全密码或密码短语:
CREATE USER 'pgloader_my'@'your_postgres_server_ip' IDENTIFIED BY 'password' REQUIRE SSL;
创建用户' pgloader_my '@' your_postgres_server_ip'IDENTIFIED BY' 密码 '需要SSL;
Note the REQUIRE SSL
clause at the end of this command. This will restrict the pgloader_my user to only access the database through a secure SSL connection.
请注意此命令末尾的REQUIRE SSL
子句。 这将限制pgloader_my用户只能通过安全的SSL连接访问数据库。
Next, grant the pgloader_my user access to the target database and all of its tables. Here, we’ll specify the database we created in the optional Step 1, but if you have your own database you’d like to migrate, use its name in place of source_db
:
接下来,授予pgloader_my用户访问目标数据库及其所有表的权限。 在这里,我们将指定在可选步骤1中创建的数据库,但是如果您要迁移自己的数据库,请使用其名称代替source_db
:
GRANT ALL ON source_db.* TO 'pgloader_my'@'your_postgresql_server_ip';
将所有代码授予 source_db 。* TO'pgloader_my '@' your_postgresql_server_ip ';
Then run the FLUSH PRIVILEGES
command to reload the grant tables, enabling the privilege changes:
然后运行FLUSH PRIVILEGES
命令重新加载授权表,以启用特权更改:
After this, you can close the MySQL prompt:
之后,您可以关闭MySQL提示符:
Now go back to your Postgres server terminal and attempt to log in to the MySQL server as the new pgloader_my user. If you followed the prerequisite guide on configuring SSL/TLS for MySQL then you will already have mysql-client
installed on your PostgreSQL server and you should be able to connect with the following command:
现在回到您的Postgres服务器终端,尝试以新的pgloader_my用户身份登录MySQL服务器。 如果遵循有关为MySQL配置SSL / TLS的先决条件指南,则您将已经在PostgreSQL服务器上安装了mysql-client
,并且应该能够使用以下命令进行连接:
mysql -u pgloader_my -p -h your_mysql_server_ip
mysql -u pgloader_my -p -h your_mysql_server_ip
If the command is successful, you will see the MySQL prompt:
如果命令成功,您将看到MySQL提示符:
After confirming that your pgloader_my user can successfully connect, go ahead and close the prompt:
在确认您的pgloader_my用户可以成功连接后,继续并关闭提示:
At this point, you have a dedicated MySQL user that can access the source database from your Postgres machine. However, if you were to try to migrate your MySQL database using SSL the attempt would fail.
此时,您已经拥有一个专用MySQL用户,该用户可以从Postgres机器上访问源数据库。 但是,如果尝试使用SSL迁移MySQL数据库,则尝试将失败。
The reason for this is that pgLoader isn’t able to read MySQL’s configuration files, and thus doesn’t know where to look for the CA certificate or client certificate that you copied to your PostgreSQL server in the prerequisite SSL/TLS configuration guide. Rather than ignoring SSL requirements, though, pgLoader requires the use of trusted certificates in cases where SSL is needed to connect to MySQL. Accordingly, you can resolve this issue by adding the ca.pem
and client-cert.pem
files to Ubuntu’s trusted certificate store.
这样做的原因是pgLoader无法读取MySQL的配置文件,因此不知道在先决条件SSL / TLS配置指南中复制到PostgreSQL服务器的CA证书或客户端证书的位置。 但是,在需要SSL连接到MySQL的情况下,pgLoader并没有忽略SSL要求,而是要求使用可信证书。 因此,您可以通过将ca.pem
和client-cert.pem
文件添加到Ubuntu的受信任证书存储区来解决此问题。
To do this, copy over the ca.pem
and client-cert.pem
files to the /usr/local/share/ca-certificates/
directory. Note that you must also rename these files so they have the .crt
file extension. If you don’t rename them, your system will not be able to recognize that you’ve added these new certificates:
为此,请将ca.pem
和client-cert.pem
文件复制到/usr/local/share/ca-certificates/
目录。 请注意,您还必须重命名这些文件,以便它们具有.crt
文件扩展名。 如果不重命名它们,则系统将无法识别出您已添加以下新证书:
Following this, run the update-ca-certificates
command. This program looks for certificates within /usr/local/share/ca-certificates
, adds any new ones to the /etc/ssl/certs/
directory, and generates a list of trusted SSL certificates — ca-certificates.crt
— based on the contents of the /etc/ssl/certs/
directory:
之后,运行update-ca-certificates
命令。 该程序在/usr/local/share/ca-certificates
查找/usr/local/share/ca-certificates
,将所有新/usr/local/share/ca-certificates
添加到/etc/ssl/certs/
目录,并基于以下内容生成受信任的SSL证书列表ca-certificates.crt
。 /etc/ssl/certs/
目录的内容:
Output
Updating certificates in /etc/ssl/certs...
2 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
With that, you’re all set to migrate your MySQL database to PostgreSQL.
这样,您就可以将MySQL数据库迁移到PostgreSQL。
Now that you’ve configured remote access from your PostgreSQL server to your MySQL server, you’re ready to begin the migration.
既然已经配置了从PostgreSQL服务器到MySQL服务器的远程访问,就可以开始迁移了。
Note: It’s important to back up your database before taking any action that could impact the integrity of your data. However, this isn’t necessary when performing a migration with pgLoader, since it doesn’t delete or transform data; it only copies it.
注意:在执行任何可能影响数据完整性的操作之前,备份数据库很重要。 但是,使用pgLoader进行迁移时并不需要这样做,因为它不会删除或转换数据。 它只会复制它。
That said, if you’re feeling cautious and would like to back up your data before migrating it, you can do so with the mysqldump
utility. See the official MySQL documentation for details.
就是说,如果您感到谨慎并希望在迁移之前备份数据,则可以使用mysqldump
实用程序来进行。 有关详细信息,请参见MySQL官方文档 。
pgLoader allows users to migrate an entire database with a single command. For a migration from a MySQL database to a PostgreSQL database on a separate server, the command would have the following syntax:
pgLoader允许用户通过一个命令迁移整个数据库。 为了从单独的服务器上MySQL数据库迁移到PostgreSQL数据库,该命令应具有以下语法:
pgloader mysql://mysql_username:password@mysql_server_ip_/source_database_name?option_1=value&option_n=value postgresql://postgresql_role_name:password@postgresql_server_ip/target_database_name?option_1=value&option_n=value
pgloader mysql:// mysql_username : 密码 @ mysql_server_ip_ / source_database_name ? option_1 = 值 & option_n = 值 postgresql:// postgresql_role_name : 密码 @ postgresql_server_ip / target_database_name ? option_1 = 值 & option_n = 值
This includes the pgloader
command and two connection strings, the first for the source database and the second for the target database. Both of these connection strings begin by declaring what type of DBMS the connection string points to, followed by the username and password that have access to the database (separated by a colon), the host address of the server where the database is installed, the name of the database pgLoader should target, and various options that affect pgLoader’s behavior.
这包括pgloader
命令和两个连接字符串 ,第一个用于源数据库,第二个用于目标数据库。 这两个连接字符串都首先声明连接字符串指向哪种类型的DBMS,然后声明有权访问数据库的用户名和密码(以冒号分隔),安装数据库的服务器的主机地址, pgLoader应该以数据库的名称为目标,以及影响pgLoader行为的各种选项 。
Using the parameters defined earlier in this tutorial, you can migrate your MySQL database using a command with the following structure. Be sure to replace any highlighted values to align with your own setup:
使用本教程前面定义的参数,可以使用具有以下结构的命令来迁移MySQL数据库。 确保替换任何突出显示的值以符合您自己的设置:
pgloader mysql://pgloader_my:mysql_password@mysql_server_ip/source_db?useSSL=true postgresql://pgloader_pg:postgresql_password@localhost/new_db
pgloader mysql:// pgloader_my : mysql_password @ mysql_server_ip / source_db ?useSSL = true postgresql:// pgloader_pg : postgresql_password @ localhost / new_db
Note that this command includes the useSSL
option in the MySQL connection string. By setting this option to true
, pgLoader will connect to MySQL over SSL. This is necessary, as you’ve configured your MySQL server to only accept secure connections.
请注意,此命令在MySQL连接字符串中包含useSSL
选项。 通过将此选项设置为true
,pgLoader将通过SSL连接到MySQL。 这是必需的,因为您已将MySQL服务器配置为仅接受安全连接。
If this command is successful, you will see an output table describing how the migration went:
如果此命令成功,您将看到一个描述迁移过程的输出表:
Output
table name errors rows bytes total time
----------------------- --------- --------- --------- --------------
fetch meta data 0 2 0.111s
Create Schemas 0 0 0.001s
Create SQL Types 0 0 0.005s
Create tables 0 2 0.017s
Set Table OIDs 0 1 0.010s
----------------------- --------- --------- --------- --------------
source_db.sample_table 0 5 0.2 kB 0.048s
----------------------- --------- --------- --------- --------------
COPY Threads Completion 0 4 0.052s
Index Build Completion 0 1 0.011s
Create Indexes 0 1 0.006s
Reset Sequences 0 0 0.014s
Primary Keys 0 1 0.001s
Create Foreign Keys 0 0 0.000s
Create Triggers 0 0 0.000s
Install Comments 0 0 0.000s
----------------------- --------- --------- --------- --------------
Total import time ✓ 5 0.2 kB 0.084s
To check that the data was migrated correctly, open up the PostgreSQL prompt:
要检查数据是否已正确迁移,请打开PostgreSQL提示符:
From there, connect to the database into which you loaded the data:
从那里,连接到将数据加载到的数据库:
\c new_db
\ c new_db
Then run the following query to test whether the migrated data is stored in your PostgreSQL database:
然后运行以下查询以测试迁移的数据是否存储在PostgreSQL数据库中:
SELECT * FROM source_db.sample_table;
SELECT * FROM source_db 。 sample_table ;
Note: Notice the FROM
clause in this query specifying the sample_table
held within the source_db
schema:
注意:请注意,此查询中的FROM
子句指定了source_db
模式中保存的sample_table
:
. . . FROM source_db.sample_table;
。 。 。 从source_db 。 sample_table ;
This is called a qualified name. You could go further and specify the fully qualified name by including the database’s name as well as those of the schema and table:
这称为合格名称 。 您可以更进一步,通过包括数据库名称以及模式和表的名称来指定全限定名称 :
. . . FROM new_db.source_db.sample_table;
。 。 。 从new_db 。 source_db 。 sample_table ;
When you run queries in a PostgreSQL database, you don’t need to be this specific if the table is held within the default public
schema. The reason you must do so here is that when pgLoader loads data into Postgres, it creates and targets a new schema named after the original database — in this case, source_db
. This is pgLoader’s default behavior for MySQL to PostgreSQL migrations. However, you can use a load file to instruct pgLoader to change the table’s schema topublic
once it’s done loading data. See the next step for an example of how to do this.
在PostgreSQL数据库中运行查询时,如果表保存在默认的public
模式中,则不需要特定于此。 在这里必须这样做的原因是,当pgLoader将数据加载到Postgres中时,它将创建并定位以原始数据库命名的新模式—在本例中为source_db
。 这是pgLoader从MySQL到PostgreSQL迁移的默认行为。 但是,您可以使用加载文件来指示pgLoader在完成加载数据后将表的模式更改为public
。 有关如何执行此操作的示例,请参见下一步。
If the data was indeed loaded correctly, you will see the following table in the query’s output:
如果确实确实正确加载了数据,那么您将在查询输出中看到下表:
Output
employee_id | first_name | last_name | start_date | salary
-------------+------------+-------------+------------+------------
1 | Elizabeth | Cotten | 2007-11-11 | $105433.18
2 | Yanka | Dyagileva | 2017-10-30 | $107540.67
3 | Lee | Dorsey | 2013-06-04 | $118024.04
4 | Kasey | Chambers | 2010-08-18 | $116456.98
5 | Bram | Tchaikovsky | 2018-09-16 | $61989.50
(5 rows)
To close the Postgres prompt, run the following command:
要关闭Postgres提示符,请运行以下命令:
Now that we’ve gone over how to migrate a MySQL database over a network and load it into a PostgreSQL database, we will go over a few other common migration scenarios in which pgLoader can be useful.
现在,我们已经讨论了如何通过网络迁移MySQL数据库并将其加载到PostgreSQL数据库中,接下来我们将介绍pgLoader可能有用的其他一些常见迁移方案。
pgLoader is a highly flexible tool that can be useful in a wide variety of situations. Here, we’ll take a quick look at a few other ways you can use pgLoader to migrate a MySQL database to PostgreSQL.
pgLoader是一种高度灵活的工具,可在多种情况下使用。 在这里,我们将快速介绍使用pgLoader将MySQL数据库迁移到PostgreSQL其他几种方法。
In the context of pgLoader, a load file, or command file, is a file that tells pgLoader how to perform a migration. This file can include commands and options that affect pgLoader’s behavior, giving you much finer control over how your data is loaded into PostgreSQL and allowing you to perform complex migrations.
在pgLoader的上下文中, 加载文件或命令文件是一个告诉pgLoader如何执行迁移的文件。 该文件可以包含影响pgLoader行为的命令和选项,使您可以更好地控制如何将数据加载到PostgreSQL中,并可以执行复杂的迁移。
pgLoader’s documentation provides comprehensive instructions on how to use and extend these files to support a number of migration types, so here we will work through a comparatively rudimentary example. We will perform the same migration we ran in Step 5, but will also include an ALTER SCHEMA
command to change the new_db
database’s schema from source_db
to public
.
pgLoader的文档提供了有关如何使用和扩展这些文件以支持多种迁移类型的全面说明,因此在这里,我们将通过一个相对简单的示例进行工作。 我们将执行与步骤5中相同的迁移,但还将包括ALTER SCHEMA
命令,以将new_db
数据库的模式从source_db
更改为public
。
To begin, create a new load file on the Postgres server using your preferred text editor:
首先,使用您喜欢的文本编辑器在Postgres服务器上创建一个新的加载文件:
Then add the following content, making sure to update the highlighted values to align with your own configuration:
然后添加以下内容,确保更新突出显示的值以符合您自己的配置:
LOAD DATABASE
FROM mysql://pgloader_my:mysql_password@mysql_server_ip/source_db?useSSL=true
INTO pgsql://pgloader_pg:postgresql_password@localhost/new_db
WITH include drop, create tables
ALTER SCHEMA 'source_db' RENAME TO 'public'
;
Here is what each of these clauses do:
这些子句的作用如下:
LOAD DATABASE
: This line instructs pgLoader to load data from a separate database, rather than a file or data archive.
LOAD DATABASE
:此行指示pgLoader从单独的数据库而不是文件或数据存档中加载数据。
FROM
: This clause specifies the source database. In this case, it points to the connection string for the MySQL database we created in Step 1.
FROM
:此子句指定源数据库。 在这种情况下,它指向我们在步骤1中创建MySQL数据库的连接字符串。
INTO
: Likewise, this line specifies the PostgreSQL database in to which pgLoader should load the data.
INTO
:同样,此行指定pgLoader要将数据加载到的PostgreSQL数据库。
include drop
: When this option is used, pgLoader will drop any tables in the target PostgreSQL database that also appear in the source MySQL database. If you use this option when migrating data to an existing PostgreSQL database, you should back up the entire database to avoid losing any data.
include drop
:使用此选项时,pgLoader将删除目标PostgreSQL数据库中也将出现在源MySQL数据库中的所有表。 如果在将数据迁移到现有PostgreSQL数据库时使用此选项,则应备份整个数据库以避免丢失任何数据。
create tables
: This option tells pgLoader to create new tables in the target PostgreSQL database based on the metadata held in the MySQL database. If the opposite option, create no tables
, is used, then the target tables must already exist in the target Postgres database prior to the migration.
create tables
:此选项告诉pgLoader根据MySQL数据库中保存的元数据在目标PostgreSQL数据库中创建新表。 如果使用相反的选项“ create no tables
,则在迁移之前目标表必须已经存在于目标Postgres数据库中。
WITH
: This clause allows you to define specific behaviors for pgLoader. You can find the full list of WITH
options that are compatible with MySQL migrations here. In this example we only include two options:
WITH
:此子句允许您定义pgLoader的特定行为。 您可以在此处找到与MySQL迁移兼容的WITH
选项的完整列表。 在此示例中,我们仅包括两个选项:
ALTER SCHEMA
: Following the WITH
clause, you can add specific SQL commands like this to instruct pgLoader to perform additional actions. Here, we instruct pgLoader to change the new Postgres database’s schema from source_db
to public
, but only after it has created the schema. Note that you can also nest such commands within other clauses — such as BEFORE LOAD DO
— to instruct pgLoader to execute those commands at specific points in the migration process.
ALTER SCHEMA
:在WITH
子句之后,您可以添加诸如此类的特定SQL命令,以指示pgLoader执行其他操作。 在这里,我们指示pgLoader将新的Postgres数据库的架构从source_db
更改为public
,但是仅在创建架构之后。 注意,您也可以将这样的命令嵌套在其他子句中,例如BEFORE LOAD DO
,以指示pgLoader在迁移过程中的特定点执行这些命令。
This is a demonstrative example of what you can include in a load file to modify pgLoader’s behavior. The complete list of clauses that one can add to a load file and what they do can be found in the official pgLoader documentation.
这是一个演示示例,说明可以在加载文件中包含哪些内容来修改pgLoader的行为。 可以添加到加载文件中的条款的完整列表以及它们的作用可以在官方的pgLoader文档中找到 。
Save and close the load file after you’ve finished adding this content. To use it, include the name of the file as an argument to the pgloader
command:
添加完此内容后,请保存并关闭加载文件。 要使用它,请将文件名作为pgloader
命令的参数包括pgloader
:
To test that the migration was successful, open up the Postgres prompt:
要测试迁移是否成功,请打开Postgres提示符:
Then connect to the database:
然后连接到数据库:
\c new_db
\ c new_db
And run the following query:
并运行以下查询:
Output
employee_id | first_name | last_name | start_date | salary
-------------+------------+-------------+------------+------------
1 | Elizabeth | Cotten | 2007-11-11 | $105433.18
2 | Yanka | Dyagileva | 2017-10-30 | $107540.67
3 | Lee | Dorsey | 2013-06-04 | $118024.04
4 | Kasey | Chambers | 2010-08-18 | $116456.98
5 | Bram | Tchaikovsky | 2018-09-16 | $61989.50
(5 rows)
This output confirms that pgLoader migrated the data successfully, and also that the ALTER SCHEMA
command we added to the load file worked as expected, since we didn’t need to specify the source_db
schema in the query to view the data.
此输出确认pgLoader成功迁移了数据,并且我们添加到加载文件中的ALTER SCHEMA
命令按预期工作,因为我们不需要在查询中指定source_db
模式来查看数据。
Note that if you plan to use a load file to migrate data held on one database to another located on a separate machine, you will still need to adjust any relevant networking and firewall rules in order for the migration to be successful.
请注意,如果您打算使用加载文件将一个数据库中保存的数据迁移到另一台计算机上的另一个数据库,则仍需要调整任何相关的网络和防火墙规则,以使迁移成功。
You can use pgLoader to migrate a MySQL database to a PostgreSQL database housed on the same machine. All you need is to run the migration command from a Linux user profile with access to the root MySQL user:
您可以使用pgLoader将MySQL数据库迁移到同一台计算机上的PostgreSQL数据库。 您所需要的只是从Linux用户配置文件中运行迁移命令,并有权访问MySQL 根用户:
pgloader mysql://root@localhost/source_db pgsql://sammy:postgresql_password@localhost/target_db
pgloader mysql:// root @ localhost / source_db pgsql:// sammy : postgresql_password @ localhost / target_db
Performing a local migration like this means you don’t have to make any changes to MySQL’s default networking configuration or your system’s firewall rules.
这样执行本地迁移意味着您不必对MySQL的默认网络配置或系统的防火墙规则进行任何更改。
You can also load a PostgreSQL database with data from a CSV file.
您还可以使用CSV文件中的数据加载PostgreSQL数据库。
Assuming you have a CSV file of data named load.csv
, the command to load it into a Postgres database might look like this:
假设您有一个名为load.csv
的CSV文件,将其加载到Postgres数据库中的命令如下所示:
pgloader load.csv pgsql://sammy:password@localhost/target_db
pgloader load.csv pgsql:// sammy : 密码 @ localhost / target_db
Because the CSV format is not fully standardized, there’s a chance that you will run into issues when loading data directly from a CSV file in this manner. Fortunately, you can correct for irregularities by including various options with pgLoader’s command line options or by specifying them in a load file. See the pgLoader documentation on the subject for more details.
由于CSV格式尚未完全标准化,因此以这种方式直接从CSV文件直接加载数据时,您可能会遇到问题。 幸运的是,您可以通过在pgLoader的命令行选项中包含各种选项或在装入文件中指定它们来纠正不规则性。 有关更多详细信息,请参见pgLoader文档 。
It’s also possible to perform a migration from a self-managed database to a managed PostgreSQL database. To illustrate how this kind of migration could look, we will use the MySQL server and a DigitalOcean Managed PostgreSQL Database. We’ll also use the sample database we created in Step 1, but if you skipped that step and have your own database you’d like to migrate, you can point to that one instead.
也可以执行从自我管理数据库到托管PostgreSQL数据库的迁移。 为了说明这种迁移的外观,我们将使用MySQL服务器和DigitalOcean托管PostgreSQL数据库。 我们还将使用在第1步中创建的示例数据库,但是如果您跳过了这一步并拥有自己的数据库要迁移,则可以指向该数据库。
Note: For instructions on how to set up a DigitalOcean Managed Database, please refer to our Managed Database Quickstart guide.
注意:有关如何设置DigitalOcean托管数据库的说明,请参阅我们的托管数据库快速入门指南。
For this migration, we won’t need pgLoader’s useSSL
option since it only works with remote MySQL databases and we will run this migration from a local MySQL database. However, we will use the sslmode=require
option when we load and connect to the DigitalOcean Managed PostgreSQL database, which will ensure your data stays protected.
对于此迁移,我们将不需要pgLoader的useSSL
选项,因为它仅适用于远程MySQL数据库,并且将从本地MySQL数据库运行此迁移。 但是,在加载并连接到DigitalOcean托管PostgreSQL数据库时,我们将使用sslmode=require
选项,这将确保您的数据受到保护。
Because we’re not using the useSSL
this time around, you can use apt
to install pgLoader along with the postgresql-client
package, which will allow you to access the Managed PostgreSQL Database from your MySQL server:
因为这次我们不使用useSSL
,所以您可以使用apt
来安装pgLoader以及postgresql-client
软件包,这将允许您从MySQL服务器访问Managed PostgreSQL Database:
Following that, you can run the pgloader
command to migrate the database. To do this, you’ll need the connection string for the Managed Database.
之后,您可以运行pgloader
命令来迁移数据库。 为此,您需要托管数据库的连接字符串。
For DigitalOcean Managed Databases, you can copy the connection string from the Cloud Control Panel. First, click Databases in the left-hand sidebar menu and select the database to which you want to migrate the data. Then scroll down to the Connection Details section. Click on the drop down menu and select Connection string. Then, click the Copy button to copy the string to your clipboard and paste it into the following migration command, replacing the example PostgreSQL connection string shown here. This will migrate your MySQL database into the defaultdb
PostgreSQL database as the doadmin PostgreSQL role:
对于DigitalOcean托管数据库,您可以从“云控制面板”中复制连接字符串。 首先,在左侧边栏菜单中单击数据库 ,然后选择要将数据迁移到的数据库。 然后向下滚动到“ 连接详细信息”部分。 单击下拉菜单,然后选择连接字符串 。 然后,单击复制按钮将字符串复制到剪贴板,并将其粘贴到以下迁移命令中,替换此处显示的示例PostgreSQL连接字符串。 这会将您MySQL数据库作为doadmin PostgreSQL角色迁移到defaultdb
PostgreSQL数据库中:
pgloader mysql://root:password@localhost/source_db postgres://doadmin:password@db_host/defaultdb?sslmode=require
pgloader mysql:// root: 密码 @ localhost / source_db postgres:// doadmin: 密码 @ db_host / defaultdb?sslmode = require
Following this, you can use the same connection string as an argument to psql
to connect to the managed PostgreSQL database and confirm that the migration was successful:
然后,您可以使用相同的连接字符串作为psql
的参数来连接到托管PostgreSQL数据库并确认迁移成功:
psql postgres://doadmin:password@db_host/defaultdb?sslmode=require
psql postgres:// doadmin: 密码 @ db_host / defaultdb?sslmode = require
Then, run the following query to check that pgLoader correctly migrated the data:
然后,运行以下查询以检查pgLoader是否正确迁移了数据:
Output
employee_id | first_name | last_name | start_date | salary
-------------+------------+-------------+------------+------------
1 | Elizabeth | Cotten | 2007-11-11 | $105433.18
2 | Yanka | Dyagileva | 2017-10-30 | $107540.67
3 | Lee | Dorsey | 2013-06-04 | $118024.04
4 | Kasey | Chambers | 2010-08-18 | $116456.98
5 | Bram | Tchaikovsky | 2018-09-16 | $61989.50
(5 rows)
This confirms that pgLoader successfully migrated your MySQL database to your managed PostgreSQL instance.
这确认pgLoader已成功将MySQL数据库迁移到托管的PostgreSQL实例。
pgLoader is a flexible tool that can perform a database migration in a single command. With a few configuration tweaks, it can migrate an entire database from one physical machine to another using a secure SSL/TLS connection. Our hope is that by following this tutorial, you will have gained a clearer understanding of pgLoader’s capabilities and potential use cases.
pgLoader是一个灵活的工具,可以在一个命令中执行数据库迁移。 通过一些配置调整,它可以使用安全的SSL / TLS连接将整个数据库从一台物理机迁移到另一台物理机。 我们希望通过学习本教程,您将对pgLoader的功能和潜在的用例有更清晰的了解。
After migrating your data over to PostgreSQL, you may find the following tutorials to be of interest:
将数据迁移到PostgreSQL之后,您可能会发现以下教程很有趣:
An Introduction to Queries in PostgreSQL
PostgreSQL查询简介
How To Install and Configure pgAdmin 4 in Server Mode
如何在服务器模式下安装和配置pgAdmin 4
How To Audit a PostgreSQL Database with InSpec on Ubuntu 18.04
如何在Ubuntu 18.04上使用InSpec审核PostgreSQL数据库
翻译自: https://www.digitalocean.com/community/tutorials/how-to-migrate-mysql-database-to-postgres-using-pgloader