https://omidraha.com/en/latest/src/postgresql/postgres-xl.html
Postgres-XL is composed of three major components called the GTM, Coordinator and Datanode.
The GTM is responsible to provide ACID property of transactions.
The Datanode stores table data and handle SQL statements locally.
The Coordinator handles each SQL statements from applications, determines which Datanode to go, and sends plans on to the appropriate Datanodes.
The Coordinators and Datanodes of Postgres-XL are essentially PostgreSQL database servers
You usually should run GTM on a separate server because GTM has to take care of transaction requirements from all the Coordinators and Datanodes. To group multiple requests and responses from Coordinator and Datanode processes running on the same server, you can configure GTM-Proxy. GTM-Proxy reduces the number of interactions and the amount of data to GTM. GTM-Proxy also helps handle GTM failures.
It is often good practice to run both Coordinator and Datanode on the same server because we don’t have to worry about workload balance between the two, and you can often get at data from replicated tables locally without sending an additional request out on the network. You can have any number of servers where these two components are running. Because both Coordinator and Datanode are essentially PostgreSQL instances, you should configure them to avoid resource conflict. It is very important to assign them different working directories and port numbers.
https://www.postgres-xl.org/documentation/tutorial-arch.html
CREATE TABLE DISTRIBUTE BY ...
REPLICATION
Each row of the table will be replicated to all the Datanode of the Postgres-XL database cluster.
ROUNDROBIN
Each row of the table will be placed in one of the Datanodes in a round-robin manner. The value of the row will not be needed to determine what Datanode to go.
HASH ( column_name )
Each row of the table will be placed based on the hash value of the specified column. Following type is allowed as distribution column: INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR, CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR, NUMERIC, MONEY, ABSTIME, RELTIME, DATE, TIME,TIMESTAMP, TIMESTAMPTZ, INTERVAL, and TIMETZ. Please note that floating point is not allowed as a basis of the distribution column.
MODULO ( column_name )
Each row of the table will be placed based on the modulo of the specified column. Following type is allowed as distribution column: INT8, INT2, INT4, BOOL, ABSTIME, RELTIME, DATE.
Please note that floating point is not allowed as a basis of the distribution column.
If DISTRIBUTE BY is not specified, columns with UNIQUE constraint will be chosen as the distribution key. If no such column is specified, distribution column is the first eligible column in the definition. If no such column is found, then the table will be distributed by ROUNDROBIN.
You could Alter a replicated table to make it a distributed table.
https://www.postgres-xl.org/documentation/sql-createtable.html
https://www.postgres-xl.org/documentation/tutorial-createcluster.html
In Postgres-XL, in distributed tables, UNIQUE constraints must include the distribution column of the table. This is because Postgres-XL currently only allows that it can push down to the Datanodes to be enforced locally. If we include the distribution column in unique constraints, it stands to reason that it can be enforced locally. If a table is distributed by ROUNDROBIN, we cannot enforce UNIQUE constraints because it does not have a distribution column; it is possible that the same value for a column exists on multiple nodes. There’s no restriction in UNIQUE constraint in replicated tables. When an expression is used on a UNIQUE constraint, this expression must contain the distribution column of its parent table. It cannot use other columns as well.
As mentioned when discussing UNIQUE constraint, the distribution column must be included in PRIMARY KEY. Other restrictions apply to the PRIMARY KEY as well. When an expression is used on a PRIMARY KEY constraint, this expression must contain the distribution column of its parent table. It cannot use other columns as well.
Please note that column with REFERENCES should be the distribution column. This limitation is introduced because constraints are enforced only locally in each Datanode.
In Postgres-XL, you cannot specify both PRIMARY KEY and REFERENCES key for different columns.
In Postgres-XL, you cannot omit the column name in REFERENCES clause.
In Postgres-XL, you cannot specify more than one foreign key constraints.
Postgres-XL does not support exclusion constraints.
Postgres-XL does not allow to modify the value of distribution column.
https://www.postgres-xl.org/documentation/ddl-constraints.html
https://www.postgres-xl.org/documentation/dml-update.html
https://stackoverflow.com/questions/28547437/migrating-from-postgresql-to-postgres-xl-distributed-tables-design
https://www.postgres-xl.org/documentation/upgrading.html
You can add slaves for each node analogous to PostgreSQL’s streaming replication. In addition, the cluster can be configured such that the Global Transaction Manager (GTM) can have a GTM Standby.
In terms of automatic failover, it is currently not part of the core project, but Corosync/Pacemaker has been used for this purpose.
https://www.postgres-xl.org/documentation/warm-standby-failover.html
https://www.postgres-xl.org/faq/
https://github.com/ClusterLabs/PAF
https://github.com/bitnine-oss/postgres-xl-ha
https://www.postgres-xl.org/download/
On each hosts:
Do the following commands:
# Install requirements
$ sudo apt-get upgrade
$ sudo apt-get install build-essential
$ sudo apt-get install libreadline-dev
$ apt-get install zlib1g-dev
$ apt-get install flex
# Download postgres-xl
$ wget https://www.postgres-xl.org/downloads/postgres-xl-9.5r1.6.tar.bz2
$ tar -xvjpf postgres-xl-9.5r1.6.tar.bz2
$ cd postgres-xl-9.5r1.6
# Install postgres-xl
$ ./configure
$ make
All of Postgres-XL successfully made. Ready to install.
$ sudo make install
Postgres-XL installation complete.
# Install pgxc_ctl
$ cd contrib
$ make
$ sudo make install
$ sudo adduser postgres
$ su postgres
$ vim /home/postgres/.bashrc
export PATH=/usr/local/pgsql/bin:$PATH
$ mkdir ~/.ssh
To fix these probable errors:
bash: gtm_ctl: command not found
bash: pg_ctl: command not found
initdb: invalid locale settings; check LANG and LC_* environment variables
Add these lines to /etc/environment
:
$ vim /etc/environment
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/pgsql/bin:"
export LANG=en_US.utf-8
export LC_ALL=en_US.utf-8
On every hosts:
# export dataDirRoot=$HOME/DATA/pgxl/nodes #临时设置,关闭shell失效
当前用户的全局设置
打开~/.bashrc,添加行:
export dataDirRoot=$HOME/DATA/pgxl/nodes
使生效:
source .bashrc
每台hosts都要设置$dataDirRoot环境变量,否则添加节点的时候会无法创建$dataDirRoot/cr_master.2文件
creating directory /dn_master.3 ... initdb: could not create directory "/dn_master.3": Permission denied
bash: /dn_master.3/postgresql.conf: No such file or directory
bash: /dn_master.3/pg_hba.conf: No such file or directory
On postgres-xl-gtm
host:
$ su postgres
$ ssh-keygen -t rsa
Enter file in which to save the key (/home/postgres/.ssh/id_rsa):
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ scp ~/.ssh/authorized_keys [email protected]:~/.ssh/
$ scp ~/.ssh/authorized_keys [email protected]:~/.ssh/
$ scp ~/.ssh/authorized_keys [email protected]:~/.ssh/
On every hosts:
$ chmod 700 ~/.ssh
$ chmod 600 ~/.ssh/authorized_keys
On postgres-xl-gtm
host check ssh connecting to other hosts:
$ ssh [email protected]
$ ssh [email protected]
$ ssh [email protected]
Configure pgxc_ctl.conf
on postgres-xl-gtm
host:
$ export dataDirRoot=$HOME/DATA/pgxl/nodes
$ mkdir $HOME/pgxc_ctl
$ pgxc_ctl
/bin/bash
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
ERROR: File "/home/postgres/pgxc_ctl/pgxc_ctl.conf" not found or not a regular file. No such file or directory
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
Reading configuration using /home/postgres/pgxc_ctl/pgxc_ctl_bash --home /home/postgres/pgxc_ctl --configuration /home/postgres/pgxc_ctl/pgxc_ctl.conf
Finished reading configuration.
******** PGXC_CTL START ***************
Current directory: /home/postgres/pgxc_ctl
Create empty configuration file, on the PGXC
console:
PGXC$ prepare config empty
PGXC$ exit
$ vim ~/pgxc_ctl/pgxc_ctl.conf
pgxcOwner=postgres
coordPgHbaEntries=(192.168.0.0/24)
datanodePgHbaEntries=(192.168.0.0/24)
Configure gtm master node:
$ pgxc_ctl
PGXC$ add gtm master gtm 192.168.0.140 20001 $dataDirRoot/gtm
PGXC$ monitor all
"""
Running: gtm master
"""
Configure coordinator nodes:
PGXC$ add coordinator master cr1 192.168.0.141 30001 30011 $dataDirRoot/cr_master.1 none none
"""
Success.
Starting coordinator master cr1
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory 'pg_log'.
Done.
"""
PGXC$ monitor all
"""
Running: gtm master
Running: coordinator master cr1
"""
PGXC$ add coordinator master cr2 192.168.0.142 30002 30012 $dataDirRoot/cr_master.2 none none
"""
Success.
Starting coordinator master cr2
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory 'pg_log'.
Done.
"""
PGXC$ monitor all
"""
Running: gtm master
Running: coordinator master cr1
Running: coordinator master cr2
"""
Configure data nodes:
PGXC$ add datanode master dn1 192.168.0.143 40001 40011 $dataDirRoot/dn_master.1 none none none
"""
Success.
Starting datanode master dn1.
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory 'pg_log'.
Done.
"""
PGXC$ monitor all
"""
Running: gtm master
Running: coordinator master cr1
Running: coordinator master cr2
Running: datanode master dn1
"""
PGXC$ add datanode master dn2 192.168.0.144 40002 40012 $dataDirRoot/dn_master.2 none none none
"""
Success
Starting datanode master dn2.
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory 'pg_log'.
Done.
"""
PGXC$ monitor all
"""
Running: gtm master
Running: coordinator master cr1
Running: coordinator master cr2
Running: datanode master dn1
Running: datanode master dn2
"""
# To stop
PGXC stop gtm master
PGXC stop coordinator master cr1
PGXC stop coordinator master cr2
PGXC stop datanode master dn1
PGXC stop datanode master dn2
# To start
PGXC start gtm master
PGXC start coordinator master cr1
PGXC start coordinator master cr2
PGXC start datanode master dn1
PGXC start datanode master dn2
https://stackoverflow.com/questions/29225743/installing-postgres-xl-in-linux-in-distributed-environment
https://ruihaijiang.wordpress.com/2015/09/17/postgres-xl-installation-example-on-linux/
https://github.com/tiredpixel/postgres-xl-docker
https://gitlab.com/ansible-postgres-xl/postgres-xl-cluster/tree/master
https://github.com/omidraha/django-postgres-xl-example
https://www.postgres-xl.org/faq/
https://github.com/bitnine-oss/postgres-xl-ha
https://github.com/systemapic/wu/wiki/Installing-Postgresql-XL
https://www.postgres-xl.org/documentation/admin.html
https://stackoverflow.com/questions/42431018/can-postgres-xl-shard-replicate-and-auto-balance-at-the-same-time