pg_shard PostgreSQL数据库分片

如果你的GCC版本第一4.6,那么首先要安装一个高版本的GCC,因为pg_shard里面用了gcc 4.6以后新加的特性。

Postgres2015全国用户大会将于11月20至21日在北京丽亭华苑酒店召开。本次大会嘉宾阵容强大,国内顶级PostgreSQL数据库专家将悉数到场,并特邀欧洲、俄罗斯、日本、美国等国家和地区的数据库方面专家助阵:

  • Postgres-XC项目的发起人铃木市一(SUZUKI Koichi)
  • Postgres-XL的项目发起人Mason Sharp
  • pgpool的作者石井达夫(Tatsuo Ishii)
  • PG-Strom的作者海外浩平(Kaigai Kohei)
  • Greenplum研发总监姚延栋
  • 周正中(德哥), PostgreSQL中国用户会创始人之一
  • 汪洋,平安科技数据库技术部经理
  • ……

 
  • 2015年度PG大象会报名地址:http://postgres2015.eventdove.com/
  • PostgreSQL中国社区: http://postgres.cn/
  • PostgreSQL专业1群: 3336901(已满)
  • PostgreSQL专业2群: 100910388
  • PostgreSQL专业3群: 150657323
  • # yum install -y gmp mpfr libmpc libmpc-devel

    # wget http://gcc.cybermirror.org/releases/gcc-4.9.3/gcc-4.9.3.tar.bz2
    # tar -jxvf gcc-4.9.3.tar.bz2 
    # cd gcc-4.9.3
    # ./configure --prefix=/opt/gcc4.9.3
    # make && make install
    # vi /etc/ld.so.conf
    /opt/gcc4.9.3/lib
    /opt/gcc4.9.3/lib64
    # ldconfig
    # ldconfig -p|grep gcc
    # vi /etc/profile
    export PATH=/opt/gcc4.9.3/bin:$PATH

    安装pg_shard

    # git clone https://github.com/citusdata/pg_shard.git
    # cd pg_shard/

    切换到master分支,使用1.2.2版本。

    # git checkout master
    commit 7e6103f79e3651eac0b32429f5fb103eb2a8ebdd
    Merge: 2b221d9 ac35076
    Author: Jason Petersen
    Date:   Fri Aug 28 19:12:16 2015 -0600

        Merge branch 'release-1.2.2'
    ......

    安装:

    # . /home/postgres/.bash_profile
    # make clean; make; make install

    假设我的环境中有5个数据库实例,其中一个master,4个worker。

    su - postgres
    cd $PGDATA

    在master实例的$PGDATA中编辑一个 pg_worker_list.conf 文件。

    postgres@digoal-> vi pg_worker_list.conf 
    localhost 1922
    localhost 1923
    localhost 1924
    localhost 1925

    同时确保master所在主机,连接work节点数据库不需要密码,或密码已经存放在.pgpass密码文件。

    postgres@digoal-> cat $PGDATA1/pg_hba.conf |grep ^local
    local   all             all                                     trust
    postgres@digoal-> cat $PGDATA2/pg_hba.conf |grep ^local
    local   all             all                                     trust
    postgres@digoal-> cat $PGDATA3/pg_hba.conf |grep ^local
    local   all             all                                     trust
    postgres@digoal-> cat $PGDATA4/pg_hba.conf |grep ^local
    local   all             all                                     trust

    在master节点配置pg_shard.

    vi $PGDATA/postgresql.conf
    shared_preload_libraries = 'pg_shard'
    pg_ctl restart -m fast
    psql

    为了确保主和worker一致,正常的流程是,在master和所有的worker节点创建一致的:

    role
    database
    schema

    在主节点, 连接到database,创建 pg_shard 扩展模块。

    postgres=# create extension pg_shard;

    在主节点创建测试表

    postgres=# CREATE TABLE customer_reviews                                             
    (
        customer_id TEXT NOT NULL,
        review_date DATE,
        review_rating INTEGER,
        review_votes INTEGER,
        review_helpful_votes INTEGER,
        product_id CHAR(10),
        product_title TEXT,
        product_sales_rank BIGINT,
        product_group TEXT,
        product_category TEXT,
        product_subcategory TEXT,
        similar_product_ids CHAR(10)[]
    );

    创建合适的约束,索引,建议在定义work table前都确定下来,否则以后添加索引要在所有节点手工添加。
    在主节点,调用以下函数,构造work table。表名,字段名为分布列。

    postgres=# SELECT master_create_distributed_table('customer_reviews', 'customer_id');

    在主节点,调用以下函数, 在子节点创建work table,16个分片,每个分片保存2份。

    postgres=# SELECT master_create_worker_shards('customer_reviews', 16, 2);

    在主节点,可以看到元数据。

    postgres=# \dt pgs_distribution_metadata.*
                           List of relations
              Schema           |      Name       | Type  |  Owner   
    ---------------------------+-----------------+-------+----------
     pgs_distribution_metadata | partition       | table | postgres
     pgs_distribution_metadata | shard           | table | postgres
     pgs_distribution_metadata | shard_placement | table | postgres
    (3 rows)

    postgres=# select * from pgs_distribution_metadata.partition ;
     relation_id | partition_method |     key     
    -------------+------------------+-------------
           42067 | h                | customer_id
    (1 row)
    postgres=# select 42067::regclass;
         regclass     
    ------------------
     customer_reviews
    (1 row)

    postgres=# select * from pgs_distribution_metadata.shard;
      id   | relation_id | storage |  min_value  |  max_value  
    -------+-------------+---------+-------------+-------------
     10000 |       42067 | t       | -2147483648 | -1879048194
     10001 |       42067 | t       | -1879048193 | -1610612739
     10002 |       42067 | t       | -1610612738 | -1342177284
     10003 |       42067 | t       | -1342177283 | -1073741829
     10004 |       42067 | t       | -1073741828 | -805306374
     10005 |       42067 | t       | -805306373  | -536870919
     10006 |       42067 | t       | -536870918  | -268435464
     10007 |       42067 | t       | -268435463  | -9
     10008 |       42067 | t       | -8          | 268435446
     10009 |       42067 | t       | 268435447   | 536870901
     10010 |       42067 | t       | 536870902   | 805306356
     10011 |       42067 | t       | 805306357   | 1073741811
     10012 |       42067 | t       | 1073741812  | 1342177266
     10013 |       42067 | t       | 1342177267  | 1610612721
     10014 |       42067 | t       | 1610612722  | 1879048176
     10015 |       42067 | t       | 1879048177  | 2147483647
    (16 rows)

    postgres=# select * from pgs_distribution_metadata.shard_placement;
     id | shard_id | shard_state | node_name | node_port 
    ----+----------+-------------+-----------+-----------
      1 |    10000 |           1 | localhost |      1922
      2 |    10000 |           1 | localhost |      1923
      3 |    10001 |           1 | localhost |      1923
      4 |    10001 |           1 | localhost |      1924
      5 |    10002 |           1 | localhost |      1924
      6 |    10002 |           1 | localhost |      1925
      7 |    10003 |           1 | localhost |      1925
      8 |    10003 |           1 | localhost |      1922
      9 |    10004 |           1 | localhost |      1922
     10 |    10004 |           1 | localhost |      1923
     11 |    10005 |           1 | localhost |      1923
     12 |    10005 |           1 | localhost |      1924
     13 |    10006 |           1 | localhost |      1924
     14 |    10006 |           1 | localhost |      1925
     15 |    10007 |           1 | localhost |      1925
     16 |    10007 |           1 | localhost |      1922
     17 |    10008 |           1 | localhost |      1922
     18 |    10008 |           1 | localhost |      1923
     19 |    10009 |           1 | localhost |      1923
     20 |    10009 |           1 | localhost |      1924
     21 |    10010 |           1 | localhost |      1924
     22 |    10010 |           1 | localhost |      1925
     23 |    10011 |           1 | localhost |      1925
     24 |    10011 |           1 | localhost |      1922
     25 |    10012 |           1 | localhost |      1922
     26 |    10012 |           1 | localhost |      1923
     27 |    10013 |           1 | localhost |      1923
     28 |    10013 |           1 | localhost |      1924
     29 |    10014 |           1 | localhost |      1924
     30 |    10014 |           1 | localhost |      1925
     31 |    10015 |           1 | localhost |      1925
     32 |    10015 |           1 | localhost |      1922
    (32 rows)

    可以看到,每个分片都有2个副本。就是我们前面创建work table时指定的副本数量。

    pg_shard的使用限制:
    1. 不能使用子查询。

    postgres=# insert into customer_reviews select generate_series(1,100);
    ERROR:  0A000: cannot perform distributed planning for the given query
    DETAIL:  Subqueries are not supported in distributed queries.
    LOCATION:  ErrorIfQueryNotSupported, pg_shard.c:567

    2. 不能使用变量

    postgres=# insert into customer_reviews values ('a',now());
    ERROR:  0A000: cannot plan sharded modification containing values which are not constants or constant expressions
    LOCATION:  ErrorIfQueryNotSupported, pg_shard.c:638

    3. 不能使用绑定变量

    postgres=# prepare a (text) as insert into customer_reviews values ($1);
    ERROR:  0A000: PREPARE commands on distributed tables are unsupported
    LOCATION:  PgShardProcessUtility, pg_shard.c:2098

    postgres@digoal-> vi test.sql
    \setrandom id 1 1000000
    insert into customer_reviews values (:id);

    postgres@digoal-> pgbench -M extended -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 1000
    Client 2 aborted in state 1: ERROR:  unrecognized node type: 2100
    Client 4 aborted in state 1: ERROR:  unrecognized node type: 2100
    Client 1 aborted in state 1: ERROR:  unrecognized node type: 2100
    Client 7 aborted in state 1: ERROR:  unrecognized node type: 2100
    Client 3 aborted in state 1: ERROR:  unrecognized node type: 2100
    Client 0 aborted in state 1: ERROR:  unrecognized node type: 2100
    Client 5 aborted in state 1: ERROR:  unrecognized node type: 2100
    Client 6 aborted in state 1: ERROR:  unrecognized node type: 2100
    transaction type: Custom query
    scaling factor: 1
    query mode: extended
    number of clients: 8
    number of threads: 8
    duration: 1000 s
    number of transactions actually processed: 0
    postgres@digoal-> pgbench -M simple -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 1000
    progress: 2.8 s, 0.7 tps, lat 2568.361 ms stddev 16.254
    progress: 3.2 s, 68.3 tps, lat 578.633 ms stddev 1043.300
    progress: 3.2 s, 264.5 tps, lat 8.263 ms stddev 3.007
    progress: 4.0 s, 1193.6 tps, lat 8.561 ms stddev 36.589
    progress: 5.0 s, 1255.6 tps, lat 6.376 ms stddev 5.437
    progress: 6.0 s, 1277.5 tps, lat 6.263 ms stddev 2.644

    还有 中期TODO,不支持分布式JOIN,不支持分布式事务,不支持非分布列的唯一约束,FK约束。
    短期TODO,不支持表结构修改,不支持删除表。

    另一个问题是,pg_shard需要一个master数据库,而且master没有办法做对等设备。所以master容易成为瓶颈,特别是网络瓶颈和CPU瓶颈。因此要突破这几个瓶颈和问题的话,在OLTP中使用就更加靠谱了。

    鉴于以上这些限制,你要考虑清楚pg_shard是否能满足你的需求。
    个人认为最靠谱的还是连接池代理,轻量,而且容易做对等设备,可以很好的解决性能和效率的问题。
    (但是同样很难实现分布式事务和分布式JOIN,以及分布式唯一和FK约束,但是你得考虑清楚,你是否真的需要这些?)

    另外,你可以使用9.4版本的jdbc,已经支持负载均衡和failover了。
    http://blog.163.com/digoal@126/blog/static/16387704020158241250463/
    如果你没有跨库 事务和分布式JOIN,以及分布式唯一和FK约束的需求。目前jdbc 9.4 + plproxy可以完美的实现真正性能线性增长的数据库分片。

    分片节点损坏了如何修复?
    关闭其中一个节点:

    postgres@digoal-> pg_ctl stop -m fast -D /data01/pg_root_1922
    waiting for server to shut down.... done
    server stopped
    postgres@digoal-> psql
    psql (9.4.4)
    Type "help" for help.
    postgres=# select count(*) from customer_reviews ;
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
     count 
    -------
      6296
    (1 row)

    当前shard状态

    postgres=# select * from pgs_distribution_metadata.shard_placement;
     id | shard_id | shard_state | node_name | node_port 
    ----+----------+-------------+-----------+-----------
      2 |    10000 |           1 | localhost |      1923
      3 |    10001 |           1 | localhost |      1923
      4 |    10001 |           1 | localhost |      1924
      5 |    10002 |           1 | localhost |      1924
      6 |    10002 |           1 | localhost |      1925
      7 |    10003 |           1 | localhost |      1925
     10 |    10004 |           1 | localhost |      1923
     11 |    10005 |           1 | localhost |      1923
     12 |    10005 |           1 | localhost |      1924
     13 |    10006 |           1 | localhost |      1924
     14 |    10006 |           1 | localhost |      1925
     15 |    10007 |           1 | localhost |      1925
     18 |    10008 |           1 | localhost |      1923
     19 |    10009 |           1 | localhost |      1923
     20 |    10009 |           1 | localhost |      1924
     21 |    10010 |           1 | localhost |      1924
     22 |    10010 |           1 | localhost |      1925
     23 |    10011 |           1 | localhost |      1925
     26 |    10012 |           1 | localhost |      1923
     27 |    10013 |           1 | localhost |      1923
     28 |    10013 |           1 | localhost |      1924
     29 |    10014 |           1 | localhost |      1924
     30 |    10014 |           1 | localhost |      1925
     31 |    10015 |           1 | localhost |      1925
      8 |    10003 |           3 | localhost |      1922
     32 |    10015 |           3 | localhost |      1922
      9 |    10004 |           3 | localhost |      1922
     25 |    10012 |           3 | localhost |      1922
     24 |    10011 |           3 | localhost |      1922
     17 |    10008 |           3 | localhost |      1922
      1 |    10000 |           3 | localhost |      1922
     16 |    10007 |           3 | localhost |      1922
    (32 rows)

    状态为3的shard是有问题需要修复的。

    使用pgbench产生一些数据变更,从而导致这些shard的数据和其他副本不一致。

    postgres@digoal-> pgbench -M simple -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 1000
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    WARNING:  Connection failed to localhost:1922
    DETAIL:  Remote message: could not connect to server: Connection refused
            Is the server running on host "localhost" (::1) and accepting
            TCP/IP connections on port 1922?
    could not connect to server: Connection refused
            Is the server running on host "localhost" (127.0.0.1) and accepting
            TCP/IP connections on port 1922?
    progress: 1.0 s, 167.9 tps, lat 24.011 ms stddev 84.151
    progress: 2.0 s, 1306.0 tps, lat 8.279 ms stddev 40.771
    progress: 3.0 s, 1451.7 tps, lat 5.505 ms stddev 1.778
    progress: 4.0 s, 1469.8 tps, lat 5.445 ms stddev 1.529
    progress: 5.0 s, 1447.0 tps, lat 5.519 ms stddev 2.082
    progress: 6.0 s, 1439.4 tps, lat 5.554 ms stddev 2.656


    启动关闭的节点

    postgres@digoal-> pg_ctl start -D /data01/pg_root_1922
    server starting
    postgres@digoal->  0LOG:  00000: redirecting log output to logging collector process
     0HINT:  Future log output will appear in directory "pg_log".
     0LOCATION:  SysLogger_Start, syslogger.c:645

    postgres@digoal-> psql
    psql (9.4.4)
    Type "help" for help.
    postgres=# select count(*) from customer_reviews ;
     count 
    -------
     17729
    (1 row)

    修复前,在所有work节点,对应的database中创建pg_shard extension,因为需要用到一个修复函数。

    psql -h 127.0.0.1 -p 1922 -c "create extension pg_shard;"
    psql -h 127.0.0.1 -p 1923 -c "create extension pg_shard;"
    psql -h 127.0.0.1 -p 1924 -c "create extension pg_shard;"
    psql -h 127.0.0.1 -p 1925 -c "create extension pg_shard;"
    修复, 连接到主节点,执行:

    postgres=# select t1.shard_id,t1.node_name,t1.node_port,t2.node_name,t2.node_port from pgs_distribution_metadata.shard_placement t1 , pgs_distribution_metadata.shard_placement t2 where t1.shard_id=t2.shard_id and (t1.node_name||t1.node_port) <> (t2.node_name||t2.node_port) and t1.shard_state=3;
     shard_id | node_name | node_port | node_name | node_port 
    ----------+-----------+-----------+-----------+-----------
        10003 | localhost |      1922 | localhost |      1925
        10004 | localhost |      1922 | localhost |      1923
        10008 | localhost |      1922 | localhost |      1923
        10011 | localhost |      1922 | localhost |      1925
        10012 | localhost |      1922 | localhost |      1923
        10015 | localhost |      1922 | localhost |      1925
    (6 rows)

    用这个函数来修复,其实就是拷贝数据。源拷贝到目标。

    public | master_copy_shard_placement     | void             | shard_id bigint, source_node_name text, source_node_port integer, target_node_name text, target_node_port integer  

    参数。千万不要搞反了。(pg_shard有保护措施,搞反了会报错)

    postgres=# select master_copy_shard_placement(t1.shard_id,t1.node_name,t1.node_port,t2.node_name,t2.node_port) from pgs_distribution_metadata.shard_placement t1 , pgs_distribution_metadata.shard_placement t2 where t1.shard_id=t2.shard_id and (t1.node_name||t1.node_port) <> (t2.node_name||t2.node_port) and t1.shard_state=3;
    ERROR:  22023: source placement must be in finalized state
    LOCATION:  master_copy_shard_placement, repair_shards.c:109

    修复:

    postgres=# select master_copy_shard_placement(t1.shard_id,t2.node_name,t2.node_port,t1.node_name,t1.node_port) from pgs_distribution_metadata.shard_placement t1 , pgs_distribution_metadata.shard_placement t2 where t1.shard_id=t2.shard_id and (t1.node_name||t1.node_port) <> (t2.node_name||t2.node_port) and t1.shard_state=3;
     master_copy_shard_placement 
    -----------------------------
    (8 rows)

    拷贝完成后,状态变为1了。

    postgres=# select * from pgs_distribution_metadata.shard_placement;
     id | shard_id | shard_state | node_name | node_port 
    ----+----------+-------------+-----------+-----------
      2 |    10000 |           1 | localhost |      1923
      3 |    10001 |           1 | localhost |      1923
      4 |    10001 |           1 | localhost |      1924
      5 |    10002 |           1 | localhost |      1924
      6 |    10002 |           1 | localhost |      1925
      7 |    10003 |           1 | localhost |      1925
     10 |    10004 |           1 | localhost |      1923
     11 |    10005 |           1 | localhost |      1923
     12 |    10005 |           1 | localhost |      1924
     13 |    10006 |           1 | localhost |      1924
     14 |    10006 |           1 | localhost |      1925
     15 |    10007 |           1 | localhost |      1925
     18 |    10008 |           1 | localhost |      1923
     19 |    10009 |           1 | localhost |      1923
     20 |    10009 |           1 | localhost |      1924
     21 |    10010 |           1 | localhost |      1924
     22 |    10010 |           1 | localhost |      1925
     23 |    10011 |           1 | localhost |      1925
     26 |    10012 |           1 | localhost |      1923
     27 |    10013 |           1 | localhost |      1923
     28 |    10013 |           1 | localhost |      1924
     29 |    10014 |           1 | localhost |      1924
     30 |    10014 |           1 | localhost |      1925
     31 |    10015 |           1 | localhost |      1925
      1 |    10000 |           1 | localhost |      1922
     16 |    10007 |           1 | localhost |      1922
      8 |    10003 |           1 | localhost |      1922
      9 |    10004 |           1 | localhost |      1922
     17 |    10008 |           1 | localhost |      1922
     24 |    10011 |           1 | localhost |      1922
     25 |    10012 |           1 | localhost |      1922
     32 |    10015 |           1 | localhost |      1922
    (32 rows)

    分片数据修复后,查询结果和之前一致。 在修复前,pg_shard根据状态过滤了不健康的副本的查询。因此产生的结果是一致的。
        
        
        
        
    postgres=# select count(*) from customer_reviews;
     count 
    -------
     17729
    (1 row)


    [参考]
    1.  https://github.com/citusdata/pg_shard/tree/v1.2.2
    2.  https://www.citusdata.com/citus-products/pg-shard/pg-shard-quick-start-guide
    中期TODO,不支持分布式JOIN,不支持分布式事务,不支持非分布列的唯一约束,FK约束。
    短期TODO,不支持表结构修改,不支持删除表,不支持子查询。

    Limitations

    pg_shard is intentionally limited in scope during its first release, but is fully functional within that scope. We classify pg_shard's current limitations into two groups. In one group, we have features that we don't intend to support in the medium term due to architectural decisions we made:

    • Transactional semantics for queries that span across multiple shards - For example, you're a financial institution and you sharded your data based on customer_id. You'd now like to withdraw money from one customer's account and debit it to another one's account, in a single transaction block.
    • Unique constraints on columns other than the partition key, or foreign key constraints.
    • Distributed JOINs also aren't supported in pg_shard - If you'd like to run complex analytic queries, please consider upgrading to CitusDB.

    Another group of limitations are shorter-term but we're calling them out here to be clear about unsupported features:

    • Table alterations are not supported: customers who do need table alterations accomplish them by using a script that propagates such changes to all worker nodes.
    • DROP TABLE does not have any special semantics when used on a distributed table. An upcoming release will add a shard cleanup command to aid in removing shard objects from worker nodes.
    • Queries such as INSERT INTO foo SELECT bar, baz FROM qux are not supported.

    Besides these limitations, we have a list of features that we're looking to add. Instead of prioritizing this list ourselves, we decided to keep an open discussion on GitHub issues and hear what you have to say. So, if you have a favorite feature missing from pg_shard, please do get in touch!

    你可能感兴趣的:(转载,Shared,Nothing架构)