目录
1、postgresql简介
1.1、PG起源
1.2、PG特点
1.3、许可证
2、postgresql安装
2.1、yum安装
2.2、源码安装
2.3、交互式脚本安装
3、postgresql参数解释
3.2、temp_buffers:
3.3、work_mem
3.4、maintenance_work_mem
3.5、max_stack_depth:
4、 postgresql初始化与启动关闭
4.1、创建操作系统用户
4.2、initdb数据库
4.3、启动数据库
4.4、查看数据库状态
4.5、关闭数据库
5、元命令
6、数据类型
数字类型
字符类型
时间与日期
网络类型
7、postgresql高级特性
7.1、with查询
7.2、returning 返回更新后的数据
7.3、upsert
7.4、数据抽样
7.5、聚合函数
7.6、窗口函数
8、查看全部参数
博客主页:大家好我是poizxc2014的博客主页
✨欢迎关注点赞收藏⭐留言✒
个人主页:poizxc2014的博客_CSDN博客-数据库,mysql,java领域博主首发时间:2022年05月01日
你做三四月的事,八九月就会有答案,一起加油吧
如果觉得博主的文章还不错的话,请三连支持一下博主哦最后的话,在很多方面还做的不好的地方,欢迎大佬指正,一起学习哦,冲冲冲
postgresql使用postgresql licence生命,类似于BSD和MIT许可证
postgresql一般有yum安装和编译安装。现在为了安装方便,提供了脚本的傻瓜式安装。
查看安装源
# yum search postgresql
……
postgresql.i686 : PostgreSQL client programs
postgresql.x86_64 : PostgreSQL client programs
……
postgresql-contrib.x86_64 : Extension modules distributed with PostgreSQL
postgresql-debuginfo.x86_64 : Debug information for package postgresql
postgresql-devel.i686 : PostgreSQL development header files and libraries
postgresql-devel.x86_64 : PostgreSQL development header files and libraries
……
postgresql-libs.i686 : The shared libraries required for any PostgreSQL clients
postgresql-libs.x86_64 : The shared libraries required for any PostgreSQL clients
……
postgresql-server.x86_64 : The programs needed to create and run a PostgreSQL server
可以看到postgresql是客户端,postgresql-server是服务器,postgresql-contrib是相应的扩展
安装
# yum install postgresql postgresql-server postgresql-contrib -y
安装之后,postgres可执行文件位于
# which postgres
/usr/bin/postgres
下载
进入PostgreSQL: Downloadshttps://www.postgresql.org/download 官网下载安装包
查看编译选项
./configure --help
选项 | 说明 |
---|---|
--prefix= PREFIX | 指定安装目录,默认的安装目录为“/usr/ ocal/pgsql” |
--includedir=DIR | 指定C和C++的头文件目录,默认的安装目录为“ PREFIX/ Include。 |
--with- report= PORTNUM | 指定初始化数据目录时的默认端口,这个值可以在安装之后进行修改需要重启数据库),修改它只在自行制作RPM包时有用,其他时候意义并不大。 |
--with- blocksize= BLOCKSIZE | 指定数据文件的块大小,默认的是8kB,如果在OLAP场景下可以适当加这个值到32kB,以提高OLAP的性能,但在OLTP场景下建议使用&kB默认值。 |
--with- segsize= SEGSIZE | 指定单个数据文件的大小,默认是1GB。 |
--with-wal- blocksize= BLOCKSIZE | 指定WAL文件的块大小,默认是8kB。 |
--with- wal-segsize= SEGSIZE | 指定单个WAL文件的大小,默认是16MB。 |
安装
使用gmake 和gmake install
9.5之后提供了交互式的脚本安装,但11版本之后不再提供。参考链接:PostgreSQL: Linux downloads (Red Hat family)https://www.postgresql.org/download/linux/redhat/
# ./postgresql-10.13-1-linux-x64.run
----------------------------------------------------------------------------
Welcome to the PostgreSQL Setup Wizard.
----------------------------------------------------------------------------
Please specify the directory where PostgreSQL will be installed.
Installation Directory [/opt/PostgreSQL/10]:
----------------------------------------------------------------------------
Select the components you want to install; clear the components you do not want
to install. Click Next when you are ready to continue.
PostgreSQL Server [Y/n] :Y
pgAdmin 4 [Y/n] :Y
Stack Builder [Y/n] :Y
Command Line Tools [Y/n] :Y
Is the selection above correct? [Y/n]: Y
----------------------------------------------------------------------------
Please select a directory under which to store your data.
integer 类型,设置数据库服务器将使用的共享内存缓冲区数量,此缓冲区为缓冲数据块所用。此缓冲区是放在共享内存中的。每个缓冲区大小的典型值是 8K 字节,默认值通常是 4000,对于 8KB 的数据块则共享内存缓冲区大小为 400*8KB=32MB。这个数值必须大于 16,并且至少是 max_connections 数值的两倍。通常都会把此值设置的大一些,这样可以改进性能。一般设置为物理内存的 25%,若把 shared_buffers 设置的更大,如超过物理内存的 40%,就会发现缓冲的效果并不明显了,这是因为 PostgreSQL 是运行文件系统之上的,若文件系统也有缓存,将导致双缓存过多,造成负面影响。
integer 类型,设置每个数据库会话使用的临时缓冲区的最大数目。此本地缓冲区只用于访问临时表。临时缓冲区是在某个连接会话的服务进程中分配的,属于本地内存。临时缓冲区的大小也是按数据块大小分配的,默认是 1000,对于 8K 的数据块大小为 8MB。
integer 类型,声明内部排序操作和 Hash 表在开始使用临时磁盘文件之前可使用的内存数目。这个内存也是本地内存,默认是 1MB。请注意对于复杂的查询,可能会同时并发运行好几个排序或散列(hash)操作;每个排序或散列操作都会分配这个参数声明的内存来存储中间数据,只有存不下才会使用临时文件。同样,好几个正在运行的会话可能会同时进行排序操作,因此使用的总内存量可能是 work_mem 的好几倍。 ORDER BY、DISTINCT 和 MERGE JOINS 都要用到排序操作。Hash 表在以 Hash join、Hash 为基础的聚集、以 Hash 为基础的 IN 子查询处理中都要用到。
integer 类型,声明在维护性操作(比如 CACUUM、CREATE INDEX、ALTER TABLE ADD FOREIGN KEY等)中使用的最大内存数。默认是 16 MB。在一个数据库会话里,只有一个这样的操作可以执行行,并且一个数据库实例通常不会有太多这样的工作并发执行,把这个数值设置得比 work_mem 大一些通常是合适的。更大的设置可以提高上述操作的速度。
integer 类型,声明服务器执行堆栈的最大安全深度。默认值 2MB。如果发现不能运行复杂的函数,可以适当提高此配置的值,不过通常情况下保持默认值就够了。
把 max_stack_depth 参数设置得大于实际的操作系统内核限制值时,意味着一个正在运行的递归函数可能会导致 PostgreSQL 后台服务进程奔溃。在一些操作系统平台上,PG 能够检测出内核限制,这时它将不允许将其设置为一个不安全的值。但PG并不能在所有操作系统的平台都检测它的限制值,所以还是建议设置一个明确的值。
# groupadd -g 1000 postgres
# useradd -g 1000 -u 1000 postgres
# id postgres
uid=1000(postgres) gid=1000(postgres) groups=1000(postgres)
注:如果使用yum安装,用户和用户组就直接存在了,无需额外创建
一定要initdb,否则会报以下错误
May 01 17:26:12 postgresql-check-db-dir[13693]: "/var/lib/pgsql/data" is missing or empty.
May 01 17:26:12 postgresql-check-db-dir[13693]: Use "postgresql-setup initdb" to initialize the database cluster.
# initdb -D ./data/ -W
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
fixing permissions on existing directory ./data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 32MB
creating configuration files ... ok
creating template1 database in ./data/base/1 ... ok
initializing pg_authid ... ok
Enter new superuser password:
Enter it again:
setting password ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
postgres -D ./data
or
pg_ctl -D ./data -l logfile start
# service postgresql start
Redirecting to /bin/systemctl start postgresql.service
或者
# pg_ctl -D data/ start
server starting
# su - postgres
# pg_ctl -D data/ status
pg_ctl: server is running (PID: 13827)
/usr/bin/postgres "-D" "/var/lib/pgsql/data" "-p" "5432"
关闭数据库因为紧急程度,有多种模式
smart:等待活动事务结束,并且等待客户端主动断开连接后才关闭
fast:回滚所有活动的事务,并主动断开客户端的连接
immediate:立刻关闭数据库,下一次启动会进入到恢复模式
三种情况可以分别简写为:-ms -mf -mi
# pg_ctl -D data/ -ms stop
waiting for server to shut down................................................... done
server stopped
查看全部
postgres=# \?
General
\copyright show PostgreSQL usage and distribution terms
\g [FILE] or ; execute query (and send results to file or |pipe)
\h [NAME] help on syntax of SQL commands, * for all commands
\q quit psql
Query Buffer
\e [FILE] [LINE] edit the query buffer (or file) with external editor
\ef [FUNCNAME [LINE]] edit function definition with external editor
\p show the contents of the query buffer
\r reset (clear) the query buffer
\s [FILE] display history or save it to file
\w FILE write query buffer to file
Input/Output
\copy ... perform SQL COPY with data stream to the client host
\echo [STRING] write string to standard output
\i FILE execute commands from file
\ir FILE as \i, but relative to location of current script
\o [FILE] send all query results to file or |pipe
\qecho [STRING] write string to query output stream (see \o)
Informational
(options: S = show system objects, + = additional detail)
\d[S+] list tables, views, and sequences
\d[S+] NAME describe table, view, sequence, or index
\da[S] [PATTERN] list aggregates
\db[+] [PATTERN] list tablespaces
\dc[S+] [PATTERN] list conversions
\dC[+] [PATTERN] list casts
\dd[S] [PATTERN] show object descriptions not displayed elsewhere
\ddp [PATTERN] list default privileges
\dD[S+] [PATTERN] list domains
\det[+] [PATTERN] list foreign tables
\des[+] [PATTERN] list foreign servers
\deu[+] [PATTERN] list user mappings
\dew[+] [PATTERN] list foreign-data wrappers
\df[antw][S+] [PATRN] list [only agg/normal/trigger/window] functions
\dF[+] [PATTERN] list text search configurations
\dFd[+] [PATTERN] list text search dictionaries
\dFp[+] [PATTERN] list text search parsers
\dFt[+] [PATTERN] list text search templates
\dg[+] [PATTERN] list roles
\di[S+] [PATTERN] list indexes
\dl list large objects, same as \lo_list
\dL[S+] [PATTERN] list procedural languages
\dn[S+] [PATTERN] list schemas
\do[S] [PATTERN] list operators
\dO[S+] [PATTERN] list collations
\dp [PATTERN] list table, view, and sequence access privileges
\drds [PATRN1 [PATRN2]] list per-database role settings
\ds[S+] [PATTERN] list sequences
\dt[S+] [PATTERN] list tables
\dT[S+] [PATTERN] list data types
\du[+] [PATTERN] list roles
\dv[S+] [PATTERN] list views
\dE[S+] [PATTERN] list foreign tables
\dx[+] [PATTERN] list extensions
\l[+] list all databases
\sf[+] FUNCNAME show a function's definition
\z [PATTERN] same as \dp
Formatting
\a toggle between unaligned and aligned output mode
\C [STRING] set table title, or unset if none
\f [STRING] show or set field separator for unaligned query output
\H toggle HTML output mode (currently off)
\pset NAME [VALUE] set table output option
(NAME := {format|border|expanded|fieldsep|fieldsep_zero|footer|null|
numericlocale|recordsep|recordsep_zero|tuples_only|title|tableattr|pager})
\t [on|off] show only rows (currently off)
\T [STRING] set HTML tag attributes, or unset if none
\x [on|off|auto] toggle expanded output (currently off)
Connection
\c[onnect] {[DBNAME|- USER|- HOST|- PORT|-] | conninfo}
connect to new database (currently "postgres")
\encoding [ENCODING] show or set client encoding
\password [USERNAME] securely change the password for a user
\conninfo display information about current connection
Operating System
\cd [DIR] change the current working directory
\setenv NAME [VALUE] set or unset environment variable
\timing [on|off] toggle timing of commands (currently off)
\! [COMMAND] execute command in shell or start interactive shell
Variables
\prompt [TEXT] NAME prompt user to set internal variable
\set [NAME [VALUE]] set internal variable, or list all if no parameters
\unset NAME unset (delete) internal variable
Large Objects
\lo_export LOBOID FILE
\lo_import FILE [COMMENT]
\lo_list
\lo_unlink LOBOID large object operations
基本规则
- 使用\开头,后面接简写
- 简写命令,一般都可以加+,获得更加详细的信息
- \d命令指describ,是最重要的元命令。默认是显示表、视图和序列,但往往后面可以跟其他单字母,扩展命令的使用
- \d加S(大写),显示的是包括系统表的关系
- \d等命令后可以明确紧跟对象名,精确显示该对象
- 可以在psql里执行shell命令,改变当前目录等
查看数据库
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(3 rows)
查看表空间
postgres=# \db
List of tablespaces
Name | Owner | Location
------------+----------+----------
pg_default | postgres |
pg_global | postgres |
(2 rows)
查看表等
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+--------+-------+----------
public | test | table | postgres
public | v_test | view | postgres
(2 rows)
postgres=# \dt
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | test | table | postgres
(1 row)
postgres=# \dv
List of relations
Schema | Name | Type | Owner
--------+--------+------+----------
public | v_test | view | postgres
(1 row)
postgres=# \d test
Table "public.test"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
postgres=# \dv v_test
List of relations
Schema | Name | Type | Owner
--------+--------+------+----------
public | v_test | view | postgres
(1 row)
postgres=# \di test_i
List of relations
Schema | Name | Type | Owner | Table
--------+--------+-------+----------+-------
public | test_i | index | postgres | test
(1 row)
postgres=# \di+ test_i
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+--------+-------+----------+-------+-------+-------------
public | test_i | index | postgres | test | 16 kB |
(1 row)
查看函数定义
\sf
切换显示模式
类似于MySQL的\G
postgres=# \x
Expanded display is on.
postgres=# \dv v_test
List of relations
-[ RECORD 1 ]----
Schema | public
Name | v_test
Type | view
Owner | postgres
postgres=# \dv+ v_test
List of relations
-[ RECORD 1 ]---------
Schema | public
Name | v_test
Type | view
Owner | postgres
Size | 0 bytes
Description |
postgres=# \x
Expanded display is off.
获取元命令对应的SQL代码
[postgres@ ~]# psql -E
psql (9.2.24)
Type "help" for help.
postgres=#
postgres=#
postgres=# \d
********* QUERY **********
SELECT n.nspname as "Schema",
c.relname as "Name",
CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' END as "Type",
pg_catalog.pg_get_userbyid(c.relowner) as "Owner"
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('r','v','S','f','')
AND n.nspname <> 'pg_catalog'
AND n.nspname <> 'information_schema'
AND n.nspname !~ '^pg_toast'
AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;
**************************
List of relations
Schema | Name | Type | Owner
--------+--------+-------+----------
public | test | table | postgres
public | v_test | view | postgres
(2 rows)
导入导出数据
copy和\copy都可以导入
- COPY命令是SQL命令,\copy是元命令
- COPY命令必须具有 SUPERUSER超级权限(将数据通过 stdin、 stdout方式导入导出情况除外),而\copy元命令不需要 SUPERUSER权限。
- COPY命令读取或写入数据库服务端主机上的文件,而\copy元命令是从psql客户端
- 大数据量导出到文件或大文件数据导入数据库,COPY比\copy能高
6、数据类型
数字类型
类型
长度
说明
范围
与其他db比较
smallint
2字节
小范围整数类型
32768到+32767
integer
4字节
整数类型
2147483648到+2147483647
bigint
8字节
大范围整数类型
-9233203685477808到+9223203685477807
decimal
可变
用户指定
精度小数点前131072位;小数点后16383位
numeric
可变
用户指定
精度小数点前131072位;小数点后16383位
real
4字节
变长,不精确
6位十进制精度
double precision
8字节
变长,不精确
15位十进制精度
smallserial
2字节
smallint I自增序列
1到32767
serial
4字节
Integer自增序列
1到2147483647
bigserial
8字节
bigint自增序列
1到922372036854775807
money
8字节
金钱类型
运算以及函数介绍:
#select 3+5,3-5,3*5,5/3,5.0/3,5%3,mod(5,3);
?column? | ?column? | ?column? | ?column? | ?column? | ?column? | mod
----------+----------+----------+----------+--------------------+----------+-----
8 | -2 | 15 | 1 | 1.6666666666666667 | 2 | 2
(1 row)
#select round(10.1),round(10.9),ceil(4.3),ceil(4.9),ceil(-3.6),ceil(-3.4),floor(3.2),floor(-3.2);
round | round | ceil | ceil | ceil | ceil | floor | floor
-------+-------+------+------+------+------+-------+-------
10 | 11 | 5 | 5 | -3 | -3 | 3 | -4
(1 row)
字符类型
类型
长度
说明
范围
与其他db比较
character varying(n),varchar(n)
变长
字符最大数有限制
character(n), char(n)
定长
字符数没达到最大值则使用空白填充
text
变长
无长度限制
字符函数
char_length与octet_length
=#select char_length('test'),octet_length('test'),char_length('中国'),octet_length('中国');
char_length | octet_length | char_length | octet_length
-------------+--------------+-------------+--------------
4 | 4 | 2 | 6
(1 row)
substring与substr很相似,但substr不支持from……for这种用法
#select substring('test' from 3 for 5),substring('test',3,5),substring('test',3,4);
substring | substring | substring
-----------+-----------+-----------
st | st | st
(1 row)
#select substr('test',3,5),substr('test',3,4);
substr | substr
--------+--------
st | st
(1 row)
#select substr('test' from 3 for 4);
ERROR: syntax error at or near "from"
LINE 1: select substr('test' from 3 for 4);
^
position返回第一个匹配的位置,若没匹配则返回0
#select position('a' in 'abcda'),position('ab' in 'abcda'),position('ad' in 'abcda');
position | position | position
----------+----------+----------
1 | 1 | 0
(1 row)
split_part
#select split_part('abcdabcab','a',1),split_part('abcdabcab','a',2),split_part('bcdabcab','a',1);
split_part | split_part | split_part | split_part
------------+------------+------------+------------
| bcd | bcd | bc
(1 row)
时间与日期
类型
长度
说明
范围
与其他db比较
timestamp[(p)][without time zone]
8字节
包括日期和时间,不带时区,简写成 timestamp
timestamp[(p)] with time zone
8字节
包括日期和时间,带时区,简写成 timestamp
date
4字节
日期,但不包含一天中的时间
time[(p)][without time zone]
8字节
天中的时间,不包含日期,不带时区
time[(p)]with time zone
12字节
天中的时间,不包含日期,带时区
interval[fields][(p)]
16字节
时间间隔
类型转换
#select now(),now():: timestamp with time zone,now():: timestamp without time zone;
now | now | now
-------------------------------+-------------------------------+----------------------------
2022-05-01 14:26:34.770816+08 | 2022-05-01 14:26:34.770816+08 | 2022-05-01 14:26:34.770816
(1 row)
可以看到now是默认带时区的
#select now()::date,now():: time with time zone,now():: time without time zone,now():: time(3) without time zone;
now | now | now | now
------------+--------------------+-----------------+--------------
2022-05-01 | 14:56:28.395949+08 | 14:56:28.395949 | 14:56:28.396
(1 row)
可以看到精度的区别,默认和最大为6,超过6会显示warning而直接截断
#select date '2022-05-01'+ interval '1 day' - interval '2 hour',interval '1 month'/double precision '3';
?column? | ?column?
---------------------+----------
2022-05-01 22:00:00 | 10 days
(1 row)
时间函数
#select current_date,current_time;
date | timetz
------------+--------------------
2022-05-01 | 15:11:33.052683+08
(1 row)
#select extract(hour from current_time),extract(hour from current_date),extract(week from now());
date_part | date_part | date_part
-----------+-----------+-----------
15 | 0 | 21
(1 row)
网络类型
类型
长度
说明
范围
与其他db比较
cidr
7或19字节
IPV4和IPv6网络
inet
7或19字节
IPv4和IPv6网络
macaddr
6字节
MAC地址
macaddr8
8字节
MAC地址(EUI-64格式)
举例:
#select '192.168.1.1'::cidr,'192.168.1.100/24'::inet;
cidr | inet
----------------+------------------
192.168.1.1/32 | 192.168.1.100/24
(1 row)
可以看到inet不一定会转换成普通的点分十进制的形式
7、postgresql高级特性
7.1、with查询
with查询是postgresql的高级特性,oracle 很早就支持,mysql的新版本也是支持的。说明这种with查询的必要性,能极大提升sql的表现力
with t as (select * from test) select * from t;
+------+
| a |
+------+
| 1 |
+------+
1 row in set (0.00 sec)
多表中间结果
with t1 as ( select a from test), t2 as ( select a from b) select * from t1 , t2 where t1.a=t2.a;
+------+------+
| a | a |
+------+------+
| 1 | 1 |
+------+------+
1 row in set (0.01 sec)
递归使用with
with的一个重要特定就是可以递归使用,引用自己的结果。
# with recursive t (x) as ( select 1 union select x + 1 from t where x < 5) select sum(x) from t;
sum
-----
15
(1 row)
7.2、returning 返回更新后的数据
创建表test
# \d test
Table "public.test"
Column | Type | Modifiers
--------+--------------+-----------
a | integer |
b | character(8) |
Indexes:
"test_i" btree (a)
插入数据
insert into test values(1,'test1'),(2,'test2');
使用returning
# update test set a = 5 where a=1 returning * ;
a | b
---+----------
5 | test1
(1 row)
7.3、upsert
# insert into test_insert(a,b) values (1,'test1') on conflict(a) do nothing;
# insert into test_insert(a,b) values (1,'test1') on conflict(a) do update set a=EXCLUDED.a;
注意,这个EXCLUDED表示的是values里的数值,且必须大写
7.4、数据抽样
普通的数据抽样
# select * from test order by random() limit 1;
tablesample数据抽样。9.5以后才支持
# select * from test tablesample system(0.00001);
# select * from test tablesample bernoulli(0.0001);
system抽样方式是随机抽取表的数据块上的数据,bernoulli抽样是随机抽取表的数据行。这会导致system抽样,每次返回的行数可能不一样,因为每个块上的数据行数不一定一样。
7.5、聚合函数
string_agg函数
# create table city( country varchar(64),city varchar(64));
# insert into city values('中国','台北'),('中国','上海'),('中国','香港'),('日本','东京'),('日本','大阪');
# select * from city;
country | city
---------+------
中国 | 台北
中国 | 上海
中国 | 香港
日本 | 东京
日本 | 大阪
(5 rows)
# select country,string_agg(city,',') from city group by country;
country | string_agg
---------+----------------
中国 | 台北,上海,香港
日本 | 东京,大阪
(2 rows)
array_agg
与string_agg类似,但返回的是数组
# select country,array_agg(city) from city group by country;
country | array_agg
---------+------------------
中国 | {台北,上海,香港}
日本 | {东京,大阪}
(2 rows)
7.6、窗口函数
窗口函数包括内置的,比如(),rank(),lag(),以及聚合函数、自定义函数后接over属性
创建基本表并插入数据
# CREATE TABLE score (id serial primary key, subject character varying(32), stu_name character varying(32), score numeric(3,0));
# INSERT INTO score(subject, stu_name, score) VALUEs('Chinese','francs',70),('Chinese','matiler',70),('Chinese','tutu', 80),('English','matiler',75),('English','francs',90),('English','tutu', 60),('Math','francs',80),('Math','matiler', 99),('Math','tutu', 65);
avg() over
如果想查每个学生的成绩并显示课程的平均分,传统方法就是通过表连接,即先查询平均分,然后再与原表连接。如下:
# select id,a.subject,stu_name,score,avg from score as a join (select subject,avg(score) from score group by subject) as b on a.subject=b.subject ;
id | subject | stu_name | score | avg
----+---------+----------+-------+---------------------
1 | Chinese | francs | 70 | 73.3333333333333333
2 | Chinese | matiler | 70 | 73.3333333333333333
3 | Chinese | tutu | 80 | 73.3333333333333333
4 | English | matiler | 75 | 75.0000000000000000
5 | English | francs | 90 | 75.0000000000000000
6 | English | tutu | 60 | 75.0000000000000000
7 | Math | francs | 80 | 81.3333333333333333
8 | Math | matiler | 99 | 81.3333333333333333
9 | Math | tutu | 65 | 81.3333333333333333
(9 rows)
如果使用窗口函数,可以方便很多:
# select id,subject,stu_name,score,avg(score) over(partition by subject) from score;
id | subject | stu_name | score | avg
----+---------+----------+-------+---------------------
1 | Chinese | francs | 70 | 73.3333333333333333
2 | Chinese | matiler | 70 | 73.3333333333333333
3 | Chinese | tutu | 80 | 73.3333333333333333
4 | English | matiler | 75 | 75.0000000000000000
5 | English | francs | 90 | 75.0000000000000000
6 | English | tutu | 60 | 75.0000000000000000
7 | Math | francs | 80 | 81.3333333333333333
8 | Math | matiler | 99 | 81.3333333333333333
9 | Math | tutu | 65 | 81.3333333333333333
(9 rows)
查看查询计划
# explain analyze select id,a.subject,stu_name,score,avg from score as a join (select subject,avg(score) from score group by subject) as b on a.subject=b.subject ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=797.06..1643.18 rows=36864 width=53) (actual time=14.061..27.067 rows=36864 loops=1)
Hash Cond: ((a.subject)::text = (b.subject)::text)
-> Seq Scan on score a (cost=0.00..612.64 rows=36864 width=21) (actual time=0.013..3.193 rows=36864 loops=1)
-> Hash (cost=797.03..797.03 rows=3 width=38) (actual time=14.037..14.037 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Subquery Scan on b (cost=796.96..797.03 rows=3 width=38) (actual time=14.024..14.030 rows=3 loops=1)
-> HashAggregate (cost=796.96..797.00 rows=3 width=38) (actual time=14.023..14.027 rows=3 loops=1)
Group Key: score.subject
-> Seq Scan on score (cost=0.00..612.64 rows=36864 width=11) (actual time=0.003..3.231 rows=36864 loops=1)
Planning time: 0.219 ms
Execution time: 28.528 ms
(11 rows)
与窗口函数的查询计划
# explain analyze select id,subject,stu_name,score,avg(score) over(partition by subject) from score;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=3408.76..4053.88 rows=36864 width=53) (actual time=22.146..40.023 rows=36864 loops=1)
-> Sort (cost=3408.76..3500.92 rows=36864 width=21) (actual time=17.463..20.134 rows=36864 loops=1)
Sort Key: subject
Sort Method: quicksort Memory: 3826kB
-> Seq Scan on score (cost=0.00..612.64 rows=36864 width=21) (actual time=0.015..5.280 rows=36864 loops=1)
Planning time: 0.130 ms
Execution time: 42.077 ms
(7 rows)
虽然窗口函数方便阅读,但执行性能比表连接还是要差一些。而且随着数据量增大,这种趋势越发明显。
数据量
类型
执行时间
9
表连接
0.108ms
9
窗口函数
0.110ms
36864
表连接
28ms
36864
窗口函数
45ms
294912
表连接
180ms
294912
窗口函数
510ms
row_number
# select row_number() over(partition by subject order by score) ,* from score;
row_number | id | subject | stu_name | score
------------+----+---------+----------+-------
1 | 2 | Chinese | matiler | 70
2 | 1 | Chinese | francs | 70
3 | 3 | Chinese | tutu | 80
1 | 6 | English | tutu | 60
2 | 4 | English | matiler | 75
3 | 5 | English | francs | 90
1 | 9 | Math | tutu | 65
2 | 7 | Math | francs | 80
3 | 8 | Math | matiler | 99
(9 rows)
rank()与dense_rank()
两个函数很相似,都可以在组内排名,但rank在字段相同时会产生间隙gap,而dense_rank不会
# select rank() over(partition by subject order by score) ,* from score;
rank | id | subject | stu_name | score
------+----+---------+----------+-------
1 | 2 | Chinese | matiler | 70
1 | 1 | Chinese | francs | 70
3 | 3 | Chinese | tutu | 80
1 | 6 | English | tutu | 60
2 | 4 | English | matiler | 75
3 | 5 | English | francs | 90
1 | 9 | Math | tutu | 65
2 | 7 | Math | francs | 80
3 | 8 | Math | matiler | 99
(9 rows)
# select dense_rank() over(partition by subject order by score) ,* from score;
dense_rank | id | subject | stu_name | score
------------+----+---------+----------+-------
1 | 2 | Chinese | matiler | 70
1 | 1 | Chinese | francs | 70
2 | 3 | Chinese | tutu | 80
1 | 6 | English | tutu | 60
2 | 4 | English | matiler | 75
3 | 5 | English | francs | 90
1 | 9 | Math | tutu | 65
2 | 7 | Math | francs | 80
3 | 8 | Math | matiler | 99
(9 rows)
lag()
# select lag(id,1) over(partition by stu_name),* from score;
lag | id | subject | stu_name | score
-----+----+---------+----------+-------
| 1 | Chinese | francs | 70
1 | 5 | English | francs | 90
5 | 7 | Math | francs | 80
| 4 | English | matiler | 75
4 | 8 | Math | matiler | 99
8 | 2 | Chinese | matiler | 70
| 6 | English | tutu | 60
6 | 9 | Math | tutu | 65
9 | 3 | Chinese | tutu | 80
(9 rows)
first_value() 和last_value()
first_value获取组内的第一个值,last_value组内最后一个值
# select first_value(score) over(partition by subject ),* from score;
first_value | id | subject | stu_name | score
-------------+----+---------+----------+-------
70 | 1 | Chinese | francs | 70
70 | 2 | Chinese | matiler | 70
70 | 3 | Chinese | tutu | 80
75 | 4 | English | matiler | 75
75 | 5 | English | francs | 90
75 | 6 | English | tutu | 60
80 | 7 | Math | francs | 80
80 | 8 | Math | matiler | 99
80 | 9 | Math | tutu | 65
(9 rows)
# select last_value(id) over(partition by subject ),* from score;
last_value | id | subject | stu_name | score
------------+----+---------+----------+-------
3 | 1 | Chinese | francs | 70
3 | 2 | Chinese | matiler | 70
3 | 3 | Chinese | tutu | 80
6 | 4 | English | matiler | 75
6 | 5 | English | francs | 90
6 | 6 | English | tutu | 60
9 | 7 | Math | francs | 80
9 | 8 | Math | matiler | 99
9 | 9 | Math | tutu | 65
(9 rows)
但在这里有一个bug,就是就是如果在窗口函数里面加了order by,会出现诡异现象
# select last_value(id) over(partition by subject order by score ),* from score;
last_value | id | subject | stu_name | score
------------+----+---------+----------+-------
1 | 2 | Chinese | matiler | 70
1 | 1 | Chinese | francs | 70
3 | 3 | Chinese | tutu | 80
6 | 6 | English | tutu | 60
4 | 4 | English | matiler | 75
5 | 5 | English | francs | 90
9 | 9 | Math | tutu | 65
7 | 7 | Math | francs | 80
8 | 8 | Math | matiler | 99
(9 rows)
这个的结果与下面语句相同
# select last_value(id) over(partition by subject ,score ),* from score;
last_value | id | subject | stu_name | score
------------+----+---------+----------+-------
1 | 2 | Chinese | matiler | 70
1 | 1 | Chinese | francs | 70
3 | 3 | Chinese | tutu | 80
6 | 6 | English | tutu | 60
4 | 4 | English | matiler | 75
5 | 5 | English | francs | 90
9 | 9 | Math | tutu | 65
7 | 7 | Math | francs | 80
8 | 8 | Math | matiler | 99
(9 rows)
原因应该是分组的时候将order 里面的条件当成了partition的条件了,也就是将(subject,score)当分组条件
nth_value()
取结果集分组的指定行数据
# select nth_value(id,2) over(partition by subject ),* from score;
nth_value | id | subject | stu_name | score
-----------+----+---------+----------+-------
2 | 1 | Chinese | francs | 70
2 | 2 | Chinese | matiler | 70
2 | 3 | Chinese | tutu | 80
5 | 4 | English | matiler | 75
5 | 5 | English | francs | 90
5 | 6 | English | tutu | 60
8 | 7 | Math | francs | 80
8 | 8 | Math | matiler | 99
8 | 9 | Math | tutu | 65
(9 rows)
但下面的结果无法理解
# select nth_value(id,2) over(partition by subject order by id desc),* from score;
nth_value | id | subject | stu_name | score
-----------+----+---------+----------+-------
| 3 | Chinese | tutu | 80
2 | 2 | Chinese | matiler | 70
2 | 1 | Chinese | francs | 70
| 6 | English | tutu | 60
5 | 5 | English | francs | 90
5 | 4 | English | matiler | 75
| 9 | Math | tutu | 65
8 | 8 | Math | matiler | 99
8 | 7 | Math | francs | 80
使用subject字段来排序,却是正常的
# select nth_value(id,2) over(partition by subject order by subject desc ),* from score;
nth_value | id | subject | stu_name | score
-----------+----+---------+----------+-------
7 | 9 | Math | tutu | 65
7 | 7 | Math | francs | 80
7 | 8 | Math | matiler | 99
4 | 5 | English | francs | 90
4 | 4 | English | matiler | 75
4 | 6 | English | tutu | 60
3 | 1 | Chinese | francs | 70
3 | 3 | Chinese | tutu | 80
3 | 2 | Chinese | matiler | 70
(9 rows)
窗口函数别名
# select sum(score) over (r) ,avg(score) over (r) ,* from score window r as (partition by subject);
sum | avg | id | subject | stu_name | score
-----+---------------------+----+---------+----------+-------
220 | 73.3333333333333333 | 1 | Chinese | francs | 70
220 | 73.3333333333333333 | 2 | Chinese | matiler | 70
220 | 73.3333333333333333 | 3 | Chinese | tutu | 80
225 | 75.0000000000000000 | 4 | English | matiler | 75
225 | 75.0000000000000000 | 5 | English | francs | 90
225 | 75.0000000000000000 | 6 | English | tutu | 60
244 | 81.3333333333333333 | 7 | Math | francs | 80
244 | 81.3333333333333333 | 8 | Math | matiler | 99
244 | 81.3333333333333333 | 9 | Math | tutu | 65
(9 rows)
8、查看全部参数
[postgres@~]# psql --help
psql is the PostgreSQL interactive terminal.
Usage:
psql [OPTION]... [DBNAME [USERNAME]]
General options:
-c, --command=COMMAND run only single command (SQL or internal) and exit
-d, --dbname=DBNAME database name to connect to (default: "postgres")
-f, --file=FILENAME execute commands from file, then exit
-l, --list list available databases, then exit
-v, --set=, --variable=NAME=VALUE
set psql variable NAME to VALUE
-V, --version output version information, then exit
-X, --no-psqlrc do not read startup file (~/.psqlrc)
-1 ("one"), --single-transaction
execute command file as a single transaction
-?, --help show this help, then exit
Input and output options:
-a, --echo-all echo all input from script
-e, --echo-queries echo commands sent to server
-E, --echo-hidden display queries that internal commands generate
-L, --log-file=FILENAME send session log to file
-n, --no-readline disable enhanced command line editing (readline)
-o, --output=FILENAME send query results to file (or |pipe)
-q, --quiet run quietly (no messages, only query output)
-s, --single-step single-step mode (confirm each query)
-S, --single-line single-line mode (end of line terminates SQL command)
Output format options:
-A, --no-align unaligned table output mode
-F, --field-separator=STRING
set field separator (default: "|")
-H, --html HTML table output mode
-P, --pset=VAR[=ARG] set printing option VAR to ARG (see \pset command)
-R, --record-separator=STRING
set record separator (default: newline)
-t, --tuples-only print rows only
-T, --table-attr=TEXT set HTML table tag attributes (e.g., width, border)
-x, --expanded turn on expanded table output
-z, --field-separator-zero
set field separator to zero byte
-0, --record-separator-zero
set record separator to zero byte
Connection options:
-h, --host=HOSTNAME database server host or socket directory (default: "local socket")
-p, --port=PORT database server port (default: "5432")
-U, --username=USERNAME database user name (default: "postgres")
-w, --no-password never prompt for password
-W, --password force password prompt (should happen automatically)
For more information, type "\?" (for internal commands) or "\help" (for SQL
commands) from within psql, or consult the psql section in the PostgreSQL
documentation.
Report bugs to .
查看全部数据库
# psql -l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(3 rows)
与在psql里执行\l是一样的
格式设置
-A显示非对齐模式
# psql -A -c "select * from test order by a "
a|b
1|test
2|2
3|test3
4|4
(4 rows)
# psql -c "select * from test order by a "
a | b
---+----------
1 | test
2 | 2
3 | test3
4 | 4
(4 rows)
-t 只显示数据
# psql -t -c "select * from test order by a "
1 | test
2 | 2
3 | test3
4 | 4
-q不显示输出信息
# psql -q -c "create table test2(a int) "
# psql -c "select * from test2 order by a "
a
---
(0 rows)
可以看到没有显示"create table"这种信息
-f执行sql文件
传递变量
- psql内部定义变量
postgres=# \set value 1
postgres=# select * from test where a=:value;
a | b
---+----------
1 | test
(1 row)
取消变量定义
postgres=# \set value
postgres=# select * from test where a=:value;
ERROR: syntax error at or near ";"
LINE 1: select * from test where a=;
^
- psql 命令行传递变量
# psql -v value=1 -f 1.sql
a | b
---+----------
1 | test
(1 row)
其中1.sql内容是
select * from test where a=:value;
通过传递变量的方式定义脚本
修改或者新建.psqlrc文件,写入如下内容(查询活动会话,记忆pg_stat_activity视图):
\set active_session 'select pid, usename, datname, query, client_addr from pg_stat_activity where pid <> pg_backend_pid() and state=\'active\' order by query;'
这样可以将定义的变量传递到psql中
postgres=# :active_session
pid | usename | datname | query | client_addr
-----+---------+---------+-------+-------------
(0 rows)
其实也可以在psql中直接定义,当前会话中可以一直使用
如下是第二个例子。修改或者新建.psqlrc:
\set wait_event 'select pid,usename,datname,query,client_addr from pg_stat_activity where pid <> pg_backend_pid() ;'
\set connections 'select datname, usename, client_addr, count(*) from pg_stat_activity where pid <> pg_backend_pid() group by 1,2,3 order by 1,2,4 desc;'
进入psql执行变量,即执行sql
postgres=# :wait_event
pid | usename | datname | query | client_addr
-----+---------+---------+-------+-------------
(0 rows)
postgres=# :connections
datname | usename | client_addr | count
----------+----------+-------------+-------
postgres | postgres | | 1
(1 row)
psql可以tab自动补全
这个类似于mysql的auto-hash,但比auto-hash更好用
客户端提示符定制
参数
解释
%M
数据库服务器别名,不是指主机名,显示的是psql的-h参数设置的值;
当连接建立在Unix域套字上时则是【local】
%>
数据库服务器的端口号。
%n
数据库会话的用户名,在数据库会话期间
这个值可能会因为命令SET SESSION UTHORIZATION的结果而改变。
%/
当前数据库名称。
%#
如果是超级用户则显示“#”,其他用户显示“>”,在数据库会话期间,
这个值可能会因为命令 SET SESSION AUTHORIZATION的结果而改变。
%p
当前数据库连接的后台进程号
%R
在 PROMPT1中通常显示“=”,如果进程被断开则显示“!”。
postgres=# \echo :PROMPT1
%/%R%#
postgres=# \set PROMPT1 %/%M%R%#
postgres[local]=#