greenplum 5.7 + create table + insert into

news/2024/7/4 13:06:41

os: centos 7.4
gp: gpdb-5.7.0

三台机器
node1 为master host
node2、node3为segment host

psql 登录 node1 master

$ psql -d peiybdb
peiybdb=# select current_database();
 current_database 
------------------
 peiybdb
(1 row)

create table tmp_t0(
c1 varchar(100),
c2 varchar(100),
c3 varchar(100)
);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'c1' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE

peiybdb=# \d+ tmp_t0
                        Table "public.tmp_t0"
 Column |          Type          | Modifiers | Storage  | Description 
--------+------------------------+-----------+----------+-------------
 c1     | character varying(100) |           | extended | 
 c2     | character varying(100) |           | extended | 
 c3     | character varying(100) |           | extended | 
Has OIDs: no
Distributed by: (c1)

查看node1、node2、node3 当前的PGHOME目录大小

[gpadmin@node1 gpseg-1]$ pwd
/u01/greenplum-data/gpseg-1
[gpadmin@node1 gpseg-1]$ du -sh
109M	.

[gpadmin@node2 gpseg0]$ pwd
/u01/greenplum-data/gpseg0
[gpadmin@node2 gpseg0]$ du -sh
109M	.

[gpadmin@node3 gpseg1]$ pwd
/u01/greenplum-data/gpseg1
[gpadmin@node3 gpseg1]$ du -sh
109M	.

插入数据

peiybdb=# select gp_opt_version();
                gp_opt_version                 
-----------------------------------------------
 GPOPT version: 2.55.20, Xerces version: 3.1.2
(1 row)


peiybdb=# set optimizer=on;
set optimizer_enumerate_plans=on;
set optimizer_minidump=always;
set optimizer_enable_constant_expression_evaluation=off;
set client_min_messages='debug5';


peiybdb=# insert into tmp_t0
(c1,c2,c3)
select generate_series(1, 1000000) as c1,
       md5(random()::text) as c2 ,
       md5(random()::text) as c3
;

node1 master上查询pg_stat_activity

peiybdb=# \x
Expanded display is on.
peiybdb=# select * from pg_stat_activity;
-[ RECORD 1 ]----+-------------------------------------------
datid            | 16384
datname          | peiybdb
procpid          | 10904
sess_id          | 20
usesysid         | 10
usename          | gpadmin
current_query    | insert into tmp_t0
                 | (c1,c2,c3)
                 | select generate_series(1, 1000000) as c1, 
                 |        md5(random()::text) as c2 , 
                 |        md5(random()::text) as c3 
                 | 
waiting          | f
query_start      | 2018-05-02 07:20:33.325637+00
backend_start    | 2018-05-02 07:13:17.573451+00
client_addr      | 192.168.56.1
client_port      | 63217
application_name | 
xact_start       | 2018-05-02 07:20:33.3212+00
waiting_reason   | 
rsgid            | 0
rsgname          | 
rsgqueueduration | 
-[ RECORD 2 ]----+-------------------------------------------
datid            | 16384
datname          | peiybdb
procpid          | 10906
sess_id          | 21
usesysid         | 10
usename          | gpadmin
current_query    | <IDLE>
waiting          | f
query_start      | 2018-05-02 07:17:17.675179+00
backend_start    | 2018-05-02 07:13:17.580506+00
client_addr      | 192.168.56.1
client_port      | 63218
application_name | 
xact_start       | 
waiting_reason   | 
rsgid            | 0
rsgname          | 
rsgqueueduration | 
-[ RECORD 3 ]----+-------------------------------------------
datid            | 16384
datname          | peiybdb
procpid          | 11095
sess_id          | 22
usesysid         | 10
usename          | gpadmin
current_query    | select * from pg_stat_activity;
waiting          | f
query_start      | 2018-05-02 07:28:07.30111+00
backend_start    | 2018-05-02 07:27:54.501407+00
client_addr      | 
client_port      | -1
application_name | psql
xact_start       | 2018-05-02 07:28:07.30111+00
waiting_reason   | 
rsgid            | 0
rsgname          | 
rsgqueueduration | 

再次查看node1、node2、node3的文件目录大小

[gpadmin@node1 gpseg-1]$ du -sh
109M	.

[gpadmin@node2 gpseg0]$ du -sh
224M	.

[gpadmin@node3 gpseg1]$ du -sh
224M	.

可以观察到 master 节点node1的数据文件大小并没有发生变化,segment的node2、node3的数据文件增长了不少。
主要就是由于 master 节点是用来存储定义,segment是用来存储数据的。

查看tmp_t0的定义

peiybdb=# 
peiybdb=# \d
              List of relations
 Schema |  Name  | Type  |  Owner  | Storage 
--------+--------+-------+---------+---------
 public | tmp_t0 | table | gpadmin | heap
(1 row)

peiybdb=# \d+ tmp_t0;
                        Table "public.tmp_t0"
 Column |          Type          | Modifiers | Storage  | Description 
--------+------------------------+-----------+----------+-------------
 c1     | character varying(100) |           | extended | 
 c2     | character varying(100) |           | extended | 
 c3     | character varying(100) |           | extended | 
Has OIDs: no
Distributed by: (c1)

Distributed by 这个就是tmp_t0表的分布列,表的分布列一定要合理,能够降数据比较均匀的分布到各个segment节点上。
检索数据时能够在多个节点并发处理数据。

使用 limit 查询数据时,可以看到数据是随机到某个节点查询

postgres=# select * from tmp_t0 limit 4;
 c1 |                c2                |                c3                
----+----------------------------------+----------------------------------
 1  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 3  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 5  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 7  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
(4 rows)

postgres=# select * from tmp_t0 limit 4;
 c1 |                c2                |                c3                
----+----------------------------------+----------------------------------
 2  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 4  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 6  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 8  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
(4 rows)


http://www.niftyadmin.cn/n/2124343.html

相关文章

Solaris 下rac的安装

Unix下rac集群安装 1&#xff0c;前提 rac1 10.85.0.10 rac2 10.85.0.20 2&#xff0c;配置网络连接需要公网&#xff0c;私网各一个 192.168.15.10 prac1 192.168.15.20 vrac1 192.168.15.100 prac2 192.168.15.200 vrac2 3,设置网络连接可以再添加一块网卡&#xff0c;…

pgcrypto

os:centos 6.8 db:postgresql 9.6 pgcrypto模块为PostgreSQL提供了密码函数。 简单使用了一下。 创建pgcrypto mondb# create extension pgcrypto; CREATE EXTENSION加密 mondb# select encrypt(123我men,aa,aes);encrypt --------------------------------…

php反射

什么是php反射类&#xff0c;顾名思义&#xff0c;可以理解为一个类的映射。 1.自动生成文档 根据反射的分析类&#xff0c;接口&#xff0c;函数和方法的内部结构&#xff0c;方法和函数的参数&#xff0c;以及类的属性和方法&#xff0c;可以自动生成文档。 <?php cla…

rac中ssh的配置

rac中的ssh双机配置 1&#xff0c;前提条件 rac1 192.168.5.80 rac2 192.168.5.90 修改主机名称是 vi /etc/sysconfig/network 为使其立即生效在修改hostname xx 临时的ip是 ifconfig eth0 ip 2,修改静态主机表 vi /etc/hosts 将本机的主机名&#xff0c;删除 在后面追加 1…

dos命令

net user 用户名 密码 /add 添加用户 net user guest /active:yes 激活guest用户 net user 查看有哪些用户 net user 账户名 查看该账户的属性 net localgroup administrators 用户名 /add 将用户添加到管理员&#xff0c;具有管理员的权限 net start 查看开启了哪些服…

ElasticSearch从入门到精通,详解

1.ElasticSearch的简介 Elasticsearch的特点 1&#xff09;可以作为一个大型分布式集群&#xff08;数百台服务器&#xff09;技术&#xff0c;处理PB级数据&#xff0c;服务大公司&#xff1b;也可以运行在单机上&#xff0c;服务小公司 2&#xff09;Elasticsearch不是什么…

greenplum 5.7 + gp_toolkit

gp_toolkit 是 greenplum的一个功能schema。 包含了大量实用的函数。 $ psql -d peiybdb psql (8.3.23) Type "help" for help.peiybdb# \dnSList of schemasName | Owner | Access privileges | Description …

data guard 命令

1&#xff0c;修改oracle为最大的保护模式 alter database set standby database to maxsize protection; alter database open; 查看数据库是那种模式 select protection_mode from v$database; 2&#xff0c;在备机上 recover managed standby database disconnect from …