这是本节的多页打印视图。点击此处打印.

Pigsty 中文文档 v3.4

1: 快速上手

1.1: 安装部署
1.2: 离线安装
1.3: 精简安装
1.4: 声明配置
1.5: 准备工作
1.6: 执行剧本
1.7: 置备机器
1.8: 安全考量
1.9: 常见问题

2: 关于Pigsty

2.1: 亮点特性
2.2: 模块列表
2.3: 发展规划
2.4: 历史沿革
2.5: 活动新闻
2.6: 加入社区
2.7: 隐私政策
2.8: 开源协议
2.9: 赞助我们
2.10: 行业案例
2.11: 订阅服务

3: 核心概念

3.1: 系统架构
3.2: 集群模型
3.3: 监控系统
3.4: 本地 CA
3.5: 基础设施即代码
3.6: 数据库高可用
3.7: 时间点恢复
3.8: 服务接入
3.9: 访问控制

4: 配置模板

4.1: 配置总览
4.2: 单节点：meta
4.3: 单节点：rich
4.4: 单节点：pitr
4.5: 单节点：demo
4.6: 单节点：supa
4.7: 单节点：bare
4.8: 四节点：full
4.9: 四节点：safe
4.10: 四节点：mssql
4.11: 四节点：polar
4.12: 四节点：ivory
4.13: 四节点：mysql
4.14: 四节点：oriole
4.15: 四节点：minio
4.16: 双节点：dual
4.17: 双节点：slim
4.18: 三节点：trio
4.19: 五节点：oss
4.20: 36节点：simu

5: 参考信息

5.1: OS兼容性
5.2: 参数列表
5.3: 扩展列表
5.4: 文件结构
5.5: 同类对比
5.6: 成本参考
5.7: 术语列表

6: PostgreSQL

6.1: 核心概念
6.2: 系统架构
6.3: 用户/角色
6.4: 数据库
6.5: 服务/接入
6.6: 扩展插件
6.7: 认证 / HBA
6.8: 集群配置
6.9: 参数列表
6.10: 预置剧本
6.11: 管理预案
6.12: 访问控制
6.13: 备份恢复
6.14: 迁移
6.15: 监控接入
6.16: 监控面板
6.17: 指标列表
6.18: 常见问题

7: PG 内核分支

7.1: Citus (Distributive)
7.2: Babelfish (MSSQL)
7.3: IvorySQL (Oracle)
7.4: OpenHalo (MySQL)
7.5: OrioleDB (OLTP)
7.6: PolarDB PG (RAC)
7.7: PolarDB O(racle)
7.8: PostgresML (AI/ML)
7.9: Supabase (Firebase)
7.10: Greenplum (MPP)
7.11: Cloudberry (MPP)
7.12: Neon (Serverless)

8: 模块：INFRA

8.1: 系统架构
8.2: 集群配置
8.3: 参数列表
8.4: 预置剧本
8.5: 管理预案
8.6: 监控告警
8.7: 指标列表
8.8: 常见问题

9: 模块：NODE

9.1: 核心概念
9.2: 集群配置
9.3: 参数列表
9.4: 预置剧本
9.5: 管理预案
9.6: 监控告警
9.7: 指标列表
9.8: 常见问题

10: 模块：ETCD

10.1: 集群配置
10.2: 参数列表
10.3: 预置剧本
10.4: 管理预案
10.5: 监控告警
10.6: 指标列表
10.7: 常见问题

11: 模块：MINIO

11.1: 使用方法
11.2: 集群配置
11.3: 参数列表
11.4: 预置剧本
11.5: 管理预案
11.6: 监控告警
11.7: 指标列表
11.8: 常见问题

12: 模块：REDIS

12.1: 集群配置
12.2: 参数列表
12.3: 预置剧本
12.4: 管理预案
12.5: 监控告警
12.6: 指标列表
12.7: 常见问题

13: 模块：FERRET

13.1: 使用方法
13.2: 集群配置
13.3: 管理预案
13.4: 指标列表
13.5: 常见问题

14: 模块：DOCKER

14.1: 使用方法
14.2: 参数列表
14.3: 预置剧本
14.4: 指标列表
14.5: 常见问题

15: 模块：专业/试点

15.1: 模块：MySQL

15.2: 模块：Kafka

15.3: 模块：DuckDB

15.4: 模块：TigerBeetle

15.5: 模块：Kubernetes

15.6: 模块：Consul

15.7: 模块：Victoria

15.8: 模块：Jupyter

16: 任务教程

16.1: DNS：使用域名访问 Pigsty 中的 Web 服务
16.2: Nginx：向外代理暴露Web服务
16.3: Certbot：申请公网HTTPS证书
16.4: Docker：启用容器与镜像代理
16.5: 使用 PostgreSQL 作为 Ansible 的配置清单与 CMDB
16.6: 使用 PostgreSQL 作为 Grafana 后端数据库
16.7: 使用 TimescaleDB + Promscale 存储 Prometheus 时序指标数据
16.8: 使用 Keepalived 为 Pigsty 节点集群配置二层 VIP
16.9: 使用 VIP-Manager 为 PostgreSQL 集群配置二层 VIP
16.10: HugePage：为数据库启用大页支持
16.11: Citus：部署原生高可用集群
16.12: 高可用演习：3坏2如何处理
16.13: Restic：文件系统备份恢复
16.14: JuiceFS：分布式文件系统
16.15: 便宜VPS

17: 应用模板

17.1: Dify：自建AI工作流平台
17.2: Odoo：自建开源ERP系统
17.3: 自建Supabase：创业出海的首选数据库

18: 软件工具

18.1: PGAdmin4：用GUI管理PG数据库
18.2: Kong：企业级开源 API 网关
18.3: Jupyter：数据分析笔记本与AI IDE
18.4: Gitea：自建简易代码托管平台
18.5: Wiki.js：搭建你自己的维基百科
18.6: Minio：开源S3，简单对象存储服务
18.7: ByteBase：PG模式迁移工具
18.8: PostgREST：自动生成REST API
18.9: SchemaSPY：PG模式可视化
18.10: PGWeb：从浏览器访问PostgreSQL
18.11: Discourse：开源技术论坛
18.12: GitLab：企业级开源代码托管平台

19: 数据分析

19.1: PGLOG：PG自带日志分析应用
19.2: NOAA ISD 全球气象站历史数据查询
19.3: WHO COVID-19 疫情大盘
19.4: AWS 阿里云服务器价格
19.5: 使用一条 SQL 计算扑克24点
19.6: DB-Engine 数据库热度趋势分析
19.7: StackOverflow 全球开发者调研

20: 发布注记
21: 漏洞缺陷
22: 便宜VPS

“PostgreSQL In Great STYle”: Postgres, Infras, Graphics, Service, Toolbox, it’s all Yours.

—— 开箱即用、本地优先的 PostgreSQL 发行版，开源 RDS 替代

仓库 | 演示 | 博客 | 论坛 | GPTs | 微信公众号 | EN Docs

快速上手最新版本的 Pigsty：curl -fsSL https://repo.pigsty.cc/get | bash

1 - 快速上手

快速上手 Pigsty：根据需求进行规划，准备资源，置备服务器，创建管理用户，下载软件并完成安装。

1.1 - 安装部署

下载、配置、安装 Pigsty

安装 Pigsty 四步走: 准备，下载，配置，以及安装。没有互联网访问，请同时参阅 离线安装。

简短版本

准备一个安装了兼容操作系统的 Linux x86_64 / aarch64 节点，使用带有免密 sudo 权限的用户，执行以下命令：

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty;

该命令会下载并解压 Pigsty 源码至家目录，依次完成配置与安装即可完成安装。

./bootstrap; ./configure; ./install.yml; # 准备依赖，生成配置，安装 Pigsty 三步走！

bootstrap：【可选】用于确保 Ansible 正常安装，如果 /tmp/pkg.tgz 离线包存在则使用它
configure：【可选】检测环境并自动生成相应的推荐配置文件，如果知道如何配置 Pigsty可以直接跳过
install.yml：根据生成的配置文件开始在当前节点上执行安装，

整个安装过程根据服务器规格/网络条件需 5 到 30 分钟不等，离线安装可以显著加速。

安装完成后，您可以通过域名，或80/443端口通过 Nginx 访问 WEB 界面，通过 5432 端口访问默认的 PostgreSQL 数据库服务。您可以继续使用 Pigsty 纳管更多节点，并部署各种模块。

如果您觉得 Pigsty 的组件过于复杂，还可以考虑使用 最小化安装 ，仅安装高可用 PostgreSQL 集群所必需的组件。

视频样例：在线单机安装（EL9）

使用命令行工具

Pigsty 在 v3.2 后提供了命令行工具：pig，可用于安装 Pigsty，生成配置，执行部署等操作。

curl https://repo.pigsty.cc/pig | bash  # 中国大陆
curl https://repo.pigsty.io/pig | bash  # 国际区域

以上命令会自动安装 pig 命令行工具（目前支持 Linux amd64 / arm64），您可以直接使用 pig sty 子命令来完成上面的步骤：

pig sty init     # 默认安装嵌入的最新 Pigsty 版本
pig sty boot     # 执行 Bootstrap，安装 Ansible
pig sty conf     # 执行 Configure，生成配置文件
pig sty install  # 执行 install.yml 剧本完成部署

准备

关于准备工作的完整详细说明，请参考 入门：准备工作 一节。

Pigsty 支持 Linux 内核与 x86_64/aarch64 架构，运行于物理机、虚拟机环境中，要求使用静态IP地址。最低配置要求为 1C1G，推荐至少使用 2C4G 以上的机型，上不封顶，参数会自动优化适配。

我们强烈建议您使用刚安装完操作系统的全新节点部署，从而避免无谓的安装异常问题，建议使用 RockyLinux 8.9 或 Ubuntu 22.04.3，支持的完整操作系统列表请参考 兼容性。

在安装 Pigsty 的管理节点上，您需要拥有 ssh 登陆权限与 sudo 权限。如果您的部署涉及多个节点，您应当确保当前管理用户在当前管理节点上，可以通过 SSH 公钥免密登陆其他被管理的节点（包括本节点）。

避免使用 root 用户安装

尽管使用 root 用户安装 Pigsty 是可行的，但安全最佳实践是使用一个不同于根用户（root）与数据库超级用户 (postgres) 的专用管理员用户（如：dba）， Pigsty 安装过程中会默认创建由配置文件指定的可选管理员用户 dba。

Pigsty 依赖 Ansible 执行剧本，在执行安装前，您需要先安装 ansible 与 jmespath 软件包。您可以通过 bootstrap 脚本完成这一任务，特别是当您没有互联网访问，需要进行离线安装时。

./bootstrap   # 使用各种可选方式安装 Ansible 与 Jmespath 依赖，如果离线包存在则使用离线包

您也可以直接使用操作系统的包管理器安装所需的 Ansible 与 Jmespath 软件包：

sudo dnf install -y ansible python3.12-jmespath python3-cryptography

sudo yum install -y ansible   # EL7 无需显式安装 Jmespath

sudo apt install -y ansible python3-jmespath

brew install ansible

下载

您可以使用以下命令自动下载、解压 Pigsty 源码包至 ~/pigsty 目录下使用：

curl -fsSL https://repo.pigsty.cc/get | bash            # 安装最新稳定版本 
curl -fsSL https://repo.pigsty.cc/get | bash -s v3.3.0  # 安装特定版本 3.3.0

一键下载脚本的样例输出

$ curl -fsSL https://repo.pigsty.cc/get | bash 

[v3.3.0] ===========================================
$ curl -fsSL https://repo.pigsty.cc/get | bash
[Site] https://pigsty.cc
[Demo] https://demo.pigsty.cc
[Repo] https://github.com/pgsty/pigsty
[Docs] https://pigsty.cc/docs/setup/install
[Download] ===========================================
[ OK ] version = v3.3.0 (from default)
curl -fSL https://repo.pigsty.cc/src/pigsty-v3.3.0.tgz -o /tmp/pigsty-v3.3.0.tgz
[WARN] tarball = /tmp/pigsty-v3.3.0.tgz exists, size = 1227379, use it
[ OK ] md5sums = xxxxxx  /tmp/pigsty-v3.3.0.tgz
[Install] ===========================================
[WARN] os user = root , it's recommended to install as a sudo-able admin
[WARN] pigsty already installed on '/root/pigsty', if you wish to overwrite:
sudo rm -rf /tmp/pigsty_bk; cp -r /root/pigsty /tmp/pigsty_bk; # backup old
sudo rm -rf /tmp/pigsty;    tar -xf /tmp/pigsty-v3.3.0.tgz -C /tmp/; # extract new
rsync -av --exclude='/pigsty.yml' --exclude='/files/pki/***' /tmp/pigsty/ /root/pigsty/; # rsync src
[TodoList] ===========================================
cd /root/pigsty
./bootstrap      # [OPTIONAL] install ansible & use offline package
./configure      # [OPTIONAL] preflight-check and config generation
./install.yml    # install pigsty modules according to your config.
[Complete] ===========================================

你也可以使用 git 来下载 Pigsty 源代码，请务必检出特定版本后使用。

git clone https://github.com/pgsty/pigsty; cd pigsty; git checkout v3.3.0

检出特定版本使用

main 主干为活跃开发分支，使用 git 时请务必检出特定版本后使用，可用版本请参考 发行注记。

配置

配置 / configure 会根据您当前的环境，自动生成推荐的（单机安装） pigsty.yml 配置文件。

提示: 如果您已经了解了如何配置 Pigsty，configure 这个步骤是可选的，可以跳过。 Pigsty 提供了许多开箱即用的预置 配置模板 供您参考。

./configure # 不带参数会自动推荐配置，并交互式问询
./configure [-i|--ip <ipaddr>]                     # 指定首要 IP 地址，如果不指定，将在检测到多个可用IP地址时问询。
            [-c|--conf <conf>]                     # 指定配置模板（相对于 conf/ 目录的配置名称，不带.yml 后缀），默认使用 meta 单节点模板
            [-v|--version <ver>]                   # 指定要安装 PostgreSQL 大版本，部分模板不适用此配置 
            [-r|--region <default|china|europe>]   # 选择镜像源区域，如果在 GFW 区域内，将被设置为 china
            [-n|--non-interactive]                 # 跳过交互式向导
            [-x|--proxy]                           # 将环境变量中的代理配置写入配置文件的 proxy_env 参数中

配置 / configure 过程的样例输出

$ ./configure
configure pigsty v3.3.0 begin
[ OK ] region = china
[ OK ] kernel = Linux
[ OK ] machine = x86_64
[ OK ] package = rpm,yum
[ OK ] vendor  = centos (CentOS Linux)
[ OK ] version = 7 (7)
[ OK ] sudo = vagrant ok
[ OK ] ssh = vagrant@127.0.0.1 ok
[WARN] Multiple IP address candidates found:
    (1) 192.168.121.110	    inet 192.168.121.110/24 brd 192.168.121.255 scope global noprefixroute dynamic eth0
    (2) 10.10.10.10	    inet 10.10.10.10/24 brd 10.10.10.255 scope global noprefixroute eth1
[ OK ] primary_ip = 10.10.10.10 (from demo)
[ OK ] admin = vagrant@10.10.10.10 ok
[WARN] mode = el7, CentOS 7.9 EOL @ 2024-06-30, deprecated, consider using el8 or el9 instead
[ OK ] configure pigsty done
proceed with ./install.yml

-i|--ip：当前主机的首要内网IP地址，用于替换配置文件中的 IP 地址占位符 10.10.10.10。
-c|--conf：用于指定使用的配置 配置模板，相对于 conf/ 目录，不带 .yml 后缀的配置名称。
-v|--version：用于指定要安装的 PostgreSQL 大版本，如 13、14、15、16、17，部分模板不支持此配置。
-r|--region：用于指定上游软件源的区域，加速下载： (default|china|europe)
-n|--non-interactive：直接使用命令行参数提供首要IP地址，跳过交互式向导。
-x|--proxy: 使用当前环境变量配置 proxy_env 变量（影响 http_proxy/HTTP_PROXY， HTTPS_PROXY， ALL_PROXY， NO_PROXY）。

如果您的机器网卡绑定了多个 IP 地址，那么需要使用 -i|--ip <ipaddr> 显式指定一个当前节点的首要 IP 地址，或在交互式问询中提供。选用的地址应为静态 IP 地址，请勿使用公网 IP 地址。

配置过程生成的配置文件默认位于：~/pigsty/pigsty.yml，您可以在安装前进行检查与修改定制。

修改默认密码！

我们强烈建议您在安装前，事先修改配置文件中使用的默认密码与凭据，详情参考 安全考量。

安装

使用 install.yml 剧本，默认在当前节点上完成标准的单节点 Pigsty 安装。

./install.yml    # 一次性在所有节点上完成安装

安装过程的样例输出

[vagrant@meta pigsty]$ ./install.yml

PLAY [IDENTITY] ********************************************************************************************************************************

TASK [node_id : get node fact] *****************************************************************************************************************
changed: [10.10.10.10]
...
...
PLAY RECAP **************************************************************************************************************************************************************************
10.10.10.10                : ok=288  changed=215  unreachable=0    failed=0    skipped=64   rescued=0    ignored=0
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=4    rescued=0    ignored=0

这是一个 Ansible 剧本，您可以使用以下参数控制其执行的目标、任务、并传递额外的命令参数：

-l: 限制执行的目标对象
-t: 限制要执行的任务
-e: 传入额外的命令行参数
-i: 指定使用不同于 pigsty.yml 的配置文件
…

避免重复执行安装剧本！

警告：在已经初始化的环境中再次运行 install.yml 会重置整个环境，所以请务必小心！

此剧本仅用于初始安装，安装完毕后可以用 rm install.yml 或 chmod a-x install.yml 来避免此剧本的误执行。

用户界面

当安装完成后，当前节点会安装有四个 核心模块：PGSQL，INFRA，NODE，ETCD。

本机上的 PGSQL 模块提供了一个开箱即用的单机 PostgreSQL 数据库实例，默认可以使用以下连接串访问：

psql postgres://dbuser_dba:DBUser.DBA@10.10.10.10/meta     # DBA / 超级用户（IP直连）

psql postgres://dbuser_meta:DBUser.Meta@10.10.10.10/meta   # 业务管理员用户，读/写/DDL变更

psql postgres://dbuser_view:DBUser.View@pg-meta/meta       # 只读用户（走域名访问）

本机上的 INFRA 模块为您提供了监控基础设施，默认使用的域名与端口如下所示：

组件	端口	域名	说明	Demo地址
Nginx	80/443	`h.pigsty`	Web 服务总入口，本地YUM源	`home.pigsty.cc`
AlertManager	9093	`a.pigsty`	告警聚合/屏蔽页面	`a.pigsty.cc`
Grafana	3000	`g.pigsty`	Grafana 监控面板	`demo.pigsty.cc`
Prometheus	9090	`p.pigsty`	Prometheus 管理界面	`p.pigsty.cc`

Grafana 监控系统（g.pigsty / 3000端口）的默认用户名与密码为：admin / pigsty

您可以通过 IP地址 + 端口的方式直接访问这些服务。但我们更推荐您使用域名通过 Nginx 80/443 端口代理访问所有组件。使用域名访问 Pigsty WebUI 时，您需要配置 DNS 解析，或者修改本地的 /etc/hosts 静态解析文件。

如何通过域名访问 Pigsty WebUI ？

客户端可以通过几种不同的办法来使用域名访问：

通过 DNS 服务商解析互联网域名，适用于公网可访问的系统。
通过配置内网 DNS 服务器解析记录实现内网域名解析。
修改本机的 /etc/hosts 文件添加静态解析记录。（Windows下为：）

我们建议普通用户使用第三种方式，在使用浏览器访问 Web 系统的机器上，修改 /etc/hosts （需要 sudo 权限）或 C:\Windows\System32\drivers\etc\hosts（Windows）文件，添加以下的解析记录：

<your_public_ip_address>  h.pigsty a.pigsty p.pigsty g.pigsty

这里的 IP 地址是安装 Pigsty 服务的 对外IP地址。

如何配置服务端使用的域名？

服务器端域名使用 Nginx 进行配置，如果您想要替换默认的域名，在参数 infra_portal 中填入使用的域名即可。当您使用 http://g.pigsty 访问 Grafana 监控主页时，实际上是通过 Nginx 代理访问了 Grafana 的 WebUI：

http://g.pigsty ️-> http://10.10.10.10:80 (nginx) -> http://10.10.10.10:3000 (grafana)

如何使用 HTTPS 访问 Pigsty WebUI ？

Pigsty默认使用自动生成的自签名的 CA 证书为 Nginx 启用 SSL，如果您希望使用 HTTPS 访问这些页面，而不弹窗提示"不安全"，通常有三个选择：

在您的浏览器或操作系统中信任 Pigsty 自签名的 CA 证书： files/pki/ca/ca.crt
如果您使用 Chrome，可以在提示不安全的窗口键入 thisisunsafe 跳过提示
您可以考虑使用 Let’s Encrypt 或其他免费的 CA 证书服务，为 Pigsty Nginx 生成正式的 SSL证书。

你可以使用 Pigsty 部署更多的集群，管理更多的节点，例如：

bin/node-add   pg-test      # 将集群 pg-test 的3个节点纳入 Pigsty 管理
bin/pgsql-add  pg-test      # 初始化一个3节点的 pg-test 高可用PG集群
bin/redis-add  redis-ms     # 初始化 Redis 集群： redis-ms

大多数模块都依赖 NODE 模块，请确保节点被 Pigsty 纳管后再加装其他模块。更多细节请参考模块详情：

PGSQL，INFRA，NODE，ETCD，MINIO，REDIS，MONGO，DOCKER，……

1.2 - 离线安装

如何在没有互联网访问的环境中安装 Pigsty？以及如何制作、使用、下载离线软件安装包？

Pigsty 默认的 标准安装 流程需要访问互联网，然而生产环境的数据库服务器通常是与互联网隔离的。

为了解决这个问题，Pigsty 提供了离线安装的功能。通过使用离线软件包，用户可以在没有互联网访问的环境中同样完成安装与部署。

离线软件包是本地软件源的快照与镜像，使用离线软件包可以避免重复的下载请求与流量消耗，显著加速安装速度，并提高安装交付过程的可靠性与一致性。

构建本地软件源

Pigsty 会在安装过程中从互联网上游的 yum/apt 软件仓库，下载所需的 rpm/deb 包并构建一个本地软件源（默认位于 /www/pigsty）。本地软件源由 Nginx 对外提供服务，后续无论是本机还是其他节点，都默认会使用本地软件源进行安装，而不再需要访问互联网

使用本地软件源有四个主要好处：

本地软件源可以避免重复的下载请求与流量消耗，显著加速安装速度并提高安装过程的可靠性。
构建本地软件源会对当前可用软件版本取快照，确保部署环境内节点所安装软件版本的一致性。
使用本地快照可以避免上游依赖变化导致的依赖错漏与安装失败问题，只要首个节点成功，相同环境的节点就能成功。
构建好的本地软件源可以整体打包，复制到安装有相同操作系统的隔离环境中用于离线安装。

在 Pigsty 中，本地软件源的默认位置是本机上的 /www/pigsty目录（可以通过 nginx_home & repo_name 参数进行配置）。这个本地软件源将使用 createrepo_c (EL) 或 dpkg-dev (Debian) 构建，本节点和其他节点可以通过 repo_upstream 中 module=local 的仓库定义引用并使用该软件仓库。

您可以在安装完全相同操作系统的节点上执行标准安装，一旦构建完成，你可以将这个节点上的本地软件源目录，使用各种手段（scp/rsync/ftp/usb）拷贝到另一台安装了完全相同操作系统的节点上，用于 离线安装。

更为标准通用的做法是在安装完毕的节点上，将本地软件源打包制作为 离线软件包，并将其拷贝至同环境的待安装隔离节点上使用。

制作离线软件包

Pigsty 提供了 cache.yml 剧本，用于制作离线软件包。例如以下命令会将 infra 节点上的 /www/pigsty 本地软件源，打包成离线软件包，并取回到本地 dist/${version} 目录下。

./cache.yml -l infra

您可以使用 cache_pkg_dir 与 cache_pkg_name 参数自定义离线软件包的输出目录与名称，例如，以下命令会将离线软件包生成至 files/pkg.tgz。

./cache.yml -l infra -e '{"cache_pkg_dir":"files","cache_pkg_name":"pkg.tgz"}'

使用离线软件包

离线软件包实际上是一个使用 gzip 与 tar 制作的压缩包，使用时解压到 /www/pigsty 目录下即可。

sudo rm -rf /www/pigsty ； sudo tar -xf /tmp/pkg.tgz -C /www

更简单的做法是将离线软件包拷贝至待安装节点的 /tmp/pkg.tgz 路径下，然后 Pigsty 会在 bootstrap 过程中会自动解包，并从本地软件源安装所需软件。

Pigsty 在构建本地软件源时会生成一个标记文件 repo_complete，标记这是一个 Pigsty 本地软件源。当 Pigsty 安装时，如果发现本地软件源已经存在，就会进入离线安装模式。在此模式下，Pigsty 将跳过从互联网下载并构建本地软件源的过程，使用本地软件源完成整个安装过程，期间无需互联网访问。

进入离线安装模式的判定标准

判定本地软件源存在的标准是：默认位于 /www/pigsty/repo_complete 的标记文件存在。

这个标记文件会在标准安装过程中，下载完成后自动生成，说明这是一个可用的本地软件源。

删除本地软件源的 repo_complete 标记文件后，安装时将重新从上游下载 缺失的 软件包。

离线包兼容性须知

离线软件包中的 RPM/DEB 包大体上可以分为三类：

INFRA 软件包：例如 Prometheus，Grafana，各种监控组件，通常在任何 Linux 发行版下都可以运行。
PGSQL 软件包：例如 PostgreSQL 内核与各种扩展插件，通常与 Linux 发行版 大版本 绑定。
NODE 软件包：例如各种动态链接库与主机依赖，通常与 Linux 发行版 大小版本 绑定

因此，离线软件包的适用性取决于操作系统的 大版本 与 小版本，因为它包含了上面三种软件包。

通常来说，离线软件包只能用于操作系统大小版本精准匹配的场景。如果大版本不匹配，INFRA 类软件包通常可以成功安装，PGSQL 和 NODE 类软件包基本必定会出现依赖缺失或冲突。如果小版本不匹配，INFRA 类软件包和 PGSQL 软件包通常可以成功安装，而 NODE 类软件包有概率成功，也有概率失败。

例如，RockLinux 8.9 下制作的离线软件包，在 RockyLinux 8.10 环境中有较大的概率可以直接使用。而在 Ubuntu 22.04.3 下制作的离线软件，有极大概率会在 22.04.4 中出现依赖冲突问题。（是的没错，Ubuntu 版本号里的 .3 才是小版本号，22.04 整个是 jammy 大版本号！）

如果离线软件包的操作系统小版本不匹配，您可以采用一种折中策略进行安装，即在 bootstrap 过程后，移除 /www/pigsty/repo_complete 标记文件，让 Pigsty 重新在安装过程中从上游下载缺失的 NODE 软件包与相关依赖。这种方式可以有效解决使用离线软件包时的依赖冲突问题，并同时获得离线安装的全部优点。

下载离线软件包

从 Pigsty v3 开始，Pigsty 不再公开提供预制的离线软件包，统一默认使用在线安装的方式。在线安装允许您从 Pigsty 提供的官方仓库与中国大陆 CDN 下载 INFRA / PGSQL 软件包，并由您操作系统的官方源与镜像下载 NODE 类软件包，从而最大程度的避免了 NODE 软件包小版本依赖冲突问题。

不过针对以下精确的操作系统版本，我们提供有偿制作的预制离线软件包，还额外包含了全部可用扩展插件，Docker 等组件。

RockyLinux 8.10 / 8.10 (x86_64)
RockyLinux 9.4 / 9.4 (x86_64, aarch64)
Ubuntu 24.04.1 (x86_64, arm64)
Ubuntu 22.04.5 (x86_64, arm64)
Debian 12.7 (x86_64, arm64)

Pigsty 的全套集成测试都针对于发布前预制的离线软件包快照进行，离线软件包能有效降低上游依赖变动导致的交付风险，省去您折腾的麻烦、与等待的时间。更能体现您对开源事业的支持，物美价廉，仅需 ¥199，请联系 @Vonng 获取下载链接。

对于 Pigsty 专业版订阅，我们会针对您使用的具体操作系统大小版本，提供精确匹配，并经过集成测试后的离线软件包。

Bootstrap

Pigsty 需要使用 Ansible 来运行剧本，因此 Ansible 本身不适合通过剧本进行安装。 Bootstrap 脚本用于解决这一问题：它会尽最大努力用各种方式来确保 Ansible 在节点上安装成功。

./bootstrap       # 确保 ansible 正确安装（如果有离线包，优先使用离线安装并解包使用）

如果您使用离线软件包进行安装，Bootstrap 过程会自动识别并处理位于 /tmp/pkg.tgz 的离线软件包，并优先从中安装 Ansible。如果没有使用离线软件包，但是有互联网访问，Bootstrap 会自动添加对应操作系统/对应区域镜像的软件源，并从中安装 Ansible。如果既没有网也没有离线包，Bootstrap 会交给用户自行处理此问题，用户需要自行保证当前节点配置的软件源包含可用的 Ansible。

Bootstrap 脚本有一些可选参数，例如您可以使用 -p|--path 参数指定一个不同于 /tmp/pkg.tgz 的离线软件包位置。您也可以使用 -r|--region 指定一个区域，这样如果需要从互联网下载 Ansible，Pigsty 会添加对应区域的软件源（全球，中国，欧洲）。

./boostrap
   [-r|--region <region]   [default,china,europe]
   [-p|--path <path>]      specify another offline pkg path
   [-k|--keep]             keep existing upstream repo during bootstrap

最后， bootstrap 过程中默认会备份（/etc/yum.repos.d/backup / /etc/apt/source.list.d/backup）并移除节点当前配置的软件源，从而最大程度避免软件源冲突问题。如果这不是你期待的行为，或者您已经配置了本地软件源，可以使用 -k|--keep 参数，保留现有软件源。

例：使用离线软件包

在一台安装了 RockyLinux 8.9 的节点上使用离线软件包安装 Pigsty：

[vagrant@el8 pigsty]$ ls -alh /tmp/pkg.tgz
-rw-r--r--. 1 vagrant vagrant 1.4G Sep  1 10:20 /tmp/pkg.tgz
[vagrant@el8 pigsty]$ ./bootstrap
bootstrap pigsty v3.0.4 begin
[ OK ] region = china
[ OK ] kernel = Linux
[ OK ] machine = x86_64
[ OK ] package = rpm,dnf
[ OK ] vendor = rocky (Rocky Linux)
[ OK ] version = 8 (8.9)
[ OK ] sudo = vagrant ok
[ OK ] ssh = vagrant@127.0.0.1 ok
[ OK ] cache = /tmp/pkg.tgz exists
[ OK ] repo = extract from /tmp/pkg.tgz
[WARN] old repos = moved to /etc/yum.repos.d/backup
[ OK ] repo file = use /etc/yum.repos.d/pigsty-local.repo
[WARN] rpm cache = updating, may take a while
pigsty local 8 - x86_64                                                                                                                               49 MB/s | 1.3 MB     00:00
Metadata cache created.
[ OK ] repo cache = created
[ OK ] install el8 utils
Last metadata expiration check: 0:00:01 ago on Sun 01 Sep 2024 10:30:52 AM UTC.
Package wget-1.19.5-11.el8.x86_64 is already installed.
Package yum-utils-4.0.21-23.el8.noarch is already installed.
Dependencies resolved.
........
Installed:

  createrepo_c-0.17.7-6.el8.x86_64        createrepo_c-libs-0.17.7-6.el8.x86_64 drpm-0.4.1-3.el8.x86_64   modulemd-tools-0.7-8.el8.noarch python3-createrepo_c-0.17.7-6.el8.x86_64
  python3-libmodulemd-2.13.0-1.el8.x86_64 python3-pyyaml-3.12-12.el8.x86_64     sshpass-1.09-4.el8.x86_64 unzip-6.0-46.el8.x86_64
  ansible-9.2.0-1.el8.noarch                 ansible-core-2.16.3-2.el8.x86_64              git-core-2.43.5-1.el8_10.x86_64                 mpdecimal-2.5.1-3.el8.x86_64
  python3-cffi-1.11.5-6.el8.x86_64           python3-cryptography-3.2.1-7.el8_9.x86_64     python3-jmespath-0.9.0-11.el8.noarch            python3-pycparser-2.14-14.el8.noarch
  python3.12-3.12.3-2.el8_10.x86_64          python3.12-cffi-1.16.0-2.el8.x86_64           python3.12-cryptography-41.0.7-1.el8.x86_64     python3.12-jmespath-1.0.1-1.el8.noarch
  python3.12-libs-3.12.3-2.el8_10.x86_64     python3.12-pip-wheel-23.2.1-4.el8.noarch      python3.12-ply-3.11-2.el8.noarch                python3.12-pycparser-2.20-2.el8.noarch
  python3.12-pyyaml-6.0.1-2.el8.x86_64

Complete!
[ OK ] ansible = ansible [core 2.16.3]
[ OK ] boostrap pigsty complete
proceed with ./configure

例：不使用离线软件包，但是有互联网访问（Debian 12）

在一台有互联网访问的 Debian 12 节点上，用户不使用离线软件包，Bootstrap 会自动添加 Debian 12 的上游软件源，并安装 ansible 与依赖：

vagrant@d12:~/pigsty$ ./bootstrap
bootstrap pigsty v3.3.0 begin
[ OK ] region = china
[ OK ] kernel = Linux
[ OK ] machine = x86_64
[ OK ] package = deb,apt
[ OK ] vendor = debian (Debian GNU/Linux)
[ OK ] version = 12 (12)
[ OK ] sudo = vagrant ok
[ OK ] ssh = vagrant@127.0.0.1 ok
[WARN] old repos = moved to /etc/apt/backup
[ OK ] repo file = add debian bookworm china upstream
[WARN] apt cache = updating, may take a while
....... apt install output

[ OK ] ansible = ansible [core 2.14.16]
[ OK ] boostrap pigsty complete
proceed with ./configure

例：不使用离线软件包，也没有互联网访问（Ubuntu 22.04）

在一个没有互联网访问，也没有离线包的 Ubuntu 22.04 节点上进行 bootstrap。

我们假设用户已经通过各种方式解决这个问题，例如在当前的服务器上已经配置了 CD / FS 路径的本地软件源，或者内网中有可用的 YUM / APT 仓库。

您可以使用 -k 参数显式保留当前的软件源配置，当然如果 Pigsty 检测到了没有互联网访问，也没有离线软件包，那么默认也会保留当前的软件源配置。

vagrant@u22:~/pigsty$ ./bootstrap
bootstrap pigsty v3.3.0 begin
[ OK ] region = china
[ OK ] kernel = Linux
[ OK ] machine = x86_64
[ OK ] package = deb,apt
[ OK ] vendor = ubuntu (Ubuntu)
[ OK ] version = 22 (22.04)
[ OK ] sudo = vagrant ok
[ OK ] ssh = vagrant@127.0.0.1 ok
[WARN] old repos = moved to /etc/apt/backup
[ OK ] repo file = add ubuntu jammy china upstream
[WARN] apt cache = updating, may take a while
[ OK ] repo cache = created

...(apt update/install 输出)

[ OK ] ansible = ansible 2.10.8
[ OK ] boostrap pigsty complete
proceed with ./configure

1.3 - 精简安装

如何只安装高可用 PostgreSQL 集群以及其最小依赖，不安装本地软件源与基础设施组件？

Pigsty 带有一整套服务于高可用 PostgreSQL 基础设施堆栈，但是这也完全是可选的，您可以通过 精简安装，只安装必要的 PostgreSQL 高可用集群组件。

概览

精简安装专注于纯粹的 HA-PostgreSQL 集群，仅安装所需的基本组件。

在精简安装模式下，不会有 Infra 模块，没有监控，没有 本地仓库

只有部分 NODE 模块的一部分，以及 ETCD 和 PGSQL 模块会被安装：您依然可以使用完整的 PGSQL 模块与 ETCD 模块的功能

在节点上会启用的 Systemd 服务有：

patroni：必选，PostgreSQL 高可用与管控组件
etcd：必选, Patroni 高可用依赖的配置中心
pgbouncer：可选，数据库连接池
vip-manager：可选，为主库绑定一个可选的二层VIP。
haproxy：可选，提供自动路由的服务接入
chronyd：可选，同步节点时间
tuned：可选，管理节点配置模板与系统内核参数

其中 patroni 和 etcd 是必选的组件，其他组件都是可选的，尽管如此，Pigsty 不建议关闭这些组件 —— 您可以留着它们不用。

配置

要执行精简安装，您需要额外配置几个选项开关，如 conf/slim.yml 所示：

all:
  children:
    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } }, vars: { docker_enabled: true } }

    etcd:
      hosts:
        10.10.10.10: { etcd_seq:  1 }
        #10.10.10.11: { etcd_seq:  2 } # optional
        #10.10.10.12: { etcd_seq:  3 } # optional
      vars: { etcd_cluster: etcd }

    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }  # init one single-node pgsql cluster by default, with:
        #10.10.10.11: { pg_seq: 2, pg_role: replica } # optional replica : bin/pgsql-add pg-meta 10.10.10.11
        #10.10.10.12: { pg_seq: 3, pg_role: replica } # optional replica : bin/pgsql-add pg-meta 10.10.10.12
      vars:
        pg_cluster: pg-meta
        pg_users:         # define business users here: https://pigsty.cc/docs/pgsql/user/
          - { name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ] ,comment: pigsty default user }
        pg_databases:     # define business databases here: https://pigsty.cc/docs/pgsql/db/
          - { name: meta ,comment: pigsty default database  }
        pg_hba_rules:     # define HBA rules here: https://pigsty.cc/docs/pgsql/hba/#define-hba
          - { user: dbuser_meta , db: all ,addr: world ,auth: pwd ,title: 'allow default user world access with password (not a good idea!)' }
        node_crontab:     # define backup policy with crontab (full|diff|incr)
          - '00 01 * * * postgres /pg/bin/pg-backup full'
        #pg_vip_address: 10.10.10.2/24  # optional l2 vip address and netmask
        pg_extensions:    # define pg extensions (345 available): https://pigsty.cc/ext/
          - postgis timescaledb pgvector

  vars:
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    node_tune: tiny                   # use tiny template for NODE  in demo environment
    pg_conf: tiny.yml                 # use tiny template for PGSQL in demo environment

    # minimal installation setup
    node_repo_modules: node,infra,pgsql
    nginx_enabled: false
    dns_enabled: false
    prometheus_enabled: false
    grafana_enabled: false
    pg_exporter_enabled: false
    pgbouncer_exporter_enabled: false
    pg_vip_enabled: false

同时在初始化安装时，您应该使用 slim.yml 剧本进行安装，而不是默认的 install.yml。

./slim.yml

然后，配置指定的节点上便会安装最小化的 PostgreSQL 高可用集群组件。

1.4 - 声明配置

使用声明式的配置来描述数据库集群与基础设施

Pigsty将基础设施和数据库视为代码：Database as Code & Infra as Code

你可以通过声明式的接口/配置文件来描述基础设施和数据库集群，你只需在配置清单（Inventory）中描述你的需求，然后用简单的幂等剧本使其生效即可。

配置清单

每一套 Pigsty 部署都有一个相应的 配置清单（Inventory）。它可以以 YAML 的形式存储在本地，并使用 git 管理；或从 CMDB 或任何 ansible 兼容的方式动态生成。

Pigsty 默认使用一个名为 pigsty.yml 的单体 YAML 配置文件作为默认的配置清单，它位于 Pigsty 源码主目录下，但你也可以通过命令行参数 -i 指定路径以使用别的配置清单。

清单由两部分组成：全局变量 和多个 组定义 。前者 all.vars 通常用于描述基础设施，并为集群设置全局默认参数。后者 all.children 则负责定义新的集群（PGSQL/Redis/MinIO/ETCD等等）。一个配置清单文件从最顶层来看大概如下所示：

all:                  # 顶层对象：all
  vars: {...}         # 全局参数
  children:           # 组定义
    infra:            # 组定义：'infra'
      hosts: {...}        # 组成员：'infra'
      vars:  {...}        # 组参数：'infra'
    etcd:    {...}    # 组定义：'etcd'
    pg-meta: {...}    # 组定义：'pg-meta'
    pg-test: {...}    # 组定义：'pg-test'
    redis-test: {...} # 组定义：'redis-test'
    # ...

集群

每个组定义通常代表一个集群，可以是节点集群、PostgreSQL 集群、Redis 集群、Etcd 集群或 Minio 集群等。它们都使用相同的格式：hosts 和 vars。你可以用 all.children.<cls>.hosts 定义集群成员，并使用 all.children.<cls>.vars 中的集群参数描述集群。以下是名为 pg-test 的三节点 PostgreSQL 高可用集群的定义示例：

pg-test:   # 集群名称
  vars:    # 集群参数
    pg_cluster: pg-test
  hosts:   # 集群成员
    10.10.10.11: { pg_seq: 1, pg_role: primary } # 实例1，在 10.10.10.11 上，主库
    10.10.10.12: { pg_seq: 2, pg_role: replica } # 实例2，在 10.10.10.12 上，从库
    10.10.10.13: { pg_seq: 3, pg_role: offline } # 实例3，在 10.10.10.13 上，从库

你也可以为特定的主机/实例定义参数，也称为实例参数。它将覆盖集群参数和全局参数，实例参数通常用于为节点和数据库实例分配身份（实例号，角色）。

参数

全局变量、组变量和主机变量都是由一系列 键值对 组成的字典对象。每一对都是一个命名的参数，由一个字符串名作为键，和一个值组成。值是五种类型之一：布尔值、字符串、数字、数组或对象。查看各模块的配置参数以了解详细的参数语法语义。

绝大多数参数都有着合适的默认值，身份参数 除外；它们被用作标识符，并必须显式配置，例如 pg_cluster， pg_role，以及 pg_seq。

参数可以被更高优先级的同名参数定义覆盖，优先级如下所示：

命令行参数 > 剧本变量  >  主机变量（实例参数）  >  组变量（集群参数）  >  全局变量（全局参数） >  默认值

例如：

使用命令行参数 -e pg_clean=true 强制删除现有数据库
使用实例参数 pg_role 和 pg_seq 来为一个数据库实例分配角色与标号。
使用集群变量来为集群设置默认值，如集群名称 pg_cluster 和数据库版本 pg_version
使用全局变量为所有 PGSQL 集群设置默认值，如使用的默认参数和插件列表
如果没有显式配置 pg_version ，默认值 16 版本号会作为最后兜底的缺省值。

模板

在 Pigsty 目录中的 conf/ 目录里，提供了针对许多不同场景的预置配置模板可供参考选用。

在 configure 过程中，您可以通过 -c 参数指定模板。否则会默认使用单节点安装的 meta 模板。

关于这些模版的功能，请参考 配置模板 中的介绍。

切换配置源

要使用不同的配置模板，您可以将模板的内容复制到 Pigsty 源码目录的 pigsty.yml 文件中，并按需进行相应调整。

您也可以在执行 Ansible 剧本时，通过 -i 命令行参数，显式指定使用的配置文件，例如：

./node.yml -i conf/full.yml    # 根据 full 配置文件，初始化目标节点，而不是使用默认的 pigsty.yml 配置文件

如果您希望修改默认的配置文件名称与位置，您也可以修改源码根目录下的 ansible.cfg 的 inventory 参数，将其指向您的配置文件路径，这样您就可以直接执行 ansible-playbook 命令而无需显式指定 -i 参数。

Pigsty 允许您使用数据库（CMDB）作为动态配置源，而不是使用静态配置文件。 Pigsty 提供了三个便利脚本：

bin/inventory_load: 将 pigsty.yml 配置文件的内容加载到本机上的 PostgreSQL 数据库中（meta.pigsty）
bin/inventory_cmdb: 切换配置源为本地 PostgreSQL 数据库（meta.pigsty）
bin/inventory_conf: 切换配置源为本地静态配置文件 pigsty.yml

参考

Pigsty 带有 280+ 配置参数，分为以下32个参数组。

模块	参数组	描述	数量
`INFRA`	`META`	Pigsty 元数据	4
`INFRA`	`CA`	自签名公私钥基础设施 CA	3
`INFRA`	`INFRA_ID`	基础设施门户，Nginx域名	2
`INFRA`	`REPO`	本地软件仓库	9
`INFRA`	`INFRA_PACKAGE`	基础设施软件包	2
`INFRA`	`NGINX`	Nginx 网络服务器	7
`INFRA`	`DNS`	DNSMASQ 域名服务器	3
`INFRA`	`PROMETHEUS`	Prometheus 时序数据库全家桶	18
`INFRA`	`GRAFANA`	Grafana 可观测性全家桶	6
`INFRA`	`LOKI`	Loki 日志服务	4
`NODE`	`NODE_ID`	节点身份参数	5
`NODE`	`NODE_DNS`	节点域名 & DNS解析	6
`NODE`	`NODE_PACKAGE`	节点仓库源 & 安装软件包	5
`NODE`	`NODE_TUNE`	节点调优与内核特性开关	10
`NODE`	`NODE_ADMIN`	管理员用户与SSH凭证管理	7
`NODE`	`NODE_TIME`	时区，NTP服务与定时任务	5
`NODE`	`NODE_VIP`	可选的主机节点集群L2 VIP	8
`NODE`	`HAPROXY`	使用HAProxy对外暴露服务	10
`NODE`	`NODE_EXPORTER`	主机节点监控与注册	3
`NODE`	`PROMTAIL`	Promtail日志收集组件	4
`DOCKER`	`DOCKER`	Docker容器服务（可选）	4
`ETCD`	`ETCD`	Etcd DCS 集群	10
`MINIO`	`MINIO`	MinIO S3 对象存储	15
`REDIS`	`REDIS`	Redis 缓存	20
`PGSQL`	`PG_ID`	PG 身份参数	11
`PGSQL`	`PG_BUSINESS`	PG 业务对象定义	12
`PGSQL`	`PG_INSTALL`	安装 PG 软件包 & 扩展	10
`PGSQL`	`PG_BOOTSTRAP`	使用 Patroni 初始化 HA PG 集群	39
`PGSQL`	`PG_PROVISION`	创建 PG 数据库内对象	9
`PGSQL`	`PG_BACKUP`	使用 pgBackRest 设置备份仓库	5
`PGSQL`	`PG_SERVICE`	对外暴露服务, 绑定 vip, dns	9
`PGSQL`	`PG_EXPORTER`	PG 监控，服务注册	15

1.5 - 准备工作

如何准备部署 Pigsty 所需的节点，操作系统，管理员用户，端口，与所需权限。

与 Pigsty 部署有关的 101 入门知识。

节点准备

Pigsty 支持 Linux 内核与 x86_64/amd64 架构，适用于任意节点。

所谓节点（node），指 ssh 可达并提供裸操作系统环境的资源，例如物理机，裸金属，虚拟机，或者启用了 systemd 与 sshd 的操作系统容器。

部署 Pigsty 最少需要一个节点。最低配置要求为 1C1G，推荐至少使用 2C4G 以上的机型，适用配置上不封顶，参数会自动优化适配。

作为 Demo，个人站点，或者开发环境时，可以使用单个节点。作为独立监控基础设施使用时，建议使用 1-2 个节点，作为高可用 PostgreSQL 数据库集群使用时，建议至少使用 3 个节点。用于核心场景时，建议使用至少 4-5 个节点。

充分利用 IaC 工具完成琐事

手工配置大规模生产环境繁琐且容易出错，我们建议您充分利用 Infra as Code 工具，解决此类问题。

您可以使用 Pigsty 提供的 Terraform 模板与 Vagrant 模板，使用 IaC 的方式一键创建所需的节点环境，完成好网络，操作系统，管理用户，权限，安全组的置备工作。

网络准备

Pigsty 要求节点使用 静态IPv4地址，即您应当为节点显式分配指定固定的 IP 地址，而不应当使用 DHCP 动态分配的地址。

节点使用的 IP 地址应当是节点用于内网通信的首要 IP 地址，并将作为节点的唯一身份标识符。

如果您希望使用可选的 Node VIP 与 PG VIP 功能，应当确保所有节点位于一个大二层网络中。

您的防火墙策略应当保证所需的端口在节点间开放，不同模块所需的具体端口列表请参考 节点：端口。

应该暴露哪些端口？

暴露端口的方法取决于您的网络安全策略实现，例如：公有云上的安全组策略，或本地 iptables 记录，防火墙配置等。如果您只是希望尝尝鲜，不在乎安全，并且希望一切越简单越好，那么您可以仅对外部用户按需开放 5432 端口（ PostgreSQL 数据库）与 3000 端口（Grafana 可视化界面）。

在 Infra节点 上的 Nginx 默认会对外暴露 80/443 端口提供 Web 服务，并通过域名对不同服务进行区分，这一端口应当对办公网络（或整个整个互联网）开放。

严肃的生产数据库服务端口通常不应当直接暴露在公网上，如果您确实需要这么做，建议首先查阅安全最佳实践，并小心行事。

操作系统准备

Pigsty 支持多种基于 Linux 内核的服务器操作系统发行版，我们建议使用 RockyLinux 9.4 或 Ubuntu 22.04.5 作为安装 Pigsty 的 OS。

Pigsty 支持 RHEL (7,8,9)，Debian (11,12)，Ubuntu (20,22,24) 以及多种与之兼容的操作系统发行版，完整操作系统列表请参考 兼容性。

在使用多个节点镜进行部署时，我们强烈建议您在所有用于 Pigsty 部署的节点上，使用相同版本的操作系统发行版与 Linux 内核版本

我们强烈建议使用干净，全新安装的最小化安装的服务器操作系统环境，使用 en_US 作为首要语言，并使用推荐/默认生成的 Locale 配置。

Pigsty 部署的 PostgreSQL 集群默认使用 C locale，当系统支持 C.utf8 或 PG 版本大于等于 17 时，则优先使用 C.UTF-8 Locale。

如何安装并启用 en_US locale？

使用其他系统语言包时，如何确保 en_US 本地化规则集可用：

yum install -y glibc-locale-source glibc-langpack-en
localedef -i en_US -f UTF-8 en_US.UTF-8
localectl set-locale LANG=en_US.UTF-8

管理用户准备

在安装 Pigsty 的节点上，您需要拥有一个 “管理用户” —— 拥有免密 ssh 登陆权限与免密 sudo 权限。

免密 sudo 是必选项，用于在安装过程中执行需要 sudo 权限的命令，例如安装软件包，配置系统参数等。

如何配置管理用户的免密码 sudo 权限？

假设您的管理用户名为 vagrant ，则可以创建 /etc/sudoers.d/vagrant 文件，并添加以下记录：

%vagrant ALL=(ALL) NOPASSWD: ALL

则 vagrant 用户即可免密 sudo 执行所有命令。如果你的用户名不是 vagrant，请将上面操作中的 vagrant 替换为您的用户名。

避免使用 root 用户安装

尽管使用 root 用户安装 Pigsty 是可行的，但我们不推荐这样做。

安全最佳实践是使用一个不同于根用户（root）与数据库超级用户 (postgres) 的专用管理员用户（如：dba）

Pigsty 提供了专用剧本任务，可以使用一个现有的管理用户（例如 root），输入 ssh/sudo 密码，创建一个专用的 管理员用户。

SSH 权限准备

除了免密 sudo 权限， Pigsty 还需要管理用户免密 ssh 登陆的权限。

对于单机安装的节点而言，这意味着本机上的管理用户可以通过 ssh 免密码登陆到本机上。如果的 Pigsty 部署涉及到多个节点，这意味着管理节点上的管理用户应当可以通过 ssh 免密码登陆到所有被 Pigsty 纳管的节点上（包括本机），并免密执行 sudo 命令。

单机安装时，在 configure 过程中，如果您的当前管理用户没有 SSH key，Pigsty 会尝试修复此问题：随机生成一对新的 id_rsa 密钥，并添加至本地 ~/.ssh/authroized_keys 文件确保本机管理用户的 SSH 登陆能力。

Pigsty 默认会为您在纳管的节点上创建一个可用的管理用户 dba (uid=88)，如果您已经使用了此用户，我们建议您修改 node_admin_username 使用新的用户名与其他 uid，或通过 node_admin_enabled 参数禁用。

如何配置管理用户的 ssh 免密码登陆？

假设您的管理用户名为 vagrant，则以 vagrant 用户身份执行以下命令，会为其生成公私钥对 ~/.ssh/id_rsa[.pub] 用于登陆。如果已经存在公私钥对，则无需生成新密钥对。

ssh-keygen -t rsa -b 2048 -N '' -f ~/.ssh/id_rsa -q

生成的公钥默认位于：/home/vagrant/.ssh/id_rsa.pub，私钥默认位于：/home/vagrant/.ssh/id_rsa，如果您操作系统用户名不叫 vagrant，将上面的 vagrant 替换为您的用户名即可。

您应当将公钥文件（id_rsa.pub）追加写入到需要登陆机器的对应用户上：/home/vagrant/.ssh/authorized_keys 文件中。如果您已经可以直接通过密码访问远程机器，可以直接通过ssh-copy-id的方式拷贝公钥：

ssh-copy-id <ip>                        # 输入密码以完成公钥拷贝
sshpass -p <password> ssh-copy-id <ip>  # 直接将密码嵌入命令中，避免交互式密码输入（注意安全！）

Pigsty 推荐将管理用户的创建，权限配置与密钥分发放在虚拟机的置备阶段完成，作为标准化交付内容的一部分。

SSH 例外情况

如果您的 SSH 访问有一些额外限制，例如，使用了跳板机，或者进行了某些定制化修改，无法通过简单的 ssh <ip> 方式访问，那么可以使用 ssh 别名。

如果您的服务器可以通过 ~/.ssh/config 中定义的别名访问，那么可以为 配置清单 中的节点配置 ansible_host 参数，指定 SSH Alias：

nodes:    
  hosts:  # 10.10.10.10 无法直接 ssh，但可以通过ssh别名 `meta` 访问
    10.10.10.10: { ansible_host: meta }

如果 SSH 别名无法满足您的需求，Ansible 还提供了一系列自定义 ssh 连接参数，可以精细控制 SSH 连接的行为。

最后，如果以下命令可以在管理节点上使用管理用户成功执行，意味着该目标节点上的管理用户与权限配置已经妥当：

ssh <ip|alias> 'sudo ls'

软件准备

在 管理节点 上，Pigsty 需要使用 Ansible 发起控制命令。如果您使用本地单机安装，那么管理节点和被管理的节点是同一台，需要安装 Ansible。对于普通节点，则无需安装 Ansible。

在 bootstrap 过程中，Pigsty 会尽最大努力自动为您完成安装 Ansible 这一任务，但您也可以选择手工安装 Ansible。手工安装 Ansible 的过程因不同操作系统发行版/大版本而异（通常涉及到额外的弱依赖 jmespath）：

sudo dnf install -y ansible python3.12-jmespath python3-cryptography

sudo yum install -y ansible   # EL7 无需显式安装 Jmespath

sudo apt install -y ansible python3-jmespath

brew install ansible

为了安装 Pigsty，您还需要准备 Pigsty 源码包。您可以直接从 GitHub Release 页面下载特定版本，或使用以下命令获取最新稳定版本：

curl -fsSL https://repo.pigsty.cc/get | bash

如果您的环境没有互联网访问，也可以考虑直接从 GitHub Release 页面或其他渠道下载针对不同操作系统发行版预先制作的 离线安装包。

1.6 - 执行剧本

Pigsty 使用 Ansible 剧本来实现所需的安装部署管理功能，但如果只是使用的话，您并不需要了解太多细节。

在 Pigsty 中，剧本 / Playbooks 用于在节点上安装模块。

剧本可以视作可执行文件直接执行，例如：./install.yml.

剧本

以下是 Pigsty 中默认包含的剧本：

剧本	功能
`install.yml`	在当前节点上一次性完整安装 Pigsty
`infra.yml`	在 infra 节点上初始化 pigsty 基础设施
`infra-rm.yml`	从 infra 节点移除基础设施组件
`node.yml`	纳管节点，并调整节点到期望的状态
`node-rm.yml`	从 pigsty 中移除纳管节点
`pgsql.yml`	初始化 HA PostgreSQL 集群或添加新的从库实例
`pgsql-rm.yml`	移除 PostgreSQL 集群或移除从库实例
`pgsql-user.yml`	向现有的 PostgreSQL 集群添加新的业务用户
`pgsql-db.yml`	向现有的 PostgreSQL 集群添加新的业务数据库
`pgsql-monitor.yml`	监控纳管远程 postgres 实例
`pgsql-migration.yml`	为现有的 PostgreSQL 生成迁移手册和脚本
`redis.yml`	初始化 redis 集群/节点/实例
`redis-rm.yml`	移除 redis 集群/节点/实例
`etcd.yml`	初始化 etcd 集群（patroni HA DCS所需）
`minio.yml`	初始化 minio 集群（pgbackrest 备份仓库备选项）
`docker.yml`	在节点上安装 docker
`mongo.yml`	在节点上安装 Mongo/FerretDB
`cert.yml`	使用 pigsty 自签名 CA 颁发证书（例如用于客户端）
`cache.yml`	制作离线软件包

一次性安装

特殊的剧本 install.yml 实际上是一个复合剧本，它在当前环境上安装所有以下组件。


  playbook  / command / group         infra           nodes    etcd     minio     pgsql
[infra.yml] ./infra.yml [-l infra]   [+infra][+node] 
[node.yml]  ./node.yml                               [+node]  [+node]  [+node]   [+node]
[etcd.yml]  ./etcd.yml  [-l etcd ]                            [+etcd]
[minio.yml] ./minio.yml [-l minio]                                     [+minio]
[pgsql.yml] ./pgsql.yml                                                          [+pgsql]

请注意，NODE 和 INFRA 之间存在循环依赖：为了在 INFRA 上注册 NODE，INFRA 应该已经存在，而 INFRA 模块依赖于 INFRA节点上的 NODE 模块才能工作。

为了解决这个问题，INFRA 模块的安装剧本也会在 INFRA 节点上安装 NODE 模块。所以，请确保首先初始化 INFRA 节点。

如果您非要一次性初始化包括 INFRA 在内的所有节点，install.yml 剧本就是用来解决这个问题的：它会正确的处理好这里的循环依赖，一次性完成整个环境的初始化。

Ansible

执行剧本需要 ansible-playbook 可执行文件，该文件包含在 ansible rpm/deb 包中。

Pigsty 将在准备期间在尽最大努力尝试在当前节点安装 ansible。

您可以自己使用 yum / apt / brew install ansible 来安装 Ansible，它含在各大发行版的默认仓库中。

了解 ansible 对于使用 Pigsty 很有帮助，但也不是必要的。对于基本使用，您只需要注意四个参数就足够了：

-i|--inventory <path> ：显式指定使用的配置文件
-l|--limit <pattern> : 限制剧本在特定的组/主机/模式上执行目标（在哪里/Where）
-t|--tags <tags> : 只运行带有特定标签的任务（做什么/What）
-e|--extra-vars <vars> : 传递额外的命令行参数（怎么做/How）

指定配置文件

您可以使用 -i 命令行参数，显式指定使用的配置文件。

Pigsty 默认使用名为 pigsty.yml 配置文件，该文件位于 Pigsty 源码根目录中的 pigsty.yml。但您可以使用 -i 覆盖这一行为，例如：

./pgsql.yml -i conf/rich.yml            # 根据 rich 配置文件，初始化下载所有扩展的单节点
./pgsql.yml -i conf/full.yml            # 根据 full 配置文件，初始化四节点集群
./pgsql.yml -i conf/app/supa.yml        # 根据 supa.yml 配置文件，初始化单节点 Supabase 部署

如果您希望永久修改默认使用的配置文件，可以修改源码根目录下的 ansible.cfg 的 inventory 参数，将其指向您的配置文件路径。这样您就可以在执行 ansible-playbook 命令时无需显式指定 -i 参数了。

指定执行对象

您可以使用 -l|-limit <selector> 限制剧本的执行目标。

缺少此值可能很危险，因为大多数剧本会在 all 分组，也就是所有主机上执行，使用时务必小心。

以下是一些主机限制的示例：

./pgsql.yml                              # 在所有主机上运行（非常危险！）
./pgsql.yml   -l pg-test                 # 在 pg-test 集群上运行
./pgsql.yml   -l 10.10.10.10             # 在单个主机 10.10.10.10 上运行
./pgsql.yml   -l pg-*                    # 在与通配符 `pg-*` 匹配的主机/组上运行
./pgsql.yml   -l '10.10.10.11,&pg-test'  # 在组 pg-test 的 10.10.10.10 上运行
/pgsql-rm.yml -l 'pg-test,!10.10.10.11'  # 在 pg-test 上运行，除了 10.10.10.11 以外
./pgsql.yml   -l pg-test                 # 在 pg-test 集群的主机上执行 pgsql 剧本

执行剧本子集

你可以使用 -t|--tags <tag> 执行剧本的子集。你可以在逗号分隔的列表中指定多个标签，例如 -t tag1,tag2。

如果指定了任务子集，将执行给定标签的任务，而不是整个剧本。以下是任务限制的一些示例：

./pgsql.yml -t pg_clean    # 如果必要，清理现有的 postgres
./pgsql.yml -t pg_dbsu     # 为 postgres dbsu 设置操作系统用户 sudo
./pgsql.yml -t pg_install  # 安装 postgres 包和扩展
./pgsql.yml -t pg_dir      # 创建 postgres 目录并设置 fhs
./pgsql.yml -t pg_util     # 复制工具脚本，设置别名和环境
./pgsql.yml -t patroni     # 使用 patroni 引导 postgres
./pgsql.yml -t pg_user     # 提供 postgres 业务用户
./pgsql.yml -t pg_db       # 提供 postgres 业务数据库
./pgsql.yml -t pg_backup   # 初始化 pgbackrest 仓库和 basebackup
./pgsql.yml -t pgbouncer   # 与 postgres 一起部署 pgbouncer sidecar
./pgsql.yml -t pg_vip      # 使用 vip-manager 将 vip 绑定到 pgsql 主库
./pgsql.yml -t pg_dns      # 将 dns 名称注册到 infra dnsmasq
./pgsql.yml -t pg_service  # 使用 haproxy 暴露 pgsql 服务
./pgsql.yml -t pg_exporter # 使用 haproxy 暴露 pgsql 服务
./pgsql.yml -t pg_register # 将 postgres 注册到 pigsty 基础设施

# 运行多个任务：重新加载 postgres 和 pgbouncer hba 规则
./pgsql.yml -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload

# 运行多个任务：刷新 haproxy 配置并重新加载
./node.yml -t haproxy_config,haproxy_reload

传递额外参数

您可以通过 -e|-extra-vars KEY=VALUE 传递额外的命令行参数。

命令行参数具有压倒性的优先级，以下是一些额外参数的示例：

./node.yml -e ansible_user=admin -k -K                  # 作为另一个用户运行剧本（带有 admin sudo 密码）
./pgsql.yml -e pg_clean=true                            # 在初始化 pgsql 实例时强制清除现有的 postgres
./pgsql-rm.yml -e pg_uninstall=true                     # 在 postgres 实例被删除后明确卸载 rpm
./redis.yml -l 10.10.10.11 -e redis_port=6379 -t redis  # 初始化一个特定的 redis 实例：10.10.10.11:6379
./redis-rm.yml -l 10.10.10.13 -e redis_port=6379        # 删除一个特定的 redis 实例：10.10.10.11:6379

此外，您还可以通过 JSON 的方式，传递诸如数组与对象这样的复杂参数：

# 通过指定软件包与仓库模块，在节点上安装 duckdb
./node.yml -t node_repo,node_pkg  -e '{"node_repo_modules":"infra","node_default_packages":["duckdb"]}'

大多数剧本都是幂等的，这意味着在未打开保护选项的情况下，一些部署剧本可能会 删除现有的数据库 并创建新的数据库。

请仔细阅读文档，多次校对命令，并小心操作。作者不对因误用造成的任何数据库损失负责。

1.7 - 置备机器

介绍 Pigsty 演示所使用的标准四节点沙箱环境，以及如何使用 Vagrant 与 Terraform 置备所需的虚拟机。

Pigsty 在节点上运行，这些节点可以是裸机或虚拟机。您可以手工置备它们，或使用 terraform 和 vagrant 这样的工具在云端或本地进行自动配置。

沙箱环境

Pigsty 带有一个演示沙箱，所谓沙箱，就是专门用来演示/测试的环境：IP地址和其他标识符都预先固定配置好，便于复现各种演示用例。

默认的沙箱环境由4个节点组成，配置文件请参考 full.yml。

沙箱的 4 个节点有着固定的 IP 地址：10.10.10.10、10.10.10.11、10.10.10.12、10.10.10.13。

沙箱带有一个位于 meta 节点上的单实例 PostgreSQL 集群：pg-meta：

meta 10.10.10.10 pg-meta pg-meta-1

沙箱中还有一个由三个实例组成的 PostgreSQL 高可用集群：pg-test，部署在另外三个节点上：

node-1 10.10.10.11 pg-test.pg-test-1
node-2 10.10.10.12 pg-test.pg-test-2
node-3 10.10.10.13 pg-test.pg-test-3

两个可选的 L2 VIP 分别绑定在 pg-meta 和 pg-test 集群的主实例上：

10.10.10.2 pg-meta
10.10.10.3 pg-test

在 meta 节点上，还有一个单实例的 etcd “集群”和一个单实例的 minio “集群”。

您可以在本地虚拟机或云虚拟机上运行沙箱。Pigsty 提供基于 Vagrant 的本地沙箱（使用 Virtualbox/libvirt 启动本地虚拟机）以及基于 Terraform 的云沙箱（使用云供应商 API 创建虚拟机）。

本地沙箱可以在您的 Mac/PC 上免费运行。运行完整的4节点沙箱，您的 Mac/PC 应至少拥有 4C/8G。
云沙箱可以轻松创建和共享，单需要一个公有云帐户才行。云上虚拟机可以按需创建/一键销毁，对于快速测试来说非常便宜省事。

此外，Pigsty 还提供了一个 42节点的生产仿真环境沙箱 prod.yml。

Vagrant

Vagrant 可以按照声明式的方式创建本地虚拟机。请查看 Vagrant 模板介绍以获取详情。

安装

确保您的操作系统中已经安装并可以使用 Vagrant 和 Virtualbox。

如果您使用的是 macOS，您可以使用 homebrew 一键命令安装它们，注意安装 Virtualbox 后需要重启系统。

如果你用的是 Linux，可以使用 virtualbox，也可以考虑使用 KVM: vagrant-libvirt。

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install vagrant virtualbox ansible   # 在 MacOS 中可以轻松一键安装，但只有 x86_64 Intel 芯片的可以

配置

vagarnt/Vagranfile 是一个 Ruby 脚本文件，用来描述 Vagrant 要创建的虚拟机节点。Pigsty 提供了一些默认的配置模板：

模板	快捷方式	规格	注释
meta.rb	`v1`	4C8G x 1	单一 Meta 节点
full.rb	`v4`	2C4G + 1C2G x 3	完整的4节点沙盒示例
el7.rb	`v7`	2C4G + 1C2G x 3	EL7 3-节点测试环境
el8.rb	`v8`	2C4G + 1C2G x 3	EL8 3-节点测试环境
el9.rb	`v9`	2C4G + 1C2G x 3	EL9 3-节点测试环境
build.rb	`vb`	2C4G x 3	3-节点 EL7,8,9 构建环境
check.rb	`vc`	2C4G x 30	30 EL7-9, PG12-16 测试环境
minio.rb	`vm`	2C4G x 3 + Disk	3-节点 MinIO/etcd 测试环境
prod.rb	`vp`	2C4G x 42	42节点的生产模拟环境

每个规格文件包含一个描述虚拟机节点的 Specs 变量。例如，full.rb 包含4节点沙盒规格的描述：

Specs = [
  {"name" => "meta",   "ip" => "10.10.10.10", "cpu" => "2",  "mem" => "4096", "image" => "generic/rocky9" },
  {"name" => "node-1", "ip" => "10.10.10.11", "cpu" => "1",  "mem" => "2048", "image" => "generic/rocky9" },
  {"name" => "node-2", "ip" => "10.10.10.12", "cpu" => "1",  "mem" => "2048", "image" => "generic/rocky9" },
  {"name" => "node-3", "ip" => "10.10.10.13", "cpu" => "1",  "mem" => "2048", "image" => "generic/rocky9" },
]

您可以使用 vagrant/config 脚本切换 Vagrant 配置文件，它会根据规格以及虚拟机软件类型，渲染生成最终的 Vagrantfile。

cd ~/pigsty
vagrant/config <spec>

vagrant/config meta     # singleton meta        | 别名：`make v1`
vagrant/config full     # 4-node sandbox        | 别名：`make v4`
vagrant/config el7      # 3-node el7 test       | 别名：`make v7`
vagrant/config el8      # 3-node el8 test       | 别名：`make v8`
vagrant/config el9      # 3-node el9 test       | 别名：`make v9`
vagrant/config prod     # prod simulation       | 别名：`make vp`
vagrant/config build    # building environment  | 别名：`make vd`
vagrant/config minio    # 3-node minio env
vagrant/config check    # 30-node check env

虚拟机管理

当您使用 vagrant/Vagrantfile 描述了所需的虚拟机后，你可以使用vagrant up命令创建这些虚拟机。

Pigsty 模板默认会使用你的 ~/.ssh/id_rsa[.pub] 作为这些虚拟机的默认ssh凭证。

在开始之前，请确保你有一个有效的ssh密钥对，你可以通过以下方式生成一对：ssh-keygen -t rsa -b 2048

此外，还有一些 makefile 快捷方式包装了 vagrant 命令，你可以使用它们来管理虚拟机。

make         # 等于 make start
make new     # 销毁现有虚拟机，根据规格创建新的
make ssh     # 将 SSH 配置写入到 ~/.ssh/ 中 （新虚拟机拉起后必须完成这一步）
make dns     # 将 虚拟机 DNS 记录写入到 /etc/hosts 中 （如果想使用名称访问虚拟机)
make start   # 等于先执行 up ，再执行 ssh 
make up      # 根据配置拉起虚拟机，或启动现有虚拟机
make halt    # 关停现有虚拟机 (down,dw)
make clean   # 销毁现有虚拟机 (clean/del/destroy)
make status  # 显示虚拟机状态 (st)
make pause   # 暂停虚拟机运行 (suspend,pause)
make resume  # 恢复虚拟机运行 (resume)
make nuke    # 使用 virsh 销毁所有虚拟机 (仅libvirt可用)

快捷方式

你可以使用以下的 Makefile 快捷方式使用 vagrant 拉起虚拟机环境。

make meta     # 单个元节点
make full     # 4-节点沙箱
make el7      # 3-节点 el7 测试环境
make el8      # 3-节点 el8 测试环境
make el9      # 3-节点 el9 测试环境
make prod     # 42 节点生产仿真环境
make build    # 3-节点 EL7,8,9 构建环境
make check    # 30-节点构建校验测试环境
make minio    # 3-节点 MinIO 测试环境

make meta  install  # 进行完整的单机安装
make full  install  # 进行4节点沙箱安装
make prod  install  # 进行42节点生产仿真环境安装
make check install  # 进行30节点本地测试环境安装
...

Terraform

Terraform是一个开源的实践“基础设施即代码”的工具：描述你想要的云资源，然后一键创建它们。

Pigsty 提供了 AWS，阿里云，腾讯云的 Terraform 模板，您可以使用它们在云上一键创建虚拟机。

在 MacOS 上，Terraform 可以使用 homebrew 一键安装：brew install terraform。你需要创建一个云帐户，获取 AccessKey 和 AccessSecret 凭证来继续下面的操作。

terraform/目录包含两个示例模板：一个 AWS 模板，一个阿里云模板，你可以按需调整它们，或者作为其他云厂商配置文件的参考，让我们用阿里云为例：

cd terraform                         # 进入 Terraform 模板目录
cp spec/alicloud.tf terraform.tf     # 使用 阿里云 Terraform 模板

在执行 terraform apply 拉起虚拟机之前，你要执行一次 terraform init 安装相应云厂商的插件。

terraform init      # 安装 terraform 云供应商插件：例如默认的 aliyun 插件 (第一次使用时安装即可)
terraform apply     # 生成执行计划，显示会创建的云资源：虚拟机，网络，安全组，等等等等……

运行 apply 子命令并按提示回答 yes 后，Terraform 将为你创建虚拟机以及其他云资源（网络，安全组，以及其他各种玩意）。

执行结束时，管理员节点的IP地址将被打印出来，你可以登录并开始完成 Pigsty 本身的安装

1.8 - 安全考量

Pigsty部署中与安全有关的考量

Pigsty 的默认配置已经足以覆盖绝大多数场景对于安全的需求。

Pigsty 已经提供了开箱即用的认证与访问控制模型，对于绝大多数场景已经足够安全。

如果您希望进一步加固系统的安全性，那么以下建议供您参考：

机密性

重要文件

保护你的 pigsty.yml 配置文件或CMDB

pigsty.yml 配置文件通常包含了高度敏感的机密信息，您应当确保它的安全。
严格控制管理节点的访问权限，仅限 DBA 或者 Infra 管理员访问。
严格控制 pigsty.yml 配置文件仓库的访问权限（如果您使用 git 进行管理）

保护你的 CA 私钥和其他证书，这些文件非常重要。

相关文件默认会在管理节点Pigsty源码目录的 files/pki 内生成。
你应该定期将它们备份到一个安全的地方存储。

密码

在生产环境部署时，必须更改这些密码，不要使用默认值！

grafana_admin_password : pigsty
pg_admin_password : DBUser.DBA
pg_monitor_password : DBUser.Monitor
pg_replication_password : DBUser.Replicator
patroni_password : Patroni.API
haproxy_admin_password : pigsty
minio_access_key : minioadmin
minio_secret_key : minioadmin

如果您使用MinIO，请修改MinIO的默认用户密码，与pgbackrest中的引用

请修改 MinIO 普通用户的密码：minio_users.[pgbacrest].secret_key
请修改 pgbackrest 中对 MinIO 使用的备份用户密码：pgbackrest_repo.minio.s3_key_secret

如果您使用远程备份仓库，请务必启用备份加密，并设置加解密密码

设置 pgbackrest_repo.*.cipher_type 为 aes-256-cbc
设置密码时可以使用 ${pg_cluster} 作为密码的一部分，避免所有集群使用同一个密码

为 PostgreSQL 使用安全可靠的密码加密算法

使用 pg_pwd_enc 默认值 scram-sha-256 替代传统的 md5
这是默认行为，如果没有特殊理由（出于对历史遗留老旧客户端的支持），请不要将其修改回 md5

使用 passwordcheck 扩展强制执行强密码。

在 pg_libs 中添加 $lib/passwordcheck 来强制密码策略。

使用加密算法加密远程备份

在 pgbackrest_repo 的备份仓库定义中使用 repo_cipher_type 启用加密

为业务用户配置密码自动过期实践

你应当为每个业务用户设置一个密码自动过期时间，以满足合规要求。

配置自动过期后，请不要忘记在巡检时定期更新这些密码。

- { name: dbuser_meta , password: Pleas3-ChangeThisPwd ,expire_in: 7300 ,pgbouncer: true ,roles: [ dbrole_admin ]    ,comment: pigsty admin user }
- { name: dbuser_view , password: Make.3ure-Compl1ance  ,expire_in: 7300 ,pgbouncer: true ,roles: [ dbrole_readonly ] ,comment: read-only viewer for meta database }
- { name: postgres     ,superuser: true  ,expire_in: 7300                        ,comment: system superuser }
- { name: replicator ,replication: true  ,expire_in: 7300 ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
- { name: dbuser_dba   ,superuser: true  ,expire_in: 7300 ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
- { name: dbuser_monitor ,roles: [pg_monitor] ,expire_in: 7300 ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

不要将更改密码的语句记录到 postgres 日志或其他日志中

SET log_statement TO 'none';
ALTER USER "{{ user.name }}" PASSWORD '{{ user.password }}';
SET log_statement TO DEFAULT;

IP地址

为 postgres/pgbouncer/patroni 绑定指定的 IP 地址，而不是所有地址。

默认的 pg_listen 地址是 0.0.0.0，即所有 IPv4 地址。
考虑使用 pg_listen: '${ip},${vip},${lo}' 绑定到特定IP地址（列表）以增强安全性。

不要将任何端口直接暴露到公网IP上，除了基础设施出口Nginx使用的端口（默认80/443）

出于便利考虑，Prometheus/Grafana 等组件默认监听所有IP地址，可以直接从公网IP端口访问
您可以修改它们的配置文件，只监听内网IP地址，限制其只能通过 Nginx 门户通过域名访问，你也可以当使用安全组，防火墙规则来实现这些安全限制。
出于便利考虑，Redis服务器默认监听所有IP地址，您可以修改 redis_bind_address 只监听内网IP地址。

使用 HBA 限制 postgres 客户端访问

有一个增强安全性的配置模板：security.yml

限制 patroni 管理访问权限：仅 infra/admin 节点可调用控制API

默认情况下，这是通过 restapi.allowlist 限制的。

网络流量

使用 SSL 和域名，通过Nginx访问基础设施组件

Nginx SSL 由 nginx_sslmode 控制，默认为 enable。
Nginx 域名由 infra_portal.<component>.domain 指定。

使用 SSL 保护 Patroni REST API

patroni_ssl_enabled 默认为禁用。
由于它会影响健康检查和 API 调用。
注意这是一个全局选项，在部署前你必须做出决定。

使用 SSL 保护 Pgbouncer 客户端流量

pgbouncer_sslmode 默认为 disable
它会对 Pgbouncer 有显著的性能影响，所以这里是默认关闭的。

完整性

为关键场景下的 PostgreSQL 数据库集群配置一致性优先模式（例如与钱相关的库）

pg_conf 数据库调优模板，使用 crit.yml 将以一些可用性为代价，换取最佳的数据一致性。

使用crit节点调优模板，以获得更好的一致性。

node_tune 主机调优模板使用 crit ，可以以减少脏页比率，降低数据一致性风险。

启用数据校验和，以检测静默数据损坏。

pg_checksum 默认为 off，但建议开启。
当启用 pg_conf = crit.yml 数据库模板时，校验和是强制开启的。

记录建立/切断连接的日志

该配置默认关闭，但在 crit.yml 配置模板中是默认启用的。
可以手工配置集群，启用 log_connections 和 log_disconnections 功能参数。

如果您希望彻底杜绝PG集群在故障转移时脑裂的可能性，请启用watchdog

如果你的流量走默认推荐的 HAProxy 分发，那么即使你不启用 watchdog，你也不会遇到脑裂的问题。
如果你的机器假死，Patroni 被 kill -9 杀死，那么 watchdog 可以用来兜底：超时自动关机。
最好不要在基础设施节点上启用 watchdog。

可用性

对于关键场景的PostgreSQL数据库集群，请使用足够的节点/实例数量

你至少需要三个节点（能够容忍一个节点的故障）来实现生产级的高可用性。
如果你只有两个节点，你可以容忍特定备用节点的故障。
如果你只有一个节点，请使用外部的 S3/MinIO 进行冷备份和 WAL 归档存储。

对于 PostgreSQL，在可用性和一致性之间进行权衡

pg_rpo : 可用性与一致性之间的权衡
pg_rto : 故障概率与影响之间的权衡

不要直接通过固定的 IP 地址访问数据库；请使用 VIP、DNS、HAProxy 或它们的排列组合

使用 HAProxy 进行服务接入
在故障切换/主备切换的情况下，Haproxy 将处理客户端的流量切换。

在重要的生产部署中使用多个基础设施节点（例如，1~3）

小规模部署或要求宽松的场景，可以使用单一基础设施节点 / 管理节点。
大型生产部署建议设置至少两个基础设施节点互为备份。

使用足够数量的 etcd 服务器实例，并使用奇数个实例（1,3,5,7）

查看 ETCD 管理了解详细信息。

1.9 - 常见问题

Pigsty 下载，安装，部署常见问题答疑

这里列出了Pigsty用户在下载、安装、部署时常遇到的问题，如果您遇到了难以解决的问题，可以提交 Issue 或者联系我们。

如何获取Pigsty软件源码包？

使用以下命令一键安装 Pigsty： curl -fsSL https://repo.pigsty.cc/get | bash

上述命令会自动下载最新的稳定版本 pigsty.tgz 并解压到 ~/pigsty 目录。您也可以从以下位置手动下载 Pigsty 源代码的特定版本。

如果您需要在没有互联网的环境中安装，可以提前在有网络的环境中下载好，并通过 scp/sftp 或者 CDROM/USB 传输至生产服务器。

如何加速从上游仓库下载 RPM ?

考虑使用本地仓库镜像，仓库镜像在repo_upstream 参数中配置，你可以选择 region 来使用不同镜像站。

例如，您可以设置 region = china，这样将使用 baseurl 中键为 china 的 URL 而不是 default。

如果防火墙或GFW屏蔽了某些仓库，考虑使用proxy_env 来绕过。

安装失败应该如何解决？

Pigsty 不依赖 Docker，所以期待的环境是一个全新安装操作系统后的状态，如果您使用已经跑了千奇百怪服务的系统，更容易遇到疑难杂症，我们建议不要这么做。出现安装失败时，可以按照以下思路进行排查：

您应当确认当前失败的位置在哪个模块，不同模块失败有不同的原因。您应当保留 Ansible 剧本输出结果备用，并找到所有标红的 Failed 任务。
如果出现无法识别配置，剧本直接无法开始，请检查您的配置文件是否是合法的 YAML，缩进，括号等是否有问题。
如果错误与本地软件源，Nginx 有关，请检查你的系统是不是全新系统，是否已经运行了 Nginx 或其他组件？或者是否有特殊的防火墙与安全策略在运作？
如果是安装节点软件包时出现错误，您是否使用了OS小版本精确匹配的离线软件包？上游是否出现依赖错漏？请参考下面的内容解决
如果是系统配置出现错误，您的操作系统是否是全新安装的状态？或者 —— 精简安装到了 Locale 都没有配置的状态？
如果是 MinIO 出现问题，您是否使用了 SNMD，MNMD 部署却没有提供真实的磁盘挂载点？您的 sss.pigsty 静态域名是否指向了任意 MinIO 节点或 LB 集群？
如果是 PGSQL 安装出现了问题，您在 pg_extension 中指定安装的扩展，是否还没有添加到 repo_packages 中，或者还没有被覆盖？
如果是 PGSQL 启动出现了问题，例如 wait for patroni primary，请参考 PGSQL FAQ 解决。

软件包安装失败如何解决？

请注意，Pigsty 的预制 离线软件包 是针对 特定操作系统发行版小版本 打包的，因此如果您使用的操作系统版本没有精确对齐，我们不建议使用离线软件安装包，而是直接从上游下载符合当前操作系统实际情况的软件包版本。

如果在线安装无法解决包冲突问题，您首先可以尝试修改 Pigsty 使用的上游软件源。例如在 EL 系操作系统中， Pigsty 默认的上游软件源中使用 $releasever 这样的大版本占位符，它将被解析为具体的 7，8，9 大版本号，但是许多操作系统发行版都提供了 Vault，允许您使用特定某一个版本的软件包镜像。因此，您可以将 repo_upstream 参数中的 BaseURL 前段替换为具体的 Vault 小版本仓库，例如：

https://mirrors.aliyun.com/rockylinux/$releasever/ （原始 BaseURL 前缀，不带 vault ）
https://mirrors.tuna.tsinghua.edu.cn/centos-vault/7.6.1810/ （使用 7.6 而不是默认的 7.9）
https://mirrors.aliyun.com/rockylinux-vault/8.6/ （使用 8.6 而不是默认的 8.9）
https://mirrors.aliyun.com/rockylinux-vault/9.2/ （使用 9.2 而不是默认的 9.3）

在替换前请注意目标软件源的路径是否真实存在，例如 EPEL 不提供小版本特定的软件源。支持这种方式的上游源包括：base, updates, extras, centos-sclo, centos-sclo-rh, baseos, appstream, extras, crb, powertools, pgdg-common, pgdg1*

repo_upstream:
  - { name: pigsty-local   ,description: 'Pigsty Local'      ,module: local ,releases: [7,8,9] ,baseurl: { default: 'http://${admin_ip}/pigsty'  }} # used by intranet nodes
  - { name: pigsty-infra   ,description: 'Pigsty INFRA'      ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://repo.pigsty.io/rpm/infra/$basearch' ,china: 'https://repo.pigsty.cc/rpm/infra/$basearch' }}
  - { name: pigsty-pgsql   ,description: 'Pigsty PGSQL'      ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://repo.pigsty.io/rpm/pgsql/el$releasever.$basearch' ,china: 'https://repo.pigsty.cc/rpm/pgsql/el$releasever.$basearch' }}
  - { name: nginx          ,description: 'Nginx Repo'        ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://nginx.org/packages/centos/$releasever/$basearch/' }}
  - { name: docker-ce      ,description: 'Docker CE'         ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://download.docker.com/linux/centos/$releasever/$basearch/stable'        ,china: 'https://mirrors.aliyun.com/docker-ce/linux/centos/$releasever/$basearch/stable'  ,europe: 'https://mirrors.xtom.de/docker-ce/linux/centos/$releasever/$basearch/stable' }}
  - { name: base           ,description: 'EL 7 Base'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/os/$basearch/'                    ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/os/$basearch/'           ,europe: 'https://mirrors.xtom.de/centos/$releasever/os/$basearch/'           }}
  - { name: updates        ,description: 'EL 7 Updates'      ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/updates/$basearch/'               ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/updates/$basearch/'      ,europe: 'https://mirrors.xtom.de/centos/$releasever/updates/$basearch/'      }}
  - { name: extras         ,description: 'EL 7 Extras'       ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/extras/$basearch/'                ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/extras/$basearch/'       ,europe: 'https://mirrors.xtom.de/centos/$releasever/extras/$basearch/'       }}
  - { name: epel           ,description: 'EL 7 EPEL'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://download.fedoraproject.org/pub/epel/$releasever/$basearch/'            ,china: 'https://mirrors.tuna.tsinghua.edu.cn/epel/$releasever/$basearch/'                ,europe: 'https://mirrors.xtom.de/epel/$releasever/$basearch/'                }}
  - { name: centos-sclo    ,description: 'EL 7 SCLo'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/sclo/$basearch/sclo/'             ,china: 'https://mirrors.aliyun.com/centos/$releasever/sclo/$basearch/sclo/'              ,europe: 'https://mirrors.xtom.de/centos/$releasever/sclo/$basearch/sclo/'    }}
  - { name: centos-sclo-rh ,description: 'EL 7 SCLo rh'      ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/sclo/$basearch/rh/'               ,china: 'https://mirrors.aliyun.com/centos/$releasever/sclo/$basearch/rh/'                ,europe: 'https://mirrors.xtom.de/centos/$releasever/sclo/$basearch/rh/'      }}
  - { name: baseos         ,description: 'EL 8+ BaseOS'      ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/BaseOS/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/BaseOS/$basearch/os/'          ,europe: 'https://mirrors.xtom.de/rocky/$releasever/BaseOS/$basearch/os/'     }}
  - { name: appstream      ,description: 'EL 8+ AppStream'   ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/AppStream/$basearch/os/'      ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/AppStream/$basearch/os/'       ,europe: 'https://mirrors.xtom.de/rocky/$releasever/AppStream/$basearch/os/'  }}
  - { name: extras         ,description: 'EL 8+ Extras'      ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/extras/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/extras/$basearch/os/'          ,europe: 'https://mirrors.xtom.de/rocky/$releasever/extras/$basearch/os/'     }}
  - { name: crb            ,description: 'EL 9 CRB'          ,module: node  ,releases: [    9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/CRB/$basearch/os/'            ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/CRB/$basearch/os/'             ,europe: 'https://mirrors.xtom.de/rocky/$releasever/CRB/$basearch/os/'        }}
  - { name: powertools     ,description: 'EL 8 PowerTools'   ,module: node  ,releases: [  8  ] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/PowerTools/$basearch/os/'     ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/PowerTools/$basearch/os/'      ,europe: 'https://mirrors.xtom.de/rocky/$releasever/PowerTools/$basearch/os/' }}
  - { name: epel           ,description: 'EL 8+ EPEL'        ,module: node  ,releases: [  8,9] ,baseurl: { default: 'http://download.fedoraproject.org/pub/epel/$releasever/Everything/$basearch/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/epel/$releasever/Everything/$basearch/'     ,europe: 'https://mirrors.xtom.de/epel/$releasever/Everything/$basearch/'     }}
  - { name: pgdg-common    ,description: 'PostgreSQL Common' ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg-extras    ,description: 'PostgreSQL Extra'  ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg-el8fix    ,description: 'PostgreSQL EL8FIX' ,module: pgsql ,releases: [  8  ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' } }
  - { name: pgdg-el9fix    ,description: 'PostgreSQL EL9FIX' ,module: pgsql ,releases: [    9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/'  ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' }}
  - { name: pgdg15         ,description: 'PostgreSQL 15'     ,module: pgsql ,releases: [7    ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/15/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg16         ,description: 'PostgreSQL 16'     ,module: pgsql ,releases: [  8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/16/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' }}
  - { name: timescaledb    ,description: 'TimescaleDB'       ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://packagecloud.io/timescale/timescaledb/el/$releasever/$basearch'  }}

在 Pigsty 配置文件中显式定义并覆盖 repo_upstream 后，（可清除 /www/pigsty/repo_complete 标记后）再次尝试安装。如果上游软件源与镜像源的软件没有解决问题，你可以考虑将上面的源替换为操作系统自带的软件源，再次尝试从上游直接安装。

最后如果以上手段都没有解决问题，你可以考虑移除 node_packages， infra_packages， pg_packages，pg_extensions 中出现冲突的软件包。或者移除、升级现有系统上的冲突软件包。

准备 / bootstrap 过程是干什么的？

检测环境是否就绪、用各种手段确保后续安装所必需的工具 ansible 被正确安装。

当你下载 Pigsty 源码后，可以进入目录并执行 bootstrap 脚本。它会检测你的节点环境，如果没有发现离线软件包，它会询问你要不要从互联网下载。

你可以选择 “是”（y），直接使用离线软件包安装又快又稳定。你也可以选“否” (n) 跳过，在安装时直接从互联网上游下载最新的软件包，这样会极大减少出现 RPM/DEB 包冲突的概率。

如果使用了离线软件包，bootstrap 会直接从离线软件包中安装 ansible，否则会从上游下载 ansible 并安装，如果你没有互联网访问，又没有 DVD，或者内网软件源，那就只能用离线软件包来安装了。

配置 / configure 过程是干什么的？

配置 / configure 过程会检测你的节点环境并为你生成一个 pigsty 配置文件：pigsty.yml，默认根据你的操作系统（EL 7/8/9）选用相应的单机安装模板。

所有默认的配置模板都在 files/pigsty中，你可以使用 -c 直接指定想要使用的配置模板。如果您已经知道如何配置 Pigsty 了，那么完全可以跳过这一步，直接编辑 Pigsty 配置文件。

Pigsty配置文件是干什么的？

Pigsty主目录下的 pigsty.yml 是默认的配置文件，可以用来描述整套部署的环境，在 conf/ 目录中有许多配置示例供你参考。在文档站：配置模板中相关说明。

当执行剧本时，你可以使用 -i <path> 参数，选用其他位置的配置文件。例如，你想根据另一个专门的配置文件 redis.yml 来安装 redis，可以这样做：./redis.yml -i conf/demo/redis.yml

如何使用 CMDB 作为配置清单？

Ansible 默认使用的配置清单在 ansible.cfg 中指定为：inventory = pigsty.yml

你可以使用 bin/inventory_cmdb 切换到动态的 CMDB 清单，使用 bin/inventory_conf 返回到本地配置文件。你还需要使用 bin/inventory_load 将当前的配置文件清单加载到 CMDB。

如果使用 CMDB，你必须从数据库而不是配置文件中编辑清单配置，这种方式适合将 Pigsty 与外部系统相集成。

配置文件中的IP地址占位符是干什么的？

Pigsty 使用 10.10.10.10 作为当前节点 IP 的占位符，配置过程中会用当前节点的主 IP 地址替换它。

当 configure 检测到当前节点有多个 NIC 带有多个 IP 时，配置向导会提示使用哪个主要 IP，即 用户用于从内部网络访问节点的 IP，此 IP 将用于在配置文件模板中替换占位符 10.10.10.10。

请注意，您应当使用静态 IP 地址，而不是 DHCP 分配的动态 IP 地址，因为 Pigsty 使用静态 IPv4 地址唯一标识节点。

请注意：不要使用公共 IP 作为主 IP，因为 Pigsty 会使用主 IP 来配置内部服务，例如 Nginx，Prometheus，Grafana，Loki，AlertManager，Chronyd，DNSMasq 等，除了 Nginx 之外的服务不应该对外界暴露端口。

我没有静态IP，可以安装Pigsty吗？

如果您的服务器没有静态IP，在 单机部署 的情况下，可以使用本地环回地址 127.0.0.1 作为这个唯一节点的 IP 地址标识。

配置文件中的哪些参数需要用户特殊关注？

Pigsty 提供了 280+ 配置参数，可以对整个环境与各个模块 infra / node / etcd / minio / pgsql 进行细致入微的定制。

通常在单节点安装中，你不需要对默认生成的配置文件进行任何调整。但如果需要，可以关注以下这些参数：

当访问 web 服务组件时，域名由 infra_portal 指定，有些服务只能通过 Nginx 代理使用域名访问。
Pigsty 假定存在一个 /data 目录用于存放所有数据；如果数据磁盘的挂载点与此不同，你可以使用 node_data 调整这些路径。
进行生产部署时，不要忘记在配置文件中更改密码，更多细节请参考安全考量。

在默认单机安装时，到底都安装了什么东西？

当您执行 make install 时，实际上是调用 Ansible 剧本 install.yml，根据配置文件中的参数，安装以下内容：

INFRA 模块：提供本地软件源，Nginx Web接入点，DNS服务器，NTP服务器，Prometheus与Grafana可观测性技术栈。
NODE 模块，将当前节点纳入 Pigsty 管理，部署 HAProxy 与监控。
ETCD 模块，部署一个单机 etcd 集群，作为 PG 高可用的 DCS
MINIO 模块如果定义，则会安装，它可以作为 PG 的可选备份仓库。
PGSQL 模块，一个单机 PostgreSQL 数据库实例。

安装遇到软件包冲突怎么办？

在安装 node / infra / pgsql 软件包期间，可能有微小的几率出现软件包依赖错漏冲突。这里有几种常见原因：

上游软件源发布了不匹配的软件包版本或缺失了依赖，您可能要等待上游修复此问题，时间以周计。事先在依赖完备时制作的离线安装包可以预防这个问题。
您制作离线安装包操作系统小版本与当前操作系统的小版本不匹配，您可以使用在线安装，或重新使用相同系统下制作的离线软件包的方式解决此问题
个别不重要，非必选的软件包，可以直接将其从安装列表，或本地软件源中剔除的方式快速绕过。

如何重建本地软件仓库？

我想要从上游仓库重新下载软件包，但是执行 ./infra.yml 和 repo 子任务都跳过下载了，怎么办？

您可以使用以下快捷命令和剧本任务，强制重建本地软件源。 Pigsty 会重新检查上游仓库并下载软件包。

make repo-build      # ./infra.yml -t repo_build

如何在安装后下载最新版本的软件包？

如果您想要下载最新的软件包版本（RPM/DEB），你可以选择在 /www/pigsty 中手工使用 apt/dnf 下载特定版本的软件包。在这种情况下，您可以使用以下命令，只更新本地软件仓库的元数据库，而不是整个重建：

./infra.yml -t repo_create

或者移除 /www/pigsty 中的旧版本软件包后重新执行仓库重建命令：

make repo-build      # ./infra.yml -t repo_build

如何使用 Vagrant 创建本地虚拟机？

当你第一次使用 Vagrant 启动某个特定的操作系统仓库时，它会下载相应的 Box/Img 镜像文件，Pigsty 沙箱默认使用 generic/rocky9 镜像。

使用代理可能会增加下载速度。Box/Image 只需下载一次，在重建沙箱时会被重复使用。

阿里云上 CentOS 7.9 特有的 RPM 冲突问题

阿里云的 CentOS 7.9 额外安装的 nscd 可能会导致 RPM 冲突问题："Error: Package: nscd-2.17-307.el7.1.x86_64 (@base)"

遇见安装失败，RPM冲突报错不要慌，这是一个DNS缓存工具，把这个包卸载了就可以了：sudo yum remove nscd，或者使用 ansible 命令批量删除所有节点上的 nscd：

ansible all -b -a 'yum remove -y nscd'

腾讯云上 Rocky 9.x 特有的 RPM 冲突问题

腾讯云的 Rocky 9.x 需要额外的 annobin 软件包才可以正常完成 Pigsty 安装。

遇见安装失败，RPM冲突报错不要慌，进入 /www/pigsty 把这几个包手动下载下来就好了。

./infra.yml -t repo_upstream      # add upstream repos
cd /www/pigsty;                   # download missing packages
repotrack annobin gcc-plugin-annobin libuser
./infra.yml -t repo_create        # create repo

Ansible命令超时（Timeout waiting for xxx）

Ansible 命令的默认 ssh 超时时间是10秒。由于网络延迟或其他原因，某些命令可能需要超过这个时间。

你可以在 ansible 配置文件 ansible.cfg 中增加超时参数：

[defaults]
timeout = 10 # 将其修改为 60，120 或更高。

如果你的SSH连接非常慢，通常会是 DNS的问题，请检查sshd配置确保 UseDNS no。

2 - 关于Pigsty

了解 Pigsty 项目本身的方方面面：功能特性、历史发展，开源协议，隐私政策，社区活动与新闻。

2.1 - 亮点特性

Pigsty 的价值主张与亮点功能特性。

“PostgreSQL In Great STYle”: Postgres, Infras, Graphics, Service, Toolbox, it’s all Yours.

—— 开箱即用、本地优先的 PostgreSQL 发行版，开源 RDS 替代

价值主张

可扩展性：强力扩展开箱即用：深度整合 PostGIS, TimescaleDB, PGVector 等 421 插件与 MySQL / Oracle / MSSQL 的 兼容内核。
可靠性：快速创建 高可用、故障自愈的 PostgreSQL 集群，自动预置的时间点恢复、访问控制、自签名 CA 与 SSL，确保数据坚如磐石。
可观测性：基于 Prometheus & Grafana 现代可观测性技术栈，提供惊艳的监控最佳实践。模块化设计，可独立使用：画廊 & Demo。
可用性：交付稳定可靠，自动路由，事务池化、读写分离的高性能数据库服务，通过 HAProxy，Pgbouncer，VIP 提供灵活的接入模式。
可维护性：简单易用，基础设施即代码，管理SOP预案，自动调参，本地软件仓库，Vagrant 沙箱与 Terraform 模板，不停机迁移方案。
可组合性：模块化架构设计，可复用的 Infra，多样的可选模块：Redis, MinIO, ETCD, FerretDB, DuckDB, Docker, Supabase。

总览

Pigsty 是一个更好的本地开源 RDS for PostgreSQL 替代：

开箱即用的RDS：从内核到RDS发行版，在 EL/Debian/Ubuntu 下提供 12-17 版本的生产级 PG 数据库服务。
丰富的扩展插件：提供无可比拟的 420+ 扩展，提供开箱即用的分布式的时序地理空间图文向量多模态数据库能力。
灵活的模块架构：灵活组合，自由扩展：Redis/Etcd/MinIO/Mongo；可独立使用，监控现有RDS/主机/数据库。
惊艳的观测能力：基于现代可观测性技术栈 Prometheus/Grafana，提供令人惊艳，无可比拟的数据库观测能力。
验证过的可靠性：故障自愈的高可用架构：硬件故障自动切换，流量无缝衔接。并提供自动配置的 PITR 兜底删库！
简单易用可维护：声明式API，GitOps就位，傻瓜式操作，Database/Infra-as-Code 以及管理SOP封装管理复杂度！
扎实的安全实践：加密备份一应俱全，自带基础ACL最佳实践。只要硬件与密钥安全，您无需操心数据库的安全性！
广泛的应用场景：低代码数据应用开发，或使用预置的 Docker Compose 模板，一键拉起使用PostgreSQL的海量软件！
开源的自由软件：以云数据库1/10不到的成本拥有与更好的数据库服务！帮您真正“拥有”自己的数据，实现自主可控！

PostgreSQL 整合了生态中的工具与最佳实践：

开箱即用的 PostgreSQL 发行版，深度整合地理、时序、分布式、图、向量、搜索、AI等 400 余个扩展插件！
运行于裸操作系统之上，无需容器支持，支持主流操作系统： EL7/8/9, Ubuntu 20.04/22.04/24.04 以及 Debian 11/12。
基于 patroni, haproxy, 与etcd，打造故障自愈的高可用架构：硬件故障自动切换，流量无缝衔接。
基于 pgBackRest 与可选的 MinIO 集群提供开箱即用的 PITR 时间点恢复，为软件缺陷与人为删库兜底。
基于 Ansible 提供声明式的 API 对复杂度进行抽象，以 Database-as-Code 的方式极大简化了日常运维管理操作。
Pigsty用途广泛，可用作完整应用运行时，开发演示数据/可视化应用，大量使用 PG 的软件可用 Docker 模板一键拉起。
提供基于 Vagrant 的本地开发测试沙箱环境，与基于 Terraform 的云端自动部署方案，开发测试生产保持环境一致。
部署并监控专用的 Redis（主从，哨兵，集群），MinIO，Etcd，Haproxy，MongoDB (FerretDB) 集群

开箱即用的RDS

让您立刻在本地拥有生产级的PostgreSQL数据库服务！

PostgreSQL 是一个足够完美的数据库内核，但它需要更多工具与系统的配合才能成为一个足够好的数据库服务（RDS），Pigsty 帮助 PostgreSQL 完成这一步飞跃。 Pigsty 为您解决使用 PostgreSQL 中会遇到的各种难题：内核扩展安装，连接池，负载均衡，服务接入，高可用 / 自动故障切换，日志收集，指标监控，告警，备份恢复，PITR，访问控制，参数调优，安全加密，证书签发，NTP，DNS，参数调优，配置管理，CMDB，管理预案… 您无需再为这些细节烦心劳神！

Pigsty 支持 PostgreSQL 12 ～ 17 主干内核与其他兼容分支，可运行于 EL / Debian / Ubuntu 以及兼容操作系统发行版上，在 x86_64 与 ARM64 芯片架构上可用，且无需容器支持。除了数据库内核与大量开箱即用的扩展插件以外，Pigsty 还提供了数据库服务所需的完整基础设施与运行时，以及本地沙箱 / 生产环境 / 云 IaaS 自动部署方案。

Pigsty 可以一键从裸机开始拉起整套环境，触达软件交付的最后一公里。普通研发运维均可快速上手并兼职进行数据库管理，无需数据库专家即可自建企业级RDS服务！

丰富的扩展插件

超融合多模态，一切皆用 PostgreSQL，一个PG替换所有数据库！

PostgreSQL 的灵魂在于其丰富的 扩展生态，而 Pigsty 独一无二地深度整合了 PostgreSQL 生态中的 421 个扩展，为您提供开箱即用的超融合多模态数据库！

插件间可以产生协同效应，产生 1+1 远大于 2 的效果。您可以使用 PostGIS 处理地理空间数据，使用 TimescaleDB 分析时序/事件流数据，并使用 Citus 将其原地升级为分布式地理时空数据库；您可以用 PGVector 存储并搜索AI嵌入，用 ParadeDB 实现 ES级全文检索，并同时使用精准的 SQL，全文检索，与模糊向量进行混合检索。您还可以通过 Hydra、 duckdb_fdw、pg_analytics、pg_duckdb 等分析扩展，实现专用 OLAP 数据库/数据湖仓的分析表现。

使用 PostgreSQL 单一组件替代 MySQL，Kafka，ElasticSearch，MongoDB，以及大数据分析技术栈已经成为一种最佳实践 —— 单一数据库选型能够显著降低系统复杂度，极大提高研发效能与敏捷性，实现程度惊人的软硬件，研发/运维人力降本增效。

灵活的模块架构

灵活组合，自由扩展，多数据库支持，监控现有RDS/主机/数据库

Pigsty 中的组件被抽象可独立部署的模块，并可自由组合以应对多变的需求场景。INFRA 模块带有完整的现代监控技术栈，而 NODE 模块则将节点调谐至指定状态并纳管。在多个节点上安装 PGSQL 模块会自动组建出基于主从复制的高可用数据库集群，而同样的 ETCD 模块则为数据库高可用提供共识与元数据存储。

除了上述四个核心模块之外，Pigsty 还提供一系列选装功能模块：MINIO 模块可以提供本地对象存储能力，并作为集中式数据库备份仓库。 REDIS 模块能以独立主从，哨兵，原生集群的方式为数据库提供辅助。DOCKER 模块可用于拉起无状态的应用软件。

此外，Pigsty 还提供 PG 兼容 / 衍生内核的支持，您可以使用 Babelfish 提供 MS SQL Server 兼容性，使用 IvorySQL 提供 Oracle 兼容性，使用 FerretDB 提供 MongoDB 兼容性，使用 OpenHalo 提供 MySQL 兼容性，使用 Supabase 提供 Firebase 兼容，并使用 PolarDB 满足国产化合规要求。更多专业版/试点模块将不断引入 Pigsty，如 GPSQL，KAFKA，DUCKDB，VICTORIA，TIGERBEETLE，KUBERNETES，CONSUL，JUPYTER，GREENPLUM，CLOUDBERRY，MYSQL, …

惊艳的观测能力

使用现代开源可观测性技术栈，提供无与伦比的监控最佳实践！

Pigsty 提供了基于开源的 Grafana / Prometheus 现代可观测性技术栈做监控的最佳实践：Prometheus 用于收集监控指标，Grafana 负责可视化呈现，Loki 用于日志收集与查询，Alertmanager 用于告警通知。 PushGateway 用于批处理任务监控，Blackbox Exporter 负责检查服务可用性。整套系统同样被设计为一键拉起，开箱即用的 INFRA 模块。

Pigsty 所管理的任何组件都会被自动纳入监控之中，包括主机节点，负载均衡 HAProxy，数据库 Postgres，连接池 Pgbouncer，元数据库 ETCD，KV缓存 Redis，对象存储 MinIO，……，以及整套监控基础设施本身。大量的 Grafana 监控面板与预置告警规则会让你的系统观测能力有质的提升，当然，这套系统也可以被复用于您的应用监控基础设施，或者监控已有的数据库实例或 RDS。

无论是故障分析还是慢查询优化、无论是水位评估还是资源规划，Pigsty 为您提供全面的数据支撑，真正做到数据驱动。在 Pigsty 中，超过三千类监控指标被用于描述整个系统的方方面面，并被进一步加工、聚合、处理、分析、提炼并以符合直觉的可视化模式呈现在您的面前。从全局大盘总览，到某个数据库实例中单个对象（表，索引，函数）的增删改查详情都能一览无余。您可以随意上卷下钻横向跳转，浏览系统现状与历史趋势，并预测未来的演变。

此外，Pigsty的监控系统模块部分还可以 独立使用 ——用它来监控现有的主机节点与数据库实例，或者是云上的 RDS 服务。只需要一个连接串一行命令，您就可以获得极致的 PostgreSQL 可观测性体验。

访问 截图画廊 与 在线演示 获取更多详情。

验证过的可靠性

开箱即用的高可用与时间点恢复能力，确保你的数据库坚如磐石！

对于软件缺陷或人为误操作造成的删表删库，Pigsty 提供了开箱即用的 PITR 时间点恢复能力，无需额外配置即默认启用。只要存储空间管够，基于 pgBackRest 的基础备份与 WAL 归档让您拥有快速回到过去任意时间点的能力。您可以使用本地目录/磁盘，亦或专用的 MinIO 集群或 S3 对象存储服务保留更长的回溯期限，丰俭由人。

更重要的是，Pigsty 让高可用与故障自愈成为 PostgreSQL 集群的标配，基于 patroni, etcd, 与 haproxy 打造的高可用故障自愈架构，让您在面对硬件故障时游刃有余：主库故障自动切换的 RTO < 30s（可配置），一致性优先模式下确保数据零损失 RPO = 0。只要集群中有任意实例存活，集群就可以对外提供完整的服务，而客户端只要连接至集群中的任意节点，即可获得完整的服务。

Pigsty 内置了 HAProxy 负载均衡器用于自动流量切换，提供 DNS/VIP/LVS 等多种接入方式供客户端选用。故障切换与主动切换对业务侧除零星闪断外几乎无感知，应用不需要修改连接串重启。极小的维护窗口需求带来了极大的灵活便利：您完全可以在无需应用配合的情况下滚动维护升级整个集群。硬件故障可以等到第二天再抽空善后处置的特性，让研发，运维与 DBA 都能安心睡个好觉。许多大型组织与核心机构已经在生产环境中长时间使用 Pigsty ，最大的部署有 25K CPU 核心与 200+ PostgreSQL 超大规格实例；在这一部署案例中，四年内经历了数十次硬件故障与各类事故，但依然可以保持比 99.999% 更高的可用性战绩。

简单易用可维护

Infra as Code, 数据库即代码，声明式的API将数据库管理的复杂度来封装。

Pigsty 使用声明式的接口对外提供服务，将系统的可控制性拔高到一个全新水平：用户通过配置清单告诉 Pigsty “我想要什么样的数据库集群”，而不用去操心到底需要怎样去做。从效果上讲，这类似于 K8S 中的 CRD 与 Operator，但 Pigsty 可用于任何节点上的数据库与基础设施：不论是容器，虚拟机，还是物理机。

无论是创建/销毁集群，添加/移除从库，还是新增数据库/用户/服务/扩展/黑白名单规则，您只需要修改配置清单并运行 Pigsty 提供的幂等剧本，而 Pigsty 负责将系统调整到您期望的状态。用户无需操心配置的细节，Pigsty将自动根据机器的硬件配置进行调优，您只需要关心诸如集群叫什么名字，有几个实例放在哪几台机器上，使用什么配置模版：事务/分析/核心/微型，这些基础信息，研发也可以自助服务。但如果您愿意跳入兔子洞中，Pigsty 也提供了丰富且精细的控制参数，满足最龟毛 DBA 的苛刻定制需求。

除此之外，Pigsty 本身的安装部署也是一键傻瓜式的，所有依赖被预先打包，在安装时可以无需互联网访问。而安装所需的机器资源，也可以通过 Vagrant 或 Terraform 模板自动获取，让您在十几分钟内就可以从零在本地笔记本或云端虚拟机上拉起一套完整的 Pigsty 部署。本地沙箱环境可以跑在1核2G的微型虚拟机中，提供与生产环境完全一致的功能模拟，可以用于开发、测试、演示与学习。

扎实的安全实践

加密备份一应俱全，只要硬件与密钥安全，您无需操心数据库的安全性。

Pigsty 针对高标准，严要求的企业级场景设计，采用业界领先的安全最佳实践保护您的数据安全（机密性/完整性/可用性），默认配置下的安全性便足以满足绝大多数场景下的合规要求。

Pigsty 会创建自签名的 CA （或使用您提供的 CA）签发证书，加密网络通信。需要保护的敏感管理页面与API端点都受到密码保护。数据库备份使用 AES 算法加密，数据库密码使用 scram-sha-256 算法加密，并提供插件强制执行密码强度策略。 Pigsty 提供了一套开箱即用，简单易用，便于扩展的 ACL 模型，提供读/写/管理/ETL 的权限区分，并带有遵循最小权限原则的 HBA 规则集，通过多重防护确保系统机密性。

Pigsty 默认启用数据库校验和避免静默数据腐坏，通过从库副本提供坏块兜底。提供 CRIT 数据零丢失配置模板，使用 watchdog 确保为高可用 Fencing 兜底。您可以通过 audit 插件审计数据库操作，系统与数据库日志全部收集备查，以满足合规要求。合理配置的系统通过等保三级毫无问题，只要您遵循安全性最佳实践，内网部署并合理配置安全组与防火墙，数据库安全性将不再是您的痛点。

广泛的应用场景

使用预置的Docker模板，一键拉起使用PostgreSQL的海量软件！

在各类数据密集型应用中，数据库往往是最为棘手的部分。例如 Gitlab 企业版与社区版的核心区别就是底层 PostgreSQL 数据库的监控与高可用，如果您已经有了足够好的本地 PG RDS，完全可以拒绝为软件自带的土法手造数据库组件买单。

Pigsty 提供了 Docker 模块与大量开箱即用的 Compose 模板。您可以使用 Pigsty 管理的高可用 PostgreSQL （以及 Redis 与 MinIO ）作为后端存储，以无状态的模式一键拉起这些软件： Gitlab、Gitea、Wiki.js、NocoDB、Odoo、Jira、Confluence、Habour、Mastodon、Discourse、KeyCloak 等等。如果您的应用需要一个靠谱的 PostgreSQL 数据库， Pigsty 也许是最简单的获取方案。

Pigsty 也提供了与 PostgreSQL 紧密联系的应用开发工具集：PGAdmin4、PGWeb、ByteBase、PostgREST、Kong、以及 EdgeDB、FerretDB、Supabase 这些使用 PostgreSQL 作为存储的"上层数据库"。更奇妙的是，您完全可以基于 Pigsty 内置了的 Grafana 与 Postgres ，以低代码的方式快速搭建起一个交互式的数据应用来，甚至还可以使用 Pigsty 内置的 ECharts 面板创造更有表现力的交互可视化作品。

开源的自由软件

Pigsty是基于 AGPLv3 开源的自由软件，由热爱 PostgreSQL 的社区成员用热情浇灌

Pigsty 是完全 开源免费 的自由软件，它允许您在缺乏数据库专家的情况下，用几乎接近纯硬件的成本来运行企业级的 PostgreSQL 数据库服务。作为对比，数据库厂商的“企业级数据库服务”与公有云厂商提供的 RDS 会收取底层硬件资源几倍到十几倍不等的溢价作为 “服务费”。

很多用户选择上云，正是因为自己搞不定数据库；很多用户使用 RDS，是因为别无他选。我们将打破云厂商的垄断，为用户提供一个云中立的，更好的 RDS 开源替代： Pigsty 紧跟 PostgreSQL 上游主干，不会有供应商锁定，不会有恼人的 “授权费”，不会有节点数量限制，不会收集您的任何数据。您的所有的核心资产 —— 数据，都能"自主可控"，掌握在自己手中。

Pigsty 本身旨在用数据库自动驾驶软件，替代大量无趣的人肉数据库运维工作，但再好的软件也没法解决所有的问题。总会有一些的冷门低频疑难杂症需要专家介入处理。这也是为什么我们也提供专业的 订阅服务，来为有需要的企业级用户使用 PostgreSQL 提供兜底。几万块的订阅咨询费不到顶尖 DBA 每年工资的几十分之一，让您彻底免除后顾之忧，把成本真正花在刀刃上。对于社区用户，我们亦用爱发电，提供免费的支持与日常答疑。

2.2 - 模块列表

本文列出了 Pigsty 中可用的功能模块，以及后续的功能模块规划。

核心模块

Pigsty 提供了四个基础功能模块，对于提供完整高可用的 PostgreSQL 服务必不可少：

PGSQL：带有高可用，时间点恢复，IaC，SOP，监控系统，以及 421 个扩展插件的自治的 PostgreSQL 集群。
INFRA：本地软件仓库、Prometheus、Grafana、Loki、AlertManager、PushGateway、Blackbox Exporter…
NODE：调整节点到所需状态、名称、时区、NTP、ssh、sudo、haproxy、docker、promtail、keepalived
ETCD：分布式键值存储，用作高可用 Postgres 集群的 DCS：共识选主/配置管理/服务发现。

内核模块

Pigsty 提供了四个内核功能模块，它们是 PostgreSQL 内核的可选原位替代，提供不同风味的数据库能力。

MSSQL：微软 SQL Server 线缆协议兼容的 PG 内核，由 AWS, WiltonDB & Babelfish 出品！
IVORY：Oracle 兼容的 PostgreSQL 16 内核，由瀚高开源的 IvorySQL 项目提供。
POLAR: 由阿里云开源的“云原生” PostgreSQL 内核，Aurora 风味的 RAC PostgreSQL Fork。
CITUS：使用扩展实现分布式PostgreSQL集群（Azure Hyperscale），带有原生的 Patroni 高可用支持！

国产化内核支持！

Pigsty 专业版 提供国产化数据库内核支持：PolarDB-O v2 —— 基于 PolarPG 的 Oracle 兼容的国产化数据库内核

扩展模块

Pigsty 提供了四个扩展功能模块，它对于核心功能来说并非必须，但可以用于增强 PostgreSQL 的能力：

MINIO：S3 兼容的简单对象存储服务器，可作为可选的 PostgreSQL 数据库备份仓库，带有生产部署支持与监控。
REDIS：Redis 服务器，高性能数据结构服务器，支持独立主从、哨兵、集群模式生产部署，并带有完善的监控支持。
MONGO：FerertDB 原生部署支持 —— 它为 PostgreSQL 添加了 MongoDB 线缆协议级别的 API 兼容支持！
DOCKER：Docker Daemon 服务，允许用户一键拉起容器化的无状态软件工具模板，为 Pigsty 加装各种功能！

外围模块

Pigsty 同时支持那些与 PostgreSQL 内核带有紧密联系的外围模块（扩展，分支，衍生，包装）：

DUCKDB：强大的嵌入式OLAP数据库，Pigsty提供二进制/动态库及相关PG扩展：pg_duckdb，pg_lakehouse与duckdb_fdw
SUPABASE: Pigsty 允许用户在现有高可用 PostgreSQL 集群基础上，运行火爆的 Firebase 开源替代 —— Supabase ！
GREENPLUM：基于 PostgreSQL 12 内核的 MPP 数据仓库，目前仅提供监控支持与 RPM 安装支持。(Beta)
CLOUDBERRY：Greenplum 闭源后原班开发者打造的开源分支，基于 PG 14 内核，目前仅提供RPM安装支持。(Beta)
NEON：带有数据库分支功能特性的的无服务器 PostgreSQL 内核 (WIP)

试点模块

Pigsty 正在支持一些与 PostgreSQL 生态相关的试点模块试点模块，它们可能会在未来成为 Pigsty 的正式模块：

KAFKA：使用 Pigsty 部署由 KRaft 驱动的 Kafka 消息队列，并提供开箱即用的监控支持 (Beta)
MYSQL：使用 Pigsty 部署高可用的 MySQL 8.0 集群，并提供开箱即用的监控支持（供批判/迁移评估之用） (Beta)
KUBE：使用 SealOS 搭建开箱即用的生产级 Kubernetes 部署与监控支持 (Alpha).
VICTORIA：基于 VictoriaMetrics 与 VictoriaLogs 的 Infra 替代选项，提供更好的性能与资源利用率（Alpha）
JUPYTER：开箱即用的 Jupyter Notebook 环境，用于数据分析与机器学习等场景（Alpha）

监控其他数据库

Pigsty 的 INFRA 模块可以独立使用，作为开箱即用的监控基础设施，监控其他节点或现有 PostgreSQL 数据库

现有 PostgreSQL 服务：Pigsty 可以监控外部的、非 Pigsty 托管的 PostgreSQL 服务，并仍可提供相对完整的监控支持。
RDS PG：是云厂商提供的 PostgreSQL RDS 服务，将其视作标准的外部 Postgres 实例即可纳入监控
PolarDB：是阿里云的云原生数据库，将其视作外部 PostgreSQL 11 / 14 实例即可纳入监控。
KingBase：是人大金仓提供的信创国产数据库，视作外部 PostgreSQL 12 实例即可纳入监控。
Greenplum / YMatrixDB 监控，目前将其视作水平分片的 PostgreSQL 集群即可纳入监控。

2.3 - 发展规划

未来功能的规划，新功能的发布节奏，待办事项列表。

版本发布策略

Pigsty 使用语义化版本号，例如：<主版本>.<次版本>.<修订号>。Alpha / Beta / RC 版本会在版本号后添加后缀与标号，如 -a1, -b1, -c1。

主版本更新意味着基础性变化和大量新特性；次版本更新通常表示新特性，软件包版本更新和较小的API变动，修订版本更新意味着修复bug和文档更新。

Pigsty 计划每年发布一次主版本更新，次版本更新通常跟随 PostgreSQL 小版本更新节奏，在 PostgreSQL 新版本发布后最迟一个月内跟进，通常每年 4 - 6 个小版本，完整发布历史请参考 发行注记 。

不要用 main 主干分支部署

Pigsty 使用 main 主干分支进行开发，请始终使用特定版本的 发行版。除非您知道自己在做什么，否则不要使用GitHub的 main 分支。

列入考虑的新特性

一个足够好用的命令行管理工具
基础设施组件的 ARM 架构支持
为 PostgreSQL 添加更多的扩展
更多预置的场景化配置模板
将软件仓库与安装下载源完全迁移至 Cloudflare
提供更丰富的 Docker 应用模板
使用 SealOS 部署并监控高可用 Kubernetes 集群！
使用 VictoriaMetrics 替换 Prometheus 存储时序数据
监控部署 MySQL 数据库
监控 Kubernetes 中的数据库
PGLite 和 EletricSQL 支持

这里是我们的 活跃议题 与 路线图。

软件应用

Pigsty 提供开箱即用的模板，允许用户一键拉起各类使用 PostgreSQL 的应用。

Dify : AI 工作流编排与 LLMOps
Odoo : 开源的企业级 ERP 系统
Supabase : 基于PG的后端一条龙服务
Gel ：原名 EdgeDB ，在 PG 基础上提供上层查询语言
NocoDB
Jupyter
Gitea
Wiki
GitLab
Mastodon
Keycloak
Harbour
Confluence
Jira
Zabbix
Grafana
Metabase
ByteBase
Kong
PostgREST
pgAdmin4
pgWeb

PostgreSQL 扩展

关于 PostgreSQL 扩展支持的路线图，可以参考扩展路线图

2.4 - 历史沿革

Pigsty 项目的由来与动机，过去发展的历史，未来的目标与愿景。

历史起源

Pigsty 项目始于 2018 ～ 2019 年，起源于探探。探探是一个互联网交友 App —— 中国的 Tinder，现已被陌陌收购。探探这家公司是一个北欧 Style 的创业公司，有着一个瑞典工程师初创团队。

探探在技术上极有品味，使用 PostgreSQL 与 Go 作为核心技术栈。探探整个系统架构参照了 Instagram ，一切围绕 PostgreSQL 数据库设计。直到几百万日活，几百万 TPS，几百 TB 数据的量级下，数据组件 只用了 PostgreSQL。几乎所有的业务逻辑都使用 PG 存储过程实现 —— 甚至包括 100ms 的推荐算法！

探探这种深度使用 PostgreSQL 特性的非典型研发模式，对工程师与DBA的水平提出了极高的要求。而 Pigsty ，就是我们用这种真实世界的大规模，高标准数据库集群场景打磨出的开源项目 —— 沉淀着我们作为顶尖 PostgreSQL 专家的经验与最佳实践。

发展过程

在最开始，Pigsty 并没有现在这样的愿景、目标与版图。而是旨在提供一个供我们自己使用的 PostgreSQL 监控系统。我们调研了市面上所有的方案，开源的、商业的、云的，datadog, pgwatch，…… ，没有一个能满足我们对于可观测性的需求。因此我们决定亲自动手，基于 Grafana 与 Prometheus 自己动手打造一个，这就是 Pigsty 的前身与雏形。 Pigsty 作为监控系统的效果相当惊艳，帮助我们解决了无数管理问题。

随后，研发人员希望在本地的开发机上也有这样的监控系统，于是我们使用 Ansible 编写了置备剧本，将这套系统从一次性建设任务转变为了可重复使用，可复制的软件。新的功能允许用户使用 Vagrant 和 Terraform，用 Infra as Code 的方式快速拉起本地 DevBox 开发机，或生产环境服务器，并自动完成 PostgreSQL 与监控系统的部署。

接下来，我们重新设计了生产环境的 PostgreSQL 架构，引入了 Patroni 与 pgBackRest 解决了数据库的 高可用 与 时间点恢复 问题。开发了基于逻辑复制的不停机迁移方案，通过蓝绿部署将生产环境两百套数据库集群滚动升级至最新大版本。并将这些能力引入 Pigsty 中。

Pigsty 是我们做给自己使用的软件，我们自己作为甲方用户，非常清楚自己需要什么，也不会在自己的需求上偷懒。 “Eat dog food”最大的好处就是，我们自己既是开发者也更是用户 —— 因此非常了解自己需要什么，也不会在自己的需求上偷懒。

我们解决了一个又一个的问题，并将解决方案沉淀到 Pigsty 里。Pigsty 的定位，也从一个监控系统，逐渐发展成为一个开箱即用的 PostgreSQL 数据库发行版。因此在这一阶段，我们决定将 Pigsty 对外开源，并开始了一系列的技术分享与宣传,也开始有各行各业的外部用户使用起 Pigsty 并提出反馈意见。

全职创业

在 2022 年，Pigsty 项目获得了由陆奇博士发起的奇绩创坛的种子轮投资，我得以全职出来做这件事情。

作为一个开源项目，Pigsty 的发展相当不赖，在全职创业这两年里，Pigsty 在 Github 上的 Star 数从的几百翻了几番到了 3700；上了 HN 头条推荐，增长开始滚起雪球；在 OSSRank 开源榜单中，Pigsty 在 PostgreSQL 生态项目中排名第 22 名，在中国人主导的项目里是最靠前的。

从前 Pigsty 只能跑在 CentOS 7 上，现今已经基本覆盖了所有主流 Linux 发行版（EL, Debian, Ubuntu）。支持的 PG 大版本覆盖 12 - 17，维护，收录整合了PG生态中的 420+ 扩展插件。其中，我本人维护了这里超过一半的扩展插件，并提供开箱即用的 RPM/DEB 包，算上 Pigsty 本身，“基于开源，回馈开源”，算是为 PG 生态做一些贡献。

Pigsty 的定位，也在不断发展的过程中，从一个 PostgreSQL 数据库发行版，进一步扩展到了 开源云数据库替代。它真正对标的是云厂商的整个云数据库品牌。

公有云的反叛者

AWS、Azure、GCP、Aliyun 等公有云厂商为初创企业提供了许多便利，但它们是闭源的，并迫使用户以高额费用租赁基础资源。

我们认为，优秀的数据库服务，应该和优秀的数据库内核一样，普及到每一个用户手中，而不是必须花费高昂的代价去向赛博领主租赁。

云计算的敏捷与弹性都很好，但它应该是自由、开源、普惠、本地优先的 —— 我们认为云计算宇宙中需要一个代表开源价值观的解决方案，在不牺牲云带来好处的前提下，将基础设施的控制权交还给用户。

因此，我们也在引领着一场 下云的运动与战役，作为公有云的反叛者，来重塑这个行业的价值观。

我们的愿景

我希望，未来的世界人人都有自由使用优秀服务的事实权利，而不是只能被圈养在几个赛博领主公有云巨头厂商的地盘上当赛博佃户甚至赛博农奴。

这正是 Pigsty要做的事 —— 一个更好的，开源免费的RDS替代。让用户能够在任何地方（包括云服务器）上，一键拉起有比云RDS更好的数据库服务。

Pigsty 是是对 PostgreSQL 的彻底补完，更是对云数据库的辛辣嘲讽。它本意是“猪圈”，但更是 Postgres In Great STYle 的缩写，即“全盛状态下的 PostgreSQL”。

Pigsty 本身是一款完全开源免费的软件，我们纯粹靠提供 咨询与服务 来维持运营建设良好的系统也许跑个几年都不会遇到需要 ”兜底“ 的问题，但数据库的问题一但出现就不是小问题。很多时候，专家的经验更是能够一言化腐朽为神奇，而我们为有需求的客户提供这样的服务 —— 我们认为这是一种更加公正、合理、可持续的模式。

关于团队

我是冯若航，Pigsty 的作者，Pigsty 绝大部分的代码由我一人开发，个别特性由社区贡献。

软件领域依然存在个人英雄主义，独一无二的个体才能够创造出独一无二的作品来 —— 我希望 Pigsty 能够成为这样的作品。

如果您对我感兴趣，这里是我的个人主页：https://vonng.com/

《墨天轮风云人物访谈录 —— 冯若航》

《90后，辞职创业，说要卷死云数据库》

2.5 - 活动新闻

与 Pigsty 和 PostgreSQL 相关的活动事件与新闻，以及最新活动预告！

版本发布

Pigsty 发布注记

版本	发布时间	摘要	地址
v3.4.1	2025-04-05	OpenHalo & OrioleDB，MySQL兼容，pgAdmin改进	v3.4.1
v3.4.0	2025-03-30	备份改进，自动证书，AGE，Ivory 全平台，本地化，架构与参数改进	v3.4.0
v3.3.0	2025-02-24	404 扩展，扩展目录，App 剧本，Nginx 定制，DocumentDB 支持	v3.3.0
v3.2.2	2025-01-23	390扩展，Omnigres支持，Mooncake，Citus13与PG17支持	v3.2.2
v3.2.1	2025-01-12	350扩展，Ivory4，Citus强化，Odoo模板	v3.2.1
v3.2.0	2024-12-24	扩展管理 CLI ，Grafana 强化，ARM64 扩展补完	v3.2.0
v3.1.0	2024-11-22	PG 17 作为默认大版本，配置简化，Ubuntu 24 与 ARM 支持，MinIO 改进	v3.1.0
v3.0.4	2024-10-30	PG 17 扩展，OLAP 全家桶，pg_duckdb	v3.0.4
v3.0.3	2024-09-27	PostgreSQL 17，Etcd 运维优化，IvorySQL 3.4，PostGIS 3.5	v3.0.3
v3.0.2	2024-09-07	精简安装模式，PolarDB 15支持，监控视图更新	v3.0.2
v3.0.1	2024-08-31	例行问题修复，Patroni 4支持，Oracle兼容性改进	v3.0.1
v3.0.0	2024-08-25	333个扩展插件，可插拔内核，MSSQL，Oracle，PolarDB 兼容性	v3.0.0
v2.7.0	2024-05-20	扩展大爆炸，新增20+强力扩展插件，与多款Docker应用	v2.7.0
v2.6.0	2024-02-28	PG 16 作为默认大版本，引入 ParadeDB 与 DuckDB 等扩展	v2.6.0
v2.5.1	2023-12-01	例行小版本更新，PG16重要扩展支持	v2.5.1
v2.5.0	2023-09-24	Ubuntu/Debian支持：bullseye, bookworm, jammy, focal	v2.5.0
v2.4.1	2023-09-24	Supabase/PostgresML支持与各种新扩展：graphql, jwt, pg_net, vault	v2.4.1
v2.4.0	2023-09-14	PG16，监控RDS，服务咨询支持，新扩展：中文分词全文检索/图/HTTP/嵌入等	v2.4.0
v2.3.1	2023-09-01	带HNSW的PGVector，PG 16 RC1, 文档翻新，中文文档，例行问题修复	v2.3.1
v2.3.0	2023-08-20	主机VIP, ferretdb, nocodb, MySQL存根, CVE修复	v2.3.0
v2.2.0	2023-08-04	仪表盘 & 置备重做，UOS 兼容性	v2.2.0
v2.1.0	2023-06-10	支持 PostgreSQL 12 ~ 16beta	v2.1.0
v2.0.2	2023-03-31	新增 pgvector 支持，修复 MinIO CVE	v2.0.2
v2.0.1	2023-03-21	v2 错误修复，安全增强，升级 Grafana 版本	v2.0.1
v2.0.0	2023-02-28	架构大升级，兼容性、安全性、可维护性显著增强	v2.0.0
v1.5.1	2022-06-18	Grafana 安全性修复	v1.5.1
v1.5.0	2022-05-31	Docker 应用程序支持	v1.5.0
v1.4.1	2022-04-20	错误修复 & 英文文档完整翻译	v1.4.1
v1.4.0	2022-03-31	MatrixDB 支持，分离 INFRA/NODES/PGSQL/REDIS模块	v1.4.0
v1.3.0	2021-11-30	PGCAT 重整 & PGSQL 增强 & Redis Beta支持	v1.3.0
v1.2.0	2021-11-03	默认 PGSQL 版本升级至 14	v1.2.0
v1.1.0	2021-10-12	主页, JupyterLab, PGWEB, Pev2 & pgbadger	v1.1.0
v1.0.0	2021-07-26	v1 正式版, 监控系统重整	v1.0.0
v0.9.0	2021-04-04	Pigsty 图形界面, 命令行界面, 日志集成	v0.9.0
v0.8.0	2021-03-28	服务置备，定制对外暴露的数据库服务	v0.8.0
v0.7.0	2021-03-01	仅监控部署，监控现有 PostgreSQL 实例	v0.7.0
v0.6.0	2021-02-19	架构增强，将PG与Consul解耦	v0.6.0
v0.5.0	2021-01-07	支持在配置中定义业务数据库/用户	v0.5.0
v0.4.0	2020-12-14	支持 PostgreSQL 13，添加官方文档	v0.4.0
v0.3.0	2020-10-22	虚拟机置备方案正式定稿	v0.3.0
v0.2.0	2020-07-10	PG监控系统第六版正式发布	v0.2.0
v0.1.0	2020-06-20	在生产仿真测试环境中验证通过	v0.1.0
v0.0.5	2020-08-19	离线安装模式：无需互联网访问即可交付	v0.0.5
v0.0.4	2020-07-27	将 Ansible 剧本重构为 Role Refactor playbooks into ansible roles	v0.0.4
v0.0.3	2020-06-22	接口设计改进	v0.0.3
v0.0.2	2020-04-30	首次提交	v0.0.2
v0.0.1	2019-05-15	概念原型	v0.0.1

会议与演讲

日期	类型	活动	主题
2025-05-16	技术大会	PGCon.Dev 2025 全球PG开发者大会	闪电演讲：Extension Delivery - Make your PGEXT accessible to users
2025-05-12	技术大会	PGCon.Dev 2025 全球PG开发者大会扩展峰会	PG 生态缺失的包管理器与扩展仓库 - pig
2025-04-19	实战工坊	PostgreSQL 数据库技术峰会	使用Pigsty拉起PG生态好伙伴 Dify，Odoo，Supabase
2025-04-11	直播主持	开源中国-数智漫谈	全网爆火的 MCP 到底是炒作还是神器
2025-01-15	直播演讲	开源项目老牌与新秀第四期	PostgreSQL 扩展吞噬数据库世界？PG包管理器 pig 与 RDS 自建发行版 Pigsty
2025-01-09	颁奖活动	开源中国 2024 年度突出贡献专家	开源中国 2024 年度突出贡献专家
2025-01-06	圆桌论坛	中国 PostgreSQL 数据库生态大会	PostgreSQL 正在通过扩展吞噬数据库世界
2024-11-23	播客节目	播客·科技乱炖	从Linux基金会聊起，为什么这些年都热衷于“卡脖子”？
2024-08-21	媒体专访	Blue Tech Wave: Interview with Feng Ruohang, author of Pigsty	Simplifying PostgreSQL management and advancing the Chinese open-source community
2024-08-15	技术大会	GOTC 全球开源技术峰会	PostgreSQL AI/ML/RAG 扩展生态与最佳实践
2024-07-12	主题演讲	第十三届PG中国技术大会	数据库世界的未来：Extensions, Service, and Postgres
2024-05-31	技术大会	PGCon.Dev 2024 全球PG开发者大会 Unconference	Built-in Prometheus Metrics Exporter
2024-05-28	主题研讨	PGCon.Dev 2024 全球PG开发者大会扩展峰会	Extension in Core & Binary Packing
2024-05-29	主题奥伦	第十三届PG中国技术大会	数据库世界的未来：Extensions, Service, and Postgres
2024-05-10	直播对谈	明说三人行：云计算泥石流系列第三期	公有云是杀猪盘吗？
2024-04-17	直播对谈	明说三人行：云计算泥石流系列第二期	云数据库是智商税吗？
2024-04-16	圆桌论坛	Cloudflare Immerse 深圳	赛博菩萨圆桌论坛
2024-04-12	技术大会	2024 数据技术嘉年华	Pigsty：解决 PostgreSQL 运维难题
2024-03-31	直播对谈	明说三人行：云计算泥石流系列第一期	老罗卖云咱下云？
2024-01-24	直播主持	开源中国：开源漫谈第九期	DBA 会被云淘汰吗？
2023-12-20	直播辩论	开源中国：开源漫谈第七期	上云 or 下云，割韭菜还是降本增效？
2023-11-24	技术大会	机器之心：大模型时代的向量数据库	圆桌讨论：大模型时代向量数据库新未来
2023-09-08	人物专访	墨天轮风云人物访谈	冯若航：不想当段子手的技术狂，不是一位好的开源创始人
2023-08-16	技术大会	DTCC 2023	DBA之夜：PostgreSQL vs MySQL的开源协议问题
2023-08-09	直播辩论	开源漫谈第一期	MySQL vs PostgreSQL，谁是世界第一？
2023-07-01	技术大会	SACC 2023	专题研讨会8：FinOps实践：云成本管理与优化
2023-05-12	线下活动	PostgreSQL中国社区温州站线下沙龙	PG With DB4AI: 向量数据库 PGVECTOR & AI4DB: 数据库自动驾驶 Pigsty
2023-04-08	技术大会	数据库嘉年华 2023	更好的开源RDS替代：Pigsty
2023-04-01	技术大会	PostgreSQL中国社区西安站线下沙龙	PG高可用与容灾最佳实践
2023-03-23	公开直播	Bytebase x Pigsty	管理 PostgreSQL 的最佳实践: Bytebase x Pigsty
2023-03-04	技术大会	PostgreSQL中国技术大会	炮打 RDS，Pigsty v2.0 发布
2023-02-01	技术大会	DTCC 2022	开源 RDS 替代：开箱即用、自动驾驶的数据库发行版 Pigsty
2022-07-21	直播辩论	云吞噬开源，那开源有机会反击吗？	云吞噬开源，那开源有机会反击吗？
2022-07-04	人物专访	专题采访：创造者说	90 后，辞职创业，说要卷死云数据库
2022-06-28	公开直播	贝斯的圆桌趴｜DBA 福音 -	SQL 审核最佳实践
2022-06-12	公开路演	奇绩创坛 S22 路演日	好用省钱的数据库发行版 Pigsty
2022-06-05	视频直播	PG中文社区直播分享	Pigstyv1.5快速上手新特性介绍与生产集群搭建
2021-08-01	技术大会	GOTC 全球开源技术峰会 2021	⽼树发新芽 —— 开箱即⽤的开源PostgreSQL发⾏版 Pigsty

GitHub 趋势

Star History: pgsty/pigsty

OSSRank: PostgreSQL Ecosystem

2.6 - 加入社区

Pigsty 是一个 Build in Public 的项目，我们在 GitHub 上非常活跃，中文区用户主要活跃于微信群组中。

GitHub

我们的 GitHub 仓库地址是：https://github.com/pgsty/pigsty ，欢迎点个 ⭐️ 关注我们。

我们欢迎任何人提交新 Issue 或创建 Pull Request，提出功能建议并参与 Pigsty 贡献。

请注意，关于 Pigsty 文档的问题，请在 github.com/pgsty/pigsty.cc 仓库中提交 Issue。

微信群组

中文区用户主要活跃于微信群组中，目前有七个活跃的群组，1群-4群已经满员，其他群需要添加小助手微信拉入。

加入微信社群，请用搜索 “Pigsty小助手”，（微信号 pigsty-cc）备注或发送 “加群” ，小助手会将您拉入群组中。

海外社群

Telegram: https://t.me/joinchat/gV9zfZraNPM3YjFh

Discord: https://discord.gg/j5pG8qfKxU

您也可以通过邮件联系我： rh@vonng.com

社区求助

当您使用 Pigsty 遇到问题时，可以向社区求助，您提供的信息越丰富，就越有可能在社区得到帮助。

请参考社区求助指南，尽可能提供足够的信息，以便社区成员帮助您解决问题。以下是求助提问的参考模板：

发生了什么事？ (必选项)

Pigsty版本号与操作系统版本 (必选项)

$ grep version pigsty.yml 

$ cat /etc/os-release

$ uname -a

一些云厂商对标准操作系统发行版进行了定制，您可以告诉我们使用的是哪一家云厂商的什么操作系统镜像。如果您在安装操作系统后对环境进行了定制与修改，或者在您的局域网中有特定的安全规则与防火墙配置，也请在提问时告知我们。

Pigsty剧本输出

请尽可能提供 Pigsty 剧本执行的输出，特别是任何警告与错误信息

Pigsty配置文件

请不要忘记抹掉任何敏感信息：密码，内部密钥，敏感配置等。

cat ~/pigsty/pigsty.yml

你期待发生什么？

请描述正常情况下应该发生什么事情，实际发生的情况与期待的情况有何偏离？

如何复现此问题？

请尽可能详细地告诉我们复现此问题的方法与步骤。

监控截图

如果你在使用 Pigsty 提供的监控系统，可以提供相关的截图。

错误日志

请尽可能提供与错误有关的日志。请不要粘贴类似 “Failed to start xxx service” 之类没有信息量的内容。

您可以从 Grafana / Loki 中查询日志，或从以下位置获取日志：

Syslog: /var/log/messages (rhel) or /var/log/syslog (debian)
Postgres: /pg/log/postgres/*
Patroni: /pg/log/patroni/*
Pgbouncer: /pg/log/pgbouncer/*
Pgbackrest: /pg/log/pgbackrest/*

journalctl -u patroni
journalctl -u <service name>

您检查过以下事项吗？

是否从全新节点与操作系统环境开始安装？
是否使用了离线安装包？还是使用在线安装？
是否在安装过程中有任何错误或警告？
您的网络/安全管理是否进行了额外的限制？

您已经搜索过 Issue/网站/FAQ了吗？

在 FAQ 中，我们提供了许多常见问题的解答，请在提问前检查

您也可以从 Github Issue 与 Discussion 中搜索相关问题：

有什么其他信息是我们需要知道的吗？

您提供的信息与上下文越丰富，我们越有可能帮助您解决问题。

2.7 - 隐私政策

Pigsty 软件与网站会收集哪些用户数据，以及我们将如何处理您的数据并保护您的隐私权？

Pigsty 项目组（以下简称 “我们”）运营 pigsty.io 与 pigsty.cc （以下称为 “网站”）。本网页告知您有关我们从网站接收到所有个人信息的收集，使用和披露政策。

我们仅将您的个人信息用于网站改进。使用本网站代表您同意根据本政策收集和使用信息。

信息收集和使用

在使用我们的网站时，我们可能会要求您向我们提供一些可用于联系或识别您的个人身份信息。个人身份信息可能包括但不限于您的姓名，电子邮件。

日志数据

与许多网站运营商一样，我们收集您浏览器在您访问我们网站时发送的信息（称为 “日志数据”）。

这些日志数据可能包含诸如您的计算机的互联网协议（“IP”）地址，浏览器类型，浏览器版本，您访问的我们网站的页面，您访问的时间和日期，花费在这些页面上的时间等信息统计。

另外，我们使用 Google Analytics 来收集，监控和分析此类信息。

Cookies

本网站使用 Google Analytics，这是 Google, Inc.（“Google”）提供的一项网络分析服务。谷歌分析使用“cookies”，即放置在您计算机上的文本文件，帮助网站分析用户如何使用该网站。

cookie 生成的有关您使用网站的信息（包括您的 IP 地址）将被传输至 Google 位于美国的服务器并由其存储。谷歌将使用这些信息来评估您对网站的使用情况，为网站运营商编制网站活动报告，并提供与网站活动和互联网使用相关的其他服务。如果法律要求，或者第三方代表 Google 处理信息，Google 还可能会将此信息传输给第三方。

Google 不会将您的 IP 地址与 Google 持有的任何其他数据关联起来。您可以通过在浏览器上选择适当的设置来拒绝使用 cookie，但请注意，如果您这样做，您可能无法使用本网站的全部功能。使用本网站即表示您同意 Google 以上述方式和目的处理有关您的数据。

联络

我们可能会使用您的电子邮件与您联系，提供新闻简报，营销或宣传材料以及其他信息，以帮助您更好地了解和使用我们的产品和服务。您可以通过单击电子邮件中的链接或更新网站上的偏好来取消订阅。

安全

您的个人信息的安全性对我们很重要，但请记住，互联网上传输的任何方法或电子存储方法都不会100％安全。我们努力使用商业上可接受的方式尽最大努力来保护您的个人信息，但我们无法保证其绝对安全。

隐私政策变更

本隐私政策自 2020 年 01 月 01 日起生效，除了将来发布的条款发生变化，本网站发布后立即生效。

我们保留随时更新或更改我们隐私政策的权利，您应该定期查看本隐私政策。在我们发布对本页隐私政策的任何修改后，您继续使用本服务将构成您对修改的承认，并表示您同意遵守修改后的隐私政策并受其约束。

如果我们对本隐私政策作出任何重大更改，我们将通过您提供给我们的电子邮件地址通知您，或在我们的网站上发布重要通知。

如果您对此政策有任何疑问或意见，或要求删除个人数据，您可以通过发送邮件至 rh@vonng.com 与我联系

2.8 - 开源协议

Pigsty 使用的开源协议 —— AGPLv3，它授予您什么样的权利，又有哪些限制？

项目协议地址：https://github.com/pgsty/pigsty/blob/main/LICENSE

协议摘要

Pigsty 使用 GNU Affero General Public License v3.0 (AGPLv3) 开源协议，这是一种强制性的 Copyleft 许可证。

如果您通过网络分发、托管或创建 Pigsty 软件的 衍生作品，GNU AGPLv3 许可证要求您也按照相同的 GNU AGPL v3 许可证分发组合作品的完整、相应的源代码。无论您是否修改了 Pigsty 此要求都适用。

本协议授权您：

商用
修改
分发
专利授权
私人使用

本协议不提供：

商标使用权
责任与担保

本协议的条件：

包含本许可证并显著声明
不得修改其开源状态
公开源代码
通过网络使用属于分发的一种
使用同样的协议开源

请注意，Pigsty 的网站 (pigsty.io/pigsty.cc) 本身使用 CC by 4.0 协议，这是一种知识共享许可证，允许您自由地分享与演绎本站的内容，但是您必须给出适当的署名，提供指向许可证的链接，并指出是否有对原始内容进行了修改。

豁免条款

Pigsty 采用了 AGPLv3 许可证，但对于普通终端用户（即：公有云厂商，数据库厂商除外的用户），执行等效 Apache 2.0 许可证。

普通终端用户可以将 AGPLv3 用于商业目的，提供服务，二次开发，而无需担心许可证问题。即使普通终端用户对 Pigsty 进行二次开发，并违反了 AGPL 协议没有开源，只要用途在合理范围内，我们不会对此进行追索，并豁免用户的相关开源义务。在订阅支持中，我们可以提供关于责任豁免的书面承诺。

我们支持并欢迎遵循 AGPLv3 协议的 PostgreSQL 服务提供商基于 Pigsty 交付，并提供自己的付费咨询与支持服务，并将二次开发成果回馈上游社区主干。请注意，PIGSTY® 是注册商标，在提供服务与咨询时，您应当尊重 PIGSTY 的商标权与著作权。我们提供 OEM 合作方案，详情请参考 商业支持。

为什么使用AGPLv3

AGPLv3 的精神实质是确保用户的软件自由，并通过合法的对公有云厂商的歧视，实现共同体边界的重新划定，将其开除出社区参与之外。

AGPLv3 不影响普通用户的使用：因为使用并不是一种“发布”，您无需操心使用 Pigsty 的业务代码是否需要开源。

当您将 Pigsty 或 Pigsty 的修改作为软件/服务的全部或一部分对外“发布”时，您才需要考虑到 AGPLv3 的约束。

以下是 GNU 官方对 GPL 相关问题的解答：

GNU 许可证常见问题

SBOM清单

本项目所依赖或相关的核心开源软件及其开源协议。421 个 PostgreSQL 扩展插件的许可证请参考 PG扩展列表。

模块	软件名称	许可证	必要性，用途与说明	必要性
PGSQL	PostgreSQL	PostgreSQL License	PostgreSQL 内核	必选
PGSQL	patroni	MIT License	提供 PostgreSQL 高可用能力	必选
ETCD	etcd	Apache License 2.0	提供高可用共识与分布式配置存储	必选
INFRA	Ansible	GPLv3	管控工具，执行剧本，发起管控命令	必选
INFRA	Nginx	BSD-2	暴露Web系统界面，提供本地软件源	必选
PGSQL	pgbackrest	MIT License	提供 PITR 备份/恢复管理能力	建议
PGSQL	pgbouncer	ISC License	提供 PostgreSQL 连接池化能力	建议
PGSQL	vip-manager	BSD 2-Clause License	提供自动将 L2 VIP 绑定到 PG 集群主库的能力	建议
PGSQL	pg_exporter	Apache License 2.0	提供监控 PostgreSQL 与 PgBouncer 的能力	建议
NODE	node_exporter	Apache License 2.0	提供主机节点监控能力	建议
NODE	haproxy	HAPROXY’s License (GPLv2)	提供负载均衡，对外暴露服务的能力	建议
INFRA	Grafana	AGPLv3	提供数据库可视化平台	建议
INFRA	Prometheus Stack	Apache License 2.0	提供监控时序数据库存储，指标采集与监控告警	建议
INFRA	Loki	AGPLv3	提供集中式日志收集存储查询平台	建议
INFRA	DNSMASQ	GPLv2 / GPLv3	提供DNS解析服务，提供集群名查询能力	建议
MINIO	MinIO	AGPLv3	提供S3兼容的对象存储服务	可选
NODE	keepalived	MIT License	提供绑定在节点集群上的 VIP	可选
REDIS	Redis	Redis License (BSD-3)	搭配PG使用的缓存服务，锁死版本 7.2.6	可选
REDIS	Redis Exporter	MIT License	提供 Redis 监控能力	可选
MONGO	FerretDB	Apache License 2.0	提供基于PG的MongoDB兼容能力	可选
DOCKER	docker-ce	Apache License 2.0	提供容器管理能力	可选
CLOUD	SealOS	Apache License 2.0	提供快速部署，复制，打包K8S集群的能力	可选
DUCKDB	DuckDB	MIT	提供简单易用的高性能分析能力	可选
External	Vagrant	Business Source License 1.1	拉起本地测试环境虚拟机	按需
External	Terraform	Business Source License 1.1	一键申请云资源用于部署	按需
External	Virtualbox	GPLv2	虚拟机管理软件	按需

必要性等级说明：

必选：提供 Pigsty 关键性核心能力，不提供关闭停用选项
建议：Pigsty 默认启用的组件，可以通过配置选项停用
可选：Pigsty 默认会下载但不启用的组件，可通过配置启用
按需：按需使用的工具，可按需配置下载并启用

协议原文

                    GNU AFFERO GENERAL PUBLIC LICENSE
                       Version 3, 19 November 2007

 Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

                            Preamble

  The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.

  The licenses for most software and other practical works are designed
to take away your freedom to share and change the works.  By contrast,
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.

  When we speak of free software, we are referring to freedom, not
price.  Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.

  Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.

  A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate.  Many developers of free software are heartened and
encouraged by the resulting cooperation.  However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.

  The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community.  It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server.  Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.

  An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals.  This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.

  The precise terms and conditions for copying, distribution and
modification follow.

                       TERMS AND CONDITIONS

  0. Definitions.

  "This License" refers to version 3 of the GNU Affero General Public License.

  "Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.

  "The Program" refers to any copyrightable work licensed under this
License.  Each licensee is addressed as "you".  "Licensees" and
"recipients" may be individuals or organizations.

  To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy.  The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.

  A "covered work" means either the unmodified Program or a work based
on the Program.

  To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy.  Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.

  To "convey" a work means any kind of propagation that enables other
parties to make or receive copies.  Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.

  An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License.  If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.

  1. Source Code.

  The "source code" for a work means the preferred form of the work
for making modifications to it.  "Object code" means any non-source
form of a work.

  A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.

  The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form.  A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.

  The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities.  However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work.  For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.

  The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.

  The Corresponding Source for a work in source code form is that
same work.

  2. Basic Permissions.

  All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met.  This License explicitly affirms your unlimited
permission to run the unmodified Program.  The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work.  This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.

  You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force.  You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright.  Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.

  Conveying under any other circumstances is permitted solely under
the conditions stated below.  Sublicensing is not allowed; section 10
makes it unnecessary.

  3. Protecting Users' Legal Rights From Anti-Circumvention Law.

  No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.

  When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.

  4. Conveying Verbatim Copies.

  You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.

  You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.

  5. Conveying Modified Source Versions.

  You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:

    a) The work must carry prominent notices stating that you modified
    it, and giving a relevant date.

    b) The work must carry prominent notices stating that it is
    released under this License and any conditions added under section
    7.  This requirement modifies the requirement in section 4 to
    "keep intact all notices".

    c) You must license the entire work, as a whole, under this
    License to anyone who comes into possession of a copy.  This
    License will therefore apply, along with any applicable section 7
    additional terms, to the whole of the work, and all its parts,
    regardless of how they are packaged.  This License gives no
    permission to license the work in any other way, but it does not
    invalidate such permission if you have separately received it.

    d) If the work has interactive user interfaces, each must display
    Appropriate Legal Notices; however, if the Program has interactive
    interfaces that do not display Appropriate Legal Notices, your
    work need not make them do so.

  A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit.  Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.

  6. Conveying Non-Source Forms.

  You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:

    a) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by the
    Corresponding Source fixed on a durable physical medium
    customarily used for software interchange.

    b) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by a
    written offer, valid for at least three years and valid for as
    long as you offer spare parts or customer support for that product
    model, to give anyone who possesses the object code either (1) a
    copy of the Corresponding Source for all the software in the
    product that is covered by this License, on a durable physical
    medium customarily used for software interchange, for a price no
    more than your reasonable cost of physically performing this
    conveying of source, or (2) access to copy the
    Corresponding Source from a network server at no charge.

    c) Convey individual copies of the object code with a copy of the
    written offer to provide the Corresponding Source.  This
    alternative is allowed only occasionally and noncommercially, and
    only if you received the object code with such an offer, in accord
    with subsection 6b.

    d) Convey the object code by offering access from a designated
    place (gratis or for a charge), and offer equivalent access to the
    Corresponding Source in the same way through the same place at no
    further charge.  You need not require recipients to copy the
    Corresponding Source along with the object code.  If the place to
    copy the object code is a network server, the Corresponding Source
    may be on a different server (operated by you or a third party)
    that supports equivalent copying facilities, provided you maintain
    clear directions next to the object code saying where to find the
    Corresponding Source.  Regardless of what server hosts the
    Corresponding Source, you remain obligated to ensure that it is
    available for as long as needed to satisfy these requirements.

    e) Convey the object code using peer-to-peer transmission, provided
    you inform other peers where the object code and Corresponding
    Source of the work are being offered to the general public at no
    charge under subsection 6d.

  A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.

  A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling.  In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage.  For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product.  A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.

  "Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source.  The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.

  If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information.  But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).

  The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed.  Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.

  Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.

  7. Additional Terms.

  "Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law.  If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.

  When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it.  (Additional permissions may be written to require their own
removal in certain cases when you modify the work.)  You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.

  Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:

    a) Disclaiming warranty or limiting liability differently from the
    terms of sections 15 and 16 of this License; or

    b) Requiring preservation of specified reasonable legal notices or
    author attributions in that material or in the Appropriate Legal
    Notices displayed by works containing it; or

    c) Prohibiting misrepresentation of the origin of that material, or
    requiring that modified versions of such material be marked in
    reasonable ways as different from the original version; or

    d) Limiting the use for publicity purposes of names of licensors or
    authors of the material; or

    e) Declining to grant rights under trademark law for use of some
    trade names, trademarks, or service marks; or

    f) Requiring indemnification of licensors and authors of that
    material by anyone who conveys the material (or modified versions of
    it) with contractual assumptions of liability to the recipient, for
    any liability that these contractual assumptions directly impose on
    those licensors and authors.

  All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10.  If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term.  If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.

  If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.

  Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.

  8. Termination.

  You may not propagate or modify a covered work except as expressly
provided under this License.  Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).

  However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.

  Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.

  Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License.  If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.

  9. Acceptance Not Required for Having Copies.

  You are not required to accept this License in order to receive or
run a copy of the Program.  Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance.  However,
nothing other than this License grants you permission to propagate or
modify any covered work.  These actions infringe copyright if you do
not accept this License.  Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.

  10. Automatic Licensing of Downstream Recipients.

  Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License.  You are not responsible
for enforcing compliance by third parties with this License.

  An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations.  If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.

  You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License.  For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.

  11. Patents.

  A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based.  The
work thus licensed is called the contributor's "contributor version".

  A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version.  For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.

  Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.

  In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement).  To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.

  If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients.  "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.

  If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.

  A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License.  You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.

  Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.

  12. No Surrender of Others' Freedom.

  If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all.  For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.

  13. Remote Network Interaction; Use with the GNU General Public License.

  Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software.  This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.

  Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work.  The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.

  14. Revised Versions of this License.

  The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time.  Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

  Each version is given a distinguishing version number.  If the
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation.  If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.

  If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.

  Later license versions may give you additional or different
permissions.  However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.

  15. Disclaimer of Warranty.

  THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

  16. Limitation of Liability.

  IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.

  17. Interpretation of Sections 15 and 16.

  If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.

                     END OF TERMS AND CONDITIONS

            How to Apply These Terms to Your New Programs

  If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.

  To do so, attach the following notices to the program.  It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

    Copyright (C) 2018-2024  Ruohang Feng, Author of Pigsty

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU Affero General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU Affero General Public License for more details.

    You should have received a copy of the GNU Affero General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.

Also add information on how to contact you by electronic and paper mail.

  If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source.  For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code.  There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.

  You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<https://www.gnu.org/licenses/>.

2.9 - 赞助我们

Pigsty 的赞助者，投资人名单，感谢你们对本项目的支持！

赞助我们

Pigsty 是一个开源免费的自由软件，由 PostgreSQL 社区成员用热情浇灌而成，旨在整合 PostgreSQL 生态的力量，推广 PostgreSQL 的普及。如果我们的工作帮到了您，请考虑赞助或者支持一下我们的项目：

直接打钱赞助我们，用最直接有力的鼓舞表达您的真挚支持！
考虑采购我们的技术支持服务，我们可以提供专业的 PostgreSQL 高可用集群部署与维护服务，让您的预算花得物有所值！
通过文章，讲座，视频分享您使用 Pigsty 的案例与经验。
允许我们在 “这些用户使用了Pigsty” 中提及您的组织。
向有需求的朋友，同事与客户提名/推荐我们的项目与服务。
关注我们的 微信公众号 并转发相关技术文章至群组与朋友圈。

天使投资人

Pigsty 是由 奇绩创坛 （原YC中国，MiraclePlus） S22 所投资的项目，感谢奇绩创坛与陆奇博士对本项目的支持！

2.10 - 行业案例

Pigsty 在各个领域与行业的客户/应用案例

如果您正在使用 Pigsty 并且愿意与我们分享，并提供一次的免费咨询支持。

互联网行业

探探：两百台+物理机，用于 PostgreSQL 与 Redis 服务
哔哩哔哩：用于支持 PostgreSQL 创新业务

金融行业

AirWallex：监控 200+ GCP PostgreSQL 数据库

自动驾驶

Momenta：自动驾驶，管理自建 PostgreSQL 集群

媒体

影视飓风：支持达芬奇与其他业务软件影视飓风达芬奇千万级数据库演化及实践

科技创新

北京领雾科技：云上 PostgreSQL 下云自建
Motphys：自建 PostgreSQL 支持 Gitlab
赛陇生物科技：自建 Supabase
杭州零码科技：自建 PostgreSQL
新加坡某NASDQ上市公司

ISV

内蒙古豪德天沐科技有限公司
上海元芳
DSG

制造业

华峰集团：使用 Pigsty 交付 PostgreSQL 集群作为数据仓库。

军工

机械工业研究所

军工

航天一院
北京某神秘单位
上海某神秘单位
电科36所

云厂商

OCI：使用 Pigsty 交付 PostgreSQL 集群。

2.11 - 订阅服务

Pigsty 专业版/企业版订阅服务：当您遇到与 PostgreSQL 和 Pigsty 有关的疑难杂症时，订阅服务可以为您兜底。

Pigsty 旨在聚集PG生态的合力，并用自动驾驶的数据库管控软件帮助用户用好世界上 最流行 的数据库 PostgreSQL。

尽管 Pigsty 本身已经解决了 PG 使用中的诸多问题。但想真正达到企业级服务的质量，原厂提供的专家支持与兜底服务不可或缺。我们深知专业的商业支持服务对于企业客户的重要性，因此，Pigsty 企业版在开源版本的基础上提供了一系列增值服务，帮助用户更好地用好 PostgreSQL 与 Pigsty，供有需求的客户按需选用。

如果您有下列需求，欢迎考虑 Pigsty 订阅服务：

在关键场景中运行数据库，需要严格 SLA 保障兜底。
希望对 Pigsty 与 PostgreSQL 相关疑难杂症提供兜底。
希望获取关于 PostgreSQL / Pigsty 生产环境最佳实践的指导。
希望有专家帮助解读监控图表，分析定位性能瓶颈与故障根因，给出意见。
希望根据现有资源与业务需求，规划满足安全/容灾/合规要求的数据库架构。
需要将其他数据库迁移至 PostgreSQL 数据库，或对历史遗留实例迁移与改造。
建设基于Prometheus / Grafana 技术栈的可观测性体系，数据大盘，可视化应用。
希望支持国产信创操作系统/国产信创 ARM 芯片架构，提供中文/本地化界面支持。
下云并寻求 RDS for PostgreSQL 的开源替代 —— 云中立，无供应商锁定的解决方案。
希望获取关于 Redis / ETCD / MinIO，以及 TimescaleDB / Citus 等扩展的专业支持。
希望规避 AGPL v3 协议对衍生作品强制使用同协议开源的的限制，进行二次开发与 OEM 贴牌。
希望将 Pigsty 作为 SaaS / PaaS / DBaaS 对外销售，或基于此发行版提供技术服务/咨询/云服务。

订阅计划

Pigsty 提供以下三种订阅计划：标准版 ，专业版 与 企业版 ，您可以根据自身的实际情况与需求选购。

Pigsty 标准版（STD）

售价：50,000 ¥ / 年

需要外援的精干团队

许可协议：商业许可证

适用规模：<= 5 节点

PG支持： 17

OS支持：三系大小版本

EL 9 兼容
Debian 12
Ubuntu 22
最新稳定小版本

架构支持： x86_64，Arm64

标准版咨询服务：

开源社区支持，加上：
BUG修复与安全补丁
疑难杂症分析与处理
PG相关软件自建指南
通过邮件提供工单支持

支持：一次性架构咨询

答疑：<= 90 分钟/月

SLA：5x8，当日时效内

需要外援的精干团队

Pigsty 专业版（PRO）

售价：150,000 ¥ / 年

普通用户的默认之选

许可协议：商业许可证

适用规模：<= 15 节点

PG支持： 17, 16, 15

OS支持：五系大小版本

EL 8 / 9 兼容
Debian 12
Ubuntu 22 / 24
可指定小版本构建

架构支持： x86_64，Arm64

专业版咨询服务：

标准版咨询内容，加上：
升级路径支持与教程
性能瓶颈定位与优化
HA / PITR 规划演练
通过IM提供即时支持

支持：每年包含 1 人天

答疑：<= 5 小时/月

SLA： 5 x8 (< 4h)

普通用户的默认之选

Pigsty 企业版（ENT）

起售价：400,000 ¥ / 年

严格 SLA 的关键场景

许可协议：商业许可证

适用规模：<= 50 节点，可至无限

PG支持： 17, 16, 15, 14, 13

OS支持：七系 + 按需定制

EL 7 / 8 / 9
Debian 11 / 12
Ubuntu 20 / 22 / 24
国产信创操作系统支持

架构支持： x86_64，Arm64

企业版咨询服务：

专业版咨询内容，加上：
定制PG内核/扩展插件
RDS DBaaS与OEM用例
数据库管理体系与合规
紧急问题通过电话oncall

支持：每年包含 2 人天

答疑：<= 10 小时/月

SLA： 7 x 24 (< 30分钟)

信创：PolarDB-O 支持

严格 SLA 的关键场景

Pigsty开源版

开源版：免费，自由，无质保

Pigsty 基于开源，回馈开源，是回馈 PostgreSQL 社区与所有用户的一份礼物 —— 您无需付费即可获得 Pigsty 的完整核心功能。当然，作为开源软件的基本特征，Pigsty 开源版不提供任何质保服务，也不对任何使用后果负责。如果您需要质保，请考虑我们的订阅服务。

Pigsty 开源版使用 AGPLv3 许可证开源，这是一种具有传染性的严格开源许可证。如果您属于普通终端用户（即：公有云厂商，数据库厂商除外的用户），我们并不会对您对 Pigsty 二开的行为进行任何追索，实际等效执行更宽松的 Apache 2.0 许可证。

如果您发现了 Pigsty 的缺陷，我们非常欢迎您在 Github 上提出 Issue，帮助我们改进。如果您有任何疑问，您可以尝试在 社区论坛 中提问并寻求帮助求助。

针对开源版本，我们提供 PostgreSQL 17 在 EL 9.5，Debian 12.7，Ubuntu 22.04.5 三个精准操作系统发行版最新小版本上的预制标准离线软件包（作为对开源的支持，针对 Debian 12 同时提供 aarch64 离线软件包）。

使用 Pigsty 开源版本，可以让初级研发工程师 / 运维工程师拥有专业 DBA 70%+ 的能力，在缺少数据库专家的情况下，也能够轻松搭建一个高可用，高性能，易维护，安全可靠的 PostgreSQL 数据库集群。在云端自建的情况下，您可以立即节省 EC2/ESSD 与 RDS 服务之间的差价，实现高达一个数量级的降本增效。

代号	发行版大版本	小版本	`x86_64`	`aarch64`
EL9	RHEL 9 / Rocky9 / Alma9	`9.4`	`el9.x86_64`	`el9.aarch64`
D12	Debian 12 (`bookworm`)	`12.7`	`d12.x86_64`	`d12.aarch64`
U22	Ubuntu 22.04 (`jammy`)	`22.04.5`	`u22.x86_64`	`u22.aarch64`

= 首要支持，提供完整服务； = 不可用； = 可以通过在线安装自行安装其他 PG 版本，但不提供订阅服务

Pigsty标准版

标准版订阅：价格 ¥ 50,000 / 年

Pigsty 标准版订阅为中小初创企业提供了物美价廉的兜底 —— 我们将提供对 Pigsty 软件本身的质保与兜底。标准版订阅使用专用的商业许可证，提供对 Pigsty 软件本身 AGPLv3 开源义务豁免的书面合同承诺。

我们向 Pigsty 标准版的客户交付一次性架构咨询服务，根据您的实际环境与可用资源给出适用的数据库架构设计方案。无论您是希望使用 PostgreSQL 搭建业务系统，还是自建 Odoo，Dify，Supabase，Gitlab 等服务，我们都可以提供一条龙扶上马的支持（包括离线安装与科学上网）。

Pigsty 标准版订阅提供了基础的专家工单答疑服务，我们承诺在工作日时效内响应您的问题。如果您遇到了疑难杂症需要更多支持，我们的专家人天服务选项也对您开放。我们的经验可以在 PostgreSQL 相关领域为您避免大量的弯路与陷阱，节省您的时间、精力、金钱开销。

Pigsty 标准版针对主流开源 Linux 发行版（EL9，Debian 12，Ubuntu 22.04）的最新稳定小版本提供了离线软件安装包，包括 x86_64 与 aarch64 两种不同的架构。其中包含了 PostgreSQL 17 内核与全部可用扩展，并经过冒烟测试，确保安装过程快速，稳定，高效，版本一致，不受网络环境与上游软件源变更的影响。

Pigsty 专业版的起售价格 ¥50,000 / 年，相当于 4 vCPU 的 AWS 高可用 RDS PG 年费，或月薪 四千元 的实习生年薪。

代号	发行版大版本	小版本	`x86_64`	`aarch64`
EL9	RHEL 9 / Rocky9 / Alma9	`9.4`	`el9.x86_64`	`el9.aarch64`
D12	Debian 12 (`bookworm`)	`12.7`	`d12.x86_64`	`d12.aarch64`
U22	Ubuntu 22.04 (`jammy`)	`22.04.5`	`u22.x86_64`	`u22.aarch64`

Pigsty专业版

专业版订阅：起售价格 ¥ 150,000 / 年

Pigsty专业版订阅是适合大部分用户的默认之选，除了 Pigsty 标准版中的服务外，我们提供了更高级的咨询服务。专业版订阅中提供对疑难杂症的分析与性能瓶颈的优化，确保您能在关键时刻没有摩擦阻力地获取顶级 DBA 的专业能力。

我们将为专业版客户提供完善的架构咨询，根据您的业务需求与资源情况，规划最佳的数据库架构设计方案，并确保其能够真正落地。我们还将协助您完成高可用，PITR 的测试演练，并对 Pigsty 的监控系统、配置方法与管理命令提供培训。

Pigsty专业版订阅提供了更高的支持力度，我们将在每年提供1个专家人天，以及每个月原则上不超过 5 个小时的 DBA 咨询答疑服务。同时提供更高的响应 SLA：对于常规问题，我们承诺在工作日时间（5x8）内响应时间不超过四小时。

Pigsty专业版有着更宽广的操作系统支持，额外支持 EL 8 与 Ubuntu 24.04 两大操作系统发行版大版本，并提供对所有大版本的 aarch64 支持；如果您没有使用最新的小版本，我们可以针对您指定的小版本定制离线软件包。此外，离线软件包中还提供了 Pigsty 的所有功能模块，例如 PG分支内核（IvorySQL，PolarDB，Babelfish）与所有 Pro/Beta 模块。

Pigsty 提供了对最近三个 PostgreSQL 大版本（17，16，15）的专业支持，并提供对于PG大版本升级与Pigsty升级的专业指导。

Pigsty 专业版的起售价格 ¥150,000 / 年，相当于 9 vCPU 的 AWS 高可用 RDS PG 年费，或月薪 一万元 的初级运维工程师。

代号	发行版大版本	小版本	`x86_64`	`aarch64`
EL9	RHEL 9 / Rocky9 / Alma9	`9.4`	`el9.x86_64`	`el9.aarch64`
D12	Debian 12 (`bookworm`)	`12.7`	`d12.x86_64`	`d12.aarch64`
U22	Ubuntu 22.04 (`jammy`)	`22.04.5`	`u22.x86_64`	`u22.aarch64`
U24	Ubuntu 24.04 (`noble`)	`24.04.5`	`u24.x86_64`	`u24.aarch64`
EL8	RHEL 8 / Rocky8 / Alma8	`8.10`	`el8.x86_64`	`el8.aarch64`

Pigsty企业版

企业版订阅：起售价格 ¥ 400,000 / 年

Pigsty企业版适用于中大型企业，或需要严格 SLA 要求的关键场景。在企业版中，我们提供最强支持力度，待命满足您对数据库的各类需求。

在企业版订阅服务中，我们会帮助您设计规划最佳的数据库架构方案并确保其落地，除了各项数据库演练，压力测试，性能评测外，我们还将为您提供管理体系方面的咨询培训，帮助您建立起完善的数据库管理体系，满足各类安全与合规要求。

Pigsty企业版每年带有两个人天的专家支持，以及每个月不超过 10 小时的 DBA 咨询答疑服务。对于常规问题，我们承诺在 7x24 内不超过 30 分钟响应，并总是尽可能及时优先处理您的请求。

Pigsty企业版提供了最为宽广的操作系统支持，额外支持 EL7，Debian 11，Ubuntu 20.04 等过保操作系统发行版大版本，并可定制支持 Euler，Anolis，UOS，麒麟等“国产操作系统”，以及 TencentOS，AliOS，OpenCloudOS 等“云操作系统”。

Pigsty企业版提供了对 PostgreSQL 生命周期内所有大版本（13 - 17）的专业支持，确保您可以在不同大版本之间丝滑不停机升级。您可以使用附带的人天将规模内的 PostgreSQL 集群通过不停机蓝绿部署的方式升级到最新大版本。

Pigsty企业版提供了对“国产信创”的支持，如果您的数据库有“国产化”要求，我们与阿里云联合提供基于 PolarDB v2.0 的信创解决方案。

Pigsty企业版允许您在指定规模内将 Pigsty 用于 DBaaS 用途，建设云数据库服务对外出售。 Pigsty企业版也允许您 OEM，使用自己的 Logo，商标与品牌在约定规模内发行 Pigsty。

Pigsty 企业版的起步价格为 ¥400,000 / 年，相当于 24 vCPU 的 AWS 高可用 RDS 年费，或月薪 三万元 的运维专家。

代号	发行版大版本	小版本	`x86_64`	`aarch64`
EL9	RHEL 9 / Rocky9 / Alma9	`9.4`	`el9.x86_64`	`el9.aarch64`
D12	Debian 12 (`bookworm`)	`12.7`	`d12.x86_64`	`d12.aarch64`
U22	Ubuntu 22.04 (`jammy`)	`22.04.5`	`u22.x86_64`	`u22.aarch64`
U24	Ubuntu 24.04 (`noble`)	`24.04.5`	`u24.x86_64`	`u24.aarch64`
EL8	RHEL 8 … / Anolis8	`8.10`	`el8.x86_64`	`el8.aarch64`
EL7	RHEL 7 … / UOS / Euler	`7.9`	`el7.x86_64`	`el7.aarch64`
D11	Debian 11 (`bullseye`)	`11.11`	`d11.x86_64`	`d11.aarch64`
U20	Ubuntu 20.04 (`focal`)	`20.04.6`	`u20.x86_64`	`u20.aarch64`

定价说明

Pigsty 订阅采用按年付费的模式，签订合同后，从合同约定日起计算一年的有效期。订阅合同到期前如果继续打款则视为自动续订。连续订阅有折扣，第一次续签（第二年）享受 95 折优惠，第二次以及后续的续签享受订阅费用 9 折优惠，一次性订阅三年以上整体费用享受 85 折优惠。

在年度订阅合同终止后，您可以选择不续签订阅服务，Pigsty 将不再提供软件更新，技术支持，咨询服务，但您仍然可以继续使用已经安装版本的 Pigsty 专业版软件。如果您订阅了 Pigsty 专业服务并选择不续订，在重新订阅时无需补齐中断期间的订阅费用，但所有折扣与优惠将重置。

Pigsty 的定价策略确保用户物有所值 —— 您可以立即获得顶尖 DBA 的数据库架构建设方案与管理最佳实践，并由其提供咨询答疑与服务支持兜底；而付出的成本相比于全职雇佣数据库专家或使用云数据库极具竞争力。

以下是市场上企业级数据库专业服务市场定价参考：

AWS RDS for PostgreSQL 高可用版：¥1,160 ～ ¥1,582 / (vCPU·月) ，折合人民币 14K ~ 19K/年 （每vCPU）
阿里云 RDS for PostgreSQL 高可用版：¥270 ～ ¥432 / (vCPU·月)，折合人民币 3K ~ 5K/年 （每vCPU）
EDB PostgreSQL 云数据库企业版： $183.3 / (vCPU·月)，折合人民币 16K/年 （每vCPU）
富士通企业级 PostgreSQL Kubernetes： $3200 / (Core·年)，折合人民币 12K/年 （每vCPU）
Oracle 年度服务费： (Enterprise $47,500 + Rac $23,000) * 22% 每年，折合人民币 28K /年 （每vCPU）

体面数据库专业服务的公允价格是 1 ~ 2 万元 / 年，计费单位为 vCPU，即一个 CPU 线程（1 Intel 核 = 2 vCPU 线程）。而 Pigsty 提供国内顶尖的 PostgreSQL 专家服务，并采用 按节点计费 的模式，在当下常见的高核数服务器节点上，能为用户带来无可比拟的 降本增效 体验。

按需专家服务

除了 Pigsty 订阅，Pigsty 还提供按需采购的 Pigsty x PostgreSQL 专家服务 —— 业界顶级数据库专家坐堂问诊。

专家顾问方案：价格面议

作为专家顾问，提供一整年的支持，包括不限量答疑与以及多次复杂案例处理，价格根据需求不同，从每年十万到大几十万不等。

专家支持人天：30,000 ¥ / 人·天

业界顶级专家现场支持，可用于架构咨询，故障分析，问题排查，数据库体检，监控解读，迁移评估，教学培训，上下云参谋等连续耗时场景。

专家咨询：3,000 ¥ / 案例

咨询任何您想要了解的问题，关于 Pigsty， PostgreSQL，数据库，云计算的问题。数据库老司机，云计算泥石流与您分享行业顶级洞察、认知与研判。或者由顶尖专家来辅助您完成 Supabase，Odoo，Dify，GitLab 等应用的自建。时间原则上在一小时内。

专家号：200 ¥ / 问题

给出一个关于 PostgreSQL / Pigsty / 数据库相关的问题的快速诊断意见与答复，时间控制在 5 分钟。

联系方式

请发送邮件至 rh@vonng.com 。中国大陆地区用户欢迎添加微信号 RuohangFeng。

3 - 核心概念

了解 Pigsty 的重要概念：整体架构、逻辑模型，基础设施，以及数据库 HA，PITR，服务接入的原理。

3.1 - 系统架构

Pigsty的模块化架构：用声明的方式来组合模块

Pigsty 使用 模块化架构 与 声明式接口。

Pigsty 使用配置清单描述整套部署环境，并通过 ansible 剧本实现。
Pigsty 在可以在任意节点上运行，无论是物理裸机还是虚拟机，只要操作系统兼容即可。
Pigsty 的行为由配置参数控制，具有幂等性的剧本会将节点调整到配置所描述的状态。
Pigsty 采用模块化设计，可自由组合以适应不同场景。使用剧本将模块安装到配置指定的节点上。

模块

Pigsty 采用模块化设计，有六个主要的默认模块：PGSQL、INFRA、NODE、ETCD、REDIS 和 MINIO。

PGSQL：由 Patroni、Pgbouncer、HAproxy、PgBackrest 等驱动的自治高可用 Postgres 集群。
INFRA：本地软件仓库、Prometheus、Grafana、Loki、AlertManager、PushGateway、Blackbox Exporter…
NODE：调整节点到所需状态、名称、时区、NTP、ssh、sudo、haproxy、docker、promtail、keepalived
ETCD：分布式键值存储，用作高可用 Postgres 集群的 DCS：共识选主/配置管理/服务发现。
REDIS：Redis 服务器，支持独立主从、哨兵、集群模式，并带有完整的监控支持。
MINIO：与 S3 兼容的简单对象存储服务器，可作为 PG数据库备份的可选目的地。

你可以声明式地自由组合它们。如果你想要主机监控，在基础设施节点上安装INFRA模块，并在纳管节点上安装 NODE 模块就足够了。 ETCD 和 PGSQL 模块用于搭建高可用 PG 集群，将模块安装在多个节点上，可以自动形成一个高可用的数据库集群。您可以复用 Pigsty 基础架构并开发您自己的模块，REDIS 和 MINIO 可以作为一个样例。后续还会有更多的模块加入，例如对 Mongo 与 MySQL 的初步支持已经提上了日程。

请注意，所有模块都强依赖 NODE 模块：在 Pigsty 中节点必须先安装 NODE 模块，被 Pigsty 纳管后方可部署其他模块。当节点（默认）使用本地软件源进行安装时，NODE 模块对 INFRA 模块有弱依赖。因此安装 INFRA 模块的管理节点/基础设施节点会在 [install.yml] 剧本中完成 Bootstrap 过程，解决循环依赖。

单机安装

默认情况下，Pigsty 将在单个节点 (物理机/虚拟机) 上安装。install.yml 剧本将在当前节点上安装 INFRA、ETCD、PGSQL 和可选的 MINIO 模块，这将为你提供一个功能完备的可观测性技术栈全家桶 (Prometheus、Grafana、Loki、AlertManager、PushGateway、BlackboxExporter 等) ，以及一个内置的 PostgreSQL 单机实例作为 CMDB，也可以开箱即用。 (集群名 pg-meta，库名为 meta)。

这个节点现在会有完整的自我监控系统、可视化工具集，以及一个自动配置有 PITR 的 Postgres 数据库（HA不可用，因为你只有一个节点）。你可以使用此节点作为开发箱、测试、运行演示以及进行数据可视化和分析。或者，还可以把这个节点当作管理节点，部署纳管更多的节点！

监控

安装的单机元节点可用作管理节点和监控中心，以将更多节点和数据库服务器置于其监视和控制之下。

Pigsty 的监控系统可以独立使用，如果你想安装 Prometheus / Grafana 可观测性全家桶，Pigsty 为你提供了最佳实践！它为主机节点和 PostgreSQL数据库提供了丰富的仪表盘。无论这些节点或 PostgreSQL 服务器是否由 Pigsty 管理，只需简单的配置，你就可以立即拥有生产级的监控和告警系统，并将现有的主机与PostgreSQL纳入监管。

高可用PG集群

Pigsty 帮助您在任何地方拥有您自己的生产级高可用 PostgreSQL RDS 服务。

要创建这样一个高可用 PostgreSQL 集群/RDS服务，你只需用简短的配置来描述它，并运行剧本来创建即可：

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica }
  vars: { pg_cluster: pg-test }

$ bin/pgsql-add pg-test  # 初始化集群 'pg-test'

不到10分钟，您将拥有一个服务接入，监控，备份PITR，高可用配置齐全的 PostgreSQL 数据库集群。

硬件故障由 patroni、etcd 和 haproxy 提供的自愈高可用架构来兜底，在主库故障的情况下，默认会在 30 秒内执行自动故障转移（Failover）。客户端无需修改配置重启应用：Haproxy 利用 patroni 健康检查进行流量分发，读写请求会自动分发到新的集群主库中，并避免脑裂的问题。这一过程十分丝滑，例如在从库故障，或主动切换（switchover）的情况下，客户端只有一瞬间的当前查询闪断，

软件故障、人为错误和数据中心级灾难由 pgbackrest 和可选的 MinIO 集群来兜底。这为您提供了本地/云端的 PITR 能力，并在数据中心失效的情况下提供了跨地理区域复制，与异地容灾功能。

3.2 - 集群模型

Pigsty 是如何将不同种类的功能抽象成为模块的，以及这些模块的逻辑模型。

在 Pigsty 中，功能模块是以 “集群” 的方式组织起来的。每一个集群都是一个 Ansible 分组，包含有若干节点资源，定义有实例

PGSQL 模块总览：关键概念与架构细节

PGSQL模块在生产环境中以集群的形式组织，这些集群是由一组由主-备关联的数据库实例组成的逻辑实体。每个数据库集群都是一个自治的业务服务单元，由至少一个 数据库（主库）实例 组成。

实体概念图

让我们从ER图开始。在Pigsty的PGSQL模块中，有四种核心实体：

集群（Cluster）：自治的PostgreSQL业务单元，用作其他实体的顶级命名空间。
服务（Service）：集群能力的命名抽象，路由流量，并使用节点端口暴露postgres服务。
实例（Instance）：一个在单个节点上的运行进程和数据库文件组成的单一postgres服务器。
节点（Node）：硬件资源的抽象，可以是裸金属、虚拟机或甚至是k8s pods。

命名约定

集群名应为有效的 DNS 域名，不包含任何点号，正则表达式为：[a-zA-Z0-9-]+
服务名应以集群名为前缀，并以特定单词作为后缀：primary、replica、offline、delayed，中间用-连接。
实例名以集群名为前缀，以正整数实例号为后缀，用-连接，例如${cluster}-${seq}。
节点由其首要内网IP地址标识，因为PGSQL模块中数据库与主机1:1部署，所以主机名通常与实例名相同。

身份参数

Pigsty使用身份参数来识别实体：PG_ID。

除了节点IP地址，pg_cluster、pg_role和pg_seq三个参数是定义postgres集群所必需的最小参数集。以沙箱环境测试集群pg-test为例：

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica }
  vars:
    pg_cluster: pg-test

集群的三个成员如下所示：

集群	序号	角色	主机 / IP	实例	服务	节点名
`pg-test`	`1`	`primary`	`10.10.10.11`	`pg-test-1`	`pg-test-primary`	`pg-test-1`
`pg-test`	`2`	`replica`	`10.10.10.12`	`pg-test-2`	`pg-test-replica`	`pg-test-2`
`pg-test`	`3`	`replica`	`10.10.10.13`	`pg-test-3`	`pg-test-replica`	`pg-test-3`

这里包含了：

一个集群：该集群命名为pg-test。
两种角色：primary和replica。
三个实例：集群由三个实例组成：pg-test-1、pg-test-2、pg-test-3。
三个节点：集群部署在三个节点上：10.10.10.11、10.10.10.12和10.10.10.13。
四个服务：
- 读写服务：pg-test-primary
- 只读服务：pg-test-replica
- 直接连接的管理服务：pg-test-default
- 离线读服务：pg-test-offline

在监控系统（Prometheus/Grafana/Loki）中，相应的指标将会使用这些身份参数进行标记：

pg_up{cls="pg-meta", ins="pg-meta-1", ip="10.10.10.10", job="pgsql"}
pg_up{cls="pg-test", ins="pg-test-1", ip="10.10.10.11", job="pgsql"}
pg_up{cls="pg-test", ins="pg-test-2", ip="10.10.10.12", job="pgsql"}
pg_up{cls="pg-test", ins="pg-test-3", ip="10.10.10.13", job="pgsql"}

3.3 - 监控系统

Pigsty 的监控系统是如何架构与实现的，被监控的目标对象又是如何被自动纳入管理的。

3.4 - 本地 CA

Pigsty 带有一套自签名的 CA 公私钥基础设施，用于签发 SSL 证书，加密网络通信流量。

Pigsty 部署默认启用了一些安全最佳实践：使用 SSL 加密网络流量，使用 HTTPS 加密 Web 界面。

为了实现这一功能，Pigsty 内置了本地自签名的 CA ，用于签发 SSL 证书，加密网络通信流量。

在默认情况下，SSL 与 HTTPS 是启用，但不强制使用的。对于有着较高安全要求的环境，您可以强制使用 SSL 与 HTTPS。

本地CA

Pigsty 默认会在初始化时，在 ADMIN节点本机 Pigsty 源码目录（~/pigsty）中生成一个自签名的 CA，当您需要使用 SSL，HTTPS，数字签名，签发数据库客户端证书，高级安全特性时，可以使用此 CA。

因此，每一套 Pigsty 部署使用的 CA 都是唯一的，不同的 Pigsty 部署之间的 CA 是不相互信任的。

本地 CA 由两个文件组成，默认放置于 files/pki/ca 目录中：

ca.crt：自签名的 CA 根证书，应当分发安装至到所有纳管节点，用于证书验证。
ca.key：CA 私钥，用于签发证书，验证 CA 身份，应当妥善保管，避免泄漏！

请保护好CA私钥文件

请妥善保管 CA 私钥文件，不要遗失，不要泄漏。我们建议您在完成 Pigsty 安装后，加密备份此文件。

使用现有CA

如果您本身已经有 CA 公私钥基础设施，Pigsty 也可以配置为使用现有 CA 。

将您的 CA 公钥与私钥文件放置于 files/pki/ca 目录中即可。

files/pki/ca/ca.key     # 核心的 CA 私钥文件，必须存在，如果不存在，默认会重新随机生成一个
files/pki/ca/ca.crt     # 如果没有证书文件，Pigsty会自动重新从 CA 私钥生成新的根证书文件

当 Pigsty 执行 install.yml 与 infra.yml 剧本进行安装时，如果发现 files/pki/ca 目录中的 ca.key 私钥文件存在，则会使用已有的 CA 。ca.crt 文件可以从 ca.key 私钥文件生成，所以如果没有证书文件，Pigsty 会自动重新从 CA 私钥生成新的根证书文件。

使用现有CA时请注意

您可以将 ca_method 参数配置为 copy，确保 Pigsty 找不到本地 CA 时报错中止，而不是自行重新生成新的自签名 CA。

信任CA

在 Pigsty 安装过程中，ca.crt 会在 node.yml 剧本的 node_ca 任务中，被分发至所有节点上的 /etc/pki/ca.crt 路径下。

EL系操作系统与 Debian系操作系统默认信任的 CA 根证书路径不同，因此分发的路径与更新的方式也不同。

rm -rf /etc/pki/ca-trust/source/anchors/ca.crt
ln -s /etc/pki/ca.crt /etc/pki/ca-trust/source/anchors/ca.crt
/bin/update-ca-trust

rm -rf /usr/local/share/ca-certificates/ca.crt
ln -s /etc/pki/ca.crt /usr/local/share/ca-certificates/ca.crt
/usr/sbin/update-ca-certificates

Pigsty 默认会为基础设施节点上的 Web 系统使用的域名签发 HTTPS 证书，您可以 HTTPS 访问 Pigsty 的 Web 系统。如果您希望在客户端电脑上浏览器访问时不要弹出“不受信任的 CA 证书”信息，可以将 ca.crt 分发至客户端电脑的信任证书目录中。

您可以双击 ca.crt 文件将其加入系统钥匙串，例如在 MacOS 系统中，需要打开“钥匙串访问” 搜索 pigsty-ca 然后“信任”此根证书

查看证书内容

使用以下命令，可以查阅 Pigsty CA 证书的内容

openssl x509 -text -in /etc/pki/ca.crt

本地 CA 根证书内容样例

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            50:29:e3:60:96:93:f4:85:14:fe:44:81:73:b5:e1:09:2a:a8:5c:0a
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O=pigsty, OU=ca, CN=pigsty-ca
        Validity
            Not Before: Feb  7 00:56:27 2023 GMT
            Not After : Jan 14 00:56:27 2123 GMT
        Subject: O=pigsty, OU=ca, CN=pigsty-ca
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (4096 bit)
                Modulus:
                    00:c1:41:74:4f:28:c3:3c:2b:13:a2:37:05:87:31:
                    ....
                    e6:bd:69:a5:5b:e3:b4:c0:65:09:6e:84:14:e9:eb:
                    90:f7:61
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Subject Alternative Name: 
                DNS:pigsty-ca
            X509v3 Key Usage: 
                Digital Signature, Certificate Sign, CRL Sign
            X509v3 Basic Constraints: critical
                CA:TRUE, pathlen:1
            X509v3 Subject Key Identifier: 
                C5:F6:23:CE:BA:F3:96:F6:4B:48:A5:B1:CD:D4:FA:2B:BD:6F:A6:9C
    Signature Algorithm: sha256WithRSAEncryption
    Signature Value:
        89:9d:21:35:59:6b:2c:9b:c7:6d:26:5b:a9:49:80:93:81:18:
        ....
        9e:dd:87:88:0d:c4:29:9e
-----BEGIN CERTIFICATE-----
...
cXyWAYcvfPae3YeIDcQpng==
-----END CERTIFICATE-----

签发证书

如果您希望通过客户端证书认证，那么可以使用本地 CA 与 cert.yml 剧本手工签发PostgreSQL 客户端证书。

将证书的 CN 字段设置为数据库用户名即可：

./cert.yml -e cn=dbuser_dba
./cert.yml -e cn=dbuser_monitor

签发的证书会默认生成在 files/pki/misc/<cn>.{key,crt} 路径下。

3.5 - 基础设施即代码

基础设施即代码，Infra as Code，简称 IaC。Pigsty 使用 IaC 的方式管理整套部署环境中的所有组件。

基础设施即代码，Infra as Code，简称 IaC。

Pigsty 遵循 IaC 与 GitOPS 的理念：Pigsty 的部署由声明式的 配置清单 描述，并通过 幂等剧本 来实现。

用户用声明的方式通过 配置参数 来描述自己期望的状态，而剧本则以幂等的方式调整目标节点以达到这个状态。这类似于 Kubernetes 的 CRD & Operator，Pigsty 在裸机和虚拟机上实现了同样的功能。

声明模块

以下面的默认配置片段为例，这段配置描述了一个节点 10.10.10.10，其上安装了 INFRA、NODE、ETCD 和 PGSQL 模块。

# 监控、告警、DNS、NTP 等基础设施集群...
infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }

# minio 集群，兼容 s3 的对象存储
minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

# etcd 集群，用作 PostgreSQL 高可用所需的 DCS
etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

# PGSQL 示例集群: pg-meta
pg-meta: { hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary }, vars: { pg_cluster: pg-meta } }

要真正安装这些模块，执行以下剧本：

./infra.yml -l 10.10.10.10  # 在节点 10.10.10.10 上初始化 infra 模块
./etcd.yml  -l 10.10.10.10  # 在节点 10.10.10.10 上初始化 etcd 模块
./minio.yml -l 10.10.10.10  # 在节点 10.10.10.10 上初始化 minio 模块
./pgsql.yml -l 10.10.10.10  # 在节点 10.10.10.10 上初始化 pgsql 模块

声明集群

您可以声明 PostgreSQL 数据库集群，在多个节点上安装 PGSQL 模块，并使其成为一个服务单元：

例如，要在以下三个已被 Pigsty 纳管的节点上，部署一个使用流复制组建的三节点高可用 PostgreSQL 集群，您可以在配置文件 pigsty.yml 的 all.children 中添加以下定义：

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: offline }
  vars:  { pg_cluster: pg-test }

定义完后，可以使用剧本将集群创建：

bin/pgsql-add pg-test   # 创建 pg-test 集群

你可以使用不同的的实例角色，例如主库（primary），从库（replica），离线从库（offline），延迟从库（delayed），同步备库（sync standby）；以及不同的集群：例如 备份集群（Standby Cluster），Citus集群，甚至是 Redis / MinIO / Etcd 集群

定制集群内容

您不仅可以使用声明式的方式定义集群，还可以定义集群中的数据库、用户、服务、HBA 规则等内容，例如，下面的配置文件对默认的 pg-meta 单节点数据库集群的内容进行了深度定制：

包括：声明了六个业务数据库与七个业务用户，添加了一个额外的 standby 服务（同步备库，提供无复制延迟的读取能力），定义了一些额外的 pg_hba 规则，一个指向集群主库的 L2 VIP 地址，与自定义的备份策略。

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary , pg_offline_query: true } }
  vars:
    pg_cluster: pg-meta
    pg_databases:                       # define business databases on this cluster, array of database definition
      - name: meta                      # REQUIRED, `name` is the only mandatory field of a database definition
        baseline: cmdb.sql              # optional, database sql baseline path, (relative path among ansible search path, e.g files/)
        pgbouncer: true                 # optional, add this database to pgbouncer database list? true by default
        schemas: [pigsty]               # optional, additional schemas to be created, array of schema names
        extensions:                     # optional, additional extensions to be installed: array of `{name[,schema]}`
          - { name: postgis , schema: public }
          - { name: timescaledb }
        comment: pigsty meta database   # optional, comment string for this database
        owner: postgres                # optional, database owner, postgres by default
        template: template1            # optional, which template to use, template1 by default
        encoding: UTF8                 # optional, database encoding, UTF8 by default. (MUST same as template database)
        locale: C                      # optional, database locale, C by default.  (MUST same as template database)
        lc_collate: C                  # optional, database collate, C by default. (MUST same as template database)
        lc_ctype: C                    # optional, database ctype, C by default.   (MUST same as template database)
        tablespace: pg_default         # optional, default tablespace, 'pg_default' by default.
        allowconn: true                # optional, allow connection, true by default. false will disable connect at all
        revokeconn: false              # optional, revoke public connection privilege. false by default. (leave connect with grant option to owner)
        register_datasource: true      # optional, register this database to grafana datasources? true by default
        connlimit: -1                  # optional, database connection limit, default -1 disable limit
        pool_auth_user: dbuser_meta    # optional, all connection to this pgbouncer database will be authenticated by this user
        pool_mode: transaction         # optional, pgbouncer pool mode at database level, default transaction
        pool_size: 64                  # optional, pgbouncer pool size at database level, default 64
        pool_size_reserve: 32          # optional, pgbouncer pool size reserve at database level, default 32
        pool_size_min: 0               # optional, pgbouncer pool size min at database level, default 0
        pool_max_db_conn: 100          # optional, max database connections at database level, default 100
      - { name: grafana  ,owner: dbuser_grafana  ,revokeconn: true ,comment: grafana primary database }
      - { name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database }
      - { name: kong     ,owner: dbuser_kong     ,revokeconn: true ,comment: kong the api gateway database }
      - { name: gitea    ,owner: dbuser_gitea    ,revokeconn: true ,comment: gitea meta database }
      - { name: wiki     ,owner: dbuser_wiki     ,revokeconn: true ,comment: wiki meta database }
    pg_users:                           # define business users/roles on this cluster, array of user definition
      - name: dbuser_meta               # REQUIRED, `name` is the only mandatory field of a user definition
        password: DBUser.Meta           # optional, password, can be a scram-sha-256 hash string or plain text
        login: true                     # optional, can log in, true by default  (new biz ROLE should be false)
        superuser: false                # optional, is superuser? false by default
        createdb: false                 # optional, can create database? false by default
        createrole: false               # optional, can create role? false by default
        inherit: true                   # optional, can this role use inherited privileges? true by default
        replication: false              # optional, can this role do replication? false by default
        bypassrls: false                # optional, can this role bypass row level security? false by default
        pgbouncer: true                 # optional, add this user to pgbouncer user-list? false by default (production user should be true explicitly)
        connlimit: -1                   # optional, user connection limit, default -1 disable limit
        expire_in: 3650                 # optional, now + n days when this role is expired (OVERWRITE expire_at)
        expire_at: '2030-12-31'         # optional, YYYY-MM-DD 'timestamp' when this role is expired  (OVERWRITTEN by expire_in)
        comment: pigsty admin user      # optional, comment string for this user/role
        roles: [dbrole_admin]           # optional, belonged roles. default roles are: dbrole_{admin,readonly,readwrite,offline}
        parameters: {}                  # optional, role level parameters with `ALTER ROLE SET`
        pool_mode: transaction          # optional, pgbouncer pool mode at user level, transaction by default
        pool_connlimit: -1              # optional, max database connections at user level, default -1 disable limit
      - {name: dbuser_view     ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [dbrole_readonly], comment: read-only viewer for meta database}
      - {name: dbuser_grafana  ,password: DBUser.Grafana  ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for grafana database   }
      - {name: dbuser_bytebase ,password: DBUser.Bytebase ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for bytebase database  }
      - {name: dbuser_kong     ,password: DBUser.Kong     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for kong api gateway   }
      - {name: dbuser_gitea    ,password: DBUser.Gitea    ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for gitea service      }
      - {name: dbuser_wiki     ,password: DBUser.Wiki     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for wiki.js service    }
    pg_services:                        # extra services in addition to pg_default_services, array of service definition
      # standby service will route {ip|name}:5435 to sync replica's pgbouncer (5435->6432 standby)
      - name: standby                   # required, service name, the actual svc name will be prefixed with `pg_cluster`, e.g: pg-meta-standby
        port: 5435                      # required, service exposed port (work as kubernetes service node port mode)
        ip: "*"                         # optional, service bind ip address, `*` for all ip by default
        selector: "[]"                  # required, service member selector, use JMESPath to filter inventory
        dest: default                   # optional, destination port, default|postgres|pgbouncer|<port_number>, 'default' by default
        check: /sync                    # optional, health check url path, / by default
        backup: "[? pg_role == `primary`]"  # backup server selector
        maxconn: 3000                   # optional, max allowed front-end connection
        balance: roundrobin             # optional, haproxy load balance algorithm (roundrobin by default, other: leastconn)
        options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'
    pg_hba_rules:
      - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}
    pg_vip_enabled: true
    pg_vip_address: 10.10.10.2/24
    pg_vip_interface: eth1
    node_crontab:  # make a full backup 1 am everyday
      - '00 01 * * * postgres /pg/bin/pg-backup full'

声明访问控制

您还可以通过声明式的配置，深度定制 Pigsty 的访问控制能力。例如下面的配置文件对 pg-meta 集群进行了深度安全定制：

使用三节点核心集群模板：crit.yml，确保数据一致性有限，故障切换数据零丢失。启用了 L2 VIP，并将数据库与连接池的监听地址限制在了本地环回IP + 内网IP + VIP 三个特定地址。模板强制启用了 Patroni 的 SSL API，与 Pgbouncer 的 SSL，并在 HBA 规则中强制要求使用 SSL 访问数据库集群。同时还在 pg_libs 中启用了 $libdir/passwordcheck 扩展，来强制执行密码强度安全策略。

最后，还单独声明了一个 pg-meta-delay 集群，作为 pg-meta 在一个小时前的延迟镜像从库，用于紧急数据误删恢复。

pg-meta:      # 3 instance postgres cluster `pg-meta`
  hosts:
    10.10.10.10: { pg_seq: 1, pg_role: primary }
    10.10.10.11: { pg_seq: 2, pg_role: replica }
    10.10.10.12: { pg_seq: 3, pg_role: replica , pg_offline_query: true }
  vars:
    pg_cluster: pg-meta
    pg_conf: crit.yml
    pg_users:
      - { name: dbuser_meta , password: DBUser.Meta   , pgbouncer: true , roles: [ dbrole_admin ] , comment: pigsty admin user }
      - { name: dbuser_view , password: DBUser.Viewer , pgbouncer: true , roles: [ dbrole_readonly ] , comment: read-only viewer for meta database }
    pg_databases:
      - {name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: postgis, schema: public}, {name: timescaledb}]}
    pg_default_service_dest: postgres
    pg_services:
      - { name: standby ,src_ip: "*" ,port: 5435 , dest: default ,selector: "[]" , backup: "[? pg_role == `primary`]" }
    pg_vip_enabled: true
    pg_vip_address: 10.10.10.2/24
    pg_vip_interface: eth1
    pg_listen: '${ip},${vip},${lo}'
    patroni_ssl_enabled: true
    pgbouncer_sslmode: require
    pgbackrest_method: minio
    pg_libs: 'timescaledb, $libdir/passwordcheck, pg_stat_statements, auto_explain' # add passwordcheck extension to enforce strong password
    pg_default_roles:                 # default roles and users in postgres cluster
      - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
      - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
      - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly]               ,comment: role for global read-write access }
      - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite]  ,comment: role for object creation }
      - { name: postgres     ,superuser: true  ,expire_in: 7300                        ,comment: system superuser }
      - { name: replicator ,replication: true  ,expire_in: 7300 ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
      - { name: dbuser_dba   ,superuser: true  ,expire_in: 7300 ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
      - { name: dbuser_monitor ,roles: [pg_monitor] ,expire_in: 7300 ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }
    pg_default_hba_rules:             # postgres host-based auth rules by default
      - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
      - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
      - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: ssl   ,title: 'replicator replication from localhost'}
      - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: ssl   ,title: 'replicator replication from intranet' }
      - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: ssl   ,title: 'replicator postgres db from intranet' }
      - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
      - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: ssl   ,title: 'monitor from infra host with password'}
      - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
      - {user: '${admin}'   ,db: all         ,addr: world     ,auth: cert  ,title: 'admin @ everywhere with ssl & cert'   }
      - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: ssl   ,title: 'pgbouncer read/write via local socket'}
      - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: ssl   ,title: 'read/write biz user via password'     }
      - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: ssl   ,title: 'allow etl offline tasks from intranet'}
    pgb_default_hba_rules:            # pgbouncer host-based authentication rules
      - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
      - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
      - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: ssl   ,title: 'monitor access via intranet with pwd' }
      - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
      - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: ssl   ,title: 'admin access via intranet with pwd'   }
      - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
      - {user: 'all'        ,db: all         ,addr: intra     ,auth: ssl   ,title: 'allow all user intra access with pwd' }

# OPTIONAL delayed cluster for pg-meta
pg-meta-delay:                    # delayed instance for pg-meta (1 hour ago)
  hosts: { 10.10.10.13: { pg_seq: 1, pg_role: primary, pg_upstream: 10.10.10.10, pg_delay: 1h } }
  vars: { pg_cluster: pg-meta-delay }

Citus分布式集群

下面是一个四节点的 Citus 分布式集群的声明式配置：

all:
  children:
    pg-citus0: # citus coordinator, pg_group = 0
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus0 , pg_group: 0 }
    pg-citus1: # citus data node 1
      hosts: { 10.10.10.11: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus1 , pg_group: 1 }
    pg-citus2: # citus data node 2
      hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus2 , pg_group: 2 }
    pg-citus3: # citus data node 3, with an extra replica
      hosts:
        10.10.10.13: { pg_seq: 1, pg_role: primary }
        10.10.10.14: { pg_seq: 2, pg_role: replica }
      vars: { pg_cluster: pg-citus3 , pg_group: 3 }
  vars:                               # global parameters for all citus clusters
    pg_mode: citus                    # pgsql cluster mode: citus
    pg_shard: pg-citus                # citus shard name: pg-citus
    patroni_citus_db: meta            # citus distributed database name
    pg_dbsu_password: DBUser.Postgres # all dbsu password access for citus cluster
    pg_users: [ { name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
    pg_databases: [ { name: meta ,extensions: [ { name: citus }, { name: postgis }, { name: timescaledb } ] } ]
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32 ,auth: ssl ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra        ,auth: ssl ,title: 'all user ssl access from intranet'  }

Redis集群

下面给出了 Redis 主从集群、哨兵集群、以及 Redis Cluster 的声明配置样例

redis-ms: # redis classic primary & replica
  hosts: { 10.10.10.10: { redis_node: 1 , redis_instances: { 6379: { }, 6380: { replica_of: '10.10.10.10 6379' } } } }
  vars: { redis_cluster: redis-ms ,redis_password: 'redis.ms' ,redis_max_memory: 64MB }

redis-meta: # redis sentinel x 3
  hosts: { 10.10.10.11: { redis_node: 1 , redis_instances: { 26379: { } ,26380: { } ,26381: { } } } }
  vars:
    redis_cluster: redis-meta
    redis_password: 'redis.meta'
    redis_mode: sentinel
    redis_max_memory: 16MB
    redis_sentinel_monitor: # primary list for redis sentinel, use cls as name, primary ip:port
      - { name: redis-ms, host: 10.10.10.10, port: 6379 ,password: redis.ms, quorum: 2 }

redis-test: # redis native cluster: 3m x 3s
  hosts:
    10.10.10.12: { redis_node: 1 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
    10.10.10.13: { redis_node: 2 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
  vars: { redis_cluster: redis-test ,redis_password: 'redis.test' ,redis_mode: cluster, redis_max_memory: 32MB }

ETCD集群

下面给出了一个三节点的 Etcd 集群声明式配置样例：

etcd: # dcs service for postgres/patroni ha consensus
  hosts:  # 1 node for testing, 3 or 5 for production
    10.10.10.10: { etcd_seq: 1 }  # etcd_seq required
    10.10.10.11: { etcd_seq: 2 }  # assign from 1 ~ n
    10.10.10.12: { etcd_seq: 3 }  # odd number please
  vars: # cluster level parameter override roles/etcd
    etcd_cluster: etcd  # mark etcd cluster name etcd
    etcd_safeguard: false # safeguard against purging
    etcd_clean: true # purge etcd during init process

MinIO集群

下面给出了一个三节点的 MinIO 集群声明式配置样例：

minio:
  hosts:
    10.10.10.10: { minio_seq: 1 }
    10.10.10.11: { minio_seq: 2 }
    10.10.10.12: { minio_seq: 3 }
  vars:
    minio_cluster: minio
    minio_data: '/data{1...2}'          # 每个节点使用两块磁盘
    minio_node: '${minio_cluster}-${minio_seq}.pigsty' # 节点名称的模式
    haproxy_services:
      - name: minio                     # [必选] 服务名称，需要唯一
        port: 9002                      # [必选] 服务端口，需要唯一
        options:
          - option httpchk
          - option http-keep-alive
          - http-check send meth OPTIONS uri /minio/health/live
          - http-check expect status 200
        servers:
          - { name: minio-1 ,ip: 10.10.10.10 , port: 9000 , options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-2 ,ip: 10.10.10.11 , port: 9000 , options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-3 ,ip: 10.10.10.12 , port: 9000 , options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

3.6 - 数据库高可用

Pigsty 使用 Patroni 实现了 PostgreSQL 的高可用，确保主库不可用时自动进行故障转移，由从库接管。

概览

Pigsty 的 PostgreSQL 集群带有开箱即用的高可用方案，由 Patroni、Etcd 和 HAProxy 强力驱动。

当您的 PostgreSQL 集群含有两个或更多实例时，您无需任何配置即拥有了硬件故障自愈的数据库高可用能力 —— 只要集群中有任意实例存活，集群就可以对外提供完整的服务，而客户端只要连接至集群中的任意节点，即可获得完整的服务，而无需关心主从拓扑变化。

在默认配置下，主库故障恢复时间目标 RTO ≈ 30s，数据恢复点目标 RPO < 1MB；从库故障 RPO = 0，RTO ≈ 0 (闪断)；在一致性优先模式下，可确保故障切换数据零损失：RPO = 0。以上指标均可通过参数，根据您的实际硬件条件与可靠性要求 按需配置。

pigsty-ha

许多大型组织与核心机构已经在生产环境中长时间使用 Pigsty ，最大的部署有 25K CPU 核心与 220+ PostgreSQL 超大规格实例（64c / 512g / 3TB NVMe SSD）；在这一部署案例中，五年内经历了数十次硬件故障与各类事故，但依然可以保持高于 99.999% 的总体可用性战绩。

高可用（High-Availability）解决什么问题？

将数据安全C/IA中的可用性提高到一个新高度：RPO ≈ 0, RTO < 30s。
获得无缝滚动维护的能力，最小化维护窗口需求，带来极大便利。
硬件故障可以立即自愈，无需人工介入，运维DBA可以睡个好觉。
从库可以用于承载只读请求，分担主库负载，让资源得以充分利用。

高可用有什么代价？

基础设施依赖：高可用需要依赖 DCS (etcd/zk/consul) 提供共识。
起步门槛增加：一个有意义的高可用部署环境至少需要 三个节点。
额外的资源消耗：一个新从库就要消耗一份额外资源，不算大问题。
复杂度代价显著升高：备份成本显著加大，需要使用工具压制复杂度。

高可用的局限性

因为复制实时进⾏，所有变更被⽴即应⽤⾄从库。因此基于流复制的高可用方案⽆法应对⼈为错误与软件缺陷导致的数据误删误改。（例如：DROP TABLE，或 DELETE 数据）此类故障需要使用 延迟集群 ，或使用先前的基础备份与 WAL 归档进行 时间点恢复。

配置策略	RTO	RPO
单机 + 什么也不做	数据永久丢失，无法恢复	数据全部丢失
单机 + 基础备份	取决于备份大小与带宽（几小时）	丢失上一次备份后的数据（几个小时到几天）
单机 + 基础备份 + WAL归档	取决于备份大小与带宽（几小时）	丢失最后尚未归档的数据（几十MB）
主从 + 手工故障切换	十分钟	丢失复制延迟中的数据（约百KB）
主从 + 自动故障切换	一分钟内	丢失复制延迟中的数据（约百KB）
主从 + 自动故障切换 + 同步提交	一分钟内	无数据丢失

原理

在 Pigsty 中，高可用架构的实现原理如下：

PostgreSQL 使⽤标准流复制搭建物理从库，主库故障时由从库接管。
Patroni 负责管理 PostgreSQL 服务器进程，处理高可用相关事宜。
Etcd 提供分布式配置存储（DCS）能力，并用于故障后的领导者选举
Patroni 依赖 Etcd 达成集群领导者共识，并对外提供健康检查接口。
HAProxy 对外暴露集群服务，并利⽤ Patroni 健康检查接口，自动分发流量至健康节点。
vip-manager 提供一个可选的二层 VIP，从 Etcd 中获取领导者信息，并将 VIP 绑定在集群主库所在节点上。

当主库故障时，将触发新一轮领导者竞选，集群中最为健康的从库将胜出（LSN位点最高，数据损失最小者），并被提升为新的主库。胜选从库提升后，读写流量将立即路由至新的主库。主库故障影响是 写服务短暂不可用：从主库故障到新主库提升期间，写入请求将被阻塞或直接失败，不可用时长通常在 15秒～ 30秒，通常不会超过 1 分钟。

当从库故障时，只读流量将路由至其他从库，如果所有从库都故障，只读流量才会最终由主库承载。从库故障的影响是 部分只读查询闪断：当前从库上正在运行查询将由于连接重置而中止，并立即由其他可用从库接管。

故障检测由 Patroni 和 Etcd 共同完成，集群领导者将持有一个租约，如果集群领导者因为故障而没有及时续租（10s），租约将会被释放，并触发 故障切换（Failover）与新一轮集群选举。

即使没有出现任何故障，您依然可以主动通过 主动切换 （Switchover）变更集群的主库。在这种情况下，主库上的写入查询将会闪断，并立即路由至新主库执行。这一操作通常可用于滚动维护/升级数据库服务器。

利弊权衡

故障恢复时间目标（RTO）与 数据恢复点目标（RPO）是高可用集群设计时需要仔细进行利弊权衡的两个参数。

Pigsty 使用的 RTO 与 RPO 默认值满足绝大多数场景下的可靠性要求，您可以根据您的硬件水平，网络质量，业务需求来合理调整它们。

RTO 与 RPO 并非越小越好！

过小的 RTO 将增大误报几率，过小的 RPO 将降低成功自动切换的概率。

故障切换时的不可用时长上限由 pg_rto 参数控制，RTO 默认值为 30s，增大它将导致更长的主库故障转移写入不可用时长，而减少它将增加误报故障转移率（例如，由短暂网络抖动导致的反复切换）。

潜在数据丢失量的上限由 pg_rpo 参数控制，默认为 1MB，减小这个值可以降低故障切换时的数据损失上限，但也会增加故障时因为从库不够健康（落后太久）而拒绝自动切换的概率。

Pigsty 默认使用可用性优先模式，这意味着当主库故障时，它将尽快进行故障转移，尚未复制到从库的数据可能会丢失（常规万兆网络下，复制延迟在通常在几KB到100KB）。

如果您需要确保故障切换时不丢失任何数据，您可以使用 crit.yml 模板来确保在故障转移期间没有数据丢失，但这会牺牲一些性能作为代价。

3.7 - 时间点恢复

Pigsty 使用 pgBackRest 实现了 PostgreSQL 时间点恢复，允许用户回滚至备份策略容许范围内的任意时间点。

概览

您可以将集群恢复回滚至过去任意时刻，避免软件缺陷与人为失误导致的数据损失。

Pigsty 的 PostgreSQL 集群带有自动配置的时间点恢复（PITR）方案，基于备份组件 pgBackRest 与可选的对象存储仓库 MinIO 提供。

高可用方案 可以解决硬件故障，但却对软件缺陷与人为失误导致的数据删除/覆盖写入/删库等问题却无能为力。对于这种情况，Pigsty 提供了开箱即用的 时间点恢复（Point in Time Recovery, PITR）能力，无需额外配置即默认启用。

Pigsty 为您提供了基础备份与 WAL 归档的默认配置，您可以使用本地目录与磁盘，亦或专用的 MinIO 集群或 S3 对象存储服务来存储备份并实现异地容灾。当您使用本地磁盘时，默认保留恢复至过去一天内的任意时间点的能力。当您使用 MinIO 或 S3 时，默认保留恢复至过去一周内的任意时间点的能力。只要存储空间管够，您尽可保留任意长地可恢复时间段，丰俭由人。

时间点恢复（PITR）解决什么问题？

容灾能⼒增强：RPO 从 ∞ 降⾄⼗⼏MB， RTO 从 ∞ 降⾄⼏⼩时/⼏刻钟。
确保数据安全：C/I/A 中的 数据完整性：避免误删导致的数据⼀致性问题。
确保数据安全：C/I/A 中的 数据可⽤性：提供对“永久不可⽤”这种灾难情况的兜底

单实例配置策略	事件	RTO	RPO
什么也不做	宕机	永久丢失	全部丢失
基础备份	宕机	取决于备份大小与带宽（几小时）	丢失上一次备份后的数据（几个小时到几天）
基础备份 + WAL归档	宕机	取决于备份大小与带宽（几小时）	丢失最后尚未归档的数据（几十MB）

时间点恢复有什么代价？

降低数据安全中的 C：机密性，产生额外泄漏点，需要额外对备份进⾏保护。
额外的资源消耗：本地存储或⽹络流量 / 带宽开销，通常并不是⼀个问题。
复杂度代价升⾼：⽤户需要付出备份管理成本。

时间点恢复的局限性

如果只有 PITR 用于故障恢复，则 RTO 与 RPO 指标相比 高可用方案 更为逊色，通常应两者组合使用。

RTO：如果只有单机 + PITR，恢复时长取决于备份大小与网络/磁盘带宽，从十几分钟到几小时，几天不等。
RPO：如果只有单机 + PITR，宕机时可能丢失少量数据，一个或几个 WAL 日志段文件可能尚未归档，损失 16 MB 到⼏⼗ MB 不等的数据。

除了 PITR 之外，您还可以在 Pigsty 中使用 延迟集群 来解决人为失误或软件缺陷导致的数据误删误改问题。

原理

时间点恢复允许您将集群恢复回滚至过去的“任意时刻”，避免软件缺陷与人为失误导致的数据损失。要做到这一点，首先需要做好两样准备工作：基础备份 与 WAL归档。拥有 基础备份，允许用户将数据库恢复至备份时的状态，而同时拥有从某个基础备份开始的 WAL归档，允许用户将数据库恢复至基础备份时刻之后的任意时间点。

详细原理，请参阅：基础备份与时间点恢复；具体操作，请参考 PGSQL管理：备份恢复。

基础备份

Pigsty 使用 pgbackrest 管理 PostgreSQL 备份。pgBackRest 将在所有集群实例上初始化空仓库，但只会在集群主库上实际使用仓库。

pgBackRest 支持三种备份模式：全量备份，增量备份，差异备份，其中前两者最为常用。全量备份将对数据库集群取一个当前时刻的全量物理快照，增量备份会记录当前数据库集群与上一次全量备份之间的差异。

Pigsty 为备份提供了封装命令：/pg/bin/pg-backup [full|incr]。您可以通过 Crontab 或任何其他任务调度系统，按需定期制作基础备份。

WAL归档

Pigsty 默认在集群主库上启⽤了 WAL 归档，并使⽤ pgbackrest 命令行工具持续推送 WAL 段⽂件至备份仓库。

pgBackRest 会⾃动管理所需的 WAL ⽂件，并根据备份的保留策略及时清理过期的备份，与其对应的 WAL 归档⽂件。

如果您不需要 PITR 功能，可以通过 配置集群： archive_mode: off 来关闭 WAL 归档，移除 node_crontab 来停止定期备份任务。

实现

默认情况下，Pigsty提供了两种预置备份策略：默认使用本地文件系统备份仓库，在这种情况下每天进行一次全量备份，确保用户任何时候都能回滚至一天内的任意时间点。备选策略使用专用的 MinIO 集群或S3存储备份，每周一全备，每天一增备，默认保留两周的备份与WAL归档。

Pigsty 使用 pgBackRest 管理备份，接收 WAL 归档，执行 PITR。备份仓库可以进行灵活配置（pgbackrest_repo）：默认使用主库本地文件系统（local），但也可以使用其他磁盘路径，或使用自带的可选 MinIO 服务（minio）与云上 S3 服务。

pgbackrest_enabled: true          # 在 pgsql 主机上启用 pgBackRest 吗？
pgbackrest_clean: true            # 初始化时删除 pg 备份数据？
pgbackrest_log_dir: /pg/log/pgbackrest # pgbackrest 日志目录，默认为 `/pg/log/pgbackrest`
pgbackrest_method: local          # pgbackrest 仓库方法：local, minio, [用户定义...]
pgbackrest_repo:                  # pgbackrest 仓库：https://pgbackrest.org/configuration.html#section-repository
  local:                          # 默认使用本地 posix 文件系统的 pgbackrest 仓库
    path: /pg/backup              # 本地备份目录，默认为 `/pg/backup`
    retention_full_type: count    # 按计数保留完整备份
    retention_full: 2             # 使用本地文件系统仓库时，最多保留 3 个完整备份，至少保留 2 个
  minio:                          # pgbackrest 的可选 minio 仓库
    type: s3                      # minio 是与 s3 兼容的，所以使用 s3
    s3_endpoint: sss.pigsty       # minio 端点域名，默认为 `sss.pigsty`
    s3_region: us-east-1          # minio 区域，默认为 us-east-1，对 minio 无效
    s3_bucket: pgsql              # minio 桶名称，默认为 `pgsql`
    s3_key: pgbackrest            # pgbackrest 的 minio 用户访问密钥
    s3_key_secret: S3User.Backup  # pgbackrest 的 minio 用户秘密密钥
    s3_uri_style: path            # 对 minio 使用路径风格的 uri，而不是主机风格
    path: /pgbackrest             # minio 备份路径，默认为 `/pgbackrest`
    storage_port: 9000            # minio 端口，默认为 9000
    storage_ca_file: /etc/pki/ca.crt  # minio ca 文件路径，默认为 `/etc/pki/ca.crt`
    bundle: y                     # 将小文件打包成一个文件
    cipher_type: aes-256-cbc      # 为远程备份仓库启用 AES 加密
    cipher_pass: pgBackRest       # AES 加密密码，默认为 'pgBackRest'
    retention_full_type: time     # 在 minio 仓库上按时间保留完整备份
    retention_full: 14            # 保留过去 14 天的完整备份
  # 您还可以添加其他的可选备份仓库，例如 S3，用于异地容灾

Pigsty 参数 pgbackrest_repo 中的目标仓库会被转换为 /etc/pgbackrest/pgbackrest.conf 配置文件中的仓库定义。例如，如果您定义了一个美西区的 S3 仓库用于存储冷备份，可以使用下面的参考配置。

s3:    # ------> /etc/pgbackrest/pgbackrest.conf
  repo1-type: s3                                   # ----> repo1-type=s3
  repo1-s3-region: us-west-1                       # ----> repo1-s3-region=us-west-1
  repo1-s3-endpoint: s3-us-west-1.amazonaws.com    # ----> repo1-s3-endpoint=s3-us-west-1.amazonaws.com
  repo1-s3-key: '<your_access_key>'                # ----> repo1-s3-key=<your_access_key>
  repo1-s3-key-secret: '<your_secret_key>'         # ----> repo1-s3-key-secret=<your_secret_key>
  repo1-s3-bucket: pgsql                           # ----> repo1-s3-bucket=pgsql
  repo1-s3-uri-style: host                         # ----> repo1-s3-uri-style=host
  repo1-path: /pgbackrest                          # ----> repo1-path=/pgbackrest
  repo1-bundle: y                                  # ----> repo1-bundle=y
  repo1-cipher-type: aes-256-cbc                   # ----> repo1-cipher-type=aes-256-cbc
  repo1-cipher-pass: pgBackRest                    # ----> repo1-cipher-pass=pgBackRest
  repo1-retention-full-type: time                  # ----> repo1-retention-full-type=time
  repo1-retention-full: 90                         # ----> repo1-retention-full=90

恢复

您可以直接使用以下封装命令可以用于 PostgreSQL 数据库集群的时间点恢复。

Pigsty 默认使用增量差分并行恢复，允许您以最快速度恢复到指定时间点。

pg-pitr                                 # 恢复到WAL存档流的结束位置（例如在整个数据中心故障的情况下使用）
pg-pitr -i                              # 恢复到最近备份完成的时间（不常用）
pg-pitr --time="2022-12-30 14:44:44+08" # 恢复到指定的时间点（在删除数据库或表的情况下使用）
pg-pitr --name="my-restore-point"       # 恢复到使用 pg_create_restore_point 创建的命名恢复点
pg-pitr --lsn="0/7C82CB8" -X            # 在LSN之前立即恢复
pg-pitr --xid="1234567" -X -P           # 在指定的事务ID之前立即恢复，然后将集群直接提升为主库
pg-pitr --backup=latest                 # 恢复到最新的备份集
pg-pitr --backup=20221108-105325        # 恢复到特定备份集，备份集可以使用 pgbackrest info 列出

pg-pitr                                 # pgbackrest --stanza=pg-meta restore
pg-pitr -i                              # pgbackrest --stanza=pg-meta --type=immediate restore
pg-pitr -t "2022-12-30 14:44:44+08"     # pgbackrest --stanza=pg-meta --type=time --target="2022-12-30 14:44:44+08" restore
pg-pitr -n "my-restore-point"           # pgbackrest --stanza=pg-meta --type=name --target=my-restore-point restore
pg-pitr -b 20221108-105325F             # pgbackrest --stanza=pg-meta --type=name --set=20221230-120101F restore
pg-pitr -l "0/7C82CB8" -X               # pgbackrest --stanza=pg-meta --type=lsn --target="0/7C82CB8" --target-exclusive restore
pg-pitr -x 1234567 -X -P                # pgbackrest --stanza=pg-meta --type=xid --target="0/7C82CB8" --target-exclusive --target-action=promote restore

在执行 PITR 时，您可以使用 Pigsty 监控系统观察集群 LSN 位点状态，判断是否成功恢复到指定的时间点，事务点，LSN位点，或其他点位。

pitr

3.8 - 服务接入

Pigsty 使用 HAProxy 提供服务接入，并提供可选的 pgBouncer 池化连接，以及可选的 L2 VIP 与 DNS 接入。

分离读写操作，正确路由流量，稳定可靠地交付 PostgreSQL 集群提供的能力。

服务是一种抽象：它是数据库集群对外提供能力的形式，并封装了底层集群的细节。

服务对于生产环境中的稳定接入至关重要，在高可用集群自动故障时方显其价值，单机用户通常不需要操心这个概念。

单机用户

“服务” 的概念是给生产环境用的，个人用户/单机集群可以不折腾，直接拿实例名/IP地址访问数据库。

例如，Pigsty 默认的单节点 pg-meta.meta 数据库，就可以直接用下面三个不同的用户连接上去。

psql postgres://dbuser_dba:DBUser.DBA@10.10.10.10/meta     # 直接用 DBA 超级用户连上去
psql postgres://dbuser_meta:DBUser.Meta@10.10.10.10/meta   # 用默认的业务管理员用户连上去
psql postgres://dbuser_view:DBUser.View@pg-meta/meta       # 用默认的只读用户走实例域名连上去

服务概述

在真实世界生产环境中，我们会使用基于复制的主从数据库集群。集群中有且仅有一个实例作为领导者（主库）可以接受写入。而其他实例（从库）则会从持续从集群领导者获取变更日志，与领导者保持一致。同时，从库还可以承载只读请求，在读多写少的场景下可以显著分担主库的负担，因此对集群的写入请求与只读请求进行区分，是一种十分常见的实践。

此外对于高频短连接的生产环境，我们还会通过连接池中间件（Pgbouncer）对请求进行池化，减少连接与后端进程的创建开销。但对于ETL与变更执行等场景，我们又需要绕过连接池，直接访问数据库。同时，高可用集群在故障时会出现故障切换（Failover），故障切换会导致集群的领导者出现变更。因此高可用的数据库方案要求写入流量可以自动适配集群的领导者变化。这些不同的访问需求（读写分离，池化与直连，故障切换自动适配）最终抽象出服务（Service）的概念。

通常来说，数据库集群都必须提供这种最基础的服务：

读写服务（primary） ：可以读写数据库

对于生产数据库集群，至少应当提供这两种服务：

读写服务（primary） ：写入数据：只能由主库所承载。
只读服务（replica） ：读取数据：可以由从库承载，没有从库时也可由主库承载

此外，根据具体的业务场景，可能还会有其他的服务，例如：

默认直连服务（default） ：允许（管理）用户，绕过连接池直接访问数据库的服务
离线从库服务（offline） ：不承接线上只读流量的专用从库，用于ETL与分析查询
同步从库服务（standby） ：没有复制延迟的只读服务，由同步备库/主库处理只读查询
延迟从库服务（delayed） ：访问同一个集群在一段时间之前的旧数据，由延迟从库来处理

接入服务

Pigsty的服务交付边界止步于集群的HAProxy，用户可以用各种手段访问这些负载均衡器。

典型的做法是使用 DNS 或 VIP 接入，将其绑定在集群所有或任意数量的负载均衡器上。

你可以使用不同的主机 & 端口组合，它们以不同的方式提供 PostgreSQL 服务。

主机

类型	样例	描述
集群域名	`pg-test`	通过集群域名访问（由 dnsmasq @ infra 节点解析）
集群 VIP 地址	`10.10.10.3`	通过由 `vip-manager` 管理的 L2 VIP 地址访问，绑定到主节点
实例主机名	`pg-test-1`	通过任何实例主机名访问（由 dnsmasq @ infra 节点解析）
实例 IP 地址	`10.10.10.11`	访问任何实例的 IP 地址

端口

Pigsty 使用不同的端口来区分 pg services

端口	服务	类型	描述
5432	postgres	数据库	直接访问 postgres 服务器
6432	pgbouncer	中间件	访问 postgres 前先通过连接池中间件
5433	primary	服务	访问主 pgbouncer (或 postgres)
5434	replica	服务	访问备份 pgbouncer (或 postgres)
5436	default	服务	访问主 postgres
5438	offline	服务	访问离线 postgres

组合

# 通过集群域名访问
postgres://test@pg-test:5432/test # DNS -> L2 VIP -> 主直接连接
postgres://test@pg-test:6432/test # DNS -> L2 VIP -> 主连接池 -> 主
postgres://test@pg-test:5433/test # DNS -> L2 VIP -> HAProxy -> 主连接池 -> 主
postgres://test@pg-test:5434/test # DNS -> L2 VIP -> HAProxy -> 备份连接池 -> 备份
postgres://dbuser_dba@pg-test:5436/test # DNS -> L2 VIP -> HAProxy -> 主直接连接 (用于管理员)
postgres://dbuser_stats@pg-test:5438/test # DNS -> L2 VIP -> HAProxy -> 离线直接连接 (用于 ETL/个人查询)

# 通过集群 VIP 直接访问
postgres://test@10.10.10.3:5432/test # L2 VIP -> 主直接访问
postgres://test@10.10.10.3:6432/test # L2 VIP -> 主连接池 -> 主
postgres://test@10.10.10.3:5433/test # L2 VIP -> HAProxy -> 主连接池 -> 主
postgres://test@10.10.10.3:5434/test # L2 VIP -> HAProxy -> 备份连接池 -> 备份
postgres://dbuser_dba@10.10.10.3:5436/test # L2 VIP -> HAProxy -> 主直接连接 (用于管理员)
postgres://dbuser_stats@10.10.10.3::5438/test # L2 VIP -> HAProxy -> 离线直接连接 (用于 ETL/个人查询)

# 直接指定任何集群实例名
postgres://test@pg-test-1:5432/test # DNS -> 数据库实例直接连接 (单例访问)
postgres://test@pg-test-1:6432/test # DNS -> 连接池 -> 数据库
postgres://test@pg-test-1:5433/test # DNS -> HAProxy -> 连接池 -> 数据库读/写
postgres://test@pg-test-1:5434/test # DNS -> HAProxy -> 连接池 -> 数据库只读
postgres://dbuser_dba@pg-test-1:5436/test # DNS -> HAProxy -> 数据库直接连接
postgres://dbuser_stats@pg-test-1:5438/test # DNS -> HAProxy -> 数据库离线读/写

# 直接指定任何集群实例 IP 访问
postgres://test@10.10.10.11:5432/test # 数据库实例直接连接 (直接指定实例, 没有自动流量分配)
postgres://test@10.10.10.11:6432/test # 连接池 -> 数据库
postgres://test@10.10.10.11:5433/test # HAProxy -> 连接池 -> 数据库读/写
postgres://test@10.10.10.11:5434/test # HAProxy -> 连接池 -> 数据库只读
postgres://dbuser_dba@10.10.10.11:5436/test # HAProxy -> 数据库直接连接
postgres://dbuser_stats@10.10.10.11:5438/test # HAProxy -> 数据库离线读-写

# 智能客户端：通过URL读写分离
postgres://test@10.10.10.11:6432,10.10.10.12:6432,10.10.10.13:6432/test?target_session_attrs=primary
postgres://test@10.10.10.11:6432,10.10.10.12:6432,10.10.10.13:6432/test?target_session_attrs=prefer-standby

3.9 - 访问控制

Pigsty 提供了标准的安全实践：密码与证书认证，开箱即用的权限模型，SSL加密网络流量，加密远程冷备份等。

Pigsty 提供了一套开箱即用的，基于角色系统和权限系统的访问控制模型。

权限控制很重要，但很多用户做不好。因此 Pigsty 提供了一套开箱即用的精简访问控制模型，为您的集群安全性提供一个兜底。

角色系统

Pigsty 默认的角色系统包含四个默认角色和四个默认用户：

角色名称	属性	所属	描述
`dbrole_readonly`	`NOLOGIN`		角色：全局只读访问
`dbrole_readwrite`	`NOLOGIN`	dbrole_readonly	角色：全局读写访问
`dbrole_admin`	`NOLOGIN`	pg_monitor,dbrole_readwrite	角色：管理员/对象创建
`dbrole_offline`	`NOLOGIN`		角色：受限的只读访问
`postgres`	`SUPERUSER`		系统超级用户
`replicator`	`REPLICATION`	pg_monitor,dbrole_readonly	系统复制用户
`dbuser_dba`	`SUPERUSER`	dbrole_admin	pgsql 管理用户
`dbuser_monitor`		pg_monitor	pgsql 监控用户

这些角色与用户的详细定义如下所示：

pg_default_roles:                 # 全局默认的角色与系统用户
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation }
  - { name: postgres     ,superuser: true  ,comment: system superuser }
  - { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator }
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 ,comment: pgsql admin user }
  - { name: dbuser_monitor ,roles: [pg_monitor] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

默认角色

Pigsty 中有四个默认角色：

业务只读 (dbrole_readonly): 用于全局只读访问的角色。如果别的业务想要此库只读访问权限，可以使用此角色。
业务读写 (dbrole_readwrite): 用于全局读写访问的角色，主属业务使用的生产账号应当具有数据库读写权限
业务管理员 (dbrole_admin): 拥有DDL权限的角色，通常用于业务管理员，或者需要在应用中建表的场景（比如各种业务软件）
离线只读访问 (dbrole_offline): 受限的只读访问角色（只能访问 offline 实例，通常是个人用户，ETL工具账号）

默认角色在 pg_default_roles 中定义，除非您确实知道自己在干什么，建议不要更改默认角色的名称。

- { name: dbrole_readonly  , login: false , comment: role for global read-only access  }                            # 生产环境的只读角色
- { name: dbrole_offline ,   login: false , comment: role for restricted read-only access (offline instance) }      # 受限的只读角色
- { name: dbrole_readwrite , login: false , roles: [dbrole_readonly], comment: role for global read-write access }  # 生产环境的读写角色
- { name: dbrole_admin , login: false , roles: [pg_monitor, dbrole_readwrite] , comment: role for object creation } # 生产环境的 DDL 更改角色

默认用户

Pigsty 也有四个默认用户（系统用户）：

超级用户 (postgres)，集群的所有者和创建者，与操作系统 dbsu 名称相同。
复制用户 (replicator)，用于主-从复制的系统用户。
监控用户 (dbuser_monitor)，用于监控数据库和连接池指标的用户。
管理用户 (dbuser_dba)，执行日常操作和数据库更改的管理员用户。

这4个默认用户的用户名/密码通过4对专用参数进行定义，并在很多地方引用：

pg_dbsu：操作系统 dbsu 名称，默认为 postgres，最好不要更改它
pg_dbsu_password：dbsu 密码，默认为空字符串意味着不设置 dbsu 密码，最好不要设置。
pg_replication_username：postgres 复制用户名，默认为 replicator
pg_replication_password：postgres 复制密码，默认为 DBUser.Replicator
pg_admin_username：postgres 管理员用户名，默认为 dbuser_dba
pg_admin_password：postgres 管理员密码的明文，默认为 DBUser.DBA
pg_monitor_username：postgres 监控用户名，默认为 dbuser_monitor
pg_monitor_password：postgres 监控密码，默认为 DBUser.Monitor

在生产部署中记得更改这些密码，不要使用默认值！

pg_dbsu: postgres                             # 数据库超级用户名，这个用户名建议不要修改。
pg_dbsu_password: ''                          # 数据库超级用户密码，这个密码建议留空！禁止dbsu密码登陆。
pg_replication_username: replicator           # 系统复制用户名
pg_replication_password: DBUser.Replicator    # 系统复制密码，请务必修改此密码！
pg_monitor_username: dbuser_monitor           # 系统监控用户名
pg_monitor_password: DBUser.Monitor           # 系统监控密码，请务必修改此密码！
pg_admin_username: dbuser_dba                 # 系统管理用户名
pg_admin_password: DBUser.DBA                 # 系统管理密码，请务必修改此密码！

如果您修改默认用户的参数，在 pg_default_roles 中修改相应的角色定义即可：

- { name: postgres     ,superuser: true                                          ,comment: system superuser }
- { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
- { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
- { name: dbuser_monitor   ,roles: [pg_monitor, dbrole_readonly] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

权限系统

Pigsty 拥有一套开箱即用的权限模型，该模型与默认角色一起配合工作。

所有用户都可以访问所有模式。
只读用户（dbrole_readonly）可以从所有表中读取数据。（SELECT，EXECUTE）
读写用户（dbrole_readwrite）可以向所有表中写入数据并运行 DML。（INSERT，UPDATE，DELETE）。
管理员用户（dbrole_admin）可以创建对象并运行 DDL（CREATE，USAGE，TRUNCATE，REFERENCES，TRIGGER）。
离线用户（dbrole_offline）类似只读用户，但访问受到限制，只允许访问离线实例（pg_role = 'offline' 或 pg_offline_query = true）
由管理员用户创建的对象将具有正确的权限。
所有数据库上都配置了默认权限，包括模板数据库。
数据库连接权限由数据库定义管理。
默认撤销PUBLIC在数据库和public模式下的CREATE权限。

对象权限

数据库中新建对象的默认权限由参数 pg_default_privileges 所控制：

- GRANT USAGE      ON SCHEMAS   TO dbrole_readonly
- GRANT SELECT     ON TABLES    TO dbrole_readonly
- GRANT SELECT     ON SEQUENCES TO dbrole_readonly
- GRANT EXECUTE    ON FUNCTIONS TO dbrole_readonly
- GRANT USAGE      ON SCHEMAS   TO dbrole_offline
- GRANT SELECT     ON TABLES    TO dbrole_offline
- GRANT SELECT     ON SEQUENCES TO dbrole_offline
- GRANT EXECUTE    ON FUNCTIONS TO dbrole_offline
- GRANT INSERT     ON TABLES    TO dbrole_readwrite
- GRANT UPDATE     ON TABLES    TO dbrole_readwrite
- GRANT DELETE     ON TABLES    TO dbrole_readwrite
- GRANT USAGE      ON SEQUENCES TO dbrole_readwrite
- GRANT UPDATE     ON SEQUENCES TO dbrole_readwrite
- GRANT TRUNCATE   ON TABLES    TO dbrole_admin
- GRANT REFERENCES ON TABLES    TO dbrole_admin
- GRANT TRIGGER    ON TABLES    TO dbrole_admin
- GRANT CREATE     ON SCHEMAS   TO dbrole_admin

由管理员新创建的对象，默认将会上述权限。使用 \ddp+ 可以查看这些默认权限：

类型	访问权限
函数	=X
	dbrole_readonly=X
	dbrole_offline=X
	dbrole_admin=X
模式	dbrole_readonly=U
	dbrole_offline=U
	dbrole_admin=UC
序列号	dbrole_readonly=r
	dbrole_offline=r
	dbrole_readwrite=wU
	dbrole_admin=rwU
表	dbrole_readonly=r
	dbrole_offline=r
	dbrole_readwrite=awd
	dbrole_admin=arwdDxt

默认权限

SQL 语句 ALTER DEFAULT PRIVILEGES 允许您设置将来创建的对象的权限。它不会影响已经存在对象的权限，也不会影响非管理员用户创建的对象。

在 Pigsty 中，默认权限针对三个角色进行定义：

{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE {{ pg_dbsu }} {{ priv }};
{% endfor %}

{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE {{ pg_admin_username }} {{ priv }};
{% endfor %}

-- 对于其他业务管理员而言，它们应当在执行 DDL 前执行 SET ROLE dbrole_admin，从而使用对应的默认权限配置。
{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE "dbrole_admin" {{ priv }};
{% endfor %}

这些内容将会被 PG集群初始化模板 pg-init-template.sql 所使用，在集群初始化的过程中渲染并输出至 /pg/tmp/pg-init-template.sql。该命令会在 template1 与 postgres 数据库中执行，新创建的数据库会通过模板 template1 继承这些默认权限配置。

也就是说，为了维持正确的对象权限，您必须用管理员用户来执行 DDL，它们可以是：

{{ pg_dbsu }}，默认为 postgres
{{ pg_admin_username }}，默认为 dbuser_dba
授予了 dbrole_admin 角色的业务管理员用户（通过 SET ROLE 切换为 dbrole_admin 身份）。

使用 postgres 作为全局对象所有者是明智的。如果您希望以业务管理员用户身份创建对象，创建之前必须使用 SET ROLE dbrole_admin 来维护正确的权限。

当然，您也可以在数据库中通过 ALTER DEFAULT PRIVILEGE FOR ROLE <some_biz_admin> XXX 来显式对业务管理员授予默认权限。

数据库权限

在 Pigsty 中，数据库（Database）层面的权限在数据库定义中被涵盖。

数据库有三个级别的权限：CONNECT、CREATE、TEMP，以及一个特殊的’权限’：OWNERSHIP。

- name: meta         # 必选，`name` 是数据库定义中唯一的必选字段
  owner: postgres    # 可选，数据库所有者，默认为 postgres
  allowconn: true    # 可选，是否允许连接，默认为 true。显式设置 false 将完全禁止连接到此数据库
  revokeconn: false  # 可选，撤销公共连接权限。默认为 false，设置为 true 时，属主和管理员之外用户的 CONNECT 权限会被回收

如果 owner 参数存在，它作为数据库属主，替代默认的 {{ pg_dbsu }}（通常也就是postgres）
如果 revokeconn 为 false，所有用户都有数据库的 CONNECT 权限，这是默认的行为。
如果显式设置了 revokeconn 为 true：
- 数据库的 CONNECT 权限将从 PUBLIC 中撤销：普通用户无法连接上此数据库
- CONNECT 权限将被显式授予 {{ pg_replication_username }}、{{ pg_monitor_username }} 和 {{ pg_admin_username }}
- CONNECT 权限将 GRANT OPTION 被授予数据库属主，数据库属主用户可以自行授权其他用户连接权限。
revokeconn 选项可用于在同一个集群间隔离跨数据库访问，您可以为每个数据库创建不同的业务用户作为属主，并为它们设置 revokeconn 选项。

示例：数据库隔离

pg-infra:
  hosts:
    10.10.10.40: { pg_seq: 1, pg_role: primary }
    10.10.10.41: { pg_seq: 2, pg_role: replica , pg_offline_query: true }
  vars:
    pg_cluster: pg-infra
    pg_users:
      - { name: dbuser_confluence, password: mc2iohos , pgbouncer: true, roles: [ dbrole_admin ] }
      - { name: dbuser_gitlab, password: sdf23g22sfdd , pgbouncer: true, roles: [ dbrole_readwrite ] }
      - { name: dbuser_jira, password: sdpijfsfdsfdfs , pgbouncer: true, roles: [ dbrole_admin ] }
    pg_databases:
      - { name: confluence , revokeconn: true, owner: dbuser_confluence , connlimit: 100 }
      - { name: gitlab , revokeconn: true, owner: dbuser_gitlab, connlimit: 100 }
      - { name: jira , revokeconn: true, owner: dbuser_jira , connlimit: 100 }

CREATE权限

出于安全考虑，Pigsty 默认从 PUBLIC 撤销数据库上的 CREATE 权限，从 PostgreSQL 15 开始这也是默认行为。

数据库属主总是可以根据实际需要，来自行调整 CREATE 权限。

4 - 配置模板

开箱即用的配置模板，针对具体场景的配置示例，以及配置文件的详细解释。

4.1 - 配置总览

开箱即用的配置模板，针对具体场景的配置示例，以及配置文件的详细解释。

单节点

meta: 默认使用的单节点安装配置模板，带有较完善的关键配置参数说明

rich: 下载所有可用PG扩展与Docker，并预置了一系列供软件备用的数据库

pitr: 单节点，使用云上的远程对象存储进行持续备份与PITR的配置样例

demo: Pigsty Demo 站点使用的配置文件，使用公开域名对外服务，并使用证书

supa: 使用 Pigsty 托管的 PostgreSQL 自建单节点/四节点 Supabase

bare: Pigsty 最精简的单节点配置

四节点

full：四节点标准沙箱演示环境，带有两套 PG集群，MinIO，Etcd，Redis，FerretDB 集群样例

safe：安全加固的3+1节点配置模板，采用高标准的安全最佳实践

mssql：使用 WiltonDB / Babelfish 的 Microsoft SQL Server 兼容内核替代 PostgreSQL

polar：使用阿里云 PolarDB for PostgreSQL 内核替代原生 PostgreSQL

ivory：使用瀚高的 IvorySQL （Oracle兼容内核）替代原生 PostgreSQL

minio：安装一套四节点的高可用多节点多盘 MinIO 集群，提供 S3 兼容的对象存储服务

多节点

dual: 双节点配置模板，搭建基于主从复制的有限高可用 PostgreSQL 集群，允许宕机特定一个节点。

slim：双节点模板，精简安装，不构建本地软件源，不部署基础设施，仅依赖 etcd 的高可用 PG 集群。

trio：三节点配置模板，标准高可用架构，允许三坏一任意节点。

oss：五节点模板，在 Pigsty 支持的五大操作系统发行版上，批量构建离线软件包。

ext：五节点模板，在 Pigsty 支持的五大操作系统发行版上，准备构建扩展插件的环境，工具与依赖

prod：三十六节点规格的生产环境仿真模板，用于同时测试不同的场景和多个 PG 大版本

4.2 - 单节点：meta

核心配置文件，Pigsty默认使用的单节点安装配置模板，带有较完善的关键配置参数说明，与最小可用功能集。

meta 配置模板是 Pigsty 默认使用的模板，它的目标是在当前单节点上完成 Pigsty 核心功能 —— PostgreSQL 的部署。

为了实现最好的兼容性，meta 模板仅下载安装包含 最小必需 软件集合，以便在所有操作系统发行版与芯片架构上实现这一目标。

配置概览

配置名称： meta
节点数量：单节点，pigsty/vagrant/spec/meta.rb
配置说明：Pigsty 默认使用的单节点安装配置模板，带有较完善的关键配置参数说明，与最小可用功能集合。
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64, aarch64 （el8 包缺失）
相关配置：rich，pitr，demo

使用方式：此配置模板为 Pigsty 默认配置模板，因此在 configure 时无需显式指定 -c meta 参数：

./configure [-i <primary_ip>]

例如，如果您想要安装 PG 16，而非默认的 PostgreSQL 17，可以在 configure 中使用 -v 参数：

./configure -v 16   # or 15,14,13....

配置内容

源文件地址：pigsty/conf/meta.yml


all:

  #==============================================================#
  # Clusters, Nodes, and Modules
  #==============================================================#
  children:

    # infra: monitor, alert, repo, etc..
    infra:
      hosts:
        10.10.10.10: { infra_seq: 1 }

    # etcd cluster for HA postgres DCS
    etcd:
      hosts:
        10.10.10.10: { etcd_seq: 1 }
      vars:
        etcd_cluster: etcd

    # minio (single node, used as backup repo)
    minio:
      hosts:
        10.10.10.10: { minio_seq: 1 }
      vars:
        minio_cluster: minio

    # postgres cluster: pg-meta
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - { name: dbuser_meta ,password: DBUser.Meta     ,pgbouncer: true ,roles: [ dbrole_admin ]    ,comment: pigsty admin user }
          - { name: dbuser_view ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [ dbrole_readonly ] ,comment: read-only viewer for meta database }
        pg_databases:
          - { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [ pigsty ] }
        pg_hba_rules:
          - { user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes' }
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1

    # pgsql 3 node ha cluster: pg-test
    pg-test:
      hosts:
        10.10.10.11: { pg_seq: 1, pg_role: primary }   # primary instance, leader of cluster
        10.10.10.12: { pg_seq: 2, pg_role: replica }   # replica instance, follower of leader
        10.10.10.13: { pg_seq: 3, pg_role: replica, pg_offline_query: true } # replica with offline access
      vars:
        pg_cluster: pg-test           # define pgsql cluster name
        pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
        pg_databases: [{ name: test }]
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.3/24
        pg_vip_interface: eth1

    #----------------------------------#
    # redis ms, sentinel, native cluster
    #----------------------------------#
    redis-ms: # redis classic primary & replica
      hosts: { 10.10.10.10: { redis_node: 1 , redis_instances: { 6379: { }, 6380: { replica_of: '10.10.10.10 6379' } } } }
      vars: { redis_cluster: redis-ms ,redis_password: 'redis.ms' ,redis_max_memory: 64MB }

    redis-meta: # redis sentinel x 3
      hosts: { 10.10.10.11: { redis_node: 1 , redis_instances: { 26379: { } ,26380: { } ,26381: { } } } }
      vars:
        redis_cluster: redis-meta
        redis_password: 'redis.meta'
        redis_mode: sentinel
        redis_max_memory: 16MB
        redis_sentinel_monitor: # primary list for redis sentinel, use cls as name, primary ip:port
          - { name: redis-ms, host: 10.10.10.10, port: 6379 ,password: redis.ms, quorum: 2 }

    redis-test: # redis native cluster: 3m x 3s
      hosts:
        10.10.10.12: { redis_node: 1 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
        10.10.10.13: { redis_node: 2 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
      vars: { redis_cluster: redis-test ,redis_password: 'redis.test' ,redis_mode: cluster, redis_max_memory: 32MB }


  #==============================================================#
  # Global Parameters
  #==============================================================#
  vars:
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    proxy_env:                        # global proxy env when downloading packages
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"
      # http_proxy:  # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # https_proxy: # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # all_proxy:   # set your proxy here: e.g http://user:pass@proxy.xxx.com
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }
      minio        : { domain: m.pigsty ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }

    #----------------------------------#
    # MinIO Related Options
    #----------------------------------#
    #pgbackrest_method: minio          # if you want to use minio as backup repo instead of 'local' fs, uncomment this
    #minio_users:                      # and configure `pgbackrest_repo` & `minio_users` accordingly
    #  - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
    #  - { access_key: pgbackrest , secret_key: S3User.Backup, policy: readwrite }
    #pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
    #  minio: ...                      # optional minio repo for pgbackrest ...
    #    s3_key: pgbackrest            # minio user access key for pgbackrest
    #    s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
    #    cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
    # if you want to use minio as backup repo instead of 'local' fs, uncomment this, and configure `pgbackrest_repo`
    pgbackrest_method: minio
    node_etc_hosts: [ '10.10.10.10 h.pigsty a.pigsty p.pigsty g.pigsty sss.pigsty' ]

    #----------------------------------#
    # Credential: CHANGE THESE PASSWORDS
    #----------------------------------#
    #grafana_admin_username: admin
    grafana_admin_password: pigsty
    #pg_admin_username: dbuser_dba
    pg_admin_password: DBUser.DBA
    #pg_monitor_username: dbuser_monitor
    pg_monitor_password: DBUser.Monitor
    #pg_replication_username: replicator
    pg_replication_password: DBUser.Replicator
    #patroni_username: postgres
    patroni_password: Patroni.API
    #haproxy_admin_username: admin
    haproxy_admin_password: pigsty
    #minio_access_key: minioadmin
    minio_secret_key: minioadmin

    #----------------------------------#
    # Repo, Node, Packages
    #----------------------------------#
    repo_modules: infra,node,pgsql
    repo_remove: true                 # remove existing repo on admin node during repo bootstrap
    node_repo_modules: local          # install the local module in repo_upstream for all nodes
    node_repo_remove: true            # remove existing node repo for node managed by pigsty
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common ] #,docker
    repo_extra_packages: [ pg17-main ] #,pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl]
    pg_version: 17                    # default postgres version
    #pg_extensions: [pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ] #,pg17-olap]

注意事项

请注意，为了在所有操作系统发行版与芯片架构上实现这一目标，meta 模板中仅下载安装包含 最小必需 软件集合。这一变更体现在 repo_packages 与 repo_extra_packages 中：

docker 默认不会被下载。
除了 pg_repack, wal2json, pgvector 之外的 PG 扩展默认不会被下载
属于 pgsql-utility 但不属于 pgsql-common 部分的 pg_activity pg_timetable pgFormatter pg_filedump pgxnclient timescaledb-tools pgcopydb pgloader 不会被下载。

4.3 - 单节点：rich

在单节点基础上下载所有可用PG扩展与Docker，并预置了一系列供软件备用的数据库以便开箱即用

配置模板 rich 针对各类使用 PostgreSQL 数据库的业务软件而特别设计。如果你希望在单机上通过 Docker 运行一些使用 PG 作为底层数据库的业务软件，如 Odoo, Gitea, Wiki.js 等，可以考虑使用此模板。

配置概览

配置名称： rich
节点数量：单节点
配置说明：在 meta 基础上下载所有可用PG扩展与Docker，使用 MinIO 存储PG备份，并预置了一系列供软件备用的数据库以便开箱即用
配置内容：pigsty/conf/rich.yml
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64
相关配置：meta
Vagrant：pigsty/vagrant/spec/meta.rb

此模板使用单节点部署，它在 meta 配置模板的基础上进行了以下增强：

在构建本地软件仓库时，下载 Docker 软件包（docker-ce, docker-compose-plugin）。
在构建本地软件仓库时，下载 PostgreSQL 17 在当前 x86_64 操作系统发行版中所有可用的扩展。
使用可选的单节点 MinIO 替代本地文件系统存储 PostgreSQL 备份。
预置了一系列供 Docker 软件模板开箱即用的的 PG 业务数据库与业务用户
添加了两个微型 Redis 独立主从实例

启用方式：在 configure 过程中使用 -c rich 参数：

./configure -c rich [-i <primary_ip>]

配置内容

源文件地址：pigsty/conf/rich.yml

all:

  #==============================================================#
  # Clusters, Nodes, and Modules
  #==============================================================#
  children:

    #----------------------------------#
    # infra: monitor, alert, repo, etc..
    #----------------------------------#
    infra:
      hosts:
        10.10.10.10: { infra_seq: 1 }
      vars:
        docker_enabled: true # enabled docker with ./docker.yml (also add docker to repo)
        #docker_registry_mirrors: ["https://docker.m.daocloud.io"]

    #----------------------------------#
    # etcd cluster for HA postgres DCS
    #----------------------------------#
    etcd:
      hosts:
        10.10.10.10: { etcd_seq: 1 }
      vars:
        etcd_cluster: etcd

    #----------------------------------#
    # minio (OPTIONAL backup repo)
    #----------------------------------#
    minio:
      hosts:
        10.10.10.10: { minio_seq: 1 }
      vars:
        minio_cluster: minio

    #----------------------------------#
    # pgsql (singleton on current node)
    #----------------------------------#
    # postgres cluster: pg-meta
    pg-meta:
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - {name: dbuser_meta     ,password: DBUser.Meta     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
          - {name: dbuser_view     ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
          - {name: dbuser_grafana  ,password: DBUser.Grafana  ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for grafana database    }
          - {name: dbuser_bytebase ,password: DBUser.Bytebase ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for bytebase database   }
          - {name: dbuser_kong     ,password: DBUser.Kong     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for kong api gateway    }
          - {name: dbuser_gitea    ,password: DBUser.Gitea    ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for gitea service       }
          - {name: dbuser_wiki     ,password: DBUser.Wiki     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for wiki.js service     }
          - {name: dbuser_noco     ,password: DBUser.Noco     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for nocodb service      }
          - {name: dbuser_odoo     ,password: DBUser.Odoo     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for odoo service ,createdb: true} #,superuser: true}
        pg_databases:
          - {name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: vector},{name: postgis},{name: timescaledb}]}
          - {name: grafana  ,owner: dbuser_grafana  ,revokeconn: true ,comment: grafana primary database  }
          - {name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database }
          - {name: kong     ,owner: dbuser_kong     ,revokeconn: true ,comment: kong api gateway database }
          - {name: gitea    ,owner: dbuser_gitea    ,revokeconn: true ,comment: gitea meta database }
          - {name: wiki     ,owner: dbuser_wiki     ,revokeconn: true ,comment: wiki meta database  }
          - {name: noco     ,owner: dbuser_noco     ,revokeconn: true ,comment: nocodb database     }
          #- {name: odoo     ,owner: dbuser_odoo     ,revokeconn: true ,comment: odoo main database  }
        pg_hba_rules:
          - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}
        pg_libs: 'timescaledb,pg_stat_statements, auto_explain'  # add timescaledb to shared_preload_libraries
        node_crontab:  # make one full backup 1 am everyday
          - '00 01 * * * postgres /pg/bin/pg-backup full'

    redis-ms: # redis classic primary & replica
      hosts: { 10.10.10.10: { redis_node: 1 , redis_instances: { 6379: { }, 6380: { replica_of: '10.10.10.10 6379' } } } }
      vars: { redis_cluster: redis-ms ,redis_password: 'redis.ms' ,redis_max_memory: 64MB }


  vars:                               # global variables
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    proxy_env:                        # global proxy env when downloading packages
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"
      # http_proxy:  # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # https_proxy: # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # all_proxy:   # set your proxy here: e.g http://user:pass@proxy.xxx.com
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }
      minio        : { domain: m.pigsty    ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }
      postgrest    : { domain: api.pigsty  ,endpoint: "127.0.0.1:8884" }
      pgadmin      : { domain: adm.pigsty  ,endpoint: "127.0.0.1:8885" }
      pgweb        : { domain: cli.pigsty  ,endpoint: "127.0.0.1:8886" }
      bytebase     : { domain: ddl.pigsty  ,endpoint: "127.0.0.1:8887" }
      jupyter      : { domain: lab.pigsty  ,endpoint: "127.0.0.1:8888", websocket: true }
      gitea        : { domain: git.pigsty  ,endpoint: "127.0.0.1:8889" }
      wiki         : { domain: wiki.pigsty ,endpoint: "127.0.0.1:9002" }
      noco         : { domain: noco.pigsty ,endpoint: "127.0.0.1:9003" }
      supa         : { domain: supa.pigsty ,endpoint: "10.10.10.10:8000", websocket: true }
      dify         : { domain: dify.pigsty ,endpoint: "10.10.10.10:8001", websocket: true }
      odoo         : { domain: odoo.pigsty, endpoint: "127.0.0.1:8069"  , websocket: true }
    nginx_navbar:                    # application nav links on home page
      - { name: PgAdmin4   , url : 'http://adm.pigsty'  , comment: 'PgAdmin4 for PostgreSQL'  }
      - { name: PGWeb      , url : 'http://cli.pigsty'  , comment: 'PGWEB Browser Client'     }
      - { name: ByteBase   , url : 'http://ddl.pigsty'  , comment: 'ByteBase Schema Migrator' }
      - { name: PostgREST  , url : 'http://api.pigsty'  , comment: 'Kong API Gateway'         }
      - { name: Gitea      , url : 'http://git.pigsty'  , comment: 'Gitea Git Service'        }
      - { name: Minio      , url : 'https://m.pigsty'   , comment: 'Minio Object Storage'     }
      - { name: Wiki       , url : 'http://wiki.pigsty' , comment: 'Local Wikipedia'          }
      - { name: Noco       , url : 'http://noco.pigsty' , comment: 'Nocodb Example'           }
      - { name: Odoo       , url : 'http://odoo.pigsty' , comment: 'Odoo - the OpenERP'       }
      - { name: Explain    , url : '/pigsty/pev.html'   , comment: 'pgsql explain visualizer' }
      - { name: Package    , url : '/pigsty'            , comment: 'local yum repo packages'  }
      - { name: PG Logs    , url : '/logs'              , comment: 'postgres raw csv logs'    }
      - { name: Schemas    , url : '/schema'            , comment: 'schemaspy summary report' }
      - { name: Reports    , url : '/report'            , comment: 'pgbadger summary report'  }

    #----------------------------------#
    # MinIO Related Options
    #----------------------------------#
    #minio_users:                      # and configure `pgbackrest_repo` & `minio_users` accordingly
    #  - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
    #  - { access_key: pgbackrest , secret_key: S3User.Backup, policy: readwrite }
    #pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
    #  minio: ...                      # optional minio repo for pgbackrest ...
    #    s3_key: pgbackrest            # minio user access key for pgbackrest
    #    s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
    #    cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
    # if you want to use minio as backup repo instead of 'local' fs, uncomment this, and configure `pgbackrest_repo`
    pgbackrest_method: minio          # use minio as backup repo instead of 'local'
    node_etc_hosts: [ "${admin_ip} sss.pigsty" ]
    dns_records: [ "${admin_ip} api.pigsty adm.pigsty cli.pigsty ddl.pigsty lab.pigsty git.pigsty wiki.pigsty noco.pigsty supa.pigsty dify.pigsty odoo.pigsty" ]

    #----------------------------------#
    # Credential: CHANGE THESE PASSWORDS
    #----------------------------------#
    #grafana_admin_username: admin
    grafana_admin_password: pigsty
    #pg_admin_username: dbuser_dba
    pg_admin_password: DBUser.DBA
    #pg_monitor_username: dbuser_monitor
    pg_monitor_password: DBUser.Monitor
    #pg_replication_username: replicator
    pg_replication_password: DBUser.Replicator
    #patroni_username: postgres
    patroni_password: Patroni.API
    #haproxy_admin_username: admin
    haproxy_admin_password: pigsty
    #minio_access_key: minioadmin
    minio_secret_key: minioadmin

    #----------------------------------#
    # Safe Guard
    #----------------------------------#
    # you can enable these flags after bootstrap, to prevent purging running etcd / pgsql instances
    etcd_safeguard: false             # prevent purging running etcd instance?
    pg_safeguard: false               # prevent purging running postgres instance? false by default

    #----------------------------------#
    # Repo, Node, Packages
    #----------------------------------#
    # if you wish to customize your own repo, change these settings:
    repo_modules: infra,node,pgsql,docker
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common ,docker]
    repo_extra_packages: [ pg17-main ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl]
    pg_version: 17                    # default postgres version
    pg_extensions: [pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ] #,pg17-olap]
...

注意事项

请注意，并非所有扩展插件都在 aarch64 (arm64) 架构上可用，因此当使用 ARM 架构时，请按需谨慎添加您所需的扩展。

要替换扩展，请参考扩展别名列表，替换 pg17-core,pg17-time,... 等一系列通配软件包。

4.4 - 单节点：pitr

单节点，利用云上远程对象存储进行持续备份与PITR，从而确保基础的 RTO/PRO。

配置模板 pitr 演示了在云上如何在只有单个 EC2 / ECS 服务器的情况下，使用对象存储对数据库进行兜底性容灾。

配置概览

配置名称： pitr
节点数量：单节点，pigsty/vagrant/spec/meta.rb
配置说明：单节点，利用云上远程对象存储进行持续备份与PITR，从而确保基础的 RTO/PRO。
配置内容：pigsty/conf/pitr.yml
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64，aarch64
相关配置：meta
Terraform 模板（阿里云）： terraform/spec/aliyun-meta-s3.tf

./configure -c pitr [-i <primary_ip>]

配置内容

源文件地址：pigsty/conf/pitr.yml

# This 1-node template will use an external S3 (OSS) as backup storage
# which provide a basic level RTO / PRO in case of single point failure
# terraform template: terraform/spec/aliyun-meta-s3.tf

all:

  #==============================================================#
  # Clusters, Nodes, and Modules
  #==============================================================#
  children:

    #----------------------------------#
    # infra: monitor, alert, repo, etc..
    #----------------------------------#
    infra:
      hosts:
        10.10.10.10: { infra_seq: 1 }

    #----------------------------------#
    # etcd cluster for HA postgres DCS
    #----------------------------------#
    etcd:
      hosts:
        10.10.10.10: { etcd_seq: 1 }
      vars:
        etcd_cluster: etcd

    #----------------------------------#
    # minio (OPTIONAL backup repo)
    #----------------------------------#
    #minio:
    #  hosts:
    #    10.10.10.10: { minio_seq: 1 }
    #  vars:
    #    minio_cluster: minio

    #----------------------------------#
    # pgsql (singleton on current node)
    #----------------------------------#
    # this is an example single-node postgres cluster with postgis & timescaledb installed, with one biz database & two biz users
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
      vars:
        pg_cluster: pg-meta                 # required identity parameter, usually same as group name

        # define business databases here: https://pigsty.cc/docs/pgsql/db/
        pg_databases:                       # define business databases on this cluster, array of database definition
          - name: meta                      # REQUIRED, `name` is the only mandatory field of a database definition
            baseline: cmdb.sql              # optional, database sql baseline path, (relative path among ansible search path, e.g: files/)
            schemas: [ pigsty ]             # optional, additional schemas to be created, array of schema names
            extensions:                     # optional, additional extensions to be installed: array of `{name[,schema]}`
              - { name: vector }            # install pgvector extension on this database by default
            comment: pigsty meta database   # optional, comment string for this database
            #pgbouncer: true                # optional, add this database to pgbouncer database list? true by default
            #owner: postgres                # optional, database owner, postgres by default
            #template: template1            # optional, which template to use, template1 by default
            #encoding: UTF8                 # optional, database encoding, UTF8 by default. (MUST same as template database)
            #locale: C                      # optional, database locale, C by default.  (MUST same as template database)
            #lc_collate: C                  # optional, database collate, C by default. (MUST same as template database)
            #lc_ctype: C                    # optional, database ctype, C by default.   (MUST same as template database)
            #tablespace: pg_default         # optional, default tablespace, 'pg_default' by default.
            #allowconn: true                # optional, allow connection, true by default. false will disable connect at all
            #revokeconn: false              # optional, revoke public connection privilege. false by default. (leave connect with grant option to owner)
            #register_datasource: true      # optional, register this database to grafana datasources? true by default
            #connlimit: -1                  # optional, database connection limit, default -1 disable limit
            #pool_auth_user: dbuser_meta    # optional, all connection to this pgbouncer database will be authenticated by this user
            #pool_mode: transaction         # optional, pgbouncer pool mode at database level, default transaction
            #pool_size: 64                  # optional, pgbouncer pool size at database level, default 64
            #pool_size_reserve: 32          # optional, pgbouncer pool size reserve at database level, default 32
            #pool_size_min: 0               # optional, pgbouncer pool size min at database level, default 0
            #pool_max_db_conn: 100          # optional, max database connections at database level, default 100
          #- { name: grafana  ,owner: dbuser_grafana  ,revokeconn: true ,comment: grafana primary database }  # define another database

        # define business users here: https://pigsty.cc/docs/pgsql/user/
        pg_users:                           # define business users/roles on this cluster, array of user definition
          - name: dbuser_meta               # REQUIRED, `name` is the only mandatory field of a user definition
            password: DBUser.Meta           # optional, password, can be a scram-sha-256 hash string or plain text
            login: true                     # optional, can log in, true by default  (new biz ROLE should be false)
            superuser: false                # optional, is superuser? false by default
            createdb: false                 # optional, can create database? false by default
            createrole: false               # optional, can create role? false by default
            inherit: true                   # optional, can this role use inherited privileges? true by default
            replication: false              # optional, can this role do replication? false by default
            bypassrls: false                # optional, can this role bypass row level security? false by default
            pgbouncer: true                 # optional, add this user to pgbouncer user-list? false by default (production user should be true explicitly)
            connlimit: -1                   # optional, user connection limit, default -1 disable limit
            expire_in: 3650                 # optional, now + n days when this role is expired (OVERWRITE expire_at)
            expire_at: '2030-12-31'         # optional, YYYY-MM-DD 'timestamp' when this role is expired  (OVERWRITTEN by expire_in)
            comment: pigsty admin user      # optional, comment string for this user/role
            roles: [dbrole_admin]           # optional, belonged roles. default roles are: dbrole_{admin,readonly,readwrite,offline}
            parameters: {}                  # optional, role level parameters with `ALTER ROLE SET`
            pool_mode: transaction          # optional, pgbouncer pool mode at user level, transaction by default
            pool_connlimit: -1              # optional, max database connections at user level, default -1 disable limit
          - { name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [dbrole_readonly], comment: read-only viewer for meta database }

        # define pg extensions: https://pigsty.cc/ext/
        pg_libs: 'pg_stat_statements, auto_explain' # add timescaledb to shared_preload_libraries
        pg_extensions: [ pgvector ] # available extensions: https://pigsty.cc/ext/list

        # define HBA rules here: https://pigsty.cc/docs/pgsql/hba/#define-hba
        pg_hba_rules:                       # example hba rules
          - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}

        node_crontab:  # make a full backup on monday 1am, and an incremental backup during weekdays
          - '00 01 * * 1 postgres /pg/bin/pg-backup full'
          - '00 01 * * 2,3,4,5,6,7 postgres /pg/bin/pg-backup'


  #==============================================================#
  # Global Parameters
  #==============================================================#
  vars:

    #----------------------------------#
    # Meta Data
    #----------------------------------#
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    proxy_env:                        # global proxy env when downloading packages
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"
      # http_proxy:  # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # https_proxy: # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # all_proxy:   # set your proxy here: e.g http://user:pass@proxy.xxx.com
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }

    #----------------------------------#
    # MinIO Related Options
    #----------------------------------#
    # ADD YOUR AK/SK/REGION/ENDPOINT HERE
    pgbackrest_method: s3             # if you want to use minio as backup repo instead of 'local' fs, uncomment this
    pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
      s3:                             # aliyun oss (s3 compatible) object storage service
        type: s3                      # oss is s3-compatible
        s3_endpoint: oss-cn-beijing-internal.aliyuncs.com
        s3_region: oss-cn-beijing
        s3_bucket: <your_bucket_name>
        s3_key: <your_access_key>
        s3_key_secret: <your_secret_key>
        s3_uri_style: host
        path: /pgbackrest
        bundle: y                     # bundle small files into a single file
        cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
        cipher_pass: PG.${pg_cluster} # AES encryption password, default is 'pgBackRest'
        retention_full_type: time     # retention full backup by time on minio repo
        retention_full: 14            # keep full backup for last 14 days

    #----------------------------------#
    # Credential: CHANGE THESE PASSWORDS
    #----------------------------------#
    #grafana_admin_username: admin
    grafana_admin_password: pigsty
    #pg_admin_username: dbuser_dba
    pg_admin_password: DBUser.DBA
    #pg_monitor_username: dbuser_monitor
    pg_monitor_password: DBUser.Monitor
    #pg_replication_username: replicator
    pg_replication_password: DBUser.Replicator
    #patroni_username: postgres
    patroni_password: Patroni.API
    #haproxy_admin_username: admin
    haproxy_admin_password: pigsty

    #----------------------------------#
    # Safe Guard
    #----------------------------------#
    # you can enable these flags after bootstrap, to prevent purging running etcd / pgsql instances
    etcd_safeguard: false             # prevent purging running etcd instance?
    pg_safeguard: false               # prevent purging running postgres instance? false by default

    #----------------------------------#
    # Repo, Node, Packages
    #----------------------------------#
    # if you wish to customize your own repo, change these settings:
    repo_modules: infra,node,pgsql
    repo_remove: true                 # remove existing repo on admin node during repo bootstrap
    node_repo_modules: local          # install the local module in repo_upstream for all nodes
    node_repo_remove: true            # remove existing node repo for node managed by pigsty
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common ] #,docker]
    repo_extra_packages: [ pg17-main ] #,pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl]
    pg_version: 17                    # default postgres version
    #pg_extensions: [pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ] #,pg17-olap]

注意事项

你需要在 pgbackrest_repo 中填入对象存储桶的访问信息。

4.5 - 单节点：demo

Pigsty公开Demo使用的配置模板，展示了如何对外暴露网站，配置SSL证书，以及安装所有扩展插件

配置模板 demo 是 Pigsty 公开 Demo 使用的样例配置文件。

如果您希望在一台云服务器上搭建自己的网站，可以参考此配置模板。它展示了如何对外暴露网站，配置 SSL 证书，以及安装所有扩展插件。

配置概览

配置名称： demo
节点数量：单节点，pigsty/vagrant/spec/meta.rb
配置说明：在 meta 基础上下载所有可用PG扩展与Docker，使用 MinIO 存储PG备份，并预置了一系列供软件备用的数据库以便开箱即用
配置内容：pigsty/conf/demo.yml
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64
相关配置：meta，rich

此模板使用单节点部署，它在 meta 配置模板的基础上进行了以下增强：

在构建本地软件仓库时，下载 Docker 软件包（docker-ce, docker-compose-plugin）。
在构建本地软件仓库时，下载 PostgreSQL 17 在当前 x86_64 操作系统发行版中所有可用的扩展。
在默认的 pg-meta 集群中，安装所有下载了的 PostgreSQL 17 扩展插件。
显式指定了节点的时区，并使用中国地区的 NTP 服务器。
部署了 MinIO 但没有使用，演示环境节约存储浪费。
预置了一系列供 Docker 软件模板开箱即用的的 PG 业务数据库与业务用户
添加了三个微型 Redis 独立主从实例
添加了一个基于 FerretDB 的 Mongo 兼容集群
添加了一个 Kafka 样例集群。

启用方式：在 configure 过程中使用 -c demo 参数：

./configure -c demo [-i <primary_ip>]

配置内容

源文件地址：pigsty/conf/demo.yml


all:
  children:

    # infra cluster for proxy, monitor, alert, etc..
    infra:
      hosts: { 10.10.10.10: { infra_seq: 1 } }
      vars:
        nodename: pigsty.cc       # overwrite the default hostname
        node_id_from_pg: false    # do not use the pg identity as hostname
        docker_enabled: true      # enable docker on this node
        docker_registry_mirrors: ["https://mirror.ccs.tencentyun.com", "https://docker.m.daocloud.io"]
        # ./pgsql-monitor.yml -l infra     # monitor 'external' PostgreSQL instance
        pg_exporters:             # treat local postgres as RDS for demonstration purpose
          20001: { pg_cluster: pg-foo, pg_seq: 1, pg_host: 10.10.10.10 }
          #20002: { pg_cluster: pg-bar, pg_seq: 1, pg_host: 10.10.10.11 , pg_port: 5432 }
          #20003: { pg_cluster: pg-bar, pg_seq: 2, pg_host: 10.10.10.12 , pg_exporter_url: 'postgres://dbuser_monitor:DBUser.Monitor@10.10.10.12:5432/postgres?sslmode=disable' }
          #20004: { pg_cluster: pg-bar, pg_seq: 3, pg_host: 10.10.10.13 , pg_monitor_username: dbuser_monitor, pg_monitor_password: DBUser.Monitor }

    # etcd cluster for ha postgres
    etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

    # minio cluster, s3 compatible object storage
    minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

    # postgres example cluster: pg-meta
    pg-meta:
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - {name: dbuser_meta       ,password: DBUser.Meta       ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
          - {name: dbuser_view       ,password: DBUser.Viewer     ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
          - {name: dbuser_grafana    ,password: DBUser.Grafana    ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for grafana database    }
          - {name: dbuser_bytebase   ,password: DBUser.Bytebase   ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for bytebase database   }
          - {name: dbuser_kong       ,password: DBUser.Kong       ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for kong api gateway    }
          - {name: dbuser_gitea      ,password: DBUser.Gitea      ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for gitea service       }
          - {name: dbuser_wiki       ,password: DBUser.Wiki       ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for wiki.js service     }
          - {name: dbuser_noco       ,password: DBUser.Noco       ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for nocodb service      }
          - {name: dbuser_odoo       ,password: DBUser.Odoo       ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for odoo service ,createdb: true } #,superuser: true}
          - {name: dbuser_mattermost ,password: DBUser.MatterMost ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for mattermost ,createdb: true }
        pg_databases:
          - {name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: vector},{name: postgis},{name: timescaledb}]}
          - {name: grafana  ,owner: dbuser_grafana  ,revokeconn: true ,comment: grafana primary database  }
          - {name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database }
          - {name: kong     ,owner: dbuser_kong     ,revokeconn: true ,comment: kong api gateway database }
          - {name: gitea    ,owner: dbuser_gitea    ,revokeconn: true ,comment: gitea meta database }
          - {name: wiki     ,owner: dbuser_wiki     ,revokeconn: true ,comment: wiki meta database  }
          - {name: noco     ,owner: dbuser_noco     ,revokeconn: true ,comment: nocodb database     }
          #- {name: odoo     ,owner: dbuser_odoo     ,revokeconn: true ,comment: odoo main database  }
          - {name: mattermost ,owner: dbuser_mattermost ,revokeconn: true ,comment: mattermost main database }
        pg_hba_rules:
          - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}
        pg_libs: 'timescaledb,pg_stat_statements, auto_explain'  # add timescaledb to shared_preload_libraries
        pg_extensions: # extensions to be installed on this cluster
          - timescaledb timescaledb_toolkit pg_timeseries periods temporal_tables emaj table_version pg_cron pg_task pg_later pg_background
          - postgis pgrouting pointcloud pg_h3 q3c ogr_fdw geoip pg_polyline pg_geohash #mobilitydb
          - pgvector vchord pgvectorscale pg_vectorize pg_similarity smlar pg_summarize pg_tiktoken pg4ml #pgml
          - pg_search pgroonga pg_bigm zhparser pg_bestmatch vchord_bm25 hunspell
          - citus hydra pg_analytics pg_duckdb pg_mooncake duckdb_fdw pg_parquet pg_fkpart pg_partman plproxy #pg_strom
          - age hll rum pg_graphql pg_jsonschema jsquery pg_hint_plan hypopg index_advisor pg_plan_filter imgsmlr pg_ivm pg_incremental pgmq pgq pg_cardano omnigres #rdkit
          - pg_tle plv8 pllua plprql pldebugger plpgsql_check plprofiler plsh pljava #plr #pgtap #faker #dbt2
          - pg_prefix pg_semver pgunit pgpdf pglite_fusion md5hash asn1oid roaringbitmap pgfaceting pgsphere pg_country pg_xenophile pg_currency pg_collection pgmp numeral pg_rational pguint pg_uint128 hashtypes ip4r pg_uri pgemailaddr pg_acl timestamp9 chkpass #pg_duration #debversion #pg_rrule
          - pg_gzip pg_bzip pg_zstd pg_http pg_net pg_curl pgjq pgjwt pg_smtp_client pg_html5_email_address url_encode pgsql_tweaks pg_extra_time pgpcre icu_ext pgqr pg_protobuf envvar floatfile pg_readme ddl_historization data_historization pg_schedoc pg_hashlib pg_xxhash shacrypt cryptint pg_ecdsa pgsparql
          - pg_idkit pg_uuidv7 permuteseq pg_hashids sequential_uuids topn quantile lower_quantile count_distinct omnisketch ddsketch vasco pgxicor tdigest first_last_agg extra_window_functions floatvec aggs_for_vecs aggs_for_arrays pg_arraymath pg_math pg_random pg_base36 pg_base62 pg_base58 pg_financial
          - pg_repack pg_squeeze pg_dirtyread pgfincore pg_cooldown pg_ddlx pg_prioritize pg_checksums pg_readonly pg_upless pg_permissions pgautofailover pg_catcheck preprepare pgcozy pg_orphaned pg_crash pg_cheat_funcs pg_fio pg_savior safeupdate pg_drop_events table_log #pgagent #pgpool
          - pg_profile pg_tracing pg_show_plans pg_stat_kcache pg_stat_monitor pg_qualstats pg_store_plans pg_track_settings pg_wait_sampling system_stats pg_meta pgnodemx pg_sqlog bgw_replstatus pgmeminfo toastinfo pg_explain_ui pg_relusage pagevis powa
          - passwordcheck supautils pgsodium pg_vault pg_session_jwt pg_anon pg_tde pgsmcrypto pgaudit pgauditlogtofile pg_auth_mon credcheck pgcryptokey pg_jobmon logerrors login_hook set_user pg_snakeoil pgextwlist pg_auditor sslutils pg_noset
          - wrappers multicorn odbc_fdw jdbc_fdw mysql_fdw tds_fdw sqlite_fdw pgbouncer_fdw mongo_fdw redis_fdw pg_redis_pubsub kafka_fdw hdfs_fdw firebird_fdw aws_s3 log_fdw #oracle_fdw #db2_fdw
          - documentdb orafce pgtt session_variable pg_statement_rollback pg_dbms_metadata pg_dbms_lock pgmemcache #pg_dbms_job #wiltondb
          - pglogical pglogical_ticker pgl_ddl_deploy pg_failover_slots db_migrator wal2json wal2mongo decoderbufs decoder_raw mimeo pg_fact_loader pg_bulkload #repmgr

    redis-ms: # redis classic primary & replica
      hosts: { 10.10.10.10: { redis_node: 1 , redis_instances: { 6379: { }, 6380: { replica_of: '10.10.10.10 6379' }, 6381: { replica_of: '10.10.10.10 6379' } } } }
      vars: { redis_cluster: redis-ms ,redis_password: 'redis.ms' ,redis_max_memory: 64MB }

    # ./mongo.yml -l pg-mongo
    pg-mongo:
      hosts: { 10.10.10.10: { mongo_seq: 1 } }
      vars:
        mongo_cluster: pg-mongo
        mongo_pgurl: 'postgres://dbuser_meta:DBUser.Meta@10.10.10.10:5432/grafana'

    # ./kafka.yml -l kf-main
    kf-main:
      hosts: { 10.10.10.10: { kafka_seq: 1, kafka_role: controller } }
      vars:
        kafka_cluster: kf-main
        kafka_peer_port: 29093 # 9093 is occupied by alertmanager


  vars:                               # global variables
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: china                     # upstream mirror region: default|china|europe
    infra_portal:                     # domain names and upstream servers
      home         : { domain: home.pigsty.cc }
      cc           : { domain: pigsty.cc      ,path:     "/www/pigsty.cc"   ,cert: /etc/cert/pigsty.cc.crt ,key: /etc/cert/pigsty.cc.key }
      grafana      : { domain: demo.pigsty.cc ,endpoint: "${admin_ip}:3000" ,websocket: true ,cert: /etc/cert/demo.pigsty.cc.crt ,key: /etc/cert/demo.pigsty.cc.key }
      prometheus   : { domain: p.pigsty.cc    ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty.cc    ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }
      minio        : { domain: m.pigsty.cc    ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }
      postgrest    : { domain: api.pigsty.cc  ,endpoint: "127.0.0.1:8884"   }
      pgadmin      : { domain: adm.pigsty.cc  ,endpoint: "127.0.0.1:8885"   }
      pgweb        : { domain: cli.pigsty.cc  ,endpoint: "127.0.0.1:8886"   }
      bytebase     : { domain: ddl.pigsty.cc  ,endpoint: "127.0.0.1:8887"   }
      jupyter      : { domain: lab.pigsty.cc  ,endpoint: "127.0.0.1:8888", websocket: true }
      gitea        : { domain: git.pigsty.cc  ,endpoint: "127.0.0.1:8889" }
      wiki         : { domain: wiki.pigsty.cc ,endpoint: "127.0.0.1:9002" }
      noco         : { domain: noco.pigsty.cc ,endpoint: "127.0.0.1:9003" }
      supa         : { domain: supa.pigsty.cc ,endpoint: "10.10.10.10:8000" ,websocket: true }
      dify         : { domain: dify.pigsty.cc ,endpoint: "10.10.10.10:8001" ,websocket: true }
      odoo         : { domain: odoo.pigsty.cc ,endpoint: "127.0.0.1:8069"   ,websocket: true }
      mm           : { domain: mm.pigsty.cc   ,endpoint: "10.10.10.10:8065" ,websocket: true }
    # scp -r ~/pgsty/cc/cert/*       pj:/etc/cert/       # copy https certs
    # scp -r ~/dev/pigsty.cc/public  pj:/www/pigsty.cc   # copy pigsty.cc website

    nginx_navbar:                     # application nav links on home page
      - { name: PgAdmin4   , url: 'http://adm.pigsty.cc'           , comment: 'PgAdmin4 for PostgreSQL'     }
      - { name: PGWeb      , url: 'http://cli.pigsty.cc'           , comment: 'PGWEB Browser Client'        }
      - { name: Jupyter    , url: 'http://lab.pigsty.cc'           , comment: 'Jupyter Notebook WebUI'      }
      - { name: ByteBase   , url: 'http://ddl.pigsty.cc'           , comment: 'ByteBase Schema Migrator'    }
      - { name: PostgREST  , url: 'http://api.pigsty.cc'           , comment: 'Kong API Gateway'            }
      - { name: Gitea      , url: 'http://git.pigsty.cc'           , comment: 'Gitea Git Service'           }
      - { name: Minio      , url: 'http://sss.pigsty.cc'           , comment: 'Minio Object Storage'        }
      - { name: Wiki       , url: 'http://wiki.pigsty.cc'          , comment: 'Local Wikipedia'             }
      - { name: Nocodb     , url: 'http://noco.pigsty.cc'          , comment: 'Nocodb Example'              }
      - { name: Odoo       , url: 'http://odoo.pigsty.cc'          , comment: 'Odoo - the OpenERP'          }
      - { name: Dify       , url: 'http://dify.pigsty.cc'          , comment: 'Dify - the LLM OPS'          }
      - { name: Explain    , url: '/pigsty/pev.html'               , comment: 'postgres explain visualizer' }
      - { name: Package    , url: '/pigsty'                        , comment: 'local yum repo packages'     }
      - { name: PG Logs    , url: '/logs'                          , comment: 'postgres raw csv logs'       }
      - { name: Schemas    , url: '/schema'                        , comment: 'schemaspy summary report'    }
      - { name: Reports    , url: '/report'                        , comment: 'pgbadger summary report'     }
      - { name: ISD        , url: '${grafana}/d/isd-overview'      , comment: 'noaa isd data visualization' }
      - { name: Covid      , url: '${grafana}/d/covid-overview'    , comment: 'covid data visualization'    }
      - { name: Worktime   , url: '${grafana}/d/worktime-overview' , comment: 'worktime query'              }
      - { name: DBTrend    , url: '${grafana}/d/dbeng-trending'    , comment: 'DB Engine Trending Graph'    }

    node_etc_hosts: [ "${admin_ip} sss.pigsty" ]
    node_timezone: Asia/Hong_Kong
    node_ntp_servers:
      - pool cn.pool.ntp.org iburst
      - pool ${admin_ip} iburst       # assume non-admin nodes does not have internet access
    pgbackrest_enabled: false         # do not take backups since this is disposable demo env
    #prometheus_options: '--storage.tsdb.retention.time=15d' # prometheus extra server options
    prometheus_options: '--storage.tsdb.retention.size=3GB' # keep 3GB data at most on demo env

    # download docker and pg17 extensions
    repo_modules: infra,node,pgsql,docker
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common ,docker]
    repo_extra_packages: [ pg17-main ] #,pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl]
    pg_version: 17                    # default postgres version
    pg_extensions: [pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ] #,pg17-olap]

注意事项

请注意，并非所有扩展插件都在 aarch64 (arm64) 架构上可用，因此当使用 ARM 架构时，请按需谨慎添加您所需的扩展。

要替换扩展，请参考扩展别名列表，替换 pg17-core,pg17-time,... 等一系列通配软件包。

4.6 - 单节点：supa

使用 Pigsty 托管的 PostgreSQL 自建单节点/四节点 Supabase

配置模板 supa 提供了自建 Supabase 的参考配置模板。

此外，还有一个四节点的模板演示如何使用三节点高可用 PostgreSQL 集群作为底层存储，并拉起读写/只读两套 Supabase。

更多细节，请参考内核：Supabase 自建教程

配置概览

配置名称： supa
节点数量：单节点，pigsty/vagrant/spec/meta.rb
配置说明：使用 Pigsty 托管的 PostgreSQL 自建单节点/四节点 Supabase
配置内容：pigsty/conf/app/supa.yml
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64, aarch64
相关配置：meta，rich

启用方式：在 configure 过程中使用 -c supa 参数：

./configure -c app/supa [-i <primary_ip>]

您可以同时使用 -v 指定安装的 PostgreSQL 大版本，目前建议使用 PostgreSQL 17，16，15。

配置内容

源文件地址：pigsty/conf/supa.yml

# supabase is available on el8/el9/u22/u24/d12 with pg15,16,17
# To install supabase on fresh node, run:
#
#  curl -fsSL https://repo.pigsty.cc/get | bash
# ./bootstrap               # prepare local repo & ansible
# ./configure -c app/supa   # IMPORTANT: CHANGE CREDENTIALS!!
# ./install.yml             # install pigsty & pgsql & minio
# ./docker.yml              # install docker & docker compose
# ./app.yml                 # launch supabase with docker compose

all:
  children:

    # the supabase stateless (default username & password: supabase/pigsty)
    supa:
      hosts:
        10.10.10.10: {}
      vars:
        app: supabase # specify app name (supa) to be installed (in the apps)
        apps:         # define all applications
          supabase:   # the definition of supabase app
            conf:     # override /opt/supabase/.env
              # IMPORTANT: CHANGE JWT_SECRET AND REGENERATE CREDENTIAL ACCORDING!!!!!!!!!!!
              # https://supabase.com/docs/guides/self-hosting/docker#securing-your-services
              JWT_SECRET: your-super-secret-jwt-token-with-at-least-32-characters-long
              ANON_KEY: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyAgCiAgICAicm9sZSI6ICJhbm9uIiwKICAgICJpc3MiOiAic3VwYWJhc2UtZGVtbyIsCiAgICAiaWF0IjogMTY0MTc2OTIwMCwKICAgICJleHAiOiAxNzk5NTM1NjAwCn0.dc_X5iR_VP_qT0zsiyj_I_OZ2T9FtRU2BBNWN8Bu4GE
              SERVICE_ROLE_KEY: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyAgCiAgICAicm9sZSI6ICJzZXJ2aWNlX3JvbGUiLAogICAgImlzcyI6ICJzdXBhYmFzZS1kZW1vIiwKICAgICJpYXQiOiAxNjQxNzY5MjAwLAogICAgImV4cCI6IDE3OTk1MzU2MDAKfQ.DaYlNEoUrrEn2Ig7tqibS-PHK5vgusbcbo7X36XVt4Q
              DASHBOARD_USERNAME: supabase
              DASHBOARD_PASSWORD: pigsty

              # postgres connection string (use the correct ip and port)
              POSTGRES_HOST: 10.10.10.10
              POSTGRES_PORT: 5436             # access via the 'default' service, which always route to the primary postgres
              POSTGRES_DB: postgres
              POSTGRES_PASSWORD: DBUser.Supa  # password for supabase_admin and multiple supabase users

              # expose supabase via domain name
              SITE_URL: http://supa.pigsty                # <------- Change This to your external domain name
              API_EXTERNAL_URL: http://supa.pigsty        # <------- Otherwise the storage api may not work!
              SUPABASE_PUBLIC_URL: http://supa.pigsty     # <------- Do not forget!

              # if using s3/minio as file storage
              S3_BUCKET: supa
              S3_ENDPOINT: https://sss.pigsty:9000
              S3_ACCESS_KEY: supabase
              S3_SECRET_KEY: S3User.Supabase
              S3_FORCE_PATH_STYLE: true
              S3_PROTOCOL: https
              S3_REGION: stub
              MINIO_DOMAIN_IP: 10.10.10.10  # sss.pigsty domain name will resolve to this ip statically

              # if using SMTP (optional)
              #SMTP_ADMIN_EMAIL: admin@example.com
              #SMTP_HOST: supabase-mail
              #SMTP_PORT: 2500
              #SMTP_USER: fake_mail_user
              #SMTP_PASS: fake_mail_password
              #SMTP_SENDER_NAME: fake_sender
              #ENABLE_ANONYMOUS_USERS: false


    # infra cluster for proxy, monitor, alert, etc..
    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }

    # etcd cluster for ha postgres
    etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

    # minio cluster, s3 compatible object storage
    minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

    # pg-meta, the underlying postgres database for supabase
    pg-meta:
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars:
        pg_cluster: pg-meta
        pg_users:
          # supabase roles: anon, authenticated, dashboard_user
          - { name: anon           ,login: false }
          - { name: authenticated  ,login: false }
          - { name: dashboard_user ,login: false ,replication: true ,createdb: true ,createrole: true }
          - { name: service_role   ,login: false ,bypassrls: true }
          # supabase users: please use the same password
          - { name: supabase_admin             ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: true   ,roles: [ dbrole_admin ] ,superuser: true ,replication: true ,createdb: true ,createrole: true ,bypassrls: true }
          - { name: authenticator              ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: false  ,roles: [ dbrole_admin, authenticated ,anon ,service_role ] }
          - { name: supabase_auth_admin        ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: false  ,roles: [ dbrole_admin ] ,createrole: true }
          - { name: supabase_storage_admin     ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: false  ,roles: [ dbrole_admin, authenticated ,anon ,service_role ] ,createrole: true }
          - { name: supabase_functions_admin   ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: false  ,roles: [ dbrole_admin ] ,createrole: true }
          - { name: supabase_replication_admin ,password: 'DBUser.Supa' ,replication: true ,roles: [ dbrole_admin ]}
          - { name: supabase_read_only_user    ,password: 'DBUser.Supa' ,bypassrls: true ,roles: [ dbrole_readonly, pg_read_all_data ] }
        pg_databases:
          - name: postgres
            baseline: supabase.sql
            owner: supabase_admin
            comment: supabase postgres database
            schemas: [ extensions ,auth ,realtime ,storage ,graphql_public ,supabase_functions ,_analytics ,_realtime ]
            extensions:
              - { name: pgcrypto  ,schema: extensions } # cryptographic functions
              - { name: pg_net    ,schema: extensions } # async HTTP
              - { name: pgjwt     ,schema: extensions } # json web token API for postgres
              - { name: uuid-ossp ,schema: extensions } # generate universally unique identifiers (UUIDs)
              - { name: pgsodium        }               # pgsodium is a modern cryptography library for Postgres.
              - { name: supabase_vault  }               # Supabase Vault Extension
              - { name: pg_graphql      }               # pg_graphql: GraphQL support
              - { name: pg_jsonschema   }               # pg_jsonschema: Validate json schema
              - { name: wrappers        }               # wrappers: FDW collections
              - { name: http            }               # http: allows web page retrieval inside the database.
              - { name: pg_cron         }               # pg_cron: Job scheduler for PostgreSQL
              - { name: timescaledb     }               # timescaledb: Enables scalable inserts and complex queries for time-series data
              - { name: pg_tle          }               # pg_tle: Trusted Language Extensions for PostgreSQL
              - { name: vector          }               # pgvector: the vector similarity search
              - { name: pgmq            }               # pgmq: A lightweight message queue like AWS SQS and RSMQ
        # supabase required extensions
        pg_libs: 'timescaledb, plpgsql, plpgsql_check, pg_cron, pg_net, pg_stat_statements, auto_explain, pg_tle, plan_filter'
        pg_parameters:
          cron.database_name: postgres
          pgsodium.enable_event_trigger: off
        pg_hba_rules: # supabase hba rules, require access from docker network
          - { user: all ,db: postgres  ,addr: intra         ,auth: pwd ,title: 'allow supabase access from intranet'    }
          - { user: all ,db: postgres  ,addr: 172.17.0.0/16 ,auth: pwd ,title: 'allow access from local docker network' }
        node_crontab: [ '00 01 * * * postgres /pg/bin/pg-backup full' ] # make a full backup every 1am



  #==============================================================#
  # Global Parameters
  #==============================================================#
  vars:
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    docker_enabled: true              # enable docker on app group

    # WARNING: YOU MAY HAVE
    #docker_registry_mirrors: ["https://docker.m.daocloud.io"] # use dao cloud mirror in mainland china
    proxy_env:                        # global proxy env when downloading packages & pull docker images
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.tsinghua.edu.cn"
      #http_proxy:  127.0.0.1:12345 # add your proxy env here for downloading packages or pull images
      #https_proxy: 127.0.0.1:12345 # usually the proxy is format as http://user:pass@proxy.xxx.com
      #all_proxy:   127.0.0.1:12345

    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      minio        : { domain: m.pigsty ,endpoint: "10.10.10.10:9001", https: true, websocket: true }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }  # expose supa studio UI and API via nginx
      supa         : { domain: supa.pigsty ,endpoint: "10.10.10.10:8000", websocket: true }
      # certbot --nginx --agree-tos --email your@email.com -n -d supa.your.domain    # replace with your email & supa domain

    #----------------------------------#
    # Credential: CHANGE THESE PASSWORDS
    #----------------------------------#
    #grafana_admin_username: admin
    grafana_admin_password: pigsty
    #pg_admin_username: dbuser_dba
    pg_admin_password: DBUser.DBA
    #pg_monitor_username: dbuser_monitor
    pg_monitor_password: DBUser.Monitor
    #pg_replication_username: replicator
    pg_replication_password: DBUser.Replicator
    #patroni_username: postgres
    patroni_password: Patroni.API
    #haproxy_admin_username: admin
    haproxy_admin_password: pigsty
    #minio_access_key: minioadmin
    minio_secret_key: minioadmin      # minio root secret key, `minioadmin` by default

    # use minio as supabase file storage, single node single driver mode for demonstration purpose
    minio_buckets: [ { name: pgsql }, { name: supa } ]
    minio_users:
      - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
      - { access_key: pgbackrest , secret_key: S3User.Backup,   policy: readwrite }
      - { access_key: supabase   , secret_key: S3User.Supabase, policy: readwrite }
    minio_endpoint: https://sss.pigsty:9000    # explicit overwrite minio endpoint with haproxy port
    node_etc_hosts: ["10.10.10.10 sss.pigsty"] # domain name to access minio from all nodes (required)

    # use minio as default backup repo for PostgreSQL
    pgbackrest_method: minio          # pgbackrest repo method: local,minio,[user-defined...]
    pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
      local:                          # default pgbackrest repo with local posix fs
        path: /pg/backup              # local backup directory, `/pg/backup` by default
        retention_full_type: count    # retention full backups by count
        retention_full: 2             # keep 2, at most 3 full backup when using local fs repo
      minio:                          # optional minio repo for pgbackrest
        type: s3                      # minio is s3-compatible, so s3 is used
        s3_endpoint: sss.pigsty       # minio endpoint domain name, `sss.pigsty` by default
        s3_region: us-east-1          # minio region, us-east-1 by default, useless for minio
        s3_bucket: pgsql              # minio bucket name, `pgsql` by default
        s3_key: pgbackrest            # minio user access key for pgbackrest
        s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
        s3_uri_style: path            # use path style uri for minio rather than host style
        path: /pgbackrest             # minio backup path, default is `/pgbackrest`
        storage_port: 9000            # minio port, 9000 by default
        storage_ca_file: /pg/cert/ca.crt  # minio ca file path, `/pg/cert/ca.crt` by default
        bundle: y                     # bundle small files into a single file
        cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
        cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
        retention_full_type: time     # retention full backup by time on minio repo
        retention_full: 14            # keep full backup for last 14 days

    # download docker and all available extensions
    pg_version: 17
    repo_modules: node,pgsql,infra,docker
    repo_packages: [node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-utility, docker ]
    repo_extra_packages: [pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ]
    pg_extensions:                  [pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ]

注意事项

4.7 - 单节点：bare

Pigsty 最精简的单节点配置

配置模板 bare 是 Pigsty 所需的最精简配置

比这更少的配置模板将无法正常工作。

配置概览

配置名称： bare
节点数量：单节点，pigsty/vagrant/spec/meta.rb
配置说明：Pigsty 最精简的单节点配置
配置内容：pigsty/conf/demo/bare.yml
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64，aarch64
相关配置：meta

./configure -c bare [-i <primary_ip>]

配置内容

源文件地址：pigsty/conf/demo/bare.yml

all:
  children:
    infra:   { hosts: { 10.10.10.10: { infra_seq: 1 } } }
    etcd:    { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }
    pg-meta: { hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }, vars: { pg_cluster: pg-meta } }
  vars:
    version: v3.3.0
    admin_ip: 10.10.10.10
    region: default

4.8 - 四节点：full

四节点标准沙箱演示环境，带有两套 PG集群，MinIO，Etcd，Redis，FerretDB 集群样例

full 配置模板是 Pigsty 推荐的沙箱环境模板，它使用四个节点，部署两套 PostgreSQL，可以用于测试，演示 Pigsty 各方面的能力。

Pigsty 大部分教程和示例都是基于此模板置备的 沙箱环境。

配置概览

配置名称： full
节点数量：四节点，pigsty/vagrant/spec/full.rb
配置说明：四节点标准沙箱演示环境，带有两套 PG集群，MinIO，Etcd，Redis 等集群样例
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64, aarch64
相关配置：rich，pitr，demo

启用方式：在 configure 过程中使用 -c full 参数：

./configure -c full

备注：这是一个四节点模版，您需要在生成配置后修改其他三个节点的 IP 地址（可选）

配置内容

源文件地址：pigsty/conf/full.yml


all:

  #==============================================================#
  # Clusters, Nodes, and Modules
  #==============================================================#
  children:

    # infra: monitor, alert, repo, etc..
    infra:
      hosts:
        10.10.10.10: { infra_seq: 1 }

    # etcd cluster for HA postgres DCS
    etcd:
      hosts:
        10.10.10.10: { etcd_seq: 1 }
      vars:
        etcd_cluster: etcd

    # minio (single node, used as backup repo)
    minio:
      hosts:
        10.10.10.10: { minio_seq: 1 }
      vars:
        minio_cluster: minio

    # postgres cluster: pg-meta
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - { name: dbuser_meta ,password: DBUser.Meta     ,pgbouncer: true ,roles: [ dbrole_admin ]    ,comment: pigsty admin user }
          - { name: dbuser_view ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [ dbrole_readonly ] ,comment: read-only viewer for meta database }
        pg_databases:
          - { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [ pigsty ] }
        pg_hba_rules:
          - { user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes' }
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1

    # pgsql 3 node ha cluster: pg-test
    pg-test:
      hosts:
        10.10.10.11: { pg_seq: 1, pg_role: primary }   # primary instance, leader of cluster
        10.10.10.12: { pg_seq: 2, pg_role: replica }   # replica instance, follower of leader
        10.10.10.13: { pg_seq: 3, pg_role: replica, pg_offline_query: true } # replica with offline access
      vars:
        pg_cluster: pg-test           # define pgsql cluster name
        pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
        pg_databases: [{ name: test }]
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.3/24
        pg_vip_interface: eth1

    #----------------------------------#
    # redis ms, sentinel, native cluster
    #----------------------------------#
    redis-ms: # redis classic primary & replica
      hosts: { 10.10.10.10: { redis_node: 1 , redis_instances: { 6379: { }, 6380: { replica_of: '10.10.10.10 6379' } } } }
      vars: { redis_cluster: redis-ms ,redis_password: 'redis.ms' ,redis_max_memory: 64MB }

    redis-meta: # redis sentinel x 3
      hosts: { 10.10.10.11: { redis_node: 1 , redis_instances: { 26379: { } ,26380: { } ,26381: { } } } }
      vars:
        redis_cluster: redis-meta
        redis_password: 'redis.meta'
        redis_mode: sentinel
        redis_max_memory: 16MB
        redis_sentinel_monitor: # primary list for redis sentinel, use cls as name, primary ip:port
          - { name: redis-ms, host: 10.10.10.10, port: 6379 ,password: redis.ms, quorum: 2 }

    redis-test: # redis native cluster: 3m x 3s
      hosts:
        10.10.10.12: { redis_node: 1 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
        10.10.10.13: { redis_node: 2 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
      vars: { redis_cluster: redis-test ,redis_password: 'redis.test' ,redis_mode: cluster, redis_max_memory: 32MB }


  #==============================================================#
  # Global Parameters
  #==============================================================#
  vars:
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    proxy_env:                        # global proxy env when downloading packages
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"
      # http_proxy:  # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # https_proxy: # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # all_proxy:   # set your proxy here: e.g http://user:pass@proxy.xxx.com
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }
      minio        : { domain: m.pigsty ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }

    #----------------------------------#
    # MinIO Related Options
    #----------------------------------#
    #pgbackrest_method: minio          # if you want to use minio as backup repo instead of 'local' fs, uncomment this
    #minio_users:                      # and configure `pgbackrest_repo` & `minio_users` accordingly
    #  - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
    #  - { access_key: pgbackrest , secret_key: S3User.Backup, policy: readwrite }
    #pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
    #  minio: ...                      # optional minio repo for pgbackrest ...
    #    s3_key: pgbackrest            # minio user access key for pgbackrest
    #    s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
    #    cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
    # if you want to use minio as backup repo instead of 'local' fs, uncomment this, and configure `pgbackrest_repo`
    pgbackrest_method: minio
    node_etc_hosts: [ '10.10.10.10 h.pigsty a.pigsty p.pigsty g.pigsty sss.pigsty' ]

    #----------------------------------#
    # Credential: CHANGE THESE PASSWORDS
    #----------------------------------#
    #grafana_admin_username: admin
    grafana_admin_password: pigsty
    #pg_admin_username: dbuser_dba
    pg_admin_password: DBUser.DBA
    #pg_monitor_username: dbuser_monitor
    pg_monitor_password: DBUser.Monitor
    #pg_replication_username: replicator
    pg_replication_password: DBUser.Replicator
    #patroni_username: postgres
    patroni_password: Patroni.API
    #haproxy_admin_username: admin
    haproxy_admin_password: pigsty
    #minio_access_key: minioadmin
    minio_secret_key: minioadmin

    #----------------------------------#
    # Repo, Node, Packages
    #----------------------------------#
    repo_modules: infra,node,pgsql
    repo_remove: true                 # remove existing repo on admin node during repo bootstrap
    node_repo_modules: local          # install the local module in repo_upstream for all nodes
    node_repo_remove: true            # remove existing node repo for node managed by pigsty
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common ] #,docker
    repo_extra_packages: [ pg17-main ] #,pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl]
    pg_version: 17                    # default postgres version
    #pg_extensions: [pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ] #,pg17-olap]

4.9 - 四节点：safe

安全加固的3+1节点配置模板，采用高标准的安全最佳实践

safe 配置模板基于 full 模板修改。是一个进行安全加固的专用配置模板，采用高标准的安全最佳实践。

配置概览

配置名称： safe
节点数量：四节点，pigsty/vagrant/spec/full.rb
配置说明：安全加固的3+1节点配置模板，采用高标准的安全最佳实践
适用系统：el7, el8, el9, u20, u22, u24
适用架构：x86_64
相关配置：full

启用方式：在 configure 过程中使用 -c safe 参数：

./configure -c safe

备注：这是一个四节点模版，您需要在生成配置后修改其他三个节点的 IP 地址（可选）

配置内容

源文件地址：pigsty/conf/safe.yml

#===== SECURITY ENHANCEMENT CONFIG TEMPLATE WITH 3 NODES ======#
#   * 3 infra nodes, 3 etcd nodes, single minio node
#   * 3-instance pgsql cluster with an extra delayed instance
#   * crit.yml templates, no data loss, checksum enforced
#   * enforce ssl on postgres & pgbouncer, use postgres by default
#   * enforce an expiration date for all users (20 years by default)
#   * enforce strong password policy with passwordcheck extension
#   * enforce changing default password for all users
#   * log connections and disconnections
#   * restrict listen ip address for postgres/patroni/pgbouncer


all:
  children:

    infra: # infra cluster for proxy, monitor, alert, etc
      hosts: # 1 for common usage, 3 nodes for production
        10.10.10.10: { infra_seq: 1 } # identity required
        10.10.10.11: { infra_seq: 2, repo_enabled: false }
        10.10.10.12: { infra_seq: 3, repo_enabled: false }
      vars: { patroni_watchdog_mode: off }

    minio: # minio cluster, s3 compatible object storage
      hosts: { 10.10.10.10: { minio_seq: 1 } }
      vars: { minio_cluster: minio }

    etcd: # dcs service for postgres/patroni ha consensus
      hosts: # 1 node for testing, 3 or 5 for production
        10.10.10.10: { etcd_seq: 1 }  # etcd_seq required
        10.10.10.11: { etcd_seq: 2 }  # assign from 1 ~ n
        10.10.10.12: { etcd_seq: 3 }  # odd number please
      vars: # cluster level parameter override roles/etcd
        etcd_cluster: etcd  # mark etcd cluster name etcd
        etcd_safeguard: false # safeguard against purging
        etcd_clean: true # purge etcd during init process

    pg-meta: # 3 instance postgres cluster `pg-meta`
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
        10.10.10.11: { pg_seq: 2, pg_role: replica }
        10.10.10.12: { pg_seq: 3, pg_role: replica , pg_offline_query: true }
      vars:
        pg_cluster: pg-meta
        pg_conf: crit.yml
        pg_users:
          - { name: dbuser_meta , password: Pleas3-ChangeThisPwd ,expire_in: 7300 ,pgbouncer: true ,roles: [ dbrole_admin ]    ,comment: pigsty admin user }
          - { name: dbuser_view , password: Make.3ure-Compl1ance  ,expire_in: 7300 ,pgbouncer: true ,roles: [ dbrole_readonly ] ,comment: read-only viewer for meta database }
        pg_databases:
          - { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [ pigsty ] ,extensions: [ { name: vector } ] }
        pg_services:
          - { name: standby , ip: "*" ,port: 5435 , dest: default ,selector: "[]" , backup: "[? pg_role == `primary`]" }
        pg_listen: '${ip},${vip},${lo}'
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1

    # OPTIONAL delayed cluster for pg-meta
    pg-meta-delay: # delayed instance for pg-meta (1 hour ago)
      hosts: { 10.10.10.13: { pg_seq: 1, pg_role: primary, pg_upstream: 10.10.10.10, pg_delay: 1h } }
      vars: { pg_cluster: pg-meta-delay }


  ####################################################################
  #                          Parameters                              #
  ####################################################################
  vars: # global variables
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    patroni_ssl_enabled: true         # secure patroni RestAPI communications with SSL?
    pgbouncer_sslmode: require        # pgbouncer client ssl mode: disable|allow|prefer|require|verify-ca|verify-full, disable by default
    pg_default_service_dest: postgres # default service destination to postgres instead of pgbouncer
    pgbackrest_method: minio          # pgbackrest repo method: local,minio,[user-defined...]

    #----------------------------------#
    # Credentials
    #----------------------------------#
    #grafana_admin_username: admin
    grafana_admin_password: You.Have2Use-A_VeryStrongPassword
    #pg_admin_username: dbuser_dba
    pg_admin_password: PessWorb.Should8eStrong-eNough
    #pg_monitor_username: dbuser_monitor
    pg_monitor_password: MekeSuerYour.PassWordI5secured
    #pg_replication_username: replicator
    pg_replication_password: doNotUseThis-PasswordFor.AnythingElse
    #patroni_username: postgres
    patroni_password: don.t-forget-to-change-thEs3-password
    #haproxy_admin_username: admin
    haproxy_admin_password: GneratePasswordWith-pwgen-s-16-1

    #----------------------------------#
    # MinIO Related Options
    #----------------------------------#
    minio_users: # and configure `pgbackrest_repo` & `minio_users` accordingly
      - { access_key: dba , secret_key: S3User.DBA.Strong.Password, policy: consoleAdmin }
      - { access_key: pgbackrest , secret_key: Min10.bAckup ,policy: readwrite }
    pgbackrest_repo: # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
      local: # default pgbackrest repo with local posix fs
        path: /pg/backup              # local backup directory, `/pg/backup` by default
        retention_full_type: count    # retention full backups by count
        retention_full: 2             # keep 2, at most 3 full backup when using local fs repo
      minio: # optional minio repo for pgbackrest
        s3_key: pgbackrest            # <-------- CHANGE THIS, SAME AS `minio_users` access_key
        s3_key_secret: Min10.bAckup   # <-------- CHANGE THIS, SAME AS `minio_users` secret_key
        cipher_pass: 'pgBR.${pg_cluster}'  # <-------- CHANGE THIS, you can use cluster name as part of password
        type: s3                      # minio is s3-compatible, so s3 is used
        s3_endpoint: sss.pigsty       # minio endpoint domain name, `sss.pigsty` by default
        s3_region: us-east-1          # minio region, us-east-1 by default, useless for minio
        s3_bucket: pgsql              # minio bucket name, `pgsql` by default
        s3_uri_style: path            # use path style uri for minio rather than host style
        path: /pgbackrest             # minio backup path, default is `/pgbackrest`
        storage_port: 9000            # minio port, 9000 by default
        storage_ca_file: /etc/pki/ca.crt  # minio ca file path, `/etc/pki/ca.crt` by default
        bundle: y                     # bundle small files into a single file
        cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
        retention_full_type: time     # retention full backup by time on minio repo
        retention_full: 14            # keep full backup for last 14 days


    #----------------------------------#
    # Access Control
    #----------------------------------#
    # add passwordcheck extension to enforce strong password policy
    pg_libs: '$libdir/passwordcheck, pg_stat_statements, auto_explain'
    pg_extensions:
      - passwordcheck, supautils, pgsodium, pg_vault, pg_session_jwt, anonymizer, pgsmcrypto, pgauditlogtofile, pgaudit #, pgaudit17, pgaudit16, pgaudit15, pgaudit14
      - pg_auth_mon, credcheck, pgcryptokey, pg_jobmon, logerrors, login_hook, set_user, pgextwlist, pg_auditor, sslutils, noset #pg_tde #pg_snakeoil
    pg_default_roles: # default roles and users in postgres cluster
      - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access }
      - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
      - { name: dbrole_readwrite ,login: false ,roles: [ dbrole_readonly ]               ,comment: role for global read-write access }
      - { name: dbrole_admin     ,login: false ,roles: [ pg_monitor, dbrole_readwrite ]  ,comment: role for object creation }
      - { name: postgres     ,superuser: true  ,expire_in: 7300                        ,comment: system superuser }
      - { name: replicator ,replication: true  ,expire_in: 7300 ,roles: [ pg_monitor, dbrole_readonly ]   ,comment: system replicator }
      - { name: dbuser_dba   ,superuser: true  ,expire_in: 7300 ,roles: [ dbrole_admin ]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
      - { name: dbuser_monitor ,roles: [ pg_monitor ] ,expire_in: 7300 ,pgbouncer: true ,parameters: { log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }
    pg_default_hba_rules: # postgres host-based auth rules by default
      - { user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'   }
      - { user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident'  }
      - { user: '${repl}'    ,db: replication ,addr: localhost ,auth: ssl   ,title: 'replicator replication from localhost' }
      - { user: '${repl}'    ,db: replication ,addr: intra     ,auth: ssl   ,title: 'replicator replication from intranet'  }
      - { user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: ssl   ,title: 'replicator postgres db from intranet'  }
      - { user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password'  }
      - { user: '${monitor}' ,db: all         ,addr: infra     ,auth: ssl   ,title: 'monitor from infra host with password' }
      - { user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'    }
      - { user: '${admin}'   ,db: all         ,addr: world     ,auth: cert  ,title: 'admin @ everywhere with ssl & cert'    }
      - { user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: ssl   ,title: 'pgbouncer read/write via local socket' }
      - { user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: ssl   ,title: 'read/write biz user via password'      }
      - { user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: ssl   ,title: 'allow etl offline tasks from intranet' }
    pgb_default_hba_rules: # pgbouncer host-based authentication rules
      - { user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident' }
      - { user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd'  }
      - { user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: ssl   ,title: 'monitor access via intranet with pwd'  }
      - { user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr'  }
      - { user: '${admin}'   ,db: all         ,addr: intra     ,auth: ssl   ,title: 'admin access via intranet with pwd'    }
      - { user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'    }
      - { user: 'all'        ,db: all         ,addr: intra     ,auth: ssl   ,title: 'allow all user intra access with pwd'  }

    #----------------------------------#
    # Repo, Node, Packages
    #----------------------------------#
    # if you wish to customize your own repo, change these settings:
    repo_modules: infra,node,pgsql    # install upstream repo during repo bootstrap
    repo_remove: true                 # remove existing repo on admin node during repo bootstrap
    node_repo_modules: local          # install the local module in repo_upstream for all nodes
    node_repo_remove: true            # remove existing node repo for node managed by pigsty
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common ] #,docker]
    repo_extra_packages: [ pg17-main ,pg17-sec ] #,pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl]
    pg_version: 17                    # default postgres version

4.10 - 四节点：mssql

使用 WiltonDB / Babelfish 的 Microsoft SQL Server 兼容内核替代 PostgreSQL

mssql 配置模板基于 full 模板，使用 WiltonDB / Babelfish 数据库内核替代原生 PostgreSQL，提供 Microsoft SQL Server 线缆协议与语法兼容能力。

完整教程请参考：Babelfish (MSSQL) 内核使用说明

配置概览

配置名称： mssql
节点数量：四节点，pigsty/vagrant/spec/full.rb
配置说明：Babelfish/WiltonDB 四节点配置模板，提供 Microsoft SQL Server 兼容能力
适用系统：el7, el8, el9, u20, u22, u24
适用架构：x86_64, aarch64
相关配置：full

启用方式：在 configure 过程中使用 -c mssql 参数：

./configure -c mssql

备注：这是一个四节点模版，您需要在生成配置后修改其他三个节点的 IP 地址（可选）

配置内容

源文件地址：pigsty/conf/mssql.yml

all:
  children:

    #----------------------------------#
    # infra: monitor, alert, repo, etc..
    #----------------------------------#
    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }

    #----------------------------------#
    # etcd cluster for HA postgres DCS
    #----------------------------------#
    etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

    #----------------------------------#
    # pgsql (singleton on current node)
    #----------------------------------#
    # this is an example single-node postgres cluster
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary } # <---- primary instance with read-write capability
      vars:
        pg_cluster: pg-meta
        pg_users:                           # create MSSQL superuser
          - {name: dbuser_mssql ,password: DBUser.MSSQL ,superuser: true, pgbouncer: true ,roles: [dbrole_admin], comment: superuser & owner for babelfish  }
        pg_databases:
          - name: mssql
            baseline: mssql.sql             # init babelfish database & user
            extensions:
              - { name: uuid-ossp          }
              - { name: babelfishpg_common }
              - { name: babelfishpg_tsql   }
              - { name: babelfishpg_tds    }
              - { name: babelfishpg_money  }
              - { name: pg_hint_plan       }
              - { name: system_stats       }
              - { name: tds_fdw            }
            owner: dbuser_mssql
            parameters: { 'babelfishpg_tsql.migration_mode' : 'multi-db' }
            comment: babelfish cluster, a MSSQL compatible pg cluster

    #----------------------------------#
    # pgsql (3-node pgsql/mssql cluster)
    #----------------------------------#
    pg-test:
      hosts:
        10.10.10.11: { pg_seq: 1, pg_role: primary }
        10.10.10.12: { pg_seq: 2, pg_role: replica }
        10.10.10.13: { pg_seq: 3, pg_role: replica, pg_offline_query: true }
      vars:
        pg_cluster: pg-test
        pg_users:                           # create MSSQL superuser
          - {name: dbuser_mssql ,password: DBUser.MSSQL ,superuser: true, pgbouncer: true ,roles: [dbrole_admin], comment: superuser & owner for babelfish  }
        pg_primary_db: mssql                # use `mssql` as the primary sql server database
        pg_databases:
          - name: mssql
            baseline: mssql.sql             # init babelfish database & user
            extensions:
              - { name: uuid-ossp          }
              - { name: babelfishpg_common }
              - { name: babelfishpg_tsql   }
              - { name: babelfishpg_tds    }
              - { name: babelfishpg_money  }
              - { name: pg_hint_plan       }
              - { name: system_stats       }
              - { name: tds_fdw            }
            owner: dbuser_mssql
            parameters: { 'babelfishpg_tsql.migration_mode' : 'single-db' }
            comment: babelfish cluster, a MSSQL compatible pg cluster

  vars:

    #----------------------------------#
    # Meta Data
    #----------------------------------#
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default,china,europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }

    #----------------------------------#
    # NODE, PGSQL, MSSQL
    #----------------------------------#
    pg_version: 15                     # The current WiltonDB major version is 15
    pg_packages:                       # install forked version of postgresql with babelfishpg support
      - wiltondb patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager
    pg_extensions: [ ]                 # do not install any vanilla postgresql extensions
    pg_mode: mssql                     # Microsoft SQL Server Compatible Mode
    pg_libs: 'babelfishpg_tds, pg_stat_statements, auto_explain' # add timescaledb to shared_preload_libraries
    pg_default_hba_rules: # overwrite default HBA rules for babelfish cluster
      - { user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident' }
      - { user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
      - { user: '${repl}'    ,db: replication ,addr: localhost ,auth: pwd   ,title: 'replicator replication from localhost' }
      - { user: '${repl}'    ,db: replication ,addr: intra     ,auth: pwd   ,title: 'replicator replication from intranet' }
      - { user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: pwd   ,title: 'replicator postgres db from intranet' }
      - { user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
      - { user: '${monitor}' ,db: all         ,addr: infra     ,auth: pwd   ,title: 'monitor from infra host with password' }
      - { user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl' }
      - { user: '${admin}'   ,db: all         ,addr: world     ,auth: ssl   ,title: 'admin @ everywhere with ssl & pwd' }
      - { user: dbuser_mssql ,db: mssql       ,addr: intra     ,auth: md5   ,title: 'allow mssql dbsu intranet access' } # <--- use md5 auth method for mssql user
      - { user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: pwd   ,title: 'pgbouncer read/write via local socket' }
      - { user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: pwd   ,title: 'read/write biz user via password' }
      - { user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: pwd   ,title: 'allow etl offline tasks from intranet' }
    pg_default_services: # route primary & replica service to mssql port 1433
      - { name: primary ,port: 5433 ,dest: 1433  ,check: /primary   ,selector: "[]" }
      - { name: replica ,port: 5434 ,dest: 1433  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
      - { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
      - { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]" }

    # download wiltondb instead of postgresql kernel
    repo_modules: node,pgsql,infra,mssql
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-utility ]
    repo_extra_packages: [ wiltondb, sqlcmd ] # replace pgsql kernel with wiltondb/babelfish

注意事项

请注意，WiltonDB 仅在 EL 7/8/9 与 Ubuntu 20.04/22.04/24.04 系统上可用，目前尚不提供 Debian 系操作系统支持。

请注意，WiltonDB 目前 LTS 版本基于 PostgreSQL 15 进行。

4.11 - 四节点：polar

使用 WiltonDB / Babelfish 的 Microsoft SQL Server 兼容内核替代 PostgreSQL

polar 配置模板基于 full 模板。

使用 PolarDB for PostgreSQL 数据库内核替代原生 PostgreSQL，提供 “云原生” Aurora 风味的 PostgreSQL 体验

完整教程请参考：PolarDB for PostgreSQL (POLAR) 内核使用说明

配置概览

配置名称： polar
节点数量：四节点，pigsty/vagrant/spec/full.rb
配置说明：使用阿里云 PolarDB for PostgreSQL 内核替代原生 PostgreSQL
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64, aarch64
相关配置：full

启用方式：在 configure 过程中使用 -c polar 参数：

./configure -c polar

备注：这是一个四节点模版，您需要在生成配置后修改其他三个节点的 IP 地址（可选）

配置内容

源文件地址：pigsty/conf/polar.yml

all:
  children:

    # infra singleton for repo, monitoring,...
    infra:
      hosts:
        10.10.10.10: { infra_seq: 1 }

    # etcd singleton for HA postgres DCS
    etcd:
      hosts:
        10.10.10.10: { etcd_seq: 1 }
      vars:
        etcd_cluster: etcd

    # polardb singleton
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - {name: dbuser_meta ,password: DBUser.Meta   ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
          - {name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
        pg_databases:
          - {name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty]}
        pg_hba_rules:
          - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1

    # polardb 3-node ha cluster: 10.10.10.3 ---> 10.10.10.1{1,2,3}
    pg-test:
      hosts:
        10.10.10.11: { pg_seq: 1, pg_role: primary }   # primary instance, leader of cluster
        10.10.10.12: { pg_seq: 2, pg_role: replica }   # replica instance, follower of leader
        10.10.10.13: { pg_seq: 3, pg_role: replica, pg_offline_query: true } # replica with offline access
      vars:
        pg_cluster: pg-test           # define pgsql cluster name
        pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
        pg_databases: [{ name: test }]
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.3/24
        pg_vip_interface: eth1

  vars:                               # global variables
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default,china,europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }

    #----------------------------------#
    # NODE, PGSQL, PolarDB
    #----------------------------------#
    # THIS SPEC REQUIRE AN AVAILABLE POLARDB KERNEL IN THE LOCAL REPO!
    pg_version: 15
    pg_packages: [ 'polardb patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager' ]
    pg_extensions: [ ]                # do not install any vanilla postgresql extensions
    pg_mode: polar                    # polardb compatible mode
    pg_exporter_exclude_database: 'template0,template1,postgres,polardb_admin'
    pg_default_roles:                 # default roles and users in postgres cluster
      - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
      - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
      - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access }
      - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation }
      - { name: postgres     ,superuser: true  ,comment: system superuser }
      - { name: replicator   ,superuser: true  ,replication: true ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator } # <- superuser is required for replication
      - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 ,comment: pgsql admin user }
      - { name: dbuser_monitor ,roles: [pg_monitor] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-utility ]
    repo_extra_packages: [ polardb ] # replace vanilla postgres kernel with polardb kernel

注意事项

请注意，PolarDB 当前最新版本等效于 PostgreSQL 15。

4.12 - 四节点：ivory

使用 Oracle 兼容的 IvorySQL 内核替换默认的 PostgreSQL

ivory 配置模板基于 full 模板，使用使用瀚高的 IvorySQL （Oracle兼容内核）替代原生 PostgreSQL 内核

完整教程请参考：IvorySQL (Oracle兼容) 内核使用说明

配置概览

配置名称： ivory
节点数量：四节点，pigsty/vagrant/spec/full.rb
配置说明：Babelfish/WiltonDB 四节点配置模板，提供 Microsoft SQL Server 兼容能力
适用系统：el7, el8, el9
适用架构：x86_64, aarch64（el7 除外）
相关配置：full

启用方式：在 configure 过程中使用 -c ivory 参数：

./configure -c ivory

备注：这是一个四节点模版，您需要在生成配置后修改其他三个节点的 IP 地址（可选）

配置内容

源文件地址：pigsty/conf/ivory.yml

all:
  children:

    # infra singleton for repo, monitoring,...
    infra:
      hosts:
        10.10.10.10: { infra_seq: 1 }

    # etcd singleton for HA postgres DCS
    etcd:
      hosts:
        10.10.10.10: { etcd_seq: 1 }
      vars:
        etcd_cluster: etcd

    # ivorysql singleton
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - {name: dbuser_meta ,password: DBUser.Meta   ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
          - {name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
        pg_databases:
          - {name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty]}
        pg_hba_rules:
          - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1

    # ivorysql 3-node ha cluster: 10.10.10.3 ---> 10.10.10.1{1,2,3}
    pg-test:
      hosts:
        10.10.10.11: { pg_seq: 1, pg_role: primary }   # primary instance, leader of cluster
        10.10.10.12: { pg_seq: 2, pg_role: replica }   # replica instance, follower of leader
        10.10.10.13: { pg_seq: 3, pg_role: replica, pg_offline_query: true } # replica with offline access
      vars:
        pg_cluster: pg-test           # define pgsql cluster name
        pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
        pg_databases: [{ name: test }]
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.3/24
        pg_vip_interface: eth1

  vars:                               # global variables
    version: v3.4.1                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default,china,europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    #docker_registry_mirrors: ["https://docker.1ms.run", "https://docker.m.daocloud.io"]
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }

    #----------------------------------#
    # Ivory SQL Configuration
    #----------------------------------#
    pg_mode: ivory                    # IvorySQL Oracle Compatible Mode
    pg_version: 17                    # The current IvorySQL compatible major version is 17
    pg_packages: [ ivorysql, pgsql-common ]
    pg_libs: 'liboracle_parser, pg_stat_statements, auto_explain'
    repo_extra_packages: [ ivorysql ] # replace default postgresql kernel with ivroysql packages

注意事项

IvorySQL 目前提供的是通用的 RPM/DEB 包，不区分操作系统发行版大版本，只跟发行版家族（DEB/RPM）与系统架构（x86_64/aarch64）有关。

安装 IvorySQL 的条件是系统满足 glibc > 2.17，因此请确保系统满足这个条件，Pigsty 支持的系统版本都满足这个条件：

CentOS 7 : 2.17
Debian 9 : 2.19
Ubuntu 14.04 : 2.19

在 EL 系列系统上，IvorySQL 的包名为 ivorysql4，在 Debian/Ubuntu 系统上，包名为 ivorysql-4。

4.13 - 四节点：mysql

使用 MySQL 兼容的 openHalo 内核替换默认的 PostgreSQL

mysql 配置模板基于 full 模板，使用使用易景科技开源的 openHalo （MYSQL兼容内核）替代原生 PostgreSQL 内核

完整教程请参考：openHalo (MySQL兼容) 内核使用说明

配置概览

配置名称： mysql
节点数量：四节点，pigsty/vagrant/spec/full.rb
配置说明：Babelfish/WiltonDB 四节点配置模板，提供 Microsoft SQL Server 兼容能力
适用系统：el8, el9
适用架构：x86_64, aarch64
相关配置：full

此配置于 Pigsty v3.4.1 新加入。

启用方式：在 configure 过程中使用 -c mysql 参数：

./configure -c mysql

备注：这是一个四节点模版，您需要在生成配置后修改其他三个节点的 IP 地址或直接移除（可选）

配置内容

源文件地址：pigsty/conf/mysql.yml

all:
  children:

    infra:
      hosts:
        10.10.10.10: { infra_seq: 1 }

    etcd:
      hosts:
        10.10.10.10: { etcd_seq: 1 }
      vars:
        etcd_cluster: etcd

    # openHaloDB singleton
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - {name: dbuser_meta ,password: DBUser.Meta   ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
          - {name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
        pg_databases:
          - {name: postgres, extensions: [aux_mysql]} # the mysql compatible database
          - {name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty]}
        pg_hba_rules:
          - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1

    # halo 3-node ha cluster: 10.10.10.3 ---> 10.10.10.1{1,2,3}
    pg-test:
      hosts:
        10.10.10.11: { pg_seq: 1, pg_role: primary }   # primary instance, leader of cluster
        10.10.10.12: { pg_seq: 2, pg_role: replica }   # replica instance, follower of leader
        10.10.10.13: { pg_seq: 3, pg_role: replica, pg_offline_query: true } # replica with offline access
      vars:
        pg_cluster: pg-test           # define pgsql cluster name
        pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
        pg_databases: [{name: test}, { name: postgres, extensions: [aux_mysql] }]
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.3/24
        pg_vip_interface: eth1

  vars:                               # global variables
    version: v3.4.1                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default,china,europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    #docker_registry_mirrors: ["https://docker.1ms.run", "https://docker.m.daocloud.io"]
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }

    pg_mode: mysql                    # MySQL Compatible Mode by HaloDB
    pg_version: 14                    # The current HaloDB is compatible with PG Major Version 14
    pg_packages: [ openhalodb, pgsql-common, mysql ]   # also install mysql client shell
    repo_modules: node,pgsql,infra,mysql
    repo_extra_packages: [ openhalodb, mysql ] # replace default postgresql kernel with openhalo packages

注意事项

openHalo 新近开源，尚未提供官方 RPM，目前 RPM 包由 Pigsty 制作提供。

Pigsty 将在后续提供对 Debian / Ubuntu 的支持。

4.14 - 四节点：oriole

使用 Supabase 出品的 OrioleDB 替代原生 PostgreSQL 内核。

oriole 配置模板基于 full 模板，使用来自 Supabase 的 OrioleDB 替代原生 PostgreSQL 内核。

完整教程请参考：OrioleDB 内核使用说明

配置概览

配置名称： oriole
节点数量：四节点，pigsty/vagrant/spec/full.rb
配置说明：Babelfish/WiltonDB 四节点配置模板，提供 Microsoft SQL Server 兼容能力
适用系统：el8, el9
适用架构：x86_64, aarch64
相关配置：full

此配置于 Pigsty v3.4.1 新加入。

启用方式：在 configure 过程中使用 -c oriole 参数：

./configure -c oriole

备注：这是一个四节点模版，您需要在生成配置后修改其他三个节点的 IP 地址或直接移除（可选）

Content

Source: pigsty/conf/oriole.yml

all:
  children:

    # infra singleton for repo, monitoring,...
    infra:
      hosts:
        10.10.10.10: { infra_seq: 1 }

    # etcd singleton for HA postgres DCS
    etcd:
      hosts:
        10.10.10.10: { etcd_seq: 1 }
      vars:
        etcd_cluster: etcd

    # orioledb example one-node cluster
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - {name: dbuser_meta ,password: DBUser.Meta   ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
          - {name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
        pg_databases:
          - {name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty], extensions: [orioledb]}
        pg_hba_rules:
          - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1

    # orioledb 3-node ha cluster: 10.10.10.3 ---> 10.10.10.1{1,2,3}
    pg-test:
      hosts:
        10.10.10.11: { pg_seq: 1, pg_role: primary }   # primary instance, leader of cluster
        10.10.10.12: { pg_seq: 2, pg_role: replica }   # replica instance, follower of leader
        10.10.10.13: { pg_seq: 3, pg_role: replica, pg_offline_query: true } # replica with offline access
      vars:
        pg_cluster: pg-test           # define pgsql cluster name
        pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
        pg_databases: [{ name: test, extensions: [orioledb] }]
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.3/24
        pg_vip_interface: eth1

  vars:                               # global variables
    version: v3.4.1                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default,china,europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    #docker_registry_mirrors: ["https://docker.1ms.run", "https://docker.m.daocloud.io"]
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }

    pg_mode: oriole                    # oriole compatible mode
    pg_version: 17                     # compatible with pg17
    pg_packages: [ orioledb, pgsql-common ]
    repo_extra_packages: [ orioledb ]  # download orioedb packages
    pg_libs: 'orioledb, pg_stat_statements, auto_explain'

提醒

OrioleDB 需要使用打补丁的 PostgreSQL 内核，目前该补丁PG由 Pigsty 制作，使用 patches17_6 标签，修改 VERSION STRING 为 OrioleDB 用于区分，没有其他修改，

Pigsty 目前仅支持在 EL 系统上使用，未来会支持 Debian / Ubuntu 系统。

4.15 - 四节点：minio

安装一套四节点的高可用多节点多盘 MinIO 集群，提供 S3 兼容的对象存储服务

minio 配置模板基于 full 模板。

在这套配置中，定义了一套四节点 x 四盘位，总计十六盘的 MinIO 集群。

更多教程，请参考 MINIO 模块文档。

配置概览

配置名称： minio
节点数量：四节点，pigsty/vagrant/spec/minio.rb
配置说明：安装一套四节点的高可用多节点多盘 MinIO 集群，提供 S3 兼容的对象存储服务
适用系统：el7, el8, el9, d12, u20, u22, u24
适用架构：x86_64, aarch64
相关配置：full

启用方式：在 configure 过程中使用 -c minio 参数：

./configure -c minio

备注：这是一个四节点模版，您需要在生成配置后修改其他三个节点的 IP 地址（可选）

配置内容

源文件地址：pigsty/conf/mssql.yml

all:
  children:

    # infra cluster for proxy, monitor, alert, etc..
    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }

    # minio cluster with 4 nodes and 4 drivers per node
    minio:
      hosts:
        10.10.10.10: { minio_seq: 1 , nodename: minio-1 }
        10.10.10.11: { minio_seq: 2 , nodename: minio-2 }
        10.10.10.12: { minio_seq: 3 , nodename: minio-3 }
        10.10.10.13: { minio_seq: 4 , nodename: minio-4 }
      vars:
        minio_cluster: minio
        minio_data: '/data{1...4}'
        minio_buckets: [ { name: pgsql }, { name: infra }, { name: redis } ]
        minio_users:
          - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
          - { access_key: pgbackrest , secret_key: S3User.SomeNewPassWord , policy: readwrite }

        # bind a node l2 vip (10.10.10.9) to minio cluster (optional)
        node_cluster: minio
        vip_enabled: true
        vip_vrid: 128
        vip_address: 10.10.10.9
        vip_interface: eth1

        # expose minio service with haproxy on all nodes
        haproxy_services:
          - name: minio                    # [REQUIRED] service name, unique
            port: 9002                     # [REQUIRED] service port, unique
            balance: leastconn             # [OPTIONAL] load balancer algorithm
            options:                       # [OPTIONAL] minio health check
              - option httpchk
              - option http-keep-alive
              - http-check send meth OPTIONS uri /minio/health/live
              - http-check expect status 200
            servers:
              - { name: minio-1 ,ip: 10.10.10.10 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
              - { name: minio-2 ,ip: 10.10.10.11 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
              - { name: minio-3 ,ip: 10.10.10.12 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
              - { name: minio-4 ,ip: 10.10.10.13 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

  vars:
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }

      # domain names to access minio web console via nginx web portal (optional)
      minio        : { domain: m.pigsty     ,endpoint: "10.10.10.10:9001" ,scheme: https ,websocket: true }
      minio10      : { domain: m10.pigsty   ,endpoint: "10.10.10.10:9001" ,scheme: https ,websocket: true }
      minio11      : { domain: m11.pigsty   ,endpoint: "10.10.10.11:9001" ,scheme: https ,websocket: true }
      minio12      : { domain: m12.pigsty   ,endpoint: "10.10.10.12:9001" ,scheme: https ,websocket: true }
      minio13      : { domain: m13.pigsty   ,endpoint: "10.10.10.13:9001" ,scheme: https ,websocket: true }

    minio_endpoint: https://sss.pigsty:9002   # explicit overwrite minio endpoint with haproxy port
    node_etc_hosts: ["10.10.10.9 sss.pigsty"] # domain name to access minio from all nodes (required)

注意事项

4.16 - 双节点：dual

双节点配置模板，有限高可用部署，允许宕机特定一台服务器。

此模板使用双节点部署，实现一主一备的 “半-高可用” 部署，如果您只有两台服务器，这是一个不错的选择。

配置概览

配置名称： dual
节点数量：双节点
配置说明：两节点模版，有限高可用部署，允许特定一台服务器宕机。
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64, aarch64
相关配置：
Vagrant：双节点 pigsty/vagrant/spec/dual.rb

配置说明

启用方式：在 configure 过程中使用 -c dual 参数：

./configure -c dual [-i <primary_ip>]

配置生成完成后，您还需要将 10.10.10.11 占位节点 IP 地址修改为您的从库节点 IP 地址。

配置内容

源文件地址：pigsty/conf/dual.yml


# It is recommended to use at least three nodes in production deployment.
# But sometimes, there are only two nodes available, that's dual.yml for
#
# In this setup, we have two nodes, .10 (admin_node) and .11 (pgsql_priamry):
#
# If .11 is down, .10 will take over since the dcs:etcd is still alive
# If .10 is down, .11 (pgsql primary) will still be functioning as a primary if:
#   - Only dcs:etcd is down
#   - Only pgsql is down
# if both etcd & pgsql are down (e.g. node down), the primary will still demote itself.


all:
  children:

    # infra cluster for proxy, monitor, alert, etc..
    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }

    # etcd cluster for ha postgres
    etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

    # minio cluster, optional backup repo for pgbackrest
    #minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

    # postgres cluster 'pg-meta' with single primary instance
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: replica }
        10.10.10.11: { pg_seq: 2, pg_role: primary }  # <----- use this as primary by default
      vars:
        pg_cluster: pg-meta
        pg_databases: [ { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [ pigsty ] ,extensions: [ { name: vector }] } ]
        pg_users:
          - { name: dbuser_meta ,password: DBUser.Meta   ,pgbouncer: true ,roles: [ dbrole_admin ]    ,comment: pigsty admin user }
          - { name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [ dbrole_readonly ] ,comment: read-only viewer for meta database }
        node_crontab: [ '00 01 * * * postgres /pg/bin/pg-backup full' ] # make a full backup every 1am
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1

  vars:                               # global parameters
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default,china,europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }
      #minio        : { domain: m.pigsty ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }

    # consider using local fs or external s3 service for cold backup storage in dual node configuration
    #pgbackrest_method: minio

    #----------------------------------#
    # Repo, Node, Packages
    #----------------------------------#
    # if you wish to customize your own repo, change these settings:
    repo_modules: infra,node,pgsql    # install upstream repo during repo bootstrap
    repo_remove: true                 # remove existing repo on admin node during repo bootstrap
    node_repo_modules: local          # install the local module in repo_upstream for all nodes
    node_repo_remove: true            # remove existing node repo for node managed by pigsty
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common ] #,docker]
    repo_extra_packages: [ pg17-main ] #,pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl]
    pg_version: 17                    # default postgres version
    #pg_extensions: [pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ] #,pg17-olap]
...

注意事项

通常在生产环境中，完整的高可用部署至少需要三个节点，以确保在任何一台服务器宕机时，集群仍然可以正常运行。因为高可用故障检测 DCS（etcd） / Patroni需要多数节点的参与，而双节点无法满足这一要求。

但是，有时候只有两台服务器可用，这种情况下，dual 模板是一个可行的选择，假设您有两台服务器：

节点A，10.10.10.10 ，默认为管理节点，运行 Infra 基础设施，单节点 etcd，以及 PGSQL 的从库。
节点B，10.10.10.11 ，只做为 PGSQL 的主库。

在这种情况下，两节点模版允许 B 节点出现故障，并在故障发生后自动切换到 A 节点。然而当 A 节点出现故障时（整个节点宕机），则需要人工介入。不过，如果 A 节点不是整个节点宕机离线，而仅仅是 etcd 或 PostgreSQL 本身的问题，整套系统仍然可以继续正常运行。

此模板使用了一个 L2 VIP 实现高可用接入，如果您的网络条件不允许使用 L2 VIP （例如，在受限制的云环境，或跨交换机广播域），您可以考虑使用 DNS 解析或其他接入方式替代。

4.17 - 双节点：slim

双节点衍生配置模板，不使用 Infra 模块，不构建本地软件源，直接从互联网安装 PostgreSQL

此模板使用双节点部署模板，提供精简安装能力，您可以在不安装 Infra 模块的前提下，直接从互联网安装 PostgreSQL。

当您需要一个最简单的可用数据库实例，不希望部署监控与依赖项时，可以考虑 精简安装 模式。

配置概览

配置名称： slim
节点数量：双节点，pigsty/vagrant/spec/dual.rb
配置说明：精简安装配置模板
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64, aarch64
相关配置：dual

配置说明

启用方式：在 configure 过程中使用 -c slim 参数：

./configure -c slim [-i <primary_ip>]

配置生成完成后，您还需要将 10.10.10.11 占位节点 IP 地址修改为您的从库节点 IP 地址。

配置内容

源文件地址：pigsty/conf/slim.yml

# This config file will perform a minimal installation on two nodes
# Directly from the Internet without any local repo or infrastructure
# ./configure -c slim
# ./slim.ym

all:
  children:

    # actually not used
    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }

    #----------------------------------#
    # etcd cluster for HA postgres DCS
    #----------------------------------#
    etcd:
      hosts:
        10.10.10.10: { etcd_seq: 1 }
      vars:
        etcd_cluster: etcd

    # postgres cluster 'pg-meta' with 2 instances
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
        10.10.10.11: { pg_seq: 2, pg_role: replica }
      vars:
        pg_cluster: pg-meta
        pg_databases: [ { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: vector}]}]
        pg_users:
          - { name: dbuser_meta ,password: DBUser.Meta   ,pgbouncer: true ,roles: [ dbrole_admin ]    ,comment: pigsty admin user }
          - { name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [ dbrole_readonly ] ,comment: read-only viewer for meta database }
        node_crontab: [ '00 01 * * * postgres /pg/bin/pg-backup full' ] # make a full backup every 1am

  vars:                               # global parameters
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default,china,europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml

    # slim installation setup
    nginx_enabled: false              # nginx not exists
    dns_enabled: false                # dnsmasq not exists
    prometheus_enabled: false         # prometheus not exists
    grafana_enabled: false            # grafana not exists
    pg_exporter_enabled: false        # disable pg_exporter
    pgbouncer_exporter_enabled: false
    pg_vip_enabled: false

    #----------------------------------#
    # Repo, Node, Packages
    #----------------------------------#
    # if you wish to customize your own repo, change these settings:
    repo_modules: infra,node,pgsql
    repo_remove: true                 # remove existing repo on admin node during repo bootstrap
    node_repo_modules: local          # install the local module in repo_upstream for all nodes
    node_repo_remove: true            # remove existing node repo for node managed by pigsty
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common ] #,docker]
    repo_extra_packages: [ pg17-main ] #,pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl]
    pg_version: 17                    # default postgres version
    #pg_extensions: [pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ] #,pg17-olap]

注意事项

因为缺少 Infra 模块提供的监控基础设施，精简安装模式不提供数据库监控能力。

4.18 - 三节点：trio

三节点配置模板，标准 HA 架构，允许三台服务器中的一台出现故障。

三节点是实现真正意义上高可用的最小规格，在这种情况下，DCS（etcd）可以容忍一台服务器的宕机。

在此配置中，使用了三节点的标准 HA 架构， INFRA，ETCD，PGSQL 三个核心模块均使用三节点部署，允许其中一台出现宕机。

配置概览

配置名称： trio
节点数量：三节点 / pigsty/vagrant/spec/trio.rb
配置说明：三节点配置模板，标准 HA 架构，允许三台服务器中的任意一台出现宕机故障。
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64
相关配置：safe

启用方式：在 configure 过程中使用 -c trio 参数：

./configure -c trio

备注：这是一个三节点模版，您需要在生成配置后修改其他两个节点的 IP 地址

配置内容

源文件地址：pigsty/conf/trio.yml

# 3 infra node, 3 etcd node, 3 pgsql node, and 1 minio node

all:

  #==============================================================#
  # Clusters, Nodes, and Modules
  #==============================================================#
  children:

    #----------------------------------#
    # infra: monitor, alert, repo, etc..
    #----------------------------------#
    infra: # infra cluster for proxy, monitor, alert, etc
      hosts: # 1 for common usage, 3 nodes for production
        10.10.10.10: { infra_seq: 1 } # identity required
        10.10.10.11: { infra_seq: 2, repo_enabled: false }
        10.10.10.12: { infra_seq: 3, repo_enabled: false }
      vars:
        patroni_watchdog_mode: off # do not fencing infra

    etcd: # dcs service for postgres/patroni ha consensus
      hosts: # 1 node for testing, 3 or 5 for production
        10.10.10.10: { etcd_seq: 1 }  # etcd_seq required
        10.10.10.11: { etcd_seq: 2 }  # assign from 1 ~ n
        10.10.10.12: { etcd_seq: 3 }  # odd number please
      vars: # cluster level parameter override roles/etcd
        etcd_cluster: etcd  # mark etcd cluster name etcd
        etcd_safeguard: false # safeguard against purging
        etcd_clean: true # purge etcd during init process

    minio: # minio cluster, s3 compatible object storage
      hosts: { 10.10.10.10: { minio_seq: 1 } }
      vars: { minio_cluster: minio }

    pg-meta: # 3 instance postgres cluster `pg-meta`
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
        10.10.10.11: { pg_seq: 2, pg_role: replica }
        10.10.10.12: { pg_seq: 3, pg_role: replica , pg_offline_query: true }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - { name: dbuser_meta , password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ]    ,comment: pigsty admin user }
          - { name: dbuser_view , password: DBUser.View ,pgbouncer: true ,roles: [ dbrole_readonly ] ,comment: read-only viewer for meta database }
        pg_databases:
          - { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [ pigsty ] ,extensions: [ { name: vector } ] }
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1


  #==============================================================#
  # Global Parameters
  #==============================================================#
  vars:

    #----------------------------------#
    # Meta Data
    #----------------------------------#
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml
    proxy_env:                        # global proxy env when downloading packages
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"
      # http_proxy:  # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # https_proxy: # set your proxy here: e.g http://user:pass@proxy.xxx.com
      # all_proxy:   # set your proxy here: e.g http://user:pass@proxy.xxx.com
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }
      #minio        : { domain: m.pigsty ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }

    #----------------------------------#
    # Repo, Node, Packages
    #----------------------------------#
    # if you wish to customize your own repo, change these settings:
    repo_modules: infra,node,pgsql    # install upstream repo during repo bootstrap
    repo_remove: true                 # remove existing repo on admin node during repo bootstrap
    node_repo_modules: local          # install the local module in repo_upstream for all nodes
    node_repo_remove: true            # remove existing node repo for node managed by pigsty
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common ] #,docker]
    repo_extra_packages: [ pg17-main ] #,pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl]
    pg_version: 17                    # default postgres version
    #pg_extensions: [pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ] #,pg17-olap]

注意事项

4.19 - 五节点：oss

在 Pigsty 支持的五大操作系统发行版上，批量构建离线软件包。

oss 配置模板是 Pigsty 本地构建离线软件包使用的配置模板，仅能在本地开发时使用。

配置概览

配置名称： oss
节点数量：五节点
配置说明：在 Pigsty 支持的五大操作系统发行版上，批量构建离线软件包。
适用系统：el8, el9, d12, u22, u24 （一次性）
适用架构：x86_64
Vagrant：四节点 pigsty/vagrant/spec/oss.rb

启用方式：直接将 oss.yml 配置文件替换 pigsty.yml 配置文件：

cp conf/build/oss.yml pigsty.yml

备注：这是一个固定IP地址的构建模板

配置内容

源文件地址：pigsty/conf/oss.yml

all:
  vars:
    version: v3.3.0
    admin_ip: 10.10.10.9
    region: default
    etcd_clean: true
    proxy_env:
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn,*.pigsty.cc"

    # building spec
    pg_version: 17
    cache_pkg_dir: 'dist/${version}/'
    repo_modules: infra,node,pgsql,docker #kube,mssql,ivory
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-utility, docker ]
    repo_extra_packages: [ pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ,citus ]
    pg_extensions: [ pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ]

  children:
    infra:
      hosts:
        10.10.10.9:  { infra_seq: 2, admin_ip: 10.10.10.9  ,ansible_host: el9 }
        10.10.10.12: { infra_seq: 3, admin_ip: 10.10.10.12 ,ansible_host: d12 }
        10.10.10.22: { infra_seq: 4, admin_ip: 10.10.10.22 ,ansible_host: u22 }
      vars: { node_conf: oltp }

    etcd: { hosts: { 10.10.10.9:  { etcd_seq: 1 }}, vars: {  etcd_cluster: etcd  } }

    el9:
      hosts: { 10.10.10.9: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-el9 }

    d12:
      hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-d12 }

    u22:
      hosts: { 10.10.10.22: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-u22 }

注意事项

构建需要使用到 http://10.10.10.1 的本地源，需要提前配置好本地源，否则构建会失败。您可以将其替换为 https://repo.pigsty.cc 或 https://repo.pigsty.io 完成构建。

4.20 - 36节点：simu

由 36 个节点组成，生产环境仿真配置，需要强大的宿主机方可运行。

配置概览

配置名称： simu
节点数量： 36 节点，pigsty/vagrant/spec/simu.rb
配置说明：43 节点的生产环境仿真配置，需要强大的宿主机方可运行。
适用系统：el8, el9, d12, u22, u24
适用架构：x86_64

cp -f conf/simu.yml pigsty.yml

配置内容

源文件地址：pigsty/conf/prod.yml

all:

  children:

    #==========================================================#
    # infra: 2 nodes
    #==========================================================#
    # ./infra.yml -l infra
    # ./docker.yml -l infra (optional)
    infra:
      hosts:
        10.10.10.10: {}
        10.10.10.11: {}
      vars:
        docker_enabled: true
        node_conf: oltp         # use oltp template for infra nodes
        pg_conf: oltp.yml       # use oltp template for infra pgsql
        pg_exporters:           # bin/pgmon-add pg-meta2/pg-test2/pg-src2/pg-dst2
          20001: {pg_cluster: pg-meta2   ,pg_seq: 1 ,pg_host: 10.10.10.10, pg_databases: [{ name: meta }]}
          20002: {pg_cluster: pg-meta2   ,pg_seq: 2 ,pg_host: 10.10.10.11, pg_databases: [{ name: meta }]}

          20003: {pg_cluster: pg-test2   ,pg_seq: 1 ,pg_host: 10.10.10.41, pg_databases: [{ name: test }]}
          20004: {pg_cluster: pg-test2   ,pg_seq: 2 ,pg_host: 10.10.10.42, pg_databases: [{ name: test }]}
          20005: {pg_cluster: pg-test2   ,pg_seq: 3 ,pg_host: 10.10.10.43, pg_databases: [{ name: test }]}
          20006: {pg_cluster: pg-test2   ,pg_seq: 4 ,pg_host: 10.10.10.44, pg_databases: [{ name: test }]}

          20007: {pg_cluster: pg-src2    ,pg_seq: 1 ,pg_host: 10.10.10.45, pg_databases: [{ name: src }]}
          20008: {pg_cluster: pg-src2    ,pg_seq: 2 ,pg_host: 10.10.10.46, pg_databases: [{ name: src }]}
          20009: {pg_cluster: pg-src2    ,pg_seq: 3 ,pg_host: 10.10.10.47, pg_databases: [{ name: src }]}

          20010: {pg_cluster: pg-dst2    ,pg_seq: 3 ,pg_host: 10.10.10.48, pg_databases: [{ name: dst }]}
          20011: {pg_cluster: pg-dst2    ,pg_seq: 4 ,pg_host: 10.10.10.49, pg_databases: [{ name: dst }]}


    #==========================================================#
    # nodes: 36 nodes
    #==========================================================#
    # ./node.yml
    nodes:
      hosts:
        10.10.10.10 : { nodename: meta1  ,node_cluster: meta   ,pg_cluster: pg_meta  ,pg_seq: 1 ,pg_role: primary, infra_seq: 1 }
        10.10.10.11 : { nodename: meta2  ,node_cluster: meta   ,pg_cluster: pg_meta  ,pg_seq: 2 ,pg_role: replica, infra_seq: 2 }
        10.10.10.12 : { nodename: pg12   ,node_cluster: pg12   ,pg_cluster: pg-v12   ,pg_seq: 1 ,pg_role: primary }
        10.10.10.13 : { nodename: pg13   ,node_cluster: pg13   ,pg_cluster: pg-v13   ,pg_seq: 1 ,pg_role: primary }
        10.10.10.14 : { nodename: pg14   ,node_cluster: pg14   ,pg_cluster: pg-v14   ,pg_seq: 1 ,pg_role: primary }
        10.10.10.15 : { nodename: pg15   ,node_cluster: pg15   ,pg_cluster: pg-v15   ,pg_seq: 1 ,pg_role: primary }
        10.10.10.16 : { nodename: pg16   ,node_cluster: pg16   ,pg_cluster: pg-v16   ,pg_seq: 1 ,pg_role: primary }
        10.10.10.17 : { nodename: pg17   ,node_cluster: pg17   ,pg_cluster: pg-v17   ,pg_seq: 1 ,pg_role: primary }
        10.10.10.18 : { nodename: proxy1 ,node_cluster: proxy  ,vip_address: 10.10.10.20 ,vip_vrid: 20 ,vip_interface: eth1 ,vip_role: master }
        10.10.10.19 : { nodename: proxy2 ,node_cluster: proxy  ,vip_address: 10.10.10.20 ,vip_vrid: 20 ,vip_interface: eth1 ,vip_role: backup }
        10.10.10.21 : { nodename: minio1 ,node_cluster: minio  ,minio_cluster: minio ,minio_seq: 1 ,etcd_cluster: etcd ,etcd_seq: 1}
        10.10.10.22 : { nodename: minio2 ,node_cluster: minio  ,minio_cluster: minio ,minio_seq: 2 ,etcd_cluster: etcd ,etcd_seq: 2}
        10.10.10.23 : { nodename: minio3 ,node_cluster: minio  ,minio_cluster: minio ,minio_seq: 3 ,etcd_cluster: etcd ,etcd_seq: 3}
        10.10.10.24 : { nodename: minio4 ,node_cluster: minio  ,minio_cluster: minio ,minio_seq: 4 ,etcd_cluster: etcd ,etcd_seq: 4}
        10.10.10.25 : { nodename: minio5 ,node_cluster: minio  ,minio_cluster: minio ,minio_seq: 5 ,etcd_cluster: etcd ,etcd_seq: 5}
        10.10.10.40 : { nodename: node40 ,node_id_from_pg: true }
        10.10.10.41 : { nodename: node41 ,node_id_from_pg: true }
        10.10.10.42 : { nodename: node42 ,node_id_from_pg: true }
        10.10.10.43 : { nodename: node43 ,node_id_from_pg: true }
        10.10.10.44 : { nodename: node44 ,node_id_from_pg: true }
        10.10.10.45 : { nodename: node45 ,node_id_from_pg: true }
        10.10.10.46 : { nodename: node46 ,node_id_from_pg: true }
        10.10.10.47 : { nodename: node47 ,node_id_from_pg: true }
        10.10.10.48 : { nodename: node48 ,node_id_from_pg: true }
        10.10.10.49 : { nodename: node49 ,node_id_from_pg: true }
        10.10.10.50 : { nodename: node50 ,node_id_from_pg: true }
        10.10.10.51 : { nodename: node51 ,node_id_from_pg: true }
        10.10.10.52 : { nodename: node52 ,node_id_from_pg: true }
        10.10.10.53 : { nodename: node53 ,node_id_from_pg: true }
        10.10.10.54 : { nodename: node54 ,node_id_from_pg: true }
        10.10.10.55 : { nodename: node55 ,node_id_from_pg: true }
        10.10.10.56 : { nodename: node56 ,node_id_from_pg: true }
        10.10.10.57 : { nodename: node57 ,node_id_from_pg: true }
        10.10.10.58 : { nodename: node58 ,node_id_from_pg: true }
        10.10.10.59 : { nodename: node59 ,node_id_from_pg: true }
        10.10.10.88 : { nodename: test   }

    #==========================================================#
    # etcd: 5 nodes used as dedicated minio cluster
    #==========================================================#
    # ./etcd.yml -l etcd;
    etcd:
      hosts:
        10.10.10.21: {}
        10.10.10.22: {}
        10.10.10.23: {}
        10.10.10.24: {}
        10.10.10.25: {}
      vars: {}

    #==========================================================#
    # minio: 3 nodes used as dedicated minio cluster
    #==========================================================#
    # ./minio.yml -l minio;
    minio:
      hosts:
        10.10.10.21: {}
        10.10.10.22: {}
        10.10.10.23: {}
        10.10.10.24: {}
        10.10.10.25: {}
      vars:
        minio_data: '/data{1...4}' # 5 node x 4 disk

    #==========================================================#
    # proxy: 2 nodes used as dedicated haproxy server
    #==========================================================#
    # ./node.yml -l proxy
    proxy:
      hosts:
        10.10.10.18: {}
        10.10.10.19: {}
      vars:
        vip_enabled: true
        haproxy_services:      # expose minio service : sss.pigsty:9000
          - name: minio        # [REQUIRED] service name, unique
            port: 9000         # [REQUIRED] service port, unique
            balance: leastconn # Use leastconn algorithm and minio health check
            options: [ "option httpchk", "option http-keep-alive", "http-check send meth OPTIONS uri /minio/health/live", "http-check expect status 200" ]
            servers:           # reload service with ./node.yml -t haproxy_config,haproxy_reload
              - { name: minio-1 ,ip: 10.10.10.21 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
              - { name: minio-2 ,ip: 10.10.10.22 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
              - { name: minio-3 ,ip: 10.10.10.23 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
              - { name: minio-4 ,ip: 10.10.10.24 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
              - { name: minio-5 ,ip: 10.10.10.25 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

    #==========================================================#
    # pg-meta: reuse infra node as meta cmdb
    #==========================================================#
    # ./pgsql.yml -l pg-meta
    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1 , pg_role: primary }
        10.10.10.11: { pg_seq: 2 , pg_role: replica }
      vars:
        pg_cluster: pg-meta
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.2/24
        pg_vip_interface: eth1
        pg_users:
          - {name: dbuser_meta     ,password: DBUser.Meta     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
          - {name: dbuser_view     ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
          - {name: dbuser_grafana  ,password: DBUser.Grafana  ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for grafana database    }
          - {name: dbuser_bytebase ,password: DBUser.Bytebase ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for bytebase database   }
          - {name: dbuser_kong     ,password: DBUser.Kong     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for kong api gateway    }
          - {name: dbuser_gitea    ,password: DBUser.Gitea    ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for gitea service       }
          - {name: dbuser_wiki     ,password: DBUser.Wiki     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for wiki.js service     }
          - {name: dbuser_noco     ,password: DBUser.Noco     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for nocodb service      }
        pg_databases:
          - { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: vector}]}
          - { name: grafana  ,owner: dbuser_grafana  ,revokeconn: true ,comment: grafana primary database }
          - { name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database }
          - { name: kong     ,owner: dbuser_kong     ,revokeconn: true ,comment: kong the api gateway database }
          - { name: gitea    ,owner: dbuser_gitea    ,revokeconn: true ,comment: gitea meta database }
          - { name: wiki     ,owner: dbuser_wiki     ,revokeconn: true ,comment: wiki meta database }
          - { name: noco     ,owner: dbuser_noco     ,revokeconn: true ,comment: nocodb database }
        pg_hba_rules:
          - { user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes' }
        pg_libs: 'pg_stat_statements, auto_explain' # add timescaledb to shared_preload_libraries
        node_crontab:  # make a full backup on monday 1am, and an incremental backup during weekdays
          - '00 01 * * 1 postgres /pg/bin/pg-backup full'
          - '00 01 * * 2,3,4,5,6,7 postgres /pg/bin/pg-backup'

    #==========================================================#
    # pg-v13 - v17
    #==========================================================#
    # ./pgsql.yml -l pg-v*
    pg-v12:
      hosts: { 10.10.10.12: {}}
      vars:
        pg_version: 13
        pg_service_provider: proxy       # use load balancer on group `proxy` with port 10012
        pg_default_services:  [{ name: primary ,port: 10012 ,dest: postgres  ,check: /primary   ,selector: "[]" }]

    pg-v13:
      hosts: { 10.10.10.13: {}}
      vars:
        pg_version: 13
        pg_service_provider: proxy       # use load balancer on group `proxy` with port 10013
        pg_default_services:  [{ name: primary ,port: 10013 ,dest: postgres  ,check: /primary   ,selector: "[]" }]

    pg-v14:
      hosts: { 10.10.10.14: {}}
      vars:
        pg_version: 14
        pg_service_provider: proxy       # use load balancer on group `proxy` with port 10014
        pg_default_services:  [{ name: primary ,port: 10014 ,dest: postgres  ,check: /primary   ,selector: "[]" }]

    pg-v15:
      hosts: { 10.10.10.15: {}}
      vars:
        pg_version: 15
        pg_service_provider: proxy       # use load balancer on group `proxy` with port 10015
        pg_default_services:  [{ name: primary ,port: 10015 ,dest: postgres  ,check: /primary   ,selector: "[]" }]

    pg-v16:
      hosts: { 10.10.10.16: {}}
      vars:
        pg_version: 16
        pg_service_provider: proxy       # use load balancer on group `proxy` with port 10016
        pg_default_services:  [{ name: primary ,port: 10016 ,dest: postgres  ,check: /primary   ,selector: "[]" }]

    pg-v17:
      hosts: { 10.10.10.17: {}}
      vars:
        pg_version: 17
        pg_service_provider: proxy       # use load balancer on group `proxy` with port 10017
        pg_default_services:  [{ name: primary ,port: 10017 ,dest: postgres  ,check: /primary   ,selector: "[]" }]

    #==========================================================#
    # pg-pitr: single node
    #==========================================================#
    # ./pgsql.yml -l pg-pitr
    pg-pitr:
      hosts:
        10.10.10.40: { pg_seq: 1 ,pg_role: primary }
      vars:
        pg_cluster: pg-pitr
        pg_databases: [{ name: test }]

    #==========================================================#
    # pg-test: dedicate 4 node testing cluster
    #==========================================================#
    # ./pgsql.yml -l pg-test
    pg-test:
      hosts:
        10.10.10.41: { pg_seq: 1 ,pg_role: primary }
        10.10.10.42: { pg_seq: 2 ,pg_role: replica }
        10.10.10.43: { pg_seq: 3 ,pg_role: replica }
        10.10.10.44: { pg_seq: 4 ,pg_role: replica }
      vars:
        pg_cluster: pg-test
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.3/24
        pg_vip_interface: eth1
        pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
        pg_databases: [{ name: test }]


    #==========================================================#
    # pg-src: dedicate 3 node testing cluster
    #==========================================================#
    # ./pgsql.yml -l pg-src
    pg-src:
      hosts:
        10.10.10.45: { pg_seq: 1 ,pg_role: primary }
        10.10.10.46: { pg_seq: 2 ,pg_role: replica }
        10.10.10.47: { pg_seq: 3 ,pg_role: replica }
      vars:
        pg_cluster: pg-src
        #pg_version: 14
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.4/24
        pg_vip_interface: eth1
        pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
        pg_databases: [{ name: src }]


    #==========================================================#
    # pg-dst: dedicate 2 node testing cluster
    #==========================================================#
    # ./pgsql.yml -l pg-dst
    pg-dst:
      hosts:
        10.10.10.48: { pg_seq: 1 ,pg_role: primary } # 8C 8G
        10.10.10.49: { pg_seq: 2 ,pg_role: replica } # 1C 2G
      vars:
        pg_cluster: pg-dst
        pg_vip_enabled: true
        pg_vip_address: 10.10.10.5/24
        pg_vip_interface: eth1
        node_hugepage_ratio: 0.3
        pg_users: [ { name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] } ]
        pg_databases: [ { name: dst } ]


    #==========================================================#
    # pg-citus: 10 node citus cluster (5 x primary-replica pair)
    #==========================================================#
    pg-citus: # citus group
      hosts:
        10.10.10.50: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.60/24 ,pg_seq: 0, pg_role: primary }
        10.10.10.51: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.60/24 ,pg_seq: 1, pg_role: replica }
        10.10.10.52: { pg_group: 1, pg_cluster: pg-citus1 ,pg_vip_address: 10.10.10.61/24 ,pg_seq: 0, pg_role: primary }
        10.10.10.53: { pg_group: 1, pg_cluster: pg-citus1 ,pg_vip_address: 10.10.10.61/24 ,pg_seq: 1, pg_role: replica }
        10.10.10.54: { pg_group: 2, pg_cluster: pg-citus2 ,pg_vip_address: 10.10.10.62/24 ,pg_seq: 0, pg_role: primary }
        10.10.10.55: { pg_group: 2, pg_cluster: pg-citus2 ,pg_vip_address: 10.10.10.62/24 ,pg_seq: 1, pg_role: replica }
        10.10.10.56: { pg_group: 3, pg_cluster: pg-citus3 ,pg_vip_address: 10.10.10.63/24 ,pg_seq: 0, pg_role: primary }
        10.10.10.57: { pg_group: 3, pg_cluster: pg-citus3 ,pg_vip_address: 10.10.10.63/24 ,pg_seq: 1, pg_role: replica }
        10.10.10.58: { pg_group: 4, pg_cluster: pg-citus4 ,pg_vip_address: 10.10.10.64/24 ,pg_seq: 0, pg_role: primary }
        10.10.10.59: { pg_group: 4, pg_cluster: pg-citus4 ,pg_vip_address: 10.10.10.64/24 ,pg_seq: 1, pg_role: replica }
      vars:
        pg_mode: citus                    # pgsql cluster mode: citus
        pg_version: 16                    # citus does not have pg16 available
        pg_shard: pg-citus                # citus shard name: pg-citus
        pg_primary_db: test               # primary database used by citus
        pg_dbsu_password: DBUser.Postgres # all dbsu password access for citus cluster
        pg_vip_enabled: true
        pg_vip_interface: eth1
        pg_extensions: [ 'citus postgis pgvector' ]
        pg_libs: 'citus, pg_stat_statements, auto_explain' # citus will be added by patroni automatically
        pg_users: [ { name: test ,password: test ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
        pg_databases: [ { name: test ,owner: test ,extensions: [ { name: citus }, { name: vector } ] } ]
        pg_hba_rules:
          - { user: 'all' ,db: all  ,addr: 10.10.10.0/24 ,auth: trust ,title: 'trust citus cluster members'        }
          - { user: 'all' ,db: all  ,addr: 127.0.0.1/32  ,auth: ssl   ,title: 'all user ssl access from localhost' }
          - { user: 'all' ,db: all  ,addr: intra         ,auth: ssl   ,title: 'all user ssl access from intranet'  }

    #==========================================================#
    # redis-meta: reuse the 5 etcd nodes as redis sentinel
    #==========================================================#
    # ./redis.yml -l redis-meta
    redis-meta:
      hosts:
        10.10.10.21: { redis_node: 1 , redis_instances: { 26379: {} } }
        10.10.10.22: { redis_node: 2 , redis_instances: { 26379: {} } }
        10.10.10.23: { redis_node: 3 , redis_instances: { 26379: {} } }
        10.10.10.24: { redis_node: 4 , redis_instances: { 26379: {} } }
        10.10.10.25: { redis_node: 5 , redis_instances: { 26379: {} } }
      vars:
        redis_cluster: redis-meta
        redis_password: 'redis.meta'
        redis_mode: sentinel
        redis_max_memory: 256MB
        redis_sentinel_monitor:  # primary list for redis sentinel, use cls as name, primary ip:port
          - { name: redis-src, host: 10.10.10.45, port: 6379 ,password: redis.src, quorum: 1 }
          - { name: redis-dst, host: 10.10.10.48, port: 6379 ,password: redis.dst, quorum: 1 }

    #==========================================================#
    # redis-test: redis native cluster in 4 nodes, 12 instances
    #==========================================================#
    # ./node.yml -l redis-test; ./redis.yml -l redis-test
    redis-test:
      hosts:
        10.10.10.41: { redis_node: 1 ,redis_instances: { 6379: {} ,6380: {} ,6381: {} } }
        10.10.10.42: { redis_node: 2 ,redis_instances: { 6379: {} ,6380: {} ,6381: {} } }
        10.10.10.43: { redis_node: 3 ,redis_instances: { 6379: {} ,6380: {} ,6381: {} } }
        10.10.10.44: { redis_node: 4 ,redis_instances: { 6379: {} ,6380: {} ,6381: {} } }
      vars:
        redis_cluster: redis-test
        redis_password: 'redis.test'
        redis_mode: cluster
        redis_max_memory: 64MB

    #==========================================================#
    # redis-src: reuse pg-src 3 nodes for redis
    #==========================================================#
    # ./redis.yml -l redis-src
    redis-src:
      hosts:
        10.10.10.45: { redis_node: 1 , redis_instances: {6379: {  } }}
        10.10.10.46: { redis_node: 2 , redis_instances: {6379: { replica_of: '10.10.10.45 6379' }, 6380: { replica_of: '10.10.10.46 6379' } }}
        10.10.10.47: { redis_node: 3 , redis_instances: {6379: { replica_of: '10.10.10.45 6379' }, 6380: { replica_of: '10.10.10.47 6379' } }}
      vars:
        redis_cluster: redis-src
        redis_password: 'redis.src'
        redis_max_memory: 64MB

    #==========================================================#
    # redis-dst: reuse pg-dst 2 nodes for redis
    #==========================================================#
    # ./redis.yml -l redis-dst
    redis-dst:
      hosts:
        10.10.10.48: { redis_node: 1 , redis_instances: {6379: {  }                               }}
        10.10.10.49: { redis_node: 2 , redis_instances: {6379: { replica_of: '10.10.10.48 6379' } }}
      vars:
        redis_cluster: redis-dst
        redis_password: 'redis.dst'
        redis_max_memory: 64MB

    #==========================================================#
    # ferret: reuse pg-src as mongo (ferretdb)
    #==========================================================#
    # ./mongo.yml -l ferret
    ferret:
      hosts:
        10.10.10.45: { mongo_seq: 1 }
        10.10.10.46: { mongo_seq: 2 }
        10.10.10.47: { mongo_seq: 3 }
      vars:
        mongo_cluster: ferret
        mongo_pgurl: 'postgres://test:test@10.10.10.45:5432/src'
        #mongo_pgurl: 'postgres://test:test@10.10.10.3:5436/test'


    #==========================================================#
    # test: running cli tools and test miscellaneous stuff
    #==========================================================#
    test:
      hosts: { 10.10.10.88: { nodename: test } }
      vars:
        node_cluster: test
        node_packages: [ 'etcd,logcli,mcli,redis' ]


  #============================================================#
  # Global Variables
  #============================================================#
  vars:

    #==========================================================#
    # INFRA
    #==========================================================#
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "10.10.10.10:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "10.10.10.10:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "10.10.10.10:9093" }
      blackbox     : { endpoint: "10.10.10.10:9115" }
      loki         : { endpoint: "10.10.10.10:3100" }
      minio        : { domain: m.pigsty    ,endpoint: "10.10.10.21:9001" ,scheme: https ,websocket: true }
      postgrest    : { domain: api.pigsty  ,endpoint: "127.0.0.1:8884" }
      pgadmin      : { domain: adm.pigsty  ,endpoint: "127.0.0.1:8885" }
      pgweb        : { domain: cli.pigsty  ,endpoint: "127.0.0.1:8886" }
      bytebase     : { domain: ddl.pigsty  ,endpoint: "127.0.0.1:8887" }
      jupyter      : { domain: lab.pigsty  ,endpoint: "127.0.0.1:8888"  , websocket: true }
      supa         : { domain: supa.pigsty ,endpoint: "10.10.10.10:8000", websocket: true }
    nginx_navbar: []
    dns_records:                      # dynamic dns records resolved by dnsmasq
      - 10.10.10.1 h.pigsty a.pigsty p.pigsty g.pigsty

    #==========================================================#
    # NODE
    #==========================================================#
    node_id_from_pg: false            # use nodename rather than pg identity as hostname
    node_conf: tiny                   # use small node template
    node_timezone: Asia/Hong_Kong     # use Asia/Hong_Kong Timezone
    node_dns_servers:                 # DNS servers in /etc/resolv.conf
      - 10.10.10.10
      - 10.10.10.11
    node_etc_hosts:
      - 10.10.10.10 h.pigsty a.pigsty p.pigsty g.pigsty
      - 10.10.10.20 sss.pigsty        # point minio serviec domain to the L2 VIP of proxy cluster
    node_ntp_servers:                 # NTP servers in /etc/chrony.conf
      - pool cn.pool.ntp.org iburst
      - pool 10.10.10.10 iburst
    node_admin_ssh_exchange: false    # exchange admin ssh key among node cluster

    #==========================================================#
    # PGSQL
    #==========================================================#
    pg_conf: tiny.yml
    pgbackrest_method: minio          # USE THE HA MINIO THROUGH A LOAD BALANCER
    pg_dbsu_ssh_exchange: false       # do not exchange dbsu ssh key among pgsql cluster
    pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
      local:                          # default pgbackrest repo with local posix fs
        path: /pg/backup              # local backup directory, `/pg/backup` by default
        retention_full_type: count    # retention full backups by count
        retention_full: 2             # keep 2, at most 3 full backup when using local fs repo
      minio:
        type: s3
        s3_endpoint: sss.pigsty       # s3_endpoint could be any load balancer: 10.10.10.1{0,1,2}, or domain names point to any of the 3 nodes
        s3_region: us-east-1          # you could use external domain name: sss.pigsty , which resolve to any members  (`minio_domain`)
        s3_bucket: pgsql              # instance & nodename can be used : minio-1.pigsty minio-1.pigsty minio-1.pigsty minio-1 minio-2 minio-3
        s3_key: pgbackrest            # Better using a new password for MinIO pgbackrest user
        s3_key_secret: S3User.Backup
        s3_uri_style: path
        path: /pgbackrest
        storage_port: 9000            # Use the load balancer port 9000
        storage_ca_file: /etc/pki/ca.crt
        bundle: y
        cipher_type: aes-256-cbc      # Better using a new cipher password for your production environment
        cipher_pass: pgBackRest.${pg_cluster}
        retention_full_type: time
        retention_full: 14

注意事项

5 - 参考信息

详细的参考信息与列表：支持的操作系统，模块，参数，监控指标，数据库扩展，同类对比，术语表等。

5.1 - OS兼容性

Pigsty 兼容的操作系统发行版、内核与架构，PostgreSQL 大版本支持策略，不同环境下的可用的功能集差异。

概述

Pigsty 建议使用 Linux 内核，amd64 架构的节点，使用 RockyLinux 9.5, Debian 12.10, Ubuntu 22.04.5 操作系统。

内核架构兼容性：Linux 内核，amd64/arm64 架构（x86_64/aarch64）

EL发行版支持： EL8，EL9 ；（RHEL, Rocky, CentOS, Alma, Oracle, Anolis,…）

Debian系发行版支持：Ubuntu 24.04 noble, 22.04 jammy, Debian 12 bookworm；

Code	Distro	`x86_64`	`Arm64`
EL9	RHEL 9 / Rocky9 / Alma9	`el9.x86_64`	`el9.arm64`
D12	Debian 12 (bookworm)	`d12.x86_64`	`d12.arm64`
U22	Ubuntu 22.04 (jammy)	`u22.x86_64`	`u22.arm64`
EL8	RHEL 8 / Rocky8 / Alma8 / Anolis8	`el8.x86_64`	`el8.arm64`
U24	Ubuntu 24.04 (noble)	`u24.x86_64`	`u24.arm64`
D11	Debian 11 (bullseye)	`d12.x86_64`	`d11.arm64`
U20	Ubuntu 20.04 (focal)	`d12.x86_64`	`u20.arm64`
EL7	RHEL7 / CentOS7	`d12.x86_64`	`el7.arm64`

= 第一类支持， = 第二类支持， = 第三类支持

Pigsty 不使用任何虚拟化容器化技术，直接运行于裸操作系统上。EL 系与 Debian 系的软件包名有显著差异，默认可用的 PostgreSQL 扩展插件 也会略有不同。

如果您有对兼容性的高级需求，例如使用特定操作系统发行版大小版本，支持特定版本的 PostgreSQL，我们亦提供专业的 服务支持 选项（与）。

内核架构兼容性

Pigsty 目前支持 Linux 内核，x86_64 / amd64 芯片架构。

MacOS 与 Windows 操作系统可以通过 Linux 虚拟机/容器的方式安装 Pigsty。

我们提供了 Vagrant 本地沙箱支持，可以在其他操作系统上使用 Vagrant 和 Virtualbox / Libvirtd / VMWare 等虚拟化软件一键拉起 Pigsty 所需的部署环境。您也可以使用 Terraform 在云端环境中一键申请部署 Pigsty 所需的资源。

EL系发行版支持

EL 系操作系统是 Pigsty 的首要支持目标，包括 Red Hat Enterprise Linux，RockyLinux，CentOS，AlmaLinux，OracleLinux, Anolis 等兼容发行版。

Pigsty 支持最近两个大版本： EL9，EL8

EL9： RHEL，RockyLinux，AlmaLinux （推荐使用 Rocky 9.4+）
EL8： RHEL，RockyLinux，AlmaLinux，Anolis（推荐使用 Rocky 8.10）
EL7： RHEL，CentOS 7.9 （推荐使用 CentOS 7.9，已在开源版本中弃用！）

代码	发行版	小版本	局限性
EL9	RHEL 9 / Rocky9 / Alma9	9.4	EL系标准功能集
EL8	RHEL 8 / Rocky8 / Alma8 / Anolis8	8.10	缺少 `pljava`, `pg_duckdb` 等扩展
EL7	RHEL7 / CentOS7	7.9	EOL，PG16/17, 大部分三方扩展不可用

建议使用 RockyLinux 9.4

Rocky 9.4+ 在系统可靠性/稳定性与软件版本的新颖性/齐全性上取得了良好的平衡，建议 EL 系用户默认使用此系统版本。

EL7 弃用通知

Red Hat Enterprise Linux 7 已经于 2024年6月停止维护，同时 PGDG 也不再为 PostgreSQL 16 提供 EL7 二进制包支持。

Pigsty 专业版订阅中提供针对 EL7 操作系统的扩展支持。

Debian系发行版支持

Pigsty 支持 Ubuntu / Debian 系操作系统及其兼容发行版，目前支持

D12： Debian 12 bookworm（推荐使用 12.7）
U24： Ubuntu 24.04 jammy（推荐使用 24.04.1 LTS）
U22： Ubuntu 22.04 jammy（推荐使用 22.04.5 LTS）
U20： Ubuntu 20.04 focal（推荐使用 20.04.6 LTS，已弃用支持）
D11： Debian 11 bullseye（推荐使用 11.11，已弃用支持）

代码	Debian系发行版	小版本	局限性
D12	Debian 12 (bookworm)	12.7	Debian标准功能集
U22	Ubuntu 22.04 (jammy)	22.04.5	Ubuntu标准功能集
U24	Ubuntu 24.04 (noble)	24.04.1	少部分扩展包缺失
U20	Debian 11 (bullseye)	11.8	已弃用支持
D11	Ubuntu 20.04 (focal)	20.04.6	已弃用支持

建议使用 Debian 12.7 / Ubuntu 22.04.5 LTS

Debian 12 与 Ubuntu 22.04 在系统可靠性/稳定性与软件版本的新颖性/齐全性上取得了良好的平衡。

建议用户默认选择 Debian，有机器学习，桌面等需求则选择 Ubuntu 22.04。

Ubuntu 20.04 / Debian 11 弃用通知

Debian 11 已经于 2024-07 进入 EOL，Ubuntu 20.04 将于 2025-04 进入 EOL。

Pigsty 将不再针对 Debian 11 / Ubuntu 20.04 提供新功能支持与扩展插件打包服务。

Pigsty 专业版订阅中提供针对 Debian 11 / Ubuntu 20.04 两个过保操作系统大版本的扩展支持。

Vagrant 镜像参考

当您使用云服务器部署 Pigsty 时，可以考虑在 Vagrant 中使用以下操作系统镜像，这也是 Pigsty 开发测试构建使用的镜像。

generic/centos7： CentOS 7.9
generic/rocky8： Rocky 8.9
generic/rocky9： Rocky 9.5
generic/debian11： Debian 11.11
generic/debian12： Debian 12.7
generic/ubuntu2004： Ubuntu 20.04.6
generic/ubuntu2204： Ubuntu 22.04.5
bento/ubuntu-24.04： Ubuntu 24.04.1

Terraform 镜像参考

当您使用云服务器部署 Pigsty 时，可以考虑在 Terraform 中使用以下操作系统基础镜像，以阿里云为例：

CentOS 7.9 : centos_7_9_x64_20G_alibase_20240628.vhd
Rocky 8.10 : rockylinux_8_10_x64_20G_alibase_20240923.vhd
Rocky 9.4 : rockylinux_9_4_x64_20G_alibase_20240925.vhd
Ubuntu 20.04 : ubuntu_20_04_x64_20G_alibase_20240925.vhd
Ubuntu 22.04 : ubuntu_22_04_x64_20G_alibase_20240926.vhd
Ubuntu 24.04 : ubuntu_24_04_x64_20G_alibase_20240923.vhd
Debian 11.11 : debian_11_11_x64_20G_alibase_20240923.vhd
Debian 12.7 : debian_12_7_x64_20G_alibase_20240927.vhd
Anolis 8.8 : anolisos_8_9_x64_20G_rhck_alibase_20240724.vhd

参考阅读

《EL系操作系统兼容性哪家强？》

5.2 - 参数列表

Pigsty 提供了约 280+ 配置参数，用于描述整个环境与各个模块的方方面面。

请参考各个功能模块的参数列表：

5.3 - 扩展列表

本文列出了 Pigsty 支持的 PostgreSQL 扩展插件，以及这些插件在不同系统下的支持情况。

Pigsty 中总共有 421 个可用扩展，其中 EL 可用 RPM扩展 334 个，Debian/Ubuntu 可用 DEB扩展 326 个。其中包括 PostgreSQL 自带的 70 个 Contrib扩展。

5.4 - 文件结构

Pigsty 的文件系统结构是如何设计与组织的，以及各个模块使用的目录结构。

Pigsty FHS

Pigsty 的主目录默认放置于于 ~/pigsty，该目录下的文件结构如下所示：

#------------------------------------------------------------------------------
# pigsty
#  ^-----@app                    # 额外的示例应用资源
#  ^-----@bin                    # bin 脚本
#  ^-----@docs                   # 文档（可docsify化）
#  ^-----@files                  # ansible 文件资源
#            ^-----@pigsty       # pigsty 配置模板文件
#            ^-----@prometheus   # prometheus 规则定义
#            ^-----@grafana      # grafana 仪表盘
#            ^-----@postgres     # /pg/bin/ 脚本
#            ^-----@migration    # pgsql 迁移任务定义
#            ^-----@pki          # 自签名 CA 和证书
#  ^-----@roles                  # ansible 剧本实现
#  ^-----@templates              # ansible 模板文件
#  ^-----@vagrant                # Vagrant 沙箱虚拟机定义模板
#  ^-----@terraform              # Terraform 云虚拟机申请模板
#  ^-----configure               # 配置向导脚本
#  ^-----ansible.cfg             # ansible 默认配置文件
#  ^-----pigsty.yml              # pigsty 默认配置文件
#  ^-----*.yml                   # ansible 剧本
#------------------------------------------------------------------------------
# /etc/pigsty/
#  ^-----@targets                # 基于文件的服务发现目标定义
#  ^-----@dashboards             # grafana 监控面板
#  ^-----@datasources            # grafana 数据源
#  ^-----@playbooks              # ansible 剧本
#  ^-----@pgadmin                # pgadmin 服务器列表与密码
#------------------------------------------------------------------------------

CA FHS

Pigsty 的自签名 CA 位于 Pigsty 主目录下的 files/pki/。

你必须妥善保管 CA 的密钥文件：files/pki/ca/ca.key，该密钥是在 install.yml 或 infra.yml 的 ca 角色负责生成的。

# pigsty/files/pki
#  ^-----@ca                      # 自签名 CA 密钥和证书
#         ^-----@ca.key           # 非常重要：保守其秘密
#         ^-----@ca.crt           # 非常重要：在所有地方都受信任
#  ^-----@csr                     # 签名请求 csr
#  ^-----@misc                    # 杂项证书，已签发证书
#  ^-----@etcd                    # etcd 服务器证书
#  ^-----@minio                   # minio 服务器证书
#  ^-----@nginx                   # nginx SSL 证书
#  ^-----@infra                   # infra 客户端证书
#  ^-----@pgsql                   # pgsql 服务器证书
#  ^-----@mongo                   # mongodb/ferretdb 服务器证书
#  ^-----@mysql                   # mysql 服务器证书（占位符）

被 Pigsty 所管理的节点将安装以下证书文件：

/etc/pki/ca.crt                             # 所有节点都添加的根证书
/etc/pki/ca-trust/source/anchors/ca.crt     # 软链接到系统受信任的锚点

所有 infra 节点都会有以下证书：

/etc/pki/infra.crt                          # infra 节点证书
/etc/pki/infra.key                          # infra 节点密钥

当您的管理节点出现故障时，files/pki 目录与 pigsty.yml 文件应当在备份的管理节点上可用。你可以用 rsync 做到这一点。

# run on meta-1, rsync to meta2
cd ~/pigsty;
rsync -avz ./ meta-2:~/pigsty

NODE FHS

节点的数据目录由参数 node_data 指定，默认为 /data，由 root 用户持有，权限为 0777。

每个组件的默认数据目录都位于这个数据库目录下，如下所示：

/data
#  ^-----@postgres                   # postgres 数据库目录
#  ^-----@backups                    # postgres 备份数据目录（没有专用备份盘时）
#  ^-----@redis                      # redis 数据目录（多实例共用）
#  ^-----@minio                      # minio 数据目录（单机单盘模式）
#  ^-----@etcd                       # etcd 主数据目录
#  ^-----@prometheus                 # prometheus 监控时序数据目录
#  ^-----@loki                       # Loki 日志数据目录
#  ^-----@docker                     # Docker数据目录
#  ^-----@...                        # 其他组件的数据目录

Prometheus FHS

Prometheus 的主配置文件则位于 roles/infra/templates/prometheus/prometheus.yml.j2 ，并渲染至所有基础设施节点的 /etc/prometheus/prometheus.yml。

Prometheus 相关的脚本与规则定义放置于 pigsty 主目录下的 files/prometheus/ 目录，会被拷贝至所有基础设施节点的 /etc/prometheus/ 下。

# /etc/prometheus/
#  ^-----prometheus.yml              # Prometheus 主配置文件
#  ^-----@bin                        # 工具脚本：检查配置，显示状态，重载配置，重建集群
#  ^-----@rules                      # 记录和报警规则定义
#            ^-----agent.yml         # agnet 规则和报警
#            ^-----infra.yml         # infra 规则和报警
#            ^-----etcd.yml          # etcd 规则和报警
#            ^-----node.yml          # node  规则和报警
#            ^-----pgsql.yml         # pgsql 规则和报警
#            ^-----redis.yml         # redis 规则和报警
#            ^-----minio.yml         # minio 规则和报警
#            ^-----mysql.yml         # mysql 规则和报警（占位）
#  ^-----@targets                    # 基于文件的服务发现目标定义
#            ^-----@infra            # infra 静态目标定义
#            ^-----@node             # node  静态目标定义
#            ^-----@pgsql            # pgsql 静态目标定义
#            ^-----@pgrds            # pgsql 远程RDS目标
#            ^-----@redis            # redis 静态目标定义
#            ^-----@minio            # minio 静态目标定义
#            ^-----@mongo            # mongo 静态目标定义
#            ^-----@mysql            # mysql 静态目标定义
#            ^-----@etcd             # etcd 静态目标定义
#            ^-----@ping             # ping 静态目标定义
#            ^-----@patroni          # patroni 静态目标定义 （当patroni启用SSL时使用此目录）
#            ^-----@.....            # 其他监控目标定义
# /etc/alertmanager.yml              # 告警组件主配置文件
# /etc/blackbox.yml                  # 黑盒探测主配置文件

Postgres FHS

以下参数与PostgreSQL数据库目录结构相关:

pg_dbsu_home： Postgres 默认用户的家目录，默认为/var/lib/pgsql
pg_bin_dir： Postgres二进制目录，默认为/usr/pgsql/bin/
pg_data：Postgres数据库目录，默认为/pg/data
pg_fs_main：Postgres主数据盘挂载点，默认为/data
pg_fs_bkup：Postgres备份盘挂载点，默认为/data/backups（即主数据盘上的子目录）

#--------------------------------------------------------------#
# 工作假设:
#   {{ pg_fs_main }} 主数据目录，默认位置：`/data`          [快速SSD]
#   {{ pg_fs_bkup }} 备份数据盘，默认位置：`/data/backups`  [廉价HDD]
#--------------------------------------------------------------#
# 默认配置:
#     pg_fs_main = /data             高速SSD
#     pg_fs_bkup = /data/backups     廉价HDD (可选)
#
#     /pg      -> /data/postgres/pg-test-17    (软链接)
#     /pg/data -> /data/postgres/pg-test-17/data
#--------------------------------------------------------------#
- name: create postgresql directories
  tags: pg_dir
  become: yes
  block:

    - name: make main and backup data dir
      file: path={{ item }} state=directory owner=root mode=0777
      with_items:
        - "{{ pg_fs_main }}"
        - "{{ pg_fs_bkup }}"

    # pg_cluster_dir:    "{{ pg_fs_main }}/postgres/{{ pg_cluster }}-{{ pg_version }}"
    - name: create postgres directories
      file: path={{ item }} state=directory owner={{ pg_dbsu }} group=postgres mode=0700
      with_items:
        - "{{ pg_fs_main }}/postgres"
        - "{{ pg_cluster_dir }}"
        - "{{ pg_cluster_dir }}/bin"
        - "{{ pg_cluster_dir }}/log"
        - "{{ pg_cluster_dir }}/tmp"
        - "{{ pg_cluster_dir }}/cert"
        - "{{ pg_cluster_dir }}/conf"
        - "{{ pg_cluster_dir }}/data"
        - "{{ pg_cluster_dir }}/meta"
        - "{{ pg_cluster_dir }}/stat"
        - "{{ pg_cluster_dir }}/spool"
        - "{{ pg_cluster_dir }}/change"
        - "{{ pg_backup_dir }}/backup"

数据文件结构

# 真实目录
{{ pg_fs_main }}     /data                      # 顶层数据目录，通常为高速SSD挂载点
{{ pg_dir_main }}    /data/postgres             # 包含所有 Postgres 实例的数据目录（可能有多个实例/不同版本）
{{ pg_cluster_dir }} /data/postgres/pg-test-17  # 包含了 `pg-test` 集群的数据 (大版本是 17)
                     /data/postgres/pg-test-17/bin            # 关于 PostgreSQL 的实用脚本
                     /data/postgres/pg-test-17/log            # 日志：postgres/pgbouncer/patroni/pgbackrest
                     /data/postgres/pg-test-17/tmp            # 临时文件，例如渲染出的 SQL 文件
                     /data/postgres/pg-test-17/cert           # postgres 服务器证书
                     /data/postgres/pg-test-17/conf           # postgres 相关配置文件索引
                     /data/postgres/pg-test-17/data           # postgres 主数据目录
                     /data/postgres/pg-test-17/meta           # postgres 身份信息
                     /data/postgres/pg-test-17/stat           # 统计信息，日志报表，汇总摘要
                     /data/postgres/pg-test-17/spool          # 假脱机目录，pgBackRest 临时存储
                     /data/postgres/pg-test-17/change         # 变更记录
                     /data/postgres/pg-test-17/backup         # 指向备份目录的软链接。

{{ pg_fs_bkup }}     /data/backups                            # 可选的备份盘目录/挂载点
                     /data/backups/postgres/pg-test-17/backup # 集群备份的实际存储位置

# 软链接
/pg             ->   /data/postgres/pg-test-17                # pg 根软链接
/pg/data        ->   /data/postgres/pg-test-17/data           # pg 数据目录
/pg/backup      ->   /var/backups/postgres/pg-test-17/backup  # pg 备份目录

二进制文件结构

在 EL 兼容发行版上（使用yum），PostgreSQL 默认安装位置为

/usr/pgsql-${pg_version}/

Pigsty 会创建一个名为 /usr/pgsql 的软连接，指向由 pg_version 参数指定的实际版本，例如

/usr/pgsql -> /usr/pgsql-17

因此，默认的 pg_bin_dir 是 /usr/pgsql/bin/，而该路径会被添加至系统的 PATH 环境变量中，定义文件为：/etc/profile.d/pgsql.sh.

export PATH="/usr/pgsql/bin:/pg/bin:$PATH"
export PGHOME=/usr/pgsql
export PGDATA=/pg/data

在 Ubuntu/Debian 上，PostgreSQL Deb 包的默认安装位置是：

/usr/lib/postgresql/${pg_version}/bin

Pgbouncer FHS

Pgbouncer 使用与 {{ pg_dbsu }} （默认为 postgres）相同的用户运行，配置文件位于/etc/pgbouncer。

pgbouncer.ini，连接池主配置文件
database.txt：定义连接池中的数据库
userlist.txt：定义连接池中的用户
pgb_hba.conf：定义连接池的访问权限

Redis FHS

Pigsty提供了对Redis部署与监控对基础支持。

Redis二进制使用RPM包或复制二进制的方式安装于/bin/中，包括

redis-server    
redis-server    
redis-cli       
redis-sentinel  
redis-check-rdb 
redis-check-aof 
redis-benchmark 
/usr/libexec/redis-shutdown

对于一个名为 redis-test-1-6379 的 Redis 实例，与其相关的资源如下所示：

/usr/lib/systemd/system/redis-test-1-6379.service               # 服务 (在Debian系中为/lib/systemd)
/etc/redis/redis-test-1-6379.conf                               # 配置 
/data/redis/redis-test-1-6379                                   # 数据库目录
/data/redis/redis-test-1-6379/redis-test-1-6379.rdb             # RDB文件
/data/redis/redis-test-1-6379/redis-test-1-6379.aof             # AOF文件
/var/log/redis/redis-test-1-6379.log                            # 日志
/var/run/redis/redis-test-1-6379.pid                            # PID

对于 Ubuntu / Debian 而言，systemd 服务的默认目录不是 /usr/lib/systemd/system/ 而是 /lib/systemd/system/

5.5 - 同类对比

本文列出了与 Pigsty 生态位有重叠的产品与项目，并比较其在特性上的差异。

与 RDS 对比

Pigsty 是使用 AGPLv3 开源的本地优先 RDS 替代，可以部署在您自己的物理机/虚拟机上，也可以部署在云服务器上。

因此，我们选择了全球份额第一的亚马逊云 AWS RDS for PostgreSQL，以及中国市场份额第一的阿里云 RDS for PostgreSQL 作为参照对象。

阿里云 RDS 与 AWS RDS 均为闭源云数据库服务，通过租赁模式，仅在公有云上对外提供，以下对比基于最新的 PostgreSQL 16 主干版本进行，对比截止日期为 2024 年 2 月份。

功能特性

指标	Pigsty	Aliyun RDS	AWS RDS
大版本支持	12 - 17	12 - 17	12 - 17
只读从库	支持任意数量只读从库	备实例不对用户开放	备实例不对用户开放
读写分离	支持端口区分读写流量	独立收费组件	独立收费组件
快慢分离	支持离线 ETL 实例	未见相关特性	未见相关特性
异地灾备	支持备份集群	支持多可用区部署	支持多可用区部署
延迟从库	支持延迟实例	未见相关特性	未见相关特性
负载均衡	HAProxy / LVS	独立收费组件	独立收费组件
连接池	Pgbouncer	独立收费组件：RDS	独立收费组件：RDS Proxy
高可用	Patroni / etcd	需高可用版提供支持	需高可用版提供支持
时间点恢复	pgBackRest / MinIO	提供备份支持	提供备份支持
指标监控	Prometheus / Exporter	免费基础版/收费进阶版	免费基础版/收费进阶版
日志采集	Loki / Promtail	基础支持	基础支持
可视化系统	Grafana / Echarts	提供基本监控	提供基本监控
告警聚合通知	AlterManager	基础支持	基础支持

重要扩展

这里列出了一些重要扩展，对比基于最新的 PostgreSQL 16 主干版本进行，截止至 2024-02-28

扩展名称	Pigsty RDS / PGDG 官方仓库	阿里云 RDS	AWS RDS
加装扩展	自由加装	不允许	不允许
地理空间	PostGIS 3.4.2	PostGIS 3.3.4 / Ganos 6.1	PostGIS 3.4.1
雷达点云	PG PointCloud 1.2.5	Ganos PointCloud 6.1
向量嵌入	PGVector 0.6.1 / Svector 0.5.6	pase 0.0.1	PGVector 0.6
机器学习	PostgresML 2.8.1
时序扩展	TimescaleDB 2.14.2
水平分布式	Citus 12.1
列存扩展	Hydra 1.1.1
全文检索	pg_bm25 0.5.6
图数据库	Apache AGE 1.5.0
GraphQL	PG GraphQL 1.5.0
OLAP	pg_analytics 0.5.6
消息队列	pgq 3.5.0
DuckDB	duckdb_fdw 1.1
模糊分词	zhparser 1.1 / pg_bigm 1.2	zhparser 1.0 / pg_jieba	pg_bigm 1.2
CDC抽取	wal2json 2.5.3		wal2json 2.5
膨胀治理	pg_repack 1.5.0	pg_repack 1.4.8	pg_repack 1.5.0

AWS RDS PG 可用扩展

AWS RDS for PostgreSQL 16 可用扩展（已刨除PG自带扩展）

name	pg16	pg15	pg14	pg13	pg12	pg11	pg10
amcheck	1.3	1.3	1.3	1.2	1.2	yes	1
auto_explain	yes	yes	yes	yes	yes	yes	yes
autoinc	1	1	1	1	null	null	null
bloom	1	1	1	1	1	1	1
bool_plperl	1	1	1	1	null	null	null
btree_gin	1.3	1.3	1.3	1.3	1.3	1.3	1.2
btree_gist	1.7	1.7	1.6	1.5	1.5	1.5	1.5
citext	1.6	1.6	1.6	1.6	1.6	1.5	1.4
cube	1.5	1.5	1.5	1.4	1.4	1.4	1.2
dblink	1.2	1.2	1.2	1.2	1.2	1.2	1.2
dict_int	1	1	1	1	1	1	1
dict_xsyn	1	1	1	1	1	1	1
earthdistance	1.1	1.1	1.1	1.1	1.1	1.1	1.1
fuzzystrmatch	1.2	1.1	1.1	1.1	1.1	1.1	1.1
hstore	1.8	1.8	1.8	1.7	1.6	1.5	1.4
hstore_plperl	1	1	1	1	1	1	1
insert_username	1	1	1	1	null	null	null
intagg	1.1	1.1	1.1	1.1	1.1	1.1	1.1
intarray	1.5	1.5	1.5	1.3	1.2	1.2	1.2
isn	1.2	1.2	1.2	1.2	1.2	1.2	1.1
jsonb_plperl	1	1	1	1	1	null	null
lo	1.1	1.1	1.1	1.1	1.1	1.1	1.1
ltree	1.2	1.2	1.2	1.2	1.1	1.1	1.1
moddatetime	1	1	1	1	null	null	null
old_snapshot	1	1	1	null	null	null	null
pageinspect	1.12	1.11	1.9	1.8	1.7	1.7	1.6
pg_buffercache	1.4	1.3	1.3	1.3	1.3	1.3	1.3
pg_freespacemap	1.2	1.2	1.2	1.2	1.2	1.2	1.2
pg_prewarm	1.2	1.2	1.2	1.2	1.2	1.2	1.1
pg_stat_statements	1.1	1.1	1.9	1.8	1.7	1.6	1.6
pg_trgm	1.6	1.6	1.6	1.5	1.4	1.4	1.3
pg_visibility	1.2	1.2	1.2	1.2	1.2	1.2	1.2
pg_walinspect	1.1	1	null	null	null	null	null
pgcrypto	1.3	1.3	1.3	1.3	1.3	1.3	1.3
pgrowlocks	1.2	1.2	1.2	1.2	1.2	1.2	1.2
pgstattuple	1.5	1.5	1.5	1.5	1.5	1.5	1.5
plperl	1	1	1	1	1	1	1
plpgsql	1	1	1	1	1	1	1
pltcl	1	1	1	1	1	1	1
postgres_fdw	1.1	1.1	1.1	1	1	1	1
refint	1	1	1	1	null	null	null
seg	1.4	1.4	1.4	1.3	1.3	1.3	1.1
sslinfo	1.2	1.2	1.2	1.2	1.2	1.2	1.2
tablefunc	1	1	1	1	1	1	1
tcn	1	1	1	1	1	1	1
tsm_system_rows	1	1	1	1	1	1	1.1
tsm_system_time	1	1	1	1	1	1	1.1
unaccent	1.1	1.1	1.1	1.1	1.1	1.1	1.1
uuid-ossp	1.1	1.1	1.1	1.1	1.1	1.1	1.1

Aliyun RDS PG 可用扩展

阿里云 RDS for PostgreSQL 16 可用扩展（已刨除PG自带扩展）

name	pg16	pg15	pg14	pg13	pg12	pg11	pg10	ali_desc
bloom	1	1	1	1	1	1	1	提供一种基于布鲁姆过滤器的索引访问方法。
btree_gin	1.3	1.3	1.3	1.3	1.3	1.3	1.2	提供一个为多种数据类型和所有enum类型实现B树等价行为的GIN操作符类示例。
btree_gist	1.7	1.7	1.6	1.5	1.5	1.5	1.5	提供一个为多种数据类型和所有enum类型实现B树等价行为的GiST操作符类示例。
citext	1.6	1.6	1.6	1.6	1.6	1.5	1.4	提供一种大小写不敏感的字符串类型。
cube	1.5	1.5	1.5	1.4	1.4	1.4	1.2	提供一种数据类型来表示多维立方体。
dblink	1.2	1.2	1.2	1.2	1.2	1.2	1.2	跨库操作表。
dict_int	1	1	1	1	1	1	1	附加全文搜索词典模板的示例。
earthdistance	1.1	1.1	1.1	1.1	1.1	1.1	1.1	提供两种不同的方法来计算地球表面的大圆距离。
fuzzystrmatch	1.2	1.1	1.1	1.1	1.1	1.1	1.1	判断字符串之间的相似性和距离。
hstore	1.8	1.8	1.8	1.7	1.6	1.5	1.4	在单一PostgreSQL值中存储键值对。
intagg	1.1	1.1	1.1	1.1	1.1	1.1	1.1	提供一个整数聚集器和一个枚举器。
intarray	1.5	1.5	1.5	1.3	1.2	1.2	1.2	提供一些有用的函数和操作符来操纵不含空值的整数数组。
isn	1.2	1.2	1.2	1.2	1.2	1.2	1.1	按照一个硬编码的前缀列表对输入进行验证，也被用来在输出时连接号码。
ltree	1.2	1.2	1.2	1.2	1.1	1.1	1.1	用于表示存储在一个层次树状结构中的数据的标签。
pg_buffercache	1.4	1.3	1.3	1.3	1.3	1.3	1.3	提供一种方法实时检查共享缓冲区。
pg_freespacemap	1.2	1.2	1.2	1.2	1.2	1.2	1.2	检查空闲空间映射（FSM）。
pg_prewarm	1.2	1.2	1.2	1.2	1.2	1.2	1.1	提供一种方便的方法把数据载入到操作系统缓冲区或者PostgreSQL缓冲区。
pg_stat_statements	1.1	1.1	1.9	1.8	1.7	1.6	1.6	提供一种方法追踪服务器执行的所有SQL语句的执行统计信息。
pg_trgm	1.6	1.6	1.6	1.5	1.4	1.4	1.3	提供字母数字文本相似度的函数和操作符，以及支持快速搜索相似字符串的索引操作符类。
pgcrypto	1.3	1.3	1.3	1.3	1.3	1.3	1.3	为PostgreSQL提供了密码函数。
pgrowlocks	1.2	1.2	1.2	1.2	1.2	1.2	1.2	提供一个函数来显示一个指定表的行锁定信息。
pgstattuple	1.5	1.5	1.5	1.5	1.5	1.5	1.5	提供多种函数来获得元组层的统计信息。
plperl	1	1	1	1	1	1	1	提供perl过程语言。
plpgsql	1	1	1	1	1	1	1	提供SQL过程语言。
pltcl	1	1	1	1	1	1	1	提供tcl过程语言。
postgres_fdw	1.1	1.1	1.1	1	1	1	1	跨库操作表。
sslinfo	1.2	1.2	1.2	1.2	1.2	1.2	1.2	提供当前客户端提供的 SSL 证书的有关信息。
tablefunc	1	1	1	1	1	1	1	包括多个返回表的函数。
tsm_system_rows	1	1	1	1	1	1	1	提供表采样方法SYSTEM_ROWS。
tsm_system_time	1	1	1	1	1	1	1	提供了表采样方法SYSTEM_TIME。
unaccent	1.1	1.1	1.1	1.1	1.1	1.1	1.1	文本搜索字典，它能从词位中移除重音（附加符号）。
uuid-ossp	1.1	1.1	1.1	1.1	1.1	1.1	1.1	提供函数使用几种标准算法之一产生通用唯一标识符（UUID）。
xml2	1.1	1.1	1.1	1.1	1.1	1.1	1.1	提供XPath查询和XSLT功能。

性能对比

指标	Pigsty	Aliyun RDS	AWS RDS
最佳性能	PGTPC on NVME SSD 评测 sysbench oltp_rw	RDS PG 性能白皮书 sysbench oltp 场景每核 QPS 4000 ~ 8000
存储规格：最高档容量	32TB / NVME SSD	32 TB / ESSD PL3	64 TB / io2 EBS Block Express
存储规格：最高档IOPS	4K随机读：最大3M，随机写 2000~350K	4K随机读：最大 1M	16K随机IOPS： 256K
存储规格：最高档延迟	4K随机读：75µs，随机写 15µs	4K随机读：200µs	500µs / 推断为16K随机IO
存储规格：最高档可靠性	UBER < 1e-18，折合18个9 MTBF: 200万小时 5DWPD，持续三年	可靠性 9个9，合 UBER 1e-9 存储与数据可靠性	持久性：99.999%，5个9 （0.001% 年故障率） io2 说明
存储规格：最高档成本	31.5 ¥/TB·月 ( 5年质保均摊 / 3.2T / 企业级 / MLC )	3200¥/TB·月（原价 6400¥，包月4000¥） 3年预付整体打5折才有此价格	1900 ¥/TB·月使用最大规格 65536GB / 256K IOPS 最大优惠

可观测性

Pigsty 提供了近 3000 类监控指标，提供了 50+ 监控面板，覆盖了数据库监控、主机监控、连接池监控、负载均衡监控等方方面面，为用户提供无与伦比的可观测性体验。

Pigsty 提供了 638 与 PostgreSQL 有关的监控指标，而 AWS RDS 只有 99 个，阿里云 RDS 更是只有个位数指标：

此外，也有一些项目提供了监控 PostgreSQL 的能力，但都相对比较简单初级：

pgwatch： 123 类指标
pgmonitor ： 156 类指标
datadog ： 69 类指标
pgDash
ClusterControl
pganalyze
Aliyun RDS ： 8 类指标
AWS RDS ： 99 类指标
Azure RDS

可维护性

指标	Pigsty	Aliyun RDS	AWS RDS
系统易用性	简单	简单	简单
配置管理	配置文件 / CMDB 基于 Ansible Inventory	可使用 Terraform	可使用 Terraform
变更方式	幂等剧本基于 Ansible Playbook	控制台点击操作	控制台点击操作
参数调优	自动根据节点适配四种预置模板 OLTP, OLAP, TINY, CRIT
Infra as Code	原生支持	可使用 Terraform	可使用 Terraform
可定制参数点	Pigsty Parameters 283 个
服务与支持	提供商业订阅支持兜底	提供售后工单支持	提供售后工单支持
无互联网部署	可离线安装部署	N/A	N/A
数据库迁移	提供从现有v10+ PG实例基于逻辑复制不停机迁移至Pigsty托管实例的剧本	提供上云辅助迁移 Aliyun RDS 数据同步

成本

经验上看，软硬件资源的部分 RDS 单位成本是自建的 5 ～ 15 倍，租售比通常在一个月。详情请参考成本分析。

要素	指标	Pigsty	Aliyun RDS	AWS RDS
成本	软件授权/服务费用	免费，硬件约 20 - 40 ¥/核·月	200 ～ 400 ¥/核·月	400 ~ 1300 ¥/核·月
	服务支持费用	服务约 100 ¥/ 核·月	包含在 RDS 成本中

其他本地数据库管控软件

一些提供管理 PostgreSQL 能力的软件与供应商

Aiven：闭源商业云托管方案
Percona：商业咨询，简易PG发行版
ClusterControl：商业数据库管控软件

其他 Kubernetes Operator

Pigsty 拒绝在生产环境中使用 Kubernetes 管理数据库，因此与这些方案在生态位上存在差异。

PGO
StackGres
CloudNativePG
TemboOperator
PostgresOperator
PerconaOperator
Kubegres
KubeDB
KubeBlocks

更多信息请参阅：

5.6 - 成本参考

本文提供了一组成本数据，供您评估 Pigsty 自建，使用云数据库 RDS 所需的成本，以及常规的 DBA 薪酬参考。

总体概览

EC2	核·月	RDS	核·月
DHH 自建核月价格（192C 384G）	25.32	初级开源数据库DBA参考工资	15K/人·月
IDC自建机房（独占物理机: 64C384G）	19.53	中级开源数据库DBA参考工资	30K/人·月
IDC自建机房（容器，超卖500%）	7	高级开源数据库DBA参考工资	60K/人·月
UCloud 弹性虚拟机（8C16G，有超卖）	25	ORACLE 数据库授权	10000
阿里云弹性服务器 2x内存（独占无超卖）	107	阿里云 RDS PG 2x内存（独占）	260
阿里云弹性服务器 4x内存（独占无超卖）	138	阿里云 RDS PG 4x内存（独占）	320
阿里云弹性服务器 8x内存（独占无超卖）	180	阿里云 RDS PG 8x内存（独占）	410
AWS C5D.METAL 96C 200G (按月无预付)	100	AWS RDS PostgreSQL db.T2 (2x)	440
AWS C5D.METAL 96C 200G (预付三年)	80	AWS RDS PostgreSQL db.M5 (4x)	611
AWS C7A.METAL 192C 384G (预付三年)	104.8	AWS RDS PostgreSQL db.R6G (8x)	786

RDS成本参考

付费模式	价格	折合每年（万¥）
IDC自建（单物理机）	¥7.5w / 5年	1.5
IDC自建（2～3台组HA）	¥15w / 5年	3.0 ~ 4.5
阿里云 RDS 按需	¥87.36/时	76.5
阿里云 RDS 月付（基准）	¥4.2w / 月	50
阿里云 RDS 年付（85折）	¥425095 / 年	42.5
阿里云 RDS 3年付（5折）	¥750168 / 3年	25
AWS 按需	$25,817 / 月	217
AWS 1年不预付	$22,827 / 月	191.7
AWS 3年全预付	12w$ + 17.5k$/月	175
AWS 中国/宁夏按需	¥197,489 / 月	237
AWS 中国/宁夏1年不预付	¥143,176 / 月	171
AWS 中国/宁夏3年全预付	¥647k + 116k/月	160.6

我们可以对比一下自建与云数据库的成本差异：

方式	折合每年（万元）
IDC托管服务器 64C / 384G / 3.2TB NVME SSD 660K IOPS (2～3台)	3.0 ~ 4.5
阿里云 RDS PG 高可用版 pg.x4m.8xlarge.2c, 64C / 256GB / 3.2TB ESSD PL3	25 ～ 50
AWS RDS PG 高可用版 db.m5.16xlarge, 64C / 256GB / 3.2TB io1 x 80k IOPS	160 ～ 217

ECS 成本参考

排除 NVMe SSD / ESSD PL3 后的纯算力价格对比

以阿里云为例，纯算力包月模式的价格是自建基准的 5 ～ 7 倍，预付五年的价格是自建的 2 倍

付费模式	单价（¥/核·月）	相对于标准价格	自建溢价倍率
按量付费（1.5倍）	¥ 202	160 %	9.2 ~ 11.2
包月（标准价格）	¥ 126	100 %	5.7 ～ 7.0
预付一年（65折）	¥ 83.7	66 %	3.8 ～ 4.7
预付二年（55折）	¥ 70.6	56 %	3.2 ~ 3.9
预付三年（44折）	¥ 55.1	44 %	2.5 ~ 3.1
预付四年（35折）	¥ 45	35 %	2.0 ~ 2.5
预付五年（30折）	¥ 38.5	30 %	1.8 ~ 2.1

DHH @ 2023	¥ 22.0
探探 IDC 自建	¥ 18.0

含 NVMe SSD / ESSD PL3 情况下的等效价格对比

包含常用规格后的 NVMe SSD 规格之后，纯算力包月模式的价格是自建基准的 11 ～ 14 倍，预付五年的价格是自建的 9 倍左右。

付费模式	单价（¥/核·月）	+ 40GB ESSD PL3	自建溢价比例
按量付费（1.5倍）	¥ 202	¥ 362	14.3 ～ 18.6
包月（标准价格）	¥ 126	¥ 286	11.3 ～ 14.7
预付一年（65折）	¥ 83.7	¥ 244	9.6 ～ 12.5
预付二年（55折）	¥ 70.6	¥ 230	9.1 ～ 11.8
预付三年（44折）	¥ 55.1	¥ 215	8.5 ～ 11.0
预付四年（35折）	¥ 45	¥ 205	8.1 ～ 10.5
预付五年（30折）	¥ 38.5	¥ 199	7.9 ～ 10.2

DHH @ 2023	¥ 25.3
探探 IDC 自建	¥ 19.5

DHH案例：192核配12.8TB Gen4 SSD (1c:66)；探探案例： 64核配3.2T Gen3 MLC SSD (1c:50)。

云上价格每核配比40GB ESSD PL3（1核:4x内存:40x磁盘）计算。

EBS成本参考

评估因素	本地 PCI-E NVME SSD	Aliyun ESSD PL3	AWS io2 Block Express
容量	32TB	32 TB	64 TB
IOPS	4K随机读：600K ~ 1.1M 4K随机写 200K ~ 350K	4K随机读：最大 1M	16K随机IOPS： 256K
延迟	4K随机读：75µs 4K随机写：15µs	4K 随机读： 200µs	随机IO：500µs 上下文推断为16K
可靠性	UBER < 1e-18，折合18个9 MTBF: 200万小时 5DWPD，持续三年	数据可靠性 9个9 存储与数据可靠性	持久性：99.999%，5个9 （0.001% 年故障率） io2 说明
成本	16 ¥/TB·月 ( 5年均摊 / 3.2T MLC ) 5 年质保，¥3000 零售	3200¥/TB·月（原价 6400¥，包月4000¥） 3年预付整体打5折才有此价格	1900 ¥/TB·月使用最大规格 65536GB 256K IOPS 最优惠状态
SLA	5年质保出问题直接换新	Aliyun RDS SLA 可用性 99.99%: 月费 15% 99%: 月费 30% 95%: 月费 100%	Amazon RDS SLA 可用性 99.95%: 月费 15% 99%: 月费 25% 95%: 月费 100%

S3成本参考

Date	$/GB·月	¥/TB·5年	HDD ¥/TB	SSD ¥/TB
2006.03	0.150	63000	2800
2010.11	0.140	58800	1680
2012.12	0.095	39900	420	15400
2014.04	0.030	12600	371	9051
2016.12	0.023	9660	245	3766
2023.12	0.023	9660	105	280

其他参考价	高性能存储	顶配底折价	与采购NVMe SSD	价格参考
S3 Express	0.160	67200	DHH 12T	1400
EBS io2	0.125 + IOPS	114000	Shannon 3.2T	900

下云合集

曾几何时，“上云“近乎成为技术圈的政治正确，整整一代应用开发者的视野被云遮蔽。就让我们用实打实的数据分析与亲身经历，讲清楚公有云租赁模式的价值与陷阱 —— 在这个降本增效的时代中，供您借鉴与参考 —— 请看《云计算泥石流：合订本》

云基础资源篇

云商业模式篇

下云奥德赛篇

云故障复盘篇

RDS翻车篇

云厂商画像篇

5.7 - 术语列表

本文列出了文档中使用的技术术语，以及它们的释义与说明。

6 - PostgreSQL

如何使用 Pigsty 部署并管理世界上最先进的开源关系型数据库 —— PostgreSQL，按需定制，开箱即用！

世界上最先进的开源关系型数据库！

而 Pigsty 帮它进入全盛状态：开箱即用、可靠、可观测、可维护、可伸缩！配置 | 管理 | 剧本 | 监控 | 参数

概览

了解关于 PostgreSQL 的重要主题与概念。

配置

描述你想要的 PostgreSQL 集群

身份参数：定义PostgreSQL集群的身份参数
读写主库：创建由单一主库构成的单实例“集群“
只读从库：创建一主一从的两节点基础高可用集群
离线从库：创建专用于OLAP/ETL/交互式查询的特殊只读实例
同步备库：启用同步提交，以确保没有数据丢失
法定人数：使用法定人数同步提交以获得更高的一致性级别
备份集群：克隆现有集群，并保持同步（异地灾备集群）
延迟集群：克隆现有集群，并延迟重放，用于紧急数据恢复
Citus集群：定义并创建 Citus 水平分布式数据库集群
大版本切换：使用不同的PostgreSQL大版本部署集群

管理

管理您所创建的 PostgreSQL 集群。

剧本

使用幂等的剧本，将您的描述变为现实。

pgsql.yml ：初始化PostgreSQL集群或添加新的从库。
pgsql-rm.yml ：移除PostgreSQL集群，或移除某个实例
pgsql-user.yml ：在现有的PostgreSQL集群中添加新的业务用户
pgsql-db.yml ：在现有的PostgreSQL集群中添加新的业务数据库
pgsql-monitor.yml ：将远程postgres实例纳入监控中
pgsql-migration.yml ：为现有的PostgreSQL集群生成迁移手册和脚本

样例：安装 PGSQL 模块

样例：移除 PGSQL 模块

监控

在 Grafana 仪表盘中查阅 PostgreSQL 的详情状态。

在 Pigsty 中共有 26 个与 PostgreSQL 相关的监控面板：

总览	集群	实例	数据库
PGSQL Overview	PGSQL Cluster	PGSQL Instance	PGSQL Database
PGSQL Alert	PGRDS Cluster	PGRDS Instance	PGCAT Database
PGSQL Shard	PGSQL Activity	PGCAT Instance	PGSQL Tables
	PGSQL Replication	PGSQL Persist	PGSQL Table
	PGSQL Service	PGSQL Proxy	PGCAT Table
	PGSQL Databases	PGSQL Pgbouncer	PGSQL Query
	PGSQL Patroni	PGSQL Session	PGCAT Query
	PGSQL PITR	PGSQL Xacts	PGCAT Locks
		PGSQL Exporter	PGCAT Schema

参数

PGSQL 模块的配置参数列表

PG_ID : 计算和校验 PostgreSQL 实例身份
PG_BUSINESS : PostgreSQL业务对象定义
PG_INSTALL : 安装 PostgreSQL 内核，支持软件包与扩展插件
PG_BOOTSTRAP : 使用 Patroni 初始化高可用 PostgreSQL 集群
PG_PROVISION : 创建 PostgreSQL 用户、数据库和其他数据库内对象
PG_BACKUP : 使用 pgbackrest 设置备份仓库
PG_SERVICE : 暴露 PostgreSQL 服务，绑定 VIP （可选），以及注册DNS
PG_EXPORTER : 为 PostgreSQL 实例添加监控，并注册至基础设施中。

教程

一些使用/管理 Pigsty中 PostgreSQL 数据库的教程。

克隆一套现有的 PostgreSQL 集群
创建一套现有 PostgreSQL 集群的在线备份集群。
创建一套现有 PostgreSQL 集群的延迟备份集群
监控一个已有的 postgres 实例？
使用逻辑复制从外部 PostgreSQL 迁移至 Pigsty 托管的 PostgreSQL 实例？
使用 MinIO 作为集中的 pgBackRest 备份仓库。
使用专门的 etcd 集群作为 PostgreSQL / Patroni 的 DCS ？
使用专用的 haproxy 负载均衡器集群对外暴露暴露 PostgreSQL 服务。
使用 pg-meta CMDB 替代 pigsty.yml 作为配置清单源。
使用 PostgreSQL 作为 Grafana 的后端存储数据库？
使用 PostgreSQL 作为 Prometheus 后端存储数据库？

6.1 - 核心概念

介绍 PostgreSQL 集群的中涉及到的重要概念

PGSQL 模块总览：关键概念与架构细节

实体概念图

让我们从ER图开始。在Pigsty的PGSQL模块中，有四种核心实体：

集群（Cluster）：自治的PostgreSQL业务单元，用作其他实体的顶级命名空间。
服务（Service）：集群能力的命名抽象，路由流量，并使用节点端口暴露postgres服务。
实例（Instance）：一个在单个节点上的运行进程和数据库文件组成的单一postgres服务器。
节点（Node）：硬件资源的抽象，可以是裸金属、虚拟机或甚至是k8s pods。

命名约定

集群名应为有效的 DNS 域名，不包含任何点号，正则表达式为：[a-zA-Z0-9-]+
服务名应以集群名为前缀，并以特定单词作为后缀：primary、replica、offline、delayed，中间用-连接。
实例名以集群名为前缀，以正整数实例号为后缀，用-连接，例如${cluster}-${seq}。
节点由其首要内网IP地址标识，因为PGSQL模块中数据库与主机1:1部署，所以主机名通常与实例名相同。

身份参数

Pigsty使用身份参数来识别实体：PG_ID。

除了节点IP地址，pg_cluster、pg_role和pg_seq三个参数是定义postgres集群所必需的最小参数集。以沙箱环境测试集群pg-test为例：

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica }
  vars:
    pg_cluster: pg-test

集群的三个成员如下所示：

集群	序号	角色	主机 / IP	实例	服务	节点名
`pg-test`	`1`	`primary`	`10.10.10.11`	`pg-test-1`	`pg-test-primary`	`pg-test-1`
`pg-test`	`2`	`replica`	`10.10.10.12`	`pg-test-2`	`pg-test-replica`	`pg-test-2`
`pg-test`	`3`	`replica`	`10.10.10.13`	`pg-test-3`	`pg-test-replica`	`pg-test-3`

这里包含了：

一个集群：该集群命名为pg-test。
两种角色：primary和replica。
三个实例：集群由三个实例组成：pg-test-1、pg-test-2、pg-test-3。
三个节点：集群部署在三个节点上：10.10.10.11、10.10.10.12和10.10.10.13。
四个服务：
- 读写服务：pg-test-primary
- 只读服务：pg-test-replica
- 直接连接的管理服务：pg-test-default
- 离线读服务：pg-test-offline

在监控系统（Prometheus/Grafana/Loki）中，相应的指标将会使用这些身份参数进行标记：

pg_up{cls="pg-meta", ins="pg-meta-1", ip="10.10.10.10", job="pgsql"}
pg_up{cls="pg-test", ins="pg-test-1", ip="10.10.10.11", job="pgsql"}
pg_up{cls="pg-test", ins="pg-test-2", ip="10.10.10.12", job="pgsql"}
pg_up{cls="pg-test", ins="pg-test-3", ip="10.10.10.13", job="pgsql"}

6.2 - 系统架构

介绍 PostgreSQL 集群的整体架构与实现细节。

PGSQL 模块总览：关键概念与架构细节

组件概览

以下是 PostgreSQL 模块组件及其相互作用的详细描述，从上至下分别为：

集群 DNS 由 infra 节点上的 DNSMASQ 负责解析
集群 VIP 由 vip-manager 组件管理，它负责将 pg_vip_address 绑定到集群主库节点上。
- vip-manager 从 etcd 集群获取由 patroni 写入的集群领导者信息
集群服务由节点上的 Haproxy 对外暴露，不同服务通过节点的不同端口（543x）区分。
- Haproxy 端口 9101：监控指标 & 统计 & 管理页面
- Haproxy 端口 5433：默认路由至主 pgbouncer：读写服务
- Haproxy 端口 5434：默认路由至从库 pgbouncer：只读服务
- Haproxy 端口 5436：默认路由至主 postgres：默认服务
- Haproxy 端口 5438：默认路由至离线 postgres：离线服务
- HAProxy 将根据 patroni 提供的健康检查信息路由流量。
Pgbouncer 是一个连接池中间件，默认监听6432端口，可以缓冲连接、暴露额外的指标，并提供额外的灵活性。
- Pgbouncer 是无状态的，并通过本地 Unix 套接字以 1:1 的方式与 Postgres 服务器部署。
- 生产流量（主/从）将默认通过 pgbouncer（可以通过pg_default_service_dest指定跳过）
- 默认/离线服务将始终绕过 pgbouncer ，并直接连接到目标 Postgres。
PostgreSQL 监听5432端口，提供关系型数据库服务
- 在多个节点上安装 PGSQL 模块，并使用同一集群名，将自动基于流式复制组成高可用集群
- PostgreSQL 进程默认由 patroni 管理。
Patroni 默认监听端口 8008，监管着 PostgreSQL 服务器进程
- Patroni 将 Postgres 服务器作为子进程启动
- Patroni 使用 etcd 作为 DCS：存储配置、故障检测和领导者选举。
- Patroni 通过健康检查提供 Postgres 信息（比如主/从），HAProxy 通过健康检查使用该信息分发服务流量
- Patroni 指标将被 infra 节点上的 Prometheus 抓取
PG Exporter 在 9630 端口对外暴露 postgres 架空指标
- PostgreSQL 指标将被 infra 节点上的 Prometheus 抓取
Pgbouncer Exporter 在端口 9631 暴露 pgbouncer 指标
- Pgbouncer 指标将被 infra 节点上的 Prometheus 抓取
pgBackRest 默认在使用本地备份仓库（pgbackrest_method = local）
- 如果使用 local（默认）作为备份仓库，pgBackRest 将在主库节点的pg_fs_bkup 下创建本地仓库
- 如果使用 minio 作为备份仓库，pgBackRest 将在专用的 MinIO 集群上创建备份仓库：pgbackrest_repo.minio
Postgres 相关日志（postgres, pgbouncer, patroni, pgbackrest）由 promtail 负责收集
- Promtail 监听 9080 端口，也对 infra 节点上的 Prometheus 暴露自身的监控指标
- Promtail 将日志发送至 infra 节点上的 Loki

高可用

主库故障恢复时间目标 (RTO) ≈ 30s，数据恢复点目标 (RPO) < 1MB，从库故障 RTO ≈ 0 (重置当前连接)

Pigsty 的 PostgreSQL 集群带有开箱即用的高可用方案，由 patroni、etcd 和 haproxy 强力驱动。

关于高可用的详细介绍，请参考高可用概念

时间点恢复

您可以将集群恢复回滚至过去任意时刻，避免软件缺陷与人为失误导致的数据损失。

Pigsty 的 PostgreSQL 集群带有自动配置的时间点恢复（PITR）方案，基于 pgBackRest 与可选的 MinIO。

高可用可以解决硬件故障，软件缺陷与人为失误导致的数据删除/覆盖写入却无能为力：因为变更操作会立即同步至从库应用。时间点恢复（Point in Time Recovery, PITR）可以解决这个问题。此外当您只有单个实例时，PITR也可以代替高可用，为最坏的情况兜底。

关于时间点恢复（PITR）的详细介绍，请参考 PITR 概念

6.3 - 用户/角色

用户/角色指的是使用 SQL 命令 CREATE USER/ROLE 创建的，数据库集簇内的逻辑对象。

在这里的上下文中，“用户” 指的是使用 SQL 命令 CREATE USER/ROLE 创建的，数据库集簇内的 逻辑对象。

在PostgreSQL中，用户默认直接隶属于数据库集簇，而非某个具体的数据库。

因此在创建业务数据库和业务用户时，应当遵循 “先用户，后数据库” 的原则。

定义用户

Pigsty通过两个配置参数定义数据库集群中的角色与用户，二者形式相同，均为用户对象的数组：

pg_default_roles：定义全局统一使用的角色和用户
pg_users：在数据库集群层面定义业务用户和角色

前者用于定义了整套环境中共用的角色与用户，后者定义单个集群中特有的业务角色与用户。

你可以定义多个用户/角色，它们会按照先全局（pg_default_roles），后集群（pg_users）的顺序创建，

此外，用户创建的顺序等同于数组内用户定义的顺序，所以后面的用户可以隶属于前面定义的角色。

下面是 Pigsty 演示环境中默认集群 pg-meta 中的业务用户定义：

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_users:
      - {name: dbuser_meta     ,password: DBUser.Meta     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
      - {name: dbuser_view     ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
      - {name: dbuser_grafana  ,password: DBUser.Grafana  ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for grafana database    }
      - {name: dbuser_bytebase ,password: DBUser.Bytebase ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for bytebase database   }
      - {name: dbuser_kong     ,password: DBUser.Kong     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for kong api gateway    }
      - {name: dbuser_gitea    ,password: DBUser.Gitea    ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for gitea service       }
      - {name: dbuser_wiki     ,password: DBUser.Wiki     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for wiki.js service     }
      - {name: dbuser_noco     ,password: DBUser.Noco     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for nocodb service      }

每个用户/角色定义都是一个 object，可能包括以下字段，以 dbuser_meta 用户为例：

- name: dbuser_meta               # 必需，`name` 是用户定义的唯一必选字段
  password: DBUser.Meta           # 可选，密码，可以是 scram-sha-256 哈希字符串或明文
  login: true                     # 可选，默认情况下可以登录
  superuser: false                # 可选，默认为 false，是超级用户吗？
  createdb: false                 # 可选，默认为 false，可以创建数据库吗？
  createrole: false               # 可选，默认为 false，可以创建角色吗？
  inherit: true                   # 可选，默认情况下，此角色可以使用继承的权限吗？
  replication: false              # 可选，默认为 false，此角色可以进行复制吗？
  bypassrls: false                # 可选，默认为 false，此角色可以绕过行级安全吗？
  pgbouncer: true                 # 可选，默认为 false，将此用户添加到 pgbouncer 用户列表吗？（使用连接池的生产用户应该显式定义为 true）
  connlimit: -1                   # 可选，用户连接限制，默认 -1 禁用限制
  expire_in: 3650                 # 可选，此角色过期时间：从创建时 + n天计算（优先级比 expire_at 更高）
  expire_at: '2030-12-31'         # 可选，此角色过期的时间点，使用 YYYY-MM-DD 格式的字符串指定一个特定日期（优先级没 expire_in 高）
  comment: pigsty admin user      # 可选，此用户/角色的说明与备注字符串
  roles: [dbrole_admin]           # 可选，默认角色为：dbrole_{admin,readonly,readwrite,offline}
  parameters: {}                  # 可选，使用 `ALTER ROLE SET` 针对这个角色，配置角色级的数据库参数
  pool_mode: transaction          # 可选，默认为 transaction 的 pgbouncer 池模式，用户级别
  pool_connlimit: -1              # 可选，用户级别的最大数据库连接数，默认 -1 禁用限制
  search_path: public             # 可选，根据 postgresql 文档的键值配置参数（例如：使用 pigsty 作为默认 search_path）

唯一必需的字段是 name，它应当是 PostgreSQL 集群中的一个有效且唯一的用户名。
角色不需要 password，但对于可登录的业务用户，通常是需要指定一个密码的。
password 可以是明文或 scram-sha-256 / md5 哈希字符串，请最好不要使用明文密码。
用户/角色按数组顺序逐一创建，因此，请确保角色/分组的定义在成员之前。
login、superuser、createdb、createrole、inherit、replication、bypassrls 是布尔标志。
pgbouncer 默认禁用：要将业务用户添加到 pgbouncer 链接池用户列表中，您应当显式将其设置为 true。

ACL系统

Pigsty 具有一套内置的，开箱即用的访问控制 / ACL 系统，您只需将以下四个默认角色分配给业务用户即可轻松使用：

dbrole_readwrite：全局读写访问的角色（主属业务使用的生产账号应当具有数据库读写权限）
dbrole_readonly：全局只读访问的角色（如果别的业务想要只读访问，可以使用此角色）
dbrole_admin：拥有DDL权限的角色（业务管理员，需要在应用中建表的场景）
dbrole_offline：受限的只读访问角色（只能访问 offline 实例，通常是个人用户）

如果您希望重新设计您自己的 ACL 系统，可以考虑定制以下参数和模板：

pg_default_roles：系统范围的角色和全局用户
pg_default_privileges：新建对象的默认权限
roles/pgsql/templates/pg-init-role.sql：角色创建 SQL 模板
roles/pgsql/templates/pg-init-template.sql：权限 SQL 模板

创建用户

在 pg_default_roles 和 pg_users 中定义的用户和角色，将在集群初始化阶段中自动逐一创建。

如果您希望在现有的集群上 创建用户，可以使用 bin/pgsql-user 命令行或直接使用 pgsql-user.yml 剧本。

将新用户/角色定义添加到 all.children.<cls>.pg_users，并使用以下方法创建该数据库：

bin/pgsql-user <cls> <username>                        # 使用命令行
./pgsql-user.yml -l <cls> -e username=<username>       # 使用剧本

不同于数据库，创建用户的剧本总是幂等的。当目标用户已经存在时，Pigsty 会修改目标用户的属性使其符合配置。

请使用剧本创建用户

我们不建议您手工创建新的业务用户，特别当您想要创建的用户使用默认的 pgbouncer 连接池时：除非您愿意手工负责维护 Pgbouncer 中的用户列表并与 PostgreSQL 保持一致。

使用 bin/pgsql-user 工具或 pgsql-user.yml 剧本创建新数据库时，会将此数据库一并添加到 Pgbouncer用户列表中。

修改用户

修改 PostgreSQL 用户的属性的方式与 创建用户 相同。

首先，调整您的用户定义，修改需要调整的属性，然后执行以下命令应用：

bin/pgsql-user <cls> <username>                        # 使用命令行
./pgsql-user.yml -l <cls> -e username=<username>       # 使用剧本

请注意，修改用户不会删除用户，而是通过 ALTER USER 命令修改用户属性；也不会回收用户的权限与分组，并使用 GRANT 命令授予新的角色。

删除用户

出于安全考虑，Pigsty 不会自动删除用户，即使您在配置中删除了用户定义，Pigsty 也不会删除现有的用户。

您需要使用 SQL 命令 DROP USER 手动删除用户

DROP USER "<username>";

如果您要删除的角色是一个组（有其他用户属于该组），您需要先将其他用户从该组中移除，然后再删除该组。

REVOKE "<rolename>" FROM "<other_user>";

如果你要删除的用户拥有数据库对象，你需要先将这些对象的所有者更改为其他用户，然后再删除该用户。

REASSIGN OWNED BY "<username>" TO "<another_user>";

Pgbouncer用户

Pigsty 在默认情况下会安装并启用 Pgbouncer 链接池，并管理链接池中的用户。

Pigsty 默认会将 pg_users 中显式带有 pgbouncer: true 标志的用户添加到 pgbouncer 用户列表中。

配置文件

Pgbouncer 连接池中的用户在 /etc/pgbouncer/userlist.txt 中列出：

"postgres" ""
"dbuser_wiki" "SCRAM-SHA-256$4096:+77dyhrPeFDT/TptHs7/7Q==$KeatuohpKIYzHPCt/tqBu85vI11o9mar/by0hHYM2W8=:X9gig4JtjoS8Y/o1vQsIX/gY1Fns8ynTXkbWOjUfbRQ="
"dbuser_view" "SCRAM-SHA-256$4096:DFoZHU/DXsHL8MJ8regdEw==$gx9sUGgpVpdSM4o6A2R9PKAUkAsRPLhLoBDLBUYtKS0=:MujSgKe6rxcIUMv4GnyXJmV0YNbf39uFRZv724+X1FE="
"dbuser_monitor" "SCRAM-SHA-256$4096:fwU97ZMO/KR0ScHO5+UuBg==$CrNsmGrx1DkIGrtrD1Wjexb/aygzqQdirTO1oBZROPY=:L8+dJ+fqlMQh7y4PmVR/gbAOvYWOr+KINjeMZ8LlFww="
"dbuser_meta" "SCRAM-SHA-256$4096:leB2RQPcw1OIiRnPnOMUEg==$eyC+NIMKeoTxshJu314+BmbMFpCcspzI3UFZ1RYfNyU=:fJgXcykVPvOfro2MWNkl5q38oz21nSl1dTtM65uYR1Q="
"dbuser_kong" "SCRAM-SHA-256$4096:bK8sLXIieMwFDz67/0dqXQ==$P/tCRgyKx9MC9LH3ErnKsnlOqgNd/nn2RyvThyiK6e4=:CDM8QZNHBdPf97ztusgnE7olaKDNHBN0WeAbP/nzu5A="
"dbuser_grafana" "SCRAM-SHA-256$4096:HjLdGaGmeIAGdWyn2gDt/Q==$jgoyOB8ugoce+Wqjr0EwFf8NaIEMtiTuQTg1iEJs9BM=:ed4HUFqLyB4YpRr+y25FBT7KnlFDnan6JPVT9imxzA4="
"dbuser_gitea" "SCRAM-SHA-256$4096:l1DBGCc4dtircZ8O8Fbzkw==$tpmGwgLuWPDog8IEKdsaDGtiPAxD16z09slvu+rHE74=:pYuFOSDuWSofpD9OZhG7oWvyAR0PQjJBffgHZLpLHds="
"dbuser_dba" "SCRAM-SHA-256$4096:zH8niABU7xmtblVUo2QFew==$Zj7/pq+ICZx7fDcXikiN7GLqkKFA+X5NsvAX6CMshF0=:pqevR2WpizjRecPIQjMZOm+Ap+x0kgPL2Iv5zHZs0+g="
"dbuser_bytebase" "SCRAM-SHA-256$4096:OMoTM9Zf8QcCCMD0svK5gg==$kMchqbf4iLK1U67pVOfGrERa/fY818AwqfBPhsTShNQ=:6HqWteN+AadrUnrgC0byr5A72noqnPugItQjOLFw0Wk="

而用户级别的连接池参数则是使用另一个单独的文件： /etc/pgbouncer/useropts.txt 进行维护，比如：

dbuser_dba                  = pool_mode=session max_user_connections=16
dbuser_monitor              = pool_mode=session max_user_connections=8

连接池用户配置文件 userlist.txt 与 useropts.txt 会在您创建用户时自动刷新，并通过在线重载配置的方式生效，正常不会影响现有的连接。

当您创建数据库时，Pgbouncer 的数据库列表定义文件将会被刷新，并通过在线重载配置的方式生效，不会影响现有的连接。

管理用户

Pgbouncer 使用和 PostgreSQL 同样的 dbsu 运行，默认为 postgres 操作系统用户，您可以使用 pgb 别名，使用 dbsu 访问 pgbouncer 管理功能。

sudo su - postgres
pgb   # 使用管理用户登录 pgbouncer 命令行控制界面

Pigsty 还提供了一个实用函数 pgb-route ，可以将 pgbouncer 数据库流量快速切换至集群中的其他节点，用于零停机迁移：

删除用户

出于安全考虑，Pigsty 默认不提供删除数据库/链接池用户的命令，

从 pgbouncer 链接池中删除用户，只需在配置文件中删除相应的行，并重新 reload pgbouncer 即可。

链接池用户列表采用全量刷新覆盖的方式进行管理，如果您确保所有数据库用户都是由 Pigsty 剧本/命令行创建，那么可以使用以下命令全量刷新覆盖 pgbouncer 链接池中的用户列表：

./pgsql.yml -t pgbouncer_user,pgbouncer_reload -e pg_reload=true

动态用户认证

请注意，pgbouncer_auth_query 参数允许你使用动态查询来完成连接池用户认证，当您懒得管理连接池中的用户时，这是一种折中的方案。

6.4 - 数据库

数据库指的是使用 SQL 命令 CREATE DATABASE 创建的，数据库集簇内的逻辑对象。

在这里的上下文中，数据库指的是使用 SQL 命令 CREATE DATABASE 创建的，数据库集簇内的逻辑对象。

一组 PostgreSQL 服务器可以同时服务于多个 数据库 （Database）。在 Pigsty 中，你可以在集群配置中定义好所需的数据库。

Pigsty会对默认模板数据库template1进行修改与定制，创建默认模式，安装默认扩展，配置默认权限，新创建的数据库默认会从template1继承这些设置。

默认情况下，所有业务数据库都会被1:1添加到 Pgbouncer 连接池中；pg_exporter 默认会通过 自动发现 机制查找所有业务数据库并进行库内对象监控。

定义数据库

业务数据库定义在数据库集群参数 pg_databases 中，这是一个数据库定义构成的对象数组。数组内的数据库按照定义顺序依次创建，因此后面定义的数据库可以使用先前定义的数据库作为模板。

下面是 Pigsty 演示环境中默认集群 pg-meta 中的数据库定义：

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_databases:
      - { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: postgis, schema: public}, {name: timescaledb}]}
      - { name: grafana  ,owner: dbuser_grafana  ,revokeconn: true ,comment: grafana primary database }
      - { name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database }
      - { name: kong     ,owner: dbuser_kong     ,revokeconn: true ,comment: kong the api gateway database }
      - { name: gitea    ,owner: dbuser_gitea    ,revokeconn: true ,comment: gitea meta database }
      - { name: wiki     ,owner: dbuser_wiki     ,revokeconn: true ,comment: wiki meta database }
      - { name: noco     ,owner: dbuser_noco     ,revokeconn: true ,comment: nocodb database }

每个数据库定义都是一个 object，可能包括以下字段，以 meta 数据库为例：

- name: meta                      # 必选，`name` 是数据库定义的唯一必选字段
  baseline: cmdb.sql              # 可选，数据库 sql 的基线定义文件路径（ansible 搜索路径中的相对路径，如 files/）
  pgbouncer: true                 # 可选，是否将此数据库添加到 pgbouncer 数据库列表？默认为 true
  schemas: [pigsty]               # 可选，要创建的附加模式，由模式名称字符串组成的数组
  extensions:                     # 可选，要安装的附加扩展： 扩展对象的数组
    - { name: postgis , schema: public }  # 可以指定将扩展安装到某个模式中，也可以不指定（不指定则安装到 search_path 首位模式中）
    - { name: timescaledb }               # 例如有的扩展会创建并使用固定的模式，就不需要指定模式。
  comment: pigsty meta database   # 可选，数据库的说明与备注信息
  owner: postgres                 # 可选，数据库所有者，默认为 postgres
  template: template1             # 可选，要使用的模板，默认为 template1，目标必须是一个模板数据库
  encoding: UTF8                  # 可选，数据库编码，默认为 UTF8（必须与模板数据库相同）
  locale: C                       # 可选，数据库地区设置，默认为 C（必须与模板数据库相同）
  lc_collate: C                   # 可选，数据库 collate 排序规则，默认为 C（必须与模板数据库相同），没有理由不建议更改。
  lc_ctype: C                     # 可选，数据库 ctype 字符集，默认为 C（必须与模板数据库相同）
  tablespace: pg_default          # 可选，默认表空间，默认为 'pg_default'
  allowconn: true                 # 可选，是否允许连接，默认为 true。显式设置 false 将完全禁止连接到此数据库
  revokeconn: false               # 可选，撤销公共连接权限。默认为 false，设置为 true 时，属主和管理员之外用户的 CONNECT 权限会被回收
  register_datasource: true       # 可选，是否将此数据库注册到 grafana 数据源？默认为 true，显式设置为 false 会跳过注册
  connlimit: -1                   # 可选，数据库连接限制，默认为 -1 ，不限制，设置为正整数则会限制连接数。
  pool_auth_user: dbuser_meta     # 可选，连接到此 pgbouncer 数据库的所有连接都将使用此用户进行验证（启用 pgbouncer_auth_query 才有用）
  pool_mode: transaction          # 可选，数据库级别的 pgbouncer 池化模式，默认为 transaction
  pool_size: 64                   # 可选，数据库级别的 pgbouncer 默认池子大小，默认为 64
  pool_size_reserve: 32           # 可选，数据库级别的 pgbouncer 池子保留空间，默认为 32，当默认池子不够用时，最多再申请这么多条突发连接。
  pool_size_min: 0                # 可选，数据库级别的 pgbouncer 池的最小大小，默认为 0
  pool_max_db_conn: 100           # 可选，数据库级别的最大数据库连接数，默认为 100

唯一必选的字段是 name，它应该是当前 PostgreSQL 集群中有效且唯一的数据库名称，其他参数都有合理的默认值。

name：数据库名称，必选项。
baseline：SQL文件路径（Ansible搜索路径，通常位于files），用于初始化数据库内容。
owner：数据库属主，默认为postgres
template：数据库创建时使用的模板，默认为template1
encoding：数据库默认字符编码，默认为UTF8，默认与实例保持一致。建议不要配置与修改。
locale：数据库默认的本地化规则，默认为C，建议不要配置，与实例保持一致。
lc_collate：数据库默认的本地化字符串排序规则，默认与实例设置相同，建议不要修改，必须与模板数据库一致。强烈建议不要配置，或配置为C。
lc_ctype：数据库默认的LOCALE，默认与实例设置相同，建议不要修改或设置，必须与模板数据库一致。建议配置为C或en_US.UTF8。
allowconn：是否允许连接至数据库，默认为true，不建议修改。
revokeconn：是否回收连接至数据库的权限？默认为false。如果为true，则数据库上的PUBLIC CONNECT权限会被回收。只有默认用户（dbsu|monitor|admin|replicator|owner）可以连接。此外，admin|owner 会拥有GRANT OPTION，可以赋予其他用户连接权限。
tablespace：数据库关联的表空间，默认为pg_default。
connlimit：数据库连接数限制，默认为-1，即没有限制。
extensions：对象数组，每一个对象定义了一个数据库中的扩展，以及其安装的模式。
parameters：KV对象，每一个KV定义了一个需要针对数据库通过ALTER DATABASE修改的参数。
pgbouncer：布尔选项，是否将该数据库加入到Pgbouncer中。所有数据库都会加入至Pgbouncer列表，除非显式指定pgbouncer: false。
comment：数据库备注信息。
pool_auth_user：启用 pgbouncer_auth_query 时，连接到此 pgbouncer 数据库的所有连接都将使用这里指定的用户执行认证查询。你需要使用一个具有访问 pg_shadow 表权限的用户。
pool_mode：数据库级别的 pgbouncer 池化模式，默认为 transaction，即事物池化。如果留空，会使用 pgbouncer_poolmode 参数作为默认值。
pool_size：数据库级别的 pgbouncer 默认池子大小，默认为 64
pool_size_reserve：数据库级别的 pgbouncer 池子保留空间，默认为 32，当默认池子不够用时，最多再申请这么多条突发连接。
pool_size_min：数据库级别的 pgbouncer 池的最小大小，默认为 0
pool_max_db_conn：数据库级别的 pgbouncer 连接池最大数据库连接数，默认为 100

新创建的数据库默认会从 template1 数据库 Fork 出来，这个模版数据库会在 PG_PROVISION 阶段进行定制修改：配置好扩展，模式以及默认权限，因此新创建的数据库也会继承这些配置，除非您显式使用一个其他的数据库作为模板。

关于数据库的访问权限，请参考 ACL：数据库权限一节。

创建数据库

在 pg_databases 中定义的数据库将在集群初始化时自动创建。如果您希望在现有集群上创建数据库，可以使用 bin/pgsql-db 包装脚本。将新的数据库定义添加到 all.children.<cls>.pg_databases 中，并使用以下命令创建该数据库：

bin/pgsql-db <cls> <dbname>    # pgsql-db.yml -l <cls> -e dbname=<dbname>

下面是新建数据库时的一些注意事项：

创建数据库的剧本默认为幂等剧本，不过当您当使用 baseline 脚本时就不一定了：这种情况下，通常不建议在现有数据库上重复执行此操作，除非您确定所提供的 baseline SQL也是幂等的。

我们不建议您手工创建新的数据库，特别当您使用默认的 pgbouncer 连接池时：除非您愿意手工负责维护 Pgbouncer 中的数据库列表并与 PostgreSQL 保持一致。使用 pgsql-db 工具或 pgsql-db.yml 剧本创建新数据库时，会将此数据库一并添加到 Pgbouncer 数据库列表中。

如果您的数据库定义有一个非常规 owner（默认为 dbsu postgres），那么请确保在创建该数据库前，属主用户已经存在。最佳实践永远是在创建数据库之前创建用户。

Pgbouncer数据库

Pigsty 会默认为 PostgreSQL 实例 1:1 配置启用一个 Pgbouncer 连接池，使用 /var/run/postgresql Unix Socket 通信。

连接池可以优化短连接性能，降低并发征用，以避免过高的连接数冲垮数据库，并在数据库迁移时提供额外的灵活处理空间。

Pigsty 默认将 pg_databases 中的所有数据库都添加到 pgbouncer 的数据库列表中。您可以通过在数据库定义中显式设置 pgbouncer: false 来禁用特定数据库的 pgbouncer 连接池支持。

Pgbouncer数据库列表在 /etc/pgbouncer/database.txt 中定义，数据库定义中关于连接池的参数会体现在这里：

meta                        = host=/var/run/postgresql mode=session
grafana                     = host=/var/run/postgresql mode=transaction
bytebase                    = host=/var/run/postgresql auth_user=dbuser_meta
kong                        = host=/var/run/postgresql pool_size=32 reserve_pool=64
gitea                       = host=/var/run/postgresql min_pool_size=10
wiki                        = host=/var/run/postgresql
noco                        = host=/var/run/postgresql
mongo                       = host=/var/run/postgresql

当您创建数据库时，Pgbouncer 的数据库列表定义文件将会被刷新，并通过在线重载配置的方式生效，正常不会影响现有的连接。

Pgbouncer 使用和 PostgreSQL 同样的 dbsu 运行，默认为 postgres 操作系统用户，您可以使用 pgb 别名，使用 dbsu 访问 pgbouncer 管理功能。

Pigsty 还提供了一个实用函数 pgb-route ，可以将 pgbouncer 数据库流量快速切换至集群中的其他节点，用于零停机迁移：

# route pgbouncer traffic to another cluster member
function pgb-route(){
  local ip=${1-'\/var\/run\/postgresql'}
  sed -ie "s/host=[^[:space:]]\+/host=${ip}/g" /etc/pgbouncer/pgbouncer.ini
  cat /etc/pgbouncer/pgbouncer.ini
}

6.5 - 服务/接入

分离读写操作，正确路由流量，稳定可靠地交付 PostgreSQL 集群提供的能力。

分离读写操作，正确路由流量，稳定可靠地交付 PostgreSQL 集群提供的能力。

服务是一种抽象：它是数据库集群对外提供能力的形式，并封装了底层集群的细节。

服务对于生产环境中的稳定接入至关重要，在高可用集群自动故障时方显其价值，单机用户通常不需要操心这个概念。

单机用户

“服务” 的概念是给生产环境用的，个人用户/单机集群可以不折腾，直接拿实例名/IP地址访问数据库。

例如，Pigsty 默认的单节点 pg-meta.meta 数据库，就可以直接用下面三个不同的用户连接上去。

psql postgres://dbuser_dba:DBUser.DBA@10.10.10.10/meta     # 直接用 DBA 超级用户连上去
psql postgres://dbuser_meta:DBUser.Meta@10.10.10.10/meta   # 用默认的业务管理员用户连上去
psql postgres://dbuser_view:DBUser.View@pg-meta/meta       # 用默认的只读用户走实例域名连上去

服务概述

通常来说，数据库集群都必须提供这种最基础的服务：

读写服务（primary） ：可以读写数据库

对于生产数据库集群，至少应当提供这两种服务：

读写服务（primary） ：写入数据：只能由主库所承载。
只读服务（replica） ：读取数据：可以由从库承载，没有从库时也可由主库承载

此外，根据具体的业务场景，可能还会有其他的服务，例如：

默认直连服务（default） ：允许（管理）用户，绕过连接池直接访问数据库的服务
离线从库服务（offline） ：不承接线上只读流量的专用从库，用于ETL与分析查询
同步从库服务（standby） ：没有复制延迟的只读服务，由同步备库/主库处理只读查询
延迟从库服务（delayed） ：访问同一个集群在一段时间之前的旧数据，由延迟从库来处理

默认服务

Pigsty默认为每个 PostgreSQL 数据库集群提供四种不同的服务，以下是默认服务及其定义：

服务	端口	描述
primary	5433	生产读写，连接到主库连接池（6432）
replica	5434	生产只读，连接到备库连接池（6432）
default	5436	管理，ETL写入，直接访问主库（5432）
offline	5438	OLAP、ETL、个人用户、交互式查询

以默认的 pg-meta 集群为例，它提供四种默认服务：

psql postgres://dbuser_meta:DBUser.Meta@pg-meta:5433/meta   # pg-meta-primary : 通过主要的 pgbouncer(6432) 进行生产读写
psql postgres://dbuser_meta:DBUser.Meta@pg-meta:5434/meta   # pg-meta-replica : 通过备份的 pgbouncer(6432) 进行生产只读
psql postgres://dbuser_dba:DBUser.DBA@pg-meta:5436/meta     # pg-meta-default : 通过主要的 postgres(5432) 直接连接
psql postgres://dbuser_stats:DBUser.Stats@pg-meta:5438/meta # pg-meta-offline : 通过离线的 postgres(5432) 直接连接

从示例集群架构图上可以看出这四种服务的工作方式：

注意在这里pg-meta 域名指向了集群的 L2 VIP，进而指向集群主库上的 haproxy 负载均衡器，它负责将流量路由到不同的实例上，详见服务接入

服务实现

在 Pigsty 中，服务使用节点上的 haproxy 来实现，通过主机节点上的不同端口进行区分。

Pigsty 所纳管的每个节点上都默认启用了 Haproxy 以对外暴露服务，而数据库节点也不例外。集群中的节点尽管从数据库的视角来看有主从之分，但从服务的视角来看，每个节点都是相同的：这意味着即使您访问的是从库节点，只要使用正确的服务端口，就依然可以使用到主库读写的服务。这样的设计可以屏蔽复杂度：所以您只要可以访问 PostgreSQL 集群上的任意一个实例，就可以完整的访问到所有服务。

这样的设计类似于 Kubernetes 中的 NodePort 服务，同样在 Pigsty 中，每一个服务都包括以下两个核心要素：

通过 NodePort 暴露的访问端点（端口号，从哪访问？）
通过 Selectors 选择的目标实例（实例列表，谁来承载？）

Pigsty的服务交付边界止步于集群的HAProxy，用户可以用各种手段访问这些负载均衡器，请参考接入服务。

所有的服务都通过配置文件进行声明，例如，PostgreSQL 默认服务就是由 pg_default_services 参数所定义的：

pg_default_services:
- { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }
- { name: replica ,port: 5434 ,dest: default  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
- { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
- { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}

您也可以在 pg_services 中定义额外的服务，参数 pg_default_services 与 pg_services 都是由服务定义对象组成的数组。

定义服务

Pigsty 允许您定义自己的服务：

pg_default_services：所有 PostgreSQL 集群统一对外暴露的服务，默认有四个。
pg_services：额外的 PostgreSQL 服务，可以视需求在全局或集群级别定义。
haproxy_servies：直接定制 HAProxy 服务内容，可以用于其他组件的接入

对于 PostgreSQL 集群来说，通常只需要关注前两者即可。每一条服务定义都会在所有相关 HAProxy 实例的配置目录下生成一个新的配置文件：/etc/haproxy/<svcname>.cfg 下面是一个自定义的服务样例 standby：当您想要对外提供没有复制延迟的只读服务时，就可以在 pg_services 新增这条记录：

- name: standby                   # 必选，服务名称，最终的 svc 名称会使用 `pg_cluster` 作为前缀，例如：pg-meta-standby
  port: 5435                      # 必选，暴露的服务端口（作为 kubernetes 服务节点端口模式）
  ip: "*"                         # 可选，服务绑定的 IP 地址，默认情况下为所有 IP 地址
  selector: "[]"                  # 必选，服务成员选择器，使用 JMESPath 来筛选配置清单
  backup: "[? pg_role == `primary`]"  # 可选，服务成员选择器（备份），也就是当默认选择器选中的实例都宕机后，服务才会由这里选中的实例成员来承载
  dest: default                   # 可选，目标端口，default|postgres|pgbouncer|<port_number>，默认为 'default'，Default的意思就是使用 pg_default_service_dest 的取值来最终决定
  check: /sync                    # 可选，健康检查 URL 路径，默认为 /，这里使用 Patroni API：/sync ，只有同步备库和主库才会返回 200 健康状态码 
  maxconn: 5000                   # 可选，允许的前端连接最大数，默认为5000
  balance: roundrobin             # 可选，haproxy 负载均衡算法（默认为 roundrobin，其他选项：leastconn）
  options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'

而上面的服务定义，在样例的三节点 pg-test 上将会被转换为 haproxy 配置文件 /etc/haproxy/pg-test-standby.conf：

#---------------------------------------------------------------------
# service: pg-test-standby @ 10.10.10.11:5435
#---------------------------------------------------------------------
# service instances 10.10.10.11, 10.10.10.13, 10.10.10.12
# service backups   10.10.10.11
listen pg-test-standby
    bind *:5435            # <--- 绑定了所有IP地址上的 5435 端口
    mode tcp               # <--- 负载均衡器工作在 TCP 协议上
    maxconn 5000           # <--- 最大连接数为 5000，可按需调大
    balance roundrobin     # <--- 负载均衡算法为 rr 轮询，还可以使用 leastconn 
    option httpchk         # <--- 启用 HTTP 健康检查
    option http-keep-alive # <--- 保持HTTP连接
    http-check send meth OPTIONS uri /sync   # <---- 这里使用 /sync ，Patroni 健康检查 API ，只有同步备库和主库才会返回 200 健康状态码。 
    http-check expect status 200             # <---- 健康检查返回代码 200 代表正常
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers： # pg-test 集群全部三个实例都被 selector: "[]" 给圈中了，因为没有任何的筛选条件，所以都会作为 pg-test-replica 服务的后端服务器。但是因为还有 /sync 健康检查，所以只有主库和同步备库才能真正承载请求。
    server pg-test-1 10.10.10.11:6432 check port 8008 weight 100 backup  # <----- 唯独主库满足条件 pg_role == `primary`， 被 backup selector 选中。
    server pg-test-3 10.10.10.13:6432 check port 8008 weight 100         #        因此作为服务的兜底实例：平时不承载请求，其他从库全部宕机后，才会承载只读请求，从而最大避免了读写服务受到只读服务的影响
    server pg-test-2 10.10.10.12:6432 check port 8008 weight 100         #

在这里，pg-test 集群全部三个实例都被 selector: "[]" 给圈中了，渲染进入 pg-test-replica 服务的后端服务器列表中。但是因为还有 /sync 健康检查，Patroni Rest API只有在主库和同步备库上才会返回代表健康的 HTTP 200 状态码，因此只有主库和同步备库才能真正承载请求。此外，主库因为满足条件 pg_role == primary，被 backup selector 选中，被标记为了备份服务器，只有当没有其他实例（也就是同步备库）可以满足需求时，才会顶上。

Primary服务

Primary服务可能是生产环境中最关键的服务，它在 5433 端口提供对数据库集群的读写能力，服务定义如下：

- { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }

选择器参数 selector: "[]" 意味着所有集群成员都将被包括在Primary服务中
但只有主库能够通过健康检查（check: /primary），实际承载Primary服务的流量。
目的地参数 dest: default 意味着Primary服务的目的地受到 pg_default_service_dest 参数的影响
dest 默认值 default 会被替换为 pg_default_service_dest 的值，默认为 pgbouncer。
默认情况下 Primary 服务的目的地默认是主库上的连接池，也就是由 pgbouncer_port 指定的端口，默认为 6432

如果 pg_default_service_dest 的值为 postgres，那么 primary 服务的目的地就会绕过连接池，直接使用 PostgreSQL 数据库的端口（pg_port，默认值 5432），对于一些不希望使用连接池的场景，这个参数非常实用。

示例：pg-test-primary 的 haproxy 配置

listen pg-test-primary
    bind *:5433         # <--- primary 服务默认使用 5433 端口
    mode tcp
    maxconn 5000
    balance roundrobin
    option httpchk
    option http-keep-alive
    http-check send meth OPTIONS uri /primary # <--- primary 服务默认使用 Patroni RestAPI /primary 健康检查
    http-check expect status 200
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers
    server pg-test-1 10.10.10.11:6432 check port 8008 weight 100
    server pg-test-3 10.10.10.13:6432 check port 8008 weight 100
    server pg-test-2 10.10.10.12:6432 check port 8008 weight 100

Patroni 的高可用机制确保任何时候最多只会有一个实例的 /primary 健康检查为真，因此Primary服务将始终将流量路由到主实例。

使用 Primary 服务而不是直连数据库的一个好处是，如果集群因为某种情况出现了双主（比如在没有watchdog的情况下kill -9杀死主库 Patroni），Haproxy在这种情况下仍然可以避免脑裂，因为它只会在 Patroni 存活且返回主库状态时才会分发流量。

Replica服务

Replica服务在生产环境中的重要性仅次于Primary服务，它在 5434 端口提供对数据库集群的只读能力，服务定义如下：

- { name: replica ,port: 5434 ,dest: default  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }

选择器参数 selector: "[]" 意味着所有集群成员都将被包括在Replica服务中
所有实例都能够通过健康检查（check: /read-only），承载Replica服务的流量。
备份选择器：[? pg_role == 'primary' || pg_role == 'offline' ] 将主库和离线从库标注为备份服务器。
只有当所有普通从库都宕机后，Replica服务才会由主库或离线从库来承载。
目的地参数 dest: default 意味着Replica服务的目的地也受到 pg_default_service_dest 参数的影响
dest 默认值 default 会被替换为 pg_default_service_dest 的值，默认为 pgbouncer，这一点和 Primary服务相同
默认情况下 Replica 服务的目的地默认是从库上的连接池，也就是由 pgbouncer_port 指定的端口，默认为 6432

示例：pg-test-replica 的 haproxy 配置

listen pg-test-replica
    bind *:5434
    mode tcp
    maxconn 5000
    balance roundrobin
    option httpchk
    option http-keep-alive
    http-check send meth OPTIONS uri /read-only
    http-check expect status 200
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers
    server pg-test-1 10.10.10.11:6432 check port 8008 weight 100 backup
    server pg-test-3 10.10.10.13:6432 check port 8008 weight 100
    server pg-test-2 10.10.10.12:6432 check port 8008 weight 100

Replica服务非常灵活：如果有存活的专用 Replica 实例，那么它会优先使用这些实例来承载只读请求，只有当从库实例全部宕机后，才会由主库来兜底只读请求。对于常见的一主一从双节点集群就是：只要从库活着就用从库，从库挂了再用主库。

此外，除非专用只读实例全部宕机，Replica 服务也不会使用专用 Offline 实例，这样就避免了在线快查询与离线慢查询混在一起，相互影响。

Default服务

Default服务在 5436 端口上提供服务，它是Primary服务的变体。

Default服务总是绕过连接池直接连到主库上的 PostgreSQL，这对于管理连接、ETL写入、CDC数据变更捕获等都很有用。

- { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }

如果 pg_default_service_dest 被修改为 postgres，那么可以说 Default 服务除了端口和名称内容之外，与 Primary 服务是完全等价的。在这种情况下，您可以考虑将 Default 从默认服务中剔除。

示例：pg-test-default 的 haproxy 配置

listen pg-test-default
    bind *:5436         # <--- 除了监听端口/目标端口和服务名，其他配置和 primary 服务一模一样
    mode tcp
    maxconn 5000
    balance roundrobin
    option httpchk
    option http-keep-alive
    http-check send meth OPTIONS uri /primary
    http-check expect status 200
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers
    server pg-test-1 10.10.10.11:5432 check port 8008 weight 100
    server pg-test-3 10.10.10.13:5432 check port 8008 weight 100
    server pg-test-2 10.10.10.12:5432 check port 8008 weight 100

Offline服务

Default服务在 5438 端口上提供服务，它也绕开连接池直接访问 PostgreSQL 数据库，通常用于慢查询/分析查询/ETL读取/个人用户交互式查询，其服务定义如下：

- { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}

Offline服务将流量直接路由到专用的离线从库上，或者带有 pg_offline_query 标记的普通只读实例。

选择器参数从集群中筛选出了两种实例：pg_role = offline 的离线从库，或是带有 pg_offline_query = true 标记的普通只读实例
专用离线从库和打标记的普通从库主要的区别在于：前者默认不承载 Replica服务的请求，避免快慢请求混在一起，而后者默认会承载。
备份选择器参数从集群中筛选出了一种实例：不带 offline 标记的普通从库，这意味着如果离线实例或者带Offline标记的普通从库挂了之后，其他普通的从库可以用来承载Offline服务。
健康检查 /replica 只会针对从库返回 200，主库会返回错误，因此 Offline服务永远不会将流量分发到主库实例上去，哪怕集群中只剩这一台主库。
同时，主库实例既不会被选择器圈中，也不会被备份选择器圈中，因此它永远不会承载Offline服务。因此 Offline 服务总是可以避免用户访问主库，从而避免对主库的影响。

示例：pg-test-offline 的 haproxy 配置

listen pg-test-offline
    bind *:5438
    mode tcp
    maxconn 5000
    balance roundrobin
    option httpchk
    option http-keep-alive
    http-check send meth OPTIONS uri /replica
    http-check expect status 200
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers
    server pg-test-3 10.10.10.13:5432 check port 8008 weight 100
    server pg-test-2 10.10.10.12:5432 check port 8008 weight 100 backup

Offline服务提供受限的只读服务，通常用于两类查询：交互式查询（个人用户），慢查询长事务（分析/ETL）。

Offline 服务需要额外的维护照顾：当集群发生主从切换或故障自动切换时，集群的实例角色会发生变化，而 Haproxy 的配置却不会自动发生变化。对于有多个从库的集群来说，这通常并不是一个问题。然而对于一主一从，从库跑Offline查询的精简小集群而言，主从切换意味着从库变成了主库（健康检查失效），原来的主库变成了从库（不在 Offline 后端列表中），于是没有实例可以承载 Offline 服务了，因此需要手动重载服务以使变更生效。

如果您的业务模型较为简单，您可以考虑剔除 Default 服务与 Offline 服务，使用 Primary 服务与 Replica 服务直连数据库。

重载服务

当集群成员发生变化，如添加/删除副本、主备切换或调整相对权重时，你需要重载服务以使更改生效。

bin/pgsql-svc <cls> [ip...]         # 为 lb 集群或 lb 实例重载服务
# ./pgsql.yml -t pg_service         # 重载服务的实际 ansible 任务

接入服务

Pigsty的服务交付边界止步于集群的HAProxy，用户可以用各种手段访问这些负载均衡器。

典型的做法是使用 DNS 或 VIP 接入，将其绑定在集群所有或任意数量的负载均衡器上。

你可以使用不同的主机 & 端口组合，它们以不同的方式提供 PostgreSQL 服务。

主机

类型	样例	描述
集群域名	`pg-test`	通过集群域名访问（由 dnsmasq @ infra 节点解析）
集群 VIP 地址	`10.10.10.3`	通过由 `vip-manager` 管理的 L2 VIP 地址访问，绑定到主节点
实例主机名	`pg-test-1`	通过任何实例主机名访问（由 dnsmasq @ infra 节点解析）
实例 IP 地址	`10.10.10.11`	访问任何实例的 IP 地址

端口

Pigsty 使用不同的端口来区分 pg services

端口	服务	类型	描述
5432	postgres	数据库	直接访问 postgres 服务器
6432	pgbouncer	中间件	访问 postgres 前先通过连接池中间件
5433	primary	服务	访问主 pgbouncer (或 postgres)
5434	replica	服务	访问备份 pgbouncer (或 postgres)
5436	default	服务	访问主 postgres
5438	offline	服务	访问离线 postgres

组合

# 通过集群域名访问
postgres://test@pg-test:5432/test # DNS -> L2 VIP -> 主直接连接
postgres://test@pg-test:6432/test # DNS -> L2 VIP -> 主连接池 -> 主
postgres://test@pg-test:5433/test # DNS -> L2 VIP -> HAProxy -> 主连接池 -> 主
postgres://test@pg-test:5434/test # DNS -> L2 VIP -> HAProxy -> 备份连接池 -> 备份
postgres://dbuser_dba@pg-test:5436/test # DNS -> L2 VIP -> HAProxy -> 主直接连接 (用于管理员)
postgres://dbuser_stats@pg-test:5438/test # DNS -> L2 VIP -> HAProxy -> 离线直接连接 (用于 ETL/个人查询)

# 通过集群 VIP 直接访问
postgres://test@10.10.10.3:5432/test # L2 VIP -> 主直接访问
postgres://test@10.10.10.3:6432/test # L2 VIP -> 主连接池 -> 主
postgres://test@10.10.10.3:5433/test # L2 VIP -> HAProxy -> 主连接池 -> 主
postgres://test@10.10.10.3:5434/test # L2 VIP -> HAProxy -> 备份连接池 -> 备份
postgres://dbuser_dba@10.10.10.3:5436/test # L2 VIP -> HAProxy -> 主直接连接 (用于管理员)
postgres://dbuser_stats@10.10.10.3::5438/test # L2 VIP -> HAProxy -> 离线直接连接 (用于 ETL/个人查询)

# 直接指定任何集群实例名
postgres://test@pg-test-1:5432/test # DNS -> 数据库实例直接连接 (单例访问)
postgres://test@pg-test-1:6432/test # DNS -> 连接池 -> 数据库
postgres://test@pg-test-1:5433/test # DNS -> HAProxy -> 连接池 -> 数据库读/写
postgres://test@pg-test-1:5434/test # DNS -> HAProxy -> 连接池 -> 数据库只读
postgres://dbuser_dba@pg-test-1:5436/test # DNS -> HAProxy -> 数据库直接连接
postgres://dbuser_stats@pg-test-1:5438/test # DNS -> HAProxy -> 数据库离线读/写

# 直接指定任何集群实例 IP 访问
postgres://test@10.10.10.11:5432/test # 数据库实例直接连接 (直接指定实例, 没有自动流量分配)
postgres://test@10.10.10.11:6432/test # 连接池 -> 数据库
postgres://test@10.10.10.11:5433/test # HAProxy -> 连接池 -> 数据库读/写
postgres://test@10.10.10.11:5434/test # HAProxy -> 连接池 -> 数据库只读
postgres://dbuser_dba@10.10.10.11:5436/test # HAProxy -> 数据库直接连接
postgres://dbuser_stats@10.10.10.11:5438/test # HAProxy -> 数据库离线读-写

# 智能客户端：自动进行读写分离
postgres://test@10.10.10.11:6432,10.10.10.12:6432,10.10.10.13:6432/test?target_session_attrs=primary
postgres://test@10.10.10.11:6432,10.10.10.12:6432,10.10.10.13:6432/test?target_session_attrs=prefer-standby

覆盖服务

你可以通过多种方式覆盖默认的服务配置，一种常见的需求是让 Primary服务与 Replica服务绕过Pgbouncer连接池，直接访问 PostgreSQL 数据库。

为了实现这一点，你可以将 pg_default_service_dest 更改为 postgres，这样所有服务定义中 svc.dest='default' 的服务都会使用 postgres 而不是默认的 pgbouncer 作为目标。

如果您已经将 Primary服务指向了 PostgreSQL，那么 default服务就会比较多余，可以考虑移除。

如果您不需要区分个人交互式查询，分析/ETL慢查询，可以考虑从默认服务列表 pg_default_services 中移除Offline服务。

如果您不需要只读从库来分担在线只读流量，也可以从默认服务列表中移除 Replica服务。

委托服务

Pigsty 通过节点上的 haproxy 暴露 PostgreSQL 服务。整个集群中的所有 haproxy 实例都使用相同的服务定义进行配置。

但是，你可以将 pg 服务委托给特定的节点分组（例如，专门的 haproxy 负载均衡器集群），而不是 PostgreSQL 集群成员上的 haproxy。

为此，你需要使用 pg_default_services 覆盖默认的服务定义，并将 pg_service_provider 设置为代理组名称。

例如，此配置将在端口 10013 的 proxy haproxy 节点组上公开 pg 集群的主服务。

pg_service_provider: proxy       # 使用端口 10013 上的 `proxy` 组的负载均衡器
pg_default_services:  [{ name: primary ,port: 10013 ,dest: postgres  ,check: /primary   ,selector: "[]" }]

用户需要确保每个委托服务的端口，在代理集群中都是唯一的。

在 43 节点生产环境仿真沙箱中提供了一个使用专用负载均衡器集群的例子：prod.yml

6.6 - 扩展插件

定义，创建，安装，启用 PostgreSQL 插件。

扩展是 PostgreSQL 的灵魂所在，Pigsty 收录了 421 个预先编译打包、开箱即用的 PostgreSQL 强力扩展插件，其中包括一些强力扩展：

关于扩展的详细信息，请参考扩展用法

6.7 - 认证 / HBA

Pigsty 中基于主机的身份认证 HBA（Host-Based Authentication）详解。

Pigsty 中基于主机的身份认证 HBA（Host-Based Authentication）详解。

认证是访问控制与权限系统的基石，PostgreSQL拥有多种认证方法。

这里主要介绍 HBA：Host Based Authentication，HBA规则定义了哪些用户能够通过哪些方式从哪些地方访问哪些数据库。

客户端认证

要连接到PostgreSQL数据库，用户必须先经过认证（默认使用密码）。

您可以在连接字符串中提供密码（不安全）或使用PGPASSWORD环境变量或.pgpass文件传递密码。参考psql文档和PostgreSQL连接字符串以获取更多详细信息。

psql 'host=<host> port=<port> dbname=<dbname> user=<username> password=<password>'
psql postgres://<username>:<password>@<host>:<port>/<dbname>
PGPASSWORD=<password>; psql -U <username> -h <host> -p <port> -d <dbname>

例如，连接 Pigsty 默认的 meta 数据库，可以使用以下连接串：

psql 'host=10.10.10.10 port=5432 dbname=meta user=dbuser_dba password=DBUser.DBA'
psql postgres://dbuser_dba:DBUser.DBA@10.10.10.10:5432/meta
PGPASSWORD=DBUser.DBA; psql -U dbuser_dba -h 10.10.10.10 -p 5432 -d meta

默认配置下，Pigsty会启用服务端 SSL 加密，但不验证客户端 SSL 证书。要使用客户端SSL证书连接，你可以使用PGSSLCERT和PGSSLKEY环境变量或sslkey和sslcert参数提供客户端参数。

psql 'postgres://dbuser_dba:DBUser.DBA@10.10.10.10:5432/meta?sslkey=/path/to/dbuser_dba.key&sslcert=/path/to/dbuser_dba.crt'

客户端证书（CN = 用户名）可以使用本地CA与cert.yml剧本签发。

定义HBA

在Pigsty中，有四个与HBA规则有关的参数：

pg_hba_rules：postgres HBA规则
pg_default_hba_rules：postgres 全局默认HBA规则
pgb_hba_rules：pgbouncer HBA规则
pgb_default_hba_rules：pgbouncer 全局默认HBA规则

这些都是 HBA 规则对象的数组，每个HBA规则都是以下两种形式之一的对象：

1. 原始形式

原始形式的 HBA 与 PostgreSQL pg_hba.conf 的格式几乎完全相同：

- title: allow intranet password access
  role: common
  rules:
    - host   all  all  10.0.0.0/8      md5
    - host   all  all  172.16.0.0/12   md5
    - host   all  all  192.168.0.0/16  md5

在这种形式中，rules 字段是字符串数组，每一行都是条原始形式的 HBA规则。title 字段会被渲染为一条注释，解释下面规则的作用。

role 字段用于说明该规则适用于哪些实例角色，当实例的pg_role与role相同时，HBA规则将被添加到这台实例的 HBA 中。

role: common的HBA规则将被添加到所有实例上。
role: primary 的 HBA 规则只会添加到主库实例上。
role: replica 的 HBA 规则只会添加到从库实例上。
role: offline的HBA规则将被添加到离线实例上（ pg_role = offline或pg_offline_query = true）

2. 别名形式

别名形式允许您用更简单清晰便捷的方式维护 HBA 规则：它用addr、auth、user和db 字段替换了 rules。 title 和 role 字段则仍然生效。

- addr: 'intra'    # world|intra|infra|admin|local|localhost|cluster|<cidr>
  auth: 'pwd'      # trust|pwd|ssl|cert|deny|<official auth method>
  user: 'all'      # all|${dbsu}|${repl}|${admin}|${monitor}|<user>|<group>
  db: 'all'        # all|replication|....
  rules: []        # raw hba string precedence over above all
  title: allow intranet password access

addr: where 哪些IP地址段受本条规则影响？
- world: 所有的IP地址
- intra: 所有的内网IP地址段： '10.0.0.0/8', '172.16.0.0/12', '192.168.0.0/16'
- infra: Infra节点的IP地址
- admin: admin_ip 管理节点的IP地址
- local: 本地 Unix Socket
- localhost: 本地 Unix Socket 以及TCP 127.0.0.1/32 环回地址
- cluster: 同一个 PostgresQL 集群所有成员的IP地址
- <cidr>: 一个特定的 CIDR 地址块或IP地址
auth: how 本条规则指定的认证方式？
- deny: 拒绝访问
- trust: 直接信任，不需要认证
- pwd: 密码认证，根据 pg_pwd_enc 参数选用 md5 或 scram-sha-256 认证
- sha/scram-sha-256：强制使用 scram-sha-256 密码认证方式。
- md5: md5 密码认证方式，但也可以兼容 scram-sha-256 认证，不建议使用。
- ssl: 在密码认证 pwd 的基础上，强制要求启用SSL
- ssl-md5: 在密码认证 md5 的基础上，强制要求启用SSL
- ssl-sha: 在密码认证 sha 的基础上，强制要求启用SSL
- os/ident: 使用操作系统用户的身份进行 ident 认证
- peer: 使用 peer 认证方式，类似于 os ident
- cert: 使用基于客户端SSL证书的认证方式，证书CN为用户名
user: who：哪些用户受本条规则影响？
- all: 所有用户
- ${dbsu}: 默认数据库超级用户 pg_dbsu
- ${repl}: 默认数据库复制用户 pg_replication_username
- ${admin}: 默认数据库管理用户 pg_admin_username
- ${monitor}: 默认数据库监控用户 pg_monitor_username
- 其他特定的用户或者角色
db: which：哪些数据库受本条规则影响？
- all: 所有数据库
- replication: 允许建立复制连接（不指定特定数据库）
- 某个特定的数据库

3. 定义位置

通常，全局的HBA定义在 all.vars 中，如果您想要修改全局默认的HBA规则，可以从 full.yml 模板中复制一份到 all.vars 中进行修改。

pg_default_hba_rules：postgres 全局默认HBA规则
pgb_default_hba_rules：pgbouncer 全局默认HBA规则

而集群特定的 HBA 规则定义在数据库的集群级配置中：

pg_hba_rules：postgres HBA规则
pgb_hba_rules：pgbouncer HBA规则

下面是一些集群HBA规则的定义例子：

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_hba_rules:
      - { user: dbuser_view ,db: all    ,addr: infra        ,auth: pwd  ,title: '允许 dbuser_view 从基础设施节点密码访问所有库'}
      - { user: all         ,db: all    ,addr: 100.0.0.0/8  ,auth: pwd  ,title: '允许所有用户从K8S网段密码访问所有库'          }
      - { user: '${admin}'  ,db: world  ,addr: 0.0.0.0/0    ,auth: cert ,title: '允许管理员用户从任何地方用客户端证书登陆'       }

重载HBA

HBA 是一个静态的规则配置文件，修改后需要重载才能生效。默认的 HBA 规则集合因为不涉及 Role 与集群成员，所以通常不需要重载。

如果您设计的 HBA 使用了特定的实例角色限制，或者集群成员限制，那么当集群实例成员发生变化（新增/下线/主从切换），一部分HBA规则的生效条件/涉及范围发生变化，通常也需要重载HBA以反映最新变化。

要重新加载 postgres/pgbouncer 的 hba 规则：

bin/pgsql-hba <cls>                 # 重新加载集群 `<cls>` 的 hba 规则
bin/pgsql-hba <cls> ip1 ip2...      # 重新加载特定实例的 hba 规则

底层实际执行的 Ansible 剧本命令为：

./pgsql.yml -l <cls> -e pg_reload=true -t pg_hba,pg_reload
./pgsql.yml -l <cls> -e pg_reload=true -t pgbouncer_hba,pgbouncer_reload

默认HBA

Pigsty 有一套默认的 HBA 规则，对于绝大多数场景来说，它已经足够安全了。这些规则使用别名形式，因此基本可以自我解释。

pg_default_hba_rules:             # postgres 全局默认的HBA规则 
  - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
  - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
  - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: pwd   ,title: 'replicator replication from localhost'}
  - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: pwd   ,title: 'replicator replication from intranet' }
  - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: pwd   ,title: 'replicator postgres db from intranet' }
  - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
  - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: pwd   ,title: 'monitor from infra host with password'}
  - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: ssl   ,title: 'admin @ everywhere with ssl & pwd'   }
  - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: pwd   ,title: 'pgbouncer read/write via local socket'}
  - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: pwd   ,title: 'read/write biz user via password'     }
  - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: pwd   ,title: 'allow etl offline tasks from intranet'}
pgb_default_hba_rules:            # pgbouncer 全局默认的HBA规则 
  - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
  - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
  - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: pwd   ,title: 'monitor access via intranet with pwd' }
  - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
  - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: pwd   ,title: 'admin access via intranet with pwd'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
  - {user: 'all'        ,db: all         ,addr: intra     ,auth: pwd   ,title: 'allow all user intra access with pwd' }

示例：渲染 pg_hba.conf

#==============================================================#
# File      :   pg_hba.conf
# Desc      :   Postgres HBA Rules for pg-meta-1 [primary]
# Time      :   2023-01-11 15:19
# Host      :   pg-meta-1 @ 10.10.10.10:5432
# Path      :   /pg/data/pg_hba.conf
# Note      :   ANSIBLE MANAGED, DO NOT CHANGE!
# Author    :   Ruohang Feng (rh@vonng.com)
# License   :   AGPLv3
#==============================================================#

# addr alias
# local     : /var/run/postgresql
# admin     : 10.10.10.10
# infra     : 10.10.10.10
# intra     : 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16

# user alias
# dbsu    :  postgres
# repl    :  replicator
# monitor :  dbuser_monitor
# admin   :  dbuser_dba

# dbsu access via local os user ident [default]
local    all                postgres                              ident

# dbsu replication from local os ident [default]
local    replication        postgres                              ident

# replicator replication from localhost [default]
local    replication        replicator                            scram-sha-256
host     replication        replicator         127.0.0.1/32       scram-sha-256

# replicator replication from intranet [default]
host     replication        replicator         10.0.0.0/8         scram-sha-256
host     replication        replicator         172.16.0.0/12      scram-sha-256
host     replication        replicator         192.168.0.0/16     scram-sha-256

# replicator postgres db from intranet [default]
host     postgres           replicator         10.0.0.0/8         scram-sha-256
host     postgres           replicator         172.16.0.0/12      scram-sha-256
host     postgres           replicator         192.168.0.0/16     scram-sha-256

# monitor from localhost with password [default]
local    all                dbuser_monitor                        scram-sha-256
host     all                dbuser_monitor     127.0.0.1/32       scram-sha-256

# monitor from infra host with password [default]
host     all                dbuser_monitor     10.10.10.10/32     scram-sha-256

# admin @ infra nodes with pwd & ssl [default]
hostssl  all                dbuser_dba         10.10.10.10/32     scram-sha-256

# admin @ everywhere with ssl & pwd [default]
hostssl  all                dbuser_dba         0.0.0.0/0          scram-sha-256

# pgbouncer read/write via local socket [default]
local    all                +dbrole_readonly                      scram-sha-256
host     all                +dbrole_readonly   127.0.0.1/32       scram-sha-256

# read/write biz user via password [default]
host     all                +dbrole_readonly   10.0.0.0/8         scram-sha-256
host     all                +dbrole_readonly   172.16.0.0/12      scram-sha-256
host     all                +dbrole_readonly   192.168.0.0/16     scram-sha-256

# allow etl offline tasks from intranet [default]
host     all                +dbrole_offline    10.0.0.0/8         scram-sha-256
host     all                +dbrole_offline    172.16.0.0/12      scram-sha-256
host     all                +dbrole_offline    192.168.0.0/16     scram-sha-256

# allow application database intranet access [common] [DISABLED]
#host    kong            dbuser_kong         10.0.0.0/8          md5
#host    bytebase        dbuser_bytebase     10.0.0.0/8          md5
#host    grafana         dbuser_grafana      10.0.0.0/8          md5

示例: 渲染 pgb_hba.conf

#==============================================================#
# File      :   pgb_hba.conf
# Desc      :   Pgbouncer HBA Rules for pg-meta-1 [primary]
# Time      :   2023-01-11 15:28
# Host      :   pg-meta-1 @ 10.10.10.10:5432
# Path      :   /etc/pgbouncer/pgb_hba.conf
# Note      :   ANSIBLE MANAGED, DO NOT CHANGE!
# Author    :   Ruohang Feng (rh@vonng.com)
# License   :   AGPLv3
#==============================================================#

# PGBOUNCER HBA RULES FOR pg-meta-1 @ 10.10.10.10:6432
# ansible managed: 2023-01-11 14:30:58

# addr alias
# local     : /var/run/postgresql
# admin     : 10.10.10.10
# infra     : 10.10.10.10
# intra     : 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16

# user alias
# dbsu    :  postgres
# repl    :  replicator
# monitor :  dbuser_monitor
# admin   :  dbuser_dba

# dbsu local admin access with os ident [default]
local    pgbouncer          postgres                              peer

# allow all user local access with pwd [default]
local    all                all                                   scram-sha-256
host     all                all                127.0.0.1/32       scram-sha-256

# monitor access via intranet with pwd [default]
host     pgbouncer          dbuser_monitor     10.0.0.0/8         scram-sha-256
host     pgbouncer          dbuser_monitor     172.16.0.0/12      scram-sha-256
host     pgbouncer          dbuser_monitor     192.168.0.0/16     scram-sha-256

# reject all other monitor access addr [default]
host     all                dbuser_monitor     0.0.0.0/0          reject

# admin access via intranet with pwd [default]
host     all                dbuser_dba         10.0.0.0/8         scram-sha-256
host     all                dbuser_dba         172.16.0.0/12      scram-sha-256
host     all                dbuser_dba         192.168.0.0/16     scram-sha-256

# reject all other admin access addr [default]
host     all                dbuser_dba         0.0.0.0/0          reject

# allow all user intra access with pwd [default]
host     all                all                10.0.0.0/8         scram-sha-256
host     all                all                172.16.0.0/12      scram-sha-256
host     all                all                192.168.0.0/16     scram-sha-256

安全加固

对于那些需要更高安全性的场合，我们提供了一个安全加固的配置模板 security.yml，使用了以下的默认 HBA 规则集：

pg_default_hba_rules:             # postgres host-based auth rules by default
  - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
  - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
  - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: ssl   ,title: 'replicator replication from localhost'}
  - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: ssl   ,title: 'replicator replication from intranet' }
  - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: ssl   ,title: 'replicator postgres db from intranet' }
  - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
  - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: ssl   ,title: 'monitor from infra host with password'}
  - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: cert  ,title: 'admin @ everywhere with ssl & cert'   }
  - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: ssl   ,title: 'pgbouncer read/write via local socket'}
  - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: ssl   ,title: 'read/write biz user via password'     }
  - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: ssl   ,title: 'allow etl offline tasks from intranet'}
pgb_default_hba_rules:            # pgbouncer host-based authentication rules
  - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
  - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
  - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: ssl   ,title: 'monitor access via intranet with pwd' }
  - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
  - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: ssl   ,title: 'admin access via intranet with pwd'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
  - {user: 'all'        ,db: all         ,addr: intra     ,auth: ssl   ,title: 'allow all user intra access with pwd' }

更多信息，请参考安全加固一节。

6.8 - 集群配置

根据需求场景选择合适的实例与集群类型，配置出满足需求的 PostgreSQL 数据库集群。

根据需求场景选择合适的实例与集群类型，配置出满足需求的 PostgreSQL 数据库集群。

您可以定义不同类型的实例和集群，下面是 Pigsty 中常见的几种 PostgreSQL 实例/集群类型：

读写主库：定义单一实例集群。
只读从库：定义具有一个主库和一个副本的基本HA集群。
离线从库：定义专用于OLAP/ETL/交互式查询的实例
同步备库：启用同步提交以确保没有数据丢失。
法定人数提交：使用多数同步提交获得更高的一致性级别。
备份集群：克隆现有集群并跟随它
延迟集群：克隆现有集群用于紧急数据恢复
Citus集群：定义一个Citus分布式数据库集群
大版本切换：使用不同的PostgreSQL大版本

读写主库

我们从最简单的情况开始：由一个主库（Primary）组成的单实例集群：

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
  vars:
    pg_cluster: pg-test

这段配置言简意赅，仅由身份参数构成。

使用以下命令在节点 10.10.10.11 上创建一个主库实例：

bin/pgsql-add pg-test

Demo展示，开发测试，承载临时需求，进行无关紧要的计算分析任务时，使用单一数据库实例可能并没有太大问题。但这样的单机集群没有高可用，当出现硬件故障时，您需要使用 PITR 或其他恢复手段来确保集群的 RTO / RPO。为此，您可以考虑为集群添加若干个只读从库

只读从库

要添加一台只读从库（Replica）实例，您可以在 pg-test 中添加一个新节点，并将其 pg_role 设置为replica。

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }  # <--- 新添加的从库
  vars:
    pg_cluster: pg-test

如果整个集群不存在，您可以直接创建这个完整的集群。如果集群主库已经初始化好了，那么您可以向现有集群添加一个从库：

bin/pgsql-add pg-test               # 一次性初始化整个集群
bin/pgsql-add pg-test 10.10.10.12   # 添加从库到现有的集群

当集群主库出现故障时，只读实例（Replica）可以在高可用系统的帮助下接管主库的工作。除此之外，只读实例还可以用于执行只读查询：许多业务的读请求要比写请求多很多，而大部分只读查询负载都可以由从库实例承担。

离线从库

离线实例（Offline）是专门用于服务慢查询、ETL、OLAP流量和交互式查询等的专用只读从库。慢查询/长事务对在线业务的性能与稳定性有不利影响，因此最好将它们与在线业务隔离开来。

要添加离线实例，请为其分配一个新实例，并将pg_role设置为offline。

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: offline }  # <--- 新添加的离线从库
  vars:
    pg_cluster: pg-test

专用离线实例的工作方式与常见的从库实例类似，但它在 pg-test-replica 服务中用作备份服务器。也就是说，只有当所有replica实例都宕机时，离线和主实例才会提供此项只读服务。

许多情况下，数据库资源有限，单独使用一台服务器作为离线实例是不经济的做法。作为折中，您可以选择一台现有的从库实例，打上 pg_offline_query 标记，将其标记为一台可以承载“离线查询”的实例。在这种情况下，这台只读从库会同时承担在线只读请求与离线类查询。您可以使用 pg_default_hba_rules和pg_hba_rules 对离线实例进行额外的访问控制。

同步备库

当启用同步备库（Sync Standby）时，PostgreSQL 将选择一个从库作为同步备库，其他所有从库作为候选者。主数据库会等待备库实例刷新到磁盘，然后才确认提交，备库实例始终拥有最新的数据，没有复制延迟，主从切换至同步备库不会有数据丢失。

PostgreSQL 默认使用异步流复制，这可能会有小的复制延迟（10KB / 10ms 数量级）。当主库失败时，可能会有一个小的数据丢失窗口（可以使用pg_rpo来控制），但对于大多数场景来说，这是可以接受的。

但在某些关键场景中（例如，金融交易），数据丢失是完全不可接受的，或者，读取复制延迟是不可接受的。在这种情况下，您可以使用同步提交来解决这个问题。要启用同步备库模式，您可以简单地使用pg_conf中的crit.yml模板。

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica }
  vars:
    pg_cluster: pg-test
    pg_conf: crit.yml   # <--- 使用 crit 模板

要在现有集群上启用同步备库，请配置集群并启用 synchronous_mode：

$ pg edit-config pg-test    # 在管理员节点以管理员用户身份运行
+++
-synchronous_mode: false    # <--- 旧值
+synchronous_mode: true     # <--- 新值
 synchronous_mode_strict: false

应用这些更改？[y/N]: y

在这种情况下，PostgreSQL 配置项 synchronous_standby_names 由 Patroni 自动管理。一台从库将被选拔为同步从库，它的 application_name 将被写入 PostgreSQL 主库配置文件中并应用生效。

法定人数提交

法定人数提交（Quorum Commit）提供了比同步备库更强大的控制能力：特别是当您有多个从库时，您可以设定提交成功的标准，实现更高/更低的一致性级别（以及可用性之间的权衡）。

如果想要最少两个从库来确认提交，可以通过 Patroni 配置集群，调整参数 synchronous_node_count 并应用生效

synchronous_mode: true          # 确保同步提交已经启用
synchronous_node_count: 2       # 指定“至少”有多少个从库提交成功，才算提交成功

如果你想要使用更多的同步从库，修改 synchronous_node_count 的取值即可。当集群的规模发生变化时，您应当确保这里的配置仍然是有效的，以避免服务不可用。

在这种情况下，PostgreSQL 配置项 synchronous_standby_names 由 Patroni 自动管理。

synchronous_standby_names = '2 ("pg-test-3","pg-test-2")'

示例：使用多个同步从库

$ pg edit-config pg-test
---
+synchronous_node_count: 2

Apply these changes? [y/N]: y

应用配置后，出现两个同步备库。

+ Cluster: pg-test (7080814403632534854) +---------+----+-----------+-----------------+
| Member    | Host        | Role         | State   | TL | Lag in MB | Tags            |
+-----------+-------------+--------------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.10 | Leader       | running |  1 |           | clonefrom: true |
| pg-test-2 | 10.10.10.11 | Sync Standby | running |  1 |         0 | clonefrom: true |
| pg-test-3 | 10.10.10.12 | Sync Standby | running |  1 |         0 | clonefrom: true |
+-----------+-------------+--------------+---------+----+-----------+-----------------+

另一种情景是，使用 任意n个 从库来确认提交。在这种情况下，配置的方式略有不同，例如，假设我们只需要任意一个从库确认提交：

synchronous_mode: quorum        # 使用法定人数提交
postgresql:
  parameters:                   # 修改 PostgreSQL 的配置参数 synchronous_standby_names ，使用 `ANY n ()` 语法
    synchronous_standby_names: 'ANY 1 (*)'  # 你可以指定具体的从库列表，或直接使用 * 通配所有从库。

示例：启用ANY法定人数提交

$ pg edit-config pg-test

+    synchronous_standby_names: 'ANY 1 (*)' # 在 ANY 模式下，需要使用此参数
- synchronous_node_count: 2  # 在 ANY 模式下， 不需要使用此参数

Apply these changes? [y/N]: y

应用后，配置生效，所有备库在 Patroni 中变为普通的 replica。但是在 pg_stat_replication 中可以看到 sync_state 会变为 quorum。

备份集群

您可以克隆现有的集群，并创建一个备份集群（Standby Cluster），用于数据迁移、水平拆分、多区域部署，或灾难恢复。

在正常情况下，备份集群将追随上游集群并保持内容同步，您可以将备份集群提升，作为真正地独立集群。

备份集群的定义方式与正常集群的定义基本相同，除了在主库上额外定义了 pg_upstream 参数，备份集群的主库被称为 备份集群领导者 （Standby Leader）。

例如，下面定义了一个pg-test集群，以及其备份集群pg-test2，其配置清单可能如下所示：

# pg-test 是原始集群
pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
  vars: { pg_cluster: pg-test }

# pg-test2 是 pg-test 的备份集群
pg-test2:
  hosts:
    10.10.10.12: { pg_seq: 1, pg_role: primary , pg_upstream: 10.10.10.11 } # <--- pg_upstream 在这里定义
    10.10.10.13: { pg_seq: 2, pg_role: replica }
  vars: { pg_cluster: pg-test2 }

而 pg-test2 集群的主节点 pg-test2-1 将是 pg-test 的下游从库，并在pg-test2集群中充当备份集群领导者（Standby Leader）。

只需确保备份集群的主节点上配置了pg_upstream参数，以便自动从原始上游拉取备份。

bin/pgsql-add pg-test     # 创建原始集群
bin/pgsql-add pg-test2    # 创建备份集群

示例：更改复制上游

如有必要（例如，上游发生主从切换/故障转移），您可以通过配置集群更改备份集群的复制上游。

要这样做，只需将standby_cluster.host更改为新的上游IP地址并应用。

$ pg edit-config pg-test2

 standby_cluster:
   create_replica_methods:
   - basebackup
-  host: 10.10.10.13     # <--- 旧的上游
+  host: 10.10.10.12     # <--- 新的上游
   port: 5432

 Apply these changes? [y/N]: y

示例：提升备份集群

你可以随时将备份集群提升为独立集群，这样该集群就可以独立承载写入请求，并与原集群分叉。

为此，你必须配置该集群并完全擦除standby_cluster部分，然后应用。

$ pg edit-config pg-test2
-standby_cluster:
-  create_replica_methods:
-  - basebackup
-  host: 10.10.10.11
-  port: 5432

Apply these changes? [y/N]: y

示例：级联复制

如果您在一台从库上指定了 pg_upstream，而不是主库。那么可以配置集群的 级联复制（Cascade Replication）

在配置级联复制时，您必须使用集群中某一个实例的IP地址作为参数的值，否则初始化会报错。该从库从特定的实例进行流复制，而不是主库。

这台充当 WAL 中继器的实例被称为 桥接实例（Bridge Instance）。使用桥接实例可以分担主库发送 WAL 的负担，当您有几十台从库时，使用桥接实例级联复制是一个不错的注意。

pg-test:
  hosts: # pg-test-1 ---> pg-test-2 ---> pg-test-3
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica } # <--- 桥接实例
    10.10.10.13: { pg_seq: 3, pg_role: replica, pg_upstream: 10.10.10.12 }
    # ^--- 从 pg-test-2 (桥接)复制，而不是从 pg-test-1 (主节点) 
  vars: { pg_cluster: pg-test }

延迟集群

延迟集群（Delayed Cluster）是一种特殊类型的备份集群，用于尽快恢复“意外删除”的数据。

例如，如果你希望有一个名为 pg-testdelay 的集群，其数据内容与一小时前的 pg-test 集群相同：

# pg-test 是原始集群
pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
  vars: { pg_cluster: pg-test }

# pg-testdelay 是 pg-test 的延迟集群
pg-testdelay:
  hosts:
    10.10.10.12: { pg_seq: 1, pg_role: primary , pg_upstream: 10.10.10.11, pg_delay: 1d }
    10.10.10.13: { pg_seq: 2, pg_role: replica }
  vars: { pg_cluster: pg-test2 }

你还可以在现有的备份集群上配置一个“复制延迟”。

$ pg edit-config pg-testdelay
 standby_cluster:
   create_replica_methods:
   - basebackup
   host: 10.10.10.11
   port: 5432
+  recovery_min_apply_delay: 1h    # <--- 在此处添加延迟时长，例如1小时

Apply these changes? [y/N]: y

当某些元组和表格被意外删除时，你可以通过修改此参数的方式，将此延迟集群推进到适当的时间点，并从中读取数据，快速修复原始集群。

延迟集群需要额外的资源，但比起 PITR 要快得多，并且对系统的影响也小得多，对于非常关键的集群，可以考虑搭建延迟集群。

Citus集群

Pigsty 原生支持 Citus。可以参考 files/pigsty/citus.yml 与 prod.yml 作为样例。

要定义一个 citus 集群，您需要指定以下参数：

pg_mode 必须设置为 citus，而不是默认的 pgsql
在每个分片集群上都必须定义分片名 pg_shard 和分片号 pg_group
必须定义 pg_primary_db 来指定由 Patroni 管理的数据库。
如果您想使用 pg_dbsu 的 postgres 而不是默认的 pg_admin_username 来执行管理命令，那么 pg_dbsu_password 必须设置为非空的纯文本密码

此外，还需要额外的 hba 规则，允许从本地和其他数据节点进行 SSL 访问。如下所示：

all:
  children:
    pg-citus0: # citus 0号分片
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus0 , pg_group: 0 }
    pg-citus1: # citus 1号分片
      hosts: { 10.10.10.11: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus1 , pg_group: 1 }
    pg-citus2: # citus 2号分片
      hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus2 , pg_group: 2 }
    pg-citus3: # citus 3号分片
      hosts:
        10.10.10.13: { pg_seq: 1, pg_role: primary }
        10.10.10.14: { pg_seq: 2, pg_role: replica }
      vars: { pg_cluster: pg-citus3 , pg_group: 3 }
  vars:                               # 所有 Citus 集群的全局参数
    pg_mode: citus                    # pgsql 集群模式需要设置为： citus
    pg_shard: pg-citus                # citus 水平分片名称： pg-citus
    pg_primary_db: meta               # citus 数据库名称：meta
    pg_dbsu_password: DBUser.Postgres # 如果使用 dbsu ，那么需要为其配置一个密码
    pg_users: [ { name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
    pg_databases: [ { name: meta ,extensions: [ { name: citus }, { name: postgis }, { name: timescaledb } ] } ]
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32 ,auth: ssl ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra        ,auth: ssl ,title: 'all user ssl access from intranet'  }

在协调者节点上，您可以创建分布式表和引用表，并从任何数据节点查询它们。从 11.2 开始，任何 Citus 数据库节点都可以扮演协调者的角色了。

SELECT create_distributed_table('pgbench_accounts', 'aid'); SELECT truncate_local_data_after_distributing_table($$public.pgbench_accounts$$);
SELECT create_reference_table('pgbench_branches')         ; SELECT truncate_local_data_after_distributing_table($$public.pgbench_branches$$);
SELECT create_reference_table('pgbench_history')          ; SELECT truncate_local_data_after_distributing_table($$public.pgbench_history$$);
SELECT create_reference_table('pgbench_tellers')          ; SELECT truncate_local_data_after_distributing_table($$public.pgbench_tellers$$);

大版本切换

Pigsty 从 PostgreSQL 10 开始提供支持，不过目前预打包的离线软件包中仅包含 12 - 16 版本。

Pigsty 对不同大版本的支持力度不同，如下表所示：

版本	说明	软件包支持程度
16	刚发布的新版本，支持重要扩展	Core, L1, L2
15	稳定的主版本，支持全部扩展（默认）	Core, L1, L2, L3
14	旧的稳定主版本，支持 L1、L2 扩展	Core, L1
13	更旧的主版本，仅支持 L1 扩展	Core, L1
12	更旧的主版本，仅支持 L1 扩展	Core, L1

内核: postgresql*，提供 12 - 16 支持
1类扩展: wal2json，pg_repack，passwordcheck_cracklib (在 PG 12 - 16 中提供)
2类扩展: postgis， citus， timescaledb， pgvector (在 PG 15,16 中提供)
3类扩展: 其他扩展 (目前只在 PG 15 提供)

除了 PG15 之外，其他大版本上可能会有一些扩展不可用，您可能需要更改 pg_extensions 和 pg_libs 以满足您的需求。

如果您确实希望在较老的大版本上使用这些扩展，可以参考添加软件和安装扩展的说明，手工从PGDG源下载并安装。

这里有一些不同大版本集群的配置样例：

pg-v12:
  hosts: { 10.10.10.12: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v12
    pg_version: 12
    pg_libs: 'pg_stat_statements, auto_explain'
    pg_extensions: [ 'wal2json_12* pg_repack_12* passwordcheck_cracklib_12*' ]

pg-v13:
  hosts: { 10.10.10.13: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v13
    pg_version: 13
    pg_libs: 'pg_stat_statements, auto_explain'
    pg_extensions: [ 'wal2json_13* pg_repack_13* passwordcheck_cracklib_13*' ]

pg-v14:
  hosts: { 10.10.10.14: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v14
    pg_version: 14

pg-v15:
  hosts: { 10.10.10.15: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v15
    pg_version: 15

pg-v16:
  hosts: { 10.10.10.16: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v16
    pg_version: 16

6.9 - 参数列表

PostgreSQL 模块提供了 115 个相关配置参数，用于定制所需的数据库集群。

Pigsty 提供了 112 参数，用于描述 PostgreSQL 数据库集群与整个环境的方方面面。

参数	参数组	类型	层次	中文说明
`pg_mode`	`PG_ID`	enum	C	pgsql 集群模式: pgsql,citus,mssql,polar,ivory,oracle,gpsql
`pg_cluster`	`PG_ID`	string	C	pgsql 集群名称, 必选身份参数
`pg_seq`	`PG_ID`	int	I	pgsql 实例号, 必选身份参数
`pg_role`	`PG_ID`	enum	I	pgsql 实例角色, 必选身份参数, 可为 primary，replica，offline
`pg_instances`	`PG_ID`	dict	I	在一个节点上定义多个 pg 实例，使用 `{port:ins_vars}` 格式
`pg_upstream`	`PG_ID`	ip	I	级联从库或备份集群或的复制上游节点IP地址
`pg_shard`	`PG_ID`	string	C	pgsql 分片名，对 citus 与 gpsql 等水平分片集群为必选身份参数
`pg_group`	`PG_ID`	int	C	pgsql 分片号，正整数，对 citus 与 gpsql 等水平分片集群为必选身份参数
`gp_role`	`PG_ID`	enum	C	这个集群的 greenplum 角色，可以是 master 或 segment
`pg_exporters`	`PG_ID`	dict	C	在该节点上设置额外的 pg_exporters 用于监控远程 postgres 实例
`pg_offline_query`	`PG_ID`	bool	I	设置为 true 将此只读实例标记为特殊的离线从库，承载 Offline 服务，允许离线查询
`pg_users`	`PG_BUSINESS`	user[]	C	postgres 业务用户
`pg_databases`	`PG_BUSINESS`	database[]	C	postgres 业务数据库
`pg_services`	`PG_BUSINESS`	service[]	C	postgres 业务服务
`pg_hba_rules`	`PG_BUSINESS`	hba[]	C	postgres 的业务 hba 规则
`pgb_hba_rules`	`PG_BUSINESS`	hba[]	C	pgbouncer 的业务 hba 规则
`pg_replication_username`	`PG_BUSINESS`	username	G	postgres 复制用户名，默认为 `replicator`
`pg_replication_password`	`PG_BUSINESS`	password	G	postgres 复制密码，默认为 `DBUser.Replicator`
`pg_admin_username`	`PG_BUSINESS`	username	G	postgres 管理员用户名，默认为 `dbuser_dba`
`pg_admin_password`	`PG_BUSINESS`	password	G	postgres 管理员明文密码，默认为 `DBUser.DBA`
`pg_monitor_username`	`PG_BUSINESS`	username	G	postgres 监控用户名，默认为 `dbuser_monitor`
`pg_monitor_password`	`PG_BUSINESS`	password	G	postgres 监控密码，默认为 `DBUser.Monitor`
`pg_dbsu_password`	`PG_BUSINESS`	password	G/C	dbsu 密码，默认为空字符串意味着不设置 dbsu 密码，最好不要设置。
`pg_dbsu`	`PG_INSTALL`	username	C	操作系统 dbsu 名称，默认为 postgres，最好不要更改
`pg_dbsu_uid`	`PG_INSTALL`	int	C	操作系统 dbsu uid 和 gid，对于默认的 postgres 用户和组为 26
`pg_dbsu_sudo`	`PG_INSTALL`	enum	C	dbsu sudo 权限, none,limit,all,nopass，默认为 limit，有限sudo权限
`pg_dbsu_home`	`PG_INSTALL`	path	C	postgresql 主目录，默认为 `/var/lib/pgsql`
`pg_dbsu_ssh_exchange`	`PG_INSTALL`	bool	C	在 pgsql 集群之间交换 postgres dbsu ssh 密钥
`pg_version`	`PG_INSTALL`	enum	C	要安装的 postgres 主版本，默认为 16
`pg_bin_dir`	`PG_INSTALL`	path	C	postgres 二进制目录，默认为 `/usr/pgsql/bin`
`pg_log_dir`	`PG_INSTALL`	path	C	postgres 日志目录，默认为 `/pg/log/postgres`
`pg_packages`	`PG_INSTALL`	string[]	C	要安装的 pg 包，`${pg_version}` 将被替换为实际主版本号
`pg_extensions`	`PG_INSTALL`	string[]	C	要安装的 pg 扩展，`${pg_version}` 将被替换为实际主版本号
`pg_safeguard`	`PG_BOOTSTRAP`	bool	G/C/A	防误删保险，禁止清除正在运行的 postgres 实例？默认为 false
`pg_clean`	`PG_BOOTSTRAP`	bool	G/C/A	在 pgsql 初始化期间清除现有的 postgres？默认为 true
`pg_data`	`PG_BOOTSTRAP`	path	C	postgres 数据目录，默认为 `/pg/data`
`pg_fs_main`	`PG_BOOTSTRAP`	path	C	postgres 主数据的挂载点/路径，默认为 `/data`
`pg_fs_bkup`	`PG_BOOTSTRAP`	path	C	pg 备份数据的挂载点/路径，默认为 `/data/backup`
`pg_storage_type`	`PG_BOOTSTRAP`	enum	C	pg 主数据的存储类型，SSD、HDD，默认为 SSD，影响自动优化的参数。
`pg_dummy_filesize`	`PG_BOOTSTRAP`	size	C	`/pg/dummy` 的大小，默认保留 64MB 磁盘空间用于紧急抢修
`pg_listen`	`PG_BOOTSTRAP`	ip(s)	C/I	postgres/pgbouncer 的监听地址，用逗号分隔的IP列表，默认为 `0.0.0.0`
`pg_port`	`PG_BOOTSTRAP`	port	C	postgres 监听端口，默认为 5432
`pg_localhost`	`PG_BOOTSTRAP`	path	C	postgres 的 Unix 套接字目录，用于本地连接
`pg_namespace`	`PG_BOOTSTRAP`	path	C	在 etcd 中的顶级键命名空间，被 patroni & vip 用于高可用管理
`patroni_enabled`	`PG_BOOTSTRAP`	bool	C	如果禁用，初始化期间不会创建 postgres 集群
`patroni_mode`	`PG_BOOTSTRAP`	enum	C	patroni 工作模式：default,pause,remove
`patroni_port`	`PG_BOOTSTRAP`	port	C	patroni 监听端口，默认为 8008
`patroni_log_dir`	`PG_BOOTSTRAP`	path	C	patroni 日志目录，默认为 `/pg/log/patroni`
`patroni_ssl_enabled`	`PG_BOOTSTRAP`	bool	G	使用 SSL 保护 patroni RestAPI 通信？
`patroni_watchdog_mode`	`PG_BOOTSTRAP`	enum	C	patroni 看门狗模式：automatic,required,off，默认为 off
`patroni_username`	`PG_BOOTSTRAP`	username	C	patroni restapi 用户名，默认为 `postgres`
`patroni_password`	`PG_BOOTSTRAP`	password	C	patroni restapi 密码，默认为 `Patroni.API`
`pg_primary_db`	`PG_BOOTSTRAP`	string	C	指定集群中首要使用的数据库名，Citus等模式会用到，默认为 `postgres`
`pg_parameters`	`PG_BOOTSTRAP`	dict	C	覆盖 postgresql.auto.conf 中的 PostgreSQL 参数
`pg_files`	`PG_BOOTSTRAP`	path[]	C	拷贝至PGDATA目录中的额外文件列表 (例如许可证文件)
`pg_conf`	`PG_BOOTSTRAP`	enum	C	配置模板：oltp,olap,crit,tiny，默认为 `oltp.yml`
`pg_max_conn`	`PG_BOOTSTRAP`	int	C	postgres 最大连接数，`auto` 将使用推荐值
`pg_shared_buffer_ratio`	`PG_BOOTSTRAP`	float	C	postgres 共享缓冲区内存比率，默认为 0.25，范围 0.1~0.4
`pg_rto`	`PG_BOOTSTRAP`	int	C	恢复时间目标（秒），默认为 `30s`
`pg_rpo`	`PG_BOOTSTRAP`	int	C	恢复点目标（字节），默认为 `1MiB`
`pg_libs`	`PG_BOOTSTRAP`	string	C	预加载的库，默认为 `timescaledb,pg_stat_statements,auto_explain`
`pg_delay`	`PG_BOOTSTRAP`	interval	I	备份集群主库的WAL重放应用延迟，用于制备延迟从库
`pg_checksum`	`PG_BOOTSTRAP`	bool	C	为 postgres 集群启用数据校验和？
`pg_pwd_enc`	`PG_BOOTSTRAP`	enum	C	密码加密算法：md5,scram-sha-256
`pg_encoding`	`PG_BOOTSTRAP`	enum	C	数据库集群编码，默认为 `UTF8`
`pg_locale`	`PG_BOOTSTRAP`	enum	C	数据库集群本地化设置，默认为 `C`
`pg_lc_collate`	`PG_BOOTSTRAP`	enum	C	数据库集群排序，默认为 `C`
`pg_lc_ctype`	`PG_BOOTSTRAP`	enum	C	数据库字符类型，默认为 `en_US.UTF8`
`pgbouncer_enabled`	`PG_BOOTSTRAP`	bool	C	如果禁用，则不会配置 pgbouncer 连接池
`pgbouncer_port`	`PG_BOOTSTRAP`	port	C	pgbouncer 监听端口，默认为 6432
`pgbouncer_log_dir`	`PG_BOOTSTRAP`	path	C	pgbouncer 日志目录，默认为 `/pg/log/pgbouncer`
`pgbouncer_auth_query`	`PG_BOOTSTRAP`	bool	C	使用 AuthQuery 来从 postgres 获取未列出的业务用户？
`pgbouncer_poolmode`	`PG_BOOTSTRAP`	enum	C	池化模式：transaction,session,statement，默认为 transaction
`pgbouncer_sslmode`	`PG_BOOTSTRAP`	enum	C	pgbouncer 客户端 SSL 模式，默认为禁用
`pg_provision`	`PG_PROVISION`	bool	C	在引导后置备 postgres 集群内部的业务对象？
`pg_init`	`PG_PROVISION`	string	G/C	为集群模板提供初始化脚本，默认为 `pg-init`
`pg_default_roles`	`PG_PROVISION`	role[]	G/C	postgres 集群中的默认预定义角色和系统用户
`pg_default_privileges`	`PG_PROVISION`	string[]	G/C	由管理员用户创建数据库内对象时的默认权限
`pg_default_schemas`	`PG_PROVISION`	string[]	G/C	要创建的默认模式列表
`pg_default_extensions`	`PG_PROVISION`	extension[]	G/C	要创建的默认扩展列表
`pg_reload`	`PG_PROVISION`	bool	A	更改HBA后，是否立即重载 postgres 配置
`pg_default_hba_rules`	`PG_PROVISION`	hba[]	G/C	postgres 基于主机的认证规则，全局PG默认HBA
`pgb_default_hba_rules`	`PG_PROVISION`	hba[]	G/C	pgbouncer 默认的基于主机的认证规则，全局PGB默认HBA
`pgbackrest_enabled`	`PG_BACKUP`	bool	C	在 pgsql 主机上启用 pgbackrest？
`pgbackrest_clean`	`PG_BACKUP`	bool	C	在初始化时删除以前的 pg 备份数据？
`pgbackrest_log_dir`	`PG_BACKUP`	path	C	pgbackrest 日志目录，默认为 `/pg/log/pgbackrest`
`pgbackrest_method`	`PG_BACKUP`	enum	C	pgbackrest 使用的仓库：local,minio,等…
`pgbackrest_repo`	`PG_BACKUP`	dict	G/C	pgbackrest 仓库定义
`pg_weight`	`PG_SERVICE`	int	I	在服务中的相对负载均衡权重，默认为 100，范围 0-255
`pg_service_provider`	`PG_SERVICE`	string	G/C	专用的 haproxy 节点组名称，或默认空字符，使用本地节点上的 haproxy
`pg_default_service_dest`	`PG_SERVICE`	enum	G/C	如果 svc.dest=‘default’，默认服务指向哪里？postgres 或 pgbouncer，默认指向 pgbouncer
`pg_default_services`	`PG_SERVICE`	service[]	G/C	postgres 默认服务定义列表，全局共用。
`pg_vip_enabled`	`PG_SERVICE`	bool	C	是否为 pgsql 主节点启用 L2 VIP？默认不启用
`pg_vip_address`	`PG_SERVICE`	cidr4	C	vip 地址的格式为 /，启用 vip 时为必选参数
`pg_vip_interface`	`PG_SERVICE`	string	C/I	监听的 vip 网络接口，默认为 eth0
`pg_dns_suffix`	`PG_SERVICE`	string	C	pgsql dns 后缀，默认为空
`pg_dns_target`	`PG_SERVICE`	enum	C	PG DNS 解析到哪里？auto、primary、vip、none 或者特定的 IP 地址
`pg_exporter_enabled`	`PG_EXPORTER`	bool	C	在 pgsql 主机上启用 pg_exporter 吗？
`pg_exporter_config`	`PG_EXPORTER`	string	C	pg_exporter 配置文件/模板名称
`pg_exporter_cache_ttls`	`PG_EXPORTER`	string	C	pg_exporter 收集器阶梯TTL配置，默认为4个由逗号分隔的秒数：‘1,10,60,300’
`pg_exporter_port`	`PG_EXPORTER`	port	C	pg_exporter 监听端口，默认为 9630
`pg_exporter_params`	`PG_EXPORTER`	string	C	pg_exporter dsn 中传入的额外 URL 参数
`pg_exporter_url`	`PG_EXPORTER`	pgurl	C	如果指定，则覆盖自动生成的 postgres DSN 连接串
`pg_exporter_auto_discovery`	`PG_EXPORTER`	bool	C	监控是否启用自动数据库发现？默认启用
`pg_exporter_exclude_database`	`PG_EXPORTER`	string	C	启用自动发现时，排除在外的数据库名称列表，用逗号分隔
`pg_exporter_include_database`	`PG_EXPORTER`	string	C	启用自动发现时，只监控这个列表中的数据库，名称用逗号分隔
`pg_exporter_connect_timeout`	`PG_EXPORTER`	int	C	pg_exporter 连接超时，单位毫秒，默认为 200
`pg_exporter_options`	`PG_EXPORTER`	arg	C	pg_exporter 的额外命令行参数选项
`pgbouncer_exporter_enabled`	`PG_EXPORTER`	bool	C	在 pgsql 主机上启用 pgbouncer_exporter 吗？
`pgbouncer_exporter_port`	`PG_EXPORTER`	port	C	pgbouncer_exporter 监听端口，默认为 9631
`pgbouncer_exporter_url`	`PG_EXPORTER`	pgurl	C	如果指定，则覆盖自动生成的 pgbouncer dsn 连接串
`pgbouncer_exporter_options`	`PG_EXPORTER`	arg	C	pgbouncer_exporter 的额外命令行参数选项
`pgbackrest_exporter_enabled`	`PG_EXPORTER`	bool	C	在 pgsql 主机上启用 pgbackrest_exporter 吗？
`pgbackrest_exporter_port`	`PG_EXPORTER`	port	C	pgbackrest_exporter 监听端口，默认为 9854
`pgbackrest_exporter_options`	`PG_EXPORTER`	arg	C	pgbackrest_exporter 的额外命令行参数选项

`PGSQL`

PGSQL 模块需要在 Pigsty 管理的节点上安装（即节点已经配置了 NODE 模块），同时还要求您的部署中有一套可用的 ETCD 集群来存储集群元数据。

在单个节点上安装 PGSQL 模块将创建一个独立的 PGSQL 服务器/实例，即主实例。在额外节点上安装将创建只读副本，可以作为备用实例，并用于承载分担只读请求。您还可以创建用于 ETL/OLAP/交互式查询的离线实例，使用同步备库和法定人数提交来提高数据一致性，甚至搭建备份集群和延迟集群以快速应对人为失误与软件缺陷导致的数据损失。

您可以定义多个 PGSQL 集群并进一步组建一个水平分片集群： Pigsty 支持原生的 citus 集群组，可以将您的标准 PGSQL 集群原地升级为一个分布式的数据库集群。

`PG_ID`

以下是一些常用的参数，用于标识 PGSQL 模块中的实体：集群、实例、服务等…

# pg_cluster:           #CLUSTER  # pgsql 集群名称，必需的标识参数
# pg_seq: 0             #INSTANCE # pgsql 实例序列号，必需的标识参数
# pg_role: replica      #INSTANCE # pgsql 角色，必需的，可以是 primary,replica,offline
# pg_instances: {}      #INSTANCE # 在节点上定义多个 pg 实例，使用 `{port:ins_vars}` 格式
# pg_upstream:          #INSTANCE # 备用集群或级联副本的 repl 上游 ip 地址
# pg_shard:             #CLUSTER  # pgsql 分片名称，分片集群的可选标识
# pg_group: 0           #CLUSTER  # pgsql 分片索引号，分片集群的可选标识
# gp_role: master       #CLUSTER  # 此集群的 greenplum 角色，可以是 master 或 segment
pg_offline_query: false #INSTANCE # 设置为 true 以在此实例上启用离线查询

您必须显式指定这些身份参数，它们没有默认值：

名称	类型	级别	扩展说明
`pg_cluster`	`string`	C	PG 数据库集群名称
`pg_seq`	`number`	I	PG 数据库实例 ID
`pg_role`	`enum`	I	PG 数据库实例角色
`pg_shard`	`string`	C	数据库分片名称
`pg_group`	`number`	C	数据库分片序号

pg_cluster: 它标识集群的名称，该名称在集群级别配置。
pg_role: 在实例级别配置，标识 ins 的角色。只有 primary 角色会特别处理。如果不填写，默认为 replica 角色和特殊的 delayed 和 offline 角色。
pg_seq: 用于在集群内标识 ins，通常是从 0 或 1 递增的整数，一旦分配就不会更改。
{{ pg_cluster }}-{{ pg_seq }} 用于唯一标识 ins，即 pg_instance。
{{ pg_cluster }}-{{ pg_role }} 用于标识集群内的服务，即 pg_service。
pg_shard 和 pg_group 用于水平分片集群，仅用于 citus、greenplum 和 matrixdb。

pg_cluster、pg_role、pg_seq 是核心标识参数，对于任何 Postgres 集群都是必选的，并且必须显式指定。以下是一个示例：

pg-test:
  hosts:
    10.10.10.11: {pg_seq: 1, pg_role: replica}
    10.10.10.12: {pg_seq: 2, pg_role: primary}
    10.10.10.13: {pg_seq: 3, pg_role: replica}
  vars:
    pg_cluster: pg-test

所有其他参数都可以从全局配置或默认配置继承，但标识参数必须明确指定和手动分配。

`pg_mode`

参数名称： pg_mode，类型： enum，层次：C

PostgreSQL 集群模式，默认值为 pgsql，即标准的 PostgreSQL 集群。

可用的模式选项包括：

pgsql：标准的 PostgreSQL 集群
citus：Citus 分布式数据库集群
mssql：Babelfish MSSQL 线缆协议兼容内核
ivory：IvorySQL Oracle 兼容内核
polar：PolarDB for PostgreSQL 内核
oracle：PolarDB for Oracle 内核
gpsql：Greenplum 并行数据库集群（监控）

如果 pg_mode 设置为 citus 或 gpsql，则需要两个额外的必选身份参数 pg_shard 和 pg_group 来定义水平分片集群的身份。

在这两种情况下，每一个 PostgreSQL 集群都是一组更大的业务单元的一部分。

`pg_cluster`

参数名称： pg_cluster，类型： string，层次：C

PostgreSQL 集群名称，必选的身份标识参数,没有默认值

集群名将用作资源的命名空间。

集群命名需要遵循特定的命名模式：[a-z][a-z0-9-]*，即，只使用数字与小写字母，且不以数字开头，以符合标识上的不同约束的要求。

`pg_seq`

参数名称： pg_seq，类型： int，层次：I

PostgreSQL 实例序列号，必选的身份标识参数，无默认值。

此实例的序号，在其集群内是唯一分配的，通常使用自然数，从0或1开始分配，通常不会回收重用。

`pg_role`

参数名称： pg_role，类型： enum，层次：I

PostgreSQL 实例角色，必选的身份标识参数，无默认值。取值可以是：primary, replica, offline

PGSQL 实例的角色，可以是：primary、replica、standby 或 offline。

primary: 主实例，在集群中有且仅有一个。
replica: 用于承载在线只读流量的副本，高负载下可能会有轻微复制延迟（10ms~100ms, 100KB）。
offline: 用于处理离线只读流量的离线副本，如统计分析/ETL/个人查询等。

`pg_instances`

参数名称： pg_instances，类型： dict，层次：I

使用 {port:ins_vars} 的形式在一台主机上定义多个 PostgreSQL 实例。

此参数是为在单个节点上的多实例部署保留的参数，Pigsty 尚未实现此功能，并强烈建议独占节点部署。

`pg_upstream`

参数名称： pg_upstream，类型： ip，层次：I

备份集群或级联从库的上游实例 IP 地址。

在集群的 primary 实例上设置 pg_upstream ，表示此集群是一个备份集群，该实例将作为 standby leader，从上游集群接收并应用更改。

对非 primary 实例设置 pg_upstream 参数将指定一个具体实例作为物理复制的上游，如果与主实例 ip 地址不同，此实例将成为 级联副本 。确保上游 IP 地址是同一集群中的另一个实例是用户的责任。

`pg_shard`

参数名称： pg_shard，类型： string，层次：C

PostgreSQL 水平分片名称，对于分片集群来说（例如 citus 集群），这是的必选标识参数。

当多个标准的 PostgreSQL 集群一起以水平分片方式为同一业务提供服务时，Pigsty 将此组集群标记为 水平分片集群。

pg_shard 是分片组名称。它通常是 pg_cluster 的前缀。

例如，如果我们有一个分片组 pg-citus，并且其中有4个集群，它们的标识参数将是：

cls pg_shard: pg-citus
cls pg_group = 0:   pg-citus0
cls pg_group = 1:   pg-citus1
cls pg_group = 2:   pg-citus2
cls pg_group = 3:   pg-citus3

`pg_group`

参数名称： pg_group，类型： int，层次：C

PostgreSQL 水平分片集群的分片索引号，对于分片集群来说（例如 citus 集群），这是的必选标识参数。

此参数与 pg_shard 配对使用，通常可以使用非负整数作为索引号。

`gp_role`

参数名称： gp_role，类型： enum，层次：C

PostgreSQL 集群的 Greenplum/Matrixdb 角色，可以是 master 或 segment。

master: 标记 postgres 集群为 greenplum 主实例（协调节点），这是默认值。
segment 标记 postgres 集群为 greenplum 段集群（数据节点）。

此参数仅用于 Greenplum/MatrixDB 数据库（pg_mode 为 gpsql），对于普通的 PostgreSQL 集群没有意义。

`pg_exporters`

参数名称： pg_exporters，类型： dict，层次：C

额外用于监控远程 PostgreSQL 实例的 Exporter 定义，默认值：{}

如果您希望监控远程 PostgreSQL 实例，请在监控系统所在节点（Infra节点）集群上的 pg_exporters 参数中定义它们，并使用 pgsql-monitor.yml 剧本来完成部署。

pg_exporters: # list all remote instances here, alloc a unique unused local port as k
    20001: { pg_cluster: pg-foo, pg_seq: 1, pg_host: 10.10.10.10 }
    20004: { pg_cluster: pg-foo, pg_seq: 2, pg_host: 10.10.10.11 }
    20002: { pg_cluster: pg-bar, pg_seq: 1, pg_host: 10.10.10.12 }
    20003: { pg_cluster: pg-bar, pg_seq: 1, pg_host: 10.10.10.13 }

`pg_offline_query`

参数名称： pg_offline_query，类型： bool，层次：I

设置为 true 以在此实例上启用离线查询，默认为 false。

当某个 PostgreSQL 实例启用此参数时，属于 dbrole_offline 分组的用户可以直接连接到该 PostgreSQL 实例上执行离线查询（慢查询，交互式查询，ETL/分析类查询）。

带有此标记的实例在效果上类似于为实例设置 pg_role = offline ，唯一的区别在于 offline 实例默认不会承载 replica 服务的请求，是作为专用的离线/分析从库实例而存在的。

如果您没有富余的实例可以专门用于此目的，则可以挑选一台普通的从库，在实例层次启用此参数，以便在需要时承载离线查询。

`PG_BUSINESS`

定制集群模板：用户，数据库，服务，权限规则。

用户需重点关注此部分参数，因为这里是业务声明自己所需数据库对象的地方。

业务用户定义： pg_users
业务数据库定义： pg_databases
集群专有服务定义： pg_services （全局定义：pg_default_services）
PostgreSQL集群/实例特定的HBA规则： pg_default_services
Pgbouncer连接池特定HBA规则： pgb_hba_rules

默认的数据库用户及其凭据，强烈建议在生产环境中修改这些用户的密码。

PG管理员用户：pg_admin_username / pg_admin_password
PG复制用户： pg_replication_username / pg_replication_password
PG监控用户：pg_monitor_username / pg_monitor_password

# postgres business object definition, overwrite in group vars
pg_users: []                      # postgres business users
pg_databases: []                  # postgres business databases
pg_services: []                   # postgres business services
pg_hba_rules: []                  # business hba rules for postgres
pgb_hba_rules: []                 # business hba rules for pgbouncer
# global credentials, overwrite in global vars
pg_dbsu_password: ''              # dbsu password, empty string means no dbsu password by default
pg_replication_username: replicator
pg_replication_password: DBUser.Replicator
pg_admin_username: dbuser_dba
pg_admin_password: DBUser.DBA
pg_monitor_username: dbuser_monitor
pg_monitor_password: DBUser.Monitor

`pg_users`

参数名称： pg_users，类型： user[]，层次：C

PostgreSQL 业务用户列表，需要在 PG 集群层面进行定义。默认值为：[] 空列表。

每一个数组元素都是一个用户/角色定义，例如：

- name: dbuser_meta               # 必需，`name` 是用户定义的唯一必选字段
  password: DBUser.Meta           # 可选，密码，可以是 scram-sha-256 哈希字符串或明文
  login: true                     # 可选，默认情况下可以登录
  superuser: false                # 可选，默认为 false，是超级用户吗？
  createdb: false                 # 可选，默认为 false，可以创建数据库吗？
  createrole: false               # 可选，默认为 false，可以创建角色吗？
  inherit: true                   # 可选，默认情况下，此角色可以使用继承的权限吗？
  replication: false              # 可选，默认为 false，此角色可以进行复制吗？
  bypassrls: false                # 可选，默认为 false，此角色可以绕过行级安全吗？
  pgbouncer: true                 # 可选，默认为 false，将此用户添加到 pgbouncer 用户列表吗？（使用连接池的生产用户应该显式定义为 true）
  connlimit: -1                   # 可选，用户连接限制，默认 -1 禁用限制
  expire_in: 3650                 # 可选，此角色过期时间：从创建时 + n天计算（优先级比 expire_at 更高）
  expire_at: '2030-12-31'         # 可选，此角色过期的时间点，使用 YYYY-MM-DD 格式的字符串指定一个特定日期（优先级没 expire_in 高）
  comment: pigsty admin user      # 可选，此用户/角色的说明与备注字符串
  roles: [dbrole_admin]           # 可选，默认角色为：dbrole_{admin,readonly,readwrite,offline}
  parameters: {}                  # 可选，使用 `ALTER ROLE SET` 针对这个角色，配置角色级的数据库参数
  pool_mode: transaction          # 可选，默认为 transaction 的 pgbouncer 池模式，用户级别
  pool_connlimit: -1              # 可选，用户级别的最大数据库连接数，默认 -1 禁用限制
  search_path: public             # 可选，根据 postgresql 文档的键值配置参数（例如：使用 pigsty 作为默认 search_path）

`pg_databases`

参数名称： pg_databases，类型： database[]，层次：C

PostgreSQL 业务数据库列表，需要在 PG 集群层面进行定义。默认值为：[] 空列表。

每一个数组元素都是一个业务数据库定义，例如：

- name: meta                      # 必选，`name` 是数据库定义的唯一必选字段
  baseline: cmdb.sql              # 可选，数据库 sql 的基线定义文件路径（ansible 搜索路径中的相对路径，如 files/）
  pgbouncer: true                 # 可选，是否将此数据库添加到 pgbouncer 数据库列表？默认为 true
  schemas: [pigsty]               # 可选，要创建的附加模式，由模式名称字符串组成的数组
  extensions:                     # 可选，要安装的附加扩展： 扩展对象的数组
    - { name: postgis , schema: public }  # 可以指定将扩展安装到某个模式中，也可以不指定（不指定则安装到 search_path 首位模式中）
    - { name: timescaledb }               # 例如有的扩展会创建并使用固定的模式，就不需要指定模式。
    - vector                              # 你也可以直接使用字符串指定扩展名称
  comment: pigsty meta database   # 可选，数据库的说明与备注信息
  owner: postgres                 # 可选，数据库所有者，默认为 postgres
  template: template1             # 可选，要使用的模板，默认为 template1，目标必须是一个模板数据库
  encoding: UTF8                  # 可选，数据库编码，默认为 UTF8（必须与模板数据库相同）
  locale: C                       # 可选，数据库地区设置，默认为 C（必须与模板数据库相同）
  lc_collate: C                   # 可选，数据库 collate 排序规则，默认为 C（必须与模板数据库相同），没有理由不建议更改。
  lc_ctype: C                     # 可选，数据库 ctype 字符集，默认为 C（必须与模板数据库相同）
  tablespace: pg_default          # 可选，默认表空间，默认为 'pg_default'
  allowconn: true                 # 可选，是否允许连接，默认为 true。显式设置 false 将完全禁止连接到此数据库
  revokeconn: false               # 可选，撤销公共连接权限。默认为 false，设置为 true 时，属主和管理员之外用户的 CONNECT 权限会被回收
  register_datasource: true       # 可选，是否将此数据库注册到 grafana 数据源？默认为 true，显式设置为 false 会跳过注册
  connlimit: -1                   # 可选，数据库连接限制，默认为 -1 ，不限制，设置为正整数则会限制连接数。
  pool_auth_user: dbuser_meta     # 可选，连接到此 pgbouncer 数据库的所有连接都将使用此用户进行验证（启用 pgbouncer_auth_query 才有用）
  pool_mode: transaction          # 可选，数据库级别的 pgbouncer 池化模式，默认为 transaction
  pool_size: 64                   # 可选，数据库级别的 pgbouncer 默认池子大小，默认为 64
  pool_size_reserve: 32           # 可选，数据库级别的 pgbouncer 池子保留空间，默认为 32，当默认池子不够用时，最多再申请这么多条突发连接。
  pool_size_min: 0                # 可选，数据库级别的 pgbouncer 池的最小大小，默认为 0
  pool_max_db_conn: 100           # 可选，数据库级别的最大数据库连接数，默认为 100

在每个数据库定义对象中，只有 name 是必选字段，其他的字段都是可选项。

`pg_services`

参数名称： pg_services，类型： service[]，层次：C

PostgreSQL 服务列表，需要在 PG 集群层面进行定义。默认值为：[] ，空列表。

用于在数据库集群层面定义额外的服务，数组中的每一个对象定义了一个服务，一个完整的服务定义样例如下：

- name: standby                   # 必选，服务名称，最终的 svc 名称会使用 `pg_cluster` 作为前缀，例如：pg-meta-standby
  port: 5435                      # 必选，暴露的服务端口（作为 kubernetes 服务节点端口模式）
  ip: "*"                         # 可选，服务绑定的 IP 地址，默认情况下为所有 IP 地址
  selector: "[]"                  # 必选，服务成员选择器，使用 JMESPath 来筛选配置清单
  backup: "[? pg_role == `primary`]"  # 可选，服务成员选择器（备份），也就是当默认选择器选中的实例都宕机后，服务才会由这里选中的实例成员来承载
  dest: default                   # 可选，目标端口，default|postgres|pgbouncer|<port_number>，默认为 'default'，Default的意思就是使用 pg_default_service_dest 的取值来最终决定
  check: /sync                    # 可选，健康检查 URL 路径，默认为 /，这里使用 Patroni API：/sync ，只有同步备库和主库才会返回 200 健康状态码 
  maxconn: 5000                   # 可选，允许的前端连接最大数，默认为5000
  balance: roundrobin             # 可选，haproxy 负载均衡算法（默认为 roundrobin，其他选项：leastconn）
  options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'

请注意，本参数用于在集群层面添加额外的服务。如果您想在全局定义所有 PostgreSQL 数据库都要提供的服务，可以使用 pg_default_services 参数。

`pg_hba_rules`

参数名称： pg_hba_rules，类型： hba[]，层次：C

数据库集群/实例的客户端IP黑白名单规则。默认为：[] 空列表。

对象数组，每一个对象都代表一条规则， hba 规则对象的定义形式如下：

- title: allow intranet password access
  role: common
  rules:
    - host   all  all  10.0.0.0/8      md5
    - host   all  all  172.16.0.0/12   md5
    - host   all  all  192.168.0.0/16  md5

title：规则的标题名称，会被渲染为 HBA 文件中的注释。
rules：规则数组，每个元素是一条标准的 HBA 规则字符串。
role：规则的应用范围，哪些实例角色会启用这条规则？
- common：对于所有实例生效
- primary, replica,offline：只针对特定的角色 pg_role 实例生效。
- 特例：role: 'offline' 的规则除了会应用在 pg_role : offline 的实例上，对于带有 pg_offline_query 标记的实例也生效。

除了上面这种原生 HBA 规则定义形式，Pigsty 还提供了另外一种更为简便的别名形式：

- addr: 'intra'    # world|intra|infra|admin|local|localhost|cluster|<cidr>
  auth: 'pwd'      # trust|pwd|ssl|cert|deny|<official auth method>
  user: 'all'      # all|${dbsu}|${repl}|${admin}|${monitor}|<user>|<group>
  db: 'all'        # all|replication|....
  rules: []        # raw hba string precedence over above all
  title: allow intranet password access

pg_default_hba_rules 与本参数基本类似，但它是用于定义全局的 HBA 规则，而本参数通常用于定制某个集群/实例的 HBA 规则。

`pgb_hba_rules`

参数名称： pgb_hba_rules，类型： hba[]，层次：C

Pgbouncer 业务HBA规则，默认值为： []，空数组。

此参数与 pg_hba_rules 基本类似，都是 hba 规则对象的数组，区别在于本参数是为 Pgbouncer 准备的。

pgb_default_hba_rules 与本参数基本类似，但它是用于定义全局连接池 HBA 规则，而本参数通常用于定制某个连接池集群/实例的 HBA 规则。

`pg_replication_username`

参数名称： pg_replication_username，类型： username，层次：G

PostgreSQL 物理复制用户名，默认使用 replicator，不建议修改此参数。

`pg_replication_password`

参数名称： pg_replication_password，类型： password，层次：G

PostgreSQL 物理复制用户密码，默认值为：DBUser.Replicator。

警告：请在生产环境中修改此密码！

`pg_admin_username`

参数名称： pg_admin_username，类型： username，层次：G

PostgreSQL / Pgbouncer 管理员名称，默认为：dbuser_dba。

这是全局使用的数据库管理员，具有数据库的 Superuser 权限与连接池的流量管理权限，请务必控制使用范围。

`pg_admin_password`

参数名称： pg_admin_password，类型： password，层次：G

PostgreSQL / Pgbouncer 管理员密码，默认为： DBUser.DBA。

警告：请在生产环境中修改此密码！

`pg_monitor_username`

参数名称： pg_monitor_username，类型： username，层次：G

PostgreSQL/Pgbouncer 监控用户名，默认为：dbuser_monitor。

这是一个用于监控的数据库/连接池用户，不建议修改此用户名。

但如果您的现有数据库使用了不同的监控用户，可以在指定监控目标时使用此参数传入使用的监控用户名。

`pg_monitor_password`

参数名称： pg_monitor_password，类型： password，层次：G

PostgreSQL/Pgbouncer 监控用户使用的密码，默认为：DBUser.Monitor。

请尽可能不要在密码中使用 @:/ 这些容易与 URL 分隔符混淆的字符，减少不必要的麻烦。

警告：请在生产环境中修改此密码！

`pg_dbsu_password`

参数名称： pg_dbsu_password，类型： password，层次：G/C

PostgreSQL pg_dbsu 超级用户密码，默认是空字符串，即不为其设置密码。

我们不建议为 dbsu 配置密码登陆，这会增大攻击面。例外情况是：pg_mode = citus，这时候需要为每个分片集群的 dbsu 配置密码，以便在分片集群内部进行连接。

`PG_INSTALL`

本节负责安装 PostgreSQL 及其扩展。如果您希望安装不同大版本与扩展插件，修改 pg_version 与 pg_extensions 即可，不过请注意，并不是所有扩展都在所有大版本可用。

pg_dbsu: postgres                 # os 数据库超级用户名称，默认为 postgres，最好不要更改
pg_dbsu_uid: 26                   # os 数据库超级用户 uid 和 gid，默认为 26，适用于默认的 postgres 用户和组
pg_dbsu_sudo: limit               # 数据库超级用户 sudo 权限，可选 none,limit,all,nopass。默认为 limit
pg_dbsu_home: /var/lib/pgsql      # postgresql 主目录，默认为 `/var/lib/pgsql`
pg_dbsu_ssh_exchange: true        # 是否在相同的 pgsql 集群中交换 postgres 数据库超级用户的 ssh 密钥
pg_version: 16                    # 要安装的 postgres 主版本，默认为 15
pg_bin_dir: /usr/pgsql/bin        # postgres 二进制目录，默认为 `/usr/pgsql/bin`
pg_log_dir: /pg/log/postgres      # postgres 日志目录，默认为 `/pg/log/postgres`
pg_packages:                      # 待安装的软件包列表，可以使用别名
  - pgsql-main pgsql-common
pg_extensions: []                 # 待安装的扩展列表，可以使用别名

`pg_dbsu`

参数名称： pg_dbsu，类型： username，层次：C

PostgreSQL 使用的操作系统 dbsu 用户名，默认为 postgres，改这个用户名是不太明智的。

不过在特定情况下，您可能会使用到不同于 postgres 的用户名，例如在安装配置 Greenplum / MatrixDB 时，需要使用 gpadmin / mxadmin 作为相应的操作系统超级用户。

`pg_dbsu_uid`

参数名称： pg_dbsu_uid，类型： int，层次：C

操作系统数据库超级用户的 uid 和 gid，26 是 PGDG RPM 默认的 postgres 用户 UID/GID。

对于 Debian/Ubuntu 系统，没有默认值，且 26 号用户经常被占用。因此Pigsty 在检测到安装环境为 Debian 系，且 uid 为 26 时，会自动使用替换的 pg_dbsu_uid = 543。

`pg_dbsu_sudo`

参数名称： pg_dbsu_sudo，类型： enum，层次：C

数据库超级用户的 sudo 权限，可以是 none、limit、all 或 nopass。默认为 limit

none: 无 Sudo 权限
limit: 有限的 sudo 权限，用于执行与数据库相关的组件的 systemctl 命令（默认选项）。
all: 完全的 sudo 权限，需要密码。
nopass: 不需要密码的完全 sudo 权限（不推荐）。
默认值为 limit，只允许执行 sudo systemctl <start|stop|reload> <postgres|patroni|pgbouncer|...> 。

可以使用 sudo 管理的服务列表：

patroni
pgbouncer
postgres
pg_exporter
pgbackrest
pgbouncer_exporter
pgbackrest_exporter
vip-manager
haproxy (仅限reload)

`pg_dbsu_home`

参数名称： pg_dbsu_home，类型： path，层次：C

postgresql 主目录，默认为 /var/lib/pgsql，与官方的 pgdg RPM 保持一致。

`pg_dbsu_ssh_exchange`

参数名称： pg_dbsu_ssh_exchange，类型： bool，层次：C

是否在交换操作系统 dbsu 用户的 ssh 密钥？

默认值为 true，意味着数据库超级用户可以互相 ssh 访问。

对于严格限制 ssh 访问的场景，您可以将其设置为 false。

请注意，SSH 密钥交换发生在同时执行剧本的实例之间，如果您针对一个 PostgreSQL 集群运行 pgsql 角色，密钥交换将发生在这个集群中的所有实例之间。如果您针对所有 PostgreSQL 集群运行 pgsql 角色，密钥交换将发生在所有实例之间，对于大规模集群，O(n2) 复杂度交换可能导致严重的组合爆炸。如果任何参与交换的实例没有 pg_dbsu 用户，与该实例相关的密钥交换将失败，但不影响其他实例的密钥交换。

`pg_version`

参数名称： pg_version，类型： enum，层次：C

要安装的 postgres 主版本，默认为 17。

请注意，PostgreSQL 的物理流复制不能跨主要版本，因此最好不要在实例级别上配置此项。

您可以使用 pg_packages 和 pg_extensions 中的参数来为特定的 PG 大版本安装不同的软件包与扩展。

`pg_bin_dir`

参数名称： pg_bin_dir，类型： path，层次：C

PostgreSQL 二进制程序目录，默认为 /usr/pgsql/bin。

默认值是在安装过程中手动创建的软链接，指向安装的特定的 Postgres 版本目录。

例如 /usr/pgsql -> /usr/pgsql-17。在 Ubuntu/Debian 上则指向 /usr/lib/postgresql/15/bin。

更多详细信息，请查看 PGSQL 文件结构。

`pg_log_dir`

参数名称： pg_log_dir，类型： path，层次：C

PostgreSQL 日志目录，默认为：/pg/log/postgres，Promtail 会使用此变量收集 PostgreSQL 日志。

请注意，如果日志目录 pg_log_dir 以数据库目录 pg_data 作为前缀，则不会显式创建（数据库目录初始化时自动创建）。

`pg_packages`

参数名称： pg_packages，类型： string[]，层次：C

要安装的 PostgreSQL 软件包（rpm/deb），这是一个由软件包名组成的数组，每个元素都是逗号或空格分割的 PG 软件包名或别名（Alias）。

默认值为：[ pgsql-main pgsql-common ]

这里的默认值是两个别名，分通过 别名翻译 为当前 PG 大版本对应的主要 RPM/DEB 包名，以及 PG 版本无关的通用组件（例如 Patroni，PgBackrest 等）

从 Pigsty v3 开始，您可以在本参数中使用roles/node_id/vars 中系统对应配置指定的别名列表。

使用包别名的好处是，您无需操心 PostgreSQL 相关软件包在不同系统平台上的包名，架构，以及大版本号，从而屏蔽了不同 OS 之间的区别：

定义在这里的软件包会首先经过 package_map 的翻译，然后经过 PG 大版本号的替换，最后安装实际的 RPM/DEB 包。

您也可以直接指定最终安装的 RPM/DEB 包名称，包名中的 ${pg_version} 或 $v 版本号占位符将被替换为具体的大版本号 pg_version 。

`pg_extensions`

参数名称： pg_extensions，类型： string[]，层次：G/C

要安装的 PostgreSQL 扩展包（rpm/deb），这是一个由扩展包名组成的数组，每个元素都是逗号或空格分割的 PG 扩展包名。

本参数在形式上与 pg_packages 一致，但是通常用于指定需要安装的扩展插件，而且在这里指定的软件包会升级到可用的最新版本。

pg_extensions: []

完整可用的扩展列表，已经在 Pigsty 默认生成的配置文件中给出，用户按需使用即可。

完整列表请参考：roles/node_id/vars 与 Pigsty 扩展目录

`PG_BOOTSTRAP`

使用 Patroni 引导拉起 PostgreSQL 集群，并设置 1:1 对应的 Pgbouncer 连接池。

它还会使用 PG_PROVISION 中定义的默认角色、用户、权限、模式、扩展来初始化数据库集群

pg_safeguard: false               # 保护正在运行的 postgres 实例不被清除？默认值为 `false`
pg_clean: true                    # 在 pgsql 初始化期间清除现有的 PG 实例？默认值为 `true`
pg_data: /pg/data                 # postgres 数据目录，默认值为 `/pg/data`
pg_fs_main: /data                 # postgres 主数据盘挂载点/路径，默认值为 `/data`
pg_fs_bkup: /data/backups         # postgres 备份数据盘挂载点/路径，默认值为 `/data/backup`
pg_storage_type: SSD              # postgres 主数据盘存储介质类型，默认值为 `SSD`
pg_dummy_filesize: 64MiB          # 紧急情况下占位符文件 `/pg/dummy` 的大小，默认值为 `64MiB`
pg_listen: '0.0.0.0'              # postgres/pgbouncer 监听地址，默认值为 `0.0.0.0`
pg_port: 5432                     # postgres 监听端口，默认值为 `5432`
pg_localhost: /var/run/postgresql # postgres 本地连接的 Unix 套接字目录，默认值为 `/var/run/postgresql`
patroni_enabled: true             # 如果禁用，在初始化期间将不会创建 postgres 集群
patroni_mode: default             # patroni 工作模式：default,pause,remove
pg_namespace: /pg                 # etcd 中的顶级键命名空间，由 patroni 和 vip 使用
patroni_port: 8008                # patroni 监听端口，默认为 8008
patroni_log_dir: /pg/log/patroni  # patroni 日志目录，默认为 `/pg/log/patroni`
patroni_ssl_enabled: false        # 是否使用 SSL 保护 patroni RestAPI 通信？
patroni_watchdog_mode: off        # patroni 看门狗模式：automatic,required,off。默认为 off
patroni_username: postgres        # patroni restapi 用户名，默认为 `postgres`
patroni_password: Patroni.API     # patroni restapi 密码，默认为 `Patroni.API`
pg_primary_db: postgres           # 主数据库名称，用于 citus 等，默认为 postgres
pg_parameters: {}                 # postgresql.auto.conf 中的额外参数
pg_files: []                      # 要复制到 postgres 数据目录的额外文件（例如许可证）
pg_conf: oltp.yml                 # 配置模板：oltp,olap,crit,tiny。默认为 `oltp.yml`
pg_max_conn: auto                 # postgres 最大连接数，`auto` 将使用推荐值
pg_shared_buffer_ratio: 0.25      # postgres 共享缓冲区比例，默认为 0.25，范围 0.1~0.4
pg_rto: 30                        # 恢复时间目标（秒），默认为 `30s`
pg_rpo: 1048576                   # 恢复点目标（字节），默认最多 `1MiB`
pg_libs: 'pg_stat_statements, auto_explain'  # 预加载库，默认为 `pg_stat_statements,auto_explain`
pg_delay: 0                       # 备用集群领导者的复制应用延迟
pg_checksum: true                 # 为 postgres 集群启用数据校验和？
pg_pwd_enc: scram-sha-256         # 密码加密算法：md5,scram-sha-256
pg_encoding: UTF8                 # 数据库集群编码，默认为 `UTF8`
pg_locale: C                      # 数据库集群区域设置，默认为 `C`
pg_lc_collate: C                  # 数据库集群排序规则，默认为 `C`
pg_lc_ctype: C                    # 数据库字符类型，默认为 `C`
pgbouncer_enabled: true           # 如果禁用，将不会在 pgsql 主机上启动 pgbouncer
pgbouncer_port: 6432              # pgbouncer 监听端口，默认为 6432
pgbouncer_log_dir: /pg/log/pgbouncer  # pgbouncer 日志目录，默认为 `/pg/log/pgbouncer`
pgbouncer_auth_query: false       # 查询 postgres 以检索未列出的业务用户？
pgbouncer_poolmode: transaction   # 连接池模式：transaction,session,statement，默认为 transaction
pgbouncer_sslmode: disable        # pgbouncer client ssl mode, disable by default

`pg_safeguard`

参数名称： pg_safeguard，类型： bool，层次：G/C/A

是否防止清除正在运行的Postgres实例？默认为：false。

如果启用，pgsql.yml 和 pgsql-rm.yml 在检测到任何正在运行的postgres实例时将立即中止。

`pg_clean`

参数名称： pg_clean，类型： bool，层次：G/C/A

在 PostgreSQL 初始化期间清除现有的 PG 实例吗？默认为：true。

默认值为true，在 pgsql.yml 初始化期间它将清除现有的postgres实例，这使得playbook具有幂等性。

如果设置为 false，pgsql.yml 会在遇到正在运行的 PostgreSQL 实例时中止。而 pgsql-rm.yml 将不会删除 PostgreSQL 的数据目录（只会停止服务器）。

`pg_data`

参数名称： pg_data，类型： path，层次：C

Postgres 数据目录，默认为 /pg/data。

这是一个指向底层实际数据目录的符号链接，在多处被使用，请不要修改它。参阅 PGSQL文件结构获取详细信息。

`pg_fs_main`

参数名称： pg_fs_main，类型： path，层次：C

PostgreSQL 主数据盘的挂载点/文件系统路径，默认为/data。

默认值：/data，它将被用作 PostgreSQL 主数据目录（/data/postgres）的父目录。

建议使用 NVME SSD 作为 PostgreSQL 主数据存储，Pigsty默认为SSD存储进行了优化，但是也支持HDD。

您可以更改pg_storage_type为HDD以针对HDD存储进行优化。

`pg_fs_bkup`

参数名称： pg_fs_bkup，类型： path，层次：C

PostgreSQL 备份数据盘的挂载点/文件系统路径，默认为/data/backup。

如果您使用的是默认的 pgbackrest_method = local，建议为备份存储使用一个单独的磁盘。

备份磁盘应足够大，以容纳所有的备份，至少足以容纳3个基础备份+2天的WAL归档。通常容量不是什么大问题，因为您可以使用便宜且大的机械硬盘作为备份盘。

建议为备份存储使用一个单独的磁盘，否则 Pigsty 将回退到主数据磁盘，并占用主数据盘的容量与IO。

`pg_storage_type`

参数名称： pg_storage_type，类型： enum，层次：C

PostgreSQL 数据存储介质的类型：SSD或HDD，默认为SSD。

默认值：SSD，它会影响一些调优参数，如 random_page_cost 和 effective_io_concurrency 。

`pg_dummy_filesize`

参数名称： pg_dummy_filesize，类型： size，层次：C

/pg/dummy的大小，默认值为64MiB，用于紧急使用的64MB磁盘空间。

当磁盘已满时，删除占位符文件可以为紧急使用释放一些空间，建议生产使用至少8GiB。

`pg_listen`

参数名称： pg_listen，类型： ip，层次：C

PostgreSQL / Pgbouncer 的监听地址，默认为0.0.0.0（所有ipv4地址）。

您可以在此变量中使用占位符，例如：'${ip},${lo}'或'${ip},${vip},${lo}'：

${ip}：转换为 inventory_hostname，它是配置清单中定义的首要内网IP地址。
${vip}：如果启用了pg_vip_enabled，将使用pg_vip_address的主机部分。
${lo}：将替换为127.0.0.1

对于高安全性要求的生产环境，建议限制监听的IP地址。

`pg_port`

参数名称： pg_port，类型： port，层次：C

PostgreSQL 服务器监听的端口，默认为 5432。

`pg_localhost`

参数名称： pg_localhost，类型： path，层次：C

本地主机连接 PostgreSQL 使用的 Unix套接字目录，默认值为/var/run/postgresql。

PostgreSQL 和 Pgbouncer 本地连接的Unix套接字目录，pg_exporter 和 patroni 都会优先使用 Unix 套接字访问 PostgreSQL。

`pg_namespace`

参数名称： pg_namespace，类型： path，层次：C

在 etcd 中使用的顶级命名空间，由 patroni 和 vip-manager 使用，默认值是：/pg，不建议更改。

`patroni_enabled`

参数名称： patroni_enabled，类型： bool，层次：C

是否启用 Patroni ？默认值为：true。

如果禁用，则在初始化期间不会创建Postgres集群。Pigsty将跳过拉起 patroni的任务，当试图向现有的postgres实例添加一些组件时，可以使用此参数。

`patroni_mode`

参数名称： patroni_mode，类型： enum，层次：C

Patroni 工作模式：default，pause，remove。默认值：default。

default：正常使用 Patroni 引导 PostgreSQL 集群
pause：与default相似，但在引导后进入维护模式
remove：使用Patroni初始化集群，然后删除Patroni并使用原始 PostgreSQL。

`patroni_port`

参数名称： patroni_port，类型： port，层次：C

patroni监听端口，默认为8008，不建议更改。

Patroni API服务器在此端口上监听健康检查和API请求。

`patroni_log_dir`

参数名称： patroni_log_dir，类型： path，层次：C

patroni日志目录，默认为/pg/log/patroni，由promtail收集。

`patroni_ssl_enabled`

参数名称： patroni_ssl_enabled，类型： bool，层次：G

使用SSL保护patroni RestAPI通信吗？默认值为false。

此参数是一个全局标志，只能在部署之前预先设置。因为如果为 patroni 启用了SSL，您将必须使用 HTTPS 而不是 HTTP 执行健康检查、获取指标，调用API。

`patroni_watchdog_mode`

参数名称： patroni_watchdog_mode，类型： string，层次：C

patroni看门狗模式：automatic，required，off，默认值为 off。

在主库故障的情况下，Patroni 可以使用看门狗来强制关机旧主库节点以避免脑裂。

off：不使用看门狗。完全不进行 Fencing （默认行为）
automatic：如果内核启用了softdog模块并且看门狗属于dbsu，则启用 watchdog。
required：强制启用 watchdog，如果softdog不可用则拒绝启动 Patroni/PostgreSQL。

默认值为off，您不应该在 Infra节点启用看门狗，数据一致性优先于可用性的关键系统，特别是与钱有关的业务集群可以考虑打开此选项。

请注意，如果您的所有访问流量都使用 HAproxy 健康检查服务接入，正常是不存在脑裂风险的。

`patroni_username`

参数名称： patroni_username，类型： username，层次：C

Patroni REST API 用户名，默认为postgres，与patroni_password 配对使用。

Patroni的危险 REST API （比如重启集群）由额外的用户名/密码保护，查看配置集群和Patroni RESTAPI以获取详细信息。

`patroni_password`

参数名称： patroni_password，类型： password，层次：C

Patroni REST API 密码，默认为Patroni.API。

警告：务必生产环境中修改此参数！

`pg_primary_db`

参数名称： pg_primary_db，类型： string，层次：C

指定集群中的主数据库名称，用于 citus 等业务数据库，默认为 postgres。

例如，在使用 Patroni 管理高可用的 Citus 集群时，您必须选择一个 “主数据库”。

此外，在这里指定的数据库名称，将在 PGSQL 模块安装完成后，显示在打印的连接串中。

`pg_parameters`

参数名称： pg_parameters，类型： dict，层次：G/C/I

可用于指定并管理 postgresql.auto.conf 中的配置参数。

当集群所有实例完成初始化后，pg_param 任务将会把本字典中的 key / value 键值对依次覆盖写入 /pg/data/postgresql.auto.conf 中。

注意：请不要手工修改该配置文件，或通过 ALTER SYSTEM 修改集群配置参数，修改会在下一次配置同步时被覆盖。

该变量的优先级大于 Patroni / DCS 中的集群配置（即优先级高于集群配置，由 Patroni edit-config 编辑的配置），因此通常可以在实例级别覆盖集群默认参数。

当您的集群成员有着不同的规格（不推荐的行为！）时，您可以通过本参数对每个实例的配置进行精细化管理。

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary , pg_parameters: { shared_buffers: '5GB' } }
    10.10.10.12: { pg_seq: 2, pg_role: replica , pg_parameters: { shared_buffers: '4GB' } }
    10.10.10.13: { pg_seq: 3, pg_role: replica , pg_parameters: { shared_buffers: '3GB' } }

请注意，一些重要的集群参数（对主从库参数值有要求）是 Patroni 直接通过命令行参数管理的，具有最高优先级，无法通过此方式覆盖，对于这些参数，您必须使用 Patroni edit-config 进行管理与配置。

在主从上必须保持一致的 PostgreSQL 参数（不一致会导致从库无法启动！）：

wal_level
max_connections
max_locks_per_transaction
max_worker_processes
max_prepared_transactions
track_commit_timestamp

在主从上最好保持一致的参数（考虑到主从切换的可能性）：

listen_addresses
port
cluster_name
hot_standby
wal_log_hints
max_wal_senders
max_replication_slots
wal_keep_segments
wal_keep_size

您可以设置不存在的参数（例如来自扩展的 GUC，从而配置 ALTER SYSTEM 无法修改的“尚未存在”的参数），但将现有配置修改为非法值可能会导致 PostgreSQL 无法启动，请谨慎配置！

`pg_files`

参数名称： pg_files，类型： path[]，层次：C

用于指定需要拷贝至PGDATA目录的文件列表，默认为空数组：[]

在本参数中指定的文件将会被拷贝至 {{ pg_data }} 目录下，这主要用于下发特殊商业版本 PostgreSQL 内核要求的 License 文件。

目前仅有 PolarDB （Oracle兼容）内核需要许可证文件，例如，您可以将 license.lic 文件放置在 files/ 目录下，并在 pg_files 中指定：

pg_files: [ license.lic ]

`pg_conf`

参数名称： pg_conf，类型： enum，层次：C

配置模板：{oltp,olap,crit,tiny}.yml，默认为oltp.yml。

tiny.yml：为小节点、虚拟机、小型演示优化（1-8核，1-16GB）
oltp.yml：为OLTP工作负载和延迟敏感应用优化（4C8GB+）（默认模板）
olap.yml：为OLAP工作负载和吞吐量优化（4C8G+）
crit.yml：为数据一致性和关键应用优化（4C8G+）

默认值：oltp.yml，但是配置程序将在当前节点为小节点时将此值设置为 tiny.yml。

您可以拥有自己的模板，只需将其放在templates/<mode>.yml下，并将此值设置为模板名称即可使用。

`pg_max_conn`

参数名称： pg_max_conn，类型： int，层次：C

PostgreSQL 服务器最大连接数。你可以选择一个介于 50 到 5000 之间的值，或使用 auto 选择推荐值。

默认值为 auto，会根据 pg_conf 和 pg_default_service_dest 来设定最大连接数。

tiny: 250
olap: 500
crit: 500 (pgbouncer) / 1000 (postgres)
oltp: 500 (pgbouncer) / 1000 (postgres)

不建议将此值设定为超过 5000，否则你还需要手动增加 haproxy 服务的连接限制。

Pgbouncer 的事务池可以缓解过多的 OLTP 连接问题，因此默认情况下不建议设置很大的连接数。

对于 OLAP 场景， pg_default_service_dest 修改为 postgres 可以绕过连接池。

`pg_shared_buffer_ratio`

参数名称： pg_shared_buffer_ratio，类型： float，层次：C

Postgres 共享缓冲区内存比例，默认为 0.25，正常范围在 0.1~0.4 之间。

默认值：0.25，意味着节点内存的 25% 将被用作 PostgreSQL 的分片缓冲区。如果您想为 PostgreSQL 启用大页，那么此参数值应当适当小于 node_hugepage_ratio。

将此值设定为大于 0.4（40%）通常不是好主意，但在极端情况下可能有用。

注意，共享缓冲区只是 PostgreSQL 中共享内存的一部分，要计算总共享内存，使用 show shared_memory_size_in_huge_pages;。

`pg_rto`

参数名称： pg_rto，类型： int，层次：C

以秒为单位的恢复时间目标（RTO）。这将用于计算 Patroni 的 TTL 值，默认为 30 秒。

如果主实例在这么长时间内失踪，将触发新的领导者选举，此值并非越低越好，它涉及到利弊权衡：

减小这个值可以减少集群故障转移期间的不可用时间（无法写入），但会使集群对短期网络抖动更加敏感，从而增加误报触发故障转移的几率。

您需要根据网络状况和业务约束来配置这个值，在故障几率和故障影响之间做出权衡，默认值是 30s，它将影响以下的 Patroni 参数：

# 获取领导者租约的 TTL（以秒为单位）。将其视为启动自动故障转移过程之前的时间长度。默认值：30
ttl: {{ pg_rto }}

# 循环将休眠的秒数。默认值：10，这是 patroni 检查循环间隔
loop_wait: {{ (pg_rto / 3)|round(0, 'ceil')|int }}

# DCS 和 PostgreSQL 操作重试的超时时间（以秒为单位）。比这短的 DCS 或网络问题不会导致 Patroni 降级领导。默认值：10
retry_timeout: {{ (pg_rto / 3)|round(0, 'ceil')|int }}

# 主实例在触发故障转移之前允许从故障中恢复的时间（以秒为单位），最大 RTO：2 倍循环等待 + primary_start_timeout
primary_start_timeout: {{ (pg_rto / 3)|round(0, 'ceil')|int }}

`pg_rpo`

参数名称： pg_rpo，类型： int，层次：C

以字节为单位的恢复点目标（RPO），默认值：1048576。

默认为 1MiB，这意味着在故障转移期间最多可以容忍 1MiB 的数据丢失。

当主节点宕机并且所有副本都滞后时，你必须做出一个艰难的选择，在可用性和一致性之间进行权衡：

提升一个从库成为新的主库，并尽快将系统恢复服务，但要付出可接受的数据丢失代价（例如，少于 1MB）。
等待主库重新上线（可能永远不会），或人工干预以避免任何数据丢失。

你可以使用 crit.yml conf 模板来确保在故障转移期间没有数据丢失，但这会牺牲一些性能。

`pg_libs`

参数名称： pg_libs，类型： string，层次：C

预加载的动态共享库，默认为 pg_stat_statements,auto_explain，这是两个 PostgreSQL 自带的扩展，强烈建议启用。

对于现有集群，您可以直接配置集群的 shared_preload_libraries 参数并应用生效。

如果您想使用 TimescaleDB 或 Citus 扩展，您需要将 timescaledb 或 citus 添加到此列表中。timescaledb 和 citus 应当放在这个列表的最前面，例如：

citus,timescaledb,pg_stat_statements,auto_explain

其他需要动态加载的扩展也可以添加到这个列表中，例如 pg_cron， pgml 等，通常 citus 和 timescaledb 有着最高的优先级，应该添加到列表的最前面。

`pg_delay`

参数名称： pg_delay，类型： interval，层次：I

延迟备库复制延迟，默认值：0。

如果此值被设置为一个正值，备用集群主库在应用 WAL 变更之前将被延迟这个时间。设置为 1h 意味着该集群中的数据将始终滞后原集群一个小时。

查看延迟备用集群以获取详细信息。

`pg_checksum`

参数名称： pg_checksum，类型： bool，层次：C

为 PostgreSQL 集群启用数据校验和吗？默认值是 false，不启用。

这个参数只能在 PGSQL 部署之前设置（但你可以稍后手动启用它）。

如果使用 pg_conf crit.yml 模板，无论此参数如何，都会始终启用数据校验和，以确保数据完整性。

`pg_pwd_enc`

参数名称： pg_pwd_enc，类型： enum，层次：C

密码加密算法：md5 或 scram-sha-256，默认值：scram-sha-256。

前者已经不再安全，如果你与旧客户端有兼容性问题，你可以将其设置为 md5。

`pg_encoding`

参数名称： pg_encoding，类型： enum，层次：C

数据库集群编码，默认为 UTF8。

除非你非常清楚自己在做什么，否则不建议使用其他非 UTF8 的编码。

`pg_locale`

参数名称： pg_locale，类型： enum，层次：C

PostgreSQL 使用的本地化规则集，默认为 C。会在数据库初始化时作为参数传递给 initdb 命令。

当 configure 检测到当前 PG 版本大于等于 17，或者当前系统明确支持 C.utf8 时，会自动配置此参数为 C.UTF-8。

当 PostgreSQL 版本大于等于 17 时， C 与 C.UTF-8 配置将使用 PostgreSQL 内部自带的 Locale Providier。

除非你非常清楚自己在做什么，否则强烈建议您使用默认的 C 或 C.UTF-8 配置。

通常应当与 pg_lc_collate 和 pg_lc_ctype 配置保持一致。

`pg_lc_collate`

参数名称： pg_lc_collate，类型： enum，层次：C

PostgreSQL 使用的本地化排序规则集，默认为 C，一旦确定，无法在集群层面修改。

当 configure 检测到当前 PG 版本大于等于 17，或者当前系统明确支持 C.utf8 时，会自动配置此参数为 C.UTF-8。

配置规则与 pg_locale 一致，但针对排序规则。

`pg_lc_ctype`

参数名称： pg_lc_ctype，类型： enum，层次：C

PostgreSQL 使用的本地化字符集定义 CTYPE，默认为 C，一旦确定，无法在集群层面修改。

当 configure 检测到当前 PG 版本大于等于 17，或者当前系统明确支持 C.utf8 时，会自动配置此参数为 C.UTF-8。

配置规则与 pg_locale 一致，但针对字符类型。

`pgbouncer_enabled`

参数名称： pgbouncer_enabled，类型： bool，层次：C

默认值为 true，如果禁用，将不会在 PGSQL节点上配置连接池 Pgbouncer。

`pgbouncer_port`

参数名称： pgbouncer_port，类型： port，层次：C

Pgbouncer 监听端口，默认为 6432。

`pgbouncer_log_dir`

参数名称： pgbouncer_log_dir，类型： path，层次：C

Pgbouncer 日志目录，默认为 /pg/log/pgbouncer，日志代理 promtail 会根据此参数收集 Pgbouncer 日志。

`pgbouncer_auth_query`

参数名称： pgbouncer_auth_query，类型： bool，层次：C

是否允许 Pgbouncer 查询 PostgreSQL，以允许未显式列出的用户通过连接池访问 PostgreSQL？默认值是 false。

如果启用，pgbouncer 用户将使用 SELECT username, password FROM monitor.pgbouncer_auth($1) 对 postgres 数据库进行身份验证，否则，只有带有 pgbouncer: true 的业务用户才被允许连接到 Pgbouncer 连接池。

`pgbouncer_poolmode`

参数名称： pgbouncer_poolmode，类型： enum，层次：C

Pgbouncer 连接池池化模式：transaction,session,statement，默认为 transaction。

session：会话级池化，具有最佳的功能兼容性。
transaction：事务级池化，具有更好的性能（许多小连接），可能会破坏某些会话级特性，如NOTIFY/LISTEN 等…
statements：语句级池化，用于简单的只读查询。

如果您的应用出现功能兼容性问题，可以考虑修改此参数为 session。

`pgbouncer_sslmode`

参数名称： pgbouncer_sslmode，类型： enum，层次：C

Pgbouncer 客户端 ssl 模式，默认为 disable。

注意，启用 SSL 可能会对你的 pgbouncer 产生巨大的性能影响。

disable：如果客户端请求 TLS 则忽略（默认）
allow：如果客户端请求 TLS 则使用。如果没有则使用纯TCP。不验证客户端证书。
prefer：与 allow 相同。
require：客户端必须使用 TLS。如果没有则拒绝客户端连接。不验证客户端证书。
verify-ca：客户端必须使用有效的客户端证书的TLS。
verify-full：与 verify-ca 相同。

`PG_PROVISION`

如果说 PG_BOOTSTRAP 是创建一个新的集群，那么 PG_PROVISION 就是在集群中创建默认的对象，包括：

默认角色
默认用户
默认权限
默认HBA规则
默认模式
默认扩展

pg_provision: true                # provision postgres cluster after bootstrap
pg_init: pg-init                  # provision init script for cluster template, `pg-init` by default
pg_default_roles:                 # default roles and users in postgres cluster
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly]               ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite]  ,comment: role for object creation }
  - { name: postgres     ,superuser: true                                          ,comment: system superuser }
  - { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
  - { name: dbuser_monitor   ,roles: [pg_monitor, dbrole_readonly] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }
pg_default_privileges:            # 管理员用户创建时的默认权限
  - GRANT USAGE      ON SCHEMAS   TO dbrole_readonly
  - GRANT SELECT     ON TABLES    TO dbrole_readonly
  - GRANT SELECT     ON SEQUENCES TO dbrole_readonly
  - GRANT EXECUTE    ON FUNCTIONS TO dbrole_readonly
  - GRANT USAGE      ON SCHEMAS   TO dbrole_offline
  - GRANT SELECT     ON TABLES    TO dbrole_offline
  - GRANT SELECT     ON SEQUENCES TO dbrole_offline
  - GRANT EXECUTE    ON FUNCTIONS TO dbrole_offline
  - GRANT INSERT     ON TABLES    TO dbrole_readwrite
  - GRANT UPDATE     ON TABLES    TO dbrole_readwrite
  - GRANT DELETE     ON TABLES    TO dbrole_readwrite
  - GRANT USAGE      ON SEQUENCES TO dbrole_readwrite
  - GRANT UPDATE     ON SEQUENCES TO dbrole_readwrite
  - GRANT TRUNCATE   ON TABLES    TO dbrole_admin
  - GRANT REFERENCES ON TABLES    TO dbrole_admin
  - GRANT TRIGGER    ON TABLES    TO dbrole_admin
  - GRANT CREATE     ON SCHEMAS   TO dbrole_admin
pg_default_schemas: [ monitor ]   # 默认模式
pg_default_extensions:            # 默认扩展
  - { name: pg_stat_statements ,schema: monitor }
  - { name: pgstattuple        ,schema: monitor }
  - { name: pg_buffercache     ,schema: monitor }
  - { name: pageinspect        ,schema: monitor }
  - { name: pg_prewarm         ,schema: monitor }
  - { name: pg_visibility      ,schema: monitor }
  - { name: pg_freespacemap    ,schema: monitor }
  - { name: postgres_fdw       ,schema: public  }
  - { name: file_fdw           ,schema: public  }
  - { name: btree_gist         ,schema: public  }
  - { name: btree_gin          ,schema: public  }
  - { name: pg_trgm            ,schema: public  }
  - { name: intagg             ,schema: public  }
  - { name: intarray           ,schema: public  }
  - { name: pg_repack }
pg_reload: true                   # HBA变化后是否重载配置？
pg_default_hba_rules:             # postgres 默认 HBA 规则集
  - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
  - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
  - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: pwd   ,title: 'replicator replication from localhost'}
  - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: pwd   ,title: 'replicator replication from intranet' }
  - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: pwd   ,title: 'replicator postgres db from intranet' }
  - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
  - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: pwd   ,title: 'monitor from infra host with password'}
  - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: ssl   ,title: 'admin @ everywhere with ssl & pwd'    }
  - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: pwd   ,title: 'pgbouncer read/write via local socket'}
  - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: pwd   ,title: 'read/write biz user via password'     }
  - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: pwd   ,title: 'allow etl offline tasks from intranet'}
pgb_default_hba_rules:            # pgbouncer 默认 HBA 规则集
  - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
  - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
  - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: pwd   ,title: 'monitor access via intranet with pwd' }
  - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
  - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: pwd   ,title: 'admin access via intranet with pwd'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
  - {user: 'all'        ,db: all         ,addr: intra     ,auth: pwd   ,title: 'allow all user intra access with pwd' }

`pg_provision`

参数名称： pg_provision，类型： bool，层次：C

在集群拉起后，完整本节定义的 PostgreSQL 集群置备工作。默认值为true。

如果禁用，不会置备 PostgreSQL 集群。对于一些特殊的 “PostgreSQL” 集群，比如 Greenplum，可以关闭此选项跳过置备阶段。

`pg_init`

参数名称： pg_init，类型： string，层次：G/C

用于初始化数据库模板的Shell脚本位置，默认为 pg-init，该脚本会被拷贝至/pg/bin/pg-init后执行。

该脚本位于 roles/pgsql/templates/pg-init

你可以在该脚本中添加自己的逻辑，或者提供一个新的脚本放置在 templates/ 目录下，并将 pg_init 设置为新的脚本名称。使用自定义脚本时请保留现有的初始化逻辑。

`pg_default_roles`

参数名称： pg_default_roles，类型： role[]，层次：G/C

Postgres 集群中的默认角色和用户。

Pigsty有一个内置的角色系统，请查看PGSQL访问控制：角色系统了解详情。

pg_default_roles:                 # default roles and users in postgres cluster
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly]               ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite]  ,comment: role for object creation }
  - { name: postgres     ,superuser: true                                          ,comment: system superuser }
  - { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
  - { name: dbuser_monitor   ,roles: [pg_monitor, dbrole_readonly] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

`pg_default_privileges`

参数名称： pg_default_privileges，类型： string[]，层次：G/C

每个数据库中的默认权限（DEFAULT PRIVILEGE）设置：

pg_default_privileges:            # 管理员用户创建时的默认权限
  - GRANT USAGE      ON SCHEMAS   TO dbrole_readonly
  - GRANT SELECT     ON TABLES    TO dbrole_readonly
  - GRANT SELECT     ON SEQUENCES TO dbrole_readonly
  - GRANT EXECUTE    ON FUNCTIONS TO dbrole_readonly
  - GRANT USAGE      ON SCHEMAS   TO dbrole_offline
  - GRANT SELECT     ON TABLES    TO dbrole_offline
  - GRANT SELECT     ON SEQUENCES TO dbrole_offline
  - GRANT EXECUTE    ON FUNCTIONS TO dbrole_offline
  - GRANT INSERT     ON TABLES    TO dbrole_readwrite
  - GRANT UPDATE     ON TABLES    TO dbrole_readwrite
  - GRANT DELETE     ON TABLES    TO dbrole_readwrite
  - GRANT USAGE      ON SEQUENCES TO dbrole_readwrite
  - GRANT UPDATE     ON SEQUENCES TO dbrole_readwrite
  - GRANT TRUNCATE   ON TABLES    TO dbrole_admin
  - GRANT REFERENCES ON TABLES    TO dbrole_admin
  - GRANT TRIGGER    ON TABLES    TO dbrole_admin
  - GRANT CREATE     ON SCHEMAS   TO dbrole_admin

Pigsty 基于默认角色系统提供了相应的默认权限设置，请查看PGSQL访问控制：权限了解详情。

`pg_default_schemas`

参数名称： pg_default_schemas，类型： string[]，层次：G/C

要创建的默认模式，默认值为：[ monitor ]，这将在所有数据库上创建一个monitor模式，用于放置各种监控扩展、表、视图、函数。

`pg_default_extensions`

参数名称： pg_default_extensions，类型： extension[]，层次：G/C

要在所有数据库中默认创建启用的扩展列表，默认值：

pg_default_extensions: # default extensions to be created
  - { name: pg_stat_statements ,schema: monitor }
  - { name: pgstattuple        ,schema: monitor }
  - { name: pg_buffercache     ,schema: monitor }
  - { name: pageinspect        ,schema: monitor }
  - { name: pg_prewarm         ,schema: monitor }
  - { name: pg_visibility      ,schema: monitor }
  - { name: pg_freespacemap    ,schema: monitor }
  - { name: postgres_fdw       ,schema: public  }
  - { name: file_fdw           ,schema: public  }
  - { name: btree_gist         ,schema: public  }
  - { name: btree_gin          ,schema: public  }
  - { name: pg_trgm            ,schema: public  }
  - { name: intagg             ,schema: public  }
  - { name: intarray           ,schema: public  }
  - { name: pg_repack }

唯一的三方扩展是 pg_repack，这对于数据库维护很重要，所有其他扩展都是内置的 PostgreSQL Contrib 扩展插件。

监控相关的扩展默认安装在 monitor 模式中，该模式由pg_default_schemas创建。

`pg_reload`

参数名称： pg_reload，类型： bool，层次：A

在hba更改后重新加载 PostgreSQL，默认值为true

当您想在应用HBA更改之前进行检查时，将其设置为false以禁用自动重新加载配置。

`pg_default_hba_rules`

参数名称： pg_default_hba_rules，类型： hba[]，层次：G/C

PostgreSQL 基于主机的认证规则，全局默认规则定义。默认值为：

pg_default_hba_rules:             # postgres default host-based authentication rules
  - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
  - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
  - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: pwd   ,title: 'replicator replication from localhost'}
  - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: pwd   ,title: 'replicator replication from intranet' }
  - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: pwd   ,title: 'replicator postgres db from intranet' }
  - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
  - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: pwd   ,title: 'monitor from infra host with password'}
  - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: ssl   ,title: 'admin @ everywhere with ssl & pwd'    }
  - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: pwd   ,title: 'pgbouncer read/write via local socket'}
  - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: pwd   ,title: 'read/write biz user via password'     }
  - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: pwd   ,title: 'allow etl offline tasks from intranet'}

默认值为常见场景提供了足够的安全级别，请查看PGSQL身份验证了解详情。

本参数为 HBA规则对象组成的数组，在形式上与 pg_hba_rules 完全一致。建议在全局配置统一的 pg_default_hba_rules，针对特定集群使用 pg_hba_rules 进行额外定制。两个参数中的规则都会依次应用，后者优先级更高。

`pgb_default_hba_rules`

参数名称： pgb_default_hba_rules，类型： hba[]，层次：G/C

Pgbouncer 默认的基于主机的认证规则，数组或 hba 规则对象。

默认值提供了一套对于常见场景足够的安全级别，查看 PGSQL Authentication 了解详情。

pgb_default_hba_rules:            # pgbouncer default host-based authentication rules
  - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
  - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
  - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: pwd   ,title: 'monitor access via intranet with pwd' }
  - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
  - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: pwd   ,title: 'admin access via intranet with pwd'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
  - {user: 'all'        ,db: all         ,addr: intra     ,auth: pwd   ,title: 'allow all user intra access with pwd' }

默认的Pgbouncer HBA规则很简单：

允许从本地使用密码登陆
允许从内网网断使用密码登陆

用户可以按照自己的需求进行定制。

本参数在形式上与 pgb_hba_rules 完全一致，建议在全局配置统一的 pgb_default_hba_rules，针对特定集群使用 pgb_hba_rules 进行额外定制。两个参数中的规则都会依次应用，后者优先级更高。

`PG_BACKUP`

本节定义了用于 pgBackRest 的变量，它被用于 PGSQL 时间点恢复 PITR 。

查看 PGSQL 备份 & PITR 以获取详细信息。

pgbackrest_enabled: true          # 在 pgsql 主机上启用 pgBackRest 吗？
pgbackrest_clean: true            # 初始化时删除 pg 备份数据？
pgbackrest_log_dir: /pg/log/pgbackrest # pgbackrest 日志目录，默认为 `/pg/log/pgbackrest`
pgbackrest_method: local          # pgbackrest 仓库方法：local, minio, [用户定义...]
pgbackrest_repo:                  # pgbackrest 仓库：https://pgbackrest.org/configuration.html#section-repository
  local:                          # 默认使用本地 posix 文件系统的 pgbackrest 仓库
    path: /pg/backup              # 本地备份目录，默认为 `/pg/backup`
    retention_full_type: count    # 按计数保留完整备份
    retention_full: 2             # 使用本地文件系统仓库时，最多保留 3 个完整备份，至少保留 2 个
  minio:                          # pgbackrest 的可选 minio 仓库
    type: s3                      # minio 是与 s3 兼容的，所以使用 s3
    s3_endpoint: sss.pigsty       # minio 端点域名，默认为 `sss.pigsty`
    s3_region: us-east-1          # minio 区域，默认为 us-east-1，对 minio 无效
    s3_bucket: pgsql              # minio 桶名称，默认为 `pgsql`
    s3_key: pgbackrest            # pgbackrest 的 minio 用户访问密钥
    s3_key_secret: S3User.Backup  # pgbackrest 的 minio 用户秘密密钥
    s3_uri_style: path            # 对 minio 使用路径风格的 uri，而不是主机风格
    path: /pgbackrest             # minio 备份路径，默认为 `/pgbackrest`
    storage_port: 9000            # minio 端口，默认为 9000
    storage_ca_file: /etc/pki/ca.crt  # minio ca 文件路径，默认为 `/etc/pki/ca.crt`
    block: y                      # 启用块增量备份
    bundle: y                     # 将小文件捆绑在一起
    bundle_limit: 20MiB           # 文件捆绑限制，20MiB 用于对象存储
    bundle_size: 128MiB           # 文件捆绑目标大小，128MiB 用于对象存储
    cipher_type: aes-256-cbc      # 为远程备份仓库启用 AES 加密
    cipher_pass: pgBackRest       # AES 加密密码，默认为 'pgBackRest'
    retention_full_type: time     # 在 minio 仓库上按时间保留完整备份
    retention_full: 14            # 保留过去 14 天的完整备份

`pgbackrest_enabled`

参数名称： pgbackrest_enabled，类型： bool，层次：C

是否在 PGSQL 节点上启用 pgBackRest？默认值为： true

在使用本地文件系统备份仓库（local）时，只有集群主库才会真正启用 pgbackrest。其他实例只会初始化一个空仓库。

`pgbackrest_clean`

参数名称： pgbackrest_clean，类型： bool，层次：C

初始化时删除 PostgreSQL 备份数据吗？默认值为 true。

`pgbackrest_log_dir`

参数名称： pgbackrest_log_dir，类型： path，层次：C

pgBackRest 日志目录，默认为 /pg/log/pgbackrest，promtail 日志代理会引用此参数收集日志。

`pgbackrest_method`

参数名称： pgbackrest_method，类型： enum，层次：C

pgBackRest 仓库方法：默认可选项为：local、minio 或其他用户定义的方法，默认为 local。

此参数用于确定用于 pgBackRest 的仓库，所有可用的仓库方法都在 pgbackrest_repo 中定义。

Pigsty 默认使用 local 备份仓库，这将在主实例的 /pg/backup 目录上创建一个备份仓库。底层存储路径由 pg_fs_bkup 指定。

`pgbackrest_repo`

参数名称： pgbackrest_repo，类型： dict，层次：G/C

pgBackRest 仓库文档：https://pgbackrest.org/configuration.html#section-repository

默认值包括两种仓库方法：local 和 minio，定义如下：

pgbackrest_repo:                  # pgbackrest 仓库：https://pgbackrest.org/configuration.html#section-repository
  local:                          # 默认使用本地 posix 文件系统的 pgbackrest 仓库
    path: /pg/backup              # 本地备份目录，默认为 `/pg/backup`
    retention_full_type: count    # 按计数保留完整备份
    retention_full: 2             # 使用本地文件系统仓库时，最多保留 3 个完整备份，至少保留 2 个
  minio:                          # pgbackrest 的可选 minio 仓库
    type: s3                      # minio 是与 s3 兼容的，所以使用 s3
    s3_endpoint: sss.pigsty       # minio 端点域名，默认为 `sss.pigsty`
    s3_region: us-east-1          # minio 区域，默认为 us-east-1，对 minio 无效
    s3_bucket: pgsql              # minio 桶名称，默认为 `pgsql`
    s3_key: pgbackrest            # pgbackrest 的 minio 用户访问密钥
    s3_key_secret: S3User.Backup  # pgbackrest 的 minio 用户秘密密钥
    s3_uri_style: path            # 对 minio 使用路径风格的 uri，而不是主机风格
    path: /pgbackrest             # minio 备份路径，默认为 `/pgbackrest`
    storage_port: 9000            # minio 端口，默认为 9000
    storage_ca_file: /etc/pki/ca.crt  # minio ca 文件路径，默认为 `/etc/pki/ca.crt`
    block: y                      # 启用块增量备份
    bundle: y                     # 将小文件捆绑在一起
    bundle_limit: 20MiB           # 文件捆绑限制，20MiB 用于对象存储
    bundle_size: 128MiB           # 文件捆绑目标大小，128MiB 用于对象存储
    cipher_type: aes-256-cbc      # 为远程备份仓库启用 AES 加密
    cipher_pass: pgBackRest       # AES 加密密码，默认为 'pgBackRest'
    retention_full_type: time     # 在 minio 仓库上按时间保留完整备份
    retention_full: 14            # 保留过去 14 天的完整备份

您可以定义新的备份仓库，例如使用 AWS S3，GCP 或其他云供应商的 S3 兼容存储服务。

在备份仓库定义参数中，你可以使用 ${pg_cluster} 变量来引用集群名称，例如作为备份路径或加密密钥的一部分。但如果你有跨集群 PITR 的需求，则应该保持备份仓库路径与加密密钥相同。

`PG_SERVICE`

本节介绍如何将PostgreSQL服务暴露给外部世界，包括：

使用haproxy在不同的端口上暴露不同的PostgreSQL服务
使用vip-manager将可选的L2 VIP绑定到主实例
在基础设施节点上使用dnsmasq注册集群/实例DNS记录

pg_weight: 100          #实例 # 服务中的相对负载均衡权重，默认为100，范围0-255
pg_default_service_dest: pgbouncer # 如果svc.dest='default'，则此为默认服务目的地
pg_default_services:              # postgres默认服务定义
  - { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }
  - { name: replica ,port: 5434 ,dest: default  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
  - { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
  - { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}
pg_vip_enabled: false             # 为pgsql主要实例启用l2 vip吗? 默认为false
pg_vip_address: 127.0.0.1/24      # `<ipv4>/<mask>`格式的vip地址，如果启用vip则需要
pg_vip_interface: eth0            # vip网络接口监听，默认为eth0
pg_dns_suffix: ''                 # pgsql dns后缀，默认为空
pg_dns_target: auto               # auto、primary、vip、none或特定的ip

`pg_weight`

参数名称： pg_weight，类型： int，层次：G

服务中的相对负载均衡权重，默认为100，范围0-255。

默认值： 100。您必须在实例变量中定义它，并重载服务以生效。

`pg_service_provider`

参数名称： pg_service_provider，类型： string，层次：G/C

专用的haproxy节点组名，或默认为本地节点的空字符串。

如果指定，PostgreSQL服务将注册到专用的haproxy节点组，而不是当下的 PGSQL 集群节点。

请记住为每个服务在专用的 haproxy 节点上分配唯一的端口！

例如，如果我们在3节点的 pg-test 集群上定义以下参数：

pg_service_provider: infra       # use load balancer on group `infra`
pg_default_services:             # alloc port 10001 and 10002 for pg-test primary/replica service  
  - { name: primary ,port: 10001 ,dest: postgres  ,check: /primary   ,selector: "[]" }
  - { name: replica ,port: 10002 ,dest: postgres  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }

`pg_default_service_dest`

参数名称： pg_default_service_dest，类型： enum，层次：G/C

当定义一个服务时，如果 svc.dest='default'，此参数将用作默认值。

默认值： pgbouncer，意味着 5433 读写服务和 5434 只读服务将默认将流量路由到 pgbouncer。

如果您不想使用 pgbouncer，将其设置为 postgres。流量将直接路由到 postgres。

`pg_default_services`

参数名称： pg_default_services，类型： service[]，层次：G/C

postgres默认服务定义

默认值是四个默认服务定义，如PGSQL Service所述

pg_default_services:               # postgres default service definitions
  - { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }
  - { name: replica ,port: 5434 ,dest: default  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
  - { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
  - { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}

`pg_vip_enabled`

参数名称： pg_vip_enabled，类型： bool，层次：C

为 PGSQL 集群启用 L2 VIP吗？默认值是false，表示不创建 L2 VIP。

启用 L2 VIP 后，会有一个 VIP 绑定在集群主实例节点上，由 vip-manager 管理，根据 etcd 中的数据进行判断。

L2 VIP只能在相同的L2网络中使用，这可能会对您的网络拓扑产生额外的限制。

`pg_vip_address`

参数名称： pg_vip_address，类型： cidr4，层次：C

如果启用vip，则需要<ipv4>/<mask>格式的vip地址。

默认值： 127.0.0.1/24。这个值由两部分组成：ipv4和mask，用/分隔。

`pg_vip_interface`

参数名称： pg_vip_interface，类型： string，层次：C/I

vip network interface to listen, eth0 by default.

L2 VIP 监听的网卡接口，默认为 eth0。

它应该是您节点的首要网卡名，即您在配置清单中使用的IP地址。

如果您的节点有多块名称不同的网卡，您可以在实例变量上进行覆盖：

pg-test:
    hosts:
        10.10.10.11: {pg_seq: 1, pg_role: replica ,pg_vip_interface: eth0 }
        10.10.10.12: {pg_seq: 2, pg_role: primary ,pg_vip_interface: eth1 }
        10.10.10.13: {pg_seq: 3, pg_role: replica ,pg_vip_interface: eth2 }
    vars:
      pg_vip_enabled: true          # 为这个集群启用L2 VIP，默认绑定到主实例
      pg_vip_address: 10.10.10.3/24 # L2网络CIDR: 10.10.10.0/24, vip地址: 10.10.10.3
      # pg_vip_interface: eth1      # 如果您的节点有统一的接口，您可以在这里定义它

`pg_dns_suffix`

参数名称： pg_dns_suffix，类型： string，层次：C

PostgreSQL DNS 名称后缀，默认为空字符串。

在默认情况下，PostgreQL 集群名会作为 DNS 域名注册到 Infra 节点的 dnsmasq 中对外提供解析。

您可以通过本参数指定一个域名后缀，这样会使用 {{ pg_cluster }}{{ pg_dns_suffix }} 作为集群 DNS 名称。

例如，如果您将 pg_dns_suffix 设置为 .db.vip.company.tld，那么 pg-test 的集群 DNS 名称将是 pg-test.db.vip.company.tld

`pg_dns_target`

参数名称： pg_dns_target，类型： enum，层次：C

可以是：auto、primary、vip、none或一个特定的IP地址，它将是集群DNS记录的解析目标IP地址。

默认值： auto，如果pg_vip_enabled，将绑定到pg_vip_address，否则会回退到集群主实例的 IP 地址。

vip：绑定到pg_vip_address
primary：解析为集群主实例IP地址
auto：如果 pg_vip_enabled，解析为 pg_vip_address，或回退到集群主实例ip地址。
none：不绑定到任何ip地址
<ipv4>：绑定到指定的IP地址

`PG_EXPORTER`

PG Exporter 用于监控 PostgreSQL 数据库与 Pgbouncer 连接池的状态。

pg_exporter_enabled: true              # 在 pgsql 主机上启用 pg_exporter 吗？
pg_exporter_config: pg_exporter.yml    # pg_exporter 配置文件名
pg_exporter_cache_ttls: '1,10,60,300'  # pg_exporter 收集器 ttl 阶段（秒），默认为 '1,10,60,300'
pg_exporter_port: 9630                 # pg_exporter 监听端口，默认为 9630
pg_exporter_params: 'sslmode=disable'  # pg_exporter dsn 的额外 url 参数
pg_exporter_url: ''                    # 如果指定，将覆盖自动生成的 pg dsn
pg_exporter_auto_discovery: true       # 启用自动数据库发现？默认启用
pg_exporter_exclude_database: 'template0,template1,postgres' # 在自动发现过程中不会被监控的数据库的 csv 列表
pg_exporter_include_database: ''       # 在自动发现过程中将被监控的数据库的 csv 列表
pg_exporter_connect_timeout: 200       # pg_exporter 连接超时（毫秒），默认为 200
pg_exporter_options: ''                # 覆盖 pg_exporter 的额外选项
pgbouncer_exporter_enabled: true       # 在 pgsql 主机上启用 pgbouncer_exporter 吗？
pgbouncer_exporter_port: 9631          # pgbouncer_exporter 监听端口，默认为 9631
pgbouncer_exporter_url: ''             # 如果指定，将覆盖自动生成的 pgbouncer dsn
pgbouncer_exporter_options: ''         # 覆盖 pgbouncer_exporter 的额外选项
pgbackrest_exporter_enabled: true      # 在 pgsql 主机上启用 pgbackrest_exporter 吗？
pgbackrest_exporter_port: 9854         # pgbackrest_exporter 监听端口，默认为 9854
pgbackrest_exporter_options: ''        # 覆盖 pgbackrest_exporter 的额外选项

`pg_exporter_enabled`

参数名称： pg_exporter_enabled，类型： bool，层次：C

是否在 PGSQL 节点上启用 pg_exporter？默认值为：true。

PG Exporter 用于监控 PostgreSQL 数据库实例，如果不想安装 pg_exporter 可以设置为 false。

`pg_exporter_config`

参数名称： pg_exporter_config，类型： string，层次：C

pg_exporter 配置文件名，PG Exporter 和 PGBouncer Exporter 都会使用这个配置文件。默认值：pg_exporter.yml。

如果你想使用自定义配置文件，你可以在这里定义它。你的自定义配置文件应当放置于 files/<name>.yml。

例如，当您希望监控一个远程的 PolarDB 数据库实例时，可以使用样例配置：files/polar_exporter.yml。

`pg_exporter_cache_ttls`

参数名称： pg_exporter_cache_ttls，类型： string，层次：C

pg_exporter 收集器 TTL 阶梯（秒），默认为 ‘1,10,60,300’

默认值：1,10,60,300，它将为不同的度量收集器使用不同的TTL值： 1s, 10s, 60s, 300s。

PG Exporter 内置了缓存机制，避免多个 Prometheus 重复抓取对数据库产生不当影响，所有指标收集器按 TTL 分为四类：

ttl_fast: "{{ pg_exporter_cache_ttls.split(',')[0]|int }}"         # critical queries
ttl_norm: "{{ pg_exporter_cache_ttls.split(',')[1]|int }}"         # common queries
ttl_slow: "{{ pg_exporter_cache_ttls.split(',')[2]|int }}"         # slow queries (e.g table size)
ttl_slowest: "{{ pg_exporter_cache_ttls.split(',')[3]|int }}"      # ver slow queries (e.g bloat)

例如，在默认配置下，存活类指标默认最多缓存 1s，大部分普通指标会缓存 10s（应当与 prometheus_scrape_interval 相同）。少量变化缓慢的查询会有 60s 的TTL，极个别大开销监控查询会有 300s 的TTL。

`pg_exporter_port`

参数名称： pg_exporter_port，类型： port，层次：C

pg_exporter 监听端口号，默认值为：9631

`pg_exporter_params`

参数名称： pg_exporter_params，类型： string，层次：C

pg_exporter 所使用 DSN 中额外的 URL PATH 参数。

默认值：sslmode=disable，它将禁用用于监控连接的 SSL（因为默认使用本地 unix 套接字）。

`pg_exporter_url`

参数名称： pg_exporter_url，类型： pgurl，层次：C

如果指定了本参数，将会覆盖自动生成的 PostgreSQL DSN，使用指定的 DSN 连接 PostgreSQL 。默认值为空字符串。

如果没有指定此参数，PG Exporter 默认会使用以下的连接串访问 PostgreSQL ：

postgres://{{ pg_monitor_username }}:{{ pg_monitor_password }}@{{ pg_host }}:{{ pg_port }}/postgres{% if pg_exporter_params != '' %}?{{ pg_exporter_params }}{% endif %}

当您想监控一个远程的 PostgreSQL 实例时，或者需要使用不同的监控用户/密码，配置选项时，可以使用这个参数。

`pg_exporter_auto_discovery`

参数名称： pg_exporter_auto_discovery，类型： bool，层次：C

启用自动数据库发现吗？默认启用：true。

PG Exporter 默认会连接到 DSN 中指定的数据库（默认为管理数据库 postgres）收集全局指标，如果您希望收集所有业务数据库的指标，可以开启此选项。 PG Exporter 会自动发现目标 PostgreSQL 实例中的所有数据库，并在这些数据库中收集 库级监控指标。

`pg_exporter_exclude_database`

参数名称： pg_exporter_exclude_database，类型： string，层次：C

如果启用了数据库自动发现（默认启用），在这个参数指定的列表中的数据库将不会被监控。默认值为： template0,template1,postgres，即管理数据库 postgres 与模板数据库会被排除在自动监控的数据库之外。

作为例外，DSN 中指定的数据库不受此参数影响，例如，PG Exporter 如果连接的是 postgres 数据库，那么即使 postgres 在此列表中，也会被监控。

`pg_exporter_include_database`

参数名称： pg_exporter_include_database，类型： string，层次：C

如果启用了数据库自动发现（默认启用），在这个参数指定的列表中的数据库才会被监控。默认值为空字符串，即不启用此功能。

参数的形式是由逗号分隔的数据库名称列表，例如：db1,db2,db3。

此参数相对于 [pg_exporter_exclude_database] 有更高的优先级，相当于白名单模式。如果您只希望监控特定的数据库，可以使用此参数。

`pg_exporter_connect_timeout`

参数名称： pg_exporter_connect_timeout，类型： int，层次：C

pg_exporter 连接超时（毫秒），默认为 200 （单位毫秒）

当 PG Exporter 尝试连接到 PostgreSQL 数据库时，最多会等待多长时间？超过这个时间，PG Exporter 将会放弃连接并报错。

默认值 200毫秒对于绝大多数场景（例如：同可用区监控）都是足够的，但是如果您监控的远程 PostgreSQL 位于另一个大洲，您可能需要增加此值以避免连接超时。

`pg_exporter_options`

参数名称： pg_exporter_options，类型： arg，层次：C

传给 PG Exporter 的命令行参数，默认值为："" 空字符串。

当使用空字符串时，会使用默认的命令参数：

{% if pg_exporter_port != '' %}
PG_EXPORTER_OPTS='--web.listen-address=:{{ pg_exporter_port }} {{ pg_exporter_options }}'
{% else %}
PG_EXPORTER_OPTS='--web.listen-address=:{{ pg_exporter_port }} --log.level=info'
{% endif %}

注意，请不要在本参数中覆盖 pg_exporter_port 的端口配置。

`pgbouncer_exporter_enabled`

参数名称： pgbouncer_exporter_enabled，类型： bool，层次：C

在 PGSQL 节点上，是否启用 pgbouncer_exporter ？默认值为：true。

`pgbouncer_exporter_port`

参数名称： pgbouncer_exporter_port，类型： port，层次：C

pgbouncer_exporter 监听端口号，默认值为：9631

`pgbouncer_exporter_url`

参数名称： pgbouncer_exporter_url，类型： pgurl，层次：C

如果指定了本参数，将会覆盖自动生成的 pgbouncer DSN，使用指定的 DSN 连接 pgbouncer。默认值为空字符串。

如果没有指定此参数，Pgbouncer Exporter 默认会使用以下的连接串访问 Pgbouncer：

postgres://{{ pg_monitor_username }}:{{ pg_monitor_password }}@:{{ pgbouncer_port }}/pgbouncer?host={{ pg_localhost }}&sslmode=disable

当您想监控一个远程的 Pgbouncer 实例时，或者需要使用不同的监控用户/密码，配置选项时，可以使用这个参数。

`pgbouncer_exporter_options`

参数名称： pgbouncer_exporter_options，类型： arg，层次：C

传给 Pgbouncer Exporter 的命令行参数，默认值为："" 空字符串。

当使用空字符串时，会使用默认的命令参数：

{% if pgbouncer_exporter_options != '' %}
PG_EXPORTER_OPTS='--web.listen-address=:{{ pgbouncer_exporter_port }} {{ pgbouncer_exporter_options }}'
{% else %}
PG_EXPORTER_OPTS='--web.listen-address=:{{ pgbouncer_exporter_port }} --log.level=info'
{% endif %}

注意，请不要在本参数中覆盖 pgbouncer_exporter_port 的端口配置。

`pgbackrest_exporter_enabled`

参数名称： pgbackrest_exporter_enabled，类型： bool，层次：C

在 PGSQL 节点上，是否启用 pgbackrest_exporter ？默认值为：true。

如果 pgbackrest_enabled 为 false，则本参数因短路无效。

`pgbackrest_exporter_port`

参数名称： pgbackrest_exporter_port，类型： port，层次：C

pgbackrest_exporter 监听端口号，默认值为：9854

`pgbackrest_exporter_options`

参数名称： pgbackrest_exporter_options，类型： arg，层次：C

传给 Pgbouncer Exporter 的命令行参数，默认值为："" 空字符串。

6.10 - 预置剧本

如何使用 ansible 剧本来管理 PostgreSQL 集群

PostgreSQL 剧本

Pigsty提供了一系列剧本，用于集群上下线扩缩容，用户/数据库管理，监控或迁移已有实例。

pgsql.yml ：初始化PostgreSQL集群或添加新的从库。
pgsql-rm.yml ：移除PostgreSQL集群，或移除某个实例
pgsql-user.yml ：在现有的PostgreSQL集群中添加新的业务用户
pgsql-db.yml ：在现有的PostgreSQL集群中添加新的业务数据库
pgsql-monitor.yml ：将远程postgres实例纳入监控中
pgsql-migration.yml ：为现有的PostgreSQL集群生成迁移手册和脚本

保护机制

使用 PGSQL 剧本时需要特别注意，剧本 pgsql.yml 与 pgsql-rm.yml 使用不当会有误删数据库的风险！

在使用pgsql.yml时，请再三检查--tags|-t 与 --limit|-l 参数是否正确。
强烈建议在执行时添加-l参数，限制命令执行的对象范围，并确保自己在正确的目标上执行正确的任务。
限制范围通常以一个数据库集群为宜，使用不带参数的pgsql.yml在生产环境中是一个高危操作，务必三思而后行。

出于防止误删的目的，Pigsty 的 PGSQL 模块提供了防误删保险，由以下两个参数控制：

pg_safeguard 默认为 false，不打开。
pg_clean 默认为 true，默认清理已有实例。

对初始化剧本的影响

当 pgsql.yml 剧本执行中遭遇配置相同的运行中现存实例时，会有以下行为表现：

`pg_safeguard` / `pg_clean`	`pg_clean=true`	`pg_clean=false`
`pg_safeguard=false`	抹除实例	中止执行
`pg_safeguard=true`	中止执行	中止执行

如果 pg_safeguard 启用，那么该剧本会中止执行，避免误删。
如果没有启用，那么会进一步根据 pg_clean 的取值，来决定是否移除现有的实例。
- 如果 pg_clean 为 true，该剧本会直接清理现有实例，为新实例腾出空间。这是默认行为。
- 如果 pg_clean 为 false，该剧本会中止执行，这需要显式配置。

对下线剧本的影响

当 pgsql-rm.yml 剧本执行中遭遇配置相同的运行中现存实例时，会有以下行为表现：

`pg_safeguard` / `pg_clean`	`pg_clean=true`	`pg_clean=false`
`pg_safeguard=false`	抹除实例与数据	抹除实例
`pg_safeguard=true`	中止执行	中止执行

如果 pg_safeguard 启用，那么该剧本会中止执行，避免误删。
如果没有启用，那么会继续抹除实例，同时 pg_clean 在本剧本中会被解释为：是否移除数据目录。
- 如果 pg_clean 为 true，该剧本会直接一并清理 PostgreSQL 数据目录，即所谓“删库”，这是默认行为。
- 如果 pg_clean 为 false，该剧本保留数据目录，继续完成其他清理工作，这需要显式配置。

`pgsql.yml`

剧本 pgsql.yml 用于初始化PostgreSQL集群或添加新的从库。

下面是使用此剧本初始化沙箱环境中 PostgreSQL 集群的过程：

本剧本包含以下子任务：

# pg_clean      : 清理现有的 postgres（如有必要）
# pg_dbsu       : 为 postgres dbsu 设置操作系统用户sudo
# pg_install    : 安装 postgres 包和扩展
#   - pg_pkg              : 安装 postgres 相关包
#   - pg_extension        : 仅安装 postgres 扩展
#   - pg_path             : 将 pgsql 版本 bin 链接到 /usr/pgsql
#   - pg_env              : 将 pgsql bin 添加到系统路径
# pg_dir        : 创建 postgres 目录并设置 fhs
# pg_util       : 复制工具脚本，设置别名和环境
#   - pg_bin              : 同步 postgres 工具脚本 /pg/bin
#   - pg_alias            : 写入 /etc/profile.d/pg-alias.sh
#   - pg_psql             : 为 psql 创建 psqlrc 文件
#   - pg_dummy            : 创建 dummy 占位文件
# patroni       : 使用 patroni 引导 postgres
#   - pg_config           : 生成 postgres 配置
#   - pg_conf           : 生成 patroni 配置
#   - pg_systemd        : 生成 patroni systemd 配置
#   - pgbackrest_config : 生成 pgbackrest 配置
#   -  pg_cert            : 为 postgres 签发证书
#   -  pg_launch          : 启动 postgres 主服务器和副本
#   - pg_watchdog       : 授予 postgres watchdog 权限
#   - pg_primary        : 启动 patroni/postgres 主服务器
#   - pg_init           : 使用角色/模板初始化 pg 集群
#   - pg_pass           : 将 .pgpass 文件写入 pg 主目录
#   - pg_replica        : 启动 patroni/postgres 副本
#   - pg_hba            : 生成 pg HBA 规则
#   - patroni_reload    : 重新加载 patroni 配置
#   - pg_patroni        : 必要时暂停或删除 patroni
# pg_user       : 配置 postgres 业务用户
#   - pg_user_config      : 渲染创建用户的 sql
#   - pg_user_create      : 在 postgres 上创建用户
# pg_db         : 配置 postgres 业务数据库
#   - pg_db_config        : 渲染创建数据库的 sql
#   - pg_db_create        : 在 postgres 上创建数据库
# pg_backup               : 初始化 pgbackrest 仓库和基础备份
#   - pgbackrest_init     : 初始化 pgbackrest 仓库
#   - pgbackrest_backup   : 引导后进行初始备份
# pgbouncer     : 与 postgres 一起部署 pgbouncer 边车
#   - pgbouncer_clean     : 清理现有的 pgbouncer
#   - pgbouncer_dir       : 创建 pgbouncer 目录
#   - pgbouncer_config    : 生成 pgbouncer 配置
#       -  pgbouncer_svc    : 生成 pgbouncer systemd 配置
#       -  pgbouncer_ini    : 生成 pgbouncer 主配置
#       -  pgbouncer_hba    : 生成 pgbouncer hba 配置
#       -  pgbouncer_db     : 生成 pgbouncer 数据库配置
#       -  pgbouncer_user   : 生成 pgbouncer 用户配置
#   -  pgbouncer_launch   : 启动 pgbouncer 池化服务
#   -  pgbouncer_reload   : 重新加载 pgbouncer 配置
# pg_vip        : 使用 vip-manager 将 vip 绑定到 pgsql 主服务器
#   - pg_vip_config       : 为 vip-manager 生成配置
#   - pg_vip_launch       : 启动 vip-manager 绑定 vip
# pg_dns        : 将 dns 名称注册到 infra dnsmasq
#   - pg_dns_ins          : 注册 pg 实例名称
#   - pg_dns_cls          : 注册 pg 集群名称
# pg_service    : 使用 haproxy 公开 pgsql 服务
#   - pg_service_config   : 为 pg 服务生成本地 haproxy 配置
#   - pg_service_reload   : 使用 haproxy 公开 postgres 服务
# pg_exporter   : 使用 haproxy 公开 pgsql 服务
#   - pg_exporter_config  : 配置 pg_exporter 和 pgbouncer_exporter
#   - pg_exporter_launch  : 启动 pg_exporter
#   - pgbouncer_exporter_launch : 启动 pgbouncer 导出器
# pg_register   : 将 postgres 注册到 pigsty 基础设施
#   - register_prometheus : 将 pg 注册为 prometheus 监控目标
#   - register_grafana    : 将 pg 数据库注册为 grafana 数据源

以下管理任务使用到了此剧本

一些关于本剧本的注意事项

单独针对某一集群从库执行此剧本时，用户应当确保 集群主库已经完成初始化！

扩容完成后，您需要重载服务与重载HBA，包装脚本 pgsql-add 会完成这些任务。
详情请参考管理 SOP：添加实例

集群扩容时，如果Patroni拉起从库的时间过长，Ansible剧本可能会因为超时而中止。

典型错误信息为：wait for postgres/patroni replica 任务执行很长时间后中止
但制作从库的进程会继续，例如制作从库需超过1天的场景，后续处理请参考 FAQ：制作从库失败。

`pgsql-rm.yml`

剧本 pgsql-rm.yml 用于移除PostgreSQL集群，或移除某个实例。

下面是使用此剧本移除沙箱环境中 PostgreSQL 集群的过程：

本剧本包含以下子任务：

# register       : 在 prometheus、grafana、nginx 中移除注册
#   - prometheus : 从 prometheus 移除监控目标
#   - grafana    : 从 grafana 移除数据源
# dns            : 移除 INFRA节点上 DNSMASQ 的 pg dns 记录
# vip            : 移除 vip-manager，与绑定在集群主库上的 VIP
# pg_service     : 从 haproxy 上移除 PostgreSQL 服务定义并重载生效
# pg_exporter    : 移除 pg_exporter 和 pgbouncer_exporter 监控组件
# pgbouncer      : 移除 pgbouncer 连接池中间件
# postgres       : 移除 postgres 实例数据库实例
#   - pg_replica : 移除所有从库
#   - pg_primary : 最后移除主库
#   - dcs        : 从 dcs:etcd 移除元数据
# pg_data        : 移除 PostgreSQL 数据目录（使用 `pg_clean=false` 禁用）
# pgbackrest     : 移除主实例时，一并移除 PostgreSQL 备份（使用 `pgbackrest_clean=false` 禁用）
# pg_pkg         : 移除 PostgreSQL 软件包（使用 `pg_uninstall=true` 启用）

本剧本可以使用一些命令行参数影响其行为：

./pgsql-rm.yml -l pg-test     # 移除集群 `pg-test`
    -e pg_clean=true          # 是否一并移除 PostgreSQL 数据库目录？默认移除数据目录。
    -e pgbackrest_clean=true  # 是否一并移除 PostgreSQL 备份？（只针对主库执行时生效），默认移除备份数据。
    -e pg_uninstall=false     # 默认不会卸载 PostgreSQL 软件包，需要显式指定此参数才会卸载。
    -e pg_safeguard=false     # 防误删保险默认不打开，如果打开，可以在这里用命令行参数强行覆盖。

以下管理任务使用到了此剧本

一些关于本剧本的注意事项

请不要直接对还有从库的集群主库单独直接执行此剧本

否则抹除主库后，其余从库会自动触发高可用自动故障切换。
总是先下线所有从库后，再下线主库，当一次性下线整个集群时不需要操心此问题。

实例下线后请刷新集群服务

当您从集群中下线掉某一个从库实例时，它仍然存留于在负载均衡器的配置文件中。
因为任何健康检查都无法通过，所以下线后的实例不会对集群产生影响。
但您应当在恰当的时间点重载服务，确保生产环境与配置清单的一致性。

`pgsql-user.yml`

剧本 pgsql-user.yml 用于在现有的PostgreSQL集群中添加新的业务用户

详情请参考：管理SOP：创建用户

`pgsql-db.yml`

剧本 pgsql-db.yml 用于在现有的PostgreSQL集群中添加新的业务数据库

详情请参考：管理SOP：创建数据库

`pgsql-monitor.yml`

剧本 pgsql-monitor.yml 用于将远程postgres实例纳入监控中

详情请参考：管理SOP：监控现有PG

`pgsql-migration.yml`

剧本 pgsql-migration.yml 用于为现有的PostgreSQL集群生成迁移手册和脚本

详情请参考：管理SOP：迁移数据库集群

6.11 - 管理预案

Pigsty 中常用的 PostgreSQL 管理预案，用于维护生产环境中的数据库集群。

本文整理了 Pigsty 中常用的 PostgreSQL 管理预案，用于维护生产环境中的数据库集群。

这里是一些常见 PostgreSQL 管理任务的 SOP 预案：

案例1：创建集群
案例2：创建用户
案例3：创建数据库
案例4：重载服务
案例5：重载HBA
案例6：配置集群
案例7：添加实例
案例8：移除实例
案例9：下线集群
案例10：主动切换
案例11：备份集群
案例12：恢复集群
案例13：添加软件
案例14：安装扩展
案例15：小版本升级
案例16：大版本升级

命令速查

PGSQL 剧本与快捷方式：

bin/pgsql-add   <cls>                   # 创建 pgsql 集群 <cls>
bin/pgsql-user  <cls> <username>        # 在 <cls> 上创建 pg 用户 <username>
bin/pgsql-db    <cls> <dbname>          # 在 <cls> 上创建 pg 数据库 <dbname>
bin/pgsql-svc   <cls> [...ip]           # 重新加载集群 <cls> 的 pg 服务
bin/pgsql-hba   <cls> [...ip]           # 重新加载集群 <cls> 的 postgres/pgbouncer HBA 规则
bin/pgsql-add   <cls> [...ip]           # 为集群 <cls> 添加从库副本
bin/pgsql-rm    <cls> [...ip]           # 从集群 <cls> 移除实例
bin/pgsql-rm    <cls>                   # 删除 pgsql 集群 <cls>

Patroni 管理命令与快捷方式：

pg list        <cls>                    # 打印集群信息
pg edit-config <cls>                    # 编辑集群配置
pg reload      <cls> [ins]              # 重新加载集群配置
pg restart     <cls> [ins]              # 重启 PostgreSQL 集群
pg reinit      <cls> [ins]              # 重新初始化集群成员
pg pause       <cls>                    # 进入维护模式（自动故障转移暂停）
pg resume      <cls>                    # 退出维护模式
pg switchover  <cls>                    # 在集群 <cls> 上进行主动主从切换（主库健康）
pg failover    <cls>                    # 在集群 <cls> 上进行故障转移（主库故障）

pgBackRest 备份/恢复命令与快捷方式：

pb info                                 # 打印 pgbackrest 备份仓库信息
pg-backup                               # 进行备份，默认进行增量备份，如果没有完整备份过就做全量备份
pg-backup full                          # 进行全量备份
pg-backup diff                          # 进行差异备份
pg-backup incr                          # 进行增量备份
pg-pitr -i                              # 恢复到最近备份完成的时间（不常用）
pg-pitr --time="2022-12-30 14:44:44+08" # 恢复到特定时间点（如在删除数据库或表的情况下）
pg-pitr --name="my-restore-point"       # 恢复到由 pg_create_restore_point 创建的命名还原点
pg-pitr --lsn="0/7C82CB8" -X            # 恢复到 LSN 之前
pg-pitr --xid="1234567" -X -P           # 恢复到特定的事务ID之前，然后将其提升为主库
pg-pitr --backup=latest                 # 恢复到最新的备份集
pg-pitr --backup=20221108-105325        # 恢复到特定的备份集，使用名称指定，可以使用 pgbackrest info 进行检查

使用 Systemd 管理系统组件的命令：

systemctl stop patroni                  # 启动 停止 重启 重载
systemctl stop pgbouncer                # 启动 停止 重启 重载
systemctl stop pg_exporter              # 启动 停止 重启 重载
systemctl stop pgbouncer_exporter       # 启动 停止 重启 重载
systemctl stop node_exporter            # 启动 停止 重启
systemctl stop haproxy                  # 启动 停止 重启 重载
systemctl stop vip-manager              # 启动 停止 重启 重载
systemctl stop postgres                 # 仅当 patroni_mode == 'remove' 时使用这个服务

创建集群

要创建一个新的Postgres集群，请首先在配置清单中定义，然后进行初始化：

bin/node-add <cls>                # 为集群 <cls> 初始化节点                  # ./node.yml  -l <cls> 
bin/pgsql-add <cls>               # 初始化集群 <cls> 的pgsql实例             # ./pgsql.yml -l <cls>

请注意，PGSQL 模块需要在 Pigsty 纳管的节点上安装，请先使用 bin/node-add 纳管节点。

示例：创建集群

创建用户

要在现有的Postgres集群上创建一个新的业务用户，请将用户定义添加到 all.children.<cls>.pg_users，然后使用以下命令将其创建：

bin/pgsql-user <cls> <username>   # ./pgsql-user.yml -l <cls> -e username=<username>

示例：创建业务用户

创建数据库

要在现有的Postgres集群上创建一个新的数据库用户，请将数据库定义添加到 all.children.<cls>.pg_databases，然后按照以下方式创建数据库：

bin/pgsql-db <cls> <dbname>       # ./pgsql-db.yml -l <cls> -e dbname=<dbname>

注意：如果数据库指定了一个非默认的属主，该属主用户应当已存在，否则您必须先创建用户。

示例：创建业务数据库

重载服务

服务是 PostgreSQL 对外提供能力的访问点（PGURL可达），由主机节点上的 HAProxy 对外暴露。

当集群成员发生变化时使用此任务，例如：添加／移除副本，主从切换／故障转移 / 暴露新服务，或更新现有服务的配置（例如，LB权重）

要在整个代理集群，或特定实例上创建新服务或重新加载现有服务：

bin/pgsql-svc <cls>               # pgsql.yml -l <cls> -t pg_service -e pg_reload=true
bin/pgsql-svc <cls> [ip...]       # pgsql.yml -l ip... -t pg_service -e pg_reload=true

示例：重载PG服务以踢除一个实例

重载HBA

当您的 Postgres/Pgbouncer HBA 规则发生更改时，您可能需要重载 HBA 以应用更改。

如果您有任何特定于角色的 HBA 规则，或者在IP地址段中引用了集群成员的别名，那么当主从切换/集群扩缩容后也可能需要重载HBA。

要在整个集群或特定实例上重新加载 postgres 和 pgbouncer 的 HBA 规则：

bin/pgsql-hba <cls>               # pgsql.yml -l <cls> -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true
bin/pgsql-hba <cls> [ip...]       # pgsql.yml -l ip... -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true

示例：重载集群 HBA 规则

配置集群

要更改现有的 Postgres 集群配置，您需要在管理节点上使用管理员用户（安装Pigsty的用户，nopass ssh/sudo）发起控制命令：

另一种方式是在数据库集群中的任何节点上，使用 dbsu （默认为 postgres），也可以执行管理命令，但只能管理本集群。

pg edit-config <cls>              # interactive config a cluster with patronictl

更改 patroni 参数和 postgresql.parameters，根据提示保存并应用更改即可。

示例：非交互式方式配置集群

您可以跳过交互模式，并使用 -p 选项覆盖 postgres 参数，例如：

pg edit-config -p log_min_duration_statement=1000 pg-test
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'

示例：使用 Patroni REST API 更改集群配置

您还可以使用 Patroni REST API 以非交互式方式更改配置，例如：

$ curl -s 10.10.10.11:8008/config | jq .  # get current config
$ curl -u 'postgres:Patroni.API' \
        -d '{"postgresql":{"parameters": {"log_min_duration_statement":200}}}' \
        -s -X PATCH http://10.10.10.11:8008/config | jq .

注意：Patroni 敏感API（例如重启等）访问仅限于从基础设施/管理节点发起，并且有 HTTP 基本认证（用户名/密码）以及可选的 HTTPS 保护。

示例：使用 patronictl 配置集群

添加实例

若要将新从库添加到现有的 PostgreSQL 集群中，您需要将其定义添加到配置清单：all.children.<cls>.hosts 中，然后：

bin/node-add <ip>                 # 将节点 <ip> 纳入 Pigsty 管理                
bin/pgsql-add <cls> <ip>          # 初始化 <ip> ，作为集群 <cls> 的新从库

这将会把节点 <ip> 添加到 pigsty 并将其初始化为集群 <cls> 的一个副本。

集群服务将会重新加载以接纳新成员。

示例：为 pg-test 添加从库

例如，如果您想将 pg-test-3 / 10.10.10.13 添加到现有的集群 pg-test，您首先需要更新配置清单：

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary } # 已存在的成员
    10.10.10.12: { pg_seq: 2, pg_role: replica } # 已存在的成员
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- 新成员
  vars: { pg_cluster: pg-test }

然后按如下方式应用更改：

bin/node-add          10.10.10.13   # 将节点添加到 pigsty
bin/pgsql-add pg-test 10.10.10.13   # 在 10.10.10.13 上为集群 pg-test 初始化新的副本

这与集群初始化相似，但只在单个实例上工作：

[ OK ] 初始化实例  10.10.10.11 到 pgsql 集群 'pg-test' 中:
[WARN]   提醒：先将节点添加到 pigsty 中，然后再安装模块 'pgsql'
[HINT]     $ bin/node-add  10.10.10.11  # 除 infra 节点外，先运行此命令
[WARN]   从集群初始化实例：
[ OK ]     $ ./pgsql.yml -l '10.10.10.11,&pg-test'
[WARN]   重新加载现有实例上的 pg_service：
[ OK ]     $ ./pgsql.yml -l 'pg-test,!10.10.10.11' -t pg_service

移除实例

若要从现有的 PostgreSQL 集群中移除副本：

bin/pgsql-rm <cls> <ip...>        # ./pgsql-rm.yml -l <ip>

这将从集群 <cls> 中移除实例 <ip>。集群服务将会重新加载以从负载均衡器中踢除已移除的实例。

示例：从 pg-test 移除从库

例如，如果您想从现有的集群 pg-test 中移除 pg-test-3 / 10.10.10.13：

bin/pgsql-rm pg-test 10.10.10.13  # 从 pg-test 中移除 pgsql 实例 10.10.10.13
bin/node-rm  10.10.10.13          # 从 pigsty 中移除该节点（可选）
vi pigsty.yml                     # 从目录中移除实例定义
bin/pgsql-svc pg-test             # 刷新现有实例上的 pg_service，以从负载均衡器中踢除已移除的实例

[ OK ] 从 'pg-test' 移除 10.10.10.13 的 pgsql 实例：
[WARN]   从集群中移除实例：
[ OK ]     $ ./pgsql-rm.yml -l '10.10.10.13,&pg-test'

并从配置清单中移除实例定义：

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- 执行后移除此行
  vars: { pg_cluster: pg-test }

最后，您可以重载PG服务并从负载均衡器中踢除已移除的实例：

bin/pgsql-svc pg-test             # 重载 pg-test 上的服务

下线集群

要移除整个 Postgres 集群，只需运行：

bin/pgsql-rm <cls>                # ./pgsql-rm.yml -l <cls>

示例：移除集群

示例：强制移除集群

注意：如果为这个集群配置了pg_safeguard（或全局设置为 true），pgsql-rm.yml 将中止，以避免意外移除集群。

您可以使用 playbook 命令行参数明确地覆盖它，以强制执行清除：

./pgsql-rm.yml -l pg-meta -e pg_safeguard=false    # 强制移除 pg 集群 pg-meta

主动切换

您可以使用 patroni 命令行工具执行 PostgreSQL 集群的切换操作。

pg switchover <cls>   # 交互模式，您可以使用下面的参数组合直接跳过此交互向导
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test

示例：pg-test 主从切换

$ pg switchover pg-test
Master [pg-test-1]:
Candidate ['pg-test-2', 'pg-test-3'] []: pg-test-2
When should the switchover take place (e.g. 2022-12-26T07:39 )  [now]: now
Current cluster topology
+ Cluster: pg-test (7181325041648035869) -----+----+-----------+-----------------+
| Member    | Host        | Role    | State   | TL | Lag in MB | Tags            |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.11 | Leader  | running |  1 |           | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-2 | 10.10.10.12 | Replica | running |  1 |         0 | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-3 | 10.10.10.13 | Replica | running |  1 |         0 | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
Are you sure you want to switchover cluster pg-test, demoting current master pg-test-1? [y/N]: y
2022-12-26 06:39:58.02468 Successfully switched over to "pg-test-2"
+ Cluster: pg-test (7181325041648035869) -----+----+-----------+-----------------+
| Member    | Host        | Role    | State   | TL | Lag in MB | Tags            |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.11 | Replica | stopped |    |   unknown | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-2 | 10.10.10.12 | Leader  | running |  1 |           | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-3 | 10.10.10.13 | Replica | running |  1 |         0 | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+

要通过 Patroni API 来执行此操作（例如，在指定时间将主库从 2号实例切换到 1号实例）

curl -u 'postgres:Patroni.API' \
  -d '{"leader":"pg-test-2", "candidate": "pg-test-1","scheduled_at":"2022-12-26T14:47+08"}' \
  -s -X POST http://10.10.10.11:8008/switchover

无论是主动切换还是故障切换，您都需要在集群成员身份发生变化后，重新刷新服务与HBA规则。您应当在变更发生后及时（例如几个小时，一天内），完成此操作：

bin/pgsql-svc <cls>
bin/pgsql-hba <cls>

备份集群

使用 pgBackRest 创建备份，需要以本地 dbsu （默认为 postgres）的身份运行以下命令：

pg-backup       # 执行备份，如有必要，执行增量或全量备份
pg-backup full  # 执行全量备份
pg-backup diff  # 执行差异备份
pg-backup incr  # 执行增量备份
pb info         # 打印备份信息 （pgbackrest info）

参阅备份恢复获取更多信息。

示例：创建备份

示例：创建定时备份任务

您可以将 crontab 添加到 node_crontab 以指定您的备份策略。

# 每天凌晨1点做一次全备份
- '00 01 * * * postgres /pg/bin/pg-backup full'

# 周一凌晨1点进全量备份，其他工作日进行增量备份
- '00 01 * * 1 postgres /pg/bin/pg-backup full'
- '00 01 * * 2,3,4,5,6,7 postgres /pg/bin/pg-backup'

恢复集群

要将集群恢复到先前的时间点 (PITR)，请以本地 dbsu 用户（默认为postgres）运行 Pigsty 提供的辅助脚本 pg-pitr

pg-pitr -i                              # 恢复到最近备份完成的时间（不常用）
pg-pitr --time="2022-12-30 14:44:44+08" # 恢复到指定的时间点（在删除数据库或表的情况下使用）
pg-pitr --name="my-restore-point"       # 恢复到使用 pg_create_restore_point 创建的命名恢复点
pg-pitr --lsn="0/7C82CB8" -X            # 在LSN之前立即恢复
pg-pitr --xid="1234567" -X -P           # 在指定的事务ID之前立即恢复，然后将集群直接提升为主库
pg-pitr --backup=latest                 # 恢复到最新的备份集
pg-pitr --backup=20221108-105325        # 恢复到特定备份集，备份集可以使用 pgbackrest info 列出

该命令会输出操作手册，请按照说明进行操作。查看备份恢复-PITR获取详细信息。

示例：使用原始pgBackRest命令进行 PITR

# 恢复到最新可用的点（例如硬件故障）
pgbackrest --stanza=pg-meta restore

# PITR 到特定的时间点（例如意外删除表）
pgbackrest --stanza=pg-meta --type=time --target="2022-11-08 10:58:48" \
   --target-action=promote restore

# 恢复特定的备份点，然后提升（或暂停|关闭）
pgbackrest --stanza=pg-meta --type=immediate --target-action=promote \
  --set=20221108-105325F_20221108-105938I restore

添加软件

要添加新版本的 RPM 包，你需要将它们加入到 repo_packages 和 repo_url_packages 中。

使用 ./infra.yml -t repo_build 子任务在 Infra 节点上重新构建本地软件仓库。然后，你可以使用 ansible 的 package 模块安装这些包：

ansible pg-test -b -m package -a "name=pg_cron_15,topn_15,pg_stat_monitor_15*"  # 使用 ansible 安装一些包

示例：手动更本地新软件源中的包

# 在基础设施/管理节点上添加上游软件仓库，然后手工下载所需的软件包
cd ~/pigsty; ./infra.yml -t repo_upstream,repo_cache # 添加上游仓库（互联网）
cd /www/pigsty;  repotrack "some_new_package_name"   # 下载最新的 RPM 包

# 更新本地软件仓库元数据
cd ~/pigsty; ./infra.yml -t repo_create              # 重新创建本地软件仓库
./node.yml -t node_repo                              # 刷新所有节点上的 YUM/APT 缓存

# 也可以使用 Ansible 手工刷新节点上的 YUM/APT 缓存
ansible all -b -a 'yum clean all'                    # 清理节点软件仓库缓存
ansible all -b -a 'yum makecache'                    # 从新的仓库重建yum/apt缓存
ansible all -b -a 'apt clean'                        # 清理 APT 缓存（Ubuntu/Debian）
ansible all -b -a 'apt update'                       # 重建 APT 缓存（Ubuntu/Debian）

例如，你可以使用以下方式安装或升级包：

ansible pg-test -b -m package -a "name=postgresql15* state=latest"

安装扩展

如果你想在 PostgreSQL 集群上安装扩展，请将它们加入到 pg_extensions 中，并执行：

./pgsql.yml -t pg_extension     # 安装扩展

一部分扩展需要在 shared_preload_libraries 中加载后才能生效。你可以将它们加入到 pg_libs 中，或者配置一个已有的集群。

最后，在集群的主库上执行 CREATE EXTENSION <extname>; 来完成扩展的安装。

示例：在 pg-test 集群上安装 pg_cron 扩展

ansible pg-test -b -m package -a "name=pg_cron_15"          # 在所有节点上安装 pg_cron 包
# 将 pg_cron 添加到 shared_preload_libraries 中
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'
pg restart --force pg-test                                  # 重新启动集群
psql -h pg-test -d postgres -c 'CREATE EXTENSION pg_cron;'  # 在主库上安装 pg_cron

更多细节，请参考PGSQL扩展安装。

小版本升级

要执行小版本的服务器升级/降级，您首先需要在本地软件仓库中添加软件：最新的PG小版本 RPM/DEB。

首先对所有从库执行滚动升级/降级，然后执行集群主从切换以升级/降级主库。

ansible <cls> -b -a "yum upgrade/downgrade -y <pkg>"    # 升级/降级软件包
pg restart --force <cls>                                # 重启集群

示例：将PostgreSQL 15.2降级到15.1

将15.1的包添加到软件仓库并刷新节点的 yum/apt 缓存：

cd ~/pigsty; ./infra.yml -t repo_upstream               # 添加上游仓库
cd /www/pigsty; repotrack postgresql15-*-15.1           # 将15.1的包添加到yum仓库
cd ~/pigsty; ./infra.yml -t repo_create                 # 重建仓库元数据
ansible pg-test -b -a 'yum clean all'                   # 清理节点仓库缓存
ansible pg-test -b -a 'yum makecache'                   # 从新仓库重新生成yum缓存

# 对于 Ubutnu/Debian 用户，使用 apt 替换 yum
ansible pg-test -b -a 'apt clean'                       # 清理节点仓库缓存
ansible pg-test -b -a 'apt update'                      # 从新仓库重新生成apt缓存

执行降级并重启集群：

ansible pg-test -b -a "yum downgrade -y postgresql15*"  # 降级软件包）
pg restart --force pg-test                              # 重启整个集群以完成升级

示例：将PostgreSQL 15.1升级回15.2

这次我们采用滚动方式升级：

ansible pg-test -b -a "yum upgrade -y postgresql15*"    # 升级软件包（或 apt upgrade）
ansible pg-test -b -a '/usr/pgsql/bin/pg_ctl --version' # 检查二进制版本是否为15.2
pg restart --role replica --force pg-test               # 重启从库
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test    # 切换主从
pg restart --role primary --force pg-test               # 重启主库

大版本升级

实现大版本升级的最简单办法是：创建一个使用新版本的新集群，然后通过逻辑复制，蓝绿部署，并进行在线迁移。

您也可以进行原地大版本升级，当您只使用数据库内核本身时，这并不复杂，使用 PostgreSQL 自带的 pg_upgrade 即可：

假设您想将 PostgreSQL 大版本从 14 升级到 15，您首先需要在仓库中添加软件，并确保两个大版本两侧安装的核心扩展插件也具有相同的版本号。

./pgsql.yml -t pg_pkg -e pg_version=15                         # 安装pg 15的包
sudo su - postgres; mkdir -p /data/postgres/pg-meta-15/data/   # 为15准备目录
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ -v -c # 预检
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ --link -j8 -v -c
rm -rf /usr/pgsql; ln -s /usr/pgsql-15 /usr/pgsql;             # 修复二进制链接
mv /data/postgres/pg-meta-14 /data/postgres/pg-meta-15         # 重命名数据目录
rm -rf /pg; ln -s /data/postgres/pg-meta-15 /pg                # 修复数据目录链接

6.12 - 访问控制

Pigsty 提供的默认角色系统与权限模型

Pigsty 提供了一套开箱即用的，基于角色系统和权限系统的访问控制模型。

权限控制很重要，但很多用户做不好。因此 Pigsty 提供了一套开箱即用的精简访问控制模型，为您的集群安全性提供一个兜底。

角色系统

Pigsty 默认的角色系统包含四个默认角色和四个默认用户：

角色名称	属性	所属	描述
`dbrole_readonly`	`NOLOGIN`		角色：全局只读访问
`dbrole_readwrite`	`NOLOGIN`	dbrole_readonly	角色：全局读写访问
`dbrole_admin`	`NOLOGIN`	pg_monitor,dbrole_readwrite	角色：管理员/对象创建
`dbrole_offline`	`NOLOGIN`		角色：受限的只读访问
`postgres`	`SUPERUSER`		系统超级用户
`replicator`	`REPLICATION`	pg_monitor,dbrole_readonly	系统复制用户
`dbuser_dba`	`SUPERUSER`	dbrole_admin	pgsql 管理用户
`dbuser_monitor`		pg_monitor	pgsql 监控用户

这些角色与用户的详细定义如下所示：

pg_default_roles:                 # 全局默认的角色与系统用户
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation }
  - { name: postgres     ,superuser: true  ,comment: system superuser }
  - { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator }
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 ,comment: pgsql admin user }
  - { name: dbuser_monitor ,roles: [pg_monitor] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

默认角色

Pigsty 中有四个默认角色：

业务只读 (dbrole_readonly): 用于全局只读访问的角色。如果别的业务想要此库只读访问权限，可以使用此角色。
业务读写 (dbrole_readwrite): 用于全局读写访问的角色，主属业务使用的生产账号应当具有数据库读写权限
业务管理员 (dbrole_admin): 拥有DDL权限的角色，通常用于业务管理员，或者需要在应用中建表的场景（比如各种业务软件）
离线只读访问 (dbrole_offline): 受限的只读访问角色（只能访问 offline 实例，通常是个人用户，ETL工具账号）

默认角色在 pg_default_roles 中定义，除非您确实知道自己在干什么，建议不要更改默认角色的名称。

- { name: dbrole_readonly  , login: false , comment: role for global read-only access  }                            # 生产环境的只读角色
- { name: dbrole_offline ,   login: false , comment: role for restricted read-only access (offline instance) }      # 受限的只读角色
- { name: dbrole_readwrite , login: false , roles: [dbrole_readonly], comment: role for global read-write access }  # 生产环境的读写角色
- { name: dbrole_admin , login: false , roles: [pg_monitor, dbrole_readwrite] , comment: role for object creation } # 生产环境的 DDL 更改角色

默认用户

Pigsty 也有四个默认用户（系统用户）：

超级用户 (postgres)，集群的所有者和创建者，与操作系统 dbsu 名称相同。
复制用户 (replicator)，用于主-从复制的系统用户。
监控用户 (dbuser_monitor)，用于监控数据库和连接池指标的用户。
管理用户 (dbuser_dba)，执行日常操作和数据库更改的管理员用户。

这4个默认用户的用户名/密码通过4对专用参数进行定义，并在很多地方引用：

pg_dbsu：操作系统 dbsu 名称，默认为 postgres，最好不要更改它
pg_dbsu_password：dbsu 密码，默认为空字符串意味着不设置 dbsu 密码，最好不要设置。
pg_replication_username：postgres 复制用户名，默认为 replicator
pg_replication_password：postgres 复制密码，默认为 DBUser.Replicator
pg_admin_username：postgres 管理员用户名，默认为 dbuser_dba
pg_admin_password：postgres 管理员密码的明文，默认为 DBUser.DBA
pg_monitor_username：postgres 监控用户名，默认为 dbuser_monitor
pg_monitor_password：postgres 监控密码，默认为 DBUser.Monitor

在生产部署中记得更改这些密码，不要使用默认值！

pg_dbsu: postgres                             # 数据库超级用户名，这个用户名建议不要修改。
pg_dbsu_password: ''                          # 数据库超级用户密码，这个密码建议留空！禁止dbsu密码登陆。
pg_replication_username: replicator           # 系统复制用户名
pg_replication_password: DBUser.Replicator    # 系统复制密码，请务必修改此密码！
pg_monitor_username: dbuser_monitor           # 系统监控用户名
pg_monitor_password: DBUser.Monitor           # 系统监控密码，请务必修改此密码！
pg_admin_username: dbuser_dba                 # 系统管理用户名
pg_admin_password: DBUser.DBA                 # 系统管理密码，请务必修改此密码！

如果您修改默认用户的参数，在 pg_default_roles 中修改相应的角色定义即可：

- { name: postgres     ,superuser: true                                          ,comment: system superuser }
- { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
- { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
- { name: dbuser_monitor   ,roles: [pg_monitor, dbrole_readonly] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

权限系统

Pigsty 拥有一套开箱即用的权限模型，该模型与默认角色一起配合工作。

所有用户都可以访问所有模式。
只读用户（dbrole_readonly）可以从所有表中读取数据。（SELECT，EXECUTE）
读写用户（dbrole_readwrite）可以向所有表中写入数据并运行 DML。（INSERT，UPDATE，DELETE）。
管理员用户（dbrole_admin）可以创建对象并运行 DDL（CREATE，USAGE，TRUNCATE，REFERENCES，TRIGGER）。
离线用户（dbrole_offline）类似只读用户，但访问受到限制，只允许访问离线实例（pg_role = 'offline' 或 pg_offline_query = true）
由管理员用户创建的对象将具有正确的权限。
所有数据库上都配置了默认权限，包括模板数据库。
数据库连接权限由数据库定义管理。
默认撤销PUBLIC在数据库和public模式下的CREATE权限。

对象权限

数据库中新建对象的默认权限由参数 pg_default_privileges 所控制：

- GRANT USAGE      ON SCHEMAS   TO dbrole_readonly
- GRANT SELECT     ON TABLES    TO dbrole_readonly
- GRANT SELECT     ON SEQUENCES TO dbrole_readonly
- GRANT EXECUTE    ON FUNCTIONS TO dbrole_readonly
- GRANT USAGE      ON SCHEMAS   TO dbrole_offline
- GRANT SELECT     ON TABLES    TO dbrole_offline
- GRANT SELECT     ON SEQUENCES TO dbrole_offline
- GRANT EXECUTE    ON FUNCTIONS TO dbrole_offline
- GRANT INSERT     ON TABLES    TO dbrole_readwrite
- GRANT UPDATE     ON TABLES    TO dbrole_readwrite
- GRANT DELETE     ON TABLES    TO dbrole_readwrite
- GRANT USAGE      ON SEQUENCES TO dbrole_readwrite
- GRANT UPDATE     ON SEQUENCES TO dbrole_readwrite
- GRANT TRUNCATE   ON TABLES    TO dbrole_admin
- GRANT REFERENCES ON TABLES    TO dbrole_admin
- GRANT TRIGGER    ON TABLES    TO dbrole_admin
- GRANT CREATE     ON SCHEMAS   TO dbrole_admin

由管理员新创建的对象，默认将会上述权限。使用 \ddp+ 可以查看这些默认权限：

类型	访问权限
函数	=X
	dbrole_readonly=X
	dbrole_offline=X
	dbrole_admin=X
模式	dbrole_readonly=U
	dbrole_offline=U
	dbrole_admin=UC
序列号	dbrole_readonly=r
	dbrole_offline=r
	dbrole_readwrite=wU
	dbrole_admin=rwU
表	dbrole_readonly=r
	dbrole_offline=r
	dbrole_readwrite=awd
	dbrole_admin=arwdDxt

默认权限

ALTER DEFAULT PRIVILEGES 允许您设置将来创建的对象的权限。它不会影响已经存在对象的权限，也不会影响非管理员用户创建的对象。

在 Pigsty 中，默认权限针对三个角色进行定义：

{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE {{ pg_dbsu }} {{ priv }};
{% endfor %}

{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE {{ pg_admin_username }} {{ priv }};
{% endfor %}

-- 对于其他业务管理员而言，它们应当在执行 DDL 前执行 SET ROLE dbrole_admin，从而使用对应的默认权限配置。
{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE "dbrole_admin" {{ priv }};
{% endfor %}

也就是说，为了维持正确的对象权限，您必须用管理员用户来执行 DDL，它们可以是：

{{ pg_dbsu }}，默认为 postgres
{{ pg_admin_username }}，默认为 dbuser_dba
授予了 dbrole_admin 角色的业务管理员用户（通过 SET ROLE 切换为 dbrole_admin 身份）。

使用 postgres 作为全局对象所有者是明智的。如果您希望以业务管理员用户身份创建对象，创建之前必须使用 SET ROLE dbrole_admin 来维护正确的权限。

当然，您也可以在数据库中通过 ALTER DEFAULT PRIVILEGE FOR ROLE <some_biz_admin> XXX 来显式对业务管理员授予默认权限。

数据库权限

在 Pigsty 中，数据库（Database）层面的权限在数据库定义中被涵盖。

数据库有三个级别的权限：CONNECT、CREATE、TEMP，以及一个特殊的’权限’：OWNERSHIP。

- name: meta         # 必选，`name` 是数据库定义中唯一的必选字段
  owner: postgres    # 可选，数据库所有者，默认为 postgres
  allowconn: true    # 可选，是否允许连接，默认为 true。显式设置 false 将完全禁止连接到此数据库
  revokeconn: false  # 可选，撤销公共连接权限。默认为 false，设置为 true 时，属主和管理员之外用户的 CONNECT 权限会被回收

如果 owner 参数存在，它作为数据库属主，替代默认的 {{ pg_dbsu }}（通常也就是postgres）
如果 revokeconn 为 false，所有用户都有数据库的 CONNECT 权限，这是默认的行为。
如果显式设置了 revokeconn 为 true：
- 数据库的 CONNECT 权限将从 PUBLIC 中撤销：普通用户无法连接上此数据库
- CONNECT 权限将被显式授予 {{ pg_replication_username }}、{{ pg_monitor_username }} 和 {{ pg_admin_username }}
- CONNECT 权限将 GRANT OPTION 被授予数据库属主，数据库属主用户可以自行授权其他用户连接权限。
revokeconn 选项可用于在同一个集群间隔离跨数据库访问，您可以为每个数据库创建不同的业务用户作为属主，并为它们设置 revokeconn 选项。

示例：数据库隔离

pg-infra:
  hosts:
    10.10.10.40: { pg_seq: 1, pg_role: primary }
    10.10.10.41: { pg_seq: 2, pg_role: replica , pg_offline_query: true }
  vars:
    pg_cluster: pg-infra
    pg_users:
      - { name: dbuser_confluence, password: mc2iohos , pgbouncer: true, roles: [ dbrole_admin ] }
      - { name: dbuser_gitlab, password: sdf23g22sfdd , pgbouncer: true, roles: [ dbrole_readwrite ] }
      - { name: dbuser_jira, password: sdpijfsfdsfdfs , pgbouncer: true, roles: [ dbrole_admin ] }
    pg_databases:
      - { name: confluence , revokeconn: true, owner: dbuser_confluence , connlimit: 100 }
      - { name: gitlab , revokeconn: true, owner: dbuser_gitlab, connlimit: 100 }
      - { name: jira , revokeconn: true, owner: dbuser_jira , connlimit: 100 }

CREATE权限

出于安全考虑，Pigsty 默认从 PUBLIC 撤销数据库上的 CREATE 权限，从 PostgreSQL 15 开始这也是默认行为。

数据库属主总是可以根据实际需要，来自行调整 CREATE 权限。

6.13 - 备份恢复

如何使用 pgBackRest 备份/恢复/PITR PostgreSQL 数据库集群

Pigsty 使用 pgBackRest 进行 PITR 备份和恢复。

对于硬件故障来说，基于物理复制的高可用故障切换可能会是最佳选择。而对于数据损坏（无论是机器还是人为错误），时间点恢复（PITR）则更为合适：它提供了对最坏情况的兜底。

备份

使用以下命令备份 PostgreSQL 数据库集群：

# stanza 是 pgbackrest 在同一个存储库中区别不同的集群的标识，默认 stanza 名称 = {{ pg_cluster }}
pgbackrest --stanza=${stanza} --type=full|diff|incr backup

# 你也可以在 pigsty 中使用 dbsu 执行以下命令 (/pg/bin/pg-backup) 进行备份
pg-backup       # 执行备份，如有必要，执行增量或全量备份
pg-backup full  # 执行全量备份
pg-backup diff  # 执行差异备份
pg-backup incr  # 执行增量备份

使用以下命令打印备份信息：

pb info   # pgbackrest info 打印备份信息

备份信息示例

$ pb info
stanza: pg-meta
    status: ok
    cipher: none

    db (current)
        wal archive min/max (14): 000000010000000000000001/000000010000000000000023

        full backup: 20221108-105325F
            timestamp start/stop: 2022-11-08 10:53:25 / 2022-11-08 10:53:29
            wal start/stop: 000000010000000000000004 / 000000010000000000000004
            database size: 96.6MB, database backup size: 96.6MB
            repo1: backup set size: 18.9MB, backup size: 18.9MB

        incr backup: 20221108-105325F_20221108-105938I
            timestamp start/stop: 2022-11-08 10:59:38 / 2022-11-08 10:59:41
            wal start/stop: 00000001000000000000000F / 00000001000000000000000F
            database size: 246.7MB, database backup size: 167.3MB
            repo1: backup set size: 35.4MB, backup size: 20.4MB
            backup reference list: 20221108-105325F

您也可以从监控系统查阅备份信息：PGCAT 实例 - 备份

恢复

以下命令可以用于 PostgreSQL 数据库集群的恢复

pg-pitr                                 # 恢复到WAL存档流的结束位置（例如在整个数据中心故障的情况下使用）
pg-pitr -i                              # 恢复到最近备份完成的时间（不常用）
pg-pitr --time="2022-12-30 14:44:44+08" # 恢复到指定的时间点（在删除数据库或表的情况下使用）
pg-pitr --name="my-restore-point"       # 恢复到使用 pg_create_restore_point 创建的命名恢复点
pg-pitr --lsn="0/7C82CB8" -X            # 在LSN之前立即恢复
pg-pitr --xid="1234567" -X -P           # 在指定的事务ID之前立即恢复，然后将集群直接提升为主库
pg-pitr --backup=latest                 # 恢复到最新的备份集
pg-pitr --backup=20221108-105325        # 恢复到特定备份集，备份集可以使用 pgbackrest info 列出

pg-pitr                                 # pgbackrest --stanza=pg-meta restore
pg-pitr -i                              # pgbackrest --stanza=pg-meta --type=immediate restore
pg-pitr -t "2022-12-30 14:44:44+08"     # pgbackrest --stanza=pg-meta --type=time --target="2022-12-30 14:44:44+08" restore
pg-pitr -n "my-restore-point"           # pgbackrest --stanza=pg-meta --type=name --target=my-restore-point restore
pg-pitr -b 20221108-105325F             # pgbackrest --stanza=pg-meta --type=name --set=20221230-120101F restore
pg-pitr -l "0/7C82CB8" -X               # pgbackrest --stanza=pg-meta --type=lsn --target="0/7C82CB8" --target-exclusive restore
pg-pitr -x 1234567 -X -P                # pgbackrest --stanza=pg-meta --type=xid --target="0/7C82CB8" --target-exclusive --target-action=promote restore

Pigsty 提供的 pg-pitr 脚本会帮助您生成进行 PITR 指令，例如，如果您希望将当前集群状态回滚至 "2023-02-07 12:38:00+08"：

$ pg-pitr -t "2023-02-07 12:38:00+08"
pgbackrest --stanza=pg-meta --type=time --target='2023-02-07 12:38:00+08' restore
执行pg-meta时间点恢复
[1. 停止PostgreSQL] ===========================================
   1.1 暂停Patroni（如果有任何副本）
       $ pg pause <cls>  # 暂停patroni自动故障转移
   1.2 关闭Patroni
       $ pt-stop         # sudo systemctl stop patroni
   1.3 关闭Postgres
       $ pg-stop         # pg_ctl -D /pg/data stop -m fast

[2. 执行PITR] ===========================================
   2.1 恢复备份
       $ pgbackrest --stanza=pg-meta --type=time --target='2023-02-07 12:38:00+08' restore
   2.2 启动PG以重放WAL
       $ pg-start        # pg_ctl -D /pg/data start
   2.3 验证并提升
     - 如果数据库内容正确，提升它以完成恢复，否则转到2.1
       $ pg-promote      # pg_ctl -D /pg/data promote

[3. 重启Patroni] ===========================================
   3.1 启动Patroni
       $ pt-start;        # sudo systemctl start patroni
   3.2 再次启用归档
       $ psql -c 'ALTER SYSTEM SET archive_mode = on; SELECT pg_reload_conf();'
   3.3 重启Patroni
       $ pt-restart      # sudo systemctl start patroni

[4. 恢复集群] ===========================================
   3.1 重新初始化所有副本（如果有任何副本）
       $ pg reinit <cls> <ins>
   3.2 恢复Patroni
       $ pg resume <cls> # 恢复patroni自动故障转移
   3.2 完整备份（可选）
       $ pg-backup full  # pgbackrest --stanza=pg-meta backup --type=full

安装说明依次操作，即可完成集群的恢复。

备份策略

您可以使用node_crontab 和 pgbackrest_repo自定义备份策略。

使用node_crontab设置定时备份任务
使用pgbackrest_repo设置备份保留策略

本地备份仓库

例如，默认的pg-meta将每天凌晨1点进行一次全量备份。

node_crontab:  # 每天凌晨1点进行全量备份
  - '00 01 * * * postgres /pg/bin/pg-backup full'

使用默认的本地备份仓库保留策略，它最多保留两个完整备份，在备份过程中临时允许第三个备份存在。

pgbackrest_repo:                  # pgbackrest 仓库定义: https://pgbackrest.org/configuration.html#section-repository
  local:                          # 默认使用本地文件系统的 pgbackrest 备份仓库
    path: /pg/backup              # 本地备份目录，默认为`/pg/backup`
    retention_full_type: count    # 指定全量备份保留数量：2
    retention_full: 2             # 使用本地文件系统仓库时，最多保留2个完整备份，备份时临时允许3个

您的备份磁盘存储空间至少应该能放下最近三个数据库全量备份文件，以及这段期间（3天）内的WAL归档文件。

MinIO备份仓库

使用MinIO时，存储容量通常不是问题。您可以按需保留备份。例如，默认的 pg-test 样例集群将在星期一进行全量备份，其他工作日进行增量备份。

node_crontab:  # 周一凌晨1点进全量备份，其他工作日进行增量备份
  - '00 01 * * 1 postgres /pg/bin/pg-backup full'
  - '00 01 * * 2,3,4,5,6,7 postgres /pg/bin/pg-backup'

MinIO备份仓库可以使用14天的时间保留策略，这将保留最近两周内的备份。

pgbackrest_repo:                  # pgbackrest 仓库: https://pgbackrest.org/configuration.html#section-repository=
  minio:                          # pgbackrest 的可选minio仓库
    type: s3                      # minio 是s3兼容的，因此使用s3
    s3_endpoint: sss.pigsty       # minio终端域名，默认为`sss.pigsty`
    s3_region: us-east-1          # minio区域，默认为us-east-1，对minio来说没有用
    s3_bucket: pgsql              # minio桶名，默认为`pgsql`
    s3_key: pgbackrest            # pgbackrest的minio用户访问密钥
    s3_key_secret: S3User.Backup  # pgbackrest的minio用户密钥，这里请按实际情况填写密码，最好不要使用默认密码。
    s3_uri_style: path            # 使用路径风格的uri，而不是主机风格的uri
    path: /pgbackrest             # minio备份路径，默认为`/pgbackrest`
    storage_port: 9000            # minio端口，默认为9000
    storage_ca_file: /etc/pki/ca.crt  # minio的ca文件路径，默认为`/etc/pki/ca.crt`
    bundle: y                     # 将小文件打包成一个文件
    cipher_type: aes-256-cbc      # 为远程备份仓库启用AES加密
    cipher_pass: pgBackRest       # AES加密密码，默认为'pgBackRest'，这里最好按需修改以下
    retention_full_type: time     # 在minio仓库上按时间保留完整备份
    retention_full: 14            # 保留过去14天的完整备份

6.14 - 迁移

如何将现有的 PostgreSQL 集群以最小的停机时间迁移至新的、由Pigsty管理的 PostgreSQL 集群？

Pigsty 内置了一个剧本 pgsql-migration.yml ，基于逻辑复制来实现在线数据库迁移。

通过预生成的自动化脚本，应用停机时间可以缩减到几秒内。但请注意，逻辑复制需要 PostgreSQL 10 以上的版本才能工作。

当然如果您有充足的停机时间预算，那么总是可以使用 pg_dump | psql 的方式进行停机迁移。

定义迁移任务

想要使用Pigsty提供的在线迁移剧本，您需要创建一个定义文件，来描述迁移任务的细节。

请查看任务定义文件示例作为参考： files/migration/pg-meta.yml 。

这个迁移任务要将 pg-meta.meta 在线迁移到 pg-test.test，前者称为 源集群（SRC），后者称为 宿集群（DST）。

pg-meta-1	10.10.10.10  --> pg-test-1	10.10.10.11 (10.10.10.12,10.10.10.13)

基于逻辑复制的迁移以数据库为单位，您需要指定需要迁移的数据库名称，以及数据库源宿集群主节点的 IP 地址，以及超级用户的连接信息。

---
#-----------------------------------------------------------------
# PG_MIGRATION
#-----------------------------------------------------------------
context_dir: ~/migration  # 迁移手册 & 脚本的放置目录
#-----------------------------------------------------------------
# SRC Cluster (旧集群)
#-----------------------------------------------------------------
src_cls: pg-meta      # 源集群名称                  <必填>
src_db: meta          # 源数据库名称                <必填>
src_ip: 10.10.10.10   # 源集群主 IP                <必填>
#src_pg: ''            # 如果定义，使用此作为源 dbsu pgurl 代替：
#                      # postgres://{{ pg_admin_username }}@{{ src_ip }}/{{ src_db }}
#                      # 例如: 'postgres://dbuser_dba:DBUser.DBA@10.10.10.10:5432/meta'
#sub_conn: ''          # 如果定义，使用此作为订阅连接字符串代替：
#                      # host={{ src_ip }} dbname={{ src_db }} user={{ pg_replication_username }}'
#                      # 例如: 'host=10.10.10.10 dbname=meta user=replicator password=DBUser.Replicator'
#-----------------------------------------------------------------
# DST Cluster (新集群)
#-----------------------------------------------------------------
dst_cls: pg-test      # 宿集群名称                  <必填>
dst_db: test          # 宿数据库名称                 <必填>
dst_ip: 10.10.10.11   # 宿集群主 IP                <必填>
#dst_pg: ''            # 如果定义，使用此作为目标 dbsu pgurl 代替：
#                      # postgres://{{ pg_admin_username }}@{{ dst_ip }}/{{ dst_db }}
#                      # 例如: 'postgres://dbuser_dba:DBUser.DBA@10.10.10.11:5432/test'
#-----------------------------------------------------------------
# PGSQL
#-----------------------------------------------------------------
pg_dbsu: postgres
pg_replication_username: replicator
pg_replication_password: DBUser.Replicator
pg_admin_username: dbuser_dba
pg_admin_password: DBUser.DBA
pg_monitor_username: dbuser_monitor
pg_monitor_password: DBUser.Monitor
#-----------------------------------------------------------------
...

默认情况下，源宿集群两侧的超级用户连接串会使用全局的管理员用户和各自主库的 IP 地址拼接而成，但您总是可以通过 src_pg 和 dst_pg 参数来覆盖这些默认值。同理，您也可以通过 sub_conn 参数来覆盖订阅连接串的默认值。

生成迁移计划

此剧本不会主动完成集群的迁移工作，但它会生成迁移所需的操作手册与自动化脚本。

默认情况下，你会在 ~/migration/pg-meta.meta 下找到迁移上下文目录。按照 README.md 的说明，依次执行这些脚本，你就可以完成数据库迁移了！

# 激活迁移上下文：启用相关环境变量
. ~/migration/pg-meta.meta/activate

# 这些脚本用于检查 src 集群状态，并帮助在 pigsty 中生成新的集群定义
./check-user     # 检查 src 用户
./check-db       # 检查 src 数据库
./check-hba      # 检查 src hba 规则
./check-repl     # 检查 src 复制身份
./check-misc     # 检查 src 特殊对象

# 这些脚本用于在现有的 src 集群和由 pigsty 管理的 dst 集群之间建立逻辑复制，除序列外的数据将实时同步
./copy-schema    # 将模式复制到目标
./create-pub     # 在 src 上创建发布
./create-sub     # 在 dst 上创建订阅
./copy-progress  # 打印逻辑复制进度
./copy-diff      # 通过计数表快速比较 src 和 dst 的差异

# 这些脚本将在在线迁移中运行，该迁移将停止 src 集群，复制序列号（逻辑复制不复制序列号！）
./copy-seq [n]   # 同步序列号，如果给出了 n，则会应用额外的偏移

# 你必须根据你的访问方式（dns,vip,haproxy,pgbouncer等），将应用流量切换至新的集群！
#./disable-src   # 将 src 集群访问限制为管理节点和新集群（你的实现）
#./re-routing    # 从 SRC 到 DST 重新路由应用流量！（你的实现）

# 然后进行清理以删除订阅和发布
./drop-sub       # 迁移后在 dst 上删除订阅
./drop-pub       # 迁移后在 src 上删除发布

注意事项

如果担心拷贝序列号时出现主键冲突，您可以在拷贝时将所有序列号向前推进一段距离，例如 +1000 ，你可以使用 ./copy-seq 加一个参数 1000 来实现这一点。

你必须实现自己的 ./re-routing 脚本，以将你的应用流量从 src 路由到 dst。因为我们不知道你的流量是如何路由的（例如 dns, VIP, haproxy 或 pgbouncer）。当然，您也可以手动完成这项操作…

你可以实现一个 ./disable-src 脚本来限制应用对 src 集群的访问，这是可选的：如果你能确保所有应用流量都在 ./re-routing 中干净利落地切完，其实不用这一步。

但如果您有未知来源的各种访问无法梳理干净，那么最好使用更为彻底的方式：更改 HBA 规则并重新加载来实现（推荐），或者只是简单粗暴地关停源主库上的 postgres、pgbouncer 或 haproxy 进程。

6.15 - 监控接入

Pigsty监控系统架构概览，以及如何监控现存的 PostgreSQL 实例？

本文介绍了 Pigsty 的监控系统架构，包括监控指标，日志，与目标管理的方式。以及如何监控现有PG集群与远程 RDS服务。

监控概览

Pigsty使用现代的可观测技术栈对 PostgreSQL 进行监控：

使用 Grafana 进行指标可视化和 PostgreSQL 数据源。
使用 Prometheus 来采集 PostgreSQL / Pgbouncer / Patroni / HAProxy / Node 的指标
使用 Loki 来记录 PostgreSQL / Pgbouncer / Patroni / pgBackRest 以及主机组件的日志
Pigsty 提供了开箱即用的 Grafana 仪表盘，展示与 PostgreSQL 有关的方方面面。

监控指标

PostgreSQL 本身的监控指标完全由 pg_exporter 配置文件所定义：pg_exporter.yml 它将进一步被 Prometheus 记录规则和告警规则进行加工处理：files/prometheus/rules/pgsql.yml。

Pigsty使用三个身份标签：cls、ins、ip，它们将附加到所有指标和日志上。此外，Pgbouncer的监控指标，主机节点 NODE，与负载均衡器的监控指标也会被 Pigsty 所使用，并尽可能地使用相同的标签以便于关联分析。

{ cls: pg-meta, ins: pg-meta-1, ip: 10.10.10.10 }
{ cls: pg-meta, ins: pg-test-1, ip: 10.10.10.11 }
{ cls: pg-meta, ins: pg-test-2, ip: 10.10.10.12 }
{ cls: pg-meta, ins: pg-test-3, ip: 10.10.10.13 }

日志

与 PostgreSQL 有关的日志由 promtail 负责收集，并发送至 infra 节点上的 Loki 日志存储/查询服务。

pg_log_dir : postgres日志目录，默认为/pg/log/postgres
pgbouncer_log_dir : pgbouncer日志目录，默认为/pg/log/pgbouncer
patroni_log_dir : patroni日志目录，默认为/pg/log/patroni
pgbackrest_log_dir : pgbackrest日志目录，默认为/pg/log/pgbackrest

目标管理

Prometheus的监控目标在 /etc/prometheus/targets/pgsql/ 下的静态文件中定义，每个实例都有一个相应的文件。以 pg-meta-1 为例：

# pg-meta-1 [primary] @ 10.10.10.10
- labels: { cls: pg-meta, ins: pg-meta-1, ip: 10.10.10.10 }
  targets:
    - 10.10.10.10:9630    # <--- pg_exporter 用于PostgreSQL指标
    - 10.10.10.10:9631    # <--- pg_exporter 用于pgbouncer指标
    - 10.10.10.10:8008    # <--- patroni指标（未启用 API SSL 时）

当全局标志 patroni_ssl_enabled 被设置时，patroni目标将被移动到单独的文件 /etc/prometheus/targets/patroni/<ins>.yml。因为此时使用的是 https 抓取端点。当您监控RDS实例时，监控目标会被单独放置于： /etc/prometheus/targets/pgrds/ 目录下，并以集群为单位进行管理。

当使用 bin/pgsql-rm 或 pgsql-rm.yml 移除集群时，Prometheus监控目标将被移除。您也可以手动移除它，或使用剧本里的子任务：

bin/pgmon-rm <cls|ins>    # 从所有infra节点中移除 prometheus 监控目标

远程 RDS 监控目标会被放置于 /etc/prometheus/targets/pgrds/<cls>.yml，它们是由 pgsql-monitor.yml 剧本或 bin/pgmon-add 脚本所创建的。

监控模式

Pigsty 提供三种监控模式，以适应不同的监控需求。

事项\等级	L1	L2	L3
名称	基础部署	托管部署	标准部署
英文	RDS	MANAGED	FULL
场景	只有连接串，例如RDS	DB已存在，节点可管理	实例由 Pigsty 创建
PGCAT功能	✅ 完整可用	✅ 完整可用	✅ 完整可用
PGSQL功能	✅ 限PG指标	✅ 限PG与节点指标	✅ 完整功能
连接池指标	❌ 不可用	⚠️ 选装	✅ 预装项
负载均衡器指标	❌ 不可用	⚠️ 选装	✅ 预装项
PGLOG功能	❌ 不可用	⚠️ 选装	✅ 预装项
PG Exporter	⚠️ 部署于Infra节点	✅ 部署于DB节点	✅ 部署于DB节点
Node Exporter	❌ 不部署	✅ 部署于DB节点	✅ 部署于DB节点
侵入DB节点	✅ 无侵入	⚠️ 安装Exporter	⚠️ 完全由Pigsty管理
监控现有实例	✅ 可支持	✅ 可支持	❌ 仅用于Pigsty托管实例
监控用户与视图	人工创建	人工创建	Pigsty自动创建
部署使用剧本	`bin/pgmon-add <cls>`	部分执行 `pgsql.ym`/`node.yml`	`pgsql.yml`
所需权限	Infra 节点可达的 PGURL	DB节点ssh与sudo权限	DB节点ssh与sudo权限
功能概述	PGCAT + PGRDS	大部分功能	完整功能

由Pigsty完全管理的数据库会自动纳入监控，并拥有最好的监控支持，通常不需要任何配置。对于现有的 PostgreSQL 集群或者 RDS 服务，如果如果目标DB节点可以被Pigsty所管理（ssh可达，sudo可用），那么您可以考虑托管部署，实现与 Pigsty 基本类似的监控管理体验。如果您只能通过PGURL（数据库连接串）的方式访问目标数据库，例如远程的RDS服务，则可以考虑使用精简模式监控目标数据库。

监控现有集群

如果目标DB节点可以被Pigsty所管理（ssh可达且sudo可用），那么您可以使用 pgsql.yml 剧本中的pg_exporter任务，使用与标准部署相同的的方式，在目标节点上部署监控组件：PG Exporter。您也可以使用该剧本的 pgbouncer，pgbouncer_exporter 任务在已有实例节点上部署连接池及其监控。此外，您也可以使用 node.yml 中的 node_exporter， haproxy， promtail 部署主机监控，负载均衡，日志收集组件。从而获得与原生Pigsty数据库实例完全一致的使用体验。

现有集群的定义方式与 Pigsty 所管理的集群定义方式完全相同，您只是选择性执行 pgsql.yml 剧本中的部分任务，而不是执行整个剧本。

./node.yml  -l <cls> -t node_repo,node_pkg           # 在主机节点上添加 INFRA节点的 YUM 源并安装软件包。
./node.yml  -l <cls> -t node_exporter,node_register  # 配置主机监控，并加入 Prometheus
./node.yml  -l <cls> -t promtail                     # 配置主机日志采集，并发送至 Loki
./pgsql.yml -l <cls> -t pg_exporter,pg_register      # 配置 PostgreSQL 监控，并注册至 Prometheus/Grafana

因为目标数据库集群已存在，所以您需要手工在目标数据库集群上创建监控用户、模式与扩展。

监控RDS

如果您只能通过PGURL（数据库连接串）的方式访问目标数据库，那么可以参照这里的说明进行配置。在这种模式下，Pigsty 在 INFRA节点上部署对应的 PG Exporter，抓取远端数据库指标信息。如下图所示：

------ infra ------
|                 |
|   prometheus    |            v---- pg-foo-1 ----v
|       ^         |  metrics   |         ^        |
|   pg_exporter <-|------------|----  postgres    |
|   (port: 20001) |            | 10.10.10.10:5432 |
|       ^         |            ^------------------^
|       ^         |                      ^
|       ^         |            v---- pg-foo-2 ----v
|       ^         |  metrics   |         ^        |
|   pg_exporter <-|------------|----  postgres    |
|   (port: 20002) |            | 10.10.10.11:5433 |
-------------------            ^------------------^

在这种模式下，监控系统不会有主机，连接池，负载均衡器，高可用组件的相关指标，但数据库本身，以及数据目录（Catalog）中的实时状态信息仍然可用。 Pigsty提供了两个专用的监控面板，专注于 PostgreSQL 本身的监控指标： PGRDS Cluster 与 PGRDS Instance，总览与数据库内监控则复用现有监控面板。因为Pigsty不能管理您的RDS，所以用户需要在目标数据库上提前配置好监控对象。

监控外部 Postgres 实例时的局限性

pgBoucner 连接池指标不可用
Patroni 高可用组件指标不可用
主机节点监控指标不可用，以及节点 HAProxy，Keepalived 指标亦不可用。
日志收集与日志衍生指标不可用

下面我们使用沙箱环境作为示例：现在我们假设 pg-meta 集群是一个有待监控的 RDS 实例 pg-foo-1，而 pg-test 集群则是一个有待监控的RDS集群 pg-bar：

在目标上创建监控模式、用户和权限。详情请参考监控对象配置

在配置清单中声明集群。例如，假设我们想要监控“远端”的 pg-meta & pg-test 集群：

infra:            # 代理、监控、警报等的infra集群..
  hosts: { 10.10.10.10: { infra_seq: 1 } }
  vars:           # 在组'infra'上为远程postgres RDS安装pg_exporter
    pg_exporters: # 在此列出所有远程实例，为k分配一个唯一的未使用的本地端口
      20001: { pg_cluster: pg-foo, pg_seq: 1, pg_host: 10.10.10.10 , pg_databases: [{ name: meta }] } # 注册 meta 数据库为 Grafana 数据源

      20002: { pg_cluster: pg-bar, pg_seq: 1, pg_host: 10.10.10.11 , pg_port: 5432 } # 几种不同的连接串拼接方法
      20003: { pg_cluster: pg-bar, pg_seq: 2, pg_host: 10.10.10.12 , pg_exporter_url: 'postgres://dbuser_monitor:DBUser.Monitor@10.10.10.12:5432/postgres?sslmode=disable'}
      20004: { pg_cluster: pg-bar, pg_seq: 3, pg_host: 10.10.10.13 , pg_monitor_username: dbuser_monitor, pg_monitor_password: DBUser.Monitor }

其中， pg_databases 字段中所列出的数据库，将会被注册至 Grafana 中，成为一个 PostgreSQL 数据源，为 PGCAT 监控面板提供数据支持。如果您不想使用PGCAT，将注册数据库到Grafana中，只需要将 pg_databases 设置为空数组或直接留空即可。

执行添加监控命令：bin/pgmon-add <clsname>

bin/pgmon-add pg-foo  # 将 pg-foo 集群纳入监控
bin/pgmon-add pg-bar  # 将 pg-bar 集群纳入监控

要删除远程集群的监控目标，可以使用 bin/pgmon-rm <clsname>

bin/pgmon-rm pg-foo  # 将 pg-foo 从 Pigsty 监控中移除
bin/pgmon-rm pg-bar  # 将 pg-bar 从 Pigsty 监控中移除

您可以使用更多的参数来覆盖默认 pg_exporter 的选项，下面是一个使用 Pigsty 监控阿里云 RDS 与 PolarDB 的配置样例：

示例：监控阿里云 RDS for PostgreSQL 与 PolarDB

详情请参考：remote.yml

infra:            # 代理、监控、警报等的infra集群..
  hosts: { 10.10.10.10: { infra_seq: 1 } }
  vars:
    pg_exporters:   # 在此列出所有待监控的远程 RDS PG 实例

      20001:        # 分配一个唯一的未使用的本地端口，供本地监控 Agent 使用，这里是一个 PolarDB 的主库
        pg_cluster: pg-polar                  # RDS 集群名 （身份参数，手工指定分配监控系统内名称）
        pg_seq: 1                             # RDS 实例号 （身份参数，手工指定分配监控系统内名称）
        pg_host: pc-2ze379wb1d4irc18x.polardbpg.rds.aliyuncs.com # RDS 主机地址
        pg_port: 1921                         # RDS 端口（从控制台连接信息获取）
        pg_exporter_auto_discovery: true      # 禁用新数据库自动发现功能
        pg_exporter_include_database: 'test'  # 仅监控这个列表中的数据库（多个数据库用逗号分隔）
        pg_monitor_username: dbuser_monitor   # 监控用的用户名，覆盖全局配置
        pg_monitor_password: DBUser_Monitor   # 监控用的密码，覆盖全局配置
        pg_databases: [{ name: test }]        # 希望启用PGCAT的数据库列表，只要name字段即可，register_datasource设置为false则不注册。

      20002:       # 这是一个 PolarDB  从库
        pg_cluster: pg-polar                  # RDS 集群名 （身份参数，手工指定分配监控系统内名称）
        pg_seq: 2                             # RDS 实例号 （身份参数，手工指定分配监控系统内名称）
        pg_host: pe-2ze7tg620e317ufj4.polarpgmxs.rds.aliyuncs.com # RDS 主机地址
        pg_port: 1521                         # RDS 端口（从控制台连接信息获取）
        pg_exporter_auto_discovery: true      # 禁用新数据库自动发现功能
        pg_exporter_include_database: 'test,postgres'  # 仅监控这个列表中的数据库（多个数据库用逗号分隔）
        pg_monitor_username: dbuser_monitor   # 监控用的用户名
        pg_monitor_password: DBUser_Monitor   # 监控用的密码
        pg_databases: [ { name: test } ]        # 希望启用PGCAT的数据库列表，只要name字段即可，register_datasource设置为false则不注册。

      20004: # 这是一个基础版的单节点 RDS for PostgreSQL 实例
        pg_cluster: pg-rds                    # RDS 集群名 （身份参数，手工指定分配监控系统内名称）
        pg_seq: 1                             # RDS 实例号 （身份参数，手工指定分配监控系统内名称）
        pg_host: pgm-2zern3d323fe9ewk.pg.rds.aliyuncs.com  # RDS 主机地址
        pg_port: 5432                         # RDS 端口（从控制台连接信息获取）
        pg_exporter_auto_discovery: true      # 禁用新数据库自动发现功能
        pg_exporter_include_database: 'rds'   # 仅监控这个列表中的数据库（多个数据库用逗号分隔）
        pg_monitor_username: dbuser_monitor   # 监控用的用户名
        pg_monitor_password: DBUser_Monitor   # 监控用的密码
        pg_databases: [ { name: rds } ]       # 希望启用PGCAT的数据库列表，只要name字段即可，register_datasource设置为false则不注册。

      20005: # 这是一个高可用版的 RDS for PostgreSQL 集群主库
        pg_cluster: pg-rdsha                  # RDS 集群名 （身份参数，手工指定分配监控系统内名称）
        pg_seq: 1                             # RDS 实例号 （身份参数，手工指定分配监控系统内名称）
        pg_host: pgm-2ze3d35d27bq08wu.pg.rds.aliyuncs.com  # RDS 主机地址
        pg_port: 5432                         # RDS 端口（从控制台连接信息获取）
        pg_exporter_include_database: 'rds'   # 仅监控这个列表中的数据库（多个数据库用逗号分隔）
        pg_databases: [ { name: rds }, {name : test} ]  # 将这两个数据库纳入 PGCAT 管理，注册为 Grafana 数据源

      20006: # 这是一个高可用版的 RDS for PostgreSQL 集群只读实例（从库）
        pg_cluster: pg-rdsha                  # RDS 集群名 （身份参数，手工指定分配监控系统内名称）
        pg_seq: 2                             # RDS 实例号 （身份参数，手工指定分配监控系统内名称）
        pg_host: pgr-2zexqxalk7d37edt.pg.rds.aliyuncs.com  # RDS 主机地址
        pg_port: 5432                         # RDS 端口（从控制台连接信息获取）
        pg_exporter_include_database: 'rds'   # 仅监控这个列表中的数据库（多个数据库用逗号分隔）
        pg_databases: [ { name: rds }, {name : test} ]  # 将这两个数据库纳入 PGCAT 管理，注册为 Grafana 数据源

监控对象配置

当您想要监控现有实例时，不论是 RDS，还是自建的 PostgreSQL 实例，您都需要在目标数据库上进行一些配置，以便 Pigsty 可以访问它们。

为了将外部现存PostgreSQL实例纳入监控，您需要有一个可用于访问该实例/集群的连接串。任何可达连接串（业务用户，超级用户）均可使用，但我们建议使用一个专用监控用户以避免权限泄漏。

监控用户：默认使用的用户名为 dbuser_monitor，该用户属于 pg_monitor 角色组，或确保具有相关视图访问权限。
监控认证：默认使用密码访问，您需要确保HBA策略允许监控用户从管理机或DB节点本地访问数据库。
监控模式：固定使用名称 monitor，用于安装额外的监控视图与扩展插件，非必选，但建议创建。
监控扩展：强烈建议启用PG自带的监控扩展 pg_stat_statements。
监控视图：监控视图是可选项，可以提供更多的监控指标支持。

监控用户

以Pigsty默认使用的监控用户dbuser_monitor为例，在目标数据库集群创建以下用户。

CREATE USER dbuser_monitor;                                       -- 创建监控用户
COMMENT ON ROLE dbuser_monitor IS 'system monitor user';          -- 监控用户备注
GRANT pg_monitor TO dbuser_monitor;                               -- 授予监控用户 pg_monitor 权限，否则一些指标将无法采集

ALTER USER dbuser_monitor PASSWORD 'DBUser.Monitor';              -- 按需修改监控用户密码（强烈建议修改！但请与Pigsty配置一致）
ALTER USER dbuser_monitor SET log_min_duration_statement = 1000;  -- 建议设置此参数，避免日志塞满监控慢查询
ALTER USER dbuser_monitor SET search_path = monitor,public;       -- 建议设置此参数，避免 pg_stat_statements 扩展无法生效

请注意，这里创建的监控用户与密码需要与 pg_monitor_username 与 pg_monitor_password 保持一致。

监控认证

配置数据库 pg_hba.conf 文件，添加以下规则以允许监控用户从本地，以及管理机使用密码访问所有数据库。

# allow local role monitor with password
local   all  dbuser_monitor                    md5
host    all  dbuser_monitor  127.0.0.1/32      md5
host    all  dbuser_monitor  <管理机器IP地址>/32 md5

如果您的 RDS 不支持定义 HBA，那么把安装 Pigsty 机器的内网 IP 地址开白即可。

监控模式

监控模式可选项，即使没有，Pigsty监控系统的主体也可以正常工作，但我们强烈建议设置此模式。

CREATE SCHEMA IF NOT EXISTS monitor;               -- 创建监控专用模式
GRANT USAGE ON SCHEMA monitor TO dbuser_monitor;   -- 允许监控用户使用

监控扩展

监控扩展是可选项，但我们强烈建议启用 pg_stat_statements 扩展该扩展提供了关于查询性能的重要数据。

注意：该扩展必须列入数据库参数 shared_preload_libraries 中方可生效，而修改该参数需要重启数据库。

CREATE EXTENSION IF NOT EXISTS "pg_stat_statements" WITH SCHEMA "monitor";

请注意，您应当在默认的管理数据库 postgres 中安装此扩展。有些时候，RDS不允许您在 postgres 数据库中创建监控模式，在这种情况下，您可以将 pg_stat_statements 插件安装到默认的 public 下，只要确保监控用户的 search_path 按照上面的配置，能够找到 pg_stat_statements 视图即可。

CREATE EXTENSION IF NOT EXISTS "pg_stat_statements";
ALTER USER dbuser_monitor SET search_path = monitor,public; -- 建议设置此参数，避免 pg_stat_statements 扩展无法生效

监控视图

监控视图提供了若干常用的预处理结果，并对某些需要高权限的监控指标进行权限封装（例如共享内存分配），便于查询与使用。强烈建议在所有需要监控的数据库中创建

监控模式与监控视图定义

----------------------------------------------------------------------
-- Table bloat estimate : monitor.pg_table_bloat
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_table_bloat CASCADE;
CREATE OR REPLACE VIEW monitor.pg_table_bloat AS
SELECT CURRENT_CATALOG AS datname, nspname, relname , tblid , bs * tblpages AS size,
       CASE WHEN tblpages - est_tblpages_ff > 0 THEN (tblpages - est_tblpages_ff)/tblpages::FLOAT ELSE 0 END AS ratio
FROM (
         SELECT ceil( reltuples / ( (bs-page_hdr)*fillfactor/(tpl_size*100) ) ) + ceil( toasttuples / 4 ) AS est_tblpages_ff,
                tblpages, fillfactor, bs, tblid, nspname, relname, is_na
         FROM (
                  SELECT
                      ( 4 + tpl_hdr_size + tpl_data_size + (2 * ma)
                          - CASE WHEN tpl_hdr_size % ma = 0 THEN ma ELSE tpl_hdr_size % ma END
                          - CASE WHEN ceil(tpl_data_size)::INT % ma = 0 THEN ma ELSE ceil(tpl_data_size)::INT % ma END
                          ) AS tpl_size, (heappages + toastpages) AS tblpages, heappages,
                      toastpages, reltuples, toasttuples, bs, page_hdr, tblid, nspname, relname, fillfactor, is_na
                  FROM (
                           SELECT
                               tbl.oid AS tblid, ns.nspname , tbl.relname, tbl.reltuples,
                               tbl.relpages AS heappages, coalesce(toast.relpages, 0) AS toastpages,
                               coalesce(toast.reltuples, 0) AS toasttuples,
                               coalesce(substring(array_to_string(tbl.reloptions, ' ') FROM 'fillfactor=([0-9]+)')::smallint, 100) AS fillfactor,
                               current_setting('block_size')::numeric AS bs,
                               CASE WHEN version()~'mingw32' OR version()~'64-bit|x86_64|ppc64|ia64|amd64' THEN 8 ELSE 4 END AS ma,
                               24 AS page_hdr,
                               23 + CASE WHEN MAX(coalesce(s.null_frac,0)) > 0 THEN ( 7 + count(s.attname) ) / 8 ELSE 0::int END
                                   + CASE WHEN bool_or(att.attname = 'oid' and att.attnum < 0) THEN 4 ELSE 0 END AS tpl_hdr_size,
                               sum( (1-coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 0) ) AS tpl_data_size,
                               bool_or(att.atttypid = 'pg_catalog.name'::regtype)
                                   OR sum(CASE WHEN att.attnum > 0 THEN 1 ELSE 0 END) <> count(s.attname) AS is_na
                           FROM pg_attribute AS att
                                    JOIN pg_class AS tbl ON att.attrelid = tbl.oid
                                    JOIN pg_namespace AS ns ON ns.oid = tbl.relnamespace
                                    LEFT JOIN pg_stats AS s ON s.schemaname=ns.nspname AND s.tablename = tbl.relname AND s.inherited=false AND s.attname=att.attname
                                    LEFT JOIN pg_class AS toast ON tbl.reltoastrelid = toast.oid
                           WHERE NOT att.attisdropped AND tbl.relkind = 'r' AND nspname NOT IN ('pg_catalog','information_schema')
                           GROUP BY 1,2,3,4,5,6,7,8,9,10
                       ) AS s
              ) AS s2
     ) AS s3
WHERE NOT is_na;
COMMENT ON VIEW monitor.pg_table_bloat IS 'postgres table bloat estimate';

GRANT SELECT ON monitor.pg_table_bloat TO pg_monitor;

----------------------------------------------------------------------
-- Index bloat estimate : monitor.pg_index_bloat
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_index_bloat CASCADE;
CREATE OR REPLACE VIEW monitor.pg_index_bloat AS
SELECT CURRENT_CATALOG AS datname, nspname, idxname AS relname, tblid, idxid, relpages::BIGINT * bs AS size,
       COALESCE((relpages - ( reltuples * (6 + ma - (CASE WHEN index_tuple_hdr % ma = 0 THEN ma ELSE index_tuple_hdr % ma END)
                                               + nulldatawidth + ma - (CASE WHEN nulldatawidth % ma = 0 THEN ma ELSE nulldatawidth % ma END))
                                  / (bs - pagehdr)::FLOAT  + 1 )), 0) / relpages::FLOAT AS ratio
FROM (
         SELECT nspname,idxname,indrelid AS tblid,indexrelid AS idxid,
                reltuples,relpages,
                current_setting('block_size')::INTEGER                                                               AS bs,
                (CASE WHEN version() ~ 'mingw32' OR version() ~ '64-bit|x86_64|ppc64|ia64|amd64' THEN 8 ELSE 4 END)  AS ma,
                24                                                                                                   AS pagehdr,
                (CASE WHEN max(COALESCE(pg_stats.null_frac, 0)) = 0 THEN 2 ELSE 6 END)                               AS index_tuple_hdr,
                sum((1.0 - COALESCE(pg_stats.null_frac, 0.0)) *
                    COALESCE(pg_stats.avg_width, 1024))::INTEGER                                                     AS nulldatawidth
         FROM pg_attribute
                  JOIN (
             SELECT pg_namespace.nspname,
                    ic.relname                                                   AS idxname,
                    ic.reltuples,
                    ic.relpages,
                    pg_index.indrelid,
                    pg_index.indexrelid,
                    tc.relname                                                   AS tablename,
                    regexp_split_to_table(pg_index.indkey::TEXT, ' ') :: INTEGER AS attnum,
                    pg_index.indexrelid                                          AS index_oid
             FROM pg_index
                      JOIN pg_class ic ON pg_index.indexrelid = ic.oid
                      JOIN pg_class tc ON pg_index.indrelid = tc.oid
                      JOIN pg_namespace ON pg_namespace.oid = ic.relnamespace
                      JOIN pg_am ON ic.relam = pg_am.oid
             WHERE pg_am.amname = 'btree' AND ic.relpages > 0 AND nspname NOT IN ('pg_catalog', 'information_schema')
         ) ind_atts ON pg_attribute.attrelid = ind_atts.indexrelid AND pg_attribute.attnum = ind_atts.attnum
                  JOIN pg_stats ON pg_stats.schemaname = ind_atts.nspname
             AND ((pg_stats.tablename = ind_atts.tablename AND pg_stats.attname = pg_get_indexdef(pg_attribute.attrelid, pg_attribute.attnum, TRUE))
                 OR (pg_stats.tablename = ind_atts.idxname AND pg_stats.attname = pg_attribute.attname))
         WHERE pg_attribute.attnum > 0
         GROUP BY 1, 2, 3, 4, 5, 6
     ) est;
COMMENT ON VIEW monitor.pg_index_bloat IS 'postgres index bloat estimate (btree-only)';

GRANT SELECT ON monitor.pg_index_bloat TO pg_monitor;

----------------------------------------------------------------------
-- Relation Bloat : monitor.pg_bloat
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_bloat CASCADE;
CREATE OR REPLACE VIEW monitor.pg_bloat AS
SELECT coalesce(ib.datname, tb.datname)                                                   AS datname,
       coalesce(ib.nspname, tb.nspname)                                                   AS nspname,
       coalesce(ib.tblid, tb.tblid)                                                       AS tblid,
       coalesce(tb.nspname || '.' || tb.relname, ib.nspname || '.' || ib.tblid::RegClass) AS tblname,
       tb.size                                                                            AS tbl_size,
       CASE WHEN tb.ratio < 0 THEN 0 ELSE round(tb.ratio::NUMERIC, 6) END                 AS tbl_ratio,
       (tb.size * (CASE WHEN tb.ratio < 0 THEN 0 ELSE tb.ratio::NUMERIC END)) ::BIGINT    AS tbl_wasted,
       ib.idxid,
       ib.nspname || '.' || ib.relname                                                    AS idxname,
       ib.size                                                                            AS idx_size,
       CASE WHEN ib.ratio < 0 THEN 0 ELSE round(ib.ratio::NUMERIC, 5) END                 AS idx_ratio,
       (ib.size * (CASE WHEN ib.ratio < 0 THEN 0 ELSE ib.ratio::NUMERIC END)) ::BIGINT    AS idx_wasted
FROM monitor.pg_index_bloat ib
         FULL OUTER JOIN monitor.pg_table_bloat tb ON ib.tblid = tb.tblid;

COMMENT ON VIEW monitor.pg_bloat IS 'postgres relation bloat detail';
GRANT SELECT ON monitor.pg_bloat TO pg_monitor;

----------------------------------------------------------------------
-- monitor.pg_index_bloat_human
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_index_bloat_human CASCADE;
CREATE OR REPLACE VIEW monitor.pg_index_bloat_human AS
SELECT idxname                            AS name,
       tblname,
       idx_wasted                         AS wasted,
       pg_size_pretty(idx_size)           AS idx_size,
       round(100 * idx_ratio::NUMERIC, 2) AS idx_ratio,
       pg_size_pretty(idx_wasted)         AS idx_wasted,
       pg_size_pretty(tbl_size)           AS tbl_size,
       round(100 * tbl_ratio::NUMERIC, 2) AS tbl_ratio,
       pg_size_pretty(tbl_wasted)         AS tbl_wasted
FROM monitor.pg_bloat
WHERE idxname IS NOT NULL;
COMMENT ON VIEW monitor.pg_index_bloat_human IS 'postgres index bloat info in human-readable format';
GRANT SELECT ON monitor.pg_index_bloat_human TO pg_monitor;


----------------------------------------------------------------------
-- monitor.pg_table_bloat_human
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_table_bloat_human CASCADE;
CREATE OR REPLACE VIEW monitor.pg_table_bloat_human AS
SELECT tblname                                          AS name,
       idx_wasted + tbl_wasted                          AS wasted,
       pg_size_pretty(idx_wasted + tbl_wasted)          AS all_wasted,
       pg_size_pretty(tbl_wasted)                       AS tbl_wasted,
       pg_size_pretty(tbl_size)                         AS tbl_size,
       tbl_ratio,
       pg_size_pretty(idx_wasted)                       AS idx_wasted,
       pg_size_pretty(idx_size)                         AS idx_size,
       round(idx_wasted::NUMERIC * 100.0 / idx_size, 2) AS idx_ratio
FROM (SELECT datname,
             nspname,
             tblname,
             coalesce(max(tbl_wasted), 0)                         AS tbl_wasted,
             coalesce(max(tbl_size), 1)                           AS tbl_size,
             round(100 * coalesce(max(tbl_ratio), 0)::NUMERIC, 2) AS tbl_ratio,
             coalesce(sum(idx_wasted), 0)                         AS idx_wasted,
             coalesce(sum(idx_size), 1)                           AS idx_size
      FROM monitor.pg_bloat
      WHERE tblname IS NOT NULL
      GROUP BY 1, 2, 3
     ) d;
COMMENT ON VIEW monitor.pg_table_bloat_human IS 'postgres table bloat info in human-readable format';
GRANT SELECT ON monitor.pg_table_bloat_human TO pg_monitor;


----------------------------------------------------------------------
-- Activity Overview: monitor.pg_session
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_session CASCADE;
CREATE OR REPLACE VIEW monitor.pg_session AS
SELECT coalesce(datname, 'all') AS datname, numbackends, active, idle, ixact, max_duration, max_tx_duration, max_conn_duration
FROM (
         SELECT datname,
                count(*)                                         AS numbackends,
                count(*) FILTER ( WHERE state = 'active' )       AS active,
                count(*) FILTER ( WHERE state = 'idle' )         AS idle,
                count(*) FILTER ( WHERE state = 'idle in transaction'
                    OR state = 'idle in transaction (aborted)' ) AS ixact,
                max(extract(epoch from now() - state_change))
                FILTER ( WHERE state = 'active' )                AS max_duration,
                max(extract(epoch from now() - xact_start))      AS max_tx_duration,
                max(extract(epoch from now() - backend_start))   AS max_conn_duration
         FROM pg_stat_activity
         WHERE backend_type = 'client backend'
           AND pid <> pg_backend_pid()
         GROUP BY ROLLUP (1)
         ORDER BY 1 NULLS FIRST
     ) t;
COMMENT ON VIEW monitor.pg_session IS 'postgres activity group by session';
GRANT SELECT ON monitor.pg_session TO pg_monitor;


----------------------------------------------------------------------
-- Sequential Scan: monitor.pg_seq_scan
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_seq_scan CASCADE;
CREATE OR REPLACE VIEW monitor.pg_seq_scan AS
SELECT schemaname                                                        AS nspname,
       relname,
       seq_scan,
       seq_tup_read,
       seq_tup_read / seq_scan                                           AS seq_tup_avg,
       idx_scan,
       n_live_tup + n_dead_tup                                           AS tuples,
       round(n_live_tup * 100.0::NUMERIC / (n_live_tup + n_dead_tup), 2) AS live_ratio
FROM pg_stat_user_tables
WHERE seq_scan > 0
  and (n_live_tup + n_dead_tup) > 0
ORDER BY seq_scan DESC;
COMMENT ON VIEW monitor.pg_seq_scan IS 'table that have seq scan';
GRANT SELECT ON monitor.pg_seq_scan TO pg_monitor;

查看共享内存分配的函数（PG13以上可用）

DROP FUNCTION IF EXISTS monitor.pg_shmem() CASCADE;
CREATE OR REPLACE FUNCTION monitor.pg_shmem() RETURNS SETOF
    pg_shmem_allocations AS $$ SELECT * FROM pg_shmem_allocations;$$ LANGUAGE SQL SECURITY DEFINER;
COMMENT ON FUNCTION monitor.pg_shmem() IS 'security wrapper for system view pg_shmem';
REVOKE ALL ON FUNCTION monitor.pg_shmem() FROM PUBLIC;
GRANT EXECUTE ON FUNCTION monitor.pg_shmem() TO pg_monitor;

6.16 - 监控面板

Pigsty 为 PostgreSQL 提供了诸多开箱即用的 Grafana 监控仪表盘

Pigsty 为 PostgreSQL 提供了诸多开箱即用的 Grafana 监控仪表盘： Demo & Gallery。

在 Pigsty 中共有 26 个与 PostgreSQL 相关的监控面板，按照层次分为总览，集群，实例，数据库四大类，按照数据来源又分为 PGSQL，PGCAT，PGLOG 三大类。

总览

总览	集群	实例	数据库
PGSQL Overview	PGSQL Cluster	PGSQL Instance	PGSQL Database
PGSQL Alert	PGRDS Cluster	PGRDS Instance	PGCAT Database
PGSQL Shard	PGSQL Activity	PGCAT Instance	PGSQL Tables
	PGSQL Replication	PGSQL Persist	PGSQL Table
	PGSQL Service	PGSQL Proxy	PGCAT Table
	PGSQL Databases	PGSQL Pgbouncer	PGSQL Query
	PGSQL Patroni	PGSQL Session	PGCAT Query
	PGSQL PITR	PGSQL Xacts	PGCAT Locks
		PGSQL Exporter	PGCAT Schema

概览

pgsql-overview : PGSQL模块的主仪表板
pgsql-alert : PGSQL的全局关键指标和警报事件
pgsql-shard : 关于水平分片的PGSQL集群的概览，例如 citus / gpsql 集群

集群

pgsql-cluster: 一个PGSQL集群的主仪表板
pgrds-cluster: PGSQL Cluster 的RDS版本，专注于所有 PostgreSQL 本身的指标
pgsql-activity: 关注PGSQL集群的会话/负载/QPS/TPS/锁定情况
pgsql-replication: 关注PGSQL集群复制、插槽和发布/订阅
pgsql-service: 关注PGSQL集群服务、代理、路由和负载均衡
pgsql-databases: 关注所有实例的数据库CRUD、慢查询和表统计信息
pgsql-patroni: 关注集群高可用状态，Patroni组件状态
pgsql-pitr: 关注集群 PITR 过程的上下文，用于辅助时间点恢复

实例

pgsql-instance: 单个PGSQL实例的主仪表板
pgrds-instance: PGSQL Instance 的RDS版本，专注于所有 PostgreSQL 本身的指标
pgcat-instance: 直接从数据库目录获取的实例信息
pgsql-proxy: 单个haproxy负载均衡器的详细指标
pgsql-pgbouncer: 单个Pgbouncer连接池实例中的指标总览
pgsql-persist: 持久性指标：WAL、XID、检查点、存档、IO
pgsql-session: 单个实例中的会话和活动/空闲时间的指标
pgsql-xacts: 关于事务、锁、TPS/QPS相关的指标
pgsql-exporter: Postgres 与 Pgbouncer 监控组件自我监控指标

数据库

pgsql-database: 单个PGSQL数据库的主仪表板
pgcat-database: 直接从数据库目录获取的数据库信息
pgsql-tables : 单个数据库内的表/索引访问指标
pgsql-table: 单个表的详细信息（QPS/RT/索引/序列…）
pgcat-table: 直接从数据库目录获取的单个表的详细信息（统计/膨胀…）
pgsql-query: 单个查询的详细信息（QPS/RT）
pgcat-query: 直接从数据库目录获取的单个查询的详细信息（SQL/统计）
pgcat-schema: 直接从数据库目录获取关于模式的信息（表/索引/序列…）
pgcat-locks: 直接从数据库目录获取的关于活动与锁等待的信息

总览

PGSQL Overview：PGSQL模块的主仪表板

PGSQL Overview

PGSQL Alert：PGSQL 全局核心指标总览与告警事件一览

PGSQL Alert

PGSQL Shard：展示一个PGSQL 水平分片集群内的横向指标对比：例如 CITUS / GPSQL 集群。

PGSQL Shard

集群

PGSQL Cluster：一个PGSQL集群的主仪表板

PGSQL Cluster

PGRDS Cluster：PGSQL Cluster 的RDS版本，专注于所有 PostgreSQL 本身的指标

PGRDS Cluster

PGSQL Service：关注PGSQL集群服务、代理、路由和负载均衡。

PGSQL Service

PGSQL Activity：关注PGSQL集群的会话/负载/QPS/TPS/锁定情况

PGSQL Activity

PGSQL Replication：关注PGSQL集群复制、插槽和发布/订阅。

PGSQL Replication

PGSQL Databases：关注所有实例的数据库CRUD、慢查询和表统计信息。

PGSQL Databases

PGSQL Patroni：关注集群高可用状态，Patroni组件状态

PGSQL Patroni

PGSQL PITR：关注集群 PITR 过程的上下文，用于辅助时间点恢复

PGSQL PITR

实例

PGSQL Instance：单个PGSQL实例的主仪表板

PGSQL Instance

PGRDS Instance：PGSQL Instance 的RDS版本，专注于所有 PostgreSQL 本身的指标

PGRDS Instance

PGSQL Proxy：单个haproxy负载均衡器的详细指标

PGSQL Proxy

PGSQL Pgbouncer：单个Pgbouncer连接池实例中的指标总览

PGSQL Pgbouncer

PGSQL Persist：持久性指标：WAL、XID、检查点、存档、IO

PGSQL Persist

PGSQL Xacts：关于事务、锁、TPS/QPS相关的指标

PGSQL Xacts

PGSQL Session：单个实例中的会话和活动/空闲时间的指标

PGSQL Session

PGSQL Exporter：Postgres/Pgbouncer 监控组件自我监控指标

PGSQL Exporter

数据库

PGSQL Database：单个PGSQL数据库的主仪表板

PGSQL Database

PGSQL Tables：单个数据库内的表/索引访问指标

PGSQL Tables

PGSQL Table：单个表的详细信息（QPS/RT/索引/序列…）

PGSQL Table

PGSQL Query：单类查询的详细信息（QPS/RT）

PGSQL Query

PGCAT

PGCAT Instance：直接从数据库目录获取的实例信息

PGCAT Instance

PGCAT Database：直接从数据库目录获取的数据库信息

PGCAT Database

PGCAT Schema：直接从数据库目录获取关于模式的信息（表/索引/序列…）

PGCAT Schema

PGCAT Table：直接从数据库目录获取的单个表的详细信息（统计/膨胀…）

PGCAT Table

PGCAT Query：直接从数据库目录获取的单类查询的详细信息（SQL/统计）

PGCAT Query

PGCAT Locks：直接从数据库目录获取的关于活动与锁等待的信息

PGCAT Locks

PGLOG

PGLOG Overview：总览 Pigsty CMDB 中的CSV日志样本

PGLOG Overview

PGLOG Overview：Pigsty CMDB 中的CSV日志样本中某一条会话的日志详情

PGLOG Session

画廊

详情请参考 pigsty/wiki/gallery。

PGSQL Overview

PGSQL Shard

PGSQL Cluster

PGSQL Service

PGSQL Activity

PGSQL Replication

PGSQL Databases

PGSQL Instance

PGSQL Proxy

PGSQL Pgbouncer

PGSQL Session

PGSQL Xacts

PGSQL Persist

PGSQL Database

PGSQL Tables

PGSQL Table

PGSQL Query

PGCAT Instance

PGCAT Database

PGCAT Schema

PGCAT Table

PGCAT Lock

PGCAT Query

PGLOG Overview

PGLOG Session

6.17 - 指标列表

Pigsty PGSQL 模块提供的完整监控指标列表与释义

PGSQL 模块包含有 638 类可用监控指标。

Metric Name	Type	Labels	Description
ALERTS	Unknown	`category`, `job`, `level`, `ins`, `severity`, `ip`, `alertname`, `alertstate`, `instance`, `cls`	N/A
ALERTS_FOR_STATE	Unknown	`category`, `job`, `level`, `ins`, `severity`, `ip`, `alertname`, `instance`, `cls`	N/A
cls:pressure1	Unknown	`job`, `cls`	N/A
cls:pressure15	Unknown	`job`, `cls`	N/A
cls:pressure5	Unknown	`job`, `cls`	N/A
go_gc_duration_seconds	summary	`job`, `ins`, `ip`, `instance`, `quantile`, `cls`	A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
go_gc_duration_seconds_sum	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
go_goroutines	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of goroutines that currently exist.
go_info	gauge	`version`, `job`, `ins`, `ip`, `instance`, `cls`	Information about the Go environment.
go_memstats_alloc_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total number of frees.
go_memstats_gc_sys_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of heap bytes that are in use.
go_memstats_heap_objects	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of allocated objects.
go_memstats_heap_released_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of heap bytes released to OS.
go_memstats_heap_sys_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total number of pointer lookups.
go_memstats_mallocs_total	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total number of mallocs.
go_memstats_mcache_inuse_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of bytes obtained from system.
go_threads	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of OS threads created.
ins:pressure1	Unknown	`job`, `ins`, `ip`, `cls`	N/A
ins:pressure15	Unknown	`job`, `ins`, `ip`, `cls`	N/A
ins:pressure5	Unknown	`job`, `ins`, `ip`, `cls`	N/A
patroni_cluster_unlocked	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if the cluster is unlocked, 0 if locked.
patroni_dcs_last_seen	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Epoch timestamp when DCS was last contacted successfully by Patroni.
patroni_failsafe_mode_is_active	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if failsafe mode is active, 0 if inactive.
patroni_is_paused	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if auto failover is disabled, 0 otherwise.
patroni_master	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if this node is the leader, 0 otherwise.
patroni_pending_restart	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if the node needs a restart, 0 otherwise.
patroni_postgres_in_archive_recovery	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if Postgres is replicating from archive, 0 otherwise.
patroni_postgres_running	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if Postgres is running, 0 otherwise.
patroni_postgres_server_version	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Version of Postgres (if running), 0 otherwise.
patroni_postgres_streaming	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if Postgres is streaming, 0 otherwise.
patroni_postgres_timeline	counter	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Postgres timeline of this node (if running), 0 otherwise.
patroni_postmaster_start_time	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Epoch seconds since Postgres started.
patroni_primary	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if this node is the leader, 0 otherwise.
patroni_replica	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if this node is a replica, 0 otherwise.
patroni_standby_leader	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if this node is the standby_leader, 0 otherwise.
patroni_sync_standby	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if this node is a sync standby replica, 0 otherwise.
patroni_up	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
patroni_version	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Patroni semver without periods.
patroni_xlog_location	counter	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Current location of the Postgres transaction log, 0 if this node is not the leader.
patroni_xlog_paused	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Value is 1 if the Postgres xlog is paused, 0 otherwise.
patroni_xlog_received_location	counter	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Current location of the received Postgres transaction log, 0 if this node is not a replica.
patroni_xlog_replayed_location	counter	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Current location of the replayed Postgres transaction log, 0 if this node is not a replica.
patroni_xlog_replayed_timestamp	gauge	`job`, `ins`, `ip`, `instance`, `cls`, `scope`	Current timestamp of the replayed Postgres transaction log, 0 if null.
pg:cls:active_backends	Unknown	`job`, `cls`	N/A
pg:cls:active_time_rate15m	Unknown	`job`, `cls`	N/A
pg:cls:active_time_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:active_time_rate5m	Unknown	`job`, `cls`	N/A
pg:cls:age	Unknown	`job`, `cls`	N/A
pg:cls:buf_alloc_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:buf_clean_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:buf_flush_backend_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:buf_flush_checkpoint_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:cpu_count	Unknown	`job`, `cls`	N/A
pg:cls:cpu_usage	Unknown	`job`, `cls`	N/A
pg:cls:cpu_usage_15m	Unknown	`job`, `cls`	N/A
pg:cls:cpu_usage_1m	Unknown	`job`, `cls`	N/A
pg:cls:cpu_usage_5m	Unknown	`job`, `cls`	N/A
pg:cls:db_size	Unknown	`job`, `cls`	N/A
pg:cls:file_size	Unknown	`job`, `cls`	N/A
pg:cls:ixact_backends	Unknown	`job`, `cls`	N/A
pg:cls:ixact_time_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:lag_bytes	Unknown	`job`, `cls`	N/A
pg:cls:lag_seconds	Unknown	`job`, `cls`	N/A
pg:cls:leader	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:cls:load1	Unknown	`job`, `cls`	N/A
pg:cls:load15	Unknown	`job`, `cls`	N/A
pg:cls:load5	Unknown	`job`, `cls`	N/A
pg:cls:lock_count	Unknown	`job`, `cls`	N/A
pg:cls:locks	Unknown	`job`, `cls`, `mode`	N/A
pg:cls:log_size	Unknown	`job`, `cls`	N/A
pg:cls:lsn_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:members	Unknown	`job`, `ins`, `ip`, `cls`	N/A
pg:cls:num_backends	Unknown	`job`, `cls`	N/A
pg:cls:partition	Unknown	`job`, `cls`	N/A
pg:cls:receiver	Unknown	`state`, `slot_name`, `job`, `appname`, `ip`, `cls`, `sender_host`, `sender_port`	N/A
pg:cls:rlock_count	Unknown	`job`, `cls`	N/A
pg:cls:saturation1	Unknown	`job`, `cls`	N/A
pg:cls:saturation15	Unknown	`job`, `cls`	N/A
pg:cls:saturation5	Unknown	`job`, `cls`	N/A
pg:cls:sender	Unknown	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `cls`	N/A
pg:cls:session_time_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:size	Unknown	`job`, `cls`	N/A
pg:cls:slot_count	Unknown	`job`, `cls`	N/A
pg:cls:slot_retained_bytes	Unknown	`job`, `cls`	N/A
pg:cls:standby_count	Unknown	`job`, `cls`	N/A
pg:cls:sync_state	Unknown	`job`, `cls`	N/A
pg:cls:timeline	Unknown	`job`, `cls`	N/A
pg:cls:tup_deleted_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:tup_fetched_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:tup_inserted_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:tup_modified_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:tup_returned_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:wal_size	Unknown	`job`, `cls`	N/A
pg:cls:xact_commit_rate15m	Unknown	`job`, `cls`	N/A
pg:cls:xact_commit_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:xact_commit_rate5m	Unknown	`job`, `cls`	N/A
pg:cls:xact_rollback_rate15m	Unknown	`job`, `cls`	N/A
pg:cls:xact_rollback_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:xact_rollback_rate5m	Unknown	`job`, `cls`	N/A
pg:cls:xact_total_rate15m	Unknown	`job`, `cls`	N/A
pg:cls:xact_total_rate1m	Unknown	`job`, `cls`	N/A
pg:cls:xact_total_sigma15m	Unknown	`job`, `cls`	N/A
pg:cls:xlock_count	Unknown	`job`, `cls`	N/A
pg:db:active_backends	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:active_time_rate15m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:active_time_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:active_time_rate5m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:age	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:age_deriv1h	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:age_exhaust	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:blk_io_time_seconds_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:blk_read_time_seconds_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:blk_write_time_seconds_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:blks_access_1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:blks_hit_1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:blks_hit_ratio1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:blks_read_1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:conn_limit	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:conn_usage	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:db_size	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:ixact_backends	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:ixact_time_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:lock_count	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:num_backends	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:rlock_count	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:session_time_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:temp_bytes_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:temp_files_1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:tup_deleted_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:tup_fetched_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:tup_inserted_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:tup_modified_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:tup_returned_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:wlock_count	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_commit_rate15m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_commit_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_commit_rate5m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_rollback_rate15m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_rollback_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_rollback_rate5m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_total_rate15m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_total_rate1m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_total_rate5m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xact_total_sigma15m	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:db:xlock_count	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:env:active_backends	Unknown	`job`	N/A
pg:env:active_time_rate15m	Unknown	`job`	N/A
pg:env:active_time_rate1m	Unknown	`job`	N/A
pg:env:active_time_rate5m	Unknown	`job`	N/A
pg:env:age	Unknown	`job`	N/A
pg:env:cpu_count	Unknown	`job`	N/A
pg:env:cpu_usage	Unknown	`job`	N/A
pg:env:cpu_usage_15m	Unknown	`job`	N/A
pg:env:cpu_usage_1m	Unknown	`job`	N/A
pg:env:cpu_usage_5m	Unknown	`job`	N/A
pg:env:ixact_backends	Unknown	`job`	N/A
pg:env:ixact_time_rate1m	Unknown	`job`	N/A
pg:env:lag_bytes	Unknown	`job`	N/A
pg:env:lag_seconds	Unknown	`job`	N/A
pg:env:lsn_rate1m	Unknown	`job`	N/A
pg:env:session_time_rate1m	Unknown	`job`	N/A
pg:env:tup_deleted_rate1m	Unknown	`job`	N/A
pg:env:tup_fetched_rate1m	Unknown	`job`	N/A
pg:env:tup_inserted_rate1m	Unknown	`job`	N/A
pg:env:tup_modified_rate1m	Unknown	`job`	N/A
pg:env:tup_returned_rate1m	Unknown	`job`	N/A
pg:env:xact_commit_rate15m	Unknown	`job`	N/A
pg:env:xact_commit_rate1m	Unknown	`job`	N/A
pg:env:xact_commit_rate5m	Unknown	`job`	N/A
pg:env:xact_rollback_rate15m	Unknown	`job`	N/A
pg:env:xact_rollback_rate1m	Unknown	`job`	N/A
pg:env:xact_rollback_rate5m	Unknown	`job`	N/A
pg:env:xact_total_rate15m	Unknown	`job`	N/A
pg:env:xact_total_rate1m	Unknown	`job`	N/A
pg:env:xact_total_sigma15m	Unknown	`job`	N/A
pg:ins:active_backends	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:active_time_rate15m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:active_time_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:active_time_rate5m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:age	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:blks_hit_ratio1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:buf_alloc_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:buf_clean_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:buf_flush_backend_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:buf_flush_checkpoint_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:ckpt_1h	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:ckpt_req_1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:ckpt_timed_1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:conn_limit	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:conn_usage	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:cpu_count	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:cpu_usage	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:cpu_usage_15m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:cpu_usage_1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:cpu_usage_5m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:db_size	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:file_size	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:fs_size	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:is_leader	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:ixact_backends	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:ixact_time_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:lag_bytes	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:lag_seconds	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:load1	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:load15	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:load5	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:lock_count	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:locks	Unknown	`job`, `ins`, `ip`, `mode`, `instance`, `cls`	N/A
pg:ins:log_size	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:lsn_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:mem_size	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:num_backends	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:rlock_count	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:saturation1	Unknown	`job`, `ins`, `ip`, `cls`	N/A
pg:ins:saturation15	Unknown	`job`, `ins`, `ip`, `cls`	N/A
pg:ins:saturation5	Unknown	`job`, `ins`, `ip`, `cls`	N/A
pg:ins:session_time_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:slot_retained_bytes	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:space_usage	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:status	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:sync_state	Unknown	`job`, `ins`, `instance`, `cls`	N/A
pg:ins:target_count	Unknown	`job`, `cls`, `ins`	N/A
pg:ins:timeline	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:tup_deleted_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:tup_fetched_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:tup_inserted_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:tup_modified_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:tup_returned_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:wal_size	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:wlock_count	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_commit_rate15m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_commit_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_commit_rate5m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_rollback_rate15m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_rollback_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_rollback_rate5m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_total_rate15m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_total_rate1m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_total_rate5m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xact_total_sigma15m	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:ins:xlock_count	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:query:call_rate1m	Unknown	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:query:rt_1m	Unknown	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg:table:scan_rate1m	Unknown	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg_activity_count	gauge	`datname`, `state`, `job`, `ins`, `ip`, `instance`, `cls`	Count of connection among (datname,state)
pg_activity_max_conn_duration	gauge	`datname`, `state`, `job`, `ins`, `ip`, `instance`, `cls`	Max backend session duration since state change among (datname, state)
pg_activity_max_duration	gauge	`datname`, `state`, `job`, `ins`, `ip`, `instance`, `cls`	Max duration since last state change among (datname, state)
pg_activity_max_tx_duration	gauge	`datname`, `state`, `job`, `ins`, `ip`, `instance`, `cls`	Max transaction duration since state change among (datname, state)
pg_archiver_failed_count	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of failed attempts for archiving WAL files
pg_archiver_finish_count	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of WAL files that have been successfully archived
pg_archiver_last_failed_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Time of the last failed archival operation
pg_archiver_last_finish_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Time of the last successful archive operation
pg_archiver_reset_time	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Time at which archive statistics were last reset
pg_backend_count	gauge	`type`, `job`, `ins`, `ip`, `instance`, `cls`	Database backend process count by backend_type
pg_bgwriter_buffers_alloc	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of buffers allocated
pg_bgwriter_buffers_backend	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of buffers written directly by a backend
pg_bgwriter_buffers_backend_fsync	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of times a backend had to execute its own fsync call
pg_bgwriter_buffers_checkpoint	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of buffers written during checkpoints
pg_bgwriter_buffers_clean	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of buffers written by the background writer
pg_bgwriter_checkpoint_sync_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk, in seconds
pg_bgwriter_checkpoint_write_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in seconds
pg_bgwriter_checkpoints_req	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of requested checkpoints that have been performed
pg_bgwriter_checkpoints_timed	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of scheduled checkpoints that have been performed
pg_bgwriter_maxwritten_clean	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of times the background writer stopped a cleaning scan because it had written too many buffers
pg_bgwriter_reset_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Time at which bgwriter statistics were last reset
pg_boot_time	gauge	`job`, `ins`, `ip`, `instance`, `cls`	unix timestamp when postmaster boot
pg_checkpoint_checkpoint_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint location
pg_checkpoint_elapse	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Seconds elapsed since latest checkpoint in seconds
pg_checkpoint_full_page_writes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s full_page_writes enabled
pg_checkpoint_newest_commit_ts_xid	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s newestCommitTsXid
pg_checkpoint_next_multi_offset	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s NextMultiOffset
pg_checkpoint_next_multixact_id	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s NextMultiXactId
pg_checkpoint_next_oid	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s NextOID
pg_checkpoint_next_xid	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s NextXID xid
pg_checkpoint_next_xid_epoch	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s NextXID epoch
pg_checkpoint_oldest_active_xid	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s oldestActiveXID
pg_checkpoint_oldest_commit_ts_xid	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s oldestCommitTsXid
pg_checkpoint_oldest_multi_dbid	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s oldestMulti’s DB OID
pg_checkpoint_oldest_multi_xid	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s oldestMultiXid
pg_checkpoint_oldest_xid	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s oldestXID
pg_checkpoint_oldest_xid_dbid	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s oldestXID’s DB OID
pg_checkpoint_prev_tli	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s PrevTimeLineID
pg_checkpoint_redo_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s REDO location
pg_checkpoint_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Time of latest checkpoint
pg_checkpoint_tli	counter	`job`, `ins`, `ip`, `instance`, `cls`	Latest checkpoint’s TimeLineID
pg_conf_reload_time	gauge	`job`, `ins`, `ip`, `instance`, `cls`	seconds since last configuration reload
pg_db_active_time	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Time spent executing SQL statements in this database, in seconds
pg_db_age	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Age of database calculated from datfrozenxid
pg_db_allow_conn	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	If false(0) then no one can connect to this database.
pg_db_blk_read_time	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Time spent reading data file blocks by backends in this database, in seconds
pg_db_blk_write_time	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Time spent writing data file blocks by backends in this database, in seconds
pg_db_blks_access	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of times disk blocks that accessed read+hit
pg_db_blks_hit	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of times disk blocks were found already in the buffer cache
pg_db_blks_read	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of disk blocks read in this database
pg_db_cks_fail_time	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Time at which the last data page checksum failure was detected in this database
pg_db_cks_fails	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of data page checksum failures detected in this database, -1 for not enabled
pg_db_confl_confl_bufferpin	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of queries in this database that have been canceled due to pinned buffers
pg_db_confl_confl_deadlock	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of queries in this database that have been canceled due to deadlocks
pg_db_confl_confl_lock	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of queries in this database that have been canceled due to lock timeouts
pg_db_confl_confl_snapshot	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of queries in this database that have been canceled due to old snapshots
pg_db_confl_confl_tablespace	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of queries in this database that have been canceled due to dropped tablespaces
pg_db_conflicts	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of queries canceled due to conflicts with recovery in this database
pg_db_conn_limit	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Sets maximum number of concurrent connections that can be made to this database. -1 means no limit.
pg_db_datid	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	OID of the database
pg_db_deadlocks	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of deadlocks detected in this database
pg_db_frozen_xid	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	All transaction IDs before this one have been frozened
pg_db_is_template	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	If true(1), then this database can be cloned by any user with CREATEDB privileges
pg_db_ixact_time	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Time spent idling while in a transaction in this database, in seconds
pg_db_numbackends	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of backends currently connected to this database
pg_db_reset_time	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Time at which database statistics were last reset
pg_db_session_time	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Time spent by database sessions in this database, in seconds
pg_db_sessions	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of sessions established to this database
pg_db_sessions_abandoned	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of database sessions to this database that were terminated because connection to the client was lost
pg_db_sessions_fatal	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of database sessions to this database that were terminated by fatal errors
pg_db_sessions_killed	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of database sessions to this database that were terminated by operator intervention
pg_db_temp_bytes	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Total amount of data written to temporary files by queries in this database.
pg_db_temp_files	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of temporary files created by queries in this database
pg_db_tup_deleted	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows deleted by queries in this database
pg_db_tup_fetched	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows fetched by queries in this database
pg_db_tup_inserted	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows inserted by queries in this database
pg_db_tup_modified	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows modified by queries in this database
pg_db_tup_returned	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows returned by queries in this database
pg_db_tup_updated	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows updated by queries in this database
pg_db_xact_commit	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of transactions in this database that have been committed
pg_db_xact_rollback	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of transactions in this database that have been rolled back
pg_db_xact_total	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of transactions in this database
pg_downstream_count	gauge	`state`, `job`, `ins`, `ip`, `instance`, `cls`	Count of corresponding state
pg_exporter_agent_up	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pg_exporter_last_scrape_time	gauge	`job`, `ins`, `ip`, `instance`, `cls`	seconds exporter spending on scrapping
pg_exporter_query_cache_ttl	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	times to live of query cache
pg_exporter_query_scrape_duration	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	seconds query spending on scrapping
pg_exporter_query_scrape_error_count	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	times the query failed
pg_exporter_query_scrape_hit_count	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	numbers been scrapped from this query
pg_exporter_query_scrape_metric_count	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	numbers of metrics been scrapped from this query
pg_exporter_query_scrape_total_count	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	times exporter server was scraped for metrics
pg_exporter_scrape_duration	gauge	`job`, `ins`, `ip`, `instance`, `cls`	seconds exporter spending on scrapping
pg_exporter_scrape_error_count	counter	`job`, `ins`, `ip`, `instance`, `cls`	times exporter was scraped for metrics and failed
pg_exporter_scrape_total_count	counter	`job`, `ins`, `ip`, `instance`, `cls`	times exporter was scraped for metrics
pg_exporter_server_scrape_duration	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	seconds exporter server spending on scrapping
pg_exporter_server_scrape_error_count	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	N/A
pg_exporter_server_scrape_total_count	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	times exporter server was scraped for metrics
pg_exporter_server_scrape_total_seconds	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	seconds exporter server spending on scrapping
pg_exporter_up	gauge	`job`, `ins`, `ip`, `instance`, `cls`	always be 1 if your could retrieve metrics
pg_exporter_uptime	gauge	`job`, `ins`, `ip`, `instance`, `cls`	seconds since exporter primary server inited
pg_flush_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	primary only, location of current wal syncing
pg_func_calls	counter	`datname`, `funcname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of times this function has been called
pg_func_self_time	counter	`datname`, `funcname`, `job`, `ins`, `ip`, `instance`, `cls`	Total time spent in this function itself, not including other functions called by it, in ms
pg_func_total_time	counter	`datname`, `funcname`, `job`, `ins`, `ip`, `instance`, `cls`	Total time spent in this function and all other functions called by it, in ms
pg_in_recovery	gauge	`job`, `ins`, `ip`, `instance`, `cls`	server is in recovery mode? 1 for yes 0 for no
pg_index_idx_blks_hit	counter	`datname`, `relname`, `job`, `ins`, `relid`, `ip`, `instance`, `cls`, `idxname`	Number of buffer hits in this index
pg_index_idx_blks_read	counter	`datname`, `relname`, `job`, `ins`, `relid`, `ip`, `instance`, `cls`, `idxname`	Number of disk blocks read from this index
pg_index_idx_scan	counter	`datname`, `relname`, `job`, `ins`, `relid`, `ip`, `instance`, `cls`, `idxname`	Number of index scans initiated on this index
pg_index_idx_tup_fetch	counter	`datname`, `relname`, `job`, `ins`, `relid`, `ip`, `instance`, `cls`, `idxname`	Number of live table rows fetched by simple index scans using this index
pg_index_idx_tup_read	counter	`datname`, `relname`, `job`, `ins`, `relid`, `ip`, `instance`, `cls`, `idxname`	Number of index entries returned by scans on this index
pg_index_relpages	gauge	`datname`, `relname`, `job`, `ins`, `relid`, `ip`, `instance`, `cls`, `idxname`	Size of the on-disk representation of this index in pages
pg_index_reltuples	gauge	`datname`, `relname`, `job`, `ins`, `relid`, `ip`, `instance`, `cls`, `idxname`	Estimate relation tuples
pg_insert_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	primary only, location of current wal inserting
pg_io_evictions	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Number of times a block has been written out from a shared or local buffer
pg_io_extend_time	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Time spent in extend operations in seconds
pg_io_extends	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Number of relation extend operations, each of the size specified in op_bytes.
pg_io_fsync_time	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Time spent in fsync operations in seconds
pg_io_fsyncs	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Number of fsync calls. These are only tracked in context normal
pg_io_hits	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	The number of times a desired block was found in a shared buffer.
pg_io_op_bytes	gauge	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	The number of bytes per unit of I/O read, written, or extended. 8192 by default
pg_io_read_time	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Time spent in read operations in seconds
pg_io_reads	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Number of read operations, each of the size specified in op_bytes.
pg_io_reset_time	gauge	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Timestamp at which these statistics were last reset
pg_io_reuses	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	The number of times an existing buffer in reused
pg_io_write_time	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Time spent in write operations in seconds
pg_io_writeback_time	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Time spent in writeback operations in seconds
pg_io_writebacks	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Number of units of size op_bytes which the process requested the kernel write out to permanent storage.
pg_io_writes	counter	`type`, `job`, `ins`, `object`, `ip`, `context`, `instance`, `cls`	Number of write operations, each of the size specified in op_bytes.
pg_is_in_recovery	gauge	`job`, `ins`, `ip`, `instance`, `cls`	1 if in recovery mode
pg_is_wal_replay_paused	gauge	`job`, `ins`, `ip`, `instance`, `cls`	1 if wal play paused
pg_lag	gauge	`job`, `ins`, `ip`, `instance`, `cls`	replica only, replication lag in seconds
pg_last_replay_time	gauge	`job`, `ins`, `ip`, `instance`, `cls`	time when last transaction been replayed
pg_lock_count	gauge	`datname`, `job`, `ins`, `ip`, `mode`, `instance`, `cls`	Number of locks of corresponding mode and database
pg_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	log sequence number, current write location
pg_meta_info	gauge	`cls`, `extensions`, `version`, `job`, `ins`, `primary_conninfo`, `conf_path`, `hba_path`, `ip`, `cluster_id`, `instance`, `listen_port`, `wal_level`, `ver_num`, `cluster_name`, `data_dir`	constant 1
pg_query_calls	counter	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	Number of times the statement was executed
pg_query_exec_time	counter	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	Total time spent executing the statement, in seconds
pg_query_io_time	counter	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	Total time the statement spent reading and writing blocks, in seconds
pg_query_rows	counter	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of rows retrieved or affected by the statement
pg_query_sblk_dirtied	counter	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of shared blocks dirtied by the statement
pg_query_sblk_hit	counter	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of shared block cache hits by the statement
pg_query_sblk_read	counter	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of shared blocks read by the statement
pg_query_sblk_written	counter	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of shared blocks written by the statement
pg_query_wal_bytes	counter	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	Total amount of WAL bytes generated by the statement
pg_receive_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	replica only, location of wal synced to disk
pg_recovery_backup_end_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	Backup end location
pg_recovery_backup_start_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	Backup start location
pg_recovery_min_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	Minimum recovery ending location
pg_recovery_min_timeline	counter	`job`, `ins`, `ip`, `instance`, `cls`	Min recovery ending loc’s timeline
pg_recovery_prefetch_block_distance	gauge	`job`, `ins`, `ip`, `instance`, `cls`	How many blocks ahead the prefetcher is looking
pg_recovery_prefetch_hit	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of blocks not prefetched because they were already in the buffer pool
pg_recovery_prefetch_io_depth	gauge	`job`, `ins`, `ip`, `instance`, `cls`	How many prefetches have been initiated but are not yet known to have completed
pg_recovery_prefetch_prefetch	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of blocks prefetched because they were not in the buffer pool
pg_recovery_prefetch_reset_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Time at which these recovery prefetch statistics were last reset
pg_recovery_prefetch_skip_fpw	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of blocks not prefetched because a full page image was included in the WAL
pg_recovery_prefetch_skip_init	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of blocks not prefetched because they would be zero-initialized
pg_recovery_prefetch_skip_new	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of blocks not prefetched because they didn’t exist yet
pg_recovery_prefetch_skip_rep	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of blocks not prefetched because they were already recently prefetched
pg_recovery_prefetch_wal_distance	gauge	`job`, `ins`, `ip`, `instance`, `cls`	How many bytes ahead the prefetcher is looking
pg_recovery_require_record	gauge	`job`, `ins`, `ip`, `instance`, `cls`	End-of-backup record required
pg_recv_flush_lsn	counter	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	Last write-ahead log location already received and flushed to disk
pg_recv_flush_tli	counter	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	Timeline number of last write-ahead log location received and flushed to disk
pg_recv_init_lsn	counter	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	First write-ahead log location used when WAL receiver is started
pg_recv_init_tli	counter	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	First timeline number used when WAL receiver is started
pg_recv_msg_recv_time	gauge	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	Receipt time of last message received from origin WAL sender
pg_recv_msg_send_time	gauge	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	Send time of last message received from origin WAL sender
pg_recv_pid	gauge	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	Process ID of the WAL receiver process
pg_recv_reported_lsn	counter	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	Last write-ahead log location reported to origin WAL sender
pg_recv_reported_time	gauge	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	Time of last write-ahead log location reported to origin WAL sender
pg_recv_time	gauge	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	Time of current snapshot
pg_recv_write_lsn	counter	`state`, `slot_name`, `job`, `ins`, `ip`, `instance`, `cls`, `sender_host`, `sender_port`	Last write-ahead log location already received and written to disk, but not flushed.
pg_relkind_count	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`, `relkind`	Number of relations of corresponding relkind
pg_repl_backend_xmin	counter	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	This standby’s xmin horizon reported by hot_standby_feedback.
pg_repl_client_port	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	TCP port number that the client is using for communication with this WAL sender, or -1 if a Unix socket is used
pg_repl_flush_diff	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Last log position flushed to disk by this standby server diff with current lsn
pg_repl_flush_lag	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written and flushed it
pg_repl_flush_lsn	counter	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Last write-ahead log location flushed to disk by this standby server
pg_repl_launch_time	counter	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Time when this process was started, i.e., when the client connected to this WAL sender
pg_repl_lsn	counter	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Current log position on this server
pg_repl_replay_diff	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Last log position replayed into the database on this standby server diff with current lsn
pg_repl_replay_lag	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written, flushed and applied it
pg_repl_replay_lsn	counter	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Last write-ahead log location replayed into the database on this standby server
pg_repl_reply_time	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Send time of last reply message received from standby server
pg_repl_sent_diff	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Last log position sent to this standby server diff with current lsn
pg_repl_sent_lsn	counter	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Last write-ahead log location sent on this connection
pg_repl_state	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Current WAL sender encoded state 0-4 for streaming startup catchup backup stopping
pg_repl_sync_priority	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Priority of this standby server for being chosen as the synchronous standby
pg_repl_sync_state	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Encoded synchronous state of this standby server, 0-3 for async potential sync quorum
pg_repl_time	counter	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Current timestamp in unix epoch
pg_repl_write_diff	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Last log position written to disk by this standby server diff with current lsn
pg_repl_write_lag	gauge	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written it
pg_repl_write_lsn	counter	`pid`, `usename`, `address`, `job`, `ins`, `appname`, `ip`, `instance`, `cls`	Last write-ahead log location written to disk by this standby server
pg_replay_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	replica only, location of wal applied
pg_seq_blks_hit	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`, `seqname`	Number of buffer hits in this sequence
pg_seq_blks_read	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`, `seqname`	Number of disk blocks read from this sequence
pg_seq_last_value	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`, `seqname`	The last sequence value written to disk
pg_setting_block_size	gauge	`job`, `ins`, `ip`, `instance`, `cls`	pg page block size, 8192 by default
pg_setting_data_checksums	gauge	`job`, `ins`, `ip`, `instance`, `cls`	whether data checksum is enabled, 1 enabled 0 disabled
pg_setting_max_connections	gauge	`job`, `ins`, `ip`, `instance`, `cls`	number of concurrent connections to the database server
pg_setting_max_locks_per_transaction	gauge	`job`, `ins`, `ip`, `instance`, `cls`	no more than this many distinct objects can be locked at any one time
pg_setting_max_prepared_transactions	gauge	`job`, `ins`, `ip`, `instance`, `cls`	maximum number of transactions that can be in the prepared state simultaneously
pg_setting_max_replication_slots	gauge	`job`, `ins`, `ip`, `instance`, `cls`	maximum number of replication slots
pg_setting_max_wal_senders	gauge	`job`, `ins`, `ip`, `instance`, `cls`	maximum number of concurrent connections from standby servers
pg_setting_max_worker_processes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	maximum number of background processes that the system can support
pg_setting_wal_log_hints	gauge	`job`, `ins`, `ip`, `instance`, `cls`	whether wal_log_hints is enabled, 1 enabled 0 disabled
pg_size_bytes	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	File size in bytes
pg_slot_active	gauge	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	True(1) if this slot is currently actively being used
pg_slot_catalog_xmin	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	The oldest transaction affecting the system catalogs that this slot needs the database to retain.
pg_slot_confirm_lsn	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	The address (LSN) up to which the logical slot’s consumer has confirmed receiving data.
pg_slot_reset_time	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	When statistics were last reset
pg_slot_restart_lsn	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	The address (LSN) of oldest WAL which still might be required by the consumer of this slot
pg_slot_retained_bytes	gauge	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	Size of bytes that retained for this slot
pg_slot_safe_wal_size	gauge	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	bytes that can be written to WAL which will not make slot into lost
pg_slot_spill_bytes	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	Bytes that spilled to disk due to logical decode mem exceeding
pg_slot_spill_count	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	Xacts that spilled to disk due to logical decode mem exceeding (a xact can be spilled multiple times)
pg_slot_spill_txns	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	Xacts that spilled to disk due to logical decode mem exceeding (subtrans included)
pg_slot_stream_bytes	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	Bytes that streamed to decoding output plugin after mem exceed
pg_slot_stream_count	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	Xacts that streamed to decoding output plugin after mem exceed (a xact can be streamed multiple times)
pg_slot_stream_txns	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	Xacts that streamed to decoding output plugin after mem exceed
pg_slot_temporary	gauge	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	True(1) if this is a temporary replication slot.
pg_slot_total_bytes	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	Number of decoded bytes sent to the decoding output plugin for this slot
pg_slot_total_txns	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	Number of decoded xacts sent to the decoding output plugin for this slot
pg_slot_wal_status	gauge	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	WAL reserve status 0-3 means reserved,extended,unreserved,lost, -1 means other
pg_slot_xmin	counter	`slot_name`, `job`, `ins`, `ip`, `instance`, `cls`	The oldest transaction that this slot needs the database to retain.
pg_slru_blks_exists	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of blocks checked for existence for this SLRU
pg_slru_blks_hit	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of times disk blocks were found already in the SLRU, so that a read was not necessary
pg_slru_blks_read	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of disk blocks read for this SLRU
pg_slru_blks_written	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of disk blocks written for this SLRU
pg_slru_blks_zeroed	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of blocks zeroed during initializations
pg_slru_flushes	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of flushes of dirty data for this SLRU
pg_slru_reset_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Time at which these statistics were last reset
pg_slru_truncates	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of truncates for this SLRU
pg_ssl_disabled	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of client connection that does not use ssl
pg_ssl_enabled	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of client connection that use ssl
pg_sync_standby_enabled	gauge	`job`, `ins`, `ip`, `names`, `instance`, `cls`	Synchronous commit enabled, 1 if enabled, 0 if disabled
pg_table_age	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Age of this table in vacuum cycles
pg_table_analyze_count	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of times this table has been manually analyzed
pg_table_autoanalyze_count	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of times this table has been analyzed by the autovacuum daemon
pg_table_autovacuum_count	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of times this table has been vacuumed by the autovacuum daemon
pg_table_frozenxid	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	All txid before this have been frozen on this table
pg_table_heap_blks_hit	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of buffer hits in this table
pg_table_heap_blks_read	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of disk blocks read from this table
pg_table_idx_blks_hit	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of buffer hits in all indexes on this table
pg_table_idx_blks_read	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of disk blocks read from all indexes on this table
pg_table_idx_scan	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of index scans initiated on this table
pg_table_idx_tup_fetch	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of live rows fetched by index scans
pg_table_kind	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Relation kind r/table/114
pg_table_n_dead_tup	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Estimated number of dead rows
pg_table_n_ins_since_vacuum	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Estimated number of rows inserted since this table was last vacuumed
pg_table_n_live_tup	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Estimated number of live rows
pg_table_n_mod_since_analyze	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Estimated number of rows modified since this table was last analyzed
pg_table_n_tup_del	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows deleted
pg_table_n_tup_hot_upd	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows HOT updated (i.e with no separate index update required)
pg_table_n_tup_ins	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows inserted
pg_table_n_tup_mod	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows modified (insert + update + delete)
pg_table_n_tup_newpage_upd	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows updated where the successor version goes onto a new heap page
pg_table_n_tup_upd	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of rows updated (includes HOT updated rows)
pg_table_ncols	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of columns in the table
pg_table_pages	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Size of the on-disk representation of this table in pages
pg_table_relid	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Relation oid of this table
pg_table_seq_scan	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of sequential scans initiated on this table
pg_table_seq_tup_read	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of live rows fetched by sequential scans
pg_table_size_bytes	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Total bytes of this table (including toast, index, toast index)
pg_table_size_indexsize	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Bytes of all related indexes of this table
pg_table_size_relsize	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Bytes of this table itself (main, vm, fsm)
pg_table_size_toastsize	gauge	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Bytes of toast tables of this table
pg_table_tbl_scan	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of scans initiated on this table
pg_table_tup_read	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of live rows fetched by scans
pg_table_tuples	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	All txid before this have been frozen on this table
pg_table_vacuum_count	counter	`datname`, `relname`, `job`, `ins`, `ip`, `instance`, `cls`	Number of times this table has been manually vacuumed (not counting VACUUM FULL)
pg_timestamp	gauge	`job`, `ins`, `ip`, `instance`, `cls`	database current timestamp
pg_up	gauge	`job`, `ins`, `ip`, `instance`, `cls`	last scrape was able to connect to the server: 1 for yes, 0 for no
pg_uptime	gauge	`job`, `ins`, `ip`, `instance`, `cls`	seconds since postmaster start
pg_version	gauge	`job`, `ins`, `ip`, `instance`, `cls`	server version number
pg_wait_count	gauge	`datname`, `job`, `ins`, `event`, `ip`, `instance`, `cls`	Count of WaitEvent on target database
pg_wal_buffers_full	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of times WAL data was written to disk because WAL buffers became full
pg_wal_bytes	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total amount of WAL generated in bytes
pg_wal_fpi	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total number of WAL full page images generated
pg_wal_records	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total number of WAL records generated
pg_wal_reset_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	When statistics were last reset
pg_wal_sync	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of times WAL files were synced to disk via issue_xlog_fsync request
pg_wal_sync_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total amount of time spent syncing WAL files to disk via issue_xlog_fsync request, in seconds
pg_wal_write	counter	`job`, `ins`, `ip`, `instance`, `cls`	Number of times WAL buffers were written out to disk via XLogWrite request.
pg_wal_write_time	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total amount of time spent writing WAL buffers to disk via XLogWrite request in seconds
pg_write_lsn	counter	`job`, `ins`, `ip`, `instance`, `cls`	primary only, location of current wal writing
pg_xact_xmax	counter	`job`, `ins`, `ip`, `instance`, `cls`	First as-yet-unassigned txid. txid >= this are invisible.
pg_xact_xmin	counter	`job`, `ins`, `ip`, `instance`, `cls`	Earliest txid that is still active
pg_xact_xnum	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Current active transaction count
pgbouncer:cls:load1	Unknown	`job`, `cls`	N/A
pgbouncer:cls:load15	Unknown	`job`, `cls`	N/A
pgbouncer:cls:load5	Unknown	`job`, `cls`	N/A
pgbouncer:db:conn_usage	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	N/A
pgbouncer:db:conn_usage_reserve	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	N/A
pgbouncer:db:pool_current_conn	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	N/A
pgbouncer:db:pool_disabled	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	N/A
pgbouncer:db:pool_max_conn	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	N/A
pgbouncer:db:pool_paused	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	N/A
pgbouncer:db:pool_reserve_size	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	N/A
pgbouncer:db:pool_size	Unknown	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	N/A
pgbouncer:ins:free_clients	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer:ins:free_servers	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer:ins:load1	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer:ins:load15	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer:ins:load5	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer:ins:login_clients	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer:ins:pool_databases	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer:ins:pool_users	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer:ins:pools	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer:ins:used_clients	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer_database_current_connections	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	Current number of connections for this database
pgbouncer_database_disabled	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	True(1) if this database is currently disabled, else 0
pgbouncer_database_max_connections	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	Maximum number of allowed connections for this database
pgbouncer_database_min_pool_size	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	Minimum number of server connections
pgbouncer_database_paused	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	True(1) if this database is currently paused, else 0
pgbouncer_database_pool_size	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	Maximum number of server connections
pgbouncer_database_reserve_pool	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `host`, `cls`, `real_datname`, `port`	Maximum number of additional connections for this database
pgbouncer_exporter_agent_up	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
pgbouncer_exporter_last_scrape_time	gauge	`job`, `ins`, `ip`, `instance`, `cls`	seconds exporter spending on scrapping
pgbouncer_exporter_query_cache_ttl	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	times to live of query cache
pgbouncer_exporter_query_scrape_duration	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	seconds query spending on scrapping
pgbouncer_exporter_query_scrape_error_count	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	times the query failed
pgbouncer_exporter_query_scrape_hit_count	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	numbers been scrapped from this query
pgbouncer_exporter_query_scrape_metric_count	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	numbers of metrics been scrapped from this query
pgbouncer_exporter_query_scrape_total_count	gauge	`datname`, `query`, `job`, `ins`, `ip`, `instance`, `cls`	times exporter server was scraped for metrics
pgbouncer_exporter_scrape_duration	gauge	`job`, `ins`, `ip`, `instance`, `cls`	seconds exporter spending on scrapping
pgbouncer_exporter_scrape_error_count	counter	`job`, `ins`, `ip`, `instance`, `cls`	times exporter was scraped for metrics and failed
pgbouncer_exporter_scrape_total_count	counter	`job`, `ins`, `ip`, `instance`, `cls`	times exporter was scraped for metrics
pgbouncer_exporter_server_scrape_duration	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	seconds exporter server spending on scrapping
pgbouncer_exporter_server_scrape_total_count	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	times exporter server was scraped for metrics
pgbouncer_exporter_server_scrape_total_seconds	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	seconds exporter server spending on scrapping
pgbouncer_exporter_up	gauge	`job`, `ins`, `ip`, `instance`, `cls`	always be 1 if your could retrieve metrics
pgbouncer_exporter_uptime	gauge	`job`, `ins`, `ip`, `instance`, `cls`	seconds since exporter primary server inited
pgbouncer_in_recovery	gauge	`job`, `ins`, `ip`, `instance`, `cls`	server is in recovery mode? 1 for yes 0 for no
pgbouncer_list_items	gauge	`job`, `ins`, `ip`, `instance`, `list`, `cls`	Number of corresponding pgbouncer object
pgbouncer_pool_active_cancel_clients	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Client connections that have forwarded query cancellations to the server and are waiting for the server response.
pgbouncer_pool_active_cancel_servers	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Server connections that are currently forwarding a cancel request
pgbouncer_pool_active_clients	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Client connections that are linked to server connection and can process queries
pgbouncer_pool_active_servers	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Server connections that are linked to a client
pgbouncer_pool_cancel_clients	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Client connections that have not forwarded query cancellations to the server yet.
pgbouncer_pool_cancel_servers	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	cancel requests have completed that were sent to cancel a query on this server
pgbouncer_pool_idle_servers	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Server connections that are unused and immediately usable for client queries
pgbouncer_pool_login_servers	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Server connections currently in the process of logging in
pgbouncer_pool_maxwait	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	How long the first(oldest) client in the queue has waited, in seconds, key metric
pgbouncer_pool_maxwait_us	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Microsecond part of the maximum waiting time.
pgbouncer_pool_tested_servers	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Server connections that are currently running reset or check query
pgbouncer_pool_used_servers	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Server connections that have been idle for more than server_check_delay (means have to run check query)
pgbouncer_pool_waiting_clients	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `user`, `cls`, `pool_mode`	Client connections that have sent queries but have not yet got a server connection
pgbouncer_stat_avg_query_count	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Average queries per second in last stat period
pgbouncer_stat_avg_query_time	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Average query duration, in seconds
pgbouncer_stat_avg_recv	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Average received (from clients) bytes per second
pgbouncer_stat_avg_sent	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Average sent (to clients) bytes per second
pgbouncer_stat_avg_wait_time	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Time spent by clients waiting for a server, in seconds (average per second).
pgbouncer_stat_avg_xact_count	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Average transactions per second in last stat period
pgbouncer_stat_avg_xact_time	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Average transaction duration, in seconds
pgbouncer_stat_total_query_count	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of SQL queries pooled by pgbouncer
pgbouncer_stat_total_query_time	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of seconds spent when executing queries
pgbouncer_stat_total_received	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Total volume in bytes of network traffic received by pgbouncer
pgbouncer_stat_total_sent	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Total volume in bytes of network traffic sent by pgbouncer
pgbouncer_stat_total_wait_time	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Time spent by clients waiting for a server, in seconds
pgbouncer_stat_total_xact_count	gauge	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of SQL transactions pooled by pgbouncer
pgbouncer_stat_total_xact_time	counter	`datname`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of seconds spent when in a transaction
pgbouncer_up	gauge	`job`, `ins`, `ip`, `instance`, `cls`	last scrape was able to connect to the server: 1 for yes, 0 for no
pgbouncer_version	gauge	`job`, `ins`, `ip`, `instance`, `cls`	server version number
process_cpu_seconds_total	counter	`job`, `ins`, `ip`, `instance`, `cls`	Total user and system CPU time spent in seconds.
process_max_fds	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Maximum number of open file descriptors.
process_open_fds	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Number of open file descriptors.
process_resident_memory_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Resident memory size in bytes.
process_start_time_seconds	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Virtual memory size in bytes.
process_virtual_memory_max_bytes	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flight	gauge	`job`, `ins`, `ip`, `instance`, `cls`	Current number of scrapes being served.
promhttp_metric_handler_requests_total	counter	`code`, `job`, `ins`, `ip`, `instance`, `cls`	Total number of scrapes by HTTP status code.
scrape_duration_seconds	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
scrape_samples_post_metric_relabeling	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
scrape_samples_scraped	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
scrape_series_added	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A
up	Unknown	`job`, `ins`, `ip`, `instance`, `cls`	N/A

6.18 - 常见问题

PostgreSQL 常见问题答疑

PGSQL初始化失败：ABORT due to install on unmanaged node

如果在未纳管的节点上直接安装 PostgreSQL 模块，会出现此错误。（判定方法：/etc/pki/ca.crt 不存在，这是由 Node 模块添加的 Pigsty 自签名CA）

解决方法是首先使用 ./node.yml 将节点/节点集群纳管，再安装 PGSQL 模块。

PGSQL初始化失败：ABORT due to postgres exists

这意味着正在初始化的 PostgreSQL 实例已经存在了，将 pg_clean 设置为 true，并将 pg_safeguard 设置为 false，就可以在执行 pgsql.yml 期间强制清理现存实例。

如果 pg_clean 为 true (并且 pg_safeguard 也为 false)，pgsql.yml 剧本将会移除现有的 pgsql 数据并重新初始化为新的，这使得这个剧本真正幂等。

你可以通过使用一个特殊的任务标签 pg_purge 来强制清除现有的 PostgreSQL 数据，这个标签任务会忽略 pg_clean 和 pg_safeguard 的设置，所以非常危险。

./pgsql.yml -t pg_clean      # 优先考虑 pg_clean 和 pg_safeguard
./pgsql.yml -t pg_purge      # 忽略 pg_clean 和 pg_safeguard

PGSQL初始化失败：ABORT due to pg_safeguard enabled

这意味着正准备清理的 PostgreSQL 实例打开了防误删保险，禁用 pg_safeguard 以移除 Postgres 实例。

如果防误删保险 pg_safeguard 打开，那么你就不能使用 bin/pgsql-rm 和 pgsql-rm.yml 剧本移除正在运行的 PGSQL 实例了。

要禁用 pg_safeguard，你可以在配置清单中将 pg_safeguard 设置为 false，或者在执行剧本时使用命令参数 -e pg_safeguard=false。

./pgsql-rm.yml -e pg_safeguard=false -l <cls_to_remove>    # 强制覆盖 pg_safeguard

———————–`

PGSQL初始化失败：Fail to wait for postgres/patroni primary

这种错误信息存在多种可能，需要你检查 Ansible，Systemd / Patroni / PostgreSQL 日志，找出真正的原因。

可能性1：集群配置错误，找出错误的配置项修改并应用。
可能性2：在部署中存在同名集群，或者之前的同名集群主节点被不正确地移除
可能性3：在DCS中有同名集群残留的垃圾元数据：没有正确完成下线，你可以使用 etcdctl del --prefix /pg/<cls> 来手工删除残留数据（请小心）
可能性4：你的 PostgreSQL 或节点相关 RPM 包没有被成功安装
可能性5：你的 Watchdog 内核模块没有正确启用加载
可能性6：你在初始化数据库时指定的语言 Locale 不存在（例如，使用了 en_US.UTF8，但没有安装英文语言包或 Locale 支持）
如果你遇到了其他的原因，欢迎提交 Issue 或向社区求助。

PGSQL初始化失败：Fail to wait for postgres/patroni replica

存在几种可能的原因：

立即失败：通常是由于配置错误、网络问题、损坏的DCS元数据等原因。你必须检查 /pg/log 找出实际原因。

过了一会儿失败：这可能是由于源实例数据损坏。查看 PGSQL FAQ：如何在数据损坏时创建副本？

过了很长时间再超时：如果 wait for postgres replica 任务耗时 30 分钟或更长时间并由于超时而失败，这对于大型集群（例如，1TB+，可能需要几小时创建一个副本）是很常见的。

在这种情况下，底层创建副本的过程仍在进行。你可以使用 pg list <cls> 检查集群状态并等待副本赶上主节点。然后使用以下命令继续以下任务，完成完整的从库初始化：

./pgsql.yml -t pg_hba,pg_reload,pg_backup,pgbouncer,pg_vip,pg_dns,pg_service,pg_exporter,pg_register -l <problematic_replica>

如何安装其他的PostgreSQL大版本：12 - 14，以及 16beta

要安装 PostgreSQL 12 - 15，你必须在配置清单中设置 pg_version 为 12、13、14 或 15，通常在集群级别配置这个参数。

pg_version: 16                    # 在此模板中安装 pg 16
pg_libs: 'pg_stat_statements, auto_explain' # 从 pg 16 beta 中移除 timescaledb，因为它不可用
pg_extensions: []                 # 目前缺少 pg16 扩展

在 prod.yml 43 节点生产环境仿真模板中提供了安装 12 - 16 大版本集群的示例。

详情请参考 PGSQL配置：切换大版本

如何为 PostgreSQL 启用大页/HugePage？

使用 node_hugepage_count 和 node_hugepage_ratio 或 /pg/bin/pg-tune-hugepage

如果你计划启用大页（HugePage），请考虑使用 node_hugepage_count 和 node_hugepage_ratio，并配合 ./node.yml -t node_tune 进行应用。

大页对于数据库来说有利有弊，利是内存是专门管理的，不用担心被挪用，降低数据库 OOM 风险。缺点是某些场景下可能对性能由负面影响。

在 PostgreSQL 启动前，您需要分配 足够多的 大页，浪费的部分可以使用 pg-tune-hugepage 脚本对其进行回收，不过此脚本仅 PostgreSQL 15+ 可用。

如果你的 PostgreSQL 已经在运行，你可以使用下面的办法启动大页（仅 PG15+ 可用）：

sync; echo 3 > /proc/sys/vm/drop_caches   # 刷盘，释放系统缓存（请做好数据库性能受到冲击的准备）
sudo /pg/bin/pg-tune-hugepage             # 将 nr_hugepages 写入 /etc/sysctl.d/hugepage.conf
pg restart <cls>                          # 重启 postgres 以使用 hugepage

如何确保故障转移中数据不丢失？

使用 crit.yml 参数模板，设置 pg_rpo 为 0，或配置集群为同步提交模式。

考虑使用同步备库和法定多数提交来确保故障转移过程中的零数据丢失。

更多细节，可以参考安全考量 - 可用性的相关介绍。

磁盘写满了如何抢救？

如果磁盘写满了，连 Shell 命令都无法执行，rm -rf /pg/dummy 可以释放一些救命空间。

默认情况下，pg_dummy_filesize 设置为 64MB。在生产环境中，建议将其增加到 8GB 或更大。

它将被放置在 PGSQL 主数据磁盘上的 /pg/dummy 路径下。你可以删除该文件以释放一些紧急空间：至少可以让你在该节点上运行一些 shell 脚本来进一步回收其他空间。

当集群数据已经损坏时如何创建副本？

Pigsty 在所有实例的 patroni 配置中设置了 cloneform: true 标签，标记该实例可用于创建副本。

如果某个实例有损坏的数据文件，导致创建新副本的时候出错中断，那么你可以设置 clonefrom: false 来避免从损坏的实例中拉取数据。具体操作如下

$ vi /pg/bin/patroni.yml

tags:
  nofailover: false
  clonefrom: true      # ----------> change to false
  noloadbalance: false
  nosync: false
  version:  '15'
  spec: '4C.8G.50G'
  conf: 'oltp.yml'
  
$ systemctl reload patroni    # 重新加载 Patroni 配置

PostgreSQL 监控的性能损耗如何？

一个常规 PostgreSQL 实例抓取耗时大约 200ms。抓取间隔默认为 10 秒，对于一个生产多核数据库实例来说几乎微不足道。

请注意，Pigsty 默认开启了库内对象监控，所以如果您的数据库内有数以十万计的表/索引对象，抓取可能耗时会增加到几秒。

您可以修改 Prometheus 的抓取频率，请确保一点：抓取周期应当显著高于一次抓取的时长。

如何监控一个现存的 PostgreSQL 实例？

在 PGSQL Monitor 中提供了详细的监控配置说明。

如何手工从 Prometheus 中移除 PostgreSQL 监控对象？

./pgsql-rm.yml -t prometheus -l <cls>     # 将集群 'cls' 的所有实例从 prometheus 中移除

bin/pgmon-rm <ins>     # 用于从 Prometheus 中移除单个实例 'ins' 的监控对象，特别适合移除添加的外部实例

7 - PG 内核分支

如何在 Pigsty 中使用其他 PostgreSQL 内核分支？例如 Citus， Babelfish/WiltonDB，IvorySQL， PolarDB 等等……

在 Pigsty 中，您可以使用不同 “风味” 的 PostgreSQL 分支替换 “原生PG内核”，实现特殊的功能与效果。

7.1 - Citus (Distributive)

使用 Pigsty 部署原生高可用的 Citus 水平分片集群，将 PostgreSQL 无缝伸缩到多套分片并加速 OLTP/OLAP 查询。

Pigsty 原生支持 Citus。这是一个基于原生 PostgreSQL 内核的分布式水平扩展插件。

使用 Citus 搭建水平扩展的高可用 PostgreSQL 集群，请参考：Citus教程：部署 Citus 高可用集群

安装

Citus 是一个 PostgreSQL 扩展插件，可以按照标准插件安装的流程，在原生 PostgreSQL 集群上加装启用。

./pgsql.yml -t pg_extension -e '{"pg_extensions":["citus"]}'

配置

要定义一个 citus 集群，您需要指定以下参数：

pg_mode 必须设置为 citus，而不是默认的 pgsql
在每个分片集群上都必须定义分片名 pg_shard 和分片号 pg_group
必须定义 patroni_citus_db 来指定由 Patroni 管理的数据库。
如果您想使用 pg_dbsu 的 postgres 而不是默认的 pg_admin_username 来执行管理命令，那么 pg_dbsu_password 必须设置为非空的纯文本密码

此外，还需要额外的 hba 规则，允许从本地和其他数据节点进行 SSL 访问。

您可以将每个 Citus 集群分别定义为独立的分组，像标准的 PostgreSQL 集群一样，如 conf/dbms/citus.yml 所示：

all:
  children:
    pg-citus0: # citus 0号分片
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus0 , pg_group: 0 }
    pg-citus1: # citus 1号分片
      hosts: { 10.10.10.11: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus1 , pg_group: 1 }
    pg-citus2: # citus 2号分片
      hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus2 , pg_group: 2 }
    pg-citus3: # citus 3号分片
      hosts:
        10.10.10.13: { pg_seq: 1, pg_role: primary }
        10.10.10.14: { pg_seq: 2, pg_role: replica }
      vars: { pg_cluster: pg-citus3 , pg_group: 3 }
  vars:                               # 所有 Citus 集群的全局参数
    pg_mode: citus                    # pgsql 集群模式需要设置为： citus
    pg_shard: pg-citus                # citus 水平分片名称： pg-citus
    patroni_citus_db: meta            # citus 数据库名称：meta
    pg_dbsu_password: DBUser.Postgres # 如果使用 dbsu ，那么需要为其配置一个密码
    pg_users: [ { name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
    pg_databases: [ { name: meta ,extensions: [ { name: citus }, { name: postgis }, { name: timescaledb } ] } ]
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32 ,auth: ssl ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra        ,auth: ssl ,title: 'all user ssl access from intranet'  }

您也可以在一个分组内指定所有 Citus 集群成员的身份参数，如 prod.yml 所示：

#==========================================================#
# pg-citus: 10 node citus cluster (5 x primary-replica pair)
#==========================================================#
pg-citus: # citus group
  hosts:
    10.10.10.50: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.60/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.51: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.60/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.52: { pg_group: 1, pg_cluster: pg-citus1 ,pg_vip_address: 10.10.10.61/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.53: { pg_group: 1, pg_cluster: pg-citus1 ,pg_vip_address: 10.10.10.61/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.54: { pg_group: 2, pg_cluster: pg-citus2 ,pg_vip_address: 10.10.10.62/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.55: { pg_group: 2, pg_cluster: pg-citus2 ,pg_vip_address: 10.10.10.62/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.56: { pg_group: 3, pg_cluster: pg-citus3 ,pg_vip_address: 10.10.10.63/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.57: { pg_group: 3, pg_cluster: pg-citus3 ,pg_vip_address: 10.10.10.63/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.58: { pg_group: 4, pg_cluster: pg-citus4 ,pg_vip_address: 10.10.10.64/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.59: { pg_group: 4, pg_cluster: pg-citus4 ,pg_vip_address: 10.10.10.64/24 ,pg_seq: 1, pg_role: replica }
  vars:
    pg_mode: citus                    # pgsql cluster mode: citus
    pg_shard: pg-citus                # citus shard name: pg-citus
    pg_primary_db: test               # primary database used by citus
    pg_dbsu_password: DBUser.Postgres # all dbsu password access for citus cluster
    pg_vip_enabled: true
    pg_vip_interface: eth1
    pg_extensions: [ 'citus postgis timescaledb pgvector' ]
    pg_libs: 'citus, timescaledb, pg_stat_statements, auto_explain' # citus will be added by patroni automatically
    pg_users: [ { name: test ,password: test ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
    pg_databases: [ { name: test ,owner: test ,extensions: [ { name: citus }, { name: postgis } ] } ]
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 10.10.10.0/24 ,auth: trust ,title: 'trust citus cluster members'        }
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32  ,auth: ssl   ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra         ,auth: ssl   ,title: 'all user ssl access from intranet'  }

使用

您可以像访问普通集群一样，访问任意节点：

pgbench -i postgres://test:test@pg-citus0/test
pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus0/test

默认情况下，您对某一个 Shard 进行的变更，都只发生在这套集群上，而不会同步到其他 Shard。

如果你希望将写入分布到所有 Shard，可以使用 Citus 提供的 API 函数，将表标记为：

水平分片表（自动分区，需要指定分区键）
引用表（全量复制：不需要指定分区键）：

从 Citus 11.2 开始，任何 Citus 数据库节点都可以扮演协调者的角色，即，任意一个主节点都可以写入：

psql -h pg-citus0 -d test -c "SELECT create_distributed_table('pgbench_accounts', 'aid'); SELECT truncate_local_data_after_distributing_table('public.pgbench_accounts');"
psql -h pg-citus0 -d test -c "SELECT create_reference_table('pgbench_branches')         ; SELECT truncate_local_data_after_distributing_table('public.pgbench_branches');"
psql -h pg-citus0 -d test -c "SELECT create_reference_table('pgbench_history')          ; SELECT truncate_local_data_after_distributing_table('public.pgbench_history');"
psql -h pg-citus0 -d test -c "SELECT create_reference_table('pgbench_tellers')          ; SELECT truncate_local_data_after_distributing_table('public.pgbench_tellers');"

将表分布出去后，你可以在其他节点上也访问到：

psql -h pg-citus1 -d test -c '\dt+'

例如，全表扫描可以发现执行计划已经变为分布式计划

vagrant@meta-1:~$ psql -h pg-citus3 -d test -c 'explain select * from pgbench_accounts'
                                               QUERY PLAN
---------------------------------------------------------------------------------------------------------
 Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=100000 width=352)
   Task Count: 32
   Tasks Shown: One of 32
   ->  Task
         Node: host=10.10.10.52 port=5432 dbname=test
         ->  Seq Scan on pgbench_accounts_102008 pgbench_accounts  (cost=0.00..81.66 rows=3066 width=97)
(6 rows)

你可以从几个不同的主节点发起写入：

pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus1/test
pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus2/test
pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus3/test
pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus4/test

当某个节点出现故障时，Patroni 提供的原生高可用支持会将备用节点提升并自动顶上。

test=# select * from  pg_dist_node;
 nodeid | groupid |  nodename   | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards
--------+---------+-------------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
      1 |       0 | 10.10.10.51 |     5432 | default  | t           | t        | primary  | default     | t              | f
      2 |       2 | 10.10.10.54 |     5432 | default  | t           | t        | primary  | default     | t              | t
      5 |       1 | 10.10.10.52 |     5432 | default  | t           | t        | primary  | default     | t              | t
      3 |       4 | 10.10.10.58 |     5432 | default  | t           | t        | primary  | default     | t              | t
      4 |       3 | 10.10.10.56 |     5432 | default  | t           | t        | primary  | default     | t              | t

7.2 - Babelfish (MSSQL)

使用 WiltonDB 与 Babelfish 创建兼容 Microsoft SQL Server 的 PostgreSQL 数据库集群！（线缆协议级仿真）

Babelfish 是一个基于 PostgreSQL 的 MSSQL（微软 SQL Server）兼容性方案，由 AWS 开源。

概览

Pigsty 允许用户使用 Babelfish 与 WiltonDB 创建 Microsoft SQL Server 兼容的 PostgreSQL 集群！

Babelfish ：一个由 AWS 开源的 MSSQL（微软 SQL Server）兼容性扩展插件
WiltonDB：一个专注于整合 Babelfish 的 PostgreSQL 内核发行版

Babelfish 是一个 PostgreSQL 扩展插件，但只能在一个轻微修改过的 PostgreSQL 内核 Fork 上工作，WiltonDB 在 EL/Ubuntu 系统下提供了编译后的Fork内核二进制与扩展二进制软件包。

Pigsty 可以使用 WiltonDB 替代原生的 PostgreSQL 内核，提供开箱即用的 MSSQL 兼容集群。MSSQL集群使用与管理与一套标准的 PostgreSQL 15 集群并无差异，您可以使用 Pigsty 提供的所有功能，如高可用，备份，监控等。

WiltonDB 带有包括 Babelfish 在内的若干扩展插件，但不能使用 PostgreSQL 原生的扩展插件。

MSSQL 兼容集群在启动后，除了监听 PostgreSQL 默认的端口外，还会监听 MSSQL 默认的 1433 端口，并在此端口上通过 TDS WireProtocol 提供 MSSQL 服务。您可以用任何 MSSQL 客户端连接至 Pigsty 提供的 MSSQL 服务，如 SQL Server Management Studio，或者使用 sqlcmd 命令行工具。

安装

WiltonDB 与原生 PostgreSQL 内核冲突，在一个节点上只能选择一个内核进行安装，使用以下命令在线安装 WiltonDB 内核。

./node.yml -t node_install -e '{"node_repo_modules":"local,mssql","node_packages":["wiltondb"]}'

请注意 WiltonDB 仅在 EL 与 Ubuntu 系统中可用，目前尚未提供 Debian 支持。

Pigsty 专业版提供了 WiltonDB 离线安装包，可以从本地软件源安装 WiltonDB。

配置

在安装部署 MSSQL 模块时需要特别注意以下事项：

WiltonDB 在 EL (7/8/9) 和 Ubuntu (20.04/22.04) 中可用，在Debian系统中不可用。
WiltonDB 目前基于 PostgreSQL 15 编译，因此需要指定 pg_version: 15 。
在 EL 系统上，wiltondb 的二进制默认会安装至 /usr/bin/ 目录下，而在 Ubuntu 系统上则会安装至 /usr/lib/postgresql/15/bin/ 目录下，与 PostgreSQL 官方二进制文件放置位置不同。
WiltonDB 兼容模式下，HBA 密码认证规则需要使用 md5，而非 scram-sha-256，因此需要覆盖 Pigsty 默认的 HBA 规则集，将 SQL Server 需要的 md5 认证规则，插入到 dbrole_readonly 通配认证规则之前
WiltonDB 只能针对一个首要数据库启用，同时应当指定一个用户作为 Babelfish 的超级用户，以便 Babelfish 可以创建数据库和用户，默认为 mssql 与 dbuser_myssql，如果修改，请一并修改 files/mssql.sql 中的用户。
WiltonDB TDS 线缆协议兼容插件 babelfishpg_tds 需要在 shared_preload_libraries 中启用
WiltonDB 扩展在启用后，默认监听 MSSQL 1433 端口，您可以覆盖 Pigsty 默认的服务定义，将 primary 与 replica 服务的端口指向 1433 ，而不是 5432 / 6432 端口。

以下参数需要针对 MSSQL 数据库集群进行配置：

#----------------------------------#
# PGSQL & MSSQL (Babelfish & Wilton)
#----------------------------------#
# PG Installation
node_repo_modules: local,node,mssql # add mssql and os upstream repos
pg_mode: mssql                      # Microsoft SQL Server Compatible Mode
pg_libs: 'babelfishpg_tds, pg_stat_statements, auto_explain' # add timescaledb to shared_preload_libraries
pg_version: 15                      # The current WiltonDB major version is 15
pg_packages:
  - wiltondb                        # install forked version of postgresql with babelfishpg support
  - patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager
pg_extensions: []                   # do not install any vanilla postgresql extensions

# PG Provision
pg_default_hba_rules:               # overwrite default HBA rules for babelfish cluster
- {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
- {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
- {user: '${repl}'    ,db: replication ,addr: localhost ,auth: pwd   ,title: 'replicator replication from localhost'}
- {user: '${repl}'    ,db: replication ,addr: intra     ,auth: pwd   ,title: 'replicator replication from intranet' }
- {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: pwd   ,title: 'replicator postgres db from intranet' }
- {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
- {user: '${monitor}' ,db: all         ,addr: infra     ,auth: pwd   ,title: 'monitor from infra host with password'}
- {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
- {user: '${admin}'   ,db: all         ,addr: world     ,auth: ssl   ,title: 'admin @ everywhere with ssl & pwd'    }
- {user: dbuser_mssql ,db: mssql       ,addr: intra     ,auth: md5   ,title: 'allow mssql dbsu intranet access'     } # <--- use md5 auth method for mssql user
- {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: pwd   ,title: 'pgbouncer read/write via local socket'}
- {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: pwd   ,title: 'read/write biz user via password'     }
- {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: pwd   ,title: 'allow etl offline tasks from intranet'}
pg_default_services:                # route primary & replica service to mssql port 1433
- { name: primary ,port: 5433 ,dest: 1433  ,check: /primary   ,selector: "[]" }
- { name: replica ,port: 5434 ,dest: 1433  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
- { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
- { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}

您可以定义 MSSQL 业务数据库与业务用户：

#----------------------------------#
# pgsql (singleton on current node)
#----------------------------------#
# this is an example single-node postgres cluster with postgis & timescaledb installed, with one biz database & two biz users
pg-meta:
  hosts:
    10.10.10.10: { pg_seq: 1, pg_role: primary } # <---- primary instance with read-write capability
  vars:
    pg_cluster: pg-test
    pg_users:                           # create MSSQL superuser
      - {name: dbuser_mssql ,password: DBUser.MSSQL ,superuser: true, pgbouncer: true ,roles: [dbrole_admin], comment: superuser & owner for babelfish  }
    pg_primary_db: mssql                # use `mssql` as the primary sql server database
    pg_databases:
      - name: mssql
        baseline: mssql.sql             # init babelfish database & user
        extensions:
          - { name: uuid-ossp          }
          - { name: babelfishpg_common }
          - { name: babelfishpg_tsql   }
          - { name: babelfishpg_tds    }
          - { name: babelfishpg_money  }
          - { name: pg_hint_plan       }
          - { name: system_stats       }
          - { name: tds_fdw            }
        owner: dbuser_mssql
        parameters: { 'babelfishpg_tsql.migration_mode' : 'multi-db' }
        comment: babelfish cluster, a MSSQL compatible pg cluster

访问

您可以使用任何 SQL Server 兼容的客户端工具来访问这个数据库集群。

Microsoft 提供了 sqlcmd 作为官方的命令行工具。

除此之外，他们还提供了一个 Go 语言版本的命令行工具 go-sqlcmd。

安装 go-sqlcmd:

curl -LO https://github.com/microsoft/go-sqlcmd/releases/download/v1.4.0/sqlcmd-v1.4.0-linux-amd64.tar.bz2
tar xjvf sqlcmd-v1.4.0-linux-amd64.tar.bz2
sudo mv sqlcmd* /usr/bin/

快速上手 go-sqlcmd

$ sqlcmd -S 10.10.10.10,1433 -U dbuser_mssql -P DBUser.MSSQL
1> select @@version
2> go
version                                                                                                                                                                                                                                                         
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Babelfish for PostgreSQL with SQL Server Compatibility - 12.0.2000.8
Oct 22 2023 17:48:32
Copyright (c) Amazon Web Services
PostgreSQL 15.4 (EL 1:15.4.wiltondb3.3_2-2.el8) on x86_64-redhat-linux-gnu (Babelfish 3.3.0)                                        

(1 row affected)

使用 Pigsty 提供的服务机制，可以使用 5433 / 5434 端口始终连接到主库/从库上的 1433 端口。

# 访问任意集群成员上的 5433 端口，指向主库上的 1433 MSSQL 端口 
sqlcmd -S 10.10.10.11,5433 -U dbuser_mssql -P DBUser.MSSQL

# 访问任意集群成员上的 5434 端口，指向任意可读库上的 1433 MSSQL 端口
sqlcmd -S 10.10.10.11,5434 -U dbuser_mssql -P DBUser.MSSQL

扩展

绝大多数 PGSQL 模块的 扩展插件（非纯 SQL 类）都无法直接在 MSSQL 模块的 WiltonDB 内核上使用，需要重新编译。

目前 WiltonDB 自带了以下扩展插件，除了 PostgreSQL Contrib 扩展，四个 BabelfishPG 核心扩展之外，还提供了 pg_hint_pan，tds_fdw，以及 system_stats 三个第三方扩展。

扩展名	版本	说明
dblink	1.2	connect to other PostgreSQL databases from within a database
adminpack	2.1	administrative functions for PostgreSQL
dict_int	1.0	text search dictionary template for integers
intagg	1.1	integer aggregator and enumerator (obsolete)
dict_xsyn	1.0	text search dictionary template for extended synonym processing
amcheck	1.3	functions for verifying relation integrity
autoinc	1.0	functions for autoincrementing fields
bloom	1.0	bloom access method - signature file based index
fuzzystrmatch	1.1	determine similarities and distance between strings
intarray	1.5	functions, operators, and index support for 1-D arrays of integers
btree_gin	1.3	support for indexing common datatypes in GIN
btree_gist	1.7	support for indexing common datatypes in GiST
hstore	1.8	data type for storing sets of (key, value) pairs
hstore_plperl	1.0	transform between hstore and plperl
isn	1.2	data types for international product numbering standards
hstore_plperlu	1.0	transform between hstore and plperlu
jsonb_plperl	1.0	transform between jsonb and plperl
citext	1.6	data type for case-insensitive character strings
jsonb_plperlu	1.0	transform between jsonb and plperlu
jsonb_plpython3u	1.0	transform between jsonb and plpython3u
cube	1.5	data type for multidimensional cubes
hstore_plpython3u	1.0	transform between hstore and plpython3u
earthdistance	1.1	calculate great-circle distances on the surface of the Earth
lo	1.1	Large Object maintenance
file_fdw	1.0	foreign-data wrapper for flat file access
insert_username	1.0	functions for tracking who changed a table
ltree	1.2	data type for hierarchical tree-like structures
ltree_plpython3u	1.0	transform between ltree and plpython3u
pg_walinspect	1.0	functions to inspect contents of PostgreSQL Write-Ahead Log
moddatetime	1.0	functions for tracking last modification time
old_snapshot	1.0	utilities in support of old_snapshot_threshold
pgcrypto	1.3	cryptographic functions
pgrowlocks	1.2	show row-level locking information
pageinspect	1.11	inspect the contents of database pages at a low level
pg_surgery	1.0	extension to perform surgery on a damaged relation
seg	1.4	data type for representing line segments or floating-point intervals
pgstattuple	1.5	show tuple-level statistics
pg_buffercache	1.3	examine the shared buffer cache
pg_freespacemap	1.2	examine the free space map (FSM)
postgres_fdw	1.1	foreign-data wrapper for remote PostgreSQL servers
pg_prewarm	1.2	prewarm relation data
tcn	1.0	Triggered change notifications
pg_trgm	1.6	text similarity measurement and index searching based on trigrams
xml2	1.1	XPath querying and XSLT
refint	1.0	functions for implementing referential integrity (obsolete)
pg_visibility	1.2	examine the visibility map (VM) and page-level visibility info
pg_stat_statements	1.10	track planning and execution statistics of all SQL statements executed
sslinfo	1.2	information about SSL certificates
tablefunc	1.0	functions that manipulate whole tables, including crosstab
tsm_system_rows	1.0	TABLESAMPLE method which accepts number of rows as a limit
tsm_system_time	1.0	TABLESAMPLE method which accepts time in milliseconds as a limit
unaccent	1.1	text search dictionary that removes accents
uuid-ossp	1.1	generate universally unique identifiers (UUIDs)
plpgsql	1.0	PL/pgSQL procedural language
babelfishpg_money	1.1.0	babelfishpg_money
system_stats	2.0	EnterpriseDB system statistics for PostgreSQL
tds_fdw	2.0.3	Foreign data wrapper for querying a TDS database (Sybase or Microsoft SQL Server)
babelfishpg_common	3.3.3	Transact SQL Datatype Support
babelfishpg_tds	1.0.0	TDS protocol extension
pg_hint_plan	1.5.1
babelfishpg_tsql	3.3.1	Transact SQL compatibility

Pigsty 专业版提供离线安装 MSSQL 兼容模块的能力
Pigsty 专业版 提供可选的 MSSQL 兼容内核扩展移植定制服务，可以将 PGSQL 模块中可用的扩展移植到 MSSQL 集群中。

7.3 - IvorySQL (Oracle)

使用瀚高开源的 IvorySQL 内核，基于 PostgreSQL 集群实现 Oracle 语法/PLSQL 兼容性。

IvorySQL 是一个开源的“Oracle兼容” PostgreSQL 内核，由瀚高出品，使用 Apache 2.0 许可证。

当然这里的 Oracle 兼容是 Pl/SQL，语法，内置函数、数据类型、系统视图、MERGE 以及 GUC参数层面上的兼容，不是 Babelfish，openHalo，FerretDB 那种可以不改客户端驱动的缆协议兼容。所以用户还是要使用 PostgreSQL 的客户端工具来访问 IvorySQL，但是可以使用 Oracle 兼容的语法。

目前 IvorySQL 最新版本 4.4 与 PostgreSQL 最新小版本 17.4 保持兼容，并且提供了主流 Linux 上的二进制 RPM/DEB 包。而 Pigsty 提供了在 PG RDS 中将原生 PostgreSQL 替换为 IvorySQL 内核的选项。

快速上手

使用标准流程安装 Pigsty，并使用 ivory 配置模板：

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty
./bootstrap              # 安装 Pigsty 依赖
./configure -c ivory     # 使用 IvorySQL 配置模板
./install.yml            # 使用剧本执行部署

对于生产环境部署，您应当在执行 ./install.yml 进行部署前，编辑自动生成的 pigsty.yml 配置文件，修改密码等参数。

当前最新的 IvorySQL 4.4 等效于 PostgreSQL 17，任何兼容 PostgreSQL 线缆协议的客户端工具都可以访问 IvorySQL 集群。

不过，默认情况下，你可以使用 PostgreSQL 客户端从另一个 1521 端口访问，这种情况下默认使用 Oracle 兼容模式。

配置说明

在 Pigsty 中要使用 IvorySQL 内核，需要修改以下四个配置参数：

pg_mode：使用 ivory 兼容模式
repo_extra_packages：下载 ivroysql 软件包
pg_packages：安装 ivorysql 软件包
pg_libs：加载 Oracle 语法兼容扩展

是的就是这么简单，你只需要在配置文件的全局变量中加上这四行，Pigsty 就会使用 IvorySQL 替换原生的 PostgreSQL 内核了

pg_mode: ivory                           # IvorySQL 兼容模式，使用 IvorySQL 的二进制
pg_packages: [ ivorysql, pgsql-common ]  # 安装 ivorysql，替换 pgsql-main 主内核
pg_libs: 'liboracle_parser, pg_stat_statements, auto_explain'  # 加载 Oracle 兼容扩展
repo_extra_packages: [ ivorysql ]        # 下载 ivorysql 软件包

IvorySQL 还提供了一系列新增 GUC 参数变量，您可以在 pg_parameters 中指定。

扩展列表

绝大多数 PGSQL 模块的 扩展插件 （非纯 SQL 类）都无法直接在 IvorySQL 内核上使用，如果需要使用，请针对新内核从源码重新编译安装。

目前 IvorySQL 内核自带了以下 109 个扩展插件。

IvorySQL 中可用的扩展插件列表

扩展名	版本	说明
amcheck	1.4	functions for verifying relation integrity
autoinc	1.0	functions for autoincrementing fields
bloom	1.0	bloom access method - signature file based index
bool_plperl	1.0	transform between bool and plperl
bool_plperlu	1.0	transform between bool and plperlu
btree_gin	1.3	support for indexing common datatypes in GIN
btree_gist	1.7	support for indexing common datatypes in GiST
citext	1.6	data type for case-insensitive character strings
cube	1.5	data type for multidimensional cubes
dblink	1.2	connect to other PostgreSQL databases from within a database
dict_int	1.0	text search dictionary template for integers
dict_xsyn	1.0	text search dictionary template for extended synonym processing
dummy_index_am	1.0	dummy_index_am - index access method template
dummy_seclabel	1.0	Test code for SECURITY LABEL feature
earthdistance	1.2	calculate great-circle distances on the surface of the Earth
file_fdw	1.0	foreign-data wrapper for flat file access
fuzzystrmatch	1.2	determine similarities and distance between strings
hstore	1.8	data type for storing sets of (key, value) pairs
hstore_plperl	1.0	transform between hstore and plperl
hstore_plperlu	1.0	transform between hstore and plperlu
hstore_plpython3u	1.0	transform between hstore and plpython3u
injection_points	1.0	Test code for injection points
insert_username	1.0	functions for tracking who changed a table
intagg	1.1	integer aggregator and enumerator (obsolete)
intarray	1.5	functions, operators, and index support for 1-D arrays of integers
isn	1.2	data types for international product numbering standards
ivorysql_ora	1.0	Oracle Compatible extenison on Postgres Database
jsonb_plperl	1.0	transform between jsonb and plperl
jsonb_plperlu	1.0	transform between jsonb and plperlu
jsonb_plpython3u	1.0	transform between jsonb and plpython3u
lo	1.1	Large Object maintenance
ltree	1.3	data type for hierarchical tree-like structures
ltree_plpython3u	1.0	transform between ltree and plpython3u
moddatetime	1.0	functions for tracking last modification time
ora_btree_gin	1.0	support for indexing oracle datatypes in GIN
ora_btree_gist	1.0	support for oracle indexing common datatypes in GiST
pageinspect	1.12	inspect the contents of database pages at a low level
pg_buffercache	1.5	examine the shared buffer cache
pg_freespacemap	1.2	examine the free space map (FSM)
pg_get_functiondef	1.0	Get function’s definition
pg_prewarm	1.2	prewarm relation data
pg_stat_statements	1.11	track planning and execution statistics of all SQL statements executed
pg_surgery	1.0	extension to perform surgery on a damaged relation
pg_trgm	1.6	text similarity measurement and index searching based on trigrams
pg_visibility	1.2	examine the visibility map (VM) and page-level visibility info
pg_walinspect	1.1	functions to inspect contents of PostgreSQL Write-Ahead Log
pgcrypto	1.3	cryptographic functions
pgrowlocks	1.2	show row-level locking information
pgstattuple	1.5	show tuple-level statistics
plisql	1.0	PL/iSQL procedural language
plperl	1.0	PL/Perl procedural language
plperlu	1.0	PL/PerlU untrusted procedural language
plpgsql	1.0	PL/pgSQL procedural language
plpython3u	1.0	PL/Python3U untrusted procedural language
plsample	1.0	PL/Sample
pltcl	1.0	PL/Tcl procedural language
pltclu	1.0	PL/TclU untrusted procedural language
postgres_fdw	1.1	foreign-data wrapper for remote PostgreSQL servers
refint	1.0	functions for implementing referential integrity (obsolete)
seg	1.4	data type for representing line segments or floating-point intervals
spgist_name_ops	1.0	Test opclass for SP-GiST
sslinfo	1.2	information about SSL certificates
tablefunc	1.0	functions that manipulate whole tables, including crosstab
tcn	1.0	Triggered change notifications
test_bloomfilter	1.0	Test code for Bloom filter library
test_copy_callbacks	1.0	Test code for COPY callbacks
test_custom_rmgrs	1.0	Test code for custom WAL resource managers
test_ddl_deparse	1.0	Test code for DDL deparse feature
test_dsa	1.0	Test code for dynamic shared memory areas
test_dsm_registry	1.0	Test code for the DSM registry
test_ext1	1.0	Test extension 1
test_ext2	1.0	Test extension 2
test_ext3	1.0	Test extension 3
test_ext4	1.0	Test extension 4
test_ext5	1.0	Test extension 5
test_ext6	1.0	test_ext6
test_ext7	1.0	Test extension 7
test_ext8	1.0	Test extension 8
test_ext9	1.0	test_ext9
test_ext_cine	1.0	Test extension using CREATE IF NOT EXISTS
test_ext_cor	1.0	Test extension using CREATE OR REPLACE
test_ext_cyclic1	1.0	Test extension cyclic 1
test_ext_cyclic2	1.0	Test extension cyclic 2
test_ext_evttrig	1.0	Test extension - event trigger
test_ext_extschema	1.0	test @extschema@
test_ext_req_schema1	1.0	Required extension to be referenced
test_ext_req_schema2	1.0	Test schema referencing of required extensions
test_ext_req_schema3	1.0	Test schema referencing of 2 required extensions
test_ext_set_schema	1.0	Test ALTER EXTENSION SET SCHEMA
test_ginpostinglist	1.0	Test code for ginpostinglist.c
test_integerset	1.0	Test code for integerset
test_lfind	1.0	Test code for optimized linear search functions
test_parser	1.0	example of a custom parser for full-text search
test_pg_dump	1.0	Test pg_dump with an extension
test_predtest	1.0	Test code for optimizer/util/predtest.c
test_radixtree	1.0	Test code for radix tree
test_rbtree	1.0	Test code for red-black tree library
test_regex	1.0	Test code for backend/regex/
test_resowner	1.0	Test code for ResourceOwners
test_shm_mq	1.0	Test code for shared memory message queues
test_slru	1.0	Test code for SLRU
test_tidstore	1.0	Test code for tidstore
tsm_system_rows	1.0	TABLESAMPLE method which accepts number of rows as a limit
tsm_system_time	1.0	TABLESAMPLE method which accepts time in milliseconds as a limit
unaccent	1.1	text search dictionary that removes accents
uuid-ossp	1.1	generate universally unique identifiers (UUIDs)
worker_spi	1.0	Sample background worker
xid_wraparound	1.0	Tests for XID wraparound
xml2	1.1	XPath querying and XSLT

备注说明

目前 IvorySQL 的软件包位于 pigsty-infra 仓库，而非 pigsty-pgsql 或 pigsty-ivory 仓库。
IvorySQL 4.4 的默认 FHS 发生改变，请从老版本升级上来的用户留意。
IvorySQL 4.4 需要 gibc 版本 >= 2.17 即可，目前 Pigsty 支持的系统版本都满足这个条件
最后一个支持 EL7 的 IvorySQL 版本为 3.3，对应 PostgreSQL 16.3，目前 IvorySQL 4.x 已经不再提供对 EL7 的支持了。
Pigsty 不对使用 IvorySQL 内核承担任何质保责任，使用此内核遇到的任何问题与需求请联系原厂解决。

7.4 - OpenHalo (MySQL)

使用 MySQL 客户端与协议访问 PostgreSQL 数据库！羲和开源的OpenHalo内核提供了 MySQL 兼容的 PG 内核分支。

OpenHalo 是一个开源的，提供 MySQL 线缆协议兼容性的 PG 内核。

OpenHalo 基于 PostgreSQL 14.10 内核版本，提供对 MySQL 版本的线缆协议级兼容性（5.7.32-log / 8.0 ）。

目前 Pigsty 提供 EL 8/9 系统上的 OpenHalo 部署支持，Debian / Ubuntu 系统支持将在后续版本中提供。

快速上手

使用 Pigsty 标准安装流程，并使用 mysql 配置模板即可。

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty
./bootstrap              # 准备 Pigsty 依赖
./configure -c mysql     # 使用 MysQL （openHalo）配置模板
./install.yml            # 安装，生产部署请先修改 pigsty.yml 中的密码

对于生产部署，请务必在执行安装剧本前，先修改 pigsty.yml 配置文件中的密码参数。

使用说明

访问 MySQL 时，实际连接使用的是 postgres 数据库。请注意，MySQL 中 “数据库” 的概念其实对应着 PostgreSQL 中的 “Schema” 概念。因此 use mysql 使用的其实是 postgres 数据库中的 mysql Schema。

MySQL 使用的用户名和密码与 PostgreSQL 中的用户和密码一致。你可以使用 PostgreSQL 标准的方式来管理用户和权限。

客户端访问

openHalo 提供了 MySQL 线缆协议兼容性，默认监听 3306 端口，MySQL 客户端与驱动程序可以直接连接。

Pigsty 的 conf/mysql 配置默认安装了 mysql 客户端工具。

你可以使用以下命令访问 MySQL：

mysql -h 127.0.0.1 -u dbuser_dba

目前 OpenHalo 官方已经确保 Navicat 可以正常访问此 MySQL 端口，但 Intellij IDEA 的 DataGrip 访问会报错。

修改说明

Pigsty 安装的 OpenHalo 内核在 HaloTech-Co-Ltd/openHalo 内核基础上进行轻度修改：

默认数据库名称从 halo0root 修改回 postgres
移除默认版本号的 1.0. 前缀，修改回 14.10
修改默认配置文件，默认启用 MySQL 兼容性并监听 3306 端口

请注意，Pigsty 不对使用 OpenHalo 内核承担任何质保责任，使用此内核遇到的任何问题与需求请联系原厂解决。

7.5 - OrioleDB (OLTP)

针对 OLTP 场景进行极致优化的 PostgreSQL 存储引擎

OrioleDB 是一个 PostgreSQL 存储引擎扩展，号称提供更好的 OLTP 性能与吞吐表现。

OrioleDB 最新版本基于 PostgreSQL 17.0 内核版本进行分叉补丁，并在其基础上进行扩展开发

目前 Pigsty 提供 EL 8/9 系统上的 OrioleDB 部署支持，Debian / Ubuntu 系统支持将在后续版本中提供。

快速上手

使用 Pigsty 标准安装流程，并使用 oriole 配置模板即可。

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty
./bootstrap              # 准备 Pigsty 依赖
./configure -c oriole    # 使用 OrioleDB 配置模板
./install.yml            # 安装，生产部署请先修改 pigsty.yml 中的密码

对于生产部署，请务必在执行安装剧本前，先修改 pigsty.yml 配置文件中的密码参数。

配置说明

all:
  children:
    pg-orio:
      vars:
        pg_databases:
        - {name: meta ,extensions: [orioledb]}
  vars:
    pg_mode: oriole
    pg_version: 17
    pg_packages: [ orioledb, pgsql-common  ]
    pg_libs: 'orioledb.so, pg_stat_statements, auto_explain'
    repo_extra_packages: [ orioledb ]

使用说明

要使用 OrioleDB，需要安装 orioledb_17 和 oriolepg_17 两个软件包（目前仅提供 RPM）。

使用 pgbench 初始化 100 仓 TPC-B Like 表：

pgbench -is 100 meta
pgbench -nv -P1 -c10 -S -T1000 meta
pgbench -nv -P1 -c50 -S -T1000 meta
pgbench -nv -P1 -c10    -T1000 meta
pgbench -nv -P1 -c50    -T1000 meta

接下来可以使用 orioledb 存储引擎重建这些表，并查看性能变化：

-- 创建 OrioleDB 表
CREATE TABLE pgbench_accounts_o (LIKE pgbench_accounts INCLUDING ALL) USING orioledb;
CREATE TABLE pgbench_branches_o (LIKE pgbench_branches INCLUDING ALL) USING orioledb;
CREATE TABLE pgbench_history_o (LIKE pgbench_history INCLUDING ALL) USING orioledb;
CREATE TABLE pgbench_tellers_o (LIKE pgbench_tellers INCLUDING ALL) USING orioledb;

-- 将普通表数据复制到 OrioleDB 表中
INSERT INTO pgbench_accounts_o SELECT * FROM pgbench_accounts;
INSERT INTO pgbench_branches_o SELECT * FROM pgbench_branches;
INSERT INTO pgbench_history_o SELECT  * FROM pgbench_history;
INSERT INTO pgbench_tellers_o SELECT * FROM pgbench_tellers;

-- 删除原始表，并重命名 OrioleDB 表
DROP TABLE pgbench_accounts, pgbench_branches, pgbench_history, pgbench_tellers;
ALTER TABLE pgbench_accounts_o RENAME TO pgbench_accounts;
ALTER TABLE pgbench_branches_o RENAME TO pgbench_branches;
ALTER TABLE pgbench_history_o RENAME TO pgbench_history;
ALTER TABLE pgbench_tellers_o RENAME TO pgbench_tellers;

7.6 - PolarDB PG (RAC)

使用阿里云开源的 PolarDB for PostgreSQL 内核提供国产信创资质支持，与类似 Oracle RAC 的使用体验。

概览

Pigsty 允许使用 PolarDB 创建带有 “国产化信创资质” 的 PostgreSQL 集群！

PolarDB for PostgreSQL 基本等效于 PostgreSQL 15，任何兼容 PostgreSQL 线缆协议的客户端工具都可以访问 PolarDB 集群。

Pigsty 的 PGSQL 仓库中提供了PolarDB PG 开源版安装包，但不会在 Pigsty 安装时下载到本地软件仓库。

快速上手

使用标准流程安装 Pigsty，并使用 polar 配置模板：

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty
./bootstrap              # 安装 Pigsty 依赖
./configure -c polar     # 使用 PolarDB 配置模板
./install.yml            # 使用剧本执行部署

配置

以下参数需要针对 PolarDB 数据库集群进行特殊配置：

pg_version: 15
pg_packages: [ 'polardb patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager' ]
pg_extensions: [ ]                # do not install any vanilla postgresql extensions
pg_mode: polar                    # polardb compatible mode
pg_exporter_exclude_database: 'template0,template1,postgres,polardb_admin'
pg_default_roles:                 # default roles and users in postgres cluster
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation }
  - { name: postgres     ,superuser: true  ,comment: system superuser }
  - { name: replicator   ,superuser: true  ,replication: true ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator } # <- superuser is required for replication
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 ,comment: pgsql admin user }
  - { name: dbuser_monitor ,roles: [pg_monitor] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-utility ]
repo_extra_packages: [ polardb ] # replace vanilla postgres kernel with polardb kernel

这里特别注意，PolarDB PG 要求 replicator 复制用户为 Superuser，与原生 PG 不同。

扩展列表

绝大多数 PGSQL 模块的 扩展插件 （非纯 SQL 类）都无法直接在 PolarDB 内核上使用，如果需要使用，请针对新内核从源码重新编译安装。

目前 PolarDB 内核自带了以下 61 个扩展插件，除去 Contrib 扩展外，提供的额外扩展包括：

polar_csn 1.0 : polar_csn
polar_monitor 1.2 : examine the polardb information
polar_monitor_preload 1.1 : examine the polardb information
polar_parameter_check 1.0 : kernel extension for parameter validation
polar_px 1.0 : Parallel Execution extension
polar_stat_env 1.0 : env stat functions for PolarDB
polar_stat_sql 1.3 : Kernel statistics gathering, and sql plan nodes information gathering
polar_tde_utils 1.0 : Internal extension for TDE
polar_vfs 1.0 : polar_vfs
polar_worker 1.0 : polar_worker
timetravel 1.0 : functions for implementing time travel
vector 0.5.1 : vector data type and ivfflat and hnsw access methods
smlar 1.0 : compute similary of any one-dimensional arrays

PolarDB 可用的完整插件列表：

name	version	comment
hstore_plpython2u	1.0	transform between hstore and plpython2u
dict_int	1.0	text search dictionary template for integers
adminpack	2.0	administrative functions for PostgreSQL
hstore_plpython3u	1.0	transform between hstore and plpython3u
amcheck	1.1	functions for verifying relation integrity
hstore_plpythonu	1.0	transform between hstore and plpythonu
autoinc	1.0	functions for autoincrementing fields
insert_username	1.0	functions for tracking who changed a table
bloom	1.0	bloom access method - signature file based index
file_fdw	1.0	foreign-data wrapper for flat file access
dblink	1.2	connect to other PostgreSQL databases from within a database
btree_gin	1.3	support for indexing common datatypes in GIN
fuzzystrmatch	1.1	determine similarities and distance between strings
lo	1.1	Large Object maintenance
intagg	1.1	integer aggregator and enumerator (obsolete)
btree_gist	1.5	support for indexing common datatypes in GiST
hstore	1.5	data type for storing sets of (key, value) pairs
intarray	1.2	functions, operators, and index support for 1-D arrays of integers
citext	1.5	data type for case-insensitive character strings
cube	1.4	data type for multidimensional cubes
hstore_plperl	1.0	transform between hstore and plperl
isn	1.2	data types for international product numbering standards
jsonb_plperl	1.0	transform between jsonb and plperl
dict_xsyn	1.0	text search dictionary template for extended synonym processing
hstore_plperlu	1.0	transform between hstore and plperlu
earthdistance	1.1	calculate great-circle distances on the surface of the Earth
pg_prewarm	1.2	prewarm relation data
jsonb_plperlu	1.0	transform between jsonb and plperlu
pg_stat_statements	1.6	track execution statistics of all SQL statements executed
jsonb_plpython2u	1.0	transform between jsonb and plpython2u
jsonb_plpython3u	1.0	transform between jsonb and plpython3u
jsonb_plpythonu	1.0	transform between jsonb and plpythonu
pg_trgm	1.4	text similarity measurement and index searching based on trigrams
pgstattuple	1.5	show tuple-level statistics
ltree	1.1	data type for hierarchical tree-like structures
ltree_plpython2u	1.0	transform between ltree and plpython2u
pg_visibility	1.2	examine the visibility map (VM) and page-level visibility info
ltree_plpython3u	1.0	transform between ltree and plpython3u
ltree_plpythonu	1.0	transform between ltree and plpythonu
seg	1.3	data type for representing line segments or floating-point intervals
moddatetime	1.0	functions for tracking last modification time
pgcrypto	1.3	cryptographic functions
pgrowlocks	1.2	show row-level locking information
pageinspect	1.7	inspect the contents of database pages at a low level
pg_buffercache	1.3	examine the shared buffer cache
pg_freespacemap	1.2	examine the free space map (FSM)
tcn	1.0	Triggered change notifications
plperl	1.0	PL/Perl procedural language
uuid-ossp	1.1	generate universally unique identifiers (UUIDs)
plperlu	1.0	PL/PerlU untrusted procedural language
refint	1.0	functions for implementing referential integrity (obsolete)
xml2	1.1	XPath querying and XSLT
plpgsql	1.0	PL/pgSQL procedural language
plpython3u	1.0	PL/Python3U untrusted procedural language
pltcl	1.0	PL/Tcl procedural language
pltclu	1.0	PL/TclU untrusted procedural language
polar_csn	1.0	polar_csn
sslinfo	1.2	information about SSL certificates
polar_monitor	1.2	examine the polardb information
polar_monitor_preload	1.1	examine the polardb information
polar_parameter_check	1.0	kernel extension for parameter validation
polar_px	1.0	Parallel Execution extension
tablefunc	1.0	functions that manipulate whole tables, including crosstab
polar_stat_env	1.0	env stat functions for PolarDB
smlar	1.0	compute similary of any one-dimensional arrays
timetravel	1.0	functions for implementing time travel
tsm_system_rows	1.0	TABLESAMPLE method which accepts number of rows as a limit
polar_stat_sql	1.3	Kernel statistics gathering, and sql plan nodes information gathering
tsm_system_time	1.0	TABLESAMPLE method which accepts time in milliseconds as a limit
polar_tde_utils	1.0	Internal extension for TDE
polar_vfs	1.0	polar_vfs
polar_worker	1.0	polar_worker
unaccent	1.1	text search dictionary that removes accents
postgres_fdw	1.0	foreign-data wrapper for remote PostgreSQL servers

Pigsty 专业版提供 PolarDB 离线安装支持，扩展插件编译支持，以及针对 PolarDB 集群进行专门适配的监控与管控支持。
Pigsty 与阿里云内核团队有合作，可以提供有偿内核兜底支持服务。

7.7 - PolarDB O(racle)

使用阿里云商业版本的 PolarDB for Oracle 内核（闭源，PG14，仅在特殊企业版定制中可用）

Pigsty 允许使用 PolarDB 创建带有 “国产化信创资质” 的 PolarDB for Oracle 集群！

根据【安全可靠测评结果公告（2023年第1号）】，附表三、集中式数据库。PolarDB v2.0 属于自主可控，安全可靠的国产信创数据库。

PolarDB for Oracle 是基于 PolarDB for PostgreSQL 进行二次开发的 Oracle 兼容版本，两者共用同一套内核，通过 --compatibility-mode 参数进行区分。

我们与阿里云内核团队合作，提供基于 PolarDB v2.0 内核与 Pigsty v3.0 RDS 的完整数据库解决方案，请联系销售咨询，或在阿里云市场自行采购。

PolarDB for Oracle 内核目前仅在 EL 系统中可用。

扩展

目前 PolarDB 2.0 (Oracle兼容) 内核自带了以下 188 个扩展插件：

name	default_version	comment
cube	1.5	data type for multidimensional cubes
ip4r	2.4	NULL
adminpack	2.1	administrative functions for PostgreSQL
dict_xsyn	1.0	text search dictionary template for extended synonym processing
amcheck	1.4	functions for verifying relation integrity
autoinc	1.0	functions for autoincrementing fields
hstore	1.8	data type for storing sets of (key, value) pairs
bloom	1.0	bloom access method - signature file based index
earthdistance	1.1	calculate great-circle distances on the surface of the Earth
hstore_plperl	1.0	transform between hstore and plperl
bool_plperl	1.0	transform between bool and plperl
file_fdw	1.0	foreign-data wrapper for flat file access
bool_plperlu	1.0	transform between bool and plperlu
fuzzystrmatch	1.1	determine similarities and distance between strings
hstore_plperlu	1.0	transform between hstore and plperlu
btree_gin	1.3	support for indexing common datatypes in GIN
hstore_plpython2u	1.0	transform between hstore and plpython2u
btree_gist	1.6	support for indexing common datatypes in GiST
hll	2.17	type for storing hyperloglog data
hstore_plpython3u	1.0	transform between hstore and plpython3u
citext	1.6	data type for case-insensitive character strings
hstore_plpythonu	1.0	transform between hstore and plpythonu
hypopg	1.3.1	Hypothetical indexes for PostgreSQL
insert_username	1.0	functions for tracking who changed a table
dblink	1.2	connect to other PostgreSQL databases from within a database
decoderbufs	0.1.0	Logical decoding plugin that delivers WAL stream changes using a Protocol Buffer format
intagg	1.1	integer aggregator and enumerator (obsolete)
dict_int	1.0	text search dictionary template for integers
intarray	1.5	functions, operators, and index support for 1-D arrays of integers
isn	1.2	data types for international product numbering standards
jsonb_plperl	1.0	transform between jsonb and plperl
jsonb_plperlu	1.0	transform between jsonb and plperlu
jsonb_plpython2u	1.0	transform between jsonb and plpython2u
jsonb_plpython3u	1.0	transform between jsonb and plpython3u
jsonb_plpythonu	1.0	transform between jsonb and plpythonu
lo	1.1	Large Object maintenance
log_fdw	1.0	foreign-data wrapper for csvlog
ltree	1.2	data type for hierarchical tree-like structures
ltree_plpython2u	1.0	transform between ltree and plpython2u
ltree_plpython3u	1.0	transform between ltree and plpython3u
ltree_plpythonu	1.0	transform between ltree and plpythonu
moddatetime	1.0	functions for tracking last modification time
old_snapshot	1.0	utilities in support of old_snapshot_threshold
oracle_fdw	1.2	foreign data wrapper for Oracle access
oss_fdw	1.1	foreign-data wrapper for OSS access
pageinspect	2.1	inspect the contents of database pages at a low level
pase	0.0.1	ant ai similarity search
pg_bigm	1.2	text similarity measurement and index searching based on bigrams
pg_freespacemap	1.2	examine the free space map (FSM)
pg_hint_plan	1.4	controls execution plan with hinting phrases in comment of special form
pg_buffercache	1.5	examine the shared buffer cache
pg_prewarm	1.2	prewarm relation data
pg_repack	1.4.8-1	Reorganize tables in PostgreSQL databases with minimal locks
pg_sphere	1.0	spherical objects with useful functions, operators and index support
pg_cron	1.5	Job scheduler for PostgreSQL
pg_jieba	1.1.0	a parser for full-text search of Chinese
pg_stat_kcache	2.2.1	Kernel statistics gathering
pg_stat_statements	1.9	track planning and execution statistics of all SQL statements executed
pg_surgery	1.0	extension to perform surgery on a damaged relation
pg_trgm	1.6	text similarity measurement and index searching based on trigrams
pg_visibility	1.2	examine the visibility map (VM) and page-level visibility info
pg_wait_sampling	1.1	sampling based statistics of wait events
pgaudit	1.6.2	provides auditing functionality
pgcrypto	1.3	cryptographic functions
pgrowlocks	1.2	show row-level locking information
pgstattuple	1.5	show tuple-level statistics
pgtap	1.2.0	Unit testing for PostgreSQL
pldbgapi	1.1	server-side support for debugging PL/pgSQL functions
plperl	1.0	PL/Perl procedural language
plperlu	1.0	PL/PerlU untrusted procedural language
plpgsql	1.0	PL/pgSQL procedural language
plpython2u	1.0	PL/Python2U untrusted procedural language
plpythonu	1.0	PL/PythonU untrusted procedural language
plsql	1.0	Oracle compatible PL/SQL procedural language
pltcl	1.0	PL/Tcl procedural language
pltclu	1.0	PL/TclU untrusted procedural language
polar_bfile	1.0	The BFILE data type enables access to binary file LOBs that are stored in file systems outside Database
polar_bpe	1.0	polar_bpe
polar_builtin_cast	1.1	Internal extension for builtin casts
polar_builtin_funcs	2.0	implement polar builtin functions
polar_builtin_type	1.5	polar_builtin_type for PolarDB
polar_builtin_view	1.5	polar_builtin_view
polar_catalog	1.2	polardb pg extend catalog
polar_channel	1.0	polar_channel
polar_constraint	1.0	polar_constraint
polar_csn	1.0	polar_csn
polar_dba_views	1.0	polar_dba_views
polar_dbms_alert	1.2	implement polar_dbms_alert - supports asynchronous notification of database events.
polar_dbms_application_info	1.0	implement polar_dbms_application_info - record names of executing modules or transactions in the database.
polar_dbms_pipe	1.1	implements polar_dbms_pipe - package lets two or more sessions in the same instance communicate.
polar_dbms_aq	1.2	implement dbms_aq - provides an interface to Advanced Queuing.
polar_dbms_lob	1.3	implement dbms_lob - provides subprograms to operate on BLOBs, CLOBs, and NCLOBs.
polar_dbms_output	1.2	implement polar_dbms_output - enables you to send messages from stored procedures.
polar_dbms_lock	1.0	implement polar_dbms_lock - provides an interface to Oracle Lock Management services.
polar_dbms_aqadm	1.3	polar_dbms_aqadm - procedures to manage Advanced Queuing configuration and administration information.
polar_dbms_assert	1.0	implement polar_dbms_assert - provide an interface to validate properties of the input value.
polar_dbms_metadata	1.0	implement polar_dbms_metadata - provides a way for you to retrieve metadata from the database dictionary.
polar_dbms_random	1.0	implement polar_dbms_random - a built-in random number generator, not intended for cryptography
polar_dbms_crypto	1.1	implement dbms_crypto - provides an interface to encrypt and decrypt stored data.
polar_dbms_redact	1.0	implement polar_dbms_redact - provides an interface to mask data from queries by an application.
polar_dbms_debug	1.1	server-side support for debugging PL/SQL functions
polar_dbms_job	1.0	polar_dbms_job
polar_dbms_mview	1.1	implement polar_dbms_mview - enables to refresh materialized views.
polar_dbms_job_preload	1.0	polar_dbms_job_preload
polar_dbms_obfuscation_toolkit	1.1	implement polar_dbms_obfuscation_toolkit - enables an application to get data md5.
polar_dbms_rls	1.1	implement polar_dbms_rls - a fine-grained access control administrative built-in package
polar_multi_toast_utils	1.0	polar_multi_toast_utils
polar_dbms_session	1.2	implement polar_dbms_session - support to set preferences and security levels.
polar_odciconst	1.0	implement ODCIConst - Provide some built-in constants in Oracle.
polar_dbms_sql	1.2	implement polar_dbms_sql - provides an interface to execute dynamic SQL.
polar_osfs_toolkit	1.0	osfs library tools and functions extension
polar_dbms_stats	14.0	stabilize plans by fixing statistics
polar_monitor	1.5	monitor functions for PolarDB
polar_osfs_utils	1.0	osfs library utils extension
polar_dbms_utility	1.3	implement polar_dbms_utility - provides various utility subprograms.
polar_parameter_check	1.0	kernel extension for parameter validation
polar_dbms_xmldom	1.0	implement dbms_xmldom and dbms_xmlparser - support standard DOM interface and xml parser object
polar_parameter_manager	1.1	Extension to select parameters for manger.
polar_faults	1.0.0	simulate some database faults for end user or testing system.
polar_monitor_preload	1.1	examine the polardb information
polar_proxy_utils	1.0	Extension to provide operations about proxy.
polar_feature_utils	1.2	PolarDB feature utilization
polar_global_awr	1.0	PolarDB Global AWR Report
polar_publication	1.0	support polardb pg logical replication
polar_global_cache	1.0	polar_global_cache
polar_px	1.0	Parallel Execution extension
polar_serverless	1.0	polar serverless extension
polar_resource_manager	1.0	a background process that forcibly frees user session process memory
polar_sys_context	1.1	implement polar_sys_context - returns the value of parameter associated with the context namespace at the current instant.
polar_gpc	1.3	polar_gpc
polar_tde_utils	1.0	Internal extension for TDE
polar_gtt	1.1	polar_gtt
polar_utl_encode	1.2	implement polar_utl_encode - provides functions that encode RAW data into a standard encoded format
polar_htap	1.1	extension for PolarDB HTAP
polar_htap_db	1.0	extension for PolarDB HTAP database level operation
polar_io_stat	1.0	polar io stat in multi dimension
polar_utl_file	1.0	implement utl_file - support PL/SQL programs can read and write operating system text files
polar_ivm	1.0	polar_ivm
polar_sql_mapping	1.2	Record error sqls and mapping them to correct one
polar_stat_sql	1.0	Kernel statistics gathering, and sql plan nodes information gathering
tds_fdw	2.0.2	Foreign data wrapper for querying a TDS database (Sybase or Microsoft SQL Server)
xml2	1.1	XPath querying and XSLT
polar_upgrade_catalogs	1.1	Upgrade catalogs for old version instance
polar_utl_i18n	1.1	polar_utl_i18n
polar_utl_raw	1.0	implement utl_raw - provides SQL functions for manipulating RAW datatypes.
timescaledb	2.9.2	Enables scalable inserts and complex queries for time-series data
polar_vfs	1.0	polar virtual file system for different storage
polar_worker	1.0	polar_worker
postgres_fdw	1.1	foreign-data wrapper for remote PostgreSQL servers
refint	1.0	functions for implementing referential integrity (obsolete)
roaringbitmap	0.5	support for Roaring Bitmaps
tsm_system_time	1.0	TABLESAMPLE method which accepts time in milliseconds as a limit
vector	0.5.0	vector data type and ivfflat and hnsw access methods
rum	1.3	RUM index access method
unaccent	1.1	text search dictionary that removes accents
seg	1.4	data type for representing line segments or floating-point intervals
sequential_uuids	1.0.2	generator of sequential UUIDs
uuid-ossp	1.1	generate universally unique identifiers (UUIDs)
smlar	1.0	compute similary of any one-dimensional arrays
varbitx	1.1	varbit functions pack
sslinfo	1.2	information about SSL certificates
tablefunc	1.0	functions that manipulate whole tables, including crosstab
tcn	1.0	Triggered change notifications
zhparser	1.0	a parser for full-text search of Chinese
address_standardizer	3.3.2	Ganos PostGIS address standardizer
address_standardizer_data_us	3.3.2	Ganos PostGIS address standardizer data us
ganos_fdw	6.0	Ganos Spatial FDW extension for POLARDB
ganos_geometry	6.0	Ganos geometry lite extension for POLARDB
ganos_geometry_pyramid	6.0	Ganos Geometry Pyramid extension for POLARDB
ganos_geometry_sfcgal	6.0	Ganos geometry lite sfcgal extension for POLARDB
ganos_geomgrid	6.0	Ganos geometry grid extension for POLARDB
ganos_importer	6.0	Ganos Spatial importer extension for POLARDB
ganos_networking	6.0	Ganos networking
ganos_pointcloud	6.0	Ganos pointcloud extension For POLARDB
ganos_pointcloud_geometry	6.0	Ganos_pointcloud LIDAR data and ganos_geometry data for POLARDB
ganos_raster	6.0	Ganos raster extension for POLARDB
ganos_scene	6.0	Ganos scene extension for POLARDB
ganos_sfmesh	6.0	Ganos surface mesh extension for POLARDB
ganos_spatialref	6.0	Ganos spatial reference extension for POLARDB
ganos_trajectory	6.0	Ganos trajectory extension for POLARDB
ganos_vomesh	6.0	Ganos volumn mesh extension for POLARDB
postgis_tiger_geocoder	3.3.2	Ganos PostGIS tiger geocoder
postgis_topology	3.3.2	Ganos PostGIS topology

7.8 - PostgresML (AI/ML)

如何使用 Pigsty 拉起 PostgresML，在数据库内进行机器学习，模型训练、推理与 Embedding，RAG。

PostgresML is an PostgreSQL extension with the support for latest LLMs, vector operations, classical Machine Learning and good old Postgres application workloads.

PostgresML (pgml) is a PostgreSQL extension written in Rust. You can run standalone docker images, but this is not a docker-compose template introduction, this file is for documentation purpose only.

PostgresML is officially supported on Ubuntu 22.04, but we also maintain an RPM version for EL 8/9, if you don’t need CUDA & NVIDIA stuff.

You’ll need the Internet access on the database nodes to download python dependencies from PyPI and models from HuggingFace.

Configuration

PostgresML is a RUST extension with official Ubuntu support. Pigsty maintains an RPM version for PostgresML on EL8 and EL9.

Launch new Cluster

PostgresML 2.10.0 is available for PostgreSQL 15 on Ubuntu 22.04 (Official), Debian 12 and EL 8/9 (Pigsty). To enable pgml, you have to install the extension first:

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_users:
      - {name: dbuser_meta     ,password: DBUser.Meta     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
      - {name: dbuser_view     ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
    pg_databases:
      - { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: postgis, schema: public}, {name: timescaledb}]}
    pg_hba_rules:
      - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}
    pg_libs: 'pgml, pg_stat_statements, auto_explain'
    pg_extensions: [ 'pgml_15 pgvector_15 wal2json_15 repack_15' ]  # ubuntu
    #pg_extensions: [ 'postgresql-pgml-15 postgresql-15-pgvector postgresql-15-wal2json postgresql-15-repack' ]  # ubuntu

In EL 8/9, the extension name is pgml_15, corresponding name in ubuntu/debian is postgresql-pgml-15. and add pgml to pg_libs.

Enable on Existing Cluster

To enable pgml on existing cluster, install with ansible package module:

ansible pg-meta -m package -b -a 'name=pgml_15'
# ansible el8,el9 -m package -b -a 'name=pgml_15'           # EL 8/9
# ansible u22 -m package -b -a 'name=postgresql-pgml-15'    # Ubuntu 22.04 jammy

Python Dependencies

You also have to install python dependencies for PostgresML on cluster nodes. Official tutorial: installation

Install Python & PIP

Make sure python3, pip and venv is installed:

# ubuntu 22.04 (python3.10), you have to install pip & venv with apt
sudo apt install -y python3 python3-pip python3-venv

For EL 8 / EL9 and compatible distros, you can use python3.11

# el 8/9, you can upgrade default pip & virtualenv if applicable
sudo yum install -y python3.11 python3.11-pip       # install latest python3.11
python3.11 -m pip install --upgrade pip virtualenv  # use python3.11 on el8 / el9

Using pypi mirrors

For mainland China user, consider using the tsinghua pypi mirror.

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple    # setup global mirror (recommended)
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple some-package        # one-time install

Install Requirements

Create a python virtualenv and install requirements from requirements.txt and requirements-xformers.txt with pip.

If you are using EL 8/9, you have to replace the python3 with python3.11 in the following commands.

su - postgres;                          # create venv with dbsu
mkdir -p /data/pgml; cd /data/pgml;     # make a venv directory
python3    -m venv /data/pgml           # create virtualenv dir (ubuntu 22.04)
source /data/pgml/bin/activate          # activate virtual env

# write down python dependencies and install with pip
cat > /data/pgml/requirments.txt <<EOF
accelerate==0.22.0
auto-gptq==0.4.2
bitsandbytes==0.41.1
catboost==1.2
ctransformers==0.2.27
datasets==2.14.5
deepspeed==0.10.3
huggingface-hub==0.17.1
InstructorEmbedding==1.0.1
lightgbm==4.1.0
orjson==3.9.7
pandas==2.1.0
rich==13.5.2
rouge==1.0.1
sacrebleu==2.3.1
sacremoses==0.0.53
scikit-learn==1.3.0
sentencepiece==0.1.99
sentence-transformers==2.2.2
tokenizers==0.13.3
torch==2.0.1
torchaudio==2.0.2
torchvision==0.15.2
tqdm==4.66.1
transformers==4.33.1
xgboost==2.0.0
langchain==0.0.287
einops==0.6.1
pynvml==11.5.0
EOF

# install requirements with pip inside virtualenv
python3 -m pip install -r /data/pgml/requirments.txt
python3 -m pip install xformers==0.0.21 --no-dependencies

# besides, 3 python packages need to be installed globally with sudo!
sudo python3 -m pip install xgboost lightgbm scikit-learn

Enable PostgresML

After installing the pgml extension and python dependencies on all cluster nodes, you can enable pgml on the PostgreSQL cluster.

Configure cluster with patronictl command and add pgml to shared_preload_libraries, and specify your venv dir in pgml.venv:

shared_preload_libraries: pgml, timescaledb, pg_stat_statements, auto_explain
pgml.venv: '/data/pgml'

After that, restart database cluster, and create extension with SQL command:

CREATE EXTENSION vector;        -- nice to have pgvector installed too!
CREATE EXTENSION pgml;          -- create PostgresML in current database
SELECT pgml.version();          -- print PostgresML version string

If it works, you should see something like:

# create extension pgml;
INFO:  Python version: 3.11.2 (main, Oct  5 2023, 16:06:03) [GCC 8.5.0 20210514 (Red Hat 8.5.0-18)]
INFO:  Scikit-learn 1.3.0, XGBoost 2.0.0, LightGBM 4.1.0, NumPy 1.26.1
CREATE EXTENSION

# SELECT pgml.version(); -- print PostgresML version string
 version
---------
 2.7.8

You are all set! Check PostgresML for more details: https://postgresml.org/docs/guides/use-cases/

7.9 - Supabase (Firebase)

如何使用Pigsty自建Supabase，一键拉起开源Firebase替代，后端全栈全家桶。

Supabase —— Build in a weekend, Scale to millions

Supabase 是一个开源的 Firebase 替代，对 PostgreSQL 进行了封装，并提供了认证，开箱即用的 API，边缘函数，实时订阅，对象存储，向量嵌入能力。这是一个低代码的一站式后端平台，能让你几乎告别大部分后端开发的工作，只需要懂数据库设计与前端即可快速出活！

Supabase 的口号是：“花个周末写写，随便扩容至百万”。诚然，在小微规模（4c8g）内的 Supabase 极有性价比，堪称赛博菩萨。 —— 但当你真的增长到百万用户时 —— 确实应该认真考虑托管自建 Supabase 了 —— 无论是出于功能，性能，还是成本上的考虑。

Pigsty 为您提供完整的 Supabase 一键自建方案。自建的 Supabase 可以享受完整的 PostgreSQL 监控，IaC，PITR 与高可用，而且相比 Supabase 云服务，提供了多达 421 个开箱即用的 PostgreSQL 扩展，并能够更充分地利用现代硬件的性能与成本优势。

完整自建教程，请参考：《Supabase自建手册》

快速上手

Pigsty 默认提供的 supa.yml 配置模板定义了一套单节点 Supabase。

首先，使用 Pigsty 标准安装流程安装 Supabase 所需的 MinIO 与 PostgreSQL 实例：

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty
./bootstrap               # 准备 Pigsty 依赖
./configure -c app/supa   # 使用 Supabase 应用模板
./install.yml             # 安装 Pigsty，以及各种数据库

请在部署 Supabase 前，根据您的实际情况，修改 pigsty.yml 配置文件中关于 Supabase 的参数（主要是密码！）

然后，运行 docker.yml 完成剩余的工作，拉起 Supabase 容器

./docker.yml         # 安装 Docker 与 Docker Compose
./app.yml           # 使用 Docker Compose 拉起 Supabase 无状态部分！

中国区域用户注意，请您配置合适的 Docker 镜像站点或代理服务器绕过 GFW 以拉取 DockerHub 镜像。

对于专业订阅，我们提供在没有互联网访问的情况下，离线安装 Pigsty 与 Supabase 的能力。

Pigsty 默认通过管理节点/INFRA节点上的 Nginx 对外暴露 Web 服务，您可以在本地添加 supa.pigsty 的 DNS 解析指向该节点，然后通过浏览器访问 https://supa.pigsty 即可进入 Supabase Studio 管理界面。

默认用户名与密码：supabase / pigsty

架构概览

Pigsty 以 Supabase 提供的 Docker Compose 模板为蓝本，提取了其中的无状态部分，由 Docker Compose 负责处理。而有状态的数据库和对象存储容器则替换为外部由 Pigsty 托管的 PostgreSQL 集群与 MinIO 服务。

Supabase: 使用 Docker 自建

经过改造后，Supabase 本体是无状态的，因此您可以随意运行，停止，甚至在同一套 PGSQL/MINIO 上同时运行多个无状态 Supabase 容器以实现扩容。

Pigsty 默认使用本机上的单机 PostgreSQL 实例作为 Supabase 的核心后端数据库。对于严肃的生产部署，我们建议使用 Pigsty 部署一套至少由三节点的 PG 高可用集群。或至少使用外部对象存储作为 PITR 备份仓库，提供兜底。

Pigsty 默认使用本机上的 SNSD MinIO 服务作为文件存储。对于严肃的生产环境部署，您可以使用外部的 S3 兼容对象存储服务，或者使用其他由 Pigsty 独立部署的多机多盘 MinIO 集群。

配置细节

自建 Supabase 时，包含 Docker Compose 所需资源的目录 app/supabase 会被整个拷贝到目标节点（默认为 supabase 分组）上的 /opt/supabase，并使用 docker compose up -d 在后台拉起。

所有配置参数都定义在 .env 文件与 docker-compose.yml 模板中。但您通常不需要直接修改这两个模板，你可以在 supa_config 中指定 .env 中的参数，这些配置会自动覆盖或追加到最终的 /opt/supabase/.env 核心配置文件中。

这里最关键的参数是 jwt_secret，以及对应的 anon_key 与 service_role_key。对于严肃的生产使用，请您务必参考Supabase自建手册中的说明与工具设置。

如果您希望使用域名对外提供服务，您可以在 site_url， api_external_url，以及 supabase_public_url 中指定您的域名（从外部访问 Supabase 服务使用的域名）。如果这几个域名配置错误，可能导致 Supabase Studio 的部分管理能力（比如对象存储管理）无法正常工作。

Pigsty 默认使用本机 MinIO，如果您希望使用 S3 或 MinIO 作为文件存储，您需要配置 s3_bucket，s3_endpoint，s3_access_key，s3_secret_key 等参数。

通常来说，您还需要使用一个外部的 SMTP 服务来发送邮件，邮件服务不建议自建，请考虑使用成熟的第三方服务，如 Mailchimp，Aliyun 邮件推送等。

对于中国大陆用户来说，我们建议您配置 docker_registry_mirrors 镜像站点，或使用 proxy_env 指定可用的代理服务器翻墙，否则从 DockerHub 上拉取镜像可能会失败或极为缓慢！

all:
  children:

    # the supabase stateless (default username & password: supabase/pigsty)
    supa:
      hosts:
        10.10.10.10: {}
      vars:
        app: supabase # specify app name (supa) to be installed (in the apps)
        apps:         # define all applications
          supabase:   # the definition of supabase app
            conf:     # override /opt/supabase/.env
              # IMPORTANT: CHANGE JWT_SECRET AND REGENERATE CREDENTIAL ACCORDING!!!!!!!!!!!
              # https://supabase.com/docs/guides/self-hosting/docker#securing-your-services
              JWT_SECRET: your-super-secret-jwt-token-with-at-least-32-characters-long
              ANON_KEY: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyAgCiAgICAicm9sZSI6ICJhbm9uIiwKICAgICJpc3MiOiAic3VwYWJhc2UtZGVtbyIsCiAgICAiaWF0IjogMTY0MTc2OTIwMCwKICAgICJleHAiOiAxNzk5NTM1NjAwCn0.dc_X5iR_VP_qT0zsiyj_I_OZ2T9FtRU2BBNWN8Bu4GE
              SERVICE_ROLE_KEY: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyAgCiAgICAicm9sZSI6ICJzZXJ2aWNlX3JvbGUiLAogICAgImlzcyI6ICJzdXBhYmFzZS1kZW1vIiwKICAgICJpYXQiOiAxNjQxNzY5MjAwLAogICAgImV4cCI6IDE3OTk1MzU2MDAKfQ.DaYlNEoUrrEn2Ig7tqibS-PHK5vgusbcbo7X36XVt4Q
              DASHBOARD_USERNAME: supabase
              DASHBOARD_PASSWORD: pigsty

              # postgres connection string (use the correct ip and port)
              POSTGRES_HOST: 10.10.10.10
              POSTGRES_PORT: 5436             # access via the 'default' service, which always route to the primary postgres
              POSTGRES_DB: postgres
              POSTGRES_PASSWORD: DBUser.Supa  # password for supabase_admin and multiple supabase users

              # expose supabase via domain name
              SITE_URL: http://supa.pigsty                # <------- Change This to your external domain name
              API_EXTERNAL_URL: http://supa.pigsty        # <------- Otherwise the storage api may not work!
              SUPABASE_PUBLIC_URL: http://supa.pigsty     # <------- Do not forget!

              # if using s3/minio as file storage
              S3_BUCKET: supa
              S3_ENDPOINT: https://sss.pigsty:9000
              S3_ACCESS_KEY: supabase
              S3_SECRET_KEY: S3User.Supabase
              S3_FORCE_PATH_STYLE: true
              S3_PROTOCOL: https
              S3_REGION: stub
              MINIO_DOMAIN_IP: 10.10.10.10  # sss.pigsty domain name will resolve to this ip statically

              # if using SMTP (optional)
              #SMTP_ADMIN_EMAIL: admin@example.com
              #SMTP_HOST: supabase-mail
              #SMTP_PORT: 2500
              #SMTP_USER: fake_mail_user
              #SMTP_PASS: fake_mail_password
              #SMTP_SENDER_NAME: fake_sender
              #ENABLE_ANONYMOUS_USERS: false

7.10 - Greenplum (MPP)

使用 Pigsty 部署/监控 Greenplum 集群，构建大规模并行处理（MPP）的 PostgreSQL 数据仓库集群！

Pigsty 支持部署 Greenplum 集群，及其衍生发行版 YMatrixDB，并提供了将现有 Greenplum 部署纳入 Pigsty 监控的能力。

概览

Greenplum / YMatrix 集群部署能力仅在专业版本/企业版本中提供，目前不对外开源。

安装

Pigsty 提供了 Greenplum 6 (@el7) 与 Greenplum 7 (@el8) 的安装包，开源版本用户可以自行安装配置。

# EL 7 Only (Greenplum6)
./node.yml -t node_install  -e '{"node_repo_modules":"pgsql","node_packages":["open-source-greenplum-db-6"]}'

# EL 8 Only (Greenplum7)
./node.yml -t node_install  -e '{"node_repo_modules":"pgsql","node_packages":["open-source-greenplum-db-7"]}'

配置

要定义 Greenplum 集群，需要用到 pg_mode = gpsql，并使用额外的身份参数 pg_shard 与 gp_role。

#================================================================#
#                        GPSQL Clusters                          #
#================================================================#

#----------------------------------#
# cluster: mx-mdw (gp master)
#----------------------------------#
mx-mdw:
  hosts:
    10.10.10.10: { pg_seq: 1, pg_role: primary , nodename: mx-mdw-1 }
  vars:
    gp_role: master          # this cluster is used as greenplum master
    pg_shard: mx             # pgsql sharding name & gpsql deployment name
    pg_cluster: mx-mdw       # this master cluster name is mx-mdw
    pg_databases:
      - { name: matrixmgr , extensions: [ { name: matrixdbts } ] }
      - { name: meta }
    pg_users:
      - { name: meta , password: DBUser.Meta , pgbouncer: true }
      - { name: dbuser_monitor , password: DBUser.Monitor , roles: [ dbrole_readonly ], superuser: true }

    pgbouncer_enabled: true                # enable pgbouncer for greenplum master
    pgbouncer_exporter_enabled: false      # enable pgbouncer_exporter for greenplum master
    pg_exporter_params: 'host=127.0.0.1&sslmode=disable'  # use 127.0.0.1 as local monitor host

#----------------------------------#
# cluster: mx-sdw (gp master)
#----------------------------------#
mx-sdw:
  hosts:
    10.10.10.11:
      nodename: mx-sdw-1        # greenplum segment node
      pg_instances:             # greenplum segment instances
        6000: { pg_cluster: mx-seg1, pg_seq: 1, pg_role: primary , pg_exporter_port: 9633 }
        6001: { pg_cluster: mx-seg2, pg_seq: 2, pg_role: replica , pg_exporter_port: 9634 }
    10.10.10.12:
      nodename: mx-sdw-2
      pg_instances:
        6000: { pg_cluster: mx-seg2, pg_seq: 1, pg_role: primary , pg_exporter_port: 9633  }
        6001: { pg_cluster: mx-seg3, pg_seq: 2, pg_role: replica , pg_exporter_port: 9634  }
    10.10.10.13:
      nodename: mx-sdw-3
      pg_instances:
        6000: { pg_cluster: mx-seg3, pg_seq: 1, pg_role: primary , pg_exporter_port: 9633 }
        6001: { pg_cluster: mx-seg1, pg_seq: 2, pg_role: replica , pg_exporter_port: 9634 }
  vars:
    gp_role: segment               # these are nodes for gp segments
    pg_shard: mx                   # pgsql sharding name & gpsql deployment name
    pg_cluster: mx-sdw             # these segment clusters name is mx-sdw
    pg_preflight_skip: true        # skip preflight check (since pg_seq & pg_role & pg_cluster not exists)
    pg_exporter_config: pg_exporter_basic.yml                             # use basic config to avoid segment server crash
    pg_exporter_params: 'options=-c%20gp_role%3Dutility&sslmode=disable'  # use gp_role = utility to connect to segments

此外，PG Exporter 需要额外的连接参数，才能连接到 Greenplum Segment 实例上采集监控指标。

7.11 - Cloudberry (MPP)

使用 Pigsty 部署/监控 Cloudberry 集群，一个由 Greenplum 分叉而来的 MPP 数据仓库集群！

安装

Pigsty 提供了 Greenplum 6 (@el7) 与 Greenplum 7 (@el8) 的安装包，开源版本用户可以自行安装配置。

# EL 7 Only (Greenplum6)
./node.yml -t node_install  -e '{"node_repo_modules":"pgsql","node_packages":["cloudberrydb"]}'

# EL 8 Only (Greenplum7)
./node.yml -t node_install  -e '{"node_repo_modules":"pgsql","node_packages":["cloudberrydb"]}'

7.12 - Neon (Serverless)

使用 Neon 开源的 Serverless 版本 PostgreSQL 内核，自建灵活伸缩，Scale To Zero，灵活分叉的PG服务。

Neon 采用了存储与计算分离架构，提供了丝滑的自动扩缩容，Scale to Zero，以及数据库版本分叉等独家能力。

Neon 官网：https://neon.tech/

Neon 编译后的二进制产物过于庞大，目前不对开源版用户提供，目前处于试点阶段，有需求请联系 Pigsty 销售。

8 - 模块：INFRA

可独立使用的模块，为 Pigsty 提供 NTP，DNS，Prometheus & Grafana 可观测性技术栈等基础设施服务。

8.1 - 系统架构

介绍 Pigsty 中 INFRA 模块的整体架构，功能组件与责任分工。

一套标准的 Pigsty 部署会带有一个 INFRA 模块，为纳管的节点与数据库集群提供服务：

Nginx：作为 Web 服务器，提供本地软件仓库服务；作为反向代理，统一收拢其他 Web UI 服务的访问
Grafana：可视化平台，呈现监控指标，展现面板大屏，或者进行数据分析与可视化。
- Loki：集中收集存储日志，便于从 Grafana 中查询。
Prometheus：监控时序数据库，拉取监控指标，存储监控数据，计算报警规则。
- AlertManager：聚合告警事件，分发告警通知，告警屏蔽与管理。
- PushGateway：收集一次性任务/跑批任务的监控指标
- BlackboxExporter：探测各个节点 IP 与 VIP 地址的可达性
DNSMASQ：提供 DNS 解析服务，解析 Pigsty 内部使用到的域名
Chronyd：提供 NTP 时间同步服务，确保所有节点时间一致

INFRA 模块对于高可用 PostgreSQL 并非必选项，例如在 精简安装 模式下，就不会安装 Infra 模块。

但 INFRA 模块提供了运行生产级高可用 PostgreSQL 集群所需要的支持性服务，通常强烈建议安装启用。

如果您已经有自己的基础设施（Nginx，本地仓库，监控系统，DNS，NTP），您也可以停用 INFRA 模块，并通过 修改配置 来使用现有的基础设施。

架构总览

Infra 模块默认包含以下组件，使用以下默认端口与域名：

组件	端口	默认域名	描述
Nginx	`80/443`	`h.pigsty`	Web服务门户（本地软件仓库）
Grafana	`3000`	`g.pigsty`	可视化平台
Prometheus	`9090`	`p.pigsty`	时间序列数据库（收存监控指标）
AlertManager	`9093`	`a.pigsty`	告警聚合分发
Loki	`3100`	-	日志收集服务器
PushGateway	`9091`	-	接受一次性的任务指标
BlackboxExporter	`9115`	-	黑盒监控探测
DNSMasq	`53`	-	DNS 服务器
Chronyd	`123`	-	NTP 时间服务器

在单机上完整安装 Pigsty 功能集，节点上的组件大致如下图所示：

在默认情况下，INFRA 模块的故障通常不会影响现有 PostgreSQL 数据库集群的正常运行

在 Pigsty 中，PGSQL 模块会使用到 INFRA 模块上的一些服务，具体来说包括：

数据库集群/主机节点的域名，依赖INFRA节点的 DNSMASQ 解析。
- Pigsty 本身不使用这些域名，而使用 IP 地址直连，避免依赖 DNS。
在数据库节点软件上安装，需要用到 INFRA 模块提供的 Nginx 本地 yum/apt 软件仓库。
- 用户可以直接指定 repo_upstream 与 node_repo_modules，直接从互联网上游/其他本地仓库下载/安装软件
数据库集群/节点的监控指标，会被INFRA节点的 Prometheus 收集抓取。
- 当 prometheus_enabled 为 false，不会收集监控指标。
数据库节点的日志会被 Promtail 收集，并发往 INFRA节点上的 Loki（只会发往 infra_portal 定义的端点）。
- 如果 loki_enabled 为 false，则不会收集日志。
数据库节点默认会从 INFRA/ADMIN节点上的 NTP/Chronyd 服务器同步时间
- 如果是 Infra 节点，会默认配置使用公共 NTP 服务器，
- 其他节点会使用 INFRA/ADMIN 节点上的 NTP/Chronyd 服务器同步时间
- 如果您有专用 NTP 服务器，可以配置 node_ntp_servers 使用
如果没有专用集群，高可用组件 Patroni 会使用 INFRA 节点上的 etcd 作为高可用DCS。
如果没有专用集群，备份组件 pgbackrest 会使用 INFRA 节点上的 minio 作为可选的集中备份仓库。
用户会从 Infra/Admin 节点上使用 Ansible 或其他工具发起对数据库节点的管理：
- 执行集群创建，扩缩容，实例/集群回收
- 创建业务用户、业务数据库、修改服务、HBA修改；
- 执行日志采集、垃圾清理，备份，巡检等

Nginx

Nginx 是 Pigsty 所有 WebUI 类服务的访问入口，默认使用 80 / 443 端口对外提供 HTTP / HTTPS 服务。

带有 WebUI 的基础设施组件可以通过 Nginx 统一对外暴露服务，例如 Grafana，Prometheus，AlertManager，以及 HAProxy 控制台，此外，本地 yum/apt 仓库等静态文件资源也通过 Nginx 对内提供服务。

Nginx 会根据 infra_portal 的定义配置本地 Web 服务器或反向代理服务器，例如默认配置为：

infra_portal:
  home         : { domain: h.pigsty }
  grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" ,websocket: true }
  prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }
  #minio        : { domain: sss.pigsty  ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }

在这里默认记录的 endpoint 会被环境中的其他服务引用，例如，日志会发往 loki 对应的 endpoint 地址，而 Grafana 数据源会注册到 grafana 对应的 endpoint 地址，告警会发送至 alertmanager 对应的 endpoint 地址。

Pigsty 允许对 Nginx 进行丰富的定制，将其作为本地文件服务器，或者反向代理服务器，配置自签名或者真正的 HTTPS 证书。

Pigsty Demo 站点的样例 Nginx 配置

infra_portal:                     # domain names and upstream servers
  home         : { domain: home.pigsty.cc                                                 ,certbot: pigsty.demo }
  grafana      : { domain: demo.pigsty.cc ,endpoint: "${admin_ip}:3000", websocket: true  ,certbot: pigsty.demo }
  prometheus   : { domain: p.pigsty.cc    ,endpoint: "${admin_ip}:9090"                   ,certbot: pigsty.demo }
  alertmanager : { domain: a.pigsty.cc    ,endpoint: "${admin_ip}:9093"                   ,certbot: pigsty.demo }
  blackbox     : { endpoint: "${admin_ip}:9115"                                                               }
  loki         : { endpoint: "${admin_ip}:3100"                                                               }
  postgrest    : { domain: api.pigsty.cc  ,endpoint: "127.0.0.1:8884"                                         }
  pgadmin      : { domain: adm.pigsty.cc  ,endpoint: "127.0.0.1:8885"                                         }
  pgweb        : { domain: cli.pigsty.cc  ,endpoint: "127.0.0.1:8886"                                         }
  bytebase     : { domain: ddl.pigsty.cc  ,endpoint: "127.0.0.1:8887"                                         }
  jupyter      : { domain: lab.pigsty.cc  ,endpoint: "127.0.0.1:8888"   ,websocket: true                      }
  gitea        : { domain: git.pigsty.cc  ,endpoint: "127.0.0.1:8889"                     ,certbot: pigsty.cc }
  wiki         : { domain: wiki.pigsty.cc ,endpoint: "127.0.0.1:9002"                     ,certbot: pigsty.cc }
  noco         : { domain: noco.pigsty.cc ,endpoint: "127.0.0.1:9003"                     ,certbot: pigsty.cc }
  supa         : { domain: supa.pigsty.cc ,endpoint: "10.2.82.163:8000" ,websocket: true  ,certbot: pigsty.cc }
  dify         : { domain: dify.pigsty.cc ,endpoint: "10.2.82.163:8001" ,websocket: true  ,certbot: pigsty.cc }
  odoo         : { domain: odoo.pigsty.cc ,endpoint: "127.0.0.1:8069"   ,websocket: true  ,certbot: pigsty.cc }
  mm           : { domain: mm.pigsty.cc   ,endpoint: "10.2.82.163:8065" ,websocket: true                      }
  web.io:
    domain: en.pigsty.cc
    path: "/www/web.io"
    certbot: pigsty.doc
    enforce_https: true
    config: |
      # rewrite /zh/ to /
          location /zh/ {
              rewrite ^/zh/(.*)$ /$1 permanent;
          }      
  web.cc:
    domain: pigsty.cc
    path: "/www/web.cc"
    domains: [ zh.pigsty.cc ]
    certbot: pigsty.doc
    config: |
      # rewrite /zh/ to /
          location /zh/ {
              rewrite ^/zh/(.*)$ /$1 permanent;
          }      
  repo:
    domain: pro.pigsty.cc
    path: "/www/repo"
    index: true
    certbot: pigsty.doc

更多信息，请参考以下教程：

本地软件仓库

Pigsty 会在安装时，默认在 Infra 节点是那个创建一个本地软件仓库，以加速后续软件安装。

该软件仓库默认位位于 /www/pigsty 目录，由 Nginx 提供服务，可以访问 http://h.pigsty/pigsty 使用。

Pigsty的离线软件包是将已经建立好的软件源目录整个打成压缩包：当Pigsty尝试构建本地源时，如果发现本地源目录 /www/pigsty 已经存在，且带有 /www/pigsty/repo_complete 标记文件，则会认为本地源已经构建完成，从而跳过从原始上游下载软件的步骤，消除了对互联网访问的依赖。

Repo定义文件位于 /www/pigsty.repo，默认可以通过 http://${admin_ip}/pigsty.repo 获取

curl -L http://h.pigsty/pigsty.repo -o /etc/yum.repos.d/pigsty.repo

您也可以在没有Nginx的情况下直接使用文件本地源：

[pigsty-local]
name=Pigsty local $releasever - $basearch
baseurl=file:///www/pigsty/
enabled=1
gpgcheck=0

本地软件仓库相关配置参数位于：配置：INFRA - REPO

Prometheus

Prometheus是监控时序数据库，默认监听9090端口，可以直接通过IP:9090或域名http://p.pigsty访问。

Prometheus是监控用时序数据库，提供以下功能：

Prometheus默认通过本地静态文件服务发现获取监控对象，并为其关联身份信息。
Prometheus从Exporter拉取监控指标数据，进行预计算加工后存入自己的TSDB中。
Prometheus计算报警规则，将报警事件发往Alertmanager处理。

AlertManager是与Prometheus配套的告警平台，默认监听9093端口，可以直接通过IP:9093或域名 http://a.pigsty 访问。 Prometheus的告警事件会发送至AlertManager，但如果需要进一步处理，用户需要进一步对其进行配置，例如提供SMTP服务配置以发送告警邮件。

Prometheus、AlertManager，PushGateway，BlackboxExporter 的相关配置参数位于：配置：INFRA - PROMETHEUS

Grafana

Grafana是开源的可视化/监控平台，是Pigsty WebUI的核心，默认监听3000端口，可以直接通过IP:3000或域名http://g.pigsty访问。

Pigsty的监控系统基于Dashboard构建，通过URL进行连接与跳转。您可以快速地在监控中下钻上卷，快速定位故障与问题。

此外，Grafana还可以用作通用的低代码前后端平台，制作交互式可视化数据应用。因此，Pigsty使用的Grafana带有一些额外的可视化插件，例如ECharts面板。

Loki是用于日志收集的日志数据库，默认监听3100端口，节点上的Promtail向元节点上的Loki推送日志。

Grafana与Loki相关配置参数位于：配置：INFRA - GRAFANA，配置：INFRA - Loki

Ansible

Pigsty默认会在元节点上安装Ansible，Ansible是一个流行的运维工具，采用声明式的配置风格与幂等的剧本设计，可以极大降低系统维护的复杂度。

DNSMASQ

DNSMASQ 提供环境内的DNS解析服务，其他模块的域名将会注册到 INFRA节点上的 DNSMASQ 服务中。

DNS记录默认放置于所有INFRA节点的 /etc/hosts.d/ 目录中。

DNSMASQ相关配置参数位于：配置：INFRA - DNS

Chronyd

NTP服务用于同步环境内所有节点的时间（可选）

NTP相关配置参数位于：配置：NODES - NTP

8.2 - 集群配置

如何配置 Infra 节点？定制 Nginx 服务器的配置与本地软件仓库的内容？配置 DNS，NTP 与监控组件的方法。

配置说明

INFRA 主要用于提供监控基础设施，对于 PostgreSQL 数据库是 可选项 。

除非您在某些地方手工配置了对 INFRA 节点上 DNS / NTP 服务的依赖，否则 INFRA 模块的故障通常不会影响 PostgreSQL 数据库集群的正常运行。

在大多数情况下，单个 INFRA 节点足以应对绝大部分场景的需求。对于有一定要求的生产环境，建议使用 2～3 个 INFRA 节点以实现高可用。

通常出于提高资源利用率的考虑，PostgreSQL 高可用依赖的 ETCD 模块可以与 INFRA 模块共用节点。

使用 3 个以上的 INFRA 节点没有太大意义，但您可以使用更多的 ETCD 节点（例如 5 个）来提高 DCS 服务的可用性与可靠性。

配置样例

要在节点上安装 INFRA 模块，首先需要在 配置清单 中的 infra 分组加入节点 IP，并为其分配 Infra 实例号 infra_seq。

默认情况下，配置单个 INFRA 节点便足以满足大部分场景下的需求，所有配置模板都默认带有 infra 分组的定义：

all:
  children:
    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } }}

默认情况下，infra 分组中的 10.10.10.10 IP 占位符会在配置过程中被替换为 当前节点首要IP地址。也就是会在当前节点上安装 INFRA 模块。

然后，使用 infra.yml 剧本在节点上初始化 INFRA 模块即可。

Infra 高可用

Infra 模块中的大部分组件都属于 “无状态/相同状态” ，对于这类组件，高可用只需要操心 “负载均衡” 问题。

Infra 组件负载均衡可以通过两种方式实现： Keepalived L2 VIP，或 HAProxy 四层负载均衡。

如果您的网络环境二层互通，则可以使用 Keepalived L2 VIP 实现高可用。

infra:
  hosts:
    10.10.10.10: { infra_seq: 1 }
    10.10.10.11: { infra_seq: 2 }
    10.10.10.12: { infra_seq: 3 }
  vars:
    vip_enabled: true
    vip_vrid: 128
    vip_address: 10.10.10.8
    vip_interface: eth1

    infra_portal:
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "10.10.10.8:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "10.10.10.8:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "10.10.10.8:9093" }
      blackbox     : { endpoint: "10.10.10.8:9115" }
      loki         : { endpoint: "10.10.10.8:3100" }

除了设置 vip_address 等 VIP 相关参数外，您还需要在 infra_portal 中修改各项 Infra 服务的端点。

Nginx配置

本地仓库配置

DNS配置

NTP配置

8.3 - 参数列表

Infra 模块提供了 10 组共 60 个配置参数，用于定制本地软件仓库，Nginx，DNS，NTP，可观测性技术栈的方方面面。

INFRA 模块有下列 10 个参数组，共计 60 个关于基础设施组件的参数：

META：Pigsty 元数据，版本号，管理节点
CA：自签名公私钥基础设施/CA
INFRA_ID：基础设施门户，Nginx/域名配置
REPO：本地软件仓库：YUM/APT
INFRA_PACKAGE：基础设施软件包
NGINX：Nginx 网络服务器与 Certbot 证书
DNS：DNSMASQ 域名服务器
PROMETHEUS：Prometheus 时序数据库全家桶
GRAFANA：Grafana 可观测性全家桶
LOKI：Loki 日志服务

参数	参数组	类型	层次	中文说明
`version`	`META`	`string`	G	pigsty 版本字符串
`admin_ip`	`META`	`ip`	G	管理节点 IP 地址
`region`	`META`	`enum`	G	上游镜像区域：default,china,europe
`proxy_env`	`META`	`dict`	G	下载包时使用的全局代理环境变量
`ca_method`	`CA`	`enum`	G	CA处理方式：create,recreate,copy，默认为没有则创建
`ca_cn`	`CA`	`string`	G	CA CN名称，固定为 pigsty-ca
`cert_validity`	`CA`	`interval`	G	证书有效期，默认为 20 年
`infra_seq`	`INFRA_ID`	`int`	I	基础设施节号，必选身份参数
`infra_portal`	`INFRA_ID`	`dict`	G	通过Nginx门户暴露的基础设施服务列表
`repo_enabled`	`REPO`	`bool`	G/I	在此基础设施节点上创建软件仓库？
`repo_home`	`REPO`	`path`	G	软件仓库主目录，默认为`/www`
`repo_name`	`REPO`	`string`	G	软件仓库名称，默认为 pigsty
`repo_endpoint`	`REPO`	`url`	G	仓库的访问点：域名或 `ip:port` 格式
`repo_remove`	`REPO`	`bool`	G/A	构建本地仓库时是否移除现有上游仓库源定义文件？
`repo_modules`	`REPO`	`string`	G/A	启用的上游仓库模块列表，用逗号分隔
`repo_upstream`	`REPO`	`upstream[]`	G	上游仓库源定义：从哪里下载上游包？
`repo_packages`	`REPO`	`string[]`	G	从上游仓库下载哪些软件包？
`repo_extra_packages`	`REPO`	`string[]`	G/C/I	从上游仓库下载哪些额外的软件包？
`repo_url_packages`	`REPO`	`string[]`	G	使用URL下载的额外软件包列表
`infra_packages`	`INFRA_PACKAGE`	`string[]`	G	在基础设施节点上要安装的软件包
`infra_packages_pip`	`INFRA_PACKAGE`	`string`	G	在基础设施节点上使用 pip 安装的包
`nginx_enabled`	`NGINX`	`bool`	G/I	在此基础设施节点上启用 nginx？
`nginx_exporter_enabled`	`NGINX`	`bool`	G/I	在此基础设施节点上启用 nginx_exporter？
`nginx_sslmode`	`NGINX`	`enum`	G	nginx SSL模式？disable,enable,enforce
`nginx_home`	`NGINX`	`path`	G	nginx 内容目录，默认为 `/www`，通常和仓库目录一致
`nginx_port`	`NGINX`	`port`	G	nginx 监听端口，默认为 80
`nginx_ssl_port`	`NGINX`	`port`	G	nginx SSL监听端口，默认为 443
`nginx_navbar`	`NGINX`	`index[]`	G	nginx 首页导航链接列表
`certbot_sign`	`NGINX`	`bool`	G/A	是否使用 certbot 自动申请证书？默认为 `false`
`certbot_email`	`NGINX`	`string`	G/A	申请证书时使用的 email，用于接受过期提醒邮件
`certbot_option`	`NGINX`	`string`	G/A	申请证书时额外传入的的配置参数
`dns_enabled`	`DNS`	`bool`	G/I	在此基础设施节点上设置dnsmasq？
`dns_port`	`DNS`	`port`	G	DNS 服务器监听端口，默认为 53
`dns_records`	`DNS`	`string[]`	G	由 dnsmasq 解析的动态 DNS 记录
`prometheus_enabled`	`PROMETHEUS`	`bool`	G/I	在此基础设施节点上启用 prometheus？
`prometheus_clean`	`PROMETHEUS`	`bool`	G/A	初始化Prometheus的时候清除现有数据？
`prometheus_data`	`PROMETHEUS`	`path`	G	Prometheus 数据目录，默认为 `/data/prometheus`
`prometheus_sd_dir`	`PROMETHEUS`	`path`	G	Prometheus 服务发现目标文件目录
`prometheus_sd_interval`	`PROMETHEUS`	`interval`	G	Prometheus 目标刷新间隔，默认为 5s
`prometheus_scrape_interval`	`PROMETHEUS`	`interval`	G	Prometheus 抓取 & 评估间隔，默认为 10s
`prometheus_scrape_timeout`	`PROMETHEUS`	`interval`	G	Prometheus 全局抓取超时，默认为 8s
`prometheus_options`	`PROMETHEUS`	`arg`	G	Prometheus 额外的命令行参数选项
`pushgateway_enabled`	`PROMETHEUS`	`bool`	G/I	在此基础设施节点上设置 pushgateway？
`pushgateway_options`	`PROMETHEUS`	`arg`	G	pushgateway 额外的命令行参数选项
`blackbox_enabled`	`PROMETHEUS`	`bool`	G/I	在此基础设施节点上设置 blackbox_exporter？
`blackbox_options`	`PROMETHEUS`	`arg`	G	blackbox_exporter 额外的命令行参数选项
`alertmanager_enabled`	`PROMETHEUS`	`bool`	G/I	在此基础设施节点上设置 alertmanager？
`alertmanager_port`	`PROMETHEUS`	`arg`	G	alertmanager 监听端口号，默认为 `9093`
`alertmanager_options`	`PROMETHEUS`	`arg`	G	alertmanager 额外的命令行参数选项
`exporter_metrics_path`	`PROMETHEUS`	`path`	G	exporter 指标路径，默认为 /metrics
`exporter_install`	`PROMETHEUS`	`enum`	G	如何安装 exporter？none,yum,binary
`exporter_repo_url`	`PROMETHEUS`	`url`	G	通过 yum 安装exporter时使用的yum仓库文件地址
`grafana_enabled`	`GRAFANA`	`bool`	G/I	在此基础设施节点上启用 Grafana？
`grafana_clean`	`GRAFANA`	`bool`	G/A	初始化Grafana期间清除数据？
`grafana_admin_username`	`GRAFANA`	`username`	G	Grafana 管理员用户名，默认为 `admin`
`grafana_admin_password`	`GRAFANA`	`password`	G	Grafana 管理员密码，默认为 `pigsty`
`loki_enabled`	`LOKI`	`bool`	G/I	在此基础设施节点上启用 loki？
`loki_clean`	`LOKI`	`bool`	G/A	是否删除现有的 loki 数据？
`loki_data`	`LOKI`	`path`	G	loki 数据目录，默认为 `/data/loki`
`loki_retention`	`LOKI`	`interval`	G	loki 日志保留期，默认为 15d

`META`

这一小节指定了一套 Pigsty 部署的元数据：包括版本号，管理员节点 IP 地址，软件源镜像上游区域 和下载软件包时使用的 http(s) 代理。

version: v3.4.0                   # pigsty 版本号
admin_ip: 10.10.10.10             # 管理节点IP地址
region: default                   # 上游镜像区域：default,china,europe
proxy_env:                        # 全局HTTPS代理，用于下载、安装软件包。
  no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"
  # http_proxy:  # set your proxy here: e.g http://user:pass@proxy.xxx.com
  # https_proxy: # set your proxy here: e.g http://user:pass@proxy.xxx.com
  # all_proxy:   # set your proxy here: e.g http://user:pass@proxy.xxx.com

`version`

参数名称： version，类型： string，层次：G

Pigsty 版本号字符串，默认值为当前版本：v3.3.0。

Pigsty 内部会使用版本号进行功能控制与内容渲染。

Pigsty使用语义化版本号，版本号字符串通常以字符 v 开头。

`admin_ip`

参数名称： admin_ip，类型： ip，层次：G

管理节点的 IP 地址，默认为占位符 IP 地址：10.10.10.10

由该参数指定的节点将被视为管理节点，通常指向安装 Pigsty 时的第一个节点，即中控节点。

默认值 10.10.10.10 是一个占位符，会在 configure 过程中被替换为实际的管理节点 IP 地址。

许多参数都会引用此参数，例如：

在这些参数中，字符串 ${admin_ip} 会被替换为 admin_ip 的真实取值。使用这种机制，您可以为不同的节点指定不同的中控管理节点。

`region`

参数名称： region，类型： enum，层次：G

上游镜像的区域，默认可选值为：upstream mirror region: default,china,europe，默认为： default

如果一个不同于 default 的区域被设置，且在 repo_upstream 中有对应的条目，将会使用该条目对应 baseurl 代替 default 中的 baseurl。

例如，如果您的区域被设置为 china，那么 Pigsty 会尝试使用中国地区的上游软件镜像站点以加速下载，如果某个上游软件仓库没有对应的中国地区镜像，那么会使用默认的上游镜像站点替代。同时，在 repo_url_packages 中定义的 URL 地址，也会进行从 repo.pigsty.io 到 repo.pigsty.cc 的替换，以使用国内的镜像源。

`proxy_env`

参数名称： proxy_env，类型： dict，层次：G

下载包时使用的全局代理环境变量，默认值指定了 no_proxy，即不使用代理的地址列表：

proxy_env:
  no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.aliyuncs.com,mirrors.tuna.tsinghua.edu.cn,mirrors.zju.edu.cn"
  #http_proxy: 'http://username:password@proxy.address.com'
  #https_proxy: 'http://username:password@proxy.address.com'
  #all_proxy: 'http://username:password@proxy.address.com'

当您在中国大陆地区从互联网上游安装时，特定的软件包可能会被墙，您可以使用代理来解决这个问题。

请注意，如果使用了 Docker 模块，那么这里的代理服务器配置也会写入 Docker Daemon 配置文件中。

请注意，如果在 ./configure 过程中指定了 -x 参数，那么当前环境中的代理配置信息将会被自动填入到生成的 pigsty.yaml 文件中。

`CA`

Pigsty 使用的自签名 CA 证书，用于支持高级安全特性。

ca_method: create                 # CA处理方式：create,recreate,copy，默认为没有则创建
ca_cn: pigsty-ca                  # CA CN名称，固定为 pigsty-ca
cert_validity: 7300d              # 证书有效期，默认为 20 年

`ca_method`

参数名称： ca_method，类型： enum，层次：G

CA处理方式：create , recreate ,copy，默认为没有则创建

默认值为： create，即如果不存在则创建一个新的 CA 证书。

create：如果 files/pki/ca 中不存在现有的CA，则创建一个全新的 CA 公私钥对，否则就直接使用现有的 CA 公私钥对。
recreate：总是创建一个新的 CA 公私钥对，覆盖现有的 CA 公私钥对。注意，这是一个危险的操作。
copy：假设files/pki/ca 目录下已经有了一对CA公私钥对，并将 ca_method 设置为 copy，Pigsty 将会使用现有的 CA 公私钥对。如果不存在则会报错

如果您已经有了一对 CA 公私钥对，可以将其复制到 files/pki/ca 目录下，并将 ca_method 设置为 copy，Pigsty 将会使用现有的 CA 公私钥对，而不是新建一个。

请注意，务必保留并备份好一套部署新生成的 CA 私钥文件。

`ca_cn`

参数名称： ca_cn，类型： string，层次：G

CA CN名称，固定为 pigsty-ca，不建议修改。

你可以使用以下命令来查看节点上的 Pigsty CA 证书： openssl x509 -text -in /etc/pki/ca.crt

`cert_validity`

参数名称： cert_validity，类型： interval，层次：G

签发证书的有效期，默认为 20 年，对绝大多数场景都足够了。默认值为： 7300d

`INFRA_ID`

Infrastructure identity and portal definition.

#infra_seq: 1                     # infra node identity, explicitly required
infra_portal:                     # infra services exposed via portal
  home         : { domain: h.pigsty }
  grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" ,websocket: true }
  prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }

`infra_seq`

参数名称： infra_seq，类型： int，层次：I

基础设施节号，必选身份参数，所以不提供默认值，必须在基础设施节点上显式指定。

`infra_portal`

参数名称： infra_portal，类型： dict，层次：G

通过Nginx门户暴露的基础设施服务列表，默认情况下，Pigsty 会通过 Nginx 对外暴露以下服务：

infra_portal:
  home         : { domain: h.pigsty }
  grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" ,websocket: true }
  prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }

每条记录由一个 Key 与一个 Value 字典组成，name 作为键，代表组件名称，值为对象，可以配置以下参数：

name： 必选项，指定 Nginx 服务器的名称
- 默认记录：home, grafana, prometheus, alertmanager, blackbox, loki 为固定名称，请勿修改。
- 用作 Nginx 配置文件名称的一部分，对应配置文件为：/etc/nginx/conf.d/<name>.conf
- 没有配置 domain 字段的 Nginx Server 不会生成配置文件，仅作为参考引用之用。
domain：可选，当服务需要通过 Nginx 对外暴露时，为必选项，指定使用的域名
- 在 Pigsty 自签名 Nginx HTTPS 证书中，域名将被添加到Nginx SSL证书的 SAN 字段中
- Pigsty web 页面之间的交叉引用会使用这里的默认域名
endpoint：通常与 path 二选一，指定上游服务器地址，设置 endpoint 表示这是一个反向代理服务器
- 在配置中可以使用 ${admin_ip} 作为占位符，部署时将动态替换为 admin_ip
- 反向代理服务器默认使用 endpoint.conf 作为配置模板
- 反向代理服务器还可以配置 websocket 与 schema 参数
path：通常与 endpoint 二选一，指定本地文件服务器路径，设置 path 表示这是一个本地Web服务器
- 本地Web服务器默认使用 path.conf 作为配置模板
- 本地Web服务器还可以配置 index 参数，是否启用文件索引页
certbot：Certbot 证书名称，如果配置，会使用 Certbot 申请证书
- 如果有多个服务器指定了相同的 certbot，Pigsty 会进行合并申请，最终证书名称为这个 certbot 的名称
cert：Nginx 证书文件路径，如果配置，会覆盖默认的证书路径
key：Nginx 证书密钥文件路径，如果配置，会覆盖默认的证书密钥路径
websocket：是否启用 WebSocket 支持
- 只有反向代理服务器可以配置此参数，如果开启将允许上游使用 WebSocket 连接
schema：上游服务器使用的协议，如果配置，会覆盖默认的协议
- 默认为 http，如果配置 https 则强制使用 HTTPS 连接上游服务器
index：是否启用文件索引页
- 只有本地Web服务器可以配置此参数，如果开启将开启 autoindex 配置，自动为目录生成索引页
log：Nginx 日志文件路径
- 如果指定，访问日志将写入此文件，否则根据服务器类型使用默认的日志文件
- 反向代理服务器，默认使用 /var/log/nginx/<name>.log 作为日志文件路径
- 本地Web服务器，使用默认的 Access 日志
conf：Nginx 配置文件路径
- 显示指定使用的配置模板文件，位于 roles/infra/templates/nginx 或 templates/nginx 目录
- 未指定本参数时，会使用默认的配置模板，位于 roles/infra/templates/nginx 或 templates/nginx 目录
config：Nginx 配置代码块
- 直接注入到 Nginx Server 配置块中的配置文本
enforce_https：将 HTTP 服务器重定向到 HTTPS 服务器
- 全局配置可以通过 nginx_sslmode: enforce 来指定
- 此配置不影响默认的 home 服务器，home 服务器会始终同时监听 80 与 443 端口确保兼容性。

`REPO`

本节配置是关于本地软件仓库的。 Pigsty 默认会在基础设施节点上启用一个本地软件仓库（APT / YUM）。

在初始化过程中，Pigsty 会从互联网上游仓库（由 repo_upstream 指定）下载所有软件包及其依赖项（由 repo_packages 指定）到 {{ nginx_home }} / {{ repo_name }} （默认为 /www/pigsty），所有软件及其依赖的总大小约为1GB左右。

创建本地软件仓库时，如果仓库已存在（判断方式：仓库目录目录中有一个名为 repo_complete 的标记文件）Pigsty 将认为仓库已经创建完成，跳过软件下载阶段，直接使用构建好的仓库。

如果某些软件包的下载速度太慢，您可以通过使用 proxy_env 配置项来设置下载代理来完成首次下载，或直接下载预打包的离线软件包，离线软件包本质上就是在同样操作系统上构建好的本地软件源。

repo_enabled: true                # 在当前基础设施节点上启用本地软件仓库？
repo_home: /www                   # 仓库主目录，默认为 `/www`
repo_name: pigsty                 # 仓库名称，默认为 pigsty
repo_endpoint: http://${admin_ip}:80 # 访问此仓库的端点，可以是域名或IP:端口
repo_remove: true                 # 移除现有的上游仓库
repo_modules: infra,node,pgsql    # 在仓库引导过程中安装上游仓库
#repo_upstream: []                # 从哪里下载软件包
#repo_packages: []                # 下载哪些软件包
#repo_extra_packages: []          # 额外下载的软件包
repo_url_packages: []             # 从URL下载的额外软件包

`repo_enabled`

参数名称： repo_enabled，类型： bool，层次：G/I

是否在当前的基础设施节点上启用本地软件源？默认为： true，即所有 Infra 节点都会设置一个本地软件仓库。

如果您有多个基础设施节点，可以只保留 1 ～ 2 个节点作为软件仓库，其他节点可以通过设置此参数为 false 来避免重复软件下载构建。

`repo_home`

参数名称： repo_home，类型： path，层次：G

本地软件仓库的家目录，默认为 Nginx 的根目录，也就是： /www，我们不建议您修改此目录。如果修改，需要和 nginx_home

`repo_name`

参数名称： repo_name，类型： string，层次：G

本地仓库名称，默认为 pigsty，更改此仓库的名称是不明智的行为。

`repo_endpoint`

参数名称： repo_endpoint，类型： url，层次：G

其他节点访问此仓库时使用的端点，默认值为：http://${admin_ip}:80。

Pigsty 默认会在基础设施节点 80/443 端口启动 Nginx，对外提供本地软件源（静态文件）服务。

如果您修改了 nginx_port 与 nginx_ssl_port，或者使用了不同于中控节点的基础设施节点，请相应调整此参数。

如果您使用了域名，可以在 node_default_etc_hosts、node_etc_hosts、或者 dns_records 中添加解析。

`repo_remove`

参数名称： repo_remove，类型： bool，层次：G/A

在构建本地软件源时，是否移除现有的上游仓库定义？默认值： true。

当启用此参数时，/etc/yum.repos.d 中所有已有仓库文件会被移动备份至/etc/yum.repos.d/backup，在 Debian 系上是移除 /etc/apt/sources.list 和 /etc/apt/sources.list.d，将文件备份至 /etc/apt/backup 中。

因为操作系统已有的源内容不可控，使用 Pigsty 验证过的上游软件源可以提高从互联网下载软件包的成功率与速度。

但在一些特定情况下（例如您的操作系统是某种 EL/Deb 兼容版，许多软件包使用了自己的私有源），您可能需要保留现有的上游仓库定义，此时可以将此参数设置为 false。

`repo_modules`

参数名称： repo_modules，类型： string，层次：G/A

哪些上游仓库模块会被添加到本地软件源中，默认值： infra,node,pgsql

当 Pigsty 尝试添加上游仓库时，会根据此参数的值来过滤 repo_upstream 中的条目，只有 module 字段与此参数值匹配的条目才会被添加到本地软件源中。

模块以逗号分隔，可用的模块列表请参考 repo_upstream 中的定义

`repo_upstream`

参数名称： repo_upstream，类型： upstream[]，层次：G

构建本地软件源时，从哪里下载上游软件包？本参数没有默认值，如果用户不在配置文件中显式指定，则会从根据当前节点的操作系统族，从定义于 roles/node_id/vars 中的 repo_upstream_default 变量中加载获取。

对于 EL （7，8，9）系统，默认使用的软件源如下所示：

- { name: pigsty-local   ,description: 'Pigsty Local'       ,module: local   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'http://${admin_ip}/pigsty'  }} # used by intranet nodes
- { name: pigsty-infra   ,description: 'Pigsty INFRA'       ,module: infra   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.pigsty.io/yum/infra/$basearch' ,china: 'https://repo.pigsty.cc/yum/infra/$basearch' }}
- { name: pigsty-pgsql   ,description: 'Pigsty PGSQL'       ,module: pgsql   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.pigsty.io/yum/pgsql/el$releasever.$basearch' ,china: 'https://repo.pigsty.cc/yum/pgsql/el$releasever.$basearch' }}
- { name: nginx          ,description: 'Nginx Repo'         ,module: infra   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://nginx.org/packages/rhel/$releasever/$basearch/' }}
- { name: docker-ce      ,description: 'Docker CE'          ,module: infra   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.docker.com/linux/centos/$releasever/$basearch/stable'        ,china: 'https://mirrors.aliyun.com/docker-ce/linux/centos/$releasever/$basearch/stable' ,europe: 'https://mirrors.xtom.de/docker-ce/linux/centos/$releasever/$basearch/stable' }}
- { name: baseos         ,description: 'EL 8+ BaseOS'       ,module: node    ,releases: [  8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/BaseOS/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/BaseOS/$basearch/os/'         ,europe: 'https://mirrors.xtom.de/rocky/$releasever/BaseOS/$basearch/os/'     }}
- { name: appstream      ,description: 'EL 8+ AppStream'    ,module: node    ,releases: [  8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/AppStream/$basearch/os/'      ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/AppStream/$basearch/os/'      ,europe: 'https://mirrors.xtom.de/rocky/$releasever/AppStream/$basearch/os/'  }}
- { name: extras         ,description: 'EL 8+ Extras'       ,module: node    ,releases: [  8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/extras/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/extras/$basearch/os/'         ,europe: 'https://mirrors.xtom.de/rocky/$releasever/extras/$basearch/os/'     }}
- { name: powertools     ,description: 'EL 8 PowerTools'    ,module: node    ,releases: [  8  ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/PowerTools/$basearch/os/'     ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/PowerTools/$basearch/os/'     ,europe: 'https://mirrors.xtom.de/rocky/$releasever/PowerTools/$basearch/os/' }}
- { name: crb            ,description: 'EL 9 CRB'           ,module: node    ,releases: [    9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/CRB/$basearch/os/'            ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/CRB/$basearch/os/'            ,europe: 'https://mirrors.xtom.de/rocky/$releasever/CRB/$basearch/os/'        }}
- { name: epel           ,description: 'EL 8+ EPEL'         ,module: node    ,releases: [  8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'http://download.fedoraproject.org/pub/epel/$releasever/Everything/$basearch/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/epel/$releasever/Everything/$basearch/'    ,europe: 'https://mirrors.xtom.de/epel/$releasever/Everything/$basearch/'     }}
- { name: pgdg-common    ,description: 'PostgreSQL Common'  ,module: pgsql   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg-el8fix    ,description: 'PostgreSQL EL8FIX'  ,module: pgsql   ,releases: [  8  ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' } }
- { name: pgdg-el9fix    ,description: 'PostgreSQL EL9FIX'  ,module: pgsql   ,releases: [    9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/'  ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' }}
- { name: pgdg13         ,description: 'PostgreSQL 13'      ,module: pgsql   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/13/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/13/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/13/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg14         ,description: 'PostgreSQL 14'      ,module: pgsql   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/14/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/14/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/14/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg15         ,description: 'PostgreSQL 15'      ,module: pgsql   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/15/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg16         ,description: 'PostgreSQL 16'      ,module: pgsql   ,releases: [  8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/16/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg17         ,description: 'PostgreSQL 17'      ,module: pgsql   ,releases: [  8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/17/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/17/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/17/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg-extras    ,description: 'PostgreSQL Extra'   ,module: extra   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg13-nonfree ,description: 'PostgreSQL 13+'     ,module: extra   ,releases: [7,8,9] ,arch: [x86_64         ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/non-free/13/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/non-free/13/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/non-free/13/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg14-nonfree ,description: 'PostgreSQL 14+'     ,module: extra   ,releases: [7,8,9] ,arch: [x86_64         ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/non-free/14/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/non-free/14/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/non-free/14/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg15-nonfree ,description: 'PostgreSQL 15+'     ,module: extra   ,releases: [7,8,9] ,arch: [x86_64         ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/non-free/15/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/non-free/15/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/non-free/15/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg16-nonfree ,description: 'PostgreSQL 16+'     ,module: extra   ,releases: [  8,9] ,arch: [x86_64         ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/non-free/16/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/non-free/16/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/non-free/16/redhat/rhel-$releasever-$basearch' }}
- { name: pgdg17-nonfree ,description: 'PostgreSQL 17+'     ,module: extra   ,releases: [  8,9] ,arch: [x86_64         ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/non-free/17/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/non-free/17/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/non-free/17/redhat/rhel-$releasever-$basearch' }}
- { name: timescaledb    ,description: 'TimescaleDB'        ,module: extra   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://packagecloud.io/timescale/timescaledb/el/$releasever/$basearch'  }}
- { name: wiltondb       ,description: 'WiltonDB'           ,module: mssql   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.pigsty.io/yum/mssql/el$releasever.$basearch', china: 'https://repo.pigsty.cc/yum/mssql/el$releasever.$basearch' , origin: 'https://download.copr.fedorainfracloud.org/results/wiltondb/wiltondb/epel-$releasever-$basearch/' }}
- { name: ivorysql       ,description: 'IvorySQL'           ,module: ivory   ,releases: [7,8,9] ,arch: [x86_64         ] ,baseurl: { default: 'https://repo.pigsty.io/yum/ivory/el$releasever.$basearch', china: 'https://repo.pigsty.cc/yum/ivory/el$releasever.$basearch' }}
- { name: groonga        ,description: 'Groonga'            ,module: groonga ,releases: [  8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://packages.groonga.org/almalinux/$releasever/$basearch/' }}
- { name: mysql          ,description: 'MySQL'              ,module: mysql   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.mysql.com/yum/mysql-8.0-community/el/$releasever/$basearch/', china: 'https://mirrors.tuna.tsinghua.edu.cn/mysql/yum/mysql-8.0-community-el7-$basearch/'}}
- { name: mongo          ,description: 'MongoDB'            ,module: mongo   ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/8.0/$basearch/' , 'https://mirrors.aliyun.com/mongodb/yum/redhat/$releasever/mongodb-org/8.0/$basearch/' }}
- { name: redis          ,description: 'Redis'              ,module: redis   ,releases: [7    ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://rpmfind.net/linux/remi/enterprise/$releasever/remi/$basearch/' }}
- { name: redis          ,description: 'Redis'              ,module: redis   ,releases: [  8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://rpmfind.net/linux/remi/enterprise/$releasever/redis72/$basearch/' }}
- { name: grafana        ,description: 'Grafana'            ,module: grafana ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://rpm.grafana.com' }}
- { name: kubernetes     ,description: 'Kubernetes'         ,module: kube    ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://pkgs.k8s.io/core:/stable:/v1.31/rpm/', china: 'https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/rpm/' }}
- { name: gitlab         ,description: 'Gitlab'             ,module: gitlab  ,releases: [  8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://packages.gitlab.com/gitlab/gitlab-ee/el/$releasever/$basearch' }}

对于 Debian （11，12）或 Ubuntu （20.04，22.04），默认使用的软件源如下所示：

- { name: pigsty-local   ,description: 'Pigsty Local'       ,module: local   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'http://${admin_ip}/pigsty ./' }}
- { name: pigsty-pgsql   ,description: 'Pigsty PgSQL'       ,module: pgsql   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.pigsty.io/apt/pgsql/${distro_codename} ${distro_codename} main', china: 'https://repo.pigsty.cc/apt/pgsql/${distro_codename} ${distro_codename} main' }}
- { name: pigsty-infra   ,description: 'Pigsty Infra'       ,module: infra   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.pigsty.io/apt/infra/ generic main' ,china: 'https://repo.pigsty.cc/apt/infra/ generic main' }}
- { name: nginx          ,description: 'Nginx'              ,module: infra   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'http://nginx.org/packages/${distro_name} ${distro_codename} nginx' }}
- { name: docker-ce      ,description: 'Docker'             ,module: infra   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.docker.com/linux/${distro_name} ${distro_codename} stable'                               ,china: 'https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux//${distro_name} ${distro_codename} stable' }}
- { name: base           ,description: 'Debian Basic'       ,module: node    ,releases: [11,12         ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'http://deb.debian.org/debian/ ${distro_codename} main non-free-firmware'                                  ,china: 'https://mirrors.aliyun.com/debian/ ${distro_codename} main restricted universe multiverse' }}
- { name: updates        ,description: 'Debian Updates'     ,module: node    ,releases: [11,12         ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'http://deb.debian.org/debian/ ${distro_codename}-updates main non-free-firmware'                          ,china: 'https://mirrors.aliyun.com/debian/ ${distro_codename}-updates main restricted universe multiverse' }}
- { name: security       ,description: 'Debian Security'    ,module: node    ,releases: [11,12         ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'http://security.debian.org/debian-security ${distro_codename}-security main non-free-firmware'            ,china: 'https://mirrors.aliyun.com/debian-security/ ${distro_codename}-security main non-free-firmware' }}
- { name: base           ,description: 'Ubuntu Basic'       ,module: node    ,releases: [      20,22,24] ,arch: [x86_64         ] ,baseurl: { default: 'https://mirrors.edge.kernel.org/ubuntu/ ${distro_codename}           main universe multiverse restricted' ,china: 'https://mirrors.aliyun.com/ubuntu/ ${distro_codename}           main restricted universe multiverse' }}
- { name: updates        ,description: 'Ubuntu Updates'     ,module: node    ,releases: [      20,22,24] ,arch: [x86_64         ] ,baseurl: { default: 'https://mirrors.edge.kernel.org/ubuntu/ ${distro_codename}-backports main restricted universe multiverse' ,china: 'https://mirrors.aliyun.com/ubuntu/ ${distro_codename}-updates   main restricted universe multiverse' }}
- { name: backports      ,description: 'Ubuntu Backports'   ,module: node    ,releases: [      20,22,24] ,arch: [x86_64         ] ,baseurl: { default: 'https://mirrors.edge.kernel.org/ubuntu/ ${distro_codename}-security  main restricted universe multiverse' ,china: 'https://mirrors.aliyun.com/ubuntu/ ${distro_codename}-backports main restricted universe multiverse' }}
- { name: security       ,description: 'Ubuntu Security'    ,module: node    ,releases: [      20,22,24] ,arch: [x86_64         ] ,baseurl: { default: 'https://mirrors.edge.kernel.org/ubuntu/ ${distro_codename}-updates   main restricted universe multiverse' ,china: 'https://mirrors.aliyun.com/ubuntu/ ${distro_codename}-security  main restricted universe multiverse' }}
- { name: base           ,description: 'Ubuntu Basic'       ,module: node    ,releases: [      20,22,24] ,arch: [        aarch64] ,baseurl: { default: 'http://ports.ubuntu.com/ubuntu-ports/ ${distro_codename}           main universe multiverse restricted'   ,china: 'https://mirrors.aliyun.com/ubuntu-ports/ ${distro_codename}           main restricted universe multiverse' }}
- { name: updates        ,description: 'Ubuntu Updates'     ,module: node    ,releases: [      20,22,24] ,arch: [        aarch64] ,baseurl: { default: 'http://ports.ubuntu.com/ubuntu-ports/ ${distro_codename}-backports main restricted universe multiverse'   ,china: 'https://mirrors.aliyun.com/ubuntu-ports/ ${distro_codename}-updates   main restricted universe multiverse' }}
- { name: backports      ,description: 'Ubuntu Backports'   ,module: node    ,releases: [      20,22,24] ,arch: [        aarch64] ,baseurl: { default: 'http://ports.ubuntu.com/ubuntu-ports/ ${distro_codename}-security  main restricted universe multiverse'   ,china: 'https://mirrors.aliyun.com/ubuntu-ports/ ${distro_codename}-backports main restricted universe multiverse' }}
- { name: security       ,description: 'Ubuntu Security'    ,module: node    ,releases: [      20,22,24] ,arch: [        aarch64] ,baseurl: { default: 'http://ports.ubuntu.com/ubuntu-ports/ ${distro_codename}-updates   main restricted universe multiverse'   ,china: 'https://mirrors.aliyun.com/ubuntu-ports/ ${distro_codename}-security  main restricted universe multiverse' }}
- { name: pgdg           ,description: 'PGDG'               ,module: pgsql   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'http://apt.postgresql.org/pub/repos/apt/ ${distro_codename}-pgdg main' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/apt/ ${distro_codename}-pgdg main' }}
- { name: timescaledb    ,description: 'Timescaledb'        ,module: extra   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://packagecloud.io/timescale/timescaledb/${distro_name}/ ${distro_codename} main' }}
- { name: citus          ,description: 'Citus'              ,module: extra   ,releases: [11,12,20,22   ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://packagecloud.io/citusdata/community/${distro_name}/ ${distro_codename} main' } }
- { name: pgml           ,description: 'PostgresML'         ,module: pgml    ,releases: [         22   ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://apt.postgresml.org ${distro_codename} main'  }}
- { name: wiltondb       ,description: 'WiltonDB'           ,module: mssql   ,releases: [      20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.pigsty.io/apt/mssql/ ${distro_codename} main', china: 'https://repo.pigsty.cc/apt/mssql/ ${distro_codename} main' , origin: 'https://ppa.launchpadcontent.net/wiltondb/wiltondb/ubuntu/ ${distro_codename} main'  }}
- { name: groonga        ,description: 'Groonga Debian'     ,module: groonga ,releases: [11,12         ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://packages.groonga.org/debian/ ${distro_codename} main' }}
- { name: groonga        ,description: 'Groonga Ubuntu'     ,module: groonga ,releases: [      20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://ppa.launchpadcontent.net/groonga/ppa/ubuntu/ ${distro_codename} main' }}
- { name: mysql          ,description: 'MySQL'              ,module: mysql   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.mysql.com/apt/${distro_name} ${distro_codename} mysql-8.0 mysql-tools', china: 'https://mirrors.tuna.tsinghua.edu.cn/mysql/apt/${distro_name} ${distro_codename} mysql-8.0 mysql-tools' }}
- { name: mongo          ,description: 'MongoDB'            ,module: mongo   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://repo.mongodb.org/apt/${distro_name} ${distro_codename}/mongodb-org/8.0 multiverse', china: 'https://mirrors.aliyun.com/mongodb/apt/${distro_name} ${distro_codename}/mongodb-org/8.0 multiverse' }}
- { name: redis          ,description: 'Redis'              ,module: redis   ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://packages.redis.io/deb ${distro_codename} main' }}
- { name: haproxyd       ,description: 'Haproxy Debian'     ,module: haproxy ,releases: [11,12         ] ,arch: [x86_64, aarch64] ,baseurl: { default: 'http://haproxy.debian.net/ ${distro_codename}-backports-3.1 main' }}
- { name: haproxyu       ,description: 'Haproxy Ubuntu'     ,module: haproxy ,releases: [      20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://ppa.launchpadcontent.net/vbernat/haproxy-3.1/ubuntu/ ${distro_codename} main' }}
- { name: grafana        ,description: 'Grafana'            ,module: grafana ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://apt.grafana.com stable main' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/grafana/apt/ stable main' }}
- { name: kubernetes     ,description: 'Kubernetes'         ,module: kube    ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://pkgs.k8s.io/core:/stable:/v1.31/deb/ /', china: 'https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb/ /' }}
- { name: gitlab         ,description: 'Gitlab'             ,module: gitlab  ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://packages.gitlab.com/gitlab/gitlab-ee/${distro_name}/ ${distro_codename} main' }}

`repo_packages`

参数名称： repo_packages，类型： string[]，层次：G

字符串数组类型，每一行都是 由空格分隔 的软件包列表字符串，指定将要使用 repotrack 或 apt download 下载到本地的软件包（及其依赖）。

本参数没有默认值，即默认值为未定义状态。如果该参数没有被显式定义，那么 Pigsty 会从 roles/node_id/vars 中定义的 repo_packages_default 变量中加载获取默认值，默认值为：

[ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-utility, extra-modules ]

该参数中的每个元素，都会在上述文件中定义的 package_map 中，根据特定的操作系统发行版大版本进行翻译。例如在 EL 系统上会翻译为：

node-bootstrap:          "ansible python3 python3-pip python3-virtualenv python3-requests python3-jmespath python3-cryptography dnf-utils modulemd-tools createrepo_c sshpass"
infra-package:           "nginx dnsmasq etcd haproxy vip-manager node_exporter keepalived_exporter pg_exporter pgbackrest_exporter redis_exporter redis minio mcli pig"
infra-addons:            "grafana grafana-plugins loki logcli promtail prometheus alertmanager pushgateway blackbox_exporter nginx_exporter pev2 certbot python3-certbot-nginx"
extra-modules:           "docker-ce docker-compose-plugin ferretdb2 duckdb restic juicefs vray grafana-infinity-ds"
node-package1:           "lz4 unzip bzip2 zlib yum pv jq git ncdu make patch bash lsof wget uuid tuned nvme-cli numactl grubby sysstat iotop htop rsync tcpdump perf flamegraph chkconfig"
node-package2:           "netcat socat ftp lrzsz net-tools ipvsadm bind-utils telnet audit ca-certificates readline vim-minimal keepalived chrony openssl openssh-server openssh-clients"
pgsql-utility:           "patroni patroni-etcd pgbouncer pgbackrest pgbadger pg_activity pg_timetable pgFormatter pg_filedump pgxnclient timescaledb-tools timescaledb-event-streamer pgcopydb"

而在 Debian 系统上会被翻译为对应的 Debian DEB 包名：

node-bootstrap:          "ansible python3 python3-pip python3-venv python3-jmespath dpkg-dev sshpass ftp linux-tools-generic"
infra-package:           "nginx dnsmasq etcd haproxy vip-manager node-exporter keepalived-exporter pg-exporter pgbackrest-exporter redis-exporter redis minio mcli pig"
infra-addons:            "grafana grafana-plugins loki logcli promtail prometheus alertmanager pushgateway blackbox-exporter nginx-exporter pev2 certbot python3-certbot-nginx"
extra-modules:           "docker-ce docker-compose-plugin ferretdb2 duckdb restic juicefs vray grafana-infinity-ds"
node-package1:           "lz4 unzip bzip2 zlib1g pv jq git ncdu make patch bash lsof wget uuid tuned nvme-cli numactl sysstat iotop htop rsync tcpdump acl chrony"
node-package2:           "netcat-openbsd socat lrzsz net-tools ipvsadm dnsutils telnet ca-certificates libreadline-dev vim-tiny keepalived openssl openssh-server openssh-client"
pgsql-utility:           "patroni pgbouncer pgbackrest pgbadger pg-activity pg-timetable pgformatter postgresql-filedump pgxnclient timescaledb-tools timescaledb-event-streamer pgcopydb pgloader"

作为一个使用约定，repo_packages 中通常包括了那些与 PostgreSQL 大版本号无关的软件包（例如 Infra，Node 和 PGDG Common 等部分），而 PostgreSQL 大版本相关的软件包（内核，扩展），通常在 repo_extra_packages 中指定，方便用户切换 PG 大版本。

`repo_extra_packages`

参数名称： repo_extra_packages，类型： string[]，层次：G/C/I

用于在不修改 repo_packages 的基础上，指定额外需要下载的软件包（通常是 PG 大版本相关的软件包），默认值为空列表。

如果该参数没有被显式定义，那么 Pigsty 会从 roles/node_id/vars 中定义的 repo_extra_packages_default 变量中加载获取默认值，默认值为：

[ pgsql-main ]

该参数中的每个元素，都会在上述文件中定义的 package_map 中，根据特定的操作系统发行版大版本进行翻译。例如在 EL 系统上会翻译为：

postgresql$v postgresql$v-server postgresql$v-libs postgresql$v-contrib postgresql$v-plperl postgresql$v-plpython3 postgresql$v-pltcl postgresql$v-llvmjit pg_repack_$v* wal2json_$v* pgvector_$v*

而在 Debian 系统上会被翻译为对应的 Debian DEB 包名：

postgresql-$v postgresql-client-$v postgresql-plpython3-$v postgresql-plperl-$v postgresql-pltcl-$v postgresql-$v-repack postgresql-$v-wal2json postgresql-$v-pgvector

这里的 $v 会被替换为 pg_version，即当前 PG 大版本号（默认为 17）。通常用户可以在这里指定 PostgreSQL 大版本相关的软件包，而不影响 repo_packages 中定义的其他 PG 大版本无关的软件包。

`repo_url_packages`

参数名称： repo_url_packages，类型： object[] | string[]，层次：G

直接使用 URL 从互联网上下载的软件包，默认为空数组： []

您可以直接在本参数中使用 URL 字符串作为数组元素，也可以使用 Pigsty v3 新引入的对象结构，显式指定 URL 与文件名称。

请注意，本参数会收到 region 变量的影响，如果您在中国大陆地区，Pigsty 会自动将 URL 替换为国内镜像站点，即将 URL 里的 repo.pigsty.io 替换为 repo.pigsty.cc。

`INFRA_PACKAGE`

这些软件包只会在 INFRA 节点上安装，包括普通的 RPM/DEB 软件包，以及 PIP 软件包。

`infra_packages`

参数名称： infra_packages，类型： string[]，层次：G

字符串数组类型，每一行都是 由空格分隔 的软件包列表字符串，指定将要在 Infra 节点上安装的软件包列表。

本参数没有默认值，即默认值为未定义状态。如果用户不在配置文件中显式指定本参数，则 Pigsty 会从根据当前节点的操作系统族，从定义于 roles/node_id/vars 中的 infra_packages_default 变量中加载获取默认值。

默认值（EL系操作系统）：

infra_packages:                   # 将在基础设施节点上安装的软件包列表
  - grafana,loki,logcli,promtail,prometheus,alertmanager,pushgateway,grafana-plugins,restic,certbot,python3-certbot-nginx
  - node_exporter,blackbox_exporter,nginx_exporter,pg_exporter,pev2,nginx,dnsmasq,ansible,etcd,python3-requests,redis,mcli

默认值（Debian/Ubuntu）：

infra_packages:                   # 将在基础设施节点上安装的软件包列表
  - grafana,grafana-plugins,loki,logcli,promtail,prometheus,alertmanager,pushgateway,restic,certbot,python3-certbot-nginx
  - node-exporter,blackbox-exporter,nginx-exporter,pg-exporter,pev2,nginx,dnsmasq,ansible,etcd,python3-requests,redis,mcli

`infra_packages_pip`

参数名称： infra_packages_pip，类型： string，层次：G

Infra 节点上要使用 pip 额外安装的软件包，包名使用逗号分隔，默认值是空字符串，即不安装任何额外的 python 包。

`NGINX`

Pigsty 会通过 Nginx 代理所有的 Web 服务访问：Home Page、Grafana、Prometheus、AlertManager 等等。以及其他可选的工具，如 PGWe、Jupyter Lab、Pgadmin、Bytebase 等等，还有一些静态资源和报告，如 pev、schemaspy 和 pgbadger。

最重要的是，Nginx 还作为本地软件仓库（Yum/Apt）的 Web 服务器，用于存储和分发 Pigsty 的软件包。此外，Pigsty 还可以使用 Certbot 自动申请免费的 Nginx SSL 证书，使用真实域名与HTTPS安全地对公网提供服务。

nginx_enabled: true               # 在当前基础设施节点上启用 Nginx？
nginx_exporter_enabled: true      # 在当前基础设施节点上启用 nginx_exporter？
nginx_sslmode: enable             # Nginx 的 SSL 工作模式？disable,enable,enforce
nginx_home: /www                  # Nginx 静态文件目录，默认为：`/www`
nginx_port: 80                    # Nginx 默认监听的端口（提供HTTP服务），默认为 `80`
nginx_ssl_port: 443               # Nginx SSL 默认监听的端口，默认为 `443`
nginx_navbar:                     # Nginx 首页上的导航栏内容
  - { name: CA Cert ,url: '/ca.crt'   ,desc: 'pigsty self-signed ca.crt'   }
  - { name: Package ,url: '/pigsty'   ,desc: 'local yum repo packages'     }
  - { name: PG Logs ,url: '/logs'     ,desc: 'postgres raw csv logs'       }
  - { name: Reports ,url: '/report'   ,desc: 'pgbadger summary report'     }
  - { name: Explain ,url: '/pigsty/pev.html' ,desc: 'postgres explain visualizer' }
certbot_sign: false               # 使用 certbot 自动申请 Nginx SSL 证书？
certbot_email: your@email.com     # certbot 邮箱地址，用于接收证书过期提醒邮件
certbot_options: ''               # certbot 额外选项

`nginx_enabled`

参数名称： nginx_enabled，类型： bool，层次：G/I

是否在当前的 Infra 节点上启用 Nginx？默认值为： true。

`nginx_exporter_enabled`

参数名称： nginx_exporter_enabled，类型： bool，层次：G/I

在此基础设施节点上启用 nginx_exporter ？默认值为： true。

如果禁用此选项，还会一并禁用 /nginx 健康检查 stub，当您安装使用的 Nginx 版本不支持此功能是可以考虑关闭此开关

`nginx_sslmode`

参数名称： nginx_sslmode，类型： enum，层次：G

Nginx 的 SSL工作模式？有三种选择：disable , enable , enforce，默认值为 enable，即启用 SSL，但不强制使用。

disable：只监听 nginx_port 指定的端口服务 HTTP 请求。
enable：同时会监听 nginx_ssl_port 指定的端口服务 HTTPS 请求。
enforce：所有链接都会被渲染为默认使用 https://
- 同时 Nginx infra_portal 中除默认服务器外的其他服务器都会自动将 80 端口重定向到 443 端口。

`nginx_home`

参数名称： nginx_home，类型： path，层次：G

Nginx服务器静态文件目录，默认为： /www

Nginx服务器的根目录，包含静态资源和软件仓库文件。最好不要随意修改此参数，修改时需要与 repo_home 参数保持一致。

`nginx_port`

参数名称： nginx_port，类型： port，层次：G

Nginx 默认监听的端口（提供HTTP服务），默认为 80 端口，最好不要修改这个参数。

当您的服务器 80 端口被占用时，可以考虑修改此参数，但是需要同时修改 repo_endpoint ，以及 node_repo_local_urls 所使用的端口并与这里保持一致。

`nginx_ssl_port`

参数名称： nginx_ssl_port，类型： port，层次：G

Nginx SSL 默认监听的端口，默认为 443，最好不要修改这个参数。

`nginx_navbar`

参数名称： nginx_navbar，类型： index[]，层次：G

Nginx 首页上的导航栏内容，默认值：

nginx_navbar:                     # Nginx 首页上的导航栏内容
  - { name: CA Cert ,url: '/ca.crt'   ,desc: 'pigsty self-signed ca.crt'   }
  - { name: Package ,url: '/pigsty'   ,desc: 'local yum repo packages'     }
  - { name: PG Logs ,url: '/logs'     ,desc: 'postgres raw csv logs'       }
  - { name: Reports ,url: '/report'   ,desc: 'pgbadger summary report'     }
  - { name: Explain ,url: '/pigsty/pev.html' ,desc: 'postgres explain visualizer' }

每一条记录都会被渲染为一个导航链接，链接到 Pigsty 首页的 App 下拉菜单，所有的 App 都是可选的，默认挂载在 Pigsty 默认服务器下的 http://pigsty/ 。

url 参数指定了 App 的 URL PATH，但是如果 URL 中包含 ${grafana} 字符串，它会被自动替换为 infra_portal 中定义的 Grafana 域名。

所以您可以将一些使用 Grafana 的数据应用挂载到 Pigsty 的首页导航栏中。

`certbot_sign`

参数名称： certbot_sign，类型： bool，层次：G/A

是否使用 certbot 自动申请证书？默认值为 false。

当设置为 true 时，Pigsty 将在 infra.yml 和 install.yml 剧本执行过程中（nginx 角色）中，使用 certbot 从 Let’s Encrypt 自动申请免费的 SSL 证书。

在 infra_portal 中定义的域名，如果定义了 certbot 参数，那么 Pigsty 会使用 certbot 申请 domain 域名证书，证书名为 cerbot 参数的值。如果有多个服务器/域名指定了相同 certbot 参数，那么 Pigsty 会为这些域名合并申请一个证书，并使用 certbot 参数的值作为证书名。

启用此选项需要您：

当前节点可以通过公网域名访问到，DNS 解析已经正确指向当前节点的公网 IP
当前节点可以访问到 Let’s Encrypt 的 API 接口

此选项默认关闭，您可以在安装完成后执行 make cert 命令来手动执行，它实际会调用渲染的 /etc/nginx/sign-cert 脚本，使用 certbot 更新或申请证书。

`certbot_email`

参数名称： certbot_email，类型： string，层次：G/A

申请证书时使用的 email 地址，用于接收证书过期提醒邮件。默认值为占位邮件地址：your@email.com。

当 certbot_sign 设置为 true 时，建议提供此参数。Let’s Encrypt 会在证书即将过期时向此邮箱发送提醒邮件。

`certbot_option`

参数名称： certbot_option，类型： string，层次：G/A

申请证书时额外传入的配置参数，默认为空字符串。

您可以通过此参数向 certbot 传递额外的命令行选项，例如 --dry-run ，那么 certbot 将不会真正申请证书，而是进行预览和测试。

`DNS`

Pigsty 默认会在 Infra 节点上启用 DNSMASQ 服务，用于解析一些辅助域名，例如 h.pigsty a.pigsty p.pigsty g.pigsty 等等，以及可选 MinIO 的 sss.pigsty。

解析记录会记录在 Infra 节点的 /etc/hosts.d/default 文件中。要使用这个 DNS 服务器，您必须将 nameserver <ip> 添加到 /etc/resolv 中，node_dns_servers 参数可以解决这个问题。

dns_enabled: true                 # 在当前基础设施节点上启用 DNSMASQ 服务？
dns_port: 53                      # DNS 服务器监听端口，默认为 `53`
dns_records:                      # 由 dnsmasq 解析的动态 DNS 记录
  - "${admin_ip} h.pigsty a.pigsty p.pigsty g.pigsty"
  - "${admin_ip} api.pigsty adm.pigsty cli.pigsty ddl.pigsty lab.pigsty git.pigsty sss.pigsty wiki.pigsty"

`dns_enabled`

参数名称： dns_enabled，类型： bool，层次：G/I

是否在这个 Infra 节点上启用 DNSMASQ 服务？默认值为： true。

如果你不想使用默认的 DNS 服务器，（比如你已经有了外部的DNS服务器，或者您的供应商不允许您使用 DNS 服务器）可以将此值设置为 false 来禁用它。并使用 node_default_etc_hosts 和 node_etc_hosts 静态解析记录代替。

`dns_port`

参数名称： dns_port，类型： port，层次：G

DNSMASQ 的默认监听端口，默认是 53，不建议修改 DNS 服务默认端口。

`dns_records`

参数名称： dns_records，类型： string[]，层次：G

由 dnsmasq 负责解析的动态 DNS 记录，一般用于将一些辅助域名解析到本地，例如 h.pigsty a.pigsty p.pigsty g.pigsty 等等。这些记录会被写入到基础设置节点的 /etc/hosts.d/default 文件中。

dns_records:                      # 由 dnsmasq 解析的动态 DNS 记录
  - "${admin_ip} h.pigsty a.pigsty p.pigsty g.pigsty"
  - "${admin_ip} api.pigsty adm.pigsty cli.pigsty ddl.pigsty lab.pigsty git.pigsty sss.pigsty wiki.pigsty"

`PROMETHEUS`

Prometheus 被用作时序数据库，用于存储和分析监控指标数据，进行指标预计算，评估告警规则。

prometheus_enabled: true          # 在当前基础设施节点上启用 Prometheus？
prometheus_clean: true            # 在初始化 Prometheus 的时候清除现有数据？
prometheus_data: /data/prometheus # Prometheus 数据目录，默认为 `/data/prometheus`
prometheus_sd_dir: /etc/prometheus/targets # Prometheus 静态文件服务发现目录
prometheus_sd_interval: 5s        # Prometheus 目标刷新间隔，默认为 `5s`
prometheus_scrape_interval: 10s   # Prometheus 抓取 & 评估间隔，默认为 `10s`
prometheus_scrape_timeout: 8s     # Prometheus 全局抓取超时，默认为 `8s`
prometheus_options: '--storage.tsdb.retention.time=15d' # Prometheus 额外的命令行参数选项
pushgateway_enabled: true         # 在当前基础设施节点上启用 PushGateway？
pushgateway_options: '--persistence.interval=1m' # PushGateway 额外的命令行参数选项
blackbox_enabled: true            # 在当前基础设施节点上启用 Blackbox_Exporter？
blackbox_options: ''              # Blackbox_Exporter 额外的命令行参数选项
alertmanager_enabled: true        # 在当前基础设施节点上启用 Alertmanager？
alertmanager_port: 9093           # Alertmanager 监听端口，默认为 `9093`
alertmanager_options: ''          # Alertmanager 额外的命令行参数选项
exporter_metrics_path: /metrics   # Exporter 指标路径，默认为 `/metrics`
exporter_install: none            # 如何安装 Exporter？none,yum,binary
exporter_repo_url: ''             # 如果通过 yum 安装 Exporter，则指定 yum 仓库文件地址

`prometheus_enabled`

参数名称： prometheus_enabled，类型： bool，层次：G/I

是否在当前 Infra 节点上启用 Prometheus？默认值为 true，即所有基础设施节点默认都会安装启用 Prometheus。

例如，如果您有多个元节点，默认情况下，Pigsty会在所有元节点上部署Prometheus。如果您想一台用于Prometheus监控指标收集，一台用于Loki日志收集，则可以在其他元节点的实例层次上将此参数设置为false。

`prometheus_clean`

参数名称： prometheus_clean，类型： bool，层次：G/A

是否在执行 Prometheus 初始化的时候清除现有 Prometheus 数据？默认值为 true。

`prometheus_data`

参数名称： prometheus_data，类型： path，层次：G

Prometheus数据库目录, 默认位置为 /data/prometheus。

`prometheus_sd_dir`

参数名称： prometheus_sd_dir，类型： path，层次：G

Prometheus 静态文件服务发现的对象存储目录，默认值为 /etc/prometheus/targets。

`prometheus_sd_interval`

参数名称： prometheus_sd_interval，类型： interval，层次：G

Prometheus 静态文件服务发现的刷新周期，默认值为 5s。

这意味着 Prometheus 每隔这样长的时间就会重新扫描一次 prometheus_sd_dir （默认为：/etc/prometheus/targets 目录），以发现新的监控对象。

`prometheus_scrape_interval`

参数名称： prometheus_scrape_interval，类型： interval，层次：G

Prometheus 全局指标抓取周期, 默认值为 10s。在生产环境，10秒 - 30秒是一个较为合适的抓取周期。如果您需要更精细的的监控数据粒度，则可以调整此参数。

`prometheus_scrape_timeout`

参数名称： prometheus_scrape_timeout，类型： interval，层次：G

Prometheus 全局抓取超时，默认为 8s。

设置抓取超时可以有效避免监控系统查询导致的雪崩，设置原则是，本参数必须小于并接近 prometheus_scrape_interval ，确保每次抓取时长不超过抓取周期。

`prometheus_options`

参数名称： prometheus_options，类型： arg，层次：G

Prometheus 的额外的命令行参数，默认值：--storage.tsdb.retention.time=15d

默认的参数会为 Prometheus 配置一个 15 天的保留期限来限制磁盘使用量。

`pushgateway_enabled`

参数名称： pushgateway_enabled，类型： bool，层次：G/I

是否在当前 Infra 节点上启用 PushGateway？默认值为 true，即所有基础设施节点默认都会安装启用 PushGateway。

`pushgateway_options`

参数名称： pushgateway_options，类型： arg，层次：G

PushGateway 的额外的命令行参数，默认值：--persistence.interval=1m，即每分钟进行一次持久化操作。

`blackbox_enabled`

参数名称： blackbox_enabled，类型： bool，层次：G/I

是否在当前 Infra 节点上启用 BlackboxExporter ？默认值为 true，即所有基础设施节点默认都会安装启用 BlackboxExporter 。

BlackboxExporter 会向节点 IP 地址， VIP 地址，PostgreSQL VIP 地址发送 ICMP 报文测试网络连通性。

`blackbox_options`

参数名称： blackbox_options，类型： arg，层次：G

BlackboxExporter 的额外的命令行参数，默认值：空字符串。

`alertmanager_enabled`

参数名称： alertmanager_enabled，类型： bool，层次：G/I

是否在当前 Infra 节点上启用 AlertManager ？默认值为 true，即所有基础设施节点默认都会安装启用 AlertManager 。

`alertmanager_port`

参数名称： alertmanager_port，类型： port，层次：G

AlertManager 的监听端口，默认值为 9093。

之所以允许特殊设置 AlertManager 的端口号，是因为 Kafka 的默认端口用到了 9093，容易出现冲突。

`alertmanager_options`

参数名称： alertmanager_options，类型： arg，层次：G

AlertManager 的额外的命令行参数，默认值：空字符串。

`exporter_metrics_path`

参数名称： exporter_metrics_path，类型： path，层次：G

监控 exporter 暴露指标的 HTTP 端点路径，默认为： /metrics ，不建议修改此参数。

`exporter_install`

参数名称： exporter_install，类型： enum，层次：G

（弃用参数）安装监控组件的方式，有三种可行选项：none, yum, binary

指明安装Exporter的方式：

none：不安装，（默认行为，Exporter已经在先前由 node_pkg 任务完成安装）
yum：使用yum（apt）安装（如果启用yum安装，在部署Exporter前执行yum安装 node_exporter 与 pg_exporter ）
binary：使用拷贝二进制的方式安装（从元节点中直接拷贝node_exporter 与 pg_exporter 二进制，不推荐）

使用yum安装时，如果指定了exporter_repo_url（不为空），在执行安装时会首先将该URL下的REPO文件安装至/etc/yum.repos.d中。这一功能可以在不执行节点基础设施初始化的环境下直接进行Exporter的安装。不推荐普通用户使用binary安装，这种模式通常用于紧急故障抢修与临时问题修复。

`exporter_repo_url`

参数名称： exporter_repo_url，类型： url，层次：G

（弃用参数）监控组件的 Yum Repo URL

默认为空，当 exporter_install 为 yum 时，该参数指定的Repo会被添加至节点源列表中。

`GRAFANA`

Pigsty 使用 Grafana 作为监控系统前端。它也可以做为数据分析与可视化平台，或者用于低代码数据应用开发，制作数据应用原型等目的。

grafana_enabled: true             # enable grafana on this infra node?
grafana_clean: true               # clean grafana data during init?
grafana_admin_username: admin     # grafana admin username, `admin` by default
grafana_admin_password: pigsty    # grafana admin password, `pigsty` by default
loki_enabled: true                # enable loki on this infra node?
loki_clean: false                 # whether remove existing loki data?
loki_data: /data/loki             # loki data dir, `/data/loki` by default
loki_retention: 15d               # loki log retention period, 15d by default

`grafana_enabled`

参数名称： grafana_enabled，类型： bool，层次：G/I

是否在Infra节点上启用Grafana？默认值为： true，即所有基础设施节点默认都会安装启用 Grafana。

`grafana_clean`

参数名称： grafana_clean，类型： bool，层次：G/A

是否在初始化 Grafana 时一并清理其数据文件？默认为：true。

该操作会移除 /var/lib/grafana/grafana.db，确保 Grafana 全新安装。

`grafana_admin_username`

参数名称： grafana_admin_username，类型： username，层次：G

Grafana管理员用户名，admin by default

`grafana_admin_password`

参数名称： grafana_admin_password，类型： password，层次：G

Grafana管理员密码，pigsty by default

提示：请务必在生产部署中修改此密码参数！

`LOKI`

Loki 是Grafana提供的轻量级日志收集/检索平台，它可以提供一个集中查询服务器/数据库日志的地方。

`loki_enabled`

参数名称： loki_enabled，类型： bool，层次：G/I

是否在当前 Infra 节点上启用 Loki ？默认值为 true，即所有基础设施节点默认都会安装启用 Loki 。

`loki_clean`

参数名称： loki_clean，类型： bool，层次：G/A

是否在安装Loki时清理数据库目录？默认值： false，现有日志数据在初始化时会保留。

`loki_data`

参数名称： loki_data，类型： path，层次：G

Loki的数据目录，默认值为： /data/loki

`loki_retention`

参数名称： loki_retention，类型： interval，层次：G

Loki日志默认保留天数，默认保留 15d 。

8.4 - 预置剧本

如何使用预置的 ansible 剧本来管理 INFRA 集群，常用管理命令速查。

Pigsty 提供了三个与 INFRA 模块相关的剧本：

infra.yml ：在 infra 节点上初始化 pigsty 基础设施
infra-rm.yml：从 infra 节点移除基础设施组件
install.yml：在当前节点上一次性完整安装 Pigsty

`infra.yml`

INFRA 模块剧本 infra.yml 用于在配置文件的 infra 分组所定义的 Infra节点 上初始化基础设施模块

执行该剧本将完成以下任务

配置 Infra节点的目录与环境变量
下载并创建本地软件仓库，加速后续安装。（若使用离线软件包，或检测到已经存在本地软件源，则跳过本阶段）
将当前 Infra节点作为一个普通节点纳入 Pigsty 管理
部署基础设施组件，包括 Prometheus, Grafana, Loki, Alertmanager, PushGateway，Blackbox Exporter 等

该剧本默认在 infra 分组上执行

Pigsty 会在配置文件中固定名为 infra 的分组上安装 INFRA 模块
Pigsty 会在 configure 过程中默认将当前安装节点标记为 Infra节点，并使用 当前节点首要IP地址 替换配置模板中的占位IP地址10.10.10.10。
该节点除了可以发起管理，部署有基础设施，与一个部署普通托管节点并无区别。

剧本注意事项

本剧本为幂等剧本，重复执行会抹除 Infra节点上的基础设施组件。
- 除非设置 prometheus_clean 为 false，否则 Prometheus 监控指标时序数据会丢失。
- 除非设置 loki_clean 为 false，否则 Loki 日志数据会丢失，
- 除非设置 grafana_clean 为 false，否则 Grafana 监控面板与配置修改会丢失
当本地软件仓库 /www/pigsty/repo_complete 存在时，本剧本会跳过从互联网下载软件的任务。
- 完整执行该剧本耗时约1～3分钟，视机器配置与网络条件而异。
- 不使用离线软件包而直接从互联网原始上游下载软件时，可能耗时5-10分钟，根据您的网络条件而异。

执行演示

可用任务

以下为 infra.yml 剧本中可用的任务列表：

#--------------------------------------------------------------#
# Tasks
#--------------------------------------------------------------#
# ca            : create self-signed CA on localhost files/pki
#   - ca_dir        : create CA directory
#   - ca_private    : generate ca private key: files/pki/ca/ca.key
#   - ca_cert       : signing ca cert: files/pki/ca/ca.crt
#
# id            : generate node identity
#
# repo          : bootstrap a local yum repo from internet or offline packages
#   - repo_dir      : create repo directory
#   - repo_check    : check repo exists
#   - repo_prepare  : use existing repo if exists
#   - repo_build    : build repo from upstream if not exists
#     - repo_upstream    : handle upstream repo files in /etc/yum.repos.d
#       - repo_remove    : remove existing repo file if repo_remove == true
#       - repo_add       : add upstream repo files to /etc/yum.repos.d
#     - repo_url_pkg     : download packages from internet defined by repo_url_packages
#     - repo_cache       : make upstream yum cache with yum makecache
#     - repo_boot_pkg    : install bootstrap pkg such as createrepo_c,yum-utils,...
#     - repo_pkg         : download packages & dependencies from upstream repo
#     - repo_create      : create a local yum repo with createrepo_c & modifyrepo_c
#     - repo_use         : add newly built repo into /etc/yum.repos.d
#   - repo_nginx    : launch a nginx for repo if no nginx is serving
#
# node/haproxy/docker/monitor : setup infra node as a common node (check node.yml)
#   - node_name, node_hosts, node_resolv, node_firewall, node_ca, node_repo, node_pkg
#   - node_feature, node_kernel, node_tune, node_sysctl, node_profile, node_ulimit
#   - node_data, node_admin, node_timezone, node_ntp, node_crontab, node_vip
#   - haproxy_install, haproxy_config, haproxy_launch, haproxy_reload
#   - docker_install, docker_admin, docker_config, docker_launch, docker_image
#   - haproxy_register, node_exporter, node_register, promtail
#
# infra         : setup infra components
#   - infra_env      : env_dir, env_pg, env_pgadmin, env_var
#   - infra_pkg      : infra_pkg_yum, infra_pkg_pip
#   - infra_user     : setup infra os user group
#   - infra_cert     : issue cert for infra components
#   - dns            : dns_config, dns_record, dns_launch
#   - nginx          : nginx_config, nginx_cert, nginx_static, nginx_launch, nginx_certbot, nginx_reload, nginx_exporter
#   - prometheus     : prometheus_clean, prometheus_dir, prometheus_config, prometheus_launch, prometheus_reload
#   - alertmanager   : alertmanager_config, alertmanager_launch
#   - pushgateway    : pushgateway_config, pushgateway_launch
#   - blackbox       : blackbox_config, blackbox_launch
#   - grafana        : grafana_clean, grafana_config, grafana_launch, grafana_provision
#   - loki           : loki clean, loki_dir, loki_config, loki_launch
#   - infra_register : register infra components to prometheus
#--------------------------------------------------------------#

`infra-rm.yml`

INFRA模块剧本 infra-rm.yml 用于从配置文件 infra 分组定义的 Infra节点上移除 Pigsty 基础设施

常用子任务包括：

./infra-rm.yml               # 移除 INFRA 模块
./infra-rm.yml -t service    # 停止 INFRA 上的基础设施服务
./infra-rm.yml -t data       # 移除 INFRA 上的存留数据
./infra-rm.yml -t package    # 卸载 INFRA 上安装的软件包

`install.yml`

INFRA模块剧本 install.yml用于在 所有节点 上一次性完整安装 Pigsty。

该剧本在剧本：一次性安装中有更详细的介绍。

8.5 - 管理预案

Infra 集群管理 SOP：创建，销毁，扩容，缩容，监控对象管理

下面是与 INFRA 模块相关的一些管理任务：

安装Infra模块

使用 infra.yml 剧本在 Infra 节点上安装 INFRA 模块：

./infra.yml     # 在 infra 分组上安装 INFRA 模块

卸载Infra模块

使用 infra-rm.yml 剧本从 Infra 节点上卸载 INFRA 模块：

./infra-rm.yml  # 从 infra 分组上卸载 INFRA 模块

扩容 Infra 模块

想要扩容现有 Infra 部署，首先修改 infra 分组，添加新的节点 IP，并为其分配不重复的 Infra 实例号 infra_seq。

all:
  children:
    infra:
      hosts:
        10.10.10.10: { infra_seq: 1 } # 原有的 1 号节点
        10.10.10.11: { infra_seq: 2 } # 新的 2 号节点

然后使用 infra.yml 剧本在新的节点上安装 INFRA 模块：

./infra.yml -l 10.10.10.11    # 在新节点上安装 INFRA 模块

管理本地软件仓库

您可以使用以下剧本子任务，管理 Infra节点上的本地软件仓库（YUM/APT）：

./infra.yml -t repo              #从互联网或离线包中创建本地软件仓库

./infra.yml -t repo_dir          # 创建本地软件仓库
./infra.yml -t repo_check        # 检查本地软件仓库是否已经存在？
./infra.yml -t repo_prepare      # 如果存在，直接使用已有的本地软件仓库
./infra.yml -t repo_build        # 如果不存在，从上游构建本地软件仓库
./infra.yml     -t repo_upstream     # 添加上游仓库 repo/list 文件
./infra.yml     -t repo_remove       # 如果 repo_remove == true，则删除现有的仓库文件
./infra.yml     -t repo_add          # 将上游仓库文件添加到 /etc/yum.repos.d （或 /etc/apt/sources.list.d）
./infra.yml     -t repo_url_pkg      # 从由 repo_url_packages 定义的互联网下载包
./infra.yml     -t repo_cache        # 使用 yum makecache / apt update 创建上游软件源元数据缓存
./infra.yml     -t repo_boot_pkg     # 安装如 createrepo_c、yum-utils 等的引导包...（或 dpkg-）
./infra.yml     -t repo_pkg          # 从上游仓库下载包 & 依赖项
./infra.yml     -t repo_create       # 使用 createrepo_c & modifyrepo_c / dpkg-dev 创建本地软件仓库
./infra.yml     -t repo_use          # 将新建的仓库添加到 /etc/yum.repos.d | /etc/apt/sources.list.d
./infra.yml -t repo_nginx        # 如果 nginx 没有运行，启动 nginx 作为文件服务器

其中常用的命令为：

./infra.yml     -t repo_upstream     # 向 INFRA 节点添加 repo_upstream 中定义的上游软件仓库
./infra.yml     -t repo_pkg          # 从上游软件仓库下载包及其依赖项。
./infra.yml     -t repo_create       # 创建/更新本地 yum/apt 仓库

管理Nginx

./infra.yml -t nginx                       # 重置 Nginx 组件
./infra.yml -t nginx_index                 # 重新渲染 Nginx 首页内容
./infra.yml -t nginx_config,nginx_reload   # 重新渲染 Nginx 配置，对外暴露新的上游服务。

如果用户在 infra_portal 列表中使用了 certbot 字段填入了证书名称，则可以使用以下命令使用 certbot 申请免费 HTTPS 证书：

# 使用 certbot 申请真实域名的免费 HTTPS 证书
./infra.yml -t nginx_certbot,nginx_reload -e certbot_sign=true

管理基础设施组件

您可以使用以下剧本子任务，管理 Infra节点上的各个基础设施组件

./infra.yml -t infra           # 配置基础设施
./infra.yml -t infra_env       # 配置管理节点上的环境变量：env_dir, env_pg, env_pgadmin, env_var
./infra.yml -t infra_pkg       # 安装INFRA所需的软件包：infra_pkg_yum, infra_pkg_pip
./infra.yml -t infra_user      # 设置 infra 操作系统用户组
./infra.yml -t infra_cert      # 为 infra 组件颁发证书
./infra.yml -t dns             # 配置 DNSMasq：dns_config, dns_record, dns_launch
./infra.yml -t nginx           # 配置 Nginx：nginx_config, nginx_cert, nginx_static, nginx_launch, nginx_exporter
./infra.yml -t prometheus      # 配置 Prometheus：prometheus_clean, prometheus_dir, prometheus_config, prometheus_launch, prometheus_reload
./infra.yml -t alertmanager    # 配置 AlertManager：alertmanager_config, alertmanager_launch
./infra.yml -t pushgateway     # 配置 PushGateway：pushgateway_config, pushgateway_launch
./infra.yml -t blackbox        # 配置 Blackbox Exporter： blackbox_launch
./infra.yml -t grafana         # 配置 Grafana：grafana_clean, grafana_config, grafana_plugin, grafana_launch, grafana_provision
./infra.yml -t loki            # 配置 Loki：loki_clean, loki_dir, loki_config, loki_launch
./infra.yml -t infra_register  # 将 infra 组件注册到 prometheus

其他常用的任务包括：

./infra.yml -t nginx_index                        # 重新渲染 Nginx 首页内容
./infra.yml -t nginx_config,nginx_reload          # 重新渲染 Nginx 配置，对外暴露新的上游服务。
./infra.yml -t prometheus_conf,prometheus_reload  # 重新生成 Prometheus 主配置文件，并重载配置
./infra.yml -t prometheus_rule,prometheus_reload  # 重新拷贝 Prometheus 规则 & 告警，并重载配置
./infra.yml -t grafana_plugin                     # 从互联网上下载 Grafana 插件，通常需要科学上网

8.6 - 监控告警

如何在 Pigsty 中对基础设施进行自监控？

监控面板

Pigsty 针对 Infra 模块提供了以下监控面板

Pigsty Home

Pigsty 监控系统主页

Pigsty Home Dashboard

INFRA Overview

Pigsty 基础设施自监控概览

INFRA Overview Dashboard

Nginx Overview

Nginx 监控指标与日志

Nginx Overview Dashboard

Grafana Overview

Grafana 监控指标与日志

Grafana Overview Dashboard

Prometheus Overview

Prometheus 监控指标与日志

Prometheus Overview Dashboard

Loki Overview

Loki 监控指标与日志

Loki Overview Dashboard

Logs Instance

查阅单个节点上的日志信息

Logs Instance Dashboard

Logs Overview

查阅全局日志信息

Logs Overview Dashboard

CMDB Overview

CMDB 可视化

CMDB Overview Dashboard

告警规则

Pigsty 针对 INFRA 模块提供了以下两条告警规则：

InfraDown ：基础设施组件出现宕机
AgentDown ：监控Agent代理出现宕机

您可以按需在 files/prometheus/rules/infra.yml 中修改或添加新的基础设施告警规则。

################################################################
#                Infrastructure Alert Rules                    #
################################################################
- name: infra-alert
  rules:

    #==============================================================#
    #                       Infra Aliveness                        #
    #==============================================================#
    # infra components (prometheus,grafana) down for 1m triggers a P1 alert
    - alert: InfraDown
      expr: infra_up < 1
      for: 1m
      labels: { level: 0, severity: CRIT, category: infra }
      annotations:
        summary: "CRIT InfraDown {{ $labels.type }}@{{ $labels.instance }}"
        description: |
          infra_up[type={{ $labels.type }}, instance={{ $labels.instance }}] = {{ $value  | printf "%.2f" }} < 1          

    #==============================================================#
    #                       Agent Aliveness                        #
    #==============================================================#

    # agent aliveness are determined directly by exporter aliveness
    # including: node_exporter, pg_exporter, pgbouncer_exporter, haproxy_exporter
    - alert: AgentDown
      expr: agent_up < 1
      for: 1m
      labels: { level: 0, severity: CRIT, category: infra }
      annotations:
        summary: 'CRIT AgentDown {{ $labels.ins }}@{{ $labels.instance }}'
        description: |
          agent_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value  | printf "%.2f" }} < 1

8.7 - 指标列表

Pigsty INFRA 模块提供的完整监控指标列表与释义

INFRA 指标

INFRA 模块包含有 964 类可用监控指标。

Metric Name	Type	Labels	Description
alertmanager_alerts	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `state`	How many alerts by state.
alertmanager_alerts_invalid_total	counter	`version`, `ins`, `instance`, `ip`, `job`, `cls`	The total number of received alerts that were invalid.
alertmanager_alerts_received_total	counter	`version`, `ins`, `instance`, `ip`, `status`, `job`, `cls`	The total number of received alerts.
alertmanager_build_info	gauge	`revision`, `version`, `ins`, `instance`, `ip`, `tags`, `goarch`, `goversion`, `job`, `cls`, `branch`, `goos`	A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which alertmanager was built, and the goos and goarch for the build.
alertmanager_cluster_alive_messages_total	counter	`ins`, `instance`, `ip`, `peer`, `job`, `cls`	Total number of received alive messages.
alertmanager_cluster_enabled	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Indicates whether the clustering is enabled or not.
alertmanager_cluster_failed_peers	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number indicating the current number of failed peers in the cluster.
alertmanager_cluster_health_score	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Health score of the cluster. Lower values are better and zero means ’totally healthy’.
alertmanager_cluster_members	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number indicating current number of members in cluster.
alertmanager_cluster_messages_pruned_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of cluster messages pruned.
alertmanager_cluster_messages_queued	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of cluster messages which are queued.
alertmanager_cluster_messages_received_size_total	counter	`ins`, `instance`, `ip`, `msg_type`, `job`, `cls`	Total size of cluster messages received.
alertmanager_cluster_messages_received_total	counter	`ins`, `instance`, `ip`, `msg_type`, `job`, `cls`	Total number of cluster messages received.
alertmanager_cluster_messages_sent_size_total	counter	`ins`, `instance`, `ip`, `msg_type`, `job`, `cls`	Total size of cluster messages sent.
alertmanager_cluster_messages_sent_total	counter	`ins`, `instance`, `ip`, `msg_type`, `job`, `cls`	Total number of cluster messages sent.
alertmanager_cluster_peer_info	gauge	`ins`, `instance`, `ip`, `peer`, `job`, `cls`	A metric with a constant ‘1’ value labeled by peer name.
alertmanager_cluster_peers_joined_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	A counter of the number of peers that have joined.
alertmanager_cluster_peers_left_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	A counter of the number of peers that have left.
alertmanager_cluster_peers_update_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	A counter of the number of peers that have updated metadata.
alertmanager_cluster_reconnections_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	A counter of the number of failed cluster peer reconnection attempts.
alertmanager_cluster_reconnections_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	A counter of the number of cluster peer reconnections.
alertmanager_cluster_refresh_join_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	A counter of the number of failed cluster peer joined attempts via refresh.
alertmanager_cluster_refresh_join_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	A counter of the number of cluster peer joined via refresh.
alertmanager_config_hash	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Hash of the currently loaded alertmanager configuration.
alertmanager_config_last_reload_success_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Timestamp of the last successful configuration reload.
alertmanager_config_last_reload_successful	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Whether the last configuration reload attempt was successful.
alertmanager_dispatcher_aggregation_groups	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of active aggregation groups
alertmanager_dispatcher_alert_processing_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_dispatcher_alert_processing_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_http_concurrency_limit_exceeded_total	counter	`ins`, `instance`, `method`, `ip`, `job`, `cls`	Total number of times an HTTP request failed because the concurrency limit was reached.
alertmanager_http_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `method`, `ip`, `le`, `job`, `cls`, `handler`	N/A
alertmanager_http_request_duration_seconds_count	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `handler`	N/A
alertmanager_http_request_duration_seconds_sum	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `handler`	N/A
alertmanager_http_requests_in_flight	gauge	`ins`, `instance`, `method`, `ip`, `job`, `cls`	Current number of HTTP requests being processed.
alertmanager_http_response_size_bytes_bucket	Unknown	`ins`, `instance`, `method`, `ip`, `le`, `job`, `cls`, `handler`	N/A
alertmanager_http_response_size_bytes_count	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `handler`	N/A
alertmanager_http_response_size_bytes_sum	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `handler`	N/A
alertmanager_integrations	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of configured integrations.
alertmanager_marked_alerts	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `state`	How many alerts by state are currently marked in the Alertmanager regardless of their expiry.
alertmanager_nflog_gc_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_nflog_gc_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_nflog_gossip_messages_propagated_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of received gossip messages that have been further gossiped.
alertmanager_nflog_maintenance_errors_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	How many maintenances were executed for the notification log that failed.
alertmanager_nflog_maintenance_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	How many maintenances were executed for the notification log.
alertmanager_nflog_queries_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of notification log queries were received.
alertmanager_nflog_query_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
alertmanager_nflog_query_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_nflog_query_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_nflog_query_errors_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number notification log received queries that failed.
alertmanager_nflog_snapshot_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_nflog_snapshot_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_nflog_snapshot_size_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Size of the last notification log snapshot in bytes.
alertmanager_notification_latency_seconds_bucket	Unknown	`integration`, `ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
alertmanager_notification_latency_seconds_count	Unknown	`integration`, `ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_notification_latency_seconds_sum	Unknown	`integration`, `ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_notification_requests_failed_total	counter	`integration`, `ins`, `instance`, `ip`, `job`, `cls`	The total number of failed notification requests.
alertmanager_notification_requests_total	counter	`integration`, `ins`, `instance`, `ip`, `job`, `cls`	The total number of attempted notification requests.
alertmanager_notifications_failed_total	counter	`integration`, `ins`, `instance`, `ip`, `reason`, `job`, `cls`	The total number of failed notifications.
alertmanager_notifications_total	counter	`integration`, `ins`, `instance`, `ip`, `job`, `cls`	The total number of attempted notifications.
alertmanager_oversize_gossip_message_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `key`, `job`, `cls`	N/A
alertmanager_oversize_gossip_message_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `key`, `job`, `cls`	N/A
alertmanager_oversize_gossip_message_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `key`, `job`, `cls`	N/A
alertmanager_oversized_gossip_message_dropped_total	counter	`ins`, `instance`, `ip`, `key`, `job`, `cls`	Number of oversized gossip messages that were dropped due to a full message queue.
alertmanager_oversized_gossip_message_failure_total	counter	`ins`, `instance`, `ip`, `key`, `job`, `cls`	Number of oversized gossip message sends that failed.
alertmanager_oversized_gossip_message_sent_total	counter	`ins`, `instance`, `ip`, `key`, `job`, `cls`	Number of oversized gossip message sent.
alertmanager_peer_position	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Position the Alertmanager instance believes it’s in. The position determines a peer’s behavior in the cluster.
alertmanager_receivers	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of configured receivers.
alertmanager_silences	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `state`	How many silences by state.
alertmanager_silences_gc_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_silences_gc_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_silences_gossip_messages_propagated_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of received gossip messages that have been further gossiped.
alertmanager_silences_maintenance_errors_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	How many maintenances were executed for silences that failed.
alertmanager_silences_maintenance_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	How many maintenances were executed for silences.
alertmanager_silences_queries_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	How many silence queries were received.
alertmanager_silences_query_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
alertmanager_silences_query_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_silences_query_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_silences_query_errors_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	How many silence received queries did not succeed.
alertmanager_silences_snapshot_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_silences_snapshot_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
alertmanager_silences_snapshot_size_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Size of the last silence snapshot in bytes.
blackbox_exporter_build_info	gauge	`revision`, `version`, `ins`, `instance`, `ip`, `tags`, `goarch`, `goversion`, `job`, `cls`, `branch`, `goos`	A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which blackbox_exporter was built, and the goos and goarch for the build.
blackbox_exporter_config_last_reload_success_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Timestamp of the last successful configuration reload.
blackbox_exporter_config_last_reload_successful	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Blackbox exporter config loaded successfully.
blackbox_module_unknown_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Count of unknown modules requested by probes
cortex_distributor_ingester_clients	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The current number of ingester clients.
cortex_dns_failures_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_dns_lookups_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_frontend_query_range_duration_seconds_bucket	Unknown	`ins`, `instance`, `method`, `ip`, `le`, `job`, `cls`, `status_code`	N/A
cortex_frontend_query_range_duration_seconds_count	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `status_code`	N/A
cortex_frontend_query_range_duration_seconds_sum	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `status_code`	N/A
cortex_ingester_flush_queue_length	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total number of series pending in the flush queue.
cortex_kv_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `role`, `ip`, `le`, `kv_name`, `type`, `operation`, `job`, `cls`, `status_code`	N/A
cortex_kv_request_duration_seconds_count	Unknown	`ins`, `instance`, `role`, `ip`, `kv_name`, `type`, `operation`, `job`, `cls`, `status_code`	N/A
cortex_kv_request_duration_seconds_sum	Unknown	`ins`, `instance`, `role`, `ip`, `kv_name`, `type`, `operation`, `job`, `cls`, `status_code`	N/A
cortex_member_consul_heartbeats_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_prometheus_notifications_alertmanagers_discovered	gauge	`ins`, `instance`, `ip`, `user`, `job`, `cls`	The number of alertmanagers discovered and active.
cortex_prometheus_notifications_dropped_total	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
cortex_prometheus_notifications_queue_capacity	gauge	`ins`, `instance`, `ip`, `user`, `job`, `cls`	The capacity of the alert notifications queue.
cortex_prometheus_notifications_queue_length	gauge	`ins`, `instance`, `ip`, `user`, `job`, `cls`	The number of alert notifications in the queue.
cortex_prometheus_rule_evaluation_duration_seconds	summary	`ins`, `instance`, `ip`, `user`, `job`, `cls`, `quantile`	The duration for a rule to execute.
cortex_prometheus_rule_evaluation_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
cortex_prometheus_rule_evaluation_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
cortex_prometheus_rule_group_duration_seconds	summary	`ins`, `instance`, `ip`, `user`, `job`, `cls`, `quantile`	The duration of rule group evaluations.
cortex_prometheus_rule_group_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
cortex_prometheus_rule_group_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
cortex_query_frontend_connected_schedulers	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of schedulers this frontend is connected to.
cortex_query_frontend_queries_in_progress	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of queries in progress handled by this frontend.
cortex_query_frontend_retries_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
cortex_query_frontend_retries_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_query_frontend_retries_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_query_scheduler_connected_frontend_clients	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of query-frontend worker clients currently connected to the query-scheduler.
cortex_query_scheduler_connected_querier_clients	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of querier worker clients currently connected to the query-scheduler.
cortex_query_scheduler_inflight_requests	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	Number of inflight requests (either queued or processing) sampled at a regular interval. Quantile buckets keep track of inflight requests over the last 60s.
cortex_query_scheduler_inflight_requests_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_query_scheduler_inflight_requests_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_query_scheduler_queue_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
cortex_query_scheduler_queue_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_query_scheduler_queue_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_query_scheduler_queue_length	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
cortex_query_scheduler_running	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Value will be 1 if the scheduler is in the ReplicationSet and actively receiving/processing requests
cortex_ring_member_heartbeats_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_ring_member_tokens_owned	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of tokens owned in the ring.
cortex_ring_member_tokens_to_own	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of tokens to own in the ring.
cortex_ring_members	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `state`	Number of members in the ring
cortex_ring_oldest_member_timestamp	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `state`	Timestamp of the oldest member in the ring.
cortex_ring_tokens_total	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of tokens in the ring
cortex_ruler_clients	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The current number of ruler clients in the pool.
cortex_ruler_config_last_reload_successful	gauge	`ins`, `instance`, `ip`, `user`, `job`, `cls`	Boolean set to 1 whenever the last configuration reload attempt was successful.
cortex_ruler_config_last_reload_successful_seconds	gauge	`ins`, `instance`, `ip`, `user`, `job`, `cls`	Timestamp of the last successful configuration reload.
cortex_ruler_config_updates_total	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
cortex_ruler_managers_total	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Total number of managers registered and running in the ruler
cortex_ruler_ring_check_errors_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
cortex_ruler_sync_rules_total	Unknown	`ins`, `instance`, `ip`, `reason`, `job`, `cls`	N/A
deprecated_flags_inuse_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cgo_go_to_c_calls_calls_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_gc_mark_assist_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_gc_mark_dedicated_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_gc_mark_idle_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_gc_pause_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_gc_total_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_idle_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_scavenge_assist_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_scavenge_background_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_scavenge_total_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_total_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_cpu_classes_user_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_cycles_automatic_gc_cycles_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_cycles_forced_gc_cycles_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_cycles_total_gc_cycles_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_duration_seconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_gogc_percent	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Heap size target percentage configured by the user, otherwise 100. This value is set by the GOGC environment variable, and the runtime/debug.SetGCPercent function.
go_gc_gomemlimit_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Go runtime memory limit configured by the user, otherwise math.MaxInt64. This value is set by the GOMEMLIMIT environment variable, and the runtime/debug.SetMemoryLimit function.
go_gc_heap_allocs_by_size_bytes_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
go_gc_heap_allocs_by_size_bytes_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_heap_allocs_by_size_bytes_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_heap_allocs_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_heap_allocs_objects_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_heap_frees_by_size_bytes_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
go_gc_heap_frees_by_size_bytes_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_heap_frees_by_size_bytes_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_heap_frees_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_heap_frees_objects_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_heap_goal_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Heap size target for the end of the GC cycle.
go_gc_heap_live_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Heap memory occupied by live objects that were marked by the previous GC.
go_gc_heap_objects_objects	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of objects, live or unswept, occupying heap memory.
go_gc_heap_tiny_allocs_objects_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_limiter_last_enabled_gc_cycle	gauge	`ins`, `instance`, `ip`, `job`, `cls`	GC cycle the last time the GC CPU limiter was enabled. This metric is useful for diagnosing the root cause of an out-of-memory error, because the limiter trades memory for CPU time when the GC’s CPU time gets too high. This is most likely to occur with use of SetMemoryLimit. The first GC cycle is cycle 1, so a value of 0 indicates that it was never enabled.
go_gc_pauses_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
go_gc_pauses_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_pauses_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_gc_scan_globals_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total amount of global variable space that is scannable.
go_gc_scan_heap_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total amount of heap space that is scannable.
go_gc_scan_stack_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of bytes of stack that were scanned last GC cycle.
go_gc_scan_total_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total amount space that is scannable. Sum of all metrics in /gc/scan.
go_gc_stack_starting_size_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The stack size of new goroutines.
go_godebug_non_default_behavior_execerrdot_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_gocachehash_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_gocachetest_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_gocacheverify_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_http2client_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_http2server_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_installgoroot_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_jstmpllitinterp_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_multipartmaxheaders_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_multipartmaxparts_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_multipathtcp_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_panicnil_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_randautoseed_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_tarinsecurepath_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_tlsmaxrsasize_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_x509sha1_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_x509usefallbackroots_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_godebug_non_default_behavior_zipinsecurepath_events_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_goroutines	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of goroutines that currently exist.
go_info	gauge	`version`, `ins`, `instance`, `ip`, `job`, `cls`	Information about the Go environment.
go_memory_classes_heap_free_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory that is completely free and eligible to be returned to the underlying system, but has not been. This metric is the runtime’s estimate of free address space that is backed by physical memory.
go_memory_classes_heap_objects_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory occupied by live objects and dead objects that have not yet been marked free by the garbage collector.
go_memory_classes_heap_released_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory that is completely free and has been returned to the underlying system. This metric is the runtime’s estimate of free address space that is still mapped into the process, but is not backed by physical memory.
go_memory_classes_heap_stacks_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory allocated from the heap that is reserved for stack space, whether or not it is currently in-use. Currently, this represents all stack memory for goroutines. It also includes all OS thread stacks in non-cgo programs. Note that stacks may be allocated differently in the future, and this may change.
go_memory_classes_heap_unused_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory that is reserved for heap objects but is not currently used to hold heap objects.
go_memory_classes_metadata_mcache_free_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory that is reserved for runtime mcache structures, but not in-use.
go_memory_classes_metadata_mcache_inuse_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory that is occupied by runtime mcache structures that are currently being used.
go_memory_classes_metadata_mspan_free_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory that is reserved for runtime mspan structures, but not in-use.
go_memory_classes_metadata_mspan_inuse_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory that is occupied by runtime mspan structures that are currently being used.
go_memory_classes_metadata_other_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory that is reserved for or used to hold runtime metadata.
go_memory_classes_os_stacks_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Stack memory allocated by the underlying operating system. In non-cgo programs this metric is currently zero. This may change in the future.In cgo programs this metric includes OS thread stacks allocated directly from the OS. Currently, this only accounts for one stack in c-shared and c-archive build modes, and other sources of stacks from the OS are not measured. This too may change in the future.
go_memory_classes_other_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory used by execution trace buffers, structures for debugging the runtime, finalizer and profiler specials, and more.
go_memory_classes_profiling_buckets_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Memory that is used by the stack trace hash map used for profiling.
go_memory_classes_total_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	All memory mapped by the Go runtime into the current process as read-write. Note that this does not include memory mapped by code called via cgo or via the syscall package. Sum of all metrics in /memory/classes.
go_memstats_alloc_bytes	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of bytes allocated, even if freed.
go_memstats_alloc_bytes_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of frees.
go_memstats_gc_sys_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of heap bytes that are in use.
go_memstats_heap_objects	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of allocated objects.
go_memstats_heap_released_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of heap bytes released to OS.
go_memstats_heap_sys_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of pointer lookups.
go_memstats_mallocs_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of mallocs.
go_memstats_mcache_inuse_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of bytes obtained from system.
go_sched_gomaxprocs_threads	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The current runtime.GOMAXPROCS setting, or the number of operating system threads that can execute user-level Go code simultaneously.
go_sched_goroutines_goroutines	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Count of live goroutines.
go_sched_latencies_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
go_sched_latencies_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_sched_latencies_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_sql_stats_connections_blocked_seconds	unknown	`ins`, `instance`, `db_name`, `ip`, `job`, `cls`	The total time blocked waiting for a new connection.
go_sql_stats_connections_closed_max_idle	unknown	`ins`, `instance`, `db_name`, `ip`, `job`, `cls`	The total number of connections closed due to SetMaxIdleConns.
go_sql_stats_connections_closed_max_idle_time	unknown	`ins`, `instance`, `db_name`, `ip`, `job`, `cls`	The total number of connections closed due to SetConnMaxIdleTime.
go_sql_stats_connections_closed_max_lifetime	unknown	`ins`, `instance`, `db_name`, `ip`, `job`, `cls`	The total number of connections closed due to SetConnMaxLifetime.
go_sql_stats_connections_idle	gauge	`ins`, `instance`, `db_name`, `ip`, `job`, `cls`	The number of idle connections.
go_sql_stats_connections_in_use	gauge	`ins`, `instance`, `db_name`, `ip`, `job`, `cls`	The number of connections currently in use.
go_sql_stats_connections_max_open	gauge	`ins`, `instance`, `db_name`, `ip`, `job`, `cls`	Maximum number of open connections to the database.
go_sql_stats_connections_open	gauge	`ins`, `instance`, `db_name`, `ip`, `job`, `cls`	The number of established connections both in use and idle.
go_sql_stats_connections_waited_for	unknown	`ins`, `instance`, `db_name`, `ip`, `job`, `cls`	The total number of connections waited for.
go_sync_mutex_wait_total_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
go_threads	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of OS threads created.
grafana_access_evaluation_count	unknown	`ins`, `instance`, `ip`, `job`, `cls`	number of evaluation calls
grafana_access_evaluation_duration_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_access_evaluation_duration_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_access_evaluation_duration_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_access_permissions_duration_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_access_permissions_duration_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_access_permissions_duration_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_aggregator_discovery_aggregation_count_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_active_alerts	gauge	`ins`, `instance`, `ip`, `job`, `cls`	amount of active alerts
grafana_alerting_active_configurations	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of active Alertmanager configurations.
grafana_alerting_alertmanager_config_match	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total number of match
grafana_alerting_alertmanager_config_match_re	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total number of matchRE
grafana_alerting_alertmanager_config_matchers	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total number of matchers
grafana_alerting_alertmanager_config_object_matchers	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total number of object_matchers
grafana_alerting_discovered_configurations	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of organizations we’ve discovered that require an Alertmanager configuration.
grafana_alerting_dispatcher_aggregation_groups	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of active aggregation groups
grafana_alerting_dispatcher_alert_processing_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_dispatcher_alert_processing_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_execution_time_milliseconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	summary of alert execution duration
grafana_alerting_execution_time_milliseconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_execution_time_milliseconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_gc_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_gc_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_gossip_messages_propagated_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_queries_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_query_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_alerting_nflog_query_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_query_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_query_errors_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_snapshot_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_snapshot_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_nflog_snapshot_size_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Size of the last notification log snapshot in bytes.
grafana_alerting_notification_latency_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_alerting_notification_latency_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_notification_latency_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_schedule_alert_rules	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of alert rules that could be considered for evaluation at the next tick.
grafana_alerting_schedule_alert_rules_hash	gauge	`ins`, `instance`, `ip`, `job`, `cls`	A hash of the alert rules that could be considered for evaluation at the next tick.
grafana_alerting_schedule_periodic_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_alerting_schedule_periodic_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_schedule_periodic_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_schedule_query_alert_rules_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_alerting_schedule_query_alert_rules_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_schedule_query_alert_rules_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_scheduler_behind_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total number of seconds the scheduler is behind.
grafana_alerting_silences_gc_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_silences_gc_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_silences_gossip_messages_propagated_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_silences_queries_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_silences_query_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_alerting_silences_query_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_silences_query_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_silences_query_errors_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_silences_snapshot_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_silences_snapshot_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_silences_snapshot_size_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Size of the last silence snapshot in bytes.
grafana_alerting_state_calculation_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_alerting_state_calculation_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_state_calculation_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_state_history_writes_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_alerting_ticker_interval_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Interval at which the ticker is meant to tick.
grafana_alerting_ticker_last_consumed_tick_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Timestamp of the last consumed tick in seconds.
grafana_alerting_ticker_next_tick_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Timestamp of the next tick in seconds before it is consumed.
grafana_api_admin_user_created_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dashboard_get_milliseconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	summary for dashboard get duration
grafana_api_dashboard_get_milliseconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dashboard_get_milliseconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dashboard_save_milliseconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	summary for dashboard save duration
grafana_api_dashboard_save_milliseconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dashboard_save_milliseconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dashboard_search_milliseconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	summary for dashboard search duration
grafana_api_dashboard_search_milliseconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dashboard_search_milliseconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dashboard_snapshot_create_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dashboard_snapshot_external_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dashboard_snapshot_get_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dataproxy_request_all_milliseconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	summary for dataproxy request duration
grafana_api_dataproxy_request_all_milliseconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_dataproxy_request_all_milliseconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_login_oauth_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_login_post_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_login_saml_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_models_dashboard_insert_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_org_create_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_response_status_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `code`	N/A
grafana_api_user_signup_completed_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_user_signup_invite_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_api_user_signup_started_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_audit_event_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_audit_requests_rejected_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_client_certificate_expiration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_apiserver_client_certificate_expiration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_client_certificate_expiration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_envelope_encryption_dek_cache_fill_percent	gauge	`ins`, `instance`, `ip`, `job`, `cls`	[ALPHA] Percent of the cache slots currently occupied by cached DEKs.
grafana_apiserver_flowcontrol_seat_fair_frac	gauge	`ins`, `instance`, `ip`, `job`, `cls`	[ALPHA] Fair fraction of server’s concurrency to allocate to each priority level that can use it
grafana_apiserver_storage_data_key_generation_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_apiserver_storage_data_key_generation_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_storage_data_key_generation_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_storage_data_key_generation_failures_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_storage_envelope_transformation_cache_misses_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_tls_handshake_errors_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_webhooks_x509_insecure_sha1_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_apiserver_webhooks_x509_missing_san_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_authn_authn_failed_authentication_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_authn_authn_successful_authentication_total	Unknown	`ins`, `instance`, `ip`, `client`, `job`, `cls`	N/A
grafana_authn_authn_successful_login_total	Unknown	`ins`, `instance`, `ip`, `client`, `job`, `cls`	N/A
grafana_aws_cloudwatch_get_metric_data_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_aws_cloudwatch_get_metric_statistics_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_aws_cloudwatch_list_metrics_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_build_info	gauge	`revision`, `version`, `ins`, `instance`, `edition`, `ip`, `goversion`, `job`, `cls`, `branch`	A metric with a constant ‘1’ value labeled by version, revision, branch, and goversion from which Grafana was built
grafana_build_timestamp	gauge	`revision`, `version`, `ins`, `instance`, `edition`, `ip`, `goversion`, `job`, `cls`, `branch`	A metric exposing when the binary was built in epoch
grafana_cardinality_enforcement_unexpected_categorizations_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_database_conn_idle	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of idle connections
grafana_database_conn_in_use	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of connections currently in use
grafana_database_conn_max_idle_closed_seconds	unknown	`ins`, `instance`, `ip`, `job`, `cls`	The total number of connections closed due to SetConnMaxIdleTime
grafana_database_conn_max_idle_closed_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_database_conn_max_lifetime_closed_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_database_conn_max_open	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Maximum number of open connections to the database
grafana_database_conn_open	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of established connections both in use and idle
grafana_database_conn_wait_count_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_database_conn_wait_duration_seconds	unknown	`ins`, `instance`, `ip`, `job`, `cls`	The total time blocked waiting for a new connection
grafana_datasource_request_duration_seconds_bucket	Unknown	`datasource`, `ins`, `instance`, `method`, `ip`, `le`, `datasource_type`, `job`, `cls`, `code`	N/A
grafana_datasource_request_duration_seconds_count	Unknown	`datasource`, `ins`, `instance`, `method`, `ip`, `datasource_type`, `job`, `cls`, `code`	N/A
grafana_datasource_request_duration_seconds_sum	Unknown	`datasource`, `ins`, `instance`, `method`, `ip`, `datasource_type`, `job`, `cls`, `code`	N/A
grafana_datasource_request_in_flight	gauge	`datasource`, `ins`, `instance`, `ip`, `datasource_type`, `job`, `cls`	A gauge of outgoing data source requests currently being sent by Grafana
grafana_datasource_request_total	Unknown	`datasource`, `ins`, `instance`, `method`, `ip`, `datasource_type`, `job`, `cls`, `code`	N/A
grafana_datasource_response_size_bytes_bucket	Unknown	`datasource`, `ins`, `instance`, `ip`, `le`, `datasource_type`, `job`, `cls`	N/A
grafana_datasource_response_size_bytes_count	Unknown	`datasource`, `ins`, `instance`, `ip`, `datasource_type`, `job`, `cls`	N/A
grafana_datasource_response_size_bytes_sum	Unknown	`datasource`, `ins`, `instance`, `ip`, `datasource_type`, `job`, `cls`	N/A
grafana_db_datasource_query_by_id_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_disabled_metrics_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_emails_sent_failed	unknown	`ins`, `instance`, `ip`, `job`, `cls`	Number of emails Grafana failed to send
grafana_emails_sent_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_encryption_cache_reads_total	Unknown	`ins`, `instance`, `method`, `ip`, `hit`, `job`, `cls`	N/A
grafana_encryption_ops_total	Unknown	`ins`, `instance`, `ip`, `success`, `operation`, `job`, `cls`	N/A
grafana_environment_info	gauge	`version`, `ins`, `instance`, `ip`, `job`, `cls`, `commit`	A metric with a constant ‘1’ value labeled by environment information about the running instance.
grafana_feature_toggles_info	gauge	`ins`, `instance`, `ip`, `job`, `cls`	info metric that exposes what feature toggles are enabled or not
grafana_frontend_boot_css_time_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_frontend_boot_css_time_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_boot_css_time_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_boot_first_contentful_paint_time_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_frontend_boot_first_contentful_paint_time_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_boot_first_contentful_paint_time_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_boot_first_paint_time_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_frontend_boot_first_paint_time_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_boot_first_paint_time_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_boot_js_done_time_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_frontend_boot_js_done_time_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_boot_js_done_time_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_boot_load_time_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_frontend_boot_load_time_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_boot_load_time_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_plugins_preload_ms_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_frontend_plugins_preload_ms_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_frontend_plugins_preload_ms_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_hidden_metrics_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_http_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `method`, `ip`, `le`, `job`, `cls`, `status_code`, `handler`	N/A
grafana_http_request_duration_seconds_count	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `status_code`, `handler`	N/A
grafana_http_request_duration_seconds_sum	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `status_code`, `handler`	N/A
grafana_http_request_in_flight	gauge	`ins`, `instance`, `ip`, `job`, `cls`	A gauge of requests currently being served by Grafana.
grafana_idforwarding_idforwarding_failed_token_signing_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_idforwarding_idforwarding_token_signing_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_idforwarding_idforwarding_token_signing_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_idforwarding_idforwarding_token_signing_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_idforwarding_idforwarding_token_signing_from_cache_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_idforwarding_idforwarding_token_signing_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_instance_start_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_ldap_users_sync_execution_time	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	summary for LDAP users sync execution duration
grafana_ldap_users_sync_execution_time_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_ldap_users_sync_execution_time_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_live_client_command_duration_seconds	summary	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `quantile`	Client command duration summary.
grafana_live_client_command_duration_seconds_count	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`	N/A
grafana_live_client_command_duration_seconds_sum	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`	N/A
grafana_live_client_num_reply_errors	unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `code`	Number of errors in replies sent to clients.
grafana_live_client_num_server_disconnects	unknown	`ins`, `instance`, `ip`, `job`, `cls`, `code`	Number of server initiated disconnects.
grafana_live_client_recover	unknown	`ins`, `instance`, `ip`, `recovered`, `job`, `cls`	Count of recover operations.
grafana_live_node_action_count	unknown	`action`, `ins`, `instance`, `ip`, `job`, `cls`	Number of node actions called.
grafana_live_node_build	gauge	`version`, `ins`, `instance`, `ip`, `job`, `cls`	Node build info.
grafana_live_node_messages_received_count	unknown	`ins`, `instance`, `ip`, `type`, `job`, `cls`	Number of messages received.
grafana_live_node_messages_sent_count	unknown	`ins`, `instance`, `ip`, `type`, `job`, `cls`	Number of messages sent.
grafana_live_node_num_channels	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of channels with one or more subscribers.
grafana_live_node_num_clients	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of clients connected.
grafana_live_node_num_nodes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of nodes in cluster.
grafana_live_node_num_subscriptions	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of subscriptions.
grafana_live_node_num_users	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of unique users connected.
grafana_live_transport_connect_count	unknown	`ins`, `instance`, `ip`, `transport`, `job`, `cls`	Number of connections to specific transport.
grafana_live_transport_messages_sent	unknown	`ins`, `instance`, `ip`, `transport`, `job`, `cls`	Number of messages sent over specific transport.
grafana_loki_plugin_parse_response_duration_seconds_bucket	Unknown	`endpoint`, `ins`, `instance`, `ip`, `le`, `status`, `job`, `cls`	N/A
grafana_loki_plugin_parse_response_duration_seconds_count	Unknown	`endpoint`, `ins`, `instance`, `ip`, `status`, `job`, `cls`	N/A
grafana_loki_plugin_parse_response_duration_seconds_sum	Unknown	`endpoint`, `ins`, `instance`, `ip`, `status`, `job`, `cls`	N/A
grafana_page_response_status_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `code`	N/A
grafana_plugin_build_info	gauge	`version`, `signature_status`, `ins`, `instance`, `plugin_type`, `ip`, `plugin_id`, `job`, `cls`	A metric with a constant ‘1’ value labeled by pluginId, pluginType and version from which Grafana plugin was built
grafana_plugin_request_duration_milliseconds_bucket	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `le`, `plugin_id`, `job`, `cls`	N/A
grafana_plugin_request_duration_milliseconds_count	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `plugin_id`, `job`, `cls`	N/A
grafana_plugin_request_duration_milliseconds_sum	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `plugin_id`, `job`, `cls`	N/A
grafana_plugin_request_duration_seconds_bucket	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `le`, `status`, `plugin_id`, `source`, `job`, `cls`	N/A
grafana_plugin_request_duration_seconds_count	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `status`, `plugin_id`, `source`, `job`, `cls`	N/A
grafana_plugin_request_duration_seconds_sum	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `status`, `plugin_id`, `source`, `job`, `cls`	N/A
grafana_plugin_request_size_bytes_bucket	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `le`, `plugin_id`, `source`, `job`, `cls`	N/A
grafana_plugin_request_size_bytes_count	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `plugin_id`, `source`, `job`, `cls`	N/A
grafana_plugin_request_size_bytes_sum	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `plugin_id`, `source`, `job`, `cls`	N/A
grafana_plugin_request_total	Unknown	`endpoint`, `ins`, `instance`, `target`, `ip`, `status`, `plugin_id`, `job`, `cls`	N/A
grafana_process_cpu_seconds_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_process_max_fds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Maximum number of open file descriptors.
grafana_process_open_fds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of open file descriptors.
grafana_process_resident_memory_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Resident memory size in bytes.
grafana_process_start_time_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Start time of the process since unix epoch in seconds.
grafana_process_virtual_memory_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Virtual memory size in bytes.
grafana_process_virtual_memory_max_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Maximum amount of virtual memory available in bytes.
grafana_prometheus_plugin_backend_request_count	unknown	`endpoint`, `ins`, `instance`, `ip`, `status`, `errorSource`, `job`, `cls`	The total amount of prometheus backend plugin requests
grafana_proxy_response_status_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `code`	N/A
grafana_public_dashboard_request_count	unknown	`ins`, `instance`, `ip`, `job`, `cls`	counter for public dashboards requests
grafana_registered_metrics_total	Unknown	`ins`, `instance`, `ip`, `stability_level`, `deprecated_version`, `job`, `cls`	N/A
grafana_rendering_queue_size	gauge	`ins`, `instance`, `ip`, `job`, `cls`	size of rendering queue
grafana_search_dashboard_search_failures_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_search_dashboard_search_failures_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_search_dashboard_search_failures_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_search_dashboard_search_successes_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
grafana_search_dashboard_search_successes_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_search_dashboard_search_successes_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
grafana_stat_active_users	gauge	`ins`, `instance`, `ip`, `job`, `cls`	number of active users
grafana_stat_total_orgs	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of orgs
grafana_stat_total_playlists	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of playlists
grafana_stat_total_service_account_tokens	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of service account tokens
grafana_stat_total_service_accounts	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of service accounts
grafana_stat_total_service_accounts_role_none	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of service accounts with no role
grafana_stat_total_teams	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of teams
grafana_stat_total_users	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of users
grafana_stat_totals_active_admins	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of active admins
grafana_stat_totals_active_editors	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of active editors
grafana_stat_totals_active_viewers	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of active viewers
grafana_stat_totals_admins	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of admins
grafana_stat_totals_alert_rules	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of alert rules in the database
grafana_stat_totals_annotations	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of annotations in the database
grafana_stat_totals_correlations	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of correlations
grafana_stat_totals_dashboard	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of dashboards
grafana_stat_totals_dashboard_versions	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of dashboard versions in the database
grafana_stat_totals_data_keys	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `active`	total amount of data keys in the database
grafana_stat_totals_datasource	gauge	`ins`, `instance`, `ip`, `plugin_id`, `job`, `cls`	total number of defined datasources, labeled by pluginId
grafana_stat_totals_editors	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of editors
grafana_stat_totals_folder	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of folders
grafana_stat_totals_library_panels	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of library panels in the database
grafana_stat_totals_library_variables	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of library variables in the database
grafana_stat_totals_public_dashboard	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of public dashboards
grafana_stat_totals_rule_groups	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of alert rule groups in the database
grafana_stat_totals_viewers	gauge	`ins`, `instance`, `ip`, `job`, `cls`	total amount of viewers
infra_up	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_baggage_restrictions_updates_total	Unknown	`result`, `ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_baggage_truncations_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_baggage_updates_total	Unknown	`result`, `ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_finished_spans_total	Unknown	`ins`, `instance`, `ip`, `sampled`, `job`, `cls`	N/A
jaeger_tracer_reporter_queue_length	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Current number of spans in the reporter queue
jaeger_tracer_reporter_spans_total	Unknown	`result`, `ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_sampler_queries_total	Unknown	`result`, `ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_sampler_updates_total	Unknown	`result`, `ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_span_context_decoding_errors_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_started_spans_total	Unknown	`ins`, `instance`, `ip`, `sampled`, `job`, `cls`	N/A
jaeger_tracer_throttled_debug_spans_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_throttler_updates_total	Unknown	`result`, `ins`, `instance`, `ip`, `job`, `cls`	N/A
jaeger_tracer_traces_total	Unknown	`ins`, `instance`, `ip`, `sampled`, `job`, `cls`, `state`	N/A
kv_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `role`, `ip`, `le`, `kv_name`, `type`, `operation`, `job`, `cls`, `status_code`	N/A
kv_request_duration_seconds_count	Unknown	`ins`, `instance`, `role`, `ip`, `kv_name`, `type`, `operation`, `job`, `cls`, `status_code`	N/A
kv_request_duration_seconds_sum	Unknown	`ins`, `instance`, `role`, `ip`, `kv_name`, `type`, `operation`, `job`, `cls`, `status_code`	N/A
legacy_grafana_alerting_ticker_interval_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Interval at which the ticker is meant to tick.
legacy_grafana_alerting_ticker_last_consumed_tick_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Timestamp of the last consumed tick in seconds.
legacy_grafana_alerting_ticker_next_tick_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Timestamp of the next tick in seconds before it is consumed.
logql_query_duration_seconds_bucket	Unknown	`ins`, `instance`, `query_type`, `ip`, `le`, `job`, `cls`	N/A
logql_query_duration_seconds_count	Unknown	`ins`, `instance`, `query_type`, `ip`, `job`, `cls`	N/A
logql_query_duration_seconds_sum	Unknown	`ins`, `instance`, `query_type`, `ip`, `job`, `cls`	N/A
loki_azure_blob_egress_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_boltdb_shipper_apply_retention_last_successful_run_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Unix timestamp of the last successful retention run
loki_boltdb_shipper_compact_tables_operation_duration_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Time (in seconds) spent in compacting all the tables
loki_boltdb_shipper_compact_tables_operation_last_successful_run_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Unix timestamp of the last successful compaction run
loki_boltdb_shipper_compact_tables_operation_total	Unknown	`ins`, `instance`, `ip`, `status`, `job`, `cls`	N/A
loki_boltdb_shipper_compactor_running	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Value will be 1 if compactor is currently running on this instance
loki_boltdb_shipper_open_existing_file_failures_total	Unknown	`ins`, `instance`, `ip`, `component`, `job`, `cls`	N/A
loki_boltdb_shipper_query_time_table_download_duration_seconds	unknown	`ins`, `instance`, `ip`, `component`, `job`, `cls`, `table`	Time (in seconds) spent in downloading of files per table at query time
loki_boltdb_shipper_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `component`, `operation`, `job`, `cls`, `status_code`	N/A
loki_boltdb_shipper_request_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `component`, `operation`, `job`, `cls`, `status_code`	N/A
loki_boltdb_shipper_request_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `component`, `operation`, `job`, `cls`, `status_code`	N/A
loki_boltdb_shipper_tables_download_operation_duration_seconds	gauge	`ins`, `instance`, `ip`, `component`, `job`, `cls`	Time (in seconds) spent in downloading updated files for all the tables
loki_boltdb_shipper_tables_sync_operation_total	Unknown	`ins`, `instance`, `ip`, `status`, `component`, `job`, `cls`	N/A
loki_boltdb_shipper_tables_upload_operation_total	Unknown	`ins`, `instance`, `ip`, `status`, `component`, `job`, `cls`	N/A
loki_build_info	gauge	`revision`, `version`, `ins`, `instance`, `ip`, `tags`, `goarch`, `goversion`, `job`, `cls`, `branch`, `goos`	A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which loki was built, and the goos and goarch for the build.
loki_bytes_per_line_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_bytes_per_line_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_bytes_per_line_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_cache_corrupt_chunks_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_cache_fetched_keys	unknown	`ins`, `instance`, `ip`, `job`, `cls`	Total count of keys requested from cache.
loki_cache_hits	unknown	`ins`, `instance`, `ip`, `job`, `cls`	Total count of keys found in cache.
loki_cache_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `method`, `ip`, `le`, `job`, `cls`, `status_code`	N/A
loki_cache_request_duration_seconds_count	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `status_code`	N/A
loki_cache_request_duration_seconds_sum	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `status_code`	N/A
loki_cache_value_size_bytes_bucket	Unknown	`ins`, `instance`, `method`, `ip`, `le`, `job`, `cls`	N/A
loki_cache_value_size_bytes_count	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`	N/A
loki_cache_value_size_bytes_sum	Unknown	`ins`, `instance`, `method`, `ip`, `job`, `cls`	N/A
loki_chunk_fetcher_cache_dequeued_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_fetcher_cache_enqueued_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_fetcher_cache_skipped_buffer_full_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_fetcher_fetched_size_bytes_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `source`, `job`, `cls`	N/A
loki_chunk_fetcher_fetched_size_bytes_count	Unknown	`ins`, `instance`, `ip`, `source`, `job`, `cls`	N/A
loki_chunk_fetcher_fetched_size_bytes_sum	Unknown	`ins`, `instance`, `ip`, `source`, `job`, `cls`	N/A
loki_chunk_store_chunks_per_query_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_chunk_store_chunks_per_query_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_chunks_per_query_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_deduped_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_deduped_chunks_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_fetched_chunk_bytes_total	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
loki_chunk_store_fetched_chunks_total	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
loki_chunk_store_index_entries_per_chunk_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_chunk_store_index_entries_per_chunk_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_index_entries_per_chunk_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_index_lookups_per_query_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_chunk_store_index_lookups_per_query_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_index_lookups_per_query_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_series_post_intersection_per_query_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_chunk_store_series_post_intersection_per_query_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_series_post_intersection_per_query_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_series_pre_intersection_per_query_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_chunk_store_series_pre_intersection_per_query_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_series_pre_intersection_per_query_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_chunk_store_stored_chunk_bytes_total	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
loki_chunk_store_stored_chunks_total	Unknown	`ins`, `instance`, `ip`, `user`, `job`, `cls`	N/A
loki_consul_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `kv_name`, `operation`, `job`, `cls`, `status_code`	N/A
loki_consul_request_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `kv_name`, `operation`, `job`, `cls`, `status_code`	N/A
loki_consul_request_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `kv_name`, `operation`, `job`, `cls`, `status_code`	N/A
loki_delete_request_lookups_failed_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_delete_request_lookups_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_discarded_bytes_total	Unknown	`ins`, `instance`, `ip`, `reason`, `job`, `cls`, `tenant`	N/A
loki_discarded_samples_total	Unknown	`ins`, `instance`, `ip`, `reason`, `job`, `cls`, `tenant`	N/A
loki_distributor_bytes_received_total	Unknown	`ins`, `instance`, `retention_hours`, `ip`, `job`, `cls`, `tenant`	N/A
loki_distributor_ingester_appends_total	Unknown	`ins`, `instance`, `ip`, `ingester`, `job`, `cls`	N/A
loki_distributor_lines_received_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `tenant`	N/A
loki_distributor_replication_factor	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The configured replication factor.
loki_distributor_structured_metadata_bytes_received_total	Unknown	`ins`, `instance`, `retention_hours`, `ip`, `job`, `cls`, `tenant`	N/A
loki_experimental_features_in_use_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_index_chunk_refs_total	Unknown	`ins`, `instance`, `ip`, `status`, `job`, `cls`	N/A
loki_index_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `component`, `operation`, `job`, `cls`, `status_code`	N/A
loki_index_request_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `component`, `operation`, `job`, `cls`, `status_code`	N/A
loki_index_request_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `component`, `operation`, `job`, `cls`, `status_code`	N/A
loki_inflight_requests	gauge	`ins`, `instance`, `method`, `ip`, `route`, `job`, `cls`	Current number of inflight requests.
loki_ingester_autoforget_unhealthy_ingesters_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_blocks_per_chunk_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_ingester_blocks_per_chunk_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_blocks_per_chunk_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_checkpoint_creations_failed_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_checkpoint_creations_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_checkpoint_deletions_failed_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_checkpoint_deletions_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_checkpoint_duration_seconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	Time taken to create a checkpoint.
loki_ingester_checkpoint_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_checkpoint_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_checkpoint_logged_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_age_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_ingester_chunk_age_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_age_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_bounds_hours_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_ingester_chunk_bounds_hours_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_bounds_hours_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_compression_ratio_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_ingester_chunk_compression_ratio_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_compression_ratio_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_encode_time_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_ingester_chunk_encode_time_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_encode_time_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_entries_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_ingester_chunk_entries_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_entries_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_size_bytes_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_ingester_chunk_size_bytes_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_size_bytes_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_stored_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `tenant`	N/A
loki_ingester_chunk_utilization_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_ingester_chunk_utilization_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunk_utilization_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunks_created_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_chunks_flushed_total	Unknown	`ins`, `instance`, `ip`, `reason`, `job`, `cls`	N/A
loki_ingester_chunks_stored_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `tenant`	N/A
loki_ingester_client_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `operation`, `job`, `cls`, `status_code`	N/A
loki_ingester_client_request_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `operation`, `job`, `cls`, `status_code`	N/A
loki_ingester_client_request_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `operation`, `job`, `cls`, `status_code`	N/A
loki_ingester_limiter_enabled	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Whether the ingester’s limiter is enabled
loki_ingester_memory_chunks	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The total number of chunks in memory.
loki_ingester_memory_streams	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `tenant`	The total number of streams in memory per tenant.
loki_ingester_memory_streams_labels_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Total bytes of labels of the streams in memory.
loki_ingester_received_chunks	unknown	`ins`, `instance`, `ip`, `job`, `cls`	The total number of chunks received by this ingester whilst joining.
loki_ingester_samples_per_chunk_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_ingester_samples_per_chunk_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_samples_per_chunk_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_sent_chunks	unknown	`ins`, `instance`, `ip`, `job`, `cls`	The total number of chunks sent by this ingester whilst leaving.
loki_ingester_shutdown_marker	gauge	`ins`, `instance`, `ip`, `job`, `cls`	1 if prepare shutdown has been called, 0 otherwise
loki_ingester_streams_created_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `tenant`	N/A
loki_ingester_streams_removed_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `tenant`	N/A
loki_ingester_wal_bytes_in_use	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Total number of bytes in use by the WAL recovery process.
loki_ingester_wal_disk_full_failures_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_wal_duplicate_entries_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_wal_logged_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_wal_records_logged_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_wal_recovered_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_wal_recovered_chunks_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_wal_recovered_entries_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_wal_recovered_streams_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_ingester_wal_replay_active	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Whether the WAL is replaying
loki_ingester_wal_replay_duration_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Time taken to replay the checkpoint and the WAL.
loki_ingester_wal_replay_flushing	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Whether the wal replay is in a flushing phase due to backpressure
loki_internal_log_messages_total	Unknown	`ins`, `instance`, `ip`, `level`, `job`, `cls`	N/A
loki_kv_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `role`, `ip`, `le`, `kv_name`, `type`, `operation`, `job`, `cls`, `status_code`	N/A
loki_kv_request_duration_seconds_count	Unknown	`ins`, `instance`, `role`, `ip`, `kv_name`, `type`, `operation`, `job`, `cls`, `status_code`	N/A
loki_kv_request_duration_seconds_sum	Unknown	`ins`, `instance`, `role`, `ip`, `kv_name`, `type`, `operation`, `job`, `cls`, `status_code`	N/A
loki_log_flushes_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_log_flushes_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_log_flushes_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_log_messages_total	Unknown	`ins`, `instance`, `ip`, `level`, `job`, `cls`	N/A
loki_logql_querystats_bytes_processed_per_seconds_bucket	Unknown	`ins`, `instance`, `range`, `ip`, `le`, `sharded`, `type`, `job`, `cls`, `status_code`, `latency_type`	N/A
loki_logql_querystats_bytes_processed_per_seconds_count	Unknown	`ins`, `instance`, `range`, `ip`, `sharded`, `type`, `job`, `cls`, `status_code`, `latency_type`	N/A
loki_logql_querystats_bytes_processed_per_seconds_sum	Unknown	`ins`, `instance`, `range`, `ip`, `sharded`, `type`, `job`, `cls`, `status_code`, `latency_type`	N/A
loki_logql_querystats_chunk_download_latency_seconds_bucket	Unknown	`ins`, `instance`, `range`, `ip`, `le`, `type`, `job`, `cls`, `status_code`	N/A
loki_logql_querystats_chunk_download_latency_seconds_count	Unknown	`ins`, `instance`, `range`, `ip`, `type`, `job`, `cls`, `status_code`	N/A
loki_logql_querystats_chunk_download_latency_seconds_sum	Unknown	`ins`, `instance`, `range`, `ip`, `type`, `job`, `cls`, `status_code`	N/A
loki_logql_querystats_downloaded_chunk_total	Unknown	`ins`, `instance`, `range`, `ip`, `type`, `job`, `cls`, `status_code`	N/A
loki_logql_querystats_duplicates_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_logql_querystats_ingester_sent_lines_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_logql_querystats_latency_seconds_bucket	Unknown	`ins`, `instance`, `range`, `ip`, `le`, `type`, `job`, `cls`, `status_code`	N/A
loki_logql_querystats_latency_seconds_count	Unknown	`ins`, `instance`, `range`, `ip`, `type`, `job`, `cls`, `status_code`	N/A
loki_logql_querystats_latency_seconds_sum	Unknown	`ins`, `instance`, `range`, `ip`, `type`, `job`, `cls`, `status_code`	N/A
loki_panic_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_querier_index_cache_corruptions_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_querier_index_cache_encode_errors_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_querier_index_cache_gets_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_querier_index_cache_hits_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_querier_index_cache_puts_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_querier_query_frontend_clients	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The current number of clients connected to query-frontend.
loki_querier_query_frontend_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `operation`, `job`, `cls`, `status_code`	N/A
loki_querier_query_frontend_request_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `operation`, `job`, `cls`, `status_code`	N/A
loki_querier_query_frontend_request_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `operation`, `job`, `cls`, `status_code`	N/A
loki_querier_tail_active	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of active tailers
loki_querier_tail_active_streams	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of active streams being tailed
loki_querier_tail_bytes_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_querier_worker_concurrency	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of concurrent querier workers
loki_querier_worker_inflight_queries	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of queries being processed by the querier workers
loki_query_frontend_log_result_cache_hit_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_query_frontend_log_result_cache_miss_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_query_frontend_partitions_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_query_frontend_partitions_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_query_frontend_partitions_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_query_frontend_shard_factor_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `mapper`, `job`, `cls`	N/A
loki_query_frontend_shard_factor_count	Unknown	`ins`, `instance`, `ip`, `mapper`, `job`, `cls`	N/A
loki_query_frontend_shard_factor_sum	Unknown	`ins`, `instance`, `ip`, `mapper`, `job`, `cls`	N/A
loki_query_scheduler_enqueue_count	Unknown	`ins`, `instance`, `ip`, `level`, `user`, `job`, `cls`	N/A
loki_rate_store_expired_streams_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_rate_store_max_stream_rate_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The maximum stream rate for any stream reported by ingesters during a sync operation. Sharded Streams are combined.
loki_rate_store_max_stream_shards	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of shards for a single stream reported by ingesters during a sync operation.
loki_rate_store_max_unique_stream_rate_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The maximum stream rate for any stream reported by ingesters during a sync operation. Sharded Streams are considered separate.
loki_rate_store_stream_rate_bytes_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_rate_store_stream_rate_bytes_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_rate_store_stream_rate_bytes_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_rate_store_stream_shards_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
loki_rate_store_stream_shards_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_rate_store_stream_shards_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_rate_store_streams	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of unique streams reported by all ingesters. Sharded streams are combined
loki_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `method`, `ip`, `le`, `ws`, `route`, `job`, `cls`, `status_code`	N/A
loki_request_duration_seconds_count	Unknown	`ins`, `instance`, `method`, `ip`, `ws`, `route`, `job`, `cls`, `status_code`	N/A
loki_request_duration_seconds_sum	Unknown	`ins`, `instance`, `method`, `ip`, `ws`, `route`, `job`, `cls`, `status_code`	N/A
loki_request_message_bytes_bucket	Unknown	`ins`, `instance`, `method`, `ip`, `le`, `route`, `job`, `cls`	N/A
loki_request_message_bytes_count	Unknown	`ins`, `instance`, `method`, `ip`, `route`, `job`, `cls`	N/A
loki_request_message_bytes_sum	Unknown	`ins`, `instance`, `method`, `ip`, `route`, `job`, `cls`	N/A
loki_response_message_bytes_bucket	Unknown	`ins`, `instance`, `method`, `ip`, `le`, `route`, `job`, `cls`	N/A
loki_response_message_bytes_count	Unknown	`ins`, `instance`, `method`, `ip`, `route`, `job`, `cls`	N/A
loki_response_message_bytes_sum	Unknown	`ins`, `instance`, `method`, `ip`, `route`, `job`, `cls`	N/A
loki_results_cache_version_comparisons_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
loki_store_chunks_downloaded_total	Unknown	`ins`, `instance`, `ip`, `status`, `job`, `cls`	N/A
loki_store_chunks_per_batch_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `status`, `job`, `cls`	N/A
loki_store_chunks_per_batch_count	Unknown	`ins`, `instance`, `ip`, `status`, `job`, `cls`	N/A
loki_store_chunks_per_batch_sum	Unknown	`ins`, `instance`, `ip`, `status`, `job`, `cls`	N/A
loki_store_series_total	Unknown	`ins`, `instance`, `ip`, `status`, `job`, `cls`	N/A
loki_stream_sharding_count	unknown	`ins`, `instance`, `ip`, `job`, `cls`	Total number of times the distributor has sharded streams
loki_tcp_connections	gauge	`ins`, `instance`, `ip`, `protocol`, `job`, `cls`	Current number of accepted TCP connections.
loki_tcp_connections_limit	gauge	`ins`, `instance`, `ip`, `protocol`, `job`, `cls`	The max number of TCP connections that can be accepted (0 means no limit).
net_conntrack_dialer_conn_attempted_total	counter	`ins`, `instance`, `ip`, `dialer_name`, `job`, `cls`	Total number of connections attempted by the given dialer a given name.
net_conntrack_dialer_conn_closed_total	counter	`ins`, `instance`, `ip`, `dialer_name`, `job`, `cls`	Total number of connections closed which originated from the dialer of a given name.
net_conntrack_dialer_conn_established_total	counter	`ins`, `instance`, `ip`, `dialer_name`, `job`, `cls`	Total number of connections successfully established by the given dialer a given name.
net_conntrack_dialer_conn_failed_total	counter	`ins`, `instance`, `ip`, `dialer_name`, `reason`, `job`, `cls`	Total number of connections failed to dial by the dialer a given name.
net_conntrack_listener_conn_accepted_total	counter	`ins`, `instance`, `ip`, `listener_name`, `job`, `cls`	Total number of connections opened to the listener of a given name.
net_conntrack_listener_conn_closed_total	counter	`ins`, `instance`, `ip`, `listener_name`, `job`, `cls`	Total number of connections closed that were made to the listener of a given name.
nginx_connections_accepted	counter	`ins`, `instance`, `ip`, `job`, `cls`	Accepted client connections
nginx_connections_active	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Active client connections
nginx_connections_handled	counter	`ins`, `instance`, `ip`, `job`, `cls`	Handled client connections
nginx_connections_reading	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Connections where NGINX is reading the request header
nginx_connections_waiting	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Idle client connections
nginx_connections_writing	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Connections where NGINX is writing the response back to the client
nginx_exporter_build_info	gauge	`revision`, `version`, `ins`, `instance`, `ip`, `tags`, `goarch`, `goversion`, `job`, `cls`, `branch`, `goos`	A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which nginx_exporter was built, and the goos and goarch for the build.
nginx_http_requests_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total http requests
nginx_up	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Status of the last metric scrape
plugins_active_instances	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of active plugin instances
plugins_datasource_instances_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
process_cpu_seconds_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total user and system CPU time spent in seconds.
process_max_fds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Maximum number of open file descriptors.
process_open_fds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of open file descriptors.
process_resident_memory_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Resident memory size in bytes.
process_start_time_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Virtual memory size in bytes.
process_virtual_memory_max_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Maximum amount of virtual memory available in bytes.
prometheus_api_remote_read_queries	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The current number of remote read queries being executed or waiting.
prometheus_build_info	gauge	`revision`, `version`, `ins`, `instance`, `ip`, `tags`, `goarch`, `goversion`, `job`, `cls`, `branch`, `goos`	A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which prometheus was built, and the goos and goarch for the build.
prometheus_config_last_reload_success_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Timestamp of the last successful configuration reload.
prometheus_config_last_reload_successful	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Whether the last configuration reload attempt was successful.
prometheus_engine_queries	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The current number of queries being executed or waiting.
prometheus_engine_queries_concurrent_max	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The max number of concurrent queries.
prometheus_engine_query_duration_seconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`, `slice`	Query timings
prometheus_engine_query_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `slice`	N/A
prometheus_engine_query_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `slice`	N/A
prometheus_engine_query_log_enabled	gauge	`ins`, `instance`, `ip`, `job`, `cls`	State of the query log.
prometheus_engine_query_log_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of query log failures.
prometheus_engine_query_samples_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The total number of samples loaded by all queries.
prometheus_http_request_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`, `handler`	N/A
prometheus_http_request_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `handler`	N/A
prometheus_http_request_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `handler`	N/A
prometheus_http_requests_total	counter	`ins`, `instance`, `ip`, `job`, `cls`, `code`, `handler`	Counter of HTTP requests.
prometheus_http_response_size_bytes_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`, `handler`	N/A
prometheus_http_response_size_bytes_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `handler`	N/A
prometheus_http_response_size_bytes_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`, `handler`	N/A
prometheus_notifications_alertmanagers_discovered	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of alertmanagers discovered and active.
prometheus_notifications_dropped_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of alerts dropped due to errors when sending to Alertmanager.
prometheus_notifications_errors_total	counter	`ins`, `instance`, `ip`, `alertmanager`, `job`, `cls`	Total number of errors sending alert notifications.
prometheus_notifications_latency_seconds	summary	`ins`, `instance`, `ip`, `alertmanager`, `job`, `cls`, `quantile`	Latency quantiles for sending alert notifications.
prometheus_notifications_latency_seconds_count	Unknown	`ins`, `instance`, `ip`, `alertmanager`, `job`, `cls`	N/A
prometheus_notifications_latency_seconds_sum	Unknown	`ins`, `instance`, `ip`, `alertmanager`, `job`, `cls`	N/A
prometheus_notifications_queue_capacity	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The capacity of the alert notifications queue.
prometheus_notifications_queue_length	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of alert notifications in the queue.
prometheus_notifications_sent_total	counter	`ins`, `instance`, `ip`, `alertmanager`, `job`, `cls`	Total number of alerts sent.
prometheus_ready	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Whether Prometheus startup was fully completed and the server is ready for normal operation.
prometheus_remote_storage_exemplars_in_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Exemplars in to remote storage, compare to exemplars out for queue managers.
prometheus_remote_storage_highest_timestamp_in_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Highest timestamp that has come into the remote storage via the Appender interface, in seconds since epoch.
prometheus_remote_storage_histograms_in_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	HistogramSamples in to remote storage, compare to histograms out for queue managers.
prometheus_remote_storage_samples_in_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Samples in to remote storage, compare to samples out for queue managers.
prometheus_remote_storage_string_interner_zero_reference_releases_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of times release has been called for strings that are not interned.
prometheus_rule_evaluation_duration_seconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	The duration for a rule to execute.
prometheus_rule_evaluation_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_rule_evaluation_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_rule_evaluation_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`, `rule_group`	The total number of rule evaluation failures.
prometheus_rule_evaluations_total	counter	`ins`, `instance`, `ip`, `job`, `cls`, `rule_group`	The total number of rule evaluations.
prometheus_rule_group_duration_seconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	The duration of rule group evaluations.
prometheus_rule_group_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_rule_group_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_rule_group_interval_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `rule_group`	The interval of a rule group.
prometheus_rule_group_iterations_missed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`, `rule_group`	The total number of rule group evaluations missed due to slow rule group evaluation.
prometheus_rule_group_iterations_total	counter	`ins`, `instance`, `ip`, `job`, `cls`, `rule_group`	The total number of scheduled rule group evaluations, whether executed or missed.
prometheus_rule_group_last_duration_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `rule_group`	The duration of the last rule group evaluation.
prometheus_rule_group_last_evaluation_samples	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `rule_group`	The number of samples returned during the last rule group evaluation.
prometheus_rule_group_last_evaluation_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `rule_group`	The timestamp of the last rule group evaluation in seconds.
prometheus_rule_group_rules	gauge	`ins`, `instance`, `ip`, `job`, `cls`, `rule_group`	The number of rules.
prometheus_sd_azure_cache_hit_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of cache hit during refresh.
prometheus_sd_azure_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of Azure service discovery refresh failures.
prometheus_sd_consul_rpc_duration_seconds	summary	`endpoint`, `ins`, `instance`, `ip`, `job`, `cls`, `call`, `quantile`	The duration of a Consul RPC call in seconds.
prometheus_sd_consul_rpc_duration_seconds_count	Unknown	`endpoint`, `ins`, `instance`, `ip`, `job`, `cls`, `call`	N/A
prometheus_sd_consul_rpc_duration_seconds_sum	Unknown	`endpoint`, `ins`, `instance`, `ip`, `job`, `cls`, `call`	N/A
prometheus_sd_consul_rpc_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of Consul RPC call failures.
prometheus_sd_discovered_targets	gauge	`ins`, `instance`, `ip`, `config`, `job`, `cls`	Current number of discovered targets.
prometheus_sd_dns_lookup_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of DNS-SD lookup failures.
prometheus_sd_dns_lookups_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of DNS-SD lookups.
prometheus_sd_failed_configs	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Current number of service discovery configurations that failed to load.
prometheus_sd_file_mtime_seconds	gauge	`ins`, `instance`, `ip`, `filename`, `job`, `cls`	Timestamp (mtime) of files read by FileSD. Timestamp is set at read time.
prometheus_sd_file_read_errors_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of File-SD read errors.
prometheus_sd_file_scan_duration_seconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	The duration of the File-SD scan in seconds.
prometheus_sd_file_scan_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_sd_file_scan_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_sd_file_watcher_errors_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of File-SD errors caused by filesystem watch failures.
prometheus_sd_http_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of HTTP service discovery refresh failures.
prometheus_sd_kubernetes_events_total	counter	`event`, `ins`, `instance`, `role`, `ip`, `job`, `cls`	The number of Kubernetes events handled.
prometheus_sd_kuma_fetch_duration_seconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	The duration of a Kuma MADS fetch call.
prometheus_sd_kuma_fetch_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_sd_kuma_fetch_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_sd_kuma_fetch_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of Kuma MADS fetch call failures.
prometheus_sd_kuma_fetch_skipped_updates_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of Kuma MADS fetch calls that result in no updates to the targets.
prometheus_sd_linode_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of Linode service discovery refresh failures.
prometheus_sd_nomad_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of nomad service discovery refresh failures.
prometheus_sd_received_updates_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of update events received from the SD providers.
prometheus_sd_updates_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of update events sent to the SD consumers.
prometheus_target_interval_length_seconds	summary	`ins`, `instance`, `interval`, `ip`, `job`, `cls`, `quantile`	Actual intervals between scrapes.
prometheus_target_interval_length_seconds_count	Unknown	`ins`, `instance`, `interval`, `ip`, `job`, `cls`	N/A
prometheus_target_interval_length_seconds_sum	Unknown	`ins`, `instance`, `interval`, `ip`, `job`, `cls`	N/A
prometheus_target_metadata_cache_bytes	gauge	`ins`, `instance`, `ip`, `scrape_job`, `job`, `cls`	The number of bytes that are currently used for storing metric metadata in the cache
prometheus_target_metadata_cache_entries	gauge	`ins`, `instance`, `ip`, `scrape_job`, `job`, `cls`	Total number of metric metadata entries in the cache
prometheus_target_scrape_pool_exceeded_label_limits_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of times scrape pools hit the label limits, during sync or config reload.
prometheus_target_scrape_pool_exceeded_target_limit_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of times scrape pools hit the target limit, during sync or config reload.
prometheus_target_scrape_pool_reloads_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of failed scrape pool reloads.
prometheus_target_scrape_pool_reloads_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of scrape pool reloads.
prometheus_target_scrape_pool_sync_total	counter	`ins`, `instance`, `ip`, `scrape_job`, `job`, `cls`	Total number of syncs that were executed on a scrape pool.
prometheus_target_scrape_pool_target_limit	gauge	`ins`, `instance`, `ip`, `scrape_job`, `job`, `cls`	Maximum number of targets allowed in this scrape pool.
prometheus_target_scrape_pool_targets	gauge	`ins`, `instance`, `ip`, `scrape_job`, `job`, `cls`	Current number of targets in this scrape pool.
prometheus_target_scrape_pools_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of scrape pool creations that failed.
prometheus_target_scrape_pools_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of scrape pool creation attempts.
prometheus_target_scrapes_cache_flush_forced_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	How many times a scrape cache was flushed due to getting big while scrapes are failing.
prometheus_target_scrapes_exceeded_body_size_limit_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of scrapes that hit the body size limit
prometheus_target_scrapes_exceeded_native_histogram_bucket_limit_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of scrapes that hit the native histogram bucket limit and were rejected.
prometheus_target_scrapes_exceeded_sample_limit_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of scrapes that hit the sample limit and were rejected.
prometheus_target_scrapes_exemplar_out_of_order_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of exemplar rejected due to not being out of the expected order.
prometheus_target_scrapes_sample_duplicate_timestamp_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of samples rejected due to duplicate timestamps but different values.
prometheus_target_scrapes_sample_out_of_bounds_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of samples rejected due to timestamp falling outside of the time bounds.
prometheus_target_scrapes_sample_out_of_order_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of samples rejected due to not being out of the expected order.
prometheus_target_sync_failed_total	counter	`ins`, `instance`, `ip`, `scrape_job`, `job`, `cls`	Total number of target sync failures.
prometheus_target_sync_length_seconds	summary	`ins`, `instance`, `ip`, `scrape_job`, `job`, `cls`, `quantile`	Actual interval to sync the scrape pool.
prometheus_target_sync_length_seconds_count	Unknown	`ins`, `instance`, `ip`, `scrape_job`, `job`, `cls`	N/A
prometheus_target_sync_length_seconds_sum	Unknown	`ins`, `instance`, `ip`, `scrape_job`, `job`, `cls`	N/A
prometheus_template_text_expansion_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The total number of template text expansion failures.
prometheus_template_text_expansions_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The total number of template text expansions.
prometheus_treecache_watcher_goroutines	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The current number of watcher goroutines.
prometheus_treecache_zookeeper_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The total number of ZooKeeper failures.
prometheus_tsdb_blocks_loaded	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of currently loaded data blocks
prometheus_tsdb_checkpoint_creations_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of checkpoint creations that failed.
prometheus_tsdb_checkpoint_creations_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of checkpoint creations attempted.
prometheus_tsdb_checkpoint_deletions_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of checkpoint deletions that failed.
prometheus_tsdb_checkpoint_deletions_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of checkpoint deletions attempted.
prometheus_tsdb_clean_start	gauge	`ins`, `instance`, `ip`, `job`, `cls`	-1: lockfile is disabled. 0: a lockfile from a previous execution was replaced. 1: lockfile creation was clean
prometheus_tsdb_compaction_chunk_range_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
prometheus_tsdb_compaction_chunk_range_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_compaction_chunk_range_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_compaction_chunk_samples_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
prometheus_tsdb_compaction_chunk_samples_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_compaction_chunk_samples_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_compaction_chunk_size_bytes_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
prometheus_tsdb_compaction_chunk_size_bytes_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_compaction_chunk_size_bytes_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_compaction_duration_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
prometheus_tsdb_compaction_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_compaction_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_compaction_populating_block	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Set to 1 when a block is currently being written to the disk.
prometheus_tsdb_compactions_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of compactions that failed for the partition.
prometheus_tsdb_compactions_skipped_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of skipped compactions due to disabled auto compaction.
prometheus_tsdb_compactions_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of compactions that were executed for the partition.
prometheus_tsdb_compactions_triggered_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of triggered compactions for the partition.
prometheus_tsdb_data_replay_duration_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Time taken to replay the data on disk.
prometheus_tsdb_exemplar_exemplars_appended_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of appended exemplars.
prometheus_tsdb_exemplar_exemplars_in_storage	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of exemplars currently in circular storage.
prometheus_tsdb_exemplar_last_exemplars_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The timestamp of the oldest exemplar stored in circular storage. Useful to check for what timerange the current exemplar buffer limit allows. This usually means the last timestampfor all exemplars for a typical setup. This is not true though if one of the series timestamp is in future compared to rest series.
prometheus_tsdb_exemplar_max_exemplars	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Total number of exemplars the exemplar storage can store, resizeable.
prometheus_tsdb_exemplar_out_of_order_exemplars_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of out of order exemplar ingestion failed attempts.
prometheus_tsdb_exemplar_series_with_exemplars_in_storage	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of series with exemplars currently in circular storage.
prometheus_tsdb_head_active_appenders	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Number of currently active appender transactions
prometheus_tsdb_head_chunks	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Total number of chunks in the head block.
prometheus_tsdb_head_chunks_created_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of chunks created in the head
prometheus_tsdb_head_chunks_removed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of chunks removed in the head
prometheus_tsdb_head_chunks_storage_size_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Size of the chunks_head directory.
prometheus_tsdb_head_gc_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_head_gc_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_head_max_time	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Maximum timestamp of the head block. The unit is decided by the library consumer.
prometheus_tsdb_head_max_time_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Maximum timestamp of the head block.
prometheus_tsdb_head_min_time	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Minimum time bound of the head block. The unit is decided by the library consumer.
prometheus_tsdb_head_min_time_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Minimum time bound of the head block.
prometheus_tsdb_head_out_of_order_samples_appended_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of appended out of order samples.
prometheus_tsdb_head_samples_appended_total	counter	`ins`, `instance`, `ip`, `type`, `job`, `cls`	Total number of appended samples.
prometheus_tsdb_head_series	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Total number of series in the head block.
prometheus_tsdb_head_series_created_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of series created in the head
prometheus_tsdb_head_series_not_found_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of requests for series that were not found.
prometheus_tsdb_head_series_removed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of series removed in the head
prometheus_tsdb_head_truncations_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of head truncations that failed.
prometheus_tsdb_head_truncations_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of head truncations attempted.
prometheus_tsdb_isolation_high_watermark	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The highest TSDB append ID that has been given out.
prometheus_tsdb_isolation_low_watermark	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The lowest TSDB append ID that is still referenced.
prometheus_tsdb_lowest_timestamp	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Lowest timestamp value stored in the database. The unit is decided by the library consumer.
prometheus_tsdb_lowest_timestamp_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Lowest timestamp value stored in the database.
prometheus_tsdb_mmap_chunk_corruptions_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of memory-mapped chunk corruptions.
prometheus_tsdb_mmap_chunks_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of chunks that were memory-mapped.
prometheus_tsdb_out_of_bound_samples_total	counter	`ins`, `instance`, `ip`, `type`, `job`, `cls`	Total number of out of bound samples ingestion failed attempts with out of order support disabled.
prometheus_tsdb_out_of_order_samples_total	counter	`ins`, `instance`, `ip`, `type`, `job`, `cls`	Total number of out of order samples ingestion failed attempts due to out of order being disabled.
prometheus_tsdb_reloads_failures_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of times the database failed to reloadBlocks block data from disk.
prometheus_tsdb_reloads_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Number of times the database reloaded block data from disk.
prometheus_tsdb_retention_limit_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Max number of bytes to be retained in the tsdb blocks, configured 0 means disabled
prometheus_tsdb_retention_limit_seconds	gauge	`ins`, `instance`, `ip`, `job`, `cls`	How long to retain samples in storage.
prometheus_tsdb_size_retentions_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of times that blocks were deleted because the maximum number of bytes was exceeded.
prometheus_tsdb_snapshot_replay_error_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number snapshot replays that failed.
prometheus_tsdb_storage_blocks_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of bytes that are currently used for local storage by all blocks.
prometheus_tsdb_symbol_table_size_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Size of symbol table in memory for loaded blocks
prometheus_tsdb_time_retentions_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	The number of times that blocks were deleted because the maximum time limit was exceeded.
prometheus_tsdb_tombstone_cleanup_seconds_bucket	Unknown	`ins`, `instance`, `ip`, `le`, `job`, `cls`	N/A
prometheus_tsdb_tombstone_cleanup_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_tombstone_cleanup_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_too_old_samples_total	counter	`ins`, `instance`, `ip`, `type`, `job`, `cls`	Total number of out of order samples ingestion failed attempts with out of support enabled, but sample outside of time window.
prometheus_tsdb_vertical_compactions_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of compactions done on overlapping blocks.
prometheus_tsdb_wal_completed_pages_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of completed pages.
prometheus_tsdb_wal_corruptions_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of WAL corruptions.
prometheus_tsdb_wal_fsync_duration_seconds	summary	`ins`, `instance`, `ip`, `job`, `cls`, `quantile`	Duration of write log fsync.
prometheus_tsdb_wal_fsync_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_wal_fsync_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_wal_page_flushes_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of page flushes.
prometheus_tsdb_wal_segment_current	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Write log segment index that TSDB is currently writing to.
prometheus_tsdb_wal_storage_size_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Size of the write log directory.
prometheus_tsdb_wal_truncate_duration_seconds_count	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_wal_truncate_duration_seconds_sum	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
prometheus_tsdb_wal_truncations_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of write log truncations that failed.
prometheus_tsdb_wal_truncations_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of write log truncations attempted.
prometheus_tsdb_wal_writes_failed_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of write log writes that failed.
prometheus_web_federation_errors_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of errors that occurred while sending federation responses.
prometheus_web_federation_warnings_total	counter	`ins`, `instance`, `ip`, `job`, `cls`	Total number of warnings that occurred while sending federation responses.
promhttp_metric_handler_requests_in_flight	gauge	`ins`, `instance`, `ip`, `job`, `cls`	Current number of scrapes being served.
promhttp_metric_handler_requests_total	counter	`ins`, `instance`, `ip`, `job`, `cls`, `code`	Total number of scrapes by HTTP status code.
pushgateway_build_info	gauge	`revision`, `version`, `ins`, `instance`, `ip`, `tags`, `goarch`, `goversion`, `job`, `cls`, `branch`, `goos`	A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which pushgateway was built, and the goos and goarch for the build.
pushgateway_http_requests_total	counter	`ins`, `instance`, `method`, `ip`, `job`, `cls`, `code`, `handler`	Total HTTP requests processed by the Pushgateway, excluding scrapes.
querier_cache_added_new_total	Unknown	`ins`, `instance`, `ip`, `job`, `cache`, `cls`	N/A
querier_cache_added_total	Unknown	`ins`, `instance`, `ip`, `job`, `cache`, `cls`	N/A
querier_cache_entries	gauge	`ins`, `instance`, `ip`, `job`, `cache`, `cls`	The total number of entries
querier_cache_evicted_total	Unknown	`ins`, `instance`, `ip`, `job`, `reason`, `cache`, `cls`	N/A
querier_cache_gets_total	Unknown	`ins`, `instance`, `ip`, `job`, `cache`, `cls`	N/A
querier_cache_memory_bytes	gauge	`ins`, `instance`, `ip`, `job`, `cache`, `cls`	The current cache size in bytes
querier_cache_misses_total	Unknown	`ins`, `instance`, `ip`, `job`, `cache`, `cls`	N/A
querier_cache_stale_gets_total	Unknown	`ins`, `instance`, `ip`, `job`, `cache`, `cls`	N/A
ring_member_heartbeats_total	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
ring_member_tokens_owned	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of tokens owned in the ring.
ring_member_tokens_to_own	gauge	`ins`, `instance`, `ip`, `job`, `cls`	The number of tokens to own in the ring.
scrape_duration_seconds	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
scrape_samples_post_metric_relabeling	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
scrape_samples_scraped	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
scrape_series_added	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A
up	Unknown	`ins`, `instance`, `ip`, `job`, `cls`	N/A

PING 指标

PING 任务包含有 54 类可用监控指标，由 blackbox_epxorter 提供。

Metric Name	Type	Labels	Description
agent_up	Unknown	`ins`, `ip`, `job`, `instance`, `cls`	N/A
probe_dns_lookup_time_seconds	gauge	`ins`, `ip`, `job`, `instance`, `cls`	Returns the time taken for probe dns lookup in seconds
probe_duration_seconds	gauge	`ins`, `ip`, `job`, `instance`, `cls`	Returns how long the probe took to complete in seconds
probe_icmp_duration_seconds	gauge	`ins`, `ip`, `job`, `phase`, `instance`, `cls`	Duration of icmp request by phase
probe_icmp_reply_hop_limit	gauge	`ins`, `ip`, `job`, `instance`, `cls`	Replied packet hop limit (TTL for ipv4)
probe_ip_addr_hash	gauge	`ins`, `ip`, `job`, `instance`, `cls`	Specifies the hash of IP address. It’s useful to detect if the IP address changes.
probe_ip_protocol	gauge	`ins`, `ip`, `job`, `instance`, `cls`	Specifies whether probe ip protocol is IP4 or IP6
probe_success	gauge	`ins`, `ip`, `job`, `instance`, `cls`	Displays whether or not the probe was a success
scrape_duration_seconds	Unknown	`ins`, `ip`, `job`, `instance`, `cls`	N/A
scrape_samples_post_metric_relabeling	Unknown	`ins`, `ip`, `job`, `instance`, `cls`	N/A
scrape_samples_scraped	Unknown	`ins`, `ip`, `job`, `instance`, `cls`	N/A
scrape_series_added	Unknown	`ins`, `ip`, `job`, `instance`, `cls`	N/A
up	Unknown	`ins`, `ip`, `job`, `instance`, `cls`	N/A

PUSH 指标

PushGateway 提供 44 类监控指标。

Metric Name	Type	Labels	Description
agent_up	Unknown	`job`, `cls`, `instance`, `ins`, `ip`	N/A
go_gc_duration_seconds	summary	`job`, `cls`, `instance`, `ins`, `quantile`, `ip`	A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count	Unknown	`job`, `cls`, `instance`, `ins`, `ip`	N/A
go_gc_duration_seconds_sum	Unknown	`job`, `cls`, `instance`, `ins`, `ip`	N/A
go_goroutines	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of goroutines that currently exist.
go_info	gauge	`job`, `cls`, `instance`, `ins`, `ip`, `version`	Information about the Go environment.
go_memstats_alloc_bytes	counter	`job`, `cls`, `instance`, `ins`, `ip`	Total number of bytes allocated, even if freed.
go_memstats_alloc_bytes_total	counter	`job`, `cls`, `instance`, `ins`, `ip`	Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total	counter	`job`, `cls`, `instance`, `ins`, `ip`	Total number of frees.
go_memstats_gc_sys_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of heap bytes that are in use.
go_memstats_heap_objects	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of allocated objects.
go_memstats_heap_released_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of heap bytes released to OS.
go_memstats_heap_sys_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total	counter	`job`, `cls`, `instance`, `ins`, `ip`	Total number of pointer lookups.
go_memstats_mallocs_total	counter	`job`, `cls`, `instance`, `ins`, `ip`	Total number of mallocs.
go_memstats_mcache_inuse_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of bytes obtained from system.
go_threads	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of OS threads created.
process_cpu_seconds_total	counter	`job`, `cls`, `instance`, `ins`, `ip`	Total user and system CPU time spent in seconds.
process_max_fds	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Maximum number of open file descriptors.
process_open_fds	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Number of open file descriptors.
process_resident_memory_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Resident memory size in bytes.
process_start_time_seconds	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Virtual memory size in bytes.
process_virtual_memory_max_bytes	gauge	`job`, `cls`, `instance`, `ins`, `ip`	Maximum amount of virtual memory available in bytes.
pushgateway_build_info	gauge	`job`, `goversion`, `cls`, `branch`, `instance`, `tags`, `revision`, `goarch`, `ins`, `ip`, `version`, `goos`	A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which pushgateway was built, and the goos and goarch for the build.
pushgateway_http_requests_total	counter	`job`, `cls`, `method`, `code`, `handler`, `instance`, `ins`, `ip`	Total HTTP requests processed by the Pushgateway, excluding scrapes.
scrape_duration_seconds	Unknown	`job`, `cls`, `instance`, `ins`, `ip`	N/A
scrape_samples_post_metric_relabeling	Unknown	`job`, `cls`, `instance`, `ins`, `ip`	N/A
scrape_samples_scraped	Unknown	`job`, `cls`, `instance`, `ins`, `ip`	N/A
scrape_series_added	Unknown	`job`, `cls`, `instance`, `ins`, `ip`	N/A
up	Unknown	`job`, `cls`, `instance`, `ins`, `ip`	N/A

8.8 - 常见问题

Pigsty INFRA 基础设施模块常见问题答疑

INFRA模块中包含了哪些组件？

Ansible 用于自动化、部署和管理；
Nginx 用于公开对外暴露各种 WebUI 服务，并为提供一个本地软件源
自签名 CA 用于 SSL/TLS 证书；
Prometheus 用于收集存储监控指标；
Grafana 用于监控/可视化；
Loki 用于收集存储查询日志；
AlertManager 用于告警聚合；
Chronyd 用于 NTP 时间同步；
DNSMasq 用于 DNS 注册和解析；
在管理节点上的 PostgreSQL 作为 CMDB；（可选）
Docker 用于无状态的应用程序和工具（可选）。

如何重新向 Prometheus 注册监控目标？

如果你不小心删除了基础设施节点上 Prometheus 的目标目录（/etc/prometheus/target），你可以使用以下命令再次向 Prometheus 注册监控目标：

./infra.yml -t register_prometheus  # 在 infra 节点上向 prometheus 注册所有 infra 目标
./node.yml  -t register_prometheus  # 在 infra 节点上向 prometheus 注册所有 node  目标
./etcd.yml  -t register_prometheus  # 在 infra 节点上向 prometheus 注册所有 etcd  目标
./minio.yml -t register_prometheus  # 在 infra 节点上向 prometheus 注册所有 minio 目标
./pgsql.yml -t register_prometheus  # 在 infra 节点上向 prometheus 注册所有 pgsql 目标

如何重新向 Grafana 注册 PostgreSQL 数据源？

在 pg_databases 中定义的 PGSQL 数据库默认会被注册为 Grafana 数据源（以供 PGCAT 应用使用）。

如果你不小心删除了在 Grafana 中注册的 postgres 数据源，你可以使用以下命令再次注册它们：

# 将所有（在 pg_databases 中定义的） pgsql 数据库注册为 grafana 数据源
./pgsql.yml -t register_grafana

如何重新向 Nginx 注册节点的 Haproxy 管控界面？

如果你不小心删除了 /etc/nginx/conf.d/haproxy 中的已注册 haproxy 代理设置，你可以使用以下命令再次恢复它们：

./node.yml -t register_nginx     # 在 infra 节点上向 nginx 注册所有 haproxy 管理页面的代理设置

如何恢复 DNSMASQ 中的域名注册记录？

PGSQL 集群/实例域名默认注册到 infra 节点的 /etc/hosts.d/<name>。你可以使用以下命令再次恢复它们：

./pgsql.yml -t pg_dns    # 在 infra 节点上向 dnsmasq 注册 pg 的 DNS 名称

如何使用Nginx对外暴露新的上游服务？

尽管您可以直接通过 IP:Port 的方式访问服务，但我们依然建议收敛访问入口，使用域名并统一从 Nginx 代理访问各类带有 Web 界面的服务。这样有利于统一收口访问，减少暴露的端口，便于进行访问控制与审计。

如果你希望通过 Nginx 门户公开新的 WebUI 服务，你可以将服务定义添加到 infra_portal 参数中。例如，下面是 Pigsty 官方 Demo 使用的 Infra 门户配置，对外暴露了几种额外的服务：

infra_portal:
  home         : { domain: home.pigsty.cc }
  grafana      : { domain: demo.pigsty.cc ,endpoint: "${admin_ip}:3000" ,websocket: true }
  prometheus   : { domain: p.pigsty.cc ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty.cc ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }
  # 新增的 Web 门户
  minio        : { domain: sss.pigsty  ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }
  postgrest    : { domain: api.pigsty.cc  ,endpoint: "127.0.0.1:8884"   }
  pgadmin      : { domain: adm.pigsty.cc  ,endpoint: "127.0.0.1:8885"   }
  pgweb        : { domain: cli.pigsty.cc  ,endpoint: "127.0.0.1:8886"   }
  bytebase     : { domain: ddl.pigsty.cc  ,endpoint: "127.0.0.1:8887"   }
  gitea        : { domain: git.pigsty.cc  ,endpoint: "127.0.0.1:8889"   }
  wiki         : { domain: wiki.pigsty.cc ,endpoint: "127.0.0.1:9002"   }
  noco         : { domain: noco.pigsty.cc ,endpoint: "127.0.0.1:9003"   }
  supa         : { domain: supa.pigsty.cc ,endpoint: "127.0.0.1:8000", websocket: true }

完成 Nginx 上游服务定义后，使用以下配置与命令，向 Nginx 注册新的服务。

./infra.yml -t nginx_config           # 重新生成 Nginx 配置文件
./infra.yml -t nginx_launch           # 更新并应用 Nginx 配置。

# 您也可以使用 Ansible 手工重载 Nginx 配置
ansible infra -b -a 'nginx -s reload'  # 重载Nginx配置

如果你希望通过 HTTPS 访问，你必须删除 files/pki/csr/pigsty.csr 和 files/pki/nginx/pigsty.{key,crt} 以强制重新生成 Nginx SSL/TLS 证书以包括新上游的域名。如果您希望使用权威机构签发的 SSL 证书，而不是 Pigsty 自签名 CA 颁发的证书，可以将其放置于 /etc/nginx/conf.d/cert/ 目录中并修改相应配置：/etc/nginx/conf.d/<name>.conf。

如何手动向节点添加上游仓库的Repo文件？

Pigsty 有一个内置的包装脚本 bin/repo-add，它将调用 ansible 剧本 node.yml 来将 repo 文件添加到相应的节点。

bin/repo-add <selector> [modules]
bin/repo-add 10.10.10.10           # 为节点 10.10.10.10 添加 node 源
bin/repo-add infra   node,infra    # 为 infra 分组添加 node 和 infra 源
bin/repo-add infra   node,local    # 为 infra 分组添加节点仓库和本地pigsty源
bin/repo-add pg-test node,pgsql    # 为 pg-test 分组添加 node 和 pgsql 源

9 - 模块：NODE

配置目标服务器，纳管主机节点，并将其调整至描述的状态。也包括节点上的 VIP，HAProxy 以及监控组件。

9.1 - 核心概念

介绍 Node 集群的中涉及到的重要概念

节点是硬件资源的抽象，它可以是裸机、虚拟机、容器或者是 k8s pods：只要装着操作系统，可以使用 CPU/内存/磁盘/网络资源就行。

在 Pigsty 中存在不同类型的节点，它们的区别主要在于安装了不同的模块

普通节点：被 Pigsty 所管理的普通节点
ADMIN节点：使用 Ansible 发出管理指令的节点
INFRA节点：安装 INFRA 模块的节点
PGSQL节点：安装 PGSQL 模块的节点
安装了其他模块的节点……

在单机安装时，当前节点会被同时视作为管理节点、基础设施节点、PGSQL 节点，当然，它也是一个普通的节点。

普通节点

你可以使用 Pigsty 管理节点，并在其上安装模块。node.yml 剧本将调整节点至所需状态。以下服务默认会被添加到所有节点：

组件	端口	描述	状态
Node Exporter	9100	节点监控指标导出器	默认启用
HAProxy Admin	9101	HAProxy 管理页面	默认启用
Promtail	9080	日志收集代理	默认启用
Docker Daemon	9323	启用容器支持	按需启用
Keepalived	-	负责管理主机集群 L2 VIP	按需启用
Keepalived Exporter	9650	负责监控 Keepalived 状态	按需启用

此外，您可以为节点选装 Docker 与 Keepalived（及其监控 keepalived exporter），这两个组件默认不启用。

ADMIN节点

在一套 Pigsty 部署中会有且只有一个管理节点，由 admin_ip 指定。在单机安装的配置过程中，它会被被设置为该机器的首要IP地址。

该节点将具有对所有其他节点的 ssh/sudo 访问权限：管理节点的安全至关重要，请确保它的访问受到严格控制。

通常管理节点与基础设施节点（infra节点）重合。如果有多个基础设施节点，管理节点通常是所有 infra 节点中的第一个，其他的作为管理节点的备份。

INFRA节点

一套 Pigsty 部署可能有一个或多个基础设施节点（INFRA节点），在大型生产环境中可能会有 2 ~ 3 个。

配置清单中的 infra 分组列出并指定了哪些节点是INFRA节点，这些节点会安装 INFRA 模块（DNS、Nginx、Prometheus、Grafana 等…）。

管理节点通常是是INFRA节点分组中的第一台，其他INFRA节点可以被用作"备用"的管理节点。

组件	端口	域名	描述
Nginx	80	`h.pigsty`	Web服务门户（也用作yum/atp仓库）
AlertManager	9093	`a.pigsty`	告警聚合分发
Prometheus	9090	`p.pigsty`	时间序列数据库（收存监控指标）
Grafana	3000	`g.pigsty`	可视化平台
Loki	3100	-	日志收集服务器
PushGateway	9091	-	接受一次性的任务指标
BlackboxExporter	9115	-	黑盒监控探测
DNSMASQ	53	-	DNS 服务器
Chronyd	123	-	NTP 时间服务器
PostgreSQL	5432	-	Pigsty CMDB 和默认数据库
Ansible	-	-	运行剧本

PGSQL节点

安装了 PGSQL 模块的节点被称为 PGSQL 节点。节点和 PostgreSQL 实例是1:1部署的。

在这种情况下，PGSQL节点可以从相应的 PostgreSQL 实例上借用身份：node_id_from_pg 参数会控制这一点。

组件	端口	描述	状态
Postgres	5432	PostgreSQL数据库	默认启用
Pgbouncer	6432	Pgbouncer 连接池服务	默认启用
Patroni	8008	Patroni 高可用组件	默认启用
Haproxy Primary	5433	主连接池：读/写服务	默认启用
Haproxy Replica	5434	副本连接池：只读服务	默认启用
Haproxy Default	5436	主直连服务	默认启用
Haproxy Offline	5438	离线直连：离线读服务	默认启用
Haproxy `service`	543x	PostgreSQL 定制服务	按需定制
Haproxy Admin	9101	监控指标和流量管理	默认启用
PG Exporter	9630	PG 监控指标导出器	默认启用
PGBouncer Exporter	9631	PGBouncer 监控指标导出器	默认启用
Node Exporter	9100	节点监控指标导出器	默认启用
Promtail	9080	收集数据库组件与主机日志	默认启用
vip-manager	-	将 VIP 绑定到主节点	按需启用
Docker Daemon	9323	Docker 守护进程	按需启用
keepalived	-	为整个集群绑定 L2 VIP	按需启用
Keepalived Exporter	9650	Keepalived 指标导出器	按需启用

9.2 - 集群配置

根据需求场景选择合适的 Node 部署类型，并对外提供可靠的接入。

Pigsty使用IP地址作为节点的唯一身份标识，该IP地址应当是数据库实例监听并对外提供服务的内网IP地址。

node-test:
  hosts:
    10.10.10.11: { nodename: node-test-1 }
    10.10.10.12: { nodename: node-test-2 }
    10.10.10.13: { nodename: node-test-3 }
  vars:
    node_cluster: node-test

该IP地址必须是数据库实例监听并对外提供服务的IP地址，但不宜使用公网IP地址。尽管如此，用户并不一定非要通过该IP地址连接至该数据库。例如，通过SSH隧道或跳板机中转的方式间接操作管理目标节点也是可行的。但在标识数据库节点时，首要IPv4地址依然是节点的核心标识符。这一点非常重要，用户应当在配置时保证这一点。 IP地址即配置清单中主机的inventory_hostname ，体现为<cluster>.hosts对象中的key。除此之外，每个节点还有两个额外的身份参数：

名称	类型	层级	必要性	说明
`inventory_hostname`	`ip`	-	必选	节点IP地址
`nodename`	`string`	I	可选	节点名称
`node_cluster`	`string`	C	可选	节点集群名称

nodename 与 node_cluster 两个参数是可选的，如果不提供，会使用节点现有的主机名，和固定值 nodes 作为默认值。在 Pigsty 的监控系统中，这两者将会被用作节点的 集群标识 （cls）与 实例标识（ins）。

对于 PGSQL节点来说，因为Pigsty默认采用PG:节点独占1:1部署，因此可以通过 node_id_from_pg 参数，将 PostgreSQL 实例的身份参数（ pg_cluster 与 pg_seq）借用至节点的ins与cls标签上，从而让数据库与节点的监控指标拥有相同的标签，便于交叉分析。

#nodename:                # [实例] # 节点实例标识，如缺失则使用现有主机名，可选，无默认值
node_cluster: nodes       # [集群] # 节点集群标识，如缺失则使用默认值'nodes'，可选
nodename_overwrite: true          # 用 nodename 覆盖节点的主机名吗？
nodename_exchange: false          # 在剧本主机之间交换 nodename 吗？
node_id_from_pg: true             # 如果可行，是否借用 postgres 身份作为节点身份？

您还可以为主机集群配置丰富的功能参数，例如，使用节点集群上的 HAProxy 对外提供负载均衡，暴露服务，或者为集群绑定一个 L2 VIP。

9.3 - 参数列表

Node 模块提供了 65 个相关配置参数，用于定制所需的 Minio 集群。

参数

NODE 模块有 11 组参数（Docker/VIP为可选项），共计 65 个相关参数：

NODE_ID : 节点身份参数
NODE_DNS : 节点域名 & DNS解析
NODE_PACKAGE : 节点仓库源 & 安装软件包
NODE_TUNE : 节点调优与内核特性开关
NODE_ADMIN : 管理员用户与SSH凭证管理
NODE_TIME : 时区，NTP服务与定时任务
NODE_VIP : 可选的主机节点集群L2 VIP
HAPROXY : 使用HAProxy对外暴露服务
NODE_EXPORTER : 主机节点监控与注册
PROMTAIL : Promtail日志收集组件

参数列表

参数	参数组	类型	层次	中文说明
`nodename`	`NODE_ID`	`string`	I	node 实例标识，如缺失则使用主机名，可选
`node_cluster`	`NODE_ID`	`string`	C	node 集群标识，如缺失则使用默认值’nodes’，可选
`nodename_overwrite`	`NODE_ID`	`bool`	C	用 nodename 覆盖节点的主机名吗？
`nodename_exchange`	`NODE_ID`	`bool`	C	在剧本主机之间交换 nodename 吗？
`node_id_from_pg`	`NODE_ID`	`bool`	C	如果可行，是否借用 postgres 身份作为节点身份？
`node_write_etc_hosts`	`NODE_DNS`	`bool`	G/C/I	是否修改目标节点上的 `/etc/hosts`？
`node_default_etc_hosts`	`NODE_DNS`	`string[]`	G	/etc/hosts 中的静态 DNS 记录
`node_etc_hosts`	`NODE_DNS`	`string[]`	C	/etc/hosts 中的额外静态 DNS 记录
`node_dns_method`	`NODE_DNS`	`enum`	C	如何处理现有DNS服务器：add,none,overwrite
`node_dns_servers`	`NODE_DNS`	`string[]`	C	/etc/resolv.conf 中的动态域名服务器列表
`node_dns_options`	`NODE_DNS`	`string[]`	C	/etc/resolv.conf 中的DNS解析选项
`node_repo_modules`	`NODE_PACKAGE`	`enum`	C	在节点上启用哪些软件源模块？默认为 local 使用本地源
`node_repo_remove`	`NODE_PACKAGE`	`bool`	C	配置节点软件仓库时，删除节点上现有的仓库吗？
`node_packages`	`NODE_PACKAGE`	`string[]`	C	要在当前节点上安装的软件包列表
`node_default_packages`	`NODE_PACKAGE`	`string[]`	G	默认在所有节点上安装的软件包列表
`node_disable_firewall`	`NODE_TUNE`	`bool`	C	禁用节点防火墙？默认为 `true`
`node_disable_selinux`	`NODE_TUNE`	`bool`	C	禁用节点 selinux？默认为 `true`
`node_disable_numa`	`NODE_TUNE`	`bool`	C	禁用节点 numa，禁用需要重启
`node_disable_swap`	`NODE_TUNE`	`bool`	C	禁用节点 Swap，谨慎使用
`node_static_network`	`NODE_TUNE`	`bool`	C	重启后保留 DNS 解析器设置，即静态网络，默认启用
`node_disk_prefetch`	`NODE_TUNE`	`bool`	C	在 HDD 上配置磁盘预取以提高性能
`node_kernel_modules`	`NODE_TUNE`	`string[]`	C	在此节点上启用的内核模块列表
`node_hugepage_count`	`NODE_TUNE`	`int`	C	主机节点分配的 2MB 大页数量，优先级比比例更高
`node_hugepage_ratio`	`NODE_TUNE`	`float`	C	主机节点分配的内存大页占总内存比例，0 默认禁用
`node_overcommit_ratio`	`NODE_TUNE`	`float`	C	节点内存允许的 OverCommit 超额比率 (50-100)，0 默认禁用
`node_tune`	`NODE_TUNE`	`enum`	C	节点调优配置文件：无，oltp,olap,crit,tiny
`node_sysctl_params`	`NODE_TUNE`	`dict`	C	额外的 sysctl 配置参数，k:v 格式
`node_data`	`NODE_ADMIN`	`path`	C	节点主数据目录，默认为 `/data`
`node_admin_enabled`	`NODE_ADMIN`	`bool`	C	在目标节点上创建管理员用户吗？
`node_admin_uid`	`NODE_ADMIN`	`int`	C	节点管理员用户的 uid 和 gid
`node_admin_username`	`NODE_ADMIN`	`username`	C	节点管理员用户的名称，默认为 `dba`
`node_admin_ssh_exchange`	`NODE_ADMIN`	`bool`	C	是否在节点集群之间交换管理员 ssh 密钥
`node_admin_pk_current`	`NODE_ADMIN`	`bool`	C	将当前用户的 ssh 公钥添加到管理员的 authorized_keys 中吗？
`node_admin_pk_list`	`NODE_ADMIN`	`string[]`	C	要添加到管理员用户的 ssh 公钥
`node_aliases`	`NODE_ADMIN`	`dict`	C	要添加到节点的命令别名
`node_timezone`	`NODE_TIME`	`string`	C	设置主机节点时区，空字符串跳过
`node_ntp_enabled`	`NODE_TIME`	`bool`	C	启用 chronyd 时间同步服务吗？
`node_ntp_servers`	`NODE_TIME`	`string[]`	C	/etc/chrony.conf 中的 ntp 服务器列表
`node_crontab_overwrite`	`NODE_TIME`	`bool`	C	写入 /etc/crontab 时，追加写入还是全部覆盖？
`node_crontab`	`NODE_TIME`	`string[]`	C	在 /etc/crontab 中的 crontab 条目
`vip_enabled`	`NODE_VIP`	`bool`	C	在此节点集群上启用 L2 vip 吗？
`vip_address`	`NODE_VIP`	`ip`	C	节点 vip 地址的 ipv4 格式，启用 vip 时为必要参数
`vip_vrid`	`NODE_VIP`	`int`	C	所需的整数，1-254，在同一 VLAN 中应唯一
`vip_role`	`NODE_VIP`	`enum`	I	可选，master/backup，默认为 backup，用作初始角色
`vip_preempt`	`NODE_VIP`	`bool`	C/I	可选，true/false，默认为 false，启用 vip 抢占
`vip_interface`	`NODE_VIP`	`string`	C/I	节点 vip 网络接口监听，默认为 eth0
`vip_dns_suffix`	`NODE_VIP`	`string`	C	节点 vip DNS 名称后缀，默认为空字符串
`vip_exporter_port`	`NODE_VIP`	`port`	C	keepalived exporter 监听端口，默认为 9650
`haproxy_enabled`	`HAPROXY`	`bool`	C	在此节点上启用 haproxy 吗？
`haproxy_clean`	`HAPROXY`	`bool`	G/C/A	清除所有现有的 haproxy 配置吗？
`haproxy_reload`	`HAPROXY`	`bool`	A	配置后重新加载 haproxy 吗？
`haproxy_auth_enabled`	`HAPROXY`	`bool`	G	启用 haproxy 管理页面的身份验证？
`haproxy_admin_username`	`HAPROXY`	`username`	G	haproxy 管理用户名，默认为 `admin`
`haproxy_admin_password`	`HAPROXY`	`password`	G	haproxy 管理密码，默认为 `pigsty`
`haproxy_exporter_port`	`HAPROXY`	`port`	C	haproxy exporter 的端口，默认为 9101
`haproxy_client_timeout`	`HAPROXY`	`interval`	C	haproxy 客户端连接超时，默认为 24h
`haproxy_server_timeout`	`HAPROXY`	`interval`	C	haproxy 服务器端连接超时，默认为 24h
`haproxy_services`	`HAPROXY`	`service[]`	C	要在节点上对外暴露的 haproxy 服务列表
`node_exporter_enabled`	`NODE_EXPORTER`	`bool`	C	在此节点上配置 node_exporter 吗？
`node_exporter_port`	`NODE_EXPORTER`	`port`	C	node exporter 监听端口，默认为 9100
`node_exporter_options`	`NODE_EXPORTER`	`arg`	C	node_exporter 的额外服务器选项
`promtail_enabled`	`PROMTAIL`	`bool`	C	启用 promtail 日志收集器吗？
`promtail_clean`	`PROMTAIL`	`bool`	G/A	初始化期间清除现有的 promtail 状态文件吗？
`promtail_port`	`PROMTAIL`	`port`	C	promtail 监听端口，默认为 9080
`promtail_positions`	`PROMTAIL`	`path`	C	promtail 位置状态文件路径

`NODE_ID`

每个节点都有身份参数，通过在<cluster>.hosts与<cluster>.vars中的相关参数进行配置。

Pigsty使用IP地址作为数据库节点的唯一标识，该IP地址必须是数据库实例监听并对外提供服务的IP地址，但不宜使用公网IP地址。尽管如此，用户并不一定非要通过该IP地址连接至该数据库。例如，通过SSH隧道或跳板机中转的方式间接操作管理目标节点也是可行的。但在标识数据库节点时，首要IPv4地址依然是节点的核心标识符。这一点非常重要，用户应当在配置时保证这一点。 IP地址即配置清单中主机的inventory_hostname ，体现为<cluster>.hosts对象中的key。

node-test:
  hosts:
    10.10.10.11: { nodename: node-test-1 }
    10.10.10.12: { nodename: node-test-2 }
    10.10.10.13: { nodename: node-test-3 }
  vars:
    node_cluster: node-test

除此之外，在Pigsty监控系统中，节点还有两个重要的身份参数：nodename 与 node_cluster，这两者将在监控系统中被用作节点的 实例标识（ins）与 集群标识 （cls）。

node_load1{cls="pg-meta", ins="pg-meta-1", ip="10.10.10.10", job="nodes"}
node_load1{cls="pg-test", ins="pg-test-1", ip="10.10.10.11", job="nodes"}
node_load1{cls="pg-test", ins="pg-test-2", ip="10.10.10.12", job="nodes"}
node_load1{cls="pg-test", ins="pg-test-3", ip="10.10.10.13", job="nodes"}

在执行默认的PostgreSQL部署时，因为Pigsty默认采用节点独占1:1部署，因此可以通过 node_id_from_pg 参数，将数据库实例的身份参数（ pg_cluster 借用至节点的ins与cls标签上。

名称	类型	层级	必要性	说明
`inventory_hostname`	`ip`	-	必选	节点IP地址
`nodename`	`string`	I	可选	节点名称
`node_cluster`	`string`	C	可选	节点集群名称

#nodename:                # [实例] # 节点实例标识，如缺失则使用现有主机名，可选，无默认值
node_cluster: nodes       # [集群] # 节点集群标识，如缺失则使用默认值'nodes'，可选
nodename_overwrite: true          # 用 nodename 覆盖节点的主机名吗？
nodename_exchange: false          # 在剧本主机之间交换 nodename 吗？
node_id_from_pg: true             # 如果可行，是否借用 postgres 身份作为节点身份？

`nodename`

参数名称： nodename，类型： string，层次：I

主机节点的身份参数，如果没有显式设置，则会使用现有的主机 Hostname 作为节点名。本参数虽然是身份参数，但因为有合理默认值，所以是可选项。

如果启用了 node_id_from_pg 选项（默认启用），且 nodename 没有被显式指定，那么 nodename 会尝试使用 ${pg_cluster}-${pg_seq} 作为实例身份参数，如果集群没有定义 PGSQL 模块，那么会回归到默认值，也就是主机节点的 HOSTNAME。

`node_cluster`

参数名称： node_cluster，类型： string，层次：C

该选项可为节点显式指定一个集群名称，通常在节点集群层次定义才有意义。使用默认空值将直接使用固定值nodes作为节点集群标识。

如果启用了 node_id_from_pg 选项（默认启用），且 node_cluster 没有被显式指定，那么 node_cluster 会尝试使用 ${pg_cluster}-${pg_seq} 作为集群身份参数，如果集群没有定义 PGSQL 模块，那么会回归到默认值 nodes。

`nodename_overwrite`

参数名称： nodename_overwrite，类型： bool，层次：C

是否使用 nodename 覆盖主机名？默认值为 true，在这种情况下，如果你设置了一个非空的 nodename ，那么它会被用作当前主机的 HOSTNAME 。

当 nodename 配置为空时，如果 node_id_from_pg 参数被配置为 true （默认为真），那么 Pigsty 会尝试借用1:1定义在节点上的 PostgreSQL 实例的身份参数作为主机的节点名。也就是 {{ pg_cluster }}-{{ pg_seq }}，如果该节点没有安装 PGSQL 模块，则会回归到默认什么都不做的状态。

因此，如果您将 nodename 留空，并且没有启用 node_id_from_pg 参数时，Pigsty不会对现有主机名进行任何修改。

`nodename_exchange`

参数名称： nodename_exchange，类型： bool，层次：C

是否在剧本节点间交换主机名？默认值为：false

启用此参数时，同一批组执行 node.yml 剧本的节点之间会相互交换节点名称，写入/etc/hosts中。

`node_id_from_pg`

参数名称： node_id_from_pg，类型： bool，层次：C

从节点上 1:1 部署的 PostgreSQL 实例/集群上借用身份参数？默认值为 true。

Pigsty 中的 PostgreSQL 实例与节点默认使用 1:1 部署，因此，您可以从数据库实例上“借用” 身份参数。此参数默认启用，这意味着一套 PostgreSQL 集群如果没有特殊配置，主机节点集群和实例的身份参数默认值是与数据库身份参数保持一致的。对于问题分析，监控数据处理都提供了额外便利。

`NODE_DNS`

Pigsty会为节点配置静态DNS解析记录与动态DNS服务器。

如果您的节点供应商已经为您配置了DNS服务器，您可以将 node_dns_method 设置为 none 跳过DNS设置。

node_write_etc_hosts: true        # modify `/etc/hosts` on target node?
node_default_etc_hosts:           # static dns records in `/etc/hosts`
  - "${admin_ip} h.pigsty a.pigsty p.pigsty g.pigsty"
node_etc_hosts: []                # extra static dns records in `/etc/hosts`
node_dns_method: add              # how to handle dns servers: add,none,overwrite
node_dns_servers: ['${admin_ip}'] # dynamic nameserver in `/etc/resolv.conf`
node_dns_options:                 # dns resolv options in `/etc/resolv.conf`
  - options single-request-reopen timeout:1

`node_write_etc_hosts`

参数名称： node_write_etc_hosts，类型： bool，层次：G|C|I

是否修改目标节点上的 /etc/hosts？例如，在容器环境中通常不允许修改此配置文件。

`node_default_etc_hosts`

参数名称： node_default_etc_hosts，类型： string[]，层次：G

默认写入所有节点 /etc/hosts 的静态DNS记录，默认值为：

["${admin_ip} h.pigsty a.pigsty p.pigsty g.pigsty"]

node_default_etc_hosts 是一个数组，每个元素都是一条 DNS 记录，格式为 <ip> <name>，您可以指定多个用空格分隔的域名。

这个参数是用于配置全局静态DNS解析记录的，如果您希望为单个集群与实例配置特定的静态DNS解析，则可以使用 node_etc_hosts 参数。

`node_etc_hosts`

参数名称： node_etc_hosts，类型： string[]，层次：C

写入节点 /etc/hosts 的额外的静态DNS记录，默认值为：[] 空数组。

本参数与 node_default_etc_hosts，形式一样，但用途不同：适合在集群/实例层面进行配置。

`node_dns_method`

参数名称： node_dns_method，类型： enum，层次：C

如何配置DNS服务器？有三种选项：add、none、overwrite，默认值为 add。

add：将 node_dns_servers 中的记录追加至/etc/resolv.conf，并保留已有DNS服务器。（默认）
overwrite：使用将 node_dns_servers 中的记录覆盖/etc/resolv.conf
none：跳过DNS服务器配置，如果您的环境中已经配置有DNS服务器，则可以直接跳过DNS配置。

`node_dns_servers`

参数名称： node_dns_servers，类型： string[]，层次：C

配置 /etc/resolv.conf 中的动态DNS服务器列表：默认值为： ["${admin_ip}"]，即将管理节点作为首要DNS服务器。

`node_dns_options`

参数名称： node_dns_options，类型： string[]，层次：C

/etc/resolv.conf 中的DNS解析选项，默认值为：

- "options single-request-reopen timeout:1"

如果 node_dns_method 配置为add或overwrite，则本配置项中的记录会被首先写入/etc/resolv.conf 中。具体格式请参考Linux文档关于/etc/resolv.conf的说明

`NODE_PACKAGE`

Pigsty会为纳入管理的节点配置Yum源，并安装软件包。

node_repo_modules: local          # upstream repo to be added on node, local by default.
node_repo_remove: true            # remove existing repo on node?
node_packages: [ ]                # packages to be installed current nodes
#node_default_packages:           # default packages to be installed on all nodes

`node_repo_modules`

参数名称： node_repo_modules，类型： string，层次：C/A

需要在节点上添加的的软件源模块列表，形式同 repo_modules。默认值为 local，即使用 repo_upstream 中 local 所指定的本地软件源。

当 Pigsty 纳管节点时，会根据此参数的值来过滤 repo_upstream 中的条目，只有 module 字段与此参数值匹配的条目才会被添加到节点的软件源中。

`node_repo_remove`

参数名称： node_repo_remove，类型： bool，层次：C/A

是否移除节点已有的软件仓库定义？默认值为：true。

如果启用，则Pigsty会移除节点上/etc/yum.repos.d中原有的配置文件，并备份至/etc/yum.repos.d/backup。在 Debian/Ubuntu 系统上，则是 /etc/apt/sources.list(.d) 备份至 /etc/apt/backup。

`node_packages`

参数名称： node_packages，类型： string[]，层次：C

在当前节点上要安装并升级的软件包列表，默认值为：[openssh-server] ，即在安装时会将 sshd 升级到最新版本（避免安全漏洞）。

每一个数组元素都是字符串：由逗号分隔的软件包名称。形式上与 node_packages_default 相同。本参数通常用于在节点/集群层面指定需要额外安装的软件包。

在本参数中指定的软件包，会 升级到可用的最新版本，如果您需要保持现有节点软件版本不变（存在即可），请使用 node_default_packages 参数。

`node_default_packages`

参数名称： node_default_packages，类型： string[]，层次：G

默认在所有节点上安装的软件包，默认值是 EL 7/8/9 通用的 RPM 软件包列表，数组，每个元素为逗号分隔的包名：

字符串数组类型，每一行都是 由空格分隔 的软件包列表字符串，指定默认在所有节点上安装的软件包列表。

在此变量中指定的软件包，只要求存在，而不要求最新。如果您需要安装最新版本的软件包，请使用 node_packages 参数。

本参数没有默认值，即默认值为未定义状态。如果用户不在配置文件中显式指定本参数，则 Pigsty 会从根据当前节点的操作系统族，从定义于 roles/node_id/vars 中的 node_packages_default 变量中加载获取默认值。

默认值（EL系操作系统）：

- lz4,unzip,bzip2,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,nvme-cli,numactl,sysstat,iotop,htop,rsync,tcpdump
- python3,python3-pip,socat,lrzsz,net-tools,ipvsadm,telnet,ca-certificates,openssl,keepalived,etcd,haproxy,chrony
- zlib,yum,audit,bind-utils,readline,vim-minimal,node_exporter,grubby,openssh-server,openssh-clients

默认值（Debian/Ubuntu）：

- lz4,unzip,bzip2,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,nvme-cli,numactl,sysstat,iotop,htop,rsync,tcpdump
- python3,python3-pip,socat,lrzsz,net-tools,ipvsadm,telnet,ca-certificates,openssl,keepalived,etcd,haproxy,chrony
- zlib1g,acl,dnsutils,libreadline-dev,vim-tiny,node-exporter,openssh-server,openssh-client

本参数形式上与 node_packages 相同，但本参数通常用于全局层面指定所有节点都必须安装的默认软件包

`NODE_TUNE`

主机节点特性、内核模块与参数调优模板。

node_disable_firewall: true       # disable node firewall? true by default
node_disable_selinux: true        # disable node selinux? true by default
node_disable_numa: false          # disable node numa, reboot required
node_disable_swap: false          # disable node swap, use with caution
node_static_network: true         # preserve dns resolver settings after reboot
node_disk_prefetch: false         # setup disk prefetch on HDD to increase performance
node_kernel_modules: [ softdog, br_netfilter, ip_vs, ip_vs_rr, ip_vs_wrr, ip_vs_sh ]
node_hugepage_count: 0            # number of 2MB hugepage, take precedence over ratio
node_hugepage_ratio: 0            # node mem hugepage ratio, 0 disable it by default
node_overcommit_ratio: 0          # node mem overcommit ratio, 0 disable it by default
node_tune: oltp                   # node tuned profile: none,oltp,olap,crit,tiny
node_sysctl_params: { }           # sysctl parameters in k:v format in addition to tuned

`node_disable_firewall`

参数名称： node_disable_firewall，类型： bool，层次：C

关闭节点防火墙？默认关闭防火墙：true。

如果您在受信任的内网部署，可以关闭防火墙。在 EL 下是 firewalld 服务，在 Ubuntu下是 ufw 服务。

`node_disable_selinux`

参数名称： node_disable_selinux，类型： bool，层次：C

关闭节点SELINUX？默认关闭SELinux：true。

如果您没有操作系统/安全专家，请关闭 SELinux。

当使用 Kubernetes 模块时，请关闭 SELinux。

`node_disable_numa`

参数名称： node_disable_numa，类型： bool，层次：C

是否关闭NUMA？默认不关闭NUMA：false。

注意，关闭NUMA需要重启机器后方可生效！如果您不清楚如何绑核，在生产环境使用数据库时建议关闭 NUMA。

`node_disable_swap`

参数名称： node_disable_swap，类型： bool，层次：C

是否关闭 SWAP ？默认不关闭SWAP：false。

通常情况下不建议关闭 SWAP，例外情况是如果您有足够的内存用于独占式 PostgreSQL 部署，则可以关闭 SWAP 提高性能。

例外：当您的节点用于部署 Kubernetes 模块时，应当禁用SWAP。

`node_static_network`

参数名称： node_static_network，类型： bool，层次：C

是否使用静态DNS服务器, 类型：bool，层级：C，默认值为：true，默认启用。

启用静态网络，意味着您的DNS Resolv配置不会因为机器重启与网卡变动被覆盖，建议启用，或由网络工程师负责配置。

`node_disk_prefetch`

参数名称： node_disk_prefetch，类型： bool，层次：C

是否启用磁盘预读？默认不启用：false。

针对HDD部署的实例可以优化性能，使用机械硬盘时建议启用。

`node_kernel_modules`

参数名称： node_kernel_modules，类型： string[]，层次：C

启用哪些内核模块？默认启用以下内核模块：

node_kernel_modules: [ softdog, br_netfilter, ip_vs, ip_vs_rr, ip_vs_wrr, ip_vs_sh ]

形式上是由内核模块名称组成的数组，声明了需要在节点上安装的内核模块。

`node_hugepage_count`

参数名称： node_hugepage_count，类型： int，层次：C

在节点上分配 2MB 大页的数量，默认为 0，另一个相关的参数是 node_hugepage_ratio。

如果这两个参数 node_hugepage_count 和 node_hugepage_ratio 都为 0（默认），则大页将完全被禁用，本参数的优先级相比 node_hugepage_ratio 更高，因为它更加精确。

如果设定了一个非零值，它将被写入 /etc/sysctl.d/hugepage.conf 中应用生效；负值将不起作用，高于 90% 节点内存的数字将被限制为节点内存的 90%

如果不为零，它应该略大于pg_shared_buffer_ratio 的对应值，这样才能让 PostgreSQL 用上大页。

`node_hugepage_ratio`

参数名称： node_hugepage_ratio，类型： float，层次：C

节点内存大页占内存的比例，默认为 0，有效范围：0 ~ 0.40

此内存比例将以大页的形式分配，并为PostgreSQL预留。 node_hugepage_count 是具有更高优先级和精度的参数版本。

默认值：0，这将设置 vm.nr_hugepages=0 并完全不使用大页。

本参数应该等于或略大于pg_shared_buffer_ratio，如果不为零。

例如，如果您为Postgres共享缓冲区默认分配了25%的内存，您可以将此值设置为 0.27 ~ 0.30，并在初始化后使用 /pg/bin/pg-tune-hugepage 精准回收浪费的大页。

`node_overcommit_ratio`

参数名称： node_overcommit_ratio，类型： int，层次：C

节点内存超额分配比率，默认为：0。这是一个从 0 到 100+ 的整数。

默认值：0，这将设置 vm.overcommit_memory=0，否则将使用 vm.overcommit_memory=2，并使用此值作为 vm.overcommit_ratio。

建议在 pgsql 独占节点上设置 vm.overcommit_ratio，避免内存过度提交。

`node_tune`

参数名称： node_tune，类型： enum，层次：C

针对机器进行调优的预制方案，基于tuned 提供服务。有四种预制模式：

tiny：微型虚拟机
oltp：常规OLTP模板，优化延迟（默认值）
olap：常规OLAP模板，优化吞吐量
crit：核心金融业务模板，优化脏页数量

通常，数据库的调优模板 pg_conf应当与机器调优模板配套。

`node_sysctl_params`

参数名称： node_sysctl_params，类型： dict，层次：C

使用 K:V 形式的 sysctl 内核参数，会添加到 tuned profile 中，默认值为： {} 空对象。

这是一个 KV 结构的字典参数，Key 是内核 sysctl 参数名，Value 是参数值。你也可以考虑直接在 roles/node/templates 中的 tuned 模板中直接定义额外的 sysctl 参数。

`NODE_ADMIN`

这一节关于主机节点上的管理员，谁能登陆，怎么登陆。

node_data: /data                  # node main data directory, `/data` by default
node_admin_enabled: true          # create a admin user on target node?
node_admin_uid: 88                # uid and gid for node admin user
node_admin_username: dba          # name of node admin user, `dba` by default
node_admin_ssh_exchange: true     # exchange admin ssh key among node cluster
node_admin_pk_current: true       # add current user's ssh pk to admin authorized_keys
node_admin_pk_list: []            # ssh public keys to be added to admin user
node_aliases: {}                  # extra shell aliases to be added, k:v dict

`node_data`

参数名称： node_data，类型： path，层次：C

节点的主数据目录，默认为 /data。

如果该目录不存在，则该目录会被创建。该目录应当由 root 拥有，并拥有 777 权限。

`node_admin_enabled`

参数名称： node_admin_enabled，类型： bool，层次：C

是否在本节点上创建一个专用管理员用户？默认值为：true。

Pigsty默认会在每个节点上创建一个管理员用户（拥有免密sudo与ssh权限），默认的管理员名为dba (uid=88)的管理用户，可以从元节点上通过SSH免密访问环境中的其他节点并执行免密sudo。

`node_admin_uid`

参数名称： node_admin_uid，类型： int，层次：C

管理员用户UID，默认值为：88。

请尽可能确保 UID 在所有节点上都相同，可以避免一些无谓的权限问题。

如果默认 UID 88 已经被占用，您可以选择一个其他 UID ，手工分配时请注意UID命名空间冲突。

`node_admin_username`

参数名称： node_admin_username，类型： username，层次：C

管理员用户名，默认为 dba 。

`node_admin_ssh_exchange`

参数名称： node_admin_ssh_exchange，类型： bool，层次：C

在节点集群间交换节点管理员SSH密钥, 类型：bool，层级：C，默认值为：true

启用时，Pigsty会在执行剧本时，在成员间交换SSH公钥，允许管理员 node_admin_username 从不同节点上相互访问。

`node_admin_pk_current`

参数名称： node_admin_pk_current，类型： bool，层次：C

是否将当前节点 & 用户的公钥加入管理员账户，默认值是： true

启用时，将会把当前节点上执行此剧本的管理用户的SSH公钥（~/.ssh/id_rsa.pub）拷贝至目标节点管理员用户的 authorized_keys 中。

生产环境部署时，请务必注意此参数，此参数会将当前执行命令用户的默认公钥安装至所有机器的管理用户上。

`node_admin_pk_list`

参数名称： node_admin_pk_list，类型： string[]，层次：C

可登陆管理员的公钥列表，默认值为：[] 空数组。

数组的每一个元素为字符串，内容为写入到管理员用户~/.ssh/authorized_keys中的公钥，持有对应私钥的用户可以以管理员身份登录。

生产环境部署时，请务必注意此参数，仅将信任的密钥加入此列表中。

`node_aliases`

参数名称： node_aliases，类型： dict，层次：C/I

要添加到节点的命令别名，默认值为：{} 空字典

您可以将自己常用的快捷别名添加到此参数中，Pigsty会在节点上将这些别名写入 /etc/profile.d/node.alias.sh。例如：

node_aliases:
  g:   git
  d:   docker

会生成：

alias g="git"
alias d="docker"

`NODE_TIME`

关于主机时间/时区/NTP/定时任务的相关配置。

时间同步对于数据库服务来说非常重要，请确保系统 chronyd 授时服务正常运行。

node_timezone: ''                 # 设置节点时区，空字符串表示跳过
node_ntp_enabled: true            # 启用chronyd时间同步服务？
node_ntp_servers:                 # `/etc/chrony.conf`中的ntp服务器
  - pool pool.ntp.org iburst
node_crontab_overwrite: true      # 覆盖还是追加到`/etc/crontab`？
node_crontab: [ ]                 # `/etc/crontab`中的crontab条目

`node_timezone`

参数名称： node_timezone，类型： string，层次：C

设置节点时区，空字符串表示跳过。默认值是空字符串，默认不会修改默认的时区（即使用通常的默认值UTC）

在中国地区使用时，建议设置为 Asia/Hong_Kong。

`node_ntp_enabled`

参数名称： node_ntp_enabled，类型： bool，层次：C

启用chronyd时间同步服务？默认值为：true

此时 Pigsty 将使用 node_ntp_servers 中指定的 NTP服务器列表覆盖节点的 /etc/chrony.conf。

如果您的节点已经配置好了 NTP 服务器，那么可以将此参数设置为 false 跳过时间同步配置。

`node_ntp_servers`

参数名称： node_ntp_servers，类型： string[]，层次：C

在 /etc/chrony.conf 中使用的 NTP 服务器列表。默认值为：["pool pool.ntp.org iburst"]

本参数是一个数组，每一个数组元素是一个字符串，代表一行 NTP 服务器配置。仅当 node_ntp_enabled 启用时生效。

Pigsty 默认使用全球 NTP 服务器 pool.ntp.org，您可以根据自己的网络环境修改此参数，例如 cn.pool.ntp.org iburst，或内网的时钟服务。

您也可以在配置中使用 ${admin_ip} 占位符，使用管理节点上的时间服务器。

node_ntp_servers: [ 'pool ${admin_ip} iburst' ]

`node_crontab_overwrite`

参数名称： node_crontab_overwrite，类型： bool，层次：C

处理 node_crontab 中的定时任务时，是追加还是覆盖？默认值为：true，即覆盖。

如果您希望在节点上追加定时任务，可以将此参数设置为 false，Pigsty 将会在节点的 crontab 上追加，而非 覆盖所有 定时任务。

`node_crontab`

参数名称： node_crontab，类型： string[]，层次：C

定义在节点 /etc/crontab 中的定时任务：默认值为：[] 空数组。

每一个数组数组元素都是一个字符串，代表一行定时任务。使用标准的 cron 格式定义。

例如，以下配置会以 postgres 用户在每天凌晨1点执行全量备份任务。

node_crontab: 
  - '00 01 * * * postgres /pg/bin/pg-backup full' ] # make a full backup every 1am

`NODE_VIP`

您可以为节点集群绑定一个可选的 L2 VIP，默认不启用此特性。L2 VIP 只对一组节点集群有意义，该 VIP 会根据配置的优先级在集群中的节点之间进行切换，确保节点服务的高可用。

请注意，L2 VIP 只能在同一 L2 网段中使用，这可能会对您的网络拓扑产生额外的限制，如果不想受此限制，您可以考虑使用 DNS LB 或者 Haproxy 实现类似的功能。

当启用此功能时，您需要为这个 L2 VIP 显式分配可用的 vip_address 与 vip_vrid，用户应当确保这两者在同一网段内唯一。

vip_enabled: false                # enable vip on this node cluster?
# vip_address:         [IDENTITY] # node vip address in ipv4 format, required if vip is enabled
# vip_vrid:            [IDENTITY] # required, integer, 1-254, should be unique among same VLAN
vip_role: backup                  # optional, `master/backup`, backup by default, use as init role
vip_preempt: false                # optional, `true/false`, false by default, enable vip preemption
vip_interface: eth0               # node vip network interface to listen, `eth0` by default
vip_dns_suffix: ''                # node vip dns name suffix, empty string by default
vip_exporter_port: 9650           # keepalived exporter listen port, 9650 by default

`vip_enabled`

参数名称： vip_enabled，类型： bool，层次：C

是否在当前这个节点集群中配置一个由 Keepalived 管理的 L2 VIP ？默认值为： false。

`vip_address`

参数名称： vip_address，类型： ip，层次：C

节点 VIP 地址，IPv4 格式（不带 CIDR 网段后缀），当节点启用 vip_enabled 时，这是一个必选参数。

本参数没有默认值，这意味着您必须显式地为节点集群分配一个唯一的 VIP 地址。

`vip_vrid`

参数名称： vip_vrid，类型： int，层次：C

VRID 是一个范围从 1 到 254 的正整数，用于标识一个网络中的 VIP，当节点启用 vip_enabled 时，这是一个必选参数。

本参数没有默认值，这意味着您必须显式地为节点集群分配一个网段内唯一的 ID。

`vip_role`

参数名称： vip_role，类型： enum，层次：I

节点 VIP 角色，可选值为： master 或 backup，默认值为 backup

该参数的值会被设置为 keepalived 的初始状态。

`vip_preempt`

参数名称： vip_preempt，类型： bool，层次：C/I

是否启用 VIP 抢占？可选参数，默认值为 false，即不抢占 VIP。

所谓抢占，是指一个 backup 角色的节点，当其优先级高于当前存活且正常工作的 master 角色的节点时，是否取抢占其 VIP？

`vip_interface`

参数名称： vip_interface，类型： string，层次：C/I

节点 VIP 监听使用的网卡，默认为 eth0。

您应当使用与节点主IP地址（即：你填入清单中IP地址）所使用网卡相同的名称。

如果你的节点有着不同的网卡名称，你可以在实例/节点层次对其进行覆盖。

`vip_dns_suffix`

参数名称： vip_dns_suffix，类型： string，层次：C/I

节点集群 L2 VIP 使用的DNS名称，默认是空字符串，即直接使用集群名本身作为DNS名。

`vip_exporter_port`

参数名称： vip_exporter_port，类型： port，层次：C/I

keepalived exporter 监听端口号，默认为：9650。

`HAPROXY`

HAProxy 默认在所有节点上安装启用，并以类似于 Kubernetes NodePort 的方式对外暴露服务。

PGSQL 模块对外服务使用到了 Haproxy。

haproxy_enabled: true             # 在此节点上启用haproxy？
haproxy_clean: false              # 清理所有现有的haproxy配置？
haproxy_reload: true              # 配置后重新加载haproxy？
haproxy_auth_enabled: true        # 为haproxy管理页面启用身份验证
haproxy_admin_username: admin     # haproxy管理用户名，默认为`admin`
haproxy_admin_password: pigsty    # haproxy管理密码，默认为`pigsty`
haproxy_exporter_port: 9101       # haproxy管理/导出端口，默认为9101
haproxy_client_timeout: 24h       # 客户端连接超时，默认为24小时
haproxy_server_timeout: 24h       # 服务器端连接超时，默认为24小时
haproxy_services: []              # 需要在节点上暴露的haproxy服务列表

`haproxy_enabled`

参数名称： haproxy_enabled，类型： bool，层次：C

在此节点上启用haproxy？默认值为： true。

`haproxy_clean`

参数名称： haproxy_clean，类型： bool，层次：G/C/A

清理所有现有的haproxy配置？默认值为 false。

`haproxy_reload`

参数名称： haproxy_reload，类型： bool，层次：A

配置后重新加载 haproxy？默认值为 true，配置更改后会重新加载haproxy。

如果您希望在应用配置前进行手工检查，您可以使用命令参数关闭此选项，并进行检查后再应用。

`haproxy_auth_enabled`

参数名称： haproxy_auth_enabled，类型： bool，层次：G

为haproxy管理页面启用身份验证，默认值为 true，它将要求管理页面进行http基本身份验证。

建议不要禁用认证，因为您的流量控制页面将对外暴露，这是比较危险的。

`haproxy_admin_username`

参数名称： haproxy_admin_username，类型： username，层次：G

haproxy 管理员用户名，默认为：admin。

`haproxy_admin_password`

参数名称： haproxy_admin_password，类型： password，层次：G

haproxy管理密码，默认为 pigsty

在生产环境中请务必修改此密码！

`haproxy_exporter_port`

参数名称： haproxy_exporter_port，类型： port，层次：C

haproxy 流量管理/指标对外暴露的端口，默认为：9101

`haproxy_client_timeout`

参数名称： haproxy_client_timeout，类型： interval，层次：C

客户端连接超时，默认为 24h。

设置一个超时可以避免难以清理的超长的连接，但如果您真的需要一个长连接，您可以将其设置为更长的时间。

`haproxy_server_timeout`

参数名称： haproxy_server_timeout，类型： interval，层次：C

服务端连接超时，默认为 24h。

设置一个超时可以避免难以清理的超长的连接，但如果您真的需要一个长连接，您可以将其设置为更长的时间。

`haproxy_services`

参数名称： haproxy_services，类型： service[]，层次：C

需要在此节点上通过 Haproxy 对外暴露的服务列表，默认值为： [] 空数组。

每一个数组元素都是一个服务定义，下面是一个服务定义的例子：

haproxy_services:                   # list of haproxy service

  # expose pg-test read only replicas
  - name: pg-test-ro                # [REQUIRED] service name, unique
    port: 5440                      # [REQUIRED] service port, unique
    ip: "*"                         # [OPTIONAL] service listen addr, "*" by default
    protocol: tcp                   # [OPTIONAL] service protocol, 'tcp' by default
    balance: leastconn              # [OPTIONAL] load balance algorithm, roundrobin by default (or leastconn)
    maxconn: 20000                  # [OPTIONAL] max allowed front-end connection, 20000 by default
    default: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'
    options:
      - option httpchk
      - option http-keep-alive
      - http-check send meth OPTIONS uri /read-only
      - http-check expect status 200
    servers:
      - { name: pg-test-1 ,ip: 10.10.10.11 , port: 5432 , options: check port 8008 , backup: true }
      - { name: pg-test-2 ,ip: 10.10.10.12 , port: 5432 , options: check port 8008 }
      - { name: pg-test-3 ,ip: 10.10.10.13 , port: 5432 , options: check port 8008 }

每个服务定义会被渲染为 /etc/haproxy/<service.name>.cfg 配置文件，并在 Haproxy 重载后生效。

`NODE_EXPORTER`

node_exporter_enabled: true       # setup node_exporter on this node?
node_exporter_port: 9100          # node exporter listen port, 9100 by default
node_exporter_options: '--no-collector.softnet --no-collector.nvme --collector.tcpstat --collector.processes'

`node_exporter_enabled`

参数名称： node_exporter_enabled，类型： bool，层次：C

在当前节点上启用节点指标收集器？默认启用：true

`node_exporter_port`

参数名称： node_exporter_port，类型： port，层次：C

对外暴露节点指标使用的端口，默认为 9100。

`node_exporter_options`

参数名称： node_exporter_options，类型： arg，层次：C

节点指标采集器的命令行参数，默认值为：

--no-collector.softnet --no-collector.nvme --collector.tcpstat --collector.processes

该选项会启用/禁用一些指标收集器，请根据您的需要进行调整。

`PROMTAIL`

Promtail 是与 Loki 配套的日志收集组件，会收集各个模块产生的日志并发送至基础设施节点上的 LOKI 服务。

INFRA：基础设施组件的日志只会在 Infra 节点上收集。
- nginx-access: /var/log/nginx/access.log
- nginx-error: /var/log/nginx/error.log
- grafana: /var/log/grafana/grafana.log
NODES：主机相关的日志，所有节点上都会启用收集。
- syslog: /var/log/messages （Debian上为 /var/log/syslog）
- dmesg: /var/log/dmesg
- cron: /var/log/cron
PGSQL：PostgreSQL 相关的日志，只有节点配置了 PGSQL 模块才会启用收集。
- postgres: /pg/log/postgres/*
- patroni: /pg/log/patroni.log
- pgbouncer: /pg/log/pgbouncer/pgbouncer.log
- pgbackrest: /pg/log/pgbackrest/*.log
REDIS：Redis 相关日志，只有节点配置了 REDIS 模块才会启用收集。
- redis: /var/log/redis/*.log

日志目录会根据这些参数的配置自动调整：pg_log_dir, patroni_log_dir, pgbouncer_log_dir, pgbackrest_log_dir

promtail_enabled: true            # enable promtail logging collector?
promtail_clean: false             # purge existing promtail status file during init?
promtail_port: 9080               # promtail listen port, 9080 by default
promtail_positions: /var/log/positions.yaml # promtail position status file path

`promtail_enabled`

参数名称： promtail_enabled，类型： bool，层次：C

是否启用Promtail日志收集服务？默认值为： true

`promtail_clean`

参数名称： promtail_clean，类型： bool，层次：G/A

是否在安装 Promtail 时移除已有状态信息？默认值为： false。

默认不会清理，当您选择清理时，Pigsty会在部署Promtail时移除现有状态文件 promtail_positions，这意味着Promtail会重新收集当前节点上的所有日志并发送至Loki。

`promtail_port`

参数名称： promtail_port，类型： port，层次：C

Promtail 监听使用的默认端口号，默认为：9080

`promtail_positions`

参数名称： promtail_positions，类型： path，层次：C

Promtail 状态文件路径，默认值为：/var/log/positions.yaml。

Promtail记录了所有日志的消费偏移量，定期写入由本参数指定的文件中。

9.4 - 预置剧本

如何使用预置的 ansible 剧本来管理 Node 集群，常用管理命令速查。

Pigsty 提供了两个与 NODE 模块相关的剧本，分别用于纳管与移除节点。

node.yml：纳管节点，并调整节点到期望的状态
node-rm.yml：从 pigsty 中移除纳管节点

此外， Pigsty 还提供了两个包装命令工具：node-add 与 node-rm，用于快速调用剧本。

`node.yml`

向 Pigsty 添加节点的 node.yml 包含以下子任务：

node-id       ：生成节点身份标识
node_name     ：设置主机名
node_hosts    ：配置 /etc/hosts 记录
node_resolv   ：配置 DNS 解析器 /etc/resolv.conf
node_firewall ：设置防火墙 & selinux
node_ca       ：添加并信任CA证书
node_repo     ：添加上游软件仓库
node_pkg      ：安装 rpm/deb 软件包
node_feature  ：配置 numa、grub、静态网络等特性
node_kernel   ：配置操作系统内核模块
node_tune     ：配置 tuned 调优模板
node_sysctl   ：设置额外的 sysctl 参数
node_profile  ：写入 /etc/profile.d/node.sh
node_ulimit   ：配置资源限制
node_data     ：配置数据目录
node_admin    ：配置管理员用户和ssh密钥
node_timezone ：配置时区
node_ntp      ：配置 NTP 服务器/客户端
node_crontab  ：添加/覆盖 crontab 定时任务
node_vip      ：为节点集群设置可选的 L2 VIP
haproxy       ：在节点上设置 haproxy 以暴露服务
monitor       ：配置节点监控：node_exporter & promtail

示例：使用 node.yml 初始化节点集群

`node-rm.yml`

从 Pigsty 中移除节点的剧本 node-rm.yml 包含了以下子任务：

register       : 从 prometheus & nginx 中移除节点注册信息
  - prometheus : 移除已注册的 prometheus 监控目标
  - nginx      : 移除用于 haproxy 管理界面的 nginx 代理记录
vip            : 移除节点的 keepalived 与 L2 VIP（如果启用 VIP）
haproxy        : 移除 haproxy 负载均衡器
node_exporter  : 移除节点监控：Node Exporter
vip_exporter   : 移除 keepalived_exporter （如果启用 VIP）
promtail       : 移除 loki 日志代理 promtail
profile        : 移除 /etc/profile.d/node.sh 环境配置文件

9.5 - 管理预案

Node 集群管理 SOP：创建，销毁，扩容，缩容，节点故障与磁盘故障的处理。

下面是 Node 模块中常用的管理操作：

更多问题请参考 FAQ：NODE

添加节点

要将节点添加到 Pigsty，您需要对该节点具有无密码的 ssh/sudo 访问权限。

您也可以选择一次性添加一个集群，或使用通配符匹配配置清单中要加入 Pigsty 的节点。

# ./node.yml -l <cls|ip|group>        # 向 Pigsty 中添加节点的实际剧本
# bin/node-add <selector|ip...>       # 向 Pigsty 中添加节点
bin/node-add node-test                # 初始化节点集群 'node-test'
bin/node-add 10.10.10.10              # 初始化节点  '10.10.10.10'

移除节点

要从 Pigsty 中移除一个节点，您可以使用以下命令：

# ./node-rm.yml -l <cls|ip|group>    # 从 pigsty 中移除节点的实际剧本
# bin/node-rm <cls|ip|selector> ...  # 从 pigsty 中移除节点
bin/node-rm node-test                # 移除节点集群 'node-test'
bin/node-rm 10.10.10.10              # 移除节点 '10.10.10.10'

您也可以选择一次性移除一个集群，或使用通配符匹配配置清单中要从 Pigsty 移除的节点。

创建管理员

如果当前用户没有对节点的无密码 ssh/sudo 访问权限，您可以使用另一个管理员用户来初始化该节点：

node.yml -t node_admin -k -K -e ansible_user=<另一个管理员>   # 为另一个管理员输入 ssh/sudo 密码以完成此任务

绑定VIP

您可以在节点集群上绑定一个可选的 L2 VIP，使用 vip_enabled 参数。

proxy:
  hosts:
    10.10.10.29: { nodename: proxy-1 } # 您可以显式指定初始的 VIP 角色：MASTER / BACKUP
    10.10.10.30: { nodename: proxy-2 } # , vip_role: master }
  vars:
    node_cluster: proxy
    vip_enabled: true
    vip_vrid: 128
    vip_address: 10.10.10.99
    vip_interface: eth1

./node.yml -l proxy -t node_vip     # 首次启用 VIP 
./node.yml -l proxy -t vip_refresh  # 刷新 vip 配置（例如指定 master）

添加节点监控

有时候您只希望将节点纳入 Pigsty 监控中，而不需要其他功能，可以执行 node.yml 剧本的一个子集来实现：

# 在节点上安装监控软件：node_exporter, promtail，分别收集指标，收集日志。
./node.yml -t node_repo,node_pkg -e '{"node_packages_default":[],"node_packages":["node_exporter", "promtail"]}'
./node.yml -t node_exporter,node_register  # 配置 node_exporter 监控组件，并将其注册到 Prometheus 中
./node.yml -t promtail                     # 如果你需要收集节点日志，额外执行此任务即可

其他常见管理任务

# Play
./node.yml -t node                            # 完成节点主体初始化（haproxy，监控除外）
./node.yml -t haproxy                         # 在节点上设置 haproxy
./node.yml -t monitor                         # 配置节点监控：node_exporter & promtail （以及可选的 keepalived_exporter）
./node.yml -t node_vip                        # 为没启用过 VIP 的集群安装、配置、启用L2 VIP
./node.yml -t vip_config,vip_reload           # 刷新节点L2 VIP配置
./node.yml -t haproxy_config,haproxy_reload   # 刷新节点上的服务定义
./node.yml -t register_prometheus             # 重新将节点注册到 Prometheus 中
./node.yml -t register_nginx                  # 重新将节点 haproxy 管控界面注册到 Nginx 中

# Task
./node.yml -t node-id        # 生成节点身份标识
./node.yml -t node_name      # 设置主机名
./node.yml -t node_hosts     # 配置节点 /etc/hosts 记录
./node.yml -t node_resolv    # 配置节点 DNS 解析器 /etc/resolv.conf
./node.yml -t node_firewall  # 配置防火墙 & selinux
./node.yml -t node_ca        # 配置节点的CA证书
./node.yml -t node_repo      # 配置节点上游软件仓库
./node.yml -t node_pkg       # 在节点上安装 yum 软件包
./node.yml -t node_feature   # 配置 numa、grub、静态网络等特性
./node.yml -t node_kernel    # 配置操作系统内核模块
./node.yml -t node_tune      # 配置 tuned 调优模板
./node.yml -t node_sysctl    # 设置额外的 sysctl 参数
./node.yml -t node_profile   # 配置节点环境变量：/etc/profile.d/node.sh
./node.yml -t node_ulimit    # 配置节点资源限制
./node.yml -t node_data      # 配置节点首要数据目录
./node.yml -t node_admin     # 配置管理员用户和ssh密钥
./node.yml -t node_timezone  # 配置节点时区
./node.yml -t node_ntp       # 配置节点 NTP 服务器/客户端
./node.yml -t node_crontab   # 添加/覆盖 crontab 定时任务
./node.yml -t node_vip       # 为节点集群设置可选的 L2 VIP

9.6 - 监控告警

如何在 Pigsty 中监控 Node？如何使用 Node 本身的管控面板？有哪些告警规则值得关注？

监控

Pigsty 中的 NODE 模块提供了 6 个内容丰富的监控面板。

NODE Overview：当前环境中所有主机节点的大盘总览

Node Overview Dashboard

NODE Cluster：某一个主机集群的详细监控信息

Node Cluster Dashboard

Node Instance：某一个主机节点的详细监控信息

Node Instance Dashboard

NODE Alert：当前环境中所有主机节点的告警信息

Node Alert Dashboard

NODE VIP：某一个主机L2 VIP的详细监控信息

Node VIP Dashboard

Node Haproxy：某一个 HAProxy 负载均衡器的详细监控

Node Haproxy Dashboard

Pigsty告警

Pigsty 针对 Node 提供了以下告警规则

################################################################
#                         Node Alert                           #
################################################################
- name: node-alert
  rules:

    #==============================================================#
    #                          Aliveness                           #
    #==============================================================#
    # node exporter is dead indicate node is down
    - alert: NodeDown
      expr: node_up < 1
      for: 1m
      labels: { level: 0, severity: CRIT, category: node }
      annotations:
        summary: "CRIT NodeDown {{ $labels.ins }}@{{ $labels.instance }}"
        description: |
          node_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value }} < 1
          http://g.pigsty/d/node-instance?var-ins={{ $labels.ins }}          

    # haproxy the load balancer
    - alert: HaproxyDown
      expr: haproxy_up < 1
      for: 1m
      labels: { level: 0, severity: CRIT, category: node }
      annotations:
        summary: "CRIT HaproxyDown {{ $labels.ins }}@{{ $labels.instance }}"
        description: |
          haproxy_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value }} < 1
          http://g.pigsty/d/node-haproxy?var-ins={{ $labels.ins }}          

    # promtail the logging agent
    - alert: PromtailDown
      expr: promtail_up < 1
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: "WARN PromtailDown {{ $labels.ins }}@{{ $labels.instance }}"
        description: |
          promtail_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value }} < 1
          http://g.pigsty/d/node-instance?var-ins={{ $labels.ins }}          

    # docker the container engine
    - alert: DockerDown
      expr: docker_up < 1
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: "WARN DockerDown {{ $labels.ins }}@{{ $labels.instance }}"
        description: |
          docker_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value }} < 1
          http://g.pigsty/d/node-instance?var-ins={{ $labels.ins }}          

    # keepalived daemon
    - alert: KeepalivedDown
      expr: keepalived_up < 1
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: "WARN KeepalivedDown {{ $labels.ins }}@{{ $labels.instance }}"
        description: |
          keepalived_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value }} < 1
          http://g.pigsty/d/node-instance?var-ins={{ $labels.ins }}          



    #==============================================================#
    #                          Node : CPU                          #
    #==============================================================#
    # cpu usage high : 1m avg cpu usage > 70% for 3m
    - alert: NodeCpuHigh
      expr: node:ins:cpu_usage_1m > 0.70
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: 'WARN NodeCpuHigh {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.2f" }}'
        description: |
          node:ins:cpu_usage[ins={{ $labels.ins }}] = {{ $value  | printf "%.2f" }} > 70%          

    # OPTIONAL: one core high
    # OPTIONAL: throttled
    # OPTIONAL: frequency
    # OPTIONAL: steal

    #==============================================================#
    #                       Node : Schedule                        #
    #==============================================================#
    # node load high : 1m avg standard load > 100% for 3m
    - alert: NodeLoadHigh
      expr: node:ins:stdload1 > 1
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: 'WARN NodeLoadHigh {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.2f" }}'
        description: |
          node:ins:stdload1[ins={{ $labels.ins }}] = {{ $value  | printf "%.2f" }} > 100%          


    #==============================================================#
    #                        Node : Memory                         #
    #==============================================================#
    # available memory < 10%
    - alert: NodeOutOfMem
      expr: node:ins:mem_avail < 0.10
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: 'WARN NodeOutOfMem {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.2f" }}'
        description: |
          node:ins:mem_avail[ins={{ $labels.ins }}] = {{ $value  | printf "%.2f" }} < 10%          

    # commit ratio > 90%
    #- alert: NodeMemCommitRatioHigh
    #  expr: node:ins:mem_commit_ratio > 0.90
    #  for: 1m
    #  labels: { level: 1, severity: WARN, category: node }
    #  annotations:
    #    summary: 'WARN NodeMemCommitRatioHigh {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.2f" }}'
    #    description: |
    #      node:ins:mem_commit_ratio[ins={{ $labels.ins }}] = {{ $value  | printf "%.2f" }} > 90%

    # OPTIONAL: EDAC Errors

    #==============================================================#
    #                        Node : Swap                           #
    #==============================================================#
    # swap usage > 1%
    - alert: NodeMemSwapped
      expr: node:ins:swap_usage > 0.01
      for: 5m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: 'WARN NodeMemSwapped {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.2f" }}'
        description: |
          node:ins:swap_usage[ins={{ $labels.ins }}] = {{ $value  | printf "%.2f" }} > 1%          

    #==============================================================#
    #                     Node : File System                       #
    #==============================================================#

    # filesystem usage > 90%
    - alert: NodeFsSpaceFull
      expr: node:fs:space_usage > 0.90
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: 'WARN NodeFsSpaceFull {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.2f" }}'
        description: |
          node:fs:space_usage[ins={{ $labels.ins }}] = {{ $value  | printf "%.2f" }} > 90%          

    # inode usage > 90%
    - alert: NodeFsFilesFull
      expr: node:fs:inode_usage > 0.90
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: 'WARN NodeFsFilesFull {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.2f" }}'
        description: |
          node:fs:inode_usage[ins={{ $labels.ins }}] = {{ $value  | printf "%.2f" }} > 90%          

    # file descriptor usage > 90%
    - alert: NodeFdFull
      expr: node:ins:fd_usage > 0.90
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: 'WARN NodeFdFull {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.2f" }}'
        description: |
          node:ins:fd_usage[ins={{ $labels.ins }}] = {{ $value  | printf "%.2f" }} > 90%          

    # OPTIONAL: space predict 1d
    # OPTIONAL: filesystem read-only
    # OPTIONAL: fast release on disk space

    #==============================================================#
    #                          Node : Disk                         #
    #==============================================================#
    # read latency > 32ms (typical on pci-e ssd: 100µs)
    - alert: NodeDiskSlow
      expr: node:dev:disk_read_rt_1m{device="dfa"} > 0.032 or node:dev:disk_write_rt_1m{device="dfa"} > 0.032
      for: 1m
      labels: { level: 2, severity: INFO, category: node }
      annotations:
        summary: 'INFO NodeReadSlow {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.6f" }}'
        description: |
          node:dev:disk_read_rt_1m[ins={{ $labels.ins }}] = {{ $value  | printf "%.6f" }} > 32ms          

    # OPTIONAL: raid card failure
    # OPTIONAL: read/write traffic high
    # OPTIONAL: read/write latency high

    #==============================================================#
    #                        Node : Network                        #
    #==============================================================#
    # OPTIONAL: unusual network traffic
    # OPTIONAL: interface saturation high

    #==============================================================#
    #                        Node : Protocol                       #
    #==============================================================#

    # rate(node:ins:tcp_error[1m]) > 1
    - alert: NodeTcpErrHigh
      expr: rate(node:ins:tcp_error[1m]) > 1
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: 'WARN NodeTcpErrHigh {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.2f" }}'
        description: |
          rate(node:ins:tcp_error{ins={{ $labels.ins }}}[1m]) = {{ $value  | printf "%.2f" }} > 1          

    # node:ins:tcp_retrans_ratio1m > 1e-4
    - alert: NodeTcpRetransHigh
      expr: node:ins:tcp_retrans_ratio1m > 1e-2
      for: 1m
      labels: { level: 2, severity: INFO, category: node }
      annotations:
        summary: 'INFO NodeTcpRetransHigh {{ $labels.ins }}@{{ $labels.instance }} {{ $value  | printf "%.6f" }}'
        description: |
          node:ins:tcp_retrans_ratio1m[ins={{ $labels.ins }}] = {{ $value  | printf "%.6f" }} > 1%          

    # OPTIONAL: tcp conn high
    # OPTIONAL: udp traffic high
    # OPTIONAL: conn track

    #==============================================================#
    #                          Node : Time                         #
    #==============================================================#

    - alert: NodeTimeDrift
      expr: node_timex_sync_status != 1
      for: 1m
      labels: { level: 1, severity: WARN, category: node }
      annotations:
        summary: 'WARN NodeTimeDrift {{ $labels.ins }}@{{ $labels.instance }}'
        description: |
          node_timex_status[ins={{ $labels.ins }}]) = {{ $value | printf "%.6f" }} != 0 or
          node_timex_sync_status[ins={{ $labels.ins }}]) = {{ $value | printf "%.6f" }} != 1          


    # time drift > 64ms
    # - alert: NodeTimeDrift
    #   expr: node:ins:time_drift > 0.064
    #   for: 1m
    #   labels: { level: 1, severity: WARN, category: node }
    #   annotations:
    #     summary: 'WARN NodeTimeDrift {{ $labels.ins }}@{{ $labels.instance }}'
    #     description: |
    #       abs(node_timex_offset_seconds)[ins={{ $labels.ins }}]) = {{ $value | printf "%.6f" }} > 64ms

9.7 - 指标列表

Pigsty NODE 模块提供的完整监控指标列表与释义

NODE 模块包含有 747 类可用监控指标。

Metric Name	Type	Labels	Description
ALERTS	Unknown	`alertname`, `ip`, `level`, `severity`, `ins`, `job`, `alertstate`, `category`, `instance`, `cls`	N/A
ALERTS_FOR_STATE	Unknown	`alertname`, `ip`, `level`, `severity`, `ins`, `job`, `category`, `instance`, `cls`	N/A
deprecated_flags_inuse_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
go_gc_duration_seconds	summary	`quantile`, `instance`, `ins`, `job`, `ip`, `cls`	A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
go_gc_duration_seconds_sum	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
go_goroutines	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of goroutines that currently exist.
go_info	gauge	`version`, `instance`, `ins`, `job`, `ip`, `cls`	Information about the Go environment.
go_memstats_alloc_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of frees.
go_memstats_gc_sys_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of heap bytes that are in use.
go_memstats_heap_objects	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of allocated objects.
go_memstats_heap_released_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of heap bytes released to OS.
go_memstats_heap_sys_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of pointer lookups.
go_memstats_mallocs_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of mallocs.
go_memstats_mcache_inuse_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes obtained from system.
go_threads	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of OS threads created.
haproxy:cls:usage	Unknown	`job`, `cls`	N/A
haproxy:ins:uptime	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
haproxy:ins:usage	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
haproxy_backend_active_servers	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of active UP servers with a non-zero weight
haproxy_backend_agg_check_status	gauge	`state`, `proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Backend’s aggregated gauge of servers’ state check status
haproxy_backend_agg_server_check_status	gauge	`state`, `proxy`, `instance`, `ins`, `job`, `ip`, `cls`	[DEPRECATED] Backend’s aggregated gauge of servers’ status
haproxy_backend_agg_server_status	gauge	`state`, `proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Backend’s aggregated gauge of servers’ status
haproxy_backend_backup_servers	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of backup UP servers with a non-zero weight
haproxy_backend_bytes_in_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of request bytes since process started
haproxy_backend_bytes_out_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of response bytes since process started
haproxy_backend_check_last_change_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	How long ago the last server state changed, in seconds
haproxy_backend_check_up_down_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of failed checks causing UP to DOWN server transitions, per server/backend, since the worker process started
haproxy_backend_client_aborts_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of requests or connections aborted by the client since the worker process started
haproxy_backend_connect_time_average_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Avg. connect time for last 1024 successful connections.
haproxy_backend_connection_attempts_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of outgoing connection attempts on this backend/server since the worker process started
haproxy_backend_connection_errors_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of failed connections to server since the worker process started
haproxy_backend_connection_reuses_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of reused connection on this backend/server since the worker process started
haproxy_backend_current_queue	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Number of current queued connections
haproxy_backend_current_sessions	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Number of current sessions on the frontend, backend or server
haproxy_backend_downtime_seconds_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total time spent in DOWN state, for server or backend
haproxy_backend_failed_header_rewriting_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of failed HTTP header rewrites since the worker process started
haproxy_backend_http_cache_hits_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of HTTP requests not found in the cache on this frontend/backend since the worker process started
haproxy_backend_http_cache_lookups_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of HTTP requests looked up in the cache on this frontend/backend since the worker process started
haproxy_backend_http_comp_bytes_bypassed_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of bytes that bypassed HTTP compression for this object since the worker process started (CPU/memory/bandwidth limitation)
haproxy_backend_http_comp_bytes_in_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of bytes submitted to the HTTP compressor for this object since the worker process started
haproxy_backend_http_comp_bytes_out_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of bytes emitted by the HTTP compressor for this object since the worker process started
haproxy_backend_http_comp_responses_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of HTTP responses that were compressed for this object since the worker process started
haproxy_backend_http_requests_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of HTTP requests processed by this object since the worker process started
haproxy_backend_http_responses_total	counter	`ip`, `proxy`, `ins`, `code`, `job`, `instance`, `cls`	Total number of HTTP responses with status 100-199 returned by this object since the worker process started
haproxy_backend_internal_errors_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of internal errors since process started
haproxy_backend_last_session_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	How long ago some traffic was seen on this object on this worker process, in seconds
haproxy_backend_limit_sessions	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Frontend/listener/server’s maxconn, backend’s fullconn
haproxy_backend_loadbalanced_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of requests routed by load balancing since the worker process started (ignores queue pop and stickiness)
haproxy_backend_max_connect_time_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Maximum observed time spent waiting for a connection to complete
haproxy_backend_max_queue	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Highest value of queued connections encountered since process started
haproxy_backend_max_queue_time_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Maximum observed time spent in the queue
haproxy_backend_max_response_time_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Maximum observed time spent waiting for a server response
haproxy_backend_max_session_rate	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Highest value of sessions per second observed since the worker process started
haproxy_backend_max_sessions	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Highest value of current sessions encountered since process started
haproxy_backend_max_total_time_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Maximum observed total request+response time (request+queue+connect+response+processing)
haproxy_backend_queue_time_average_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Avg. queue time for last 1024 successful connections.
haproxy_backend_redispatch_warnings_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of server redispatches due to connection failures since the worker process started
haproxy_backend_requests_denied_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of denied requests since process started
haproxy_backend_response_errors_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of invalid responses since the worker process started
haproxy_backend_response_time_average_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Avg. response time for last 1024 successful connections.
haproxy_backend_responses_denied_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of denied responses since process started
haproxy_backend_retry_warnings_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of server connection retries since the worker process started
haproxy_backend_server_aborts_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of requests or connections aborted by the server since the worker process started
haproxy_backend_sessions_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of sessions since process started
haproxy_backend_status	gauge	`state`, `proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Current status of the service, per state label value.
haproxy_backend_total_time_average_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Avg. total time for last 1024 successful connections.
haproxy_backend_uweight	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Server’s user weight, or sum of active servers’ user weights for a backend
haproxy_backend_weight	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Server’s effective weight, or sum of active servers’ effective weights for a backend
haproxy_frontend_bytes_in_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of request bytes since process started
haproxy_frontend_bytes_out_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of response bytes since process started
haproxy_frontend_connections_rate_max	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Highest value of connections per second observed since the worker process started
haproxy_frontend_connections_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of new connections accepted on this frontend since the worker process started
haproxy_frontend_current_sessions	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Number of current sessions on the frontend, backend or server
haproxy_frontend_denied_connections_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of incoming connections blocked on a listener/frontend by a tcp-request connection rule since the worker process started
haproxy_frontend_denied_sessions_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of incoming sessions blocked on a listener/frontend by a tcp-request connection rule since the worker process started
haproxy_frontend_failed_header_rewriting_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of failed HTTP header rewrites since the worker process started
haproxy_frontend_http_cache_hits_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of HTTP requests not found in the cache on this frontend/backend since the worker process started
haproxy_frontend_http_cache_lookups_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of HTTP requests looked up in the cache on this frontend/backend since the worker process started
haproxy_frontend_http_comp_bytes_bypassed_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of bytes that bypassed HTTP compression for this object since the worker process started (CPU/memory/bandwidth limitation)
haproxy_frontend_http_comp_bytes_in_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of bytes submitted to the HTTP compressor for this object since the worker process started
haproxy_frontend_http_comp_bytes_out_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of bytes emitted by the HTTP compressor for this object since the worker process started
haproxy_frontend_http_comp_responses_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of HTTP responses that were compressed for this object since the worker process started
haproxy_frontend_http_requests_rate_max	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Highest value of http requests observed since the worker process started
haproxy_frontend_http_requests_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of HTTP requests processed by this object since the worker process started
haproxy_frontend_http_responses_total	counter	`ip`, `proxy`, `ins`, `code`, `job`, `instance`, `cls`	Total number of HTTP responses with status 100-199 returned by this object since the worker process started
haproxy_frontend_intercepted_requests_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of HTTP requests intercepted on the frontend (redirects/stats/services) since the worker process started
haproxy_frontend_internal_errors_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of internal errors since process started
haproxy_frontend_limit_session_rate	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Limit on the number of sessions accepted in a second (frontend only, ‘rate-limit sessions’ setting)
haproxy_frontend_limit_sessions	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Frontend/listener/server’s maxconn, backend’s fullconn
haproxy_frontend_max_session_rate	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Highest value of sessions per second observed since the worker process started
haproxy_frontend_max_sessions	gauge	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Highest value of current sessions encountered since process started
haproxy_frontend_request_errors_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of invalid requests since process started
haproxy_frontend_requests_denied_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of denied requests since process started
haproxy_frontend_responses_denied_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of denied responses since process started
haproxy_frontend_sessions_total	counter	`proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Total number of sessions since process started
haproxy_frontend_status	gauge	`state`, `proxy`, `instance`, `ins`, `job`, `ip`, `cls`	Current status of the service, per state label value.
haproxy_process_active_peers	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Current number of verified active peers connections on the current worker process
haproxy_process_build_info	gauge	`version`, `instance`, `ins`, `job`, `ip`, `cls`	Build info
haproxy_process_busy_polling_enabled	gauge	`instance`, `ins`, `job`, `ip`, `cls`	1 if busy-polling is currently in use on the worker process, otherwise zero (config.busy-polling)
haproxy_process_bytes_out_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes emitted by current worker process over the last second
haproxy_process_bytes_out_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of bytes emitted by current worker process since started
haproxy_process_connected_peers	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Current number of peers having passed the connection step on the current worker process
haproxy_process_connections_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of connections on this worker process since started
haproxy_process_current_backend_ssl_key_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of SSL keys created on backends in this worker process over the last second
haproxy_process_current_connection_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of front connections created on this worker process over the last second
haproxy_process_current_connections	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Current number of connections on this worker process
haproxy_process_current_frontend_ssl_key_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of SSL keys created on frontends in this worker process over the last second
haproxy_process_current_run_queue	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Total number of active tasks+tasklets in the current worker process
haproxy_process_current_session_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of sessions created on this worker process over the last second
haproxy_process_current_ssl_connections	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Current number of SSL endpoints on this worker process (front+back)
haproxy_process_current_ssl_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of SSL connections created on this worker process over the last second
haproxy_process_current_tasks	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Total number of tasks in the current worker process (active + sleeping)
haproxy_process_current_zlib_memory	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Amount of memory currently used by HTTP compression on the current worker process (in bytes)
haproxy_process_dropped_logs_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of dropped logs for current worker process since started
haproxy_process_failed_resolutions	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of failed DNS resolutions in current worker process since started
haproxy_process_frontend_ssl_reuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Percent of frontend SSL connections which did not require a new key
haproxy_process_hard_max_connections	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Hard limit on the number of per-process connections (imposed by Memmax_MB or Ulimit-n)
haproxy_process_http_comp_bytes_in_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes submitted to the HTTP compressor in this worker process over the last second
haproxy_process_http_comp_bytes_out_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Number of bytes emitted by the HTTP compressor in this worker process over the last second
haproxy_process_idle_time_percent	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Percentage of last second spent waiting in the current worker thread
haproxy_process_jobs	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Current number of active jobs on the current worker process (frontend connections, master connections, listeners)
haproxy_process_limit_connection_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Hard limit for ConnRate (global.maxconnrate)
haproxy_process_limit_http_comp	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Limit of CompressBpsOut beyond which HTTP compression is automatically disabled
haproxy_process_limit_session_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Hard limit for SessRate (global.maxsessrate)
haproxy_process_limit_ssl_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Hard limit for SslRate (global.maxsslrate)
haproxy_process_listeners	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Current number of active listeners on the current worker process
haproxy_process_max_backend_ssl_key_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Highest SslBackendKeyRate reached on this worker process since started (in SSL keys per second)
haproxy_process_max_connection_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Highest ConnRate reached on this worker process since started (in connections per second)
haproxy_process_max_connections	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Hard limit on the number of per-process connections (configured or imposed by Ulimit-n)
haproxy_process_max_fds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Hard limit on the number of per-process file descriptors
haproxy_process_max_frontend_ssl_key_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Highest SslFrontendKeyRate reached on this worker process since started (in SSL keys per second)
haproxy_process_max_memory_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Worker process’s hard limit on memory usage in byes (-m on command line)
haproxy_process_max_pipes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Hard limit on the number of pipes for splicing, 0=unlimited
haproxy_process_max_session_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Highest SessRate reached on this worker process since started (in sessions per second)
haproxy_process_max_sockets	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Hard limit on the number of per-process sockets
haproxy_process_max_ssl_connections	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Hard limit on the number of per-process SSL endpoints (front+back), 0=unlimited
haproxy_process_max_ssl_rate	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Highest SslRate reached on this worker process since started (in connections per second)
haproxy_process_max_zlib_memory	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Limit on the amount of memory used by HTTP compression above which it is automatically disabled (in bytes, see global.maxzlibmem)
haproxy_process_nbproc	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of started worker processes (historical, always 1)
haproxy_process_nbthread	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of started threads (global.nbthread)
haproxy_process_pipes_free_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Current number of allocated and available pipes in this worker process
haproxy_process_pipes_used_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Current number of pipes in use in this worker process
haproxy_process_pool_allocated_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Amount of memory allocated in pools (in bytes)
haproxy_process_pool_failures_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Number of failed pool allocations since this worker was started
haproxy_process_pool_used_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Amount of pool memory currently used (in bytes)
haproxy_process_recv_logs_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of log messages received by log-forwarding listeners on this worker process since started
haproxy_process_relative_process_id	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Relative worker process number (1)
haproxy_process_requests_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of requests on this worker process since started
haproxy_process_spliced_bytes_out_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of bytes emitted by current worker process through a kernel pipe since started
haproxy_process_ssl_cache_lookups_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of SSL session ID lookups in the SSL session cache on this worker since started
haproxy_process_ssl_cache_misses_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of SSL session ID lookups that didn’t find a session in the SSL session cache on this worker since started
haproxy_process_ssl_connections_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of SSL endpoints on this worker process since started (front+back)
haproxy_process_start_time_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Start time in seconds
haproxy_process_stopping	gauge	`instance`, `ins`, `job`, `ip`, `cls`	1 if the worker process is currently stopping, otherwise zero
haproxy_process_unstoppable_jobs	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Current number of unstoppable jobs on the current worker process (master connections)
haproxy_process_uptime_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	How long ago this worker process was started (seconds)
haproxy_server_bytes_in_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of request bytes since process started
haproxy_server_bytes_out_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of response bytes since process started
haproxy_server_check_code	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	layer5-7 code, if available of the last health check.
haproxy_server_check_duration_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total duration of the latest server health check, in seconds.
haproxy_server_check_failures_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of failed individual health checks per server/backend, since the worker process started
haproxy_server_check_last_change_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	How long ago the last server state changed, in seconds
haproxy_server_check_status	gauge	`state`, `proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Status of last health check, per state label value.
haproxy_server_check_up_down_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of failed checks causing UP to DOWN server transitions, per server/backend, since the worker process started
haproxy_server_client_aborts_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of requests or connections aborted by the client since the worker process started
haproxy_server_connect_time_average_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Avg. connect time for last 1024 successful connections.
haproxy_server_connection_attempts_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of outgoing connection attempts on this backend/server since the worker process started
haproxy_server_connection_errors_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of failed connections to server since the worker process started
haproxy_server_connection_reuses_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of reused connection on this backend/server since the worker process started
haproxy_server_current_queue	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Number of current queued connections
haproxy_server_current_sessions	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Number of current sessions on the frontend, backend or server
haproxy_server_current_throttle	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Throttling ratio applied to a server’s maxconn and weight during the slowstart period (0 to 100%)
haproxy_server_downtime_seconds_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total time spent in DOWN state, for server or backend
haproxy_server_failed_header_rewriting_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of failed HTTP header rewrites since the worker process started
haproxy_server_idle_connections_current	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Current number of idle connections available for reuse on this server
haproxy_server_idle_connections_limit	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Limit on the number of available idle connections on this server (server ‘pool_max_conn’ directive)
haproxy_server_internal_errors_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of internal errors since process started
haproxy_server_last_session_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	How long ago some traffic was seen on this object on this worker process, in seconds
haproxy_server_limit_sessions	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Frontend/listener/server’s maxconn, backend’s fullconn
haproxy_server_loadbalanced_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of requests routed by load balancing since the worker process started (ignores queue pop and stickiness)
haproxy_server_max_connect_time_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Maximum observed time spent waiting for a connection to complete
haproxy_server_max_queue	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Highest value of queued connections encountered since process started
haproxy_server_max_queue_time_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Maximum observed time spent in the queue
haproxy_server_max_response_time_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Maximum observed time spent waiting for a server response
haproxy_server_max_session_rate	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Highest value of sessions per second observed since the worker process started
haproxy_server_max_sessions	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Highest value of current sessions encountered since process started
haproxy_server_max_total_time_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Maximum observed total request+response time (request+queue+connect+response+processing)
haproxy_server_need_connections_current	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Estimated needed number of connections
haproxy_server_queue_limit	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Limit on the number of connections in queue, for servers only (maxqueue argument)
haproxy_server_queue_time_average_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Avg. queue time for last 1024 successful connections.
haproxy_server_redispatch_warnings_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of server redispatches due to connection failures since the worker process started
haproxy_server_response_errors_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of invalid responses since the worker process started
haproxy_server_response_time_average_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Avg. response time for last 1024 successful connections.
haproxy_server_responses_denied_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of denied responses since process started
haproxy_server_retry_warnings_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of server connection retries since the worker process started
haproxy_server_safe_idle_connections_current	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Current number of safe idle connections
haproxy_server_server_aborts_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of requests or connections aborted by the server since the worker process started
haproxy_server_sessions_total	counter	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Total number of sessions since process started
haproxy_server_status	gauge	`state`, `proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Current status of the service, per state label value.
haproxy_server_total_time_average_seconds	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Avg. total time for last 1024 successful connections.
haproxy_server_unsafe_idle_connections_current	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Current number of unsafe idle connections
haproxy_server_used_connections_current	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Current number of connections in use
haproxy_server_uweight	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Server’s user weight, or sum of active servers’ user weights for a backend
haproxy_server_weight	gauge	`proxy`, `instance`, `ins`, `job`, `server`, `ip`, `cls`	Server’s effective weight, or sum of active servers’ effective weights for a backend
haproxy_up	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
inflight_requests	gauge	`instance`, `ins`, `job`, `route`, `ip`, `cls`, `method`	Current number of inflight requests.
jaeger_tracer_baggage_restrictions_updates_total	Unknown	`instance`, `ins`, `job`, `result`, `ip`, `cls`	N/A
jaeger_tracer_baggage_truncations_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
jaeger_tracer_baggage_updates_total	Unknown	`instance`, `ins`, `job`, `result`, `ip`, `cls`	N/A
jaeger_tracer_finished_spans_total	Unknown	`instance`, `ins`, `job`, `sampled`, `ip`, `cls`	N/A
jaeger_tracer_reporter_queue_length	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Current number of spans in the reporter queue
jaeger_tracer_reporter_spans_total	Unknown	`instance`, `ins`, `job`, `result`, `ip`, `cls`	N/A
jaeger_tracer_sampler_queries_total	Unknown	`instance`, `ins`, `job`, `result`, `ip`, `cls`	N/A
jaeger_tracer_sampler_updates_total	Unknown	`instance`, `ins`, `job`, `result`, `ip`, `cls`	N/A
jaeger_tracer_span_context_decoding_errors_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
jaeger_tracer_started_spans_total	Unknown	`instance`, `ins`, `job`, `sampled`, `ip`, `cls`	N/A
jaeger_tracer_throttled_debug_spans_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
jaeger_tracer_throttler_updates_total	Unknown	`instance`, `ins`, `job`, `result`, `ip`, `cls`	N/A
jaeger_tracer_traces_total	Unknown	`state`, `instance`, `ins`, `job`, `sampled`, `ip`, `cls`	N/A
loki_experimental_features_in_use_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_internal_log_messages_total	Unknown	`level`, `instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_log_flushes_bucket	Unknown	`instance`, `ins`, `job`, `le`, `ip`, `cls`	N/A
loki_log_flushes_count	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_log_flushes_sum	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_log_messages_total	Unknown	`level`, `instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_logql_querystats_duplicates_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_logql_querystats_ingester_sent_lines_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_querier_index_cache_corruptions_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_querier_index_cache_encode_errors_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_querier_index_cache_gets_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_querier_index_cache_hits_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
loki_querier_index_cache_puts_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
net_conntrack_dialer_conn_attempted_total	counter	`ip`, `ins`, `job`, `instance`, `cls`, `dialer_name`	Total number of connections attempted by the given dialer a given name.
net_conntrack_dialer_conn_closed_total	counter	`ip`, `ins`, `job`, `instance`, `cls`, `dialer_name`	Total number of connections closed which originated from the dialer of a given name.
net_conntrack_dialer_conn_established_total	counter	`ip`, `ins`, `job`, `instance`, `cls`, `dialer_name`	Total number of connections successfully established by the given dialer a given name.
net_conntrack_dialer_conn_failed_total	counter	`ip`, `ins`, `job`, `reason`, `instance`, `cls`, `dialer_name`	Total number of connections failed to dial by the dialer a given name.
node:cls:avail_bytes	Unknown	`job`, `cls`	N/A
node:cls:cpu_count	Unknown	`job`, `cls`	N/A
node:cls:cpu_usage	Unknown	`job`, `cls`	N/A
node:cls:cpu_usage_15m	Unknown	`job`, `cls`	N/A
node:cls:cpu_usage_1m	Unknown	`job`, `cls`	N/A
node:cls:cpu_usage_5m	Unknown	`job`, `cls`	N/A
node:cls:disk_io_bytes_rate1m	Unknown	`job`, `cls`	N/A
node:cls:disk_iops_1m	Unknown	`job`, `cls`	N/A
node:cls:disk_mreads_rate1m	Unknown	`job`, `cls`	N/A
node:cls:disk_mreads_ratio1m	Unknown	`job`, `cls`	N/A
node:cls:disk_mwrites_rate1m	Unknown	`job`, `cls`	N/A
node:cls:disk_mwrites_ratio1m	Unknown	`job`, `cls`	N/A
node:cls:disk_read_bytes_rate1m	Unknown	`job`, `cls`	N/A
node:cls:disk_reads_rate1m	Unknown	`job`, `cls`	N/A
node:cls:disk_write_bytes_rate1m	Unknown	`job`, `cls`	N/A
node:cls:disk_writes_rate1m	Unknown	`job`, `cls`	N/A
node:cls:free_bytes	Unknown	`job`, `cls`	N/A
node:cls:mem_usage	Unknown	`job`, `cls`	N/A
node:cls:network_io_bytes_rate1m	Unknown	`job`, `cls`	N/A
node:cls:network_rx_bytes_rate1m	Unknown	`job`, `cls`	N/A
node:cls:network_rx_pps1m	Unknown	`job`, `cls`	N/A
node:cls:network_tx_bytes_rate1m	Unknown	`job`, `cls`	N/A
node:cls:network_tx_pps1m	Unknown	`job`, `cls`	N/A
node:cls:size_bytes	Unknown	`job`, `cls`	N/A
node:cls:space_usage	Unknown	`job`, `cls`	N/A
node:cls:space_usage_max	Unknown	`job`, `cls`	N/A
node:cls:stdload1	Unknown	`job`, `cls`	N/A
node:cls:stdload15	Unknown	`job`, `cls`	N/A
node:cls:stdload5	Unknown	`job`, `cls`	N/A
node:cls:time_drift_max	Unknown	`job`, `cls`	N/A
node:cpu:idle_time_irate1m	Unknown	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	N/A
node:cpu:sched_timeslices_rate1m	Unknown	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	N/A
node:cpu:sched_wait_rate1m	Unknown	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	N/A
node:cpu:time_irate1m	Unknown	`ip`, `mode`, `ins`, `job`, `cpu`, `instance`, `cls`	N/A
node:cpu:total_time_irate1m	Unknown	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	N/A
node:cpu:usage	Unknown	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	N/A
node:cpu:usage_avg15m	Unknown	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	N/A
node:cpu:usage_avg1m	Unknown	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	N/A
node:cpu:usage_avg5m	Unknown	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	N/A
node:dev:disk_avg_queue_size	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_io_batch_1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_io_bytes_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_io_rt_1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_io_time_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_iops_1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_mreads_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_mreads_ratio1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_mwrites_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_mwrites_ratio1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_read_batch_1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_read_bytes_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_read_rt_1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_read_time_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_reads_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_util_1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_write_batch_1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_write_bytes_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_write_rt_1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_write_time_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:disk_writes_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:network_io_bytes_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:network_rx_bytes_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:network_rx_pps1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:network_tx_bytes_rate1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:dev:network_tx_pps1m	Unknown	`ip`, `device`, `ins`, `job`, `instance`, `cls`	N/A
node:env:avail_bytes	Unknown	`job`	N/A
node:env:cpu_count	Unknown	`job`	N/A
node:env:cpu_usage	Unknown	`job`	N/A
node:env:cpu_usage_15m	Unknown	`job`	N/A
node:env:cpu_usage_1m	Unknown	`job`	N/A
node:env:cpu_usage_5m	Unknown	`job`	N/A
node:env:device_space_usage_max	Unknown	`device`, `mountpoint`, `job`, `fstype`	N/A
node:env:free_bytes	Unknown	`job`	N/A
node:env:mem_avail	Unknown	`job`	N/A
node:env:mem_total	Unknown	`job`	N/A
node:env:mem_usage	Unknown	`job`	N/A
node:env:size_bytes	Unknown	`job`	N/A
node:env:space_usage	Unknown	`job`	N/A
node:env:stdload1	Unknown	`job`	N/A
node:env:stdload15	Unknown	`job`	N/A
node:env:stdload5	Unknown	`job`	N/A
node:fs:avail_bytes	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:free_bytes	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:inode_free	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:inode_total	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:inode_usage	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:inode_used	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:size_bytes	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:space_deriv1h	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:space_exhaust	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:space_predict_1d	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:fs:space_usage	Unknown	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	N/A
node:ins	Unknown	`id`, `ip`, `ins`, `job`, `nodename`, `instance`, `cls`	N/A
node:ins:avail_bytes	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:cpu_count	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:cpu_usage	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:cpu_usage_15m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:cpu_usage_1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:cpu_usage_5m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:ctx_switch_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_io_bytes_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_iops_1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_mreads_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_mreads_ratio1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_mwrites_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_mwrites_ratio1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_read_bytes_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_reads_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_write_bytes_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:disk_writes_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:fd_alloc_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:fd_usage	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:forks_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:free_bytes	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:inode_usage	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:interrupt_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:mem_avail	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:mem_commit_ratio	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:mem_kernel	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:mem_rss	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:mem_usage	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:network_io_bytes_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:network_rx_bytes_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:network_rx_pps1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:network_tx_bytes_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:network_tx_pps1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:pagefault_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:pagein_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:pageout_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:pgmajfault_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:sched_wait_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:size_bytes	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:space_usage_max	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:stdload1	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:stdload15	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:stdload5	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:swap_usage	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:swapin_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:swapout_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_active_opens_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_dropped_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_error	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_error_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_insegs_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_outsegs_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_overflow_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_passive_opens_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_retrans_ratio1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_retranssegs_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:tcp_segs_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:time_drift	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:udp_in_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:udp_out_rate1m	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node:ins:uptime	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node_arp_entries	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	ARP entries by device
node_boot_time_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Node boot time, in unixtime.
node_context_switches_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of context switches.
node_cooling_device_cur_state	gauge	`instance`, `ins`, `job`, `type`, `ip`, `cls`	Current throttle state of the cooling device
node_cooling_device_max_state	gauge	`instance`, `ins`, `job`, `type`, `ip`, `cls`	Maximum throttle state of the cooling device
node_cpu_guest_seconds_total	counter	`ip`, `mode`, `ins`, `job`, `cpu`, `instance`, `cls`	Seconds the CPUs spent in guests (VMs) for each mode.
node_cpu_seconds_total	counter	`ip`, `mode`, `ins`, `job`, `cpu`, `instance`, `cls`	Seconds the CPUs spent in each mode.
node_disk_discard_time_seconds_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	This is the total number of seconds spent by all discards.
node_disk_discarded_sectors_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The total number of sectors discarded successfully.
node_disk_discards_completed_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The total number of discards completed successfully.
node_disk_discards_merged_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The total number of discards merged.
node_disk_filesystem_info	gauge	`ip`, `usage`, `version`, `device`, `uuid`, `ins`, `type`, `job`, `instance`, `cls`	Info about disk filesystem.
node_disk_info	gauge	`minor`, `ip`, `major`, `revision`, `device`, `model`, `serial`, `path`, `ins`, `job`, `instance`, `cls`	Info of /sys/block/<block_device>.
node_disk_io_now	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The number of I/Os currently in progress.
node_disk_io_time_seconds_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Total seconds spent doing I/Os.
node_disk_io_time_weighted_seconds_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The weighted # of seconds spent doing I/Os.
node_disk_read_bytes_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The total number of bytes read successfully.
node_disk_read_time_seconds_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The total number of seconds spent by all reads.
node_disk_reads_completed_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The total number of reads completed successfully.
node_disk_reads_merged_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The total number of reads merged.
node_disk_write_time_seconds_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	This is the total number of seconds spent by all writes.
node_disk_writes_completed_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The total number of writes completed successfully.
node_disk_writes_merged_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The number of writes merged.
node_disk_written_bytes_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	The total number of bytes written successfully.
node_dmi_info	gauge	`bios_vendor`, `ip`, `product_family`, `product_version`, `product_uuid`, `system_vendor`, `bios_version`, `ins`, `bios_date`, `cls`, `job`, `product_name`, `instance`, `chassis_version`, `chassis_vendor`, `product_serial`	A metric with a constant ‘1’ value labeled by bios_date, bios_release, bios_vendor, bios_version, board_asset_tag, board_name, board_serial, board_vendor, board_version, chassis_asset_tag, chassis_serial, chassis_vendor, chassis_version, product_family, product_name, product_serial, product_sku, product_uuid, product_version, system_vendor if provided by DMI.
node_entropy_available_bits	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Bits of available entropy.
node_entropy_pool_size_bits	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Bits of entropy pool.
node_exporter_build_info	gauge	`ip`, `version`, `revision`, `goversion`, `branch`, `ins`, `goarch`, `job`, `tags`, `instance`, `cls`, `goos`	A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which node_exporter was built, and the goos and goarch for the build.
node_filefd_allocated	gauge	`instance`, `ins`, `job`, `ip`, `cls`	File descriptor statistics: allocated.
node_filefd_maximum	gauge	`instance`, `ins`, `job`, `ip`, `cls`	File descriptor statistics: maximum.
node_filesystem_avail_bytes	gauge	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	Filesystem space available to non-root users in bytes.
node_filesystem_device_error	gauge	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	Whether an error occurred while getting statistics for the given device.
node_filesystem_files	gauge	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	Filesystem total file nodes.
node_filesystem_files_free	gauge	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	Filesystem total free file nodes.
node_filesystem_free_bytes	gauge	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	Filesystem free space in bytes.
node_filesystem_readonly	gauge	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	Filesystem read-only status.
node_filesystem_size_bytes	gauge	`ip`, `device`, `mountpoint`, `ins`, `cls`, `job`, `instance`, `fstype`	Filesystem size in bytes.
node_forks_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of forks.
node_hwmon_chip_names	gauge	`chip_name`, `ip`, `ins`, `chip`, `job`, `instance`, `cls`	Annotation metric for human-readable chip names
node_hwmon_energy_joule_total	counter	`sensor`, `ip`, `ins`, `chip`, `job`, `instance`, `cls`	Hardware monitor for joules used so far (input)
node_hwmon_sensor_label	gauge	`sensor`, `ip`, `ins`, `chip`, `job`, `label`, `instance`, `cls`	Label for given chip and sensor
node_intr_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of interrupts serviced.
node_ipvs_connections_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The total number of connections made.
node_ipvs_incoming_bytes_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The total amount of incoming data.
node_ipvs_incoming_packets_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The total number of incoming packets.
node_ipvs_outgoing_bytes_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The total amount of outgoing data.
node_ipvs_outgoing_packets_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The total number of outgoing packets.
node_load1	gauge	`instance`, `ins`, `job`, `ip`, `cls`	1m load average.
node_load15	gauge	`instance`, `ins`, `job`, `ip`, `cls`	15m load average.
node_load5	gauge	`instance`, `ins`, `job`, `ip`, `cls`	5m load average.
node_memory_Active_anon_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Active_anon_bytes.
node_memory_Active_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Active_bytes.
node_memory_Active_file_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Active_file_bytes.
node_memory_AnonHugePages_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field AnonHugePages_bytes.
node_memory_AnonPages_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field AnonPages_bytes.
node_memory_Bounce_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Bounce_bytes.
node_memory_Buffers_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Buffers_bytes.
node_memory_Cached_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Cached_bytes.
node_memory_CommitLimit_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field CommitLimit_bytes.
node_memory_Committed_AS_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Committed_AS_bytes.
node_memory_DirectMap1G_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field DirectMap1G_bytes.
node_memory_DirectMap2M_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field DirectMap2M_bytes.
node_memory_DirectMap4k_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field DirectMap4k_bytes.
node_memory_Dirty_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Dirty_bytes.
node_memory_FileHugePages_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field FileHugePages_bytes.
node_memory_FilePmdMapped_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field FilePmdMapped_bytes.
node_memory_HardwareCorrupted_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field HardwareCorrupted_bytes.
node_memory_HugePages_Free	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field HugePages_Free.
node_memory_HugePages_Rsvd	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field HugePages_Rsvd.
node_memory_HugePages_Surp	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field HugePages_Surp.
node_memory_HugePages_Total	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field HugePages_Total.
node_memory_Hugepagesize_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Hugepagesize_bytes.
node_memory_Hugetlb_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Hugetlb_bytes.
node_memory_Inactive_anon_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Inactive_anon_bytes.
node_memory_Inactive_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Inactive_bytes.
node_memory_Inactive_file_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Inactive_file_bytes.
node_memory_KReclaimable_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field KReclaimable_bytes.
node_memory_KernelStack_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field KernelStack_bytes.
node_memory_Mapped_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Mapped_bytes.
node_memory_MemAvailable_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field MemAvailable_bytes.
node_memory_MemFree_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field MemFree_bytes.
node_memory_MemTotal_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field MemTotal_bytes.
node_memory_Mlocked_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Mlocked_bytes.
node_memory_NFS_Unstable_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field NFS_Unstable_bytes.
node_memory_PageTables_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field PageTables_bytes.
node_memory_Percpu_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Percpu_bytes.
node_memory_SReclaimable_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field SReclaimable_bytes.
node_memory_SUnreclaim_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field SUnreclaim_bytes.
node_memory_ShmemHugePages_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field ShmemHugePages_bytes.
node_memory_ShmemPmdMapped_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field ShmemPmdMapped_bytes.
node_memory_Shmem_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Shmem_bytes.
node_memory_Slab_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Slab_bytes.
node_memory_SwapCached_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field SwapCached_bytes.
node_memory_SwapFree_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field SwapFree_bytes.
node_memory_SwapTotal_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field SwapTotal_bytes.
node_memory_Unevictable_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Unevictable_bytes.
node_memory_VmallocChunk_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field VmallocChunk_bytes.
node_memory_VmallocTotal_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field VmallocTotal_bytes.
node_memory_VmallocUsed_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field VmallocUsed_bytes.
node_memory_WritebackTmp_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field WritebackTmp_bytes.
node_memory_Writeback_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Memory information field Writeback_bytes.
node_netstat_Icmp6_InErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Icmp6InErrors.
node_netstat_Icmp6_InMsgs	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Icmp6InMsgs.
node_netstat_Icmp6_OutMsgs	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Icmp6OutMsgs.
node_netstat_Icmp_InErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic IcmpInErrors.
node_netstat_Icmp_InMsgs	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic IcmpInMsgs.
node_netstat_Icmp_OutMsgs	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic IcmpOutMsgs.
node_netstat_Ip6_InOctets	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Ip6InOctets.
node_netstat_Ip6_OutOctets	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Ip6OutOctets.
node_netstat_IpExt_InOctets	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic IpExtInOctets.
node_netstat_IpExt_OutOctets	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic IpExtOutOctets.
node_netstat_Ip_Forwarding	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic IpForwarding.
node_netstat_TcpExt_ListenDrops	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpExtListenDrops.
node_netstat_TcpExt_ListenOverflows	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpExtListenOverflows.
node_netstat_TcpExt_SyncookiesFailed	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpExtSyncookiesFailed.
node_netstat_TcpExt_SyncookiesRecv	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpExtSyncookiesRecv.
node_netstat_TcpExt_SyncookiesSent	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpExtSyncookiesSent.
node_netstat_TcpExt_TCPSynRetrans	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpExtTCPSynRetrans.
node_netstat_TcpExt_TCPTimeouts	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpExtTCPTimeouts.
node_netstat_Tcp_ActiveOpens	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpActiveOpens.
node_netstat_Tcp_CurrEstab	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpCurrEstab.
node_netstat_Tcp_InErrs	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpInErrs.
node_netstat_Tcp_InSegs	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpInSegs.
node_netstat_Tcp_OutRsts	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpOutRsts.
node_netstat_Tcp_OutSegs	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpOutSegs.
node_netstat_Tcp_PassiveOpens	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpPassiveOpens.
node_netstat_Tcp_RetransSegs	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic TcpRetransSegs.
node_netstat_Udp6_InDatagrams	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Udp6InDatagrams.
node_netstat_Udp6_InErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Udp6InErrors.
node_netstat_Udp6_NoPorts	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Udp6NoPorts.
node_netstat_Udp6_OutDatagrams	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Udp6OutDatagrams.
node_netstat_Udp6_RcvbufErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Udp6RcvbufErrors.
node_netstat_Udp6_SndbufErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic Udp6SndbufErrors.
node_netstat_UdpLite6_InErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic UdpLite6InErrors.
node_netstat_UdpLite_InErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic UdpLiteInErrors.
node_netstat_Udp_InDatagrams	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic UdpInDatagrams.
node_netstat_Udp_InErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic UdpInErrors.
node_netstat_Udp_NoPorts	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic UdpNoPorts.
node_netstat_Udp_OutDatagrams	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic UdpOutDatagrams.
node_netstat_Udp_RcvbufErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic UdpRcvbufErrors.
node_netstat_Udp_SndbufErrors	unknown	`instance`, `ins`, `job`, `ip`, `cls`	Statistic UdpSndbufErrors.
node_network_address_assign_type	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: address_assign_type
node_network_carrier	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: carrier
node_network_carrier_changes_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: carrier_changes_total
node_network_carrier_down_changes_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: carrier_down_changes_total
node_network_carrier_up_changes_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: carrier_up_changes_total
node_network_device_id	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: device_id
node_network_dormant	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: dormant
node_network_flags	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: flags
node_network_iface_id	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: iface_id
node_network_iface_link	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: iface_link
node_network_iface_link_mode	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: iface_link_mode
node_network_info	gauge	`broadcast`, `ip`, `device`, `operstate`, `ins`, `job`, `adminstate`, `duplex`, `address`, `instance`, `cls`	Non-numeric data from /sys/class/net/, value is always 1.
node_network_mtu_bytes	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: mtu_bytes
node_network_name_assign_type	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: name_assign_type
node_network_net_dev_group	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: net_dev_group
node_network_protocol_type	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: protocol_type
node_network_receive_bytes_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic receive_bytes.
node_network_receive_compressed_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic receive_compressed.
node_network_receive_drop_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic receive_drop.
node_network_receive_errs_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic receive_errs.
node_network_receive_fifo_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic receive_fifo.
node_network_receive_frame_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic receive_frame.
node_network_receive_multicast_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic receive_multicast.
node_network_receive_nohandler_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic receive_nohandler.
node_network_receive_packets_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic receive_packets.
node_network_speed_bytes	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: speed_bytes
node_network_transmit_bytes_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic transmit_bytes.
node_network_transmit_carrier_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic transmit_carrier.
node_network_transmit_colls_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic transmit_colls.
node_network_transmit_compressed_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic transmit_compressed.
node_network_transmit_drop_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic transmit_drop.
node_network_transmit_errs_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic transmit_errs.
node_network_transmit_fifo_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic transmit_fifo.
node_network_transmit_packets_total	counter	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device statistic transmit_packets.
node_network_transmit_queue_length	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Network device property: transmit_queue_length
node_network_up	gauge	`ip`, `device`, `ins`, `job`, `instance`, `cls`	Value is 1 if operstate is ‘up’, 0 otherwise.
node_nf_conntrack_entries	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of currently allocated flow entries for connection tracking.
node_nf_conntrack_entries_limit	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Maximum size of connection tracking table.
node_nf_conntrack_stat_drop	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of packets dropped due to conntrack failure.
node_nf_conntrack_stat_early_drop	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of dropped conntrack entries to make room for new ones, if maximum table size was reached.
node_nf_conntrack_stat_found	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of searched entries which were successful.
node_nf_conntrack_stat_ignore	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of packets seen which are already connected to a conntrack entry.
node_nf_conntrack_stat_insert	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of entries inserted into the list.
node_nf_conntrack_stat_insert_failed	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of entries for which list insertion was attempted but failed.
node_nf_conntrack_stat_invalid	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of packets seen which can not be tracked.
node_nf_conntrack_stat_search_restart	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of conntrack table lookups which had to be restarted due to hashtable resizes.
node_os_info	gauge	`id`, `ip`, `version`, `version_id`, `ins`, `instance`, `job`, `pretty_name`, `id_like`, `cls`	A metric with a constant ‘1’ value labeled by build_id, id, id_like, image_id, image_version, name, pretty_name, variant, variant_id, version, version_codename, version_id.
node_os_version	gauge	`id`, `ip`, `ins`, `instance`, `job`, `id_like`, `cls`	Metric containing the major.minor part of the OS version.
node_processes_max_processes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of max PIDs limit
node_processes_max_threads	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Limit of threads in the system
node_processes_pids	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of PIDs
node_processes_state	gauge	`state`, `instance`, `ins`, `job`, `ip`, `cls`	Number of processes in each state.
node_processes_threads	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Allocated threads in system
node_processes_threads_state	gauge	`instance`, `ins`, `job`, `thread_state`, `ip`, `cls`	Number of threads in each state.
node_procs_blocked	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of processes blocked waiting for I/O to complete.
node_procs_running	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of processes in runnable state.
node_schedstat_running_seconds_total	counter	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	Number of seconds CPU spent running a process.
node_schedstat_timeslices_total	counter	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	Number of timeslices executed by CPU.
node_schedstat_waiting_seconds_total	counter	`ip`, `ins`, `job`, `cpu`, `instance`, `cls`	Number of seconds spent by processing waiting for this CPU.
node_scrape_collector_duration_seconds	gauge	`ip`, `collector`, `ins`, `job`, `instance`, `cls`	node_exporter: Duration of a collector scrape.
node_scrape_collector_success	gauge	`ip`, `collector`, `ins`, `job`, `instance`, `cls`	node_exporter: Whether a collector succeeded.
node_selinux_enabled	gauge	`instance`, `ins`, `job`, `ip`, `cls`	SELinux is enabled, 1 is true, 0 is false
node_sockstat_FRAG6_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of FRAG6 sockets in state inuse.
node_sockstat_FRAG6_memory	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of FRAG6 sockets in state memory.
node_sockstat_FRAG_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of FRAG sockets in state inuse.
node_sockstat_FRAG_memory	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of FRAG sockets in state memory.
node_sockstat_RAW6_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of RAW6 sockets in state inuse.
node_sockstat_RAW_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of RAW sockets in state inuse.
node_sockstat_TCP6_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of TCP6 sockets in state inuse.
node_sockstat_TCP_alloc	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of TCP sockets in state alloc.
node_sockstat_TCP_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of TCP sockets in state inuse.
node_sockstat_TCP_mem	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of TCP sockets in state mem.
node_sockstat_TCP_mem_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of TCP sockets in state mem_bytes.
node_sockstat_TCP_orphan	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of TCP sockets in state orphan.
node_sockstat_TCP_tw	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of TCP sockets in state tw.
node_sockstat_UDP6_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of UDP6 sockets in state inuse.
node_sockstat_UDPLITE6_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of UDPLITE6 sockets in state inuse.
node_sockstat_UDPLITE_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of UDPLITE sockets in state inuse.
node_sockstat_UDP_inuse	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of UDP sockets in state inuse.
node_sockstat_UDP_mem	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of UDP sockets in state mem.
node_sockstat_UDP_mem_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of UDP sockets in state mem_bytes.
node_sockstat_sockets_used	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of IPv4 sockets in use.
node_tcp_connection_states	gauge	`state`, `instance`, `ins`, `job`, `ip`, `cls`	Number of connection states.
node_textfile_scrape_error	gauge	`instance`, `ins`, `job`, `ip`, `cls`	1 if there was an error opening or reading a file, 0 otherwise
node_time_clocksource_available_info	gauge	`ip`, `device`, `ins`, `clocksource`, `job`, `instance`, `cls`	Available clocksources read from ‘/sys/devices/system/clocksource’.
node_time_clocksource_current_info	gauge	`ip`, `device`, `ins`, `clocksource`, `job`, `instance`, `cls`	Current clocksource read from ‘/sys/devices/system/clocksource’.
node_time_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	System time in seconds since epoch (1970).
node_time_zone_offset_seconds	gauge	`instance`, `ins`, `job`, `time_zone`, `ip`, `cls`	System time zone offset in seconds.
node_timex_estimated_error_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Estimated error in seconds.
node_timex_frequency_adjustment_ratio	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Local clock frequency adjustment.
node_timex_loop_time_constant	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Phase-locked loop time constant.
node_timex_maxerror_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Maximum error in seconds.
node_timex_offset_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Time offset in between local system and reference clock.
node_timex_pps_calibration_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Pulse per second count of calibration intervals.
node_timex_pps_error_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Pulse per second count of calibration errors.
node_timex_pps_frequency_hertz	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Pulse per second frequency.
node_timex_pps_jitter_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Pulse per second jitter.
node_timex_pps_jitter_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Pulse per second count of jitter limit exceeded events.
node_timex_pps_shift_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Pulse per second interval duration.
node_timex_pps_stability_exceeded_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Pulse per second count of stability limit exceeded events.
node_timex_pps_stability_hertz	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Pulse per second stability, average of recent frequency changes.
node_timex_status	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Value of the status array bits.
node_timex_sync_status	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Is clock synchronized to a reliable server (1 = yes, 0 = no).
node_timex_tai_offset_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	International Atomic Time (TAI) offset.
node_timex_tick_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Seconds between clock ticks.
node_udp_queues	gauge	`ip`, `queue`, `ins`, `job`, `exported_ip`, `instance`, `cls`	Number of allocated memory in the kernel for UDP datagrams in bytes.
node_uname_info	gauge	`ip`, `sysname`, `version`, `domainname`, `release`, `ins`, `job`, `nodename`, `instance`, `cls`, `machine`	Labeled system information as provided by the uname system call.
node_up	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
node_vmstat_oom_kill	unknown	`instance`, `ins`, `job`, `ip`, `cls`	/proc/vmstat information field oom_kill.
node_vmstat_pgfault	unknown	`instance`, `ins`, `job`, `ip`, `cls`	/proc/vmstat information field pgfault.
node_vmstat_pgmajfault	unknown	`instance`, `ins`, `job`, `ip`, `cls`	/proc/vmstat information field pgmajfault.
node_vmstat_pgpgin	unknown	`instance`, `ins`, `job`, `ip`, `cls`	/proc/vmstat information field pgpgin.
node_vmstat_pgpgout	unknown	`instance`, `ins`, `job`, `ip`, `cls`	/proc/vmstat information field pgpgout.
node_vmstat_pswpin	unknown	`instance`, `ins`, `job`, `ip`, `cls`	/proc/vmstat information field pswpin.
node_vmstat_pswpout	unknown	`instance`, `ins`, `job`, `ip`, `cls`	/proc/vmstat information field pswpout.
process_cpu_seconds_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total user and system CPU time spent in seconds.
process_max_fds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Maximum number of open file descriptors.
process_open_fds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of open file descriptors.
process_resident_memory_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Resident memory size in bytes.
process_start_time_seconds	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Virtual memory size in bytes.
process_virtual_memory_max_bytes	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Maximum amount of virtual memory available in bytes.
prometheus_remote_storage_exemplars_in_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Exemplars in to remote storage, compare to exemplars out for queue managers.
prometheus_remote_storage_histograms_in_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	HistogramSamples in to remote storage, compare to histograms out for queue managers.
prometheus_remote_storage_samples_in_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Samples in to remote storage, compare to samples out for queue managers.
prometheus_remote_storage_string_interner_zero_reference_releases_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The number of times release has been called for strings that are not interned.
prometheus_sd_azure_failures_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Number of Azure service discovery refresh failures.
prometheus_sd_consul_rpc_duration_seconds	summary	`ip`, `call`, `quantile`, `ins`, `job`, `instance`, `cls`, `endpoint`	The duration of a Consul RPC call in seconds.
prometheus_sd_consul_rpc_duration_seconds_count	Unknown	`ip`, `call`, `ins`, `job`, `instance`, `cls`, `endpoint`	N/A
prometheus_sd_consul_rpc_duration_seconds_sum	Unknown	`ip`, `call`, `ins`, `job`, `instance`, `cls`, `endpoint`	N/A
prometheus_sd_consul_rpc_failures_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The number of Consul RPC call failures.
prometheus_sd_consulagent_rpc_duration_seconds	summary	`ip`, `call`, `quantile`, `ins`, `job`, `instance`, `cls`, `endpoint`	The duration of a Consul Agent RPC call in seconds.
prometheus_sd_consulagent_rpc_duration_seconds_count	Unknown	`ip`, `call`, `ins`, `job`, `instance`, `cls`, `endpoint`	N/A
prometheus_sd_consulagent_rpc_duration_seconds_sum	Unknown	`ip`, `call`, `ins`, `job`, `instance`, `cls`, `endpoint`	N/A
prometheus_sd_consulagent_rpc_failures_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
prometheus_sd_dns_lookup_failures_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The number of DNS-SD lookup failures.
prometheus_sd_dns_lookups_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The number of DNS-SD lookups.
prometheus_sd_file_read_errors_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The number of File-SD read errors.
prometheus_sd_file_scan_duration_seconds	summary	`quantile`, `instance`, `ins`, `job`, `ip`, `cls`	The duration of the File-SD scan in seconds.
prometheus_sd_file_scan_duration_seconds_count	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
prometheus_sd_file_scan_duration_seconds_sum	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
prometheus_sd_file_watcher_errors_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The number of File-SD errors caused by filesystem watch failures.
prometheus_sd_kubernetes_events_total	counter	`ip`, `event`, `ins`, `job`, `role`, `instance`, `cls`	The number of Kubernetes events handled.
prometheus_target_scrape_pool_exceeded_label_limits_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of times scrape pools hit the label limits, during sync or config reload.
prometheus_target_scrape_pool_exceeded_target_limit_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of times scrape pools hit the target limit, during sync or config reload.
prometheus_target_scrape_pool_reloads_failed_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of failed scrape pool reloads.
prometheus_target_scrape_pool_reloads_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of scrape pool reloads.
prometheus_target_scrape_pools_failed_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of scrape pool creations that failed.
prometheus_target_scrape_pools_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of scrape pool creation attempts.
prometheus_target_scrapes_cache_flush_forced_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	How many times a scrape cache was flushed due to getting big while scrapes are failing.
prometheus_target_scrapes_exceeded_body_size_limit_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of scrapes that hit the body size limit
prometheus_target_scrapes_exceeded_sample_limit_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of scrapes that hit the sample limit and were rejected.
prometheus_target_scrapes_exemplar_out_of_order_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of exemplar rejected due to not being out of the expected order.
prometheus_target_scrapes_sample_duplicate_timestamp_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of samples rejected due to duplicate timestamps but different values.
prometheus_target_scrapes_sample_out_of_bounds_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of samples rejected due to timestamp falling outside of the time bounds.
prometheus_target_scrapes_sample_out_of_order_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	Total number of samples rejected due to not being out of the expected order.
prometheus_template_text_expansion_failures_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The total number of template text expansion failures.
prometheus_template_text_expansions_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The total number of template text expansions.
prometheus_treecache_watcher_goroutines	gauge	`instance`, `ins`, `job`, `ip`, `cls`	The current number of watcher goroutines.
prometheus_treecache_zookeeper_failures_total	counter	`instance`, `ins`, `job`, `ip`, `cls`	The total number of ZooKeeper failures.
promhttp_metric_handler_errors_total	counter	`ip`, `cause`, `ins`, `job`, `instance`, `cls`	Total number of internal errors encountered by the promhttp metric handler.
promhttp_metric_handler_requests_in_flight	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Current number of scrapes being served.
promhttp_metric_handler_requests_total	counter	`ip`, `ins`, `code`, `job`, `instance`, `cls`	Total number of scrapes by HTTP status code.
promtail_batch_retries_total	Unknown	`host`, `ip`, `ins`, `job`, `instance`, `cls`	N/A
promtail_build_info	gauge	`ip`, `version`, `revision`, `goversion`, `branch`, `ins`, `goarch`, `job`, `tags`, `instance`, `cls`, `goos`	A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which promtail was built, and the goos and goarch for the build.
promtail_config_reload_fail_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
promtail_config_reload_success_total	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
promtail_dropped_bytes_total	Unknown	`host`, `ip`, `ins`, `job`, `reason`, `instance`, `cls`	N/A
promtail_dropped_entries_total	Unknown	`host`, `ip`, `ins`, `job`, `reason`, `instance`, `cls`	N/A
promtail_encoded_bytes_total	Unknown	`host`, `ip`, `ins`, `job`, `instance`, `cls`	N/A
promtail_file_bytes_total	gauge	`path`, `instance`, `ins`, `job`, `ip`, `cls`	Number of bytes total.
promtail_files_active_total	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of active files.
promtail_mutated_bytes_total	Unknown	`host`, `ip`, `ins`, `job`, `reason`, `instance`, `cls`	N/A
promtail_mutated_entries_total	Unknown	`host`, `ip`, `ins`, `job`, `reason`, `instance`, `cls`	N/A
promtail_read_bytes_total	gauge	`path`, `instance`, `ins`, `job`, `ip`, `cls`	Number of bytes read.
promtail_read_lines_total	Unknown	`path`, `instance`, `ins`, `job`, `ip`, `cls`	N/A
promtail_request_duration_seconds_bucket	Unknown	`host`, `ip`, `ins`, `job`, `status_code`, `le`, `instance`, `cls`	N/A
promtail_request_duration_seconds_count	Unknown	`host`, `ip`, `ins`, `job`, `status_code`, `instance`, `cls`	N/A
promtail_request_duration_seconds_sum	Unknown	`host`, `ip`, `ins`, `job`, `status_code`, `instance`, `cls`	N/A
promtail_sent_bytes_total	Unknown	`host`, `ip`, `ins`, `job`, `instance`, `cls`	N/A
promtail_sent_entries_total	Unknown	`host`, `ip`, `ins`, `job`, `instance`, `cls`	N/A
promtail_targets_active_total	gauge	`instance`, `ins`, `job`, `ip`, `cls`	Number of active total.
promtail_up	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
request_duration_seconds_bucket	Unknown	`instance`, `ins`, `job`, `status_code`, `route`, `ws`, `le`, `ip`, `cls`, `method`	N/A
request_duration_seconds_count	Unknown	`instance`, `ins`, `job`, `status_code`, `route`, `ws`, `ip`, `cls`, `method`	N/A
request_duration_seconds_sum	Unknown	`instance`, `ins`, `job`, `status_code`, `route`, `ws`, `ip`, `cls`, `method`	N/A
request_message_bytes_bucket	Unknown	`instance`, `ins`, `job`, `route`, `le`, `ip`, `cls`, `method`	N/A
request_message_bytes_count	Unknown	`instance`, `ins`, `job`, `route`, `ip`, `cls`, `method`	N/A
request_message_bytes_sum	Unknown	`instance`, `ins`, `job`, `route`, `ip`, `cls`, `method`	N/A
response_message_bytes_bucket	Unknown	`instance`, `ins`, `job`, `route`, `le`, `ip`, `cls`, `method`	N/A
response_message_bytes_count	Unknown	`instance`, `ins`, `job`, `route`, `ip`, `cls`, `method`	N/A
response_message_bytes_sum	Unknown	`instance`, `ins`, `job`, `route`, `ip`, `cls`, `method`	N/A
scrape_duration_seconds	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
scrape_samples_post_metric_relabeling	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
scrape_samples_scraped	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
scrape_series_added	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A
tcp_connections	gauge	`instance`, `ins`, `job`, `protocol`, `ip`, `cls`	Current number of accepted TCP connections.
tcp_connections_limit	gauge	`instance`, `ins`, `job`, `protocol`, `ip`, `cls`	The max number of TCP connections that can be accepted (0 means no limit).
up	Unknown	`instance`, `ins`, `job`, `ip`, `cls`	N/A

9.8 - 常见问题

Pigsty NODE 主机节点模块常见问题答疑

如何配置主机节点上的NTP服务？

NTP对于生产环境各项服务非常重要，如果没有配置 NTP，您可以使用公共 NTP 服务，或管理节点上的 Chronyd 作为标准时间。

如果您的节点已经配置了 NTP，可以通过设置 node_ntp_enabled 为 false 来保留现有配置，不进行任何变更。

否则，如果您有互联网访问权限，可以使用公共 NTP 服务，例如 pool.ntp.org。

如果您没有互联网访问权限，可以使用以下方式，确保所有环境内的节点与管理节点时间是同步的，或者使用其他内网环境的 NTP 授时服务。

node_ntp_servers:                 # /etc/chrony.conf 中的 ntp 服务器列表
  - pool cn.pool.ntp.org iburst
  - pool ${admin_ip} iburst       # 假设其他节点都没有互联网访问，那么至少与 Admin 节点保持时间同步。

如何在节点上强制同步时间？

为了使用 chronyc 来同步时间。您首先需要配置 NTP 服务。

ansible all -b -a 'chronyc -a makestep'     # 同步时间

您可以用任何组或主机 IP 地址替换 all，以限制执行范围。

远程节点无法通过SSH访问怎么办？

如果目标机器隐藏在SSH跳板机后面，或者进行了一些无法直接使用ssh ip访问的自定义操作，可以使用诸如 ansible_port 或 ansible_host 这一类Ansible连接参数来指定各种 SSH 连接信息，如下所示：

pg-test:
  vars: { pg_cluster: pg-test }
  hosts:
    10.10.10.11: {pg_seq: 1, pg_role: primary, ansible_host: node-1 }
    10.10.10.12: {pg_seq: 2, pg_role: replica, ansible_port: 22223, ansible_user: admin }
    10.10.10.13: {pg_seq: 3, pg_role: offline, ansible_port: 22224 }

远程节点SSH与SUDO需要密码怎么办？

执行部署和更改时，使用的管理员用户必须对所有节点拥有ssh和sudo权限。无需密码免密登录。

您可以在执行剧本时通过-k|-K 参数传入ssh和sudo密码，甚至可以通过-eansible_host=<another_user>使用另一个用户来运行剧本。

但是，Pigsty强烈建议为管理员用户配置SSH无密码登录以及无密码的sudo。

如何使用现有管理员创建专用管理员用户？

使用以下命令，使用该节点上现有的管理员用户，创建由node_admin_username 定义的新的标准的管理员用户。

./node.yml -k -K -e ansible_user=<another_admin> -t node_admin

如何使用节点上的HAProxy对外暴露服务？

您可以在配置中中使用haproxy_services 来暴露服务，并使用 node.yml -t haproxy_config,haproxy_reload 来更新配置。

以下是使用它暴露MinIO服务的示例：暴露MinIO服务

为什么我的 `/etc/yum.repos.d/*` 全没了？

Pigsty会在infra节点上构建的本地软件仓库源中包含所有依赖项。而所有普通节点会根据 node_repo_modules 的默认配置 local 来引用并使用 Infra 节点上的本地软件源。

这一设计从而避免了互联网访问，增强了安装过程的稳定性与可靠性。所有原有的源定义文件会被移动到 /etc/yum.repos.d/backup 目录中，您只要按需复制回来即可。

如果您想在普通节点安装过程中保留原有的源定义文件，将 node_repo_remove 设置为false即可。

如果您想在 Infra 节点构建本地源的过程中保留原有的源定义文件，将 repo_remove 设置为false即可。

为什么我的命令行提示符变样了？怎么恢复？

Pigsty 使用的 Shell 命令行提示符是由环境变量 PS1 指定，定义在 /etc/profile.d/node.sh 文件中。

如果您不喜欢，想要修改或恢复原样，可以将这个文件移除，重新登陆即可。

为什么我的主机名变了？

在两种情况下，Pigsty 会修改您的节点主机名：

显式定义了 nodename 的值（默认为空）
节点上声明了 PGSQL 模块，且启用了 node_id_from_pg 参数（默认为 true）

如果您不希望修改主机名，可以在全局/集群/实例层面修改 nodename_overwrite 参数为 false （默认值为 true）。

详情请参考 NODE_ID 一节。

腾讯云的 OpenCloudOS 有什么兼容性问题？

OpenCloudOS 上的 softdog 内核模块不可用，需要从 node_kernel_modules 中移除。在配置文件全局变量中添加以下配置项以覆盖：

node_kernel_modules: [ br_netfilter, ip_vs, ip_vs_rr, ip_vs_wrr, ip_vs_sh ]

Debian系统默认有什么坑？

Debian 系统通常会有一些额外的小瑕疵问题，例如，如果出现 Local 没有定义的情况，可以用以下命令来修复：

localedef -i en_US -f UTF-8 en_US.UTF-8
localectl set-locale LANG=en_US.UTF-8

另外，很多 Debian 镜像默认没有带 rsync 工具，您可以自行使用 apt-get install rsync 来安装。

10 - 模块：ETCD

Pigsty 内置了 etcd 支持，这是一个可靠的分布式配置存储数据库，作为 DCS 为 PostgreSQL 高可用提供支持。

ETCD 是一个分布式的、可靠的键-值存储，用于存放系统中最为关键的配置数据。

Pigsty 使用 etcd 作为 DCS：分布式配置存储（或称为分布式共识服务）。这对于 PostgreSQL 的高可用性与自动故障转移至关重要。

在安装 ETCD 模块之前，需要安装 NODE 模块将节点纳管。此外，除非您决定使用外部的现有 etcd 集群，否则在部署任何 PGSQL 集群之前，你必须先安装 ETCD 模块，因为 patroni 和 vip-manager 会依赖 etcd 模块实现高可用与L2 VIP绑定。

10.1 - 集群配置

根据需求场景选择合适的 Etcd 集群规模，并对外提供可靠的接入。

在部署 Etcd 之前，你需要在配置清单中定义一个 Etcd 集群，通常来说，你可以选择：

单节点，没有高可用性，适用于开发、测试、演示，或者依赖外部S3备份进行PITR的无高可用单机部署。
三节点，具有基本的高可用性，可以容忍一个节点的故障，适用于中小规模的生产环境
五节点，具有更好的高可用性，可以容忍两个节点的故障，适用于大规模生产环境。

偶数节点的 Etcd 集群没有意义，超过五节点的 Etcd 集群并不常见，因此通常使用的规格就是单节点，三节点，五节点。

单节点

在 Pigsty 中，定义一个单例 Etcd 实例非常简单，只需要一行配置即可：

etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

在 Pigsty 提供的所有单机配置模板中，都有这样一项，其中的占位 IP 地址：10.10.10.10 默认会被替换为当前管理节点的 IP。

除了 IP 地址外，这里唯一必要的参数是 etcd_seq 和 etcd_cluster，它们会唯一标识每一个 Etcd 实例。

三节点

三节点的 Etcd 集群最为常见，它可以容忍一个节点的故障，适用于中小规模的生产环境。

例如，Pigsty 的三节点模板：trio 和 safe 就使用了三节点的 Etcd 集群，如下所示：

etcd: 
  hosts:
    10.10.10.10: { etcd_seq: 1 }  # etcd_seq （etcd实例号）是必须指定的身份参数
    10.10.10.11: { etcd_seq: 2 }  # 实例号是正整数，一般从 0 或 1 开始依次分配
    10.10.10.12: { etcd_seq: 3 }  # 实例号应当终生不可变，一旦分配就不再回收使用。
  vars: # 集群层面的参数
    etcd_cluster: etcd    # 默认情况下，etcd集群名就叫 etcd， 除非您想要部署多套 etcd 集群，否则不要改这个名字
    etcd_safeguard: false # 是否打开 etcd 的防误删安全保险？ 在生产环境初始化完成后，可以考虑打开这个选项，避免误删。
    etcd_clean: true      # 在初始化过程中，是否强制移除现有的 etcd 实例？测试的时候可以打开，这样剧本就是真正幂等的。

五节点

五节点的 Etcd 集群可以容忍两个节点的故障，适用于大规模生产环境。

例如，Pigsty 的生产仿真模板：prod 中就使用了一个五节点的 Etcd 集群：

etcd:
  hosts:
    10.10.10.21 : { etcd_seq: 1 }
    10.10.10.22 : { etcd_seq: 2 }
    10.10.10.23 : { etcd_seq: 3 }
    10.10.10.24 : { etcd_seq: 4 }
    10.10.10.25 : { etcd_seq: 5 }
  vars: { etcd_cluster: etcd    }

使用Etcd的服务

目前使用 Etcd 的服务有：

patroni: 用于 PostgreSQL 高可用，Etcd 的配置将填入 Patroni 配置文件。
vip-manager: 用于在 PostgreSQL 集群上绑定一个可选的 L2 VIP，会从 Etcd 中读取集群的领导者信息。

当 etcd 集群的成员信息发生永久性变更时，您应当重载相关服务的配置，以确保服务能够正确访问 Etcd 集群。

在 patroni 上更新 etcd 端点引用：

./pgsql.yml -t pg_conf                            # 重新生成 patroni 配置
ansible all -f 1 -b -a 'systemctl reload patroni' # 重新加载 patroni 配置

在 vip-manager 上更新 etcd 端点引用（如果你正在使用 PGSQL L2 VIP 才需要执行此操作）：

./pgsql.yml -t pg_vip_config                           # 重新生成 vip-manager 配置
ansible all -f 1 -b -a 'systemctl restart vip-manager' # 重启 vip-manager 以使用新配置

10.2 - 参数列表

Etcd 模块提供了 10 个相关配置参数，用于定制所需的 Etcd 集群。

ETCD 是一个分布式的、可靠的键-值存储，用于存放系统中最为关键的数据。在 Pigsty 中，etcd 作为高可用组件 Patroni 使用的 DCS，它对于 PG 的高可用非常重要。

Pigsty 为 etcd 集群使用一个硬编码的默认集群组名 etcd，它可以是一套现有的外部 etcd 集群，或者是默认由 Pigsty 使用 etcd.yml 剧本部署创建的新etcd集群。

参数列表

ETCD 模块有 10 个相关参数：

参数	类型	级别	注释
`etcd_seq`	int	I	etcd 实例标识符，必填
`etcd_cluster`	string	C	etcd 集群名，默认固定为 etcd
`etcd_safeguard`	bool	G/C/A	etcd 防误删保险，阻止清除正在运行的 etcd 实例？
`etcd_clean`	bool	G/C/A	etcd 清除指令：在初始化时清除现有的 etcd 实例？
`etcd_data`	path	C	etcd 数据目录，默认为 /data/etcd
`etcd_port`	port	C	etcd 客户端端口，默认为 2379
`etcd_peer_port`	port	C	etcd 同伴端口，默认为 2380
`etcd_init`	enum	C	etcd 初始集群状态，新建或已存在
`etcd_election_timeout`	int	C	etcd 选举超时，默认为 1000ms
`etcd_heartbeat_interval`	int	C	etcd 心跳间隔，默认为 100ms

默认参数

Etcd 模块的默认参数定义于 roles/etcd/defaults/main.yml

#-----------------------------------------------------------------
# etcd
#-----------------------------------------------------------------
#etcd_seq: 1                      # etcd instance identifier, explicitly required
etcd_cluster: etcd                # etcd cluster & group name, etcd by default
etcd_safeguard: false             # prevent purging running etcd instance?
etcd_clean: true                  # purging existing etcd during initialization?
etcd_data: /data/etcd             # etcd data directory, /data/etcd by default
etcd_port: 2379                   # etcd client port, 2379 by default
etcd_peer_port: 2380              # etcd peer port, 2380 by default
etcd_init: new                    # etcd initial cluster state, new or existing
etcd_election_timeout: 1000       # etcd election timeout, 1000ms by default
etcd_heartbeat_interval: 100      # etcd heartbeat interval, 100ms by default

`etcd_seq`

参数名称： etcd_seq，类型： int，层次：I

etcd 实例标号，这是必选参数，必须为每一个 etcd 实例指定一个唯一的标号。

以下是一个3节点etcd集群的示例，分配了 1 ～ 3 三个标号。

etcd: # dcs service for postgres/patroni ha consensus
  hosts:  # 1 node for testing, 3 or 5 for production
    10.10.10.10: { etcd_seq: 1 }  # etcd_seq required
    10.10.10.11: { etcd_seq: 2 }  # assign from 1 ~ n
    10.10.10.12: { etcd_seq: 3 }  # odd number please
  vars: # cluster level parameter override roles/etcd
    etcd_cluster: etcd  # mark etcd cluster name etcd
    etcd_safeguard: false # safeguard against purging
    etcd_clean: true # purge etcd during init process

`etcd_cluster`

参数名称： etcd_cluster，类型： string，层次：C

etcd 集群 & 分组名称，默认值为硬编码值 etcd。

当您想要部署另外的 etcd 集群备用时，可以修改此参数并使用其他集群名。

`etcd_safeguard`

参数名称： etcd_safeguard，类型： bool，层次：G/C/A

安全保险参数，防止清除正在运行的etcd实例？默认值为 false。

如果启用安全保险，etcd.yml 剧本不会清除正在运行的etcd实例。

`etcd_clean`

参数名称： etcd_clean，类型： bool，层次：G/C/A

在初始化时清除现有的 etcd ？默认值为true。

如果启用，etcd.yml 剧本将清除正在运行的 etcd 实例，这将使其成为一个真正幂等的剧本（总是抹除现有集群）。

但是如果启用了etcd_safeguard，即使设置了此参数，剧本依然会在遇到运行中的 etcd 实例时中止，避免误删。

`etcd_data`

参数名称： etcd_data，类型： path，层次：C

etcd 数据目录，默认为/data/etcd 。

`etcd_port`

参数名称： etcd_port，类型： port，层次：C

etcd 客户端端口号，默认为2379。

`etcd_peer_port`

参数名称： etcd_peer_port，类型： port，层次：C

etcd peer 端口，默认为 2380 。

`etcd_init`

参数名称： etcd_init，类型： enum，层次：C

etcd初始集群状态，可以是new或existing，默认值：new。

默认将创建一个独立的新etcd集群，当尝试向现有etcd集群 添加新成员 时，应当使用 existing。

`etcd_election_timeout`

参数名称： etcd_election_timeout，类型： int，层次：C

etcd 选举超时，默认为 1000 (毫秒)，也就是 1 秒。

`etcd_heartbeat_interval`

参数名称： etcd_heartbeat_interval，类型： int，层次：C

etcd心跳间隔，默认为 100 (毫秒)。

10.3 - 预置剧本

如何使用预置的 ansible 剧本来管理 Etcd 集群，常用管理命令速查。

Etcd 模块提供了一个默认的剧本 etcd.yml ，用于安装 Etcd 集群。

`etcd.yml`

剧本原始文件：etcd.yml

执行本剧本，将会在硬编码的固定分组 etcd 上安装配置 Etcd 集群，并启动 etcd 服务。

在 etcd.yml 中，提供了以下是可用的任务子集：

etcd_assert ：生成 etcd 身份
etcd_install ：安装 etcd rpm 包
etcd_clean ：清理现有的 etcd 实例
- etcd_check ：检查 etcd 实例是否在运行
- etcd_purge ：删除正在运行的 etcd 实例和数据
etcd_dir ：创建 etcd 数据和配置目录
etcd_config ：生成 etcd 配置
- etcd_conf ：生成 etcd 主配置
- etcd_cert ：生成 etcd ssl 证书
etcd_launch ：启动 etcd 服务
etcd_register ：将 etcd 注册到 prometheus

Etcd 模块没有提供专门的卸载剧本，如果您需要卸载 Etcd，可以使用剧本中的 etcd_clean 子任务，请参考 保护机制 中的介绍。

执行演示

命令速查

Etcd 剧本与快捷方式：

./etcd.yml                                      # 初始化 etcd 集群 
./etcd.yml -t etcd_launch                       # 重启整个 etcd 集群
./etcd.yml -t etcd_clean                        # 移除整个集群，会检查现有实例是否存在，根据安全保险判断是否执行
./etcd.yml -t etcd_purge                        # 强制移除整个集群，根本不管安全保险是否启用。
./etcd.yml -t etcd_conf                         # 使用最新状态刷新 /etc/etcd/etcd.conf
./etcd.yml -l 10.10.10.12 -e etcd_init=existing # 扩容节点：一定要添加 existing 参数，命令行或配置文件均可
./etcd.yml -l 10.10.10.12 -t etcd_purge         # 删除节点

保护机制

出于防止误删的目的，Pigsty 的 ETCD 模块提供了防误删保险，由以下参数控制：

etcd_clean 默认为 true，即，默认在初始化时清理现有实例。
etcd_safeguard: 默认为 false，即默认不打开防误删保护。

默认配置使得您可以使用剧本重置 etcd 集群的状态，这对于开发、测试和生产环境中紧急重建 etcd 集群非常有用。

如果您希望在初始化时清理现有实例，请修改配置文件，显式关闭此保险，或者在执行时使用命令行参数 -e etcd_clean=true 进行覆盖。

如果您单纯希望清理现有实例，而不安装新实例，直接执行 etcd_clean 子任务即可：

./etcd.yml -l <cls> -e etcd_clean=true -t etcd_clean

如果您确定要摧毁这个 etcd 集群，更简单暴力直接的方式是：

./etcd.yml -l <cls> -t etcd_purge

除非您清楚地知道自己在做什么，我们并不建议用户这样清理 Etcd 集群。

10.4 - 管理预案

Etcd 集群管理 SOP，创建，销毁，扩容，缩容的详细说明

以下是一些常见的 etcd 管理任务 SOP（预案）：

创建集群：如何初始化 etcd 集群？
销毁集群：如何销毁 etcd 集群？
环境变量：如何配置 etcd 客户端，以访问 etcd 服务器集群？
重载配置：如何更新客户端使用的 etcd 服务器成员列表？
添加成员：如何向现有 etcd 集群添加新成员？
移除成员：如何从 etcd 集群移除老成员？

更多问题请参考 FAQ：ETCD。

创建集群

要创建一个集群，首先需要在 配置清单 中定义 etcd 集群：

etcd:
  hosts:
    10.10.10.10: { etcd_seq: 1 }
    10.10.10.11: { etcd_seq: 2 }
    10.10.10.12: { etcd_seq: 3 }
  vars: { etcd_cluster: etcd }

执行 etcd.yml 剧本即可。

./etcd.yml  # 初始化 etcd 集群

注意，Pigsty 的 etcd 模块提供了防误删保护机制。在默认配置下， etcd_clean 配置打开，且 etcd_safeguard 配置关闭，那么执行此剧本的过程中即使遇到存活的etcd实例，也会强制移除，在这种情况下 etcd.yml 剧本是真正幂等的。这种配置对于开发，测试，以及生产环境紧急强制重建 etcd 集群来说是有用的。

对于生产环境已经初始化好的 etcd 集群，可以打开防误删保护，避免误删现有的 etcd 实例。此时当剧本检测到存活 etcd 实例时会主动中止，避免误删现有 etcd 实例，您可以使用命令行参数来覆盖这一行为。

销毁集群

要销毁一个 etcd 集群，只需使用 etcd.yml 剧本的 etcd_clean 子任务即可。执行此命令前请务必三思！

./etcd.yml -t etcd_clean  # 移除整个集群，会检查现有实例是否存在，根据安全保险判断是否执行
./etcd.yml -t etcd_purge  # 强制移除整个集群，根本不管安全保险是否启用。

使用 etcd_clean 子任务会尊重 etcd_safeguard 防误删保险的配置，使用 etcd_purge 子任务则会无视一切清理现有 etcd 集群。

环境变量

Pigsty 默认使用 etcd v3 API，以下是etcd客户端配置环境变量的示例。

alias e="etcdctl"
alias em="etcdctl member"
export ETCDCTL_API=3
export ETCDCTL_ENDPOINTS=https://10.10.10.10:2379
export ETCDCTL_CACERT=/etc/pki/ca.crt
export ETCDCTL_CERT=/etc/etcd/server.crt
export ETCDCTL_KEY=/etc/etcd/server.key

配置好客户端环境变量后，你可以使用以下命令进行 etcd CRUD 操作：

e put a 10 ; e get a; e del a ; # V3 API

重载配置

如果 etcd 集群的成员发生变化，我们需要刷新对 etcd 服务端点的引用，目前 Pigsty 中有四处 etcd 引用：

现有 etcd 实例成员的配置文件
etcdctl 客户端环境变量（infra节点上）
patroni DCS 端点配置（pgsql节点上）
vip-manager DCS 端点配置（可选）

要在现有etcd成员上刷新 etcd 配置文件 /etc/etcd/etcd.conf：

./etcd.yml -t etcd_conf                           # 使用最新状态刷新 /etc/etcd/etcd.conf
ansible etcd -f 1 -b -a 'systemctl restart etcd'  # 可选操作：重启 etcd

刷新 etcdctl 客户端环境变量：

$ ./etcd.yml -t etcd_env                          # 刷新 /etc/profile.d/etcdctl.sh （管理节点）

在 patroni 上更新 etcd 端点引用：

./pgsql.yml -t pg_conf                            # 重新生成 patroni 配置
ansible all -f 1 -b -a 'systemctl reload patroni' # 重新加载 patroni 配置

在 vip-manager 上更新 etcd 端点引用（如果你正在使用 PGSQL L2 VIP 才需要执行此操作）：

./pgsql.yml -t pg_vip_config                           # 重新生成 vip-manager 配置
ansible all -f 1 -b -a 'systemctl restart vip-manager' # 重启 vip-manager 以使用新配置

添加成员

ETCD 参考: 添加成员

向现有的 etcd 集群添加新成员通常需要五个步骤：

简短版本

执行 etcdctl member add 命令，通知现有集群即将有新成员加入（使用学习者模式）
更新配置清单，将新实例写入配置文件 etcd 组中。
使用 etcd_init=existing 的方式初始化新的 etcd 实例，使其加入现有集群而不是创建一个新集群（非常重要）
将新成员从学习者提升为追随者，正式成为集群中具有投票权的一员。
重载配置以更新客户端使用的 etcd 服务端点。

etcdctl member add <etcd-?> --learner=true --peer-urls=https://<new_ins_ip>:2380  # 通知集群
./etcd.yml -l <new_ins_ip> -e etcd_init=existing                                  # 初始化新实例
etcdctl member promote <new_ins_server_id>                                        # 提升实例为追随者

详细步骤：向etcd集群添加成员

下面是具体操作的详细细节，让我们从一个单实例 etcd 集群开始：

etcd:
  hosts:
    10.10.10.10: { etcd_seq: 1 } # <--- 集群中原本存在的唯一实例
    10.10.10.11: { etcd_seq: 2 } # <--- 将此新成员定义添加到清单中
  vars: { etcd_cluster: etcd }

使用 etcd member add 向现有 etcd 集群宣告新的学习者实例 etcd-2 即将到来：

$ etcdctl member add etcd-2 --learner=true --peer-urls=https://10.10.10.11:2380
Member 33631ba6ced84cf8 added to cluster 6646fbcf5debc68f

ETCD_NAME="etcd-2"
ETCD_INITIAL_CLUSTER="etcd-2=https://10.10.10.11:2380,etcd-1=https://10.10.10.10:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.10.11:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"

使用 etcdctl member list（或 em list）检查成员列表，我们可以看到一个 unstarted 新成员：

33631ba6ced84cf8, unstarted, , https://10.10.10.11:2380, , true       # 这里有一个未启动的新成员
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false

接下来使用 etcd.yml 剧本初始化新的 etcd 实例 etcd-2，完成后，我们可以看到新成员已经启动：

$ ./etcd.yml -l 10.10.10.11 -e etcd_init=existing    # 一定要添加 existing 参数，命令行或配置文件均可
...
33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, true
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false

新成员初始化完成并稳定运行后，可以将新成员从学习者提升为追随者：

$ etcdctl member promote 33631ba6ced84cf8   # 将学习者提升为追随者，这里需要使用 etcd 实例的 ID
Member 33631ba6ced84cf8 promoted in cluster 6646fbcf5debc68f

$ em list                # check again, the new member is started
33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, false
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, fals

新成员添加完成，请不要忘记重载配置，让所有客户端也知道新成员的存在。

重复以上步骤，可以添加更多成员。记住，生产环境中至少要使用 3 个成员。

移除成员

要从 etcd 集群中删除一个成员实例，通常需要以下三个步骤：

从配置清单中注释/屏蔽/删除该实例，并重载配置，让客户端不再使用该实例。
使用 etcdctl member remove <server_id> 命令将它从集群中踢除
将该实例临时添加回配置清单，使用剧本彻底移除下线该实例，然后永久从配置中删除

详细步骤：从etcd集群移除成员

让我们以一个 3 节点的 etcd 集群为例，从中移除 3 号实例。

为了刷新配置，您需要注释要待删除的成员，然后重载配置，让所有客户端都不要再使用此实例。

etcd:
  hosts:
    10.10.10.10: { etcd_seq: 1 }
    10.10.10.11: { etcd_seq: 2 }
    10.10.10.12: { etcd_seq: 3 }   # <---- 首先注释掉这个成员，然后重载配置
  vars: { etcd_cluster: etcd }

然后，您需要使用 etcdctl member remove 命令，将它从集群中踢出去：

$ etcdctl member list 
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false
33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, false
93fcf23b220473fb, started, etcd-3, https://10.10.10.12:2380, https://10.10.10.12:2379, false  # <--- 移除这个

$ etcdctl member remove 93fcf23b220473fb  # kick it from cluster
Member 93fcf23b220473fb removed from cluster 6646fbcf5debc68f

最后，您要将该成员临时添加回配置清单中以便运行下线任务，将实例彻底关停移除。

./etcd.yml -t etcd_purge -l 10.10.10.12   # 下线该实例（注意：执行这个命令要求这个实例的定义还在配置清单里）

执行完毕后，您可以将其从配置清单中永久删除，移除成员至此完成。

重复以上步骤，可以移除更多成员，与添加成员配合使用，可以对 etcd 集群进行滚动升级搬迁。

10.5 - 监控告警

如何监控 Etcd？有哪些告警规则值得关注？

监控面板

ETCD 模块提供了一个监控面板：Etcd Overview。

ETCD Overview Dashboard

ETCD Overview：ETCD 集群概览

这个监控面板提供了 ETCD 状态的关键信息：最值得关注的是 ETCD Aliveness，它显示了 ETCD 集群整体的服务状态。

红色的条带标识着实例不可用的时间段，而底下蓝灰色的条带标识着整个集群处于不可用的时间段。

告警规则

Pigsty 针对 Etcd 提供了以下五条预置告警规则，定义于 files/prometheus/rules/etcd.yml

EtcdServerDown：Etcd 节点宕机，严重警报
EtcdNoLeader：Etcd 集群没有领导者，严重警报
EtcdQuotaFull：Etcd 配额使用超过 90%，警告
EtcdNetworkPeerRTSlow：Etcd 网络时延缓慢，提醒
EtcdWalFsyncSlow：Etcd 磁盘刷盘缓慢，提醒

#==============================================================#
#                         Aliveness                            #
#==============================================================#
# etcd server instance down
- alert: EtcdServerDown
  expr: etcd_up < 1
  for: 1m
  labels: { level: 0, severity: CRIT, category: etcd }
  annotations:
    summary: "CRIT EtcdServerDown {{ $labels.ins }}@{{ $labels.instance }}"
    description: |
      etcd_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value }} < 1
      http://g.pigsty/d/etcd-overview      

#==============================================================#
#                         Error                                #
#==============================================================#
# Etcd no Leader triggers a P0 alert immediately
# if dcs_failsafe mode is not enabled, this may lead to global outage
- alert: EtcdNoLeader
  expr: min(etcd_server_has_leader) by (cls) < 1
  for: 15s
  labels: { level: 0, severity: CRIT, category: etcd }
  annotations:
    summary: "CRIT EtcdNoLeader: {{ $labels.cls }} {{ $value }}"
    description: |
      etcd_server_has_leader[cls={{ $labels.cls }}] = {{ $value }} < 1
      http://g.pigsty/d/etcd-overview?from=now-5m&to=now&var-cls={{$labels.cls}}      

#==============================================================#
#                        Saturation                            #
#==============================================================#
- alert: EtcdQuotaFull
  expr: etcd:cls:quota_usage > 0.90
  for: 1m
  labels: { level: 1, severity: WARN, category: etcd }
  annotations:
    summary: "WARN EtcdQuotaFull: {{ $labels.cls }}"
    description: |
      etcd:cls:quota_usage[cls={{ $labels.cls }}] = {{ $value | printf "%.3f" }} > 90%      

#==============================================================#
#                         Latency                              #
#==============================================================#
# etcd network peer rt p95 > 200ms for 1m
- alert: EtcdNetworkPeerRTSlow
  expr: etcd:ins:network_peer_rt_p95_5m > 0.200
  for: 1m
  labels: { level: 2, severity: INFO, category: etcd }
  annotations:
    summary: "INFO EtcdNetworkPeerRTSlow: {{ $labels.cls }} {{ $labels.ins }}"
    description: |
      etcd:ins:network_peer_rt_p95_5m[cls={{ $labels.cls }}, ins={{ $labels.ins }}] = {{ $value }} > 200ms
      http://g.pigsty/d/etcd-instance?from=now-10m&to=now&var-cls={{ $labels.cls }}      

# Etcd wal fsync rt p95 > 50ms
- alert: EtcdWalFsyncSlow
  expr: etcd:ins:wal_fsync_rt_p95_5m > 0.050
  for: 1m
  labels: { level: 2, severity: INFO, category: etcd }
  annotations:
    summary: "INFO EtcdWalFsyncSlow: {{ $labels.cls }} {{ $labels.ins }}"
    description: |
      etcd:ins:wal_fsync_rt_p95_5m[cls={{ $labels.cls }}, ins={{ $labels.ins }}] = {{ $value }} > 50ms
      http://g.pigsty/d/etcd-instance?from=now-10m&to=now&var-cls={{ $labels.cls }}

10.6 - 指标列表

Pigsty ETCD 模块提供的完整监控指标列表与释义

ETCD 模块包含有 177 类可用监控指标。

Metric Name	Type	Labels	Description
etcd:ins:backend_commit_rt_p99_5m	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd:ins:disk_fsync_rt_p99_5m	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd:ins:network_peer_rt_p99_1m	Unknown	`cls`, `To`, `ins`, `instance`, `job`, `ip`	N/A
etcd_cluster_version	gauge	`cls`, `cluster_version`, `ins`, `instance`, `job`, `ip`	Which version is running. 1 for ‘cluster_version’ label with current cluster version
etcd_debugging_auth_revision	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The current revision of auth store.
etcd_debugging_disk_backend_commit_rebalance_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_debugging_disk_backend_commit_rebalance_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_disk_backend_commit_rebalance_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_disk_backend_commit_spill_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_debugging_disk_backend_commit_spill_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_disk_backend_commit_spill_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_disk_backend_commit_write_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_debugging_disk_backend_commit_write_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_disk_backend_commit_write_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_lease_granted_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of granted leases.
etcd_debugging_lease_renewed_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The number of renewed leases seen by the leader.
etcd_debugging_lease_revoked_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of revoked leases.
etcd_debugging_lease_ttl_total_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_debugging_lease_ttl_total_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_lease_ttl_total_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_mvcc_compact_revision	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The revision of the last compaction in store.
etcd_debugging_mvcc_current_revision	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The current revision of store.
etcd_debugging_mvcc_db_compaction_keys_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of db keys compacted.
etcd_debugging_mvcc_db_compaction_last	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The unix time of the last db compaction. Resets to 0 on start.
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_mvcc_events_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of events sent by this member.
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_mvcc_keys_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of keys.
etcd_debugging_mvcc_pending_events_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of pending events to be sent.
etcd_debugging_mvcc_range_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of ranges seen by this member.
etcd_debugging_mvcc_slow_watcher_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of unsynced slow watchers.
etcd_debugging_mvcc_total_put_size_in_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The total size of put kv pairs seen by this member.
etcd_debugging_mvcc_watch_stream_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of watch streams.
etcd_debugging_mvcc_watcher_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of watchers.
etcd_debugging_server_lease_expired_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of expired leases.
etcd_debugging_snap_save_marshalling_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_debugging_snap_save_marshalling_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_snap_save_marshalling_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_snap_save_total_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_debugging_snap_save_total_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_snap_save_total_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_debugging_store_expires_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of expired keys.
etcd_debugging_store_reads_total	counter	`cls`, `action`, `ins`, `instance`, `job`, `ip`	Total number of reads action by (get/getRecursive), local to this member.
etcd_debugging_store_watch_requests_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of incoming watch requests (new or reestablished).
etcd_debugging_store_watchers	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Count of currently active watchers.
etcd_debugging_store_writes_total	counter	`cls`, `action`, `ins`, `instance`, `job`, `ip`	Total number of writes (e.g. set/compareAndDelete) seen by this member.
etcd_disk_backend_commit_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_disk_backend_commit_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_disk_backend_commit_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_disk_backend_defrag_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_disk_backend_defrag_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_disk_backend_defrag_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_disk_backend_snapshot_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_disk_backend_snapshot_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_disk_backend_snapshot_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_disk_defrag_inflight	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Whether or not defrag is active on the member. 1 means active, 0 means not.
etcd_disk_wal_fsync_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_disk_wal_fsync_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_disk_wal_fsync_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_disk_wal_write_bytes_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of bytes written in WAL.
etcd_grpc_proxy_cache_hits_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of cache hits
etcd_grpc_proxy_cache_keys_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of keys/ranges cached
etcd_grpc_proxy_cache_misses_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of cache misses
etcd_grpc_proxy_events_coalescing_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of events coalescing
etcd_grpc_proxy_watchers_coalescing_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total number of current watchers coalescing
etcd_mvcc_db_open_read_transactions	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The number of currently open read transactions
etcd_mvcc_db_total_size_in_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total size of the underlying database physically allocated in bytes.
etcd_mvcc_db_total_size_in_use_in_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Total size of the underlying database logically in use in bytes.
etcd_mvcc_delete_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of deletes seen by this member.
etcd_mvcc_hash_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_mvcc_hash_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_mvcc_hash_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_mvcc_hash_rev_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_mvcc_hash_rev_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_mvcc_hash_rev_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_mvcc_put_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of puts seen by this member.
etcd_mvcc_range_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of ranges seen by this member.
etcd_mvcc_txn_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of txns seen by this member.
etcd_network_active_peers	gauge	`cls`, `ins`, `Local`, `instance`, `job`, `ip`, `Remote`	The current number of active peer connections.
etcd_network_client_grpc_received_bytes_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of bytes received from grpc clients.
etcd_network_client_grpc_sent_bytes_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of bytes sent to grpc clients.
etcd_network_peer_received_bytes_total	counter	`cls`, `ins`, `instance`, `job`, `ip`, `From`	The total number of bytes received from peers.
etcd_network_peer_round_trip_time_seconds_bucket	Unknown	`cls`, `To`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_network_peer_round_trip_time_seconds_count	Unknown	`cls`, `To`, `ins`, `instance`, `job`, `ip`	N/A
etcd_network_peer_round_trip_time_seconds_sum	Unknown	`cls`, `To`, `ins`, `instance`, `job`, `ip`	N/A
etcd_network_peer_sent_bytes_total	counter	`cls`, `To`, `ins`, `instance`, `job`, `ip`	The total number of bytes sent to peers.
etcd_server_apply_duration_seconds_bucket	Unknown	`cls`, `version`, `ins`, `instance`, `job`, `le`, `success`, `ip`, `op`	N/A
etcd_server_apply_duration_seconds_count	Unknown	`cls`, `version`, `ins`, `instance`, `job`, `success`, `ip`, `op`	N/A
etcd_server_apply_duration_seconds_sum	Unknown	`cls`, `version`, `ins`, `instance`, `job`, `success`, `ip`, `op`	N/A
etcd_server_client_requests_total	counter	`client_api_version`, `cls`, `ins`, `instance`, `type`, `job`, `ip`	The total number of client requests per client version.
etcd_server_go_version	gauge	`cls`, `ins`, `instance`, `job`, `server_go_version`, `ip`	Which Go version server is running with. 1 for ‘server_go_version’ label with current version.
etcd_server_has_leader	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Whether or not a leader exists. 1 is existence, 0 is not.
etcd_server_health_failures	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of failed health checks
etcd_server_health_success	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of successful health checks
etcd_server_heartbeat_send_failures_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of leader heartbeat send failures (likely overloaded from slow disk).
etcd_server_id	gauge	`cls`, `ins`, `instance`, `job`, `server_id`, `ip`	Server or member ID in hexadecimal format. 1 for ‘server_id’ label with current ID.
etcd_server_is_leader	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Whether or not this member is a leader. 1 if is, 0 otherwise.
etcd_server_is_learner	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Whether or not this member is a learner. 1 if is, 0 otherwise.
etcd_server_leader_changes_seen_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The number of leader changes seen.
etcd_server_learner_promote_successes	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of successful learner promotions while this member is leader.
etcd_server_proposals_applied_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The total number of consensus proposals applied.
etcd_server_proposals_committed_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The total number of consensus proposals committed.
etcd_server_proposals_failed_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of failed proposals seen.
etcd_server_proposals_pending	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The current number of pending proposals to commit.
etcd_server_quota_backend_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Current backend storage quota size in bytes.
etcd_server_read_indexes_failed_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of failed read indexes seen.
etcd_server_slow_apply_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of slow apply requests (likely overloaded from slow disk).
etcd_server_slow_read_indexes_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	The total number of pending read indexes not in sync with leader’s or timed out read index requests.
etcd_server_snapshot_apply_in_progress_total	gauge	`cls`, `ins`, `instance`, `job`, `ip`	1 if the server is applying the incoming snapshot. 0 if none.
etcd_server_version	gauge	`cls`, `server_version`, `ins`, `instance`, `job`, `ip`	Which version is running. 1 for ‘server_version’ label with current version.
etcd_snap_db_fsync_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_snap_db_fsync_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_snap_db_fsync_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_snap_db_save_total_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_snap_db_save_total_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_snap_db_save_total_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_snap_fsync_duration_seconds_bucket	Unknown	`cls`, `ins`, `instance`, `job`, `le`, `ip`	N/A
etcd_snap_fsync_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_snap_fsync_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
etcd_up	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
go_gc_duration_seconds	summary	`cls`, `ins`, `instance`, `quantile`, `job`, `ip`	A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
go_gc_duration_seconds_sum	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
go_goroutines	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of goroutines that currently exist.
go_info	gauge	`cls`, `version`, `ins`, `instance`, `job`, `ip`	Information about the Go environment.
go_memstats_alloc_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of frees.
go_memstats_gc_cpu_fraction	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The fraction of this program’s available CPU time used by the GC since the program started.
go_memstats_gc_sys_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of heap bytes that are in use.
go_memstats_heap_objects	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of allocated objects.
go_memstats_heap_released_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of heap bytes released to OS.
go_memstats_heap_sys_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of pointer lookups.
go_memstats_mallocs_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total number of mallocs.
go_memstats_mcache_inuse_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of bytes obtained from system.
go_threads	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of OS threads created.
grpc_server_handled_total	counter	`cls`, `ins`, `instance`, `grpc_code`, `job`, `grpc_method`, `grpc_type`, `ip`, `grpc_service`	Total number of RPCs completed on the server, regardless of success or failure.
grpc_server_msg_received_total	counter	`cls`, `ins`, `instance`, `job`, `grpc_type`, `grpc_method`, `ip`, `grpc_service`	Total number of RPC stream messages received on the server.
grpc_server_msg_sent_total	counter	`cls`, `ins`, `instance`, `job`, `grpc_type`, `grpc_method`, `ip`, `grpc_service`	Total number of gRPC stream messages sent by the server.
grpc_server_started_total	counter	`cls`, `ins`, `instance`, `job`, `grpc_type`, `grpc_method`, `ip`, `grpc_service`	Total number of RPCs started on the server.
os_fd_limit	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The file descriptor limit.
os_fd_used	gauge	`cls`, `ins`, `instance`, `job`, `ip`	The number of used file descriptors.
process_cpu_seconds_total	counter	`cls`, `ins`, `instance`, `job`, `ip`	Total user and system CPU time spent in seconds.
process_max_fds	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Maximum number of open file descriptors.
process_open_fds	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Number of open file descriptors.
process_resident_memory_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Resident memory size in bytes.
process_start_time_seconds	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Virtual memory size in bytes.
process_virtual_memory_max_bytes	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flight	gauge	`cls`, `ins`, `instance`, `job`, `ip`	Current number of scrapes being served.
promhttp_metric_handler_requests_total	counter	`cls`, `ins`, `instance`, `job`, `ip`, `code`	Total number of scrapes by HTTP status code.
scrape_duration_seconds	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
scrape_samples_post_metric_relabeling	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
scrape_samples_scraped	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
scrape_series_added	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A
up	Unknown	`cls`, `ins`, `instance`, `job`, `ip`	N/A

10.7 - 常见问题

Pigsty etcd 模块常见问题答疑

etcd集群起什么作用？

etcd 是一个分布式的、可靠的键-值存储，用于存放系统中最为关键的数据，Pigsty 使用 etcd 作为 Patroni 的 DCS（分布式配置存储）服务，用于存储 PostgreSQL 集群的高可用状态信息。

Patroni 将通过 etcd，实现集群故障检测、自动故障转移、主从切换，集群配置管理等功能。

etcd 对于 PostgreSQL 集群的高可用至关重要，而 etcd 本身的可用性与容灾，是通过使用多个分布式的节点来保证的。

etcd集群使用多大规模合适？

如果超过集群成员数一半（包括正好一半）的 etcd 实例不可用，那么 etcd 集群将进入不可用状态，拒绝对外提供服务。

例如：使用 3 节点的 etcd 集群允许最多一个节点宕机，而其他两个节点仍然可以正常工作；而使用 5 节点的 etcd 集群则可以容忍 2 节点失效。

请注意，etcd 集群中的 学习者（Learner）实例不计入成员数，因此在 3 节点 etcd 集群中，如果有一个学习者实例，那么实际上成员数量为 2，不能容忍任一节点失效。

在生产环境中，我们建议使用奇数个 etcd 实例，对于生产环境，建议使用 3 节点或 5 节点的 etcd 集群部署以确保足够的可靠性。

etcd集群不可用会有什么影响？

如果 etcd 集群不可用，那么会影响 PostgreSQL 的管控平面，但不会影响数据平面 —— 现有的 PostgreSQL 集群将继续运行，但通过 Patroni 进行的管理操作将无法执行。

etcd 故障期间，PostgreSQL 高可用将无法实现自动故障转移，您也无法使用 patronictl 对 PostgreSQL 集群发起管理操作，例如修改配置，执行手动故障转移等。通过 Ansible 发起的管理命令不受 etcd 故障影响：例如创建数据库，创建用户，刷新 HBA 与 Service 配置等，etcd 故障期间，您依然可以直接操作 PostgreSQL 集群来实现这些功能。

请注意，以上描述的行为仅适用于较新版本的 Patroni (>=3.0，对应 Pigsty >= 2.0)。如果您使用的是较老版本的 Patroni (<3.0，对应 Pigsty 版本为 1.x)，则 etcd / consul 故障会引发极为严重的全局性影响：所有 PostgreSQL 集群将发生降级：主库将降级为从库，拒绝写请求，etcd 故障将放大为全局性 PostgreSQL 故障。在 Patroni 3.0 引入 DCS Failsafe 功能后，这种情况得到了显著改善。

etcd集群中存储着什么数据？

在 Pigsty 中，etcd 仅用于 PostgreSQL 高可用，并不会用于存储任何其他配置或状态数据。

而 PG 高可用组件 Patroni 会自动生成并管理 etcd 中的数据，当这些数据在 etcd 中丢失时，Patroni 会自动重建。

因此默认情况下，Pigsty 中的 etcd 可以视作 “无状态服务”，可以进行销毁与重建，这为维护工作带来了极大的便利。

如果您将 etcd 用于其他目的，例如作为 Kubernetes 的元数据存储，或自行存储其他数据，那么您需要自行备份 etcd 数据，并在 etcd 集群恢复后进行数据恢复。

如何从etcd故障中恢复？

因为 Pigsty 中的 etcd 只用于 PostgreSQL 高可用，本质上是可销毁、可重建的 “无状态服务”，因此在出现故障时，您可以通过 “重启” / “重置” 来进行快速止血。

要重启 etcd 集群，您可以使用以下 Ansible 命令：

./etcd.yml -t etcd_launch

要重置 etcd 集群，您可以直接执行以下剧本，实现覆盖抹除式重装：

./etcd.yml

如果您自行使用 etcd 存储了其他数据，那么通常需要备份 etcd 数据，并在 etcd 集群恢复后进行数据恢复。

维护etcd有什么注意事项？

简单的版本是：不要写爆 etcd 就好。

etcd 默认设置了一个 2GB 的数据库容量上限，如果您的 etcd 数据库容量超过了这个限制，etcd 将会拒绝写入请求，这可能导致依赖 etcd 的 PostgreSQL 高可用机制无法正常工作。与此同时，etcd 的数据模型使得每一次写入都会产生一个新的版本，因此如果您的 etcd 集群频繁写入，即使只有极个别的 Key， etcd 数据库的大小也可能会不断增长，并在达到容量上限时出现故障。

您可以通过 启动自动压实，手工压实，碎片整理与提高配额等方式实现这一点，详情请阅读 etcd 官方文档维护指南。

Pigsty 在 v2.6 之后默认启用了 etcd 自动压实（Auto Compact），通常无需担心写满 etcd 的问题。对于 v2.6 之前的版本，我们 强烈建议您 在生产环境中启用 etcd 的自动压实功能

写爆 etcd 可能导致 etcd 集群不可用与 PostgreSQL 高可用故障！

请 Pigsty v2.0 - v2.5 用户尽快升级到较新版本，或参照下面的说明启用 etcd 自动垃圾回收

如何启动etcd自动垃圾回收？

如果您使用的早先版本的 Pigsty （v2.0 - v2.5），我们强烈建议您通过以下步骤，在生产环境中启用 etcd 的自动压实功能，从而避免 etcd 容量配额写满导致的 etcd 不可用故障。

在 Pigsty 源码目录中，编辑 etcd 配置文件模板：roles/etcd/templates/etcd.conf.j2，添加以下三条配置项：

auto-compaction-mode: periodic
auto-compaction-retention: "24h"
quota-backend-bytes: 17179869184

然后将所有相关 PostgreSQL 集群设置为 维护模式 后，重新使用 ./etcd.yml 覆盖部署 etcd 集群即可。

该配置会将 etcd 默认的容量配额从 2 GiB 提高到 16 GiB，并确保只保留最近一天的写入历史版本，从而避免了 etcd 数据库大小的无限增长。

etcd中的PostgreSQL高可用数据存储在哪里？

默认情况下，Patroni 使用 pg_namespace 指定的前缀（默认为 /pg）作为所有元数据键的前缀，随后是 PostgreSQL 集群名称，例如，名为 pg-meta 的 PG 集群，其元数据键将存储在 /pg/pg-meta 下。

etcdctl get /pg/pg-meta --prefix

其中的数据样本如下所示：

/pg/pg-meta/config
{"ttl":30,"loop_wait":10,"retry_timeout":10,"primary_start_timeout":10,"maximum_lag_on_failover":1048576,"maximum_lag_on_syncnode":-1,"primary_stop_timeout":30,"synchronous_mode":false,"synchronous_mode_strict":false,"failsafe_mode":true,"pg_version":16,"pg_cluster":"pg-meta","pg_shard":"pg-meta","pg_group":0,"postgresql":{"use_slots":true,"use_pg_rewind":true,"remove_data_directory_on_rewind_failure":true,"parameters":{"max_connections":100,"superuser_reserved_connections":10,"max_locks_per_transaction":200,"max_prepared_transactions":0,"track_commit_timestamp":"on","wal_level":"logical","wal_log_hints":"on","max_worker_processes":16,"max_wal_senders":50,"max_replication_slots":50,"password_encryption":"scram-sha-256","ssl":"on","ssl_cert_file":"/pg/cert/server.crt","ssl_key_file":"/pg/cert/server.key","ssl_ca_file":"/pg/cert/ca.crt","shared_buffers":"7969MB","maintenance_work_mem":"1993MB","work_mem":"79MB","max_parallel_workers":8,"max_parallel_maintenance_workers":2,"max_parallel_workers_per_gather":0,"hash_mem_multiplier":8.0,"huge_pages":"try","temp_file_limit":"7GB","vacuum_cost_delay":"20ms","vacuum_cost_limit":2000,"bgwriter_delay":"10ms","bgwriter_lru_maxpages":800,"bgwriter_lru_multiplier":5.0,"min_wal_size":"7GB","max_wal_size":"28GB","max_slot_wal_keep_size":"42GB","wal_buffers":"16MB","wal_writer_delay":"20ms","wal_writer_flush_after":"1MB","commit_delay":20,"commit_siblings":10,"checkpoint_timeout":"15min","checkpoint_completion_target":0.8,"archive_mode":"on","archive_timeout":300,"archive_command":"pgbackrest --stanza=pg-meta archive-push %p","max_standby_archive_delay":"10min","max_standby_streaming_delay":"3min","wal_receiver_status_interval":"1s","hot_standby_feedback":"on","wal_receiver_timeout":"60s","max_logical_replication_workers":8,"max_sync_workers_per_subscription":6,"random_page_cost":1.1,"effective_io_concurrency":1000,"effective_cache_size":"23907MB","default_statistics_target":200,"log_destination":"csvlog","logging_collector":"on","log_directory":"/pg/log/postgres","log_filename":"postgresql-%Y-%m-%d.log","log_checkpoints":"on","log_lock_waits":"on","log_replication_commands":"on","log_statement":"ddl","log_min_duration_statement":100,"track_io_timing":"on","track_functions":"all","track_activity_query_size":8192,"log_autovacuum_min_duration":"1s","autovacuum_max_workers":2,"autovacuum_naptime":"1min","autovacuum_vacuum_cost_delay":-1,"autovacuum_vacuum_cost_limit":-1,"autovacuum_freeze_max_age":1000000000,"deadlock_timeout":"50ms","idle_in_transaction_session_timeout":"10min","shared_preload_libraries":"timescaledb, pg_stat_statements, auto_explain","auto_explain.log_min_duration":"1s","auto_explain.log_analyze":"on","auto_explain.log_verbose":"on","auto_explain.log_timing":"on","auto_explain.log_nested_statements":true,"pg_stat_statements.max":5000,"pg_stat_statements.track":"all","pg_stat_statements.track_utility":"off","pg_stat_statements.track_planning":"off","timescaledb.telemetry_level":"off","timescaledb.max_background_workers":8,"citus.node_conninfo":"sslm
ode=prefer"}}}
/pg/pg-meta/failsafe
{"pg-meta-2":"http://10.10.10.11:8008/patroni","pg-meta-1":"http://10.10.10.10:8008/patroni"}
/pg/pg-meta/initialize
7418384210787662172
/pg/pg-meta/leader
pg-meta-1
/pg/pg-meta/members/pg-meta-1
{"conn_url":"postgres://10.10.10.10:5432/postgres","api_url":"http://10.10.10.10:8008/patroni","state":"running","role":"primary","version":"4.0.1","tags":{"clonefrom":true,"version":"16","spec":"8C.32G.125G","conf":"tiny.yml"},"xlog_location":184549376,"timeline":1}
/pg/pg-meta/members/pg-meta-2
{"conn_url":"postgres://10.10.10.11:5432/postgres","api_url":"http://10.10.10.11:8008/patroni","state":"running","role":"replica","version":"4.0.1","tags":{"clonefrom":true,"version":"16","spec":"8C.32G.125G","conf":"tiny.yml"},"xlog_location":184549376,"replication_state":"streaming","timeline":1}
/pg/pg-meta/status
{"optime":184549376,"slots":{"pg_meta_2":184549376,"pg_meta_1":184549376},"retain_slots":["pg_meta_1","pg_meta_2"]}

如何使用一个外部的已经存在的 etcd 集群？

配置清单中硬编码了所使用 etcd 的分组名为 etcd，这个分组里的成员将被用作 PGSQL 的 DCS 服务器。您可以使用 etcd.yml 对它们进行初始化，或直接假设它是一个已存在的外部 etcd 集群。

要使用现有的外部 etcd 集群，只要像往常一样定义它们即可，您可以跳过 etcd.yml 剧本的执行，因为集群已经存在，不需要部署。

但用户必须确保 现有 etcd 集群证书是由 Pigsty 使用的相同 CA 签名颁发的。否则客户端无法使用 Pigsty 自签名 CA 颁发的证书来访问外部的 etcd 集群。

如何向现有etcd集群添加新的成员？

详细过程，请参考向 etcd 集群添加成员

etcdctl member add <etcd-?> --learner=true --peer-urls=https://<new_ins_ip>:2380 # 在管理节点上宣告新成员加入
./etcd.yml -l <new_ins_ip> -e etcd_init=existing                                 # 真正初始化新 etcd 成员
etcdctl member promote <new_ins_server_id>                                       # 在管理节点上提升新成员为正式成员

如何从现有etcd集群中移除成员？

详细过程，请参考从 etcd 集群中移除成员

etcdctl member remove <etcd_server_id>   # 在管理节点上从集群中踢出成员
./etcd.yml -l <ins_ip> -t etcd_purge     # 真正清除下线 etcd 实例

11 - 模块：MINIO

Pigsty 内置了 MinIO 支持，一个本地 S3 对象存储开源替代，可用于 PGSQL 模块冷备份存储。

Min.IO 是一个兼容 AWS S3 的多云对象存储软件，使用 AGPLv3 协议开源。

MinIO 可以用来存储文档、图片、视频和备份。Pigsty 原生支持部署各种 MinIO 集群，易于扩展、安全且开箱即用。

11.1 - 使用方法

快速上手，如何上手使用 MinIO ？如何可靠地接入 MinIO？如何使用 mc / rclone 客户端工具？

当您配置并执行剧本部署 MinIO 集群后，可以参考这里的说明开始使用与接入 MinIO 集群。

部署集群

在 Pigsty 中部署一个开箱即用的单机单盘 MinIO 实例非常简单：首先在配置清单中定义一套 MinIO 集群：

minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

然后，针对定义的分组（这里为 minio ）执行 Pigsty 提供的 minio.yml 剧本即可：

./minio.yml -l minio

请注意在 install.yml 中，事先定义好的 MinIO 集群将自动创建，无需手动再次执行 minio.yml 剧本。

如果您计划部署一个生产等级的大规模多节点 MinIO 集群，我们强烈建议您通读 Pigsty MinIO 配置文档与 MinIO 官方文档后再进行。

接入集群

请注意：MinIO 服务必须通过域名与 HTTPS 访问，所以请务必确保： MinIO 服务域名（默认为 sss.pigsty）正确指向 MinIO 服务器节点

您可以在 node_etc_hosts 中添加静态解析记录，或者手工修改 /etc/hosts 文件
您可以在内网的 DNS 服务器上添加一条记录，如果已经有了现成的 DNS 服务
如果您启用了 Infra 节点上的 DNS 服务器，可以在 dns_records 中添加记录

对于生产环境访问 MinIO，通常我们建议使用第一种方式：静态 DNS 解析记录，避免 MinIO 对于 DNS 的额外依赖。

您应当将 MinIO 服务域名指向 MinIO 服务器节点的 IP 地址与服务端口，或者负载均衡器的 IP 地址与服务端口。 Pigsty 默认使用的 MinIO 服务域名是 sss.pigsty，在单机部署时默认指向本机，在 9000 端口提供服务。

在一些例子中，MinIO 集群上还部署了 HAProxy 实例对外暴露服务，在这种情况下，9002 是模板中使用的服务端口。

添加别名

要使用 mcli 客户端访问 minio 服务器集群，首先要配置服务器的别名（alias）：

mcli alias ls  # 列出 minio 别名（默认使用sss）
mcli alias set sss https://sss.pigsty:9000 minioadmin minioadmin              # root 用户
mcli alias set sss https://sss.pigsty:9002 minioadmin minioadmin              # root 用户，使用负载均衡器 9002 端口

mcli alias set pgbackrest https://sss.pigsty:9000 pgbackrest S3User.Backup    # 使用备份用户

在管理节点的管理用户上，已经默认配置了名为 sss 的 MinIO 别名，可以直接使用。

MinIO 客户端工具 mcli 的完整功能参考，请查阅文档： MinIO 客户端。

用户管理

使用 mcli 可以管理 MinIO 中的业务用户，例如这里我们可以使用命令行创建两个业务用户：

mcli admin user list sss     # 列出 sss 上的所有用户
set +o history # 在历史记录中隐藏密码并创建 minio 用户
mcli admin user add sss dba S3User.DBA
mcli admin user add sss pgbackrest S3User.Backup
set -o history

存储桶管理

您可以对MinIO中的存储桶进行增删改查：

mcli ls sss/                         # 列出别名 'sss' 的所有桶
mcli mb --ignore-existing sss/hello  # 创建名为 'hello' 的桶
mcli rb --force sss/hello            # 强制删除 'hello' 桶

对象管理

您也可以对存储桶内的对象进行增删改查，详情请参考官方文档：对象管理

mcli cp /www/pigsty/* sss/infra/     # 将本地软件源的内容上传到 MinIO 的 infra 桶中 
mcli cp sss/infra/plugins.tgz /tmp/  # 从 minio 下载文件到本地
mcli ls sss/infra                    # 列出 infra 桶中的所有文件
mcli rm sss/infra/plugins.tgz        # 删除 infra 桶中的特定文件  
mcli cat sss/infra/repo_complete     # 查看 infra 桶中的文件内容

使用rclone

Pigsty 仓库中提供了 rclone，一个方便的多云对象存储客户端，您可以使用它来访问 MinIO 服务。

yum install rclone; # el compatible
dnf install rclone; # debian/ubuntu

mkdir -p ~/.config/rclone/;
tee ~/.config/rclone/rclone.conf > /dev/null <<EOF
[sss]
type = s3
access_key_id = minioadmin
secret_access_key = minioadmin
endpoint = sss.pigsty:9000
EOF

rclone ls sss:/

配置备份仓库

在 Pigsty 中，MinIO 默认的用例是作为 pgBackRest 的备份存储仓库。当您修改 pgbackrest_method 为 minio 时，PGSQL 模块会自动将备份存储仓库切换到 MinIO 上。

pgbackrest_method: local          # pgbackrest repo method: local,minio,[user-defined...]
pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
  local:                          # default pgbackrest repo with local posix fs
    path: /pg/backup              # local backup directory, `/pg/backup` by default
    retention_full_type: count    # retention full backups by count
    retention_full: 2             # keep 2, at most 3 full backup when using local fs repo
  minio:                          # optional minio repo for pgbackrest
    type: s3                      # minio is s3-compatible, so s3 is used
    s3_endpoint: sss.pigsty       # minio endpoint domain name, `sss.pigsty` by default
    s3_region: us-east-1          # minio region, us-east-1 by default, useless for minio
    s3_bucket: pgsql              # minio bucket name, `pgsql` by default
    s3_key: pgbackrest            # minio user access key for pgbackrest
    s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
    s3_uri_style: path            # use path style uri for minio rather than host style
    path: /pgbackrest             # minio backup path, default is `/pgbackrest`
    storage_port: 9000            # minio port, 9000 by default
    storage_ca_file: /pg/cert/ca.crt  # minio ca file path, `/pg/cert/ca.crt` by default
    bundle: y                     # bundle small files into a single file
    cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
    cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
    retention_full_type: time     # retention full backup by time on minio repo
    retention_full: 14            # keep full backup for last 14 days

请注意，如果您使用了多节点部署的 MinIO 集群，并通过负载均衡器对外提供服务，您需要相应地修改这里的 s3_endpoint 与 storage_port 参数。

11.2 - 集群配置

根据需求场景选择合适的 MinIO 部署类型，并对外提供可靠的接入。

在部署 MinIO 之前，你需要在配置清单中定义一个 MinIO 集群，MinIO 有三种经典部署模式：

单机单盘：SNSD：单机单盘模式，可以使用任意目录作为数据盘，仅作为开发、测试、演示使用。
单机多盘：SNMD：折中模式，在单台服务器上使用多块磁盘 (>=2)，仅当资源极为有限时使用。
多机多盘：MNMD：多机多盘模式，标准生产环境部署，具有最好的可靠性，但需要多台服务器。

通常我们建议使用 SNSD 与 MNMD 这两种模式，前者用于开发测试，后者用于生产部署，SNMD 仅在资源有限（只有一台服务器）的情况下使用。

此外，还可以使用多池部署来实现现有 MinIO 集群的扩容，或者直接部署多套集群。

使用多节点 MinIO 集群时，访问任意节点都可以获取服务，因此最佳实践是在 MinIO 集群前使用负载均衡与高可用服务接入机制。

核心参数

MinIO 部署中，MINIO_VOLUMES 是一个核心配置参数，用于指定 MinIO 的部署模式。 Pigsty 提供了一些便捷的参数用于自动根据配置清单，生成 MINIO_VOLUMES 与其他配置参数的值，但您也可以直接指定它们。

单机单盘： MINIO_VOLUMES 指向本机上的一个普通目录，默认由 minio_data 指定，默认位置为 /data/minio。
单机多盘： MINIO_VOLUMES 指向本机上的序列挂载点，同样是由 minio_data 指定，但需要用特殊语法显式覆盖指定真实挂载点，例如 /data{1...4}。
多机多盘： MINIO_VOLUMES 指向多台服务器上的序列挂载点，由以下两部分自动组合生成：
- 首先要使用 minio_data 指定集群每个成员的磁盘挂载点序列 /data{1...4}，
- 还需要使用 minio_node 指定节点的命名模式 ${minio_cluster}-${minio_seq}.pigsty
多池部署：您需要显式指定 minio_volumes 参数来分配每个存储池的节点，从而实现集群扩容

单机单盘

SNSD 模式，部署参考教程：MinIO 单机单盘部署

在 Pigsty 中，定义一个单例 MinIO 实例非常简单：

# 1 节点 1 驱动器（默认）
minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

单机模式下，唯一必要的参数是 minio_seq 和 minio_cluster，它们会唯一标识每一个 MinIO 实例。

单节点单磁盘模式仅用于开发目的，因此您可以使用一个普通的目录作为数据目录，该目录由参数 minio_data 默认为 /data/minio。

在您使用 MinIO 时，强烈建议您通过静态解析的域名记录访问 MinIO，例如，假设 minio_domain 设置的内部服务域名使用了默认的 sss.pigsty，那么您可以在所有节点上添加一个静态解析，便于其他节点访问此服务。

node_etc_hosts: ["10.10.10.10 sss.pigsty"] # domain name to access minio from all nodes (required)

SNSD 仅适用于开发测试

单节点模式应当仅用于开发，测试，演示目的，因为它无法容忍硬件故障，也无法带来多磁盘的性能改善。

单机多盘

SNMD 模式，部署参考教程：MinIO 单机多盘部署

要在单节点上使用多块磁盘，所需的操作与单机单盘基本一致，但用户需要以 {{ prefix }}{x...y} 的特定格式指定 minio_data，该格式定义了序列磁盘挂载点。

minio:
  hosts: { 10.10.10.10: { minio_seq: 1 } }
  vars:
    minio_cluster: minio         # minio 集群名称，默认为 minio
    minio_data: '/data{1...4}'   # minio 数据目录，使用 {x...y} 记号来指定多块磁盘

请使用真实磁盘挂载点

请注意，SNMD 模式不支持使用普通目录作为数据目录，如果您使用 SNMD 模式拉起 MinIO，但数据目录不是有效的磁盘挂载点，MinIO 将拒绝启动。

例如 Vagrant MinIO 沙箱定义了一个带有4块磁盘的单节点 MinIO 集群：/data1、/data2、/data3 和 /data4。启动 MinIO 之前，你需要正确地挂载它们（请务必使用 xfs 格式化磁盘）：

mkfs.xfs /dev/vdb; mkdir /data1; mount -t xfs /dev/sdb /data1;   # 挂载第1块盘……
mkfs.xfs /dev/vdc; mkdir /data2; mount -t xfs /dev/sdb /data2;   # 挂载第2块盘……
mkfs.xfs /dev/vdd; mkdir /data3; mount -t xfs /dev/sdb /data3;   # 挂载第3块盘……
mkfs.xfs /dev/vde; mkdir /data4; mount -t xfs /dev/sdb /data4;   # 挂载第4块盘……

挂载磁盘属于服务器置备的部分，超出 Pigsty 的处理范畴。挂载的磁盘应该同时写入 /etc/fstab 以便在服务器重启后可以自动挂载。

/dev/vdb /data1 xfs defaults,noatime,nodiratime 0 0
/dev/vdc /data2 xfs defaults,noatime,nodiratime 0 0
/dev/vdd /data3 xfs defaults,noatime,nodiratime 0 0
/dev/vde /data4 xfs defaults,noatime,nodiratime 0 0

SNMD 模式可以利用单机上的多块磁盘，提供更高的性能和容量，并且容忍部分磁盘故障。但单节点模式无法容忍整个节点的故障，而且您无法在运行时添加新的节点，因此如果没有特殊原因，我们不建议在生产环境中使用 SNMD 模式。

多机多盘

MNMD 模式，部署参考教程：MinIO 多机多盘部署

除了需要单机多盘模式中的 minio_data 指定磁盘驱动器，使用MinIO 多节点部署需要使用一个额外的 minio_node 参数。

例如，以下配置定义了一个 MinIO 集群，其中有四个节点，每个节点有四块磁盘：

minio:
  hosts:
    10.10.10.10: { minio_seq: 1 }  # 实际节点名： minio-1.pigsty
    10.10.10.11: { minio_seq: 2 }  # 实际节点名： minio-2.pigsty
    10.10.10.12: { minio_seq: 3 }  # 实际节点名： minio-3.pigsty
    10.10.10.13: { minio_seq: 4 }  # 实际节点名： minio-4.pigsty
  vars:
    minio_cluster: minio
    minio_data: '/data{1...4}'                         # 每个节点使用四块磁盘
    minio_node: '${minio_cluster}-${minio_seq}.pigsty' # minio 节点名称规则

minio_node 参数指定了 MinIO 节点名称的模式，用于生成每个节点的唯一名称。默认情况下，节点名称是 ${minio_cluster}-${minio_seq}.pigsty，其中 ${minio_cluster} 是集群名称，${minio_seq} 是节点序号。 MinIO 实例的名称非常重要，会自动写入到 MinIO 节点的 /etc/hosts 中进行静态解析。MinIO 依靠这些名称来识别并访问集群中的其他节点。

在这种情况下，MINIO_VOLUMES 将被设置为 https://minio-{1...4}.pigsty/data{1...4} ，以标识四个节点上的四块盘。您可以直接在 MinIO 集群中指定 minio_volumes 参数，来覆盖自动根据规则生成的值。但通常不需要这样做，因为 Pigsty 会自动根据配置清单生成它。

多池部署

MinIO 的架构允许通过添加新的存储池来扩容。在 Pigsty 中，您可以通过显式指定 minio_volumes 参数来分配每个存储池的节点，从而实现集群扩容。

例如，假设您已经创建了多机多盘样例中定义的 MinIO 集群，现在您想要添加一个新的存储池，同样由四个节点构成。

那么，你需要直接覆盖指定 minio_volumes 参数：

minio:
  hosts:
    10.10.10.10: { minio_seq: 1 }
    10.10.10.11: { minio_seq: 2 }
    10.10.10.12: { minio_seq: 3 }
    10.10.10.13: { minio_seq: 4 }
    
    10.10.10.14: { minio_seq: 5 }
    10.10.10.15: { minio_seq: 6 }
    10.10.10.16: { minio_seq: 7 }
    10.10.10.17: { minio_seq: 8 }
  vars:
    minio_cluster: minio
    minio_data: "/data{1...4}"
    minio_node: '${minio_cluster}-${minio_seq}.pigsty' # minio 节点名称规则
    minio_volumes: 'https://minio-{1...4}.pigsty:9000/data{1...4} https://minio-{5...8}.pigsty:9000/data{1...4}'

在这里，空格分割的两个参数分别代表两个存储池，每个存储池有四个节点，每个节点有四块磁盘。更多关于存储池的信息请参考管理预案：MinIO集群扩容

多套集群

您可以将新的 MinIO 节点部署为一个全新的 MinIO 集群，使用不同的集群名称定义一个新的分组即可，以下配置声明了两个独立的 MinIO 集群：

minio1:
  hosts:
    10.10.10.10: { minio_seq: 1 }
    10.10.10.11: { minio_seq: 2 }
    10.10.10.12: { minio_seq: 3 }
    10.10.10.13: { minio_seq: 4 }
  vars:
    minio_cluster: minio2
    minio_data: "/data{1...4}"

minio2:
  hosts:    
    10.10.10.14: { minio_seq: 5 }
    10.10.10.15: { minio_seq: 6 }
    10.10.10.16: { minio_seq: 7 }
    10.10.10.17: { minio_seq: 8 }
  vars:
    minio_cluster: minio2
    minio_data: "/data{1...4}"
    minio_alias: sss2
    minio_domain: sss2.pigsty
    minio_endpoint: sss2.pigsty:9000

请注意，Pigsty 默认一套部署中只有一个 MinIO 集群，如果您需要部署多个 MinIO 集群，那么一些带有默认值的参数需要显式设置，无法省略，否则会出现命名冲突，如上所示。

服务接入

MinIO 默认使用 9000 端口提供服务。多节点 MinIO 集群可以通过访问 任意一个节点 来访问其服务。

服务接入属于 NODE 模块的功能范畴，这里仅做基本介绍。

多节点 MinIO 集群的高可用接入可以使用 L2 VIP 或 HAProxy 实现。例如，您可以选择使用 keepalived 在 MinIO 集群上绑定一个 L2 VIP，或者使用由 NODE 模块的提供的 haproxy 组件，通过负载均衡器对外暴露 MinIO 服务。

# minio cluster with 4 nodes and 4 drivers per node
minio:
  hosts:
    10.10.10.10: { minio_seq: 1 , nodename: minio-1 }
    10.10.10.11: { minio_seq: 2 , nodename: minio-2 }
    10.10.10.12: { minio_seq: 3 , nodename: minio-3 }
    10.10.10.13: { minio_seq: 4 , nodename: minio-4 }
  vars:
    minio_cluster: minio
    minio_data: '/data{1...4}'
    minio_buckets: [ { name: pgsql }, { name: infra }, { name: redis } ]
    minio_users:
      - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
      - { access_key: pgbackrest , secret_key: S3User.SomeNewPassWord , policy: readwrite }

    # bind a node l2 vip (10.10.10.9) to minio cluster (optional)
    node_cluster: minio
    vip_enabled: true
    vip_vrid: 128
    vip_address: 10.10.10.9
    vip_interface: eth1

    # expose minio service with haproxy on all nodes
    haproxy_services:
      - name: minio                    # [REQUIRED] service name, unique
        port: 9002                     # [REQUIRED] service port, unique
        balance: leastconn             # [OPTIONAL] load balancer algorithm
        options:                       # [OPTIONAL] minio health check
          - option httpchk
          - option http-keep-alive
          - http-check send meth OPTIONS uri /minio/health/live
          - http-check expect status 200
        servers:
          - { name: minio-1 ,ip: 10.10.10.10 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-2 ,ip: 10.10.10.11 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-3 ,ip: 10.10.10.12 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-4 ,ip: 10.10.10.13 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

例如，上面的配置块为 MinIO 集群的所有节点上启用了 HAProxy ，在 9002 端口上暴露 MinIO 服务，同时为集群绑定了一个二层 VIP。当使用时，用户应当将 sss.pigsty 域名解析指向 VIP 地址 10.10.10.9，并使用 9002 端口访问 MinIO 服务。这样当任意一个节点发生故障时，VIP 会自动切换到另一个节点，保证服务的高可用性。

在这种情况下，您通常还需要在全局修改域名解析的目的地，以及 minio_endpoint 参数，修改写入管理节点 MinIO Alias 对应的端点地址：

minio_endpoint: https://sss.pigsty:9002   # 覆盖默认值： https://sss.pigsty:9000
node_etc_hosts: ["10.10.10.9 sss.pigsty"] # 其他节点将使用 sss.pigsty 域名来访问 MinIO

专用负载均衡

Pigsty 允许用户使用专用的负载均衡服务器组，而不是集群本身来运行 VIP 与 HAProxy。例如 prod 模板中就使用了这种方式。

proxy:
  hosts:
    10.10.10.18 : { nodename: proxy1 ,node_cluster: proxy ,vip_interface: eth1 ,vip_role: master }
    10.10.10.19 : { nodename: proxy2 ,node_cluster: proxy ,vip_interface: eth1 ,vip_role: backup }
  vars:
    vip_enabled: true
    vip_address: 10.10.10.20
    vip_vrid: 20
    
    haproxy_services:      # expose minio service : sss.pigsty:9000
      - name: minio        # [REQUIRED] service name, unique
        port: 9000         # [REQUIRED] service port, unique
        balance: leastconn # Use leastconn algorithm and minio health check
        options: [ "option httpchk", "option http-keep-alive", "http-check send meth OPTIONS uri /minio/health/live", "http-check expect status 200" ]
        servers:           # reload service with ./node.yml -t haproxy_config,haproxy_reload
          - { name: minio-1 ,ip: 10.10.10.21 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-2 ,ip: 10.10.10.22 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-3 ,ip: 10.10.10.23 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-4 ,ip: 10.10.10.24 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-5 ,ip: 10.10.10.25 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

在这种情况下，您通常还需要在全局修改 MinIO 域名的解析，将 sss.pigsty 指向负载均衡器的地址，并修改 minio_endpoint 参数，修改写入管理节点 MinIO Alias 对应的端点地址：

minio_endpoint: https://sss.pigsty:9002    # overwrite the defaults: https://sss.pigsty:9000
node_etc_hosts: ["10.10.10.20 sss.pigsty"] # domain name to access minio from all nodes (required)

访问服务

如果您想要访问上面通过 HAProxy 暴露的 MinIO，以 PGSQL 备份配置为例，可以修改 pgbackrest_repo 中的配置，添加新的备份仓库定义：

# 这是新添加的 HA MinIO Repo 定义，使用此配置代替之前的单机 MinIO 配置
minio_ha:
  type: s3
  s3_endpoint: minio-1.pigsty   # s3_endpoint 可以是任何一个负载均衡器：10.10.10.1{0,1,2}，或指向任意 3 个节点的域名
  s3_region: us-east-1          # 你可以使用外部域名：sss.pigsty，该域名指向任一成员（`minio_domain`）
  s3_bucket: pgsql              # 你可使用实例名和节点名：minio-1.pigsty minio-1.pigsty minio-1.pigsty minio-1 minio-2 minio-3
  s3_key: pgbackrest            # 最好为 MinIO 的 pgbackrest 用户使用专门的密码
  s3_key_secret: S3User.SomeNewPassWord
  s3_uri_style: path
  path: /pgbackrest
  storage_port: 9002            # 使用负载均衡器的端口 9002 代替默认的 9000（直接访问）
  storage_ca_file: /etc/pki/ca.crt
  bundle: y
  cipher_type: aes-256-cbc      # 在您的生产环境中最好使用新的加密密码，这里可以使用集群名作为密码的一部分。
  cipher_pass: pgBackRest.With.Some.Extra.PassWord.And.Salt.${pg_cluster}
  retention_full_type: time
  retention_full: 14

暴露管控

MinIO 默认通过 9001 端口（由 minio_admin_port 参数指定）提供Web管控界面。

将后台管理界面暴露给外部可能存在安全隐患。如果你希望这样做，请将 MinIO 添加到 infra_portal 并刷新 Nginx 配置。

# ./infra.yml -t nginx
infra_portal:
  home         : { domain: h.pigsty }
  grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
  prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }

  # MinIO 管理页面需要 HTTPS / Websocket 才能工作
  minio        : { domain: m.pigsty     ,endpoint: "10.10.10.10:9001" ,scheme: https ,websocket: true }
  minio10      : { domain: m10.pigsty   ,endpoint: "10.10.10.10:9001" ,scheme: https ,websocket: true }
  minio11      : { domain: m11.pigsty   ,endpoint: "10.10.10.11:9001" ,scheme: https ,websocket: true }
  minio12      : { domain: m12.pigsty   ,endpoint: "10.10.10.12:9001" ,scheme: https ,websocket: true }
  minio13      : { domain: m13.pigsty   ,endpoint: "10.10.10.13:9001" ,scheme: https ,websocket: true }

请注意，MinIO 管控页面需要使用 HTTPS，请不要在生产环境中暴露未加密的 MinIO 管控页面。

这意味着，您通常需要在您的 DNS 服务器，或者本机 /etc/hosts 中添加 m.pigsty 的解析记录，以便访问 MinIO 管控页面。

与此同时，如果您使用的是 Pigsty 自签名的 CA 而不是一个正规的公共 CA ，通常您还需要手工信任该 CA 或证书，才能跳过浏览器中的 “不安全” 提示信息。

11.3 - 参数列表

MinIO 模块提供了 17 个相关配置参数，用于定制所需的 Minio 集群。

MinIO 是一个与 S3 兼容的对象存储服务，它被用作 PostgreSQL 的可选的集中式备份存储库。

但用户也可以将其用于其他目的，如存储文件、文档、图片和视频，作为数据湖。

参数列表

MinIO 模块有 17 个相关参数：

参数	类型	层次	中文说明
`minio_seq`	int	I	minio 实例标识符，必填
`minio_cluster`	string	C	minio 集群名称，默认为 minio
`minio_clean`	bool	G/C/A	初始化时清除 minio？默认为 false
`minio_user`	username	C	minio 操作系统用户，默认为 `minio`
`minio_node`	string	C	minio 节点名模式
`minio_data`	path	C	minio 数据目录，使用 `{x...y}` 指定多个磁盘
`minio_volumes`	string	C	minio 核心参数，指定成员节点与磁盘，默认不指定
`minio_domain`	string	G	minio 外部域名，默认为 `sss.pigsty`
`minio_port`	port	C	minio 服务端口，默认为 9000
`minio_admin_port`	port	C	minio 控制台端口，默认为 9001
`minio_access_key`	username	C	根访问密钥，默认为 `minioadmin`
`minio_secret_key`	password	C	根密钥，默认为 `minioadmin`
`minio_extra_vars`	string	C	minio 服务器的额外环境变量
`minio_alias`	string	C	minio 部署的客户端别名
`minio_endpoint`	string	C	minio 部署的客户端别名对应的端点
`minio_buckets`	bucket[]	C	待创建的 minio 存储桶列表
`minio_users`	user[]	C	待创建的 minio 用户列表

其中，minio_volumes 与 minio_endpoint 为自动生成的参数，但您可以显式覆盖指定这两个参数。

默认参数

MinIO 模块的默认参数为：

#-----------------------------------------------------------------
# MINIO
#-----------------------------------------------------------------
#minio_seq: 1                     # minio 实例标识符，必填
minio_cluster: minio              # minio 集群名称，默认为 minio
minio_clean: false                # 初始化时清除 minio？默认为 false
minio_user: minio                 # minio 操作系统用户，默认为 `minio`
minio_node: '${minio_cluster}-${minio_seq}.pigsty' # minio 节点名模式
minio_data: '/data/minio'         # minio 数据目录，使用 `{x...y}` 指定多个磁盘
#minio_volumes:                   # minio 核心参数，如果未指定，则使用拼接生成的默认值 
minio_domain: sss.pigsty          # minio 外部域名，默认为 `sss.pigsty`
minio_port: 9000                  # minio 服务端口，默认为 9000
minio_admin_port: 9001            # minio 控制台端口，默认为 9001
minio_access_key: minioadmin      # 根访问密钥，默认为 `minioadmin`
minio_secret_key: minioadmin      # 根密钥，默认为 `minioadmin`
minio_extra_vars: ''              # minio 服务器的额外环境变量
minio_alias: sss                  # minio 部署的客户端别名
#minio_endpoint: https://sss.pigsty:9000 # minio 别名对应的接入点，如果未指定，则使用拼接生成的默认值
minio_buckets: [ { name: pgsql }, { name: infra },  { name: redis } ] # 待创建的 minio 存储桶列表
minio_users:                      # 待创建的 minio 用户列表
  - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
  - { access_key: pgbackrest , secret_key: S3User.Backup, policy: readwrite }

`minio_seq`

参数名称： minio_seq，类型： int，层次：I

MinIO 实例标识符，必需的身份参数。没有默认值，您必须手动分配这些序列号。

通常的最佳实践是，从1开始分配，依次加1，并永远不使用已经分配的序列号。

`minio_cluster`

参数名称： minio_cluster，类型： string，层次：C

MinIO 集群名称，默认为 minio。当部署多个MinIO集群时，可以使用此参数进行区分。

`minio_clean`

参数名称： minio_clean，类型： bool，层次：G/C/A

是否在初始化时清理 MinIO ？默认为 false，即不清理现有数据。

如果您想在移除，或者在安装过程中清理 MinIO 数据目录，可以将此参数设置为 true，这是一个危险的操作，因为它会删除所有MinIO数据！

`minio_user`

参数名称： minio_user，类型： username，层次：C

MinIO 操作系统用户名，默认为 minio，MinIO 的家目录中将存储 MinIO 使用的证书。

`minio_node`

参数名称： minio_node，类型： string，层次：C

MinIO 节点名称模式，用于多节点部署。

默认值为：${minio_cluster}-${minio_seq}.pigsty，即以实例名 + .pigsty 后缀作为默认的节点名。

在这里指定的域名模式将用于生成节点名，节点名将写入所有 MinIO 节点的 /etc/hosts 文件中。

`minio_data`

参数名称： minio_data，类型： path，层次：C

MinIO 数据目录（们），默认值：/data/minio，这是单节点部署的常见目录。

对于多机多盘与 单机多盘 部署，您可以使用 {x...y} 的记法来指定多个磁盘。

`minio_volumes`

参数名称： minio_volumes，类型： string，层次：C

MinIO 核心参数，默认不指定留空，留空情况下，该参数会自动使用以下规则拼接生成：

minio_volumes: "{% if minio_cluster_size|int > 1 %}https://{{ minio_node|replace('${minio_cluster}', minio_cluster)|replace('${minio_seq}',minio_seq_range) }}:{{ minio_port|default(9000) }}{% endif %}{{ minio_data }}"

在单机部署（无论是单盘还是多盘）模式下，minio_volumes 直接使用 minio_data 的值，进行单机部署。
在多机部署模式下，minio_volumes 会使用 minio_node, minio_port, minio_data 参数的值生成多节点的地址，用于多机部署。
在多池部署模式下，通常需要您直接指定并覆盖 minio_volumes 的值，以指定多个节点池的地址。

指定本参数时，您需要确保使用的参数与 minio_node, minio_port, minio_data 三者匹配。

`minio_domain`

参数名称： minio_domain，类型： string，层次：G

MinIO 服务域名，默认为sss.pigsty。

客户端可以通过此域名访问 MinIO S3 服务。此名称将注册到本地 DNSMASQ，并包含在 SSL 证书字段中。

`minio_port`

参数名称： minio_port，类型： port，层次：C

MinIO 服务端口，默认为9000。

`minio_admin_port`

参数名称： minio_admin_port，类型： port，层次：C

MinIO 控制台端口，默认为9001。

`minio_access_key`

参数名称： minio_access_key，类型： username，层次：C

根访问用户名（access key），默认为minioadmin。

`minio_secret_key`

参数名称： minio_secret_key，类型： password，层次：C

根访问密钥（secret key），默认为minioadmin。

使用默认根密码是高危行为！请务必在生产部署中更改此参数！

`minio_extra_vars`

参数名称： minio_extra_vars，类型： string，层次：C

MinIO 服务器的额外环境变量。查看 MinIO Server 文档以获取完整列表。

默认值为空字符串，您可以使用多行字符串来传递多个环境变量。

`minio_alias`

参数名称： minio_alias，类型： string，层次：C

本地 MinIO 集群的 MinIO 别名，默认值：sss，它将被写入基础设施节点/管理员用户的客户端别名配置文件中。

`minio_endpoint`

参数名称：minio_endpoint，类型： string，层次：C

部署的客户端别名对应的端点，如果指定，这里的 minio_endpoint （例如： https://sss.pigsty:9002）将会替代默认值，作为写入管理节点 MinIO Alias 的目标端点。

mcli alias set {{ minio_alias }} {% if minio_endpoint is defined and minio_endpoint != '' %}{{ minio_endpoint }}{% else %}https://{{ minio_domain }}:{{ minio_port }}{% endif %} {{ minio_access_key }} {{ minio_secret_key }}

以上 MinIO Alias 会在管理节点上以默认管理用户执行。

`minio_buckets`

参数名称： minio_buckets，类型： bucket[]，层次：C

默认创建的minio存储桶列表：

minio_buckets: [ { name: pgsql }, { name: infra },  { name: redis } ]

为模块 PGSQL、INFRA和REDIS 创建了三个默认的存储桶

目前这三个桶中，仅 pgsql 默认用于 pgbackrest 备份存储。

`minio_users`

参数名称： minio_users，类型： user[]，层次：C

要创建的minio用户列表，默认值：

minio_users:
  - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
  - { access_key: pgbackrest , secret_key: S3User.Backup, policy: readwrite }

默认配置会为 PostgreSQL DBA 和 pgBackREST 创建两个默认用户：dba，pgbackrest。

使用默认密码是高危行为！请务必在您的部署中调整这些凭证！

11.4 - 预置剧本

如何使用预置的 ansible 剧本来管理 MinIO 集群，常用管理命令速查。

MinIO 模块提供了一个默认的剧本 minio.yml ，用于安装 MinIO 集群。

`minio.yml`

剧本 minio.yml 用于在节点上安装 MinIO 模块。

minio-id : 生成/校验 minio 身份参数
minio_install : 安装 minio
- minio_os_user : 创建操作系统用户 minio
- minio_pkg : 安装 minio/mcli 软件包
- minio_clean : 移除 minio 数据目录 (默认不移除)
- minio_dir : 创建 minio 目录
minio_config : 生成 minio 配置
- minio_conf : minio 主配置文件
- minio_cert : minio SSL证书签发
- minio_dns : minio DNS记录插入
minio_launch : minio 服务启动
minio_register : minio 纳入监控
minio_provision : 创建 minio 别名/存储桶/业务用户
- minio_alias : 创建 minio 客户端别名（管理节点上）
- minio_bucket : 创建 minio 存储桶
- minio_user : 创建 minio 业务用户

在执行剧本前，请先在配置清单中，完成 MinIO 集群的配置。

命令速查

MINIO 剧本与快捷方式：

./minio.yml -l <cls>                      # 在 <cls> 分组上安装 MINIO 模块
./minio.yml -l minio -e minio_clean=true  # 安装 MINIO 模块，安装时清理现有数据目录（危险！）
./minio.yml -l minio -e minio_clean=true  -t minio_clean # 停止 MinIO 并抹除 MinIO 数据目录（危险！）
./minio.yml -l minio -t minio_instal      # 在节点上安装 MinIO 服务，准备数据目录，但不启动
./minio.yml -l minio -t minio_config      # 重新配置 MinIO 集群
./minio.yml -l minio -t minio_launch      # 重启 MinIO 集群

保护机制

出于防止误删的目的，Pigsty 的 MINIO 模块提供了防误删保险，由以下参数控制：

minio_clean 默认为 false，即，默认不清理现有实例。

如果您希望在初始化时清理现有实例，请修改配置文件，显式关闭此保险，或者在执行时使用命令行参数 -e minio_clean=true 进行覆盖。

如果您单纯希望清理现有实例，而不安装新实例，直接执行 minio_clean 子任务即可：

./minio.yml -l <cls> -e minio_clean=true -t minio_clean

执行演示

11.5 - 管理预案

MinIO 集群管理 SOP：创建，销毁，扩容，缩容，节点故障与磁盘故障的处理。

创建集群

要创建一个集群，在配置清单中定义好后，执行 minio.yml 剧本即可。

minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

例如，上面的配置定义了一个 SNSD 单机单盘 MinIO 集群，使用以下命令即可创建该 MinIO 集群：

./minio.yml -l minio  # 在 minio 分组上安装 MinIO 模块

销毁集群

要销毁一个集群，执行 minio.yml 剧本的 minio_clean 子任务即可

./minio.yml -l minio -t minio_clean -e minio_clean=true   # 停止并清理 MinIO 数据目录

如果您希望将 MinIO 从 Prometheus 监控系统中移除，可以执行：

ansible infra -b -a 'rm -rf /etc/prometheus/targets/minio/minio-1.yml'  # 删除 MinIO 监控目标 minio-1

集群扩容

集群扩容教程

MinIO 无法在节点/磁盘级别上扩容，但可以在存储池（多个节点）层次上进行扩容。

现在假设您有这样一个四节点的 MinIO 集群，希望扩容一倍，新增一个四节点的存储池。

minio:
  hosts:
    10.10.10.10: { minio_seq: 1 , nodename: minio-1 }
    10.10.10.11: { minio_seq: 2 , nodename: minio-2 }
    10.10.10.12: { minio_seq: 3 , nodename: minio-3 }
    10.10.10.13: { minio_seq: 4 , nodename: minio-4 }
  vars:
    minio_cluster: minio
    minio_data: '/data{1...4}'
    minio_buckets: [ { name: pgsql }, { name: infra }, { name: redis } ]
    minio_users:
      - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
      - { access_key: pgbackrest , secret_key: S3User.SomeNewPassWord , policy: readwrite }

    # bind a node l2 vip (10.10.10.9) to minio cluster (optional)
    node_cluster: minio
    vip_enabled: true
    vip_vrid: 128
    vip_address: 10.10.10.9
    vip_interface: eth1

    # expose minio service with haproxy on all nodes
    haproxy_services:
      - name: minio                    # [REQUIRED] service name, unique
        port: 9002                     # [REQUIRED] service port, unique
        balance: leastconn             # [OPTIONAL] load balancer algorithm
        options:                       # [OPTIONAL] minio health check
          - option httpchk
          - option http-keep-alive
          - http-check send meth OPTIONS uri /minio/health/live
          - http-check expect status 200
        servers:
          - { name: minio-1 ,ip: 10.10.10.10 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-2 ,ip: 10.10.10.11 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-3 ,ip: 10.10.10.12 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-4 ,ip: 10.10.10.13 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

首先，修改 MinIO 集群定义，新增四台节点，按顺序分配序列号 5 到 8。这里的关键一步是修改 minio_volumes 参数，将新的四个节点指定为一个新的 存储池。

minio:
  hosts:
    10.10.10.10: { minio_seq: 1 , nodename: minio-1 }
    10.10.10.11: { minio_seq: 2 , nodename: minio-2 }
    10.10.10.12: { minio_seq: 3 , nodename: minio-3 }
    10.10.10.13: { minio_seq: 4 , nodename: minio-4 }
    # 新增的四个节点
    10.10.10.14: { minio_seq: 5 , nodename: minio-5 }
    10.10.10.15: { minio_seq: 6 , nodename: minio-6 }
    10.10.10.16: { minio_seq: 7 , nodename: minio-7 }
    10.10.10.17: { minio_seq: 8 , nodename: minio-8 }

  vars:
    minio_cluster: minio
    minio_data: '/data{1...4}'
    minio_volumes: 'https://minio-{1...4}.pigsty:9000/data{1...4} https://minio-{5...8}.pigsty:9000/data{1...4}'  # 新增的集群配置
    # …… 省略其他配置

第二步，将这些节点交由 Pigsty 纳管：

./node.yml -l 10.10.10.14,10.10.10.15,10.10.10.16,10.10.10.17

第三步，在新节点上，使用 Ansible 剧本安装并准备 MinIO 软件：

./minio.yml -l 10.10.10.14,10.10.10.15,10.10.10.16,10.10.10.17 -t minio_install

第四步，在 整个集群 上，使用 Ansible 剧本重新配置 MinIO 集群：

./minio.yml -l minio -t minio_config

这一步会更新现有四个节点的 MINIO_VOLUMES 配置

第五步，一次性重启整个 MinIO 集群（请注意，不要滚动重启！）：

./minio.yml -l minio -t minio_launch -f 10   # 8并发数，确保同时重启

第六步（可选）：如果您使用了负载均衡，那么请确保负载均衡器的配置也已经更新。例如，将新的四个节点加入到负载均衡器的配置中：

# expose minio service with haproxy on all nodes
haproxy_services:
  - name: minio                    # [REQUIRED] service name, unique
    port: 9002                     # [REQUIRED] service port, unique
    balance: leastconn             # [OPTIONAL] load balancer algorithm
    options:                       # [OPTIONAL] minio health check
      - option httpchk
      - option http-keep-alive
      - http-check send meth OPTIONS uri /minio/health/live
      - http-check expect status 200
    servers:
      - { name: minio-1 ,ip: 10.10.10.10 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
      - { name: minio-2 ,ip: 10.10.10.11 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
      - { name: minio-3 ,ip: 10.10.10.12 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
      - { name: minio-4 ,ip: 10.10.10.13 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

      - { name: minio-5 ,ip: 10.10.10.14 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
      - { name: minio-6 ,ip: 10.10.10.15 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
      - { name: minio-7 ,ip: 10.10.10.16 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
      - { name: minio-8 ,ip: 10.10.10.17 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

然后，执行 node.yml 剧本的 haproxy 子任务，更新负载均衡器配置：

./node.yml -l minio -t haproxy_config,haproxy_reload   # 更新负载均衡器配置并在线加载

如果您使用 L2 VIP 来确保可靠的负载均衡器接入，那么还需要将新的节点（如果有）加入到现有 NODE VIP 分组中：

./node.yml -l minio -t node_vip  # 刷新集群 L2 VIP 配置

集群缩容

MinIO 无法在节点/磁盘级别上缩容，但可以在存储池（多个节点）层次上进行退役 —— 新增一个新存储池，将旧存储池排干迁移到新存储池，然后将旧存储池退役。

集群缩容教程

集群升级

集群升级教程

首先，将新版本的 MinIO 软件包下载至 INFRA 节点的本地软件仓库，然后重建软件仓库索引：

minio:
- amd64: https://dl.min.io/server/minio/release/linux-amd64/
- arm64: https://dl.min.io/server/minio/release/linux-arm64/
mcli:
- amd64: https://dl.min.io/client/mc/release/linux-amd64/
- arm64: https://dl.min.io/client/mc/release/linux-arm64/

./infra.yml -t repo_create

其次，使用 Ansible 批量升级 MinIO 软件包版本：

ansible minio -m package -b -a 'name=minio state=latest'  # 升级 MinIO 服务器软件版本
ansible minio -m package -b -a 'name=mcli state=latest'   # 升级 MinIO 客户端软件版本

最后，使用 mc 命令行工具通知 MinIO 集群重启：

mc admin service restart sss

替换故障节点

节点故障教程

# 1. 从集群中下线故障节点
bin/node-rm <your_old_node_ip>

# 2. 替换故障节点，使用原来的节点名称（如果IP变化，您需要修改 MinIO 集群定义）
bin/node-add <your_new_node_ip>

# 3. 在新节点上安装配置 MinIO
./minio.yml -l <your_new_node_ip>

# 4. 指示 MinIO 执行恢复动作
mc admin heal

替换故障磁盘

磁盘故障教程

# 1. 从集群中删除故障磁盘
umount /dev/<your_disk_device>

# 2. 替换故障磁盘，使用xfs格盘
mkfs.xfs /dev/sdb -L DRIVE1

# 3. 不要忘记设置开机自动挂载
vi /etc/fstab
# LABEL=DRIVE1     /mnt/drive1    xfs     defaults,noatime  0       2

# 4. 重新挂载
mount -a

# 5. 指示 MinIO 执行恢复动作
mc admin heal

11.6 - 监控告警

如何在 Pigsty 中监控 MinIO？如何使用 MinIO 本身的管控面板？有哪些告警规则值得关注？

内置控制台

MinIO 内置了一个相当不错的管控界面，默认您可以通过任意 MinIO 实例的管控端口（minio_admin_port ，默认为 9001），使用 HTTPS 访问此界面。

在大多数提供 MinIO 服务的配置模板中，MinIO 都会以 m.pigsty 的自定义服务对外暴露。在配置域名解析后，您可以通过 https://m.pigsty 访问 MinIO 管控界面。

Pigsty监控

Pigsty 提供了两个与 MINIO 模块有关的监控面板：

MinIO Overview 展示了 MinIO 集群的整体监控指标。

MinIO Instance 展示了单个 MinIO 实例的监控指标详情

Pigsty告警

Pigsty 针对 MinIO 提供了以下三条告警规则，分别是：

MinioServerDown ： MinIO 服务器宕机
MinioNodeOffline： MinIO 节点离线
MinioDiskOffline： MinIO 磁盘离线

#==============================================================#
#                         Aliveness                            #
#==============================================================#
# MinIO server instance down
- alert: MinioServerDown
  expr: minio_up < 1
  for: 1m
  labels: { level: 0, severity: CRIT, category: minio }
  annotations:
    summary: "CRIT MinioServerDown {{ $labels.ins }}@{{ $labels.instance }}"
    description: |
      minio_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value }} < 1
      http://g.pigsty/d/minio-overview      

#==============================================================#
#                         Error                                #
#==============================================================#
# MinIO node offline triggers a p1 alert
- alert: MinioNodeOffline
  expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio"}[5m]) > 0
  for: 3m
  labels: { level: 1, severity: WARN, category: minio }
  annotations:
    summary: "WARN MinioNodeOffline: {{ $labels.cls }} {{ $value }}"
    description: |
      minio_cluster_nodes_offline_total[cls={{ $labels.cls }}] = {{ $value }} > 0
      http://g.pigsty/d/minio-overview?from=now-5m&to=now&var-cls={{$labels.cls}}      

# MinIO disk offline triggers a p1 alert
- alert: MinioDiskOffline
  expr: avg_over_time(minio_cluster_disk_offline_total{job="minio"}[5m]) > 0
  for: 3m
  labels: { level: 1, severity: WARN, category: minio }
  annotations:
    summary: "WARN MinioDiskOffline: {{ $labels.cls }} {{ $value }}"
    description: |
      minio_cluster_disk_offline_total[cls={{ $labels.cls }}] = {{ $value }} > 0
      http://g.pigsty/d/minio-overview?from=now-5m&to=now&var-cls={{$labels.cls}}

11.7 - 指标列表

Pigsty MINIO 模块提供的完整监控指标列表与释义

MINIO 模块包含有 79 类可用监控指标。

Metric Name	Type	Labels	Description
minio_audit_failed_messages	counter	`ip`, `job`, `target_id`, `cls`, `instance`, `server`, `ins`	Total number of messages that failed to send since start
minio_audit_target_queue_length	gauge	`ip`, `job`, `target_id`, `cls`, `instance`, `server`, `ins`	Number of unsent messages in queue for target
minio_audit_total_messages	counter	`ip`, `job`, `target_id`, `cls`, `instance`, `server`, `ins`	Total number of messages sent since start
minio_cluster_bucket_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of buckets in the cluster
minio_cluster_capacity_raw_free_bytes	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total free capacity online in the cluster
minio_cluster_capacity_raw_total_bytes	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total capacity online in the cluster
minio_cluster_capacity_usable_free_bytes	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total free usable capacity online in the cluster
minio_cluster_capacity_usable_total_bytes	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total usable capacity online in the cluster
minio_cluster_drive_offline_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total drives offline in this cluster
minio_cluster_drive_online_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total drives online in this cluster
minio_cluster_drive_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total drives in this cluster
minio_cluster_health_erasure_set_healing_drives	gauge	`pool`, `ip`, `job`, `cls`, `set`, `instance`, `server`, `ins`	Get the count of healing drives of this erasure set
minio_cluster_health_erasure_set_online_drives	gauge	`pool`, `ip`, `job`, `cls`, `set`, `instance`, `server`, `ins`	Get the count of the online drives in this erasure set
minio_cluster_health_erasure_set_read_quorum	gauge	`pool`, `ip`, `job`, `cls`, `set`, `instance`, `server`, `ins`	Get the read quorum for this erasure set
minio_cluster_health_erasure_set_status	gauge	`pool`, `ip`, `job`, `cls`, `set`, `instance`, `server`, `ins`	Get current health status for this erasure set
minio_cluster_health_erasure_set_write_quorum	gauge	`pool`, `ip`, `job`, `cls`, `set`, `instance`, `server`, `ins`	Get the write quorum for this erasure set
minio_cluster_health_status	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Get current cluster health status
minio_cluster_nodes_offline_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of MinIO nodes offline
minio_cluster_nodes_online_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of MinIO nodes online
minio_cluster_objects_size_distribution	gauge	`ip`, `range`, `job`, `cls`, `instance`, `server`, `ins`	Distribution of object sizes across a cluster
minio_cluster_objects_version_distribution	gauge	`ip`, `range`, `job`, `cls`, `instance`, `server`, `ins`	Distribution of object versions across a cluster
minio_cluster_usage_deletemarker_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of delete markers in a cluster
minio_cluster_usage_object_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of objects in a cluster
minio_cluster_usage_total_bytes	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total cluster usage in bytes
minio_cluster_usage_version_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of versions (includes delete marker) in a cluster
minio_cluster_webhook_failed_messages	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Number of messages that failed to send
minio_cluster_webhook_online	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Is the webhook online?
minio_cluster_webhook_queue_length	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Webhook queue length
minio_cluster_webhook_total_messages	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of messages sent to this target
minio_cluster_write_quorum	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Maximum write quorum across all pools and sets
minio_node_file_descriptor_limit_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Limit on total number of open file descriptors for the MinIO Server process
minio_node_file_descriptor_open_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of open file descriptors by the MinIO Server process
minio_node_go_routine_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of go routines running
minio_node_ilm_expiry_pending_tasks	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Number of pending ILM expiry tasks in the queue
minio_node_ilm_transition_active_tasks	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Number of active ILM transition tasks
minio_node_ilm_transition_missed_immediate_tasks	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Number of missed immediate ILM transition tasks
minio_node_ilm_transition_pending_tasks	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Number of pending ILM transition tasks in the queue
minio_node_ilm_versions_scanned	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of object versions checked for ilm actions since server start
minio_node_io_rchar_bytes	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total bytes read by the process from the underlying storage system including cache, /proc/[pid]/io rchar
minio_node_io_read_bytes	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total bytes read by the process from the underlying storage system, /proc/[pid]/io read_bytes
minio_node_io_wchar_bytes	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total bytes written by the process to the underlying storage system including page cache, /proc/[pid]/io wchar
minio_node_io_write_bytes	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total bytes written by the process to the underlying storage system, /proc/[pid]/io write_bytes
minio_node_process_cpu_total_seconds	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total user and system CPU time spent in seconds
minio_node_process_resident_memory_bytes	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Resident memory size in bytes
minio_node_process_starttime_seconds	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Start time for MinIO process per node, time in seconds since Unix epoc
minio_node_process_uptime_seconds	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Uptime for MinIO process per node in seconds
minio_node_scanner_bucket_scans_finished	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of bucket scans finished since server start
minio_node_scanner_bucket_scans_started	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of bucket scans started since server start
minio_node_scanner_directories_scanned	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of directories scanned since server start
minio_node_scanner_objects_scanned	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of unique objects scanned since server start
minio_node_scanner_versions_scanned	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of object versions scanned since server start
minio_node_syscall_read_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total read SysCalls to the kernel. /proc/[pid]/io syscr
minio_node_syscall_write_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total write SysCalls to the kernel. /proc/[pid]/io syscw
minio_notify_current_send_in_progress	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Number of concurrent async Send calls active to all targets (deprecated, please use ‘minio_notify_target_current_send_in_progress’ instead)
minio_notify_events_errors_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Events that were failed to be sent to the targets (deprecated, please use ‘minio_notify_target_failed_events’ instead)
minio_notify_events_sent_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of events sent to the targets (deprecated, please use ‘minio_notify_target_total_events’ instead)
minio_notify_events_skipped_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Events that were skipped to be sent to the targets due to the in-memory queue being full
minio_s3_requests_4xx_errors_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`, `api`	Total number of S3 requests with (4xx) errors
minio_s3_requests_errors_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`, `api`	Total number of S3 requests with (4xx and 5xx) errors
minio_s3_requests_incoming_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of incoming S3 requests
minio_s3_requests_inflight_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`, `api`	Total number of S3 requests currently in flight
minio_s3_requests_rejected_auth_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of S3 requests rejected for auth failure
minio_s3_requests_rejected_header_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of S3 requests rejected for invalid header
minio_s3_requests_rejected_invalid_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of invalid S3 requests
minio_s3_requests_rejected_timestamp_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of S3 requests rejected for invalid timestamp
minio_s3_requests_total	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`, `api`	Total number of S3 requests
minio_s3_requests_ttfb_seconds_distribution	gauge	`ip`, `job`, `cls`, `le`, `instance`, `server`, `ins`, `api`	Distribution of time to first byte across API calls
minio_s3_requests_waiting_total	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of S3 requests in the waiting queue
minio_s3_traffic_received_bytes	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of s3 bytes received
minio_s3_traffic_sent_bytes	counter	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Total number of s3 bytes sent
minio_software_commit_info	gauge	`ip`, `job`, `cls`, `instance`, `commit`, `server`, `ins`	Git commit hash for the MinIO release
minio_software_version_info	gauge	`ip`, `job`, `cls`, `instance`, `version`, `server`, `ins`	MinIO Release tag for the server
minio_up	Unknown	`ip`, `job`, `cls`, `instance`, `ins`	N/A
minio_usage_last_activity_nano_seconds	gauge	`ip`, `job`, `cls`, `instance`, `server`, `ins`	Time elapsed (in nano seconds) since last scan activity.
scrape_duration_seconds	Unknown	`ip`, `job`, `cls`, `instance`, `ins`	N/A
scrape_samples_post_metric_relabeling	Unknown	`ip`, `job`, `cls`, `instance`, `ins`	N/A
scrape_samples_scraped	Unknown	`ip`, `job`, `cls`, `instance`, `ins`	N/A
scrape_series_added	Unknown	`ip`, `job`, `cls`, `instance`, `ins`	N/A
up	Unknown	`ip`, `job`, `cls`, `instance`, `ins`	N/A

11.8 - 常见问题

Pigsty MINIO 对象存储模块常见问题答疑

启动多节点/多盘MinIO集群失败怎么办？

在单机多盘或多机多盘模式下，如果数据目录不是有效的磁盘挂载点，MinIO会拒绝启动。

请使用已挂载的磁盘作为MinIO的数据目录，而不是普通目录。您只能在单机单盘模式下使用普通目录作为 MinIO 的数据目录，作为开发测试之用。

如何向已有的MinIO集群中添加新的成员？

在部署之前，您最好规划MinIO集群容量，因为新增成员需要全局重启。

您可以通过向现有集群中添加新的服务器节点，打造一个新的存储池的方式，实现 MinIO 扩容。

请注意，MinIO 一旦部署，你无法修改现有集群的节点数量与磁盘数量！

请参考这里：扩展MinIO部署

12 - 模块：REDIS

Pigsty 内置了 Redis 支持，开源高性能缓存，可作为 PostgreSQL 的辅助与补充，支持主从、集群、哨兵三种模式。

Redis 是广为流行的开源“高性能”内存数据结构服务器，PostgreSQL 的好搭子。 Pigsty 中的 Redis 版本锁定在最后一个使用 BSD 协议的 7.2.6 版本。

12.1 - 集群配置

根据需求场景选择合适的 Redis 模式，并通过配置清单表达您的需求

概念

Redis的实体概念模型与PostgreSQL几乎相同，同样包括 集群（Cluster） 与 实例（Instance） 的概念。注意这里的Cluster指的不是Redis原生集群方案中的集群。

REDIS模块与PGSQL模块核心的区别在于，Redis通常采用 单机多实例 部署，而不是 PostgreSQL 的 1:1 部署：一个物理/虚拟机节点上通常会部署多个 Redis实例，以充分利用多核CPU。因此配置和管理Redis实例的方式与PGSQL稍有不同。

在Pigsty管理的Redis中，节点完全隶属于集群，即目前尚不允许在一个节点上部署两个不同集群的Redis实例，但这并不影响您在在一个节点上部署多个独立 Redis 主从实例。当然这样也会有一些局限性，例如在这种情况下您就无法为同一个节点上的不同实例指定不同的密码了。

身份参数

Redis 身份参数 是定义Redis集群时必须提供的信息，包括：

名称	属性	说明	例子
`redis_cluster`	必选，集群级别	集群名	`redis-test`
`redis_node`	必选，节点级别	节点号	`1`,`2`
`redis_instances`	必选，节点级别	实例定义	`{ 6001 : {} ,6002 : {}}`

redis_cluster：Redis集群名称，作为集群资源的顶层命名空间。
redis_node：Redis节点标号，整数，在集群内唯一，用于区分不同节点。
redis_instances：JSON对象，Key为实例端口号，Value为包含实例其他配置JSON对象。

工作模式

Redis有三种不同的工作模式，由 redis_mode 参数指定：

standalone：默认的独立主从模式
cluster：Redis原生分布式集群模式
sentinel：哨兵模式，可以为主从模式的 Redis 提供高可用能力

下面给出了三种Redis集群的定义样例：

一个1节点，一主一从的 Redis Standalone 集群：redis-ms
一个1节点，3实例的Redis Sentinel集群：redis-sentinel
一个2节点，6实例的的 Redis Cluster集群： redis-cluster

redis-ms: # redis 经典主从集群
  hosts: { 10.10.10.10: { redis_node: 1 , redis_instances: { 6379: { }, 6380: { replica_of: '10.10.10.10 6379' } } } }
  vars: { redis_cluster: redis-ms ,redis_password: 'redis.ms' ,redis_max_memory: 64MB }

redis-meta: # redis 哨兵 x 3
  hosts: { 10.10.10.11: { redis_node: 1 , redis_instances: { 26379: { } ,26380: { } ,26381: { } } } }
  vars:
    redis_cluster: redis-meta
    redis_password: 'redis.meta'
    redis_mode: sentinel
    redis_max_memory: 16MB
    redis_sentinel_monitor: # primary list for redis sentinel, use cls as name, primary ip:port
      - { name: redis-ms, host: 10.10.10.10, port: 6379 ,password: redis.ms, quorum: 2 }

redis-test: # redis 原生集群： 3主 x 3从
  hosts:
    10.10.10.12: { redis_node: 1 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
    10.10.10.13: { redis_node: 2 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
  vars: { redis_cluster: redis-test ,redis_password: 'redis.test' ,redis_mode: cluster, redis_max_memory: 32MB }

局限性

一个节点只能属于一个 Redis 集群，这意味着您不能将一个节点同时分配给两个不同的Redis集群。
在每个 Redis 节点上，您需要为 Redis实例分配唯一的端口号，避免端口冲突。
通常同一个 Reids 集群会使用同一个密码，但一个 Redis节点上的多个 Redis 实例无法设置不同的密码（因为 redis_exporter 只允许使用一个密码0
Redis Cluster自带高可用，而Redis主从的高可用需要在 Sentinel 中额外进行手工配置：因为我们不知道您是否会部署 Sentinel。
好在配置 Redis 主从实例的高可用非常简单，可以通过Sentinel进行配置，详情请参考管理-设置Redis主从高可用

12.2 - 参数列表

Redis 模块提供了 21 个相关配置参数，用于定制所需的 Redis 集群。

Pigsty 中有 21 个关于 Redis 模块的配置参数：

参数	类型	级别	注释
`redis_cluster`	string	C	Redis数据库集群名称，必选身份参数
`redis_instances`	dict	I	Redis节点上的实例定义
`redis_node`	int	I	Redis节点编号，正整数，集群内唯一，必选身份参数
`redis_fs_main`	path	C	Redis主数据目录，默认为 `/data`
`redis_exporter_enabled`	bool	C	Redis Exporter 是否启用？
`redis_exporter_port`	port	C	Redis Exporter监听端口
`redis_exporter_options`	string	C/I	Redis Exporter命令参数
`redis_safeguard`	bool	G/C/A	禁止抹除现存的Redis
`redis_clean`	bool	G/C/A	初始化Redis是否抹除现存实例
`redis_rmdata`	bool	G/C/A	移除Redis实例时是否一并移除数据？
`redis_mode`	enum	C	Redis集群模式：sentinel，cluster，standalone
`redis_conf`	string	C	Redis配置文件模板，sentinel 除外
`redis_bind_address`	ip	C	Redis监听地址，默认留空则会绑定主机IP
`redis_max_memory`	size	C/I	Redis可用的最大内存
`redis_mem_policy`	enum	C	Redis内存逐出策略
`redis_password`	password	C	Redis密码，默认留空则禁用密码
`redis_rdb_save`	string[]	C	Redis RDB 保存指令，字符串列表，空数组则禁用RDB
`redis_aof_enabled`	bool	C	Redis AOF 是否启用？
`redis_rename_commands`	dict	C	Redis危险命令重命名列表
`redis_cluster_replicas`	int	C	Redis原生集群中每个主库配几个从库？
`redis_sentinel_monitor`	master[]	C	Redis哨兵监控的主库列表，只在哨兵集群上使用？

默认参数

Redis 模块的默认参数定义于 roles/redis/defaults/main.yml

#redis_cluster:            <集群> # Redis数据库集群名称，必选身份参数
#redis_node: 1             <节点> # Redis节点上的实例定义
#redis_instances: {}       <节点> # Redis节点编号，正整数，集群内唯一，必选身份参数
redis_fs_main: /data             # Redis主数据目录，默认为 `/data`
redis_exporter_enabled: true     # Redis Exporter 是否启用？
redis_exporter_port: 9121        # Redis Exporter监听端口
redis_exporter_options: ''       # Redis Exporter命令参数
redis_safeguard: false           # 禁止抹除现存的Redis
redis_clean: true                # 初始化Redis是否抹除现存实例
redis_rmdata: true               # 移除Redis实例时是否一并移除数据？
redis_mode: standalone           # Redis集群模式：sentinel，cluster，standalone
redis_conf: redis.conf           # Redis配置文件模板，sentinel 除外
redis_bind_address: '0.0.0.0'    # Redis监听地址，默认留空则会绑定主机IP
redis_max_memory: 1GB            # Redis可用的最大内存
redis_mem_policy: allkeys-lru    # Redis内存逐出策略
redis_password: ''               # Redis密码，默认留空则禁用密码
redis_rdb_save: ['1200 1']       # Redis RDB 保存指令，字符串列表，空数组则禁用RDB
redis_aof_enabled: false         # Redis AOF 是否启用？
redis_rename_commands: {}        # Redis危险命令重命名列表
redis_cluster_replicas: 1        # Redis原生集群中每个主库配几个从库？
redis_sentinel_monitor: []       # Redis哨兵监控的主库列表，只在哨兵集群上使用

`redis_cluster`

参数名称： redis_cluster，类型： string，层次：C

身份参数，必选参数，必须显式在集群层面配置，将用作集群内资源的命名空间。

需要遵循特定命名规则：[a-z][a-z0-9-]*，以兼容不同约束对身份标识的要求，建议使用redis-作为集群名前缀。

`redis_node`

参数名称： redis_node，类型： int，层次：I

Redis节点序列号，身份参数，必选参数，必须显式在节点（Host）层面配置。

自然数，在集群中应当是唯一的，用于区别与标识集群内的不同节点，从0或1开始分配。

`redis_instances`

参数名称： redis_instances，类型： dict，层次：I

当前 Redis 节点上的 Redis 实例定义，必选参数，必须显式在节点（Host）层面配置。

内容为JSON KV对象格式。Key为数值类型端口号，Value为该实例特定的JSON配置项。

redis-test: # redis native cluster: 3m x 3s
  hosts:
    10.10.10.12: { redis_node: 1 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
    10.10.10.13: { redis_node: 2 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
  vars: { redis_cluster: redis-test ,redis_password: 'redis.test' ,redis_mode: cluster, redis_max_memory: 32MB }

每一个Redis实例在对应节点上监听一个唯一端口，实例配置项中replica_of 用于设置一个实例的上游主库地址，构建主从复制关系。

redis_instances:
    6379: {}
    6380: { replica_of: '10.10.10.13 6379' }
    6381: { replica_of: '10.10.10.13 6379' }

`redis_fs_main`

参数名称： redis_fs_main，类型： path，层次：C

Redis使用的主数据盘挂载点，默认为/data，Pigsty会在该目录下创建redis目录，用于存放Redis数据。

所以实际存储数据的目录为 /data/redis，该目录的属主为操作系统用户 redis，内部结构详情请参考 FHS：Redis

`redis_exporter_enabled`

参数名称： redis_exporter_enabled，类型： bool，层次：C

是否启用Redis监控组件 Redis Exporter？

默认启用，在每个Redis节点上部署一个，默认监听 redis_exporter_port 9121 端口。所有本节点上 Redis 实例的监控指标都由它负责抓取。

`redis_exporter_port`

参数名称： redis_exporter_port，类型： port，层次：C

Redis Exporter监听端口，默认值为：9121

`redis_exporter_options`

参数名称： redis_exporter_options，类型： string，层次：C/I

传给 Redis Exporter 的额外命令行参数，会被渲染到 /etc/defaut/redis_exporter 中，默认为空字符串。

`redis_safeguard`

参数名称： redis_safeguard，类型： bool，层次：G/C/A

Redis的防误删安全保险开关：打开后无法使用剧本抹除正在运行的 Redis 实例。

默认值为 false，如果设置为 true，那么当剧本遇到正在运行的 Redis 实例时，会中止初始化/抹除的操作，避免误删。

`redis_clean`

参数名称： redis_clean，类型： bool，层次：G/C/A

Redis清理开关：是否在初始化的过程中抹除运行中的Redis实例？默认值为：true。

剧本 redis.yml 会在执行时抹除具有相同定义的现有 Redis 实例，这样可以保证剧本的幂等性。

如果您不希望 redis.yml 这样做，可以将此参数设置为 false，那么当剧本遇到正在运行的 Redis 实例时，会中止初始化/抹除的操作，避免误删。

如果安全保险参数 redis_safeguard 已经打开，那么本参数的优先级低于该参数。

`redis_rmdata`

参数名称： redis_rmdata，类型： bool，层次：G/C/A

移除 Redis 实例的时候，是否一并移除 Redis 数据目录？默认为 true。

数据目录包含了 Redis 的 RDB与AOF文件，如果不抹除它们，那么新拉起的 Redis 实例将会从这些备份文件中加载数据。

`redis_mode`

参数名称： redis_mode，类型： enum，层次：C

Redis集群的工作模式，有三种选项：standalone, cluster, sentinel，默认值为 standalone

standalone：默认，独立的Redis主从模式
cluster： Redis原生集群模式
sentinel：Redis高可用组件：哨兵

当使用standalone模式时，Pigsty会根据 replica_of 参数设置Redis主从复制关系。

当使用cluster模式时，Pigsty会根据 redis_cluster_replicas 参数使用所有定义的实例创建原生Redis集群。

`redis_conf`

参数名称： redis_conf，类型： string，层次：C

Redis 配置模板路径，Sentinel除外。

默认值：redis.conf，这是一个模板文件，位于 roles/redis/templates/redis.conf。

如果你想使用自己的 Redis 配置模板，你可以将它放在 templates/ 目录中，并设置此参数为模板文件名。

注意： Redis Sentinel 使用的是另一个不同的模板文件，即 roles/redis/templates/redis-sentinel.conf。

`redis_bind_address`

参数名称： redis_bind_address，类型： ip，层次：C

Redis服务器绑定的IP地址，空字符串将使用配置清单中定义的主机名。

默认值：0.0.0.0，这将绑定到此主机上的所有可用 IPv4 地址。

在生产环境中出于安全性考虑，建议仅绑定内网 IP，即将此值设置为空字符串 ''

`redis_max_memory`

参数名称： redis_max_memory，类型： size，层次：C/I

每个 Redis 实例使用的最大内存配置，默认值：1GB。

`redis_mem_policy`

参数名称： redis_mem_policy，类型： enum，层次：C

Redis 内存回收策略，默认值：allkeys-lru，

noeviction：内存达限时不保存新值：当使用主从复制时仅适用于主库
allkeys-lru：保持最近使用的键；删除最近最少使用的键（LRU）
allkeys-lfu：保持频繁使用的键；删除最少频繁使用的键（LFU）
volatile-lru：删除带有真实过期字段的最近最少使用的键
volatile-lfu：删除带有真实过期字段的最少频繁使用的键
allkeys-random：随机删除键以为新添加的数据腾出空间
volatile-random：随机删除带有过期字段的键
volatile-ttl：删除带有真实过期字段和最短剩余生存时间（TTL）值的键。

详情请参阅Redis内存回收策略。

`redis_password`

参数名称： redis_password，类型： password，层次：C/N

Redis 密码，空字符串将禁用密码，这是默认行为。

注意，由于 redis_exporter 的实现限制，您每个节点只能设置一个 redis_password。这通常不是问题，因为 pigsty 不允许在同一节点上部署两个不同的 Redis 集群。

请在生产环境中使用强密码

`redis_rdb_save`

参数名称： redis_rdb_save，类型： string[]，层次：C

Redis RDB 保存指令，使用空列表则禁用 RDB。

默认值是 ["1200 1"]：如果最近20分钟至少有1个键更改，则将数据集转储到磁盘。

详情请参考 Redis持久化。

`redis_aof_enabled`

参数名称： redis_aof_enabled，类型： bool，层次：C

启用 Redis AOF 吗？默认值是 false，即不使用 AOF。

`redis_rename_commands`

参数名称： redis_rename_commands，类型： dict，层次：C

重命名 Redis 危险命令，这是一个 k:v 字典：old: new，old是待重命名的命令名称，new是重命名后的名字。

默认值：{}，你可以通过设置此值来隐藏像 FLUSHDB 和 FLUSHALL 这样的危险命令，下面是一个例子：

{
  "keys": "op_keys",
  "flushdb": "op_flushdb",
  "flushall": "op_flushall",
  "config": "op_config"  
}

`redis_cluster_replicas`

参数名称： redis_cluster_replicas，类型： int，层次：C

在 Redis 原生集群中，应当为一个 Master/Primary 实例配置多少个从库？默认值为： 1，即每个主库配一个从库。

`redis_sentinel_monitor`

参数名称： redis_sentinel_monitor，类型： master[]，层次：C

Redis哨兵监控的主库列表，只在哨兵集群上使用。每个待纳管的主库定义方式如下所示：

redis_sentinel_monitor:  # primary list for redis sentinel, use cls as name, primary ip:port
  - { name: redis-src, host: 10.10.10.45, port: 6379 ,password: redis.src, quorum: 1 }
  - { name: redis-dst, host: 10.10.10.48, port: 6379 ,password: redis.dst, quorum: 1 }

其中，name，host 是必选参数，port，password，quorum 是可选参数，quorum 用于设置判定主库失效所需的法定人数数，通常大于哨兵实例数的一半（默认为1）。

12.3 - 预置剧本

如何使用预置的 ansible 剧本来管理 Redis 集群，常用管理命令速查。

REDIS模块提供了两个剧本，用于拉起/销毁传统主从Redis集群/节点/实例：

redis.yml：初始Redis集群/节点/实例。
redis-rm.yml：移除Redis集群/节点/实例

`redis.yml`

用于初始化 Redis 的 redis.yml 剧本包含以下子任务：

redis_node        : 初始化redis节点
  - redis_install : 安装redis & redis_exporter
  - redis_user    : 创建操作系统用户 redis
  - redis_dir     : 配置 redis的FHS目录结构
redis_exporter    : 配置 redis_exporter 监控
  - redis_exporter_config  : 生成redis_exporter配置
  - redis_exporter_launch  : 启动redis_exporter
redis_instance    : 停止并禁用redis集群/节点/实例
  - redis_check   : 检查redis实例是否存在
  - redis_clean   : 清除现有的redis实例
  - redis_config  : 生成redis实例配置
  - redis_launch  : 启动redis实例
redis_register    : 将redis注册到基础设施中
redis_ha          : 配置redis哨兵
redis_join        : 加入redis集群

示例：使用Redis剧本初始化Redis集群

`redis-rm.yml`

用于卸载 Redis 的 redis-rm.yml 剧本包含以下子任务：

register       : 从prometheus中移除监控目标
redis_exporter : 停止并禁用redis_exporter
redis          : 停止并禁用redis集群/节点/实例
redis_data     : 移除redis数据（rdb, aof）
redis_pkg      : 卸载redis & redis_exporter软件包

12.4 - 管理预案

Redis 集群管理 SOP，创建，销毁，扩容，缩容的详细说明

以下是一些常见的 Redis 管理任务 SOP（预案）：

更多问题请参考 FAQ：REDIS。

初始化Redis

您可以使用 redis.yml 剧本来初始化 Redis 集群、节点、或实例：

# 初始化集群内所有 Redis 实例
./redis.yml -l <cluster>      # 初始化 redis 集群

# 初始化特定节点上的所有 Redis 实例
./redis.yml -l 10.10.10.10    # 初始化 redis 节点

# 初始化特定 Redis 实例：  10.10.10.11:6379
./redis.yml -l 10.10.10.11 -e redis_port=6379 -t redis

你也可以使用包装脚本命令行脚本来初始化：

bin/redis-add redis-ms          # 初始化 redis 集群 'redis-ms'
bin/redis-add 10.10.10.10       # 初始化 redis 节点 '10.10.10.10'
bin/redis-add 10.10.10.10 6379  # 初始化 redis 实例 '10.10.10.10:6379'

下线Redis

您可以使用 redis-rm.yml 剧本来初始化 Redis 集群、节点、或实例：

# 下线 Redis 集群 `redis-test`
./redis-rm.yml -l redis-test

# 下线 Redis 集群 `redis-test` 并卸载 Redis 软件包
./redis-rm.yml -l redis-test -e redis_uninstall=true

# 下线 Redis 节点 10.10.10.13 上的所有实例
./redis-rm.yml -l 10.10.10.13

# 下线特定 Redis 实例 10.10.10.13:6379
./redis-rm.yml -l 10.10.10.13 -e redis_port=6379

你也可以使用包装脚本来下线 Redis 集群/节点/实例：

bin/redis-rm redis-ms          # 下线 redis 集群 'redis-ms'
bin/redis-rm 10.10.10.10       # 下线 redis 节点 '10.10.10.10'
bin/redis-rm 10.10.10.10 6379  # 下线 redis 实例 '10.10.10.10:6379'

重新配置Redis

您可以部分执行 redis.yml 剧本来重新配置 Redis 集群、节点、或实例：

./redis.yml -l <cluster> -t redis_config,redis_launch

请注意，redis 无法在线重载配置，您只能使用 launch 任务进行重启来让配置生效。

使用Redis客户端

使用 redis-cli 访问 Reids 实例：

$ redis-cli -h 10.10.10.10 -p 6379 # <--- 使用 Host 与 Port 访问对应 Redis 实例
10.10.10.10:6379> auth redis.ms    # <--- 使用密码验证
OK
10.10.10.10:6379> set a 10         # <--- 设置一个Key
OK
10.10.10.10:6379> get a            # <--- 获取 Key 的值
"10"

Redis提供了redis-benchmark工具，可以用于Redis的性能评估，或生成一些负载用于测试。

redis-benchmark -h 10.10.10.13 -p 6379

手工设置Redis从库

https://redis.io/commands/replicaof/

# 将一个 Redis 实例提升为主库
> REPLICAOF NO ONE
"OK"

# 将一个 Redis 实例设置为另一个实例的从库
> REPLICAOF 127.0.0.1 6799
"OK"

设置Redis主从高可用

Redis独立主从集群可以通过 Redis 哨兵集群配置自动高可用，详细用户请参考 Sentinel官方文档

以四节点沙箱环境为例，一套 Redis Sentinel 集群 redis-meta，可以用来管理很多套独立 Redis 主从集群。

以一主一从的Redis普通主从集群 redis-ms 为例，您需要在每个 Sentinel 实例上，使用 SENTINEL MONITOR 添加目标，并使用 SENTINEL SET 提供密码，高可用就配置完毕了。

# 对于每一个 sentinel，将 redis 主服务器纳入哨兵管理：（26379,26380,26381）
$ redis-cli -h 10.10.10.11 -p 26379 -a redis.meta
10.10.10.11:26379> SENTINEL MONITOR redis-ms 10.10.10.10 6379 1
10.10.10.11:26379> SENTINEL SET redis-ms auth-pass redis.ms      # 如果启用了授权，需要配置密码

如果您想移除某个由 Sentinel 管理的 Redis 主从集群，使用 SENTINEL REMOVE <name> 移除即可。

您可以使用定义在 Sentinel 集群上的 redis_sentinel_monitor 参数，来自动配置管理哨兵监控管理的主库列表。

redis_sentinel_monitor:  # 需要被监控的主库列表，端口、密码、法定人数（应为1/2以上的哨兵数量）为可选参数
  - { name: redis-src, host: 10.10.10.45, port: 6379 ,password: redis.src, quorum: 1 }
  - { name: redis-dst, host: 10.10.10.48, port: 6379 ,password: redis.dst, quorum: 1 }

使用以下命令刷新 Redis 哨兵集群上的纳管主库列表：

./redis.yml -l redis-meta -t redis-ha   # 如果您的 Sentinel 集群名称不是 redis-meta，请在这里替换。

12.5 - 监控告警

如何监控 redis？有哪些告警规则值得关注？

监控面板

REDIS 模块提供了 3 个监控面板

Redis Overview：redis 集群概览
Redis Cluster：redis 集群详情
Redis Instance：redis 实例详情

监控

Pigsty 提供了三个与 REDIS 模块有关的监控仪表盘：

Redis Overview

Redis Overview：关于所有Redis集群/实例的详细信息

Redis Cluster

Redis Cluster：关于单个Redis集群的详细信息

Redis Cluster Dashboard

Redis Instance

Redis Instance：关于单个Redis实例的详细信息

Redis Instance Dashboard

告警规则

Pigsty 针对 redis 提供了以下六条预置告警规则，定义于 files/prometheus/rules/redis.yml

RedisDown：redis 实例不可用
RedisRejectConn：redis 实例拒绝连接
RedisRTHigh：redis 实例响应时间过高
RedisCPUHigh：redis 实例 CPU 使用率过高
RedisMemHigh：redis 实例内存使用率过高
RedisQPSHigh：redis 实例 QPS 过高

#==============================================================#
#                         Error                                #
#==============================================================#
# redis down triggers a P0 alert
- alert: RedisDown
  expr: redis_up < 1
  for: 1m
  labels: { level: 0, severity: CRIT, category: redis }
  annotations:
    summary: "CRIT RedisDown: {{ $labels.ins }} {{ $labels.instance }} {{ $value }}"
    description: |
      redis_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value }} == 0
      http://g.pigsty/d/redis-instance?from=now-5m&to=now&var-ins={{$labels.ins}}      

# redis reject connection in last 5m
- alert: RedisRejectConn
  expr: redis:ins:conn_reject > 0
  labels: { level: 0, severity: CRIT, category: redis }
  annotations:
    summary: "CRIT RedisRejectConn: {{ $labels.ins }} {{ $labels.instance }} {{ $value }}"
    description: |
      redis:ins:conn_reject[cls={{ $labels.cls }}, ins={{ $labels.ins }}][5m] = {{ $value }} > 0
      http://g.pigsty/d/redis-instance?from=now-10m&to=now&viewPanel=88&fullscreen&var-ins={{ $labels.ins }}      



#==============================================================#
#                         Latency                              #
#==============================================================#
# redis avg query response time > 160 µs
- alert: RedisRTHigh
  expr: redis:ins:rt > 0.00016
  for: 1m
  labels: { level: 1, severity: WARN, category: redis }
  annotations:
    summary: "WARN RedisRTHigh: {{ $labels.cls }} {{ $labels.ins }}"
    description: |
      pg:ins:query_rt[cls={{ $labels.cls }}, ins={{ $labels.ins }}] = {{ $value }} > 160µs
      http://g.pigsty/d/redis-instance?from=now-10m&to=now&viewPanel=97&fullscreen&var-ins={{ $labels.ins }}      



#==============================================================#
#                        Saturation                            #
#==============================================================#
# redis cpu usage more than 70% for 1m
- alert: RedisCPUHigh
  expr: redis:ins:cpu_usage > 0.70
  for: 1m
  labels: { level: 1, severity: WARN, category: redis }
  annotations:
    summary: "WARN RedisCPUHigh: {{ $labels.cls }} {{ $labels.ins }}"
    description: |
      redis:ins:cpu_all[cls={{ $labels.cls }}, ins={{ $labels.ins }}] = {{ $value }} > 60%
      http://g.pigsty/d/redis-instance?from=now-10m&to=now&viewPanel=43&fullscreen&var-ins={{ $labels.ins }}      

# redis mem usage more than 70% for 1m
- alert: RedisMemHigh
  expr: redis:ins:mem_usage > 0.70
  for: 1m
  labels: { level: 1, severity: WARN, category: redis }
  annotations:
    summary: "WARN RedisMemHigh: {{ $labels.cls }} {{ $labels.ins }}"
    description: |
      redis:ins:mem_usage[cls={{ $labels.cls }}, ins={{ $labels.ins }}] = {{ $value }} > 80%
      http://g.pigsty/d/redis-instance?from=now-10m&to=now&viewPanel=7&fullscreen&var-ins={{ $labels.ins }}      

#==============================================================#
#                         Traffic                              #
#==============================================================#
# redis qps more than 32000 for 5m
- alert: RedisQPSHigh
  expr: redis:ins:qps > 32000
  for: 5m
  labels: { level: 2, severity: INFO, category: redis }
  annotations:
    summary: "INFO RedisQPSHigh: {{ $labels.cls }} {{ $labels.ins }}"
    description: |
      redis:ins:qps[cls={{ $labels.cls }}, ins={{ $labels.ins }}] = {{ $value }} > 16000
      http://g.pigsty/d/redis-instance?from=now-10m&to=now&viewPanel=96&fullscreen&var-ins={{ $labels.ins }}

12.6 - 指标列表

Pigsty REDIS 模块提供的完整监控指标列表与释义

REDIS 模块包含有 275 类可用监控指标。

Metric Name	Type	Labels	Description
ALERTS	Unknown	`cls`, `ip`, `level`, `severity`, `instance`, `category`, `ins`, `alertname`, `job`, `alertstate`	N/A
ALERTS_FOR_STATE	Unknown	`cls`, `ip`, `level`, `severity`, `instance`, `category`, `ins`, `alertname`, `job`	N/A
redis:cls:aof_rewrite_time	Unknown	`cls`, `job`	N/A
redis:cls:blocked_clients	Unknown	`cls`, `job`	N/A
redis:cls:clients	Unknown	`cls`, `job`	N/A
redis:cls:cmd_qps	Unknown	`cls`, `cmd`, `job`	N/A
redis:cls:cmd_rt	Unknown	`cls`, `cmd`, `job`	N/A
redis:cls:cmd_time	Unknown	`cls`, `cmd`, `job`	N/A
redis:cls:conn_rate	Unknown	`cls`, `job`	N/A
redis:cls:conn_reject	Unknown	`cls`, `job`	N/A
redis:cls:cpu_sys	Unknown	`cls`, `job`	N/A
redis:cls:cpu_sys_child	Unknown	`cls`, `job`	N/A
redis:cls:cpu_usage	Unknown	`cls`, `job`	N/A
redis:cls:cpu_usage_child	Unknown	`cls`, `job`	N/A
redis:cls:cpu_user	Unknown	`cls`, `job`	N/A
redis:cls:cpu_user_child	Unknown	`cls`, `job`	N/A
redis:cls:fork_time	Unknown	`cls`, `job`	N/A
redis:cls:key_evict	Unknown	`cls`, `job`	N/A
redis:cls:key_expire	Unknown	`cls`, `job`	N/A
redis:cls:key_hit	Unknown	`cls`, `job`	N/A
redis:cls:key_hit_rate	Unknown	`cls`, `job`	N/A
redis:cls:key_miss	Unknown	`cls`, `job`	N/A
redis:cls:mem_max	Unknown	`cls`, `job`	N/A
redis:cls:mem_usage	Unknown	`cls`, `job`	N/A
redis:cls:mem_usage_max	Unknown	`cls`, `job`	N/A
redis:cls:mem_used	Unknown	`cls`, `job`	N/A
redis:cls:net_traffic	Unknown	`cls`, `job`	N/A
redis:cls:qps	Unknown	`cls`, `job`	N/A
redis:cls:qps_mu	Unknown	`cls`, `job`	N/A
redis:cls:qps_realtime	Unknown	`cls`, `job`	N/A
redis:cls:qps_sigma	Unknown	`cls`, `job`	N/A
redis:cls:rt	Unknown	`cls`, `job`	N/A
redis:cls:rt_mu	Unknown	`cls`, `job`	N/A
redis:cls:rt_sigma	Unknown	`cls`, `job`	N/A
redis:cls:rx	Unknown	`cls`, `job`	N/A
redis:cls:size	Unknown	`cls`, `job`	N/A
redis:cls:tx	Unknown	`cls`, `job`	N/A
redis:env:blocked_clients	Unknown	`job`	N/A
redis:env:clients	Unknown	`job`	N/A
redis:env:cmd_qps	Unknown	`cmd`, `job`	N/A
redis:env:cmd_rt	Unknown	`cmd`, `job`	N/A
redis:env:cmd_time	Unknown	`cmd`, `job`	N/A
redis:env:conn_rate	Unknown	`job`	N/A
redis:env:conn_reject	Unknown	`job`	N/A
redis:env:cpu_usage	Unknown	`job`	N/A
redis:env:cpu_usage_child	Unknown	`job`	N/A
redis:env:key_evict	Unknown	`job`	N/A
redis:env:key_expire	Unknown	`job`	N/A
redis:env:key_hit	Unknown	`job`	N/A
redis:env:key_hit_rate	Unknown	`job`	N/A
redis:env:key_miss	Unknown	`job`	N/A
redis:env:mem_usage	Unknown	`job`	N/A
redis:env:net_traffic	Unknown	`job`	N/A
redis:env:qps	Unknown	`job`	N/A
redis:env:qps_mu	Unknown	`job`	N/A
redis:env:qps_realtime	Unknown	`job`	N/A
redis:env:qps_sigma	Unknown	`job`	N/A
redis:env:rt	Unknown	`job`	N/A
redis:env:rt_mu	Unknown	`job`	N/A
redis:env:rt_sigma	Unknown	`job`	N/A
redis:env:rx	Unknown	`job`	N/A
redis:env:tx	Unknown	`job`	N/A
redis:ins	Unknown	`cls`, `id`, `instance`, `ins`, `job`	N/A
redis:ins:blocked_clients	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:clients	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:cmd_qps	Unknown	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:cmd_rt	Unknown	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:cmd_time	Unknown	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:conn_rate	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:conn_reject	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:cpu_sys	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:cpu_sys_child	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:cpu_usage	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:cpu_usage_child	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:cpu_user	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:cpu_user_child	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:key_evict	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:key_expire	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:key_hit	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:key_hit_rate	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:key_miss	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:lsn_rate	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:mem_usage	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:net_traffic	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:qps	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:qps_mu	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:qps_realtime	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:qps_sigma	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:rt	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:rt_mu	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:rt_sigma	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:rx	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:ins:tx	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:node:ip	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis:node:mem_alloc	Unknown	`cls`, `ip`, `job`	N/A
redis:node:mem_total	Unknown	`cls`, `ip`, `job`	N/A
redis:node:mem_used	Unknown	`cls`, `ip`, `job`	N/A
redis:node:qps	Unknown	`cls`, `ip`, `job`	N/A
redis_active_defrag_running	gauge	`cls`, `ip`, `instance`, `ins`, `job`	active_defrag_running metric
redis_allocator_active_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	allocator_active_bytes metric
redis_allocator_allocated_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	allocator_allocated_bytes metric
redis_allocator_frag_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	allocator_frag_bytes metric
redis_allocator_frag_ratio	gauge	`cls`, `ip`, `instance`, `ins`, `job`	allocator_frag_ratio metric
redis_allocator_resident_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	allocator_resident_bytes metric
redis_allocator_rss_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	allocator_rss_bytes metric
redis_allocator_rss_ratio	gauge	`cls`, `ip`, `instance`, `ins`, `job`	allocator_rss_ratio metric
redis_aof_current_rewrite_duration_sec	gauge	`cls`, `ip`, `instance`, `ins`, `job`	aof_current_rewrite_duration_sec metric
redis_aof_enabled	gauge	`cls`, `ip`, `instance`, `ins`, `job`	aof_enabled metric
redis_aof_last_bgrewrite_status	gauge	`cls`, `ip`, `instance`, `ins`, `job`	aof_last_bgrewrite_status metric
redis_aof_last_cow_size_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	aof_last_cow_size_bytes metric
redis_aof_last_rewrite_duration_sec	gauge	`cls`, `ip`, `instance`, `ins`, `job`	aof_last_rewrite_duration_sec metric
redis_aof_last_write_status	gauge	`cls`, `ip`, `instance`, `ins`, `job`	aof_last_write_status metric
redis_aof_rewrite_in_progress	gauge	`cls`, `ip`, `instance`, `ins`, `job`	aof_rewrite_in_progress metric
redis_aof_rewrite_scheduled	gauge	`cls`, `ip`, `instance`, `ins`, `job`	aof_rewrite_scheduled metric
redis_blocked_clients	gauge	`cls`, `ip`, `instance`, `ins`, `job`	blocked_clients metric
redis_client_recent_max_input_buffer_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	client_recent_max_input_buffer_bytes metric
redis_client_recent_max_output_buffer_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	client_recent_max_output_buffer_bytes metric
redis_clients_in_timeout_table	gauge	`cls`, `ip`, `instance`, `ins`, `job`	clients_in_timeout_table metric
redis_cluster_connections	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_connections metric
redis_cluster_current_epoch	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_current_epoch metric
redis_cluster_enabled	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_enabled metric
redis_cluster_known_nodes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_known_nodes metric
redis_cluster_messages_received_total	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_messages_received_total metric
redis_cluster_messages_sent_total	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_messages_sent_total metric
redis_cluster_my_epoch	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_my_epoch metric
redis_cluster_size	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_size metric
redis_cluster_slots_assigned	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_slots_assigned metric
redis_cluster_slots_fail	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_slots_fail metric
redis_cluster_slots_ok	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_slots_ok metric
redis_cluster_slots_pfail	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_slots_pfail metric
redis_cluster_state	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_state metric
redis_cluster_stats_messages_meet_received	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_stats_messages_meet_received metric
redis_cluster_stats_messages_meet_sent	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_stats_messages_meet_sent metric
redis_cluster_stats_messages_ping_received	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_stats_messages_ping_received metric
redis_cluster_stats_messages_ping_sent	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_stats_messages_ping_sent metric
redis_cluster_stats_messages_pong_received	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_stats_messages_pong_received metric
redis_cluster_stats_messages_pong_sent	gauge	`cls`, `ip`, `instance`, `ins`, `job`	cluster_stats_messages_pong_sent metric
redis_commands_duration_seconds_total	counter	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	Total amount of time in seconds spent per command
redis_commands_failed_calls_total	counter	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	Total number of errors prior command execution per command
redis_commands_latencies_usec_bucket	Unknown	`cls`, `cmd`, `ip`, `le`, `instance`, `ins`, `job`	N/A
redis_commands_latencies_usec_count	Unknown	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	N/A
redis_commands_latencies_usec_sum	Unknown	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	N/A
redis_commands_processed_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	commands_processed_total metric
redis_commands_rejected_calls_total	counter	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	Total number of errors within command execution per command
redis_commands_total	counter	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	Total number of calls per command
redis_config_io_threads	gauge	`cls`, `ip`, `instance`, `ins`, `job`	config_io_threads metric
redis_config_maxclients	gauge	`cls`, `ip`, `instance`, `ins`, `job`	config_maxclients metric
redis_config_maxmemory	gauge	`cls`, `ip`, `instance`, `ins`, `job`	config_maxmemory metric
redis_connected_clients	gauge	`cls`, `ip`, `instance`, `ins`, `job`	connected_clients metric
redis_connected_slave_lag_seconds	gauge	`cls`, `ip`, `slave_ip`, `instance`, `slave_state`, `ins`, `slave_port`, `job`	Lag of connected slave
redis_connected_slave_offset_bytes	gauge	`cls`, `ip`, `slave_ip`, `instance`, `slave_state`, `ins`, `slave_port`, `job`	Offset of connected slave
redis_connected_slaves	gauge	`cls`, `ip`, `instance`, `ins`, `job`	connected_slaves metric
redis_connections_received_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	connections_received_total metric
redis_cpu_sys_children_seconds_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	cpu_sys_children_seconds_total metric
redis_cpu_sys_main_thread_seconds_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	cpu_sys_main_thread_seconds_total metric
redis_cpu_sys_seconds_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	cpu_sys_seconds_total metric
redis_cpu_user_children_seconds_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	cpu_user_children_seconds_total metric
redis_cpu_user_main_thread_seconds_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	cpu_user_main_thread_seconds_total metric
redis_cpu_user_seconds_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	cpu_user_seconds_total metric
redis_db_keys	gauge	`cls`, `ip`, `instance`, `ins`, `db`, `job`	Total number of keys by DB
redis_db_keys_expiring	gauge	`cls`, `ip`, `instance`, `ins`, `db`, `job`	Total number of expiring keys by DB
redis_defrag_hits	gauge	`cls`, `ip`, `instance`, `ins`, `job`	defrag_hits metric
redis_defrag_key_hits	gauge	`cls`, `ip`, `instance`, `ins`, `job`	defrag_key_hits metric
redis_defrag_key_misses	gauge	`cls`, `ip`, `instance`, `ins`, `job`	defrag_key_misses metric
redis_defrag_misses	gauge	`cls`, `ip`, `instance`, `ins`, `job`	defrag_misses metric
redis_dump_payload_sanitizations	counter	`cls`, `ip`, `instance`, `ins`, `job`	dump_payload_sanitizations metric
redis_errors_total	counter	`cls`, `ip`, `err`, `instance`, `ins`, `job`	Total number of errors per error type
redis_evicted_keys_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	evicted_keys_total metric
redis_expired_keys_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	expired_keys_total metric
redis_expired_stale_percentage	gauge	`cls`, `ip`, `instance`, `ins`, `job`	expired_stale_percentage metric
redis_expired_time_cap_reached_total	gauge	`cls`, `ip`, `instance`, `ins`, `job`	expired_time_cap_reached_total metric
redis_exporter_build_info	gauge	`cls`, `golang_version`, `ip`, `commit_sha`, `instance`, `version`, `ins`, `job`, `build_date`	redis exporter build_info
redis_exporter_last_scrape_connect_time_seconds	gauge	`cls`, `ip`, `instance`, `ins`, `job`	exporter_last_scrape_connect_time_seconds metric
redis_exporter_last_scrape_duration_seconds	gauge	`cls`, `ip`, `instance`, `ins`, `job`	exporter_last_scrape_duration_seconds metric
redis_exporter_last_scrape_error	gauge	`cls`, `ip`, `instance`, `ins`, `job`	The last scrape error status.
redis_exporter_scrape_duration_seconds_count	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis_exporter_scrape_duration_seconds_sum	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
redis_exporter_scrapes_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	Current total redis scrapes.
redis_instance_info	gauge	`cls`, `ip`, `os`, `role`, `instance`, `run_id`, `redis_version`, `tcp_port`, `process_id`, `ins`, `redis_mode`, `maxmemory_policy`, `redis_build_id`, `job`	Information about the Redis instance
redis_io_threaded_reads_processed	counter	`cls`, `ip`, `instance`, `ins`, `job`	io_threaded_reads_processed metric
redis_io_threaded_writes_processed	counter	`cls`, `ip`, `instance`, `ins`, `job`	io_threaded_writes_processed metric
redis_io_threads_active	gauge	`cls`, `ip`, `instance`, `ins`, `job`	io_threads_active metric
redis_keyspace_hits_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	keyspace_hits_total metric
redis_keyspace_misses_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	keyspace_misses_total metric
redis_last_key_groups_scrape_duration_milliseconds	gauge	`cls`, `ip`, `instance`, `ins`, `job`	Duration of the last key group metrics scrape in milliseconds
redis_last_slow_execution_duration_seconds	gauge	`cls`, `ip`, `instance`, `ins`, `job`	The amount of time needed for last slow execution, in seconds
redis_latency_percentiles_usec	summary	`cls`, `cmd`, `ip`, `instance`, `quantile`, `ins`, `job`	A summary of latency percentile distribution per command
redis_latency_percentiles_usec_count	Unknown	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	N/A
redis_latency_percentiles_usec_sum	Unknown	`cls`, `cmd`, `ip`, `instance`, `ins`, `job`	N/A
redis_latest_fork_seconds	gauge	`cls`, `ip`, `instance`, `ins`, `job`	latest_fork_seconds metric
redis_lazyfree_pending_objects	gauge	`cls`, `ip`, `instance`, `ins`, `job`	lazyfree_pending_objects metric
redis_loading_dump_file	gauge	`cls`, `ip`, `instance`, `ins`, `job`	loading_dump_file metric
redis_master_last_io_seconds_ago	gauge	`cls`, `ip`, `master_host`, `instance`, `ins`, `job`, `master_port`	Master last io seconds ago
redis_master_link_up	gauge	`cls`, `ip`, `master_host`, `instance`, `ins`, `job`, `master_port`	Master link status on Redis slave
redis_master_repl_offset	gauge	`cls`, `ip`, `instance`, `ins`, `job`	master_repl_offset metric
redis_master_sync_in_progress	gauge	`cls`, `ip`, `master_host`, `instance`, `ins`, `job`, `master_port`	Master sync in progress
redis_mem_clients_normal	gauge	`cls`, `ip`, `instance`, `ins`, `job`	mem_clients_normal metric
redis_mem_clients_slaves	gauge	`cls`, `ip`, `instance`, `ins`, `job`	mem_clients_slaves metric
redis_mem_fragmentation_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	mem_fragmentation_bytes metric
redis_mem_fragmentation_ratio	gauge	`cls`, `ip`, `instance`, `ins`, `job`	mem_fragmentation_ratio metric
redis_mem_not_counted_for_eviction_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	mem_not_counted_for_eviction_bytes metric
redis_memory_max_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	memory_max_bytes metric
redis_memory_used_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	memory_used_bytes metric
redis_memory_used_dataset_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	memory_used_dataset_bytes metric
redis_memory_used_lua_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	memory_used_lua_bytes metric
redis_memory_used_overhead_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	memory_used_overhead_bytes metric
redis_memory_used_peak_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	memory_used_peak_bytes metric
redis_memory_used_rss_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	memory_used_rss_bytes metric
redis_memory_used_scripts_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	memory_used_scripts_bytes metric
redis_memory_used_startup_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	memory_used_startup_bytes metric
redis_migrate_cached_sockets_total	gauge	`cls`, `ip`, `instance`, `ins`, `job`	migrate_cached_sockets_total metric
redis_module_fork_in_progress	gauge	`cls`, `ip`, `instance`, `ins`, `job`	module_fork_in_progress metric
redis_module_fork_last_cow_size	gauge	`cls`, `ip`, `instance`, `ins`, `job`	module_fork_last_cow_size metric
redis_net_input_bytes_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	net_input_bytes_total metric
redis_net_output_bytes_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	net_output_bytes_total metric
redis_number_of_cached_scripts	gauge	`cls`, `ip`, `instance`, `ins`, `job`	number_of_cached_scripts metric
redis_process_id	gauge	`cls`, `ip`, `instance`, `ins`, `job`	process_id metric
redis_pubsub_channels	gauge	`cls`, `ip`, `instance`, `ins`, `job`	pubsub_channels metric
redis_pubsub_patterns	gauge	`cls`, `ip`, `instance`, `ins`, `job`	pubsub_patterns metric
redis_pubsubshard_channels	gauge	`cls`, `ip`, `instance`, `ins`, `job`	pubsubshard_channels metric
redis_rdb_bgsave_in_progress	gauge	`cls`, `ip`, `instance`, `ins`, `job`	rdb_bgsave_in_progress metric
redis_rdb_changes_since_last_save	gauge	`cls`, `ip`, `instance`, `ins`, `job`	rdb_changes_since_last_save metric
redis_rdb_current_bgsave_duration_sec	gauge	`cls`, `ip`, `instance`, `ins`, `job`	rdb_current_bgsave_duration_sec metric
redis_rdb_last_bgsave_duration_sec	gauge	`cls`, `ip`, `instance`, `ins`, `job`	rdb_last_bgsave_duration_sec metric
redis_rdb_last_bgsave_status	gauge	`cls`, `ip`, `instance`, `ins`, `job`	rdb_last_bgsave_status metric
redis_rdb_last_cow_size_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	rdb_last_cow_size_bytes metric
redis_rdb_last_save_timestamp_seconds	gauge	`cls`, `ip`, `instance`, `ins`, `job`	rdb_last_save_timestamp_seconds metric
redis_rejected_connections_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	rejected_connections_total metric
redis_repl_backlog_first_byte_offset	gauge	`cls`, `ip`, `instance`, `ins`, `job`	repl_backlog_first_byte_offset metric
redis_repl_backlog_history_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	repl_backlog_history_bytes metric
redis_repl_backlog_is_active	gauge	`cls`, `ip`, `instance`, `ins`, `job`	repl_backlog_is_active metric
redis_replica_partial_resync_accepted	gauge	`cls`, `ip`, `instance`, `ins`, `job`	replica_partial_resync_accepted metric
redis_replica_partial_resync_denied	gauge	`cls`, `ip`, `instance`, `ins`, `job`	replica_partial_resync_denied metric
redis_replica_resyncs_full	gauge	`cls`, `ip`, `instance`, `ins`, `job`	replica_resyncs_full metric
redis_replication_backlog_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	replication_backlog_bytes metric
redis_second_repl_offset	gauge	`cls`, `ip`, `instance`, `ins`, `job`	second_repl_offset metric
redis_sentinel_master_ckquorum_status	gauge	`cls`, `ip`, `message`, `instance`, `ins`, `master_name`, `job`	Master ckquorum status
redis_sentinel_master_ok_sentinels	gauge	`cls`, `ip`, `instance`, `ins`, `master_address`, `master_name`, `job`	The number of okay sentinels monitoring this master
redis_sentinel_master_ok_slaves	gauge	`cls`, `ip`, `instance`, `ins`, `master_address`, `master_name`, `job`	The number of okay slaves of the master
redis_sentinel_master_sentinels	gauge	`cls`, `ip`, `instance`, `ins`, `master_address`, `master_name`, `job`	The number of sentinels monitoring this master
redis_sentinel_master_setting_ckquorum	gauge	`cls`, `ip`, `instance`, `ins`, `master_address`, `master_name`, `job`	Show the current ckquorum config for each master
redis_sentinel_master_setting_down_after_milliseconds	gauge	`cls`, `ip`, `instance`, `ins`, `master_address`, `master_name`, `job`	Show the current down-after-milliseconds config for each master
redis_sentinel_master_setting_failover_timeout	gauge	`cls`, `ip`, `instance`, `ins`, `master_address`, `master_name`, `job`	Show the current failover-timeout config for each master
redis_sentinel_master_setting_parallel_syncs	gauge	`cls`, `ip`, `instance`, `ins`, `master_address`, `master_name`, `job`	Show the current parallel-syncs config for each master
redis_sentinel_master_slaves	gauge	`cls`, `ip`, `instance`, `ins`, `master_address`, `master_name`, `job`	The number of slaves of the master
redis_sentinel_master_status	gauge	`cls`, `ip`, `master_status`, `instance`, `ins`, `master_address`, `master_name`, `job`	Master status on Sentinel
redis_sentinel_masters	gauge	`cls`, `ip`, `instance`, `ins`, `job`	The number of masters this sentinel is watching
redis_sentinel_running_scripts	gauge	`cls`, `ip`, `instance`, `ins`, `job`	Number of scripts in execution right now
redis_sentinel_scripts_queue_length	gauge	`cls`, `ip`, `instance`, `ins`, `job`	Queue of user scripts to execute
redis_sentinel_simulate_failure_flags	gauge	`cls`, `ip`, `instance`, `ins`, `job`	Failures simulations
redis_sentinel_tilt	gauge	`cls`, `ip`, `instance`, `ins`, `job`	Sentinel is in TILT mode
redis_slave_expires_tracked_keys	gauge	`cls`, `ip`, `instance`, `ins`, `job`	slave_expires_tracked_keys metric
redis_slave_info	gauge	`cls`, `ip`, `master_host`, `instance`, `read_only`, `ins`, `job`, `master_port`	Information about the Redis slave
redis_slave_priority	gauge	`cls`, `ip`, `instance`, `ins`, `job`	slave_priority metric
redis_slave_repl_offset	gauge	`cls`, `ip`, `master_host`, `instance`, `ins`, `job`, `master_port`	Slave replication offset
redis_slowlog_last_id	gauge	`cls`, `ip`, `instance`, `ins`, `job`	Last id of slowlog
redis_slowlog_length	gauge	`cls`, `ip`, `instance`, `ins`, `job`	Total slowlog
redis_start_time_seconds	gauge	`cls`, `ip`, `instance`, `ins`, `job`	Start time of the Redis instance since unix epoch in seconds.
redis_target_scrape_request_errors_total	counter	`cls`, `ip`, `instance`, `ins`, `job`	Errors in requests to the exporter
redis_total_error_replies	counter	`cls`, `ip`, `instance`, `ins`, `job`	total_error_replies metric
redis_total_reads_processed	counter	`cls`, `ip`, `instance`, `ins`, `job`	total_reads_processed metric
redis_total_system_memory_bytes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	total_system_memory_bytes metric
redis_total_writes_processed	counter	`cls`, `ip`, `instance`, `ins`, `job`	total_writes_processed metric
redis_tracking_clients	gauge	`cls`, `ip`, `instance`, `ins`, `job`	tracking_clients metric
redis_tracking_total_items	gauge	`cls`, `ip`, `instance`, `ins`, `job`	tracking_total_items metric
redis_tracking_total_keys	gauge	`cls`, `ip`, `instance`, `ins`, `job`	tracking_total_keys metric
redis_tracking_total_prefixes	gauge	`cls`, `ip`, `instance`, `ins`, `job`	tracking_total_prefixes metric
redis_unexpected_error_replies	counter	`cls`, `ip`, `instance`, `ins`, `job`	unexpected_error_replies metric
redis_up	gauge	`cls`, `ip`, `instance`, `ins`, `job`	Information about the Redis instance
redis_uptime_in_seconds	gauge	`cls`, `ip`, `instance`, `ins`, `job`	uptime_in_seconds metric
scrape_duration_seconds	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
scrape_samples_post_metric_relabeling	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
scrape_samples_scraped	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
scrape_series_added	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A
up	Unknown	`cls`, `ip`, `instance`, `ins`, `job`	N/A

12.7 - 常见问题

Pigsty REDIS 模块常见问题答疑

Redis初始化失败：ABORT due to existing redis instance

这意味着正在初始化的 Redis 实例已经存在了，使用 redis_clean = true 和 redis_safeguard = false 来强制清除redis数据

当您运行redis.yml来初始化一个已经在运行的redis实例，并且redis_clean设置为false时，就会出现这种情况。

如果redis_clean设置为true（并且 redis_safeguard 也设置为false），redis.yml剧本将删除现有的redis实例并将其重新初始化为一个新的实例，这使得redis.yml剧本完全具有幂等性。

Redis初始化失败：ABORT due to redis_safeguard enabled

这意味着正准备清理的 Redis 实例打开了防误删保险：当 redis_safeguard 设置为 true 时，尝试移除一个redis实例时就会出现这种情况。

您可以关闭 redis_safeguard 来移除Redis实例。这就是 redis_safeguard 的作用。

如何在某个节点上添加一个新的Redis实例？

使用 bin/redis-add <ip> <port> 在节点上部署一个新的redis实例。

如何从节点上移除一个特定实例？

使用 bin/redis-rm <ip> <port> 从节点上移除一个单独的redis实例。

13 - 模块：FERRET

Pigsty 内置了对 FerertDB 的原生部署支持 —— 它为 PostgreSQL 添加了 MongoDB 线缆协议级别的 API 兼容能力！

使用 FerretDB 为 PostgreSQL 添加 MongoDB 兼容的协议支持！配置 | 管理 | 剧本 | 监控 | 参数

MongoDB 曾经是一项令人惊叹的技术，让开发者能够抛开关系型数据库的“模式束缚”，快速构建应用程序。然而随着时间推移，MongoDB 放弃了它的开源本质，将许可证更改为 SSPL，这使得许多开源项目和早期商业项目无法使用它。大多数 MongoDB 用户其实并不需要 MongoDB 提供的高级功能，但他们确实需要一个易于使用的开源文档数据库解决方案。为了填补这个空白，FerretDB 应运而生。

PostgreSQL 的 JSON 功能支持已经足够完善了：二进制存储 JSONB，GIN 任意字段索引，各种 JSON 处理函数，JSON PATH 和 JSON Schema，它早已是一个功能完备，性能强大的文档数据库了。但是提供替代的功能，和直接仿真还是不一样的。FerretDB 可以为使用 MongoDB 驱动的应用程序提供一个丝滑迁移到 PostgreSQL 的过渡方案。

Pigsty 在 1.x 中就提供了基于 Docker 的 FerretDB 模板，在 v2.3 中更是提供了原生部署支持。它作为一个选装项，对丰富 PostgreSQL 生态大有裨益。Pigsty 社区已经与 FerretDB 社区成为了合作伙伴，后续将进行深度的合作与适配支持。

目前，FerretDB 2.0 提供了基于微软 DocumentDB 扩展插件的全新版本，目前已经在 Pigsty v3.3 中正式提供支持

13.1 - 使用方法

快速上手，如何上手使用 FerretDB ？如何可靠地接入 FerretDB？如何使用 mongosh 客户端工具？

安装客户端工具

你可以使用 MongoDB 的命令行工具 MongoSH 来访问 FerretDB。

pig 命令行工具可以用于添加 MongoDB 仓库，然后你可以使用 yum 或 apt 安装 mongosh：

pig repo add mongo -u
yum install mongodb-mongosh
apt install mongodb-mongosh

连接到FerretDB

你可以使用 MongoDB 连接串，用任何语言的 MongoDB 驱动访问 FerretDB，这里以上面安装的 mongosh 命令行工具为例：

$ mongosh
Current Mongosh Log ID:	67ba8c1fe551f042bf51e943
Connecting to:		mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.4.0
Using MongoDB:		7.0.77
Using Mongosh:		2.4.0

For mongosh info see: https://www.mongodb.com/docs/mongodb-shell/

test>

认证

你可以使用其他用户进行登陆，参阅 FerretDB：认证获取详细信息。

mongosh 'mongodb://dbuser_meta:DBUser.Meta@10.10.10.10:27017/meta'      # 业务管理员用户
mongosh 'mongodb://dbuser_view:DBUser.Viewer@10.10.10.10:27017/meta'    # 只读用户

快速上手

你可以连接到 FerretDB 并假装它是一个 MongoDB 集群。

$ mongosh 'mongodb://dbuser_meta:DBUser.Meta@10.10.10.10:27017/meta'

MongoDB 的命令会被翻译为SQL命令，在底下的 PostgreSQL 中执行：

use test                            // CREATE SCHEMA test;
db.dropDatabase();                  // DROP SCHEMA test;
db.createCollection('posts');       // CREATE TABLE posts(_data JSONB,...)
db.posts.insertOne({                // INSERT INTO posts VALUES(...);
    title: 'Post One',body: 'Body of post one',category: 'News',tags: ['news', 'events'],
    user: {name: 'John Doe',status: 'author'},date: Date()}
);
db.posts.find().limit(2).pretty();  // SELECT * FROM posts LIMIT 2;
db.posts.createIndex({ title: 1 })  // CREATE INDEX ON posts(_data->>'title');

如果你不是很熟悉 MongoDB，这里有一个快速上手教程，同样适用于 FerretDB： Perform CRUD Operations with MongoDB Shell

如果你希望生成一些样例负载，可以使用 mongosh 执行以下的简易测试剧本：

cat > benchmark.js <<'EOF'
const coll = "testColl";
const numDocs = 1000;

for (let i = 0; i < numDocs; i++) {  // insert
  db.getCollection(coll).insertOne({ num: i, name: "MongoDB Benchmark Test" });
}

for (let i = 0; i < numDocs; i++) {  // select
  db.getCollection(coll).find({ num: i });
}

for (let i = 0; i < numDocs; i++) {  // update
  db.getCollection(coll).updateOne({ num: i }, { $set: { name: "Updated" } });
}

for (let i = 0; i < numDocs; i++) {  // delete
  db.getCollection(coll).deleteOne({ num: i });
}
EOF

mongosh 'mongodb://dbuser_meta:DBUser.Meta@10.10.10.10:27017' benchmark.js

你可以查阅 FerretDB 支持的 MongoDB命令，同时还有一些已知的区别，对于基本的使用来说，通常不是什么大问题。

13.2 - 集群配置

配置 FerretDB 集群与所需的 PostgreSQL 集群。

配置

在部署 Mongo (FerretDB) 集群前，你需要先在配置清单中使用相关参数定义好它。

下面的例子将默认的单节点 pg-meta 集群的 meta 数据库作为 FerretDB 的底层存储：

all:
  children:

    #----------------------------------#
    # ferretdb for mongodb on postgresql
    #----------------------------------#
    # ./mongo.yml -l ferret
    ferret:
      hosts:
        10.10.10.10: { mongo_seq: 1 }
      vars:
        mongo_cluster: ferret
        mongo_pgurl: 'postgres://mongod:DBUser.Mongo@10.10.10.10:5432/meta'

这里 mongo_cluster 与 mongo_seq 属于不可或缺的身份参数，对于 FerretDB 来说，还有一个必须提供的参数是 mongo_pgurl，指定了底层 PG 的位置。

请注意， mongo_pgurl 参数需要一个 PostgreSQL 超级用户。本例中定义了一个 mongod 超级用户供 FerretDB 专门使用。

请注意，FerretDB 的认证完全基于 PostgreSQL。您可以使用 FerretDB 或者 PostgreSQL 创建其他所需的普通用户。

PostgreSQL集群

FerretDB 2.0+ 需要用到一个扩展插件： DocumentDB，该扩展插件同时依赖几个其他扩展，以下是创建 FerretDB 所需的 PostgreSQL 集群样板：

all:
  children:

    #----------------------------------#
    # pgsql (singleton on current node)
    #----------------------------------#
    # postgres cluster: pg-meta
    pg-meta:
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - { name: mongod      ,password: DBUser.Mongo  ,pgbouncer: true ,roles: [dbrole_admin ] ,superuser: true ,comment: ferretdb super user }
          - { name: dbuser_meta ,password: DBUser.Meta   ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
          - { name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
        pg_databases:
          - {name: meta, owner: mongod ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [ documentdb, postgis, vector, pg_cron, rum ]}
        pg_hba_rules:
          - { user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes' }
          - { user: mongod      , db: all ,addr: world ,auth: pwd ,title: 'mongodb password access from everywhere' }
        pg_extensions:
          - documentdb, citus, postgis, pgvector, pg_cron, rum
        pg_parameters:
          cron.database_name: meta
        pg_libs: 'pg_documentdb, pg_documentdb_core, pg_cron, pg_stat_statements, auto_explain'  # add timescaledb to shared_preload_libraries

高可用

您可以使用服务来接入高可用的 PostgreSQL 集群，并部署多个 FerretDB 实例副本并绑定 L2 VIP 以实现 FerretDB 层本身的高可用。

ferret:
  hosts:
    10.10.10.45: { mongo_seq: 1 }
    10.10.10.46: { mongo_seq: 2 }
    10.10.10.47: { mongo_seq: 3 }
  vars:
    mongo_cluster: ferret
    mongo_pgurl: 'postgres://mongod:DBUser.Mongo@10.10.10.3:5436/test'
    vip_enabled: true
    vip_vrid: 128
    vip_address: 10.10.10.99
    vip_interface: eth1

参数

FERRET 模块中提供了 9 个相关的配置参数，如下表所示：

参数	类型	级别	注释
`mongo_seq`	int	I	mongo 实例号，必选身份参数
`mongo_cluster`	string	C	mongo 集群名，必选身份参数
`mongo_pgurl`	pgurl	C/I	mongo/ferretdb 底层使用的 PGURL 连接串，必选
`mongo_ssl_enabled`	bool	C	mongo/ferretdb 是否启用SSL？默认为 `false`
`mongo_listen`	ip	C	mongo 监听地址，默认留控则监听所有地址
`mongo_port`	port	C	mongo 服务端口，默认使用 27017
`mongo_ssl_port`	port	C	mongo TLS 监听端口，默认使用 27018
`mongo_exporter_port`	port	C	mongo exporter 端口，默认使用 9216
`mongo_extra_vars`	string	C	MONGO 服务器额外环境变量，默认为空白字符串

# mongo_cluster:        #CLUSTER  # mongo cluster name, required identity parameter
# mongo_seq: 0          #INSTANCE # mongo instance seq number, required identity parameter
# mongo_pgurl: 'postgres:///'     # mongo/ferretdb underlying postgresql url, required
mongo_ssl_enabled: false          # mongo/ferretdb ssl enabled, false by default
mongo_listen: ''                  # mongo/ferretdb listen address, '' for all addr
mongo_port: 27017                 # mongo/ferretdb listen port, 27017 by default
mongo_ssl_port: 27018             # mongo/ferretdb tls listen port, 27018 by default
mongo_exporter_port: 9216         # mongo/ferretdb exporter port, 9216 by default
mongo_extra_vars: ''              # extra environment variables for mongo/ferretdb

13.3 - 管理预案

FerretDB 管理剧本，监控面板，以及 SOP。

剧本

Pigsty 提供了一个内置的剧本： mongo.yml，用于在节点上安装 FerretDB 集群。

`mongo.yml`

该剧本由以下子任务组成：

mongo_check ：检查 mongo 身份参数
mongo_dbsu ：创建操作系统用户 mongod
mongo_install ：安装 mongo/ferretdb RPM包
mongo_purge ：清理现有 mongo/ferretdb 集群（默认不执行）
mongo_config ：配置 mongo/ferretdb
- mongo_cert ：签发 mongo/ferretdb SSL证书
mongo_launch ：启动 mongo/ferretdb 服务
mongo_register：将 mongo/ferretdb 注册到 Prometheus 监控中

监控

MONGO 模块提供了一个简单的监控面板：Mongo Overview

Mongo Overview

Mongo Overview: Mongo/FerretDB 集群概览

这个监控面板提供了关于 FerretDB 的基本监控指标，因为 FerretDB 底层使用了 PostgreSQL，所以更多的监控指标，还请参考 PostgreSQL 监控。

创建Mongo集群

在配置清单中定义好MONGO集群后，您可以使用以下命令完成安装。

./mongo.yml -l ferret   # 在 ferret 分组上安装“MongoDB/FerretDB”

因为 FerretDB 使用了 PostgreSQL 作为底层存储，所以重复运行此剧本通常并无大碍。

移除Mongo集群

要移除 Mongo/FerretDB 集群，运行 mongo.yml 剧本的子任务：mongo_purge，并使用 mongo_purge 命令行参数即可：

./mongo.yml -e mongo_purge=true -t mongo_purge

13.4 - 指标列表

Pigsty MONGO 模块提供的完整监控指标列表与释义

MONGO 模块包含有 54 类可用监控指标。

Metric Name	Type	Labels	Description
ferretdb_client_accepts_total	Unknown	`error`, `cls`, `ip`, `ins`, `instance`, `job`	N/A
ferretdb_client_duration_seconds_bucket	Unknown	`error`, `le`, `cls`, `ip`, `ins`, `instance`, `job`	N/A
ferretdb_client_duration_seconds_count	Unknown	`error`, `cls`, `ip`, `ins`, `instance`, `job`	N/A
ferretdb_client_duration_seconds_sum	Unknown	`error`, `cls`, `ip`, `ins`, `instance`, `job`	N/A
ferretdb_client_requests_total	Unknown	`cls`, `ip`, `ins`, `opcode`, `instance`, `command`, `job`	N/A
ferretdb_client_responses_total	Unknown	`result`, `argument`, `cls`, `ip`, `ins`, `opcode`, `instance`, `command`, `job`	N/A
ferretdb_postgresql_metadata_databases	gauge	`cls`, `ip`, `ins`, `instance`, `job`	The current number of database in the registry.
ferretdb_postgresql_pool_size	gauge	`cls`, `ip`, `ins`, `instance`, `job`	The current number of pools.
ferretdb_up	gauge	`cls`, `version`, `commit`, `ip`, `ins`, `dirty`, `telemetry`, `package`, `update_available`, `uuid`, `instance`, `job`, `branch`, `debug`	FerretDB instance state.
go_gc_duration_seconds	summary	`cls`, `ip`, `ins`, `instance`, `quantile`, `job`	A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count	Unknown	`cls`, `ip`, `ins`, `instance`, `job`	N/A
go_gc_duration_seconds_sum	Unknown	`cls`, `ip`, `ins`, `instance`, `job`	N/A
go_goroutines	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of goroutines that currently exist.
go_info	gauge	`cls`, `version`, `ip`, `ins`, `instance`, `job`	Information about the Go environment.
go_memstats_alloc_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total	counter	`cls`, `ip`, `ins`, `instance`, `job`	Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total	counter	`cls`, `ip`, `ins`, `instance`, `job`	Total number of frees.
go_memstats_gc_sys_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of heap bytes that are in use.
go_memstats_heap_objects	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of allocated objects.
go_memstats_heap_released_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of heap bytes released to OS.
go_memstats_heap_sys_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total	counter	`cls`, `ip`, `ins`, `instance`, `job`	Total number of pointer lookups.
go_memstats_mallocs_total	counter	`cls`, `ip`, `ins`, `instance`, `job`	Total number of mallocs.
go_memstats_mcache_inuse_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of bytes obtained from system.
go_threads	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of OS threads created.
mongo_up	Unknown	`cls`, `ip`, `ins`, `instance`, `job`	N/A
process_cpu_seconds_total	counter	`cls`, `ip`, `ins`, `instance`, `job`	Total user and system CPU time spent in seconds.
process_max_fds	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Maximum number of open file descriptors.
process_open_fds	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Number of open file descriptors.
process_resident_memory_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Resident memory size in bytes.
process_start_time_seconds	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Virtual memory size in bytes.
process_virtual_memory_max_bytes	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_errors_total	counter	`job`, `cls`, `ip`, `ins`, `instance`, `cause`	Total number of internal errors encountered by the promhttp metric handler.
promhttp_metric_handler_requests_in_flight	gauge	`cls`, `ip`, `ins`, `instance`, `job`	Current number of scrapes being served.
promhttp_metric_handler_requests_total	counter	`job`, `cls`, `ip`, `ins`, `instance`, `code`	Total number of scrapes by HTTP status code.
scrape_duration_seconds	Unknown	`cls`, `ip`, `ins`, `instance`, `job`	N/A
scrape_samples_post_metric_relabeling	Unknown	`cls`, `ip`, `ins`, `instance`, `job`	N/A
scrape_samples_scraped	Unknown	`cls`, `ip`, `ins`, `instance`, `job`	N/A
scrape_series_added	Unknown	`cls`, `ip`, `ins`, `instance`, `job`	N/A
up	Unknown	`cls`, `ip`, `ins`, `instance`, `job`	N/A

13.5 - 常见问题

Pigsty FerretDB 模块常见问题答疑

14 - 模块：DOCKER

Docker Daemon 服务，允许用户一键拉起容器化的无状态软件工具模板，加装各种功能。

Docker 是最流行的容器化平台，提供了标准化的软件交付能力。

14.1 - 使用方法

Docker 模块快速上手，安装，卸载，下载，仓库，镜像，代理，拉取，关于 Docker 你需要知道的内容。

Pigsty 内置了 Docker 支持，您可以用它来快速部署容器化的应用软件。

上手

Docker 是一个 可选模块，且在 Pigsty 的大部分配置模板中，Docker 并非默认启用。因此，用户需要显式地下载并配置才能在 Pigsty 中使用 Docker。

例如，在默认使用的 meta 模板中，Docker 默认是不会下载安装的。不过在 rich 单节点模板中，则会下载并安装 Docker。

这两个配置的关键区别就在于这两个参数：repo_modules 与 repo_packages。

repo_modules: infra,node,pgsql,docker  # <--- 启用 Docker 仓库
repo_packages: 
  - node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common, docker   # <--- 下载 Docker

Docker 下载完之后，您需要在待安装 Docker 的节点上配置 docker_enabled: true 标记，并按需配置 其他参数。

infra:
  hosts:
    10.10.10.10: { infra_seq: 1 ,nodename: infra-1 }
    10.10.10.11: { infra_seq: 2 ,nodename: infra-2 }
  vars:
    docker_enabled: true  # 在这个分组上安装 Docker ！

最后，使用您可以使用 docker.yml 剧本将其安装到节点上：

./docker.yml -l infra    # 在 infra 分组上安装 Docker

安装

如果您只是临时性的希望在某些节点上，直接从互联网安装 Docker，那么可以考虑使用以下命令：

./node.yml -e '{"node_repo_modules":"node,docker","node_packages":["docker-ce,docker-compose-plugin"]}' -t node_repo,node_pkg -l <select_group_ip>

这条命令会在目标节点上，首先启用 node,docker 两个模块对应的上游软件源，然后安装 docker-ce 与 docker-compose-plugin 两个软件包（EL/Debian同名）。

如果您希望的是在 Pigsty 初始化的时候就自动下载好 Docker 相关软件包，请参考下面的说明。

卸载

因为过于简单，Pigsty 不提供 Docker 模块的卸载剧本，你可以直接使用 Ansible 指令移除 Docker

ansible infra -m package -b -a 'name=docker-ce state=absent'  # 卸载 docker

这条命令会在目标节点上，首先启用 node,docker 两个模块对应的上游软件源，然后安装 docker-ce 与 docker-compose-plugin 两个软件包（EL/Debian同名）。

如果您希望的是在 Pigsty 初始化的时候就自动下载好 Docker 相关软件包，请参考下面的说明。

下载

想要在 Pigsty 安装过程中下载 Docker，在 配置清单 中修改参数 repo_modules 启用 Docker 软件仓库，然后在 repo_packages 或 repo_extra_packages 参数中指定下载 Docker 软件包。

repo_modules: infra,node,pgsql,docker  # <--- 启用 Docker 仓库
repo_packages: 
  - node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-common, docker   # <--- 下载 Docker
repo_extra_packages:
  - pgsql-main docker # <--- 也可以在这里指定

这里指定的 docker（实际对应 docker-ce 与 docker-compose-plugin 两个软件包）会在默认的 install.yml 过程中自动下载到本地软件源中。下载完成后的 Docker 软件包可以通过本地软件源，对所有节点可用。

如果您已经完成了 Pigsty 安装，本地软件源已经初始化完毕，您可以在修改配置之后执行 ./infra.yml -t repo_build 重新下载并构建离线软件源。

安装 Docker 需要用到 Docker 的 YUM/APT 仓库，这个仓库在 Pigsty 中默认包含，但不启用，需要将 docker 加入到 repo_modules 中启用后才能安装

仓库

下载 Docker 需要用到互联网上游软件仓库，已经在定义在默认的 repo_upstream 中，模块名为 docker

- { name: docker-ce ,description: 'Docker CE' ,module: docker  ,releases: [7,8,9] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.docker.com/linux/centos/$releasever/$basearch/stable'    ,china: 'https://mirrors.aliyun.com/docker-ce/linux/centos/$releasever/$basearch/stable'  ,europe: 'https://mirrors.xtom.de/docker-ce/linux/centos/$releasever/$basearch/stable' }}
- { name: docker-ce ,description: 'Docker CE' ,module: docker  ,releases: [11,12,20,22,24] ,arch: [x86_64, aarch64] ,baseurl: { default: 'https://download.docker.com/linux/${distro_name} ${distro_codename} stable' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux//${distro_name} ${distro_codename} stable' }}

您可以在 repo_modules 与 node_repo_modules 两个参数中，使用 docker 模块名引用这个仓库。

请注意，Docker 的官方软件仓库在中国大陆默认处于封锁状态，您需要使用中国地区的镜像站点才能正常完成下载。

如果您处在中国大陆地区遇到 Docker 本身下载失败的问题，请检查您的配置清单中，region 是否被设置为了 default，默认情况下自动配置的 region: china 可以解决这个问题。

代理

如果您的网络环境需要使用代理服务器才能访问互联网，您可以在 Pigsty 的配置清单中配置 proxy_env 参数，这个参数会被写入到 Docker 的配置文件中的 proxy 相关配置中。

proxy_env:
  no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.aliyuncs.com,mirrors.tuna.tsinghua.edu.cn"
  #http_proxy: 'http://username:password@proxy.address.com'
  #https_proxy: 'http://username:password@proxy.address.com'
  #all_proxy: 'http://username:password@proxy.address.com'

在执行 configure 的过程中如果指定了 -x 参数，当前环境中的代理服务器配置会自动生成到 Pigsty 配置文件到 all.vars.proxy_env 中。如果您希望单独针对运行 Docker 的节点配置代理服务器，那么可以在集群或者主机，而非全局变量中定义 proxy_env 参数。

例如，假设您使用的代理服务器在 127.0.0.1:12345 上提供服务，那么你可以通过以下环境变量来使用这个代理服务器：

all:
  children:
    proxy_env:
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.tsinghua.edu.cn"
      #http_proxy:  127.0.0.1:12345 # 在这里添加你的代理服务器地址，用于下载软件包与拉取镜像
      #https_proxy: 127.0.0.1:12345 # 如果你的代理服务器运行于内网，可以直接通过指定 IP+端口的方式访问
      #all_proxy:   127.0.0.1:12345 # 当然也可以是形如 'http://username:password@proxy.address.com' 的完整格式

您可以使用 curl 命令，检验代理服务器是否可以正常工作，例如成功可以访问 Google，通常说明代理服务器工作正常。此外，如果您使用了代理服务器翻墙，请务必注意别再用中国大陆的镜像站点，因为这很可能因为负负得正再次导致不可用。

除了使用代理服务器之外，您还可以通过配置 Docker镜像站点 的方式来规避封锁。针对专业版用户，我们提供开箱即用的容器拉取服务，避免您折腾这些问题。

镜像站

您可以通过参数 docker_registry_mirrors 指定 Docker 的 Registry Mirrors 参数，使用未被墙掉的镜像站点。

普通墙外用户，除了官方默认的 DockerHub 站点外，还可以考虑使用 quay.io 镜像站点。如果您的内网环境已经有了成熟的镜像基础设施，您可以使用内网的 Docker 镜像站点，避免受到外网镜像站点的影响，提高下载速度。

使用公有云厂商服务的用户可以考虑使用内网免费的 Docker 镜像。例如，如果您使用阿里云，可以使用阿里云提供的内网 Docker 镜像站点（需要登陆）：

["https://registry.cn-hangzhou.aliyuncs.com"]   # 阿里云镜像站点，需要显式登陆

如果你使用腾讯云，可以使用腾讯云提供的内网 Docker 镜像站点（需要内网）：

["https://ccr.ccs.tencentyun.com"]   # 腾讯云镜像站点，内网专用

此外，您还可以使用 CF-Workers-docker.io 快速拉起您自己的 Docker 镜像代理。也可以考虑使用免费的 Docker代理镜像（风险自负！）

其他可供您参考的资源：DockerHub 国内加速镜像列表。针对专业版用户，我们提供开箱即用的 Docker 代理方案，避免您折腾这些问题。

拉取镜像

参数 docker_image 与 docker_image_cache 可用于直接指定在 Docker 安装时，需要拉取的镜像列表。

使用这一功能，可以让 Docker 装好之后就带有指定的镜像（前提是可以成功拉取，此任务失败会自动忽略跳过）

例如，您可以在配置清单中指定需要拉取的镜像：

infra:
  hosts:
    10.10.10.10: { infra_seq: 1 }
  vars:
    docker_enabled: true  # 在这个分组上安装 Docker ！
    docker_image:
      - redis:latest      # 拉取最新版本的 Redis 镜像

另一种预先加载镜像的方式是使用本地 save 的 tgz 压缩包：如果您预先使用 docker save xxx | gzip -c > /tmp/docker/xxx.tgz 将 Docker 镜像导出保存在本地。那么这些导出的镜像文件可以通过参数 docker_image_cache 指定的 glob 被自动加载。默认的位置是： /tmp/docker/*.tgz。

这意味着你可以事先把镜像放在 /tmp/docker 目录中，然后执行 docker.yml 安装 docker 后会自动加载这些镜像包。

例如，在 supabase自建教程中就使用了这种技术，在拉起 Supabase，安装 Docker 之前，把本地 /tmp/supabase 目录的 *.tgz 镜像压缩包都拷贝到了目标节点的 /tmp/docker 目录下。

- name: copy local docker images
  copy: src="{{ item }}" dest="/tmp/docker/"
  with_fileglob: "{{ supa_images }}"
  vars: # you can override this with -e cli args
    supa_images: /tmp/supabase/*.tgz

应用

Pigsty 提供了一系列开箱即用的，基于 Docker Compose 的 软件模板，您可以用它们一键拉起使用外部由 Pigsty 管理数据库集群的业务软件。

14.2 - 参数列表

Docker 模块提供了 8 个相关配置参数

参数列表

Docker 模块有 8 个相关参数，如下表所示：

参数	类型	级别	注释
`docker_enabled`	`bool`	G/C/I	在当前节点上启用 Docker？默认不启用
`docker_data`	`path`	G/C/I	Docker 主数据目录，默认为 `/var/lib/docker`
`docker_storage_driver`	`enum`	G/C/I	Docker 存储驱动驱动，默认为 overlay2
`docker_cgroups_driver`	`enum`	G/C/I	Docker CGroup 文件系统驱动：cgroupfs,systemd
`docker_registry_mirrors`	`string[]`	G/C/I	Docker 仓库镜像列表
`docker_exporter_port`	`port`	G	Docker 监控指标暴露端口，默认为 `9323`
`docker_image`	`string[]`	G/C/I	Docker 待拉取的镜像列表，默认为空列表
`docker_image_cache`	`path[]`	G/C/I	Docker 待导入的镜像压缩包路径，默认为 `/tmp/docker/*.tgz`

默认参数

Docker 模块的默认参数定义于 roles/docker/defaults/main.yml

docker_enabled: false             # enable docker on this node?
docker_data: /var/lib/docker      # docker data directory, /var/lib/docker by default
docker_storage_driver: overlay2   # docker storage driver, can be zfs, btrfs
docker_cgroups_driver: systemd    # docker cgroup fs driver: cgroupfs,systemd
docker_registry_mirrors: []       # docker registry mirror list
docker_exporter_port: 9323        # docker metrics exporter port, 9323 by default
docker_image: []                  # docker image to be pulled after bootstrap
docker_image_cache: /tmp/docker/*.tgz # docker image cache glob pattern

`docker_enabled`

参数名称： docker_enabled，类型： bool，层次：C

是否在当前节点启用Docker？默认为： false，即不启用。

`docker_data`

参数名称： docker_data，类型： path，层次：C

Docker 主数据目录，默认为 /var/lib/docker。

`docker_storage_driver`

参数名称： docker_storage_driver，类型： enum，层次：C

Docker 存储驱动，默认值为： overlay2。

可选值请参考：https://docs.docker.com/engine/storage/drivers/select-storage-driver/

overlay2
fuse-overlayfs
brtfs
zfs
vfs

`docker_cgroups_driver`

参数名称： docker_cgroups_driver，类型： enum，层次：C

Docker使用的 CGroup FS 驱动，可以是 cgroupfs 或 systemd，默认值为： systemd

`docker_registry_mirrors`

参数名称： docker_registry_mirrors，类型： string[]，层次：C

Docker使用的镜像仓库地址，默认值为：[] 空数组。

您可以使用Docker镜像站点加速镜像拉取，下面是一些例子：

使用国内厂商的 Docker 镜像站点：

docker_registry_mirrors:
 - https://docker.m.daocloud.io
 - https://dockerproxy.com
 - https://docker.mirrors.ustc.edu.cn
 - https://docker.nju.edu.cn

使用各家云厂商内网镜像：

["https://docker.m.daocloud.io"]                # 国内 DaoCloud 镜像站点
["https://docker.1ms.run"]                      # 国内毫秒镜像镜像站点
["https://mirror.ccs.tencentyun.com"]           # 腾讯云内网的镜像站点
["https://registry.cn-hangzhou.aliyuncs.com"]   # 阿里云镜像站点，需要登陆

使用 Cloudflare Worker 一键免费自建 Docker 中转代理

如果拉取速度太慢，您也可以考虑：docker login quay.io 使用其他的 Registry。

`docker_exporter_port`

参数名称： docker_exporter_port，类型： port，层次：G

Docker 暴露指标使用的端口，默认为 9323。

`docker_image_cache`

参数名称： docker_image_cache，类型： path，层次：C

本地的Docker镜像离线缓存包路径，默认为 /tmp/docker/*.tgz。

所有匹配该正则表达式的文件，会被视作 Tar + GZ 制作的压缩 Docker 镜像，会逐个通过 docker load 加载：

cat *.tgz | gzip -d -c - | docker load

14.3 - 预置剧本

如何使用预置的 ansible 剧本来管理 Docker，常用管理命令速查。

Docker 模块提供了一个默认的剧本 docker.yml ，用于安装 Docker Daemon 与 Docker Compose。

`docker.yml`

剧本原始文件：docker.yml 中

执行本剧本，将会在带有 docker_enabled: true 标记地目标节点上安装 docker-ce 与 docker-compose-plugin，启用 dockerd 服务

以下是 docker.yml 剧本中可用的任务子集：

docker_install ：在节点上安装 Docker，Docker Compose 软件包
docker_admin ：将指定的用户加入 Docker 管理员用户组中
docker_config ：生成 Docker 守护进程服务配置文件
docker_launch ：启动 Docker 守护进程服务
docker_register ：将 Docker 守护进程注册为 Prometheus 监控目标
docker_image ：尝试从 /tmp/docker/*.tgz 加载预置镜像压缩包（如果存在）

Docker 模块没有提供专门的卸载剧本，如果您需要卸载 Docker，可以手工停止 docker 后卸载：

systemctl stop docker                        # 停止 Docker 守护进程服务
yum remove docker-ce docker-compose-plugin   # 在 EL 系统上卸载 Docker 
apt remove docker-ce docker-compose-plugin   # 在 Debian 系统上卸载 Docker

14.4 - 指标列表

Pigsty Docker 模块提供的完整监控指标列表与释义

DOCKER 模块包含有 123 类可用监控指标。

Metric Name	Type	Labels	Description
builder_builds_failed_total	counter	`ip`, `cls`, `reason`, `ins`, `job`, `instance`	Number of failed image builds
builder_builds_triggered_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	Number of triggered image builds
docker_up	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
engine_daemon_container_actions_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`, `action`	N/A
engine_daemon_container_actions_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `action`	N/A
engine_daemon_container_actions_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `action`	N/A
engine_daemon_container_states_containers	gauge	`ip`, `cls`, `ins`, `job`, `instance`, `state`	The count of containers in various states
engine_daemon_engine_cpus_cpus	gauge	`ip`, `cls`, `ins`, `job`, `instance`	The number of cpus that the host system of the engine has
engine_daemon_engine_info	gauge	`ip`, `cls`, `architecture`, `ins`, `job`, `instance`, `os_version`, `kernel`, `version`, `graphdriver`, `os`, `daemon_id`, `commit`, `os_type`	The information related to the engine and the OS it is running on
engine_daemon_engine_memory_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	The number of bytes of memory that the host system of the engine has
engine_daemon_events_subscribers_total	gauge	`ip`, `cls`, `ins`, `job`, `instance`	The number of current subscribers to events
engine_daemon_events_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	The number of events logged
engine_daemon_health_checks_failed_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	The total number of failed health checks
engine_daemon_health_check_start_duration_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
engine_daemon_health_check_start_duration_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
engine_daemon_health_check_start_duration_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
engine_daemon_health_checks_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	The total number of health checks
engine_daemon_host_info_functions_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`, `function`	N/A
engine_daemon_host_info_functions_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `function`	N/A
engine_daemon_host_info_functions_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `function`	N/A
engine_daemon_image_actions_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`, `action`	N/A
engine_daemon_image_actions_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `action`	N/A
engine_daemon_image_actions_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `action`	N/A
engine_daemon_network_actions_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`, `action`	N/A
engine_daemon_network_actions_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `action`	N/A
engine_daemon_network_actions_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `action`	N/A
etcd_debugging_snap_save_marshalling_duration_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
etcd_debugging_snap_save_marshalling_duration_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_debugging_snap_save_marshalling_duration_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_debugging_snap_save_total_duration_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
etcd_debugging_snap_save_total_duration_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_debugging_snap_save_total_duration_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_disk_wal_fsync_duration_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
etcd_disk_wal_fsync_duration_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_disk_wal_fsync_duration_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_disk_wal_write_bytes_total	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Total number of bytes written in WAL.
etcd_snap_db_fsync_duration_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
etcd_snap_db_fsync_duration_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_snap_db_fsync_duration_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_snap_db_save_total_duration_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
etcd_snap_db_save_total_duration_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_snap_db_save_total_duration_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_snap_fsync_duration_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
etcd_snap_fsync_duration_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
etcd_snap_fsync_duration_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
go_gc_duration_seconds	summary	`ip`, `cls`, `ins`, `job`, `instance`, `quantile`	A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
go_gc_duration_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
go_goroutines	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of goroutines that currently exist.
go_info	gauge	`ip`, `cls`, `ins`, `job`, `version`, `instance`	Information about the Go environment.
go_memstats_alloc_bytes	counter	`ip`, `cls`, `ins`, `job`, `instance`	Total number of bytes allocated, even if freed.
go_memstats_alloc_bytes_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	Total number of frees.
go_memstats_gc_sys_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of heap bytes that are in use.
go_memstats_heap_objects	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of allocated objects.
go_memstats_heap_released_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of heap bytes released to OS.
go_memstats_heap_sys_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	Total number of pointer lookups.
go_memstats_mallocs_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	Total number of mallocs.
go_memstats_mcache_inuse_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of bytes obtained from system.
go_threads	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of OS threads created.
logger_log_entries_size_greater_than_buffer_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	Number of log entries which are larger than the log buffer
logger_log_read_operations_failed_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	Number of log reads from container stdio that failed
logger_log_write_operations_failed_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	Number of log write operations that failed
process_cpu_seconds_total	counter	`ip`, `cls`, `ins`, `job`, `instance`	Total user and system CPU time spent in seconds.
process_max_fds	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Maximum number of open file descriptors.
process_open_fds	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Number of open file descriptors.
process_resident_memory_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Resident memory size in bytes.
process_start_time_seconds	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Virtual memory size in bytes.
process_virtual_memory_max_bytes	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flight	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Current number of scrapes being served.
promhttp_metric_handler_requests_total	counter	`ip`, `cls`, `ins`, `job`, `instance`, `code`	Total number of scrapes by HTTP status code.
scrape_duration_seconds	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
scrape_samples_post_metric_relabeling	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
scrape_samples_scraped	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
scrape_series_added	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_dispatcher_scheduling_delay_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
swarm_dispatcher_scheduling_delay_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_dispatcher_scheduling_delay_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_manager_configs_total	gauge	`ip`, `cls`, `ins`, `job`, `instance`	The number of configs in the cluster object store
swarm_manager_leader	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Indicates if this manager node is a leader
swarm_manager_networks_total	gauge	`ip`, `cls`, `ins`, `job`, `instance`	The number of networks in the cluster object store
swarm_manager_nodes	gauge	`ip`, `cls`, `ins`, `job`, `instance`, `state`	The number of nodes
swarm_manager_secrets_total	gauge	`ip`, `cls`, `ins`, `job`, `instance`	The number of secrets in the cluster object store
swarm_manager_services_total	gauge	`ip`, `cls`, `ins`, `job`, `instance`	The number of services in the cluster object store
swarm_manager_tasks_total	gauge	`ip`, `cls`, `ins`, `job`, `instance`, `state`	The number of tasks in the cluster object store
swarm_node_manager	gauge	`ip`, `cls`, `ins`, `job`, `instance`	Whether this node is a manager or not
swarm_raft_snapshot_latency_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
swarm_raft_snapshot_latency_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_raft_snapshot_latency_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_raft_transaction_latency_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
swarm_raft_transaction_latency_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_raft_transaction_latency_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_batch_latency_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
swarm_store_batch_latency_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_batch_latency_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_lookup_latency_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
swarm_store_lookup_latency_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_lookup_latency_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_memory_store_lock_duration_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
swarm_store_memory_store_lock_duration_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_memory_store_lock_duration_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_read_tx_latency_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
swarm_store_read_tx_latency_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_read_tx_latency_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_write_tx_latency_seconds_bucket	Unknown	`ip`, `cls`, `ins`, `job`, `instance`, `le`	N/A
swarm_store_write_tx_latency_seconds_count	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
swarm_store_write_tx_latency_seconds_sum	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A
up	Unknown	`ip`, `cls`, `ins`, `job`, `instance`	N/A

14.5 - 常见问题

Pigsty Docker 模块常见问题答疑

谁能执行Docker命令？

默认情况下，Pigsty 会将当前远程节点执行剧本的管理用户（即目标节点上 ssh 远程登陆的用户），以及参数 node_admin_username 中指定的管理用户加入到 Docker 操作系统用户组中。在这个用户组（docker）中的所有用户，可以使用 docker CLI 命令对 Docker 发起管理。

如果你想让其他用户也可以执行 Docker 命令，可以将该操作系统用户加入到 docker 组中：

usermod -aG docker <username>

使用代理服务器

在 Docker 安装过程中，如果 proxy_env 参数存在，这里的 HTTP 代理服务器配置会被写入到 /etc/docker/daemon.json 配置文件中。

Docker 在从上游 Registry 拉取镜像时，会使用此代理服务器。

小提示，在执行 configure 过程中使用 -x 参数会将当前环境中的代理服务器配置写入到 proxy_env 中。

使用镜像站点

如果您在中国大陆受到功夫网影响，可以考虑使用墙内可用的 Docker 镜像站点，例如 quay.io：

docker login quay.io    # 输入用户名密码，完成登陆

2024-06 更新，国内所有可用 Docker 镜像站点均已被墙，请使用代理服务器访问并拉取。

将Docker纳入监控

在 Docker 模块安装过程中

针对节点单独执行监控目标注册子任务 docker_register 或 register_prometheus 即可：

./docker.yml -l <your-node-selector> -t register_prometheus

使用软件模板

Pigsty 提供了一系列使用 Docker Compose 拉起的软件工具模板，可以开箱即用。

但需要首先安装 Docker 模块。

15 - 模块：专业/试点

仅在专业版本中提供的功能模块

15.1 - 模块：MySQL

使用 Pigsty 拉起过气的 MySQL 集群，用于测试，迁移，性能评估等目的。

MySQL 曾经是“世界上最流行的开源关系型数据库”。安装 | 配置 | 管理 | 剧本 | 监控 | 参数

概览

MySQL 模块本身目前仅在 Pigsty 专业版中提供 Beta 试用预览，注意，请不要将这里的 MySQL 用于生产环境。

安装

您可以直接使用以下命令，在 Pigsty 管理的节点上，直接从官方软件源安装 MySQL 8.0 (EL系统)

# el 7,8,9
./node.yml -t node_install -e '{"node_repo_modules":"node,mysql","node_packages":["mysql-community-server,mysql-community-client"]}'

# debian / ubuntu
./node.yml -t node_install -e '{"node_repo_modules":"node,mysql","node_packages":["mysql-server"]}'

您也可以将 MySQL 软件包加入本地软件源后，使用 MySQL 剧本 mysql.yml 进行生产环境部署。

配置

以下配置片段定义了一个单节点的 MySQL 实例，以及其中的 Database 与 User。

my-test:
  hosts: { 10.10.10.10: { mysql_seq: 1, mysql_role: primary } }
  vars:
    mysql_cluster: my-test
    mysql_databases:
      - { name: meta }
    mysql_users:
      - { name: dbuser_meta    ,host: '%' ,password: 'dbuesr_meta'    ,priv: { "*.*": "SELECT, UPDATE, DELETE, INSERT" } }
      - { name: dbuser_dba     ,host: '%' ,password: 'DBUser.DBA'     ,priv: { "*.*": "ALL PRIVILEGES" } }
      - { name: dbuser_monitor ,host: '%' ,password: 'DBUser.Monitor' ,priv: { "*.*": "SELECT, PROCESS, REPLICATION CLIENT" } ,connlimit: 3 }

管理

以下是基本的 MySQL 集群基本管理操作：

使用 mysql.yml 创建 MySQL 集群：

./mysql.yml -l my-test

剧本

Pigsty 提供了一个与 MYSQL 模块相关的剧本，用于部署 MySQL 集群

mysql.yml：根据配置清单部署 MySQL

`mysql.yml`

用于部署 MySQL 模式集群的 mysql.yml 剧本包含以下子任务：

mysql-id       : generate mysql instance identity
mysql_clean    : remove existing mysql instance (DANGEROUS)
mysql_dbsu     : create os user mysql
mysql_install  : install mysql rpm/deb packages
mysql_dir      : create mysql data & conf dir
mysql_config   : generate mysql config file
mysql_boot     : bootstrap mysql cluster
mysql_launch   : launch mysql service
mysql_pass     : write mysql password
mysql_db       : create mysql biz database
mysql_user     : create mysql biz user
mysql_exporter : launch mysql exporter
mysql_register : register mysql service to prometheus

监控

Pigsty 提供了两个与 MYSQL 模块有关的监控面板：

MYSQL Overview 展示了 MySQL 集群的整体监控指标。

MYSQL Instance 展示了单个 MySQL 实例的监控指标详情

参数

MySQL 的可用配置项：

#-----------------------------------------------------------------
# MYSQL_IDENTITY
#-----------------------------------------------------------------
# mysql_cluster:           #CLUSTER  # mysql cluster name, required identity parameter
# mysql_role: replica      #INSTANCE # mysql role, required, could be primary,replica
# mysql_seq: 0             #INSTANCE # mysql instance seq number, required identity parameter

#-----------------------------------------------------------------
# MYSQL_BUSINESS
#-----------------------------------------------------------------
# mysql business object definition, overwrite in group vars
mysql_users: []                      # mysql business users
mysql_databases: []                  # mysql business databases
mysql_services: []                   # mysql business services

# global credentials, overwrite in global vars
mysql_root_username: root
mysql_root_password: DBUser.Root
mysql_replication_username: replicator
mysql_replication_password: DBUser.Replicator
mysql_admin_username: dbuser_dba
mysql_admin_password: DBUser.DBA
mysql_monitor_username: dbuser_monitor
mysql_monitor_password: DBUser.Monitor

#-----------------------------------------------------------------
# MYSQL_INSTALL
#-----------------------------------------------------------------
# - install - #
mysql_dbsu: mysql                    # os dbsu name, mysql by default, better not change it
mysql_dbsu_uid: 27                   # os dbsu uid and gid, 306 for default mysql users and groups
mysql_dbsu_home: /var/lib/mysql      # mysql home directory, `/var/lib/mysql` by default
mysql_dbsu_ssh_exchange: true        # exchange mysql dbsu ssh key among same mysql cluster
mysql_packages:                      # mysql packages to be installed, `mysql-community*` by default
  - mysql-community*
  - mysqld_exporter

# - bootstrap - #
mysql_data: /data/mysql              # mysql data directory, `/data/mysql` by default
mysql_listen: '0.0.0.0'              # mysql listen addresses, comma separated IP list
mysql_port: 3306                     # mysql listen port, 3306 by default
mysql_sock: /var/lib/mysql/mysql.sock # mysql socket dir, `/var/lib/mysql/mysql.sock` by default
mysql_pid: /var/run/mysqld/mysqld.pid # mysql pid file, `/var/run/mysqld/mysqld.pid` by default
mysql_conf: /etc/my.cnf              # mysql config file, `/etc/my.cnf` by default
mysql_log_dir: /var/log              # mysql log dir, `/var/log/mysql` by default

mysql_exporter_port: 9104            # mysqld_exporter listen port, 9104 by default

mysql_parameters: {}                 # extra parameters for mysqld
mysql_default_parameters:            # default parameters for mysqld

15.2 - 模块：Kafka

使用 Pigsty 拉起 Kafka Kraft 集群，一个开源的分布式流处理平台。

Kafka 是一个开源的分布式流处理平台：安装 | 配置 | 管理 | 剧本 | 监控 | 参数 | 资源

概览

Kafka 模块本身目前仅在 Pigsty 专业版中提供 Beta 试用预览。

安装

如果您使用开源版 Pigsty，可以使用以下命令，在指定节点上安装 Kafka 及其 Java 依赖。

Pigsty 在官方 Infra 仓库中提供了 Kafka 3.8.0 的 RPM 与 DEB 安装包，如果需要使用，可以直接下载安装。

./node.yml -t node_install  -e '{"node_repo_modules":"infra","node_packages":["kafka"]}'

Kafka 依赖 Java 运行环境，因此在安装 Kafka 时，需要安装可用的 JDK （默认使用 OpenJDK 17，但其他 JDK 与版本，例如 8，11 都可以使用）

# EL7 (没有 JDK 17 支持)
./node.yml -t node_install  -e '{"node_repo_modules":"node","node_packages":["java-11-openjdk-headless"]}'

# EL8 / EL9 (使用 OpenJDK 17 )
./node.yml -t node_install  -e '{"node_repo_modules":"node","node_packages":["java-17-openjdk-headless"]}'

# Debian / Ubuntu (使用 OpenJDK 17)
./node.yml -t node_install  -e '{"node_repo_modules":"node","node_packages":["openjdk-17-jdk"]}'

配置

单节点 Kafka 配置样例，请注意，在 Pigsty 单机部署模式下，管理节点上的 9093 端口默认已经被 AlertManager 占用。

建议在管理节点上安装 Kafka 时，Peer Poort 使用其他端口，例如（9095）。

kf-main:
  hosts:
    10.10.10.10: { kafka_seq: 1, kafka_role: controller }
  vars:
    kafka_cluster: kf-main
    kafka_data: /data/kafka
    kafka_peer_port: 9095     # 9093 is already hold by alertmanager

三节点 Kraft 模式 Kafka 集群配置样例：

kf-test:
  hosts:
    10.10.10.11: { kafka_seq: 1, kafka_role: controller   }
    10.10.10.12: { kafka_seq: 2, kafka_role: controller   }
    10.10.10.13: { kafka_seq: 3, kafka_role: controller   }
  vars:
    kafka_cluster: kf-test

管理

以下是基本的 Kafka 集群基本管理操作：

使用 kafka.yml 创建 Kafka 集群：

./kafka.yml -l kf-main
./kafka.yml -l kf-test

创建一个名为 test 的 Topic：

kafka-topics.sh --create --topic test --partitions 1 --replication-factor 1 --bootstrap-server localhost:9092

这里 --replication-factor 1 表示每个数据只会复制一次，--partitions 1 表示只创建一个分区。

使用以下命令，查看 Kafka 中的 Topic 列表：

kafka-topics.sh --bootstrap-server localhost:9092 --list

使用 Kafka 自带的消息生产者，向 test Topic 发送消息：

kafka-console-producer.sh --topic test --bootstrap-server localhost:9092
>haha
>xixi
>hoho
>hello
>world
> ^D

使用 Kafka 自带的消费者，从 test Topic 中读取消息：

kafka-console-consumer.sh --topic test --from-beginning --bootstrap-server localhost:9092

剧本

Pigsty 提供了两个与 KAFKA 模块相关的剧本，分别用于纳管与移除节点。

node.yml：纳管节点，并调整节点到期望的状态
node-rm.yml：从 pigsty 中移除纳管节点

此外， Pigsty 还提供了两个包装命令工具：node-add 与 node-rm，用于快速调用剧本。

`kafka.yml`

用于部署 Kafka KRaft 模式集群的 kafka.yml 剧本包含以下子任务：

kafka-id       : generate kafka instance identity
kafka_clean    : remove existing kafka instance (DANGEROUS)
kafka_user     : create os user kafka
kafka_pkg      : install kafka rpm/deb packages
kafka_link     : create symlink to /usr/kafka
kafka_path     : add kafka bin path to /etc/profile.d
kafka_svc      : install kafka systemd service
kafka_dir      : create kafka data & conf dir
kafka_config   : generate kafka config file
kafka_boot     : bootstrap kafka cluster
kafka_launch   : launch kafka service
kafka_exporter : launch kafka exporter
kafka_register : register kafka service to prometheus

监控

Pigsty 提供了两个与 KAFKA 模块有关的监控面板：

KAFKA Overview 展示了 Kafka 集群的整体监控指标。

KAFKA Instance 展示了单个 Kafka 实例的监控指标详情

参数

Kafka 的可用配置项：

#kafka_cluster:           #CLUSTER  # kafka cluster name, required identity parameter
#kafka_role: controller   #INSTANCE # kafka role, controller, broker, or controller-only
#kafka_seq: 0             #INSTANCE # kafka instance seq number, required identity parameter
kafka_clean: false                  # cleanup kafka during init? false by default
kafka_data: /data/kafka             # kafka data directory, `/data/kafka` by default
kafka_version: 3.8.0                # kafka version string
scala_version: 2.13                 # kafka binary scala version
kafka_port: 9092                    # kafka broker listen port
kafka_peer_port: 9093               # kafka broker peer listen port, 9093 by default (conflict with alertmanager)
kafka_exporter_port: 9308           # kafka exporter listen port, 9308 by default
kafka_parameters:                   # kafka parameters to be added to server.properties
  num.network.threads: 3
  num.io.threads: 8
  socket.send.buffer.bytes: 102400
  socket.receive.buffer.bytes: 102400
  socket.request.max.bytes: 104857600
  num.partitions: 1
  num.recovery.threads.per.data.dir: 1
  offsets.topic.replication.factor: 1
  transaction.state.log.replication.factor: 1
  transaction.state.log.min.isr: 1
  log.retention.hours: 168
  log.segment.bytes: 1073741824
  log.retention.check.interval.ms: 300000
  #log.retention.bytes: 1073741824
  #log.flush.interval.ms: 1000
  #log.flush.interval.messages: 10000

资源

Pigsty 为 PostgreSQL 提供了一些 Kafka 相关的扩展插件：

kafka_fdw，一个有趣的 FDW，允许用户直接从 PostgreSQL 中读写 Kafka Topic 数据
wal2json，用于从 PostgreSQL 中逻辑解码 WAL 日志，生成 JSON 格式的变更数据
wal2mongo，用于从 PostgreSQL 中逻辑解码 WAL 日志，生成 BSON 格式的变更数据
decoder_raw，用于从 PostgreSQL 中逻辑解码 WAL 日志，生成 SQL 格式的变更数据
test_decoding，用于从 PostgreSQL 中逻辑解码 WAL 日志，生成 RAW 格式的变更数据

15.3 - 模块：DuckDB

使用 Pigsty 安装 DuckDB，一个高性能，嵌入式的分析数据库组件。

DuckDB 是一个高性能，嵌入式的分析数据库：安装 | 资源

概览

DuckDB 是一个嵌入式数据库，所以不需要部署与服务化，只需要在节点上安装 DuckDB 软件包即可使用。

安装

Pigsty 在 Infra 软件仓库中已经提供了 DuckDB 软件包（RPM / DEB），使用以下命令即可完成安装：

./node.yml -t node_install  -e '{"node_repo_modules":"infra","node_packages":["duckdb"]}'

15.4 - 模块：TigerBeetle

使用 Pigsty 部署 TigerBeetle，金融会计事务专用数据库。

TigerBeetle 是一个金融会计事务专用数据库，提供了极致性能与可靠性。

概览

TigerBeetle 模块目前仅在 Pigsty 专业版中提供 Beta 试用预览。

安装

Pigsty Infra 仓库中提供了 TigerBeetle 的 RPM / DEB 软件包，使用以下命令即可完成安装：

./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["tigerbeetle"]}'

即可安装，然后请参考官方文档进行配置：https://github.com/tigerbeetle/tigerbeetle

TigerBeetle需要Linux内核5.5以上版本！

请注意，TigerBeetle 仅支持 Linux 内核 5.5 或更高版本，因此默认在 EL7 (3.10) / EL8 (4.18) 系统上无法使用。

请使用 EL9 （5.14）， Ubuntu 22.04 (5.15)，或 Debian 12 (6.1) 与 Debian 11 (5.10)，或其他支持的系统来安装 Tiger Beetle

15.5 - 模块：Kubernetes

使用 Pigsty 安装 Kubernetes，生产级无状态容器调度编排私有云平台

Kubernetes 是生产级无状态容器调度编排私有云平台。

Pigsty 提供了原生的 [ETCD] 集群支持，可以供 Kubernetes 使用，因此也在专业版中提供了 KUBE 模块，用于部署生产级 Kubernetes 集群。

Kubernetes 模块目前仅在 Pigsty Pro 专业版本中提供 Beta 预览，在开源版本中不可用。

但您可以直接在 Pigsty 中指定节点仓库，安装 Kubernetes 软件包，并使用 Pigsty 调整环境配置，置备节点供 K8S 部署使用，解决交付的最后一公里问题。

SealOS

SealOS 是一个 Kubernetes 发行版，可以用于将整个 Kubernetes 集群打包制作为一个镜像在其他地方使用

Pigsty 在 Infra 仓库中提供了 SealOS 5.0 的 RPM 与 DEB 安装包，可以直接下载安装，并使用 SealOS 管理集群。

./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["sealos"]}'

Kubernetes

如果您更喜欢使用经典的 Kubeadm 来部署 Kubernetes，请参考下面的 KUBE 模块参考。

./node.yml -t node_install -e '{"node_repo_modules":"kube","node_packages":["kubeadm,kubelet,kubectl"]}'

Kubernetes 支持多种容器运行时，要是用 Containerd 容器运行时，请确保节点上已经安装了 Containerd 软件包。

./node.yml -t node_install -e '{"node_repo_modules":"node,docker","node_packages":["containerd.io"]}'

若要使用 Docker 作为容器运行时，您需要安装 Docker ，并使用 cri-dockerd 项目桥接（EL9 / D11 / U20 尚不可用）：

./node.yml -t node_install -e '{"node_repo_modules":"node,infra,docker","node_packages":["docker-ce,docker-compose-plugin,cri-dockerd"]}'

剧本

kube.yml 剧本

监控

TBD

参数

Kubernetes 模块支持以下配置参数

#kube_cluster:                                          #IDENTITY# # define kubernetes cluster name 
kube_role: node                                                    # default kubernetes role (master|node)
kube_version: 1.31.0                                               # kubernetes version
kube_registry: registry.aliyuncs.com/google_containers             # kubernetes version aliyun k8s miiror repository
kube_pod_cidr: "10.11.0.0/16"                                      # kubernetes pod network cidr
kube_service_cidr: "10.12.0.0/16"                                  # kubernetes service network cidr
kube_dashboard_admin_user: dashboard-admin-sa                      # kubernetes dashboard admin user name

15.6 - 模块：Consul

使用 Pigsty 安装部署 Consul —— Etcd 的替代品。

Consul 是一个分布式 DCS + KV + DNS + 服务注册/发现的组件。

在旧版本（1.x）的 Pigsty 里，默认使用 Consul 作为高可用的 DCS，现在该支持已经移除，但会在后续重新作为独立模块提供。

配置

要部署 Consul，您需要将所有节点的 IP 地址和主机名添加到 consul 分组中。

您至少需要指定一个节点的 consul_role 为 server，其他节点的 consul_role 默认为 node。

consul:
  hosts:
    10.10.10.10: { nodename: meta , consul_role: server }
    10.10.10.11: { nodename: node-1 }
    10.10.10.12: { nodename: node-2 }
    10.10.10.13: { nodename: node-3 }

我们建议在严肃生产部署中使用奇数个 Consul Server，三个为宜。

参数

#-----------------------------------------------------------------
# CONSUL
#-----------------------------------------------------------------
consul_role: node                 # consul role, node or server, node by default
consul_dc: pigsty                 # consul data center name, `pigsty` by default
consul_data: /data/consul         # consul data dir, `/data/consul`
consul_clean: true                # consul purge flag, if true, clean consul during init
consul_ui: false                  # enable consul ui, the default value for consul server is true

15.7 - 模块：Victoria

使用 Pigsty 拉起 VictoriaMetrics 与 VictoriaLogs，Prometheus 与 Loki 的原位上位替代组件。

VictoriaMetrics 是 Prometheus 的原地上位替代，提供更好的性能，压缩比。

概览

Victoria 模块目前仅在 Pigsty 专业版中提供 Beta 试用预览。包含了 VictoriaMetrics 与 VictoriaLogs 组件的部署与管理。

安装

Pigsty Infra 仓库中提供了 VictoriaMetrics 的 RPM / DEB 软件包，使用以下命令即可完成安装：

./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["victoria-metrics"]}'
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["victoria-metrics-cluster"]}'
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["victoria-metrics-utils"]}'
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["victoria-logs"]}'

通常普通用户安装单机版 VictoriaMetrics 即可，如果需要集群部署，可以安装 victoria-metrics-cluster 软件包。

15.8 - 模块：Jupyter

使用 Pigsty 拉起 JupyterNotebook，搭建开箱即用的数据分析环境。

Run jupyter notebook with docker, you have to:

1. change the default password in .env: JUPYTER_TOKEN
1. create data dir with proper permission: make dir, owned by 1000:100
1. make up to pull up jupyter with docker compose

cd ~/pigsty/app/jupyter ; make dir up

Visit http://lab.pigsty or http://10.10.10.10:8888, the default password is pigsty

http://lab.pigsty?token=pigsty

Prepare

Create a data directory /data/jupyter, with the default uid & gid 1000:100:

make dir   # mkdir -p /data/jupyter; chown -R 1000:100 /data/jupyter

Connect to Postgres

Use the jupyter terminal to install psycopg2-binary & psycopg2 package.

pip install psycopg2-binary psycopg2

# install with a mirror
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple psycopg2-binary psycopg2

pip install --upgrade pip
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

Or installation with conda:

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/

then use the driver in your notebook

import psycopg2

conn = psycopg2.connect('postgres://dbuser_dba:DBUser.DBA@10.10.10.10:5432/meta')
cursor = conn.cursor()
cursor.execute('SELECT * FROM pg_stat_activity')
for i in cursor.fetchall():
    print(i)

Alias

make up         # pull up jupyter with docker compose
make dir        # create required /data/jupyter and set owner
make run        # launch jupyter with docker
make view       # print jupyter access point
make log        # tail -f jupyter logs
make info       # introspect jupyter with jq
make stop       # stop jupyter container
make clean      # remove jupyter container
make pull       # pull latest jupyter image
make rmi        # remove jupyter image
make save       # save jupyter image to /tmp/docker/jupyter.tgz
make load       # load jupyter image from /tmp/docker/jupyter.tgz

16 - 任务教程

Pigsty 中常见的任务列表与一些主题案例教学，例如：证书签发，故障演练，性能调优，数据迁移，使用 CMDB 等。

16.1 - DNS：使用域名访问 Pigsty 中的 Web 服务

如何配置DNS解析记录，从而使用域名访问 Pigsty 的 Web 系统？

安装完 Pigsty 之后，用户可以通过 IP + Port 的方式访问大部分 Infra组件提供的 Web 界面。

假设你的节点使用的内网 IP 地址是 10.10.10.10，那么默认情况下：

http://10.10.10.10:3000 是 Grafana 监控面板（日常使用的主要入口）
http://10.10.10.10:9090 是 Prometheus 时序数据库控制台
http://10.10.10.10:9093 是 AlertManager 告警控制台
http://10.10.10.10 是 Nginx HTTP 服务的入口（默认 80 端口）

对于开发测试来说，IP + 端口无需配置即可使用，非常方便，然而对于更严肃的部署而言，我强烈建议您通过域名访问这些服务。

使用域名 有诸多优点，并不需要额外花钱或者引入额外的依赖，只需简单一行配置即可。

接下来，我们会深入展开以下主题：

太长不看

向本机 /etc/hosts (Linux/MacOS) / C:\Windows\System32\drivers\etc\hosts (Windows) 文件中添加以下静态解析记录：

sudo tee -a /etc/hosts <<EOF
10.10.10.10 h.pigsty g.pigsty p.pigsty a.pigsty
EOF

将占位符 IP 地址 10.10.10.10 替换为 Pigsty 安装节点的 IP 地址（公网/内网不限，可达即可）。

如果你修改了 infra_portal 中的默认域名，请将默认域名替换为你自己的域名。

为什么要用域名？

Pigsty 强烈建议 使用域名访问 Web 系统，而不是直接通过 IP+Port 直连的方式，原因如下：

域名更容易记忆，使用方便
域名更加灵活，可以通过修改解析指向不同地址
通过域名访问，可以将所有服务收拢至 Nginx 统一出口，便于管理，审计，减小攻击面。
使用域名，可以申请 HTTPS 证书，加密流量，避免信息被窃听，提高安全性。
在国内，HTTP 访问未备案的域名，会被运营商劫持到特殊页面，HTTPS 访问证书域名则不会
一些无法直连的服务，例如监听 127.0.0.1 地址或者内部 Docker 网段的服务，可以通过 NGINX 代理访问

Pigsty 默认使用 内网静态域名，只需要在您本地服务器添加解析即可，无需在 DNS 供应商处申请真实域名。

如果您的 Pigsty 部署于互联网公网环境，您也可以考虑使用真正的互联网域名，并申请免费的 HTTPS 证书。

域名的工作原理

如果您并不熟悉 HTTP / DNS 协议，以下是一个简短的原理介绍：为什么 Nginx 只使用一个 80 端口（+ HTTPS 443）就可以为多个域名提供服务？

DNS协议

当客户端（例如浏览器）要访问 https://a.pigsty.cc 时，首先会通过 DNS 请求来解析 a.pigsty.cc 这个域名对应的 IP 地址。
负责解析的，可以是 本地的静态文件，也可以是 内网DNS服务器，或者是互联网上的 公网 DNS 解析。
DNS 返回的 IP 地址可能是同一台服务器，但对客户端来说，它只需要知道：这个域名解析到哪个 IP 即可。
多条域名（例如 a.pigsty.cc、p.pigsty.cc）可以在 DNS 解析时指向同一个 IP，浏览器最终都会把请求发到相同的 IP 地址上。

HTTP协议

在浏览器请求 HTTP 资源时（HTTP/1.1 及以上），请求头里会包含一个 Host 字段，内容为请求的域名。
这个 Host 字段是用于区分的关键，HTTP/1.1 规范规定，客户端必须包含这个首部。
当 Nginx 在 80 端口上收到请求后，会根据请求中的 Host，来匹配对应的服务器，进而决定要返回的站点内容。
因此通过 Host 不同取值的区分，Nginx 就可以在同一个端口上为不同域名提供不同的内容。

Pigsty默认域名

Pigsty 默认为四个核心组件配置使用以下四个 内网域名：

域名	名称	端口	组件	说明
`h.pigsty`	`home`	80/443	Nginx	默认服务器，本地软件仓库
`g.pigsty`	`grafana`	3000	Grafana	Grafana 监控大屏与可视化平台
`p.pigsty`	`prometheus`	9090	Prometheus	Prometheus 监控时序数据库
`a.pigsty`	`alertmanager`	9093	AlertManager	告警聚合/屏蔽/管理/消息发送

因为这四个域名不使用任何顶级域，所以你需要 本地静态解析 或 内网动态解析 才可以使用。

这并不是什么复杂的配置，只需要在您的客户端机器上添加一行配置即可。

本地静态解析

假设 Pigsty 安装的内网 IP 地址为 10.10.10.10，那么在您的客户端机器上，添加以下静态解析记录即可：

# Pigsty 核心组件与默认域名
10.10.10.10 h.pigsty g.pigsty p.pigsty a.pigsty

如何添加解析

所谓客户端机器，指的是您在本地需要访问 Pigsty 服务的机器，通常也是你使用浏览器的环境，比如笔记本，台式机，虚拟机。

对于 Linux / MacOS 用户，您可以直接 sudo nano /etc/hosts 来编辑 hosts 文件，添加上述记录。

对于 Windows 用户，您需要使用管理员权限执行 notepad C:\Windows\System32\drivers\etc\hosts 来编辑 hosts 文件，添加上述记录。

添加记录后，您就可以通过这几个域名访问 Pigsty 的 Web 服务了。

解析到其他域名

如果您不喜欢 Pigsty 默认的域名，您也可以将解析记录修改为其他域名，只是务必在安装前相应修改 infra_portal 中的 domain 参数。

例如，您希望使用 *.pigsty.xxx 作为默认域名，那么您需要修改 infra_portal 中的 domain 参数为：

infra_portal:
  home         : { domain: h.pigsty.xxx }
  grafana      : { domain: g.pigsty.xxx ,endpoint: "${admin_ip}:3000" ,websocket: true }
  prometheus   : { domain: p.pigsty.xxx ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty.xxx ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }

而相应的解析记录则需要修改为：

10.10.10.10 h.pigsty.xxx g.pigsty.xxx p.pigsty.xxx a.pigsty.xxx

这个域名可以是真的域名，也可以是随便什么你自己喜欢的名字，只要你确保域名可以在本机，内网，或公网上被解析到 Pigsty 的 IP 地址即可。

其他解析记录

如果您需要运行其他一系列 Pigsty 扩展工具集，你可以一并添加其他解析记录：

# Pigsty 扩展工具与使用的默认域名
10.10.10.10 adm.pigsty   # pgadmin GUI 管理工具
10.10.10.10 ddl.pigsty   # bytebase 数据库DDL管理工具
10.10.10.10 cli.pigsty   # pig CLI 预留域名
10.10.10.10 api.pigsty   # pigsty API 预留域名
10.10.10.10 lab.pigsty   # jupyterlab 预留域名
10.10.10.10 git.pigsty   # gitea 预留域名
10.10.10.10 wiki.pigsty  # wiki.js 预留域名
10.10.10.10 noco.pigsty  # nocodb 预留域名
10.10.10.10 supa.pigsty  # supabase 预留域名
10.10.10.10 dify.pigsty  # dify 预留域名
10.10.10.10 odoo.pigsty  # odoo 预留域名
10.10.10.10 mm.pigsty    # minio 预留域名

解析到公网地址

如果您的 Pigsty 部署在公网环境，比如云服务器上，那么您不能将本地解析记录解析到 内网 IP 地址，而必须将其解析到您的 公网 IP 地址。

您的云服务器如果可以访问互联网，那么通常会有两块虚拟网卡，一块连接互联网，一块连接内网，同时分别有对应的公网 IP 地址与内网 IP 地址。

假设您的云服务器公网 IP 地址为 1.2.3.4，内网 VPC 地址为 10.10.10.10，那么您需要将解析记录相应配置为：

# 公网/云服务器部署，请解析至公网 IP 地址！修改记录的前半段即可：
1.2.3.4 h.pigsty g.pigsty p.pigsty a.pigsty

内网动态解析

假如您希望在办公网络中的其他同事也可以通过域名访问 Pigsty 的 Web 服务，那么您可以考虑使用 内网动态解析。

最简单的办法是要求您的网络管理员，为您在内网 DNS 服务器上添加上面的解析记录。

使用内网DNS

假设您的内网 DNS 服务器地址为 192.168.1.1，那么在 Linux 与 MacOS 上，您可以编辑 /etc/resolv.conf 文件，添加以下记录：

nameserver 192.168.1.1

在 Windows 上，通常你需要在 “网络和 Internet 设置” 中找到网络适配器，然后打开 `TCP/IPv4 属性，修改 DNS 配置。

您可以通过以下命令来测试内网 DNS 解析是否生效：

dig h.pigsty @192.168.1.1

使用Pigsty自带的DNS

Pigsty Infra 模块自带 DNS 解析服务器，您也可以通过 53 端口使用该 DNS 服务器。

但公网部署时请务必注意，中国大陆有特殊国情，通常 不允许 用户在公网服务器上启动 DNS 服务（53 端口）！

本地HTTPS访问

当您在本地通过 HTTP 访问 Pigsty 的 Web 服务时，默认会提示 “不安全”，因为这是明文传输的 HTTP 协议，容易被窃听，篡改，冒充。

在默认情况下，Pigsty 会使用本地自签名的 CA 证书为 Nginx 中所有带域名的服务器签发 “自签名证书”，并使用这些证书为 HTTP 服务启用 SSL。

如果你使用 HTTPS 访问 Pigsty 的 Web 服务，默认会提示 “证书错误”，因为这是 自签名的证书，而非权威机构签发的证书。

你可以选择：

不管他，回头接着用 HTTP 甚至是 IP+Port 访问，反正这是内网，怎么方便怎么来。
使用 HTTPS 访问，点击 “高级 - 我知道这里的风险，继续访问”
使用 Chrome 浏览器，可以在提示不安全的窗口键入 thisisunsafe 表示你知道这是“不安全”的自签名证书。
使用 HTTPS 访问，并且信任自签名证书，需要将 Pigsty 的 CA 证书添加到您的浏览器或操作系统中。
申请一个 CA 证书，并给 Pigsty 使用，这样 Pigsty 签发的证书就都是有效真证书。
使用正儿八经的真域名，并申请真正的 HTTPS 证书使用。

通常对于内网访问，如果您需要 HTTPS，又懒得每次跳过安全提醒，可以考虑直接在系统中信任 Pigsty 自动生成的 自签名CA。

对于严肃的生产环境，我们建议使用真正的 公网域名 并使用诸如 certbot 这样的工具申请免费的 HTTPS 证书使用。

信任自签名CA

Pigsty 默认会在初始化时，在安装节点本机 Pigsty 源码目录（~/pigsty）中生成一个自签名的 CA，并使用此 CA 签发内网各项服务所需的证书。

如果你想通过 HTTPS 加密访问 Pigsty 提供的 Web 服务，需要将 Pigsty 的 CA 证书分发至客户端电脑的信任证书目录中（或用真的 CA ，很贵！）。

在 Pigsty 纳管的 Linux 节点中，都已经自动完成了对 Pigsty 自签名 CA 的信任，在 Linux 上可以这样信任 Pigsty CA：

rm -rf /etc/pki/ca-trust/source/anchors/ca.crt
ln -s /etc/pki/ca.crt /etc/pki/ca-trust/source/anchors/ca.crt
/bin/update-ca-trust

rm -rf /usr/local/share/ca-certificates/ca.crt
ln -s /etc/pki/ca.crt /usr/local/share/ca-certificates/ca.crt
/usr/sbin/update-ca-certificates

在 MacOS 上，您需要双击 ca.crt 文件将其加入系统钥匙串，然后在钥匙串应用中，搜索 pigsty-ca ，打开然后选择 “信任” 此根证书。

在 Windows 上，您需要将 ca.crt 文件添加到 “受信任的根证书颁发机构” 中。

信任 Pigsty CA 后，访问由该 CA 签发的自签名证书网站时，就不会再弹出 “不受信任证书” 之类的信息了。

公网域名解析

您也可以使用类似 Cloudflare，Godaddy，阿里云万网，腾讯云 DNSPod 等 DNS 服务商提供的域名解析服务。

这通常需要您购买一个域名，普通域名一年十几块钱，很便宜。

通常您需要通过云 DNS 服务商提供的控制台或 API 接口，将域名解析到 Pigsty 部署服务器的 公网 IP 地址。

假设你购买的域名名为 pigsty.xxx，您可以选择添加一条 * 通配符A记录，将所有子域名解析到 Pigsty 部署服务器的 公网 IP 地址。

或者单独为每个组件使用的域名添加一条 A 记录，将其指向 公网IP地址 即可：

h.pigsty.xxx 1.2.3.4
a.pigsty.xxx 1.2.3.4
p.pigsty.xxx 1.2.3.4
g.pigsty.xxx 1.2.3.4

Pigsty 内置了 Certbot 支持，如果使用公网域名，可以一键申请免费的 HTTPS 证书供 Nginx 使用（需三个月一更新）。

参考阅读

16.2 - Nginx：向外代理暴露Web服务

如何配置 Pigsty 中的 Nginx，对外代理并暴露内部 Web 服务，提供本地软件仓库服务。

Pigsty 的 Infra 模块默认会在节点上安装 Nginx，这是一个高性能的 Web 服务器。 Pigsty 使用 Nginx 作为所有本地 WebUI 服务的统一入口，并将其用作本地软件仓库，向内网其他节点提供服务。

当然，用户可以根据需求，调整配置，将 Nginx 用作标准的 Web 服务器，对外提供服务。无论是作为反向代理，还是直接作为网站服务器，都可以通过适当的配置实现。 Pigsty 本身的文档站与仓库也是通过 Pigsty 自建的 Nginx 对外提供的。

配置概览

Nginx 服务器配置由 infra_portal 参数指定。

用户在这里声明所有需要通过 Nginx 代理的域名，以及对应的上游服务器端点（endpoint）或本地目录路径（path）。

例如，默认情况下，Pigsty 会这样配置 Nginx，下面的配置会使用 Nginx 对外暴露 Home，Grafana，Prometheus，Alertmanager 四项服务：

infra_portal:  # domain names and upstream servers
  home         : { domain: h.pigsty }
  grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
  prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }

当安装 Pigsty 时，Pigsty 会自动根据以上配置生成 Nginx 的配置文件。

/etc/nginx/conf.d/haproxy/           # <--- 存放着 HAPROXY 管理界面的位置定义
/etc/nginx/conf.d/home.conf          # <--- Pigsty 默认服务器定义（本地软件源，HAPROXY转发）
/etc/nginx/conf.d/grafana.conf       # <--- 代理访问内网 Grafana 服务器
/etc/nginx/conf.d/prometheus.conf    # <--- 代理访问内网 Prometheus 服务器
/etc/nginx/conf.d/alertmanager.conf  # <--- 代理访问内网 Alertmanager 服务器

Nginx 默认服务于 80/443 端口，home 服务器是本地软件源，同时也是默认的 Nginx 服务器。如果你想通过 Nginx 访问其他服务，只需要在 infra_portal 中添加相应的配置即可，任何带有 domain 参数的配置都会被 Nginx 自动代理。

配置剧本

当安装 Pigsty 时，这些配置会在默认的 install.yml 剧本，或者 infra.yml 剧本中自动生效。但是用户也可以在 Pigsty 部署后使用 infra.yml 剧本中的 nginx 子任务重新初始化 Nginx 配置。

./infra.yml -t nginx           # 重新配置 Nginx
./infra.yml -t nginx_config    # 重新生成 Nginx 配置
./infra.yml -t nginx_launch    # 重新启动 Nginx 服务

这意味着如果您想要调整 Nginx 服务器的配置，只需要修改 pigsty.yml 配置文件，并执行上面的 nginx 任务即可生效。

当然，你也可以选择先使用 nginx_config 子任务重新生成 Nginx 配置文件，人工检查后使用 nginx -s reload 重新在线加载配置。

配置详情

配置变量 infra_portal 通常定义在全局变量 all.vars 中，默认值如下所示：

all:
  vars:
    infra_portal:  # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }

默认配置意味着，用户默认可以通过：

h.pigsty 访问 home 服务器，这是默认服务器，对应 Pigsty 的文件系统首页与本地软件源，通常指向本机上的 /www 目录。
g.pigsty 访问 grafana 服务器，这是默认的 Grafana 服务，通常指向管理节点（admin_ip）上的 3000 端口。
p.pigsty 访问 prometheus 服务器，这是默认的 Prometheus 服务，通常指向管理节点（admin_ip）上的 9090 端口。
a.pigsty 访问 alertmanager 服务器，这是默认的 Alertmanager 服务，通常指向管理节点（admin_ip）上的 9093 端口。

注意这里的 blackbox 和 loki 没有配置 domain 参数，因此不会被添加到 Nginx 的配置中，因为它们没有配置 domain 参数。但是这并不意味着用户可以直接把这两项定义移除掉，因为内网中的其他服务可能会引用这里的配置（例如日志 Agent 会引用 Loki endpoint 地址发送日志）

用户可以通过丰富的配置参数，为不同的服务配置不同的配置，如下所示：

服务器参数

每一条服务器记录都有一个独一无二的 name 作为 key，一个配置字典作为 value。在配置字典中，目前有以下几个可用配置项：

domain：可选，指定代理的域名，如果不填写，则 Nginx 不会对外暴露此服务。

对于那些需要知道 endpoint 地址，但不想对外暴露的服务（例如 Loki, Blackbox Exporter），可以不填写此参数
endpoint：可选，指定上游服务的地址，可以是 IP:PORT 或者 DOMAIN:PORT。
- 当此服务器为上游服务时，可以指定此参数，Pigsty 会生成一个标准的反向代理配置，并将请求转发给上游的 endpoint 地址。
- 此参数与 path 参数是互斥的，不能同时存在：如果一个服务器是反向代理服务器，那么它不能同时是本地网页服务器。
- 在此参数值中，可以使用 ${admin_ip} 占位符，Pigsty会填入 admin_ip 的值。
- 如果指定了此参数，则 Pigsty 默认会使用 endpoint.conf 配置模板，这是反向代理的标准模板
- 如果同时指定了 conf 参数，则 conf 参数指定的模板有更高的优先级。
- 如果上游强制要求 HTTPS 访问，你可以额外设置 scheme: https 参数。
path：可选，指定本地 Web 服务器的根目录，可以是绝对路径或者相对路径。
- 当此服务器为本地 Web 服务器时，可以指定此参数，Pigsty 会生成一个标准的本地 Web 服务器配置，并将请求转发给本地的 path 目录。
- 此参数与 endpoint 参数是互斥的，不能同时存在，如果一个服务器是本地网页服务器，那么它不能同时是上游代理服务器。
- 如果指定了此参数，则 Pigsty 默认会使用 path.conf 配置模板，这是本地 Web 服务器的标准模板
- 如果同时指定了 conf 参数，则 conf 参数指定的模板有更高的优先级。
- 如果你希望 Nginx 自动生成文件列表索引，可以设置 index: true 参数，默认是不打开的。
conf: 可选，如果指定，则将会使用 templates/nginx/ 中定义的配置模板。
- 当你想要任意定制 Nginx 配置时可以指定此参数，指定一个存在于 templates/nginx 目录中的模板文件名。
- 如果没有指定此参数，Pigsty 将根据服务器的类型（Home, Proxy, Path）自动应用相应的默认模板：
  - home.conf：默认服务器的模板（用于 home）。
  - endpoint.conf：上游服务代理的模板（例如：用于Grafana，Prometheus 等）
  - path.conf：本地 Web 服务器的模板（默认没有使用，但 home 是特殊的本地服务器）
certbot: 可选，指定此服务器的 certbot 证书名称，Pigsty 会自动使用 certbot 生成的证书。
- 如果您的证书是使用 certbot 生成的，你可以指定此参数为 certbot 生成的证书名称，Pigsty 会自动使用此证书。
- 您应当填入 certbot 生成证书的域名部分。例如 certbot: pigsty.cc 会自动使用
  - 证书：/etc/letsencrypt/live/pigsty.cc/fullchain.pem，但可以被显式指定的 cert 参数（完整路径）覆盖。
  - 私钥：/etc/letsencrypt/live/pigsty.cc/privkey.pem，但可以被显式指定的 key 参数（完整路径）覆盖。
- certbot 证书名称通常与 domain 参数相同，例如 g.pigsty 的证书名称为 g.pigsty.cc，但也有特例：当你同时申请多个证书时，certbot 会生成一个捆绑证书，名称为申请列表中的第一个域名。
cert: 可选，指定此服务器的 SSL 证书文件名，需要给出完整路径。
key: 可选，指定此服务器的 SSL 私钥文件名，需要给出完整路径。
domains: 可选，除了默认的 domain 域名，您还可以为此服务器指定多个额外的域名
scheme: 可选，指定此服务器的协议（http/https），留空则默认使用 http，通常用于强制要求 HTTPS 访问的上游 Web 服务。
index: 可选，如果设置为 true， Nginx 会为目录自动生成文件列表索引页面，方便浏览文件，对于软件仓库类通常可以打开，对于网站通常应当关闭。
log: 可选，如果指定，则日志将打印到 <value>.log ，而非默认的 access.log 中。
- 作为特例，上游代理服务器的日志总是会打印到 <name>.log 中。

复杂的配置样例

Pigsty 自带的配置模板 conf/demo.yml 有一个更详细的案例，给出了 Pigsty 文档站的配置样例。

是的，Pigsty 的文档站也是在一台普通云服务器上使用 Pigsty 本身建设的，其配置如下所示：

infra_portal:
  home         : { domain: home.pigsty.cc }
  grafana      : { domain: demo.pigsty.cc ,endpoint: "${admin_ip}:3000" ,websocket: true ,cert: /etc/cert/demo.pigsty.cc.crt ,key: /etc/cert/demo.pigsty.cc.key }
  prometheus   : { domain: p.pigsty.cc    ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty.cc    ,endpoint: "${admin_ip}:9093" }
  cc           : { domain: pigsty.cc      ,path:     "/www/pigsty.cc"   ,cert: /etc/cert/pigsty.cc.crt ,key: /etc/cert/pigsty.cc.key }

  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }
  minio        : { domain: m.pigsty.cc    ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }
  postgrest    : { domain: api.pigsty.cc  ,endpoint: "127.0.0.1:8884"   }
  pgadmin      : { domain: adm.pigsty.cc  ,endpoint: "127.0.0.1:8885"   }
  pgweb        : { domain: cli.pigsty.cc  ,endpoint: "127.0.0.1:8886"   }
  bytebase     : { domain: ddl.pigsty.cc  ,endpoint: "127.0.0.1:8887"   }
  jupyter      : { domain: lab.pigsty.cc  ,endpoint: "127.0.0.1:8888", websocket: true }
  gitea        : { domain: git.pigsty.cc  ,endpoint: "127.0.0.1:8889" }
  wiki         : { domain: wiki.pigsty.cc ,endpoint: "127.0.0.1:9002" }
  noco         : { domain: noco.pigsty.cc ,endpoint: "127.0.0.1:9003" }
  supa         : { domain: supa.pigsty.cc ,endpoint: "10.10.10.10:8000" ,websocket: true }
  dify         : { domain: dify.pigsty.cc ,endpoint: "10.10.10.10:8001" ,websocket: true }
  odoo         : { domain: odoo.pigsty.cc ,endpoint: "127.0.0.1:8069"   ,websocket: true }
  mm           : { domain: mm.pigsty.cc   ,endpoint: "10.10.10.10:8065" ,websocket: true }

这个文档站的 Nginx 配置要比默认的配置复杂一些：

home 服务器使用了一个真实的公网域名 home.pigsty.cc
grafana 服务器使用了一个真实的公网域名 demo.pigsty.cc，并配置了 websocket: true 以支持 WebSocket 连接。
cc 服务器是 Pigsty 的文档站，它使用了真实公网域名 pigsty.cc，并指向了本地的 /www/pigsty.cc 目录。
下面还定义了一系列 Docker App 服务器，将这些应用的 Web 界面通过域名对外暴露。
cc 与 grafana 直接通过 cert 与 key 参数指定了 HTTPS 证书。

配置域名

您可以通过 IP:Port 直接访问特定服务，例如 IP:3000 访问 Grafana，IP:9090 访问 Prometheus。但这样的行为通常并不可取，常规安全最佳实践要求您通过域名访问服务，而不是通过 IP:Port 直接访问。通过域名访问意味着你只需要对外暴露一个 Nginx 服务，减小攻击面，并便于统一添加访问控制！

小知识：Nginx 是如何区分不同服务器的？

Nginx 通过浏览器设置的 HOST 首部中的域名，来区分不同的服务。

所以如果您有多种服务，尽管都使用同一个 IP 地址的 80/443 端口，但 Nginx 仍然可以区分它们。但限制条件就是，您必须通过域名访问服务，而不是通过 IP:Port 直接访问。如果您直接用 IP 来访问 80/443 端口，那么您只能访问到默认的 home 服务器。

使用域名访问 Pigsty WebUI 时，您需要配置 DNS 解析，有以下几种方式：

使用真域名，通过云厂商/域名厂商的 DNS 解析服务，将公网域名指向你的服务器公网IP
使用内网域名，在内网 DNS 服务器上添加内网域名，并指向你服务器的内网IP地址
使用本机域名，在你浏览器所在主机的（/etc/hosts）添加一条静态解析记录

通过本机访问

如果你是唯一的用户，那么可以直接修改本地 /etc/hosts 文件（Linux/MacOS），在 Windows 系统上，您需要修改 C:\Windows\System32\drivers\etc\hosts 文件。无论是什么系统，修改此文件通常需要管理员权限。

你可以添加以下静态解析记录，将 Pigsty 的默认域名指向你的服务器IP地址：

<ip_address>  h.pigsty a.pigsty p.pigsty g.pigsty

如果你会用到其他服务，也可以添加其他服务对应的域名解析记录：

10.10.10.10 h.pigsty a.pigsty p.pigsty g.pigsty
10.10.10.10 api.pigsty ddl.pigsty adm.pigsty cli.pigsty lab.pigsty
10.10.10.10 supa.pigsty noco.pigsty odoo.pigsty dify.pigsty

通过办公网访问

如果您的服务需要在办公网共享访问，例如让所有同事都可以通过域名访问，那么除了让同事都在自己的电脑上添加上面的静态解析记录之外，更正规的做法是使用内网DNS服务器。

您可以要求网络管理员在公司内部 DNS 服务器中添加相应的解析记录，将其指向 Nginx 服务器所在的 IP 地址。

当然还有另一种选项，Pigsty 默认安装也会在 53 端口提供一个 DNS 服务器，您可以通过配置 /etc/resolv.conf 文件或图形化界面配置 DNS 服务器，来使用内网域名访问部署在办公网中的服务。

通过互联网访问

如果您的服务需要直接暴露在互联网网上，通常您需要通过 DNS 服务商（Cloudflare，Aliyun DNS 等）解析互联网域名。当然，你依然可以使用本地 DNS 服务器或本地静态解析记录来访问服务，但这样你的服务（除了默认的home服务器）就没法被互联网上的其他用户访问了。

您可以将所有需要暴露的域名都解析到 Nginx 服务器所在的 IP 地址。或者更简单的将 @ 和 * A 记录都解析到 Nginx 服务器所在的 IP 地址。当你申请好新域名并将其指向你的服务器公网IP后，你还需要修改 Pigsty 的 infra_portal ，将域名填入各个服务器条目的 domain 字段中。

通过公网访问的域名，最佳实践是申请 HTTPS 证书，并始终使用 HTTPS 访问。我们将在下一节介绍这个主题。

配置HTTPS

HTTPS 是当代 Web 服务的主流配置，然而并非所有用户都熟悉 HTTPS 的配置方法。因此 Pigsty 默认为用户启用 HTTPS 支持。

如果你的 Nginx 只是对内网，办公网提供服务，那么 HTTPS 是一个 可选项；如果你的 Nginx 需要对 互联网 提供服务，那么我们 强烈建议 您使用真实的域名与真正的 HTTPS 证书。使用 HTTPS 不仅能够加密您的网络流量避免非法窥探篡改，而且能够避免访问 “未备案” 域名时恼人的体验。

本地域名与自签名证书

因为 Pigsty 默认使用的域名都是本地域名（x.pigsty），无法申请真正的域名 HTTPS 证书，所以 Pigsty 默认使用自签名证书。

Pigsty 会使用自签名的 CA 为所有的 infra_portal 中的域名签发证书。当然此证书并非权威证书，在浏览器中会提示证书不可信。你可以选择：

“我知道不安全，继续访问”
使用 Chrome 浏览器时，你也可以使用敲击键入 thisisunsafe 来绕过证书验证
将 Pigsty 自动生成的 pigsty-ca CA 证书加入浏览器所在电脑的信任的根 CA 列表。
回退到 HTTP 或者 IP:Port 访问，不使用 HTTPS （不推荐）
不使用本地域名与自签名证书，而是使用真正的域名与真正的 HTTPS 证书。

默认生成的自签名 CA 公钥和私钥位于 pigsty 本地目录的：files/pki/ca/ca.crt 和 files/pki/ca/ca.key。

真实域名与真证书

HTTPS 证书通常是一项收费服务，但是您可以使用诸如 certbot 这类工具申请免费的 Let’s Encrypt 证书。

使用 Certbot 申请真正 HTTPS 证书的教程将在下一篇 Certbot教程：申请免费HTTPS证书中详细介绍。

参考阅读

16.3 - Certbot：申请公网HTTPS证书

如何使用 Certbot 申请免费的公网HTTPS证书？

Pigsty 自带了 Certbot 工具，并默认于 Infra 节点上安装启用。

这意味着你可以直接通过 certbot 命令行工具，为你的 Nginx 服务器与公网域名申请真正的 Let’Encrypt 免费 HTTPS 证书，而不是使用 Pigsty 自签名的 HTTPS 证书。

为了做到这一点，你需要：

确定哪些域名需要证书
将这些域名指向您的服务器
使用 Certbot 申请证书
将Nginx配置文件纳入管理
配置更新证书的定时任务
申请证书的一些注意事项

以下是如何去做的详细说明：

确定哪些域名需要证书

首先，您需要决定哪些 “上游服务” 需要真正的公网证书

infra_portal:
  home         : { domain: h.pigsty.cc }
  grafana      : { domain: g.pigsty.cc ,endpoint: "${admin_ip}:3000" ,websocket: true  }
  prometheus   : { domain: p.pigsty.cc ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty.cc ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }
  minio        : { domain: m.pigsty.cc    ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }
  web          : { domain: pigsty.cc      ,path: "/www/web.cc" }
  repo         : { domain: repo.pigsty.cc ,path: "/www/repo"   }

例如在 infra_portal 中，假设我们要对外暴露以下五项服务：

Grafana 可视化监控面板的 g.pigsty.cc 域名
Prometheus 时序数据库的 p.pigsty.cc 域名
AlertManager 告警面板的 a.pigsty.cc 域名
Pigsty 文档站的 pigsty.cc 域名，指向本地文档目录
Pigsty 软件仓库的 repo.pigsty.cc 域名，指向软件仓库

这里的例子里特意没有选择为 home 主页申请真的 Let’s Encrypt 证书，原因见最后一节。

将这些域名指向您的服务器

接下来，您需要将上面选定的域名指向您服务器的 公网IP地址。例如，Pigsty CC 站点的 IP 地址是 47.83.172.23，则可在域名注册商（如阿里云DNS控制台）上设置以下域名解析 A 记录：

47.83.172.23 pigsty.cc
47.83.172.23 g.pigsty.cc
47.83.172.23 p.pigsty.cc
47.83.172.23 a.pigsty.cc
47.83.172.23 repo.pigsty.cc

修改完之后，可以使用

使用 Certbot 申请证书

第一次申请的时候，certbot 会提示你输入邮箱，并是否同意协议，按提示输入即可。

$ certbot --nginx -d pigsty.cc -d repo.pigsty.cc -d g.pigsty.cc -d p.pigsty.cc -d a.pigsty.cc
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Enter email address (used for urgent renewal and security notices)
 (Enter 'c' to cancel): rh@vonng.com

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please read the Terms of Service at
https://letsencrypt.org/documents/LE-SA-v1.4-April-3-2024.pdf. You must agree in
order to register with the ACME server. Do you agree?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(Y)es/(N)o: Y

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Would you be willing, once your first certificate is successfully issued, to
share your email address with the Electronic Frontier Foundation, a founding
partner of the Let's Encrypt project and the non-profit organization that
develops Certbot? We'd like to send you email about our work encrypting the web,
EFF news, campaigns, and ways to support digital freedom.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(Y)es/(N)o: N
Account registered.
Requesting a certificate for pigsty.cc and 4 more domains

Successfully received certificate.
Certificate is saved at: /etc/letsencrypt/live/pigsty.cc/fullchain.pem
Key is saved at:         /etc/letsencrypt/live/pigsty.cc/privkey.pem
This certificate expires on 2025-05-18.
These files will be updated when the certificate renews.
Certbot has set up a scheduled task to automatically renew this certificate in the background.

Deploying certificate
Successfully deployed certificate for pigsty.cc to /etc/nginx/conf.d/web.conf
Successfully deployed certificate for repo.pigsty.cc to /etc/nginx/conf.d/repo.conf
Successfully deployed certificate for g.pigsty.cc to /etc/nginx/conf.d/grafana.conf
Successfully deployed certificate for p.pigsty.cc to /etc/nginx/conf.d/prometheus.conf
Successfully deployed certificate for a.pigsty.cc to /etc/nginx/conf.d/alertmanager.conf
Congratulations! You have successfully enabled HTTPS on https://pigsty.cc, https://repo.pigsty.cc, https://g.pigsty.cc, https://p.pigsty.cc, and https://a.pigsty.cc

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
If you like Certbot, please consider supporting our work by:
 * Donating to ISRG / Let's Encrypt:   https://letsencrypt.org/donate
 * Donating to EFF:                    https://eff.org/donate-le
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

首次申请之后，以后申请就可以省略这些步骤，直接使用这些命令了。

更简单的办法是直接以非交互式的方法调用，直接使用以下命令，传入你的邮箱和要注册的域名。

certbot --nginx --agree-tos --email rh@vonng.com -n -d supa.pigsty.cc

将Nginx配置文件纳入管理

使用 certbot 签发证书后，默认会修改 Nginx 的配置文件，将 HTTP 服务器重定向到 HTTPS 服务器，而这可能并非你想要的。

你可以通过修改 Pigsty 配置文件中的 infra_portal 参数，将 Certbot 已经成功签发证书的域名配置到 Nginx 的配置文件中。

infra_portal:
  home         : { domain: h.pigsty.cc }
  grafana      : { domain: g.pigsty.cc ,endpoint: "${admin_ip}:3000" ,websocket: true , certbot: pigsty.cc }
  prometheus   : { domain: p.pigsty.cc ,endpoint: "${admin_ip}:9090"                  , certbot: pigsty.cc }
  alertmanager : { domain: a.pigsty.cc ,endpoint: "${admin_ip}:9093"                  , certbot: pigsty.cc }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }
  minio        : { domain: m.pigsty.cc    ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true    }
  web          : { domain: pigsty.cc      ,path: "/www/web.cc"                        , certbot: pigsty.cc }
  repo         : { domain: repo.pigsty.cc ,path: "/www/repo"                          , certbot: pigsty.cc }

这里，修改签发证书的服务器定义项，添加 certbot: <domain-name> ，这里的 <domain-name> 指的是 certbot 签发的文件名。通常与 domain 一样，但如果你同时申请多个域名证书，certbot 会将其合并为一个证书，比如这里合并为两个文件:

Certificate is saved at: /etc/letsencrypt/live/pigsty.cc/fullchain.pem
Key is saved at:         /etc/letsencrypt/live/pigsty.cc/privkey.pem

因此将证书中间的 pigsty.cc 抽出来填入 certbot，然后重新运行：

./infra.yml -t nginx_config,nginx_launch

即可让 Pigsty 重新生成 Nginx 配置文件，回退 Certbot 对配置进行的其他修改，只保留申请的证书。以后需要续期更新证书的时候就不需要重复这个过程了，直接使用 certbot renew 即可。

配置更新证书的定时任务

默认情况下，申请的证书有效期为三个月，所以如果在证书有效期到期之前，你应该使用 certbot renew 对证书进行续期。

如果你需要更新证书，执行以下命令即可。

certbot renew

在真正执行之前，你可以使用 DryRun 模式来测试续期是否正常：

certbot renew --dry-run

如果你修改过 Nginx 配置文件，请务必确保 certbot 的修改不会影响你的配置文件。

你可以将这个命令配置为 crontab ，在每个月的第一天凌晨执行续期并打印日志。

参考阅读

16.4 - Docker：启用容器与镜像代理

如何在 Pigsty 启用 Docker 容器支持？Docker 的安装部署配置，以及如何解决DockerHub被“墙”的问题

Pigsty 提供了 DOCKER 模块，但默认并不安装。

您可以使用 docker.yml 剧本在指定节点上安装并启用 Docker。

./docker.yml -l <ip|group|cls>   # 在指定的节点、分组、集群上安装并启用 Docker

如何建配置代理服务器？

本文不会介绍如何“翻墙”，而是假设你已经有了一个可用的 HTTP(s) 代理服务器，应该如何配置，让 Docker 可以通过代理服务器，访问 docker hub 或 quay.io 等镜像站点：

你的代理服务器软件应该会提供一个形如：

http://<ip|domain>:<port> 或者 https://[user]:[pass]@<ip|domain>:<port> 的代理地址

例如，假设您使用的代理服务器在 127.0.0.1:12345 上提供服务，那么你可以通过以下环境变量来使用它：

export ALL_PROXY=http://192.168.0.106:8118
export HTTP_PROXY=http://192.168.0.106:8118
export HTTPS_PROXY=http://192.168.0.106:8118
export NO_PROXY="localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"

您可以使用 curl 命令，检验代理服务器是否可以正常工作，例如成功可以访问 Google，通常说明代理服务器工作正常。

curl -x http://192.168.0.106:8118 -I http://www.google.com

如何为Docker Daemon配置代理服务器？

如果您希望 Docker 在 Pull 镜像时使用代理服务器，那么应当在 pigsty.yml 配置文件的全局变量中，指定 proxy_env 参数：

all:
  vars:
    proxy_env:                        # global proxy env when downloading packages
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"
      http_proxy: http://192.168.0.106:8118
      all_proxy: http://192.168.0.106:8118
      https_proxy: http://192.168.0.106:8118

那么当 Docker 剧本执行，时，这些配置会被渲染为 /etc/docker/daemon.json 中的代理配置：

{
  "proxies": {
    "http-proxy": "{{ proxy_env['http_proxy'] }}",
    "https-proxy": "{{ proxy_env['http_proxy'] }}",
    "no-proxy": "{{ proxy_env['no_proxy'] }}"
  }
}

请注意，Docker Daemon 不使用 all_proxy 参数

如果您希望手工指定代理服务器，可以选则直接修改 /etc/docker/daemon.json 中的 proxies 配置；或者也可以修改 /lib/systemd/system/docker.service (Debian/Ubuntu) 与 /usr/lib/systemd/system/docker.service 的服务定义，在 [Service] 一节中添加环境变量声明，并重启生效：

[Service]
Environment="HTTP_PROXY=http://192.168.0.106:8118"
Environment="HTTPS_PROXY=http://192.168.0.106:8118"
Environment="NO_PROXY=localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"

重启后生效：

systemctl restart docker

如何使用其他镜像站点？

您可以在 docker_registry_mirrors 参数中指定其他镜像站点，例如阿里云、腾讯云、清华大学等镜像站点：

[ "https://mirror.ccs.tencentyun.com" ]         # tencent cloud mirror, intranet only
["https://registry.cn-hangzhou.aliyuncs.com"]   # aliyun cloud mirror, login required

不过目前来看，所有位于中国大陆的 DockerHub 公有镜像站都已经被封禁了，建议使用代理服务器直接访问 Docker Hub

如果您需要使用其他镜像站，例如 quay.io，可以首先执行：

docker login quay.io
username #> # input your username
password #> # input your password

16.5 - 使用 PostgreSQL 作为 Ansible 的配置清单与 CMDB

使用 PostgreSQL ，而不是静态 YAML 配置文件作为 Ansible 的配置源，从而更好地与外部系统集成整合。

您可以使用 PostgreSQL 作为 Pigsty 的配置源，替代静态 YAML 配置文件。

使用 CMDB 作为 Ansible 的动态配置清单具有一些优点：元数据以高度结构化的方式以数据表的形式呈现，并通过数据库约束确保一致性。同时，使用 CMDB 允许您使用第三方的工具来编辑管理 Pigsty 元数据，便于与外部系统相互集成。

Ansible配置原理

Pigsty 默认的配置文件路径在 ansible.cfg 中指定为：inventory = pigsty.yml

修改该参数，可以更改默认使用的配置文件路径。如果您将其指向一个可执行的脚本文件，那么 Ansible 会使用动态 Inventory 机制，执行该脚本，并期待该脚本返回一份配置文件。

修改配置源实质上是编辑Pigsty目录下的 ansible.cfg 实现的：

---
inventory = pigsty.yml
+++
inventory = inventory.sh

而 inventory.sh 则是一个从 PostgreSQL CMDB 的记录中，生成等效 YAML/JSON 配置文件的简单脚本。

加载配置

Pigsty CMDB的模式会在pg-meta元数据库初始化时自动创建（files/cmdb.sql），位于meta数据库的pigsty 模式中。使用bin/inventory_load可以将静态配置文件加载至CMDB中。

必须在元节点完整执行 infra.yml 安装完毕后，方可使用 CMDB

usage: inventory_load [-h] [-p PATH] [-d CMDB_URL]

load config arguments

optional arguments:
  -h, --help            show this help message and exit„
  -p PATH, --path PATH  config path, ${PIGSTY_HOME}/pigsty.yml by default
  -d DATA, --data DATA  postgres cmdb pgurl, ${METADB_URL} by default

默认情况下，不带参数执行该脚本将会把$PIGSTY_HOME/pigsty.yml的名称载入默认CMDB中。

bin/inventory_load
bin/inventory_load -p conf/demo.yml
bin/inventory_load -p conf/prod.yml -d postgresql://dbuser_meta:DBUser.Meta@10.10.10.10:5432/meta

当原有配置文件加载至 CMDB 作为初始数据后，即可配置 Ansible 使用 CMDB 作为配置源：

bin/inventory_cmdb

您可以切换回静态配置文件：

bin/inventory_conf

16.6 - 使用 PostgreSQL 作为 Grafana 后端数据库

使用 PostgreSQL 而不是 SQLite 作为 Grafana 后端使用的远程存储数据库，获取更好的性能与可用性。

您可以使用 PostgreSQL 作为 Grafana 后端使用的数据库。

这是了解Pigsty部署系统使用方式的好机会，完成此教程，您会了解：

如何创建新数据库集群
如何在已有数据库集群中创建新业务用户
如何在已有数据库集群中创建新业务数据库
如何访问Pigsty所创建的数据库
如何管理Grafana中的监控面板
如何管理Grafana中的PostgreSQL数据源
如何一步到位完成Grafana数据库升级

太长不看

vi pigsty.yml # 取消注释DB/User定义：dbuser_grafana  grafana 
bin/pgsql-user  pg-meta  dbuser_grafana
bin/pgsql-db    pg-meta  grafana

psql postgres://dbuser_grafana:DBUser.Grafana@meta:5436/grafana -c \
  'CREATE TABLE t(); DROP TABLE t;' # 检查连接串可用性
  
vi /etc/grafana/grafana.ini # 修改 [database] type url
systemctl restart grafana-server

创建数据库集群

我们可以在pg-meta上定义一个新的数据库grafana，也可以在新的机器节点上创建一个专用于Grafana的数据库集群：pg-grafana

定义集群

如果需要创建新的专用数据库集群pg-grafana，部署在10.10.10.11，10.10.10.12两台机器上，可以使用以下配置文件：

pg-grafana: 
  hosts: 
    10.10.10.11: {pg_seq: 1, pg_role: primary}
    10.10.10.12: {pg_seq: 2, pg_role: replica}
  vars:
    pg_cluster: pg-grafana
    pg_databases:
      - name: grafana
        owner: dbuser_grafana
        revokeconn: true
        comment: grafana primary database
    pg_users:
      - name: dbuser_grafana
        password: DBUser.Grafana
        pgbouncer: true
        roles: [dbrole_admin]
        comment: admin user for grafana database

创建集群

使用以下命令完成数据库集群pg-grafana的创建：pgsql.yml。

bin/createpg pg-grafana    # 初始化pg-grafana集群

该命令实际上调用了Ansible Playbook pgsql.yml 创建数据库集群。

./pgsql.yml -l pg-grafana  # 实际执行的等效Ansible剧本命令

定义在 pg_users 与 pg_databases 中的业务用户与业务数据库会在集群初始化时自动创建，因此使用该配置时，集群创建完毕后，（在没有DNS支持的情况下）您可以使用以下连接串访问数据库（任一即可）：

postgres://dbuser_grafana:DBUser.Grafana@10.10.10.11:5432/grafana # 主库直连
postgres://dbuser_grafana:DBUser.Grafana@10.10.10.11:5436/grafana # 直连default服务
postgres://dbuser_grafana:DBUser.Grafana@10.10.10.11:5433/grafana # 连接串读写服务

postgres://dbuser_grafana:DBUser.Grafana@10.10.10.12:5432/grafana # 主库直连
postgres://dbuser_grafana:DBUser.Grafana@10.10.10.12:5436/grafana # 直连default服务
postgres://dbuser_grafana:DBUser.Grafana@10.10.10.12:5433/grafana # 连接串读写服务

因为默认情况下Pigsty安装在单个元节点上，接下来的步骤我们会在已有的pg-meta数据库集群上创建Grafana所需的用户与数据库，而并非使用这里创建的pg-grafana集群。

创建Grafana业务用户

通常业务对象管理的惯例是：先创建用户，再创建数据库。因为如果为数据库配置了owner，数据库对相应的用户存在依赖。

定义用户

要在pg-meta集群上创建用户dbuser_grafana，首先将以下用户定义添加至pg-meta的集群定义中：

添加位置：all.children.pg-meta.vars.pg_users

- name: dbuser_grafana
  password: DBUser.Grafana
  comment: admin user for grafana database
  pgbouncer: true
  roles: [ dbrole_admin ]

如果您在这里定义了不同的密码，请在后续步骤中将相应参数替换为新密码

创建用户

使用以下命令完成dbuser_grafana用户的创建（任一均可）。

bin/pgsql-user pg-meta dbuser_grafana # 在pg-meta集群上创建`dbuser_grafana`用户

实际上调用了Ansible Playbook pgsql-createuser.yml 创建用户

./pgsql-user.yml -l pg-meta -e pg_user=dbuser_grafana  # Ansible

dbrole_admin 角色具有在数据库中执行DDL变更的权限，这正是Grafana所需要的。

创建Grafana业务数据库

定义数据库

创建业务数据库的方式与业务用户一致，首先在pg-meta的集群定义中添加新数据库grafana的定义。

添加位置：all.children.pg-meta.vars.pg_databases

- { name: grafana, owner: dbuser_grafana, revokeconn: true }

创建数据库

使用以下命令完成grafana数据库的创建（任一均可）。

bin/pgsql-db pg-meta grafana # 在`pg-meta`集群上创建`grafana`数据库

实际上调用了Ansible Playbook pgsql-createdb.yml 创建数据库

./pgsql-db.yml -l pg-meta -e pg_database=grafana # 实际执行的Ansible剧本

使用Grafana业务数据库

检查连接串可达性

您可以使用不同的服务或接入方式访问数据库，例如：

postgres://dbuser_grafana:DBUser.Grafana@meta:5432/grafana # 直连
postgres://dbuser_grafana:DBUser.Grafana@meta:5436/grafana # default服务
postgres://dbuser_grafana:DBUser.Grafana@meta:5433/grafana # primary服务

这里，我们将使用通过负载均衡器直接访问主库的 Default服务访问数据库。

首先检查连接串是否可达，以及是否有权限执行DDL命令。

psql postgres://dbuser_grafana:DBUser.Grafana@meta:5436/grafana -c \
  'CREATE TABLE t(); DROP TABLE t;'

直接修改Grafana配置

为了让Grafana使用 Postgres 数据源，您需要编辑 /etc/grafana/grafana.ini，并修改配置项：

[database]
;type = sqlite3
;host = 127.0.0.1:3306
;name = grafana
;user = root
# If the password contains # or ; you have to wrap it with triple quotes. Ex """#password;"""
;password =
;url =

将默认的配置项修改为：

[database]
type = postgres
url =  postgres://dbuser_grafana:DBUser.Grafana@meta/grafana

随后重启Grafana即可：

systemctl restart grafana-server

从监控系统中看到新增的 grafana 数据库已经开始有活动，则说明Grafana已经开始使用Postgres作为首要后端数据库了。但一个新的问题是，Grafana中原有的Dashboards与Datasources都消失了！这里需要重新导入监控面板与Postgres数据源

管理Grafana监控面板

您可以使用管理用户前往 Pigsty 目录下的files/ui目录，执行grafana.py init重新加载Pigsty监控面板。

cd ~/pigsty/files/ui
./grafana.py init    # 使用当前目录下的Dashboards初始化Grafana监控面板

执行结果：

vagrant@meta:~/pigsty/files/ui
$ ./grafana.py init
Grafana API: admin:pigsty @ http://10.10.10.10:3000
init dashboard : home.json
init folder pgcat
init dashboard: pgcat / pgcat-table.json
init dashboard: pgcat / pgcat-bloat.json
init dashboard: pgcat / pgcat-query.json
init folder pgsql
init dashboard: pgsql / pgsql-replication.json
init dashboard: pgsql / pgsql-table.json
init dashboard: pgsql / pgsql-activity.json
init dashboard: pgsql / pgsql-cluster.json
init dashboard: pgsql / pgsql-node.json
init dashboard: pgsql / pgsql-database.json
init dashboard: pgsql / pgsql-xacts.json
init dashboard: pgsql / pgsql-overview.json
init dashboard: pgsql / pgsql-session.json
init dashboard: pgsql / pgsql-tables.json
init dashboard: pgsql / pgsql-instance.json
init dashboard: pgsql / pgsql-queries.json
init dashboard: pgsql / pgsql-alert.json
init dashboard: pgsql / pgsql-service.json
init dashboard: pgsql / pgsql-persist.json
init dashboard: pgsql / pgsql-proxy.json
init dashboard: pgsql / pgsql-query.json
init folder pglog
init dashboard: pglog / pglog-instance.json
init dashboard: pglog / pglog-analysis.json
init dashboard: pglog / pglog-session.json

该脚本会侦测当前的环境（安装时定义于~/pigsty），获取Grafana的访问信息，并将监控面板中的URL连接占位符域名（*.pigsty）替换为真实使用的域名。

export GRAFANA_ENDPOINT=http://10.10.10.10:3000
export GRAFANA_USERNAME=admin
export GRAFANA_PASSWORD=pigsty

export NGINX_UPSTREAM_YUMREPO=yum.pigsty
export NGINX_UPSTREAM_CONSUL=c.pigsty
export NGINX_UPSTREAM_PROMETHEUS=p.pigsty
export NGINX_UPSTREAM_ALERTMANAGER=a.pigsty
export NGINX_UPSTREAM_GRAFANA=g.pigsty
export NGINX_UPSTREAM_HAPROXY=h.pigsty

题外话，使用grafana.py clean会清空目标监控面板，使用grafana.py load会加载当前目录下所有监控面板，当Pigsty的监控面板发生变更，可以使用这两个命令升级所有的监控面板。

管理Postgres数据源

当使用 pgsql.yml 创建新PostgreSQL集群，或使用pgsql-createdb.yml创建新业务数据库时，Pigsty会在Grafana中注册新的PostgreSQL数据源，您可以使用默认的监控用户通过Grafana直接访问目标数据库实例。应用pgcat的绝大部分功能有赖于此。

要注册Postgres数据库，可以使用pgsql.yml中的register_grafana任务：

./pgsql.yml -t register_grafana             # 重新注册当前环境中所有Postgres数据源
./pgsql.yml -t register_grafana -l pg-test  # 重新注册 pg-test 集群中所有的数据库

一步到位更新Grafana

您可以直接通过修改Pigsty配置文件，更改Grafana使用的后端数据源，一步到位的完成切换Grafana后端数据库的工作。编辑pigsty.yml中grafana_database与grafana_pgurl参数，将其修改为：

grafana_database: postgres
grafana_pgurl: postgres://dbuser_grafana:DBUser.Grafana@meta:5436/grafana

然后重新执行 infral.yml中的grafana任务，即可完成 Grafana升级

./infra.yml -t grafana

16.7 - 使用 TimescaleDB + Promscale 存储 Prometheus 时序指标数据

您可以通过 Promscale，使用TimescaleDB持久化Prometheus指标数据。

虽然这并不是推荐的行为，但这是了解Pigsty部署系统使用方式的好机会。

注意，使用 Promscale 存储 Prometheus 指标占用的存储空间大约是 Prometheus 的 4 倍，但是可以使用 SQL 来查询分析 Prometheus 监控指标。

准备Postgres数据库

vi pigsty.yml # 取消注释DB/User定义：dbuser_prometheus  prometheus

pg_databases:                           # define business users/roles on this cluster, array of user definition
  - { name: prometheus, owner: dbuser_prometheus , revokeconn: true, comment: prometheus primary database }
pg_users:                           # define business users/roles on this cluster, array of user definition
  - {name: dbuser_prometheus , password: DBUser.Prometheus ,pgbouncer: true , createrole: true,  roles: [dbrole_admin], comment: admin user for prometheus database }

创建 Prometheus 业务数据库与业务用户。

bin/createuser  pg-meta  dbuser_prometheus
bin/createdb    pg-meta  prometheus

检查数据库可用性并创建扩展

psql postgres://dbuser_prometheus:DBUser.Prometheus@10.10.10.10:5432/prometheus -c 'CREATE EXTENSION timescaledb;'

配置Promscale

在元节点上执行以下命令安装 promscale

yum install -y promscale

如果默认软件包中没有，可以直接下载：

wget https://github.com/timescale/promscale/releases/download/0.6.1/promscale_0.6.1_Linux_x86_64.rpm
sudo rpm -ivh promscale_0.6.1_Linux_x86_64.rpm

编辑 promscale 的配置文件 /etc/sysconfig/promscale.conf

PROMSCALE_DB_HOST="127.0.0.1"
PROMSCALE_DB_NAME="prometheus"
PROMSCALE_DB_PASSWORD="DBUser.Prometheus"
PROMSCALE_DB_PORT="5432"
PROMSCALE_DB_SSL_MODE="disable"
PROMSCALE_DB_USER="dbuser_prometheus"

最后启动promscale，它会访问安装有 timescaledb 的数据库实例，并创建所需的schema

# launch 
cat /usr/lib/systemd/system/promscale.service
systemctl start promscale && systemctl status promscale

配置Prometheus

Prometheus可以使用Remote Write/ Remote Read的方式，通过Promscale，使用Postgres作为远程存储。

编辑Prometheus配置文件：

vi /etc/prometheus/prometheus.yml

添加以下记录：

remote_write:
  - url: "http://127.0.0.1:9201/write"
remote_read:
  - url: "http://127.0.0.1:9201/read"

重启Prometheus后，监控数据即可放入Postgres中。

systemctl restart prometheus

16.8 - 使用 Keepalived 为 Pigsty 节点集群配置二层 VIP

如何在 Pigsty 中为节点集群绑定一个二层 VIP？什么情况下用不了？如何解决？

您可以在节点集群上绑定一个可选的 L2 VIP —— 前提条件是：集群中的所有节点都在一个二层网络中。

在节点集群（任何一个 Ansible Group，包括数据库集群定义都可以视作一个节点集群）上，启用 vip_enabled 参数，即可在节点集群上启用 Keepalived ，绑定一个2层 VIP。

proxy:
  hosts:
    10.10.10.29: { nodename: proxy-1 } # 您可以显式指定初始的 VIP 角色：MASTER / BACKUP
    10.10.10.30: { nodename: proxy-2 } # , vip_role: master }
  vars:
    node_cluster: proxy
    vip_enabled: true
    vip_vrid: 128
    vip_address: 10.10.10.99
    vip_interface: eth1

使用以下命令，刷新节点的 Keepalived 配置，并生效：

./node.yml -l proxy -t node_vip     # 首次启用 VIP 
./node.yml -l proxy -t vip_refresh  # 刷新 vip 配置（例如指定 master）

专用的场景

针对 PostgreSQL 高可用场景， Pigsty 提供了基于 vip-manager 的 L2 VIP 解决方案。

vip-manager 是一个独立的组件，它读取 etcd 中的 PostgreSQL 集群领导者，并在领导者所在节点上绑定一个 L2 VIP。

因此我们建议您使用 vip-manager 来实现 PostgreSQL 的高可用性，而不是使用 Keepalived 来实现。请参考 PGSQL VIP 了解更多。

不适用的场景

在诸如 AWS，阿里云这样的云环境中，通常不支持使用 L2 VIP。

在这种情况下，我们建议您使用四层负载均衡器来实现类似的功能。

例如 Pigsty 提供了 HAProxy 的配置支持。

16.9 - 使用 VIP-Manager 为 PostgreSQL 集群配置二层 VIP

如何在 Pigsty 中为 PostgreSQL 集群绑定一个二层 VIP？

您可以在 PostgreSQL 集群上绑定一个可选的 L2 VIP —— 前提条件是：集群中的所有节点都在一个二层网络中。

这个 L2 VIP 强制使用 Master - Backup 模式，Master 始终指向在数据库集群主库实例所在的节点。

这个 VIP 由 VIP-Manager 组件管理，它会从 DCS （etcd）中直接读取由 Patroni 写入的 Leader Key，从而判断自己是否是 Master。

启用VIP

在 PostgreSQL 集群上定义 pg_vip_enabled 参数为 true，即可在集群上启用 VIP 组件。当然您也可以在全局配置中启用此配置项。

# pgsql 3 node ha cluster: pg-test
pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }   # primary instance, leader of cluster
    10.10.10.12: { pg_seq: 2, pg_role: replica }   # replica instance, follower of leader
    10.10.10.13: { pg_seq: 3, pg_role: replica, pg_offline_query: true } # replica with offline access
  vars:
    pg_cluster: pg-test           # define pgsql cluster name
    pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
    pg_databases: [{ name: test }]

    # 启用 L2 VIP
    pg_vip_enabled: true
    pg_vip_address: 10.10.10.3/24
    pg_vip_interface: eth1

请注意，pg_vip_address 必须是一个合法的 IP 地址，带有网段，且在当前二层网络中可用。

请注意，pg_vip_interface 必须是一个合法的网络接口名，并且应当是与 inventory 中使用 IPv4 地址一致的网卡。如果集群成员的网卡名不一样，用户应当为每个实例显式指定 pg_vip_interface 参数，例如：

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary , pg_vip_interface: eth0  }
    10.10.10.12: { pg_seq: 2, pg_role: replica , pg_vip_interface: eth1  }
    10.10.10.13: { pg_seq: 3, pg_role: replica , pg_vip_interface: ens33 }
  vars:
    pg_cluster: pg-test           # define pgsql cluster name
    pg_users:  [{ name: test , password: test , pgbouncer: true , roles: [ dbrole_admin ] }]
    pg_databases: [{ name: test }]

    # 启用 L2 VIP
    pg_vip_enabled: true
    pg_vip_address: 10.10.10.3/24
    #pg_vip_interface: eth1

使用以下命令，刷新 PG 的 vip-manager 配置并重启生效：

./pgsql.yml -t pg_vip

16.10 - HugePage：为数据库启用大页支持

如何为 PostgreSQL 集群分配精准的大页面？

内存大页的优缺点

对于数据库来说，启用大页有好处，也有缺点。

OLAP 场景下的显著性能收益：大数据量扫描与批量计算
更可控的内存分配模型：启动时“锁定”需要的内存
提升内存访问效率，减少 TLB miss
降低内核页表维护开销

但也伴随着一些缺点：

额外的配置与维护复杂度
大页内存被锁定，对系统整体资源弹性要求高的环境来说缺乏灵活性
小规模内存场景收益有限，甚至会适得其反

请注意，HugePage 和 Transparent HugePage （透明大页）是两个不同的概念， Pigsty 会强制关闭 Transparent HugePage 以遵循数据库最佳实践。

什么时候启用大页？

如果你的场景满足以下条件，我们建议启用大页：

OLAP 分析场景
超过几十GB 的内存
PostgreSQL 15+
Linux 内核版本 > 3.10 （> EL7, > Ubuntu 16）

Pigsty 默认不启用大页，但你可以通过简单的配置启用，并配置为 PostgreSQL 专属的内存。

分配节点大页

要为节点启用大页面，用户可以使用以下两个参数：

node_hugepage_count：精确指定要分配的 2MB 内存大页数量
node_hugepage_ratio：指定一个百分比，将内存的一部分以大页形式分配

这两个参数二选一，你可以直接指定要分配的（2MB）大页数量，或指定分配为大页的内存比例（0.00 - 0.90 ），前者具有更高优先级。

node_hugepage_count: 0            # 精确指定 2MB 大页面数量，优先级要高于 node_hugepage_ratio
node_hugepage_ratio: 0            # 分配为 2MB 大页面的内存比例，优先级要低于 node_hugepage_count

应用生效：

./node.yml -t node_tune

本质上是在：/etc/sysctl.d/hugepage.conf 中写入了 vm.nr_hugepages 参数值并执行了 sysctl -p 应用生效。

./node.yml -t node_tune -e node_hugepage_count=3000    # 精确分配 5000 个 2MB 大页（10GB）
./node.yml -t node_tune -e node_hugepage_ratio=0.30    # 以大页形式分配 30% 的内存

请注意，以上参数只是为节点启用大页，不仅仅是 PostgreSQL 可以使用。

PostgreSQL 服务器默认会在启动时尝试使用大页，如果系统中可用的大页数量不足，PostgreSQL 会继续使用普通页面启动。

如果你尝试降低大页数量，只有未被使用与保留的大页（Free）会被释放，已经被使用的大页会在进程退出后释放。

Pigsty 最多允许分配 90% 的内存作为大页，但对于 PostgreSQL 数据库来说，合理的范围通常在 25% - 40% 的内存。

建议用户设置：node_hugepage_ratio=0.30，并在 PostgreSQL 启动后按需进一步调整大页数量。

查看大页状态

最直观的查看方法是使用 Pigsty 监控系统，这里给出了调整大页时的一个监控图表样例：

Node Instance - Memory - HugePages Allocation

默认状态
启用大页，未使用
重启 PG ，使用/保留了一部分大页
进一步使用 PG，使用了更多大页
缩减大页数量，回收未使用的大页
重启 PG，彻底释放保留的大页

你可以直接 cat /proc/meminfo | grep Huge 查看大页状态。

$ cat /proc/meminfo  | grep Huge

默认情况下，没有启用大页面，大页面数量（Total）为 0：

AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB

启用了大页面，总共有 6015 个大页面，全部空闲可用：

AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    6015
HugePages_Free:     6015
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        12318720 kB

如果这时候重启 PostgreSQL （默认会尝试使用大页）

sudo su - postgres
pg-restart

那么 PostgreSQL 会使用 保留预定 （Rsvd，Reserved）所需的大页，用于共享缓冲区，例如这里保留了 5040 个。

AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    6015
HugePages_Free:     5887
HugePages_Rsvd:     5040
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        12318720 kB

如果我们给 PostgreSQL 增加一些负载，比如 pgbench -is10 ，那么 PostgreSQL 会开始使用更多大页（Alloc = Total - Free）。

请注意，大页一旦被（分配或者预定），即使你将系统的 vm.nr_hugepages 参数调小，这些页面也依然会被保留，直到使用完毕。因此，如果你想要真正回收这些大页，需要重启 PostgreSQL 服务。

./node.yml -t node_tune -e node_hugepage_count=3000    # 分配 3000 大页

精准分配大页

在 PostgreSQL 启动前，您需要分配 足够多的 大页，否则 PostgreSQL 将无法使用这些大页。

在 Pigsty 中，默认使用的 SharedBuffer 不超过内存的 25% ，所以您可以分配 26% ~ 27% 的内存作为大页，以确保 PostgreSQL 可以使用大页。

node_hugepage_ratio: 0.27  # 先分配 27% 内存作为大页，肯定够 PG 用了

如果不在乎少量资源浪费，您可以直接分配 27% 左右的内存作为大页。

回收脚本

PG 启动后，使用以下 SQL 可以查询到 PostgreSQL 实际使用的大页数量：

SHOW shared_memory_size_in_huge_pages;

最后，您可以精确指定所需的大页数量：

node_hugepage_count: 3000   # 精确分配 3000 个 2MB 大页（6GB）

然而，要精准的一个不漏的统计所需的大页数量，通常要等到 PostgreSQL 服务器启动后才能获取。

所以折中的办法是，提前超量分配大页启动 PostgreSQL 后，从 PG 中查询得到所需的精准大页数量，然后再精确修改所需大页的数量。

让PG独占大页

默认情况下，所有进程都可以去使用大页，如果用户希望仅允许 PostgreSQL 数据库使用大页，可以修改 vm.hugetlb_shm_group 内核参数

你可以调整 node_sysctl_params 参数，将 PostgreSQL 的 GID 填入。

node_sysctl_params:
  vm.hugetlb_shm_group: 26

node_sysctl_params:
  vm.hugetlb_shm_group: 543

注意 EL/Debian PostgreSQL UID/GID 默认值不同，分别为 26, 543 （可以显式通过 pg_dbsu_uid 修改）

想要移除此变更：

sysctl -p /etc/sysctl.d/hugepage.conf

快速调整脚本

浪费的大页部分可以使用 pg-tune-hugepage 脚本对其进行回收，不过此脚本仅 PostgreSQL 15+ 可用。

如果你的 PostgreSQL 已经在运行，你可以使用下面的办法启动大页（仅 PG15+ 可用）：

sync; echo 3 > /proc/sys/vm/drop_caches   # 刷盘，释放系统缓存（请做好数据库性能受到冲击的准备）
sudo /pg/bin/pg-tune-hugepage             # 将 nr_hugepages 写入 /etc/sysctl.d/hugepage.conf
pg restart <cls>                          # 重启 postgres 以使用 hugepage

执行 pg-tune-hugepage 的样例输出：

$ /pg/bin/pg-tune-hugepage
[INFO] Querying PostgreSQL for hugepage requirements...
[INFO] Added safety margin of 0 hugepages (5168 → 5168)
[INFO] ==================================
PostgreSQL user: postgres
PostgreSQL group ID: 26
Required hugepages: 5168
Configuration file: /etc/sysctl.d/hugepage.conf
[BEFORE] ================================
Current memory information:
AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:   10025
HugePages_Free:     9896
HugePages_Rsvd:     5039
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        20531200 kB
Current sysctl settings:
vm.hugetlb_shm_group = 26
vm.nr_hugepages = 10025
vm.nr_hugepages_mempolicy = 10025
[EXECUTE] ===============================
Writing new hugepage configuration...
Applying new settings...
vm.nr_hugepages = 5168
vm.hugetlb_shm_group = 26
[AFTER] =================================
Updated memory information:
AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    5168
HugePages_Free:     5039
HugePages_Rsvd:     5039
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        10584064 kB
Updated sysctl settings:
vm.hugetlb_shm_group = 26
vm.nr_hugepages = 5168
vm.nr_hugepages_mempolicy = 5168
[DONE] ==================================
PostgreSQL hugepage configuration complete.

Consider adding the following to your inventory file:
node_hugepage_count: 5168
node_sysctl_params: {vm.hugetlb_shm_group: 26}

参考

https://www.cybertec-postgresql.com/en/huge-pages-postgresql/

16.11 - Citus：部署原生高可用集群

如何部署 Citus 高可用分布式集群？

Citus 是一个 PostgreSQL 扩展，可以将 PostgreSQL 原地转换为一个分布式数据库，并实现在多个节点上水平扩展，以处理大量数据和大量查询。

Patroni 在 v3.0 后，提供了对 Citus 原生高可用的支持，简化了 Citus 集群的搭建，Pigsty 也对此提供了原生支持。

Citus集群

Pigsty 原生支持 Citus。可以参考 conf/citus.yml，以及更复杂的 十节点集群。

这里使用 Pigsty 四节点沙箱，定义了一个 Citus 集群 pg-citus，其中包括一个两节点的协调者（Coordinator）集群 pg-citus0，以及两个工作者（Worker）集群 pg-citus1，pg-citus2。

pg-citus:
  hosts:
    10.10.10.10: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.2/24 ,pg_seq: 1, pg_role: primary }
    10.10.10.11: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.2/24 ,pg_seq: 2, pg_role: replica }
    10.10.10.12: { pg_group: 1, pg_cluster: pg-citus1 ,pg_vip_address: 10.10.10.3/24 ,pg_seq: 1, pg_role: primary }
    10.10.10.13: { pg_group: 2, pg_cluster: pg-citus2 ,pg_vip_address: 10.10.10.4/24 ,pg_seq: 1, pg_role: primary }
  vars:
    pg_mode: citus                            # pgsql cluster mode: citus
    pg_version: 16                            # citus does not have pg16 available
    pg_shard: pg-citus                        # citus shard name: pg-citus
    pg_primary_db: citus                      # primary database used by citus
    pg_vip_enabled: true                      # enable vip for citus cluster
    pg_vip_interface: eth1                    # vip interface for all members
    pg_dbsu_password: DBUser.Postgres         # all dbsu password access for citus cluster
    pg_extensions: [ citus, postgis, pgvector, topn, pg_cron, hll ]  # install these extensions
    pg_libs: 'citus, pg_cron, pg_stat_statements' # citus will be added by patroni automatically
    pg_users: [{ name: dbuser_citus ,password: DBUser.Citus ,pgbouncer: true ,roles: [ dbrole_admin ]    }]
    pg_databases: [{ name: citus ,owner: dbuser_citus ,extensions: [ citus, vector, topn, pg_cron, hll ] }]
    pg_parameters:
      cron.database_name: citus
      citus.node_conninfo: 'sslmode=require sslrootcert=/pg/cert/ca.crt sslmode=verify-full'
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32  ,auth: ssl   ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra         ,auth: ssl   ,title: 'all user ssl access from intranet'  }

相比标准 PostgreSQL 集群，Citus 集群的配置有一些特殊之处，如下所示：

首先，你需要确保 Citus 扩展被下载，安装，加载并启用，这涉及到以下四个参数

repo_packages：必须包含 citus 扩展，或者你需要使用带有 Citus 扩展的 PostgreSQL 离线安装包。
pg_extensions：必须包含 citus 扩展，即你必须在每个节点上安装 citus 扩展。
pg_libs：必须包含 citus 扩展，而且首位必须为 citus，但现在 Patroni 会自动完成这件事了。
pg_databases：这里要定义一个首要数据库，该数据库必须安装 citus 扩展。

其次，你需要确保 Citus 集群的配置正确：

pg_mode：必须设置为 citus，从而告知 Patroni 使用 Citus 模式。
pg_primary_db：必须指定一个首要数据库的名称，该数据库必须安装 citus 扩展，这里名为 citus。
pg_shard：必须指定一个统一的名称，字符串，作为所有水平分片PG集群的集群名称前缀，这里为 pg-citus。
pg_group：必须指定一个分片号，从零开始依次分配的整数，0 号固定代表协调者集群，其他为 Worker 集群。
pg_cluster 必须与 pg_shard 和 pg_group 组合后的结果对应。
pg_dbsu_password：必须设置为非空的纯文本密码，否则 Citus 无法正常工作。
pg_parameters：建议设置 citus.node_conninfo 参数，强制要求 SSL 访问并要求节点间验证客户端证书。

配置完成后，您可以像创建普通 PostgreSQL 集群一样，使用 pgsql.yml 剧本部署 Citus 集群。

管理Citus集群

定义好 Citus 集群后，部署 Citus 集群同样使用的剧本 pgsql.yml：

./pgsql.yml -l pg-citus    # 部署 Citus 集群 pg-citus

使用任意成员的 DBSU（postgres）用户，都能通过 patronictl （alias: pg）列出 Citus 集群的状态：

$ pg list
+ Citus cluster: pg-citus ----------+---------+-----------+----+-----------+--------------------+
| Group | Member      | Host        | Role    | State     | TL | Lag in MB | Tags               |
+-------+-------------+-------------+---------+-----------+----+-----------+--------------------+
|     0 | pg-citus0-1 | 10.10.10.10 | Leader  | running   |  1 |           | clonefrom: true    |
|       |             |             |         |           |    |           | conf: tiny.yml     |
|       |             |             |         |           |    |           | spec: 20C.40G.125G |
|       |             |             |         |           |    |           | version: '16'      |
+-------+-------------+-------------+---------+-----------+----+-----------+--------------------+
|     1 | pg-citus1-1 | 10.10.10.11 | Leader  | running   |  1 |           | clonefrom: true    |
|       |             |             |         |           |    |           | conf: tiny.yml     |
|       |             |             |         |           |    |           | spec: 10C.20G.125G |
|       |             |             |         |           |    |           | version: '16'      |
+-------+-------------+-------------+---------+-----------+----+-----------+--------------------+
|     2 | pg-citus2-1 | 10.10.10.12 | Leader  | running   |  1 |           | clonefrom: true    |
|       |             |             |         |           |    |           | conf: tiny.yml     |
|       |             |             |         |           |    |           | spec: 10C.20G.125G |
|       |             |             |         |           |    |           | version: '16'      |
+-------+-------------+-------------+---------+-----------+----+-----------+--------------------+
|     2 | pg-citus2-2 | 10.10.10.13 | Replica | streaming |  1 |         0 | clonefrom: true    |
|       |             |             |         |           |    |           | conf: tiny.yml     |
|       |             |             |         |           |    |           | spec: 10C.20G.125G |
|       |             |             |         |           |    |           | version: '16'      |
+-------+-------------+-------------+---------+-----------+----+-----------+--------------------+

您可以将每个水平分片集群视为一个独立的 PGSQL 集群，使用 pg (patronictl) 命令管理它们。但是务必注意，当你使用 pg 命令管理 Citus 集群时，需要额外使用 --group 参数指定集群分片号

pg list pg-citus --group 0   # 需要使用 --group 0 指定集群分片号

Citus 中有一个名为 pg_dist_node 的系统表，用于记录 Citus 集群的节点信息，Patroni 会自动维护该表。

PGURL=postgres://postgres:DBUser.Postgres@10.10.10.10/citus

psql $PGURL -c 'SELECT * FROM pg_dist_node;'       # 查看节点信息
 nodeid | groupid |  nodename   | nodeport | noderack | hasmetadata | isactive | noderole  | nodecluster | metadatasynced | shouldhaveshards
--------+---------+-------------+----------+----------+-------------+----------+-----------+-------------+----------------+------------------
      1 |       0 | 10.10.10.10 |     5432 | default  | t           | t        | primary   | default     | t              | f
      4 |       1 | 10.10.10.12 |     5432 | default  | t           | t        | primary   | default     | t              | t
      5 |       2 | 10.10.10.13 |     5432 | default  | t           | t        | primary   | default     | t              | t
      6 |       0 | 10.10.10.11 |     5432 | default  | t           | t        | secondary | default     | t              | f

此外，你还可以查看用户认证信息（仅限超级用户访问）：

$ psql $PGURL -c 'SELECT * FROM pg_dist_authinfo;'   # 查看节点认证信息（仅限超级用户访问）

然后，你可以使用普通业务用户（例如，具有 DDL 权限的 dbuser_citus）来访问 Citus 集群：

psql postgres://dbuser_citus:DBUser.Citus@10.10.10.10/citus -c 'SELECT * FROM pg_dist_node;'

使用Citus集群

在使用 Citus 集群时，我们强烈建议您先阅读 Citus 官方文档，了解其架构设计与核心概念。

其中核心是了解 Citus 中的五种表，以及其特点与应用场景：

分布式表（Distributed Table）
参考表（Reference Table）
本地表（Local Table）
本地管理表（Local Management Table）
架构表（Schema Table）

在协调者节点上，您可以创建分布式表和引用表，并从任何数据节点查询它们。从 11.2 开始，任何 Citus 数据库节点都可以扮演协调者的角色了。

我们可以使用 pgbench 来创建一些表，并将其中的主表（pgbench_accounts）分布到各个节点上，然后将其他小表作为引用表：

PGURL=postgres://dbuser_citus:DBUser.Citus@10.10.10.10/citus
pgbench -i $PGURL

psql $PGURL <<-EOF
SELECT create_distributed_table('pgbench_accounts', 'aid'); SELECT truncate_local_data_after_distributing_table('public.pgbench_accounts');
SELECT create_reference_table('pgbench_branches')         ; SELECT truncate_local_data_after_distributing_table('public.pgbench_branches');
SELECT create_reference_table('pgbench_history')          ; SELECT truncate_local_data_after_distributing_table('public.pgbench_history');
SELECT create_reference_table('pgbench_tellers')          ; SELECT truncate_local_data_after_distributing_table('public.pgbench_tellers');
EOF

执行读写测试：

pgbench -nv -P1 -c10 -T500 postgres://dbuser_citus:DBUser.Citus@10.10.10.10/citus      # 直连协调者 5432 端口
pgbench -nv -P1 -c10 -T500 postgres://dbuser_citus:DBUser.Citus@10.10.10.10:6432/citus # 通过连接池，减少客户端连接数压力，可以有效提高整体吞吐。
pgbench -nv -P1 -c10 -T500 postgres://dbuser_citus:DBUser.Citus@10.10.10.13/citus      # 任意 primary 节点都可以作为 coordinator
pgbench --select-only -nv -P1 -c10 -T500 postgres://dbuser_citus:DBUser.Citus@10.10.10.11/citus # 可以发起只读查询

更严肃的生产部署

要将 Citus 用于生产环境，您通常需要为 Coordinator 和每个 Worker 集群设置流复制物理副本。

例如，在 simu.yml 中定义了一个 10 节点的 Citus 集群。

pg-citus: # citus group
  hosts:
    10.10.10.50: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.60/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.51: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.60/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.52: { pg_group: 1, pg_cluster: pg-citus1 ,pg_vip_address: 10.10.10.61/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.53: { pg_group: 1, pg_cluster: pg-citus1 ,pg_vip_address: 10.10.10.61/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.54: { pg_group: 2, pg_cluster: pg-citus2 ,pg_vip_address: 10.10.10.62/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.55: { pg_group: 2, pg_cluster: pg-citus2 ,pg_vip_address: 10.10.10.62/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.56: { pg_group: 3, pg_cluster: pg-citus3 ,pg_vip_address: 10.10.10.63/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.57: { pg_group: 3, pg_cluster: pg-citus3 ,pg_vip_address: 10.10.10.63/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.58: { pg_group: 4, pg_cluster: pg-citus4 ,pg_vip_address: 10.10.10.64/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.59: { pg_group: 4, pg_cluster: pg-citus4 ,pg_vip_address: 10.10.10.64/24 ,pg_seq: 1, pg_role: replica }
  vars:
    pg_mode: citus                            # pgsql cluster mode: citus
    pg_version: 16                            # citus does not have pg16 available
    pg_shard: pg-citus                        # citus shard name: pg-citus
    pg_primary_db: citus                      # primary database used by citus
    pg_vip_enabled: true                      # enable vip for citus cluster
    pg_vip_interface: eth1                    # vip interface for all members
    pg_dbsu_password: DBUser.Postgres         # enable dbsu password access for citus
    pg_extensions: [ citus, postgis, pgvector, topn, pg_cron, hll ]  # install these extensions
    pg_libs: 'citus, pg_cron, pg_stat_statements' # citus will be added by patroni automatically
    pg_users: [{ name: dbuser_citus ,password: DBUser.Citus ,pgbouncer: true ,roles: [ dbrole_admin ]    }]
    pg_databases: [{ name: citus ,owner: dbuser_citus ,extensions: [ citus, vector, topn, pg_cron, hll ] }]
    pg_parameters:
      cron.database_name: citus
      citus.node_conninfo: 'sslrootcert=/pg/cert/ca.crt sslmode=verify-full'
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32  ,auth: ssl   ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra         ,auth: ssl   ,title: 'all user ssl access from intranet'  }

我们将在后续教程中覆盖一系列关于 Citus 的高级主题

读写分离
故障处理
一致性备份与恢复
高级监控与问题诊断
连接池

16.12 - 高可用演习：3坏2如何处理

高可用典型场景处理预案：三节点坏了两个节点，高可用不生效了，怎么从紧急状态中恢复？

如果经典3节点高可用部署同时出现两台（多数主体）故障，系统通常无法自动完成故障切换，需要人工介入：

首先判断另外两台服务器的情况，如果短时间内可以拉起，优先选择拉起另外两台服务。否则进入 紧急止血流程

紧急止血流程假设您的管理节点故障，只有单台普通数据库节点存活，在这种情况下，最快的恢复操作流程为：

调整 HAProxy 配置，将流量指向主库。
关闭 Patroni，手动提升 PostgreSQL 从库为主库。

调整HAProxy配置

如果你通过其他方式绕开 HAProxy 访问集群，那么可以跳过这一步。如果你通过 HAProxy 方式访问数据库集群，那么你需要调整负载均衡配置，将读写流量手工指向主库。

编辑 /etc/haproxy/<pg_cluster>-primary.cfg 配置文件，其中 <pg_cluster> 为你的 PostgreSQL 集群名称，例如 pg-meta。
将健康检查配置选项注释，停止进行健康鉴擦好
将服务器列表中，其他两台故障的机器注释掉，只保留当前主库服务器。

listen pg-meta-primary
    bind *:5433
    mode tcp
    maxconn 5000
    balance roundrobin

    # 注释掉以下四行健康检查配置
    #option httpchk                               # <---- remove this
    #option http-keep-alive                       # <---- remove this
    #http-check send meth OPTIONS uri /primary    # <---- remove this
    #http-check expect status 200                 # <---- remove this

    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    server pg-meta-1 10.10.10.10:6432 check port 8008 weight 100

    # 注释掉其他两台故障的机器
    #server pg-meta-2 10.10.10.11:6432 check port 8008 weight 100 <---- comment this
    #server pg-meta-3 10.10.10.12:6432 check port 8008 weight 100 <---- comment this

配置调整完成后，先不着急执行 systemctl reload haproxy 重载生效，等待后续主库提升后一起执行。以上配置的效果是，HAProxy 将不再进行主库健康检查（默认使用 Patroni），而是直接将写入流量指向当前主库

手工提升备库

登陆目标服务器，切换至 dbsu 用户，执行 CHECKPOINT 刷盘后，关闭 Patroni，重启 PostgreSQL 并执行 Promote。

sudo su - postgres                     # 切换到数据库 dbsu 用户
psql -c 'checkpoint; checkpoint;'      # 两次 Checkpoint 刷脏页，避免PG后重启耗时过久
sudo systemctl stop patroni            # 关闭 Patroni
pg-restart                             # 重新拉起 PostgreSQL
pg-promote                             # 将 PostgreSQL 从库提升为主库
psql -c 'SELECT pg_is_in_recovery();'  # 如果结果为 f，表示已经提升为主库

如果你上面调整了 HAProxy 配置，那么现在可以执行 systemctl reload haproxy 重载 HAProxy 配置，将流量指向新的主库。

systemctl reload haproxy                # 重载 HAProxy 配置，将写入流量指向当前实例

避免脑裂

紧急止血后，第二优先级问题为：避免脑裂。用户应当防止另外两台服务器重新上线后，与当前主库形成脑裂，导致数据不一致。

简单的做法是：

将另外两台服务器直接 断电/断网，确保它们不会在不受控的情况下再次上线。
调整应用使用的数据库连接串，将其 HOST 直接指向唯一幸存服务器上的主库。

然后应当根据具体情况，决定下一步的操作：

A：这两台服务器是临时故障（比如断网断电），可以原地修复后继续服务
B：这两台故障服务器是永久故障（比如硬件损坏），将移除并下线。

临时故障后的复原

如果另外两台服务器是临时故障，可以修复后继续服务，那么可以按照以下步骤进行修复与重建：

每次处理一台故障服务器，优先处理管理节点 / INFRA 管理节点
启动故障服务器，并在启动后关停 Patroni

ETCD 集群在法定人数恢复后，将恢复工作，此时可以启动幸存服务器（当前主库）上的 Patroni，接管现有 PostgreSQL，并重新获取集群领导者身份。 Patroni 启动后进入维护模式。

systemctl restart patroni
pg pause <pg_cluster>

在另外两台实例上以 postgres 用户身份创建 touch /pg/data/standby.signal 标记文件将其标记为从库，然后拉起 Patroni：

systemctl restart patroni

确认 Patroni 集群身份/角色正常后，退出维护模式：

pg resume <pg_cluster>

永久故障后的复原

出现永久故障后，首先需要恢复管理节点上的 ~/pigsty 目录，主要是需要 pigsty.yml 与 files/pki/ca/ca.key 两个核心文件。

如果您无法取回或没有备份这两个文件，您可以选择部署一套新的 Pigsty，并通过备份集群的方式将现有集群迁移至新部署中。

请定期备份 pigsty 目录（例如使用 Git 进行版本管理）。建议吸取教训，下次不要犯这样的错误。

配置修复

您可以将幸存的节点作为新的管理节点，将 ~/pigsty 目录拷贝到新的管理节点上，然后开始调整配置。例如，将原本默认的管理节点 10.10.10.10 替换为幸存节点 10.10.10.12

all:
  vars:
    admin_ip: 10.10.10.12               # 使用新的管理节点地址
    node_etc_hosts: [10.10.10.12 h.pigsty a.pigsty p.pigsty g.pigsty sss.pigsty]
    infra_portal: {}                    # 一并修改其他引用旧管理节点 IP (10.10.10.10) 的配置

  children:

    infra:                              # 调整 Infra 集群
      hosts:
        # 10.10.10.10: { infra_seq: 1 } # 老的 Infra 节点
        10.10.10.12: { infra_seq: 3 }   # 新增 Infra 节点

    etcd:                               # 调整 ETCD 集群
      hosts:
        #10.10.10.10: { etcd_seq: 1 }   # 注释掉此故障节点
        #10.10.10.11: { etcd_seq: 2 }   # 注释掉此故障节点
        10.10.10.12: { etcd_seq: 3 }    # 保留幸存节点
      vars:
        etcd_cluster: etcd

    pg-meta:                            # 调整 PGSQL 集群配置
      hosts:
        #10.10.10.10: { pg_seq: 1, pg_role: primary }
        #10.10.10.11: { pg_seq: 2, pg_role: replica }
        #10.10.10.12: { pg_seq: 3, pg_role: replica , pg_offline_query: true }
        10.10.10.12: { pg_seq: 3, pg_role: primary , pg_offline_query: true }
      vars:
        pg_cluster: pg-meta

ETCD修复

然后执行以下命令，将 ETCD 重置为单节点集群：

./etcd.yml -e etcd_safeguard=false -e etcd_clean=true

根据 ETCD重载配置的说明，调整对 ETCD Endpoint 的引用。

INFRA修复

如果幸存节点上没有 INFRA 模块，请在当前节点上配置新的 INFRA 模块并安装。执行以下命令，将 INFRA 模块部署到幸存节点上：

./infra.yml -l 10.10.10.12

修复当前节点的监控

./node.yml -t node_monitor

PGSQL修复

./pgsql.yml -t pg_conf                            # 重新生成 PG 配置文件
systemctl reload patroni                          # 在幸存节点上重载 Patroni 配置

各模块修复后，您可以参考标准扩容流程，将新的节点加入集群，恢复集群的高可用性。

16.13 - Restic：文件系统备份恢复

如何使用 Restic 定期备份与恢复文件系统数据

Pigsty 已经处理好了 PostgreSQL 数据库本身的备份与恢复，但如何解决普通文件/目录的备份与恢复？

对于各种使用 PostgreSQL 的业务软件来说，（例如 Odoo，GitLab），您可以考虑使用 Restic 定期备份文件系统部分的数据（例如 /data/odoo）。

Restic 是一个非常好用的开源备份工具，支持快照，增量备份，加密，压缩等功能，并且可以使用包括 S3/MinIO 在内的多种服务作为备份仓库。详细信息请参考 restic 文档。

快速上手

Pigsty Infra 仓库提供开箱即用的最新 Restic RPM/DEB 软件包，大部分 Linux 操作系统官方支持的发行版仓库里也提供较旧的版本。

yum install -y restic
apt install -y restic

出于演示目的，使用本地文件目录作为备份仓库，初始化仓库只需要执行一次即可

mkdir -p /data/backups/restic
export RESTIC_REPOSITORY=/data/backups/restic
export RESTIC_PASSWORD=some-strong-password
restic init

接下来，你可以进行备份，查看快照，恢复文件等操作：

restic backup /www/web.cc               # 将 /www/web.cc 目录备份到仓库
restic snapshots                        # 查看备份快照列表
restic restore -t /tmp/web.cc 0b11f778  # 将快照 0b11f778 恢复到 /tmp/web.cc
restic check                            # 定期检查仓库完整性

完整命令输出

$ restic backup /www/web.cc
repository fcd37256 opened (repository version 2) successfully, password is correct
created new cache in /root/.cache/restic
no parent snapshot found, will read all files

Files:        5934 new,     0 changed,     0 unmodified
Dirs:         1622 new,     0 changed,     0 unmodified
Added to the repository: 1.570 GiB (1.167 GiB stored)

processed 5934 files, 1.694 GiB in 0:20
snapshot 0b11f778 saved

$ restic snapshots               # 查看备份快照
repository fcd37256 opened (repository version 2) successfully, password is correct
ID        Time                 Host        Tags        Paths
------------------------------------------------------------------
0b11f778  2025-03-19 15:25:21  pigsty.cc               /www/web.cc
------------------------------------------------------------------
1 snapshots

$ restic backup /www/web.cc
repository fcd37256 opened (repository version 2) successfully, password is correct
using parent snapshot 0b11f778

Files:           0 new,     0 changed,  5934 unmodified
Dirs:            0 new,     0 changed,  1622 unmodified
Added to the repository: 0 B   (0 B   stored)

processed 5934 files, 1.694 GiB in 0:00
snapshot 06cd9b5c saved

[03-19 15:25:59] root@pigsty.cc:/data/backups
$ restic snapshots               # 查看备份快照
repository fcd37256 opened (repository version 2) successfully, password is correct
ID        Time                 Host        Tags        Paths
------------------------------------------------------------------
0b11f778  2025-03-19 15:25:21  pigsty.cc               /www/web.cc
06cd9b5c  2025-03-19 15:25:58  pigsty.cc               /www/web.cc
------------------------------------------------------------------
2 snapshots

$ restic restore -t /www/web.cc 0b11f778
repository fcd37256 opened (repository version 2) successfully, password is correct
restoring <Snapshot 0b11f778 of [/www/web.cc] at 2025-03-19 15:25:21.514089814 +0800 HKT by root@pigsty.cc> to /www/web.cc

使用对象存储

你可以使用许许多多的方式来存储 Restic 的备份数据。这里介绍如何使用 Pigsty 自带的 MinIO 作为备份仓库。

export AWS_ACCESS_KEY_ID=minioadmin              # MinIO 默认账号
export AWS_SECRET_ACCESS_KEY=minioadmin          # MinIO 默认密码
restic -r s3:http://sss.pigsty:9000/infra init   # 利用默认的 infra 桶作为备份目的地

16.14 - JuiceFS：分布式文件系统

如何使用 Pigsty 提供的 PostgreSQL 与 MinIO 搭建分布式云原生文件系统 JuiceFS。

JuiceFS 是一个高性能、云原生分布式文件系统，

本文介绍如何使用 Pigsty 提供的 PostgreSQL 作为 JuiceFS 的元数据引擎，MinIO 作为 JuiceFS 的对象存储引擎，搭建一套生产级 JuiceFS 集群。

快速上手

使用 full 模式创建四节点沙箱。

./configure -c full
./install.yml

安装 JuiceFS 并使用对象存储：

JFSNAME=jfs
METAURL=postgres://dbuser_meta:DBUser.Meta@10.10.10.10:5432/meta
DATAURL=(
  --storage minio
  --bucket https://sss.pigsty:9000/infra
  --access-key minioadmin
  --secret-key minioadmin
)

juicefs format "${DATAURL[@]}" ${METAURL} jfs    # 格式化文件系统
juicefs mount ${METAURL} ~/jfs -d                # 后台挂载
juicefs umount ~/jfs                             # 停止挂载

更狂野的玩法：PGFS，把数据库当成文件系统用

JFSNAME=jfs
METAURL=postgres://dbuser_meta:DBUser.Meta@10.10.10.10:5432/meta
DATAURL=(
  --storage postgres
  --bucket 10.10.10.10:5432/meta
  --access-key dbuser_meta
  --secret-key DBUser.Meta
)

juicefs format "${DATAURL[@]}" ${METAURL} ${JFSNAME}
juicefs mount ${METAURL} ~/jfs -d                # 后台挂载
juicefs umount ~/jfs                             # 停止挂载

单机模式

Pigsty Infra 仓库提供最新版本的 JuiceFS RPM/DEB 包，直接使用包管理器安装即可。

以下命令使用本地 SQLite 与文件系统（/var/jfs）创建一个本地 JuiceFS 文件系统：

juicefs format sqlite3:///tmp/jfs.db myjfs

格式化输出

$ juicefs format sqlite3:///jfs.db myjfs
2025/03/19 12:07:56.956222 juicefs[62924] <INFO>: Meta address: sqlite3:///jfs.db [interface.go:504]
2025/03/19 12:07:56.958055 juicefs[62924] <INFO>: Data use file:///var/jfs/myjfs/ [format.go:484]
2025/03/19 12:07:56.966150 juicefs[62924] <INFO>: Volume is formatted as {
  "Name": "myjfs",
  "UUID": "1568ee2a-dc4c-4a0e-9788-be0490776dda",
  "Storage": "file",
  "Bucket": "/var/jfs/",
  "BlockSize": 4096,
  "Compression": "none",
  "EncryptAlgo": "aes256gcm-rsa",
  "TrashDays": 1,
  "MetaVersion": 1,
  "MinClientVersion": "1.1.0-A",
  "DirStats": true,
  "EnableACL": false
} [format.go:521]

然后使用以下命令进行本地前台挂载：

juicefs mount sqlite3:///tmp/jfs.db ~/jfs      # 前台挂载，退出后自动卸载
juicefs mount sqlite3:///tmp/jfs.db ~/jfs -d   # 守护进程挂载，需要手动卸载
juicefs umount ~/jfs                           # 取消挂载，退出进程

删除数据

使用以下命令清空 PostgreSQL 中的 JuiceFS 元数据

DROP TABLE IF EXISTS jfs_acl,jfs_chunk,jfs_chunk_ref,jfs_counter,jfs_delfile,jfs_delslices,jfs_detached_node,jfs_dir_quota,jfs_dir_stats,jfs_edge,jfs_flock,jfs_node,jfs_plock,jfs_session2,jfs_setting,jfs_sustained,jfs_symlink,jfs_xattr CASCADE;

使用以下命令清空对象存储桶：

mcli rm --recursive --force infra/jfs

PGFS性能摘要

二手物理机评测结果：

METAURL=postgres://dbuser_meta:DBUser.Meta@:5432/meta
OPTIONS=(
  --storage postgres
  --bucket :5432/meta
  --access-key dbuser_meta
  --secret-key DBUser.Meta
  ${METAURL}
  jfs
)

juicefs format "${OPTIONS[@]}"
juicefs mount ${METAURL} ~/jfs -d  # 后台挂载
juicefs bench ~/jfs                # 测试性能
juicefs umount ~/jfs               # 停止挂载

$ juicefs bench ~/jfs                # 测试性能
  Write big blocks: 1024/1024 [==============================================================]  178.5/s  used: 5.73782533s
   Read big blocks: 1024/1024 [==============================================================]  31.7/s   used: 32.314547037s
Write small blocks: 100/100 [==============================================================]  149.2/s  used: 670.127171ms
 Read small blocks: 100/100 [==============================================================]  543.4/s  used: 184.109596ms
  Stat small files: 100/100 [==============================================================]  1723.4/s used: 58.087752ms
Benchmark finished!
BlockSize: 1.0 MiB, BigFileSize: 1.0 GiB, SmallFileSize: 128 KiB, SmallFileCount: 100, NumThreads: 1
Time used: 42.2 s, CPU: 687.2%, Memory: 179.4 MiB
+------------------+------------------+---------------+
|       ITEM       |       VALUE      |      COST     |
+------------------+------------------+---------------+
|   Write big file |     178.51 MiB/s |   5.74 s/file |
|    Read big file |      31.69 MiB/s |  32.31 s/file |
| Write small file |    149.4 files/s |  6.70 ms/file |
|  Read small file |    545.2 files/s |  1.83 ms/file |
|        Stat file |   1749.7 files/s |  0.57 ms/file |
|   FUSE operation | 17869 operations |    3.82 ms/op |
|      Update meta |  1164 operations |    1.09 ms/op |
|       Put object |   356 operations |  303.01 ms/op |
|       Get object |   256 operations | 1072.82 ms/op |
|    Delete object |     0 operations |    0.00 ms/op |
| Write into cache |   356 operations |    2.18 ms/op |
|  Read from cache |   100 operations |    0.11 ms/op |
+------------------+------------------+---------------+

16.15 - 便宜VPS

Pigsty 使用便宜 ClawCloud 托管服务器搭建文档站

如你所见，本站托管在 阿爪云 “Claw Cloud” 上，这是位于新加坡的 “阿里云青春版”。

小道消息称：这是阿里云在新加坡开的马甲

我用的是一台 4c8g / 200g 磁盘，1Gbps 带宽，每月 2TB 流量的中国优化云服务器，每月 18 $，托管在 HK 可用区。

相比阿里云/腾讯云/AWS 卖的那些 EC2 要便宜多了，特别是流量。国内 1GB 八毛钱简直是抢劫。

这玩意大陆访问还挺快，香港地区大概 ping 50ms，所以我就拿来建站了，同时还跑着 Pigsty 的 Demo。

如果你要弄一台建个站或者搭个 TZ，不妨考虑一下这个，以下推荐码链接可以立省 10%，我也赚个返点贴补服务器费用：

当然，如果你想要支持本项目的发展，也可以选择更直接的方式：

扫描支付宝二维码，感谢您的支持 🙏

17 - 应用模板

使用开箱即用的配置，拉起使用 PostgreSQL 作为核心数据库的应用业务软件。

PostgreSQL 是世界上最流行的数据库，有无数的经典的软件构建于 PostgreSQL 之上。Pigsty 为其提供了开箱即用的置备模板。

Pigsty 的模板会使用由 Pigsty 创建管理的外部 PostgreSQL，MinIO，Etcd 服务，包含了完整的备份恢复，高可用，监控日志告警，IaC，连接池，负载均衡等功能。并一并解决了基础设施，Nginx转发，证书申请等“最后一公里”问题。相比使用 Docker 拉起包括数据库在内的整套软件的 “玩具模式”，提供了“企业级”所需的能力。

17.1 - Dify：自建AI工作流平台

如何使用 Pigsty 自建 AI Workflow LLMOps 平台 —— Dify，并使用外部 PostgreSQL，PGVector 作为存储？

Dify 是一个生成式 AI 应用创新引擎，开源的 LLM 应用开发平台。提供从 Agent 构建到 AI workflow 编排、RAG 检索、模型管理等能力，帮助用户轻松构建和运营生成式 AI 原生应用。

Pigsty 提供对自建 Dify 的支持，您可以一键拉起 Dify ，并将关键状态保存于外部由 Pigsty 管理的 PostgreSQL，并使用同一个 PG 中的 pgvector 作为向量数据库，进一步简化版部署。

当前 Pigsty v3.4 支持的 Dify 版本为：v1.1.3

快速上手

在安装 兼容发行版 的全新 Linux x86 / ARM 服务器上执行：

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty 
./bootstrap                # 安装 Pigsty 依赖
./configure -c app/dify    # 使用 Dify 配置模板 
vi pigsty.yml              # 修改密码，域名，密钥等参数
./install.yml              # 安装 Pigsty
./docker.yml               # 安装 Docker 模块
./app.yml                  # 拉起 Dify

Dify 默认监听于 5001 端口，你可以通过浏览器访问 http://<ip>:5001，并设置初始用户与密码后登陆。

Dify 启动后，你可以安装各种扩展插件，配置好系统模型之后，就可以开始使用了！

为何自建

自建 Dify 的原因有很多，但主要是出于数据安全的考虑。 Dify 提供的 DockerCompose 模板使用的是简陋的默认数据库镜像，缺少企业级应用所需的高可用性，容灾能力，监控，IaC，PITR 等能力。

Pigsty 可以优雅地为 Dify 解决这些问题，根据配置文件一键拉起所有组件，并使用镜像解决国内翻墙难题。让 Dify 的部署与交付无比丝滑。一次性解决 PostgreSQL 主数据库与 PGVector 向量数据库，MinIO 对象存储，Redis，Prometheus 监控与 Grafana 可视化，以及 Nginx 反向代理，免费 HTTPS 证书。

Pigsty 可以确保 Dify 所有的状态都存储在外部托管服务中，包括 PostgreSQL 中的元数据，与文件系统中的其他数据。因此，使用 Docker Compose 拉起的 Dify 是无状态的简单应用，可以随时销毁与重建，极大简化了运维。

单机安装

让我们先从单节点 Dify 部署开始，我们会在后面进一步介绍生产环境高可用部署的方法。

首先，使用 Pigsty 标准安装流程安装 Dify 所需的 PostgreSQL 实例；

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty
./bootstrap               # 准备 Pigsty 依赖
./configure -c app/supa   # 使用 Supabase 应用模板
vi pigsty.yml             # 编辑配置文件，修改域名与密码
./install.yml             # 安装 Pigsty，以及各种数据库

当你使用 ./configure -c app/dify 命令时，Pigsty 会自动根据 conf/app/dify.yml 配置模板，以及您当前的环境生成 Pigsty 配置文件。您应该根据自己的实际需求，在生成的 pigsty.yml 配置文件中，修改密码，域名等相关参数，然后使用 ./install.yml 执行标准安装流程即可。

接下来，运行 docker.yml 安装 Docker 与 Docker Compose，然后使用 app.yml 剧本完成 Dify 的部署：

./docker.yml              # 安装 Docker 与 Docker Compose
./app.yml                 # 使用 Docker 拉起 Supabase 无状态部分

你可以可以在本地网络通过 http://<your_ip_address>:5001 访问到 Dify Web 管理界面。

默认的用户名，邮箱，密码会在首次登陆时提醒您设置。

你也可以使用本地解析的占位域名 dify.pigsty，或者参考下面的配置，使用真正的域名与 HTTPS 证书。

整个安装过程非常简单，几行命令，十分钟左右即可完成。真正的难点主要在于正确配置参数，下面会详细介绍。

配置详情

当你使用 ./configure -c app/dify 命令进行配置时，Pigsty 会自动根据 conf/app/dify.yml 配置模板，以及您当前的环境生成 Pigsty 配置文件。以下是默认配置文件的详细说明：

all:
  children:

    dify:
      hosts: { 10.10.10.10: {} }
      vars:
        app: dify   # 指定要安装的应用名称（在 apps 中）
        apps:       # 定义所有应用
          dify:     # 应用名称，应该有对应的 ~/pigsty/app/dify 文件夹
            
            file:   # 需要创建的数据目录，创建 /data/dify 用于存储各种插件
              - { path: /data/dify ,state: directory ,mode: 0755 }
            
            conf:   # 覆盖 /opt/dify/.env 配置文件

              # Dify 使用的域名，请替换为您的实际域名，如果使用这个默认域名，你要自己添加本地/内网解析记录
              NGINX_SERVER_NAME: dify.pigsty
              # 用于签名和加密的密钥，可通过 `openssl rand -base64 42` 生成（请修改这个密钥！）
              SECRET_KEY: sk-9f73s3ljTXVcMT3Blb3ljTqtsKiGHXVcMT3BlbkFJLK7U
              # 默认使用端口 5001 暴露 DIFY nginx 服务
              DIFY_PORT: 5001
              # 存储 Dify 文件的位置？默认是 ./volume，我们将使用上面创建的另一个卷 /data/dify 存储数据
              DIFY_DATA: /data/dify

              # 代理和镜像设置，对于中国地区，可以使用清华大学 PIP 镜像加速下载
              #PIP_MIRROR_URL: https://pypi.tuna.tsinghua.edu.cn/simple
              #SANDBOX_HTTP_PROXY: http://10.10.10.10:12345
              #SANDBOX_HTTPS_PROXY: http://10.10.10.10:12345

              # 数据库凭据，这里 PGVECTOR 和 PostgreSQL 使用同一个数据库，使用 pg-meta 集群中的 Dify 用户即可
              DB_USERNAME: dify
              DB_PASSWORD: difyai123456
              DB_HOST: 10.10.10.10
              DB_PORT: 5432
              DB_DATABASE: dify
              VECTOR_STORE: pgvector
              PGVECTOR_HOST: 10.10.10.10
              PGVECTOR_PORT: 5432
              PGVECTOR_USER: dify
              PGVECTOR_PASSWORD: difyai123456
              PGVECTOR_DATABASE: dify
              PGVECTOR_MIN_CONNECTION: 2
              PGVECTOR_MAX_CONNECTION: 10

    pg-meta:
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars:
        pg_cluster: pg-meta
        pg_users:
          - { name: dify ,password: difyai123456 ,pgbouncer: true ,roles: [ dbrole_admin ] ,superuser: true ,comment: dify 超级用户 }
        pg_databases:
          - { name: dify    ,owner: dify ,revokeconn: true ,comment: dify 主数据库  }
          - { name: dify_fs ,owner: dify ,revokeconn: true ,comment: dify 文件系统数据库    }
        pg_hba_rules:
          - { user: dify ,db: all ,addr: 172.17.0.0/16  ,auth: pwd ,title: '允许 dify 从本地 docker 网络访问' }
        node_crontab: [ '00 01 * * * postgres /pg/bin/pg-backup full' ] # 每天凌晨1点进行一次完整备份

    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }
    etcd:  { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }
    #minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

  vars:                               # 全局变量
    version: v3.4.0                   # pigsty 版本字符串
    admin_ip: 10.10.10.10             # 管理节点 IP 地址
    region: default                   # 上游镜像区域：default|china|europe
    node_tune: oltp                   # 节点调优规格：oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # PostgreSQL 调优规格：{oltp,olap,tiny,crit}.yml

    docker_enabled: true              # 在 app 组上启用 docker
    #docker_registry_mirrors: ["https://docker.1ms.run"] # 在中国大陆使用镜像站，否则需要你配置 proxy_env 进行科学上网

    proxy_env:                        # 下载软件包和拉取 docker 镜像时的全局代理环境
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.tsinghua.edu.cn"
      #http_proxy:  127.0.0.1:12345 # 在此添加您的代理环境用于下载软件包或拉取镜像
      #https_proxy: 127.0.0.1:12345 # 通常代理格式为 http://user:pass@proxy.xxx.com
      #all_proxy:   127.0.0.1:12345

    infra_portal: # 域名和上游服务器
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }
      #minio        : { domain: m.pigsty    ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }
      
      dify:                            # dify 的 nginx 服务器配置
        domain: dify.pigsty            # 替换为您自己的域名！
        endpoint: "10.10.10.10:5001"   # dify 服务端点：IP:PORT
        websocket: true                # 添加 websocket 支持
        certbot: dify.pigsty           # certbot 证书名称，使用 `make cert` 申请

    #----------------------------------#
    # 修改这里的默认密码！！！！！
    #----------------------------------#
    #grafana_admin_username: admin
    grafana_admin_password: pigsty
    #pg_admin_username: dbuser_dba
    pg_admin_password: DBUser.DBA
    #pg_monitor_username: dbuser_monitor
    pg_monitor_password: DBUser.Monitor
    #pg_replication_username: replicator
    pg_replication_password: DBUser.Replicator
    #patroni_username: postgres
    patroni_password: Patroni.API
    #haproxy_admin_username: admin
    haproxy_admin_password: pigsty
    #minio_access_key: minioadmin
    minio_secret_key: minioadmin      # minio root secret key, `minioadmin` by default

    repo_extra_packages: [ pg17-main ]
    pg_version: 17

检查清单

以下是您需要关注的配置项检查清单：

硬件/软件：准备所需的机器资源：Linux x86_64/arm64 服务器一台，全新安装主流 Linux 操作系统
网络/权限：有 ssh 免密登陆权限，所用用户有免密 sudo 权限
确保机器有内网静态IPv4网络地址，并可以访问互联网。
如果你通过公网访问，请确保您拥有一个可用域名并将其指向当前节点的 公网IP地址
确保使用 app/dify 配置模板，并按需修改参数
- configure -c app/dify，并输入节点的内网首要 IP 地址，或通过 -i <primary_ip> 命令行参数指定
- 您是否修改了所有与密码有关的配置参数？【可选】
  - grafana_admin_password: pigsty，Grafana管理员密码
  - pg_admin_password: DBUser.DBA，PG超级用户密码
  - pg_monitor_password: DBUser.Monitor，PG监控用户密码
  - pg_replication_password: DBUser.Replicator，PG复制用户密码
  - patroni_password: Patroni.API，Patroni 高可用组件密码
  - haproxy_admin_password: pigsty，负载均衡器管控密码
  - minio_access_key: minioadmin，MinIO 根用户名
  - minio_secret_key: minioadmin，MinIO 根用户密钥
- 您是否修改了 PostgreSQL 集群业务用户的密码，以及使用此密码的 App 配置？
  - 默认的用户名 dify 与密码 difyai123456 是 Pigsty 为 Dify 生成的默认用户名与密码，请根据实际情况修改
  - Dify 的配置块中，请相应修改 DB_USERNAME，DB_PASSWORD，PGVECTOR_USER，PGVECTOR_PASSWORD 等参数
- 您是否修改了 Dify 默认使用的加密密钥？
  - 你可以使用 openssl rand -base64 42 随机生成一个密码字符串，填入 SECRET_KEY 参数中
- 您是否修改了 Dify 使用的域名？
  - 将占位符域名 dify.pigsty 替换为您的实际域名，例如 dify.pigsty.cc
  - 您可以使用 sed -ie 's/dify.pigsty/dify.pigsty.cc/g' pigsty.yml 修改 Dify 使用的域名

域名证书

如果你希望使用真实的域名与 HTTPS 证书，你需要在 pigsty.yml 配置文件中，修改：

infra_portal 参数中的 dify 域名
最好指定一个用于接受证书过期通知的邮箱地址 certbot_email
配置 dify 的 NGINX_SERVER_NAME 参数，指定为你的实际域名

all:
  children:                            # 集群定义
    dify:                              # Dify 分组
      vars:                            # Dify 分组变量
        apps:                          # 应用配置
          dify:                        # Dify 应用定义
            conf:                      # Dify 应用配置
              NGINX_SERVER_NAME: dify.pigsty

  vars:                                # 全局参数
    #certbot_sign: true                # 使用 Certbot 申请免费 HTTPS 证书
    certbot_email: your@email.com      # 申请证书使用的邮箱，用于接受过期通知，可选
    infra_portal:                      # 配置 Nginx 服务器
      dify:                            # Dify 服务器定义
        domain: dify.pigsty            # 请在这里替换为你自己的域名！
        endpoint: "10.10.10.10:5001"   # 请在这里指定 Dify 的 IP 与端口（默认自动配置）
        websocket: true                # Dify 需要启用 websocket 
        certbot: dify.pigsty           # 指定 Certbot 证书名称

使用以下命令申请 Nginx 证书：

# 申请证书，也可以手动执行 /etc/nginx/sign-cert 脚本
make cert

# 以上 Makefile 快捷命令实际上是执行以下剧本任务：
./infra.yml -t nginx_certbot,nginx_reload -e certbot_sign=true

执行 app.yml 剧本，重新拉起 Dify 服务即可让 NGINX_SERVER_NAME 配置生效。

./app.yml

文件备份

你可以使用 restic 对 Dify 文件系统进行备份，Dify 的数据文件在 /data/dify 目录下，你可以使用以下命令对其进行备份：

export RESTIC_REPOSITORY=/data/backups/dify   # 指定 dify 备份目录
export RESTIC_PASSWORD=some-strong-password   # 指定备份加密密码
mkdir -p ${RESTIC_REPOSITORY}                 # 创建 dify 备份目录
restic init

创建 Restic 备份库后，你可以使用以下命令对 Dify 进行备份：

export RESTIC_REPOSITORY=/data/backups/dify   # 指定 dify 备份目录
export RESTIC_PASSWORD=some-strong-password   # 指定备份加密密码

restic backup /data/dify                      # 将 /dify 数据目录备份到仓库
restic snapshots                              # 查看备份快照列表
restic restore -t /data/dify 0b11f778         # 将快照 xxxxxx 恢复到 /data/dify
restic check                                  # 定期检查仓库完整性

另一种更可靠的方式是使用 JuiceFS 将 MinIO 对象存储挂载到 /data/dify 目录下，这样你就可以使用 MinIO/S3 盛放文件状态了。

如果你希望将所有的数据都保存在 PostgreSQL 中，可以考虑使用 JuiceFS 将文件系统数据保存到 PostgreSQL 中：

例如，你可以创建另一个 dify_fs 数据库，并使用它作为 JuiceFS 的元数据存储：

METAURL=postgres://dify:difyai123456@:5432/dify_fs
OPTIONS=(
  --storage postgres
  --bucket :5432/dify_fs
  --access-key dify
  --secret-key difyai123456
  ${METAURL}
  jfs
)
juicefs format "${OPTIONS[@]}"         # 创建一个 PG 文件系统
juicefs mount ${METAURL} /data/dify -d # 后台挂载到 /data/dify 目录
juicefs bench /data/dify               # 测试性能
juicefs umount /data/dify              # 停止挂载

参考阅读

Dify 自建 FAQ

17.2 - Odoo：自建开源ERP系统

如何拉起开箱即用的企业级应用全家桶 Odoo，并使用 Pigsty 管理其后端 PostgreSQL 数据库。

Odoo 是一个开源的企业级 ERP 系统，提供了从 CRM、销售、采购、库存、生产、财务等全方位的企业管理功能。Odoo 也是一个典型的 Web 应用，底层使用 PostgreSQL 数据库作为存储。

将你所有的业务都汇总入一个平台，简单，高效，省钱，你自己的 ERP！

快速上手

在 “网络条件良好” 的情况下，你可以通过以下命令快速拉起一个 Odoo 实例，使用由 Pigsty 管理的外部 PostgreSQL 数据库：

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty 
pig sty init               # 安装 Pigsty
./bootstrap                # 安装 Pigsty 依赖
./configure -c app/odoo    # 使用 Odoo 配置模板 （请在这一步修改生成配置文件 pigsty.yml 中的各种密码！）
./install.yml              # 安装 Pigsty
./docker.yml               # 安装 Docker 模块
./app.yml                  # 拉起 Odoo

Odoo 默认监听在 8069 端口，你可以通过浏览器访问 http://<ip>:8069。默认的用户名和密码都是： admin。

请注意，Odoo 无状态部分使用 Docker 拉起，然而中国大陆 DockerHub 被墙，你可能需要参考教程来配置镜像站或代理服务器方可顺利完成最后一步。在 Pigsty 商业版中，我们可以帮您丝滑解决这个问题。

配置文件

在 conf/app/odoo.yml 中有一个模板配置文件，定义了单机 Odoo 所需的资源。

在 configure 之后，您应该根据自己的实际需求，修改这里的密码类参数。请注意修改密码务必匹配：例如你如果在 pg_users 中修改了 odoo 数据库用户的密码，那么也同样要修改 all.children.odoo.vars.apps.<odoo>.conf.PG_PASSWORD 参数，以确保 Odoo 与 PostgreSQL 数据库的连接正常。

all:
  children:

    # the odoo application (default username & password: admin/admin)
    odoo:
      hosts: { 10.10.10.10: {} }
      vars:
        app: odoo   # specify app name to be installed (in the apps)
        apps:       # define all applications
          odoo:     # app name, should have corresponding ~/app/odoo folder
            file:   # optional directory to be created
              - { path: /data/odoo         ,state: directory, owner: 100, group: 101 }
              - { path: /data/odoo/webdata ,state: directory, owner: 100, group: 101 }
              - { path: /data/odoo/addons  ,state: directory, owner: 100, group: 101 }
            conf:   # override /opt/<app>/.env config file
              PG_HOST: 10.10.10.10            # postgres host
              PG_PORT: 5432                   # postgres port
              PG_USERNAME: odoo               # postgres user
              PG_PASSWORD: DBUser.Odoo        # postgres password
              ODOO_PORT: 8069                 # odoo app port
              ODOO_DATA: /data/odoo/webdata   # odoo webdata
              ODOO_ADDONS: /data/odoo/addons  # odoo plugins
              ODOO_DBNAME: odoo               # odoo database name
              ODOO_VERSION: 18.0              # odoo image version

    # the odoo database
    pg-odoo:
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars:
        pg_cluster: pg-odoo
        pg_users:
          - { name: odoo    ,password: DBUser.Odoo ,pgbouncer: true ,roles: [ dbrole_admin ] ,createdb: true ,comment: admin user for odoo service }
          - { name: odoo_ro ,password: DBUser.Odoo ,pgbouncer: true ,roles: [ dbrole_readonly ]  ,comment: read only user for odoo service  }
          - { name: odoo_rw ,password: DBUser.Odoo ,pgbouncer: true ,roles: [ dbrole_readwrite ] ,comment: read write user for odoo service }
        pg_databases:
          - { name: odoo ,owner: odoo ,revokeconn: true ,comment: odoo main database  }
        pg_hba_rules:
          - { user: all ,db: all ,addr: 172.17.0.0/16  ,auth: pwd ,title: 'allow access from local docker network' }
          - { user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes' }

    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }
    etcd:  { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }
    #minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

  vars:                               # global variables
    version: v3.3.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: default                   # upstream mirror region: default|china|europe
    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml

    docker_enabled: true              # enable docker on app group
    #docker_registry_mirrors: ["https://docker.m.daocloud.io"] # use dao cloud mirror in mainland china
    proxy_env:                        # global proxy env when downloading packages & pull docker images
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.tsinghua.edu.cn"
      #http_proxy:  127.0.0.1:12345 # add your proxy env here for downloading packages or pull images
      #https_proxy: 127.0.0.1:12345 # usually the proxy is format as http://user:pass@proxy.xxx.com
      #all_proxy:   127.0.0.1:12345

    infra_portal: # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }
      minio        : { domain: m.pigsty    ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }
      odoo         : { domain: odoo.pigsty, endpoint: "127.0.0.1:8069"   ,websocket: true }  #cert: /path/to/crt ,key: /path/to/key
      # setup your own domain name here ^^^, or use default domain name, or ip + 8069 port direct access
      # certbot --nginx --agree-tos --email your@email.com -n -d odoo.your.domain    # replace with your email & odoo domain

    #----------------------------------#
    # Credential: CHANGE THESE PASSWORDS
    #----------------------------------#
    #grafana_admin_username: admin
    grafana_admin_password: pigsty
    #pg_admin_username: dbuser_dba
    pg_admin_password: DBUser.DBA
    #pg_monitor_username: dbuser_monitor
    pg_monitor_password: DBUser.Monitor
    #pg_replication_username: replicator
    pg_replication_password: DBUser.Replicator
    #patroni_username: postgres
    patroni_password: Patroni.API
    #haproxy_admin_username: admin
    haproxy_admin_password: pigsty

    repo_modules: infra,node,pgsql,docker
    repo_packages: [ node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-utility, docker ]
    repo_extra_packages: [ pg17-main ]
    pg_version: 17

Odoo 扩展插件

社区中有很多可用的 Odoo 模块，你可以通过下载并将它们放置在 addons 文件夹中来安装它们。

在上面的配置文件中，addons 目录默认为 /data/odoo/addons，把扩展 zip 包放进该目录即可将其“安装”。

要启用这些模块，首先在 Odoo 中进入开发者模式

Settings -> Generic Settings -> Developer Tools -> Activate the developer Mode

然后，转到 > Apps -> Update Apps List, 然后你可以在面板中找到这些额外的模块并安装它们。

经常使用的免费模块请参考这里，当然以及大家最需要的 Accounting Kit 模块。

对外服务

您可以直接用 IP 地址访问目标服务器上的 8069 端口，访问 Odoo Web 界面，但显然这样的方式对于严肃的场景过于儿戏。以下是如何使用域名访问 Odoo 的说明与 Pigsty 配置方法：

在上面的配置文件中，已经为 Odoo 设置了在 Infra Nginx 上的反向代理，因此您可以通过 odoo.pigsty 的域名访问 Odoo 网络界面。

infra_portal: # 定义 Nginx 服务器配置
  # ...
  odoo : { domain: odoo.pigsty, endpoint: "127.0.0.1:8069" ,websocket: true }  #cert: /path/to/crt ,key: /path/to/key

如果您想要使用其他域名，请相应修改 domain 参数，如果您的 Odoo 部署在其他服务器上，请相应修改 endpoint 参数。然后执行 ./infra.yml -t nginx_config,nginx_reload 生效。

在任何情况下通过域名访问 Odoo 都需要配置 DNS 解析，有三种典型的配置方式：

使用真域名，通过云厂商/DNS服务商的解析服务，指向你的服务器公网IP
使用内网 DNS，在你的内网 DNS 服务上添加指向你服务器的内网IP地址
使用本地静态 DNS，在你览器所在主机（/etc/hosts）添加一条静态解析记录

HTTPS证书

如果您想要通过 HTTPS 访问 Odoo 服务，则需要申请 HTTPS 证书。

Pigsty 默认会为你的 Odoo 服务生成一个 “自签名CA” 生成的证书，这个证书是不被浏览器信任的，因此在浏览器中会提示不安全。你可以选择：

“我知道不安全，继续访问”
使用 Chrome 浏览器时，你也可以使用敲击键入 thisisunsafe 来绕过证书验证
将 Pigsty 创建的 pigsty-ca 加入信任的根 CA 列表。
花钱当大冤种去买 HTTPS 证书
使用 certbot 申请免费的 HTTPS 证书（正规且推荐！）

如果你已经有 HTTPS 证书，你可以在 infra_portal 中指定 cert 和 key

infra_portal:
  # ...
  odoo : { domain: odoo.pigsty.cc, endpoint: "127.0.0.1:8069" ,websocket: true ,cert: /etc/cert/odoo.pigsty.cc.crt   ,key: /etc/cert/odoo.pigsty.cc.key  }

然后使用 ./infra.yml -t nginx_config,nginx_launch 更新服务器配置并使其生效。

免费HTTPS证书

如果你不想当大冤种花钱去买 HTTPS 证书，最简单的办法是使用 Let’s Encrypt 的免费 HTTPS 证书。

Pigsty 默认集成了 certbot，这里是详细的教程，核心就是以下这行命令：

certbot --nginx --agree-tos --email your@email.com -n -d odoo.pigsty.cc

把上面的 email 换成你自己的邮件地址，域名换成你的域名，然后按照提示操作即可，全自动申请与配置。

请注意，使用 certbot 申请免费的 HTTPS 证书需要：

你的服务器有网络访问，且可以通过公网访问（80/443端口）。
你的域名正确指向这台服务器的公网IP地址，即在域名服务商处配置了正确的 A 记录

使用 Certbot 申请完证书后，默认会修改 Nginx 的配置文件，将 HTTP 服务器重定向到 HTTPS 服务器，而这可能并非你想要的。你可以通过修改 Pigsty 配置文件中的 infra_portal 参数，将 Certbot 已经成功签发证书的域名配置到 Nginx 的配置文件中。

infra_portal:
  # ...
  odoo : { domain: odoo.pigsty.cc, endpoint: "127.0.0.1:8069" ,websocket: true ,cert: /etc/cert/odoo.pigsty.cc.crt ,key: /etc/cert/odoo.pigsty.cc.key  }

Certificate is saved at: /etc/letsencrypt/live/pigsty.cc/fullchain.pem
Key is saved at:         /etc/letsencrypt/live/pigsty.cc/privkey.pem

因此将证书中间的 pigsty.cc 抽出来填入 certbot，然后重新运行：

./infra.yml -t nginx_config,nginx_launch

17.3 - 自建Supabase：创业出海的首选数据库

如何在本地/云端物理机/裸金属/虚拟机上使用 Pigsty 自建企业级 Supabase？

Supabase 非常棒，拥有你自己的 Supabase 那就是棒上加棒！本文介绍了如何在本地/云端物理机/裸金属/虚拟机上自建企业级 Supabase。

简短版本

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty
./bootstrap               # 准备 Pigsty 依赖
./configure -c app/supa   # 使用 Supabase 应用模板
vi pigsty.yml             # 编辑配置文件，修改域名与密码
./install.yml             # 安装 Pigsty，以及各种数据库
./docker.yml              # 安装 Docker 与 Docker Compose
./app.yml                 # 使用 Docker 拉起 Supabase 无状态部分

Supabase是什么？

Supabase 是一个 BaaS （Backend as Service），开源的 Firebase。 Supabase 对 PostgreSQL 进行了封装，并提供了身份认证，消息传递，边缘函数，对象存储，并基于 PG 数据库模式自动生成 REST API 与 GraphQL API。

Supabase 旨在为开发者提供一站式的后端解决方案，减少开发和维护后端基础设施的复杂性，使开发者专注于前端开发和用户体验。用大白话说就是：让开发者告别绝大部分后端开发的工作，只需要懂数据库设计与前端即可快速出活！

目前，Supabase 是 PostgreSQL 生态人气最高的开源项目，在 GitHub 上已经有高达八万的 Star 数。并且和 Neon，Cloudflare 一起并称为赛博菩萨 —— 因为他们都提供了非常不错的云服务免费计划。目前，Supabase 和 Neon 已经成为许多初创企业的首选数据库 —— 用起来非常方便，起步还是免费的。

为什么要自建Supabase？

小微规模（4c8g）内的 Supabase 云服务极富性价比，人称赛博菩萨。那么 Supabase 云服务这么香，为什么要自建呢？

最直观的原因是是我们在《云数据库是智商税吗？》中提到过的：当你的规模超出云计算适用光谱，成本很容易出现爆炸式增长。而且在当下，足够可靠的本地企业级 NVMe SSD在性价比上与云端存储有着三到四个数量级的优势，而自建能更好地利用这一点。

另一个重要的原因是功能， Supabase 云服务的功能受限 —— 出于与RDS相同的逻辑，很多 强力PG扩展 因为多租户安全挑战与许可证的原因无法作为云服务提供。故而尽管PG扩展是 Supabase 的一个核心特色，在云服务上也依然只有 64 个可用扩展，而 Pigsty 提供了多达 421 个开箱即用的 PG 扩展。

此外，尽管 Supabase 虽然旨在提供一个无供应商锁定的 Google Firebase 开源替代，但实际上自建高标准企业级的 Supabase 门槛并不低： Supabase 内置了一系列由他们自己开发维护的 PG 扩展插件，而这些扩展在 PGDG 官方仓库中并没有提供。这实际上是某种隐性的供应商锁定，阻止了用户使用除了 supabase/postgres Docker 镜像之外的方式自建。

Pigsty 解决了这些问题，我们将所有 Supabase 自研与用到的 10 个缺失的扩展打成开箱即用的 RPM/DEB 包，确保它们在所有主流Linux操作系统发行版上都可用：

pg_graphql：提供PG内的GraphQL支持 (RUST)，Rust扩展，由PIGSTY提供
pg_jsonschema：提供JSON Schema校验能力，Rust扩展，由PIGSTY提供
wrappers：Supabase提供的外部数据源包装器捆绑包,，Rust扩展，由PIGSTY提供
index_advisor：查询索引建议器，SQL扩展，由PIGSTY提供
pg_net：用 SQL 进行异步非阻塞HTTP/HTTPS 请求的扩展 (supabase)，C扩展，由PIGSTY提供
vault：在 Vault 中存储加密凭证的扩展 (supabase)，C扩展，由PIGSTY提供
pgjwt：JSON Web Token API 的PG实现 (supabase)，SQL扩展，由PIGSTY提供
pgsodium：表数据加密存储 TDE，扩展，由PIGSTY提供
supautils：用于在云环境中确保数据库集群的安全，C扩展，由PIGSTY提供
pg_plan_filter：使用执行计划代价过滤阻止特定查询语句，C扩展，由PIGSTY提供

我们在 Supabase 自建部署中默认安装绝大多数扩展，您可以参考可用扩展列表按需启用。

同时，Pigsty 还会负责好底层高可用 PostgreSQL 数据库集群，高可用 MinIO 对象存储集群的自动搭建，甚至是 Docker 容器底座的部署与 Nginx 域名配置与HTTPS证书签发。最终，您可以使用 Docker Compose 拉起任意数量的无状态 Supabase 容器集群，并使用外部由 Pigsty 托管的企业级 PostgreSQL 数据库与 MinIO 对象存储，甚至连反向代理的 Nginx 等都已经为您配置准备完毕！

在这一自建部署架构中，您获得了使用不同内核的自由（PG 15-17），加装 421 个扩展的自由，扩容与伸缩 Supabase/Postgres/MinIO 的自由，免于数据库运维的自由，以及告别供应商锁定的自由。而相比于使用 Supabase 云服务需要付出的代价，不过是准备一（几）台物理机/虚拟机 + 敲几行命令，等候十几分钟的区别。

单节点自建快速上手

让我们先从单节点 Supabase 部署开始，我们会在后面进一步介绍多节点高可用部署的方法。

首先，使用 Pigsty 标准安装流程安装 Supabase 所需的 MinIO 与 PostgreSQL 实例；然后额外运行 docker.yml 与 app.yml 完成剩余的工作，拉起无状态部分的 Supabase 容器，Supabase 就可以使用了（默认端口 8000/8433）。

curl -fsSL https://repo.pigsty.cc/get | bash; cd ~/pigsty
./bootstrap               # 准备 Pigsty 依赖
./configure -c app/supa   # 使用 Supabase 应用模板
vi pigsty.yml             # 编辑配置文件，修改域名与密码
./install.yml             # 安装 Pigsty，以及各种数据库
./docker.yml              # 安装 Docker 与 Docker Compose
./app.yml                 # 使用 Docker 拉起 Supabase 无状态部分

请在部署 Supabase 前，根据您的实际情况，修改自动生成的 pigsty.yml 配置文件中的参数（主要是密码！）如果您只是将其用于本地开发测试，可以先不管这些，我们将在后面介绍如何通过修改配置文件来定制您的 Supabase。

如果您的配置没有问题，那么大约在 10 分钟后，您就可以在本地网络通过 http://<your_ip_address>:8000 访问到 Supabase Studio 管理界面了。默认的用户名与密码分别是： supabase 与 pigsty。

检查清单

硬件/软件：准备所需的机器资源：Linux x86_64/arm64 服务器一台，全新安装主流 Linux 操作系统
网络/权限：有 ssh 免密登陆权限，所用用户有免密 sudo 权限
确保机器有内网静态IPv4网络地址，并可以访问互联网。
- 在 configure 过程中，请输入节点的内网首要 IP 地址，或直接通过 -i <primary_ip> 命令行参数指定
- 如果您的网络环境无法访问 DockerHub，请通过 docker_registry_mirrors 使用镜像站或 proxy_env 绕过防火墙。
确保使用了 app/supa 配置模板，并按需修改了参数
- 您是否修改了所有与密码有关的配置参数？【可选】
- 您是否需要使用外部 SMTP 服务器？是否配置了 apps.<supabase>.conf 中的 SMTP 相关参数？【可选】
DockerHub 网络访问问题
- 对于中国区域的用户，因为 DockerHub 被墙，Pigsty 默认会使用 docker.1ms.run 作为 Docker 镜像站点。你可以在 docker_registry_mirrors 指定其他的 Docker 镜像站点，或者使用 proxy_env 配置代理服务器直接访问 DockerHub。

修改后的配置文件，应该如下所示：

对默认生成的配置文件进行修改

---
#==============================================================#
# File      :   supa.yml
# Desc      :   Pigsty configuration for self-hosting supabase
# Ctime     :   2023-09-19
# Mtime     :   2025-03-30
# Docs      :   https://pigsty.io/docs/app/supabase/
# License   :   AGPLv3 @ https://pigsty.io/docs/about/license
# Copyright :   2018-2025  Ruohang Feng / Vonng (rh@vonng.com)
#==============================================================#

# supabase is available on el8/el9/u22/u24/d12 with pg15,16,17
# To install supabase on fresh node, run:
#
#  curl -fsSL https://repo.pigsty.io/get | bash
# ./bootstrap               # prepare local repo & ansible
# ./configure -c app/supa   # use this supabase conf template
# vi pigsty.yml             # IMPORTANT: CHANGE CREDENTIALS!!
# ./install.yml             # install pigsty & pgsql & minio
# ./docker.yml              # install docker & docker compose
# ./app.yml                 # launch supabase with docker compose

all:
  children:

    # the supabase stateless (default username & password: supabase/pigsty)
    supa:
      hosts:
        10.10.10.10: {}
      vars:
        app: supabase # specify app name (supa) to be installed (in the apps)
        apps:         # define all applications
          supabase:   # the definition of supabase app
            conf:     # override /opt/supabase/.env
              # IMPORTANT: CHANGE JWT_SECRET AND REGENERATE CREDENTIAL ACCORDING!!!!!!!!!!!
              # https://supabase.com/docs/guides/self-hosting/docker#securing-your-services
              JWT_SECRET: your-super-secret-jwt-token-with-at-least-32-characters-long
              ANON_KEY: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyAgCiAgICAicm9sZSI6ICJhbm9uIiwKICAgICJpc3MiOiAic3VwYWJhc2UtZGVtbyIsCiAgICAiaWF0IjogMTY0MTc2OTIwMCwKICAgICJleHAiOiAxNzk5NTM1NjAwCn0.dc_X5iR_VP_qT0zsiyj_I_OZ2T9FtRU2BBNWN8Bu4GE
              SERVICE_ROLE_KEY: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyAgCiAgICAicm9sZSI6ICJzZXJ2aWNlX3JvbGUiLAogICAgImlzcyI6ICJzdXBhYmFzZS1kZW1vIiwKICAgICJpYXQiOiAxNjQxNzY5MjAwLAogICAgImV4cCI6IDE3OTk1MzU2MDAKfQ.DaYlNEoUrrEn2Ig7tqibS-PHK5vgusbcbo7X36XVt4Q
              DASHBOARD_USERNAME: supabase
              DASHBOARD_PASSWORD: pigsty

              # postgres connection string (use the correct ip and port)
              POSTGRES_HOST: 10.10.10.10      # point to the local postgres node
              POSTGRES_PORT: 5436             # access via the 'default' service, which always route to the primary postgres
              POSTGRES_DB: postgres           # the supabase underlying database
              POSTGRES_PASSWORD: DBUser.Supa  # password for supabase_admin and multiple supabase users

              # expose supabase via domain name
              SITE_URL: https://supa.pigsty                # <------- Change This to your external domain name
              API_EXTERNAL_URL: https://supa.pigsty        # <------- Otherwise the storage api may not work!
              SUPABASE_PUBLIC_URL: https://supa.pigsty     # <------- DO NOT FORGET TO PUT IT IN infra_portal!

              # if using s3/minio as file storage
              S3_BUCKET: supa
              S3_ENDPOINT: https://sss.pigsty:9000
              S3_ACCESS_KEY: supabase
              S3_SECRET_KEY: S3User.Supabase
              S3_FORCE_PATH_STYLE: true
              S3_PROTOCOL: https
              S3_REGION: stub
              MINIO_DOMAIN_IP: 10.10.10.10  # sss.pigsty domain name will resolve to this ip statically

              # if using SMTP (optional)
              #SMTP_ADMIN_EMAIL: admin@example.com
              #SMTP_HOST: supabase-mail
              #SMTP_PORT: 2500
              #SMTP_USER: fake_mail_user
              #SMTP_PASS: fake_mail_password
              #SMTP_SENDER_NAME: fake_sender
              #ENABLE_ANONYMOUS_USERS: false


    # infra cluster for proxy, monitor, alert, etc..
    infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }

    # etcd cluster for ha postgres
    etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

    # minio cluster, s3 compatible object storage
    minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

    # pg-meta, the underlying postgres database for supabase
    pg-meta:
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars:
        pg_cluster: pg-meta
        pg_users:
          # supabase roles: anon, authenticated, dashboard_user
          - { name: anon           ,login: false }
          - { name: authenticated  ,login: false }
          - { name: dashboard_user ,login: false ,replication: true ,createdb: true ,createrole: true }
          - { name: service_role   ,login: false ,bypassrls: true }
          # supabase users: please use the same password
          - { name: supabase_admin             ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: true   ,roles: [ dbrole_admin ] ,superuser: true ,replication: true ,createdb: true ,createrole: true ,bypassrls: true }
          - { name: authenticator              ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: false  ,roles: [ dbrole_admin, authenticated ,anon ,service_role ] }
          - { name: supabase_auth_admin        ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: false  ,roles: [ dbrole_admin ] ,createrole: true }
          - { name: supabase_storage_admin     ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: false  ,roles: [ dbrole_admin, authenticated ,anon ,service_role ] ,createrole: true }
          - { name: supabase_functions_admin   ,password: 'DBUser.Supa' ,pgbouncer: true ,inherit: false  ,roles: [ dbrole_admin ] ,createrole: true }
          - { name: supabase_replication_admin ,password: 'DBUser.Supa' ,replication: true ,roles: [ dbrole_admin ]}
          - { name: supabase_read_only_user    ,password: 'DBUser.Supa' ,bypassrls: true ,roles: [ dbrole_readonly, pg_read_all_data ] }
        pg_databases:
          - name: postgres
            baseline: supabase.sql
            owner: supabase_admin
            comment: supabase postgres database
            schemas: [ extensions ,auth ,realtime ,storage ,graphql_public ,supabase_functions ,_analytics ,_realtime ]
            extensions:
              - { name: pgcrypto  ,schema: extensions } # cryptographic functions
              - { name: pg_net    ,schema: extensions } # async HTTP
              - { name: pgjwt     ,schema: extensions } # json web token API for postgres
              - { name: uuid-ossp ,schema: extensions } # generate universally unique identifiers (UUIDs)
              - { name: pgsodium        }               # pgsodium is a modern cryptography library for Postgres.
              - { name: supabase_vault  }               # Supabase Vault Extension
              - { name: pg_graphql      }               # pg_graphql: GraphQL support
              - { name: pg_jsonschema   }               # pg_jsonschema: Validate json schema
              - { name: wrappers        }               # wrappers: FDW collections
              - { name: http            }               # http: allows web page retrieval inside the database.
              - { name: pg_cron         }               # pg_cron: Job scheduler for PostgreSQL
              - { name: timescaledb     }               # timescaledb: Enables scalable inserts and complex queries for time-series data
              - { name: pg_tle          }               # pg_tle: Trusted Language Extensions for PostgreSQL
              - { name: vector          }               # pgvector: the vector similarity search
              - { name: pgmq            }               # pgmq: A lightweight message queue like AWS SQS and RSMQ
        # supabase required extensions
        pg_libs: 'timescaledb, plpgsql, plpgsql_check, pg_cron, pg_net, pg_stat_statements, auto_explain, pg_tle, plan_filter'
        pg_parameters:
          cron.database_name: postgres
          pgsodium.enable_event_trigger: off
        pg_hba_rules: # supabase hba rules, require access from docker network
          - { user: all ,db: postgres  ,addr: intra         ,auth: pwd ,title: 'allow supabase access from intranet'    }
          - { user: all ,db: postgres  ,addr: 172.17.0.0/16 ,auth: pwd ,title: 'allow access from local docker network' }
        node_crontab: [ '00 01 * * * postgres /pg/bin/pg-backup full' ] # make a full backup every 1am


  #==============================================================#
  # Global Parameters
  #==============================================================#
  vars:
    version: v3.4.0                   # pigsty version string
    admin_ip: 10.10.10.10             # admin node ip address
    region: china                     # upstream mirror region: default|china|europe
    pg_locale: C.UTF-8                # overwrite default C local
    pg_lc_collate: C.UTF-8            # overwrite default C lc_collate
    pg_lc_ctype: C.UTF-8              # overwrite default C lc_ctype

    node_tune: oltp                   # node tuning specs: oltp,olap,tiny,crit
    pg_conf: oltp.yml                 # pgsql tuning specs: {oltp,olap,tiny,crit}.yml

    docker_enabled: true              # enable docker on app group
    docker_registry_mirrors: ["https://docker.1ms.run"] # use mirror in mainland china

    proxy_env:                        # global proxy env when downloading packages & pull docker images
      no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.tsinghua.edu.cn"
      #http_proxy:  127.0.0.1:12345 # add your proxy env here for downloading packages or pull images
      #https_proxy: 127.0.0.1:12345 # usually the proxy is format as http://user:pass@proxy.xxx.com
      #all_proxy:   127.0.0.1:12345

    certbot_email: your@email.com     # your email address for applying free let's encrypt ssl certs
    infra_portal:                     # domain names and upstream servers
      home         : { domain: h.pigsty }
      grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
      prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
      alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
      minio        : { domain: m.pigsty ,endpoint: "10.10.10.10:9001", https: true, websocket: true }
      blackbox     : { endpoint: "${admin_ip}:9115" }
      loki         : { endpoint: "${admin_ip}:3100" }  # expose supa studio UI and API via nginx
      supa :                          # nginx server config for supabase
        domain: supa.pigsty           # REPLACE WITH YOUR OWN DOMAIN!
        endpoint: "10.10.10.10:8000"  # supabase service endpoint: IP:PORT
        websocket: true               # add websocket support
        certbot: supa.pigsty          # certbot cert name, apply with `make cert`

    #----------------------------------#
    # Credential: CHANGE THESE PASSWORDS
    #----------------------------------#
    #grafana_admin_username: admin
    grafana_admin_password: pigsty
    #pg_admin_username: dbuser_dba
    pg_admin_password: DBUser.DBA
    #pg_monitor_username: dbuser_monitor
    pg_monitor_password: DBUser.Monitor
    #pg_replication_username: replicator
    pg_replication_password: DBUser.Replicator
    #patroni_username: postgres
    patroni_password: Patroni.API
    #haproxy_admin_username: admin
    haproxy_admin_password: pigsty
    #minio_access_key: minioadmin
    minio_secret_key: minioadmin      # minio root secret key, `minioadmin` by default, also change pgbackrest_repo.minio.s3_key_secret

    # use minio as supabase file storage, single node single driver mode for demonstration purpose
    minio_buckets: [ { name: pgsql }, { name: supa } ]
    minio_users:
      - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
      - { access_key: pgbackrest , secret_key: S3User.Backup,   policy: readwrite }
      - { access_key: supabase   , secret_key: S3User.Supabase, policy: readwrite }
    minio_endpoint: https://sss.pigsty:9000    # explicit overwrite minio endpoint with haproxy port
    node_etc_hosts: ["10.10.10.10 sss.pigsty"] # domain name to access minio from all nodes (required)

    # use minio as default backup repo for PostgreSQL
    pgbackrest_method: minio          # pgbackrest repo method: local,minio,[user-defined...]
    pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
      local:                          # default pgbackrest repo with local posix fs
        path: /pg/backup              # local backup directory, `/pg/backup` by default
        retention_full_type: count    # retention full backups by count
        retention_full: 2             # keep 2, at most 3 full backup when using local fs repo
      minio:                          # optional minio repo for pgbackrest
        type: s3                      # minio is s3-compatible, so s3 is used
        s3_endpoint: sss.pigsty       # minio endpoint domain name, `sss.pigsty` by default
        s3_region: us-east-1          # minio region, us-east-1 by default, useless for minio
        s3_bucket: pgsql              # minio bucket name, `pgsql` by default
        s3_key: pgbackrest            # minio user access key for pgbackrest
        s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest <------------------ HEY, DID YOU CHANGE THIS?
        s3_uri_style: path            # use path style uri for minio rather than host style
        path: /pgbackrest             # minio backup path, default is `/pgbackrest`
        storage_port: 9000            # minio port, 9000 by default
        storage_ca_file: /etc/pki/ca.crt  # minio ca file path, `/etc/pki/ca.crt` by default
        block: y                      # Enable block incremental backup
        bundle: y                     # bundle small files into a single file
        bundle_limit: 20MiB           # Limit for file bundles, 20MiB for object storage
        bundle_size: 128MiB           # Target size for file bundles, 128MiB for object storage
        cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
        cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'  <----- HEY, DID YOU CHANGE THIS?
        retention_full_type: time     # retention full backup by time on minio repo
        retention_full: 14            # keep full backup for last 14 days

    pg_version: 17
    repo_extra_packages: [pg17-core ,pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-olap ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl ]
    pg_extensions: [ pg17-time ,pg17-gis ,pg17-rag ,pg17-fts ,pg17-feat ,pg17-lang ,pg17-type ,pg17-util ,pg17-func ,pg17-admin ,pg17-stat ,pg17-sec ,pg17-fdw ,pg17-sim ,pg17-etl, pg_mooncake, pg_analytics, pg_parquet ] #,pg17-olap]
...

自建关键技术决策

以下是一些自建 Supabase 会涉及到的关键技术决策，供您参考：

使用默认的单节点部署 Supabase 无法享受到 PostgreSQL / MinIO 的高可用能力。尽管如此，单节点部署相比官方纯 Docker Compose 方案依然要有显著优势：例如开箱即用的监控系统，自由安装扩展的能力，各个组件的扩缩容能力，以及数据库时间点恢复能力等。

如果您只有一台服务器，Pigsty 建议您直接使用外部的 S3 作为对象存储，存放 PostgreSQL 的备份，并承载 Supabase Storage 服务。这样的部署在故障时可以提供一个最低标准的 RTO （小时级恢复时长）/ RPO （MB级数据损失）兜底容灾水平。此外，如果您选择在云上自建，我们也建议您直接使用 S3，而非默认使用的本体 MinIO ，单纯在本地 EBS 上再套一层 MinIO 转发，除了便于开发测试外，对生产实用并没有意义。

在严肃的生产部署中，Pigsty 建议使用至少3～4个节点的部署策略，确保 MinIO 与 PostgreSQL 都使用满足企业级高可用要求的多节点部署。在这种情况下，您需要相应准备更多节点与磁盘，并相应调整 pigsty.yml 配置清单中的集群配置，以及 supabase 集群配置中的接入信息。

部分 Supabase 的功能需要发送邮件，所以要用到 SMTP。除非单纯用于内网，否则对于严肃的生产部署，我们建议您考虑使用外部的 SMTP 服务。自建的邮件服务器发送的邮件可能会被对方邮件服务器拒收，或者被标记为垃圾邮件。

如果您的服务直接向公网暴露，我们建议您使用 Nginx 进行反向代理，使用真正的域名与 HTTPS 证书，并通过不同的域名区分不同的多个实例。

接下来，我们会依次讨论这几个主题：

进阶主题：安全加固
高可用的 PostgreSQL 集群部署与接入
高可用的 MinIO 集群部署与接入
使用 S3 服务替代 MinIO
使用外部 SMTP 服务发送邮件
使用真实域名，证书，通过 Nginx 反向代理

进阶主题：安全加固

Pigsty基础组件

对于严肃的生产部署，我们强烈建议您修改 Pigsty 基础组件的密码。因为这些默认值是公开且众所周知的，不改密码上生产无异于裸奔：

grafana_admin_password: pigsty，Grafana管理员密码
pg_admin_password: DBUser.DBA，PG超级用户密码
pg_monitor_password: DBUser.Monitor，PG监控用户密码
pg_replication_password: DBUser.Replicator，PG复制用户密码
patroni_password: Patroni.API，Patroni 高可用组件密码
haproxy_admin_password: pigsty，负载均衡器管控密码
minio_access_key: minioadmin，MinIO 根用户名
minio_secret_key: minioadmin，MinIO 根用户密钥
此外，强烈建议您修改 Supabase 使用的 PostgreSQL 业务用户密码，默认为 DBUser.Supa

以上密码为 Pigsty 组件模块的密码，强烈建议在安装部署前就设置完毕。

Supabase密钥

除了 Pigsty 组件的密码，你还需要修改 Supabase 的密钥，包括

JWT_SECRET:
ANON_KEY:
SERVICE_ROLE_KEY:
DASHBOARD_USERNAME: Supabase Studio Web 界面的默认用户名，默认为 supabase
DASHBOARD_PASSWORD: Supabase Studio Web 界面的默认密码，默认为 pigsty

这里请您务必参照 Supabase教程：保护你的服务里的说明：

生成一个长度超过 40 个字符的 JWT_SECRET，并使用教程中的工具签发 ANON_KEY 与 SERVICE_ROLE_KEY 两个 JWT。
使用教程中提供的工具，根据 JWT_SECRET 以及过期时间等属性，生成一个 ANON_KEY JWT，这是匿名用户的身份凭据。
使用教程中提供的工具，根据 JWT_SECRET 以及过期时间等属性，生成一个 SERVICE_ROLE_KEY，这是权限更高服务角色的身份凭据。
如果您使用的 PostgreSQL 业务用户使用了不同于默认值的密码，请相应修改 `POSTGRES_PASSWORD`` 的值
如果您的对象存储使用了不同于默认值的密码，请相应修改 S3_ACCESS_KEY``](https://github.com/pgsty/pigsty/blob/main/conf/app/supa.yml#L57) 与 [S3_SECRET_KEY`` 的值

Supabase 部分的凭据修改后，您可以重启 Docker Compose 容器以应用新的配置：

./app.yml -t app_config,app_launch

进阶主题：域名接入

如果你在本机或局域网内使用 Supabase，那么可以选择 IP:Port 直连 Kong 对外暴露的 HTTP 8000 端口访问 Supabase。

你可以使用一个内网静态解析的域名，但对于严肃的生产部署，我们建议您使用真域名 + HTTPS 来访问 Supabase。在这种情况下，通常您需要进行以下准备：

您的服务器应当有一个公网 IP 地址
购买域名，使用云/DNS/CDN 供应商提供的 DNS 解析服务，将其指向安装节点的公网 IP（下位替代：本地 /etc/hosts）
申请证书，使用 Let’s Encrypt 等证书颁发机构签发的免费 HTTPS 证书，用于加密通信（下位替代：默认自签名证书，手工信任）

您可以参考certbot教程，申请免费的 HTTPS 证书，这里我们假设您使用的自定义域名是：supa.pigsty.cc，那么您应该这样修改 infra_portal 中的 supa 域名：

all:
  vars:
    infra_portal:
      supa :
        domain: supa.pigsty.cc        # 替换为你的域名！
        endpoint: "10.10.10.10:8000"
        websocket: true
        certbot: supa.pigsty.cc       # 证书名称，通常与域名一致即可

如果域名已经解析到了您的服务器的公网 IP，那么在 Pigsty 目录中执行以下命令即可自动完整证书的申请与应用：

make cert

除了 Pigsty 组件的密码，你还需要修改 Supabase 的域名相关配置，包括

将他们配置为你的自定义域名，例如：supa.pigsty.cc，然后重新应用配置：

./app.yml -t app_config,app_launch

作为下位替代，您可以使用一个本地域名，来访问 Supabase。使用本地域名时，您可以在浏览器本机的 /etc/hosts 或局域网 DNS 里来配置 supa.pigsty 的解析，将其指向安装节点的【对外】IP地址。 Pigsty 管理节点上的 Nginx 会为此域名申请自签名的证书（浏览器显示《不安全》），并将请求转发到 8000 端口的 Kong，由 Supabase 处理。

进阶主题：外部对象存储

您可以使用 S3 或 S3 兼容的服务，来作为 PGSQL 备份与 Supabase 使用的对象存储。这里我们使用一个阿里云 OSS 对象存储作为例子。

Pigsty 提供了一个 terraform/spec/aliyun-meta-s3.tf 模板，用于在阿里云上拉起一台服务器，以及一个 OSS 存储桶。

首先，我们修改 all.children.supa.vars.apps.[supabase].conf 中 S3 相关的配置，将其指向阿里云 OSS 存储桶：

# if using s3/minio as file storage
S3_BUCKET: supa
S3_ENDPOINT: https://sss.pigsty:9000
S3_ACCESS_KEY: supabase
S3_SECRET_KEY: S3User.Supabase
S3_FORCE_PATH_STYLE: true
S3_PROTOCOL: https
S3_REGION: stub
MINIO_DOMAIN_IP: 10.10.10.10  # sss.pigsty domain name will resolve to this ip statically

同样使用以下命令重载 Supabase 配置：

./app.yml -t app_config,app_launch

您同样可以使用 S3 作为 PostgreSQL 的备份仓库，在 all.vars.pgbackrest_repo 新增一个 aliyun 备份仓库的定义：

all:
  vars:
    pgbackrest_method: aliyun          # pgbackrest 备份方法：local,minio,[其他用户定义的仓库...]，本例中将备份存储到 MinIO 上
    pgbackrest_repo:                   # pgbackrest 备份仓库: https://pgbackrest.org/configuration.html#section-repository
      aliyun:                          # 定义一个新的备份仓库 aliyun
        type: s3                       # 阿里云 oss 是 s3-兼容的对象存储
        s3_endpoint: oss-cn-beijing-internal.aliyuncs.com
        s3_region: oss-cn-beijing 
        s3_bucket: pigsty-oss
        s3_key: xxxxxxxxxxxxxx
        s3_key_secret: xxxxxxxx
        s3_uri_style: host
        
        path: /pgbackrest
        bundle: y
        cipher_type: aes-256-cbc
        cipher_pass: PG.${pg_cluster}   # 设置一个与集群名称绑定的加密密码
        retention_full_type: time 
        retention_full: 14

然后在 all.vars.pgbackrest_mehod 中指定使用 aliyun 备份仓库，重置 pgBackrest 备份：

./pgsql.yml -t pgbackrest

Pigsty 会将备份仓库切换到外部对象存储上。

进阶主题：备份策略

在 supabase 模板中，Pigsty 默认已经使用每天凌晨一点做一次全量备份的备份策略，你可以参考备份/恢复中的说明，来修改备份策略。

all:
  children:
    pg-meta:
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars:
        pg_cluster: pg-meta  # 每天凌晨一点做个全量备份
        node_crontab: [ '00 01 * * * postgres /pg/bin/pg-backup full' ]

然后执行以下命令，将 Crontab 配置应用到节点上：

./node.yml -t node_crontab

更多关于备份策略的主题，请参考 备份策略

进阶主题：使用SMTP

你可以使用 SMTP 来发送邮件，修改 supabase 应用配置，添加 SMTP 信息：

all:
  children:
    supa:            # supa group
      vars:          # supa group vars
        apps:        # supa group app list
          supabase:  # the supabase app
            conf:    # the supabase app conf entries
              SMTP_HOST: smtpdm.aliyun.com:80
              SMTP_PORT: 80
              SMTP_USER: no_reply@mail.your.domain.com
              SMTP_PASS: your_email_user_password
              SMTP_SENDER_NAME: MySupabase
              SMTP_ADMIN_EMAIL: adminxxx@mail.your.domain.com
              ENABLE_ANONYMOUS_USERS: false

不要忘了使用 app.yml -t app_config,app_launch 来重载配置

进阶主题：真·高可用

经过上面的配置，您已经可以使用一个带有公网域名，HTTPS 证书，SMTP 邮件服务器，备份的 Supabase 了。

如果您的这个节点挂了，起码外部 S3 存储中保留了备份，您可以在新的节点上重新部署 Supabase，然后从备份中恢复。这样的部署在故障时可以提供一个最低标准的 RTO （小时级恢复时长）/ RPO （MB级数据损失）兜底容灾水平兜底。

但如果您想要达到 RTO < 30s ，零数据丢失，那么就需要用到多节点高可用集群了。多节点部署有三个维度：

ETCD： DCS 需要使用三个节点或以上，才能容忍一个节点的故障。
PGSQL： PGSQL 同步提交不丢数据模式，建议使用至少三个节点。
INFRA：监控基础设施故障影响稍小，但我们建议生产环境使用三副本
Supabase 本身也可以是多节点的副本，实现高可用

我们建议您参考 trio 与 safe 中的集群配置，将您的集群配置升级到三节点或以上。

在这种情况下，您还需要修改 PostgreSQL 与 MinIO 的接入点，使用 DNS / L2 VIP / HAProxy 等高可用接入点

18 - 软件工具

使用 Docker 运行开箱即用的软件与工具

PostgreSQL 是世界上最流行的数据库，有无数的软件构建于 PostgreSQL 之上，围绕 PostgreSQL 生态，或服务于 PostgreSQL 本身，例如

使用 PostgreSQL 作为首选数据库的 “应用软件”
服务于 PostgreSQL 软件开发与管理的 “工具软件”
基于 PostgreSQL 的上层数据库，兼容分支等 “数据库软件”

Pigsty 提供了一系列 Docker Compose 模板，帮助用户一键拉起这些软件应用，开箱即用：

名称	官方网站	类型	状态	端口	默认域名	说明
Supabase	Supabase	数据库	GA	8000	supa.pigsty	基于PG的开源 Firebase 替代，后端平台
PolarDB	PolarDB	数据库	GA	5532		开源的 RAC 版 PostgreSQL，国产信创稻草人
FerretDB	FerretDB	数据库	GA	27017		基于PG的开源 MongoDB 替代
MinIO	MinIO	数据库	GA	9000	sss.pigsty	开源S3对象存储替代
EdgeDB	EdgeDB	数据库	TBD			基于PostgreSQL的图数据库
NocoDB	NocoDB	应用软件	GA	8080	noco.pigsty	开源的 Airtable 替代
Odoo	Odoo	应用软件	GA	8069	odoo.pigsty	开源的企业级ERP系统
Dify	Dify	应用软件	GA	8001	dify.pigsty	AI工作流编排平台，LLMOPS
Jupyter	Jupyter	应用软件	GA		lab.pigsty	Python开发与数据分析笔记本
Gitea	Gitea	应用软件	GA	8889	git.pigsty	私有，可靠高效的 DevOps 代码托管平台
Wiki	Wiki.js	应用软件	GA	9002	wiki.pigsty	开源且可扩展的 Wiki 软件
GitLab	GitLab	应用软件	TBD			开源 GitHub，企业级代码托管平台
Mastodon	Mastodon	应用软件	TBD			开源去中心化社交网站
Keycloak	Keycloak	应用软件	TBD			开源的身份认证与访问控制组件
Harbour	Harbour	应用软件	TBD			企业级 Docker/K8S 镜像仓库
Confluence	Confluence	应用软件	TBD			企业级知识管理库
Jira	Jira	应用软件	TBD			企业级项目管理工具
Zabbix	Zabbix 7	应用软件	TBD			企业级全家桶监控平台
Grafana	Grafana	应用软件	TBD			企业级数据可视化与面板平台
Metabase	Metabase	应用软件	GA	9004	mtbs.pigsty	快速对多种数据源内的数据进行分析
ByteBase	ByteBase	应用软件	GA	8887	ddl.pigsty	数据库模式变更工具
Kong	Kong	开发套件	GA	8000	api.pigsty	基于Nginx/OpenResty的API网关
PostgREST	PostgREST	开发套件	GA	8884	api.pigsty	自动从PG模式中生成RestAPI
pgAdmin4	pgAdmin4	PG工具	GA	8885	adm.pigsty	PostgreSQL GUI 管理工具
pgWeb	pgWeb	PG工具	GA	8886	cli.pigsty	PostgreSQL 网页客户端工具
SchemaSpy	SchemaSpy	PG工具	TBD			生成PostgreSQL模式图的工具
pgBadger	pgBadger	PG工具	TBD			分析PostgreSQL
pg_exporter	pg_exporter	PG工具	GA	9630		暴露PostgreSQL与Pgbouncer的监控指标

如何使用这些软件模板？

使用这些软件模板需要在节点上安装 DOCKER 模块，中国大陆用户可能还要配置 DockerHub 代理，详见教程。

18.1 - PGAdmin4：用GUI管理PG数据库

使用Docker拉起PgAdmin4，管理Pigsty服务器列表

PgAdmin4 是一个实用的 PostgreSQL 第一方管理 GUI 工具，Pigsty 内建了对 PGADMIN 的支持。

快速上手

Pigsty 内置 pgAdmin 的 Docker 应用模板，可以使用剧本一键拉起。

./docker.yml               # 安装 Docker & Docker Compose
./app.yml -e app=pgadmin   # 使用 Docker 拉起 PGADMIN 应用

默认分配 8885 端口，使用域名： http://adm.pigsty 访问， Demo：http://adm.pigsty.cc。

默认用户名：admin@pigsty.cc，密码：pigsty，登陆界面可以选择语言

Demo

公开Demo地址：http://adm.pigsty.cc

默认用户名与密码: admin@pigsty.cc / pigsty

太长；不看

cd ~/pigsty/app/pgadmin   # 进入应用目录
make up                   # 拉起pgadmin容器
make conf view            # 加载Pigsty服务器列表文件至Pgadmin容器内并加载

Pigsty的Pgadmin应用模板默认使用8885端口，您可以通过以下地址访问：

http://adm.pigsty 或 http://10.10.10.10:8885

默认用户名与密码: admin@pigsty.cc / pigsty

make up         # pull up pgadmin with docker-compose
make run        # launch pgadmin with docker
make view       # print pgadmin access point
make log        # tail -f pgadmin logs
make info       # introspect pgadmin with jq
make stop       # stop pgadmin container
make clean      # remove pgadmin container
make conf       # provision pgadmin with pigsty pg servers list 
make dump       # dump servers.json from pgadmin container
make pull       # pull latest pgadmin image
make rmi        # remove pgadmin image
make save       # save pgadmin image to /tmp/pgadmin.tgz
make load       # load pgadmin image from /tmp

18.2 - Kong：企业级开源 API 网关

拉起基于 Nginx 与 OpenResty 的强力开源 API 网关，并使用 PostgreSQL 与 Redis 作为其后端状态存储

TL;DR

cd app/kong ; docker-compose up -d

make up         # pull up kong with docker-compose
make ui         # run swagger ui container
make log        # tail -f kong logs
make info       # introspect kong with jq
make stop       # stop kong container
make clean      # remove kong container
make rmui       # remove swagger ui container
make pull       # pull latest kong image
make rmi        # remove kong image
make save       # save kong image to /tmp/kong.tgz
make load       # load kong image from /tmp

Scripts

Default Port: 8000
Default SSL Port: 8443
Default Admin Port: 8001
Default Postgres Database: postgres://dbuser_kong:DBUser.Kong@10.10.10.10:5432/kong

# postgres://dbuser_kong:DBUser.Kong@10.10.10.10:5432/kong
- { name: kong, owner: dbuser_kong, revokeconn: true , comment: kong the api gateway database }
- { name: dbuser_kong, password: DBUser.Kong , pgbouncer: true , roles: [ dbrole_admin ] }

18.3 - Jupyter：数据分析笔记本与AI IDE

使用 Jupyter Lab 并访问 PostgreSQL 数据库，并组合使用SQL与Python的能力进行数据分析。

本文需要更新

Jupyter Lab 是基于 IPython Notebook 的完整数据科学研发环境，可用于数据分析与可视化。

因为JupyterLab提供了Web Terminal功能，因此在默认安装中不启用，需要主动使用 infra-jupyter.yml 在元节点上进行部署。

数据分析环境：Jupyter

Jupyter Lab 是一站式数据分析环境，下列命令将在 8887 端口启动一个Jupyter Server.

docker run -it --restart always --detach --name jupyter -p 8888:8888 -v "${PWD}":/tmp/notebook jupyter/scipy-notebook
docker logs jupyter # 打印日志，获取登陆的Token

访问 http://10.10.10.10:8888/ 即可使用 JupyterLab，（需要填入自动生成的Token）。

您也可以使用 infra-jupyter.yml 在管理节点裸机上启用Jupyter Notebook。

太长不看

./infra-jupyter.yml # 在管理节点上安装 Jupyter Lab，使用8888端口，OS用户jupyter，默认密码 pigsty
./infra-jupyter.yml -e jupyter_domain=lab.pigsty.cc  # 使用另一个域名（默认为lab.pigsty）
./infra-jupyter.yml -e jupyter_port=8887             # 使用另一个端口（默认为8888）
./infra-jupyter.yml -e jupyter_username=osuser_jupyter jupyter_password=pigsty2 # 使用不同的操作系统用户与密码

Jupyter配置

Name	Type	Level	Comment
`jupyter_port`	integer	G	Jupyter端口
`jupyter_domain`	string	G	Jupyter端口
`jupyter_username`	string	G	Jupyter使用的操作系统用户
`jupyter_password`	string	G	Jupyter Lab的密码

默认值

jupyter_username: jupyter       # os user name, special names: default|root (dangerous!)
jupyter_password: pigsty        # default password for jupyter lab (important!)
jupyter_port: 8888              # default port for jupyter lab
jupyter_domain: lab.pigsty      # domain name used to distinguish jupyter

`jupyter_username`

Jupyter使用的操作系统用户, 类型：bool，层级：G，默认值为："jupyter"

其他用户名亦同理，但特殊用户名default会使用当前执行安装的用户（通常为管理员）运行 Jupyter Lab，这会更方便，但也更危险。

`jupyter_password`

Jupyter Lab的密码, 类型：bool，层级：G，默认值为："pigsty"

如果启用Jupyter，强烈建议修改此密码。加盐混淆的密码默认会写入~jupyter/.jupyter/jupyter_server_config.json。

`jupyter_port`

Jupyter监听端口, 类型：int，层级：G，默认值为：8888。

启用JupyterLab时，Pigsty会使用jupyter_username 参数指定的用户运行本地Notebook服务器。此外，需要确保配置node_packages_meta_pip 参数包含默认值 'jupyterlab'。 Jupyter Lab可以从Pigsty首页导航进入，或通过默认域名 lab.pigsty 访问，默认监听于8888端口。

`jupyter_domain`

Jupyter域名, 类型：string，层级：G，默认值为：lab.pigsty。

该域名会被写入 /etc/nginx/conf.d/jupyter.conf 中，作为Jupyter服务的监听域名。

Jupyter剧本

`infra-jupyter`

infra-jupyter.yml 剧本用于在元节点上加装 Jupyter Lab服务

Jupyter Lab 是非常实用的Python数据分析环境，但自带Web Shell，风险较大，需要使用专用剧本显式安装。

使用说明：参照 Jupyter配置中的说明调整配置清单，然后执行此剧本即可。

如果您在生产环境中启用了Jupyter，请务必修改Jupyter的密码

在Jupyter中访问PostgreSQL数据库

您可以直接使用 psycopg2 驱动访问 PostgreSQL 数据库

import psycopg2
conn = psycopg2.connect('postgres://dbuser_meta:DBUser.Meta@:5432/meta')
cursor = conn.cursor()
cursor.execute("""SELECT date, new_cases FROM covid.country_history WHERE country_code = 'CN';""")
data = cursor.fetchall()

18.4 - Gitea：自建简易代码托管平台

使用Docker拉起Gitea，并使用Pigsty的PG作为外部的元数据库

公开Demo地址：http://git.pigsty.cc

太长；不看

cd ~/pigsty/app/gitea; make up

在本例中，Gitea 默认使用 8889 端口，您可以访问以下位置：

http://git.pigsty 或 http://10.10.10.10:8889

make up      # pull up gitea with docker-compose in minimal mode
make run     # launch gitea with docker , local data dir and external PostgreSQL
make view    # print gitea access point
make log     # tail -f gitea logs
make info    # introspect gitea with jq
make stop    # stop gitea container
make clean   # remove gitea container
make pull    # pull latest gitea image
make rmi     # remove gitea image
make save    # save gitea image to /tmp/gitea.tgz
make load    # load gitea image from /tmp

使用外部的PostgreSQL

Pigsty默认使用容器内的 Sqlite 作为元数据存储，您可以让 Gitea 通过连接串环境变量使用外部的PostgreSQL

# postgres://dbuser_gitea:DBUser.gitea@10.10.10.10:5432/gitea
db:   { name: gitea, owner: dbuser_gitea, comment: gitea primary database }
user: { name: dbuser_gitea , password: DBUser.gitea, roles: [ dbrole_admin ] }

18.5 - Wiki.js：搭建你自己的维基百科

如何使用 Wiki.js 搭建你自己的开源维基百科，并使用 Pigsty 管理的PG作为持久数据存储

公开Demo地址：http://wiki.pigsty.cc

太长;不看

cd app/wiki ; docker-compose up -d

准备数据库

# postgres://dbuser_wiki:DBUser.Wiki@10.10.10.10:5432/wiki
- { name: wiki, owner: dbuser_wiki, revokeconn: true , comment: wiki the api gateway database }
- { name: dbuser_wiki, password: DBUser.Wiki , pgbouncer: true , roles: [ dbrole_admin ] }

bin/createuser pg-meta dbuser_wiki
bin/createdb   pg-meta wiki

容器配置

version: "3"
services:
  wiki:
    container_name: wiki
    image: requarks/wiki:2
    environment:
      DB_TYPE: postgres
      DB_HOST: 10.10.10.10
      DB_PORT: 5432
      DB_USER: dbuser_wiki
      DB_PASS: DBUser.Wiki
      DB_NAME: wiki
    restart: unless-stopped
    ports:
      - "9002:3000"

Access

Default Port for wiki: 9002

# add to nginx_upstream
- { name: wiki  , domain: wiki.pigsty.cc , endpoint: "127.0.0.1:9002"   }

./infra.yml -t nginx_config
ansible all -b -a 'nginx -s reload'

18.6 - Minio：开源S3，简单对象存储服务

使用Docker拉起Minio，即刻拥有你自己的对象存储服务。

公开Demo地址：http://sss.pigsty.cc

默认用户名： admin / pigsty.minio

太长；不看

Launch minio (s3) service on 9000 & 9001

cd ~/pigsty/app/minio ; docker-compose up -d

docker run -p 9000:9000 -p 9001:9001 \
  -e "MINIO_ROOT_USER=admin" \
  -e "MINIO_ROOT_PASSWORD=pigsty.minio" \
  minio/minio server /data --console-address ":9001"

visit http://10.10.10.10:9000 with user admin and password pigsty.minio

make up         # pull up minio with docker-compose
make run        # launch minio with docker
make view       # print minio access point
make log        # tail -f minio logs
make info       # introspect minio with jq
make stop       # stop minio container
make clean      # remove minio container
make pull       # pull latest minio image
make rmi        # remove minio image
make save       # save minio image to /tmp/minio.tgz
make load       # load minio image from /tmp

18.7 - ByteBase：PG模式迁移工具

使用Docker拉起Bytebase，对PG的模式进行版本化管理

ByteBase

ByteBase是一个进行数据库模式变更的工具，以下命令将在元节点 8887 端口启动一个ByteBase。

mkdir -p /data/bytebase/data;
docker run --init --name bytebase --restart always --detach --publish 8887:8887 --volume /data/bytebase/data:/var/opt/bytebase \
    bytebase/bytebase:1.0.4 --data /var/opt/bytebase --host http://ddl.pigsty --port 8887

访问 http://10.10.10.10:8887/ 或 http://ddl.pigsty 即可使用 ByteBase，您需要依次创建项目、环境、实例、数据库，即可开始进行模式变更。公开Demo地址： http://ddl.pigsty.cc

公开Demo地址：http://ddl.pigsty.cc

默认用户名与密码： admin / pigsty

Bytebase概览

Schema Migrator for PostgreSQL

cd app/bytebase; make up

Visit http://ddl.pigsty or http://10.10.10.10:8887

make up         # pull up bytebase with docker-compose in minimal mode
make run        # launch bytebase with docker , local data dir and external PostgreSQL
make view       # print bytebase access point
make log        # tail -f bytebase logs
make info       # introspect bytebase with jq
make stop       # stop bytebase container
make clean      # remove bytebase container
make pull       # pull latest bytebase image
make rmi        # remove bytebase image
make save       # save bytebase image to /tmp/bytebase.tgz
make load       # load bytebase image from /tmp

使用外部的PostgreSQL

Bytebase use its internal PostgreSQL database by default, You can use external PostgreSQL for higher durability.

# postgres://dbuser_bytebase:DBUser.Bytebase@10.10.10.10:5432/bytebase
db:   { name: bytebase, owner: dbuser_bytebase, comment: bytebase primary database }
user: { name: dbuser_bytebase , password: DBUser.Bytebase, roles: [ dbrole_admin ] }

if you wish to user an external PostgreSQL, drop monitor extensions and views & pg_repack

DROP SCHEMA monitor CASCADE;
DROP EXTENSION pg_repack;

After bytebase initialized, you can create them back with /pg/tmp/pg-init-template.sql

psql bytebase < /pg/tmp/pg-init-template.sql

18.8 - PostgREST：自动生成REST API

使用Docker拉起PostgREST，自动根据PostgreSQL模式生成后端REST API

PostgREST

PostgREST是一个自动根据 PostgreSQL 数据库模式生成 REST API的二进制组件。

例如，以下命令将使用docker拉起 postgrest （本地 8884 端口，使用默认管理员用户，暴露Pigsty CMDB模式）

docker run --init --name postgrest --restart always --detach --publish 8884:8081 postgrest/postgrest

访问 http://10.10.10.10:8884 会展示所有自动生成API的定义，并自动使用 Swagger Editor 暴露API文档。

如果您想要进行增删改查，设计更精细的权限控制，请参考 Tutorial 1 - The Golden Key，生成一个签名JWT。

This is an example of creating pigsty cmdb API with PostgREST

cd ~/pigsty/app/postgrest ; docker-compose up -d

http://10.10.10.10:8884 is the default endpoint for PostgREST

http://10.10.10.10:8883 is the default api docs for PostgREST

make up         # pull up postgrest with docker-compose
make run        # launch postgrest with docker
make ui         # run swagger ui container
make view       # print postgrest access point
make log        # tail -f postgrest logs
make info       # introspect postgrest with jq
make stop       # stop postgrest container
make clean      # remove postgrest container
make rmui       # remove swagger ui container
make pull       # pull latest postgrest image
make rmi        # remove postgrest image
make save       # save postgrest image to /tmp/postgrest.tgz
make load       # load postgrest image from /tmp

Swagger UI

Launch a swagger OpenAPI UI and visualize PostgREST API on 8883 with:

docker run --init --name postgrest --name swagger -p 8883:8080 -e API_URL=http://10.10.10.10:8884 swaggerapi/swagger-ui
# docker run -d -e API_URL=http://10.10.10.10:8884 -p 8883:8080 swaggerapi/swagger-editor # swagger editor

Check http://10.10.10.10:8883/

18.9 - SchemaSPY：PG模式可视化

使用 SchemaSPY 镜像解析 PostgreSQL 数据库模式，生成可视化报表

使用以下docker生成数据库模式报表，以CMDB为例：

docker run -v /www/schema/pg-meta/meta/pigsty:/output andrewjones/schemaspy-postgres:latest -host 10.10.10.10 -port 5432 -u dbuser_dba -p DBUser.DBA -db meta -s pigsty

然后访问 http://h.pigsty/schema/pg-meta/meta/pigsty 即可访问Schema报表

18.10 - PGWeb：从浏览器访问PostgreSQL

使用Docker拉起PGWEB，以便从浏览器进行小批量在线数据查询

PGWeb客户端工具

PGWeb是一款基于浏览器的PG客户端工具，使用以下命令，在元节点上拉起PGWEB服务，默认为主机8886端口。可使用域名： http://cli.pigsty 访问，公开Demo：http://cli.pigsty.cc。

# docker stop pgweb; docker rm pgweb
docker run --init --name pgweb --restart always --detach --publish 8886:8081 sosedoff/pgweb

用户需要自行填写数据库连接串，例如默认CMDB的连接串：

postgres://dbuser_dba:DBUser.DBA@10.10.10.10:5432/meta?sslmode=disable

公开Demo地址：http://cli.pigsty.cc

使用Docker Compose拉起PGWEB容器：

cd ~/pigsty/app/pgweb ; docker-compose up -d

接下来，访问您本机的 8886 端口，即可看到 PGWEB 的UI界面： http://10.10.10.10:8886

您可以尝试使用下面的URL连接串，通过 PGWEB 连接至数据库实例并进行探索。

postgres://dbuser_meta:DBUser.Meta@10.10.10.10:5432/meta?sslmode=disable
postgres://test:test@10.10.10.11:5432/test?sslmode=disable

快捷方式

make up         # pull up pgweb with docker-compose
make run        # launch pgweb with docker
make view       # print pgweb access point
make log        # tail -f pgweb logs
make info       # introspect pgweb with jq
make stop       # stop pgweb container
make clean      # remove pgweb container
make pull       # pull latest pgweb image
make rmi        # remove pgweb image
make save       # save pgweb image to /tmp/pgweb.tgz
make load       # load pgweb image from /tmp

18.11 - Discourse：开源技术论坛

如何搭建开源的论坛软件 Discourse，并使用 Pigsty 管理的 PG 作为后端数据库存储？

搭建开源论坛Discourse，需要调整配置 app.yml ，重点是SMTP部分的配置

Discourse配置样例

templates:
  - "templates/web.china.template.yml"
  - "templates/postgres.template.yml"
  - "templates/redis.template.yml"
  - "templates/web.template.yml"
  - "templates/web.ratelimited.template.yml"
## Uncomment these two lines if you wish to add Lets Encrypt (https)
# - "templates/web.ssl.template.yml"
# - "templates/web.letsencrypt.ssl.template.yml"
expose:
  - "80:80"   # http
  - "443:443" # https
params:
  db_default_text_search_config: "pg_catalog.english"
  db_shared_buffers: "768MB"
env:
  LC_ALL: en_US.UTF-8
  LANG: en_US.UTF-8
  LANGUAGE: en_US.UTF-8
  EMBER_CLI_PROD_ASSETS: 1
  UNICORN_WORKERS: 4
  DISCOURSE_HOSTNAME: forum.pigsty
  DISCOURSE_DEVELOPER_EMAILS: 'fengruohang@outlook.com,rh@vonng.com'
  DISCOURSE_SMTP_ENABLE_START_TLS: false
  DISCOURSE_SMTP_AUTHENTICATION: login
  DISCOURSE_SMTP_OPENSSL_VERIFY_MODE: none
  DISCOURSE_SMTP_ADDRESS: smtpdm.server.address
  DISCOURSE_SMTP_PORT: 80
  DISCOURSE_SMTP_USER_NAME: no_reply@mail.pigsty.cc
  DISCOURSE_SMTP_PASSWORD: "<password>"
  DISCOURSE_SMTP_DOMAIN: mail.pigsty.cc
volumes:
  - volume:
      host: /var/discourse/shared/standalone
      guest: /shared
  - volume:
      host: /var/discourse/shared/standalone/log/var-log
      guest: /var/log

hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - git clone https://github.com/discourse/docker_manager.git
run:
  - exec: echo "Beginning of custom commands"
  # - exec: rails r "SiteSetting.notification_email='no_reply@mail.pigsty.cc'"
  - exec: echo "End of custom commands"

然后，执行以下命令，拉起Discourse即可。

./launcher rebuild app

18.12 - GitLab：企业级开源代码托管平台

如何自建企业级开源代码托管平台 GitLab，并使用 Pigsty 管理的 PG 作为后端数据库存储？

安装自主管理的 GitLab

样例：开源代码仓库：Gitlab

请参考Gitlab Docker部署文档完成Docker部署。

export GITLAB_HOME=/data/gitlab

sudo docker run --detach \
  --hostname gitlab.example.com \
  --publish 443:443 --publish 80:80 --publish 23:22 \
  --name gitlab \
  --restart always \
  --volume $GITLAB_HOME/config:/etc/gitlab \
  --volume $GITLAB_HOME/logs:/var/log/gitlab \
  --volume $GITLAB_HOME/data:/var/opt/gitlab \
  --shm-size 256m \
  gitlab/gitlab-ee:latest
  
sudo docker exec -it gitlab grep 'Password:' /etc/gitlab/initial_root_password

19 - 数据分析

使用 Pigsty Grafana & Echarts 工具箱进行数据分析与可视化

Applet的结构

Applet，是一种自包含的，运行于Pigsty基础设施中的数据小应用。

一个Pigsty应用通常包括以下内容中的至少一样或全部：

图形界面（Grafana Dashboard定义）放置于ui目录
数据定义（PostgreSQL DDL File），放置于 sql 目录
数据文件（各类资源，需要下载的文件），放置于data目录
逻辑脚本（执行各类逻辑），放置于bin目录

Pigsty默认提供了几个样例应用：

pglog，分析PostgreSQL CSV日志样本。
covid，可视化WHO COVID-19数据，查阅各国疫情数据。
pglog， NOAA ISD，可以查询全球30000个地表气象站从1901年来的气象观测记录。

应用的结构

一个Pigsty应用会在应用根目录提供一个安装脚本：install或相关快捷方式。您需要使用管理用户在 管理节点 执行安装。安装脚本会检测当前的环境（获取 METADB_URL， PIGSTY_HOME，GRAFANA_ENDPOINT等信息以执行安装）

通常，带有APP标签的面板会被列入Pigsty Grafana首页导航中App下拉菜单中，带有APP和Overview标签的面板则会列入首页面板导航中。

您可以从 https://github.com/pgsty/pigsty/releases/download/v1.5.1/app.tgz 下载带有基础数据的应用进行安装。

19.1 - PGLOG：PG自带日志分析应用

Pigsty自带的，用于分析PostgreSQL CSV日志样本的一个样例Applet

PGLOG是Pigsty自带的一个样例应用，固定使用MetaDB中pglog.sample表作为数据来源。您只需要将日志灌入该表，然后访问相关Dashboard即可。

Pigsty提供了一些趁手的命令，用于拉取csv日志，并灌入样本表中。在元节点上，默认提供下列快捷命令：

catlog  [node=localhost]  [date=today]   # 打印CSV日志到标准输出
pglog                                    # 从标准输入灌入CSVLOG
pglog12                                  # 灌入PG12格式的CSVLOG
pglog13                                  # 灌入PG13格式的CSVLOG
pglog14                                  # 灌入PG14格式的CSVLOG (=pglog)

catlog | pglog                       # 分析当前节点当日的日志
catlog node-1 '2021-07-15' | pglog   # 分析node-1在2021-07-15的csvlog

接下来，您可以访问以下的连接，查看样例日志分析界面。

PGLOG Overview: 呈现整份CSV日志样本详情，按多种维度聚合。

PGLOG Session: 呈现日志样本中一条具体连接的详细信息。

catlog命令从特定节点拉取特定日期的CSV数据库日志，写入stdout

默认情况下，catlog会拉取当前节点当日的日志，您可以通过参数指定节点与日期。

组合使用pglog与catlog，即可快速拉取数据库CSV日志进行分析。

catlog | pglog                       # 分析当前节点当日的日志
catlog node-1 '2021-07-15' | pglog   # 分析node-1在2021-07-15的csvlog

19.2 - NOAA ISD 全球气象站历史数据查询

以ISD数据集为例，展现如何将数据导入数据库中

如果您拥有数据库后不知道干点什么，不妨参试试这个开源项目：Vonng/isd

您可以直接复用监控系统Grafana，以交互式的方式查阅近30000个地面气象站过去120年间的亚小时级气象数据。

这是一个功能完成的数据应用，可以查询全球30000个地表气象站从1901年来的气象观测记录。

项目地址：https://github.com/Vonng/isd

在线Demo地址：https://demo.pigsty.cc/d/isd-overview

快速上手

克隆本仓库

git clone https://github.com/Vonng/isd.git; cd isd;

准备一个 PostgreSQL 实例

该 PostgreSQL 实例应当启用了 PostGIS 扩展。使用 PGURL 环境变量传递数据库连接信息：

# Pigsty 默认使用的管理员账号是 dbuser_dba，密码是 DBUser.DBA
export PGURL=postgres://dbuser_dba:DBUser.DBA@127.0.0.1:5432/meta?sslmode=disable
psql "${PGURL}" -c 'SELECT 1'  # 检查连接是否可用

获取并导入ISD气象站元数据

这是一份每日更新的气象站元数据，包含了气象站的经纬度、海拔、名称、国家、省份等信息，使用以下命令下载并导入。

make reload-station   # 相当于先下载最新的Station数据再加载：get-station + load-station

获取并导入最新的 isd.daily 数据

isd.daily 是一个每日更新的数据集，包含了全球各气象站的日观测数据摘要，使用以下命令下载并导入。请注意，直接从 NOAA 网站下载的原始数据需要经过解析方可入库，所以你需要下载或构建一个 ISD 数据 Parser。

make get-parser       # 从 Github 下载 Parser 二进制，当然你也可以用 make build 直接用 go 构建。
make reload-daily     # 下载本年度最新的 isd.daily 数据并导入数据库中

加载解析好的 CSV 数据集

ISD Daily 数据集有一些脏数据与重复数据，如果你不想手工解析处理清洗，这里也提供了一份解析好的稳定CSV数据集。

该数据集包含了截止到 2023-06-24 的 isd.daily 数据，你可以直接下载并导入 PostgreSQL 中，不需要 Parser，

make get-stable       # 从 Github 上获取稳定的 isd.daily 历史数据集。
make load-stable      # 将下载好的稳定历史数据集加载到 PostgreSQL 数据库中。

数据

数据集概要

ISD提供了四个数据集：亚小时级原始观测数据，每日统计摘要数据，月度统计摘要，年度统计摘要

数据集	备注
ISD Hourly	亚小时级观测记录
ISD Daily	每日统计摘要
ISD Monthly	没有用到，因为可以从 `isd.daily` 计算生成
ISD Yearly	没有用到，因为可以从 `isd.daily` 计算生成

每日摘要数据集

压缩包大小 2.8GB (截止至 2023-06-24)
表大小 24GB，索引大小 6GB，PostgreSQL 中总大小约为 30GB
如果启用了 timescaledb 压缩，总大小可以压缩到 4.5 GB。

亚小时级观测数据级

压缩包总大小 117GB
灌入数据库后表大小 1TB+ ，索引大小 600GB+，总大小 1.6TB

数据库模式

气象站元数据表

CREATE TABLE isd.station
(
    station    VARCHAR(12) PRIMARY KEY,
    usaf       VARCHAR(6) GENERATED ALWAYS AS (substring(station, 1, 6)) STORED,
    wban       VARCHAR(5) GENERATED ALWAYS AS (substring(station, 7, 5)) STORED,
    name       VARCHAR(32),
    country    VARCHAR(2),
    province   VARCHAR(2),
    icao       VARCHAR(4),
    location   GEOMETRY(POINT),
    longitude  NUMERIC GENERATED ALWAYS AS (Round(ST_X(location)::NUMERIC, 6)) STORED,
    latitude   NUMERIC GENERATED ALWAYS AS (Round(ST_Y(location)::NUMERIC, 6)) STORED,
    elevation  NUMERIC,
    period     daterange,
    begin_date DATE GENERATED ALWAYS AS (lower(period)) STORED,
    end_date   DATE GENERATED ALWAYS AS (upper(period)) STORED
);

每日摘要表

CREATE TABLE IF NOT EXISTS isd.daily
(
    station     VARCHAR(12) NOT NULL, -- station number 6USAF+5WBAN
    ts          DATE        NOT NULL, -- observation date
    -- 气温 & 露点
    temp_mean   NUMERIC(3, 1),        -- mean temperature ℃
    temp_min    NUMERIC(3, 1),        -- min temperature ℃
    temp_max    NUMERIC(3, 1),        -- max temperature ℃
    dewp_mean   NUMERIC(3, 1),        -- mean dew point ℃
    -- 气压
    slp_mean    NUMERIC(5, 1),        -- sea level pressure (hPa)
    stp_mean    NUMERIC(5, 1),        -- station pressure (hPa)
    -- 可见距离
    vis_mean    NUMERIC(6),           -- visible distance (m)
    -- 风速
    wdsp_mean   NUMERIC(4, 1),        -- average wind speed (m/s)
    wdsp_max    NUMERIC(4, 1),        -- max wind speed (m/s)
    gust        NUMERIC(4, 1),        -- max wind gust (m/s) 
    -- 降水 / 雪深
    prcp_mean   NUMERIC(5, 1),        -- precipitation (mm)
    prcp        NUMERIC(5, 1),        -- rectified precipitation (mm)
    sndp        NuMERIC(5, 1),        -- snow depth (mm)
    -- FRSHTT (Fog/Rain/Snow/Hail/Thunder/Tornado) 雾/雨/雪/雹/雷/龙卷
    is_foggy    BOOLEAN,              -- (F)og
    is_rainy    BOOLEAN,              -- (R)ain or Drizzle
    is_snowy    BOOLEAN,              -- (S)now or pellets
    is_hail     BOOLEAN,              -- (H)ail
    is_thunder  BOOLEAN,              -- (T)hunder
    is_tornado  BOOLEAN,              -- (T)ornado or Funnel Cloud
    -- 统计聚合使用的记录数
    temp_count  SMALLINT,             -- record count for temp
    dewp_count  SMALLINT,             -- record count for dew point
    slp_count   SMALLINT,             -- record count for sea level pressure
    stp_count   SMALLINT,             -- record count for station pressure
    wdsp_count  SMALLINT,             -- record count for wind speed
    visib_count SMALLINT,             -- record count for visible distance
    -- 气温标记
    temp_min_f  BOOLEAN,              -- aggregate min temperature
    temp_max_f  BOOLEAN,              -- aggregate max temperature
    prcp_flag   CHAR,                 -- precipitation flag: ABCDEFGHI
    PRIMARY KEY (station, ts)
); -- PARTITION BY RANGE (ts);

亚小时级原始观测数据表

ISD Hourly

CREATE TABLE IF NOT EXISTS isd.hourly
(
    station    VARCHAR(12) NOT NULL, -- station id
    ts         TIMESTAMP   NOT NULL, -- timestamp
    -- air
    temp       NUMERIC(3, 1),        -- [-93.2,+61.8]
    dewp       NUMERIC(3, 1),        -- [-98.2,+36.8]
    slp        NUMERIC(5, 1),        -- [8600,10900]
    stp        NUMERIC(5, 1),        -- [4500,10900]
    vis        NUMERIC(6),           -- [0,160000]
    -- wind
    wd_angle   NUMERIC(3),           -- [1,360]
    wd_speed   NUMERIC(4, 1),        -- [0,90]
    wd_gust    NUMERIC(4, 1),        -- [0,110]
    wd_code    VARCHAR(1),           -- code that denotes the character of the WIND-OBSERVATION.
    -- cloud
    cld_height NUMERIC(5),           -- [0,22000]
    cld_code   VARCHAR(2),           -- cloud code
    -- water
    sndp       NUMERIC(5, 1),        -- mm snow
    prcp       NUMERIC(5, 1),        -- mm precipitation
    prcp_hour  NUMERIC(2),           -- precipitation duration in hour
    prcp_code  VARCHAR(1),           -- precipitation type code
    -- sky
    mw_code    VARCHAR(2),           -- manual weather observation code
    aw_code    VARCHAR(2),           -- auto weather observation code
    pw_code    VARCHAR(1),           -- weather code of past period of time
    pw_hour    NUMERIC(2),           -- duration of pw_code period
    -- misc
    -- remark     TEXT,
    -- eqd        TEXT,
    data       JSONB                 -- extra data
) PARTITION BY RANGE (ts);

解析器

NOAA ISD 提供的原始数据是高度压缩的专有格式，需要通过解析器加工，才能转换为数据库表的格式。

针对 Daily 与 Hourly 两份数据集，这里提供了两个 Parser： isdd and isdh。这两个解析器都以年度数据压缩包作为输入，产生 CSV 结果作为输出，以管道的方式工作，如下所示：

NAME
        isd -- Intergrated Surface Dataset Parser

SYNOPSIS
        isd daily   [-i <input|stdin>] [-o <output|stout>] [-v]
        isd hourly  [-i <input|stdin>] [-o <output|stout>] [-v] [-d raw|ts-first|hour-first]

DESCRIPTION
        The isd program takes noaa isd daily/hourly raw tarball data as input.
        and generate parsed data in csv format as output. Works in pipe mode

        cat data/daily/2023.tar.gz | bin/isd daily -v | psql ${PGURL} -AXtwqc "COPY isd.daily FROM STDIN CSV;" 

        isd daily  -v -i data/daily/2023.tar.gz  | psql ${PGURL} -AXtwqc "COPY isd.daily FROM STDIN CSV;"
        isd hourly -v -i data/hourly/2023.tar.gz | psql ${PGURL} -AXtwqc "COPY isd.hourly FROM STDIN CSV;"

OPTIONS
        -i  <input>     input file, stdin by default
        -o  <output>    output file, stdout by default
        -p  <profpath>  pprof file path, enable if specified
        -d              de-duplicate rows for hourly dataset (raw, ts-first, hour-first)
        -v              verbose mode
        -h              print help

用户界面

这里提供了几个使用 Grafana 制作的 Dashboard，可以用于探索 ISD 数据集，查询气象站与历史气象数据。

ISD Overview

全局概览，总体指标与气象站导航。

ISD Country

展示单个国家/地区内所有的气象站。

ISD Station

展示单个气象站的详细信息，元数据，天/月/年度汇总指标。

ISD Station Dashboard

ISD Detail

展示一个气象站原始亚小时级观测指标数据，需要 isd.hourly 数据集。

ISD Station Dashboard

19.3 - WHO COVID-19 疫情大盘

Pigsty 自带的，用于展示世界卫生组织官方疫情数据的一个样例 Applet

Covid 是 Pigsty 自带的，用于展示世界卫生组织官方疫情数据大盘的一个样例 Applet。

您可以查阅每个国家与地区 COVID-19 的感染与死亡案例，以及全球的疫情趋势。

概览

GitHub 仓库地址：https://github.com/pgsty/pigsty-app/tree/master/covid

在线Demo地址：https://demo.pigsty.cc/d/covid

安装

在管理节点上进入应用目录，执行make以完成安装。

make            # 完成所有配置

其他一些子任务：

make reload     # download latest data and pour it again
make ui         # install grafana dashboards
make sql        # install database schemas
make download   # download latest data
make load       # load downloaded data into database
make reload     # download latest data and pour it into database

19.4 - AWS 阿里云服务器价格

分析阿里云 / AWS 上算力与存储的价格 (ECS/ESSD)

概览

GitHub 仓库地址：https://github.com/pgsty/pigsty-app/tree/master/cloud

在线Demo地址：https://demo.pigsty.cc/d/ecs

文章地址：《剖析算力成本：阿里云真降价了吗？》

数据源

Aliyun ECS 价格可以在价格计算器 - 定价详情 - 价格下载中获取 CSV 原始数据。

模式

下载阿里云价格明细并导入分析

CREATE EXTENSION file_fdw;
CREATE SERVER fs FOREIGN DATA WRAPPER file_fdw;

DROP FOREIGN TABLE IF EXISTS aliyun_ecs CASCADE;
CREATE FOREIGN TABLE aliyun_ecs
    (
        "region" text,
        "system" text,
        "network" text,
        "isIO" bool,
        "instanceId" text,
        "hourlyPrice" numeric,
        "weeklyPrice" numeric,
        "standard" numeric,
        "monthlyPrice" numeric,
        "yearlyPrice" numeric,
        "2yearPrice" numeric,
        "3yearPrice" numeric,
        "4yearPrice" numeric,
        "5yearPrice" numeric,
        "id" text,
        "instanceLabel" text,
        "familyId" text,
        "serverType" text,
        "cpu" text,
        "localStorage" text,
        "NvmeSupport" text,
        "InstanceFamilyLevel" text,
        "EniTrunkSupported" text,
        "InstancePpsRx" text,
        "GPUSpec" text,
        "CpuTurboFrequency" text,
        "InstancePpsTx" text,
        "InstanceTypeId" text,
        "GPUAmount" text,
        "InstanceTypeFamily" text,
        "SecondaryEniQueueNumber" text,
        "EniQuantity" text,
        "EniPrivateIpAddressQuantity" text,
        "DiskQuantity" text,
        "EniIpv6AddressQuantity" text,
        "InstanceCategory" text,
        "CpuArchitecture" text,
        "EriQuantity" text,
        "MemorySize" numeric,
        "EniTotalQuantity" numeric,
        "PhysicalProcessorModel" text,
        "InstanceBandwidthRx" numeric,
        "CpuCoreCount" numeric,
        "Generation" text,
        "CpuSpeedFrequency" numeric,
        "PrimaryEniQueueNumber" text,
        "LocalStorageCategory" text,
        "InstanceBandwidthTx" text,
        "TotalEniQueueQuantity" text
        ) SERVER fs OPTIONS ( filename '/tmp/aliyun-ecs.csv', format 'csv',header 'true');

AWS EC2 同理，可以从 Vantage 下载价格清单：


DROP FOREIGN TABLE IF EXISTS aws_ec2 CASCADE;
CREATE FOREIGN TABLE aws_ec2
    (
        "name" TEXT,
        "id" TEXT,
        "Memory" TEXT,
        "vCPUs" TEXT,
        "GPUs" TEXT,
        "ClockSpeed" TEXT,
        "InstanceStorage" TEXT,
        "NetworkPerformance" TEXT,
        "ondemand" TEXT,
        "reserve" TEXT,
        "spot" TEXT
        ) SERVER fs OPTIONS ( filename '/tmp/aws-ec2.csv', format 'csv',header 'true');



DROP VIEW IF EXISTS ecs;
CREATE VIEW ecs AS
SELECT "region"                                       AS region,
       "id"                                           AS id,
       "instanceLabel"                                AS name,
       "familyId"                                     AS family,
       "CpuCoreCount"                                 AS cpu,
       "MemorySize"                                   AS mem,
       round("5yearPrice" / "CpuCoreCount" / 60, 2)   AS ycm5, -- ¥ / (core·month)
       round("4yearPrice" / "CpuCoreCount" / 48, 2)   AS ycm4, -- ¥ / (core·month)
       round("3yearPrice" / "CpuCoreCount" / 36, 2)   AS ycm3, -- ¥ / (core·month)
       round("2yearPrice" / "CpuCoreCount" / 24, 2)   AS ycm2, -- ¥ / (core·month)
       round("yearlyPrice" / "CpuCoreCount" / 12, 2)  AS ycm1, -- ¥ / (core·month)
       round("standard" / "CpuCoreCount", 2)          AS ycmm, -- ¥ / (core·month)
       round("hourlyPrice" / "CpuCoreCount" * 720, 2) AS ycmh, -- ¥ / (core·month)
       "CpuSpeedFrequency"::NUMERIC                   AS freq,
       "CpuTurboFrequency"::NUMERIC                   AS freq_turbo,
       "Generation"                                   AS generation
FROM aliyun_ecs
WHERE system = 'linux';

DROP VIEW IF EXISTS ec2;
CREATE VIEW ec2 AS
SELECT id,
       name,
       split_part(id, '.', 1)                                                               as family,
       split_part(id, '.', 2)                                                               as spec,
       (regexp_match(split_part(id, '.', 1), '^[a-zA-Z]+(\d)[a-z0-9]*'))[1]                 as gen,
       regexp_substr("vCPUs", '^[0-9]+')::int                                               as cpu,
       regexp_substr("Memory", '^[0-9]+')::int                                              as mem,
       CASE spot
           WHEN 'unavailable' THEN NULL
           ELSE round((regexp_substr("spot", '([0-9]+.[0-9]+)')::NUMERIC * 7.2), 2) END     AS spot,
       CASE ondemand
           WHEN 'unavailable' THEN NULL
           ELSE round((regexp_substr("ondemand", '([0-9]+.[0-9]+)')::NUMERIC * 7.2), 2) END AS ondemand,
       CASE reserve
           WHEN 'unavailable' THEN NULL
           ELSE round((regexp_substr("reserve", '([0-9]+.[0-9]+)')::NUMERIC * 7.2), 2) END  AS reserve,
       "ClockSpeed"                                                                         AS freq
FROM aws_ec2;

可视化

19.5 - 使用一条 SQL 计算扑克24点

用一条SQL给出扑克牌24点的计算表达式，PostgreSQL 解法。

题目

题目如下：《数据库编程大赛：一条SQL计算扑克牌24点》

有一张表 cards，id 是自增字段的数字主键，另外有4个字段 c1,c2,c3,c4 ，每个字段随机从 1~10 之间选择一个整数要求选手使用一条 SQL 给出 24 点的计算公式，返回的内容示例如右图：

其中 result 字段是计算的表达式，只需返回1个解，如果没有解，result 返回null

24 点的计算规则：只能使用加减乘除四则运算，不能使用阶乘、指数等运算符，每个数字最少使用一次，且只能使用一次，可以使用小括号改变优先级
只能使用一条 SQL ，可以使用数据库内置函数，但是不能使用存储过程/自定义函数和代码块。
SQL 正确性大家在 NineData 平台 demo 数据库自己验证，或在自己的数据库上验证，组委会评测服务器是 4 核 CPU ，32 GB 内存
选手个人诚信参赛，不允许提交别人的比赛代码，如果发现有类似代码，工作组以第一个提交的为有效参赛
每个选手最多提交 3 次比赛代码
提交的 SQL 不能超过 10 KB大小

作为 MySQL 老司机，NineData 搞的这个比赛暗吹 MySQL 的水平比姜高到不知道哪里去了 —— 为什么这么说呢？

因为 10KB 的大小限制非常猥琐 —— 最快的解法都是质数查表，而这种方式所有解的文本拼接大小大约是 10018 个字符。要想压缩这个表到 10KB 以内，必须要用到一些压缩技巧。

MySQL 是带有 COMPRESS 和 UNCOMPRESS 函数的，而 PostgreSQL 原生是没带的，需要用到 pgsql-gzip 扩展，而这个扩展在 NineData 比赛的平台上是不提供的。

下面是使用 PostgreSQL 的解法：

创建随机测试数据表

CREATE SCHEMA poker24;
DROP TABLE IF EXISTS poker24.cards;
CREATE TABLE poker24.cards AS
SELECT i                   AS id,
       ceil(random() * 10) AS c1,
       ceil(random() * 10) AS c2,
       ceil(random() * 10) AS c3,
       ceil(random() * 10) AS c4
FROM generate_series(1, 1000000) i;

ALTER TABLE poker24.cards ADD PRIMARY KEY (id);

解法

基本思想是是使用质数编码，将所有可能的结果分配唯一主键编号，快速计算 24 点：

EXPLAIN ANALYZE
WITH a(i, result) AS (
    SELECT (split_part(kv, ':', 1))::INTEGER AS i, split_part(kv, ':', 2) AS result
    FROM regexp_split_to_table('152:((1+1)+1)*8,156:(6*2)*(1+1),204:(7+1)*(2+1),228:((1*1)+2)*8,276:(9-1)*(2+1),348:(10+2)*(1+1),140:(4*3)*(1+1),220:(5+1)*(3+1),260:((1+1)+6)*3,340:((1*1)+7)*3,380:(8*3)+(1-1),460:(9+3)*(1+1),580:(10-(1+1))*3,196:((1+1)+4)*4,308:((1*1)+5)*4,364:(6*4)+(1-1),476:(7-(1*1))*4,532:(8+4)*(1+1),644:(9-1)*(4-1),812:((1+1)*10)+4,484:(5*5)-(1*1),572:(5-(1*1))*6,748:(7+5)*(1+1),836:(5-(1+1))*8,676:(6+6)*(1+1),988:(8*6)/(1+1),1196:((1+1)*9)+6,1972:((1+1)*7)+10,1444:((1+1)*8)+8,126:(4*2)*(2+1),198:(2+2)*(5+1),234:(6+2)*(2+1),306:(2+2)*(7-1),342:((2-1)+2)*8,414:((2+1)+9)*2,522:(10-2)*(2+1),150:(3*2)*(3+1),210:((2+1)+3)*4,330:(5+3)*(2+1),390:((2-1)+3)*6,510:(7*3)+(2+1),570:(8*3)*(2-1),690:(9*3)-(2+1),870:(10-(2*1))*3,294:(4+4)*(2+1),462:((2-1)+5)*4,546:(6*4)*(2-1),714:(7-(2-1))*4,798:(4-(2-1))*8,966:(9-(2+1))*4,1218:((2*1)*10)+4,726:(5*5)-(2-1),858:(5-(2-1))*6,1122:(7+5)*(2*1),1254:(5-(2*1))*8,1518:((2+1)*5)+9,1914:(10*2)+(5-1),1014:((2+1)*6)+6,1326:(7-(2+1))*6,1482:(6-(2+1))*8,1794:((2*1)*9)+6,2262:((2+1)*10)-6,1734:((7*7)-1)/2,1938:(8*2)+(7+1),2346:(9*2)+(7-1),2958:((2*1)*7)+10,2166:((2*1)*8)+8,2622:(9*8)/(2+1),3306:((8-1)*2)+10,250:(3+3)*(3+1),350:((3+1)+4)*3,550:(5+3)*(3*1),650:((3-1)+6)*3,850:(7*3)+(3*1),950:((8+1)*3)-3,1150:(9-3)*(3+1),1450:(10-(3-1))*3,490:((3-1)+4)*4,770:(5*4)+(3+1),910:6/(1-(3/4)),1190:(7*4)-(3+1),1330:((3+1)*4)+8,1610:(9-(3*1))*4,2030:(10-4)*(3+1),1430:(6*3)+(5+1),1870:(7+5)*(3-1),2090:(5-(3-1))*8,2530:((3*1)*5)+9,3190:(10*3)-(5+1),1690:(6+6)*(3-1),2210:(7-(3*1))*6,2470:(8-(3+1))*6,2990:((3-1)*9)+6,3770:((3*1)*10)-6,2890:(7-3)*(7-1),3230:(7-(3+1))*8,3910:(9/3)*(7+1),4930:((3-1)*7)+10,3610:((3+1)*8)-8,4370:(9*8)/(3*1),5510:(8/3)*(10-1),5290:(9/3)*(9-1),6670:((10+1)*3)-9,8410:(10+10)+(3+1),686:((4+1)*4)+4,1078:(5*4)+(4*1),1274:((6+1)*4)-4,1666:(7*4)-(4*1),1862:((4*1)*4)+8,2254:(9-(4-1))*4,2842:(10-4)*(4*1),1694:(5*4)+(5-1),2002:6/((5/4)-1),2618:(7*4)-(5-1),2926:(8-4)*(5+1),3542:((4-1)*5)+9,4466:(10-4)*(5-1),2366:((4+1)*6)-6,3094:(7-(4-1))*6,3458:(6-(4-1))*8,4186:(9-(4+1))*6,5278:((4-1)*10)-6,4046:(7-4)*(7+1),4522:(7-(4*1))*8,5474:(7-4)*(9-1),5054:(8-(4+1))*8,6118:(9*8)/(4-1),9338:(10+9)+(4+1),11774:(10+10)+(4*1),2662:(5-(1/5))*5,3146:(6*5)-(5+1),5566:(9-5)*(5+1),7018:((10-5)*5)-1,3718:((5*1)*6)-6,4862:(6*5)-(7-1),5434:(8-(5-1))*6,6578:(9-(5*1))*6,8294:(10-6)*(5+1),7106:(7-(5-1))*8,8602:(9-5)*(7-1),10846:(7*5)-(10+1),7942:((5-1)*8)-8,9614:(9-(5+1))*8,12122:(10+8)+(5+1),11638:(9+9)+(5+1),14674:(10+9)+(5*1),18502:(10+10)+(5-1),4394:((6-1)*6)-6,6422:6/(1-(6/8)),7774:(9-(6-1))*6,9802:(10-6)*(6*1),10166:(9-6)*(7+1),12818:(10+7)+(6+1),9386:(8-(6-1))*8,11362:(9+8)+(6+1),14326:(10-(6+1))*8,13754:(9+9)+(6*1),17342:(10+9)+(6-1),13294:(9+7)+(7+1),16762:(10-7)*(7+1),12274:(8+8)+(7+1),14858:(9-(7-1))*8,18734:(10+8)+(7-1),17986:(9+9)+(7-1),22678:(10-7)*(9-1),13718:(8+8)+(8*1),16606:(9+8)+(8-1),20938:(10-(8-1))*8,135:(3*2)*(2+2),189:(4+2)*(2+2),297:((5*2)+2)*2,459:((7*2)-2)*2,513:(8-2)*(2+2),621:((9+2)*2)+2,783:(10*2)+(2+2),225:(3+3)*(2+2),315:((2+2)+4)*3,495:((5*2)-2)*3,585:((2/2)+3)*6,765:((2/2)+7)*3,855:(8*3)+(2-2),1035:(9-3)*(2+2),1305:((10+3)*2)-2,441:((4*2)-2)*4,693:(5*4)+(2+2),819:(6*4)+(2-2),1071:(7*4)-(2+2),1197:((2+2)*4)+8,1449:(9*2)+(4+2),1827:(10-4)*(2+2),1089:(5*5)-(2/2),1287:(5-(2/2))*6,1683:(7*2)+(5*2),1881:((8+5)*2)-2,2277:((5-2)+9)*2,2871:((5+2)*2)+10,1521:(6/2)*(6+2),1989:((7+2)*2)+6,2223:(8-(2+2))*6,2691:((6/2)+9)*2,3393:(10*2)+(6-2),2601:((7-2)+7)*2,2907:(7-(2+2))*8,4437:((10/2)+7)*2,3249:((2+2)*8)-8,3933:(9*2)+(8-2),4959:(10-2)+(8*2),6003:((9-2)*2)+10,7569:(10+10)+(2+2),375:((3+2)+3)*3,825:((5+2)*3)+3,975:((3-2)+3)*6,1275:((3-2)+7)*3,1425:(8*3)*(3-2),1725:((3+2)*3)+9,2175:(10*3)-(3*2),735:((3+2)*4)+4,1155:((3-2)+5)*4,1365:(6*4)*(3-2),1785:(7-(3-2))*4,1995:(4-(3-2))*8,2415:(9*4)/(3/2),3045:(10*3)-(4+2),1815:(5*5)-(3-2),2145:(5-(3-2))*6,2805:(7*3)+(5-2),3135:(5+3)+(8*2),3795:(9-5)*(3*2),4785:(5-3)*(10+2),2535:((3+2)*6)-6,3315:(7*3)+(6/2),3705:((8+2)*3)-6,4485:(9-(3+2))*6,5655:(10-6)*(3*2),4335:(7+3)+(7*2),4845:(8/3)*(7+2),5865:(9+7)*(3/2),7395:(7-3)+(10*2),5415:(8-(3+2))*8,6555:(9-(3*2))*8,8265:(10+8)+(3*2),7935:(9+9)+(3*2),10005:(10+9)+(3+2),12615:((10-3)*2)+10,1029:((4-2)+4)*4,1617:((5+2)*4)-4,1911:((4*2)-4)*6,2499:(7-4)*(4*2),2793:(8-4)*(4+2),3381:((9-2)*4)-4,4263:((4-2)*10)+4,2541:((5+5)*2)+4,3003:(6*5)-(4+2),3927:(7+5)*(4-2),4389:(5-(4-2))*8,5313:(9-5)*(4+2),6699:(10+4)+(5*2),3549:(6+6)*(4-2),4641:(7-4)*(6+2),5187:(8*6)/(4-2),6279:((4-2)*9)+6,7917:(10-6)*(4+2),6069:((7+7)*2)-4,6783:((7*2)-8)*4,8211:(9+7)+(4*2),10353:((4-2)*7)+10,7581:((4-2)*8)+8,9177:(9-(4+2))*8,11571:(10+8)+(4+2),11109:(9+9)+(4+2),14007:(10-4)+(9*2),17661:((4/10)+2)*10,6171:(5+5)+(7*2),6897:((5/5)+2)*8,8349:((5-2)*5)+9,10527:(5-(2/10))*5,5577:((5-2)*6)+6,7293:(7-(5-2))*6,8151:(6-(5-2))*8,9867:((5/2)*6)+9,12441:((5-2)*10)-6,9537:(7+7)+(5*2),10659:((5*2)-7)*8,12903:(7*5)-(9+2),16269:(10+7)+(5+2),11913:(8*5)-(8*2),14421:(9+8)+(5+2),18183:(10-(5+2))*8,22011:(9-5)+(10*2),27753:(10/5)*(10+2),6591:(6+6)+(6*2),8619:(7-(6/2))*6,9633:(8-(6-2))*6,11661:(9-6)*(6+2),14703:(10+6)+(6+2),12597:(7-(6-2))*8,15249:(9+7)+(6+2),19227:(10-7)*(6+2),14079:(8+8)+(6+2),17043:((6*2)-9)*8,21489:(10-8)*(6*2),20631:((9-6)+9)*2,26013:(9-6)*(10-2),32799:(10+10)+(6-2),16473:(8+7)+(7+2),25143:((10/7)+2)*7,18411:(8-(7-2))*8,22287:((9+7)*2)-8,34017:(10+9)+(7-2),42891:(10-7)*(10-2),20577:((8/2)*8)-8,24909:(9-(8-2))*8,31407:(10+8)+(8-2),30153:(9+9)+(8-2),38019:(10-(9-2))*8,47937:(10+10)+(8/2),58029:(10+9)+(10/2),625:((3*3)*3)-3,875:((3*3)-3)*4,1375:(5*3)+(3*3),1625:(6*3)+(3+3),2125:(7-3)*(3+3),2375:((3+3)-3)*8,2875:(9-(3/3))*3,3625:(10*3)-(3+3),1225:(4*3)+(4*3),1925:((3/3)+5)*4,2275:(6*4)+(3-3),2975:(7-(3/3))*4,3325:(8-4)*(3+3),4025:(9-(4-3))*3,3025:(5*5)-(3/3),3575:(6*5)-(3+3),4675:((5*3)-7)*3,6325:(9-5)*(3+3),7975:(10+5)+(3*3),4225:((6/3)+6)*3,5525:(7*3)+(6-3),6175:((3*3)-6)*8,7475:(9+6)+(3*3),9425:(10-6)*(3+3),7225:((3/7)+3)*7,8075:(8+7)+(3*3),9775:(9-3)*(7-3),9025:8/(3-(8/3)),10925:(9-(3+3))*8,13775:(10+8)+(3+3),13225:(9+9)+(3+3),16675:(10*3)-(9-3),1715:((4+3)*4)-4,2695:((4-3)+5)*4,3185:(6*4)*(4-3),4165:(7-(4-3))*4,4655:((4+3)-4)*8,5635:(9*4)-(4*3),7105:((10-3)*4)-4,4235:(5*5)-(4-3),5005:(5-(4-3))*6,6545:(7+5)+(4*3),7315:(8*4)-(5+3),8855:((5*3)-9)*4,11165:(10/5)*(4*3),5915:(6+6)+(4*3),8645:(8-6)*(4*3),10465:(9-(6-3))*4,13195:(10-4)+(6*3),10115:(7*4)-(7-3),11305:((7-3)*4)+8,13685:(9-7)*(4*3),17255:(10+7)+(4+3),15295:(9+8)+(4+3),19285:(10-(4+3))*8,18515:(9+9)*(4/3),29435:(10*3)-(10-4),7865:((5+5)*3)-6,10285:(7+5)*(5-3),11495:(8-5)*(5+3),13915:(9-(5/5))*3,9295:(6+6)*(5-3),12155:(7+5)*(6/3),13585:(8*6)/(5-3),16445:(9-6)*(5+3),20735:(10+6)+(5+3),17765:(8-5)+(7*3),21505:(9+7)+(5+3),27115:(10-7)*(5+3),19855:(8+8)+(5+3),24035:(9*3)-(8-5),29095:((5/3)*9)+9,36685:(10/5)*(9+3),46255:(10-(10/5))*3,10985:((6-3)*6)+6,14365:(7-(6-3))*6,16055:((6+3)-6)*8,19435:(9+6)+(6+3),24505:((6-3)*10)-6,18785:((7-6)+7)*3,20995:(8+7)+(6+3),25415:(9-6)+(7*3),32045:((6/3)*7)+10,23465:((6/3)*8)+8,28405:(9*8)/(6-3),35815:(10-(8-6))*3,34385:(9*3)-(9-6),43355:(10-6)*(9-3),54665:(3-(6/10))*10,24565:(7+7)+(7+3),27455:((7+3)-7)*8,33235:(9-(7/7))*3,41905:(10-7)+(7*3),30685:((7-3)*8)-8,37145:(9-(8-7))*3,44965:(9-7)*(9+3),56695:(9*3)-(10-7),71485:(10+10)+(7-3),34295:((8+3)-8)*8,41515:(9-8)*(8*3),52345:((10*8)-8)/3,50255:(9-9)+(8*3),63365:(10+9)+(8-3),79895:(10-10)+(8*3),60835:(9+9)+(9-3),76705:((9+9)-10)*3,96715:(9-(10/10))*3,2401:(4*4)+(4+4),3773:((4/4)+5)*4,4459:((4+4)-4)*6,5831:(7-4)*(4+4),6517:(8*4)-(4+4),7889:((9-4)*4)+4,9947:(10*4)-(4*4),5929:(5*5)-(4/4),7007:(5-(4/4))*6,9163:(7-(5-4))*4,10241:(8-5)*(4+4),15631:((10-5)*4)+4,12103:(8+4)*(6-4),14651:(9-6)*(4+4),18473:(10+6)+(4+4),14161:(4-(4/7))*7,15827:(7*4)-(8-4),19159:(9+7)+(4+4),24157:(10-7)*(4+4),17689:(8+8)+(4+4),21413:(9*4)-(8+4),26999:(10-4)*(8-4),41209:((10*10)-4)/4,9317:(5*5)-(5-4),11011:((5+4)-5)*6,14399:(7-(5/5))*4,16093:(4-(5/5))*8,19481:(9-5)+(5*4),24563:(10+5)+(5+4),13013:(6-5)*(6*4),17017:(7+5)*(6-4),19019:((5+4)-6)*8,23023:(9+6)+(5+4),29029:(10-6)+(5*4),22253:(7*5)-(7+4),24871:(8+7)+(5+4),30107:((7-4)*5)+9,37961:((7-5)*10)+4,27797:(5-(8/4))*8,33649:(9-(8-5))*4,42427:(10/5)*(8+4),40733:((9/9)+5)*4,51359:(9-5)*(10-4),64757:((10/5)*10)+4,15379:((6+4)-6)*6,20111:(7-6)*(6*4),22477:(8+6)+(6+4),27209:((6-4)*9)+6,34307:(10+6)*(6/4),26299:(7+7)+(6+4),29393:((6+4)-7)*8,35581:(9+7)*(6/4),44863:((6-4)*7)+10,32851:((6-4)*8)+8,39767:(9-8)*(6*4),50141:((8-6)*10)+4,48139:(9-9)+(6*4),60697:(10-9)*(6*4),76531:(10-10)+(6*4),34391:(7-(7/7))*4,38437:((7+7)-8)*4,42959:((7+4)-8)*8,52003:(9*8)/(7-4),65569:((7/4)*8)+10,62951:(7-(9/9))*4,79373:(10*4)-(9+7),100079:(7-(10/10))*4,48013:((8-4)*8)-8,58121:((8+4)-9)*8,73283:(10-8)*(8+4),70357:(4-(9/9))*8,88711:((9+4)-10)*8,111853:(10+10)+(8-4),107387:(10+9)+(9-4),14641:(5*5)-(5/5),17303:(5*5)-(6-5),30613:(9+5)+(5+5),20449:((5+5)-6)*6,26741:(5*5)-(7-6),29887:(8+6)+(5+5),34969:(7+7)+(5+5),39083:((5+5)-7)*8,59653:(10/5)*(7+5),43681:(5*5)-(8/8),52877:(5*5)-(9-8),66671:(10+5)*(8/5),64009:(5*5)-(9/9),80707:(5*5)-(10-9),101761:(5*5)-(10/10),24167:(5-(6/6))*6,31603:(7+6)+(6+5),35321:((8-5)*6)+6,42757:(9*6)-(6*5),53911:((10-5)*6)-6,41327:(5-(7/7))*6,46189:(8-6)*(7+5),55913:((7-5)*9)+6,51623:((6+5)-8)*8,62491:((8+5)-9)*6,78793:(6*5)/(10/8),75647:((9-6)*5)+9,95381:((9+5)-10)*6,120263:(10+10)*(6/5),73117:(9-7)*(7+5),92191:((7-5)*7)+10,67507:((7-5)*8)+8,81719:((7+5)-9)*8,103037:(10-8)*(7+5),124729:((10-7)*5)+9,157267:((7/5)*10)+10,75449:(8*5)-(8+8),91333:(9*8)/(8-5),115159:((8+5)-10)*8,212773:(10+10)+(9-5),28561:(6+6)+(6+6),41743:(8-6)*(6+6),50531:(6*6)/(9/6),63713:(10*6)-(6*6),66079:(9-7)*(6+6),83317:((10-7)*6)+6,61009:(8*6)/(8-6),73853:((6+6)-9)*8,93119:(10-8)*(6+6),112723:((9-6)*10)-6,108953:((7+7)-10)*6,96577:(8*6)/(9-7),121771:((7+6)-10)*8,116909:(7*6)-(9+9),185861:((10-7)*10)-6,89167:((8-6)*8)+8,107939:(9*8)-(8*6),136097:(8*6)/(10-8),130663:(9+9)*(8/6),164749:((10-8)*9)+6,199433:((9/6)*10)+9,317057:(10+10)+(10-6),192763:((9-7)*7)+10,141151:((9-7)*8)+8,177973:(10*8)-(8*7),215441:(9*8)/(10-7),271643:((10-8)*7)+10,198911:((10-8)*8)+8', ',') AS kv
)
SELECT c.id, c1, c2, c3, c4, result
FROM poker24.cards c LEFT JOIN a a ON a.i =
( CASE c1 WHEN 1 THEN 2 WHEN 2 THEN 3 WHEN 3 THEN 5 WHEN 4 THEN 7 WHEN 5 THEN 11 WHEN 6 THEN 13 WHEN 7 THEN 17 WHEN 8 THEN 19 WHEN 9 THEN 23 WHEN 10 THEN 29 END
* CASE c2 WHEN 1 THEN 2 WHEN 2 THEN 3 WHEN 3 THEN 5 WHEN 4 THEN 7 WHEN 5 THEN 11 WHEN 6 THEN 13 WHEN 7 THEN 17 WHEN 8 THEN 19 WHEN 9 THEN 23 WHEN 10 THEN 29 END
* CASE c3 WHEN 1 THEN 2 WHEN 2 THEN 3 WHEN 3 THEN 5 WHEN 4 THEN 7 WHEN 5 THEN 11 WHEN 6 THEN 13 WHEN 7 THEN 17 WHEN 8 THEN 19 WHEN 9 THEN 23 WHEN 10 THEN 29 END
* CASE c4 WHEN 1 THEN 2 WHEN 2 THEN 3 WHEN 3 THEN 5 WHEN 4 THEN 7 WHEN 5 THEN 11 WHEN 6 THEN 13 WHEN 7 THEN 17 WHEN 8 THEN 19 WHEN 9 THEN 23 WHEN 10 THEN 29 END);

当然，这里的字符串长度超过了 10000： 10896 个。我们可以用一些手段来压缩，比如把这个巨长的 CASE 弄成一个 inline 函数，然后再把主键从十进制数字字面值换成十六进制，其实长度就在 10KB 以内了。不过规则禁止我们使用存储过程，这就要想其他办法了。主要就是如何压缩中间那个长字符串。

压缩优化

当然，这里的字符串长度超过了 10000： 10896 个。所以需要用到额外的压缩功能，来满足题目要求。 Pigsty 原生提供了 pgsql-gzip 扩展：

CREATE EXTENSION IF NOT EXISTS gzip;

然后我们把上面的结果表压缩一下，10018个字符压缩到 7796 个，总长度 8796，满足题目要求

WITH a(i, result) AS (SELECT (split_part(kv, ':', 1))::INTEGER AS i, split_part(kv, ':', 2) AS result
FROM regexp_split_to_table(encode(gunzip('\x1F8B08000000000000034D5A5B722B2B0CDC4EE278CABC24E0EE7F6157DD2DF0A9CA47860109F468B518576BFFFDFCD4BFFA1B7FAFF5AEE6FFFDF8ABFDBE38F86E65FCF733F1EEA7F1B92DCC7FC5FC86F96DC6FCFDDCF77DC4FB5AFEAE803ACA7F3FE3D5AFC016CF46819DCF5ECE06FCF7D54340390A269F573CAF58FFF75343CD7B60FEFEBBF20CEF6B79F8840575FB11387E5FE3DDCBDDB1F1D9074E38AE409C603E9C81F7D6C3220B6BA5C0C738271C98BFEAB1D8AB96D0F11E2B26D8CB7E25E36D3326D811E8EF09934C2897C0D55DEFB9E1F5766CC0717ABDDF6BE1C4FEFB490B7E4FF4DA61A538E1BC5B98E1B71246C62635B27EFFC28DCD61F576DC5277C86CF48AD1EA1D46F8BBEF7BF1F37E3E742334B4E7B87954C8C7D4BFFDFB6A6F6B8D56FF2AB07043A742B9B596B3A0D3EA9D6EEF57E12E474187910CF327DDCCF736D3ED2F4E7A3BE6EF787EF47ECD747B7BC9ED6DC70E07DDC609C3EF09E8761B2EB7A7C0891385DBF180F713161AE779BDB733B0290CEF6BAB8823A84BBF4FD8587EA7C4658B7E958470538591E4782C0B113634E3251DD524136EB3B06CB809BBAA25ECF81713B1A65CCB4744C0F9BD295EB5B118182BD4F81908A9738FB353C64B6BB24586EC136B26FC1FF69EBFA1E4D3427167D041EFCC00C1F935808DB46DF7FC0ABA5661228D30E8424DC39A15912B2733AC7E1692A7690DC38461C030E978E6BFF05C7F9BDD30E93099EBFD73D061D90D13BEDF7CBF70B2088D487EC6E17EAE823A2C03A53F0A94B1AF48E2C3442419F1802B7644A247EAC58ACFF865FA51E7083F4B246399FF635518DC2B95724B10D94A97D2F1DD06469C1B67025606B082A3D3BE056AECEC33AC6952F33AC1D1B991080E248184302B041D12C2B49B6727E1FAC13CD2CE39B0EFF1151C9DE7971A05475B3C306D2830683DA56684F5CD037F3883C9B6FB95AAE0E8B4898CB47E9F40903ECB090EBACE98F28B42C2541869FB8ADD4C7AE7DEA29CC8BFFBBD46A50DFE908232AD2FC4D8486F44A296B98E4387D26E22D85D339E98E1085C795433161304FFCBA38D991A1E1D890E6D8D763DAA35BEC75163726069089C1F8BB0E18023BBA54633365277518629FA09B35022178F819DA51AADE97E8FE7F04E2F5BC0351266FA4062FA1900562F41D7489F5B8345A4462E1E651044C67520017DCA1E1062638E3383BEB00293AC2335CA56C5F1E45016C6DDBB6AFF86E555B9E61C5F77D16ECD3DCBE3C7428E45580B99ED44B599A0D78E9966214C86590C767A6A042D47EC75AC32E8410961CCDAE8DAAEA599DC6084B08A656A2C568C10EA574F2D82564B4B2E2FEDEC640A8D170DA7625FB868D38758A240DF5E153B76F0B85555CBBF75B3BF3A6CB5692A8D0C4F5371485169A57DADC770189DD8EECF39B88FD612AEFCB302AE264D1EEA3D0FBE97A4F09CFE524D9185FDB8BFB655E5BBC85E664A78733158534E1CA376D878F3149EA0D614AE7CE6A43E99393C8594CDAED4D110ADD869FA4D65D21F1C489B9CDF2D316D17D56964B0C26E7998DA16EB585A561E8A3AEE67032A5CCDE7BAB2B736C0F891ECA56CF6E2E7702BF158E1FCF05987B3C371409542FD06E5B8CF6D4F066523696A91549B45B6FD3E7CB6DA61D13BDF5B8DF79B9363C97BAE7E8BBF04363BD592CFBD1AEB78CB6A39B6A542088DEAB9F8FED392544DBFCF53D5D30E976E0F0E507022554B9DA81713E0F65F4A0D44AA4446A918C1C3FA413DAE58751F369D22673DA02791955621B754B51C631F663164C6362FE8694D8165935A7D1AE3738A39C513498FC356533CE94521AB9209586E3CC287DE884D89B28608CCB0343758B3C101FE81439C7AF7A2C7720A9853A3CBB82D964FDF10823592DAFBFE3ACD6181686830653EB27A28DE6526636B02E8A885B4F2E74CE90D369191082221B51F232162E0EA9D8CFB8F3CEDEDA57484CF73CF33CDF7172F1431D3588515111101CD8E0D220AFA7BEBFD7322266AE51D60C8D4D1EC10F14E07CF764442C40E1A8825494B901DEFD9EF0C55E46A5728B97820891D329E4211B960184F13DBDE08ED7106A2220FC4FE8E25411F1012BD8CAFDA8C234C51D4506AAB98624708980DC25BF41181110985AD826FA651FBDC76109F6719DC993D62294C4AFB1E4F15996929A9CEAD4D66D19289509DC63211C40C2373B38BC9D2D32175722793030B9B5FC9B11A1A5D180DA0F9920566DF269EF6A7008CA2879DACA3276AB4592A7E696035B70B9872D62606102F39504B297601BBD3B24165840B4FBFC953DA26A968C9A38305CF135BA259BB5EEC1822A37B1F4E81D1779BBB1F4244170683A827A6296334EFA925BBAE60664A6326FA1FFA7BE4814ABF846CE089A8F560EE74C2C9C327929B0E249697B9C47D2B73C6C193A066FB506B0971E8D5E60916D1BBCDD3A77386C771CE5E49ADE7AEF37A597A8A0B6026512AE09498AF1AB160C5D5603495C6F14A8CBE269899E7B41247D878859E998CAF65AD36805DFA59D9516BD9C7D11A19A51CE0FD23D6441EBA53F407C6A6CD83E74114EC9D91E94B752EF89B2E0756277A21A3B28D2DD60E5E8720B03CB30BC7EA636783EF49B6941391BD953CD6D24B51C8A5474B426C5335B28C86C8AC6D9DBE9EB70E1467D565519CA25FBBF4C3D9360FEE2D8192CBB24AB13873129120CA54AB871158E28B0AF4C367A2522B74D7633707A3EC18677DEC42466CA92A98FE78B716C4B2321388172469DE7BB2AD2C70959E104753718A568E82254679695BA5C5D36651D2585D93C7B1A6B52CAFF32BA92052573239FABD0C041936F76C9E2CD8960ACE226DC4C98A7765A79F921AA5AE9F4DB23645259BFB9F22C48A587D4C9C2EF91E31B4525F58693288665877C094EB61E5947159F505798D5571146554B23BA46574ABF51E4F7B6845C1B63EA79C865118FC0F8B295B5818E166084B6C2F159E53866864952A23109258BB03B2E6F77C50812BC8B6EFB658D6030C582450377931B1E6797EBA4AE064B1D24D466750DAB92100E50B0F343B6DB8064E31978C054693E81E558277A594714A31D65452C841A9836AB636162B548B1B2BBE185C7F3A596CD6624AC5D51D29405E66C48C519A65779C7A3990953756057A4AA89D7D447723AADA9995FDEDBD7D0B2D66CC2D1A419CA14506F71E29D2F3F2C7AC7D0B2DB6EAF56B558745E6A045982194B147FBA7D0524F4B034C529EF95E056B149C5A33E765C530FF7BE3782B78C7C37A0C90D9ED14F47EFA9EF94F61A5E93B356565E588FB3F54090A22F15858072F495110029938F01CFFF4BA2E57C268B4F76E790120FF0C9209CA808FA2BC794FAEF4C8E9D1D87ECB77D6D57E3D46A9C6A26F472AFAE561AAA21939933467E93A03C7596C27E4D3CD98AE55EC82D0C745017C76908F03CB496B5412199065B865C3AAF3D45EB7DDBAE49A54C5B106FB7B8C64AB327524F415DDC5B2E6151D54D52ECE0FBAC01A095ED64525C492363EABADB49A9E0B491FE6C4E85FCF7167D1ADB91D224296178C68D9211EA64DB2435B7995C1A0A041703BC0EB8F60E0DC9088861635D2658941F0A3EF5C76A886E6F818768097825B99DA214D2D5D93FDDF7A54B9092956EC54072D9B346CC2A7966DB58959F83069A84FE4E1210E2D8D5A4FD0D3CDCB49F7F5F5FD56CEA7F91F0DB39D289B3D2A7C9DF7D9A3673C7B465EF5C2C0F2BF93D655E6DF59F9B8259E4472F24E7B91AA8724CFDE255AF87D535BCBC49059C16492DED847D0D0E75EBB332235A48BED356837DE751179C2236937C6B23E5CF575AD040DE4F45FF461BEDB70C8EEA8FC6446D0378C16C8F248AF0C5A000F223181C16AD57F6600177BFFBACBF1DC394BF17573426DE4AC8A93D8652E1BDB6F96D04DE6849C7D427BF2DB889CA922C784EB83811AD6ECA4AAB867549A90212C667B984E4043F5BF9FC0ECD2D482B0A86252301DFFF617EB21F6AFCC7815554E2BEBDB98D076D3D557610813913C3E339D22C268CF8E68AD2879BCFF0D428F1B6E12E8CF48481DBA98C14B3526B6FAE5F65CE256E7C13A0ECCC59B818D29EC69F71EA40109B20350D7EEA50574BD27E9B5E98924AFFAA1BC434857DAA8071FA8A79A3896EE3AD53DB75AFAF922E9409E1AF179B9A1962D32ACCC7E0D459DA86CA10723261896F1A24520BA2828C0E8B245AE429BFD658B12347D5DB6A84921BB9F02B33812FDD3BE7738943D6A03E58289909FD1B787D13ACC2A13193750C99F03664294E96B565793980089BEB2A05318678470B02EEBC65D1453A85FF660DC76273775DA16E51324B7DEC65086DCE477524FA4694165FA411ACA09A813B92366485B14F6DB514C998D974B2B8175904CC2FB0A2A7DBE99DB753164B7979D732B4416430479EEE310551D7FB421FE4E60A5B583B9765EFD7CF6F9B81925621F3AA5F2149CDBF296E9EA0B7ECB16D5F3BCD19287FD19FA7EACD4DA00795E89B51899F2244C96DF8C464FF2CC651F4640A3DF126B69385E8D499940CCD8B8EA0A83ABC658ECEF293A3F1C4511AD6788E8DBF7F47970867BB452D90892469C8FF0515A0FCE70127A6D85F23EEBA21EF6FAC5198EC559362D90C81A8C6BE97E0E6751530EE853DF3E12FBACF1D6411561D2DE66EAED3FDA373AE75826D1F0943E3277E529530786E075CB54C41F0CC36918BC22DD44F2B05CD305E7C80E6D86A5FAEDD01818D120C2E7E3288CD63CE25297CC439889BB81E037FD9F1686991021B5BEADD54E988195335D238A70975FFA194166A1E4100A32EF8C3F18D16D403C648CF9FCCA41A44568AC75638CAB7A94A51B3E1AD985572394C3F0B1A85CDFCE1A691C15D6D715BD3E0B2568CD8B310899B707EBAE890D61279CC3472917AB612A3401E52E63C880734EAFDF31F806F8E8CA58FFB83EBF053E010C325FB073EB7215110DF912190CBF6CDC17B22B7A5BD7E598705E9FB0A261906805628C38BF30ACFC4E8365C66B0A610853D1A26F54965986A647B37BAEC211291EC58BF76C50FCC141C228D37CCCECE5054FDBF2EE0DCB1029B80C2ECD6FA420658D6D409D87407053BBD57D814D491C7D4EA29F6512AFE874944296F11B55ADA895660053540DF069AA1A90AFCB249B8D1741F30019AFC01865795F13A5298A6BEF372349522BF8C93E9650F047533DE73FC1BFC96697F9F77EE60FCCADCED18FE539127413D0E1E4E03B7C1F3C6656E5B2BC8A21672AEFBC6A71FCD687252FCFC360F0CAE0139B8786302913922B649B2894F59FDB1748AAB1F3D68FCB92F296E04DFD40959C164932EFBD2476020231F9E903317A41C0792332B979302A743DCBEBDDAB34ACCD789725F4CBA238229116B84435E8BCCABE3AB969945FF77E9AA80583E11A68A473D7FD2953507BD5B28472FCCE2178DE3F772C2CBDE8D3A6EBFCF3FBB3A7CA4B438D697B28A9728BF637D9F6F0E250B1218A1B8D8FE715143693F282879EB45C12F83F3248B10120270000'), 'escape'), ',') AS kv)
SELECT c.id, c1, c2, c3, c4, result FROM poker24.cards c LEFT JOIN a a ON a.i =
(CASE c1 WHEN 1 THEN 2 WHEN 2 THEN 3 WHEN 3 THEN 5 WHEN 4 THEN 7 WHEN 5 THEN 11 WHEN 6 THEN 13 WHEN 7 THEN 17 WHEN 8 THEN 19 WHEN 9 THEN 23 WHEN 10 THEN 29 END
*CASE c2 WHEN 1 THEN 2 WHEN 2 THEN 3 WHEN 3 THEN 5 WHEN 4 THEN 7 WHEN 5 THEN 11 WHEN 6 THEN 13 WHEN 7 THEN 17 WHEN 8 THEN 19 WHEN 9 THEN 23 WHEN 10 THEN 29 END
*CASE c3 WHEN 1 THEN 2 WHEN 2 THEN 3 WHEN 3 THEN 5 WHEN 4 THEN 7 WHEN 5 THEN 11 WHEN 6 THEN 13 WHEN 7 THEN 17 WHEN 8 THEN 19 WHEN 9 THEN 23 WHEN 10 THEN 29 END
*CASE c4 WHEN 1 THEN 2 WHEN 2 THEN 3 WHEN 3 THEN 5 WHEN 4 THEN 7 WHEN 5 THEN 11 WHEN 6 THEN 13 WHEN 7 THEN 17 WHEN 8 THEN 19 WHEN 9 THEN 23 WHEN 10 THEN 29 END);

结果

在本地 M1 Macbook Pro 上单核执行时间大约是 0.58 秒，比第一名 0.67s 稍微快一点。

当然，因为 NineData 上面那个 PostgreSQL 没有 gzip 扩展，所以我也没用他们的平台（4c 32G）去提交成绩。

 Merge Right Join  (cost=118104.17..768224.17 rows=5000000 width=68) (actual time=457.485..555.265 rows=1000000 loops=1)
   Merge Cond: (((split_part(kv.kv, ':'::text, 1))::integer) = ((((CASE c.c1 WHEN '1'::double precision THEN 2 WHEN '2'::double precision THEN 3 WHEN '3'::double precision THEN 5 WHEN '4'::double precision THEN 7 WHEN '5'::double precision THEN 11 WHEN '6'::double precision THEN 13 WHEN '7'::double precision THEN 17 WHEN '8'::double precision THEN 19 WHEN '9'::double precision THEN 23 WHEN '10'::double precision THEN 29 ELSE NULL::integer END * CASE c.c2 WHEN '1'::double precision THEN 2 WHEN '2'::double precision THEN 3 WHEN '3'::double precision THEN 5 WHEN '4'::double precision THEN 7 WHEN '5'::double precision THEN 11 WHEN '6'::double precision THEN 13 WHEN '7'::double precision THEN 17 WHEN '8'::double precision THEN 19 WHEN '9'::double precision THEN 23 WHEN '10'::double precision THEN 29 ELSE NULL::integer END) * CASE c.c3 WHEN '1'::double precision THEN 2 WHEN '2'::double precision THEN 3 WHEN '3'::double precision THEN 5 WHEN '4'::double precision THEN 7 WHEN '5'::double precision THEN 11 WHEN '6'::double precision THEN 13 WHEN '7'::double precision THEN 17 WHEN '8'::double precision THEN 19 WHEN '9'::double precision THEN 23 WHEN '10'::double precision THEN 29 ELSE NULL::integer END) * CASE c.c4 WHEN '1'::double precision THEN 2 WHEN '2'::double precision THEN 3 WHEN '3'::double precision THEN 5 WHEN '4'::double precision THEN 7 WHEN '5'::double precision THEN 11 WHEN '6'::double precision THEN 13 WHEN '7'::double precision THEN 17 WHEN '8'::double precision THEN 19 WHEN '9'::double precision THEN 23 WHEN '10'::double precision THEN 29 ELSE NULL::integer END)))
   ->  Sort  (cost=62.33..64.83 rows=1000 width=64) (actual time=0.851..0.872 rows=566 loops=1)
         Sort Key: ((split_part(kv.kv, ':'::text, 1))::integer)
         Sort Method: quicksort  Memory: 59kB
         ->  Function Scan on regexp_split_to_table kv  (cost=0.00..12.50 rows=1000 width=64) (actual time=0.491..0.654 rows=566 loops=1)
   ->  Sort  (cost=118041.84..120541.84 rows=1000000 width=36) (actual time=456.629..494.693 rows=1000000 loops=1)
         Sort Key: ((((CASE c.c1 WHEN '1'::double precision THEN 2 WHEN '2'::double precision THEN 3 WHEN '3'::double precision THEN 5 WHEN '4'::double precision THEN 7 WHEN '5'::double precision THEN 11 WHEN '6'::double precision THEN 13 WHEN '7'::double precision THEN 17 WHEN '8'::double precision THEN 19 WHEN '9'::double precision THEN 23 WHEN '10'::double precision THEN 29 ELSE NULL::integer END * CASE c.c2 WHEN '1'::double precision THEN 2 WHEN '2'::double precision THEN 3 WHEN '3'::double precision THEN 5 WHEN '4'::double precision THEN 7 WHEN '5'::double precision THEN 11 WHEN '6'::double precision THEN 13 WHEN '7'::double precision THEN 17 WHEN '8'::double precision THEN 19 WHEN '9'::double precision THEN 23 WHEN '10'::double precision THEN 29 ELSE NULL::integer END) * CASE c.c3 WHEN '1'::double precision THEN 2 WHEN '2'::double precision THEN 3 WHEN '3'::double precision THEN 5 WHEN '4'::double precision THEN 7 WHEN '5'::double precision THEN 11 WHEN '6'::double precision THEN 13 WHEN '7'::double precision THEN 17 WHEN '8'::double precision THEN 19 WHEN '9'::double precision THEN 23 WHEN '10'::double precision THEN 29 ELSE NULL::integer END) * CASE c.c4 WHEN '1'::double precision THEN 2 WHEN '2'::double precision THEN 3 WHEN '3'::double precision THEN 5 WHEN '4'::double precision THEN 7 WHEN '5'::double precision THEN 11 WHEN '6'::double precision THEN 13 WHEN '7'::double precision THEN 17 WHEN '8'::double precision THEN 19 WHEN '9'::double precision THEN 23 WHEN '10'::double precision THEN 29 ELSE NULL::integer END))
         Sort Method: external sort  Disk: 56760kB
         ->  Seq Scan on cards c  (cost=0.00..18384.00 rows=1000000 width=36) (actual time=0.028..213.760 rows=1000000 loops=1)
 Planning Time: 0.363 ms
 Execution Time: 581.782 ms

以上就是使用 PostgreSQL 一条SQL计算扑克牌24点的解法。

其实，如果在用上并行优化也许还能再快点，然后 PostgreSQL 还有一种其他数据库做不到的解法。那就是直接把这个查表动作封装成一个扩展，然后用C语言直接暴露存储过程给 SQL 调用。这样就能把这个计算过程优化到极致了。当然，这种我们也懒得折腾了。

19.6 - DB-Engine 数据库热度趋势分析

分析 DB-Engine 上的数据库管理系统，查阅其流行度变迁。

概览

GitHub 仓库地址：https://github.com/pgsty/pigsty-app/tree/master/db

在线Demo地址：https://demo.pigsty.cc/d/db-engine

19.7 - StackOverflow 全球开发者调研

分析 StackOverflow 最近七年全球开发者调研数据中关于数据库的部分

概览

GitHub 仓库地址：https://github.com/pgsty/pigsty-app/tree/master/db

在线Demo地址：https://demo.pigsty.cc/d/sf-survey

20 - 发布注记

Pigsty 历史版本发行注记

版本	发布时间	摘要	地址
v3.5.0	2025-05-31	PG18 beta，421 扩展，监控升级，代码重构	v3.5.0
v3.4.1	2025-04-05	OpenHalo & OrioleDB，MySQL兼容，pgAdmin改进	v3.4.1
v3.4.0	2025-03-30	备份改进，自动证书，AGE，Ivory 全平台，本地化，架构与参数改进	v3.4.0
v3.3.0	2025-02-24	404 扩展，扩展目录，App 剧本，Nginx 定制，DocumentDB 支持	v3.3.0
v3.2.2	2025-01-23	390扩展，Omnigres支持，Mooncake，Citus13与PG17支持	v3.2.2
v3.2.1	2025-01-12	350扩展，Ivory4，Citus强化，Odoo模板	v3.2.1
v3.2.0	2024-12-24	扩展管理 CLI ，Grafana 强化，ARM64 扩展补完	v3.2.0
v3.1.0	2024-11-24	PG 17 升默认大版本，配置简化，Ubuntu24与ARM 支持，Supabase，MinIO 改进	v3.1.0
v3.0.4	2024-10-30	PG 17 扩展，OLAP 全家桶，pg_duckdb	v3.0.4
v3.0.3	2024-09-27	PostgreSQL 17，Etcd 运维优化，IvorySQL 3.4，PostGIS 3.5	v3.0.3
v3.0.2	2024-09-07	精简安装模式，PolarDB 15支持，监控视图更新	v3.0.2
v3.0.1	2024-08-31	例行问题修复，Patroni 4支持，Oracle兼容性改进	v3.0.1
v3.0.0	2024-08-25	333个扩展插件，可插拔内核，MSSQL，Oracle，PolarDB 兼容性	v3.0.0
v2.7.0	2024-05-20	扩展大爆炸，新增20+强力扩展插件，与多款Docker应用	v2.7.0
v2.6.0	2024-02-28	PG 16 作为默认大版本，引入 ParadeDB 与 DuckDB 等扩展	v2.6.0
v2.5.1	2023-12-01	例行小版本更新，PG16重要扩展支持	v2.5.1
v2.5.0	2023-09-24	Ubuntu/Debian支持：bullseye, bookworm, jammy, focal	v2.5.0
v2.4.1	2023-09-24	Supabase/PostgresML支持与各种新扩展：graphql, jwt, pg_net, vault	v2.4.1
v2.4.0	2023-09-14	PG16，监控RDS，服务咨询支持，新扩展：中文分词全文检索/图/HTTP/嵌入等	v2.4.0
v2.3.1	2023-09-01	带HNSW的PGVector，PG 16 RC1, 文档翻新，中文文档，例行问题修复	v2.3.1
v2.3.0	2023-08-20	主机VIP, ferretdb, nocodb, MySQL存根, CVE修复	v2.3.0
v2.2.0	2023-08-04	仪表盘 & 置备重做，UOS 兼容性	v2.2.0
v2.1.0	2023-06-10	支持 PostgreSQL 12 ~ 16beta	v2.1.0
v2.0.2	2023-03-31	新增 pgvector 支持，修复 MinIO CVE	v2.0.2
v2.0.1	2023-03-21	v2 错误修复，安全增强，升级 Grafana 版本	v2.0.1
v2.0.0	2023-02-28	架构大升级，兼容性、安全性、可维护性显著增强	v2.0.0
v1.5.1	2022-06-18	Grafana 安全性修复	v1.5.1
v1.5.0	2022-05-31	Docker 应用程序支持	v1.5.0
v1.4.1	2022-04-20	错误修复 & 英文文档完整翻译	v1.4.1
v1.4.0	2022-03-31	MatrixDB 支持，分离 INFRA/NODES/PGSQL/REDIS模块	v1.4.0
v1.3.0	2021-11-30	PGCAT 重整 & PGSQL 增强 & Redis Beta支持	v1.3.0
v1.2.0	2021-11-03	默认 PGSQL 版本升级至 14	v1.2.0
v1.1.0	2021-10-12	主页, JupyterLab, PGWEB, Pev2 & pgbadger	v1.1.0
v1.0.0	2021-07-26	v1 正式版, 监控系统重整	v1.0.0
v0.9.0	2021-04-04	Pigsty 图形界面, 命令行界面, 日志集成	v0.9.0
v0.8.0	2021-03-28	服务置备，定制对外暴露的数据库服务	v0.8.0
v0.7.0	2021-03-01	仅监控部署，监控现有 PostgreSQL 实例	v0.7.0
v0.6.0	2021-02-19	架构增强，将PG与Consul解耦	v0.6.0
v0.5.0	2021-01-07	支持在配置中定义业务数据库/用户	v0.5.0
v0.4.0	2020-12-14	支持 PostgreSQL 13，添加官方文档	v0.4.0
v0.3.0	2020-10-22	虚拟机置备方案正式定稿	v0.3.0
v0.2.0	2020-07-10	PG监控系统第六版正式发布	v0.2.0
v0.1.0	2020-06-20	在生产仿真测试环境中验证通过	v0.1.0
v0.0.5	2020-08-19	离线安装模式：无需互联网访问即可交付	v0.0.5
v0.0.4	2020-07-27	将 Ansible 剧本重构为 Role	v0.0.4
v0.0.3	2020-06-22	接口设计改进	v0.0.3
v0.0.2	2020-04-30	首次提交	v0.0.2
v0.0.1	2019-05-15	概念原型	v0.0.1

v3.5.0 (Beta2)

Beta2 : 2025-05-28

亮点特性

支持 PG 18 (Beta)，扩展更新，总数达到 421 个
OrioleDB 与 OpenHalo 内核在全平台上可用
可使用 pig do 子命令代替 bin 脚本
Supabase 自建加强，解决若干遗留问题，例如复制延迟与密钥分发
代码重构与架构优化，优化了 Postgres 与 Pgbouncer 默认参数
更新了 Grafana 12, pg_exporter 1.0 与相关插件，翻修面板

curl https://repo.pigsty.cc/get | bash -s v3.5.0

支持 PostgreSQL 18
- 通过 pg_exporter 1.0.0 支持 PG18 监控指标
- 通过 pig 0.4.1 支持 PG18 安装 Alias。
- 提供 pg18 配置模板
重构 pgsql 模块
- PGSQL 重构，将 PG 监控抽离为单独的 pg_monitor 角色，移除 clean 逻辑
- 去除冗余重复的任务，合并同类项，精简配置。移除 dir/utils 任务块
- 所有扩展默认安装至 extensions 模式中（与 supabase 安全实践保持一致）
- 重命名模板文件，移除所有 .j2 后缀
- 为所有模板中的 monitor 函数添加 SET 命令清空 search_path，遵循 Supabase 安全最佳实践。
- 调整 pgbouncer 默认参数，增大默认链接池大小，设置链接池清理查询。
- 新增参数 pgbouncer_ignore_param ，允许配置 pgbouncer 忽略的参数列表
- 新增任务 pg_key 用于生成 pgsodium 所需的服务端密钥
- 针对 PG 17 默认启用 sync_replication_slots
- 重新调整了子任务标签，使其更符合配置小节的拆分逻辑
重构 pg_remove 模块
- 重命名参数：pg_rm_data, pg_rm_bkup, pg_rm_pkg 用于控制删除的内容
- 重新调整角色代码结构，使用更清楚的标签进行划分
新增 pg_monitor 模块
- pgbouncer_exporter 现在不再和 pg_exporter 共享配置文件
- 新增了 TimescaleDB， Citus，pg_wait_event 的监控指标。
- 使用 pg_exporter 1.0.0 ，更新了 PG16/17/18 相关监控指标。
- 使用更为紧凑，全新设计的指标收集器配置文件。
Supabase 加强 (感谢来自 @lawso017 的贡献！)
- 将 Supabase 容器镜像与数据库模式更新至最新版本
- 现在默认支持 pgsodium 服务端密钥加载
- 通过 supa-kick 定时任务解决 logflare 无法及时更新复制进度的问题
- 为 monitor 模式中的函数添加 set search_path 子句以遵循安全最佳实践
CLI 新增 pig do 命令，允许通过命令行工具替代 bin/ 中的 Shell 脚本
监控系统更新
- 更新 Grafana 大版本至 12.0.0，更新相关插件/数据源软件包
- 更新 Postgres 数据源 uid 命名方式（以适应新的 uid 长度限制与字符限制）
- 新增了 Static Datasource
- 更新了现有 Dashboard，修复若干遗留问题

基础设施软件包更新

pig 0.4.2
duckdb 1.3.0
etcd 3.6.0
vector 0.47.0
minio 20250422221226
mcli 20250416181326
pev 1.5.0
rclone 1.69.3
mtail 3.0.8 (new)

可观测性软件包更新

grafana 12.0.0
grafana-victorialogs-ds 0.16.3
grafana-victoriametrics-ds 0.15.1
grafana-infinity-ds 3.2.1
grafana_plugins 12.0.0
prometheus 3.4.0
pushgateway 1.11.1
nginx_exporter 1.4.2
pg_exporter 1.0.0
pgbackrest_exporter 0.20.0
redis_exporter 1.72.1
keepalived_exporter 1.6.2
victoriametrics 1.117.1
victoria_logs 1.22.2

数据库软件包更新

PostgreSQL 17.5, 16.9, 15.13, 14.18, 13.21
PostgreSQL 18beta1 支持
pgbouncer 1.24.1
pgbackrest 2.55
pgbadger 13.1

Postgres 扩展包更新

spat 0.1.0a4 new extension
pgsentinel 1.1.0 new extension
pgdd 0.6.0 (pgrx 0.14.1) new extension add back
convert 0.0.4 (pgrx 0.14.1) new extension
pg_tokenizer.rs 0.1.0 (pgrx 0.13.1)
pg_render 0.1.2 (pgrx 0.12.8)
pgx_ulid 0.2.0 (pgrx 0.12.7)
pg_idkit 0.3.0 (pgrx 0.14.1)
pg_ivm 1.11.0
orioledb 1.4.0 beta11 rpm & add debian/ubuntu support
openhalo 14.10 add debian/ubuntu support
omnigres 20250507 (miss on d12/u22)
citus 12.0.3
timescaledb 2.20.0 (DROP PG14 support)
supautils 2.9.2
pg_envvar 1.0.1
pgcollection 1.0.0
aggs_for_vecs 1.4.0
pg_tracing 0.1.3
pgmq 1.5.1
tzf-pg 0.2.0 (pgrx 0.14.1)
pg_search 0.15.18 (pgrx 0.14.1)
anon 2.1.1 (pgrx 0.14.1)
pg_parquet 0.4.0 (0.14.1)
pg_cardano 1.0.5 (pgrx 0.12) -> 0.14.1
pglite_fusion 0.0.5 (pgrx 0.12.8) -> 14.1
vchord_bm25 0.2.1 (pgrx 0.13.1)
vchord 0.3.0 (pgrx 0.13.1)
pg_vectorize 0.22.1 (pgrx 0.13.1)
wrappers 0.4.6 (pgrx 0.12.9)
timescaledb-toolkit 1.21.0 (pgrx 0.12.9)
pgvectorscale 0.7.1 (pgrx 0.12.9)
pg_session_jwt 0.3.1 (pgrx 0.12.6) -> 0.12.9
pg_timetable 5.13.0
ferretdb 2.2.0
documentdb 0.103.0 (+aarch64 support)
pgml 2.10.0 (pgrx 0.12.9)
sqlite_fdw 2.5.0 (fix pg17 deb)
tzf 0.2.2 0.14.1 (rename src)
pg_vectorize 0.22.2 (pgrx 0.13.1)
wrappers 0.5.0 (pgrx 0.12.9)

校验和

df30f2599a6416eea11acfd0f05ee14b  pigsty-v3.5.0.tgz
4c9fabc2d1f0ed733145af2b6aff2f48  pigsty-pkg-v3.5.0.d12.x86_64.tgz
796d47de12673b2eb9882e527c3b6ba0  pigsty-pkg-v3.5.0.el8.x86_64.tgz
a53ef2cede1363f11e9faaaa43718fdc  pigsty-pkg-v3.5.0.el9.x86_64.tgz
36da28f97a845fdc0b7bbde2d3812a67  pigsty-pkg-v3.5.0.u22.x86_64.tgz
8551b3e04b38af382163e6857778437d  pigsty-pkg-v3.5.0.u24.x86_64.tgz

v3.4.1

GitHub 发布页面：v3.4.1

在 EL 系统上支持 MySQL 线缆兼容的 PostgreSQL 内核：openHalo
在 EL 系统上支持 OLTP 强化的 PostgreSQL 内核：orioledb
pgAdmin 9.2 应用模板优化，自动更新服务器列表与填充 pgpass 密码
增大 PG 最大默认连接数至 250，500，1000
从 EL8 中移除依赖错误的 mysql_fdw 扩展

基础设施软件包升级

pig 0.3.4
etcd 3.5.21
restic 0.18.0
ferretdb 2.1.0
tigerbeetle 0.16.34
pg_exporter 0.8.1
node_exporter 1.9.1
grafana 11.6.0
zfs_exporter 3.8.1
mongodb_exporter 0.44.0
victoriametrics 1.114.0
minio 20250403145628
mcli 20250403170756

PG扩展升级

更新 pg_search 至 0.15.13
更新 citus 至 13.0.3
更新 timescaledb 至 2.19.1
更新 pgcollection RPM 至 1.0.0
更新 pg_vectorize RPM 至 0.22.1
更新 pglite_fusion RPM 至 0.0.4
更新 aggs_for_vecs RPM 至 1.4.0
更新 pg_tracing RPM 至 0.1.3
更新 pgmq RPM 至 1.5.1

校验和

471c82e5f050510bd3cc04d61f098560  pigsty-v3.4.1.tgz
4ce17cc1b549cf8bd22686646b1c33d2  pigsty-pkg-v3.4.1.d12.aarch64.tgz
c80391c6f93c9f4cad8079698e910972  pigsty-pkg-v3.4.1.d12.x86_64.tgz
811bf89d1087512a4f8801242ca8bed5  pigsty-pkg-v3.4.1.el9.x86_64.tgzz
9fe2e6482b14a3e60863eeae64a78945  pigsty-pkg-v3.4.1.u22.x86_64.tgz

v3.4.0

GitHub 发布页面：v3.4.0

新特性

新增 pgbackrest 备份监控指标与面板
丰富了 Nginx 服务器的配置项，支持自动化 Certbot 证书申请
优先使用 PostgreSQL 自带的 C, C.UTF-8 本地化规则集
IvorySQL 4.4 全平台支持（RPM/DEB x x86/ARM）
新增可用软件包：Juicefs, Restic, TimescaleDB EventStreamer
图数据库扩展 Apache AGE 现提供 EL 上 PG 13-17 的完整支持
优化 app.yml 剧本的使用体验，现在可以免配置拉起普通 Docker 应用
优化一键 Supabase, Dify, Odoo 自建模板，并更新至最新版本
新增 electric 应用模板，本地优先的 PostgreSQL 前后端同步引擎

基础设施软件包

+restic 0.17.3
+juicefs 1.2.3
+timescaledb-event-streamer 0.12.0
Prometheus 3.2.1
AlertManager 0.28.1
blackbox_exporter 0.26.0
node_exporter 1.9.0
mysqld_exporter 0.17.2
kafka_exporter 1.9.0
redis_exporter 1.69.0
pgbackrest_exporter 0.19.0-2
DuckDB 1.2.1
etcd 3.5.20
FerretDB 2.0.0
tigerbeetle 0.16.31
vector 0.45.0
VictoriaMetrics 1.113.0
VictoriaLogs 1.17.0
rclone 1.69.1
pev2 1.14.0
grafana-victorialogs-ds 0.16.0
grafana-victoriametrics-ds 0.14.0
grafana-infinity-ds 3.0.0

PostgreSQL与各模块

Patroni 4.0.5
PolarDB 15.12.3.0-e1e6d85b
IvorySQL 4.4
pgbackrest 2.54.2
pev2 1.14
WiltonDB 13.17

PostgreSQL扩展包

pgspider_ext 1.3.0 (new extension)
apache age 13 - 17 el rpm (1.5.0)
timescaledb 2.18.2 -> 2.19.0
citus 13.0.1 -> 13.0.2
documentdb 1.101-0 -> 1.102-0
pg_analytics: 0.3.4 -> 0.3.7
pg_search: 0.15.2 -> 0.15.8
pg_ivm 1.9 -> 1.10
emaj 4.4.0 -> 4.6.0
pgsql_tweaks 0.10.0 -> 0.11.0
pgvectorscale 0.4.0 -> 0.6.0 (pgrx 0.12.5)
pg_session_jwt 0.1.2 -> 0.2.0 (pgrx 0.12.6)
wrappers 0.4.4 -> 0.4.5 (pgrx 0.12.9)
pg_parquet 0.2.0 -> 0.3.1 (pgrx 0.13.1)
vchord 0.2.1 -> 0.2.2 (pgrx 0.13.1)
pg_tle 1.2.0 -> 1.5.0
supautils 2.5.0 -> 2.6.0
sslutils 1.3 -> 1.4
pg_profile 4.7 -> 4.8
pg_snakeoil 1.3 -> 1.4
pg_jsonschema 0.3.2 -> 0.3.3
pg_incremental: 1.1.1 -> 1.2.0
pg_stat_monitor 2.1.0 -> 2.1.1
fix ddl_historization ver 0.7 -> 0.0.7
fix pg_sqlog 3.1.7 -> 1.6
fix pg_random remove dev suffix
asn1oid 1.5 -> 1.6
table_log 0.6.1 -> 0.6.4

接口变更

新增 Docker 参数：docker_data，以及 docker_storage_driver by #521 by [@waitingsong]https://github.com/waitingsong)
新增 Infra 参数： alertmanager_port，可指定 AlertManager 端口
新增 Infra 参数：certbot_sign：是否在 Nginx 初始化时自动申请 Certbot 证书？默认为否
新增 Infra 参数：certbot_email：使用 Certbot 申请证书时使用的 email
新增 Infra 参数：certbot_options：使用 Certbot 申请证书时使用的额外参数
从 IvorySQL 4.4 版本起调整了 IvorySQL 的默认二进制路径：/usr/ivory-4
pg_lc_ctype 与其他 Locale 相关参数默认值由 en_US.UTF-8 修改为 C。
对于 PG 17 ，如果使用 UTF8 编码和 C/C.UTF-8 Locale，则优先使用PG自带的本地化规则。
configure 现在会根据 PG 版本与环境是否支持 C.utf8，自动配置 locale 相关选项
IvorySQL 的二进制路径现在默认为 /usr/ivory-4
pg_packages 的默认值修改为：pgsql-main patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager
repo_packages 的默认值修改为：[node-bootstrap, infra-package, infra-addons, node-package1, node-package2, pgsql-utility, extra-modules]
从 /etc/profile.d/node.sh 中移除 LANG 和 LC_ALL 环境变量设置
现在使用 bento/rockylinux-8, bento/rockylinux-9 作为 EL 的 Vagrant box 镜像
新增 Alias：extra_modules ，包含额外的可选模块：
调整 PGSQL Alias： postgresql, pgsql-main, pgsql-core, pgsql-full
仓库：Gitlab 仓库现已加入可用模块列表
仓库：Docker 模块现已合并入 Infra 模块中
node.yml 剧本新增了 node_pip 任务，将 pip 镜像站写入节点 pip 配置
pgsql.yml 剧本新增了 pgbackrest_exporter 任务，收集备份监控指标
允许在 Makefile 中使用 META/PKG 环境变量
新增了 /pg/spool 目录用于 pgBackrest 临时文件存储
pgBackRest 的 link-all 选项默认关闭
对于 MinIO 备份仓库，现在默认启用块增量备份模式`

缺陷修复

修复 pg-backup 返回状态码：#532 by [@waitingsong]https://github.com/waitingsong)
在 pg-tune-hugepage 中设置只允许 PG 使用大页 #527 by [@waitingsong]https://github.com/waitingsong)
修复 pg-role 中的问题逻辑
修复 hugepage 配置参数类型转换问题
修复 slim 模板中 node_repo_modules 的默认值问题

校验和

768bea3bfc5d492f4c033cb019a81d3a  pigsty-v3.4.0.tgz
7c3d47ef488a9c7961ca6579dc9543d6  pigsty-pkg-v3.4.0.d12.aarch64.tgz
b5d76aefb1e1caa7890b3a37f6a14ea5  pigsty-pkg-v3.4.0.d12.x86_64.tgz
42dacf2f544ca9a02148aeea91f3153a  pigsty-pkg-v3.4.0.el8.aarch64.tgz
d0a694f6cd6a7f2111b0971a60c49ad0  pigsty-pkg-v3.4.0.el8.x86_64.tgz
7caa82254c1b0750e89f78a54bf065f8  pigsty-pkg-v3.4.0.el9.aarch64.tgz
8f817e5fad708b20ee217eb2e12b99cb  pigsty-pkg-v3.4.0.el9.x86_64.tgz
8b2fcaa6ef6fd8d2726f6eafbb488aaf  pigsty-pkg-v3.4.0.u22.aarch64.tgz
83291db7871557566ab6524beb792636  pigsty-pkg-v3.4.0.u22.x86_64.tgz
c927238f0343cde82a4a9ab230ecd2ac  pigsty-pkg-v3.4.0.u24.aarch64.tgz
14cbcb90693ed5de8116648a1f2c3e34  pigsty-pkg-v3.4.0.u24.x86_64.tgz

v3.3.0

总可用扩展插件数量提升至 404 个！
PostgreSQL 2 月小版本更新：17.4, 16.8, 15.12, 14.17, 13.20
新功能：app.yml 剧本可以自动安装应用，如 Odoo, Supabase，Dify。
新特性：支持 DocumentDB 与 FerretDB 2.0 以提供 MongoDB 兼容的PG。
新功能：允许在 infra_portal 中进一步定制 Nginx 配置文件。
新功能：添加了对 certbot 的支持，允许用户快速申请免费 HTTPS 证书
新特性：允许在 pg_default_extensions 中使用纯文本扩展列表。
新特性：默认可用仓库列表中添加了 mongo, redis, groonga, haproxy 等。
新参数：node_aliases ，允许为 Node 添加命令别名
修复：修复了 Bootstrap 剧本中 EPEL 国际仓库的默认地址
改进：为 Debian Security 仓库添加阿里云国内镜像
改进：针对 IvorySQL 内核提供 pgBackRest 备份支持
改进：针对 PolarDB 提供 ARM64，Debian/Ubuntu 支持
pg_exporter 0.8.0，提供对 pgbouncer 1.24 新增指标的支持
新特性：为常见命令 git, docker, systemctl 添加自动补全支持 #506 #507 by @waitingsong
改进：改进 pgbouncer 配置模板的 ignore_startup_parameters 参数，忽略若干超时参数设置 #488 by @waitingsong
全新首页：现在 Pigsty 官网使用全新的网站设计。
扩展目录：现在 Pigsty 提供关于扩展 RPM/DEB 二进制包的详细信息与下载链接。
扩展构建：现在 pig 命令行可以自动设置 PostgreSQL 扩展构建环境。

新增扩展

新增 12 个 PostgreSQL 扩展，可用总数达到 404 整。

documentdb 0.101-0
VectorChord-bm25 (vchord_bm25) 0.1.0
pg_tracing 0.1.2
pg_curl 2.4
pgxicor 0.1.0
pgsparql 1.0
pgjq 0.1.0
hashtypes 0.1.5
db_migrator 1.0.0
pg_cooldown 0.1
pgcollection 0.9.1
pg_bzip 1.0.0

更新扩展

citus 13.0.0 -> 13.0.1
pg_duckdb 0.2.0 -> 0.3.1
pg_mooncake 0.1.0 -> 0.1.2
timescaledb 2.17.2 -> 2.18.2
supautils 2.5.0 -> 2.6.0
supabase_vault 0.3.1 (become C)
VectorChord 0.1.0 -> 0.2.1
pg_bulkload 3.1.22 (+pg17)
pg_store_plan 1.8 (+pg17)
pg_search 0.14 -> 0.15.2
pg_analytics 0.3.0 -> 0.3.4
pgroonga 3.2.5 -> 4.0.0
zhparser 2.2 -> 2.3
pg_vectorize 0.20.0 -> 0.21.1
pg_net 0.14.0
pg_curl 2.4.2
table_version 1.10.3 -> 1.11.0
pg_duration 1.0.2
pg_graphql 1.5.9 -> 1.5.11
vchord 0.1.1 -> 0.2.1 ((+13))
vchord_bm25 0.1.0 -> 0.1.1
pg_mooncake 0.1.1 -> 0.1.2
pgddl 0.29
pgsql_tweaks 0.11.0

基础设施软件包更新

pig 0.1.3 -> 0.3.0
pushgateway 1.10.0 -> 1.11.0
alertmanager 0.27.0 -> 0.28.0
nginx_exporter 1.4.0 -> 1.4.1
pgbackrest_exporter 0.18.0 -> 0.19.0
redis_exporter 1.66.0 -> 1.67.0
mongodb_exporter 0.43.0 -> 0.43.1
VictoriaMetrics 1.107.0 -> 1.111.0
VictoriaLogs v1.3.2 -> 1.9.1
DuckDB 1.1.3 -> 1.2.0
Etcd 3.5.17 -> 3.5.18
pg_timetable 5.10.0 -> 5.11.0
FerretDB 1.24.0 -> 2.0.0-rc
tigerbeetle 0.16.13 -> 0.16.27
grafana 11.4.0 -> 11.5.2
vector 0.43.1 -> 0.44.0
minio 20241218131544 -> 20250218162555
mcli 20241121172154 -> 20250215103616
rclone 1.68.2 -> 1.69.0
vray 5.23 -> 5.28

v3.2.2

新增扩展包: Omnigres 33个扩展，将postgres作为应用开发平台
新增扩展: pg_mooncake: postgres中的duckdb
新增扩展: pg_xxhash
新增扩展: timescaledb_toolkit
新增扩展: pg_xenophile
新增扩展: pg_drop_events
新增扩展: pg_incremental
升级 citus 至13.0.0，支持PostgreSQL 17
升级 pgml 至2.10.0
升级 pg_extra_time 至2.0.0
升级 pg_vectorize 至0.20.0

变更内容

升级IvorySQL至4.2版本(基于PostgreSQL 17.2)
为PolarDB内核添加Arm64和Debian支持
在默认infra_packages中添加certbot和certbot-nginx
增加pgbouncer的max_prepared_statements参数至256
移除pgxxx-citus包别名
在pg_extensions中默认隐藏pgxxx-olap类别（因为存在两对扩展冲突）

v3.2.1

亮点特性

PG扩展插件数量提升至350个，新增强力Rust扩展anon。
IvorySQL支持更新至PG17兼容的4.0版本
使用Pigsty编译的Citus，TimescaleDB与PGroonga。
添加Odoo一键自建模板与新app.yml剧本

新增 13 扩展插件:

新增 pg_anon 2.0.0
新增 omnisketch 1.0.2
新增 ddsketch 1.0.1
新增 pg_duration 1.0.1
新增 ddl_historization 0.0.7
新增 data_historization 1.1.0
新增 schedoc 0.0.1
新增 floatfile 1.3.1
新增 pg_upless 0.0.3
新增 pg_task 1.0.0
新增 pg_readme 0.7.0
新增 vasco 0.1.0
新增 pg_xxhash 0.0.1

更新扩展版本

lower_quantile 1.0.3
quantile 1.1.8
sequential_uuids 1.0.3
pgmq 1.5.0 (subdir)
floatvec 1.1.1
pg_parquet 0.2.0
wrappers 0.4.4
pg_later 0.3.0
topn fix for deb.arm64
add age 17 on debian
powa + pg17, 5.0.1
h3 + pg17
ogr_fdw + pg17
age + pg17 1.5 on debian
pgtap + pg17 1.3.3
repmgr
topn + pg17
pg_partman 5.2.4
credcheck 3.0
ogr_fdw 1.1.5
ddlx 0.29
postgis 3.5.1
tdigest 1.4.3
pg_repack 1.5.2

v3.2.0

亮点特性

Pigsty 命令行工具：pig 0.2.0，可用于管理扩展插件。
提供五大发行版上 390 个扩展的 ARM64 扩展支持
Supabase 发布周最新版本更新，全发行版均可自建。
Grafana 更新至 11.4 ，新增 infinity 数据源。

软件包变化

新增扩展
- 新增 timescaledb, timescaledb-loader timescaledb-toolkit timescaledb-tool to PIGSTY repo
- 新增 pg_timescaledb，针对 EL 进行的编译重制版本
- 新增 pgroonga，针对 EL 全系进行编译重制
- 新增 vchord 0.1.0
- 新增 pg_bestmatch.rs 0.0.1
- 新增 pglite_fusion 0.0.3
- 新增 pgpdf 0.1.0
更新扩展
- pgvectorscale 0.4.0 -> 0.5.1
- pg_parquet 0.1.0 -> 0.1.1
- pg_polyline 0.0.1
- pg_cardano 1.0.2 -> 1.0.3
- pg_vectorize 0.20.0
- pg_duckdb 0.1.0 -> 0.2.0
- pg_search 0.13.0 -> 0.13.1
- aggs_for_vecs 1.3.1 -> 1.3.2
- pgoutput 被标记为新的 PostgreSQL Contrib 扩展
基础设施
- 新增 promscale 0.17.0
- 新增 grafana-plugins 11.4
- 新增 grafana-infinity-plugins
- 新增 grafana-victoriametrics-ds
- 新增 grafana-victorialogs-ds
- vip-manager 2.8.0 -> 3.0.0
- vector 0.42.0 -> 0.43.0
- grafana 11.3 -> 11.4
- prometheus 3.0.0 -> 3.0.1 (软件包名从 prometheus2 变更为 prometheus)
- nginx_exporter 1.3.0 -> 1.4.0
- mongodb_exporter 0.41.2 -> 0.43.0
- VictoriaMetrics 1.106.1 -> 1.107.0
- VictoriaLogs 1.0.0 -> 1.3.2
- pg_timetable 5.9.0 -> 5.10.0
- tigerbeetle 0.16.13 -> 0.16.17
- pg_export 0.7.0 -> 0.7.1
缺陷修复
- el8.aarch64 添加 python3-cdiff 修复 patroni 依赖错漏问题
- el9.aarch64 添加 timescaledb-tools ，修复官方仓库缺失问题
- el9.aarch64 添加 pg_filedump ，修复官方仓库缺失问题
移除扩展
- pg_mooncake 因为与 pg_duckdb 冲突而被移除。
- pg_top 因为出现太多版本出现缺失，因质量问题而淘汰。
- hunspell_pt_pt 因为与 PG 官方字典文件冲突而被淘汰。
- pg_timeit 因为无法在 AARCH64 架构上使用而被淘汰。
- pgdd 因为缺乏维护，PG 17 与 pgrx 版本老旧而被标记为弃用。
- old_snapshot 与 adminpack 被标记为 PG 17 不可用。
- pgml 被设置为默认不下载不安装。

API变化

repo_url_packages 参数现在默认值为空数组，因为所有软件包现在都通过操作系统包管理器进行安装。
grafana_plugin_cache 参数弃用，现在 Grafana 插件通过操作系统包管理器进行安装
grafana_plugin_list 参数弃用，现在 Grafana 插件通过操作系统包管理器进行安装
原名为 prod 的 36 节点仿真模板现在重命名为 simu。
原本在 node_id/vars 针对每个发行版代码生成的配置，现在同样针对 aarch64 生成。
infra_packages 中默认添加命令行管理工具 pig
configure 命令同样会修改自动生成配置文件中 pgsql-xxx 别名的版本号。
adminpack 在 PG 17 中被移除，因此从 Pigsty 默认扩展中被移除。

问题修复

修复了 pgbouncer 仪表盘选择器问题 #474
pg-pitr 新增 --arg value 参数解析支持 by @waitingsong
修复 Redis 日志信息 typo by @waitingsong

软件包校验和

8fdc6a60820909b0a2464b0e2b90a3a6  pigsty-v3.2.0.tgz
d2b85676235c9b9f2f8a0ad96c5b15fd  pigsty-pkg-v3.2.0.el9.aarch64.tgz
649f79e1d94ec1845931c73f663ae545  pigsty-pkg-v3.2.0.el9.x86_64.tgz
c42da231067f25104b71a065b4a50e68  pigsty-pkg-v3.2.0.d12.aarch64.tgz
ebb818f98f058f932b57d093d310f5c2  pigsty-pkg-v3.2.0.d12.x86_64.tgz
24c0be1d8436f3c64627c12f82665a17  pigsty-pkg-v3.2.0.u22.aarch64.tgz
0b9be0e137661e440cd4f171226d321d  pigsty-pkg-v3.2.0.u22.x86_64.tgz

v3.1.0

亮点特性

PostgreSQL 17 现已成为默认使用的主要版本 (17.2)
Ubuntu 24.04 系统支持
arm 架构支持：EL9, Debian12, Ubuntu 22.04
Supabase 一键自建，新的剧本 supabase.yml
MinIO 最佳实践改进，配置模板与 Vagrant 模板
提供了一系列开箱即用的配置模板与文档说明。
允许在 configure 过程中使用 -v|--version 指定使用的 PG 大版本。
调整 PG 默认插件策略：默认安装 pg_repack, wal2json 以及 pgvector 三个关键扩展。
大幅简化 repo_packages 本地软件源构建逻辑，允许在 repo_packages 中使用软件包组别名
提供了 WiltonDB，IvorySQL，PolarDB 的软件源镜像，简化三者的安装。
默认启用数据库校验和。
修复 ETCD 与 MINIO 日志面板

软件升级

PostgreSQL 17.2, 16.6, 15.10, 14.15, 13.18, 12.22
PostgreSQL 扩展版本变动请参考：https://ext.pigsty.io
Patroni 4.0.4
MinIO 20241107 / MCLI 20241117
Rclone 1.68.2
Prometheus: 2.54.0 -> 3.0.0
VictoriaMetrics 1.102.1 -> 1.106.1
VictoriaLogs v0.28.0 -> 1.0.0
vslogcli 1.0.0
MySQL Exporter 0.15.1 -> 0.16.0
Redis Exporter 1.62.0 -> 1.66.0
MongoDB Exporter 0.41.2 -> 0.42.0
Keepalived Exporter 1.3.3 -> 1.4.0
DuckDB 1.1.2 -> 1.1.3
etcd 3.5.16 -> 3.5.17
tigerbeetle 16.8 -> 0.16.13

API变更

repo_upstream: 针对每个具体的操作系统发行版生成默认值：roles/node_id/vars
repo_packages: 允许使用 package_map 中定义的别名。
repo_extra_packages: 新增未指定时的默认值，允许使用 package_map 中定义的别名。
pg_checksum: 默认值修改为 true，默认打开。
pg_packages: 默认值修改为：postgresql, wal2json pg_repack pgvector, patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager
pg_extensions: 默认值修改为空数组 []。
infra_portal: 允许为 home 服务器指定 path，替代默认的本地仓库路径 nginx_home (/www)

校验和

e62f9ce9f89a58958609da7b234bf2f2  pigsty-v3.1.0.tgz

v3.0.4

特性

针对 PostgreSQL 17 编译了所有支持的 Pigsty 扩展插件
提供了全新的 OLAP 扩展支持：pg_duckdb 与 pg_parquet
简化并优化了最新版本 Supabase 自建的流程
新增参数 docker_image，允许在 Docker 安装后自动拉取镜像。

扩展

欢迎查阅我们最新的 PostgreSQL 扩展目录： https://ext.pigsty.io

统计项	总计	PGDG	PIGSTY	MISC	MISS	PG17	PG16	PG15	PG14	PG13	PG12
EL系统扩展	338	134	130	4	7	298	334	336	328	319	310
Deb系统扩展	326	109	143	74	19	290	322	324	316	307	300
RPM 软件包	313	122	129	4	6	275	309	311	303	294	285
DEB 软件包	298	93	142	64	19	264	294	296	288	279	272

版本升级

新的PGSQL扩展
升级与跟进PG扩展
- pg_search 0.11.0
- pg_analytics 0.2.0
- plv8 3.2.3
- supautils 2.5.0
- icu_ext 1.9.0
- redis_fdw 17
- pg_failover_slots 1.1.0
- pg_later 0.1.3
- plprql 1.0.0
- pg_vectorize 0.18.3
- unit 7.7 -> 7.9
- log_fdw 1.4
- pg_duckdb 0.1.0
- pg_graphql 1.5.9 (+17)
- pg_jsonschema 0.3.2 (+17)
- pgvectorscale 0.4.0 (+17)
- wrappers 0.4.3 +pg17
- pg_ivm 1.9
- pg_timeseries 0.1.6
- pgmq 1.4.4
- pg_protobuf 16 17
- pg_uuidv7 1.6
- pg_readonly
- pgddl 0.28
- pg_safeupdate
- pg_stat_monitor 2.1
- pg_profile 4.7
- system_stats 3.2
- pg_auth_mon 3.0
- login_hook 1.6
- logerrors 2.1.3
- pg-orphaned
- pgnodemx 1.7
- sslutils 1.4 (deb+pg16,17)
- timestamp9 (deb)
修复不支持PG16/17的扩展
- pg_mon
- pg_uri
- agg_for_vecs
- quantile
- lower_quantile
- pg_protobuf
- acl
- pg_emailaddr
- pg_zstd
- smlar
- geohash
- pgsmcrypto (+17)
- pg_tiktoken (+17)
- pg_idkit (+17)
基础设施软件包
- Grafana 11.3
- duckdb 1.1.2
- etcd 3.5.16
- ferretdb 1.24.0
- minio 20241013133411
- mcli 2024101313411
- pushgateway 1.10
- tigerbeetle 0.16.8
- mongodb_exporter 0.41.2
- redis_exporter 1.64.1
- vector 0.41.1
- vip-manager 2.7
- sealos 5.0.1

v3.0.3

特性

提供对最新发布的 PostgreSQL 17 支持。
优化了 etcd 配置，监控，与告警规则
（Oracle兼容的）IvorySQL 3.4 支持，与 PostgreSQL 16.4 同步

版本升级

PostGIS 3.5
Grafana 11.2
duckdb 1.1
pg_search 0.10.2
pg_analytics 0.1.4

v3.0.2

特性

精简安装模式：使用 slim.yml 进行最精简的 HA PGSQL 部署。
PolarDB PG 15 的原生支持。
优化 monitor.pg_table_bloat 与 monitor.pg_index_bloat，使用安全定义包装函数规避 PolarDB 统计视图权限问题。
在各模块的监控注册阶段，尊重 prometheus_enabled 与 grafana_enabled 配置选项，关闭时不再注册。
在 /etc/profile.d/pgsql.sh 中添加 PGDATABASE 与 PGPORT 环境变量，设置为 pg_primary_db（默认postgres）

变更

在 Pigsty PGSQL 仓库中移除 PolarDB 11 与 CloudberryDB 1.5.4 的 RPM/APT 包。
使用专用的仓库分发 PolarDB 15 与 CloudberryDB 1.6.0 的 RPM/APT 包。

问题修复

修复 Redis 的 /etc/tmp.files.d 文件名错误。
在管理 pgbouncer 用户时，设置 PGHOST 与 PGPORT 环境变量。
临时移除 pg_snakeoil 扩展支持，因为 EL8 上游源 clamv 出现依赖缺失问题。
移除 pgsql 角色的 Notify / Handler，以兼容更老的 Ansible 2.9 版本。

v3.0.1

特性改进

PolarDB Oracle 兼容性模式支持（需要第三方商业闭源内核）
使用 Oracle 兼容的 SQL 语法改写监控视图与相关 SQL 语句
Patroni 4 支持与适配
新增扩展 pg_analytics，通过 duckdb 为 PG 加装分析能力
添加新扩展：odbc_fdw 与 jdbc_fdw，提供通用的外部数据源连接能力
仓库添加新内核 cloudberrydb (Greenplum 原班开发者的开源分支)
仓库添加新工具 walminer，从WAL（replica等级）中提取原始 SQL。（高级特性需自行购买License）
更新执行计划可视化工具 Pev2 版本至 1.12.1
新增Grafana插件：volkovlabs-rss-datasource
在PGCAT databases监控面板中添加了已安装和待安装的扩展插件
PGSQL 主库初始化后，会重启一次以便 pg_param & pg_files 生效，因此 Supabase PG / PolarDB 集群置备后无需重启。

问题修复

修复了 Grafana 11.1.4 面板插件默认不加载的问题
修复了特定操作系统上 BlackBox Exporter Ping 探针失效的问题
确保 /var/run/postgresql 与 /var/run/redis 临时目录总是在重启后自动创建
修复了 cache.yml 剧本没有正确移除老旧的 patroni 3.0.4 RPM 包问题
修复了个别告警规则中的描述信息错误
移除了 Patroni 配置文件中过时的 Bootstrap User/HBA 参数

v3.0.0

亮点特性

扩展大爆炸：

Pigsty v3 提供了史无前例的 333 个可用扩展插件。包括 121 个扩展 RPM包 与 133 个 DEB包，数量已经超过了 PGDG 官方仓库提供的扩展数量总和（135 RPM/ 109 DEB）。而且，Pigsty 还将EL系统与Debian生态的独有PG扩展插件相互移植，实现了两大发行版的插件生态大对齐。

- timescaledb periods temporal_tables emaj table_version pg_cron pg_later pg_background pg_timetable
- postgis pgrouting pointcloud pg_h3 q3c ogr_fdw geoip #pg_geohash #mobilitydb
- pgvector pgvectorscale pg_vectorize pg_similarity pg_tiktoken pgml #smlar
- pg_search pg_bigm zhparser hunspell
- hydra pg_lakehouse pg_duckdb duckdb_fdw pg_fkpart pg_partman plproxy #pg_strom citus
- pg_hint_plan age hll rum pg_graphql pg_jsonschema jsquery index_advisor hypopg imgsmlr pg_ivm pgmq pgq #rdkit
- pg_tle plv8 pllua plprql pldebugger plpgsql_check plprofiler plsh #pljava plr pgtap faker dbt2
- prefix semver pgunit md5hash asn1oid roaringbitmap pgfaceting pgsphere pg_country pg_currency pgmp numeral pg_rational pguint ip4r timestamp9 chkpass #pg_uri #pgemailaddr #acl #debversion #pg_rrule
- topn pg_gzip pg_http pg_net pg_html5_email_address pgsql_tweaks pg_extra_time pg_timeit count_distinct extra_window_functions first_last_agg tdigest aggs_for_arrays pg_arraymath pg_idkit pg_uuidv7 permuteseq pg_hashids
- sequential_uuids pg_math pg_random pg_base36 pg_base62 floatvec pg_financial pgjwt pg_hashlib shacrypt cryptint pg_ecdsa pgpcre icu_ext envvar url_encode #pg_zstd #aggs_for_vecs #quantile #lower_quantile #pgqr #pg_protobuf
- pg_repack pg_squeeze pg_dirtyread pgfincore pgdd ddlx pg_prioritize pg_checksums pg_readonly safeupdate pg_permissions pgautofailover pg_catcheck preprepare pgcozy pg_orphaned pg_crash pg_cheat_funcs pg_savior table_log pg_fio #pgpool pgagent
- pg_profile pg_show_plans pg_stat_kcache pg_stat_monitor pg_qualstats pg_store_plans pg_track_settings pg_wait_sampling system_stats pg_meta pgnodemx pg_sqlog bgw_replstatus pgmeminfo toastinfo pagevis powa pg_top #pg_statviz #pgexporter_ext #pg_mon
- passwordcheck supautils pgsodium pg_vault anonymizer pg_tde pgsmcrypto pgaudit pgauditlogtofile pg_auth_mon credcheck pgcryptokey pg_jobmon logerrors login_hook set_user pg_snakeoil pgextwlist pg_auditor noset #sslutils
- wrappers multicorn mysql_fdw tds_fdw sqlite_fdw pgbouncer_fdw mongo_fdw redis_fdw pg_redis_pubsub kafka_fdw hdfs_fdw firebird_fdw aws_s3 log_fdw #oracle_fdw #db2_fdw
- orafce pgtt session_variable pg_statement_rollback pg_dbms_metadata pg_dbms_lock pgmemcache #pg_dbms_job #wiltondb
- pglogical pgl_ddl_deploy pg_failover_slots wal2json wal2mongo decoderbufs decoder_raw mimeo pgcopydb pgloader pg_fact_loader pg_bulkload pg_comparator pgimportdoc pgexportdoc #repmgr #slony
- gis-stack rag-stack fdw-stack fts-stack etl-stack feat-stack olap-stack supa-stack stat-stack json-stack

可插拔内核：

Pigsty v3 允许您更换 PostgreSQL 内核，目前支持了 SQL Server 兼容的 Babelfish （线缆协议级仿真），Oracle 兼容的 IvorySQL，以及 PG 版的 RAC PolarDB；此外，现在自托管 Supabase 也在 Debian 系统中可用。您可以让 Pigsty 中带有 HA，IaC，PITR，监控的生产级 PostgreSQL 集群仿真 MSSQL (via WiltonDB)，Oracle via (IvorySQL)，Oracle RAC (via PolarDB), MongoDB（via FerretDB），以及 Firebase （via Supabase）。

专业级服务：

我们现在提供 Pigsty Pro 专业版，在开源版的功能基础上提供增值服务。专业版提供额外的功能模块：MSSQL，Oracle，Mongo，K8S，Victoria，Kafka，TigerBeetle 等……，并提供更广泛的 PG 大版本、操作系统、芯片架构的支持。提供针对全系操作系统精准小版本定制的离线安装包，以及 EL7，Debian 11，Ubuntu 20.04 等过保老系统的支持；此外，专业版还提供内核可插拔定制服务，并对PolarDB PG 的原生部署、监控管控支持以满足“国产化”需要。

使用以下命令快速安装体验：

curl -fsSL https://repo.pigsty.cc/get | bash
cd ~/pigsty; ./bootstrap; ./configure; ./install.yml

重大变更

本次 Pigsty 发布调整大版本号，从 2.x 升级到 3.0，带有一些重大变更：

首要支持操作系统调整为：EL 8 / EL 9 / Debian 12 / Ubuntu 22.04
- EL7 / Debian 11 / Ubuntu 20.04 等系统进入弃用阶段，不再提供支持
- 有在这些系统上运行需求的用户请考虑我们的订阅服务
默认使用在线安装，不再提供离线软件包，从而解决操作系统小版本兼容性问题。
- bootstrap 过程现在不再询问是否下载离线安装包，但如果 /tmp/pkg.tgz 存在，仍然会自动使用离线安装包。
- 有离线安装需求请自行制作离线软件包或考虑我们的订阅服务
Pigsty 使用的上游软件仓库进行统一调整，地址变更，并对所有软件包进行 GPG 签名与校验
- 标准仓库： https://repo.pigsty.io/{apt/yum}
- 国内镜像： https://repo.pigsty.cc/{apt/yum}
API 参数变更与配置模板变更
- EL 系与 Debian 系配置模板现在收拢统一，有差异的参数统一放置于 roles/node_id/vars/ 目录进行管理。
- 配置目录变更，所有配置文件模板统一放置在 conf 目录下，并分为 default, dbms, demo, build 四大类。

其他新特性

PG OLAP 分析能力史诗级加强：DuckDB 1.0.0，DuckDB FDW，以及 PG Lakehouse，Hydra 移植至 Deb 系统中。
PG 向量检索与全文检索能力加强：Vectorscale 提供 DiskANN 向量索引，Hunspell 分词字典支持，pg_search 0.8.6。
帮助 ParadeDB 解决了软件包构建问题，现在我们在 Debian/Ubuntu 上也能提供这一扩展。
Supabase 所需的扩展在 Debian/Ubuntu 上全部可用，Supabase 现在可在全OS上自托管。
提供了场景化预置扩展堆栈的能力，如果您不知道安装哪些扩展，我们准备了针对特定应用场景的扩展推荐包（Stack）。
针对所有 PostgreSQL 生态的扩展，制作了元数据表格、文档、索引、名称映射，针对 EL与Deb 进行对齐，确保扩展可用性。
为了解决 DockerHub 被 Ban 的问题，我们加强了 proxy_env 参数的功能并简化其配置方式。
建设了一个专用的新软件仓库，提供了 12-17 版本的全部扩展插件，其中，PG16的扩展仓库会在 Pigsty 默认的版本中实装。
现有软件仓库升级改造，使用标准的签名与校验机制，确保软件包的完整性与安全性。APT 仓库采用新的标准布局通过 reprepro 构建。
提供了 1,2,3,4,43 节点的沙箱环境：meta, dual, trio, full, prod，以及针对 7 大 OS Distro 的快捷配置模板。
PG Exporter 新增了 PostgreSQL 17 与 pgBouncer 1.23 新监控指标收集器的定义，与使用这些指标的 Grafana Panel
监控面板修缮，修复了各种问题，为 PGSQL Pgbouncer 与 PGSQL Patroni 监控面板添加了日志仪表盘。
使用全新的 cache.yml Ansible 剧本，替换了原有制作离线软件包的 bin/cache 与 bin/release-pkg 脚本。

API变更

新参数选项： pg_mode 现在支持的模式有 pgsql, citus, gpsql, mssql, ivory, polar，用于指定 PostgreSQL 集群的模式
- pgsql：标准 PostgreSQL 高可用集群
- citus： Citus 水平分布式 PostgreSQL 原生高可用集群
- gpsql：用于 Greenplum 与 GP 兼容数据库的监控（专业版）
- mssql：安装 WiltonDB / Babelfish，提供 Microsoft SQL Server 兼容性模式的标准 PostgreSQL 高可用集群，线缆协议级支持，扩展不可用
- ivory：安装 IvorySQL 提供的 Oracle 兼容性 PostgreSQL 高可用集群，Oracle语法/数据类型/函数/存储过程兼容，扩展不可用（专业版）
- polar：安装 PolarDB for PostgreSQL （PG RAC）开源版本，提供国产化数据库能力支持，扩展不可用。（专业版）
新参数： pg_parameters，用于在实例级别指定 postgresql.auto.conf 中的参数，覆盖集群配置，实现不同实例成员的个性化配置。
新参数： pg_files，用于将额外的文件拷贝到PGDATA数据目录，针对需要License文件的商业版PostgreSQL分叉内核设计。
新参数： repo_extra_packages，用于额外指定需要下载的软件包，与 repo_packages 共同使用，便于指定OS版本独有的扩展列表。
参数重命名： patroni_citus_db 重命名为 pg_primary_db，用于指定集群中的主要数据库（在 Citus 模式中使用）
参数强化：proxy_env 中的代理服务器配置会写入 Docker Daemon，解决科学上网问题，configure -x 选项会自动在配置中写入当前环境中的代理服务器配置。
参数强化：infra_portal 参数现在支持指定 path 选项，对外暴露本机上的目录，提供web服务。
参数强化：repo_url_packages 中的 repo.pigsty.io 会在区域为中国时自动替换为 repo.pigsty.cc，解决科学上网问题，此外，现在可以指定下载后的文件名称。
参数强化：pg_databases.extensions 中的 extension 字段现在可以支持字典与扩展名字符串两种模式，字典模式提供 version 支持，允许安装特定版本的扩展。
参数强化：repo_upstream 参数如果没有显式覆盖定义，将从 rpm.yml 或 deb.yml 中定义的 repo_upstream_default 提取对应系统的默认值。
参数强化：repo_packages 参数如果没有显式覆盖定义，将从 rpm.yml 或 deb.yml 中定义的 repo_packages_default 提取对应系统的默认值。
参数强化：infra_packages 参数如果没有显式覆盖定义，将从 rpm.yml 或 deb.yml 中定义的 infra_packages_default 提取对应系统的默认值。
参数强化：node_default_packages 参数如果没有显式覆盖定义，将从 rpm.yml 或 deb.yml 中定义的 node_packages_default 提取对应系统的默认值。
参数强化：pg_packages 与 pg_extensions 中的扩展现在都会从 rpm.yml 或 deb.yml 中定义的 pg_package_map 执行一次查找与翻译。
参数强化：node_packages 与 pg_extensions 参数中指定的软件包在安装时会升级至最新版本， node_packages 中现在默认值变为 [openssh-server]，帮助修复 OpenSSH CVE
参数强化：pg_dbsu_uid 会自动根据操作系统类型调整为 26 （EL）或 543 （Debian），避免了手工调整。
设置了 pgbouncer 默认参数，max_prepared_statements = 128 启用了事物池化模式下的准备语句支持，并设置 server_lifetime 为 600，
修改了 patroni 模板默认参数，统一增大 max_worker_processes +8 可用后端进程，提高 max_wal_senders 与 max_replication_slots 至 50，并增大 OLAP 模板临时文件的大小限制为主磁盘的 1/5

版本升级

截止至发布时刻，Pigsty 主要组件的版本升级如下：

PostgreSQL 16.4, 15.8, 14.13, 13.16, 12.20
pg_exporter : 0.7.0
Patroni: 3.3.2
pgBouncer: 1.23.1
pgBackRest: 2.53.1
duckdb : 1.0.0
etcd : 3.5.15
pg_timetable: 5.9.0
ferretdb: 1.23.1
vip-manager: 2.6.0
minio: 20240817012454
mcli: 20240817113350
grafana : 11.1.4
loki : 3.1.1
promtail : 3.0.0
prometheus : 2.54.0
pushgateway : 1.9.0
alertmanager : 0.27.0
blackbox_exporter : 0.25.0
nginx_exporter : 1.3.0
node_exporter : 1.8.2
keepalived_exporter : 0.7.0
pgbackrest_exporter 0.18.0
mysqld_exporter : 0.15.1
redis_exporter : v1.62.0
kafka_exporter : 1.8.0
mongodb_exporter : 0.40.0
VictoriaMetrics : 1.102.1
VictoriaLogs : v0.28.0
sealos: 5.0.0
vector : 0.40.0

Pigsty 重新编译了所有 PostgreSQL 扩展插件，PostgreSQL 扩展插件的最新版本，可用的 333 个扩展插件请参考 扩展列表

新应用

Pigsty 现在提供开箱即用的 Dify 与 Odoo 两款使用 PostgreSQL 软件的 Docker Compose 模板：

Dify： AI智能体工作流编排与 LLMOps，使用 PostgreSQL 作为元数据库，PGVector 作为向量存储。
Odoo：企业级开源 ERP 系统，使用 PostgreSQL 作为底层数据库。

Pigsty 专业版现在提供试点的 Kubernetes 部署支持与 Kafka KRaft 集群部署与监控支持

KUBE：使用 cri-dockerd 或 containerd 部署由 Pigsty 托管的 Kubernetes 集群
KAFKA：部署由 Kraft 协议支持的高可用 Kafka 集群

问题修复

修复了 Ubuntu / Debian 系统中，节点重启后可能出现的 postgresql-common 服务自动启动替代默认数据库集群的缺陷
通过 node_packages 中的默认值 [openssh-server]，CVE-2024-6387 可以在 Pigsty 安装过程中被自动修复。
修复了 Loki 解析 Nginx 日志标签基数过大导致的内存消耗问题。
修复了 EL8 系统中上游 Ansible 依赖变化导致的 bootstrap 失效问题（python3.11-jmespath 升级至 python3.12-jmespath）

v2.7.0

亮点特性

新增了大量强力扩展插件，特别是一些使用 rust 与 pgrx 进行开发的强力扩展：

pg_search v0.7.0：使用 BM25 算法对 SQL 表进行全文搜索
pg_lakehouse v0.7.0：在对象存储（如 S3）和表格式（如 DeltaLake）上进行查询的引擎
pg_analytics v0.6.1：加速 PostgreSQL 内部的分析查询处理
pg_graphql v1.5.4：为 PostgreSQL 数据库提供 GraphQL 支持
pg_jsonschema v0.3.1：提供 JSON Schema 校验的 PostgreSQL 扩展
wrappers v0.3.1：由 Supabase 提供的 PostgreSQL 外部数据封装器集合
pgmq v1.5.2：轻量级消息队列，类似于 AWS SQS 和 RSMQ
pg_tier v0.0.3：支将将冷数据分级存储到 AWS S3
pg_vectorize v0.15.0: 在 PG 中实现 RAG 向量检索的封装
pg_later v0.1.0：现在执行 SQL，并在稍后获取结果
pg_idkit v0.2.3：生成多种流行类型的标识符（UUID）
plprql v0.1.0：在 PostgreSQL 中使用 PRQL 查询语言
pgsmcrypto v0.1.0：PostgreSQL 的国密 SM 算法扩展
pg_tiktoken v0.0.1：计算 OpenAI 使用的 Token 数量
pgdd v0.5.2：通过纯 SQL 接口，访问数据目录的元数据

当然，也有一些使用原生 C 和 C++ 开发的强力扩展：

parquet_s3_fdw 1.1.0：从 S3 存取 Parquet 格式文件，作为湖仓之用
plv8 3.2.2：使用 V8 引擎，允许在 PostgreSQL 中使用 Javascript 语言编写存储过程
md5hash 1.0.1：用于存储原生MD5哈希数据类型，而非文本。
pg_tde 1.0 alpha：PostgreSQL 的实验性加密存储引擎。
pg_dirtyread 2.6：从 PostgreSQL 表中读取未清理的死元组，用于脏读
新的 deb PGDG 扩展：pg_roaringbitmap, pgfaceting, mobilitydb, pgsql-http, pg_hint_plan, pg_statviz, pg_rrule
新的 rpm PGDG 扩展：pg_profile, pg_show_plans, 使用 PGDG 的 pgsql_http, pgsql_gzip, pg_net, pg_bigm 替代 Pigsty 维护的 RPM。

新特性

允许 Pigsty 在特定 Docker 虚拟机镜像中运行。
针对 Ubuntu 与 EL 系操作系统发行版准备了 INFRA & PGSQL 模块的 arm64 软件包
新安装脚本，可从 cloudflare 下载软件，可以指定版本，提供更完善的提示信息。
新增的 PGSQL PITR 监控面板，用于在 PITR 过程中提供更好的可观测性
针对在 Docker 虚拟机镜像中运行 Pigsty 进行了一系列铺垫与准备。
新增了防呆设计，避免在非 Pigsty 纳管的节点上运行 pgsql.yml 剧本（AdamYLK）
针对每个支持的发行版大版本配置了独立的配置文件：el7, el8, el9, debian11, debian12, ubuntu20, ubuntu22

软件版本升级

PostgreSQL 16.3
Patroni 3.3.0
pgBackRest 2.51
VIP-Manager v2.5.0
Haproxy 2.9.7
Grafana 10.4.2
Prometheus 2.51
Loki & Promtail: 3.0.0 (警告：大版本非兼容性变更！)
Alertmanager 0.27.0
BlackBox Exporter 0.25.0
Node Exporter 1.8.0
pgBackrest Exporter 0.17.0
duckdb 0.10.2
etcd 3.5.13
minio-20240510014138 / mcli-20240509170424
pev2 v1.8.0 -> v1.11.0
pgvector 0.6.1 -> 0.7.0
pg_tle: v1.3.4 -> v1.4.0
hydra: v1.1.1 -> v1.1.2
duckdb_fdw: v1.1.0 重新针对 libduckdb 0.10.2 进行编译
pg_bm25 0.5.6 -> pg_search 0.7.0
pg_analytics: 0.5.6 -> 0.6.1
pg_graphql: 1.5.0 -> 1.5.4
pg_net 0.8.0 -> 0.9.1
pg_sparse (deprecated)

Docker应用模板

Odoo：开源 ERP 软件与插件
Jupyter：使用容器运行 Jupyter Notebook
PolarDB：运行“国产数据库” PolarDB，应付信创检查！
supabase：更新至最近的 GA 版本
bytebase：使用 latest 标签替代特定版本号。
pg_exporter：更新了 Docker 镜像的例子。

缺陷修复

修复了 pg_exporters 角色中的变量空白问题。
修复了 minio_cluster 变量没有在全局配置中注释掉的问题
修复了 EL7 模板中的 postgis34 插件名称问题，应该使用 postgis33
修复了 EL8 python3.11-cryptography 依赖名的问题，上游现在变更为 python3-cryptography。
修复了 /pg/bin/pg-role 无法在非交互式 Shell 模式下获取操作系统用户名的问题
修复了 /pg/bin/pg-pitr 无法正确提示 -X -P 选项的问题

API变更

新参数 node_write_etc_hosts，用于控制是否向目标节点的 /etc/hosts 文件写入静态 DNS 解析记录
新增了 prometheus_sd_dir 参数，用于指定 Prometheus 静态服务发现的目标文件目录
configure 脚本新增了 -x|--proxy 参数，用于将当前环境的代理信息写入配置文件 by @waitingsong in https://github.com/pgsty/pigsty/pull/405
不再使用 Promtail & Loki 解析 Infra 节点上的 Nginx 日志细节标签，因为这样会导致标签基数爆炸。
在 Prometheus 配置中使用 alertmanager API v2 替代 v1
在 PGSQL 模块中，使用 /pg/cert/ca.crt 代替 /etc/pki/ca.crt，降低对节点根证书的依赖。

新的贡献者

@NeroSong made their first contribution in https://github.com/pgsty/pigsty/pull/373
@waitingsong made their first contribution in https://github.com/pgsty/pigsty/pull/405

完整的变更日志: https://github.com/pgsty/pigsty/compar

离线软件包校验和

ec271a1d34b2b1360f78bfa635986c3a  pigsty-pkg-v2.7.0.el8.x86_64.tgz
f3304bfd896b7e3234d81d8ff4b83577  pigsty-pkg-v2.7.0.debian12.x86_64.tgz
5b071c2a651e8d1e68fc02e7e922f2b3  pigsty-pkg-v2.7.0.ubuntu22.x86_64.tgz

v2.6.0

亮点特性

现已将 PostgreSQL 16 作为默认主要版本（16.2）
新增 ParadeDB 扩展插件：pg_analytics, pg_bm25, and pg_sparse
新增 DuckDB 与 duckdb_fdw 插件支持
全球 Cloudflare CDN https://repo.pigsty.io 与中国大陆CDN https://repo.pigsty.cc

软件配置变更

使用 node_repo_modules 替换 node_repo_method 参数，并移除 node_repo_local_urls 参数。
暂时关闭 Grafana 统一告警功能，避免 “Database Locked” 错误。
新增 node_repo_modules 参数，用于指定在节点上添加的上游仓库源。
移除 node_local_repo_urls，其功能由 node_repo_modules & repo_upstream 替代。
移除 node_repo_method 参数，其功能由 node_repo_modules 替代。
在 repo_upstream 添加新的 local 源，并通过 node_repo_modules 使用，替代 node_local_repo_urls 的功能
重排 node_default_packages，infra_packages，pg_packages，pg_extensions 参数默认值。
在 repo_upstream 中替换 repo_upstream.baseurl 时，如果 EL8/9 PGDG小版本特定的仓库可用，使用 major.minor 而不是 major 替换 $releasever，提高小版本兼容性。

软件版本升级

Grafana 10.3
Prometheus 2.47
node_exporter 1.7.0
HAProxy 2.9.5
Loki / Promtail 2.9.4
minio-20240216110548 / mcli-20240217011557
etcd 3.5.11
Redis 7.2.4
Bytebase 2.13.2
DuckDB 0.10.0
FerretDB 1.19
Metabase：新Docker应用模板

PostgreSQL扩展插件

PostgreSQL 小版本升级： 16.2, 15.6, 14.11, 13.14, 12.18
PostgreSQL 16：现在被提升为默认主版本
pg_exporter 0.6.1：安全修复
Patroni 3.2.2
pgBadger 12.4
pgBackRest 2.50
vip-manager 2.3.0
PostGIS 3.4.2
TimescaleDB 2.14.1
向量扩展 PGVector 0.6.0：新增并行创建 HNSW 索引功能
新增扩展插件 duckdb_fdw v1.1 ，支持读写 DuckDB 数据 v1.1
新增扩展插件 pgsql-gzip ，用于支持 Gzip 压缩解压缩 v1.0.0
新增扩展插件 pg_sparse，高效处理稀疏向量（ParadeDB） v0.5.6
新增扩展插件 pg_bm25，用于支持高质量全文检索 BM25 算法的插件（ParadeDB） v0.5.6
新增扩展插件 pg_analytics，支持 SIMD 与列式存储的PG分析插件（ParadeDB） v0.5.6
升级AIML插件 pgml 至 v2.8.1，新增 PG 16 支持。
升级列式存储插件 hydra 版本至 v1.1.1，新增 PG 16 支持。
升级图扩展插件 age 至 v1.5.0，新增 PG 16 支持。
升级GraphQL插件 pg_graphql 版本至 v1.5.0 ，支持 Supabase。

330e9bc16a2f65d57264965bf98174ff  pigsty-v2.6.0.tgz
81abcd0ced798e1198740ab13317c29a  pigsty-pkg-v2.6.0.debian11.x86_64.tgz
7304f4458c9abd3a14245eaf72f4eeb4  pigsty-pkg-v2.6.0.debian12.x86_64.tgz
f914fbb12f90dffc4e29f183753736bb  pigsty-pkg-v2.6.0.el7.x86_64.tgz
fc23d122d0743d1c1cb871ca686449c0  pigsty-pkg-v2.6.0.el8.x86_64.tgz
9d258dbcecefd232f3a18bcce512b75e  pigsty-pkg-v2.6.0.el9.x86_64.tgz
901ee668621682f99799de8932fb716c  pigsty-pkg-v2.6.0.ubuntu20.x86_64.tgz
39872cf774c1fe22697c428be2fc2c22  pigsty-pkg-v2.6.0.ubuntu22.x86_64.tgz

v2.5.1

跟进 PostgreSQL v16.1, v15.5, 14.10, 13.13, 12.17, 11.22 小版本例行更新。

现在 PostgreSQL 16 的所有重要扩展已经就位（新增 pg_repack 与 timescaledb 支持）

软件更新：
- PostgreSQL to v16.1, v15.5, 14.10, 13.13, 12.17, 11.22
- Patroni v3.2.0
- PgBackrest v2.49
- Citus 12.1
- TimescaleDB 2.13
- Grafana v10.2.0
- FerretDB 1.15
- SealOS 4.3.7
- Bytebase 2.11.1

移除 PGCAT 监控面板中查询对 monitor 模式前缀（允许用户将 pg_stat_statements 扩展装到别的地方）
新的配置模板 wool.yml，为阿里云免费99 ECS 单机针对设计。
为 EL9 新增 python3-jmespath 软件包，解决 Ansible 依赖更新后 bootstrap 缺少 jmespath 的问题

31ee48df1007151009c060e0edbd74de  pigsty-pkg-v2.5.1.el7.x86_64.tgz
a40f1b864ae8a19d9431bcd8e74fa116  pigsty-pkg-v2.5.1.el8.x86_64.tgz
c976cd4431fc70367124fda4e2eac0a7  pigsty-pkg-v2.5.1.el9.x86_64.tgz
7fc1b5bdd3afa267a5fc1d7cb1f3c9a7  pigsty-pkg-v2.5.1.debian11.x86_64.tgz
add0731dc7ed37f134d3cb5b6646624e  pigsty-pkg-v2.5.1.debian12.x86_64.tgz
99048d09fa75ccb8db8e22e2a3b41f28  pigsty-pkg-v2.5.1.ubuntu20.x86_64.tgz
431668425f8ce19388d38e5bfa3a948c  pigsty-pkg-v2.5.1.ubuntu22.x86_64.tgz

v2.5.0

curl https://get.pigsty.cc/latest | bash

亮点特性

Ubuntu / Debian 支持： bullseye, bookworm, jammy, focal
使用CDN repo.pigsty.cc 软件源，提供 rpm/deb 软件包下载。
Anolis 操作系统支持（兼容 EL 8.8 ）。
使用 PostgreSQL 16 替代 PostgreSQL 14 作为备选主要支持版本
新增了 PGSQL Exporter / PGSQL Patroni 监控面板，重做 PGSQL Query 面板
扩展更新：
- PostGIS 版本至 3.4（ EL8/EL9 ），EL7 仍使用 PostGIS 3.3
- 移除 pg_embedding，因为开发者不再对其进行维护，建议使用 pgvector 替换。
- 新扩展（EL）：点云插件 pointcloud 支持，Ubuntu原生带有此扩展。
- 新扩展（EL）： imgsmlr， pg_similarity，pg_bigm 用于搜索。
- 重新编译 pg_filedump 为 PG 大版本无关的软件包。。
- 新收纳 hydra 列存储扩展，不再默认安装 citus 扩展。
软件更新：
- Grafana 更新至 v10.1.5
- Prometheus 更新至 v2.47
- Promtail/Loki 更新至 v2.9.1
- Node Exporter 更新至 v1.6.1
- Bytebase 更新至 v2.10.0
- patroni 更新至 v3.1.2
- pgbouncer 更新至 v1.21.0
- pg_exporter 更新至 v0.6.0
- pgbackrest 更新至 v2.48.0
- pgbadger 更新至 v12.2
- pg_graphql 更新至 v1.4.0
- pg_net 更新至 v0.7.3
- ferretdb 更新至 v0.12.1
- sealos 更新至 4.3.5
- Supabase 支持更新至 20231013070755

Ubuntu 支持说明

Pigsty 支持了 Ubuntu 22.04 (jammy) 与 20.04 (focal) 两个 LTS 版本，并提供相应的离线软件安装包。

相比 EL 系操作系统，一些参数的默认值需要显式指定调整，详情请参考 ubuntu.yml

repo_upstream：按照 Ubuntu/Debian 的包名进行了调整
repo_packages：按照 Ubuntu/Debian 的包名进行了调整
node_repo_local_urls：默认值为 ['deb [trusted=yes] http://${admin_ip}/pigsty ./']
node_default_packages ：
- zlib -> zlib1g, readline -> libreadline-dev
- vim-minimal -> vim-tiny, bind-utils -> dnsutils, perf -> linux-tools-generic,
- 新增软件包 acl，确保 Ansible 权限设置正常工作
infra_packages：所有含 _ 的包要替换为 - 版本，此外 postgresql-client-16 用于替换 postgresql16
pg_packages：Ubuntu 下惯用 - 替代 _，不需要手工安装 patroni-etcd 包。
pg_extensions：扩展名称与EL系不太一样，Ubuntu下缺少 passwordcheck_cracklib 扩展。
pg_dbsu_uid： Ubuntu 下 Deb 包不显式指定uid，需要手动指定，Pigsty 默认分配为 543

API变更

默认值变化：

repo_modules 现在的默认值为 infra,node,pgsql,redis,minio，启用所有上游源
repo_upstream 发生变化，现在添加了 Pigsty Infra/MinIO/Redis/PGSQL 模块化软件源
repo_packages 发生变化，移除未使用的 karma,mtail,dellhw_exporter，移除了 PG14 主要扩展，新增了 PG16 主要扩展，添加了 virtualenv 包。
node_default_packages 发生变化，默认安装 python3-pip 组件。
pg_libs: timescaledb 从 shared_preload_libraries 中移除，现在默认不自动启用。

pg_extensions 发生变化，不再默认安装 Citus 扩展，默认安装 passwordcheck_cracklib 扩展，EL8,9 PostGIS 默认版本升级至 3.4

- pg_repack_${pg_version}* wal2json_${pg_version}* passwordcheck_cracklib_${pg_version}*
- postgis34_${pg_version}* timescaledb-2-postgresql-${pg_version}* pgvector_${pg_version}*

Patroni 所有模板默认移除 wal_keep_size 参数，避免触发 Patroni 3.1.1 的错误，其功能由 min_wal_size 覆盖。

87e0be2edc35b18709d7722976e305b0  pigsty-pkg-v2.5.0.el7.x86_64.tgz
e71304d6f53ea6c0f8e2231f238e8204  pigsty-pkg-v2.5.0.el8.x86_64.tgz
39728496c134e4352436d69b02226ee8  pigsty-pkg-v2.5.0.el9.x86_64.tgz
e3f548a6c7961af6107ffeee3eabc9a7  pigsty-pkg-v2.5.0.debian11.x86_64.tgz
1e469cc86a19702e48d7c1a37e2f14f9  pigsty-pkg-v2.5.0.debian12.x86_64.tgz
cc3af3b7c12f98969d3c6962f7c4bd8f  pigsty-pkg-v2.5.0.ubuntu20.x86_64.tgz
c5b2b1a4867eee624e57aed58ac65a80  pigsty-pkg-v2.5.0.ubuntu22.x86_64.tgz

v2.4.1

Supabase 支持：开源的 Firebase 替代，现可使用 Pigsty 本地托管的 PostgreSQL 实例作为数据存储。
PostgresML支持：使用SQL完成经典机器学习算法，训练、微调、调用大语言模型（hugging face）。
FerretDB v1.10 支持，在 PostgreSQL 上提供 MongoDB API与协议兼容能力。
GraphQL扩展: pg_graphql：从现有模式中反射出 GraphQL 模式，提供库内 GraphQL 查询能力。
JWT支持扩展：pgjwt 允许您使用 SQL 验证签发 JWT (JSON Web Tokens)。
密钥存储扩展: vault 可以在提供一个安全存储加密密钥的保险柜。
数据恢复扩展：pg_filedump：可用于快速从PostgreSQL二进制文件中恢复数据
图数据库扩展：Apache age，为 PostgreSQL 添加 OpenCypher 查询支持，类似 Neo4J
中文分词扩展：zhparser，为中文全文检索提供分词能力，类似 ElasticSearch。
高效位图扩展：pg_roaringbitmap，在 PostgreSQL 中提供 roaring bitmap 的支持，高效计数聚合统计。
向量嵌入替代：pg_embedding，提供了不同于 pgvector 的另一种 HNSW 替代实现。
可信语言扩展：pg_tle，由 AWS 出品的，允许您打包分发管理由可信存储过程语言编写的函数。
HTTP客户端扩展：pgsql-http：使用 SQL 接口，curl API，发起 HTTP 请求，与各类系统交互。
异步HTTP扩展： pg_net 允许您使用 SQL 发起非阻塞的 HTTP/HTTPS 请求。
列式存储引擎：hydra 针对分析场景打造的向量化列存储引擎，原地替代 Citus 列存插件。
其他PGDG扩展：新收录8个由PGDG维护的扩展插件，Pigsty支持的插件总数达到 150+ 。
PostgreSQL 16 内核支持，监控云端 RDS / PolarDB for PostgreSQL。

亮点特性

Supabase 支持：开源的 Firebase 替代，现可使用 Pigsty 托管的 PostgreSQL 实例存储数据。
PostgresML支持：在 PostgreSQL 运行各类模型（hugging face），向量操作，经典机器学习算法。
GraphQL支持扩展: pg_graphql：从现有模式中反射出 GraphQL 模式，提供库内 GraphQL 查询能力。
异步HTTP客户端扩展： pg_net 允许您使用 SQL 发起非阻塞的 HTTP/HTTPS 请求
JWT支持扩展：pgjwt 允许您使用 SQL 验证签发 JWT (JSON Web Tokens)
密钥存储扩展: vault 可以在保险柜里存储加密密钥
将FerretDB 版本升级至 v1.10
新增组件：pg_filedump：可用于快速从PostgreSQL二进制文件中恢复数据
减少 EL9 离线软件包的大小，移除非必须依赖项 proj-data*
修复了 Patroni 3.1.1 的错误

efabe7632d8994f3fb58f9838b8f9d7d  pigsty-pkg-v2.5.0.el7.x86_64.tgz # 1.1G
ea78957e8c8434b120d1c8c43d769b56  pigsty-pkg-v2.5.0.el8.x86_64.tgz # 1.4G
4ef280a7d28872814e34521978b851bb  pigsty-pkg-v2.5.0.el9.x86_64.tgz # 1.3G

v2.4.0

使用 bash -c "$(curl -fsSL https://get.pigsty.cc/latest)" 快速上手。

最新特性

PostgreSQL 16 正式发布，Pigsty提供支持。
可以监控云数据库，RDS for PostgreSQL，以及 PolarDB，提供全新的 PGRDS 监控面板
正式提供商业支持与咨询服务。并发布首个 LTS 版本，为订阅客户提供最长5年的支持。
新扩展插件: Apache AGE, openCypher graph query engine on PostgreSQL
新扩展插件: zhparser, full text search for Chinese language
新扩展插件: pg_roaringbitmap, roaring bitmap for PostgreSQL
新扩展插件: pg_embedding, hnsw alternative to pgvector
新扩展插件: pg_tle, admin / manage stored procedure extensions
新扩展插件: pgsql-http, issue http request with SQL interface
新增插件： pg_auth_mon pg_checksums pg_failover_slots pg_readonly postgresql-unit pg_store_plans pg_uuidv7 set_user
Redis改进：支持 Redis 哨兵监控，配置主从集群的自动高可用。

API变化

新增参数，REDIS.redis_sentinel_monitor，用于指定 Sentinel 集群监控的主库列表

问题修复

修复 Grafana 10.1 注册数据源时缺少 uid 的问题

MD5 (pigsty-pkg-v2.4.0.el7.x86_64.tgz) = 257443e3c171439914cbfad8e9f72b17
MD5 (pigsty-pkg-v2.4.0.el8.x86_64.tgz) = 41ad8007ffbfe7d5e8ba5c4b51ff2adc
MD5 (pigsty-pkg-v2.4.0.el9.x86_64.tgz) = 9a950aed77a6df90b0265a6fa6029250

v2.3.1

使用 bash -c "$(curl -fsSL https://get.pigsty.cc/latest)" 快速开始。

最新特性

pgvector 更新至 0.5，添加 hnsw 算法支持。
支持 PostgreSQL 16 RC1 (el8/el9)
默认包中添加了 SealOS 用于快速部署Kubernetes集群。

问题修复

修复了 infra.repo.repo_pkg 任务：当 repo_packages 中包名包含 * 时，下载可能会受到 /www/pigsty 现有内容的影响。
将 vip_dns_suffix 的默认值由 .vip 调整为空字符串，即集群本身的名称将默认作为节点集群的 L2 VIP
modprobe watchdog and chown watchdog if patroni_watchdog_mode is required
当 pg_dbsu_sudo = limit and patroni_watchdog_mode = required 时，授予数据库 dbsu 以下命令的 sudo 执行权限
- /usr/bin/sudo /sbin/modprobe softdog：在启动 Patroni 服务时确保 softdog 内核模块启用
- /usr/bin/sudo /bin/chown {{ pg_dbsu }} /dev/watchdog: 在启动 Patroni 服务时，确保 watchdog 属主正确

文档更新

向英文文档中添加了更新内容。
添加了简体中文版本的内置文档，修复了 pigsty.cc 文档站的中文文档。

软件更新

PostgreSQL 16 RC1 for EL8/EL9
PGVector 0.5.0，支持 hnsw 索引
TimescaleDB 2.11.2
grafana 10.1.0
loki & promtail 2.8.4
redis-stack 7.2 on el7/8
mcli-20230829225506 / minio-20230829230735
ferretdb 1.9
sealos 4.3.3
pgbadger 1.12.2

ce69791eb622fa87c543096cdf11f970  pigsty-pkg-v2.3.1.el7.x86_64.tgz
495aba9d6d18ce1ebed6271e6c96b63a  pigsty-pkg-v2.3.1.el8.x86_64.tgz
38b45582cbc337ff363144980d0d7b64  pigsty-pkg-v2.3.1.el9.x86_64.tgz

v2.3.0

相关文章：《Pigsty v2.3 发布：应用生态丰富》

发布注记：https://github.com/pgsty/pigsty/releases/tag/v2.3.0

使用 bash -c "$(curl -fsSL https://get.pigsty.cc/latest)" 快速开始。

亮点特性

INFRA: 添加了对 NODE/PGSQL VIP 的监控支持
PGSQL: 通过小版本升级修复了 PostgreSQL CVE-2023-39417： 15.4, 14.9, 13.12, 12.16，以及 Patroni v3.1.0
NODE: 允许用户使用 keepalived 为一个节点集群绑定 L2 VIP
REPO: Pigsty 专用 yum 源优化精简，全站默认使用 HTTPS： get.pigsty.cc 与 demo.pigsty.cc
APP: 升级 app/bytebase 版本至 v2.6.0， app/ferretdb 版本至 v1.8；添加新的应用模板：nocodb，开源的 Airtable。
REDIS: 升级版本至 v7.2，并重制了 Redis 监控面板。
MONGO: 添加基于 FerretDB 1.8 实现的基本支持。
MYSQL: 添加了 Prometheus / Grafana / CA 中的代码存根，便于后续纳管。

API变化

新增一个新的参数组 NODE.NODE_VIP：包含 8 个新参数

NODE.VIP.vip_enabled：在此节点集群上启用 vip 吗？
NODE.VIP.vip_address：ipv4 格式的节点 vip 地址，如果启用了 vip，则必需
NODE.VIP.vip_vrid：必需，整数，1-255 在相同 VLAN 中应该是唯一的
NODE.VIP.vip_role：master/backup，默认为备份，用作初始角色
NODE.VIP.vip_preempt：可选，true/false，默认为 false，启用 vip 抢占
NODE.VIP.vip_interface：节点 vip 网络接口监听，eth0 默认
NODE.VIP.vip_dns_suffix：节点 vip dns 名称后缀，默认为空字符串
NODE.VIP.vip_exporter_port：keepalived 导出器监听端口，默认为 9650

MD5 (pigsty-pkg-v2.3.0.el7.x86_64.tgz) = 81db95f1c591008725175d280ad23615
MD5 (pigsty-pkg-v2.3.0.el8.x86_64.tgz) = 6f4d169b36f6ec4aa33bfd5901c9abbe
MD5 (pigsty-pkg-v2.3.0.el9.x86_64.tgz) = 4bc9ae920e7de6dd8988ca7ee681459d

v2.2.0

相关文章：《Pigsty v2.2 发布 —— 监控系统大升级》

发布注记：https://github.com/pgsty/pigsty/releases/tag/v2.2.0

快速开始： bash -c "$(curl -fsSL https://get.pigsty.cc/latest)"

亮点特性

监控面板重做: https://demo.pigsty.cc
Vagrant沙箱重做: 支持 libvirt 与新的配置模板
Pigsty EL Yum 仓库: 统一收纳零碎 RPM，简化安装构建流程。
操作系统兼容性: 新增信创操作系统 UOS-v20-1050e 支持
新的配置模板：42 节点的生产仿真配置
统一使用官方 PGDG citus 软件包（el7）

软件升级

PostgreSQL 16 beta2
Citus 12 / PostGIS 3.3.3 / TimescaleDB 2.11.1 / PGVector 0.44
patroni 3.0.4 / pgbackrest 2.47 / pgbouncer 1.20
grafana 10.0.3 / loki/promtail/logcli 2.8.3
etcd 3.5.9 / haproxy v2.8.1 / redis v7.0.12
minio 20230711212934 / mcli 20230711233044

Bug修复

修复了 Docker 组权限的问题 29434bd
将 infra 操作系统用户组作为额外的组，而不是首要用户组。
修复了 Redis Sentinel Systemd 服务的自动启用状态 5c96feb
放宽了 bootstrap & configure 的检查，特别是当 /etc/redhat-release 不存在的时候。
升级到 Grafana 10，修复了 Grafana 9.x CVE-2023-1410
在 CMDB pglog 模式中添加了 PG 14 - 16 的 command tags 与错误代码。

API变化

新增1个变量

INFRA.NGINX.nginx_exporter_enabled: 现在用户可以通过设置这个参数来禁用 nginx_exporter 。

默认值变化:

repo_modules: node,pgsql,infra : redis 现在由 pigsty-el 仓库提供，不再需要 redis 模块。
repo_upstream:
- 新增 pigsty-el: 与具体EL版本无关的RPM: 例如 grafana, minio, pg_exporter, 等等……
- 新增 pigsty-misc: 与具体EL版本有关的RPM: 例如 redis, prometheus 全家桶，等等……
- 移除 citus: 现在 PGDG 中有完整的 EL7 - EL9 citus 12 支持
- 移除 remi: redis 现在由 pigsty-el 仓库提供，不再需要 redis 模块。
repo_packages:
- ansible python3 python3-pip python3-requests python3-jmespath python3.11-jmespath dnf-utils modulemd-tools # el7: python36-requests python36-idna yum-utils
- grafana loki logcli promtail prometheus2 alertmanager karma pushgateway node_exporter blackbox_exporter nginx_exporter redis_exporter
- redis etcd minio mcli haproxy vip-manager pg_exporter nginx createrepo_c sshpass chrony dnsmasq docker-ce docker-compose-plugin flamegraph
- lz4 unzip bzip2 zlib yum pv jq git ncdu make patch bash lsof wget uuid tuned perf nvme-cli numactl grubby sysstat iotop htop rsync tcpdump
- netcat socat ftp lrzsz net-tools ipvsadm bind-utils telnet audit ca-certificates openssl openssh-clients readline vim-minimal
- postgresql13* wal2json_13* pg_repack_13* passwordcheck_cracklib_13* postgresql12* wal2json_12* pg_repack_12* passwordcheck_cracklib_12* timescaledb-tools
- postgresql15 postgresql15* citus_15* pglogical_15* wal2json_15* pg_repack_15* pgvector_15* timescaledb-2-postgresql-15* postgis33_15* passwordcheck_cracklib_15* pg_cron_15*
- postgresql14 postgresql14* citus_14* pglogical_14* wal2json_14* pg_repack_14* pgvector_14* timescaledb-2-postgresql-14* postgis33_14* passwordcheck_cracklib_14* pg_cron_14*
- postgresql16* wal2json_16* pgvector_16* pg_squeeze_16* postgis34_16* passwordcheck_cracklib_16* pg_cron_16*
- patroni patroni-etcd pgbouncer pgbadger pgbackrest pgloader pg_activity pg_partman_15 pg_permissions_15 pgaudit17_15 pgexportdoc_15 pgimportdoc_15 pg_statement_rollback_15*
- orafce_15* mysqlcompat_15 mongo_fdw_15* tds_fdw_15* mysql_fdw_15 hdfs_fdw_15 sqlite_fdw_15 pgbouncer_fdw_15 multicorn2_15* powa_15* pg_stat_kcache_15* pg_stat_monitor_15* pg_qualstats_15 pg_track_settings_15 pg_wait_sampling_15 system_stats_15
- plprofiler_15* plproxy_15 plsh_15* pldebugger_15 plpgsql_check_15* pgtt_15 pgq_15* pgsql_tweaks_15 count_distinct_15 hypopg_15 timestamp9_15* semver_15* prefix_15* rum_15 geoip_15 periods_15 ip4r_15 tdigest_15 hll_15 pgmp_15 extra_window_functions_15 topn_15
- pg_background_15 e-maj_15 pg_catcheck_15 pg_prioritize_15 pgcopydb_15 pg_filedump_15 pgcryptokey_15 logerrors_15 pg_top_15 pg_comparator_15 pg_ivm_15* pgsodium_15* pgfincore_15* ddlx_15 credcheck_15 safeupdate_15 pg_squeeze_15* pg_fkpart_15 pg_jobmon_15
repo_url_packages:
- https://get.pigsty.cc/rpm/pev.html
- https://get.pigsty.cc/rpm/chart.tgz
node_default_packages:
- lz4,unzip,bzip2,zlib,yum,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,nvme-cli,numactl,grubby,sysstat,iotop,htop,rsync,tcpdump
- netcat,socat,ftp,lrzsz,net-tools,ipvsadm,bind-utils,telnet,audit,ca-certificates,openssl,readline,vim-minimal,node_exporter,etcd,haproxy,python3,python3-pip
infra_packages
- grafana,loki,logcli,promtail,prometheus2,alertmanager,karma,pushgateway
- node_exporter,blackbox_exporter,nginx_exporter,redis_exporter,pg_exporter
- nginx,dnsmasq,ansible,postgresql15,redis,mcli,python3-requests
PGSERVICE in .pigsty 被移除了，取而代之的是 PGDATABASE=postgres，这用户只需 IP 地址就可以从管理节点访问特定实例。

目录结构变化:

bin/dns and bin/ssh 现在被移动到 vagrant/ 目录中。

MD5 (pigsty-pkg-v2.2.0.el7.x86_64.tgz) = 5fb6a449a234e36c0d895a35c76add3c
MD5 (pigsty-pkg-v2.2.0.el8.x86_64.tgz) = c7211730998d3b32671234e91f529fd0
MD5 (pigsty-pkg-v2.2.0.el9.x86_64.tgz) = 385432fe86ee0f8cbccbbc9454472fdd

v2.1.0

相关文章：Pigsty v2.1 发布：向量扩展 / PG12-16 支持

发布注记：https://github.com/pgsty/pigsty/releases/tag/v2.1.0

Highlight

PostgreSQL 16 beta 支持, 以及 12 ~ 15 的支持.
为 PG 12 - 15 新增了 PGVector 扩展支持，用于存储 AI 嵌入。
为 Grafana 添加了额外6个默认的扩展面板/数据源插件。
添加 bin/profile 脚本用于执行远程 Profiling ，生成火焰图。
添加 bin/validate 用于校验 pigsty.yml 配置文件合法性。
添加 bin/repo-add 用于快速向节点添加软件源定义。
PostgreSQL 16 可观测性：添加了 pg_stat_io 支持与相关监控面板

软件升级

PostgreSQL 15.3 , 14.8, 13.11, 12.15, 11.20, and 16 beta1
pgBackRest 2.46 / pgbouncer 1.19
Redis 7.0.11
Grafana v9.5.3
Loki / Promtail / Logcli 2.8.2
Prometheus 2.44
TimescaleDB 2.11.0
minio-20230518000536 / mcli-20230518165900
Bytebase v2.2.0

改进增强

当添加本地用户的公钥时，所有的 id*.pub 都会被添加到远程机器上（例如椭圆曲线算法生成的密钥文件）

v2.0.2

https://github.com/pgsty/pigsty/releases/tag/v2.0.2

亮点

使用开箱即用的 pgvector 存储 AI Embedding、索引、检索向量。

新扩展 pgvector
MinIO CVE-2023-28432 问题修复

变更

新扩展插件 pgvector 用于存储 AI 嵌入，并执行向量相似度搜索。
修复 MinIO CVE-2023-28432 ，使用 20230324 新提供的 policy API.
为 DNSMASQ systemd 服务添加动态重载命令
更新 PEV 版本至 v1.8
更新 grafana 版本至 v9.4.7
更新 MinIO 与 MCLI 版本至 20230324
更新 bytebase 版本至 v1.15.0
更新监控面板并修复死链接
更新了阿里云 Terraform 模板，默认使用 RockyLinux 9
使用 Grafana v9.4 的 Provisioning API
为众多管理任务添加了 asciinema 视频
修复了 EL8 PostgreSQL 的破损依赖：移除 anonymizer_15 faker_15 pgloader

MD5 (pigsty-pkg-v2.0.2.el7.x86_64.tgz) = d46440a115d741386d29d6de646acfe2
MD5 (pigsty-pkg-v2.0.2.el8.x86_64.tgz) = 5fa268b5545ac96b40c444210157e1e1
MD5 (pigsty-pkg-v2.0.2.el9.x86_64.tgz) = c8b113d57c769ee86a22579fc98e8345

v2.0.1

https://github.com/pgsty/pigsty/releases/tag/v2.0.1

安全性改进，与对 v2.0.0 的 BUG 修复。

改进

更换猪头 logo 以符合 PostgreSQL 商标政策。
将 grafana 版本升级至 v9.4，界面更佳且修复了 bug。
将 patroni 版本升级至 v3.0.1，其中包含了一些 bug 修复。
修改：将 grafana systemd 服务文件回滚到 rpm 默认的版本。
使用缓慢的 copy 代替 rsync 来复制 grafana 仪表板，更加可靠。
增强：bootstrap 执行后会添加回默认 repo 文件。
添加 asciinema 视频，用于各种管理任务。
安全增强模式：限制监控用户权限。
新的配置模板：dual.yml，用于双节点部署。
在 crit.yml 模板中启用 log_connections 和 log_disconnections。
在 crit.yml 模板中的 pg_libs 中启用 $lib/passwordcheck。
明确授予 pg_monitor 角色监视视图权限。
从 dbuser_monitor 中移除默认的 dbrole_readonly 以限制监控用户的权限
现在 patroni 监听在 {{ inventory_hostname }} 而不是 0.0.0.0
现在你可以使用 pg_listen 控制 postgres/pgbouncer 监听的地址
现在你可以在 pg_listen 中使用 ${ip}, ${lo}, ${vip} 占位符
将 Aliyun terraform 镜像从 centos 7.9 提升到 rocky Linux 9
将 bytebase 版本升级到 v1.14.0

BUG修复

为 alertmanager 添加缺失的 advertise 地址。
解决使用 bin/pgsql-user 创建数据库用户时，pg_mode 变量缺失问题。
在 redis.yml 中为 Redis 集群加入任务添加 -a password 选项。
在 infra-rm.yml.remove infra data 任务中补充缺失的默认值。
修复 prometheus 监控对象定义文件的属主为 prometheus 用户。
使用管理员用户而不是 root 去删除 DCS 中的元数据。
修复了由 grafana 9.4 bug 导致的问题：Meta数据源缺失。

注意事项

EL8 pgdg 上游官方源处于依赖破损状态，请小心使用。涉及到的软件包: postgis33_15, pgloader, postgresql_anonymizer_15*, postgresql_faker_15

如何升级？

cd ~/pigsty; tar -zcf /tmp/files.tgz files; rm -rf ~/pigsty    # backup files dir and remove
cd ~; bash -c "$(curl -fsSL https://get.pigsty.cc/latest)"      # get latest pigsty source
cd ~/pigsty; rm -rf files; tar -xf /tmp/files.tgz -C ~/pigsty  # restore files dir

Checksums

MD5 (pigsty-pkg-v2.0.1.el7.x86_64.tgz) = 5cfbe98fd9706b9e0f15c1065971b3f6
MD5 (pigsty-pkg-v2.0.1.el8.x86_64.tgz) = c34aa460925ae7548866bf51b8b8759c
MD5 (pigsty-pkg-v2.0.1.el9.x86_64.tgz) = 055057cebd93c473a67fb63bcde22d33

特别感谢 @cocoonkid 提供的反馈。

v2.0.0

Pigsty v2.0.0 正式发布！

从v2.0.0开始，PIGSTY 现在是 “PostgreSQL In Great STYle"的首字母缩写，即"全盛状态的PostgreSQL”。

curl -fsSL https://get.pigsty.cc/latest | bash

Download directly from GitHub Release

bash -c "$(curl -fsSL https://raw.githubusercontent.com/pgsty/pigsty/master/bin/get)"

# or download tarball directly with curl (EL9)
curl -L https://github.com/pgsty/pigsty/releases/download/v2.0.0/pigsty-v2.0.0.tgz -o ~/pigsty.tgz
curl -L https://github.com/pgsty/pigsty/releases/download/v2.0.0/pigsty-pkg-v2.0.0.el9.x86_64.tgz  -o /tmp/pkg.tgz
# EL7: https://github.com/pgsty/pigsty/releases/download/v2.0.0/pigsty-pkg-v2.0.0.el7.x86_64.tgz
# EL8: https://github.com/pgsty/pigsty/releases/download/v2.0.0/pigsty-pkg-v2.0.0.el8.x86_64.tgz

亮点

完美整合 PostgreSQL 15, PostGIS 3.3, Citus 11.2, TimescaleDB 2.10，分布式地理时序超融合数据库。
OS兼容性大幅增强：支持 EL7，8，9，以及 RHEL, CentOS, Rocky, OracleLinux, AlmaLinux等兼容发行版。
安全性改进：自签名CA，全局网络流量SSL加密，密码scram-sha-256认证，备份采用AES加密，重制的HBA规则系统。
Patroni升级至3.0，提供原生的高可用 Citus 分布式集群支持，默认启用FailSafe模式，无惧DCS故障致全局主库瘫痪。
提供基于 pgBackRest 的开箱即用的时间点恢复 PITR 支持，默认支持本地文件系统与专用MinIO/S3集群备份。
新模块 ETCD，可独立部署，简易扩缩容，自带监控高可用，彻底取代 Consul 作为高可用 PG 的 DCS。
新模块 MINIO，可独立部署，支持多盘多节点部署，用作S3本地替代，亦用于集中式 PostgreSQL 备份仓库。
大幅精简配置文件参数，无需默认值即可使用；模板自动根据机器规格调整主机与PG参数，HBA/服务的定义更简洁泛用。
受 Grafana 与 MinIO 影响，软件协议由 Apache License 2.0 变更为 AGPL 3.0

兼容性

支持 EL7, EL8, EL9 三个大版本，并提供三个版本对应的离线软件包，默认开发测试环境由EL7升级至EL9。
支持更多EL兼容Linux发行版：RHEL, CentOS, RockyLinux, AlmaLinux, OracleLinux等…
源码包与离线软件包的命名规则发生改变，现在版本号，操作系统版本号，架构都会体现在包名中。
PGSQL: PostgreSQL 15.2, PostGIS 3.3, Citus 11.2, TimescaleDB 2.10 现可同时使用，协同工作。
PGSQL: Patroni 升级至 3.0 版本，作为 PGSQL 的高可用组件。
- 默认使用 ETCD 作为 DCS，取代 Consul，减少一个 Consul Agent 失效点。
- 因为 vip-manager 升级至 2.1 并使用 ETCDv3 API，彻底弃用 ETCDv2 API，Patroni同理
- 提供原生的高可用 Citus 分布式集群支持。使用完全开源所有功能的 Citus 11.2。
- 默认启用FailSafe模式，无惧DCS故障致全局主库瘫痪。
PGSQL: 引入 pgBackrest v2.44 提供开箱即用的 PostgreSQL 时间点恢复 PITR 功能
- 默认使用主库上的备份目录创建备份仓库，滚动保留两天的恢复窗口。
- 默认备选备份仓库为专用 MinIO/S3 集群，滚动保留两周的恢复窗口，本地使用需要启用 MinIO 模块。
ETCD 现在作为一个独立部署的模块，带有完整的扩容/缩容方案与监控。
MINIO 现在成为一个独立部署的模块，支持多盘多节点部署，用作S3本地替代，亦可用作集中式备份仓库。
NODE 模块现在包含 haproxy, docker, node_exporter, promtail 功能组件
- chronyd 现在取代 ntpd 成为所有节点默认的 NTP 服务。
- HAPROXY 现从属于 NODE 的一部分，而不再是 PGSQL 专属，可以 NodePort 的方式对外暴露服务。
- 现在 PGSQL 模块可以使用专用的集中式 HAPROXY 集群统一对外提供服务。
INFRA 模块现在包含 dnsmasq, nginx, prometheus, grafana, loki 等组件
- Infra 模块中的 DNSMASQ 服务器默认启用，并添加为所有节点的默认 DNS 服务器之一。
- 添加了 blackbox_exporter 用于主机 PING 探测，pushgateway 用于批处理任务指标。
- loki 与 promtail 现在使用 Grafana 默认的软件包，使用官方的 Grafana Echarts 面板插件
- 提供针对 PostgreSQL 15 的新增可观测性位点的监控支持，添加 Patroni 监控
软件版本升级
- PostgreSQL 15.2 / PostGIS 3.3 / TimescaleDB 2.10 / Citus 11.2
- Patroni 3.0 / Pgbouncer 1.18 / pgBackRest 2.44 / vip-manager 2.1
- HAProxy 2.7 / Etcd 3.5 / MinIO 20230131022419 / mcli 20230128202938
- Prometheus 2.42 / Grafana 9.3 / Loki & Promtail 2.7 / Node Exporter 1.5

安全性

启用了一个完整的本地自签名CA：pigsty-ca，用于签发内网组件所使用的证书。
创建用户/修改密码的操作将不再会在日志文件中留下痕迹。
Nginx 默认启用 SSL 支持（如需HTTPS，您需要在系统中信任pigsty-ca，或使用Chrome thisisunsafe）
ETCD 全面启用 SSL 加密客户端与服务端对等通信
PostgreSQL 添加并默认启用了 SSL 支持，管理链接默认都使用SSL访问。
Pgbouncer 添加了 SSL 支持，出于性能考虑默认不启用。
Patroni 添加了 SSL 支持，并默认限制了管理 API 只能从本机与管理节点使用密码认证方可访问。
PostgreSQL 的默认密码认证方式由 md5 改为 scram-sha-256。
Pgbouncer添加了认证查询支持，可以动态管理连接池用户。
pgBackRest 使用远端集中备份存储仓库时，默认使用 AES-256-CBC 加密备份数据。
提供高安全等级配置模板：强制使用全局 SSL，并要求使用管理员证书登陆。
所有默认HBA规则现在全部在配置文件中显式定义。

可维护性

现有的配置模板可根据机器规格（CPU/内存/存储）自动调整优化。
现在可以动态配置 Postgres/Pgbouncer/Patroni/pgBackRest 的日志目录：默认为：/pg/log/<type>/
原有的 IP 地址占位符 10.10.10.10 被替换为一个专用变量：${admin_ip}，可在多处引用，便于切换备用管理节点。
您可以指定 region 来使用不同地区的上游镜像源，以加快软件包的下载速度。
现在允许用户定义更细粒度的上游源地址，您可以根据不同的EL版本、架构，以及地区，使用不同的上游源。
提供了阿里云与AWS中国地区的 Terraform 模板，可用于一键拉起所需的 EC2 虚拟机。
提供了多种不同规格的 Vagrant 沙箱模板：meta, full, el7/8/9, minio, build, citus
添加了新的专用剧本：pgsql-monitor.yml 用于监控现有的 Postgres 实例或 RDS。
添加了新的专用剧本：pgsql-migration.yml ，使用逻辑复制无缝迁移现有实例至 Pigsty管理的集群。
添加了一系列专用 Shell 实用命令，封装常见运维操作，方便用户使用。
优化了所有 Ansible Role 的实现，使其更加简洁、易读、易维护，无需默认参数即可使用。
允许在业务数据库/用户的层次上定义额外的 Pgbouncer 参数。

API变更

Pigsty v2.0 进行了大量变更，新增64个参数，移除13个参数，重命名17个参数。

新增的参数

INFRA.META.admin_ip : 主元节点 IP 地址
INFRA.META.region : 上游镜像区域：default|china|europe
INFRA.META.os_version : 企业版 Linux 发行版本：7,8,9
INFRA.CA.ca_cn : CA 通用名称，默认为 pigsty-ca
INFRA.CA.cert_validity : 证书有效期，默认为 20 年
INFRA.REPO.repo_enabled : 在 infra 节点上构建本地 yum 仓库吗？
INFRA.REPO.repo_upstream : 上游 yum 仓库定义列表
INFRA.REPO.repo_home : 本地 yum 仓库的主目录，通常与 nginx_home ‘/www’ 相同
INFRA.NGINX.nginx_ssl_port : https 监听端口
INFRA.NGINX.nginx_ssl_enabled : 启用 nginx https 吗？
INFRA.PROMTETHEUS.alertmanager_endpoint : altermanager 端点（ip|domain）：端口格式
NODE.NODE_TUNE.node_hugepage_ratio : 内存 hugepage 比率，默认禁用，值为 0
NODE.HAPROXY.haproxy_service : 要公开的 haproxy 服务列表
PGSQL.PG_ID.pg_mode : pgsql 集群模式：pgsql,citus,gpsql
PGSQL.PG_BUSINESS.pg_dbsu_password : dbsu 密码，默认为空字符串表示没有 dbsu 密码
PGSQL.PG_INSTALL.pg_log_dir : postgres 日志目录，默认为 /pg/data/log
PGSQL.PG_BOOTSTRAP.pg_storage_type : SSD|HDD，默认为 SSD
PGSQL.PG_BOOTSTRAP.patroni_log_dir : patroni 日志目录，默认为 /pg/log
PGSQL.PG_BOOTSTRAP.patroni_ssl_enabled : 使用 SSL 保护 patroni RestAPI 通信？
PGSQL.PG_BOOTSTRAP.patroni_username : patroni rest api 用户名
PGSQL.PG_BOOTSTRAP.patroni_password : patroni rest api 密码（重要：请更改此密码）
PGSQL.PG_BOOTSTRAP.patroni_citus_db : 由 patroni 管理的 citus 数据库，默认为 postgres
PGSQL.PG_BOOTSTRAP.pg_max_conn : postgres 最大连接数，auto 将使用推荐值
PGSQL.PG_BOOTSTRAP.pg_shmem_ratio : postgres 共享内存比率，默认为 0.25，范围 0.1~0.4
PGSQL.PG_BOOTSTRAP.pg_rto : 恢复时间目标，故障转移的 ttl，默认为 30s
PGSQL.PG_BOOTSTRAP.pg_rpo : 恢复点目标，默认最多丢失 1MB 数据
PGSQL.PG_BOOTSTRAP.pg_pwd_enc : 密码加密算法：md5|scram-sha-256
PGSQL.PG_BOOTSTRAP.pgbouncer_log_dir : pgbouncer 日志目录，默认为 /var/log/pgbouncer
PGSQL.PG_BOOTSTRAP.pgbouncer_auth_query : 如果启用，查询 pg_authid 表以检索 biz 用户，而不是填充用户列表
PGSQL.PG_BOOTSTRAP.pgbouncer_sslmode : pgbouncer 客户端的 SSL：disable|allow|prefer|require|verify-ca|verify-full
PGSQL.PG_BOOTSTRAP.pg_service_provider : 专用的 haproxy 节点组名称，或者默认为本地节点的空字符串
PGSQL.PG_BOOTSTRAP.pg_default_service_dest : 如果 svc.dest=‘default’，则为默认服务目标
PGSQL.PG_BACKUP.pgbackrest_enabled : 启用 pgbackrest 吗？
PGSQL.PG_BACKUP.pgbackrest_clean : 初始化期间删除 pgbackrest 数据吗？
PGSQL.PG_BACKUP.pgbackrest_log_dir : pgbackrest 日志目录，默认为 /pg/log
PGSQL.PG_BACKUP.pgbackrest_method : pgbackrest 备份仓库方法，local 或 minio
PGSQL.PG_BACKUP.pgbackrest_repo : pgbackrest 备份仓库配置
PGSQL.PG_DNS.pg_dns_suffix : pgsql dns 后缀，默认为空字符串
PGSQL.PG_DNS.pg_dns_target : auto, primary, vip, none 或 ad hoc ip
ETCD.etcd_seq : etcd 实例标识符，必需
ETCD.etcd_cluster : etcd 集群和组名称，默认为 etcd
ETCD.etcd_safeguard : 防止清除正在运行的 etcd 实例吗？
ETCD.etcd_clean : 在初始化期间清除现有的 etcd 吗？
ETCD.etcd_data : etcd 数据目录，默认为 /data/etcd
ETCD.etcd_port : etcd 客户端端口，默认为 2379
ETCD.etcd_peer_port : etcd 对等端口，默认为 2380
ETCD.etcd_init : etcd 初始集群状态，新建或已存在
ETCD.etcd_election_timeout : etcd 选举超时，默认为 1000ms
ETCD.etcd_heartbeat_interval : etcd 心跳间隔，默认为 100ms
MINIO.minio_seq : minio 实例标识符，必须参数
MINIO.minio_cluster : minio 集群名称，默认为 minio
MINIO.minio_clean : 初始化时清理 minio 吗？默认为 false
MINIO.minio_user : minio 操作系统用户，默认为 minio
MINIO.minio_node : minio 节点名模式
MINIO.minio_data : minio 数据目录，使用 {x…y} 来指定多个驱动器
MINIO.minio_domain : minio 外部域名，默认为 sss.pigsty
MINIO.minio_port : minio 服务端口，默认为 9000
MINIO.minio_admin_port : minio 控制台端口，默认为 9001
MINIO.minio_access_key : 根访问密钥，默认为 minioadmin
MINIO.minio_secret_key : 根秘密密钥，默认为 minioadmin
MINIO.minio_extra_vars : minio 服务器的额外环境变量
MINIO.minio_alias : 本地 minio 部署的别名
MINIO.minio_buckets : 待创建的 minio 存储桶列表
MINIO.minio_users : 待创建的 minio 用户列表

移除的参数

INFRA.CA.ca_homedir : CA 主目录，现在固定为 /etc/pki/
INFRA.CA.ca_cert : CA 证书文件名，现在固定为 ca.key
INFRA.CA.ca_key : CA 密钥文件名，现在固定为 ca.key
INFRA.REPO.repo_upstreams : 已被 repo_upstream 替代
PGSQL.PG_INSTALL.pgdg_repo : 现在由节点 playbooks 负责
PGSQL.PG_INSTALL.pg_add_repo : 现在由节点 playbooks 负责
PGSQL.PG_IDENTITY.pg_backup : 未使用且与部分名称冲突
PGSQL.PG_IDENTITY.pg_preflight_skip : 不再使用，由 pg_id 替代
DCS.dcs_name : 由于使用 etcd 而被移除
DCS.dcs_servers : 被 ad hoc 组 etcd 替代
DCS.dcs_registry : 由于使用 etcd 而被移除
DCS.dcs_safeguard : 被 etcd_safeguard 替代
DCS.dcs_clean : 被 etcd_clean 替代

重命名的参数

nginx_upstream -> infra_portal
repo_address -> repo_endpoint
pg_hostname -> node_id_from_pg
pg_sindex -> pg_group
pg_services -> pg_default_services
pg_services_extra -> pg_services
pg_hba_rules_extra -> pg_hba_rules
pg_hba_rules -> pg_default_hba_rules
pgbouncer_hba_rules_extra -> pgb_hba_rules
pgbouncer_hba_rules -> pgb_default_hba_rules
vip_mode -> pg_vip_enabled
vip_address -> pg_vip_address
vip_interface -> pg_vip_interface
node_packages_default -> node_default_packages
node_packages_meta -> infra_packages
node_packages_meta_pip -> infra_packages_pip
node_data_dir -> node_data

Checksums

MD5 (pigsty-pkg-v2.0.0-rc1.el7.x86_64.tgz) = af4b5db9dc38c860de609956a8f1f0d3
MD5 (pigsty-pkg-v2.0.0-rc1.el8.x86_64.tgz) = 5b7152e142df3e3cbc06de30bd70e433
MD5 (pigsty-pkg-v2.0.0-rc1.el9.x86_64.tgz) = 1362e2a5680fc1a3a014cc4f304100bd

特别感谢意大利用户 @alemacci 在 SSL加密，备份，多操作系统发行版适配与自适应参数模版上的贡献！

v1.5.1

亮点

重要：修复了PG14.0-14.3中 CREATE INDEX|REINDEX CONCURRENTLY 可能导致索引数据损坏的问题。

Pigsty v1.5.1 升级默认PostgreSQL版本至 14.4 强烈建议尽快更新。

软件升级

postgres 升级至 to 14.4
haproxy 升级至 to 2.6.0
grafana 升级至 to 9.0.0
prometheus 升级至 2.36.0
patroni 升级至 2.1.4

问题修复

修复了pgsql-migration.yml中的TYPO
移除了HAProxy配置文件中的PID配置项
移除了默认软件包中的 i686 软件包
默认启用所有Systemd Redis Service
默认启用所有Systemd Patroni Service

API变更

grafana_database 与 grafana_pgurl 被标记为过时API，将从后续版本移除

New Apps

wiki.js : 使用Postgres搭建本地维基百科
FerretDB ：使用Postgres提供MongoDB API

v1.5.0

亮点概述

完善的Docker支持：在管理节点上默认启用并提供诸多开箱即用的软件模板：bytebase, pgadmin, pgweb, postgrest, minio等。
基础设施自我监控：Nginx， ETCD， Consul， Prometheus， Grafana， Loki 自我监控
CMDB升级：兼容性改善，支持Redis集群/Greenplum集群元数据，配置文件可视化。
服务发现改进：可以使用Consul自动发现所有待监控对象，并纳入Prometheus中。
更好的冷备份支持：默认定时备份任务，添加pg_probackup备份工具，一键创建延时从库。
ETCD现在可以用作PostgreSQL/Patroni的DCS服务，作为Consul的备选项。
Redis剧本/角色改善：现在允许对单个Redis实例，而非整个Redis节点进行初始化与移除。

详细变更列表

监控面板

CMDB Overview：可视化Pigsty CMDB Inventory。
DCS Overview：查阅Consul与ETCD集群的监控指标。
Nginx Overview：查阅Pigsty Web访问指标与访问日志。
Grafana Overview：Grafana自我监控
Prometheus Overview：Prometheus自我监控
INFRA Dashboard进行重制，反映基础设施整体状态

监控架构

现在允许使用 Consul 进行服务发现（当所有服务注册至Consul时）
现在所有的Infra组件会启用自我监控，并通过infra_register角色注册至Prometheus与Consul中。
指标收集器 pg_exporter 更新至 v0.5.0，添加新功能，scale 与 default，允许为指标指定一个倍乘因子，以及指定默认值。
pg_bgwriter, pg_wal, pg_query, pg_db, pgbouncer_stat 关于时间的指标，单位由默认的毫秒或微秒统一缩放至秒。
pg_table 中的相关计数器指标，现在配置有默认值 0，替代原有的NaN。
pg_class指标收集器默认移除，相关指标添加至 pg_table 与 pg_index 收集器中。
pg_table_size 指标收集器现在默认启用，默认设置有300秒的缓存时间。

部署方案

新增可选软件包 docker.tgz，带有常用应用镜像：Pgadmin, Pgweb, Postgrest, ByteBase, Kong, Minio等。
新增角色ETCD，可以在DCS Servers指定的节点上自动部署ETCD服务，并自动纳入监控。
允许通过 pg_dcs_type 指定PG高可用使用的DCS服务，Consul（默认），ETCD（备选）
允许通过 node_crontab 参数，为节点配置定时任务，例如数据库备份、VACUUM，统计收集等。
新增了 pg_checksum 选项，启用时，数据库集群将启用数据校验和（此前只有crit模板默认启用）
新增了pg_delay选项，当实例为Standby Cluster Leader时，此参数可以用于配置一个延迟从库
新增了软件包 pg_probackup，默认角色replicator现在默认赋予了备份相关函数所需的权限。
Redis部署现在拆分为两个部分：Redis节点与Redis实例，通过redis_port参数可以精确控制一个具体实例。
Loki 与 Promtail 现在使用 frpm 制作的 RPM软件包进行安装。
DCS3配置模板现在使用一个3节点的pg-meta集群，与一个单节点的延迟从库。

软件升级

升级 PostgreSQL 至 14.3
升级 Redis 至 6.2.7
升级 PG Exporter 至 0.5.0
升级 Consul 至 1.12.0
升级 vip-manager 至 v1.0.2
升级 Grafana 至 v8.5.2
升级 Loki & Promtail 至 v2.5.0，使用frpm打包。

问题修复

修复了Loki 与 Promtail 默认配置文件名的问题
修复了Loki 与 Promtail 环境变量无法正确展开的问题
对英文文档进行了一次完整的翻译与修缮，文档依赖的JS资源现在直接从本地获取，无需互联网访问。

API变化

新参数

node_data_dir : 主要的数据挂载路径，如果不存在会被创建。
node_crontab_overwrite : 覆盖 /etc/crontab 而非追加内容。
node_crontab: 要被追加或覆盖的 node crontab 内容。
nameserver_enabled: 在这个基础设施节节点上启用 nameserver 吗？
prometheus_enabled: 在这个基础设施节节点上启用 prometheus 吗？
grafana_enabled: 在这个基础设施节节点上启用 grafana 吗？
loki_enabled: 在这个基础设施节节点上启用 loki 吗？
docker_enable: 在这个基础设施节点上启用 docker 吗？
consul_enable: 启用 consul 服务器/代理吗？
etcd_enable: 启用 etcd 服务器/客户端吗？
pg_checksum: 启用 pg 集群数据校验和吗？
pg_delay: 备份集群主库复制重放时的应用延迟。

参数重制

现在 *_clean 是布尔类型的参数，用于在初始化期间清除现有实例。

*_safeguard 也是布尔类型的参数，用于在执行任何剧本时，避免清除正在运行的实例。

pg_exists_action -> pg_clean
pg_disable_purge -> pg_safeguard
dcs_exists_action -> dcs_clean
dcs_disable_purge -> dcs_safeguard

参数重命名

node_ntp_config -> node_ntp_enabled
node_admin_setup -> node_admin_enabled
node_admin_pks -> node_admin_pk_list
node_dns_hosts -> node_etc_hosts_default
node_dns_hosts_extra -> node_etc_hosts
node_dns_server -> node_dns_method
node_local_repo_url -> node_repo_local_urls
node_packages -> node_packages_default
node_extra_packages -> node_packages
node_packages_meta -> node_packages_meta
node_meta_pip_install -> node_packages_meta_pip
node_sysctl_params -> node_tune_params
app_list -> nginx_indexes
grafana_plugin -> grafana_plugin_method
grafana_cache -> grafana_plugin_cache
grafana_plugins -> grafana_plugin_list
grafana_git_plugin_git -> grafana_plugin_git
haproxy_admin_auth_enabled -> haproxy_auth_enabled
pg_shared_libraries -> pg_libs
dcs_type -> pg_dcs_type

v1.4.1

日常错误修复 / Docker 支持 / 英文文档

现在，默认在元节点上启用 docker。您可以使用它启动海量的各类软件

现在提供英文文档。

Bug 修复

修复 promtail & loki 配置变量问题
修复 grafana 旧版警报。
默认禁用 nameserver
为 patroni 快捷方式重命名 pg-alias.sh
为所有仪表板禁用 exemplars 查询
修复 loki 数据目录问题 https://github.com/pgsty/pigsty/issues/100
将 autovacuum_freeze_max_age 从 100000000 更改为 1000000000

v1.4.0

架构

将系统解耦为4大类别：INFRA、NODES、PGSQL、REDIS，这使得pigsty更加清晰、更易于扩展。
单节点部署 = INFRA + NODES + PGSQL
部署pgsql集群 = NODES + PGSQL
部署redis集群 = NODES + REDIS
部署其他数据库 = NODES + xxx（例如 MONGO、KAFKA…待定）

可访问性

为中国大陆提供CDN。
使用 bash -c "$(curl -fsSL http://get.pigsty.cc/latest)" 获取最新源代码。
使用新的 download 脚本下载并提取包。

监控增强

将监控系统分为5大类别：INFRA、NODES、REDIS、PGSQL、APP
默认启用日志记录
- 现在默认启用loki和promtail，带有预构建的 loki-rpm。
模型和标签
- 为所有仪表板添加了一个隐藏的ds prometheus数据源变量，因此您只需选择一个新的数据源而不是修改Grafana数据源和仪表板。
- 为所有指标添加了一个ip标签，并将其用作数据库指标和节点指标之间的连接键。
INFRA监控
- Infra主仪表板：INFRA概览
- 添加日志仪表板：日志实例
- PGLOG分析和PGLOG会话现在被视为一个示例Pigsty APP。
NODES监控应用
- 如果您完全不关心数据库，现在可以单独使用Pigsty作为主机监控软件！
- 包括4个核心仪表板：节点概览 & 节点集群 & 节点实例 & 节点警报
- 为节点引入新的身份变量：node_cluster 和 nodename
- 变量pg_hostname现在意味着将主机名设置为与postgres实例名相同，以保持向后兼容性
- 变量nodename_overwrite 控制是否用nodename覆盖节点的主机名
- 变量nodename_exchange 将nodename写入彼此的/etc/hosts
- 所有节点指标引用都经过修订，通过ip连接
- 节点监控目标在/etc/prometheus/targets/nodes下单独管理
PGSQL监控增强
- 完全新的PGSQL集群，简化并专注于集群中的重要内容。
- 新仪表板PGSQL数据库是集群级对象监控。例如整个集群而不是单个实例的表和查询。
- PGSQL警报仪表板现在只关注pgsql警报。
- PGSQL Shard已添加到PGSQL中。
Redis监控增强
- 为所有redis仪表板添加节点监控。

MatrixDB支持

通过pigsty-matrix.yml playbook可以部署MatrixDB（Greenplum 7）
MatrixDB监控仪表板：PGSQL MatrixDB
添加示例配置：pigsty-mxdb.yml

监控增强

将监控系统分为5大类别：INFRA、NODES、REDIS、PGSQL、APP
默认启用日志记录
- 现在默认启用loki和promtail，带有预构建的 loki-rpm。
模型和标签
- 为所有仪表板添加了一个隐藏的ds prometheus数据源变量，因此您只需选择一个新的数据源而不是修改Grafana数据源和仪表板。
- 为所有指标添加了一个ip标签，并将其用作数据库指标和节点指标之间的连接键。
INFRA监控
- Infra主仪表板：INFRA概览
- 添加日志仪表板：日志实例
- PGLOG分析和PGLOG会话现在被视为一个示例Pigsty APP。
NODES监控应用
- 如果您完全不关心数据库，现在可以单独使用Pigsty作为主机监控软件！
- 包括4个核心仪表板：节点概览 & 节点集群 & 节点实例 & 节点警报
- 为节点引入新的身份变量：node_cluster 和 nodename
- 变量pg_hostname现在意味着将主机名设置为与postgres实例名相同，以保持向后兼容性
- 变量nodename_overwrite 控制是否用nodename覆盖节点的主机名
- 变量nodename_exchange 将nodename写入彼此的/etc/hosts
- 所有节点指标引用都经过修订，通过ip连接
- 节点监控目标在/etc/prometheus/targets/nodes下单独管理
PGSQL监控增强
- 完全新的PGSQL集群，简化并专注于集群中的重要内容。
- 新仪表板PGSQL数据库是集群级对象监控。例如整个集群而不是单个实例的表和查询。
- PGSQL警报仪表板现在只关注pgsql警报。
- PGSQL Shard已添加到PGSQL中。
Redis监控增强
- 为所有redis仪表板添加节点监控。

MatrixDB支持

通过pigsty-matrix.yml playbook可以部署MatrixDB（Greenplum 7）
MatrixDB监控仪表板：PGSQL MatrixDB
添加示例配置：pigsty-mxdb.yml

置备改进

现在 pigsty 的工作流如下：

 infra.yml ---> 在单一的元节点上安装 pigsty
      |          然后将更多节点加入 pigsty 的管理下
      |
 nodes.yml ---> 为 pigsty 准备节点（节点设置、dcs、node_exporter、promtail）
      |          然后选择一个 playbook 在这些节点上部署数据库集群
      |
      ^--> pgsql.yml   在已准备好的节点上安装 postgres
      ^--> redis.yml   在已准备好的节点上安装 redis

infra-demo.yml = 
           infra.yml -l meta     +
           nodes.yml -l pg-test  +
           pgsql.yml -l pg-test +
           infra-loki.yml + infra-jupyter.yml + infra-pgweb.yml

nodes.yml：用于设置和准备 pigsty 的节点，
在节点上设置 node、node_exporter、consul agent
node-remove.yml 用于节点注销
pgsql.yml：现在只在已准备好的节点上工作
pgsql-remove 现在只负责 postgres 本身（dcs 和节点监控由 node.yml 负责）
添加一系列新选项以在 greenplum/matrixdb 中重用 postgres 角色
redis.yml：现在在已准备好的节点上工作
而 redis-remove.yml 现在从节点上移除 redis。
pgsql-matrix.yml 现在在已准备好的节点上安装 matrixdb（Greenplum 7）。

软件升级

PostgreSQL 14.2
PostGIS 3.2
TimescaleDB 2.6
Patroni 2.1.3 (Prometheus 指标 + 故障转移插槽)
HAProxy 2.5.5 (修复统计错误，更多指标)
PG 导出器 0.4.1 (超时参数等)
Grafana 8.4.4
Prometheus 2.33.4
Greenplum 6.19.4 / MatrixDB 4.4.0
Loki 现在作为 rpm 包提供，而不是 zip 存档。

错误修复

删除 patroni 的 consul 依赖，这使其更容易迁移到新的 consul 集群
修复 prometheus bin/new 脚本的默认数据目录路径：从 /export/prometheus 更改为 /data/prometheus
在 vip-manager systemd 服务中添加重新启动秒数
修复错别字和任务

API 变更

新增变量

node_cluster：节点集群的身份变量
nodename_overwrite：如果设置，则 nodename 将设置为节点的主机名
nodename_exchange：交换 play 主机之间的节点主机名（在 /etc/hosts 中）
node_dns_hosts_extra：可以通过单个实例/集群轻松覆盖的额外静态 dns 记录
patroni_enabled：如果禁用，postgres & patroni 的引导过程不会在 postgres 角色期间执行
pgbouncer_enabled：如果禁用，pgbouncer 在 postgres 角色期间不会启动
pg_exporter_params：生成监控目标 url 时为 pg_exporter 提供的额外 url 参数。
pg_provision：布尔值变量，表示是否执行 postgres 角色的资源配置部分（模板，数据库，用户）
no_cmdb：用于 infra.yml 和 infra-demo.yml 播放书，不会在元节点上创建 cmdb。

MD5 (app.tgz) = f887313767982b31a2b094e5589a75ea
MD5 (matrix.tgz) = 3d063437c482d94bd7e35df1a08bbc84
MD5 (pigsty.tgz) = e143b88ebea1474f9ebaffddc6072c49
MD5 (pkg.tgz) = 73e8f5ce995b1f1760cb63c1904fb91b

v1.3.1

监控

PGSQL & PGCAT 仪表盘改进
优化 pgcat 实例 & pgcat 数据库的布局
在 pgsql 实例仪表盘中添加关键指标面板，与 pgsql 集群保持一致
在 pgcat 数据库中添加表/索引膨胀面板，移除 pgcat 膨胀仪表盘
在 pgcat 数据库仪表盘中添加索引信息
修复在 grafana 8.3 中的损坏面板
在 nginx 主页中添加 redis 索引

部署

新的 infra-demo.yml 剧本用于一次性引导
使用 infra-jupyter.yml 剧本部署可选的 jupyter lab 服务器
使用 infra-pgweb.yml 剧本部署可选的 pgweb 服务器
在 meta 节点上新的 pg 别名，可以从 admin 用户启动 postgres 集群（除了 postgres）
根据 timescaledb-tune 的建议调整所有 patroni 配置模板中的 max_locks_per_transactions
在配置模板中添加 citus.node_conninfo: 'sslmode=prefer' 以便在没有 SSL 的情况下使用 citus
在 pgdg14 包列表中添加所有扩展（除了 pgrouting）
将 node_exporter 升级到 v1.3.1
将 PostgREST v9.0.0 添加到包列表。从 postgres 模式生成 API。

错误修复

Grafana 的安全漏洞（升级到 v8.3.1 问题)
修复 pg_instance & pg_service 在 register 角色中从剧本的中间开始时的问题
修复在没有 pg_cluster 变量存在的主机上 nginx 主页渲染问题
在升级到 grafana 8.3.1 时修复样式问题

v1.3.0

【功能增强】Redis 部署（集群、哨兵、主从）
【功能增强】Redis 监控
- Redis 总览仪表盘
- Redis 集群仪表盘
- Redis 实例仪表盘 -【功能增强】监控：PGCAT 大修
- 新仪表盘：PGCAT 实例
- 新仪表盘：PGCAT 数据库仪表盘
- 重做仪表盘：PGCAT 表格
【功能增强】监控：PGSQL 增强
- 新面板：PGSQL 集群，添加 10 个关键指标面板（默认切换）
- 新面板：PGSQL 实例，添加 10 个关键指标面板（默认切换）
- 简化 & 重新设计：PGSQL 服务
- 在 PGCAT & PGSL 仪表盘之间添加交叉引用 -【功能增强】监控部署
- 现在 grafana 数据源在仅监控部署期间自动注册 -【功能增强】软件升级
- 将 PostgreSQL 13 添加到默认包列表
- 默认升级到 PostgreSQL 14.1
- 添加 greenplum rpm 和依赖项
- 添加 redis rpm & 源代码包
- 将 perf 添加为默认包

v1.2.0

【功能增强】默认使用 PostgreSQL 14 版本
【功能增强】默认使用 TimescaleDB 2.5 扩展
- 现在 timescaledb 和 postgis 默认在 cmdb 中启用
【功能增强】新增仅监控模式：
- 仅通过可连接的 URL，您可以使用 pigsty 监控现有的 pg 实例
- pg_exporter 将在本地的 meta 节点上部署
- 新仪表板 PGSQL Cluster Monly 用于远程集群
【功能增强】软件升级
- grafana 升级到 8.2.2
- pev2 升级到 v0.11.9
- promscale 升级到 0.6.2
- pgweb 升级到 0.11.9
- 新增扩展：pglogical、pg_stat_monitor、orafce -【功能增强】自动检测机器规格并使用适当的 node_tune 和 pg_conf 模板 -【功能增强】重做与膨胀相关的视图，现在公开更多信息 -【功能增强】删除 timescale 和 citus 的内部监控 -【功能增强】新剧本 pgsql-audit.yml 用于创建审计报告 -【BUG修复】现在 pgbouncer_exporter 资源所有者是 {{ pg_dbsu }} 而不是 postgres -【BUG修复】修复在执行 REINDEX TABLE CONCURRENTLY 时 pg_exporter 在 pg_table pg_index 上的重复指标 -【功能增强】现在所有配置模板都减少到两个：auto 和 demo。(已删除：pub4, pg14, demo4, tiny, oltp)
- 如果 vagrant 是默认用户，则配置 pigsty-demo，否则使用 pigsty-auto。

如何从 v1.1.1 升级

在 1.2.0 中没有 API 变更。您仍然可以使用旧的 pigsty.yml 配置文件 (PG13)。对于基础设施部分，重新执行 repo 将完成大部分工作。

至于数据库，您仍然可以使用现有的 PG13 实例。就地升级在涉及到像 PostGIS 和 Timescale 这样的扩展时非常棘手。我强烈推荐使用逻辑复制进行数据库迁移。新的剧本 pgsql-migration.yml 将使这一过程变得容易得多。它将创建一系列的脚本，帮助您近乎零停机时间地迁移您的集群。

v1.1.1

【功能增强】用 timescale 版本替换 timescaledb 的 apache 版本
【功能增强】升级 prometheus 到 2.30
【BUG修复】现在 pg_exporter 配置目录的属主是 {{ pg_dbsu }}，而不再是 prometheus

如何从v1.1.0升级？

这个版本的主要变动是 TimescaleDB，使用 TimescaleDB License （TSL）的官方版本替代了 PGDG 仓库中的 Apache License v2 的版本。

stop/pause postgres instance with timescaledb
yum remove -y timescaledb_13

[timescale_timescaledb]
name=timescale_timescaledb
baseurl=https://packagecloud.io/timescale/timescaledb/el/7/$basearch
repo_gpgcheck=0
gpgcheck=0
enabled=1

yum install timescaledb-2-postgresql13

v1.1.0

【增强功能】增加 pg_dummy_filesize 以创建文件系统空间占位符
【增强功能】主页大改版
【增强功能】增加 Jupyter Lab 整合
【增强功能】增加 pgweb 控制台整合
【增强功能】增加 pgbadger 支持
【增强功能】增加 pev2 支持，解释可视化工具
【增强功能】增加 pglog 工具
【增强功能】更新默认的 pkg.tgz 软件版本：
- PostgreSQL 升级至 v13.4（支持官方的 pg14）
- pgbouncer 升级至 v1.16（指标定义更新）
- Grafana 升级至 v8.1.4
- Prometheus 升级至 v2.2.29
- node_exporter 升级至 v1.2.2
- haproxy 升级至 v2.1.1
- consul 升级至 v1.10.2
- vip-manager 升级至 v1.0.1

API 变更

nginx_upstream 现在持有不同的结构。（不兼容）
新的配置条目：app_list，渲染至主页的导航条目
新的配置条目：docs_enabled，在默认服务器上设置本地文档
新的配置条目：pev2_enabled，设置本地的 pev2 工具
新的配置条目：pgbadger_enabled，创建日志概要/报告目录
新的配置条目：jupyter_enabled，在元节点上启用 Jupyter Lab 服务器
新的配置条目：jupyter_username，指定运行 Jupyter Lab 的用户
新的配置条目：jupyter_password，指定 Jupyter Lab 的默认密码
新的配置条目：pgweb_enabled，在元节点上启用 pgweb 服务器
新的配置条目：pgweb_username，指定运行 pgweb 的用户
将内部标记 repo_exist 重命名为 repo_exists
现在 repo_address 的默认值为 pigsty 而非 yum.pigsty
现在 haproxy 的访问点为 http://pigsty 而非 http://h.pigsty

v1.0.1

2021-09-14

文档更新
- 现已支持中文文档
- 现已支持机器翻译的英文文档
错误修复：pgsql-remove 不会移除主实例
错误修复：用 pg_cluster + pg_seq 替换 pg_instance
- Start-At-Task 可能因为 pg_instance 未定义而失败
错误修复：从默认共享预加载库中移除 citus
- citus 会强制 max_prepared_transaction 的值为非零
错误修复：在 configure 中进行 ssh sudo 检查：
- 现在使用 ssh -t sudo -n ls 进行权限检查
笔误修复：pg-backup 脚本的笔误
警报调整：移除 NTP 合理性检查警报（与 ClockSkew 重复）
导出器调整：移除 collector.systemd 以减少开销

v1.0.0

v1 正式发布，监控系统全面改进

亮点

监控系统全面改进
- 在 Grafana 8.0 上新增仪表盘
- 新的度量定义，增加 PG14 支持
- 简化的标签系统：静态标签集：(job, cls, ins)
- 新的警报规则与衍生度量
- 同时监控多个数据库
- 实时日志搜索 & csvlog 分析
- 链接丰富的仪表盘，点击图形元素进行深入|汇总
架构变更
- 将 citus 和 timescaledb 加入默认安装部分
- 增加对 PostgreSQL 14beta2 的支持
- 简化 haproxy 管理页面索引
- 通过添加新的角色 register 来解耦基础设施和 pgsql
- 添加新角色 loki 和 promtail 用于日志记录
- 为管理员节点上的管理员用户添加新角色 environ 以设置环境
- 默认使用 static 服务发现用于 prometheus（而不是 consul）
- 添加新角色 remove 以优雅地移除集群和实例
- 升级 prometheus 和 grafana 的配置逻辑
- 升级到 vip-manager 1.0，node_exporter 1.2，pg_exporter 0.4，grafana 8.0
- 现在，每个实例上的每个数据库都可以自动注册为 grafana 数据源
- 将 consul 注册任务移到 register 角色，更改 consul 服务标签
- 添加 cmdb.sql 作为 pg-meta 基线定义（CMDB & PGLOG）
应用框架
- 可扩展框架用于新功能
- 核心应用：PostgreSQL 监控系统：pgsql
- 核心应用：PostgreSQL 目录浏览器：pgcat
- 核心应用：PostgreSQL Csvlog 分析器：pglog
- 添加示例应用 covid 用于可视化 covid-19 数据
- 添加示例应用 isd 用于可视化 isd 数据
其他
- 添加 jupyterlab，为数据科学提供完整的 python 环境
- 添加 vonng-echarts-panel 以恢复对 Echarts 的支持
- 添加 wrap 脚本 createpg，createdb，createuser
- 添加 cmdb 动态库存脚本：load_conf.py，inventory_cmdb，inventory_conf
- 移除过时的剧本：pgsql-monitor，pgsql-service，node-remove 等….

API 变更

新变量: node_meta_pip_install
新变量: grafana_admin_username
新变量: grafana_database
新变量: grafana_pgurl
新变量: pg_shared_libraries
新变量: pg_exporter_auto_discovery
新变量: pg_exporter_exclude_database
新变量: pg_exporter_include_database
变量重命名: grafana_url 为 grafana_endpoint

Bug 修复

修复默认时区 Asia/Shanghai (CST) 问题
修复 pgbouncer & patroni 的 nofile 限制
当执行标签 pgbouncer 时，pgbouncer 的用户列表和数据库列表将会被生成

v0.9.0

v0.9极大简化了安装流程，进行了大量日志相关改进，开发了命令行工具（Beta），并修复了一系列问题。

新功能

一键安装模式：

/bin/bash -c "$(curl -fsSL https://pigsty.cc/install)"

开发命令行工具 pigsty-cli封装常用Ansible命令，目前pigsty-cli处于Beta状态
使用Loki与Promtail收集日志：
- 默认收集Postgres，Pgbouncer，Patroni日志
- 新增部署脚本infra-loki.yml 与 pgsql-promtail.yml
- 定义基于日志的监控指标
- 使用Grafana制作日志相关可视化面板。
监控组件可以使用二进制安装，使用files/get_bin.sh下载监控二进制组件。
飞升模式：当集群元节点初始化完成后，可以使用bin/upgrade升级为动态Inventory 使用pg-meta上的数据库代替YAML配置文件。

问题修复

集中修复日志相关问题：
- 修复了HAProxy健康检查造成PG日志中大量 connection reset by peer的问题。
- 修复了HAProxy健康检查造成Patroni日志中大量出现Connect Reset Exception的问题
- 修复了Patroni日志时间戳格式，去除毫秒时间戳，附加完整时区信息。
- 为dbuser_monitor配置1秒的log_min_duration_statement，避免监控查询出现在日志中。
重构Grafana角色
- 在保持API不变的前提下重构Grafana角色。
- 使用CDN下载预打包的Grafana插件，加速插件下载
其他问题修复
- 修复了pgbouncer-create-user 未能正确处理 md5 密码的问题。
- 完善了数据库与用户创建SQL模版中参数空置检查。
- 修复了 NODE DNS配置时如果手工中断执行，DNS配置可能出错的问题。
- 重构了Makefile快捷方式 Makefile 中的错别字

参数变更

node_disable_swap 默认为 False，默认不会关闭SWAP。
node_sysctl_params 不再有默认修改的系统参数。
grafana_plugin 的默认值install 现在意味着当插件缓存不存在时，从CDN下载。
repo_url_packages 现在从 Pigsty CDN 下载额外的RPM包，解决墙内无法访问的问题。
proxy_env.no_proxy现在将Pigsty CDN加入到NOPROXY列表中。
grafana_customize 现在默认为false，启用意味着安装Pigsty Pro版UI（默认不开源所以不要启用）
node_admin_pk_current，新增选项，启用后会将当前用户的~/.ssh/id_rsa.pub添加至管理员的Key中
loki_clean：新增选项，安装Loki时是否清除现有数据
loki_data_dir：新增选项，指明安装Loki时的数据目录
promtail_enabled 是否启用Promtail日志收集服务？
promtail_clean 是否在安装promtail时移除已有状态信息？
promtail_port promtail使用的默认端口，默认为9080
promtail_status_file 保存Promtail状态信息的文件位置
promtail_send_url 用于接收日志的loki服务endpoint

v0.8.0

v0.8 针对 服务（Service） 接入部分进行了彻底的重做。现在除了默认的primary, replica服务外，用户可以自行定义新的服务。服务的接口可以支持多种不同的实现，例如L4 DPKG VIP可作为Haproxy的替代品与Pigsty集成。同时，针对用户反馈的一些问题进行了集中处理与改进。

改动内容

v0.8是供给方案定稿版本，此后供给系统的API将保持稳定。

API变更

原有vip与haproxy角色的所有配置项，现在迁移至service角色中。

#------------------------------------------------------------------------------
# SERVICE PROVISION
#------------------------------------------------------------------------------
pg_weight: 100              # default load balance weight (instance level)

# - service - #
pg_services:                                  # how to expose postgres service in cluster?
  # primary service will route {ip|name}:5433 to primary pgbouncer (5433->6432 rw)
  - name: primary           # service name {{ pg_cluster }}_primary
    src_ip: "*"
    src_port: 5433
    dst_port: pgbouncer     # 5433 route to pgbouncer
    check_url: /primary     # primary health check, success when instance is primary
    selector: "[]"          # select all instance as primary service candidate

  # replica service will route {ip|name}:5434 to replica pgbouncer (5434->6432 ro)
  - name: replica           # service name {{ pg_cluster }}_replica
    src_ip: "*"
    src_port: 5434
    dst_port: pgbouncer
    check_url: /read-only   # read-only health check. (including primary)
    selector: "[]"          # select all instance as replica service candidate
    selector_backup: "[? pg_role == `primary`]"   # primary are used as backup server in replica service

  # default service will route {ip|name}:5436 to primary postgres (5436->5432 primary)
  - name: default           # service's actual name is {{ pg_cluster }}-{{ service.name }}
    src_ip: "*"             # service bind ip address, * for all, vip for cluster virtual ip address
    src_port: 5436          # bind port, mandatory
    dst_port: postgres      # target port: postgres|pgbouncer|port_number , pgbouncer(6432) by default
    check_method: http      # health check method: only http is available for now
    check_port: patroni     # health check port:  patroni|pg_exporter|port_number , patroni by default
    check_url: /primary     # health check url path, / as default
    check_code: 200         # health check http code, 200 as default
    selector: "[]"          # instance selector
    haproxy:                # haproxy specific fields
      maxconn: 3000         # default front-end connection
      balance: roundrobin   # load balance algorithm (roundrobin by default)
      default_server_options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'

  # offline service will route {ip|name}:5438 to offline postgres (5438->5432 offline)
  - name: offline           # service name {{ pg_cluster }}_replica
    src_ip: "*"
    src_port: 5438
    dst_port: postgres
    check_url: /replica     # offline MUST be a replica
    selector: "[? pg_role == `offline` || pg_offline_query ]"         # instances with pg_role == 'offline' or instance marked with 'pg_offline_query == true'
    selector_backup: "[? pg_role == `replica` && !pg_offline_query]"  # replica are used as backup server in offline service

pg_services_extra: []        # extra services to be added

# - haproxy - #
haproxy_enabled: true                         # enable haproxy among every cluster members
haproxy_reload: true                          # reload haproxy after config
haproxy_policy: roundrobin                    # roundrobin, leastconn
haproxy_admin_auth_enabled: false             # enable authentication for haproxy admin?
haproxy_admin_username: admin                 # default haproxy admin username
haproxy_admin_password: admin                 # default haproxy admin password
haproxy_exporter_port: 9101                   # default admin/exporter port
haproxy_client_timeout: 3h                    # client side connection timeout
haproxy_server_timeout: 3h                    # server side connection timeout

# - vip - #
vip_mode: none                                # none | l2 | l4
vip_reload: true                              # whether reload service after config
# vip_address: 127.0.0.1                      # virtual ip address ip (l2 or l4)
# vip_cidrmask: 24                            # virtual ip address cidr mask (l2 only)
# vip_interface: eth0                         # virtual ip network interface (l2 only)

新增选项

# - localization - #
pg_encoding: UTF8                             # default to UTF8
pg_locale: C                                  # default to C
pg_lc_collate: C                              # default to C
pg_lc_ctype: en_US.UTF8                       # default to en_US.UTF8

pg_reload: true                               # reload postgres after hba changes
vip_mode: none                                # none | l2 | l4
vip_reload: true                              # whether reload service after config

移除选项

haproxy_check_port                            # Haproxy相关参数已经被Service定义覆盖
haproxy_primary_port
haproxy_replica_port
haproxy_backend_port
haproxy_weight
haproxy_weight_fallback
vip_enabled                                   # vip_enabled参数被vip_mode覆盖

服务管理

pg_services 与 pg_services_extra 定义了集群中的服务，每一个服务的定义结构如下例所示：

一个服务必须指定以下内容：

名称：服务的完整名称以数据库集群名为前缀，以service.name为后缀，通过-连接。例如在pg-test集群中name=primary的服务，其完整服务名称为pg-test-primary。
端口：在Pigsty中，服务默认采用NodePort的形式对外暴露，因此暴露端口为必选项。但如果使用外部负载均衡服务接入方案，您也可以通过其他的方式区分服务。
选择器：选择器指定了服务的成员，采用JMESPath的形式，从所有集群实例成员中筛选变量。默认的[]选择器会选取所有的集群成员。

此外selector_backup会选择或标记用于backup的实例列表（当集群中所有其他成员失效时方才接管服务）

  # default service will route {ip|name}:5436 to primary postgres (5436->5432 primary)
  - name: default           # service's actual name is {{ pg_cluster }}-{{ service.name }}
    src_ip: "*"             # service bind ip address, * for all, vip for cluster virtual ip address
    src_port: 5436          # bind port, mandatory
    dst_port: postgres      # target port: postgres|pgbouncer|port_number , pgbouncer(6432) by default
    check_method: http      # health check method: only http is available for now
    check_port: patroni     # health check port:  patroni|pg_exporter|port_number , patroni by default
    check_url: /primary     # health check url path, / as default
    check_code: 200         # health check http code, 200 as default
    selector: "[]"          # instance selector
    haproxy:                # haproxy specific fields
      maxconn: 3000         # default front-end connection
      balance: roundrobin   # load balance algorithm (roundrobin by default)
      default_server_options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'

数据库管理

数据库现在可以对locale的细分选项：lc_ctype与lc_collate分别进行指定。支持这一功能的主要原因是PG的扩展插件pg_trgm需要在lc_ctype!=C的环境中才能正常支持中文。

旧接口定义

pg_databases:
  - name: meta                      # name is the only required field for a database
    owner: postgres                 # optional, database owner
    template: template1             # optional, template1 by default
    encoding: UTF8                  # optional, UTF8 by default
    locale: C                       # optional, C by default
    allowconn: true                 # optional, true by default, false disable connect at all
    revokeconn: false               # optional, false by default, true revoke connect from public # (only default user and owner have connect privilege on database)
    tablespace: pg_default          # optional, 'pg_default' is the default tablespace
    connlimit: -1                   # optional, connection limit, -1 or none disable limit (default)
    extensions:                     # optional, extension name and where to create
      - {name: postgis, schema: public}
    parameters:                     # optional, extra parameters with ALTER DATABASE
      enable_partitionwise_join: true
    pgbouncer: true                 # optional, add this database to pgbouncer list? true by default
    comment: pigsty meta database   # optional, comment string for database

新的接口定义

pg_databases:
  - name: meta                      # name is the only required field for a database
    # owner: postgres                 # optional, database owner
    # template: template1             # optional, template1 by default
    # encoding: UTF8                # optional, UTF8 by default , must same as template database, leave blank to set to db default
    # locale: C                     # optional, C by default , must same as template database, leave blank to set to db default
    # lc_collate: C                 # optional, C by default , must same as template database, leave blank to set to db default
    # lc_ctype: C                   # optional, C by default , must same as template database, leave blank to set to db default
    allowconn: true                 # optional, true by default, false disable connect at all
    revokeconn: false               # optional, false by default, true revoke connect from public # (only default user and owner have connect privilege on database)
    # tablespace: pg_default          # optional, 'pg_default' is the default tablespace
    connlimit: -1                   # optional, connection limit, -1 or none disable limit (default)
    extensions:                     # optional, extension name and where to create
      - {name: postgis, schema: public}
    parameters:                     # optional, extra parameters with ALTER DATABASE
      enable_partitionwise_join: true
    pgbouncer: true                 # optional, add this database to pgbouncer list? true by default
    comment: pigsty meta database   # optional, comment string for database

v0.7.0

v0.7 针对接入已有数据库实例进行了改进，现在用户可以采用 仅监控部署（Monly Deployment） 模式使用Pigsty。同时新增了专用于管理数据库与用户、以及单独部署监控的剧本，并对数据库与用户的定义进行改进。

Features

Bug Fix

API变更

新增选项

prometheus_sd_target: batch                   # batch|single    监控目标定义文件采用单体还是每个实例一个
exporter_install: none                        # none|yum|binary 监控Exporter的安装模式
exporter_repo_url: ''                         # 如果设置，这里的REPO连接会加入目标的Yum源中
node_exporter_options: '--no-collector.softnet --collector.systemd --collector.ntp --collector.tcpstat --collector.processes'                          # Node Exporter默认的命令行选项
pg_exporter_url: ''                           # 可选，PG Exporter监控对象的URL
pgbouncer_exporter_url: ''                    # 可选，PGBOUNCER EXPORTER监控对象的URL

移除选项

exporter_binary_install: false                 # 功能被 exporter_install 覆盖

定义结构变更

pg_default_roles                               # 变化细节参考 用户管理。
pg_users                                       # 变化细节参考 用户管理。
pg_databases                                   # 变化细节参考 数据库管理。

重命名选项

pg_default_privilegs -> pg_default_privileges # 很明显这是一个错别字

仅监控模式

有时用户不希望使用Pigsty供给方案，只希望使用Pigsty监控系统管理现有PostgreSQL实例。

Pigsty提供了仅监控部署（monly, monitor-only模式，剥离供给方案部分，可用于监控现有PostgreSQL集群。

仅监控模式的部署流程与标准模式大体上保持一致，但省略了很多步骤

在元节点上完成基础设施初始化的部分与标准流程保持一致，仍然通过./infra.yml完成。
不需要在数据库节点上完成 基础设施初始化。
不需要在数据库节点上执行数据库初始化的绝大多数任务，而是通过专用的./pgsql-monitor.yml 完成仅监控系统部署。
实际使用的配置项大大减少，只保留基础设施相关变量，与监控系统相关的少量变量。

数据库管理

Database provisioning interface enhancement #33

旧接口定义

pg_databases:                       # create a business database 'meta'
  - name: meta
    schemas: [meta]                 # create extra schema named 'meta'
    extensions: [{name: postgis}]   # create extra extension postgis
    parameters:                     # overwrite database meta's default search_path
      search_path: public, monitor

新的接口定义

pg_databases:
  - name: meta                      # name is the only required field for a database
    owner: postgres                 # optional, database owner
    template: template1             # optional, template1 by default
    encoding: UTF8                  # optional, UTF8 by default
    locale: C                       # optional, C by default
    allowconn: true                 # optional, true by default, false disable connect at all
    revokeconn: false               # optional, false by default, true revoke connect from public # (only default user and owner have connect privilege on database)
    tablespace: pg_default          # optional, 'pg_default' is the default tablespace
    connlimit: -1                   # optional, connection limit, -1 or none disable limit (default)
    extensions:                     # optional, extension name and where to create
      - {name: postgis, schema: public}
    parameters:                     # optional, extra parameters with ALTER DATABASE
      enable_partitionwise_join: true
    pgbouncer: true                 # optional, add this database to pgbouncer list? true by default
    comment: pigsty meta database   # optional, comment string for database

接口变更

Add new options: template , encoding, locale, allowconn, tablespace, connlimit
Add new option revokeconn, which revoke connect privileges from public for this database
Add comment field for database

数据库变更

在运行中集群中创建新数据库可以使用pgsql-createdb.yml剧本，在配置中定义完新数据库后，执行以下剧本。

./pgsql-createdb.yml -e pg_database=<your_new_database_name>

通过-e pg_datbase=告知需要创建的数据库名称，则该数据库即会被创建（或修改）。具体执行的命令参见集群主库/pg/tmp/pg-db-{{ database.name}}.sql文件。

用户管理

User provisioning interface enhancement #34

旧接口定义

pg_users:
  - username: test                  # example production user have read-write access
    password: test                  # example user's password
    options: LOGIN                  # extra options
    groups: [ dbrole_readwrite ]    # dborole_admin|dbrole_readwrite|dbrole_readonly
    comment: default test user for production usage
    pgbouncer: true                 # add to pgbouncer

新接口定义

pg_users:
  # complete example of user/role definition for production user
  - name: dbuser_meta               # example production user have read-write access
    password: DBUser.Meta           # example user's password, can be encrypted
    login: true                     # can login, true by default (should be false for role)
    superuser: false                # is superuser? false by default
    createdb: false                 # can create database? false by default
    createrole: false               # can create role? false by default
    inherit: true                   # can this role use inherited privileges?
    replication: false              # can this role do replication? false by default
    bypassrls: false                # can this role bypass row level security? false by default
    connlimit: -1                   # connection limit, -1 disable limit
    expire_at: '2030-12-31'         # 'timestamp' when this role is expired
    expire_in: 365                  # now + n days when this role is expired (OVERWRITE expire_at)
    roles: [dbrole_readwrite]       # dborole_admin|dbrole_readwrite|dbrole_readonly
    pgbouncer: true                 # add this user to pgbouncer? false by default (true for production user)
    parameters:                     # user's default search path
      search_path: public
    comment: test user

接口变更

username field rename to name
groups field rename to roles
options now split into separated configration entries: login, superuser, createdb, createrole, inherit, replication,bypassrls,connlimit
expire_at and expire_in options
pgbouncer option for user is now false by default

用户管理

在运行中集群中创建新数据库可以使用pgsql-createuser.yml剧本，在配置中定义完新数据库后，执行以下剧本。

./pgsql-createuser.yml -e pg_user=<your_new_user_name>

通过-e pg_user=告知需要创建的数据库名称，则该数据库即会被创建（或修改）。具体执行的命令参见集群主库/pg/tmp/pg-user-{{ user.name}}.sql文件。

v0.6.0

v0.6 对数据库供给方案进行了修改与调整，根据用户的反馈添加了一系列实用功能与修正。针对监控系统的移植性进行优化，便于与其他外部数据库供给方案对接。

BUG修复

修复了新版本Patroni重启后会重置PG HBA的问题
修复了PG Overview Dashboard标题中的别字
修复了沙箱集群pg-test的默认主库，原来为pg-test-2，应当为pg-test-1
修复了过时代码注释

功能改进

改造Prometheus与监控供给方式
- 允许在无基础设施的情况下对已有PG集群进行监控部署，便于监控系统与其他供给方案集成。#11
- 基于Inventory渲染所有监控对象的静态列表，用于静态服务发现。#11
- Prometheus添加了静态对象模式，用于替代动态服务发现，集中进行身份管理#11
- 监控Exporter现在添加了service_registry选项，Consul服务注册变为可选项 #13
- Exporter现在可以通过拷贝二进制的方式直接安装：exporter_binary_install，#14
- Exporter现在具有xxx_enabled选项，控制是否启用该组件。
Haproxy供给重构与改进 #8
- 新增了全局HAProxy管理界面导航，默认域名h.pigsty
- 允许将主库加入只读服务集中，当集群中所有从库宕机时自动承接读流量。 #8
- 允许位Haproxy实例管理界面启用认证 haproxy_admin_auth_enabled
- 允许通过配置项调整每个服务对应后端的流量权重. #10
访问控制模型改进。#7
- 添加了默认角色dbrole_offline，用于慢查询，ETL，交互式查询场景。
- 修改默认HBA规则，允许dbrole_offline分组的用户访问pg_role == 'offline'及pg_offline_query == true的实例。
软件更新 Release v0.6
- PostgreSQL 13.2
- Prometheus 2.25
- PG Exporter 0.3.2
- Node Exporter 1.1
- Consul 1.9.3
- 更新默认PG源：PostgreSQL现在默认使用浙江大学的镜像，加速下载安装

接口变更

新增选项

service_registry: consul                      # 服务注册机制：none | consul | etcd | both
prometheus_options: '--storage.tsdb.retention=30d'  # prometheus命令行选项
prometheus_sd_method: consul                  # Prometheus使用的服务发现机制：static|consul
prometheus_sd_interval: 2s                    # Prometheus服务发现刷新间隔
pg_offline_query: false                       # 设置后将允许dbrole_offline角色连接与查询该实例
node_exporter_enabled: true                   # 设置后将安装配置Node Exporter
pg_exporter_enabled: true                     # 设置后将安装配置PG Exporter
pgbouncer_exporter_enabled: true              # 设置后将安装配置Pgbouncer Exporter
dcs_disable_purge: false                      # 双保险，强制 dcs_exists_action = abort 避免误删除DCS实例
pg_disable_purge: false                       # 双保险，强制 pg_exists_action = abort 避免误删除数据库实例
haproxy_weight: 100                           # 配置实例的相对负载均衡权重
haproxy_weight_fallback: 1                    # 配置集群主库在只读服务中的相对权重

移除选项

prometheus_metrics_path                       # 与 exporter_metrics_path 重复
prometheus_retention                          # 功能被 prometheus_options 覆盖

v0.5.0

Pigsty 现在有了官方网站啦：pigsty.cc 🎉 !

亮点特性

Pigsty官方文档站正式上线！
添加了数据库模板的定制支持，用户可以通过配置文件定制所需的数据库内部对象。
对默认访问控制模型进行了改进
重构了HBA管理的逻辑，现在将由Pigsty替代Patroni直接负责生成HBA
将Grafana监控系统的供给方案从sqlite改为JSON文件静态Provision
将pg-cluster-replication面板加入Pigsty开源免费套餐。
最新的经过测试的离线安装包：pkg.tgz (v0.5)

定制数据库

您是否烦恼过单实例多租户的问题？比如总有研发拿着PostgreSQL当MySQL使，明明是一个Schema就能解决的问题，非要创建一个新的数据库出来，在一个实例中创建出几十个不同的DB。不要忧伤，不要心急。Pigsty已经提供数据库内部对象的Provision方案，您可以轻松地在配置文件中指定所需的数据库内对象，包括：

角色
- 用户/角色名
- 密码
- 用户属性
- 用户备注
- 用户所属的权限组
数据库
- 属主
- 额外的模式
- 额外的扩展插件
- 数据库级的自定义配置参数
数据库
- 属主
- 额外的模式
- 额外的扩展插件
- 数据库级的自定义配置参数
默认权限
- 默认情况下这里配置的权限会应用至所有由超级用户和管理员用户创建的对象上。
默认扩展
- 所有新创建的业务数据库都会安装有这些默认扩展
默认模式
- 所有新创建的业务数据库都会创建有这些默认的模式

配置样例

# 通常是每个DB集群配置的变量
pg_users:
  - username: test
    password: test
    comment: default test user
    groups: [ dbrole_readwrite ]    # dborole_admin|dbrole_readwrite|dbrole_readonly
pg_databases:                       # create a business database 'test'
  - name: test
    extensions: [{name: postgis}]   # create extra extension postgis
    parameters:                     # overwrite database meta's default search_path
      search_path: public,monitor

# 通常是整个环境统一配置的全局变量
# - system roles - #
pg_replication_username: replicator           # system replication user
pg_replication_password: DBUser.Replicator    # system replication password
pg_monitor_username: dbuser_monitor           # system monitor user
pg_monitor_password: DBUser.Monitor           # system monitor password
pg_admin_username: dbuser_admin               # system admin user
pg_admin_password: DBUser.Admin               # system admin password

# - default roles - #
pg_default_roles:
  - username: dbrole_readonly                 # sample user:
    options: NOLOGIN                          # role can not login
    comment: role for readonly access         # comment string

  - username: dbrole_readwrite                # sample user: one object for each user
    options: NOLOGIN
    comment: role for read-write access
    groups: [ dbrole_readonly ]               # read-write includes read-only access

  - username: dbrole_admin                    # sample user: one object for each user
    options: NOLOGIN BYPASSRLS                # admin can bypass row level security
    comment: role for object creation
    groups: [dbrole_readwrite,pg_monitor,pg_signal_backend]

  # NOTE: replicator, monitor, admin password are overwritten by separated config entry
  - username: postgres                        # reset dbsu password to NULL (if dbsu is not postgres)
    options: SUPERUSER LOGIN
    comment: system superuser

  - username: replicator
    options: REPLICATION LOGIN
    groups: [pg_monitor, dbrole_readonly]
    comment: system replicator

  - username: dbuser_monitor
    options: LOGIN CONNECTION LIMIT 10
    comment: system monitor user
    groups: [pg_monitor, dbrole_readonly]

  - username: dbuser_admin
    options: LOGIN BYPASSRLS
    comment: system admin user
    groups: [dbrole_admin]

  - username: dbuser_stats
    password: DBUser.Stats
    options: LOGIN
    comment: business read-only user for statistics
    groups: [dbrole_readonly]


# object created by dbsu and admin will have their privileges properly set
pg_default_privilegs:
  - GRANT USAGE                         ON SCHEMAS   TO dbrole_readonly
  - GRANT SELECT                        ON TABLES    TO dbrole_readonly
  - GRANT SELECT                        ON SEQUENCES TO dbrole_readonly
  - GRANT EXECUTE                       ON FUNCTIONS TO dbrole_readonly
  - GRANT INSERT, UPDATE, DELETE        ON TABLES    TO dbrole_readwrite
  - GRANT USAGE,  UPDATE                ON SEQUENCES TO dbrole_readwrite
  - GRANT TRUNCATE, REFERENCES, TRIGGER ON TABLES    TO dbrole_admin
  - GRANT CREATE                        ON SCHEMAS   TO dbrole_admin
  - GRANT USAGE                         ON TYPES     TO dbrole_admin

# schemas
pg_default_schemas: [monitor]

# extension
pg_default_extensions:
  - { name: 'pg_stat_statements',  schema: 'monitor' }
  - { name: 'pgstattuple',         schema: 'monitor' }
  - { name: 'pg_qualstats',        schema: 'monitor' }
  - { name: 'pg_buffercache',      schema: 'monitor' }
  - { name: 'pageinspect',         schema: 'monitor' }
  - { name: 'pg_prewarm',          schema: 'monitor' }
  - { name: 'pg_visibility',       schema: 'monitor' }
  - { name: 'pg_freespacemap',     schema: 'monitor' }
  - { name: 'pg_repack',           schema: 'monitor' }
  - name: postgres_fdw
  - name: file_fdw
  - name: btree_gist
  - name: btree_gin
  - name: pg_trgm
  - name: intagg
  - name: intarray

# postgres host-based authentication rules
pg_hba_rules:
  - title: allow meta node password access
    role: common
    rules:
      - host    all     all                         10.10.10.10/32      md5

  - title: allow intranet admin password access
    role: common
    rules:
      - host    all     +dbrole_admin               10.0.0.0/8          md5
      - host    all     +dbrole_admin               172.16.0.0/12       md5
      - host    all     +dbrole_admin               192.168.0.0/16      md5

  - title: allow intranet password access
    role: common
    rules:
      - host    all             all                 10.0.0.0/8          md5
      - host    all             all                 172.16.0.0/12       md5
      - host    all             all                 192.168.0.0/16      md5

  - title: allow local read-write access (local production user via pgbouncer)
    role: common
    rules:
      - local   all     +dbrole_readwrite                               md5
      - host    all     +dbrole_readwrite           127.0.0.1/32        md5

  - title: allow read-only user (stats, personal) password directly access
    role: replica
    rules:
      - local   all     +dbrole_readonly                               md5
      - host    all     +dbrole_readonly           127.0.0.1/32        md5
pg_hba_rules_extra: []

# pgbouncer host-based authentication rules
pgbouncer_hba_rules:
  - title: local password access
    role: common
    rules:
      - local  all          all                                     md5
      - host   all          all                     127.0.0.1/32    md5

  - title: intranet password access
    role: common
    rules:
      - host   all          all                     10.0.0.0/8      md5
      - host   all          all                     172.16.0.0/12   md5
      - host   all          all                     192.168.0.0/16  md5
pgbouncer_hba_rules_extra: []

数据库模板

pg-init-template.sql 用于初始化template1数据的脚本模板
pg-init-business.sql 用于初始化其他业务数据库的脚本模板

权限模型

v0.5 改善了默认的权限模型，主要是针对单实例多租户的场景进行优化，并收紧权限控制。

撤回了普通业务用户对非所属数据库的默认CONNECT权限
撤回了非管理员用户对所属数据库的默认CREATE权限
撤回了所有用户在public模式下的默认创建权限。

供给方式

原先Pigsty采用直接拷贝Grafana自带的grafana.db的方式完成监控系统的初始化。这种方式虽然简单粗暴管用，但不适合进行精细化的版本控制管理。在v0.5中，Pigsty采用了Grafana API完成了监控系统面板供给的工作。您所需的就是在grafana_url中填入带有用户名密码的Grafana URL。因此，监控系统可以背方便地添加至已有的Grafana中。

v0.4.0

第二个公开测试版v0.4现已正式发行！

监控系统

Pigsty v0.4对监控系统进行了整体升级改造，精心挑选了10个面板作为标准的Pigsty开源内容。同时，针对Grafana 7.3的不兼容升级进行了大量适配改造工作。使用升级的pg_exporter v0.3.1作为默认指标导出器，调整了监控报警规则的监控面板连接。

Pigsty开源版

Pigsty开源版选定了以下10个Dashboard作为开源内容。其他Dashboard作为可选的商业支持内容提供。

PG Overview
PG Cluster
PG Service
PG Instance
PG Database
PG Query
PG Table
PG Table Catalog
PG Table Detail
Node

尽管进行了少量阉割，这10个监控面板所涵盖的内容仍然可以吊打所有同类软件。

软件升级

Pigsty v0.4进行了大量软件适配工作，包括：

Upgrade to PostgreSQL 13.1, Patroni 2.0.1-4, add citus to repo.
Upgrade to pg_exporter 0.3.1
Upgrade to Grafana 7.3, Ton’s of compatibility work
Upgrade to prometheus 2.23, with new UI as default
Upgrade to consul 1.9

其他改进

Update prometheus alert rules
Fix alertmanager info links
Fix bugs and typos.
add a simple backup script

离线安装包

v0.4的离线安装包（CentOS 7.8）已经可以从Github下载：pkg.tgz

v0.3.0

首个Pigsty公开测试版本现在已经释出！

监控系统

Pigsty v0.3 包含以下8个监控面板作为开源内容：

PG Overview
PG Cluster
PG Service
PG Instance
PG Database
PG Table Overview
PG Table Catalog
Node

离线安装包

v0.3 离线安装包（CentOS 7.8）已经可以从Github下载：pkg.tgz

21 - 漏洞缺陷

安全漏洞，Bug 缺陷，修复公告

PIGSTY-20231201

问题编号：PIGSTY-20231201 etcd 写满导致 PG HA 不可用

问题等级：关键错误，请立即安排修复

影响范围：Pigsty v2.0.0 - v2.5.1 ，于 Pigsty v2.6.0 修复

问题描述： etcd 默认设置了一个 2GB 的数据库容量上限，如果您的 etcd 数据库容量超过了这个限制，etcd 将会拒绝写入请求，这可能导致依赖 etcd 的 PostgreSQL 高可用机制无法正常工作。与此同时，etcd 的 数据模型 使得每一次写入都会产生一个新的版本，因此如果您的 etcd 集群频繁写入，即使只有极个别的 Key， etcd 数据库的大小也可能会不断增长，并在达到容量上限时出现故障。

修复方案：将 Pigsty 更新至 v2.6.0 以上的版本，或者更新 roles/etcd 部分的代码后，重新执行 ./etcd.yml 强制重置 etcd 集群实现修复。

关键配置更新：roles/etcd/templates/etcd.conf.j2

22 - 便宜VPS

Pigsty 使用便宜 ClawCloud 托管服务器搭建文档站

如你所见，本站托管在 阿爪云 “Claw Cloud” 上，这是位于新加坡的 “阿里云青春版”。

小道消息称：这是阿里云在新加坡开的马甲

我用的是一台 4c8g / 200g 磁盘，1Gbps 带宽，每月 2TB 流量的中国优化云服务器，每月 18 $，托管在 HK 可用区。

相比阿里云/腾讯云/AWS 卖的那些 EC2 要便宜多了，特别是流量。国内 1GB 八毛钱简直是抢劫。

这玩意大陆访问还挺快，香港地区大概 ping 50ms，所以我就拿来建站了，同时还跑着 Pigsty 的 Demo。

如果你要弄一台建个站或者搭个 TZ，不妨考虑一下这个，以下推荐码链接可以立省 10%，我也赚个返点贴补服务器费用：

当然，如果你想要支持本项目的发展，也可以选择更直接的方式：

扫描支付宝二维码，感谢您的支持 🙏