Metrics
Pigsty ETCD module metric list
ETCD is a distributed, reliable key-value store for the most critical data of a distributed system
Configuration | Administration | Playbook | Dashboard | Parameter
Pigsty use etcd as DCS: Distributed configuration storage (or distributed consensus service). Which is critical to PostgreSQL High-Availability & Auto-Failover.
You have to install ETCD
module before any PGSQL
modules, since patroni & vip-manager will rely on etcd to work. Unless you are using an external etcd cluster.
You don’t need NODE
module to install ETCD
, but it requires a valid CA
on your local files/pki/ca
. Check ETCD Administration SOP for more details.
You have to define an etcd cluster before deploying it. There some parameters about etcd.
It is recommending to have at least 3 instances for a serious production environment.
Define a group etcd
in the inventory, It will create a singleton etcd instance.
# etcd cluster for ha postgres
etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }
This is good enough for development, testing & demonstration, but not recommended in serious production environment.
You can define etcd cluster with multiple nodes.
etcd: # dcs service for postgres/patroni ha consensus
hosts: # 1 node for testing, 3 or 5 for production
10.10.10.10: { etcd_seq: 1 } # etcd_seq required
10.10.10.11: { etcd_seq: 2 } # assign from 1 ~ n
10.10.10.12: { etcd_seq: 3 } # odd number please
vars: # cluster level parameter override roles/etcd
etcd_cluster: etcd # mark etcd cluster name etcd
etcd_safeguard: false # safeguard against purging
etcd_clean: true # purge etcd during init process
You can use more nodes for production environment, but 3 or 5 nodes are recommended. Remember to use odd number for cluster size.
Here are some useful administration tasks for etcd:
If etcd_safeguard
is true
, or etcd_clean
is false
,
the playbook will abort if any running etcd instance exists to prevent purge etcd by accident.
etcd:
hosts:
10.10.10.10: { etcd_seq: 1 }
10.10.10.11: { etcd_seq: 2 }
10.10.10.12: { etcd_seq: 3 }
vars: { etcd_cluster: etcd }
./etcd.yml # init etcd module on group 'etcd'
To destroy an etcd cluster, just use the etcd_clean
subtask of etcd.yml
, do think before you type.
./etcd.yml -t etcd_clean # remove entire cluster, honor the etcd_safeguard
./etcd.yml -t etcd_purge # purge with brutal force, omit the etcd_safeguard
Here’s an example of client environment config.
Pigsty use etcd v3 API by default.
alias e="etcdctl"
alias em="etcdctl member"
export ETCDCTL_API=3
export ETCDCTL_ENDPOINTS=https://10.10.10.10:2379
export ETCDCTL_CACERT=/etc/pki/ca.crt
export ETCDCTL_CERT=/etc/etcd/server.crt
export ETCDCTL_KEY=/etc/etcd/server.key
CRUD
You can do CRUD with following commands.
e put a 10 ; e get a; e del a ; # V3 API
If etcd cluster membership changes, we need to refresh etcd endpoints references:
To refresh etcd config file /etc/etcd/etcd.conf
on existing members:
./etcd.yml -t etcd_conf # refresh /etc/etcd/etcd.conf with latest status
ansible etcd -f 1 -b -a 'systemctl restart etcd' # optional: restart etcd
To refresh etcdctl
client environment variables
$ ./etcd.yml -t etcd_env # refresh /etc/profile.d/etcdctl.sh
To update etcd endpoints reference on patroni
:
./pgsql.yml -t pg_conf # regenerate patroni config
ansible all -f 1 -b -a 'systemctl reload patroni' # reload patroni config
To update etcd endpoints reference on vip-manager
, (optional, if you are using a L2 vip)
./pgsql.yml -t pg_vip_config # regenerate vip-manager config
ansible all -f 1 -b -a 'systemctl restart vip-manager' # restart vip-manager to use new config
ETCD Reference: Add a member
You can add new members to existing etcd cluster in 5 steps:
etcdctl member add
command to tell existing cluster that a new member is coming (use learner mode)etcd
with new instanceetcd_init=existing
, to join the existing cluster rather than create a new one (VERY IMPORTANT)Short Version
etcdctl member add <etcd-?> --learner=true --peer-urls=https://<new_ins_ip>:2380
./etcd.yml -l <new_ins_ip> -e etcd_init=existing
etcdctl member promote <new_ins_server_id>
Here’s the detail, let’s start from one single etcd instance.
etcd:
hosts:
10.10.10.10: { etcd_seq: 1 } # <--- this is the existing instance
10.10.10.11: { etcd_seq: 2 } # <--- add this new member definition to inventory
vars: { etcd_cluster: etcd }
Add a learner instance etcd-2
to cluster with etcd member add
:
# tell the existing cluster that a new member etcd-2 is coming
$ etcdctl member add etcd-2 --learner=true --peer-urls=https://10.10.10.11:2380
Member 33631ba6ced84cf8 added to cluster 6646fbcf5debc68f
ETCD_NAME="etcd-2"
ETCD_INITIAL_CLUSTER="etcd-2=https://10.10.10.11:2380,etcd-1=https://10.10.10.10:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.10.11:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
Check the member list with etcdctl member list
(or em list
), we can see an unstarted
member:
33631ba6ced84cf8, unstarted, , https://10.10.10.11:2380, , true
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false
Init the new etcd instance etcd-2
with etcd.yml
playbook, we can see the new member is started:
$ ./etcd.yml -l 10.10.10.11 -e etcd_init=existing # etcd_init=existing must be set
...
33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, true
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false
Promote the new member, from leaner to follower:
$ etcdctl member promote 33631ba6ced84cf8 # promote the new learner
Member 33631ba6ced84cf8 promoted in cluster 6646fbcf5debc68f
$ em list # check again, the new member is started
33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, false
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, fals
The new member is added, don’t forget to reload config.
Repeat the steps above to add more members. remember to use at least 3 members for production.
To remove a member from existing etcd cluster, it usually takes 3 steps:
etcdctl member remove <server_id>
command and kick it out of the clusterHere’s the detail, let’s start from a 3 instance etcd cluster:
etcd:
hosts:
10.10.10.10: { etcd_seq: 1 }
10.10.10.11: { etcd_seq: 2 }
10.10.10.12: { etcd_seq: 3 } # <---- comment this line, then reload-config
vars: { etcd_cluster: etcd }
Then, you’ll have to actually kick it from cluster with etcdctl member remove
command:
$ etcdctl member list
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false
33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, false
93fcf23b220473fb, started, etcd-3, https://10.10.10.12:2380, https://10.10.10.12:2379, false # <--- remove this
$ etcdctl member remove 93fcf23b220473fb # kick it from cluster
Member 93fcf23b220473fb removed from cluster 6646fbcf5debc68f
Finally, you have to shutdown the instance, and purge it from node, you have to uncomment the member in inventory temporarily, then purge it with etcd.yml
playbook:
./etcd.yml -t etcd_purge -l 10.10.10.12 # purge it (the member is in inventory again)
After that, remove the member from inventory permanently, all clear!
There’s a built-in playbook: etcd.yml
for installing etcd cluster. But you have to define it first.
./etcd.yml # install etcd cluster on group 'etcd'
Here are available sub tasks:
etcd_assert
: generate etcd identityetcd_install
: install etcd rpm packagesetcd_clean
: cleanup existing etcd
etcd_check
: check etcd instance is runningetcd_purge
: remove running etcd instance & dataetcd_dir
: create etcd data & conf diretcd_config
: generate etcd config
etcd_conf
: generate etcd main configetcd_cert
: generate etcd ssl certetcd_launch
: launch etcd serviceetcd_register
: register etcd to prometheusIf etcd_safeguard
is true
, or etcd_clean
is false
,
the playbook will abort if any running etcd instance exists to prevent purge etcd by accident.
There is one dashboard for ETCD module:
ETCD Overview: Overview of the ETCD cluster
There are 10 parameters about ETCD
module.
Parameter | Type | Level | Comment |
---|---|---|---|
etcd_seq |
int | I | etcd instance identifier, REQUIRED |
etcd_cluster |
string | C | etcd cluster & group name, etcd by default |
etcd_safeguard |
bool | G/C/A | prevent purging running etcd instance? |
etcd_clean |
bool | G/C/A | purging existing etcd during initialization? |
etcd_data |
path | C | etcd data directory, /data/etcd by default |
etcd_port |
port | C | etcd client port, 2379 by default |
etcd_peer_port |
port | C | etcd peer port, 2380 by default |
etcd_init |
enum | C | etcd initial cluster state, new or existing |
etcd_election_timeout |
int | C | etcd election timeout, 1000ms by default |
etcd_heartbeat_interval |
int | C | etcd heartbeat interval, 100ms by default |
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.