This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

模块:DOCKER

Docker Daemon 服务,允许用户一键拉起容器化的无状态软件工具模板,加装各种功能。

Docker 是流行的容器平台,提供了标准化的软件交付能力。 安装 | 下载 | 配置 | 管理 | 剧本 | 参数


概览

尽管我们不赞成将 Docker 用于有状态的重要数据库上,但它对于无状态的应用软件而言是一个相当优雅的解决方案。

请注意,Docker 并不是 Pigsty 的 核心功能模块,默认也并不会在 Pigsty 安装过程中从互联网下载至本地。

您可以选择在有互联网环境的节点直接使用官方仓库进行在线安装,或者在没有互联网环境的节点上,通过 Pigsty 下载 Docker 软件包至本地软件源后使用。


安装

对于有互联网环境的节点,执行以下命令,可以在 Pigsty 托管的节点上添加 Docker 仓库并在线安装 Docker。

./node.yml -t node_install -e '{"node_repo_modules":"node,infra,docker","node_packages":["docker-ce,docker-compose-plugin"]}'

在这种情况下,您需要自行配置 Docker 的软件源,以及 Docker 的软件包名称。当然您也可以选择使用 Pigsty docker.yml 剧本对安装好的 Docker 进行进一步的配置。


下载

因此如果您的环境没有互联网访问,或者节点需要通过本地软件源来统一安装的 Docker 版本并加速部署,您需要将 Docker 的软件包添加到本地软件源中,以下命令会将 Docker 下载到本地软件源中:

./infra.yml -t repo_build -e '{"repo_modules":"node,docker","repo_packages":["docker-ce docker-compose-plugin"],"repo_extra_packages":[],"repo_url_packages":[]}'

配置

关于 Docker 模块,用户通常只需要关心一个参数 docker_enabled: true,决定是否在节点上安装 Docker。

带有此标记的节点,会在执行 docker.yml 剧本时,安装 Docker 软件包并启动 Docker 服务。

infra:
  hosts:
    10.10.10.10: { infra_seq: 1 ,nodename: infra-1 }
    10.10.10.11: { infra_seq: 2 ,nodename: infra-2 }
  vars:
    docker_enabled: true
    node_id_from_pg: false
    node_cluster: infra
    node_conf: oltp

管理


剧本

Pigsty 提供了一个用于安装 Docker 模块的剧本

docker.yml

在节点上安装 Docker 的任务 docker.yml 包含了以下子任务:

docker_install   : 在节点上安装 Docker,Docker Compose 软件包
docker_admin     : 将指定的用户加入 Docker 管理员用户组中
docker_config    : 生成 Docker 守护进程服务配置文件
docker_launch    : 启动 Docker 守护进程服务
docker_image     : 尝试从 /tmp/docker/*.tgz 加载镜像(如果存在)

参数

docker_enabled: false             # enable docker on this node?
docker_cgroups_driver: systemd    # docker cgroup fs driver: cgroupfs,systemd
docker_registry_mirrors: []       # docker registry mirror list
docker_exporter_port: 9323        # docker metrics exporter port, 9323 by default
docker_image_cache: /tmp/docker   # docker image cache dir, `/tmp/docker` by default

docker_enabled

参数名称: docker_enabled, 类型: bool, 层次:C

是否在当前节点启用Docker?默认为: false,即不启用。

docker_cgroups_driver

参数名称: docker_cgroups_driver, 类型: enum, 层次:C

Docker使用的 CGroup FS 驱动,可以是 cgroupfssystemd,默认值为: systemd

docker_registry_mirrors

参数名称: docker_registry_mirrors, 类型: string[], 层次:C

Docker使用的镜像仓库地址,默认值为:[] 空数组。

您可以使用Docker镜像站点加速镜像拉取,下面是一些例子:

["https://mirror.ccs.tencentyun.com"]           # 腾讯云内网的镜像站点
["https://registry.cn-hangzhou.aliyuncs.com"]   # 阿里云镜像站点,需要登陆

如果拉取速度太慢,您也可以考虑:docker login quay.io 使用其他的 Registry。

docker_exporter_port

参数名称: docker_exporter_port, 类型: port, 层次:G

Docker 暴露指标使用的端口,默认为 9323

docker_image_cache

参数名称: docker_image_cache, 类型: path, 层次:C

本地的Docker镜像离线缓存包路径, 默认为 /tmp/docker

在该路径下以 tgz 结尾的文件会被逐个 load 到 Docker 中:

cat {{ docker_image_cache }}/*.tgz | gzip -d -c - | docker load


1 - 指标

Pigsty Docker 模块提供的完整监控指标列表与释义

DOCKER 模块包含有 123 类可用监控指标。

Metric Name Type Labels Description
builder_builds_failed_total counter ip, cls, reason, ins, job, instance Number of failed image builds
builder_builds_triggered_total counter ip, cls, ins, job, instance Number of triggered image builds
docker_up Unknown ip, cls, ins, job, instance N/A
engine_daemon_container_actions_seconds_bucket Unknown ip, cls, ins, job, instance, le, action N/A
engine_daemon_container_actions_seconds_count Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_container_actions_seconds_sum Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_container_states_containers gauge ip, cls, ins, job, instance, state The count of containers in various states
engine_daemon_engine_cpus_cpus gauge ip, cls, ins, job, instance The number of cpus that the host system of the engine has
engine_daemon_engine_info gauge ip, cls, architecture, ins, job, instance, os_version, kernel, version, graphdriver, os, daemon_id, commit, os_type The information related to the engine and the OS it is running on
engine_daemon_engine_memory_bytes gauge ip, cls, ins, job, instance The number of bytes of memory that the host system of the engine has
engine_daemon_events_subscribers_total gauge ip, cls, ins, job, instance The number of current subscribers to events
engine_daemon_events_total counter ip, cls, ins, job, instance The number of events logged
engine_daemon_health_checks_failed_total counter ip, cls, ins, job, instance The total number of failed health checks
engine_daemon_health_check_start_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
engine_daemon_health_check_start_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
engine_daemon_health_check_start_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
engine_daemon_health_checks_total counter ip, cls, ins, job, instance The total number of health checks
engine_daemon_host_info_functions_seconds_bucket Unknown ip, cls, ins, job, instance, le, function N/A
engine_daemon_host_info_functions_seconds_count Unknown ip, cls, ins, job, instance, function N/A
engine_daemon_host_info_functions_seconds_sum Unknown ip, cls, ins, job, instance, function N/A
engine_daemon_image_actions_seconds_bucket Unknown ip, cls, ins, job, instance, le, action N/A
engine_daemon_image_actions_seconds_count Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_image_actions_seconds_sum Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_network_actions_seconds_bucket Unknown ip, cls, ins, job, instance, le, action N/A
engine_daemon_network_actions_seconds_count Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_network_actions_seconds_sum Unknown ip, cls, ins, job, instance, action N/A
etcd_debugging_snap_save_marshalling_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_debugging_snap_save_marshalling_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_debugging_snap_save_marshalling_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_debugging_snap_save_total_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_debugging_snap_save_total_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_debugging_snap_save_total_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_disk_wal_fsync_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_disk_wal_fsync_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_disk_wal_fsync_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_disk_wal_write_bytes_total gauge ip, cls, ins, job, instance Total number of bytes written in WAL.
etcd_snap_db_fsync_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_snap_db_fsync_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_snap_db_fsync_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_snap_db_save_total_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_snap_db_save_total_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_snap_db_save_total_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_snap_fsync_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_snap_fsync_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_snap_fsync_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
go_gc_duration_seconds summary ip, cls, ins, job, instance, quantile A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
go_gc_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
go_goroutines gauge ip, cls, ins, job, instance Number of goroutines that currently exist.
go_info gauge ip, cls, ins, job, version, instance Information about the Go environment.
go_memstats_alloc_bytes counter ip, cls, ins, job, instance Total number of bytes allocated, even if freed.
go_memstats_alloc_bytes_total counter ip, cls, ins, job, instance Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter ip, cls, ins, job, instance Total number of frees.
go_memstats_gc_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge ip, cls, ins, job, instance Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge ip, cls, ins, job, instance Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge ip, cls, ins, job, instance Number of heap bytes that are in use.
go_memstats_heap_objects gauge ip, cls, ins, job, instance Number of allocated objects.
go_memstats_heap_released_bytes gauge ip, cls, ins, job, instance Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge ip, cls, ins, job, instance Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge ip, cls, ins, job, instance Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter ip, cls, ins, job, instance Total number of pointer lookups.
go_memstats_mallocs_total counter ip, cls, ins, job, instance Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge ip, cls, ins, job, instance Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge ip, cls, ins, job, instance Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge ip, cls, ins, job, instance Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge ip, cls, ins, job, instance Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge ip, cls, ins, job, instance Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge ip, cls, ins, job, instance Number of bytes obtained from system.
go_threads gauge ip, cls, ins, job, instance Number of OS threads created.
logger_log_entries_size_greater_than_buffer_total counter ip, cls, ins, job, instance Number of log entries which are larger than the log buffer
logger_log_read_operations_failed_total counter ip, cls, ins, job, instance Number of log reads from container stdio that failed
logger_log_write_operations_failed_total counter ip, cls, ins, job, instance Number of log write operations that failed
process_cpu_seconds_total counter ip, cls, ins, job, instance Total user and system CPU time spent in seconds.
process_max_fds gauge ip, cls, ins, job, instance Maximum number of open file descriptors.
process_open_fds gauge ip, cls, ins, job, instance Number of open file descriptors.
process_resident_memory_bytes gauge ip, cls, ins, job, instance Resident memory size in bytes.
process_start_time_seconds gauge ip, cls, ins, job, instance Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge ip, cls, ins, job, instance Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge ip, cls, ins, job, instance Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flight gauge ip, cls, ins, job, instance Current number of scrapes being served.
promhttp_metric_handler_requests_total counter ip, cls, ins, job, instance, code Total number of scrapes by HTTP status code.
scrape_duration_seconds Unknown ip, cls, ins, job, instance N/A
scrape_samples_post_metric_relabeling Unknown ip, cls, ins, job, instance N/A
scrape_samples_scraped Unknown ip, cls, ins, job, instance N/A
scrape_series_added Unknown ip, cls, ins, job, instance N/A
swarm_dispatcher_scheduling_delay_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_dispatcher_scheduling_delay_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_dispatcher_scheduling_delay_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_manager_configs_total gauge ip, cls, ins, job, instance The number of configs in the cluster object store
swarm_manager_leader gauge ip, cls, ins, job, instance Indicates if this manager node is a leader
swarm_manager_networks_total gauge ip, cls, ins, job, instance The number of networks in the cluster object store
swarm_manager_nodes gauge ip, cls, ins, job, instance, state The number of nodes
swarm_manager_secrets_total gauge ip, cls, ins, job, instance The number of secrets in the cluster object store
swarm_manager_services_total gauge ip, cls, ins, job, instance The number of services in the cluster object store
swarm_manager_tasks_total gauge ip, cls, ins, job, instance, state The number of tasks in the cluster object store
swarm_node_manager gauge ip, cls, ins, job, instance Whether this node is a manager or not
swarm_raft_snapshot_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_raft_snapshot_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_raft_snapshot_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_raft_transaction_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_raft_transaction_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_raft_transaction_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_batch_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_batch_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_batch_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_lookup_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_lookup_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_lookup_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_memory_store_lock_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_memory_store_lock_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_memory_store_lock_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_read_tx_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_read_tx_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_read_tx_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_write_tx_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_write_tx_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_write_tx_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
up Unknown ip, cls, ins, job, instance N/A

2 - 常见问题

Pigsty Docker 模块常见问题答疑

使用代理服务器

在 Docker 安装过程中,如果 proxy_env 参数存在, 这里的 HTTP 代理服务器配置会被写入到 /etc/docker/daemon.json 配置文件中。

Docker 在从上游 Registry 拉取镜像时,会使用此代理服务器。

小提示,在执行 configure 过程中使用 -x 参数会将当前环境中的代理服务器配置写入到 proxy_env 中。


使用镜像站点

如果您在中国大陆受到功夫网影响,可以考虑使用墙内可用的 Docker 镜像站点,例如 quay.io:

docker login quay.io    # 输入用户名密码,完成登陆

2024-06 更新,国内所有可用 Docker 镜像站点均已被墙,请使用代理服务器访问并拉取。


将Docker纳入监控

在 Docker 模块安装过程中

针对节点单独执行监控目标注册子任务 docker_registerregister_prometheus 即可:

./docker.yml -l <your-node-selector> -t register_prometheus

使用软件模板

Pigsty 提供了一系列使用 Docker Compose 拉起的软件工具模板,可以开箱即用。

但需要首先安装 Docker 模块。