Ansible最佳实践之Playbook执行速度优化

如果我不曾见到太阳,我本可以忍受黑暗。——————艾米莉·狄金森

写在前面


  • 今天和小伙伴们分享一些 Ansible 中 Playbook 执行速度优化的笔记
  • 博文通过7种不同的优化方式,合理利用可配置资源,从而提高 Playbook 的执行速度
  • 食用方式
    • 了解 Ansible 基础知识
    • 了解 Ansible 剧本编写
  • 理解不足小伙伴帮忙指正

如果我不曾见到太阳,我本可以忍受黑暗。——————艾米莉·狄金森


优化 Playbook 执行

主要通过以下方式来优化

  • 优化基础架构
  • 禁用facts收集
  • 增加任务并行
  • 程序包管理器模块不使用循环
  • 高效拷贝文件
  • 使用模板代替lineinfile
  • 优化SSH连接
  • 启用pipelining

下面我们一起来看一下如何优化

优化基础架构

运行最新版本的 Ansible 可帮助提高使用 Ansible 核心模块的 Playbook 的性能。同时尽可能让控制节点靠近受管节点。Ansible严重依赖网络通信和数据传输。

禁用facts收集

通过将gater_facts指令设置为Fasle来跳过收集,这样做的前提是剧本不依赖采集主机信息生成的变量信息,如涉及到装包或者其他不使用收集的系统变量,魔法变量的剧本,那个跳过收集可以节省横多时间,尤其是受控机的量级达到一定程度。

实际看一下,如果剧本中没有显示设置不采集主机信息,并且没有在配置中显示配置策略,那么剧本默认收集主机信息

1
2
3
4
5
6
---
- name: do not become
hosts: all
tasks:
- name: sleep 2
shell: sleep 2

上面的剧本默认收集主机信息,执行中我们可以找日志里看到TASK [Gathering Facts]

1
2
3
4
5
6
7
8
9
10
$ time ansible-playbook  fact.yaml
PLAY [do not become] ***********************************************************************************************
TASK [Gathering Facts] *********************************************************************************************
ok: [servera]
ok: [serverb]
ok: [serverc]
.............
real 0m10.204s
user 0m1.874s
sys 0m1.610s

可以看到执行时间耗时10.204s,在剧本中配置gather_facts:False禁用观察一下

1
2
3
4
5
6
7
---
- name: do not become
hosts: all
gather_facts: false
tasks:
- name: sleep 2
shell: sleep 2

可以发现执行耗时6.928s执行速度缩短了4秒

1
2
3
4
5
6
7
8
9
10
11
12
$ vim +3 fact.yaml
$ time ansible-playbook fact.yaml

PLAY [do not become] ***********************************************************************************************
TASK [sleep 2] *****************************************************************************************************
changed: [serverd]
changed: [serverc]
changed: [serverb]
.......................
real 0m6.928s
user 0m1.329s
sys 0m0.581s

当然,主机收集作禁用作为变量,也了在配置文件中去赋值,这里赋值是全局的。

1
2
3
4
5
6
7
$ cat /etc/ansible/ansible.cfg | grep -i  gather
# plays will gather facts by default, which contain information about
# smart - gather by default, but don't regather if already gathered
# implicit - gather by default, turn off with gather_facts: False
# explicit - do not gather by default, must say gather_facts: True
#gathering = implicit
......

通过 gathering=explicit 配置禁用全局的主机收集

1
2
3
4
5
6
7
8
9
10
11
12
13
$ cat ansible.cfg
[defaults]
inventory=inventory
remote_user=devops
roles_path=roles
gathering=explicit

[privilege_escalation]
become=True
become_method=sudo
become_user=root
become_ask_pass=False

删除剧本的禁用配置,时间和刚才差不多

1
2
3
4
5
6
7
8
9
10
11
$ sed '4d' fact.yaml -i
$ time ansible-playbook fact.yaml
PLAY [do not become] ***********************************************************************************************
TASK [sleep 2] *****************************************************************************************************
changed: [servere]
changed: [serverd]
.......
real 0m7.323s
user 0m0.939s
sys 0m1.124s
$

增加并行

所谓增加并行,即一次要把命令分发给几个受管机执行,这个配置由参数forks控制, 说的准确的些,即Ansible可以有多少个连接同时处于活动状态。在默认情况下,它设置为 5。

1
2
$ ansible-config  dump | grep -i fork
DEFAULT_FORKS(default) = 5

可以在 Ansible 配置文件中指定,或者通过 -f 选项传递给ansible-playbook命令:

配置文件中设置

1
2
3
4
5
$ cat ansible.cfg
[defautts]
inventory=inventory
remote_user=devops
forks=10

命令行中的设置

ansible-playbook fact.yaml -f 10

1
2
3
4
5
6
7
8
9
10
11
12
$ time ansible-playbook  fact.yaml  -f 10
PLAY [do not become] ***********************************************************************************************
TASK [sleep 2] *****************************************************************************************************
changed: [serverf]
changed: [servere]
changed: [serverd]
changed: [serverb]
.....
real 0m4.163s
user 0m1.013s
sys 0m0.471s
$

可以发现,在禁用主机收集gather_facts=False的基础上,设置多并行处理forks=10,时间由原来的10秒到现在的4秒。

使用软件包管理器模块避免循环:

某些模块接受要处理的项的列表,不要使用循环。此时模块将调用一次而不是多次。比如使用yum模块来装包

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
- name: Install the packages on the web servers
hosts: all

tasks:
- name: Ensure the packages are installed
yum:
name:
- httpd
- mod_ssl
- httpd-tools
- mariadb-server
- mariadb
- php
- php-mysqlnd
state: absent

等效于

1
2
 yum -y install httpd mod_ssl httpd-tools \
>mariadb-server mariadb php php-mysqlnd

使用循环的方式,可以发现使用的循环的方式是通过多个子bash的方式来执行,所以每次执行都要重新申请资源为一个bash进程来处理,而上面的方式始终只有个一个bash进程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
- name: Install the packages on the web servers
hosts: all

tasks:
- name: Ensure the packages are installed
yum:
name: "{{ item }}"
state: present
loop:
- httpd
- mod_ssl
- httpd-tools
- mariadb-server
- mariadb
- php
- php-mysqlnd

等效于

1
2
3
4
5
6
7
$ yum install httpd
$ yum install mod_sst
$ yum install httpd-tools
$ yum install mariadb-server
$ yum install mariadb
$ yum install php
$ yum install php-mysqlnd

注意:并非所有模块都接受 name 参数的列表,

高效复制文件到受管主机:

在将大量文件复制到受管主机时,使用 synchronize 模块更为高效,应为``synchronize` 模块使用可rsync来同步文件,会通过哈希值比较文件,如果文件存在,则不复制,速度非常快,所以大多数情况下此模块后台使用 rsync 速度比copy 模块快,copy模块本质上是scp,所以他不会对文件是否存在进行校验。

申请一个1G的文件测试下

1
2
3
4
$ dd if=/dev/zero of=bigfile1 bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.431348 s, 2.5 GB/s

文件不存在的情况,通过synchronize复制文件

1
2
3
4
5
6
7
8
9
10
---
- name: Deploy the w eb content on the web servers
hosts: all
become: True
gather_facts: False
tasks:
- name: copy demo
synchronize:
src: bigfile1
dest: /tmp/

执行耗时为26.146s

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ time ansible-playbook  copy_task.yaml
PLAY [Deploy the w eb content on the web servers] ******************************************************************
TASK [copy demo] ***************************************************************************************************
changed: [servere]
changed: [serverf]
changed: [serverb]
changed: [servera]
changed: [serverd]
changed: [serverc]
PLAY RECAP *********************************************************************************************************
servera : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverb : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverc : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverd : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
servere : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverf : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

real 0m26.146s
user 0m50.033s
sys 0m2.382s

现在我么使用 copy 模块来试下

1
2
3
4
5
6
7
8
9
10
11
---
- name: Deploy the w eb content on the web servers
hosts: all
become: True
gather_facts: False
tasks:
- name: copy demo
#synchronize:
copy:
src: bigfile1
dest: /tmp/
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ time ansible-playbook  copy_task.yaml
PLAY [Deploy the w eb content on the web servers] ******************************************************************
TASK [copy demo] ***************************************************************************************************
ok: [serverc]
ok: [serverd]
ok: [serverf]
ok: [servera]
ok: [serverb]
ok: [servere]

PLAY RECAP *********************************************************************************************************
servera : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverb : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverc : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverd : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
servere : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverf : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

real 0m14.868s
user 0m12.273s
sys 0m5.125s

copy 模块耗时14.868s,因为他不用对文件进行校验所以要少于synchronize模块,我们在次使用synchronize

1
2
3
4
5
6
7
8
9
10
11
12
$ cat copy_task.yaml
---
- name: Deploy the w eb content on the web servers
hosts: all
become: True
gather_facts: False
tasks:
- name: copy demo
synchronize:
#copy:
src: bigfile1
dest: /tmp/
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ time ansible-playbook  copy_task.yaml
PLAY [Deploy the w eb content on the web servers] ******************************************************************
TASK [copy demo] ***************************************************************************************************
ok: [servere]
ok: [serverb]
ok: [serverd]
ok: [servera]
ok: [serverf]
ok: [serverc]

PLAY RECAP *********************************************************************************************************
servera : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverb : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverc : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverd : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
servere : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverf : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

real 0m2.022s
user 0m1.757s
sys 0m0.568s

发现只使用了2秒,所以要分情况使用,如果是确定是新文件,那么使用copy模块,如果不确定,使用synchronize模块

使用模板:

lineinfile 模块在文件中插入或删除行,与循环搭配时不是很高效:请改用template模块,这不多讲,lineinfile 模块用于少量的配置文件修改,比如关闭交换分区,Selinux等。如果是Nginx等配置文件,使用模板文件会更高效

优化 SSH 连接:

Ansible 建立 SSH 连接可能是一个速度较慢的过程,为缓解这类问题,Ansible 依赖于 SSH 提供的两个功能:

  • ControlMaster允许多个同时与远程主机连接的 SSH 会话使用单一网络连接。第一个 SSH 会话建立连接,与同一主机连接的其他会话则重复利用该连接,从而绕过较慢的初始过程。SSH 在最后一个会话关闭后,立即销毁共享的连接。
  • ControlPersist使连接在后台保持打开,而不是在上⼀次会话后销毁连接。此指令允许稍后的 SSH 会话重用该连接。ControlPersist 指示 SSH 应使空闲连接保持打开的时间长度,每个新会话将重置此空闲计时器。

Ansible 通过 Ansible 配置⽂件的[ssh_connection]部分下的ssh_args指令启用ControlMaster 和ControlPersist功能,并且默认是开启状态的。

ssh_args 的默认值:

1
2
$ ansible-config  dump | grep -i master
ANSIBLE_SSH_ARGS(default) = -C -o ControlMaster=auto -o ControlPersist=60s

显示设置

1
2
[ssh_connection]
ssh_args=-o ControlMaster=auto -o ControlPersist=60s

如果forks 值或 ControlPersist设置比较大,控制节点可能会使用更多的并发连接。确保控制节点配置有足够的文件句柄,可用于支持许多活动的网络连接

启用 Pipelining:

为了在远程节点上运行任务,Ansible 会执行多个 SSH 操作,将模块及其所有数据复制到远程节点并执行该模块。若要提高 playbook 的性能,可以激活Pipelining功能,Ansible 将建立较少的 SSH 连接。若要启用 Pipelining ,将 Ansible 配置文件中的[ssh_connection] 部分:

1
2
[ssh_connection]
pipelining =True

此功能默认不启用,因为需要禁用受管主机中的 requiretty sudo 选项。requiretty表示即使没有交互式shell /会话也可以使用sudo

禁用需要找受管机做如下配置

1
2
3
4
5
6
7
8
9
10
11
[root@servera student]# cat /etc/sudoers | grep -C 4 visiblepw

#
# Refuse to run if unable to disable echo on the tty.
#
Defaults !visiblepw
Defaults !requiretty
#
# Preserving HOME has security implications since many programs
# use it when searching for configuration files. Note that HOME
[root@servera student]#

实战

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[student@workstation task-speed]$ cat deploy_webservers.yml
---
- name: Deploy the web servers
hosts: web_servers
become: True

tasks:
- name: Ensure required packages are installed
yum:
name: "{{ item }}"
state: present
loop:
- httpd
- mod_ssl
- httpd-tools
- mariadb-server
- mariadb
- php
- php-mysqlnd

- name: Ensure the services are enabled
service:
name: "{{ item }}"
state: started
enabled: True
loop:
- httpd
- mariadb

- name: Ensure the web content is installed
copy:
src: web_content/
dest: /var/www/html
1
2
3
4
[defaults]
inventory=inventory.yml
remote_user=devops
callback_whitelist=timer,profile_tasks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
[student@workstation task-speed]$ ansible-playbook  deploy_webservers.yml

PLAY [Deploy the web servers] ****************************************************************************

TASK [Gathering Facts] ***********************************************************************************
Tuesday 16 August 2022 00:08:41 +0800 (0:00:00.033) 0:00:00.033 ********
ok: [serverb.lab.example.com]
ok: [serverc.lab.example.com]

TASK [Ensure required packages are installed] ************************************************************
Tuesday 16 August 2022 00:08:42 +0800 (0:00:01.165) 0:00:01.198 ********
changed: [serverc.lab.example.com] => (item=httpd)
changed: [serverb.lab.example.com] => (item=httpd)
changed: [serverc.lab.example.com] => (item=mod_ssl)
changed: [serverb.lab.example.com] => (item=mod_ssl)
ok: [serverc.lab.example.com] => (item=httpd-tools)
ok: [serverb.lab.example.com] => (item=httpd-tools)
changed: [serverc.lab.example.com] => (item=mariadb-server)
changed: [serverb.lab.example.com] => (item=mariadb-server)
ok: [serverc.lab.example.com] => (item=mariadb)
ok: [serverb.lab.example.com] => (item=mariadb)
changed: [serverc.lab.example.com] => (item=php)
changed: [serverb.lab.example.com] => (item=php)
changed: [serverc.lab.example.com] => (item=php-mysqlnd)
changed: [serverb.lab.example.com] => (item=php-mysqlnd)

TASK [Ensure the services are enabled] *******************************************************************
Tuesday 16 August 2022 00:09:00 +0800 (0:00:18.639) 0:00:19.838 ********
changed: [serverc.lab.example.com] => (item=httpd)
changed: [serverb.lab.example.com] => (item=httpd)
changed: [serverc.lab.example.com] => (item=mariadb)
changed: [serverb.lab.example.com] => (item=mariadb)

TASK [Ensure the web content is installed] ***************************************************************
Tuesday 16 August 2022 00:09:03 +0800 (0:00:02.720) 0:00:22.558 ********
changed: [serverc.lab.example.com]
changed: [serverb.lab.example.com]

PLAY RECAP ***********************************************************************************************
serverb.lab.example.com : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverc.lab.example.com : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

Tuesday 16 August 2022 00:09:38 +0800 (0:00:34.566) 0:00:57.124 ********
===============================================================================
Ensure the web content is installed -------------------------------------------------------------- 34.57s
Ensure required packages are installed ----------------------------------------------------------- 18.64s
Ensure the services are enabled ------------------------------------------------------------------- 2.72s
Gathering Facts ----------------------------------------------------------------------------------- 1.17s
Playbook run took 0 days, 0 hours, 0 minutes, 57 seconds
[student@workstation task-speed]$
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[student@workstation task-speed]$ ansible-playbook deploy_webservers.yml

PLAY [Deploy the web servers] ****************************************************************************

TASK [Ensure required packages are installed] ************************************************************
Tuesday 16 August 2022 00:16:20 +0800 (0:00:00.035) 0:00:00.035 ********
changed: [serverb.lab.example.com]
changed: [serverc.lab.example.com]

TASK [Ensure the services are enabled] *******************************************************************
Tuesday 16 August 2022 00:16:30 +0800 (0:00:10.005) 0:00:10.041 ********
changed: [serverc.lab.example.com] => (item=httpd)
changed: [serverb.lab.example.com] => (item=httpd)
changed: [serverb.lab.example.com] => (item=mariadb)
changed: [serverc.lab.example.com] => (item=mariadb)

TASK [Ensure the web content is installed] ***************************************************************
Tuesday 16 August 2022 00:16:33 +0800 (0:00:03.142) 0:00:13.184 ********
changed: [serverc.lab.example.com]
changed: [serverb.lab.example.com]

PLAY RECAP ***********************************************************************************************
serverb.lab.example.com : ok=3 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
serverc.lab.example.com : ok=3 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

Tuesday 16 August 2022 00:16:54 +0800 (0:00:20.718) 0:00:33.902 ********
===============================================================================
Ensure the web content is installed -------------------------------------------------------------- 20.72s
Ensure required packages are installed ----------------------------------------------------------- 10.01s
Ensure the services are enabled ------------------------------------------------------------------- 3.14s
Playbook run took 0 days, 0 hours, 0 minutes, 33 seconds
[student@workstation task-speed]$

博文参考


《Red Hat Ansible Engine 2.8 DO447》


嗯,关于Ansible 中Playbook 执行速度优化就和小伙伴们分享到这里,生活加油 ^_^

发布于

2022-05-14

更新于

2023-06-21

许可协议

评论
Your browser is out-of-date!

Update your browser to view this website correctly.&npsb;Update my browser now

×