Commit c5d488059d9407f1b9b96552159ffc298c8dc547 is invalidating sshd_config

Bug #1442239 reported by Bjoern
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
openstack-ansible
Invalid
Undecided
Unassigned
Juno
Fix Released
Medium
Matt Thompson
Trunk
Invalid
Undecided
Unassigned

Bug Description

If there is a new line missing inside the sshd_config this commit will just add MaxStartups or MaxSessions to a existing line and invalidating sshd_config causing all sshd's to die.

Please fix regex:

diff --git a/rpc_deployment/roles/common/tasks/ssh_config.yml b/rpc_deployment/roles/common/tasks/ssh_config.yml
index 49c8791..89a4670 100644
--- a/rpc_deployment/roles/common/tasks/ssh_config.yml
+++ b/rpc_deployment/roles/common/tasks/ssh_config.yml
@@ -16,14 +16,14 @@
 - name: set max sessions
   lineinfile:
     dest: /etc/ssh/sshd_config
- regexp: 'MaxSessions'
+ regexp: '^MaxSessions.*'
     line: "MaxSessions 500"
   notify:
     - restart ssh
 - name: set max startups
   lineinfile:
     dest: /etc/ssh/sshd_config
- regexp: 'MaxStartups'
+ regexp: '^MaxStartups.*'
     line: "MaxStartups 500"
   notify:
     - restart ssh

Revision history for this message
Andy McCrae (andrew-mccrae) wrote :

Can you show us your ssh file before that ran?
I don't see how the original would break anything (also it'd really break testing hard if this was crashing ssh).

The lineinfile module replaces whole lines (and also only replaces 1 line, the last occurrence) - but it would replace a full line, so it shouldn't append anything to a line.

The regex before is doing a search so it'd replace the last occurrence of a line with "MaxSessions" in it with a line that is "MaxSessions 500". (Same for MaxStartups) - replacing the whole line, and includes os.linesep.

We need to determine that this isn't a larger bug related to the lineinfile module itself or some other issue.

Revision history for this message
Bjoern (bjoern-t) wrote :

My sshd_config hat just

PasswordAuthentication yes

with a ending newline and looked

PasswordAuthentication yesMaxSessions 500

after running the host-setup.
I did test my statement already. Nonetheless the regex which has be defined in 10.1.2 in not good.
I you intent in replacing a whole line, you should have a regex reflecting this

Revision history for this message
Bjoern (bjoern-t) wrote :

update :

My sshd_config had just

PasswordAuthentication yes

without an ending newline and looked

PasswordAuthentication yesMaxSessions 500

after running the host-setup.
I did test my statement already. Nonetheless the regex which has be defined in 10.1.2 in not good.
I you intent in replacing a whole line, you should have a regex reflecting this

Revision history for this message
Andy McCrae (andrew-mccrae) wrote :

The lineinfile module replaces the whole line - thats what the module does.
So if thats not happening then the module itself isn't working - which is my point.

The regex is "search for the line to replace" the line directive is the line to replace the line found by the regexp. E.g. if the regex isn't matched nothing should happen.
Its also using the python regex search function which should match if it finds the specified string inside a string - in the case of this module it will pull out each line individually and search for the string inside each line.

Revision history for this message
Bjoern (bjoern-t) wrote :

Yeah, it looks like it just added this string in this case.
I did remove the MaxSession addition and rerun it again with my regex and it did work correctly, it made a newline and added the string. So the module is working differently depending on the regex. I can only advice to fix this regex, a leading ^ would have prevented this mess

Revision history for this message
Melvin Hillsman (mrhillsman) wrote :
Download full text (3.3 KiB)

It does not appear that the regular expression is the issue. I believed the issue to be related to the actual /etc/ssh/sshd_config file as there is a difference in the last byte in two test environments I setup, one of which I get this bug on and the other I do not. The one I am getting this bug on does not have the EOF [0a] byte at the end:

I rebuilt nodes in the failing environment and only ran the common-setup playbook which resulted in getting this bug again:
BEFORE RUNNING THE PLAYBOOK:
root@melv7301-infra1:/opt/os-ansible-deployment/rpc_deployment# xxd /etc/ssh/sshd_config
...
00009d0: 0a50 6173 7377 6f72 6441 7574 6865 6e74 .PasswordAuthent
00009e0: 6963 6174 696f 6e20 7965 73 ication yes
root@melv7301-infra1:/opt/os-ansible-deployment/rpc_deployment#
root@melv7301-infra1:/opt/os-ansible-deployment/rpc_deployment# ssh n02
Welcome to Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-46-generic x86_64)

 * Documentation: https://help.ubuntu.com/
Last login: Mon Apr 13 23:36:26 2015 from ...
root@melv7301-infra2:~#

RUNNING THE PLAYBOOK
root@melv7301-infra1:/opt/os-ansible-deployment/rpc_deployment# ansible-playbook -vvvv -e @/etc/rpc_deploy/user_variables.yml playbooks/setup/setup-common.yml
...
PLAY RECAP ********************************************************************
compute1 : ok=51 changed=36 unreachable=0 failed=0
compute2 : ok=51 changed=36 unreachable=0 failed=0
infra1 : ok=51 changed=36 unreachable=0 failed=0
infra2 : ok=51 changed=36 unreachable=0 failed=0
infra3 : ok=51 changed=36 unreachable=0 failed=0
logger1 : ok=51 changed=36 unreachable=0 failed=0
storage1 : ok=51 changed=36 unreachable=0 failed=0

root@melv7301-infra1:/opt/os-ansible-deployment/rpc_deployment# ssh n02
ssh: connect to host n02 port 22: Connection refused
root@melv7301-infra1:/opt/os-ansible-deployment/rpc_deployment#

root@melv7301-infra1:~# xxd /etc/ssh/sshd_config
...
00009e0: 6e20 7965 734d 6178 5365 7373 696f 6e73 n yesMaxSessions
00009f0: 2035 3030 0a 500.
root@melv7301-infra1:~#

Again, to test, I accessed each server via their console, resolved the SSH login issue, and removed the last byte [EOF] from two of them along with the MaxSessions 500 addition, ran the same playbook again [common-setup.yml] and the two nodes without the EOF byte both replicated this bug while the others did not.

[I used xxd, wc -c, and truncate to ensure I removed the 0a byte from the files on the two mentioned nodes]

So finally, I noticed that the versions of Ansible in my two test environments were different. This bug is related to the version of Ansible being used [1.6.10] in the requirements.txt file and adding ^ to the beginning of the regular expression did not resolve. I have not at this time determined the cause in terms of the way the code handles the file because it is missing the 0a byte at the end however Ansible v1.9.0.1 does not have this problem.

Unless there is a reason to keep 1.6.10 in the requirements.txt file then...

Read more...

Revision history for this message
Shane Cunningham (appprod0) wrote :

I've run across this bug but only on public cloud servers. I think it has to do with the sshd_config file being used in the public cloud image of 14.04. Following Melvin's debugging.

Public Cloud Server - Ubuntu Server 14.04 LTS PVHVM

root@test:~# xxd /etc/ssh/sshd_config
...

00009d0: 0a50 6173 7377 6f72 6441 7574 6865 6e74 .PasswordAuthent
00009e0: 6963 6174 696f 6e20 7965 73 ication yes

14.04 ISO from Canonical

root@111111-infra01:~# xxd /etc/ssh/sshd_config
...
00009d0: 7469 6361 7469 6f6e 2074 6f20 276e 6f27 tication to 'no'
00009e0: 2e0a 5573 6550 414d 2079 6573 0a ..UsePAM yes.

The 14.04 ISO contains the EOF byte, the Rackspace public cloud 14.04 image does not.

Revision history for this message
Bjoern (bjoern-t) wrote :

As I already said the missing line feed (0a) is triggering this issue. Personally I don't care how it gets fixed but upgrading ansible requires probably a full deployment test, in worst case updates to playbooks.

Revision history for this message
Shane Cunningham (appprod0) wrote :

Just clarifying, 0a is line feed, so the lineinfile module or the regex being used in ssh_config.yml not finding a line feed character at the end of sshd_config seems to be the issue.

Revision history for this message
Melvin Hillsman (mrhillsman) wrote :

Apologies as I did not notice Bjoern mention the line feed only the regex unfortunately so I missed that. But it is definitely an issue with the version of Ansible.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-ansible-deployment (juno)

Fix proposed to branch: juno
Review: https://review.openstack.org/176758

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-ansible-deployment (juno)

Reviewed: https://review.openstack.org/176758
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=390763b587c2908f6f5727f7c4c3df048bbac3d0
Submitter: Jenkins
Branch: juno

commit 390763b587c2908f6f5727f7c4c3df048bbac3d0
Author: Matt Thompson <email address hidden>
Date: Thu Apr 23 15:14:34 2015 +0100

    Add newline to /etc/ssh/sshd_config if necessary

    On some deploys we see /etc/ssh/sshd_config missing a trailing newline,
    which breaks the sshd_config file when the lineinfile tasks are run.
    This has been traced back to the version of ansible being deployed in
    juno as we do not have this issue in later branches using newer
    versions of ansible.

    Alternatively, we could upgrade ansible in juno but I would prefer to
    make the smallest change possible since juno is now widely deployed.

    Closes-Bug: #1442239

    Change-Id: If0f66753f40b378f6f2c8d256e1eb5b4d59fd64e

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers