ec2: zesty tempfile sandbox dhclient.pid file can't be created
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cloud-init |
Fix Released
|
High
|
Chad Smith | ||
cloud-init (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Xenial |
Fix Released
|
Medium
|
Unassigned | ||
Zesty |
Fix Released
|
Medium
|
Unassigned | ||
Artful |
Fix Released
|
Medium
|
Unassigned | ||
Bionic |
Fix Released
|
Medium
|
Unassigned |
Bug Description
=== Begin SRU Template ===
[Impact]
Ec2 instances could hit race condition with tempdir removal where dhclient doesn't write a pidfile and DataSourceEc2Local hits a traceback trying to read that non-existent pidfile. This traceback causes the instance to fallback and get discovered in init-network stage as DataSourceEc2. The thrashing costs instances a couple extra seconds to boot while re-discovering in a different stage.
[Test Case]
# Launch instance under test
$ for release in xenial zesty artful; do
echo "Handling $release";
launch-ec2 --series $release;
ssh ubuntu@
ssh ubuntu@
ssh ubuntu@
ssh ubuntu@
ssh ubuntu@
# Show upgrade without restart doesn't break
ssh ubuntu@
# Show clean install doesn't break
ssh ubuntu@
ssh ubuntu@
ssh ubuntu@
# Asssert no intermittent tracebacks from dhcp_discovery and no leaked dhcpclients;
ssh ubuntu@
sudo ps -afe |grep dhclient;
done
[Regression Potential]
Regression would still result in Tracebacks in DataSourceEc2Local which would cause cloud-init to fallback to DataSourceEc2 in init-network stage.
[Other Info]
Upstream commit at
https:/
=== End SRU Template ===
=== Original Description ===
Saw an issue once on EC2 zesty image with 17.1.41 during SRU testing.
Looks like we hit an inability to create the pid file (from syslog)
#### syslog
Nov 30 04:20:35 ip-10-0-20-176 cloud-init[440]: Cloud-init v. 17.1 running 'init-local' at Thu, 30 Nov 2017 04:20:32 +0000. Up 7.16 seconds.
Nov 30 04:20:35 ip-10-0-20-176 cloud-init[440]: 2017-11-30 04:20:32,768 - util.py[WARNING]: Getting data from <class 'cloudinit.
Nov 30 04:20:35 ip-10-0-20-176 dhclient[669]: Can't create /var/tmp/
#### end syslog
A traceback when trying to read the temporary pid file that was created by our dhclient run during Ec2Local setup. Maybe we exited out of the dhcp run before we could read the pid file?
...
2017-11-30 04:20:32,738 - util.py[DEBUG]: Running command ['ip', 'link', 'set', 'dev', 'eth0', 'up'] with allowed return codes [0] (shell=False, capture=True)
2017-11-30 04:20:32,744 - util.py[DEBUG]: Running command ['/var/
2017-11-30 04:20:32,768 - util.py[DEBUG]: Reading from /var/tmp/
2017-11-30 04:20:32,768 - handlers.py[DEBUG]: finish: init-local/
2017-11-30 04:20:32,768 - util.py[WARNING]: Getting data from <class 'cloudinit.
2017-11-30 04:20:32,768 - util.py[DEBUG]: Getting data from <class 'cloudinit.
Traceback (most recent call last):
File "/usr/lib/
if s.get_data():
File "/usr/lib/
return super(DataSourc
File "/usr/lib/
self.
File "/usr/lib/
return dhcp_discovery(
File "/usr/lib/
pid = int(util.
File "/usr/lib/
with open(fname, 'rb') as ifh:
FileNotFoundError: [Errno 2] No such file or directory: '/var/tmp/
Related branches
- Server Team CI bot: Approve (continuous-integration)
- Scott Moser: Pending requested
-
Diff: 667 lines (+221/-77)15 files modifiedcloudinit/net/dhcp.py (+29/-15)
cloudinit/net/network_state.py (+8/-0)
cloudinit/net/sysconfig.py (+15/-0)
cloudinit/net/tests/test_dhcp.py (+61/-5)
cloudinit/sources/DataSourceAzure.py (+3/-26)
cloudinit/util.py (+22/-0)
debian/changelog (+13/-0)
tests/cloud_tests/images/nocloudkvm.py (+15/-7)
tests/cloud_tests/instances/nocloudkvm.py (+5/-3)
tests/cloud_tests/platforms/nocloudkvm.py (+11/-10)
tests/cloud_tests/releases.yaml (+16/-0)
tests/cloud_tests/setup_image.py (+3/-3)
tests/cloud_tests/snapshots/nocloudkvm.py (+11/-6)
tests/unittests/test_datasource/test_azure.py (+3/-2)
tests/unittests/test_net.py (+6/-0)
- Server Team CI bot: Approve (continuous-integration)
- Scott Moser: Pending requested
-
Diff: 667 lines (+221/-77)15 files modifiedcloudinit/net/dhcp.py (+29/-15)
cloudinit/net/network_state.py (+8/-0)
cloudinit/net/sysconfig.py (+15/-0)
cloudinit/net/tests/test_dhcp.py (+61/-5)
cloudinit/sources/DataSourceAzure.py (+3/-26)
cloudinit/util.py (+22/-0)
debian/changelog (+13/-0)
tests/cloud_tests/images/nocloudkvm.py (+15/-7)
tests/cloud_tests/instances/nocloudkvm.py (+5/-3)
tests/cloud_tests/platforms/nocloudkvm.py (+11/-10)
tests/cloud_tests/releases.yaml (+16/-0)
tests/cloud_tests/setup_image.py (+3/-3)
tests/cloud_tests/snapshots/nocloudkvm.py (+11/-6)
tests/unittests/test_datasource/test_azure.py (+3/-2)
tests/unittests/test_net.py (+6/-0)
- Server Team CI bot: Approve (continuous-integration)
- Scott Moser: Pending requested
-
Diff: 667 lines (+221/-77)15 files modifiedcloudinit/net/dhcp.py (+29/-15)
cloudinit/net/network_state.py (+8/-0)
cloudinit/net/sysconfig.py (+15/-0)
cloudinit/net/tests/test_dhcp.py (+61/-5)
cloudinit/sources/DataSourceAzure.py (+3/-26)
cloudinit/util.py (+22/-0)
debian/changelog (+13/-0)
tests/cloud_tests/images/nocloudkvm.py (+15/-7)
tests/cloud_tests/instances/nocloudkvm.py (+5/-3)
tests/cloud_tests/platforms/nocloudkvm.py (+11/-10)
tests/cloud_tests/releases.yaml (+16/-0)
tests/cloud_tests/setup_image.py (+3/-3)
tests/cloud_tests/snapshots/nocloudkvm.py (+11/-6)
tests/unittests/test_datasource/test_azure.py (+3/-2)
tests/unittests/test_net.py (+6/-0)
- Server Team CI bot: Approve (continuous-integration)
- Scott Moser: Approve
-
Diff: 320 lines (+118/-48)5 files modifiedcloudinit/net/dhcp.py (+29/-15)
cloudinit/net/tests/test_dhcp.py (+61/-5)
cloudinit/sources/DataSourceAzure.py (+3/-26)
cloudinit/util.py (+22/-0)
tests/unittests/test_datasource/test_azure.py (+3/-2)
description: | updated |
summary: |
- ec2: temp dhclient.pid file issues + ec2: zesty tempfile sandbox dhclient.pid file can't be created |
Changed in cloud-init: | |
assignee: | nobody → Chad Smith (chad.smith) |
status: | Triaged → In Progress |
Changed in cloud-init: | |
status: | In Progress → Fix Committed |
Changed in cloud-init (Ubuntu Xenial): | |
status: | New → Confirmed |
Changed in cloud-init (Ubuntu Zesty): | |
status: | New → Confirmed |
Changed in cloud-init (Ubuntu Artful): | |
status: | New → Confirmed |
Changed in cloud-init (Ubuntu Xenial): | |
importance: | Undecided → Medium |
Changed in cloud-init (Ubuntu Zesty): | |
importance: | Undecided → Medium |
Changed in cloud-init (Ubuntu Artful): | |
importance: | Undecided → Medium |
Changed in cloud-init (Ubuntu Bionic): | |
importance: | Undecided → Medium |
Changed in cloud-init (Ubuntu Xenial): | |
status: | Confirmed → Fix Committed |
cloud-init log from Ec2Local failure to read dhclient.pid file