test_004_can_access_metadata_over_public_ip fails intermittently

Bug #821242 reported by Dan Prince
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Dan Prince

Bug Description

The test_004_can_access_metadata_over_public_ip fails intermittently (1 out of 4 times) when run on a set of freshly configured servers. When it fails we get the following error:

Traceback (most recent call last):
  File "/usr/lib/python2.6/unittest.py", line 279, in run
    testMethod()
  File "/root/nova_source/smoketests/test_netadmin.py", line 157, in test_004_can_access_metadata_over_public_ip
    while not self.__public_instance_is_accessible():
  File "/root/nova_source/smoketests/test_netadmin.py", line 118, in __public_instance_is_accessible
    raise Exception("Wrong instance id")
Exception: Wrong instance id

Related branches

Revision history for this message
Vish Ishaya (vishvananda) wrote : Re: [Bug 821242] [NEW] test_004_can_access_metadata_over_public_ip fails intermittently

It would be useful to know what it is getting for the instance id. Is it getting another id, meaning that the metadata server is somehow not getting the ip correctly, or is it just failing to get a valid response from the metadata server. It would be great if we could change it to log what it actually got via the request. It may be a timing issue where there is a moment during the association where it is not working, or there is some intermittent state where the metadata server is failing. For example the occassional error we have been seeing regarding lazy load of networks. Can you try putting some more logging in there for what the actual response is, and also see if there is any error logged in nova-api?

Vish

On Aug 4, 2011, at 7:58 PM, Dan Prince wrote:

> Public bug reported:
>
> The test_004_can_access_metadata_over_public_ip fails intermittently (1
> out of 4 times) when run on a set of freshly configured servers. When it
> fails we get the following error:
>
> Traceback (most recent call last):
> File "/usr/lib/python2.6/unittest.py", line 279, in run
> testMethod()
> File "/root/nova_source/smoketests/test_netadmin.py", line 157, in test_004_can_access_metadata_over_public_ip
> while not self.__public_instance_is_accessible():
> File "/root/nova_source/smoketests/test_netadmin.py", line 118, in __public_instance_is_accessible
> raise Exception("Wrong instance id")
> Exception: Wrong instance id
>
> ** Affects: nova
> Importance: Undecided
> Status: New
>
> --
> You received this bug notification because you are a member of Nova Bug
> Team, which is subscribed to OpenStack Compute (nova).
> https://bugs.launchpad.net/bugs/821242
>
> Title:
> test_004_can_access_metadata_over_public_ip fails intermittently
>
> Status in OpenStack Compute (Nova):
> New
>
> Bug description:
> The test_004_can_access_metadata_over_public_ip fails intermittently
> (1 out of 4 times) when run on a set of freshly configured servers.
> When it fails we get the following error:
>
> Traceback (most recent call last):
> File "/usr/lib/python2.6/unittest.py", line 279, in run
> testMethod()
> File "/root/nova_source/smoketests/test_netadmin.py", line 157, in test_004_can_access_metadata_over_public_ip
> while not self.__public_instance_is_accessible():
> File "/root/nova_source/smoketests/test_netadmin.py", line 118, in __public_instance_is_accessible
> raise Exception("Wrong instance id")
> Exception: Wrong instance id
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/821242/+subscriptions

Dan Prince (dan-prince)
Changed in nova:
status: New → In Progress
assignee: nobody → Dan Prince (dan-prince)
importance: Undecided → Low
Revision history for this message
Dan Prince (dan-prince) wrote :

Hey Vish. Just pushed a branch to add some extra logging for the failure. I can't get it to happen for me locally. It seems to happen more often in our CI jobs up on Jenkins. Here is a job log that contains the failure along with some extra API logs.

https://jenkins.openstack.org/job/nova-vpc/163/testReport/junit/smoketests/test_netadmin/SecurityGroupTests_test_004_can_access_metadata_over_public_ip/

Nothing jumped out at me but maybe this logging will tell us something.

Revision history for this message
Vish Ishaya (vishvananda) wrote : Re: [Bug 821242] Re: test_004_can_access_metadata_over_public_ip fails intermittently

Hmm, are there multiple builds at once going on in CI on jenkins? If the ip is used by multiple machines in multiple builds it could cause issues.

Vish

On Aug 5, 2011, at 8:22 AM, Dan Prince wrote:

> Hey Vish. Just pushed a branch to add some extra logging for the
> failure. I can't get it to happen for me locally. It seems to happen
> more often in our CI jobs up on Jenkins. Here is a job log that contains
> the failure along with some extra API logs.
>
> https://jenkins.openstack.org/job/nova-
> vpc/163/testReport/junit/smoketests/test_netadmin/SecurityGroupTests_test_004_can_access_metadata_over_public_ip/
>
> Nothing jumped out at me but maybe this logging will tell us something.
>
> ** Branch linked: lp:~rackspace-titan/nova/test_004_metadata_logging
>
> --
> You received this bug notification because you are a member of Nova Bug
> Team, which is subscribed to OpenStack Compute (nova).
> https://bugs.launchpad.net/bugs/821242
>
> Title:
> test_004_can_access_metadata_over_public_ip fails intermittently
>
> Status in OpenStack Compute (Nova):
> In Progress
>
> Bug description:
> The test_004_can_access_metadata_over_public_ip fails intermittently
> (1 out of 4 times) when run on a set of freshly configured servers.
> When it fails we get the following error:
>
> Traceback (most recent call last):
> File "/usr/lib/python2.6/unittest.py", line 279, in run
> testMethod()
> File "/root/nova_source/smoketests/test_netadmin.py", line 157, in test_004_can_access_metadata_over_public_ip
> while not self.__public_instance_is_accessible():
> File "/root/nova_source/smoketests/test_netadmin.py", line 118, in __public_instance_is_accessible
> raise Exception("Wrong instance id")
> Exception: Wrong instance id
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/821242/+subscriptions

Revision history for this message
Dan Prince (dan-prince) wrote :

Hey Vish,

The failures I'm seeing occur in isolated dev clouds. I don't think multiple simultaneous builds is the issue here.

--

After the logging fix went in we got the following failure in Jenkins this afternoon:

Exception: Wrong instance id. Expected: i-00000004, Got: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /latest/meta-data/instance-id was not found on this server.</p>
<hr>
<address>Apache/2.2.16 (Ubuntu) Server at 172.20.0.0 Port 80</address>
</body></html>
--

Here is a link to the test failure.

https://jenkins.openstack.org/job/nova-vpc/171/testReport/smoketests/test_netadmin/SecurityGroupTests_test_004_can_access_metadata_over_public_ip/

I'm looking at how we make use of __public_instance_is_accessible in the test_004_can_access_metadata_over_public_ip... Instead of raising an exception when the instance IDs don't match perhaps we should just return false? This would at least give us a chance to reach the timeout (and try multiple times).

Dan Prince (dan-prince)
Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → diablo-4
Thierry Carrez (ttx)
Changed in nova:
milestone: diablo-4 → 2011.3
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.