Comment 56 for bug 1253896

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Bug criticality has now decreased a little, since it's not anymore ranked #1.
This bug is however still on the podium sitting at #3.

It is perhaps a good idea to have a detailed look at the failures to understand whether this bug is still actually critical, and what we should do about it.

[1] is a score-by-branch for the past 24 hours.
About 20% of the failures occur on stable/havana. Considering that the frequency of jobs targeting stable/havana is probably less than 20% of overall jobs, this means that we've probably improved the resiliency to this bug for icehouse. However stable/havana failures appear to be all related to the neutron, and the reason has to be searched in the fact that the improvements for neutron have not been backported. Frankly I'm not sure they could be considered backportable at all as there are some large patches.

[2] is a score-by-job for master branch in the past 24 hours.

check-tempest-dvsm-heat-neutron-slow, currently non-voting, accounts for over 50% of the failures. Without this job, there would have been "only" 25 failures on the master branch in the past 24 hours. I have not yet submitted a tempest bug for this job, but the problem appear to be that is trying to ssh an instance over a private network, which I don't think it can work with neutron.

There are 16 failures with nova-network enabled (a little less than 30% of master failures). 7 of them however occur on grenade jobs. I have not yet looked into them. I hope someone from the nova team can help on this matter.

There are also 9 neutron failures. 6 of them all occurred in the same job, and the root cause is that the metadata service did not start. Currently devstack does not return an error when some neutron agents fail to start. Bug 128182 has been filed to this aim.

For the remaning 3 neutron failures, 2 of them are caused by a regex parsing error. This problem is being tracked by bug 1280827, for which there is a patch under review.

The third failure is being investigated.

[1] http://logstash.openstack.org/#eyJmaWVsZHMiOltdLCJzZWFyY2giOiJtZXNzYWdlOlwiU1NIVGltZW91dDogQ29ubmVjdGlvbiB0byB0aGVcIiBBTkQgbWVzc2FnZTpcInZpYSBTU0ggdGltZWQgb3V0LlwiIEFORCBmaWxlbmFtZTpcImNvbnNvbGUuaHRtbFwiIiwidGltZWZyYW1lIjoiODY0MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsIm9mZnNldCI6MCwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzOTI2NTQ5Mjk4NzIsIm1vZGUiOiJzY29yZSIsImFuYWx5emVfZmllbGQiOiJidWlsZF9icmFuY2gifQ==

[2] http://logstash.openstack.org/#eyJmaWVsZHMiOltdLCJzZWFyY2giOiJtZXNzYWdlOlwiU1NIVGltZW91dDogQ29ubmVjdGlvbiB0byB0aGVcIiBBTkQgbWVzc2FnZTpcInZpYSBTU0ggdGltZWQgb3V0LlwiIEFORCBmaWxlbmFtZTpcImNvbnNvbGUuaHRtbFwiIEFORCBidWlsZF9icmFuY2g6XCJtYXN0ZXJcIiIsInRpbWVmcmFtZSI6Ijg2NDAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJvZmZzZXQiOjAsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxMzkyNjU0OTI5ODcyLCJtb2RlIjoic2NvcmUiLCJhbmFseXplX2ZpZWxkIjoiYnVpbGRfbmFtZSJ9