Services not starting on new side of Grenade

Bug #1331274 reported by Joe Gordon
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
grenade
Invalid
Undecided
Unassigned

Bug Description

http://logs.openstack.org/72/99772/3/check/check-grenade-dsvm-partial-ncpu/20d91e2/logs/new/screen-n-api.txt.gz

says
"Couldn't find log for n-cond at /opt/stack/new/screen-logs/screen-n-cond.log"

But the process list includes nova-conductor and it was started around the same time as nova-api, which had the new service successfully start.

So it looks like we are missing some log files one way or another.

logstash query: message:"Couldn't find log for " AND tags:"console.html"

This is only hitting grenade jobs.

Revision history for this message
Joe Gordon (jogo) wrote :
Revision history for this message
Sean Dague (sdague) wrote :

I don't think this is a new bug, this is a move of an existing set of issues with extremely slow nodes. Realistically once you filter out the neutron parts of this failure you get only a handful of hits. (6 jobs in the last 24 hrs)

query: >
  message:"Expected running services not running:"
  AND NOT message:"q-fwaas"
  AND tags:"console.html"

In all but 1 case they are hpcloud. They are split across grenade and grenade-partial pretty evenly (I had wondered if partial was more susceptible, it's not.)

In most of these cases it seems like these services are actually running, but logs don't exist. This actually exposes another long term mystery we were seeing where logs were not created. So I think we need to move forward here to figure out why and get to the root cause.

Sean Dague (sdague)
summary: - Couldn't find log for * at /opt/stack/new/screen-logs/*
+ Logs are not getting created for services in Grenade
Revision history for this message
Matt Riedemann (mriedem) wrote : Re: Logs are not getting created for services in Grenade

Hit it here also, n-api log isn't showing up and the service isn't running:

http://logs.openstack.org/30/102130/2/check/check-grenade-dsvm/54a7b60/logs/grenade.sh.txt

2014-06-24 14:22:04.780 | Couldn't find log for n-api at /opt/stack/new/screen-logs/screen-n-api.log
2014-06-24 14:22:04.781 | rabbit does not look like a valid service, skipping log check
2014-06-24 14:22:04.782 | tempest does not look like a valid service, skipping log check
2014-06-24 14:22:04.782 | Expected running services not running: ,n-api
2014-06-24 14:22:04.782 | + die 330 'Failure in check-sanity'
2014-06-24 14:22:04.782 | + local exitcode=1

Sean Dague (sdague)
summary: - Logs are not getting created for services in Grenade
+ Services not starting on new side of Grenade
Revision history for this message
Nathan Kinder (nkinder) wrote :

I just hit this with n-obj:

http://logs.openstack.org/88/111088/1/check/check-grenade-dsvm/9fc0daa/logs/grenade.sh.txt.gz

2014-08-02 00:21:26.202 | Couldn't find log for n-obj at /opt/stack/new/screen-logs/screen-n-obj.log
2014-08-02 00:21:26.202 | rabbit does not look like a valid service, skipping log check
2014-08-02 00:21:26.202 | tempest does not look like a valid service, skipping log check
2014-08-02 00:21:26.202 | Expected running services not running: ,n-obj
2014-08-02 00:21:26.203 | + die 330 'Failure in check-sanity'
2014-08-02 00:21:26.203 | + local exitcode=1
2014-08-02 00:21:26.203 | [Call Trace]
2014-08-02 00:21:26.203 | ./grenade.sh:330:die
2014-08-02 00:21:26.206 | [ERROR] ./grenade.sh:330 Failure in check-sanity
2014-08-02 00:21:26.206 | Exit code: 1
2014-08-02 00:21:26.206 |

Revision history for this message
Victor A. Ying (victor-ying) wrote :

I had the the same issue occur as Nathan Kinder:

http://logs.openstack.org/82/114382/1/check/check-grenade-dsvm/fa7a491/logs/grenade.sh.txt.gz

2014-08-18 17:44:18.195 | Couldn't find log for n-obj at /opt/stack/new/screen-logs/screen-n-obj.log
2014-08-18 17:44:18.195 | rabbit does not look like a valid service, skipping log check
2014-08-18 17:44:18.195 | tempest does not look like a valid service, skipping log check
2014-08-18 17:44:18.195 | Expected running services not running: ,n-obj
2014-08-18 17:44:18.196 | + die 348 'Failure in check-sanity'
2014-08-18 17:44:18.196 | + local exitcode=1
2014-08-18 17:44:18.196 | [Call Trace]
2014-08-18 17:44:18.196 | ./grenade.sh:348:die
2014-08-18 17:44:18.199 | [ERROR] ./grenade.sh:348 Failure in check-sanity
2014-08-18 17:44:18.200 | /opt/stack/new/grenade/functions: line 133: /opt/stack/logs/screen/error.log: Permission denied
2014-08-18 17:44:18.200 | Exit code: 1

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

I hit this with n-cpu

http://logs.openstack.org/03/114403/6/gate/gate-grenade-dsvm/a6bc3d8/logs/grenade.sh.txt.gz#_2014-09-16_07_10_14_500

2014-09-16 07:10:14.193 | *********************************************************************
2014-09-16 07:10:14.193 | Begin /opt/stack/new/grenade/check-sanity
2014-09-16 07:10:14.193 | *********************************************************************
2014-09-16 07:10:14.497 | Ceilometer not yet supported, skipping check for ceilometer-acentral
2014-09-16 07:10:14.497 | Ceilometer not yet supported, skipping check for ceilometer-acompute
2014-09-16 07:10:14.498 | Ceilometer not yet supported, skipping check for ceilometer-alarm-evaluator
2014-09-16 07:10:14.498 | Ceilometer not yet supported, skipping check for ceilometer-alarm-notifier
2014-09-16 07:10:14.498 | Ceilometer not yet supported, skipping check for ceilometer-anotification
2014-09-16 07:10:14.498 | Ceilometer not yet supported, skipping check for ceilometer-api
2014-09-16 07:10:14.498 | Ceilometer not yet supported, skipping check for ceilometer-collector
2014-09-16 07:10:14.498 | cinder does not look like a valid service, skipping log check
2014-09-16 07:10:14.498 | dstat does not look like a valid service, skipping log check
2014-09-16 07:10:14.499 | Heat not yet supported, skipping check for h-api
2014-09-16 07:10:14.499 | Heat not yet supported, skipping check for h-api-cfn
2014-09-16 07:10:14.499 | Heat not yet supported, skipping check for h-api-cw
2014-09-16 07:10:14.499 | Heat not yet supported, skipping check for h-eng
2014-09-16 07:10:14.499 | heat does not look like a valid service, skipping log check
2014-09-16 07:10:14.499 | horizon does not look like a valid service, skipping log check
2014-09-16 07:10:14.499 | key does not look like a valid service, skipping log check
2014-09-16 07:10:14.500 | mysql does not look like a valid service, skipping log check
2014-09-16 07:10:14.500 | Couldn't find log for n-cpu at /opt/stack/new/screen-logs/screen-n-cpu.log
2014-09-16 07:10:14.500 | rabbit does not look like a valid service, skipping log check
2014-09-16 07:10:14.501 | tempest does not look like a valid service, skipping log check
2014-09-16 07:10:14.501 | Expected running services not running: ,n-cpu
2014-09-16 07:10:14.502 | + die 357 'Failure in check-sanity'
2014-09-16 07:10:14.502 | + local exitcode=1
2014-09-16 07:10:14.502 | [Call Trace]
2014-09-16 07:10:14.502 | ./grenade.sh:357:die
2014-09-16 07:10:14.504 | [ERROR] ./grenade.sh:357 Failure in check-sanity
2014-09-16 07:10:14.505 | /opt/stack/new/grenade/functions: line 133: /opt/stack/logs/screen/error.log: Permission denied
2014-09-16 07:10:14.505 | Exit code: 1

Revision history for this message
Sean Dague (sdague) wrote :

This grenade bug was last updated over 180 days ago, as grenade
is a fast moving project and we'd like to get the tracker down to
currently actionable bugs, this is getting marked as Invalid. If the
issue still exists, please feel free to reopen it.

Changed in grenade:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.