Configs in nova and cinder:
rabbit_hosts="mq:5671,mq1:5671,mq2:5671,mq3:5671"
rabbit_ha_queues=True
rabbit_userid=nova
rabbit_password=XXXXXXX
rabbit_durable_queues=True
rabbit_use_ssl=True
The reply_xxxx queue in RabbitMQ that has unacked messages does have a consumer. I trace that back to the offending API host using the port number and see that it's a nova-api process that has an established connection to rabbit. Then I've been using strace on the process to see what it's doing. All it has is:
Configs in nova and cinder: hosts=" mq:5671, mq1:5671, mq2:5671, mq3:5671" ha_queues= True password= XXXXXXX durable_ queues= True
rabbit_
rabbit_
rabbit_userid=nova
rabbit_
rabbit_
rabbit_use_ssl=True
Some related package versions:
apt-cache policy librabbitmq1
librabbitmq1:
Installed: 0.4.1-1
Candidate: 0.4.1-1
apt-cache policy python-amqp
python-amqp:
Installed: 1.3.3-1ubuntu1
Candidate: 1.3.3-1ubuntu1
apt-cache policy python-amqplib
python-amqplib:
Installed: 1.0.2-1
Candidate: 1.0.2-1
apt-cache policy python-kombu
python-kombu:
Installed: 3.0.7-1ubuntu1
Candidate: 3.0.7-1ubuntu1
The reply_xxxx queue in RabbitMQ that has unacked messages does have a consumer. I trace that back to the offending API host using the port number and see that it's a nova-api process that has an established connection to rabbit. Then I've been using strace on the process to see what it's doing. All it has is:
gettimeofday( {1409785170, 966963}, NULL) = 0 {1409785170, 967066}, NULL) = 0 744583} }}, 1023, 482100) = 1 EPOLLERR| EPOLLONESHOT| 0xe0a1800, {u32=32681, u64=22396489217 114025} }) = 0 EPOLLPRI| EPOLLERR| EPOLLHUP, {u32=7, u64=39432335262 744583} }) = 0 {1409785188, 431971}, NULL) = 0 {1409785188, 432035}, NULL) = 0 744583} }}, 1023, 464636) = 1 EPOLLERR| EPOLLONESHOT| 0xe0a1800, {u32=32681, u64=22396489217 114025} }) = 0 EPOLLPRI| EPOLLERR| EPOLLHUP, {u32=7, u64=39432335262 744583} }) = 0
gettimeofday(
epoll_wait(6, {{EPOLLIN, {u32=7, u64=39432335262
epoll_ctl(6, EPOLL_CTL_DEL, 7, {EPOLLWRNORM|
accept(7, 0x7fffdf911350, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(6, EPOLL_CTL_ADD, 7, {EPOLLIN|
gettimeofday(
gettimeofday(
epoll_wait(6, {{EPOLLIN, {u32=7, u64=39432335262
epoll_ctl(6, EPOLL_CTL_DEL, 7, {EPOLLWRNORM|
accept(7, 0x7fffdf911350, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(6, EPOLL_CTL_ADD, 7, {EPOLLIN|
The API host isn't under any load etc. either nor are the rabbit and DB hosts.
This is happening for us multiple times a day so it would be easy for me to get more information. I just don't know how to debug further.