Modify of rpc_response_timeout in Cinder.conf doesn't take effect after rebooting all the cinder and nova service

Bug #1641849 reported by yin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
Undecided
yin

Bug Description

Hi, dear Mirantis engineer

    I am testing the MIrantis Fuel 9.0 with EMC Vmax V3 iscsi cinder driver .

    We met time out issue(the nova log shows timeout) when detach the last volume from compute node in Mirantis Fuel 9.0 environment, this time out might be caused by poor cinder network, then we modified the rpc_response_timeout of Cinder file from default value 60 seconds to 240 seconds.We restarted cinder services including cinder api ,cinder scheduler and cinder volume, and nova services including nova-conductor nova-scheduler on control node and nova-compute service on compute node in orde to make the modify case go into effect.

 however , the timeout issue still exist after service rebooting. According to log , the request became timeout after exact 1 minute (which is the default value of rpc_response ), this means the modification didn’t go to effect as below. For more details please refer to attached file

2016-11-09 15:20:06.046 24952 INFO nova.compute.manager […]Detach volume 21f63308-954f-413e-b3de-a27a28fbb28c from mountpoint /dev/vdd

2016-11-09 15:21:07.031 24952 ERROR oslo_messaging.rpc.dispatcher […] Exception during message handling: Gateway Time-out (HTTP 504)

Revision history for this message
yin (vincentliuyin) wrote :
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Cinder team, can you check this?

Changed in fuel:
assignee: nobody → MOS Cinder (mos-cinder)
tags: added: area-cinder
Changed in fuel:
milestone: none → 9.2
Revision history for this message
Ivan Kolodyazhny (e0ne) wrote :

Oslo team, please, check if oslo.messaging works OK with setting this param.

Changed in fuel:
assignee: MOS Cinder (mos-cinder) → MOS Oslo (mos-oslo)
tags: added: area-oslo
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

vincentliuyin, please make sure you have updated timeout in haproxy, because it looks like that 504 is caused by http server closing connection.

Changed in fuel:
status: New → Incomplete
assignee: MOS Oslo (mos-oslo) → yin (vincentliuyin)
Revision history for this message
yin (vincentliuyin) wrote :

Hi, Sedlnik

  Would you please specify where to modify timeout in haproxy (which node and which file), and how to make it effect?

Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

Yin,

On all controller nodes edit 'timeout server' parameter /etc/haproxy/haproxy.cfg, raise it to 10 minutes, like in that diff - http://paste.openstack.org/show/589563/

After that on one controller do 'crm resource restart p_haproxy', that will restart haproxies on all controllers.

After you finish your experiment and if you hit your problem again, post your logs again.

Revision history for this message
yin (vincentliuyin) wrote :
Download full text (7.9 KiB)

HI, Meschery

    After modified timeout server to 10 minutes, I didn't got time out error in when detach last volume (nova volume-detach vincent4 de8b3388-19a1-4d6b-b098-78534d46c3b6) from compute node , and the volume could indeed be detached successfully from instance.

    But the weired thing is , from cinder log we can see that the detach completed successfull, from the nova-compute log , although no error information display, there is no subsequent message to show us the request (req-12654102-9362-4e77-82fd-7ae965ccd03b 263860edeee9402b9d2148cda8011a4c) is successfully be completed

Log details as below

Nova log

2016-11-19 03:42:49.179 4408 INFO nova.compute.manager [req-12654102-9362-4e77-82fd-7ae965ccd03b 263860edeee9402b9d2148cda8011a4c 89131bda0c084ff193e815dc78978c61 - - -] [instance: 4af75534-9273-4a14-8d20-9c0683d426df] Detach volume de8b3388-19a1-4d6b-b098-78534d46c3b6 from mountpoint /dev/vdb
2016-11-19 03:43:29.644 4408 INFO nova.compute.resource_tracker [req-875977c3-ef98-4a6a-9a0e-5fa49069ecdc - - - - -] Auditing locally available compute resources for node node-16.domain.tld
2016-11-19 03:43:30.435 4408 INFO nova.compute.resource_tracker [req-875977c3-ef98-4a6a-9a0e-5fa49069ecdc - - - - -] Total usable vcpus: 24, total allocated vcpus: 1
2016-11-19 03:43:30.436 4408 INFO nova.compute.resource_tracker [req-875977c3-ef98-4a6a-9a0e-5fa49069ecdc - - - - -] Final resource view: name=node-16.domain.tld phys_ram=32112MB used_ram=2560MB phys_disk=36GB used_disk=20GB total_vcpus=24 used_vcpus=1 pci_stats=[]
2016-11-19 03:43:30.462 4408 INFO nova.compute.resource_tracker [req-875977c3-ef98-4a6a-9a0e-5fa49069ecdc - - - - -] Compute_service record updated for node-16.domain.tld:node-16.domain.tld

Cinder log

pacity stats for SRP pool DEFAULT_SRP on array 000196701147 total_capacity_gb=32730, free_capacity_gb=30074
2016-11-12 11:02:09.553 25131 WARNING cinder.volume.drivers.emc.emc_vmax_provision_v3 [req-8f50c7e1-7eea-4d6b-84f9-743932955f19 - - - - -] Remaining capacity 30074 GBs is determined from SRP pool capacity and not the SLO capacity. Performance may not be what you expect.
2016-11-12 11:02:09.554 25131 INFO cinder.volume.drivers.emc.emc_vmax_common [req-8f50c7e1-7eea-4d6b-84f9-743932955f19 - - - - -] Capacity stats for SRP pool DEFAULT_SRP on array 000196701147 total_capacity_gb=32730, free_capacity_gb=30074
2016-11-12 11:03:09.458 25131 WARNING cinder.volume.drivers.emc.emc_vmax_provision_v3 [req-8f50c7e1-7eea-4d6b-84f9-743932955f19 - - - - -] Remaining capacity 30074 GBs is determined from SRP pool capacity and not the SLO capacity. Performance may not be what you expect.
2016-11-12 11:03:09.458 25131 INFO cinder.volume.drivers.emc.emc_vmax_common [req-8f50c7e1-7eea-4d6b-84f9-743932955f19 - - - - -] Capacity stats for SRP pool DEFAULT_SRP on array 000196701147 total_capacity_gb=32730, free_capacity_gb=30074
2016-11-12 11:04:09.491 25131 WARNING cinder.volume.drivers.emc.emc_vmax_provision_v3 [req-8f50c7e1-7eea-4d6b-84f9-743932955f19 - - - - -] Remaining capacity 30074 GBs is determined from SRP pool capacity and not the SLO capacity. Performance may not be what you expect.
2016-11-12 11:04:09.492 25131 INFO cinder.vol...

Read more...

Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

Yin, I would suggest to enable debug in Nova and retry - it could be that Nova prints only debug logs on success.

Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Marking as Invalid, because of no activity for more than a month.

Changed in fuel:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.