[horizon] Timeout when reading response headers from daemon process 'horizon'

Bug #1566276 reported by Artem Savinov
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Won't Fix
Medium
Unassigned

Bug Description

Detailed bug description:
After added internal interface to router in horizon i got error:
"Danger: There was an error submitting the form. Please try again. "
The message is presented after about 3 minutes of waiting(for neutron in haproxy configuration file we set timeout to 600 seconds and the task has completed successfully already after the error occurred in the Horizon)

Steps to reproduce:
1) Install Fuel 8 with fule-plugin-nsxv( for nsxv set "Enable HA for NSX Edges" to true)
2) From cli add distributed router:
neutron router-create --distributed True drouter
3) From Horizon add external gateway to router, which was created in the previous step
4) In Horizon create custom internal network.
5) In Horizon add interface from internal network, which was created in the previous step, to distributed router and wait.

Expected results:
All fine.

Actual result:
In web ui got error:
Danger: There was an error submitting the form. Please try again.
in horizon_error.log got error:
[Tue Apr 05 10:17:11.754729 2016] [wsgi:error] [pid 6383:tid 140340631611136] [client 192.168.0.2:37000] Timeout when reading response headers from daemon process 'horizon': /usr/share/ openstack-dashboard/openstack_dashboard/wsgi/django.wsgi, referer: http://172.16.0.3/horizon/project/routers/8328527d-1aaf-41d6-b380-2a6b09e64e60/

Reproducibility:
The problem is not always reproduced.

Description of the environment:
Fuel 8.0
fuel-plugin-nsxv from 8.0 branch

Additional information:
Browser FF 45.0.1 on Linux.
For reproduce need fuel-plugin-nsxv from 8.0 branch- not released version( with additional patches, where increase timeouts)
I cannot attach diagnostic snapshot- launchpad show timeout error, snapsot place on tpi98:/home/asavinov/shared-logs/uel-snapshot-2016-04-05_10-35-56.tar.xz (Sergey Chepiga has access to tpi98)

Tags: enhancement
Changed in mos:
assignee: nobody → MOS Horizon (mos-horizon)
importance: Undecided → Medium
Revision history for this message
Timur Sufiev (tsufiev-x) wrote :

Artem, do you regard 3 minutes waits in UI as a normal situation?

Revision history for this message
Artem Savinov (asavinov) wrote :

Timur, yes. Router created in Vmware sdn with HA - two VM on esxi. Official recommendation from Vmware on timeouts for neutron to haproxy - 600 seconds.

Revision history for this message
Sergei Chipiga (schipiga) wrote :

I found some discussion about similar situations:
https://groups.google.com/forum/#!topic/modwsgi/cuZlSO9vN18
https://groups.google.com/forum/#!topic/modwsgi/H7qPoqYNJdI
https://groups.google.com/forum/#!topic/modwsgi/pxx9kuTxc48

And it looks like a server tuning question, touching webserver waiting for wsgi. For ex:
"
That is, an Apache child worker process will wait at most 60 seconds by default when using mod_wsgi-express, for any data to be received related to a response from the mod_wsgi daemon process. This also applies where an Apache child worker process is reading or writing data with a HTTP client.

If you have very long running requests, you would need to set this higher, but increasing it carries dangers and needing to set it higher generally indicates what you are trying to do is probably a bad way of doing it.

In general, what you are better off doing is not doing such long running work inside of the actual web request. You should offload the processing to a backed task system queue. The web request would therefore return immediately once queued and the task system would pick up the job. The web UI can then poll or use some other mechanism to determine when the task is complete and the data available.
"

Revision history for this message
Sergei Chipiga (schipiga) wrote :

In my opinion, I'm agree with Timur Sufiev, that to wait a response over 3 mins isn't normal. It's better to delegate response notification to message queue.

Revision history for this message
Timur Sufiev (tsufiev-x) wrote :

It may be tolerable from the API's / CLI point of view, but not from the UI point of view. If it's really a normal situation for the action to take 3+ minutes, it should be asynchronous, like for example 'Create Volume' (Cinder) action is implemented.

Does Neutron provide different states of router creation? I am specifically interested in something like 'Pending' state.

Revision history for this message
Artem Savinov (asavinov) wrote :

>"Does Neutron provide different states of router creation?"
Timur, tell me please how can I test it?

NSXv connected to the neutron as a core plugin and developed in Vmware.

Revision history for this message
Timur Sufiev (tsufiev-x) wrote :

Artem, sorry, this wasn't a question to you, but rather to our Neutron team. If Neutron doesn't provide the Horizon a capability to poll long-running actions, we can't do that nicely in UI. Reassigned the bug to them.

Changed in mos:
assignee: MOS Horizon (mos-horizon) → MOS Neutron (mos-neutron)
milestone: none → 10.0
status: New → Confirmed
Changed in mos:
assignee: MOS Neutron (mos-neutron) → Igor Zinovik (izinovik)
Revision history for this message
Thiago Martins (martinx) wrote :

I am seeing this problem:

Timeout when reading response headers from daemon process 'horizon': /usr/share/ openstack-dashboard/openstack_dashboard/wsgi/django.wsgi

On a fresh installed Mitaka on top of Ubuntu 16.04.

Revision history for this message
Igor Zinovik (izinovik) wrote :

It seems that this issue can not be solved nicely from
NSXv plugin side.

Big timeout value causes UI problems.

If we decrease it then we will get problems when we create
huge amount of router.

Revision history for this message
Dayaanaand Ghule (dbghule) wrote :

Any workaround for Ubuntu 16.04 ?
On a fresh installed Mitaka on top of Ubuntu 16.04.

Igor Zinovik (izinovik)
Changed in mos:
assignee: Igor Zinovik (izinovik) → nobody
Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

Neutron team, could you please elaborate on API enhancements. Can we make this API request asynchronize?
Artem, please create corresponding bugreport in the upstream Neutron project.

tags: added: enhancement
Changed in mos:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.