Create new instance failed via horizon

Bug #1887589 reported by Yvonne Ding
28
This bug affects 2 people
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Lin Shuicheng

Bug Description

Brief Description
-----------------
Create a new instance via horizon is failed.

Severity
--------
Major

Steps to Reproduce
------------------
1. Login as Tenant
2. Go to Project > Compute > Instance
3. Create a new instance

TC-name:
test_horizon_create_delete_instance

Expected Behavior
-----------------
New instance is created as active

Actual Behavior
----------------
Instance is failed to be created

Reproducibility
---------------
reproducible

System Configuration
--------------------
Regular standard 2+2 with stx-openstack installed

Lab-name:
wcp-7-10

Branch/Pull Time/Commit
-----------------------
BUILD_ID="20200709T013419Z"

Timestamp/Logs
--------------
[2020-07-10 15:34:29,904] 61 DEBUG MainThread conftest.update_results:: ***Failure at test call: /usr/local/lib/python3.4/site-packages/selenium/webdriver/remote/errorhandler.py:242: selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: a.close
***Details: instances_pg = <utils.horizon.pages.project.compute.instancespage.InstancesPage object at 0x7f7de8888908>
.....

> assert not instances_pg.find_message_and_dismiss(messages.ERROR)

testcases/functional/horizon/test_instances.py:65:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
utils/horizon/pages/basepage.py:56: in find_message_and_dismiss
    message.close()
utils/horizon/regions/messages.py:45: in close
    self._get_element(*self._close_locator).click()
utils/horizon/regions/baseregion.py:62: in _get_element
    return self.src_elem.find_element(*locator)
/usr/local/lib/python3.4/site-packages/selenium/webdriver/remote/webelement.py:659: in find_element
    {"using": by, "value": value})['value']
/usr/local/lib/python3.4/site-packages/selenium/webdriver/remote/webelement.py:633: in _execute
    return self._parent.execute(command, params)
/usr/local/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py:321: in execute
    self.error_handler.check_response(response)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <selenium.webdriver.remote.errorhandler.ErrorHandler object at 0x7f7de8ca3470>
response = {'status': 404, 'value': '{"value":{"error":"no such element","message":"Unable to locate element: a.close","stacktrac...ror@chrome://marionette/content/error.js:387:5\\nelement.find/</<@chrome://marionette/content/element.js:330:16\\n"}}'}

......

> raise exception_class(message, screen, stacktrace)
E selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: a.close

/usr/local/lib/python3.4/site-packages/selenium/webdriver/remote/errorhandler.py:242: NoSuchElementException

logs of .tar and video
https://files.starlingx.kube.cengn.ca/launchpad/1887589

Test Activity
-------------

Yvonne Ding (yding)
description: updated
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.4.0 / high - looks to be an openstack regression. Marking as stx.4.0 gating as this is a basic operation

Changed in starlingx:
importance: Undecided → High
status: New → Triaged
assignee: nobody → yong hu (yhu6)
tags: added: stx.4.0 stx.distro.openstack
yong hu (yhu6)
Changed in starlingx:
assignee: yong hu (yhu6) → Lin Shuicheng (shuicheng)
Revision history for this message
Lin Shuicheng (shuicheng) wrote :

Hi Yvonne,
I cannot reproduce the issue.
I try to create VM with the "Launch Instance" button in the Instances page.
I tried create VM by admin user and another user account created in the Identity->Users page.
I tried 7/12 and 7/14 image I have.
Could you help confirm whether the issue still exist with latest build? If yes, please share detail step for the reproduce step.
BTW, the video in the attached log is with black screen only.

BTW, could you share me the log file name and path for your attached log? I cannot use grep to find it due to grep seems in deadloop when I use it. Thanks.

Revision history for this message
zhipeng liu (zhipengs) wrote :

Creating VM from horizon on virtual duplex can work in my side

Changed in starlingx:
status: Triaged → Incomplete
Revision history for this message
Yvonne Ding (yding) wrote : Re: [Bug 1887589] Re: Create new instance failed via horizon

Hi,

I am not the person who setup the lab and run horizon testsuites. I was
asked to create LPs based on test report.  All logs are attached in LPs
and please check url for video

and "collect all" logs.  Furthermore I did not own the lab.

Thanks,

Yvonne

On 2020-07-16 4:32 a.m., Lin Shuicheng wrote:
> Hi Yvonne,
> I cannot reproduce the issue.
> I try to create VM with the "Launch Instance" button in the Instances page.
> I tried create VM by admin user and another user account created in the Identity->Users page.
> I tried 7/12 and 7/14 image I have.
> Could you help confirm whether the issue still exist with latest build? If yes, please share detail step for the reproduce step.
> BTW, the video in the attached log is with black screen only.
>
> BTW, could you share me the log file name and path for your attached
> log? I cannot use grep to find it due to grep seems in deadloop when I
> use it. Thanks.
>

Revision history for this message
Lin Shuicheng (shuicheng) wrote :

Hi Yvonne,
From the log, openstack fail to schedule the VM due to cannot communicate with mariadb. There is patch https://review.opendev.org/739046 to fix mariadb run issue. The code is merged at 7/10, which is after your test build which is at 7/9。
Please help confirm whether the issue still exist with latest build.

Here is error message in nova conductor:
2020-07-10T12:23:58.541000978Z stdout F 2020-07-10 12:23:58.538 1 WARNING oslo_db.sqlalchemy.exc_filters [-] DBAPIError exception wrapped.: pymysql.err.InternalError: (1047, 'WSREP has not yet prepared node for application use')
2020-07-10T12:23:58.546514318Z stdout F 2020-07-10 12:23:58.542 1 ERROR nova.servicegroup.drivers.db [-] Unexpected error while reporting service status: oslo_db.exception.DBError: (pymysql.err.InternalError) (1047, 'WSREP has not yet prepared node for application use')

Revision history for this message
Yvonne Ding (yding) wrote :

Hi Shuicheng,

The issue can be reproduced with load 20200717T143848Z which is after
Jul 10.

[2020-07-21 15:58:50,674] 61   DEBUG MainThread
conftest.update_results:: ***Failure at test call:
/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/errorhandler.py:242:
selenium.common.exceptions.NoSuchElementException: Message: Unable to
locate element: a.close
***Details: instances_pg =
<utils.horizon.pages.project.compute.instancespage.InstancesPage object
at 0x7f6497342c50>

Thanks

Yvonne

On 2020-07-20 4:32 a.m., Lin Shuicheng wrote:
> Hi Yvonne,
> >From the log, openstack fail to schedule the VM due to cannot communicate with mariadb. There is patch https://review.opendev.org/739046 to fix mariadb run issue. The code is merged at 7/10, which is after your test build which is at 7/9。
> Please help confirm whether the issue still exist with latest build.
>
> Here is error message in nova conductor:
> 2020-07-10T12:23:58.541000978Z stdout F 2020-07-10 12:23:58.538 1 WARNING oslo_db.sqlalchemy.exc_filters [-] DBAPIError exception wrapped.: pymysql.err.InternalError: (1047, 'WSREP has not yet prepared node for application use')
> 2020-07-10T12:23:58.546514318Z stdout F 2020-07-10 12:23:58.542 1 ERROR nova.servicegroup.drivers.db [-] Unexpected error while reporting service status: oslo_db.exception.DBError: (pymysql.err.InternalError) (1047, 'WSREP has not yet prepared node for application use')
>

Revision history for this message
Lin Shuicheng (shuicheng) wrote :

Could you please attach collect log for the new ISO?

Revision history for this message
Lin Shuicheng (shuicheng) wrote :

Also could you share me the VM image? And the detail step for the vm creation?
So I could try to reproduce it in my environment again.
Thanks.

Revision history for this message
Yvonne Ding (yding) wrote :

BUILD_ID="r/stx.4.0"
BUILD_DATE="2020-07-17 14:38:48 +0000"

Logs of .tar, video, and automation log,
https://files.starlingx.kube.cengn.ca/launchpad/1887589

You should reproduce the issue in your environment as described in description.

description: updated
Changed in starlingx:
status: Incomplete → New
Revision history for this message
Lin Shuicheng (shuicheng) wrote :

Hi,
Still cannot reproduce the issue with latest build.
To isolate it is not the test script itself issue, could you help reproduce the issue manually?
In your description, the issue occur when use tenant account, could you help confirm whether the issue occur with admin account or not?
I tried both tenant and admin account, and tenant account need be created with admin role, other wise tenant cannot create VM due to lack of permission. Could you help confirm what is the error message in horizon? Thanks.

Revision history for this message
Yvonne Ding (yding) wrote :

Hi @Shuicheng Lin,

The issue can be reproduced via automation regression testcase "test_horizon_create_delete_instance".

If you would need manual test instead, please talk to the manager to coordinate someone else.

yong hu (yhu6)
Changed in starlingx:
status: New → Triaged
Revision history for this message
Lin Shuicheng (shuicheng) wrote :

Waiting submitter to help reproduce the issue manually, and give more info about the issue.
BTW, for the mariadb error log message, it also show in normal log. The reason is that, when do host-swact or controller reboot, there will be some time for mariadb to switch between nodes, and cause mariadb service not available. In my test, nova scheduler will retry and always success to create VM at the end.
So I cannot determine what cause the VM failure yet from log files.

Revision history for this message
Yvonne Ding (yding) wrote :

Please note all logs are already provided which include tarball, video, and automation log.

The issue can be 100% reproduced with openstack on STX master, however it is passed on WR openstack internal master. Looks to me it's STX master problem.

Last but not least, not every automation issue can be reproduced by manual test. Most of them could though. Looks to me you assume they were equal.

I am trying manual test. It would be the same steps extracting from TIS_AUTOMATION.log. Please let me know if you've tried same steps.

Thanks,

Revision history for this message
Lin Shuicheng (shuicheng) wrote :

I already checked the tarball for the log, and cannot figure out what cause the issue.
The video is black.
The automation log is with WR specific test script, and I don't have the code to understand it. The code in the script is just partial.

There are many cases could lead to VM cannot be created, and I cannot reproduced issue locally based on the simple step provided in the description, that is why I ask you to reproduce it manually and share more detail step.

I don't assume automation is the same as manual. Just because I don't have the automation test script, so I ask you to run it manually.

Changed in starlingx:
status: Triaged → Incomplete
Revision history for this message
Lin Shuicheng (shuicheng) wrote :

@Ghada, it seems it is a environment specific issue, and I cannot reproduce it locally. Could you help find someone from WR to debug it? Thanks.

Revision history for this message
George Postolache (gpostola) wrote :

I don't know if this is of any help, I have tried to reproduce manually and for this i have created a new project named "tenant1" and 2 users, a member named "tenant1" and and admin named "tenant2"
Loging in with tenant2 the availability zone nova is present and i can launch the instance
Logiing in with tenant1 there is no availability zone just like in the attached video and i can't press launch instance (Forbidden. Insufficient permissions of the requested operation is visible in that page)

Do i need to do anything extra for the normal user?

Revision history for this message
Nicolae Jascanu (njascanu-intel) wrote :

From the error listed: "Message: Unable to locate element: a.close" seems that selenium cannot find a html element. Maybe the testcase should be adapted to the horizon changes.

Revision history for this message
Yvonne Ding (yding) wrote :

Steps in LP description as below,

  Steps to Reproduce
  ------------------
  1. Login as Tenant
  2. Go to Project > Compute > Instance
  3. Create a new instance

Tenant could be admin/tenant1/tenant2 etc, and it depends on openstack user config.

Thanks,
Yvonne

On 2020-07-29 12:19 p.m., Liu, Yang (YOW) wrote:
> Please note that this happens with non-admin user.
> I believe someone also mentioned this in the community call today.
>
> -----Original Message-----
> From: Ding, Yuhong (Yvonne)
> Sent: July-29-20 11:56 AM
> To: Lin, Shuicheng; Hu, Yong; Liu, Yang (YOW); Khalil, Ghada; Miller, Frank
> Subject: Re: [Bug 1887589] Re: Create new instance failed via horizon
>
> Hi Yong/Shuicheng,
>
> The video in attached LP can be opened by one of my WR colleagues successfully. Please double check the tool you are using.
>
> The issue can be reproduced manually in WR lab. Attached is the snapshot.
>
> Thanks,
>
> Yvonne
>
>
> On 2020-07-29 1:13 a.m., Lin Shuicheng wrote:
>> I already checked the tarball for the log, and cannot figure out what cause the issue.
>> The video is black.
>> The automation log is with WR specific test script, and I don't have the code to understand it. The code in the script is just partial.
>>
>> There are many cases could lead to VM cannot be created, and I cannot
>> reproduced issue locally based on the simple step provided in the
>> description, that is why I ask you to reproduce it manually and share
>> more detail step.
>>
>> I don't assume automation is the same as manual. Just because I don't
>> have the automation test script, so I ask you to run it manually.
>>
>>
>> ** Changed in: starlingx
>> Status: Triaged => Incomplete
>>

Changed in starlingx:
status: Incomplete → New
yong hu (yhu6)
Changed in starlingx:
status: New → Confirmed
Revision history for this message
yong hu (yhu6) wrote :

The issue is confirmed with tenant users on Horizon, so we need to adress it in stx.4.0 maintenance releases.

Revision history for this message
Lin Shuicheng (shuicheng) wrote :

It should be a regression issue caused by openstack upstream.
I have opened a new issue in nova project:
https://bugs.launchpad.net/nova/+bug/1890019

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-armada-app (master)

Fix proposed to branch: master
Review: https://review.opendev.org/744414

Changed in starlingx:
status: Confirmed → In Progress
Revision history for this message
yong hu (yhu6) wrote :

as commented in https://bugs.launchpad.net/nova/+bug/1890019, the issue on Nova upstream was fixed in
https://bugs.launchpad.net/nova/+bug/1869543, so the fix should be applied on openstack-helm, instead of Nova.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-armada-app (master)

Reviewed: https://review.opendev.org/744414
Committed: https://git.openstack.org/cgit/starlingx/openstack-armada-app/commit/?id=c6eaec81a6f2e7b50d6d8009767984611c918cf1
Submitter: Zuul
Branch: master

commit c6eaec81a6f2e7b50d6d8009767984611c918cf1
Author: Shuicheng Lin <email address hidden>
Date: Mon Aug 3 15:45:32 2020 +0800

    Fix vm cannot be created with non-admin user in horizon

    VM cannot be created due to non-admin user cannot retrieve resource
    limits info. The reason is nova code since Ussuri has changed limits/
    os-availability-zone's policy to any user. But the policy config in
    openstack-helm is not updated yet, and cause the mismatch between
    code and config.
    Overwrite nova's policy config to align with the code.
    Here is upstream's patch for this policy change:
    limits: 4d37ffc111ae8bb43bd33fe995bc3686b065131b
    os-availability-zone: b8c2de86ed46caf7768027e82519c2418989c36b
    Patch is uploaded in openstack-helm also, and we could abandon this
    overwrite later when we upgrade openstack-helm to include the fix:
    https://review.opendev.org/744392

    Closes-Bug: 1887589

    Change-Id: If637c40fb6b887cdc017aa70c4c5ba145eb5bec3
    Signed-off-by: Shuicheng Lin <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-armada-app (r/stx.4.0)

Fix proposed to branch: r/stx.4.0
Review: https://review.opendev.org/744713

Ghada Khalil (gkhalil)
tags: added: not-yet-in-r-stx40
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-armada-app (r/stx.4.0)

Reviewed: https://review.opendev.org/744713
Committed: https://git.openstack.org/cgit/starlingx/openstack-armada-app/commit/?id=e903603a0a0836bb22845b6b42a6dd1df914d85a
Submitter: Zuul
Branch: r/stx.4.0

commit e903603a0a0836bb22845b6b42a6dd1df914d85a
Author: Shuicheng Lin <email address hidden>
Date: Mon Aug 3 15:45:32 2020 +0800

    Fix vm cannot be created with non-admin user in horizon

    VM cannot be created due to non-admin user cannot retrieve resource
    limits info. The reason is nova code since Ussuri has changed limits/
    os-availability-zone's policy to any user. But the policy config in
    openstack-helm is not updated yet, and cause the mismatch between
    code and config.
    Overwrite nova's policy config to align with the code.
    Here is upstream's patch for this policy change:
    limits: 4d37ffc111ae8bb43bd33fe995bc3686b065131b
    os-availability-zone: b8c2de86ed46caf7768027e82519c2418989c36b
    Patch is uploaded in openstack-helm also, and we could abandon this
    overwrite later when we upgrade openstack-helm to include the fix:
    https://review.opendev.org/744392

    Closes-Bug: 1887589

    Change-Id: If637c40fb6b887cdc017aa70c4c5ba145eb5bec3
    Signed-off-by: Shuicheng Lin <email address hidden>
    (cherry picked from commit c6eaec81a6f2e7b50d6d8009767984611c918cf1)

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Merged in r/stx.4.0, but not yet picked up in a mtce release

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.