Rally test delete-subnets fails at higher concurrency

Bug #1821963 reported by Sridar Kandaswamy on 2019-03-27
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Undecided
Unassigned

Bug Description

We have been noticing that the rally test delete-subnets fails when concurrency is higher. We first noticed issues with a gbpservice based deployment on neutron with stable/newton. Have tried to triage this further with the community reference implementation from stable/ocata, stable/queens as well as master before submitting this defect.

We notice failures at concurrency level of 6 and higher in older community reference implementation releases. It is better with newer releases and reference implementation master (where we notice failure at concurrency of 17 or higher).

Sometimes we have noticed tracebacks with IP allocation cleanup with a FlushError. But consistently we see that the failure seems to be triggered with a DELETE for subnet (which has already been deleted). For some reason rally is prompted to send another DELETE for a subnet resulting in a 404 or 409.

It is not clear if this is a rally issue or if the retry is prompted due to high latency seen with the DELETE.

Notes below:

1) subnet deletes showing latency as well the few 404.
2) Sample traceback we notice in out gbpservice implementation which I have also seen on community stable/ocata.

1)
Mar 26 12:21:21 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-0b1112a5-be40-4b46-839e-4942328a0280 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_dCdKhYP8] 10.30.10.147 "DELETE /v2.0/subnets/e02dd6bc-94cd-40e8-941d-1d8d4c2640a5 HTTP/1.1" status: 404 len: 342 time: 13.3713560
Mar 26 12:21:21 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-36220d55-ec76-4fb7-8721-75638ce9e84b c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_dCdKhYP8] 10.30.10.147 "DELETE /v2.0/subnets/e02dd6bc-94cd-40e8-941d-1d8d4c2640a5 HTTP/1.1" status: 204 len: 173 time: 10.7947230
Mar 26 12:21:22 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-8e40b03f-1c1f-4e99-a707-b02b95e95e8f c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_BWCrXuCS] 10.30.10.147 "DELETE /v2.0/subnets/ab5c7118-59c2-4367-83da-8fb1ccf3d33f HTTP/1.1" status: 404 len: 342 time: 14.2172780
Mar 26 12:21:24 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-050fef8e-4342-4dce-a63e-ba14e4ee8890 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_c0s9ZE8D] 10.30.10.147 "DELETE /v2.0/subnets/b86002a0-731a-474b-b9df-159107b8212e HTTP/1.1" status: 204 len: 173 time: 14.7142339
Mar 26 12:21:24 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-189e4af7-8419-4894-833f-fabbd1d0ad14 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_eFa0pM1j] 10.30.10.147 "DELETE /v2.0/subnets/27c6e490-61af-4ea0-8723-99b1efdec072 HTTP/1.1" status: 204 len: 173 time: 15.0186520
Mar 26 12:21:26 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-0f7af610-724c-4ac2-bba9-11ef60e07b25 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_BWCrXuCS] 10.30.10.147 "DELETE /v2.0/subnets/ab5c7118-59c2-4367-83da-8fb1ccf3d33f HTTP/1.1" status: 204 len: 173 time: 15.3438430
Mar 26 12:21:27 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-0e945b06-7d79-4981-8cca-f42419f7d5f2 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_FaZYbsDZ] 10.30.10.147 "DELETE /v2.0/subnets/1cc7929d-f151-4dc3-a7de-5b7c1ad731e0 HTTP/1.1" status: 204 len: 173 time: 17.7165470
Mar 26 12:21:28 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-f101d5ee-54c5-4919-b5e7-5fd570ffc252 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_4MdtgRt1] 10.30.10.147 "DELETE /v2.0/subnets/70513aed-2ab3-4ae5-8e70-15180a662a2c HTTP/1.1" status: 204 len: 173 time: 17.8223970
Mar 26 12:21:28 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-623419d7-b9b7-4c83-aa62-1723b2a3964e c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_LwIpTIRg] 10.30.10.147 "DELETE /v2.0/subnets/951b111d-49ce-4e75-ae4f-7359bc11f9da HTTP/1.1" status: 204 len: 173 time: 19.8612812
Mar 26 12:21:30 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-631dab60-c4a3-4a40-8a1e-f972180c4195 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_Th4lV5RJ] 10.30.10.147 "DELETE /v2.0/subnets/008a7d25-0792-4829-b799-fa22f8bbfd0a HTTP/1.1" status: 204 len: 173 time: 20.2435060
Mar 26 12:21:30 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-9f6273e7-98ef-45d3-a543-f58614381ea6 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_LQdIj5w1] 10.30.10.147 "DELETE /v2.0/subnets/b9c0d9b5-8f9d-4875-a11c-f9146d7dc7c8 HTTP/1.1" status: 204 len: 173 time: 22.4103460
Mar 26 12:21:31 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-12779a8f-ccde-445b-90bc-908efe5310cb c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_sVGS0nRm] 10.30.10.147 "DELETE /v2.0/subnets/383cb32c-748d-4135-93cd-1a2d25665627 HTTP/1.1" status: 204 len: 173 time: 22.9507060
Mar 26 12:21:33 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-d1c1c2e5-4f90-4eab-a307-51e61b2f9446 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_e2YPDxKL] 10.30.10.147 "DELETE /v2.0/subnets/af44f539-13ce-47e7-8d0f-84ae8b7bece4 HTTP/1.1" status: 204 len: 173 time: 23.0341308
Mar 26 12:21:33 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-28312ae2-45c1-4aad-b794-2854879aae76 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_wn9sU6rI] 10.30.10.147 "DELETE /v2.0/subnets/8b3175f5-caa9-4a63-a22e-0704410ab30c HTTP/1.1" status: 204 len: 173 time: 22.4846880
Mar 26 12:21:34 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-1c82f70d-597a-43db-8e8d-a35ff7f60e3f c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_aoKEAwwL] 10.30.10.147 "DELETE /v2.0/subnets/5fdfafb4-1a9a-4e06-b57c-66b74134d023 HTTP/1.1" status: 204 len: 173 time: 23.9002321
Mar 26 12:21:34 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-c4870c76-fd36-45a9-8b37-7275a17a9e2e c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_AnBD8r7f] 10.30.10.147 "DELETE /v2.0/subnets/0e17c0f4-bd9b-4491-860a-1e06bcc94bc2 HTTP/1.1" status: 204 len: 173 time: 23.9638250
Mar 26 12:21:36 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-e8a4faa8-4d3e-4006-8d30-c11939855d93 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_AnBD8r7f] 10.30.10.147 "DELETE /v2.0/subnets/0e17c0f4-bd9b-4491-860a-1e06bcc94bc2 HTTP/1.1" status: 404 len: 342 time: 27.7147441
Mar 26 12:21:37 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-3fd38d94-2f23-4505-a07a-a0ac0bd7d838 c_rally_2c3dc3ea_v5nfLkdh c_rally_2c3dc3ea_xAOAVRKv] 10.30.10.147 "DELETE /v2.0/subnets/1b367757-2cfc-46af-a142-b6933756f193 HTTP/1.1" status: 204 len: 173 time: 28.8396449
Mar 26 12:21:42 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-63f7d5b3-4eeb-4f7e-b599-17679da7bc79 admin admin] 10.30.10.147 "DELETE /v2.0/subnets/1e2efa29-e509-40d5-8dab-fa70161d621d HTTP/1.1" status: 204 len: 173 time: 0.5462992
Mar 26 12:21:42 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-b584040a-6c20-4f13-99a2-4745afa2cc9a admin admin] 10.30.10.147 "DELETE /v2.0/subnets/44a86ec4-0826-4719-8caa-95523d5c7439 HTTP/1.1" status: 204 len: 173 time: 0.5719988
Mar 26 12:21:43 ubuntu neutron-server[2832]: INFO neutron.wsgi [None req-acc74331-ff42-40c3-aa59-bb3143a4799a admin admin] 10.30.10.147 "DELETE /v2.0/subnets/e9642f17-bda5-4c0f-b0de-0935e8e1e535 HTTP/1.1" status: 204 len: 173 time: 0.6732390

2)
Abridged traceback:

2019-03-20 17:04:57.164 ERROR neutron.api.v2.resource [^[[01;36mreq-e941652b-c1dd-4bda-b810-097e7c0fca33 ^[[00;36mc_rally_78dd9f81_sioGOy9C c_rally_78dd9f81_7zEhCfnG] ^[[01;35mdelete failed: No details.^[[00m
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00mTraceback (most recent call last):
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/opt/stack/neutron/neutron/api/v2/resource.py", line 93, in resource
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m result = method(request=request, **args)
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/opt/stack/neutron/neutron/api/v2/base.py", line 562, in delete
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m return self._delete(request, id, **kwargs)
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/opt/stack/neutron/neutron/db/api.py", line 95, in wrapped
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m setattr(e, '_RETRY_EXCEEDED', True)
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m self.force_reraise()
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m six.reraise(self.type_, self.value, self.tb)
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/opt/stack/neutron/neutron/db/api.py", line 91, in wrapped
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m return f(*args, **kwargs)
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 151, in wrapper
.
.
.

2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m flush_context.execute()
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 373, in execute
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m rec.execute(self)
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 532, in execute
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m uow
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 153, in save_obj
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m base_mapper, states, uowtransaction
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 305, in _organize_states_for_save
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m state_str(existing)))
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00mFlushError: New instance <IPAllocation at 0x7f8f85e94c10> with identity key (<class 'neutron.db.models_v2.IPAllocation'>, ('10.2.2.2', u'c4d25e66-3ef5-42df-b33c-3b1d31fc13b4', u'efa53109-592c-494f-aff3-555f61a26fc8')) conflicts with persistent instance <IPAllocation at 0x7f8f866bd7d0>
2019-03-20 17:04:57.164 TRACE neutron.api.v2.resource ^[[01;35m^[[00m
2019-03-20 17:04:57.188 INFO neutron.wsgi [^[[01;36mreq-e941652b-c1dd-4bda-b810-097e7c0fca33 ^[[00;36mc_rally_78dd9f81_sioGOy9C c_rally_78dd9f81_7zEhCfnG] ^[[01;35m10.30.10.143 "DELETE /v2.0/subnets/7732a064-1e7f-4588-9ad1-9009f310a2a5 HTTP/1.1" status: 500 len: 368 time: 2.8037841^[[00m
2019-03-20 17:04:57.217 ERROR neutron.api.v2.resource [^[[01;36mreq-cd76189b-c8db-49d1-befc-feae5b1c30b7 ^[[00;36mc_rally_78dd9f81_sioGOy9C c_rally_78dd9f81_YSgC0gT2] ^[[01;35mdelete failed: No details.^[[00m

Bence Romsics (bence-romsics) wrote :

Thank you for your bug report. I'm trying to reproduce the problem. Is this the rally scenario you're running?

https://opendev.org/openstack/rally-openstack/src/commit/bb9a58becd88ba5d5d5431844ed52709af222b3d/rally_openstack/scenarios/neutron/network.py#L209

This is what I'm doing (in devstack built on current master just a bit after rc1):

/opt/stack/neutron $ cat rally-jobs/bug-1821963.yaml
{% set floating_network = floating_network or "public" %}

---
  version: 2
  title: Rally Task for OpenStack Neutron CI
  description: >
    The task contains various scenarios to prevent concurrency issues
  subtasks:
    -
      title: Subnet related workloads.
      workloads:
        -
          scenario:
            NeutronNetworks.create_and_delete_subnets:
              network_create_args: {}
              subnet_create_args: {}
              subnet_cidr_start: "1.1.0.0/30"
              subnets_per_network: 2
          runner:
            constant:
              times: 200
              concurrency: 40
          contexts:
            users:
              tenants: 1
              users_per_tenant: 1
            quotas:
              neutron:
                network: -1
                subnet: -1
/opt/stack/neutron $ rally task start rally-jobs/bug-1821963.yaml

But with this I was not able to reproduce the problem yet even at concurrency=40.

Can you please share more detail on how to reproduce this problem?

In your point 2) are you saying you're seeing the same problem in the stable/ocata gate jobs? If yes, could you please link to logs?

Changed in neutron:
status: New → Incomplete
Sridar Kandaswamy (skandasw) wrote :

Many thanks Bence for taking a look at the issue right away. The create_and_delete_subnets scenario has been working well for us as well. The test where I saw the issue was:

{
    "NeutronSubnets.delete_subnets": [
        {
            "runner": {
                "type": "constant",
                "times": 18,
                "concurrency": 18
            },
            "context": {
                "users": {
                    "tenants": 1,
                    "users_per_tenant": 15,
                    "user_choice_method": "round_robin"
                },
                "network": {
                    "subnets_per_network": 18,
                    "dualstack": true,
                    "router": {}
                }
            },
            "sla": {
                "failure_rate": {
                    "max": 0
                }
            }
        }
    ]
}

Pls let me know if I am missing something as well and I can get more data. Many thanks.

Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers