tap device will disappear after manila-share node restart

Bug #1688155 reported by yankee on 2017-05-04
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Manila
Undecided
haobing1

Bug Description

the tap device will disappear after restart the node in generic,so manila can not manage share that created by this manila-share service

yankee (yankeefu) on 2017-05-04
Changed in manila:
assignee: nobody → yankee (yankeefu)

Fix proposed to branch: master
Review: https://review.openstack.org/462438

Changed in manila:
status: New → In Progress
Tom Barron (tpb) wrote :

In what version or versions of OpenStack have you reproduced this issue?

You don't share the generic driver configuration so I can't tell if you have DHSS=False or DHSS=True.

Are the reproduction instructions to restart the node or to restart manila-share? You talk here about restarting the node and in review 462438 about restarting the share service (which would happen if you restart the node but can also be done on its own).

Also, what services are running on the same node as manila-share? Is the L3-agent running on that node? What about the hypervisor/nova services for the SVM running the share service, is it on the same node as manila-share?

I am inclined to think that the manila service instance module that is used by the generic driver is rather fragile as one moves out of an all-in-one environment like devstack, so that is part of why I am asking whether all the relevant services are deployed on one node or whether you are running in a more distributed (more genuinely cloud-like) deployment.

Also, I'd like to check, are you running only one L3 agent or more than one? Are you doing L3 HA? DVR?

yankee (yankeefu) wrote :

log_dir = /var/log/manila
rpc_backend = rabbit

control_exchange = openstack
nova_admin_auth_url=http://192.168.210.2:5000/v2.0
notification_driver=messaging
nova_admin_tenant_name=services
nova_admin_username=nova
nova_admin_password=KKKNEYGG
nova_catalog_info=compute:nova:publicURL
nova_api_insecure=False
neutron_api_insecure=False
neutron_auth_strategy=keystone
neutron_admin_tenant_name=services
neutron_url=http://192.168.210.2:9696
neutron_region_name=RegionOne
neutron_admin_password=HzVzg6hf
cinder_catalog_info=volume:cinder:publicURL
cinder_admin_username=cinder
cinder_admin_password=Yzco2ybf
cinder_cross_az_attach=True
cinder_api_insecure=False
cinder_admin_auth_url=http://192.168.210.2:5000/v2.0
cinder_http_retries=3
cinder_admin_tenant_name=services
neutron_admin_username=neutron
neutron_admin_auth_url=http://192.168.210.2:5000/v2.0
nova_catalog_admin_info=compute:nova:adminURL
neutron_url_timeout=30

[cinder]

[cors]
[cors.subdomain]

[database]
connection = mysql+pymysql://manila:SOt9jSYh@192.168.220.2/manila

[keystone_authtoken]

auth_uri = http://192.168.220.2:35357/v3
auth_version = v3.0
signing_dir = /tmp/keystone-signing-manila
admin_user=manila
admin_tenant_name=services
auth_port=35357
auth_protocol=http
admin_password=EjKglsW5
auth_host=192.168.220.2

[matchmaker_redis]
[neutron]
[nova]
[oslo_concurrency]
lock_path = /tmp/manila/manila_locks
[oslo_messaging_amqp]
server_request_prefix = exclusive
broadcast_prefix = broadcast
group_request_prefix = unicast
container_name = guest
idle_timeout = 0
trace = False
allow_insecure_clients = False

[oslo_messaging_notifications]
[oslo_messaging_rabbit]
amqp_durable_queues = False
rabbit_hosts = 192.168.220.2:5672

rabbit_use_ssl = False

rabbit_userid = nova

rabbit_password = BQtzlGCj
rabbit_virtual_host = /
rabbit_ha_queues = False

[oslo_middleware]
[oslo_policy]

[london]
share_mount_path=/shares
max_time_to_attach=120
automatic_share_server_cleanup=True
delete_share_server_with_last_share=False
share_helpers=CIFS=manila.share.drivers.helpers.CIFSHelperIPAccess,NFS=manila.share.drivers.helpers.NFSHelper
smb_template_config_path=$state_path/smb.conf
share_volume_fstype=ext4
unmanage_remove_access_rules=False
share_backend_name=london
volume_name_template=manila-share-%s
driver_handles_share_servers=True
max_time_to_create_volume=180
share_driver=manila.share.drivers.generic.GenericShareDriver
service_instance_smb_config_path=$share_mount_path/smb.conf
volume_snapshot_name_template=manila-snapshot-%s
manila_service_keypair_name=manila-service
max_time_to_build_instance=300
service_instance_name_template=manila_service_instance_%s
interface_driver=manila.network.linux.interface.OVSInterfaceDriver
service_network_cidr=10.254.0.0/16
path_to_public_key=/root/.ssh/id_rsa.pub
service_network_name=manila_service_network
path_to_private_key=/root/.ssh/id_rsa
service_instance_user=manila
connect_share_server_to_tenant_network=False
service_instance_network_helper_type=neutron
service_instance_security_group=manila-service
service_instance_flavor_id=1
service_instance_password=manila
service_image_name=manila-service-image
service_network_division_mask=28

yankee (yankeefu) wrote :

hi,Tom,manila-share service is restarted when node restarted ,but find tap device disappear when node restart.And neutron service,and all manila service are in control node .

yankee (yankeefu) wrote :

neutron service list in control node
neutron-server
neutron-openvswitch-agent
neutron-lbaas-agent

neutron-l3-agent
neutron-metadata-agent

Tom Barron (tpb) wrote :

Thanks for the above information. Do you know if this issue will happen with even just a single controller node? The reason I ask is this. If you have only a single node and this issue still occurs it is because OVS doesn't persist its configuration through node restart, which I would think is an OVS bug. If on the other hand it only happens with multiple nodes it may be because the manila service comes up on a new Controller mode (as it would normally if you have pacemaker control of Controller node services like manila share) and the integration bridge on the new node was never set up by manila share in the first place.

yankee (yankeefu) wrote :

thank you,Tom,but i have not a single controller node, just the tap device created by manila on manila-share service node will disappear,others device will not though node restart. i don't know this tap device what different with others, but if we can check this tap device and service port before restart manila-share service is good.

i considered if we can transfer tap device to other controller node when one node down. but i think it is not a safe action.

Change abandoned by yankee (yankeefu@163.com) on branch: master
Review: https://review.openstack.org/462438

Mikhail Kebich (mkebich) wrote :

Hello yankee. Why did you abandon your change? Did you find another solution of the problem? Thanks.

haobing1 (haobing1) on 2018-01-26
Changed in manila:
assignee: yankee (yankeefu) → haobing1 (haobing1)
haobing1 (haobing1) wrote :
Download full text (16.3 KiB)

I also meet this bug in my devstack environment
reproduce:
1. config the manila.share.drivers.generic.GenericShareDriver as the share_driver
2.create a share
[stack@localhost root]$ manila list
+--------------------------------------+----------+------+-------------+-----------+-----------+-----------------+------+-------------------+
| ID | Name | Size | Share Proto | Status | Is Public | Share Type Name | Host | Availability Zone |
+--------------------------------------+----------+------+-------------+-----------+-----------+-----------------+------+-------------------+
| 05b5903b-ae59-47d6-8450-1ff7f361bcb6 | hb_share | 1 | NFS | available | False | default | | nova |
+--------------------------------------+----------+------+-------------+-----------+-----------+-----------------+------+-------------------+
3. add access-allow
[stack@localhost root]$ manila access-allow hb_share ip 10.2.0.5

the above steps 1,2,3 all is success

4.get ifconfig info
[stack@localhost share]$ ifconfig
enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
        inet 192.168.1.120 netmask 255.255.255.0 broadcast 192.168.1.255
        inet6 fe80::a00:27ff:fe05:3ed6 prefixlen 64 scopeid 0x20<link>
        ether 08:00:27:05:3e:d6 txqueuelen 1000 (Ethernet)
        RX packets 78484 bytes 6323737 (6.0 MiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 18494 bytes 9016904 (8.5 MiB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
        inet 127.0.0.1 netmask 255.0.0.0
        inet6 ::1 prefixlen 128 scopeid 0x10<host>
        loop txqueuelen 0 (Local Loopback)
        RX packets 3158923 bytes 1132293474 (1.0 GiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 3158923 bytes 1132293474 (1.0 GiB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

qbrab35157a-08: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
        ether 86:eb:56:7c:68:46 txqueuelen 0 (Ethernet)
        RX packets 254 bytes 25908 (25.3 KiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 0 bytes 0 (0.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

qbrd2952fa6-2c: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
        ether 96:ce:dc:00:f3:2a txqueuelen 0 (Ethernet)
        RX packets 124 bytes 22260 (21.7 KiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 0 bytes 0 (0.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

qvbab35157a-08: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST> mtu 1450
        inet6 fe80::84eb:56ff:fe7c:6846 prefixlen 64 scopeid 0x20<link>
        ether 86:eb:56:7c:68:46 txqueuelen 1000 (Ethernet)
        RX packets 998 bytes 63864 (62.3 KiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 8738 bytes 730295 (713.1 KiB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

qvbd2952fa6-2c: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST> mtu 1450
        inet6 fe80::94ce:dcff:fe00:f32a pr...

Reviewed: https://review.openstack.org/539924
Committed: https://git.openstack.org/cgit/openstack/manila/commit/?id=7422aef799dd3f03bf01d40a60fb9a3c5abc8432
Submitter: Zuul
Branch: master

commit 7422aef799dd3f03bf01d40a60fb9a3c5abc8432
Author: yanjun.fu <email address hidden>
Date: Thu May 4 16:41:25 2017 +0800

    Fix tap device disappear after node restart

    When use driver_handles_share_servers driver, the tap device will
    down and the tap device mac address is changed after restart the
    node, that will caused manila can not manage share that created
    by this service.
    This path fix this issue. When restart manila-share service,
    call setup_connectivity_with_service_instances() to create host port
    and check the the mac address.

    Change-Id: Ibcdd4f58f15a53c69d35db06bc42283859349758
    Closes-Bug:#1688155

Changed in manila:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/554569
Committed: https://git.openstack.org/cgit/openstack/manila/commit/?id=886577e4b0cb3d827409d21125a099824c5ffde8
Submitter: Zuul
Branch: stable/queens

commit 886577e4b0cb3d827409d21125a099824c5ffde8
Author: yanjun.fu <email address hidden>
Date: Thu May 4 16:41:25 2017 +0800

    Fix tap device disappear after node restart

    When use driver_handles_share_servers driver, the tap device will
    down and the tap device mac address is changed after restart the
    node, that will caused manila can not manage share that created
    by this service.
    This path fix this issue. When restart manila-share service,
    call setup_connectivity_with_service_instances() to create host port
    and check the the mac address.

    Change-Id: Ibcdd4f58f15a53c69d35db06bc42283859349758
    Closes-Bug:#1688155
    (cherry picked from commit 7422aef799dd3f03bf01d40a60fb9a3c5abc8432)

tags: added: in-stable-queens

This issue was fixed in the openstack/manila 7.0.0.0b1 development milestone.

This issue was fixed in the openstack/manila 6.0.1 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers