/var/lib/nova/instances goes into read only mode without nagios alert

Bug #1883576 reported by Steven Parker
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Triaged
Wishlist
Unassigned

Bug Description

We had an NVME drive (bcache) fail on one of our production nodes.
The backing file system then went into read only mode.
This was not detected via the nagios health checks.

If this is not already in the charm we should add this check.

This is one of the errors generated via the /var/log/nova logs

2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task [req-9b336247-0968-469a-8a5c-32bcf090a329 - - - - -] Error during ComputeManager._run_image_cache_manager_pass: OSError: [Errno 30] Read-only file system: '/var/lib/nova/instances/locks/nova-storage-registry-lock'
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task Traceback (most recent call last):
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/oslo_service/periodic_task.py", line 222, in run_periodic_tasks
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task task(self, context)
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/nova/compute/manager.py", line 8708, in _run_image_cache_manager_pass
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task storage_users.register_storage_use(CONF.instances_path, CONF.host)
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/nova/virt/storage_users.py", line 72, in register_storage_use
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task return do_register_storage_use(storage_path, hostname)
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/oslo_concurrency/lockutils.py", line 321, in inner
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task fair=fair):
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task return next(self.gen)
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/oslo_concurrency/lockutils.py", line 269, in lock
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task ext_lock.acquire(delay=delay)
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/fasteners/process_lock.py", line 147, in acquire
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task self._do_open()
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/fasteners/process_lock.py", line 119, in _do_open
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task self.lockfile = open(self.path, 'a')
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task OSError: [Errno 30] Read-only file system: '/var/lib/nova/instances/locks/nova-storage-registry-lock'
2020-06-15 16:18:28.380 782491 ERROR oslo_service.periodic_task

Changed in charm-nova-compute:
importance: Undecided → Wishlist
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.