resizing filesystems not completed if configured and actual size are close

Bug #1822422 reported by Peng Peng on 2019-03-30
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Low
Abraham Arce

Bug Description

Brief Description
-----------------
"resizing filesystems not completed" print out in sysinv.log every minute

Severity
--------
Major

Steps to Reproduce
------------------
check sysinv.log

Expected Behaviour
------------------
no "resizing filesystems not completed" print out

Actual Behaviour
----------------
 "resizing filesystems not completed" print out

Reproducibility
---------------
Reproducible
100%

System Configuration
--------------------
Two node system

Branch/Pull Time/Commit
-----------------------
master as of 20190330T013000Z

Timestamp/Logs
--------------
2019-03-30 14:04:14.192 204195 INFO sysinv.conductor.manager [-] _controller_config_active_apply about to resize the filesystem
2019-03-30 14:04:14.192 204195 WARNING sysinv.conductor.manager [-] resizing filesystems
2019-03-30 14:04:14.252 204195 INFO sysinv.conductor.manager [-] resizing filesystems drbd connected
2019-03-30 14:04:14.337 204195 INFO sysinv.conductor.manager [-] drbd-overview: pgsql-75.0, cgcs-9.9, extension-0.96875, patch-vault-0, etcd-4.8, dockerdistribution-16.0
2019-03-30 14:04:14.338 204195 INFO sysinv.conductor.manager [-] lvdisplay: pgsql-76.0, cgcs-10.0, extension-1.0, patch-vault-0, etcd-5.0, dockerdistribution-16.0
2019-03-30 14:04:20.773 204195 WARNING sysinv.conductor.manager [-] resizing filesystems not completed

Ghada Khalil (gkhalil) wrote :
Download full text (3.8 KiB)

From Wei Zhou:

The problem is when the resizing started somehow drbd-overview showed “Connected” instead of “SyncSource" or "PausedSyncS" in the output; therefore “resize2fs drbd0” cmd didn’t run. See here:

  0:drbd-pgsql/0 Connected Primary/Secondary UpToDate/UpToDate C r----- /var/lib/postgresql ext4 99G 102M 95G 1%

This issue happened before. When it happened though, there was no indication that resize2fs didn’t occur and that caused problem later on when the file system was in use. My commit https://git.starlingx.io/cgit/stx-config/commit/?id=0bc78c4c44492f588d43e1f77f750b06e4381d80 added error indication like what you saw in the sysinv.log that resizing not complete.

I resized database fs again and this time it passed.

[root@controller-1 conductor(keystone_admin)]# system controllerfs-list
+--------------------------------------+---------------------+------+-----------------------+------------+-----------+
| UUID | FS Name | Size | Logical Volume | Replicated | State |
| | | in | | | |
| | | GiB | | | |
+--------------------------------------+---------------------+------+-----------------------+------------+-----------+
| 14d87794-3528-43f5-bd9b-ee2f4100a1c6 | backup | 96 | backup-lv | False | available |
| 13b6ac56-9948-4785-b0a9-60039a812343 | database | 38 | pgsql-lv | True | available |
| a548b0da-fa40-403b-9aa8-517609b73487 | docker | 30 | docker-lv | False | available |
| 837ccfcd-98a0-433f-bb3b-e51026a16f90 | docker-distribution | 16 | dockerdistribution-lv | True | available |
| 845735ec-b46c-48cc-ab46-083e60ae804d | etcd | 5 | etcd-lv | True | available |
| afb9ac97-7567-4695-9130-d288bfdb2cfd | extension | 1 | extension-lv | True | available |
| 9bd1fbc4-5042-4201-830e-0590e00f909e | glance | 10 | cgcs-lv | True | available |
| 508281f7-4e23-40d8-b720-794bda97240a | gnocchi | 5 | gnocchi-lv | False | available |
| 1ff1aa08-7c10-4fd4-9a10-59dbba87add0 | img-conversions | 38 | img-conversions-lv | False | available |
| 310035f7-523a-4343-9073-67065ae525cf | scratch | 8 | scratch-lv | False | available |

2019-03-29 08:56:49.051 93401 INFO sysinv.conductor.manager [-] _controller_config_active_apply about to resize the filesystem
2019-03-29 08:56:49.054 93401 WARNING sysinv.conductor.manager [-] resizing filesystems
2019-03-29 08:56:49.275 93401 INFO sysinv.conductor.manager [-] resizing filesystems drbd connected
2019-03-29 08:56:49.370 93401 INFO sysinv.openstack.common.rpc.common [req-e2b6f39f-366a-49fc-b638-d1f3a35709b8 None None] Connected to AMQP server on 192.168.204.2:5672
2019-03-29 08:56:49.499 93401 INFO sysinv.conductor.manager [-] Performed drbdadm resize all
2019-03-29 08:56:49.883 93401 INFO sysinv.conductor.manager [-] drbd-overview: pgsql-75.0, cgcs-9.9, extension-0.96875, patch-vault-0, etcd-4.8, dockerdistribution-16.0
2019-03-29 08:56:49.884 93401 INFO sysinv.conductor.manager [-] lvdisplay: pgsql-76.0, cgcs-10.0, extension-1.0, patch-vault-0, etcd-5.0, dockerdistribution-16.0
2019-03-29 08:56:57.342 93401 WARNING sysinv.conductor.manager [-] resizing filesystems not completed
2019-03-29 08:57:45.283 93401 INFO sysinv.conductor.manager [-] _controller_config_active_apply about ...

Read more...

Ghada Khalil (gkhalil) wrote :

What is triggering the filesystem resize?

Changed in starlingx:
status: New → Incomplete
Ghada Khalil (gkhalil) wrote :

Need more information from the reporter

Changed in starlingx:
assignee: nobody → Peng Peng (ppeng)
Peng Peng (ppeng) wrote :

The test lab config the database size is 38, which seems make

2019-03-29 08:56:49.883 93401 INFO sysinv.conductor.manager [-] drbd-overview: pgsql-75.0, cgcs-9.9, extension-0.96875, patch-vault-0, etcd-4.8, dockerdistribution-16.0
2019-03-29 08:56:49.884 93401 INFO sysinv.conductor.manager [-] lvdisplay: pgsql-76.0, cgcs-10.0, extension-1.0, patch-vault-0, etcd-5.0, dockerdistribution-16.0

75 GiB and 76 GiB too close. drbd keeps request resizing and it is never completed.

We modified ata_base size to 40 for this lab,

[wrsroot@controller-0 ~(keystone_admin)]$ system controllerfs-modify backup=100 img-conversions=40 database=40
+--------------------------------------+---------------------+------+-----------------------+------------+------------------------------+
| UUID | FS Name | Size | Logical Volume | Replicated | State |
| | | in | | | |
| | | GiB | | | |
+--------------------------------------+---------------------+------+-----------------------+------------+------------------------------+
| c8027a3a-76b6-4240-b9f7-5e04163df776 | backup | 100 | backup-lv | False | available |
| 3e5a36ee-da68-4da3-bc93-94ec54d6308b | database | 40 | pgsql-lv | True | drbd_fs_resizing_in_progress |
| d94efba9-7fe9-40c7-8db7-8ce8a6c23b75 | docker | 30 | docker-lv | False | available |
| fdb6dfda-b064-451e-b741-d1c529dea65e | docker-distribution | 16 | dockerdistribution-lv | True | available |
| 6202cb7c-a2b0-4c98-8c1f-73264e33ef81 | etcd | 5 | etcd-lv | True | available |
| ef8280a4-2dc2-4cd4-9f10-1f041d7dd59e | extension | 1 | extension-lv | True | available |
| c7405b50-f2ca-4814-a91b-7279c9534ba3 | glance | 10 | cgcs-lv | True | available |
| e266ae1c-0261-4ca0-a738-b072ed348c27 | gnocchi | 5 | gnocchi-lv | False | available |
| 9ddfb952-3643-4390-b7e7-09d23a5a88e3 | img-conversions | 40 | img-conversions-lv | False | available |
| dff08790-7496-4466-9017-cdd88669af87 | scratch | 8 | scratch-lv | False | available |
+--------------------------------------+---------------------+------+-----------------------+------------+------------------------------+

Then this issue was not appeared.

2019-04-10 18:17:08.371 106353 WARNING sysinv.conductor.manager [-] resizing filesystems
2019-04-10 18:17:08.430 106353 INFO sysinv.conductor.manager [-] resizing filesystems drbd connected
2019-04-10 18:17:08.469 106353 INFO sysinv.conductor.manager [-] Performed drbdadm resize all
2019-04-10 18:17:08.561 106353 INFO sysinv.conductor.manager [-] drbd-overview: pgsql-75.0, cgcs-9.9, extension-0.96875, patch-vault-0, etcd-4.8, dockerdistribution-16.0
2019-04-10 18:17:08.561 106353 INFO sysinv.conductor.manager [-] lvdisplay: pgsql-80.0, cgcs-10.0, extension-1.0, patch-vault-0, etcd-5.0, dockerdistribution-16.0
2019-04-10 18:17:09.729 106353 INFO sysinv.conductor.manager [-] Performed resize2fs drbd0
2019-04-10 18:17:09.730 106353 INFO sysinv.conductor.manager [-] resizing filesystems completed

Peng Peng (ppeng) wrote :
Ghada Khalil (gkhalil) wrote :

Not release gating; marking as low priority.
This appears to be an issue if the configured filesystem size is close to the actual filesystem size, the resize doesn't succeed. This is likely a day 1 issue and can be addressed by a configuring a larger fs size.

tags: added: stx.config stx.retestneeded
Changed in starlingx:
status: Incomplete → Triaged
importance: Undecided → Low
assignee: Peng Peng (ppeng) → nobody
tags: added: stx.helpwanted
Ghada Khalil (gkhalil) on 2019-04-15
summary: - Containers: resizing filesystems not completed
+ resizing filesystems not completed if configured and actual size are
+ close
Abraham Arce (xe1gyq) on 2019-06-26
Changed in starlingx:
assignee: nobody → Abraham Arce (xe1gyq)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers