MicroCeph step is not idempotent - Error: Microceph Adding disks /dev/vdc failed: {'return-code': 0}

Bug #2065649 reported by Nobuto Murata
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Snap
In Progress
High
Hemanth Nakkina

Bug Description

$ snap list openstack
Name Version Rev Tracking Publisher Notes
openstack 2024.1 503 2024.1/edge canonical✓ -

With the manifest file below, re-running sunbeam bootstrap fails after the initial openstack model deployment failed and the model was cleaned up.

$ sunbeam cluster bootstrap --manifest deployment_manifest.yaml --role control --role compute --role storage
⠧ Deploying OpenStack Control Plane to Kubernetes (this may take a while) ... waiting for services to come online (23/25)Timed out while waiting for model 'openstack' to be ready
Error: Timed out while waiting for model 'openstack' to be ready

$ juju destroy-model --no-wait --force openstack

$ sunbeam cluster bootstrap --manifest deployment_manifest.yaml --role control --role compute --role storage
Error: Microceph Adding disks /dev/vdc failed: {'return-code': 0}

[manifest file]

$ head -n 20 deployment_manifest.yaml
deployment:
  proxy:
    proxy_required: false
  bootstrap:
    management_cidr: 192.168.123.0/24
  addons:
    metallb: 192.168.123.81-192.168.123.90
  microceph_config:
    sunbeam-1:
      osd_devices: /dev/vdc
software:
  juju:
    bootstrap_args:
      - --debug
      - --model-default=test-mode=true
      - --model-default=disable-telemetry=true
      - --model-default=logging-config=<root>=INFO;unit=DEBUG
  charms:
    aodh-k8s:
      channel: 2024.1/edge

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

@nobuto can you add $HOME/snap/openstack/common/logs/*.log --> one of the log file should correspond to the failure.

Revision history for this message
Nobuto Murata (nobuto) wrote :

There you go.

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

From the logs, I see microceph sees no disks.

07:08:15,931 sunbeam.commands.microceph DEBUG Result after running action list-disks on 'microceph/0': {'osds': '[]', 'return-code': 0, 'stdout': "{'osds': [], 'unpartitioned-disks': []}\n", 'unpartitioned-disks': '[]'}
07:08:15,931 sunbeam.commands.microceph DEBUG Unpartitioned disks: []
07:08:15,932 sunbeam.commands.microceph DEBUG OSD disks: []

Can you run `sudo microceph disk list` command.
Is the /dev/vdc device size is less than 2 GB? Microceph seems to ignore devices less than 2 GB [1]

[1] https://github.com/canonical/microceph/blob/450240f5dd0d24853354a5a3306a5bd85b429c9a/microceph/cmd/microceph/disk_list.go#L183-L186

Revision history for this message
James Page (james-page) wrote :

@hemanth-n I think this issue here is that when run again, the step that deals with adding disks fails (as the disk has already been added).

Revision history for this message
Nobuto Murata (nobuto) wrote :

> 07:08:15,931 sunbeam.commands.microceph DEBUG Result after running action list-disks on 'microceph/0': {'osds': '[]', 'return-code': 0, 'stdout': "{'osds': [], 'unpartitioned-disks': []}\n", 'unpartitioned-disks': '[]'}

I think this part specifically is from reef/edge as per:
https://bugs.launchpad.net/snap-openstack/+bug/2065470/comments/5

I might be able to try once again with latest/edge later.

Revision history for this message
Nobuto Murata (nobuto) wrote :

I can confirm this happens with microceph latest/edge 47 too.

James Page (james-page)
Changed in snap-openstack:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Hemanth Nakkina (hemanth-n)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.