Node deployment failed due to WWN [tinyipa]

Bug #2066711 reported by Nilesh
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
New
Undecided
Unassigned

Bug Description

### Describe the bug

* Currently I am looking out a way to speed up my deployment process,
* If I use custom IPA images build with [1], and corresponding generated images [2], I am able to provision the baremetal node.
* Do not see issue for WWN [deploy.write_image]
* Only the case I do see is
  A) ipa.initramfs is almost 1G in size.
  B) And slowness in deployment, [scans all hardware and then report back, slowing down]
  c) DHCP for all networks,
* Here is my <>.json file that I am using for enrolment. [3]

[1]
~~~
ironic-python-agent-builder -o ironic-python-agent ironic-python-agent-ramdisk --release jammy ubuntu
~~~

[2]
~~~
()[root@spare httpboot]# ls -lhrt ubu_ipa/
total 983M
-rw-r--r-- 1 root root 971M May 9 05:52 ipa.initramfs
-rw-r--r-- 1 ironic ironic 12M May 9 05:52 ipa.kernel
-rw-r--r-- 1 root root 94 May 9 07:27 ipa.kernel.sha256
-rw-r--r-- 1 root root 96 May 9 07:28 ipa.initramfs.sha256
()[root@spare httpboot]#
~~~

[3]
~~~
{
    "baremetal1": {
      "name": "baremetal1",
      "driver": "ipmi",
      "driver_info": {
        "ipmi_address": "172.29.xxx.30",
        "ipmi_port": "623",
        "ipmi_username": "xxxxx",
        "ipmi_password": "xxxxx"
      },
      "ipv4_address": "172.xx.xxx.15",
      "ipv4_subnet_mask": "255.255.255.0",
      "ipv4_gateway": "172.xx.xxx.1",
      "ipv4_nameserver": "172.xx.xx.47",
      "inventory_dhcp": true,
      "nics": [
        {
          "mac": "14:23:f2:79:2b:xx"
        }
      ],
      "properties": {
        "cpu_arch": "x86_64",
        "root_device": {"wwn": "0x600062b2169e51002d719a7648740109"}
      },
      "instance_info": {
        "image_source": "http://172.29.xx.30:8080/mini-jammy.qcow2",
        "image_checksum": "eb1b8264b4d0403d6e276eebe9b6c014",
        "configdrive": {
          "meta_data": {
            "hostname": "baremetal_001.xxxxx.lan"
          }
        }
      }
    }
}
~~~

* If I use TinyIPA [1], deployment is failing [2], Noting down here, no config change only ipa images changes.
* TinyIPA is not able to find out WWN root device hints.

[1]
~~~
()[root@spare httpboot]# ls -lhrt tiny/
total 205M
-rw-r--r-- 1 root root 5.3M May 6 11:36 ipa.kernel
-rw-r--r-- 1 root root 200M May 6 11:36 ipa.initramfs
()[root@spare httpboot]#
~~~

[2]
~~~
May 17 15:32:17 spare.clear-trail.lan ironic[17844]: 2024-05-17 15:32:17.831 17844 ERROR ironic.conductor.utils [-] Deploy step deploy.write_image failed on node 105394ba-a5ca-404c-a11b-8d0509afcc30. No suitable device was found for deployment using these hints {'wwn': '0x600062b2169e51002d719a7648740109'}
~~~

#### Expectation

* We know Tiny images are not suitable for production, but to test the functionality, it should work.
* Looking out for a confirmation If we can use tiny images to test baremetal provisioning like the above emntioned.

#### Blocker:

* If I use custom IPA total deployment time is around 12 min.

Thanks,
cNilesh.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Greetings!

The difference is in large part block device utilities we utilize to collect that data, where in TinyIPA they have a number of limitations which can reflect in ways such as this. TinyIPA is also only intended for CI testing of ironic, and should never be used for physical baremetal deployment.

You *really* should be building an IPA image or using non-tinyipa built image. For example https://tarballs.opendev.org/openstack/ironic-python-agent/dib/ipa-centos9-master.tar.gz is Centos9 based and should work for you.

When you say if you build a custom image, it takes 12 minutes to deploy. Can you provide us some details? We've seen some issues in the past with hosts where numerous network interfaces can take a long time to resolve for network connectivity. If you can breakdown what is taking so long with the overall host, we might be able to help provide a solution.

-Julia

Revision history for this message
Nilesh (cnilesh) wrote (last edit ):

Hey Thank you so much Julia, So lets eliminate TinyIPA part, its of no use for Baremetal, so lets eliminate.

The other major part its taking time to deploy around 12min.

And as I said here,

* If I use custom IPA images build with [1],
* and corresponding generated images [2],
* I am able to provision the baremetal node.

* Only the case I do see is
  A) ipa.initramfs is almost 1G in size.
  B) And slowness in deployment, [scans all hardware and then report back, slowing down]
  c) DHCP for all networks,
* Here is my <>.json file that I am using for enrolment. [3]

* So my observation is/was during deployment it scans all interfaces and take long to resolve the network connectivity, Even I had modify the image and modify the interfaces file to DHCP on only 1 network, but as initramfs is doing scan before qcow2 directsync it hung up thr for almost 2/3 min.

[1]
~~~
ironic-python-agent-builder -o ironic-python-agent ironic-python-agent-ramdisk --release jammy ubuntu
~~~

[2]
~~~
()[root@spare httpboot]# ls -lhrt ubu_ipa/
total 983M
-rw-r--r-- 1 root root 971M May 9 05:52 ipa.initramfs
-rw-r--r-- 1 ironic ironic 12M May 9 05:52 ipa.kernel
-rw-r--r-- 1 root root 94 May 9 07:27 ipa.kernel.sha256
-rw-r--r-- 1 root root 96 May 9 07:28 ipa.initramfs.sha256
()[root@spare httpboot]#
~~~

[3]
~~~
{
    "baremetal1": {
      "name": "baremetal1",
      "driver": "ipmi",
      "driver_info": {
        "ipmi_address": "172.29.xxx.30",
        "ipmi_port": "623",
        "ipmi_username": "xxxxx",
        "ipmi_password": "xxxxx"
      },
      "ipv4_address": "172.xx.xxx.15",
      "ipv4_subnet_mask": "255.255.255.0",
      "ipv4_gateway": "172.xx.xxx.1",
      "ipv4_nameserver": "172.xx.xx.47",
      "inventory_dhcp": true,
      "nics": [
        {
          "mac": "14:23:f2:79:2b:xx"
        }
      ],
      "properties": {
        "cpu_arch": "x86_64",
        "root_device": {"wwn": "0x600062b2169e51002d719a7648740109"}
      },
      "instance_info": {
        "image_source": "http://172.29.xx.30:8080/mini-jammy.qcow2",
        "image_checksum": "eb1b8264b4d0403d6e276eebe9b6c014",
        "configdrive": {
          "meta_data": {
            "hostname": "baremetal_001.xxxxx.lan"
          }
        }
      }
    }
}
~~~

Revision history for this message
Nilesh (cnilesh) wrote :

@ironic team, any update please, let me know if you need logs from ironic.

Revision history for this message
Nilesh (cnilesh) wrote :

I just tweak the change:

~~~
./usr/local/sbin/dhcp-all-interfaces.sh
~~~

~~~
#!/bin/bash
interfaces=$(ls /sys/class/net/ | grep -v lo)
first_interface=$(echo "$interfaces" | head -n1)
dhclient -v "$first_interface"
~~~

~~~
find . | cpio -o -H newc | gzip > ipa.initramfs
~~~

Able to perform introspection. Saved a lot of time.

Thanks,

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.