compute unlock restart loop

Bug #1808158 reported by Michal
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
Medium
Steven Webster

Bug Description

Title
-----
compute unlock restart loop

Brief Description
-----------------

After compute bootstrap, when switching from Locked to Unlocked it is falling into restart loop.

Severity
--------
Critical: System/Feature is not usable after the defect

Steps to Reproduce
------------------
It is failing in lab hardware for every compute installation attempt.

Expected Behavior
------------------
Compute should unlock successfully.

Actual Behavior
----------------
Compute cannot be unlocked.

Reproducibility
---------------
100%

System Configuration
--------------------
One controller + one compute

Branch/Pull Time/Commit
-----------------------
master, Nov 27 14:41

Timestamp/Logs
--------------

Here is a console output:

[ OK ] Started OpenStack Neutron Layer 3 Agent.
         Starting OpenStack Neutron Layer 3 Agent...
         Starting Neutron networking agent...
[ OK ] Started Neutron networking agent.
         Starting OpenStack Neutron SR-IOV NIC Agent...
[ OK ] Started OpenStack Neutron SR-IOV NIC Agent.
[ 147.926656] compute_config[1819]: [WARNING]
[ 147.933158] compute_config[1819]: Warnings found. See /var/log/puppet/2018-12-12-13-01-26_compute/puppet.log for details
[ 147.970718] compute_config[1819]: *****************************************************
[ 147.979149] compute_config[1819]: *****************************************************
[ 147.988438] compute_config[1819]: Failed to run the puppet manifest (RC:1)
[ 147.996089] compute_config[1819]: *****************************************************
[ 148.005081] compute_config[1819]: *****************************************************
[ 148.014173] compute_config[1819]: Pausing for 5 seconds...
[ OK ] Started General TIS config gate.
         Starting Titanium Cloud Maintenance Resource Monitor...
         Starting Titanium Cloud Maintenance Filesystem Monitor...
         Starting Titanium Cloud Maintenance Alarm Handler Client...
         Starting Titanium Cloud Maintenance Guest Heartbeat Monitor Server...
         Starting Titanium Cloud Maintenance Goenable Ready...
         Starting Titanium Cloud Maintenance Logger...
         Starting Titanium Cloud Maintenance Heartbeat Client...
         Starting Titanium Cloud Host Guest Messaging Agent...
[ OK ] Started TIS compute config gate.
[ OK ] Started Titanium Cloud Maintenance Resource Monitor.
[ OK ] Started Titanium Cloud Maintenance Alarm Handler Client.
[ OK ] Started Titanium Cloud Maintenance Guest Heartbeat Monitor Server.
[ OK ] Started Titanium Cloud Maintenance Goenable Ready.
[ OK ] Started Titanium Cloud Maintenance Logger.
[ OK ] Started Titanium Cloud Maintenance Heartbeat Client.
[ OK ] Started Titanium Cloud Host Guest Messaging Agent.
         Starting Titanium Cloud Maintenance Command Handler Client...
[ OK ] Started Getty on tty1.
         Starting Getty on tty1...
[ OK ] Started Serial Getty on ttyS0.
         Starting Serial Getty on ttyS0...
[ OK ] Reached target Login Prompts.
         Starting OpenStack Nova Compute Server Pre-Startup...
         Starting Titanium Cloud Nova Init...
[ OK ] Started Titanium Cloud Maintenance Command Handler Client.
[ OK ] Started OpenStack Nova Compute Server Pre-Startup.
[FAILED] Failed to start Titanium Cloud Nova Init.
See 'systemctl status e_nova-init.service' for details.
[ OK ] Started KVM Timer Advance Setup.
         Starting KVM Timer Advance Setup...
         Starting Titanium Cloud Maintenance Compute Goenable Ready...
[ OK ] Started Titanium Cloud Maintenance Compute Goenable Ready.
[ OK ] Started Titanium Cloud Maintenance Filesystem Monitor.
         Starting Titanium Cloud Maintenance Process Monitor...
[ OK ] Started Titanium Cloud Maintenance Process Monitor.
         Starting Titanium Cloud Maintenance Host Watchdog...
[ OK ] Started Titanium Cloud Maintenance Host Watchdog.
[ OK ] Reached target Multi-User System.
         Starting Update UTMP about System Runlevel Changes...
[ OK ] Started Update UTMP about System Runlevel Changes.
         Stopping Availability of block devices...
         Stopping OpenStack ceilometer polling agent...
[ OK ] Removed slice system-systemd\x2dfsck.slice.
[ OK ] Stopped target Timers.
[ OK ] Stopped target Multi-User System.
         Stopping Dynamic System Tuning Daemon...
         Stopping Titanium Cloud Maintenance Compute Goenable Ready...
         Stopping nfscheck...
[ OK ] Stopped Resets System Activity Logs.
         Stopping Resets System Activity Logs...
         Stopping Titanium Cloud Filesystem Initialization...
         Stopping Collectd statistics daemon and extension services...
         Stopping OpenStack Neutron Layer 3 Agen[ OK ] Unmounted /var/lib/nova/instances.
[ OK ] Stopped Authorization Manager.
[ OK ] Stopped Self Monitoring and Reporting Technology (SMART) Daemon.
[ OK ] Stopped Command Scheduler.
[ OK ] Stopped nfscheck.
[ OK ] Stopped LLDP daemon.
[ OK ] Stopped Naming services LDAP client daemon..
[ OK ] Stopped Getty on tty1.
[ OK ] Stopped Serial Getty on ttyS0.
[ OK ] Stopped Availability of block devices.
[ OK ] Stopped OpenStack ceilometer polling agent.
[ OK ] Stopped Titanium Cloud Maintenance Compute Goenable Ready.
[ OK ] Stopped Titanium Cloud Filesystem Initialization.
[ OK ] Failed unmounting RPC Pipe File System.
[ OK ] Stopped Collectd statistics daemon and extension services.
         Stopping StarlingX Cloud Filesystem Auto-mounter...
         Stopping KVM Timer Advance Setup...
[ OK ] Removed slice system-serial\x2dgetty.slice.
[ OK ] Removed slice system-getty.slice.
         Stopping Permit User Sessions...
[ OK ] Stopped target System Time Synchronized.
[ OK ] Stopped OpenStack Neutron Open vSwitch Agent.
[ OK ] Stopped KVM Timer Advance Setup.
[ OK ] Stopped StarlingX Cloud Filesystem Auto-mounter.
[ OK ] Stopped Permit User Sessions.
[ OK ] Stopped OpenStack Nova Compute Server Pre-Startup.
         Stopping OpenStack Nova Compute Server Pre-Startup...
[ OK ] Stopped TIS compute config gate.
         Stopping TIS compute config gate...
[ OK ] Stopped fast remote file copy program daemon.
[ OK ] Stopped Dynamic System Tuning Daemon.
[ OK ] Stopped Titanium Cloud Maintenance Host Watchdog.
         Stopping Titanium Cloud Maintenance Process Monitor...
[ OK ] Stopped Titanium Cloud Maintenance Process Monitor.
         Stopping Titanium Cloud Host Guest Messaging Agent...
         Stopping OpenSSH server daemon...
         Stopping Network Time Service...
         Stopping Service Management Event Recorder Unit...
         Stopping TIS Patching Agent...
         Stopping Titanium Cloud Maintenance Resource Monitor...
         Stopping Titanium Cloud Maintenance Guest Heartbeat Monitor Server...
         Stopping Titanium Cloud Maintenance Filesystem Monitor...
         Stopping OpenStack Neutron SR-IOV NIC Agent...
         Stopping Titanium Cloud Maintenance Goenable Ready...
         Stopping Neutron networking agent...
         Stopping Titanium Cloud Maintenance Logger...
         Stopping ACPI Event Daemon...
         Stopping Titanium Cloud Maintenance Alarm Handler Client...
         Stopping Virtualization daemon...
         Stopping Neutron networking agent...
         Stopping Titanium Cloud Maintenance Command Handler Client...
[ OK ] Stopped Network Time Service.
[ OK ] Stopped Virtualization daemon.
[ OK ] Stopped ACPI Event Daemon.
[ OK ] Stopped Titanium Cloud Host Guest Messaging Agent.
[ OK ] Stopped OpenSSH server daemon.
[ OK ] Stopped TIS Patching Agent.
[ OK ] Stopped OpenStack Neutron SR-IOV NIC Agent.
[ OK ] Stopped Titanium Cloud Maintenance Goenable Ready.
[ OK ] Stopped Titanium Cloud Maintenance Logger.
[ OK ] Stopped Neutron networking agent.
[ OK ] Stopped Titanium Cloud Maintenance Alarm Handler Client.
         Stopping Login Service...
         Stopping Virtual machine lock manager...
         Stopping Titanium Cloud libvirt QEMU cleanup...
[ OK ] Stopped Set time via NTP.
         Stopping Set time via NTP...
         Stopping StarlingX Filesystem Server...
[ OK ] Stopped Virtual machine lock manager.
[ OK ] Stopped Login Service.
[ OK ] Stopped Neutron networking agent.
[ OK ] Stopped Titanium Cloud libvirt QEMU cleanup.
[ OK ] Stopped Service Management Event Recorder Unit.
[ OK ] Stopped Titanium Cloud Maintenance Resource Monitor.
[ OK ] Stopped Titanium Cloud Maintenance Guest Heartbeat Monitor Server.
[ OK ] Stopped Titanium Cloud Maintenance Filesystem Monitor.
[ OK ] Stopped OpenStack Neutron Layer 3 Agent.
[ OK ] Stopped StarlingX Filesystem Server.
[ OK ] Stopped Titanium Cloud Maintenance Command Handler Client.
         Stopping Titanium Cloud Maintenance Heartbeat Client...
[ OK ] Stopped Titanium Cloud Maintenance Heartbeat Client.
[ OK ] Stopped General TIS config gate.
         Stopping General TIS config gate...
         Stopping Titanium Cloud Log Management...
         Stopping computeconfig service...
[ OK ] Stopped Titanium Cloud Log Management.
[ OK ] Stopped computeconfig service.
         Stopping Titanium Cloud Affine Platform...
[ OK ] Stopped target Remote File Systems.
         Unmounting /opt/platform...
         Stopping Titanium Cloud opt-platform mounter...
[ 159.253562] umount[27010]: umount: /opt/platform: not mounted
         Stopping Titanium Cloud System Inventory Agent...
[ OK ] Stopped Clean nova-local thinpool.
         Stopping Clean nova-local thinpool...
[ OK ] Stopped Titanium Cloud Affine Platform.
[ OK ] Unmounted /opt/platform.
[ OK ] Stopped Titanium Cloud opt-platform mounter.
[ OK ] Stopped Titanium Cloud System Inventory Agent.
         Stopping StarlingX Filesystem Common...

/var/log/puppet/2018-12-12-13-01-26_compute/puppet.log
2018-12-12T13:02:31.469 Debug: 2018-12-12 13:02:31 +0000 Executing: 'ovs-ofctl add-flow br-phy0 dl_dst=01:80:c2:00:00:0e,dl_type=0x88cc,hard_timeout=0,idle_timeout=0,in_port=eth0,actions=output:lldpbee5d55d-2d'
2018-12-12T13:02:31.476 Notice: 2018-12-12 13:02:31 +0000 /Stage[main]/Platform::Vswitch::Ovs/Platform::Vswitch::Ovs::Flow[eth0]/Exec[ovs-add-flow: eth0]/returns: ovs-ofctl: eth0: invalid or unknown port for in_port
2018-12-12T13:02:31.478 Error: 2018-12-12 13:02:31 +0000 ovs-ofctl add-flow br-phy0 dl_dst=01:80:c2:00:00:0e,dl_type=0x88cc,hard_timeout=0,idle_timeout=0,in_port=eth0,actions=output:lldpbee5d55d-2d returned 1 instead of one of [0]
2018-12-12T13:02:31.479 /usr/share/ruby/vendor_ruby/puppet/util/errors.rb:106:in `fail'
2018-12-12T13:02:31.481 /usr/share/ruby/vendor_ruby/puppet/type/exec.rb:160:in `sync'
2018-12-12T13:02:31.485 /usr/share/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:236:in `sync'
2018-12-12T13:02:31.486 /usr/share/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:134:in `sync_if_needed'
2018-12-12T13:02:31.489 /usr/share/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:88:in `block in perform_changes'
2018-12-12T13:02:31.491 /usr/share/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:87:in `each'
2018-12-12T13:02:31.493 /usr/share/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:87:in `perform_changes'
2018-12-12T13:02:31.494 /usr/share/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:21:in `evaluate'
2018-12-12T13:02:31.496 /usr/share/ruby/vendor_ruby/puppet/transaction.rb:230:in `apply'
2018-12-12T13:02:31.497 /usr/share/ruby/vendor_ruby/puppet/transaction.rb:246:in `eval_resource'
2018-12-12T13:02:31.499 /usr/share/ruby/vendor_ruby/puppet/transaction.rb:163:in `call'
2018-12-12T13:02:31.500 /usr/share/ruby/vendor_ruby/puppet/transaction.rb:163:in `block (2 levels) in evaluate'
2018-12-12T13:02:31.502 /usr/share/ruby/vendor_ruby/puppet/util.rb:386:in `block in thinmark'
2018-12-12T13:02:31.503 /usr/share/ruby/benchmark.rb:296:in `realtime'
2018-12-12T13:02:31.504 /usr/share/ruby/vendor_ruby/puppet/util.rb:385:in `thinmark'
2018-12-12T13:02:31.506 /usr/share/ruby/vendor_ruby/puppet/transaction.rb:163:in `block in evaluate'
2018-12-12T13:02:31.507 /usr/share/ruby/vendor_ruby/puppet/graph/relationship_graph.rb:118:in `traverse'
2018-12-12T13:02:31.508 /usr/share/ruby/vendor_ruby/puppet/transaction.rb:154:in `evaluate'
2018-12-12T13:02:31.510 /usr/share/ruby/vendor_ruby/puppet/resource/catalog.rb:222:in `block in apply'
2018-12-12T13:02:31.511 /usr/share/ruby/vendor_ruby/puppet/util/log.rb:155:in `with_destination'
2018-12-12T13:02:31.512 /usr/share/ruby/vendor_ruby/puppet/transaction/report.rb:142:in `as_logging_destination'
2018-12-12T13:02:31.514 /usr/share/ruby/vendor_ruby/puppet/resource/catalog.rb:221:in `apply'
2018-12-12T13:02:31.515 /usr/share/ruby/vendor_ruby/puppet/configurer.rb:171:in `block in apply_catalog'
2018-12-12T13:02:31.516 /usr/share/ruby/vendor_ruby/puppet/util.rb:223:in `block in benchmark'
2018-12-12T13:02:31.518 /usr/share/ruby/benchmark.rb:296:in `realtime'
2018-12-12T13:02:31.519 /usr/share/ruby/vendor_ruby/puppet/util.rb:222:in `benchmark'
2018-12-12T13:02:31.520 /usr/share/ruby/vendor_ruby/puppet/configurer.rb:170:in `apply_catalog'
2018-12-12T13:02:31.522 /usr/share/ruby/vendor_ruby/puppet/configurer.rb:343:in `run_internal'
2018-12-12T13:02:31.523 /usr/share/ruby/vendor_ruby/puppet/configurer.rb:221:in `block in run'
2018-12-12T13:02:31.524 /usr/share/ruby/vendor_ruby/puppet/context.rb:65:in `override'
2018-12-12T13:02:31.525 /usr/share/ruby/vendor_ruby/puppet.rb:241:in `override'
2018-12-12T13:02:31.527 /usr/share/ruby/vendor_ruby/puppet/configurer.rb:195:in `run'
2018-12-12T13:02:31.528 /usr/share/ruby/vendor_ruby/puppet/application/apply.rb:350:in `apply_catalog'
2018-12-12T13:02:31.529 /usr/share/ruby/vendor_ruby/puppet/application/apply.rb:274:in `block in main'
2018-12-12T13:02:31.531 /usr/share/ruby/vendor_ruby/puppet/context.rb:65:in `override'
2018-12-12T13:02:31.532 /usr/share/ruby/vendor_ruby/puppet.rb:241:in `override'
2018-12-12T13:02:31.533 /usr/share/ruby/vendor_ruby/puppet/application/apply.rb:225:in `main'
2018-12-12T13:02:31.535 /usr/share/ruby/vendor_ruby/puppet/application/apply.rb:170:in `run_command'
2018-12-12T13:02:31.536 /usr/share/ruby/vendor_ruby/puppet/application.rb:344:in `block in run'
2018-12-12T13:02:31.538 /usr/share/ruby/vendor_ruby/puppet/util.rb:540:in `exit_on_fail'
2018-12-12T13:02:31.539 /usr/share/ruby/vendor_ruby/puppet/application.rb:344:in `run'
2018-12-12T13:02:31.540 /usr/share/ruby/vendor_ruby/puppet/util/command_line.rb:132:in `run'
2018-12-12T13:02:31.542 /usr/share/ruby/vendor_ruby/puppet/util/command_line.rb:72:in `execute'
2018-12-12T13:02:31.543 /usr/bin/puppet:5:in `<main>'
2018-12-12T13:02:31.544 Error: 2018-12-12 13:02:31 +0000 /Stage[main]/Platform::Vswitch::Ovs/Platform::Vswitch::Ovs::Flow[eth0]/Exec[ovs-add-flow: eth0]/returns: change from notrun to 0 failed: ovs-ofctl add-flow br-phy0 dl_dst=01:80:c2:00:00:0e,dl_type=0x88cc,hard_timeout=0,idle_timeout=0,in_port=eth0,actions=output:lldpbee5d55d-2d returned 1 instead of one of [0]
2018-12-12T13:02:31.546 Debug: 2018-12-12 13:02:31 +0000 Platform::Vswitch::Ovs::Flow[eth0]: Resource is being skipped, unscheduling all events

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Michal, Can you confirm that you have VT-d enabled in the BIOS? What type of NIC are you using for the data port?

Revision history for this message
Michal (michalkr) wrote :

Yes, VT-d is enabled in BIOS.

Here is what I see in operating system:
virt-host-validate
QEMU: Checking for hardware virtualization : PASS
lscpu |grep Virt
Virtualization: VT-x

I have 2xNIC, one mgmt., one data:
Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)

Revision history for this message
Steven Webster (swebster-wr) wrote :

This is a similar symptom to https://bugs.launchpad.net/starlingx/+bug/1796380 , which to me points to a memory issue, or lack thereof.

The /var/log/openvswitch/ovs-vswitchd.log around the time the puppet manifest is being applied should have some more information to confirm this. Are there any "ERR" messages?

It would also be useful to have the output of the following from the compute node. It might be useful to lock it first to get out of the reboot loop.

cat /proc/cmdline

ls /sys/devices/system/node/node0/hugepages/
ls /sys/devices/system/node/node1/hugepages/

cat /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
cat /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/free_hugepages
cat /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
cat /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/free_hugepages

sudo ovs-vsctl get Open_vSwitch . other_config

Revision history for this message
Michal (michalkr) wrote :

ovs-vswitchd.log:
2018-12-12T13:02:31.233Z|00009|dpdk|INFO|DPDK Enabled - initializing...
2018-12-12T13:02:31.233Z|00010|dpdk|INFO|No vhost-sock-dir provided - defaulting to /var/run/openvswitch
2018-12-12T13:02:31.233Z|00011|dpdk|INFO|IOMMU support for vhost-user-client disabled.
2018-12-12T13:02:31.233Z|00012|dpdk|INFO|EAL ARGS: ovs-vswitchd -c 3 --huge-dir /mnt/huge-1048576kB --socket-mem 1024 -n 4
2018-12-12T13:02:31.233Z|00013|dpdk|INFO|EAL: Detected 8 lcore(s)
2018-12-12T13:02:31.234Z|00014|dpdk|WARN|EAL: Some devices want iova as va but pa will be used because..
2018-12-12T13:02:31.234Z|00015|dpdk|WARN|EAL: IOMMU does not support IOVA as VA
2018-12-12T13:02:31.235Z|00016|dpdk|INFO|EAL: 10174 hugepages of size 2097152 reserved, but no mounted hugetlbfs found for that size
2018-12-12T13:02:31.235Z|00017|dpdk|INFO|EAL: Probing VFIO support...
2018-12-12T13:02:31.235Z|00018|dpdk|INFO|EAL: VFIO support initialized
2018-12-12T13:02:31.384Z|00019|dpdk|INFO|EAL: PCI device 0000:02:00.0 on NUMA socket -1
2018-12-12T13:02:31.384Z|00020|dpdk|WARN|EAL: Invalid NUMA socket, default to 0
2018-12-12T13:02:31.384Z|00021|dpdk|INFO|EAL: probe driver: 8086:10fb net_ixgbe
2018-12-12T13:02:31.384Z|00022|dpdk|ERR|EAL: 0000:02:00.0 VFIO group is not viable!
2018-12-12T13:02:31.384Z|00023|dpdk|ERR|EAL: Requested device 0000:02:00.0 cannot be used
2018-12-12T13:02:31.384Z|00024|dpdk|INFO|EAL: PCI device 0000:02:00.1 on NUMA socket -1
2018-12-12T13:02:31.384Z|00025|dpdk|WARN|EAL: Invalid NUMA socket, default to 0
2018-12-12T13:02:31.384Z|00026|dpdk|INFO|EAL: probe driver: 8086:10fb net_ixgbe
2018-12-12T13:02:31.385Z|00027|dpdk|INFO|DPDK Enabled - initialized

cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-862.11.6.el7.36.tis.x86_64 root=UUID=979017c6-c426-44c8-94c2-94e65951ea5c ro security_profile=standard module_blacklist=integrity,ima audit=0 tboot=false crashkernel=auto biosdevname=0 console=ttyS0,115200 iommu=pt usbcore.autosuspend=-1 hugepagesz=1G hugepages=1 hugepagesz=2M hugepages=0 default_hugepagesz=2M isolcpus=1,5 rcu_nocbs=1-3,5-7 kthread_cpus=0,4 irqaffinity=0,4 selinux=0 enforcing=0 nmi_watchdog=panic,1 softlockup_panic=1 intel_iommu=on user_namespace.enable=1 nopti nospectre_v2

ls /sys/devices/system/node/node0/hugepages/
hugepages-1048576kB hugepages-2048kB

ls /sys/devices/system/node/node1/hugepages/ -> node1 does not exist

cat /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
1

cat /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/free_hugepages
0

sudo ovs-vsctl get Open_vSwitch . other_config
{dpdk-extra="-n 4", dpdk-hugepage-dir="/mnt/huge-1048576kB", dpdk-init="true", dpdk-lcore-mask="3", dpdk-socket-mem="1024", pmd-cpu-mask="2"}

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as Release Gating until further investigation.
Note: This issue has not been reported by anyone else and could be hardware specific.

tags: added: stx.networking
Changed in starlingx:
importance: Undecided → High
status: New → Triaged
assignee: nobody → Steven Webster (swebster-wr)
tags: added: stx.2019.03
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Steve will look at the new data after he is back from holidays in January 2019.

Ken Young (kenyis)
tags: added: stx.2019.05
removed: stx.2019.03
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Hi Michal, Our apologies for not following up on this report earlier. Is this still an issue?

There have been many changes in starlingx master since this report was filed; including a change in deployment to enable container support as well as support for ovs in a container.

The new installation guides are available at:
https://wiki.openstack.org/wiki/StarlingX/Containers#Installing_and_Configuring_StarlingX_with_Containers

You can get the latest pre-built ISO at:
http://mirror.starlingx.cengn.ca/mirror/starlingx/master/centos/

Sanity reports are sent regularly to <email address hidden>

Please try to deploy your system again with a recent load.

Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
Ghada Khalil (gkhalil)
Changed in starlingx:
status: Triaged → Incomplete
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: High → Medium
Revision history for this message
Ghada Khalil (gkhalil) wrote :

From Michal:
Hi,

I was trying to install latest build, but form some reason is it not booting properly.

On a VirtualBox it works as expected (I can see a menu with available install options), but when trying the same on a bare metal it gives me bootloader prompt (“grub>”).
Can you recommend on this please?

Thank You
Michal

Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Michal,
For your baremetal system, is the install issue you are seeing on controller-0? Are you installing from USB? Is the BIOS setup properly to boot from USB? Or is it perhaps going to disk and finding some other grub config there? It looks like your server is not even getting to the StarlingX installer.

What are the details of the baremetal server that you are using?

Revision history for this message
Michal (michalkr) wrote :

It's suppose to be controller-0.
I am installing from USB emulating CD-ROM drive, as CD-ROM is the only allowed boot method in BIOS. This was working fine for initial StarlingX release, is working fine for TitaniumCloud installer and any other Linux iso images.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

The only thing we can think about is that the USB was not burned properly. How did you burn the USB? I suggest that you re-burn the USB or try another one. It maybe good to do a sync before ejecting as well.

The following thread suggests some options to check if a USB is created correctly; perhaps you can check some of the suggestions there. Please note that we have not tried these ourselves:
https://unix.stackexchange.com/questions/75483/how-to-check-if-the-iso-was-written-to-my-usb-stick-without-errors

Revision history for this message
Don Penney (dpenney) wrote :

I tested this method for verifying USB content, comparing one of our labs against the installation source:

$ stat -c '%s' bootimage.iso
1881145344
$ md5sum bootimage.iso
8c95d3cf2461b049b459729d922d70bf bootimage.iso

controller-0:/home/wrsroot# head -c 1881145344 /dev/sdc | md5sum
8c95d3cf2461b049b459729d922d70bf -

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Closing this bug as Invalid due to lack of activity from the reporter.
If you still encounter issues, please open a new bug with fresh data.

Changed in starlingx:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.