Full cluster reinstallation failed after provisioning (reboot task) according to node get irq 34: nobody cared (try booting with the "irqpoll" option)

Bug #1498330 reported by Tatyanka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
Undecided
Tatyanka

Bug Description

There is not defined steps to reproduce, according to issue appears during nightly test runs:
https://product-ci.infra.mirantis.net/view/7.0_swarm/job/7.0.system_test.ubuntu.full_cluster_reinstallation/31/testReport/junit/%28root%29/full_cluster_reinstallation/full_cluster_reinstallation/

and from the first look test failed by the reason: http://paste.openstack.org/show/473420/

But this not was not able works fine after reboot, see kernel.log error from it (node-1)
2015-09-22T00:52:59.351780+00:00 err: [ 4631.183946] irq 34: nobody cared (try booting with the "irqpoll" option)
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OX 3.13.0-64-generic #104-Ubuntu
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] ffff8800bc0078a4 ffff8800bfc03d30 ffffffff81723f70 ffff8800bc007800
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] ffff8800bfc03d58 ffffffff810c2282 ffff8800bc007800 0000000000000022
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] 0000000000000000 ffff8800bfc03d98 ffffffff810c27bc 00000000bfc03d78
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] Call Trace:
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] <IRQ> [<ffffffff81723f70>] dump_stack+0x45/0x56
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff810c2282>] __report_bad_irq+0x32/0xd0
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff810c27bc>] note_interrupt+0x24c/0x2a0
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff810bffe9>] handle_irq_event_percpu+0xd9/0x1d0
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff810c011d>] handle_irq_event+0x3d/0x60
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff810c333a>] handle_fasteoi_irq+0x5a/0x100
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff81015e3e>] handle_irq+0x1e/0x30
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff81736e5d>] do_IRQ+0x4d/0xc0
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff8172c46d>] common_interrupt+0x6d/0x6d
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff8106cc70>] ? __do_softirq+0x90/0x2c0
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff8106d215>] irq_exit+0x105/0x110
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff81736e66>] do_IRQ+0x56/0xc0
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff8172c46d>] common_interrupt+0x6d/0x6d
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] <EOI> [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff8101cb2f>] default_idle+0x1f/0xc0
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff8101d406>] arch_cpu_idle+0x26/0x30
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff810bf3a5>] cpu_startup_entry+0xc5/0x290
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff81712177>] rest_init+0x77/0x80
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff81d34f70>] start_kernel+0x438/0x443
2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
2015-09-22T00:52:59.351936+00:00 warning: [ 4631.184018] [<ffffffff81d34120>] ? early_idt_handler_array+0x120/0x120
2015-09-22T00:52:59.351936+00:00 warning: [ 4631.184018] [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
2015-09-22T00:52:59.351936+00:00 warning: [ 4631.184018] [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
2015-09-22T00:52:59.351936+00:00 err: [ 4631.184018] handlers:
2015-09-22T00:52:59.351936+00:00 err: [ 4631.184018] [<ffffffffa00084a0>] e1000_intr [e1000]
2015-09-22T00:52:59.351936+00:00 err: [ 4631.184018] [<ffffffffa00084a0>] e1000_intr [e1000]
2015-09-22T00:52:59.351936+00:00 emerg: [ 4631.184018] Disabling IRQ #34

Scenario where issues appears looks like:

            1. Create a cluster
            2. Add 3 nodes with controller roles
            3. Add a node with compute and cinder roles
            4. Add a node with mongo role
            5. Deploy the cluster
            6. Create an empty sample file on each node to check that it is not
               available after cluster reinstallation
            7. Reinstall all cluster nodes
            8. Verify that all nodes are reinstalled (not just rebooted),
               i.e. there is no sample file on a node

Actual result:
after rebbot task during installation(re-provisioning task) node get IRQ#34 and became not operable, as result provisioning task failed with message http://paste.openstack.org/show/473420/

root@node-1:~# cat /proc/interrupts
           CPU0
  0: 7 IO-APIC-edge timer
  1: 10 IO-APIC-edge i8042
  2: 0 XT-PIC-XT-PIC cascade
  8: 0 IO-APIC-edge rtc0
 12: 144 IO-APIC-edge i8042
 14: 0 IO-APIC-edge ata_piix
 15: 0 IO-APIC-edge ata_piix
 34: 61959 IO-APIC-fasteoi eth2, eth3
 35: 15851 IO-APIC-fasteoi virtio3, eth0, eth4, eth1
 40: 0 PCI-MSI-edge virtio0-config
 41: 18175 PCI-MSI-edge virtio0-requests
 42: 0 PCI-MSI-edge virtio1-config
 43: 1442 PCI-MSI-edge virtio1-requests
 44: 0 PCI-MSI-edge virtio2-config
 45: 1283 PCI-MSI-edge virtio2-requests
 46: 1 PCI-MSI-edge xhci_hcd
 47: 0 PCI-MSI-edge xhci_hcd
NMI: 0 Non-maskable interrupts
LOC: 106613 Local timer interrupts
SPU: 0 Spurious interrupts
PMI: 0 Performance monitoring interrupts
IWI: 19863 IRQ work interrupts
RTR: 0 APIC ICR read retries
RES: 0 Rescheduling interrupts
CAL: 0 Function call interrupts
TLB: 0 TLB shootdowns
TRM: 0 Thermal event interrupts
THR: 0 Threshold APIC interrupts
MCE: 0 Machine check exceptions
MCP: 1 Machine check polls
ERR: 0
MIS: 0

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "7.0"
  openstack_version: "2015.1.0-7.0"
  api: "1.0"
  build_number: "300"
  build_id: "300"
  nailgun_sha: "52ed7e621cffe8b448b28d3d98a5ec722b75a3e1"
  python-fuelclient_sha: "486bde57cda1badb68f915f66c61b544108606f3"
  fuel-agent_sha: "50e90af6e3d560e9085ff71d2950cfbcca91af67"
  fuel-nailgun-agent_sha: "d7027952870a35db8dc52f185bb1158cdd3d1ebd"
  astute_sha: "6c5b73f93e24cc781c809db9159927655ced5012"
  fuel-library_sha: "6afdcd342a9e8e2670e496de4160e5897d5d6be0"
  fuel-ostf_sha: "2cd967dccd66cfc3a0abd6af9f31e5b4d150a11c"
  fuelmain_sha: "628bfb1ba8413dc69a83b8514d02ecac8a07a7ca"

Tags: area-qa
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
description: updated
Changed in fuel:
importance: Medium → Undecided
summary: - Full cluster reinstallation failed after provisioning rebbot task
+ Full cluster reinstallation failed after provisioning (reboot task)
according to node get irq 34: nobody cared (try booting with the
"irqpoll" option)
Changed in fuel:
assignee: nobody → MOS Linux (mos-linux)
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

> 2015-09-22T00:52:59.351780+00:00 warning: [ 4631.184018] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011

Please post the exact qemu command line (or libvirt XML describing the VM)

Changed in fuel:
status: New → Incomplete
assignee: MOS Linux (mos-linux) → Tatyanka (tatyana-leontovich)
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

Too few data to for any meaningful conclusion (host system being overloaded, qemu bug, kernel bug,
cosmic rays, aliens), and no clear steps to reproduce the problem. Setting as Incomplete.

description: updated
Changed in fuel:
milestone: 7.0 → 8.0
Dmitry Pyzhov (dpyzhov)
tags: added: area-qa
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

move to invalid according to there are no any reproduces of this behaviour

Changed in fuel:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.