One high load node on the cluster with 20 nodes cl
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mirantis OpenStack |
Fix Committed
|
Medium
|
Sergey Galkin | ||
5.0.x |
Won't Fix
|
Medium
|
MOS Cinder | ||
5.1.x |
Won't Fix
|
Medium
|
MOS Cinder | ||
6.0.x |
Fix Committed
|
Medium
|
MOS Cinder | ||
6.1.x |
Fix Committed
|
Medium
|
Sergey Galkin |
Bug Description
api: '1.0'
astute_sha: f5fbd89d1e0e1f2
auth_required: true
build_id: 2014-10-13_00-01-06
build_number: '27'
feature_groups:
- mirantis
fuellib_sha: 46ad455514614ec
fuelmain_sha: 431350ba204146f
nailgun_sha: 88a94a11426d356
ostf_sha: 64cb59c681658a7
production: docker
release: 5.1.1
Steps to reproduce
1. Create cluster with 20 HW nodes
2. Run rally tests
Sometime one node becomes heavily loaded
In my case
14-10-2014 10:13:58Node 'Untitled (92:9e)' is back online
14-10-2014 10:10:33Node 'Untitled (92:9e)' has gone away
14-10-2014 10:06:59Node 'Untitled (92:9e)' is back online
14-10-2014 10:05:03Node 'Untitled (92:9e)' has gone away
14-10-2014 09:58:31Node 'Untitled (92:9e)' is back online
14-10-2014 09:55:32Node 'Untitled (92:9e)' has gone away
14-10-2014 08:52:21Node 'Untitled (92:9e)' is back online
14-10-2014 08:50:27Node 'Untitled (92:9e)' has gone away
14-10-2014 08:46:59Node 'Untitled (92:9e)' is back online
14-10-2014 08:44:57Node 'Untitled (92:9e)' has gone away
14-10-2014 08:41:34Node 'Untitled (92:9e)' is back online
14-10-2014 08:39:26Node 'Untitled (92:9e)' has gone away
top on this node
top - 09:21:17 up 3:36, 2 users, load average: 12.59, 11.67, 9.37
Tasks: 293 total, 1 running, 285 sleeping, 0 stopped, 7 zombie
Cpu(s): 0.9%us, 0.5%sy, 0.0%ni, 54.1%id, 44.5%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32911996k total, 14377740k used, 18534256k free, 13047952k buffers
Swap: 15999996k total, 0k used, 15999996k free, 147072k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23550 root 20 0 12476 1716 608 D 3 0.0 0:03.80 dd
23776 root 20 0 12476 1716 608 D 3 0.0 0:03.21 dd
part of dmesg
[12685.908490] bio: create slab <bio-1> at 1
[12926.610890] scsi12 : iSCSI Initiator over TCP/IP
[12927.116097] scsi 12:0:0:0: RAID IET Controller 0001 PQ: 0 ANSI: 5
[12927.116236] scsi 12:0:0:0: Attached scsi generic sg2 type 12
[12927.116614] scsi 12:0:0:1: Direct-Access IET VIRTUAL-DISK 0001 PQ: 0 ANSI: 5
[12927.116747] sd 12:0:0:1: Attached scsi generic sg3 type 0
[12927.116963] sd 12:0:0:1: [sdc] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
[12927.117711] sd 12:0:0:1: [sdc] Write Protect is off
[12927.117715] sd 12:0:0:1: [sdc] Mode Sense: 69 00 00 08
[12927.117884] sd 12:0:0:1: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[12927.119533] sdc: sdc1
[12927.120499] sd 12:0:0:1: [sdc] Attached SCSI disk
[12958.600566] sd 12:0:0:1: [sdc] Synchronizing SCSI cache
[12958.874873] connection7:0: detected conn error (1020)
[12986.390070] scsi13 : iSCSI Initiator over TCP/IP
[12986.895534] scsi 13:0:0:0: RAID IET Controller 0001 PQ: 0 ANSI: 5
[12986.895693] scsi 13:0:0:0: Attached scsi generic sg2 type 12
[12986.896150] scsi 13:0:0:1: Direct-Access IET VIRTUAL-DISK 0001 PQ: 0 ANSI: 5
[12986.896262] sd 13:0:0:1: Attached scsi generic sg3 type 0
[12986.896747] sd 13:0:0:1: [sdc] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
[12986.897224] sd 13:0:0:1: [sdc] Write Protect is off
[12986.897228] sd 13:0:0:1: [sdc] Mode Sense: 69 00 00 08
[12986.897401] sd 13:0:0:1: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[12987.102184] sdc: sdc1
[12987.103406] sd 13:0:0:1: [sdc] Attached SCSI disk
[13000.669725] sd 13:0:0:1: [sdc] Synchronizing SCSI cache
[13001.068262] connection8:0: detected conn error (1020)
Changed in mos: | |
milestone: | none → 6.0 |
no longer affects: | fuel |
Changed in mos: | |
status: | Confirmed → Won't Fix |
Logs from node