cannot set hmcdrv cachesize to 1GB

Bug #1806115 reported by Dimitri John Ledkov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Won't Fix
Undecided
bugproxy
linux (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

On an LPAR, execute Customize/Delete Activation Profile. Customize profile. In security section, tick "Input/Output (I/O) configuration control". Save. Ok. Close.

Execute Deactivate task.
Execute Activate task.

ssh into lpar.

Try to modprobe hmcdrv module, whilst setting cachesize to 1GiB. Going down to the smallest permissible value.

First, observe that suffixes do not appear to be accepted, despite the documentation stating that it should be supported:

ubuntu@s1lp7:~$ sudo modprobe hmcdrv cachesize=1G
modprobe: ERROR: could not insert 'hmcdrv': Invalid argument

ubuntu@s1lp7:~$ sudo modprobe hmcdrv cachesize=512M
modprobe: ERROR: could not insert 'hmcdrv': Invalid argument

Next try to request largish amounts of cache, and note how those calls fail:
$ sudo modprobe hmcdrv cachesize=134217728
modprobe: ERROR: could not insert 'hmcdrv': Cannot allocate memory

$ sudo modprobe hmcdrv cachesize=67108864
modprobe: ERROR: could not insert 'hmcdrv': Cannot allocate memory

$ sudo modprobe hmcdrv cachesize=33554432
modprobe: ERROR: could not insert 'hmcdrv': Cannot allocate memory

$ sudo modprobe hmcdrv cachesize=16777216
modprobe: ERROR: could not insert 'hmcdrv': Cannot allocate memory

$ sudo modprobe hmcdrv cachesize=8388608
modprobe: ERROR: could not insert 'hmcdrv': Cannot allocate memory

$ sudo modprobe hmcdrv cachesize=4194304
modprobe: ERROR: could not insert 'hmcdrv': Cannot allocate memory

$ sudo modprobe hmcdrv cachesize=2097152
modprobe: ERROR: could not insert 'hmcdrv': Cannot allocate memory

$ sudo modprobe hmcdrv cachesize=2097152
modprobe: ERROR: could not insert 'hmcdrv': Cannot allocate memory

# 1MiB + 2048 bytes
$ sudo modprobe hmcdrv cachesize=1050624
modprobe: ERROR: could not insert 'hmcdrv': Cannot allocate memory

$ sudo modprobe hmcdrv cachesize=1048576
$ echo $?
0

The LPAR in question is configured with 2000MB of storage (RAM/memory), thus imho should be able to handle more than 1MiB cachesize, when there are more than 900MiB of free storage:

$ free -h
              total used free shared buff/cache available
Mem: 1.7Gi 348Mi 913Mi 0.0Ki 459Mi 1.3Gi
Swap: 1.9Gi 0B 1.9Gi

Is 1MiB some kind of a cachesize limit of the hmcdrv module? Can it be fixed and/or bumped to be unlimited? I'm looking forward to be able to use a 1GiB cachesize.
---
ProblemType: Bug
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.10-0ubuntu14
Architecture: s390x
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
DistroRelease: Ubuntu 19.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb: Error: command ['lsusb'] failed with exit code 1:
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C
 SHELL=/bin/bash
ProcFB: Error: [Errno 2] No such file or directory: '/proc/fb'
ProcKernelCmdLine: root=UUID=0972d5df-b2f2-499e-a787-9fb90b7ebf00 crashkernel=256M break=bottom BOOT_IMAGE=1
ProcVersionSignature: Ubuntu 4.18.0-11.12-generic 4.18.12
RelatedPackageVersions:
 linux-restricted-modules-4.18.0-11-generic N/A
 linux-backports-modules-4.18.0-11-generic N/A
 linux-firmware 1.176
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags: disco
Uname: Linux 4.18.0-11-generic s390x
UpgradeStatus: Upgraded to disco on 2018-11-20 (10 days ago)
UserGroups: adm cdrom cpacfstats dip kvm libvirt lpadmin lxd plugdev sambashare sbuild sudo
_MarkForUpload: True

Revision history for this message
Dimitri John Ledkov (xnox) wrote : AudioDevicesInUse.txt

apport information

tags: added: apport-collected disco
description: updated
Revision history for this message
Dimitri John Ledkov (xnox) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Dimitri John Ledkov (xnox) wrote : Lspci.txt

apport information

Revision history for this message
Dimitri John Ledkov (xnox) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Dimitri John Ledkov (xnox) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Dimitri John Ledkov (xnox) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Dimitri John Ledkov (xnox) wrote : ProcModules.txt

apport information

Revision history for this message
Dimitri John Ledkov (xnox) wrote : UdevDb.txt

apport information

Revision history for this message
Dimitri John Ledkov (xnox) wrote : WifiSyslog.txt

apport information

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

dmesg output:

[ 61.741644] Hardware name: IBM 2964 N63 400 (LPAR)
[ 61.741645] Krnl PSW : 00000000d4fd3f9b 000000008aabe5b8 (__alloc_pages_nodemask+0xe96/0x1068)
[ 61.741649] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 61.741650] Krnl GPRS: 0000000000000000 00000000006000c1 00000000006000c1 0000000000000000
[ 61.741651] 0000000000000000 000000006fff3700 0000000000000000 000003ff8117c800
[ 61.741652] 0000000000b39000 000003ff8117ca50 000003ff8117f000 0000000000000000
[ 61.741653] 0000000000000000 0000000000901b00 00000000002e56e8 000000005c18f9b8
[ 61.741661] Krnl Code: 00000000002e645a: a7110200 tmll %r1,512
                          00000000002e645e: a774f94a brc 7,2e56f2
                         #00000000002e6462: a7f40001 brc 15,2e6464
                         >00000000002e6466: a7f4f946 brc 15,2e56f2
                          00000000002e646a: e31003380004 lg %r1,824
                          00000000002e6470: d503f15819c0 clc 344(4,%r15),2496(%r1)
                          00000000002e6476: a784febc brc 8,2e61ee
                          00000000002e647a: a7f4fa73 brc 15,2e5960
[ 61.741674] Call Trace:
[ 61.741676] ([<00000000002e56e8>] __alloc_pages_nodemask+0x118/0x1068)
[ 61.741677] [<00000000002e0bca>] __get_free_pages+0x2a/0x60
[ 61.741680] [<000003ff8117a2e0>] hmcdrv_cache_startup+0x68/0xd0 [hmcdrv]
[ 61.741682] [<000003ff8117f03e>] hmcdrv_mod_init+0x3e/0x1000 [hmcdrv]
[ 61.741684] [<0000000000100878>] do_one_initcall+0x48/0x1d0
[ 61.741687] [<0000000000216f52>] do_init_module+0x7a/0x260
[ 61.741688] [<00000000002196e6>] load_module+0x16ee/0x1968
[ 61.741689] [<0000000000219bce>] sys_finit_module+0xb6/0xf8
[ 61.741693] [<00000000008f0dc0>] system_call+0xdc/0x2c8
[ 61.741693] Last Breaking-Event-Address:
[ 61.741695] [<00000000002e6462>] __alloc_pages_nodemask+0xe92/0x1068
[ 61.741696] ---[ end trace 3870bf3953e53424 ]---
[ 61.741697] hmcdrv.a3150c: Allocating the requested cache size of 134217728 bytes failed
[ 108.285532] hmcdrv.a3150c: Allocating the requested cache size of 67108864 bytes failed
[ 120.229043] hmcdrv.a3150c: Allocating the requested cache size of 33554432 bytes failed
[ 130.020224] hmcdrv.a3150c: Allocating the requested cache size of 16777216 bytes failed
[ 143.255667] hmcdrv.a3150c: Allocating the requested cache size of 8388608 bytes failed
[ 155.345222] hmcdrv.a3150c: Allocating the requested cache size of 4194304 bytes failed
[ 165.291445] hmcdrv.a3150c: Allocating the requested cache size of 2097152 bytes failed
[ 173.740144] hmcdrv.a3150c: Allocating the requested cache size of 2097152 bytes failed
[ 279.595926] hmcdrv: `1G' invalid for parameter `cachesize'
[ 285.847067] hmcdrv: `512M' invalid for parameter `cachesize'
[ 708.461429] hmcdrv.a3150c: Allocating the requested cache size of 1050624 bytes failed

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Looking at the source code, it seems like the __get_free_pages call is failing. But I don't know what is the significance of said call, and why it fails beyond the 1MiB limit.

/**
 * hmcdrv_cache_startup() - startup of HMC drive cache
 * @cachesize: cache size
 *
 * Return: 0 on success, else a (negative) error code
 */
int hmcdrv_cache_startup(size_t cachesize)
{
 if (cachesize > 0) { /* perform caching ? */
  hmcdrv_cache_order = get_order(cachesize);
  hmcdrv_cache_file.content =
   (void *) __get_free_pages(GFP_KERNEL | GFP_DMA,
        hmcdrv_cache_order);

  if (!hmcdrv_cache_file.content) {
   pr_err("Allocating the requested cache size of %zu bytes failed\n",
          cachesize);
   return -ENOMEM;
  }

  pr_debug("content cache enabled, size is %zu bytes\n",
    cachesize);
 }

 hmcdrv_cache_file.len = cachesize;
 return 0;
}

Revision history for this message
Frank Heimes (fheimes) wrote :

according to the device drivers guide
'Linux on Z and LinuxONE - Device Drivers, Features, and Commands - Development stream (Kernel 4.18)' (page 482)
http://public.dhe.ibm.com/software/dw/linux390/docu/l418dd35.pdf
there is no limitation mentioned and it should just work - even with GBs of cachesize.
Hence looping in IBM and reverse mirroring this bug.

tags: added: reverse-proxy-bugzilla s390x
Changed in ubuntu-z-systems:
assignee: nobody → bugproxy (bugproxy)
bugproxy (bugproxy)
tags: added: architecture-s39064 bugnameltc-173729 severity-high targetmilestone-inin1804
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-12-03 06:34 EDT-------
> Looking at the source code, it seems like the __get_free_pages call is
> failing. But I don't know what is the significance of said call, and why it
> fails beyond the 1MiB limit.

__get_free_pages uses the standard page allocator which has a max order.
For s390 CONFIG_FORCE_MAX_ZONEORDER is 9 and there is this check
in mm/page_alloc.c:

if (unlikely(order >= MAX_ORDER)) {
WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
return NULL;
}

The largest order that works is therefore 8, or 1MB.
If a larger cache size is really needed we might be able to use the CMA
allocator.

tags: added: kernel-da-key
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

So. I'm working on the new server installer for s390x, and it already works quite good on qemu/kvm.

(whilst most of the UX improvements are not visible in the daily images, it does boot and completes the install in qemu/kvm. We have colors support, and partitioning changes/improvements staged in git master.)

http://cdimage.ubuntu.com/ubuntu-server/daily-live/current/disco-live-server-s390x.iso

Performing ISO/USB load, and giving access to the LPAR (it needs it), also results in the ability to boot the new installer. But it is painfully slow, and often hits OOM.

I was hoping that hmcdrvfs is performant enough for read-only access as the underlay of the overlayfs, for live ubuntu server experience, and the source of installation. At the moment, it is painfully slow. And I'm not sure if we can tune vfs or hmcdrv to make this installation type viable.

I almost feel like starting to make boot-in-place images, and requesting HMC to have ability to accept an upload of a qcow2 image, blast it onto a zfcp LUN disk, and IPL that as a read-only disk, for extremely fast booting of the new installer, with extremely fast install speeds.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Well, we discussed the performance. It seems like this interface is not meant to be used as a backing storage for rootfs.

Will try other options instead.

Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Changed in ubuntu-z-systems:
status: New → Won't Fix
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-12-05 10:45 EDT-------
@Xnox: Does that mean, we can close it, due to working on alternate solutions???

Many thx

Revision history for this message
Frank Heimes (fheimes) wrote :

Yes, since hmcdrv and hmcdrvfs was designed for a different purpose and thus not offers whats needed (regarding latency and bandwidth) this can be closed.
Alternatives are under investigation.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-12-06 02:45 EDT-------
IBM bugzilla status -> closed, Will not be fixed. Alternate solution under investigation

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.