internal error: End of file from monitor

Bug #1582990 reported by Seth Arnold
42
This bug affects 8 people
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

I am trying to use the security team's front end to libvirt, uvt, to create new virtual machines. I found these error messages in the journalctl -u libvirt-bin.service output:

May 17 17:37:51 hunt libvirtd[15696]: libvirt version: 1.3.1, package: 1ubuntu10 (William Grant <email address hidden> Fri, 15 Apr 2016 12:08:21 +1000)
May 17 17:37:51 hunt libvirtd[15696]: hostname: hunt
May 17 17:37:51 hunt libvirtd[15696]: internal error: End of file from monitor
May 17 17:37:59 hunt dnsmasq-dhcp[2466]: DHCPDISCOVER(virbr0) 52:54:00:c2:e0:73
May 17 17:37:59 hunt dnsmasq-dhcp[2466]: DHCPOFFER(virbr0) 192.168.122.141 52:54:00:c2:e0:73
May 17 17:38:00 hunt dnsmasq-dhcp[2466]: DHCPREQUEST(virbr0) 192.168.122.141 52:54:00:c2:e0:73
May 17 17:38:00 hunt dnsmasq-dhcp[2466]: DHCPACK(virbr0) 192.168.122.141 52:54:00:c2:e0:73 sec-xenial-i386
May 17 17:43:07 hunt libvirtd[15696]: internal error: End of file from monitor
May 17 17:43:13 hunt dnsmasq-dhcp[2466]: DHCPDISCOVER(virbr0) 52:54:00:f8:13:c9
May 17 17:43:13 hunt dnsmasq-dhcp[2466]: DHCPOFFER(virbr0) 192.168.122.221 52:54:00:f8:13:c9
May 17 17:43:13 hunt dnsmasq-dhcp[2466]: DHCPREQUEST(virbr0) 192.168.122.221 52:54:00:f8:13:c9
May 17 17:43:13 hunt dnsmasq-dhcp[2466]: DHCPACK(virbr0) 192.168.122.221 52:54:00:f8:13:c9 sec-xenial-amd64
May 17 18:04:52 hunt dnsmasq-dhcp[2466]: DHCPREQUEST(virbr0) 192.168.122.141 52:54:00:c2:e0:73
May 17 18:04:52 hunt dnsmasq-dhcp[2466]: DHCPACK(virbr0) 192.168.122.141 52:54:00:c2:e0:73 sec-xenial-i386
May 17 18:07:26 hunt dnsmasq-dhcp[2466]: DHCPREQUEST(virbr0) 192.168.122.221 52:54:00:f8:13:c9

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: libvirt-bin 1.3.1-1ubuntu10
ProcVersionSignature: Ubuntu 4.4.0-21.37-generic 4.4.6
Uname: Linux 4.4.0-21-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
ApportVersion: 2.20.1-0ubuntu2
Architecture: amd64
CurrentDesktop: Unity
Date: Tue May 17 21:51:06 2016
InstallationDate: Installed on 2012-10-18 (1307 days ago)
InstallationMedia: Ubuntu 12.04.1 LTS "Precise Pangolin" - Release amd64 (20120823.1)
SourcePackage: libvirt
UpgradeStatus: Upgraded to xenial on 2016-04-30 (17 days ago)
modified.conffile..etc.libvirt.qemu.conf: [inaccessible: [Errno 13] Permission denied: '/etc/libvirt/qemu.conf']
modified.conffile..etc.libvirt.qemu.networks.default.xml: [inaccessible: [Errno 13] Permission denied: '/etc/libvirt/qemu/networks/default.xml']

Revision history for this message
Seth Arnold (seth-arnold) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in libvirt (Ubuntu):
status: New → Confirmed
Changed in libvirt (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Laurent Salle (lsalle) wrote :

See:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=785011
Debian Bug report logs - #785011 libvirtd: log error message after successfully terminating machine

Nish Aravamudan (nacc)
Changed in libvirt (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Seth Arnold (seth-arnold) wrote :

This was still happening as of mid-february 2017:

Feb 24 20:34:45 hunt libvirtd[3871]: internal error: End of file from monitor

$ dpkg -l libvirt-bin qemu-kvm | cat
...
ii libvirt-bin 1.3.1-1ubuntu10.8 amd64 programs for the libvirt library
ii qemu-kvm 1:2.5+dfsg-5ubuntu10.8 amd64 QEMU Full virtualization

Thanks

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hmm, this seems to be a non Fatal error.
After explicitly looking for it I see this up to (including) zesty.
Slight change in the Text: "End of file from qemu monitor"
Which is as before +qemu

This seems to be part of a normal non-error guest shutdown phase:
dnsmasq-dhcp[5221]: DHCPRELEASE(virbr0) 192.168.122.235 52:54:00:b6:c3:65
kernel: virbr0: port 2(vnet0) entered disabled state
kernel: device vnet0 left promiscuous mode
kernel: virbr0: port 2(vnet0) entered disabled state
libvirtd[18333]: internal error: End of file from qemu monitor
virtlogd[18234]: End of file while reading data: Input/output error

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Not an excuse at all, but a statement of overall status - I checked a few other distros (Redhat, IBMKVM, Suse) as far as I could see they all have the same

I'm analyzing the case at the moment (will post on that later) and I think I have found a place where once could tweak, but it seems to be an upstream bug/patch which then need to tickle down.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (7.5 KiB)

On ther per guest qemu log one always can see:
2017-03-07 08:59:35.748+0000: shutting down
Then libvirt throws
Mär 07 09:59:35 lap libvirtd[3023]: internal error: End of file from monitor

It seems even on a clean shutdown I see before a kill signal like:
qemu-system-x86_64: terminating on signal 15 from pid 3023

So might that be "normal" if stopped like this.
There are several stop options for "virsh shutdown"
Shutdown methods:
"acpi", "agent", "initctl", "signal" and "paravirt"
And then one can also just shutdown from inside the guest.

Notation is "error log msg / kill signal", so Y/Y means both happened.

acpi: Y/Y
agent: Y/Y
shutdown -h 0 inside guest: Y/Y
initctl not-supported for qemu
signal: not-supported for qemu
paravirt: not-supported for qemu

So the "mode" doesn't matter - which at least tells us it is code a bit outside the actual shutdown.

The signal comes from "/usr/sbin/libvirtd" itself.
And since it does so independent to the shutdown method it seems to be some sort of generic "cleanup".
Lets find that in the code.

There are three kills in an strace of the shutdown action:
(some context to find these later)
     0.000020 write(22, "<domstatus state='shutdown' reas"..., 10272) = 10272 <0.000011>
     0.000032 fsync(22) = 0 <0.000006>
     0.000018 close(22) = 0 <0.000007>
     0.000020 rename("/var/run/libvirt/qemu/zesty-test-log.xml.new", "/var/run/libvirt/qemu/zesty-test-log.xml") = 0 <0.000024
     0.000039 unlink("/var/run/libvirt/qemu/zesty-test-log.xml.new") = -1 ENOENT (No such file or directory) <0.000009>
XX 0.000027 kill(11836, SIGTERM) = 0 <0.000010>
     0.000031 poll([{fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, even
     0.000113 read(26, "{\"timestamp\": {\"seconds\": 148887"..., 1023) = 81 <0.000012>

     0.000035 close(22) = 0 <0.000044>
     0.000060 unlink("/var/lib/libvirt/qemu/channel/target/domain-zesty-test-log/org.qemu.guest_agent.0") = -1 ENOENT (No such
XX 0.000032 kill(11836, SIGTERM) = 0 <0.000013>
     0.000029 nanosleep({0, 200000000}, NULL) = 0 <0.200086>
XX 0.200115 kill(11836, SIG_0) = -1 ESRCH (No such process) <0.000007>
     0.000049 open("/dev/null", O_RDWR|O_CLOEXEC) = 22 <0.000020>
     0.000045 pipe2([23, 24], O_CLOEXEC) = 0 <0.000010>
     0.000027 pipe2([25, 26], O_CLOEXEC) = 0 <0.000009>
     0.000021 rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [ALRM], 8) = 0 <0.000007>

Install libvirt0-dbg and break on those when called from the daemon.
Found the "kill" that does this at with a backtrace like:
#0 kill () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007f3511d3014a in virProcessKill (pid=<optimized out>, sig=<optimized out>) at ../../../src/util/virprocess.c:333
#2 0x00007f34ea693928 in qemuProcessKill (vm=vm@entry=0x7f34dc004f60, flags=flags@entry=2) at ../../../src/qemu/qemu_process.c:5213
#3 0x00007f34ea693a0d in ...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Reproducable with current upstream (3.2 in development)

Note that on newer versions with full virtlogd there also is a similar message from that:
virtlogd[19517]: End of file while reading data: Input/output error

There is some side effect that makes this slightly less straight forward.
If the virReportError call is missing later on it reports "Error while processing monitor IO".
I guess the report sets err and makes virGetLastError return something later on (what we still see is the generic catch all error message).

It is like it is insisting on an error :-)
If I don't get any from above I set the generic one down here.
Yet just skipping those might have side effects I currently can't see.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The following patch (so far RFC level) would avoid the error for all qemu shutdown modes supported (acpi and agent).

I checked that:
- no error message on acpi shutdown
- no error message on agent shutdown
- still reports the error when the guest is killed from "other sources" (we want that).
- still reports "other errors" on shutdown

It is just not adding the EOF issue or the fallback "there has to be an error" in case qemuMonitorEmitShutdown was passed setting mon->willhangup = 1.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Do you think it makes sense suggesting that upstream?

tags: added: patch
Revision history for this message
Seth Arnold (seth-arnold) wrote :

Christian, many thanks for tracking this down. Is the bool getting the correct initial value due to a 'memset 0' when the struct is initialized? Or does this also need an explicit initializer?

Thanks

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: [Bug 1582990] Re: internal error: End of file from monitor

On Wed, Mar 8, 2017 at 2:01 AM, Seth Arnold <email address hidden>
wrote:

> Is the bool getting the
> correct initial value due to a 'memset 0' when the struct is
> initialized? Or does this also need an explicit initializer?
>

Hi,
as always in libvirt things are a bit more abstracted.
After a trip through somewhat object oriented code in C (which I don't want
to bother you) I can confirm that eventually is is a calloc that gets the
structs memory (src/util/viralloc.c:560).
Also an indicator - the other struct vars have no initiaizers either.

So I think it does not need an explicit initializer.

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Great; then it feels like it's ready for upstream review.

Thanks

Revision history for this message
zongyimin (yanpeifei) wrote :

I met the same bug.
Aug 10 08:01:41 controller libvirtd[12421]: internal error: End of file from monitor
Aug 10 08:23:12 controller libvirtd[12421]: internal error: End of file from monitor
Aug 10 08:24:06 controller libvirtd[12421]: internal error: End of file from monitor
Aug 10 08:25:54 controller libvirtd[12421]: internal error: End of file from monitor
Aug 10 08:28:03 controller libvirtd[12421]: internal error: End of file from monitor
Aug 10 08:36:49 controller libvirtd[12421]: internal error: End of file from monitor
Aug 10 08:37:00 controller libvirtd[12421]: internal error: End of file from monitor
Aug 10 08:44:33 controller libvirtd[12421]: internal error: End of file from monitor
Aug 10 08:46:12 controller libvirtd[12421]: internal error: End of file from monitor

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

One has to be clear - any case where qemu stops executing in an unexpected way make libvirt go "huh ?" and this is that message.
Please be careful that having the same "message" does not imply the same error other than "qemu went away" which can be for various reasons.

There was some activity to improve the messages where possible, but for some libvirt just can't know.

That "it will still happen anyway" made me forget about this.
So thanks for the ping still @yanpeifei

I checked and my fix went upstream in 3.4 which we have (newer) in Artful.
I don't consider it SRU worth thou - so setting fix released to match reality.

Changed in libvirt (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
s10 (vlad-esten) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi s10,
I know, but we never hit the deadlock they had fixed by the revert in Artful AFAIK.
We followed the revert in the code for Bionic, but not yet intended to SRU the revert to Artful.

Revision history for this message
torel (torehl) wrote :

Ubuntu 18.04.6LTS w/5.4.0-137-generic. Group of Twin2 cabinets with InfiniBand and PCIe NTB networks. Migration is pretty much instant. When migrating VMs, they come up paused, but VMs are running.

Jan 24 09:25:47 v003 libvirtd[9066]: 2023-01-24 08:25:47.500+0000: 9066: error : qemuMonitorIO:719 : internal error: End of file from qemu monitor

Been trying to track down why that is. Can anyone shed some light on this?

Revision history for this message
torel (torehl) wrote :

With test.sh from Bugzilla bug, when the problem occurs I get

error: error: Domain is already active
Domain is already active

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hello torel,

As Christian mentioned on comment #15, having the "End of file from qemu monitor" message does not necessarily mean that you're experiencing the same problem described in this bug. Also, it's been a long time since this bug was investigated and things have likely changed quite significantly in qemu since then. I'd like to ask that you open a new bug and provide as much information about the issue as possible, so that we can take a better look at it. Feel free to mention this bug in the new one, of course.

Thank you.

Revision history for this message
torel (torehl) wrote :

Hi Sergio,

> error: error: Domain is already active
> Domain is already active

    After some investigations, it seems to be the same uuid on all the nodes in the Twin2 cabinet. 'dmidecode -s system-uuid' is the same on all nodes. Could explain the above.

  There's also an issue with /dev/kvm not being in group kvm. There are several udev rules which might be tripping over each other, as well as the startup script qemu-kvm.service that sets it.

  Lastly, 'virsh nodeinfo' returns the wrong socket count (compared to actual number returned by numactl -H). Maybe three new bugs are in order? Do you still want bugs against bionic?

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Bionic is still supported at the moment, so feel free to file bugs.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.