snmpd dies after requests with snmpwalk

Bug #426813 reported by gdowle on 2009-09-09
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
net-snmp (Ubuntu)
Medium
Unassigned
Hardy
Medium
Unassigned

Bug Description

Hi,

I'm using Ubuntu 8.04 (up-to-date) and have the service snmpd running on it. The problem is, when I'm trying to snmpwalk the server (eg. in a while - loop, every 5 seconds), then after max 5 minutes snmpd dies. The package-version is 5.4.1~dfsg-4ubuntu4.2.

/var/log/messages says:
==================
... kernel: [ 4167.266281] snmpd[6485] trap divide error rip:7f2ed9c32c2c rsp:7fffe25cd3d0 error:0

I'm using snmpwalk as follows:
=======================
snmpwalk -v2c -c test_server 162.23.61.160

My /etc/default/snmpd is:
==================
# This file controls the activity of snmpd and snmptrapd
# MIB directories. /usr/share/snmp/mibs is the default, but
# including it here avoids some strange problems.
export MIBDIRS=/usr/share/snmp/mibs
# snmpd control (yes means start daemon).
SNMPDRUN=yes
# snmpd options (use syslog, close stdin/out/err).
SNMPDOPTS='-LS 0-4 d -Lf /dev/null -u snmp -p /var/run/snmpd.pid'
# snmptrapd control (yes means start daemon). As of net-snmp version
# 5.0, master agentx support must be enabled in snmpd before snmptrapd
# can be run. See snmpd.conf(5) for how to do this.
TRAPDRUN=no
# snmptrapd options (use syslog).
TRAPDOPTS='-Lsd -p /var/run/snmptrapd.pid'
# create symlink on Debian legacy location to official RFC path
SNMPDCOMPAT=yes

My /etc/snmp/snmpd.conf:
====================
view all included .1 80
view system included .iso.org.dod.internet.mgmt.mib-2.system
syslocation ServerRoot (configure /etc/snmp/snmpd.local.conf)
syscontact Root <root@localhost> (configure /etc/snmp/snmpd.local.conf)
rocommunity public 127.0.0.1
rocommunity test_server 192.168.44.0/23
rocommunity test_server 162.23.61.0/24

On the client server, where I run a script, that repeats snmpwalk every 5 seconds , the error occurs (if it occurs) always at the same place:
========================================================================
...
HOST-RESOURCES-MIB::hrDeviceErrors.1028 = Counter32: 0
HOST-RESOURCES-MIB::hrDeviceErrors.1029 = Counter32: 0
HOST-RESOURCES-MIB::hrProcessorFrwID.768 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorFrwID.769 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorFrwID.770 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorFrwID.771 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorFrwID.772 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorFrwID.773 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorFrwID.774 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorFrwID.775 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorLoad.768 = INTEGER: 0
HOST-RESOURCES-MIB::hrProcessorLoad.769 = INTEGER: 0
Timeout: No Response from 162.23.61.160
Timeout: No Response from 162.23.61.160
Timeout: No Response from 162.23.61.160

gdowle (garb-dowle) on 2009-09-09
visibility: private → public
gdowle (garb-dowle) wrote :

Could somebody please help? This really looks like a bug, because I almost use the default configuration for snmpd.

Thank you

Kees Cook (kees) wrote :

I would recommend installing snmpd-dbgsym and libsnmp15-dbgsym (if possible), and then running snmpd in gdb to catch the crash, so you can see a back-trace. This should allow developers to see what is happening no your system. For more details, see https://wiki.ubuntu.com/DebuggingProgramCrash

Changed in net-snmp (Ubuntu):
status: New → Incomplete
gdowle (garb-dowle) wrote :

Thank you for this hint!
I installed snmpd-dbgsym and libsnmp15-dbgsym and generated a backtrace with gdb and put the output in the attachment. Snmpd died after 4 or 5 runs of snmpwalk.

Kees Cook (kees) wrote :

agent/mibgroup/host/hr_proc.c, around line 183 from the trace-back shows:

    case HRPROC_LOAD:
        cpu = netsnmp_cpu_get_byIdx( proc_idx & HRDEV_TYPE_MASK, 0 );
        if ( !cpu || !cpu->history || !cpu->history[0].total_hist )
            return NULL;

        long_return = (cpu->idle_ticks - cpu->history[0].idle_hist)*100;
        long_return /= (cpu->total_ticks - cpu->history[0].total_hist);

So if cpu->total_ticks - cpu->history[0].total_hist == 0, it'll explode.

Kees Cook (kees) wrote :

total_hist set equal to total_ticks in updated in agent/mibgroup/hardware/cpu/cpu.c _cpu_update_stats(), though total_ticks is updated there too.

Changed in net-snmp (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → Medium
security vulnerability: yes → no
Chuck Short (zulcss) wrote :

Could you attach your script as well?

Thanks
chuck

gdowle (garb-dowle) wrote :

It's only a simple testing script:

#! /bin/bash
while [ 1 ]
do
  snmpwalk -v2c -c test_server 162.23.61.160
  sleep 5
done

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package net-snmp - 5.4.1~dfsg-12ubuntu7

---------------
net-snmp (5.4.1~dfsg-12ubuntu7) karmic; urgency=low

  * debian/patches/99-fix-ubuntu-div0.patch: Fix division by zero.
    (LP: #426813).

 -- Chuck Short <email address hidden> Mon, 28 Sep 2009 14:02:16 -0400

Changed in net-snmp (Ubuntu):
status: Triaged → Fix Released
johe (johe-stephan) wrote :

This bug is not fixed in LTS (Hardy)

Mathias Gug (mathiaz) on 2009-11-25
Changed in net-snmp (Ubuntu Hardy):
importance: Undecided → Medium
status: New → Triaged

As far a I can tell this is still not fixed in Hardy. Obviously this issue is now known as a patch has been released for Karmic. Unfortunately, I don't have the luxury of upgrading all my servers and rolling my own packages for this is something I would like to avoid. Will this be fixed in Hardy (LTS)? I assume this would have been the priority over Karmic, no?

Thanks,

-brendan

Download full text (4.3 KiB)

There will be several tests to be done.
As far as i can tell today, it seems not to be an generell snmp problem.

Last week i spend by observing some of our servers wich die that way. And it
seems that its an Problem of some OIDs or the interact with how fast they
are asked. I found out that an hand written perl script kills our servers.
This perl script does 5 to 6 asks to the server. We slowed it down by
waiting 1 second each ask. And the server is still running.

So it could be that snmpwalk just asks to much questions, which the server
cant handle.

2010/1/14 Brendan P. Caulfield <email address hidden>

> As far a I can tell this is still not fixed in Hardy. Obviously this
> issue is now known as a patch has been released for Karmic.
> Unfortunately, I don't have the luxury of upgrading all my servers and
> rolling my own packages for this is something I would like to avoid.
> Will this be fixed in Hardy (LTS)? I assume this would have been the
> priority over Karmic, no?
>
> Thanks,
>
> -brendan
>
> --
> snmpd dies after requests with snmpwalk
> https://bugs.launchpad.net/bugs/426813
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “net-snmp” package in Ubuntu: Fix Released
> Status in “net-snmp” source package in Hardy: Triaged
>
> Bug description:
> Hi,
>
> I'm using Ubuntu 8.04 (up-to-date) and have the service snmpd running on
> it. The problem is, when I'm trying to snmpwalk the server (eg. in a while
> - loop, every 5 seconds), then after max 5 minutes snmpd dies. The
> package-version is 5.4.1~dfsg-4ubuntu4.2.
>
> /var/log/messages says:
> ==================
> ... kernel: [ 4167.266281] snmpd[6485] trap divide error rip:7f2ed9c32c2c
> rsp:7fffe25cd3d0 error:0
>
> I'm using snmpwalk as follows:
> =======================
> snmpwalk -v2c -c test_server 162.23.61.160
>
> My /etc/default/snmpd is:
> ==================
> # This file controls the activity of snmpd and snmptrapd
> # MIB directories. /usr/share/snmp/mibs is the default, but
> # including it here avoids some strange problems.
> export MIBDIRS=/usr/share/snmp/mibs
> # snmpd control (yes means start daemon).
> SNMPDRUN=yes
> # snmpd options (use syslog, close stdin/out/err).
> SNMPDOPTS='-LS 0-4 d -Lf /dev/null -u snmp -p /var/run/snmpd.pid'
> # snmptrapd control (yes means start daemon). As of net-snmp version
> # 5.0, master agentx support must be enabled in snmpd before snmptrapd
> # can be run. See snmpd.conf(5) for how to do this.
> TRAPDRUN=no
> # snmptrapd options (use syslog).
> TRAPDOPTS='-Lsd -p /var/run/snmptrapd.pid'
> # create symlink on Debian legacy location to official RFC path
> SNMPDCOMPAT=yes
>
> My /etc/snmp/snmpd.conf:
> ====================
> view all included .1 80
> view system included .iso.org.dod.internet.mgmt.mib-2.system
> syslocation ServerRoot (configure /etc/snmp/snmpd.local.conf)
> syscontact Root <root@localhost> (configure /etc/snmp/snmpd.local.conf)
> rocommunity public 127.0.0.1
> rocommunity test_server 192.168.44.0/23
> rocommunity test_server 162.23.61.0/24
>
> On the client server, where I run a script, that repeats snmpwalk every 5
...

Read more...

Mikael Löfstrand (mikaelld) wrote :

We still have these problems in two of our servers. Both of these run MySQL 5 (packaged with Hardy). A bunch of other servers running MySQL doesn't have this problem though, nor our other servers.

Any ideas on a workaround except "Build your own package"?

Thanks,
Micke

Motin (motin) wrote :

Such a shame! Trying to set up Zenoss monitoring and couldn't understand why the snmp deamon would stop responding after a couple of minutes after deamon restart...

Running Hardy LTS because I thought it was stable enough to be included in a data center / under monitoring.

No workaround or fix for Hardy still?

Motin (motin) wrote :

Although not noted in this bug report, it may be fixed for hardy. I hadn't enabled hardy-updates and now when using the latest packages, the service has been running for 2 hours without shutting down.

chimaster (spike-queenstown) wrote :

I'm still experiencing this error. Any further thoughts?

Rolf Leggewie (r0lf) wrote :

Hardy has seen the end of its life and is no longer receiving any updates. Marking the Hardy task for this ticket as "Won't Fix".

Changed in net-snmp (Ubuntu Hardy):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments