6b34 regression: jps as root does not show classname anymore but instead 'process information unavailable' for processes running as non-root user

Bug #1417962 reported by Stefan Huehner on 2015-02-04
58
This bug affects 11 people
Affects Status Importance Assigned to Milestone
openjdk-6 (Ubuntu)
Undecided
Unassigned
openjdk-7 (Ubuntu)
Undecided
Unassigned

Bug Description

I noticed the following behavior change in the jps command line tool when updating from
6b33-1.13.5-1ubuntu0.12.04 -> 6b34-1.13.6-1ubuntu0.12.04.1 on a ubuntu 12.04 64bit system.

Staring point is a java process for apache tomcat running as non-root user (username openbravo in below example) with pid 1462.

In 6b33 when running jps -l as root it did correctly identify the classname of the running processes as shown here:
luna686:~# jps -l
1462 org.apache.catalina.startup.Bootstrap
11610 sun.tools.jps.Jps

However after updating the same jps call as root-user does now show:
luna686:~# jps -l
12056 sun.tools.jps.Jps
1462 -- process information unavailable

Which break some custom monitoring of us trying to find tomcat process via its classname.

Note: Problem only occours when running jps as root user and process in question is running non-root. When running jps as same user as the tomcat process is running with then both 6b33 + 6b34 work as expected

Stefan Huehner (stefan-huehner) wrote :

The same regression can also be observed in openjdk-7-jdk package in trusty/14.04

Good version:
7u71-2.5.3-0ubuntu0.14.04.1

Bad version:
7u75-2.5.4-1~trusty1

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in openjdk-6 (Ubuntu):
status: New → Confirmed
Changed in openjdk-7 (Ubuntu):
status: New → Confirmed
Erik Forsberg (forsberg) wrote :

I see this as well, although on oracle JVM 7u75, but that's no surprise given that it originates from the same codebase as openjdk-7.

Doing a bit of strace on the jps program shows some interesting data:

# strace -f jps -v 2>&1|grep /tmp/hsperfdata
[pid 9064] open("/tmp/hsperfdata_root", O_RDONLY|O_NOFOLLOW) = 3
[pid 9064] open("/tmp/hsperfdata_root", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4
[pid 9064] mkdir("/tmp/hsperfdata_root", 0755) = -1 EEXIST (File exists)
[pid 9064] lstat("/tmp/hsperfdata_root", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 9064] open("/tmp/hsperfdata_root", O_RDONLY|O_NOFOLLOW) = 3
[pid 9064] open("/tmp/hsperfdata_root", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4
[pid 9064] stat("/tmp/hsperfdata_root", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 9064] open("/tmp/hsperfdata_root", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
[pid 9064] stat("/tmp/hsperfdata_root/9063", {st_mode=S_IFREG|0600, st_size=32768, ...}) = 0
[pid 9064] access("/tmp/hsperfdata_root/9063", R_OK) = 0
[pid 9064] stat("/tmp/hsperfdata_cassandra", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
[pid 9064] open("/tmp/hsperfdata_cassandra", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
[pid 9064] stat("/tmp/hsperfdata_cassandra/13926", {st_mode=S_IFREG|0600, st_size=32768, ...}) = 0
[pid 9064] access("/tmp/hsperfdata_cassandra/13926", R_OK) = 0
[pid 9064] open("/tmp/hsperfdata_root", O_RDONLY|O_NOFOLLOW) = 6
[pid 9064] open("/tmp/hsperfdata_root", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 7
[pid 9064] lstat("/tmp/hsperfdata_root", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 9064] open("/tmp/hsperfdata_cassandra", O_RDONLY|O_NOFOLLOW <unfinished ...>
[pid 9064] open("/tmp/hsperfdata_13926", O_RDONLY <unfinished ...>
[pid 9064] unlink("/tmp/hsperfdata_root/9063") = 0

I have two processes here:

9063, the jps program, running as root. Shows up as it should.
13926, a cassandra server, running as cassandra user. Does not show up as it should.

Note how jps is first finding the hsperfdata file in /tmp/hsperfdata_cassandra, then tries to open /tmp/hsperfdata_13926, which doesn't exist.

Erik Forsberg (forsberg) wrote :

Workaround, in an example where the java process you want to monitor is running as the "example" user and as pid 999:

ln -s /tmp/hsperfdata_example/999 /tmp/hsperfdata_999

jps, jstat etc now work as before.

The above link can be created in the startup script of the service.

amir sanjar (asanjar) wrote :

Urgent, Big Data hdp-hadoop charms and related big data bundles in charmstore are broken due to the same issue.

amir sanjar (asanjar) wrote :

any update?

Antonio Rosales (arosales) wrote :

To expand on comment 6 from @amir what is currently broken for Big Data clusters is the following.

Most Big Data workloads such as setting up a Hadoop Cluster require Java at many levels. Juju[0] has encapsulated the deployment of Big Data workloads like deploying a Hadoop Cluster into Charms. In those charms jps is used in the deployment of Hadoop. A user recently posted to the Juju mailing list the charms failed to deploy a 10 node hadoop cluster[1].

This is a good example of the failure we are seeing. The effect is severe in that anybody deploying a Hadoop Cluster using the current Ubuntu OpenJDK from the archive will hit this failure. The work around isn't that trivial as it requires manual intervention in the connections between services.

If there is any information we can provide to help in the debug of this issue please let us know.

[0] juju.ubuntu.com/
[1] https://lists.ubuntu.com/archives/juju/2015-February/004941.html

-thanks,
Antonio

amir sanjar (asanjar) wrote :

Please reinstate OpenJDK 1.7.0_51/52 until this issues has been resolved. This issue is effecting our Big Data customers and charm development effort.

Antonio,Amir:
As another affected user i think downgrade would not be a good idea as it would reopen all the fixed security issues.

But as i wrote in my initial description there is a very easy workaround which may even make sense to keep.

The problem with jps only happens if
a.) A java program is running as non-root user
b.) you run jps as root and want data about the running process from a.)

Not running jps as root but instead as the same user as the process is running it is enough to not trigger the bug.

I did not check any of your charms but at least in our app it was very easy to change our script to not trigger the issue.

Maybe you can incoporate the same change into your scripts/charms also

Stefan
The issue is not the workaround, we have already implemented similar
changes (ugly, there are many jvm services and users in hadoop) in our new
charm.
The real issue is modifying the existing charms and bundles in charm-store.
Big Data Bundles were required to lock to a specific charm build (a well
tested build - i.e. hdp-hadoop-4). Any changes to a charm will have a
domino effect on all bundles including the charm.

On Wed, Feb 11, 2015 at 8:10 AM, Stefan Huehner <email address hidden>
wrote:

> Antonio,Amir:
> As another affected user i think downgrade would not be a good idea as it
> would reopen all the fixed security issues.
>
> But as i wrote in my initial description there is a very easy workaround
> which may even make sense to keep.
>
> The problem with jps only happens if
> a.) A java program is running as non-root user
> b.) you run jps as root and want data about the running process from a.)
>
> Not running jps as root but instead as the same user as the process is
> running it is enough to not trigger the bug.
>
> I did not check any of your charms but at least in our app it was very
> easy to change our script to not trigger the issue.
>
> Maybe you can incoporate the same change into your scripts/charms also
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1417962
>
> Title:
> 6b34 regression: jps as root does not show classname anymore but
> instead 'process information unavailable' for processes running as
> non-root user
>
> Status in openjdk-6 package in Ubuntu:
> Confirmed
> Status in openjdk-7 package in Ubuntu:
> Confirmed
>
> Bug description:
> I noticed the following behavior change in the jps command line tool
> when updating from
> 6b33-1.13.5-1ubuntu0.12.04 -> 6b34-1.13.6-1ubuntu0.12.04.1 on a ubuntu
> 12.04 64bit system.
>
> Staring point is a java process for apache tomcat running as non-root
> user (username openbravo in below example) with pid 1462.
>
> In 6b33 when running jps -l as root it did correctly identify the
> classname of the running processes as shown here:
> luna686:~# jps -l
> 1462 org.apache.catalina.startup.Bootstrap
> 11610 sun.tools.jps.Jps
>
> However after updating the same jps call as root-user does now show:
> luna686:~# jps -l
> 12056 sun.tools.jps.Jps
> 1462 -- process information unavailable
>
> Which break some custom monitoring of us trying to find tomcat process
> via its classname.
>
> Note: Problem only occours when running jps as root user and process
> in question is running non-root. When running jps as same user as the
> tomcat process is running with then both 6b33 + 6b34 work as expected
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/1417962/+subscriptions
>

--
Regards
Amir Sanjar
Big Data Solution Lead
Canonical
cell 512-507-5537

David Harrigan (dharrigan) wrote :

Hi,

Not sure where I should post this, but I've just hit this bug with Java 8

java version "1.8.0_31"
Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)

# update-alternatives --config java
There is 1 choice for the alternative java (providing /usr/bin/java).

  Selection Path Priority Status
------------------------------------------------------------
  0 /usr/lib/jvm/java-8-oracle/jre/bin/java 20012 auto mode
* 1 /usr/lib/jvm/java-8-oracle/jre/bin/java 20012 manual mode

# uname -a
Linux java-test-1 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13 19:36:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

I had to do the workaround (by creating the symlink) in order for VisualVM to see the data coming from the running JVM.

I would be happy to move/report this issue if someone could suggest a more appropriate area to post in.

Thank you.

-=david=-

GMan (pokstar) wrote :

The bug can be reproduce in:
- openjdk-6-jdk_6b33-1.13.5-1ubuntu

But I was not able to reproduce it in:
- openjdk-6-jdk_6b32-1.13.4-4ubuntu

Test was done using 12.04 LTS

amir sanjar (asanjar) wrote :

any updates on this issue? Lack of functional jps in a muti-user environment has had a negative effect on JAVA development/debug . Also above suggested workaround has been inconsistent.

amir sanjar (asanjar) wrote :
amir sanjar (asanjar) wrote :

correction: not b13 but b24.. going to verify

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers