64-bit BOINC crashes with ever NVidia driver

Bug #1434570 reported by Roland Hughes
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linux Mint
Expired
Undecided
Unassigned
boinc (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Linux Mint 17 'Qiana' KDE 64-bit

BOINC cannot identify CUDA with any of the NVidia drivers unless one manually installs the backported nvidia-modprobe. 32-bit version does not have this problem. In fact the 32-bit versions of things work flawlessly. 64-bit appears to have never been tested.

Install BOINC.
Install Nvidia 331 driver
Reboot.
Launch BOINC manager select advanced view, Advanced->Event Log
Scroll back to the top and you will see a line stating something like "no useable CUDA found.

Click on the little blue start button in the lower left corner.

In the little search bar at the top type "sour" and you should see an option for "Software Sources" appear. Click it.

Click the "Additional repositories" button on left.

Click the "Add a new repository" button at bottom.

paste in the following:

deb http://archive.ubuntu.com/ubuntu trusty-backports main restricted universe multiverse

Click "Ok"

Click the "update" or "refresh" button on upper right side of screen. (This will be greyed out with "No action required" until you actually add something and I don't remember what the text chagnes to.)

Exit out of all windows.

Click on the little blue start button in the lower left corner.

Applications->System->Synaptic Package Manager

Enter password when prompted.

After the index finishes updating type "modpro" in the "Quick filter" field

Mark nvidia-modprobe for installation, then click "Apply changes"

Exit everything and reboot

Open your BOINC manager.

Click Advanced->Event Log

Scroll to top of log (it always shows last entry when it opens) You should see something like:

Tue 10 Mar 2015 06:35:09 AM CDT | | CUDA: NVIDIA GPU 0: GeForce GT 630 (driver version unknown, CUDA version 6.0, compute capability 3.5, 2047MB, 1973MB available, 692 GFLOPS peak)

near the beginning.

Now, be sure to subscribe to Einstein@Home (other projects do use CUDA, but I know this one uses it heavy and makes the problem happen.)

With the base 331 driver you can go for quite a few hours before machine starts getting slow. If left with manager up over night most mornings will display "NVidia Crash". This will be displayed by one of two conditions.

1) Mouse will move on screen but nothing will respond. SSH in from another machine and terminal applications seem to run.

2) You will wiggle the mouse and notice screens don't come out of darked out mode. Looking down at the keyboard you will notice the 3 indicator lights turning on and off in sequence.

Installing 331-updates driver will let your computer operate less than half a day.

Installing any of the 34x drivers out of edgers PPA will have you crashing many times per day.

Note: with the 32-bit version you simply install BOINC, install 331 driver, reboot, and everything is flawless. This bug is 64-bit specific. If you visit this thread:
http://boinc.berkeley.edu/dev/forum_thread.php?id=10046&postid=61102#61102
you will see the desktop does not matter. Whether beautiful KDE or one of the Gicky-Gnasty-Gnome desktops is used one achieves the same results.

Tags: boinc nvidia
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in boinc (Ubuntu):
status: New → Confirmed
Revision history for this message
David Anderson (davea42) wrote :

Xubuntu 14.04. This really bites as of 3.13.0-88 #135. X86_64
Installing nvidia drivers via additional drivers menu works and X works
so I have a desktop.

No matter which nvidia I select. But boinc cannot see my pair
of GTX-760 GPUs

If I install nvidia-modprobe then X no longer comes up.
ssh from another machine works.

What have had to do after every recent kernel update (because
X would not come up after reboot) was

sudo apt-get purge 'nvidea*'
reboot (now in nouveau),
if necessary twice to get clean shutdown.

Using additional drivers, select one nvidia or another.
reboot (now in nvidia),
if necessary twice to get clean shutdown.

Boinc would not see the GPUS.
sudo apt-get install nvidia-modprobe
(which installs a 340 modprobe, rather old)
reboot.

But now as of 3.13.0-88 #135. X86_64
the nvidia modprobe kills X. No desktop.

Revision history for this message
David Anderson (davea42) wrote :

Should have said:
Xubuntu shows available drivers (nvidia) as 352.63 and 340.96.

nvidia-modprobe is shown (by synaptic) as 340.24-1

Revision history for this message
David Anderson (davea42) wrote :

Another machine with xfce and GTX750 (one of them) runs 352.63 X86-64
 3.19.0-61-generic #69~14.04.1-Ubuntu SMP Thu Jun 9 09:09:13 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
runs the 340.24-1 nvidia-modprobe and X works find. Boinc works.

Software update on each says software is up to date. but
they are not on the same kernel, I have
no idea why.

Revision history for this message
David Anderson (davea42) wrote :

q3 (nvidia with nvidia-modprobe works)
Linux q3 3.19.0-61-generic #69~14.04.1-Ubuntu SMP Thu Jun 9 09:09:13 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

seti3 (nvidia with nvidia-modprobe does not work at all)
Linux seti3 3.13.0-88-generic #135-Ubuntu SMP Wed Jun 8 21:10:42 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

The above is the output of uname -a

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

David, 3.19 is part of linux-lts-* stack
maybe you can try to use linux-lts-xenial stack, the 4.4 kernel
https://launchpad.net/ubuntu/+source/linux-lts-xenial

BTW there is not much we can do for X crashing with official nvidia drivers, basically they are binaries installed in the system :(

BTW with ubuntu 16.04 LTS you should have a more stable situation, at least I didn't experience this kind of issues anymore

Revision history for this message
David Anderson (davea42) wrote :

Thanks, LocutusOfBorg. I have been waiting for 16.04 to be officially marked as LTS,
which I gather will be with 16.04.1, and will install it then when
Software Update gives me the choice.
I have the impression that might be in July.

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

Can anybody please test 16.04, 16.10 or similar? we should be more crash prone now!

Changed in boinc (Ubuntu):
status: Confirmed → Incomplete
Changed in linuxmint:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for boinc (Ubuntu) because there has been no activity for 60 days.]

Changed in boinc (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Linux Mint because there has been no activity for 60 days.]

Changed in linuxmint:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.