crashes opening network file, leaves lock file

Bug #578402 reported by Bryan on 2010-05-10
50
This bug affects 10 people
Affects Status Importance Assigned to Milestone
openoffice.org (Ubuntu)
Low
Unassigned

Bug Description

Binary package hint: openoffice.org

When trying to access most openoffice files via nautilus on a NAS with newly installed Lucid, either AMD64 or i386 (2 separate machines), OO crashes on launch. Message says "due to unexpected error" and no files are listed to recover. Document permissions show 100755, owner and group are current user, lock file is left after crash. Deleting lock file does not change things. Can copy file to local home directory, edit, save, and copy back to NAS as workaround for now.

Some files can be edited with oowriter and I can't see anything that sets them apart other than I can open them with Nautilus, oowriter comes up and edits and saves OK. (e.g. the invoice crashes, the envelope doesn't)

spreadsheet does the same thing: crashes on opening an odt from Nautilus.

Files are accessed on the NAS using autofs with uid and gid options, user login credential file, and auto.smb from Ubuntu 9.10. Mount reports '//nas/share on /lan/nas/share type cifs (rw,mand)'

problem is in openoffice.org 3.2 OOO320m12 (build 9843) in Ubuntu 10.04 LTS - did not happen with 9.10 systems. This bug does seem familiar to that encountered in an earlier version, though.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: openoffice.org-writer 1:3.2.0-7ubuntu4
ProcVersionSignature: Ubuntu 2.6.32-22.33-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-22-generic x86_64
NonfreeKernelModules: fglrx
Architecture: amd64
Date: Mon May 10 10:50:33 2010
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release amd64 (20100429)
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: openoffice.org

Bryan (k1cd) wrote :
Piotr Kujawski (elektrownia) wrote :

I had exactly the same issue. 10.04LTS. One is a fresh install, second is an upgrade from 8.04LTS.

Mount says //ks-eisfair/mew on /home/piotr/mew type cifs (rw,mand)
File permissions are 100755. Copy the to Desktop and back to NAS solves the issue.

Changed in openoffice.org (Ubuntu):
status: New → Confirmed
Teun Vink (teunvink) wrote :

Several people in our office report similar problems.

Mark Schouten (mark-prevented) wrote :

Given that Lucid is an LTS release, also targeted at 'the enterprises' I think this bug should be marked 'important'.

Eelbuntu (eelcoaartsen) wrote :

Same problem here, changed from autofs to manual mounting did not solve the problem

Chris Cheney (ccheney) wrote :

I can't manage to reproduce this problem with samba 2:3.4.7~dfsg-1ubuntu3 in Lucid for the server and client. I tried using nautilus (gvfs-fuse) for the client connection and also did a client test using mount.cifs and both worked fine for me. My test was saving to both an ods and odt file on the server. It may be that there is something specific to certain files that is causing this problem and if so I will likely need an example file (non-corrupted) that I can use to attempt saving to see if it also works for me.

Also, if someone can get it to crash again can they please try to obtain a backtrace manually following the instructions at http://wiki.ubuntu.com/DebuggingProgramCrash and upload the backtrace (as an attachment) to the bug report. This will greatly help us in tracking down the problem.

Thanks,

Chris Cheney

Changed in openoffice.org (Ubuntu):
importance: Undecided → Low
status: Confirmed → Incomplete
Bryan (k1cd) wrote :

It appears that this is a well known interaction between OOo 3.0 and Samba related to file locking. There appears to be a bug here for an earlier distribution and OOo 3.1 and also issues listed in the OOo forums and buglists. It can cause a crash on load or it can cause unspecific 'general IO errors' when trying to save an edited file. I had this happen with an HTML file which I was able to load then 'save as' but not save when changed.

Some of the workarounds suggested have been to reset some configuration scripts or disable file locking in the soffice script or to use the 'nounix' option in the CIFS mount or to regress to an smbfs mount. (I tried the script refresh and it didn't do the job so I am now in the process of testing the 'nounix' option)

It does appear that mounting the share via 'Nautilus->places->network' may not express the problem. That mounts as "gvfs-fuse-daemon on /home/username/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=username)" rather than "//server/share on /lan/server/share type cifs (rw,mand)" - I don't know that much about how the gvfs-fuse-daemon mounts things (yet) to understand how it's mounts of SMB shares differs from CIFS but that difference may yield some insight into this problem. (I might try some of thoe gvfs options in autofs as well, to see if I can find one that will make a difference)

I'll see if I can get a backtrace - that process looks to be a pain but I understand the need; may take a while.

In regard to the test indicated, I think the key is opening, editing, and saving an existing file rather than just creating a new one. It does appear to be a multi faceted issue and I don't think it is anything as simple as file specific. That makes it rather tough to track down.

Chris Cheney (ccheney) wrote :

When I did my testing I did an open->edit->save trip with the two methods I tried which worked.

I wrote the support to have OOo use gvfs-fuse instead of regular gvfs and during that time noticed a lot of bugs in gvfs-fuse, most of which related to OOo calling ftruncate (non 0) from what I recall. I'm not sure if older versions of samba, likely to be the ones in the NAS boxes, have a similar problem or not, but it should be fairly trivial to write a test case just doing various posix file i/o calls on it and see if they fail.

Chris Cheney (ccheney) wrote :

This bug looks like it might be a duplicate of bug 486443 which has an upstream bug attached to it already. Does it seem related to you?

Bryan (k1cd) wrote :

That bug was with 9.10 and 3.1.1, which worked OK here on the LAN. It does have some similar characteristics, though. Interesting that it showed with KDE but not Gnome. Maybe my using Gnome was why I didn't see it. (potential experiment suggested in that, I wonder what the difference is).

I see all the discussion about fftruncate and whatnot but that does not seem to fill the bill. If the file call was a problem, other apps would show similar behavior, it seems to me. This one smells a whole lot to me like OOo is getting some return value from handling its lock file that it does not recognize and does not handle gracefully. In any case, that 'general file io error' is indicative of something trapped that OOo doesn't like and isn't reported in any useful fashion. That's something to fix in any case as even cryptic error codes are better than having to insert debug backtracking to figure out the stimulus for a crash.

I also don't think suspicion on the NAS devices is worthwhile. Mine is a Netgear ReadyNAS Duo and the other bug used a different one. It appears some of the reporters here used other servers. Another key is that OOo was working with LAN files, and is working, on the 9.10 machine I have on the LAN but not the new 10.04 installs. If the NAS was the problem, it'd be the problem in all cases.

I am aware of the conflicts between SMB and Unix permissions. These have been a royal pain and the current hassle is autofs needing explicit specific user settings. (don't know but maybe its fixed now??)

At any rate, it appears that the problem consistently shows in some environments and not in others. OOo is one common factor. Since it is LAN related, knowing about that topology is probably going to be needed but it does appear that OOo is doing something different than most other applications as it seems to be aware of local versus LAN based files and modifies its behavior accordingly.

Need more data, always need more data. I'll see what I can do to collect some.

Mark Schouten (mark-prevented) wrote :

I'm currently testing/debugging this.

I can write, edit and open small files, it seems. I'm testing with:

Client:
 Ubuntu 10.04 AMD64, Desktop

Server:
 Debian Sarge (yes, I know :)), Samba 3.0.14a-3sarge

I'll report later with more testing/debugging.

Mark Schouten (mark-prevented) wrote :

Right. Some more testing:

Test 1: Open an existing .odt from a CIFS share [1]
Result: OpenOffice crashes

Test 2: Open the same file from local disk
Result: OpenOffice works

Test 3: Create, Edit and Save a new file on a Samba-share
Result: OpenOffice works

Test 4: Open a file created with OO 3.1 from a Samba-share
Result: OpenOffice works

Test 5: Unzip a (non-working) odt-file, and zip it again, open it from a Samba-share
Result: OpenOffice works
<howto>
mkdir repair
cd repair
unzip ../broken.odt
zip -r ../working.odt ./*
</howto>

[1]: /proc/mounts entry:
//fileserver.office.bit.nl/work /mnt/samba/work cifs rw,mand,relatime,unc=\\fileserver.office.bit.nl\work,username=marks,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.0.1,serverino,acl,rsize=16384,wsize=57344 0 0

So it seems that somehow, if a file is zipped again, stuff works.
(Now dazed and confused, but trying to continue :))

Mark Schouten (mark-prevented) wrote :

Sorry. But this bug can hardly be a duplicate of 486443. That bug is about unable to save files to a nas. This bug is about Ooo *crashing* while opening a file from a Samba share.

BTW: I don't have a crashing Ooo now..

Chris Cheney (ccheney) on 2010-05-19
Changed in openoffice.org (Ubuntu):
status: Incomplete → Fix Released
Mark Schouten (mark-prevented) wrote :

Huh? Where's the fix than?

Chris Cheney (ccheney) wrote :

Perhaps you confused me? You said this bug was different because it was causing OOo to crash and then went on to say:

"BTW: I don't have a crashing Ooo now.."

So what exactly are you meaning?

hilaire (hilaire-drouineau) wrote :

Hi,
I can confirm that the workaround proposed by Brian (i.e.mounting the share via 'Nautilus->places->network') works for me. I am still not able to make it work when mounting using fstab nor sudo mount -t: it seems that options nomand and nounix are not taken into account:
if I mount a partition as follow
hilaire@hilaire-desktop:~$ sudo mount -t cifs //machine/partition /media/Sauvegarde -o nounix,nomanduid=1000,gid=100,file_mode=0640,dir_mode=0750,iocharset=utf8,credentials=/root/.smbcredentials

the command mount returns

hilaire@hilaire-desktop:~$ mount
 //machine/partition on /media/Sauvegarde type cifs (rw,mand)

Moreover, the solution proposed by Mark, i.e. unzip and zip back problematic files works only for some files.

Cheers

Cheers

Mark Schouten (mark-prevented) wrote :

Sorry for the confusion.

I meant that I could open the file now, but couldn't before. Which doesn't mean the problem is gone, it's just even more weird.

hilaire (hilaire-drouineau) wrote :

Hi again,

Great news: I solved my problem! The nomand option is not valid for cifs, you have to use nobrl

in fstab
//machine/partition /media/Sauvegarde cifs nobrl,credentials=/root/.smbcredentials,uid=1000,gid=100,rw,iocharset=utf8 0 0

with sudo mount

hilaire@hilaire-desktop:~$ sudo mount -t cifs //machine/partition /media/Sauvegarde -o nobrl,uid=1000,gid=100,file_mode=0640,dir_mode=0750,iocharset=utf8,credentials=/root/.smbcredentials

Hope that help!
Cheers

Chris Cheney (ccheney) wrote :

hilaire,

So nobrl fixes the problem for you? If so does it also help you Mark Schouten?

Chris

Changed in openoffice.org (Ubuntu):
status: Fix Released → Incomplete
hilaire (hilaire-drouineau) wrote :

Hi Chris,
I have to make further tests to confirm that all files work, but presently it seems so.

Hilaire

Mark Schouten (mark-prevented) wrote :

First tests look promising. I've informed my co-workers of this fix. I'll keep you updated on issues found.

BTW: The man-page of mount.cifs says:
    Do not send byte range lock requests to the server. This is
    necessary for certain applications that break with cifs style
    mandatory byte range locks (and most cifs servers do not yet
    support requesting advisory byte range locks).

It seems that this BRL-function is rather experimental? Maybe it should be a default option than?

Chris Cheney (ccheney) wrote :

If this does solve the problem its most likely a duplicate of the other bug, but I will keep them separate for now. If it can't be fixed in OOo upstream I will ping the samba maintainer.

Piotr Kujawski (elektrownia) wrote :

I can confirm, the "nobrl" option at mounttime is a workaround. With this option i have no problems opening older office files on a CIFS share.

Chris Cheney (ccheney) on 2010-05-21
Changed in openoffice.org (Ubuntu):
status: Incomplete → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers