look(1) can't open bigfiles

Bug #510613 reported by Ralph Corderoy on 2010-01-21
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
bsdmainutils (Ubuntu)
Undecided
Unassigned

Bug Description

Binary package hint: bsdmainutils

look(1) isn't passing O_LARGEFILE to open(2).

    $ dpkg -S /usr/bin/look
    bsdmainutils: /usr/bin/look
    $ dpkg-query -W bsdmainutils
    bsdmainutils 6.1.10ubuntu4
    $
    $ echo | socat -u - file:bigfile,create,largefile,seek=100000000000
    $ ls -l bigfile
    -rw-r--r-- 1 kiosk kiosk 100000000001 2010-01-21 11:17 bigfile
    $ look foo bigfile
    look: bigfile: Value too large for defined data type
    $ strace look foo bigfile 2>&1 | grep EOVERFLOW
    open("bigfile", O_RDONLY) = -1 EOVERFLOW (Value too large for defined data type)
    $

I've a 11.5GB sorted file and look's binary search would be ideal but I
can't use it. Once it manages to open the file, it needs to be able to
cope with the long offsets it may be seeking to.

Mark Nieweglowski (lp-0) wrote :

Attached is a patch so "look" will accept files larger than INT_MAX on 64 bit architectures.

The original debian/patches/look.diff allows look.c to compile on Debian by defining SIZE_T_MAX as INT_MAX.

But it should have defined SIZE_T_MAX as SIZE_MAX.

SIZE_MAX is defined in stdint.h which wasn't included before, so this patch adds an #include <stdint.h>

stdint.h should be available on Debian since it is provided by package libc6-dev.

I just learned about "look" last week, as a utility that does a binary search on sorted files.

I tried it at work to look through a 33G log file.

Instead of grep '^2011-05-27 11:32' some.log # waiting minutes

look '2011-05-27 11:32' some.log # hoped it would be faster than grep, instead I got
look: some.log: File too large # :(

I found out about look via

http://www.reddit.com/r/blog/comments/fjgit

and "look -b" sounds like the appropriate tool but fails unreasonably on 64 bit architectures.

Mark Nieweglowski (lp-0) wrote :

BTW: The original bug mentioned EOVERFLOW and in hindsight I see that my patch doesn't address it, but EOVERFLOW is not the bug I'm seeing. I'm seeing the max size check fail.
--- look.c:145
                if (sb.st_size > SIZE_T_MAX)
                        errx(2, "%s: %s", file, strerror(EFBIG));
---

$ strace look foo the.player.4.mpg 2>&1 | tail -n 7
open("the.player.4.mpg", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=4045771844, ...}) = 0
write(2, "look: "..., 6look: ) = 6
write(2, "the.player.4.mpg: File too large"..., 32the.player.4.mpg: File too large) = 32
write(2, "\n"..., 1
) = 1
exit_group(2) = ?

Hi Mark, Perhaps you're on a 64-bit machine so the behaviour differs? Anyway, could you please change this bugs status from New to Confirmed; until then, no one looks at it and it's frowned upon to Confirm one's own bugs no matter how simple and obvious the test case. :-(

Mark Nieweglowski (lp-0) on 2011-05-28
Changed in bsdmainutils (Ubuntu):
status: New → Confirmed
tags: added: patch
Mark Nieweglowski (lp-0) wrote :

This patch defines _LARGEFILE64_SOURCE and _FILE_OFFSET_BITS so fstat and mmap become fstat64 and mmap64.

It replaces the previous patch and is meant to replace the package's debian/patches/look.diff

Mark Nieweglowski (lp-0) wrote :

Sorry about the noise. I attached the wrong patch last time.

jugmac00 (jugmac00) wrote :

Is there any chance this gets fixed?

8 years later I still get the bug on Ubtunu 18.04 / WSL.

jugmac00@DESKTOP-jugmac00:/mnt/d/Projects/pw-hash$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04 LTS
Release: 18.04
Codename: bionic

jugmac00@DESKTOP-jugmac00:/mnt/d/Projects/pw-hash$ look -f $(echo -n xxx| sha1sum | awk '{print $1}') pwned-passwords-sha1-ordered-by-hash-v4.txt
look: pwned-passwords-sha1-ordered-by-hash-v4.txt: File too large

Stuart Taylor (stuartraetaylor) wrote :

It would be great to see this fixed in an upcoming release of Ubuntu.

I had the same problem and applied Mark Nieweglowski's fix here - https://github.com/stuartraetaylor/bsdmainutils-look

$ ./look -bf $(echo -n "P@ssw0rd" | sha1sum | cut -d' ' -f1) pwned-passwords-sha1-ordered-by-hash-v4.txt
21BD12DC183F740EE76F27B78EB39C8AD972A757:51259

There's also an alternative approach here - https://www.tablix.org/~avian/blog/archives/2018/10/checking_passwords_against_haveibeenpwned_com/

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers