Comment 35 for bug 209483

Revision history for this message
Benjamin Delagoutte (benjamin-delagoutte) wrote : Re: Regression: Sony Walkman NWZ-S618F doesn't mount in Hardy

I finally found a possible reason of the bug we are going through.

First of all, the size of the volume reported by the kernel through sysfs hasn't changed since Gutsy, nor do the permissions :
gutsy:~$ ls -l /sys/block/sdb/size /sys/block/sdb/sdb1/size
-r--r--r-- 1 root root 4096 2008-05-06 18:05 /sys/block/sdb/sdb1/size
-r--r--r-- 1 root root 4096 2008-05-06 18:05 /sys/block/sdb/size
gutsy:~$ cat /sys/block/sdb/size /sys/block/sdb/sdb1/size
15319040
4294967292

hardy:~$ ls -l /sys/block/sdb/size /sys/block/sdb/sdb1/size
-r--r--r-- 1 root root 4096 2008-05-06 18:21 /sys/block/sdb/sdb1/size
-r--r--r-- 1 root root 4096 2008-05-06 18:22 /sys/block/sdb/size
hardy:~$ cat /sys/block/sdb/size /sys/block/sdb/sdb1/size
15319040
4294967292

I then investigated for possible change in hal code between Gutsy (0.5.9.1) and Hardy (0.5.11-rc2). Here's what I found :
- the 'Ignoring hotplug event - cannot read 'size'' message is sent from a function in hald/linux/blockdev.c called hotplug_event_begin_add_blockdev(), because the call to hal_util_set_int_from_file (d, "volume.num_blocks", sysfs_path_real, "size", 0) returned FALSE
- looking at this hal_util_set_int_from_file() function in hald/util.c, we see that it makes use of the function hal_util_get_int_from_file() in the same source file.

In this function, there is a major change between 0.5.9.1 and 0.5.11-rc2 : the error code returned by strtol is handled.

0.5.9.1:
/* TODO: handle error condition */
*result = strtol (buf, NULL, base);
ret = TRUE;

0.5.11-rc2:
errno = 0;
_result = strtol (buf, NULL, base);
if (errno == 0) {
  ret = TRUE;
  *result = _result;
}

strtol() is used to convert a string (the size read from sysfs) to a long int. If the value read from the string exceeds 2147483647 (*which is true in our case*), strtol() returns LONG_MAX and sets errno to ERANGE.

To sum up:

in Gutsy, the size exposed by sysfs already exceeded the limit, but, as there was no control on errno in hal, the volume size was set to 2147483647. This can be confirmed by running hal-device under Gutsy :
gutsy:~$ hal-device
0: udi = '/org/freedesktop/Hal/devices/volume_uuid_47EA_53F8'
...
  volume.num_blocks = 2147483647 (0x7fffffff) (int)
...

Now, in Hardy, the errno is correctly handled, but as a side effect it prevents hal from detecting the volume on the device.

A possible fix for this would be to remove the errno checking or to use an unsigned long int and strtoul() to read the size, as it would perfectly fit within the limit. Moreover, it rarely happens to have a negative size, so why use a signed long int?

To finish with, I would insist on the fact that if the regression (to the user's point of view) is related to a change in hal's source code, hal developers have done their job by checking against errno.

The real question would be: why does the kernel report such a funny size through sysfs? Is it related to the support of the mass-storage protocol by the device, or to the implementation of USB mass-storage in the kernel?