reading only 1st char of link in "long string" ($) syntax fails in read error

Bug #907761 reported by Dirk Zimoch
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Fix Released
Medium
Andrew Johnson

Bug Description

> caget -# 1 record.INP$
CA.Client.Exception...............................................
    Warning: "Channel read request failed"
    Context: "op=0, channel=record.INP$, type=DBR_TIME_CHAR, count=1, ctx="read failed""
    Source File: ../getCopy.cpp line 82
    Current Time: Thu Dec 22 2011 14:00:26.382257828
..................................................................

This fails for INP$, SDIS$, and probably for any link field.

However DESC$, NAME$ and probably other string fields are fine.

Reading 2 or more characters succeeds.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

The above bug happened when using a 3.14.8 client with a 3.14.12 ioc. Using a 3.14.12 client succeeds. So it seems the bug had been fixed. I thought the problem was in the server but it seemed to be a client problem.

Revision history for this message
Andrew Johnson (anj) wrote :

Confirmed fixed in 3.14.12.

Changed in epics-base:
status: New → Fix Released
Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

I just found the problem was gone in 3.14.12.1 but re-appeared in 3.14.12.2. Still investigating...

Revision history for this message
Ralph Lange (ralph-lange) wrote :

As the client does not know if a field is a link or not - I would actually suspect the server side.

Or a specific client behavior (maybe connected to the variable length array feature?) that triggers something in the server. Note that "array of length 1" is pretty much how CA and the Database internally tag a scalar.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

With the 3.14.12.1 version of catools/caget.c, the problem does not appear. However this version requests the full array length and then cuts the output to the number of requested elements. The version of caget.c in 3.14.12.2 requests only the number of elements as specified by the user, which is more correct. This fails.

So far I see the problem is:

ca_array_get(DBR_<any>_CHAR, elements, chid, &buffer) fails if elements is 1 and the channel is connected to a link field.

I tried a different client with both, the 3.14.12.1 and3.14.12.2 client libraries. Both fail.

So I guess the bug is indeed in the server, because it only appears when accessing links fields as char arrays, not string fields. It had not been fixed in 3.14.12.1, only been hidden by the caget.c implementation.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

The same problem as for links exists for SPC_ATTRIBUTE fields like .RNAM.

The problem is caused by multiple bugs in string length calculation and termination in dbGetField, dbGet.

The algorithm wrongly assumed that CHAR arrays need to be 0 terminated and then did the length calculation wrong by decreasing the length twice. The patch attached fixes this so that access to links and attributes behaves the same as access to string fields.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :
Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

Of course I meant ".RTYP", not ".RNAM" in comment #6.

Revision history for this message
Andrew Johnson (anj) wrote :

When long string support was added, the code was modified to 0-terminate all char arrays retrieved using the long string feature so that clients don't need to do that for themselves. With this design it makes little sense to fetch a single-element char array as the result will always be a single 0 character (i.e. an empty string), which is why it was rejecting those requests.

I don't mind changing the code so that a 1-char array fetch will succeed, but I don't want to remove the 0-termination functionality, so it should always return an empty string. I agree that the double subtraction is an error, which I will also fix.

I think that just leaves the issue of 0-terminating a char-array fetched from a DBF_STRING field, which I would like to see happen in order to make everything behave the same, so I'll regard that as the other bug here. I need to look at the code more closely to work out how best to implement it though.

Changed in epics-base:
status: Fix Released → In Progress
assignee: nobody → Andrew Johnson (anj)
importance: Undecided → Low
Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

I agree that the behavior should be the same for links, attributes and strings. When I saw that strings are not 0 terminated when read partially (and worked) I considered that the "correct" behavior and changed links and attributes accordingly.

The actual client with that I run into this problem was the Tcl interface. When it connects to a channel, it first gets a DBR_CTRL_* structure with element count 1 in order to get EGU etc. After that, it uses DBR_STS_* with the actual element count for any further communication. This failed for link fields. Maybe other clients behave similar. In this situation, the client is not really interested in the value byte.

Nevertheless, I find it questionable to "fake" a 0 into the data. If a client requests less bytes than the full string, it should not expect a 0 termination. Compare to strncpy. Also consider extensions to CA that allow the client to read the array piecewise (-> Ralph's server-side-plugins). In this case the client would have to do extra work to get rid of the additional 0 bytes. Of course from a human's point of view, there is a difference between arrays of 8 bit numbers and strings. But for a computer that's the same. And it adds extra difficulty to programs to handle char arrays differently dependent on their interpretation by a human.

I would add 0 bytes only at the end of the full string, but not at partially read strings.

Revision history for this message
Andrew Johnson (anj) wrote :

strncpy() is regarded as a major source of bugs and security problems in C code when a programmer forgets to force the terminating zero byte; I want to avoid repeating that mistake. In all the cases we're talking about we know that the data we are looking at is a character string and not just an array of bytes, so we can remove a possible source of bugs in client code by nil-terminating the result. This isn't "faking" anything, it's trying to maintain the invariant that C strings are (supposed to be) nil-terminated.

I fully admit that long strings were a string-and-sealing-wax addition to Base, but that's no reason not to file off any unnecessary sharp edges so our users don't get hurt unnecessarily.

I'm attaching a patch which allows single-element fetches, fixes the double subtraction issues, and adds nil-termination to DBF_STRING fields being read as long-string char arrays. If you can try it out I would appreciate another set of eyes on the result before I commit it and add it to the Known Problems page.

Thanks,

- Andrew

Changed in epics-base:
importance: Low → Medium
Andrew Johnson (anj)
Changed in epics-base:
status: In Progress → Fix Committed
Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote : Re: [Bug 907761] Re: reading only 1st char of link in "long string" ($) syntax fails in read error

Hi Andrew,

I found time to test your patch today. It looks fine for me. Strings,
links and attributes behave the same now. All are terminated with a
single 0 in all cases, resulting in only the 0 bytes when requesting 1 byte.

If I understood the "dynamic arrays" feature correctly, shouldn't the
char array consist only of strlen()+1 bytes?

 > caget -S DZ:CO.RTYP$
DZ:CO.RTYP$ calcout

 > caget DZ:CO.RTYP$
DZ:CO.RTYP$ 40 99 97 108 99 111 117 116 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

 > caget -#2 DZ:CO.RTYP$
DZ:CO.RTYP$ 2 99 0

 > caget -#1 DZ:CO.RTYP$
DZ:CO.RTYP$ 1 0

 > caget -#0 DZ:CO.RTYP$
DZ:CO.RTYP$ 40 99 97 108 99 111 117 116 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

In the last case, I had expected
DZ:CO.RTYP$ 8 99 97 108 99 111 117 116 0

Or do I misunderstand the "dynamic arrays" feature? The
RELEASE_NOTES.html says:

"It permits a CA client to fetch only the currently valid elements of an
array by specifying a COUNT of zero [...]"

I had expected that the valid elements would be strlen()+1 bytes?

Dirk

Andrew Johnson wrote:
> strncpy() is regarded as a major source of bugs and security problems in
> C code when a programmer forgets to force the terminating zero byte; I
> want to avoid repeating that mistake. In all the cases we're talking
> about we know that the data we are looking at is a character string and
> not just an array of bytes, so we can remove a possible source of bugs
> in client code by nil-terminating the result. This isn't "faking"
> anything, it's trying to maintain the invariant that C strings are
> (supposed to be) nil-terminated.
>
> I fully admit that long strings were a string-and-sealing-wax addition
> to Base, but that's no reason not to file off any unnecessary sharp
> edges so our users don't get hurt unnecessarily.
>
> I'm attaching a patch which allows single-element fetches, fixes the
> double subtraction issues, and adds nil-termination to DBF_STRING fields
> being read as long-string char arrays. If you can try it out I would
> appreciate another set of eyes on the result before I commit it and add
> it to the Known Problems page.
>
> Thanks,
>
> - Andrew
>
> ** Patch added: "fix-907761.patch"
> https://bugs.launchpad.net/epics-base/+bug/907761/+attachment/2660256/+files/fix-907761.patch
>
> ** Changed in: epics-base
> Importance: Low => Medium
>

Revision history for this message
Andrew Johnson (anj) wrote :

Hi Dirk,

Thanks for looking at this, I merged it yesterday.

Note that long strings != dynamic arrays.

The original aim for dynamic arrays was so CA monitors of large waveforms would only send NORD instead of NELM elements — the current tech-talk conversations about pyepics are discussing exactly this, where they have a large waveform that stores images which can vary in size and they only want CA to distribute the much smaller Region-Of-Interest.

The long strings implementation doesn't currently call strlen() on the source field, although it could probably be made to do so and update the request length appropriately. That's trivial for SPC_ATTRIBUTE and DBF_*LINK fields (done here but not committed), but it's a harder problem for DBF_STRING fields since the routines in dbConvert.c can't update the request length. I'll probably move this work into the 3.15 tree — it's really a new feature that might need more extensive changes to implement fully, and there have already been other changes in this area.

BTW you have to use camonitor or "caget -c" to use dynamic arrays, they only work when the data is provided through a CA callback which caget only uses when given the -c flag.

- Andrew

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

Hi Andrew,

I found a problem with fix-907761.patch today. In src/db/dbConvert.c, functions getCharChar and getCharUchar the code now evaluates paddr->pfldDes->field_type but it does not check if the fields are set. There is code in base that calls this cunfion without setting paddr->pfldDes. Thus dereferencing this field crashes the IOC.

It is called like this for example from dbCaGetLink through the aConvert pointer.

Dirk

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

I suggest to change
if (paddr->pfldDes->field_type == DBF_STRING)
to
if (paddr->pfldDes && paddr->pfldDes->field_type == DBF_STRING)

Revision history for this message
Andrew Johnson (anj) wrote :

Apply Dirk's suggestion.

Revision history for this message
Andrew Johnson (anj) wrote :

Fix committed in 3.14.12.3 release.

Changed in epics-base:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.