Bug #1753675 “cvt_d_st and cvt_f_st do not check target field si...” : Bugs : EPICS Base

Revision history for this message

Andrew Johnson (anj) wrote on 2018-03-07:

#1

There is no way to add arguments to individual conversion functions, they must all present the exact same calling signature, which is a major problem for the cvt_d_st() and cvt_f_st() routines with a "too-small" output buffer.

Oh, and since both log(2**63-1)+1 > 19 and log(2**64-1) > 19 the new cvt_q_st() and cvt_uq_st() routines on 3.16 and later can also overflow a 20-character output buffer. How should we indicate this has happened? My inclination would be to replace the result with a string like "<overflow>", any objections to that? What about when the above real number conversions don't fit, should we limit the precision we feed into them, or just return "<overflow>" again?

One difference between the "fast" conversions and the dbConvert.c code is that the latter has separate get and put operations:
long get<1><2>(const dbAddr *paddr, void *pto, long nRequest, long no_elements, long offset);
long put<1><2>(dbAddr *paddr, const void *pfrom, long nRequest, long no_elements, long offset);
For put operations the dbAddr argument provides the size of the target field, so the put<real>String() converters do know it and can be made to use it.

Unfortunately for get operations the const dbAddr argument describes the source buffer, thus the get<real>String() routines still have to assume that the target buffer is MAX_STRING_SIZE characters. That may be less of a problem though, I can't think of any input links that are fetched into a small string buffer.

Thus my preferred solution is currently to drop the "fast" convert routines and only use the dbConvert.h API (in dbPut() at least). Then add cvt<type>ToString() routines that can't overflow their buffers, and change dbConvert.c to use them. I think this should fix the worst of the problem.

Comments?

Revision history for this message

Dirk Zimoch (dirk.zimoch) wrote on 2018-03-07:

#2

Indeed, the design decisions made when the conversion API was created cause problems that are hard to fix. The fact that we have two sets of conversion functions lets me assume that there had been previous attempts to fix shortcomings (missing array support I guess). What I do not quite understand is why the scalar version had not been abandoned for a array[1] interpretation. Is it that much slower?

The mayor problem now is that only proper meta-information about one half of the conversion is given and the convert functions cannot easily find out which half, sender or receiver. To fix this properly, a complete re-design would be necessary, potentially breaking a lot of existing software. In particular the dbPut* and dbGet* functions which are in use in many records and drivers, 3rd party as well, cannot be extended easily without breaking a lot of existing code.
The problem in conversion to string is that some information is needed from the source (precision, enum table) and some is needed from the target (buffer size).

One potential way to go:
1. Change the convert functions to have source and target dbAddr* instead of only one dbAddr*.
2. Therefore links need to store owner dbAddr in addition to target dbAddr. This is tricky because a link does not know which field of its owner is connected to but at least it can know the record.
3. In dbPut and dbGet guess the field from the provided buffer pointer (nasty hack) to fill the dbAddr structure correctly. (Or assume VAL if the buffer does not match a field). I would try to cache this information in the link the same way dbGetLinkValue caches the convert function.

For the future I would suggest to provide a new set of get and put functions which provide the necessary interfaces. I particular links should probably have an idea about the local field they are connected with, not only about the target field. Most links always read or write only one local field. Or at least this field does not typically change at run-time. So instead of dbPutLink working on a source buffer, I would create the link connected to a local field already and call the new dbPutLink with only 1 argument -- the link. I would keep the old functions for backward compatibility only. Of course this is far beyond a bug fix and needs some investigation and conceptional design first.

Concerning insufficient buffer space I see two possibilities:
Either consider the conversion a two-step process. First convert the number into the correct unlimited string, then truncate the string to the target size. This may change the value drastically but is what snprintf would do.
Or fill the string with an error message as a replacement. However "<overflow>" itself may be subject to truncation. What about "##########" (fill the buffer)?

Independent of the question if this problem can be fixed, I think it still is a bug.

Indeed, the design decisions made when the conversion API was created cause problems that are hard to fix. The fact that we have two sets of conversion functions lets me assume that there had been previous attempts to fix shortcomings (missing array support I guess). What I do not quite understand is why the scalar version had not been abandoned for a array[1] interpretation. Is it that much slower?

The mayor problem now is that only proper meta-information about one half of the conversion is given and the convert functions cannot easily find out which half, sender or receiver. To fix this properly, a complete re-design would be necessary, potentially breaking a lot of existing software. In particular the dbPut* and dbGet* functions which are in use in many records and drivers, 3rd party as well, cannot be extended easily without breaking a lot of existing code.
The problem in conversion to string is that some information is needed from the source (precision, enum table) and some is needed from the target (buffer size).

One potential way to go:
1. Change the convert functions to have source and target dbAddr* instead of only one dbAddr*.
2. Therefore links need to store owner dbAddr in addition to target dbAddr. This is tricky because a link does not know which field of its owner is connected to but at least it can know the record.
3. In dbPut and dbGet guess the field from the provided buffer pointer (nasty hack) to fill the dbAddr structure correctly. (Or assume VAL if the buffer does not match a field). I would try to cache this information in the link the same way dbGetLinkValue caches the convert function.

For the future I would suggest to provide a new set of get and put functions which provide the necessary interfaces. I particular links should probably have an idea about the local field they are connected with, not only about the target field. Most links always read or write only one local field. Or at least this field does not typically change at run-time. So instead of dbPutLink working on a source buffer, I would create the link connected to a local field already and call the new dbPutLink with only 1 argument -- the link. I would keep the old functions for backward compatibility only. Of course this is far beyond a bug fix and needs some investigation and conceptional design first.

Concerning insufficient buffer space I see two possibilities:
Either consider the conversion a two-step process. First convert the number into the correct unlimited string, then truncate the string to the target size. This may change the value drastically but is what snprintf would do.
Or fill the string with an error message as a replacement. However "<overflow>" itself may be subject to truncation. What about "##########" (fill the buffer)?

Independent of the question if this problem can be fixed, I think it still is a bug.

Revision history for this message

Andrew Johnson (anj) wrote on 2018-03-08:

#3

I suspect your understanding may be a little off. My knowledge of the history is that the scalar versions were added later, which is why they are called "fast" convert — this was an attempt to improve the performance of links that only transport a scalar, and comparing the speeds of the convert routines by themselves the "fast" ones do seem to be about 20-30% faster. However I suspect that the extra overhead of working out which one to call probably removes most of the speed advantage that the fast routines have. I'm still measuring the effect of doing that, but it does seem to slow down dbGet() a little.

Please note that the older dbConvert.h routines have a different API for put vs. get, and there the convert functions *do* know which side's metadata they have access to. The put routines always provide the destination as a dbAddr, while the get routines always provide the source as a dbAddr. This matches the APIs of dbPut() and dbGet().

I don't think providing a dbAddr for both source and destination would be a good idea, it's actually a relatively heavy-weight object.

Still working on this...

Revision history for this message

Dirk Zimoch (dirk.zimoch) wrote on 2018-03-08:

#4

I had not looked up the older history of the conversions. I simply guessed from the complexity.

Any new implementation should of course be at least as efficient as the existing one. This makes any re-design a challenge.

EPICS Base

cvt_d_st and cvt_f_st do not check target field size

Bug Description

Other bug subscribers

Remote bug watches