Crash due to a too small buffer being provided in dbContextReadNotifyCache
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
EPICS Base |
Fix Released
|
Medium
|
mdavidsaver | ||
3.14 |
Fix Released
|
Medium
|
mdavidsaver | ||
3.15 |
Fix Released
|
Medium
|
mdavidsaver | ||
3.16 |
Fix Released
|
Medium
|
mdavidsaver |
Bug Description
Hello,
We have been investigating an IOC crash for some time (originally on VxWorks) and we have finally managed to track it down when using a Linux build of the IOC compiled with Address Sanitizer. Here is a partial backtrace which indicates that a buffer provided by the dbContextReadNo
==18348== ERROR: AddressSanitizer: heap-buffer-
WRITE of size 8 at 0x600c005e7f98 thread T578
<output omitted>
#0 0x7f567aa42460 in getDoubleDouble /home/user/
#1 0x7f567aa355d5 in dbGet /home/user/
#2 0x7f567aa3733e in dbGetField /home/user/
#3 0x7f567aa6b907 in db_get_
#4 0x7f567aa6f762 in db_get_field /home/user/
#5 0x7f567aa84f80 in dbContextReadNo
#6 0x7f567aa8191f in dbChannelIO:
#7 0x7f567a377494 in ca_array_
#8 0x7f567bbd5c55 in caVariable:
#9 0x7f567bdeda19 in seq_pvGet /home/user/
<output omitted>
0x600c005e7f98 is located 0 bytes to the right of 56-byte region [0x600c005e7f60
allocated by thread T509 here:
#0 0x7f567dac61d9 (/lib64/
#1 0x7f567aa84f36 in dbContextReadNo
<output omitted>
SUMMARY: AddressSanitizer: heap-buffer-
We should note that reproducing this bug turned out tricky, because if the IOC passed a certain action without crashing, it would not crash again when the action was repeated. The IOC then needed to be restarted to have any chance of the crash occurring.
Regardless, while reviewing this buffer caching code, we have found a bug. The following is a summary of our understanding of the bug, and we also suggest a fix (attached) which has resolved the crash for us. From what we can tell, the bug has been present for a long time and is currently both in 3.14 and 3.15.
The dbContextReadNo
also be of this same size until the next bumping.
The bug is that when a used buffer is released, it is unconditionally inserted into the free list - even if the size was bumped in between! So when a new allocation request comes in, we may return this old buffer which is shorter than _readNotifyCach
We've fixed it by keeping the allocation size along with each buffer, and checking in the free function whether this is still equal to _readNotifyCach
I agree with your analysis.