Constant arrary input links of size 1 don't work

Bug #1882520 reported by Dirk Zimoch on 2020-06-08
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Undecided
Dirk Zimoch

Bug Description

Constant input links in array form with 1 element don't work and return an empty array.

Example:
field (INP, [42])

As soon as a comma is present, it works:
field (INP, [42,])

Related branches

Changed in epics-base:
assignee: nobody → Dirk Zimoch (dirk.zimoch)
status: New → In Progress
summary: - Constant arrar input links of size 1 don't work
+ Constant arrary input links of size 1 don't work
Dirk Zimoch (dirk.zimoch) wrote :

The problem is in dbStaticLib.c line 2334 (dbParseLink):

    /* Link may be an array constant */
    if (pstr[0] == '[' && pstr[len-1] == ']' &&
        (strchr(pstr, ',') || strchr(pstr, '"'))) {
        pinfo->ltype = CONSTANT;
        return 0;
    }

This demands that arrays contain at least one comma or quote. So 1 element string arrays are ok but one element numeric arrays not? Why?

[42] and [] sould be are valid arrays.

Dirk Zimoch (dirk.zimoch) wrote :

See merge request 385269.

Andrew Johnson (anj) wrote :

I am probably going to mark this bug as Invalid and your merge request as Rejected.

Why do you think that code might have been included in dbStaticLib in the first place? It isn't very old, and the strchr() calls in it are very explicit.

I'll give you a hint: https://epics.anl.gov/base/R3-15/6-docs/AppDevGuide/DatabaseDefinition.html#x7-2520006.4.11

Dirk Zimoch (dirk.zimoch) wrote :

What do you mean? I don't get it.

Andrew Johnson (anj) wrote :

record(aai, "[]") {
  field(INP, {const:[]})
  field(NELM, 10)
}
record(aai, "[0]") {
  field(INP, {const:[0]})
  field(NELM, 10)
}
record(aai, "[1]") {
  field(INP, {const:[1]})
  field(NELM, 10)
}
record(aai, "empty") {
  field(INP, [])
  field(PINI, YES)
  field(NELM, 10)
}
record(aai, "zero") {
  field(INP, [0])
  field(PINI, YES)
  field(NELM, 10)
}
record(aai, "one") {
  field(INP, [1])
  field(PINI, YES)
  field(NELM, 10)
}

mdavidsaver (mdavidsaver) wrote :

@anj I take it you're referring to commit d3a9ba7701e45d520e540930df4d8fa23c030f5e with the message 'Fix issues related to const array initialization'. (imo. the appdev guide makes for a poor hint)

Andrew Johnson (anj) wrote :

No, the previous version of that code at 1f726c876027894c78c1a18fe2f2de9eb04c288c still had strchr(pstr, ',') in the condition.

My point is that both [] and [42] are completely legal record names according to our documentation, so this merge would change the meaning of them when used as an INP field value. Neither the double-quote nor comma characters are legal in record names as documented, and the two strchr() tests were to ensure that we only changed the meaning in cases that weren't previously legal.

The commit you pointed to was made to also allow constant strings to be specified using the DB-file syntax ["..."].

We can have a discussion over whether we want to change the requirements for legal record names, but we should agree about doing that first.

Dirk Zimoch (dirk.zimoch) wrote :

Hmmm... Is this a real-life problem? Does anyone name their records []? I must admit that I had not thought of such a case.

Dirk Zimoch (dirk.zimoch) wrote :

I have seen databases using { } or # in their record names. So the rules have changed at some point already. Also the documentation does not mention a minimum length, thus an empty record name is legal. I‘ll test that tomorrow...

mdavidsaver (mdavidsaver) wrote :

record(ai, "a b") {}

record(ai, "x.y") {}

Both result in records being created. "a b" is even accessible through CA.
"x.y" isn't. Neither is linkable.

"legal record name" reads to me like a sign over the barn door. "no equine exit"

Still, maybe this is the time to finally get serious about defining what a valid record name is.

mdavidsaver (mdavidsaver) wrote :

> thus an empty record name is legal

> Error: dbRecordHead: Record name can't be empty
> at or before ")" in path "." file "strange.db" line 7
> Segmentation fault

Which gives a clear indication of how well tested this
error path is.

Andrew Johnson (anj) wrote :

tux% cat bad.db
record(ai, "") {
  field(DESC, "Test")
}
tux% softIoc -d bad.db dbLoadDatabase("/home/phoebus4/ANJ/epics/base/7.0/bin/linux-x86_64/../../dbd/softIoc.dbd")
softIoc_registerRecordDeviceDriver(pdbbase)
Error: dbRecordHead: Record name can't be empty
 at or before ")" in path "." file "bad.db" line 1
Error: syntax error
dbLoadRecords: failed to load 'bad.db'
Error: Failed to load: bad.db

tux% cat bad.db
record(ai, "a b") {
  field(DESC, "Test")
}
record(ai, "a.b") {
  field(DESC, "Test")
}
record(ai, "a$b") {
  field(DESC, "Test")
}
record(ai, "a\"b") {
  field(DESC, "Test")
}
tux% softIoc -d bad.db
dbLoadDatabase("/home/phoebus4/ANJ/epics/base/7.0/bin/linux-x86_64/../../dbd/softIoc.dbd")
softIoc_registerRecordDeviceDriver(pdbbase)
Bad character ' ' in record name "a b"
Bad character '.' in record name "a.b"
Bad character '$' in record name "a$b"
Bad character '"' in record name "a\"b"
dbLoadRecords("bad.db")
iocInit()
Starting iocInit
############################################################################
## EPICS R7.0.3.2-DEV
## Rev. R7.0.3.1-294-g75a3442669af831d8a6b
############################################################################
iocRun: All initialization complete
epics> dbl
a b
a.b
a$b
a\"b

The fact that the messages are printed before the command that caused them is a feature of the new version of softIoc, but all those record names do actually generate an error message when you load them. I did that instead of erroring out because we didn't used to reject such names, and as you have noticed some of those characters do work in some situations.

Note that if I had been strict about checking names the BNL naming convention that includes {} characters inside record names would have been prohibited. As it is since we added JSON link types records cannot be named "{something}", and the brace characters have not been added to the documented list of legal record name characters.

I'm not in favor of our getting too strict about record names too quickly.

Andrew Johnson (anj) wrote :

@mad what did strange.db contain and how did you load it? I don't get a segfault with an empty name.

mdavidsaver (mdavidsaver) wrote :

cat <<EOF > empty.db
record(ai, "") {}
EOF

gdb --args ./bin/linux-x86_64-debug/softIoc -d empty.db
...
Error: dbRecordHead: Record name can't be empty
 at or before ")" in path "." file "empty.db" line 1

Thread 1 "softIoc" received signal SIGSEGV, Segmentation fault.
0x00007ffff7f19e74 in popFirstTemp () at ../dbStatic/dbLexRoutines.c:142
142 ptemp = ptempListNode->item;
(gdb) p ptempListNode
$1 = (tempListNode *) 0x0
(gdb) bt
#0 0x00007ffff7f19e74 in popFirstTemp () at ../dbStatic/dbLexRoutines.c:142
#1 0x00007ffff7f1d17c in dbRecordBody () at ../dbStatic/dbLexRoutines.c:1230
...

Neither function is testing for NULL.

mdavidsaver (mdavidsaver) wrote :

> Note that if I had been strict about checking names the BNL naming convention that includes {}
> characters inside record names would have been prohibited.

If you (or others before you) had done so prior to 2011, then we would have selected a different separator.

Of course 2011 was 9 years ago, so at this point I'm as guilty as anyone for continuing to let this slide. Still, we can only change the future.

You've seen my request on tech-talk. I'd like to start by trying to establish a positive set of allowed characters based on what has actually been used (have you ever seen '+' or ';' in a name?). If, and likely when, this proves impossible I think we can enforce a negative set to blacklist certain characters in certain positions.

Andrew Johnson (anj) wrote :

';' was used by the original VxWorks Global Symbol support to terminate the symbol name to be looked up from later parts of the PV name; it's still documented as being supported by the current version: https://github.com/epics-modules/symb/blob/e44a1f10ae72e56d89fbe22cd8ca8f9e42229304/symbApp/devSymbFind.c#L56

We do have a few records at the APS which have a ';' in their names, and we were recently discussing using it in the APS-Upgrade naming convention to introduce optional naming components, to simplify or speed up parsing, so that usage here may grow.

I have some vague recollection that + has been used for something else (not here at APS though), but I don't now remember what it was, it might even have been in one of my training lectures.

In an old tech-talk post there was a suggestion that starting PV names with a '-' was a bad idea because the caget/caput/camonitor programs mistake the PV name for a flag. They do support the -- flag though, so I think that should just be a suggestion, not a hard rule.

In general I would not be in favor of making any name illegal that follows the documented legal character set; people have asked many times over the years what characters were allowed in record names and I've pointed them at the AppDevGuide. If they actually followed that advice I wouldn't want to break the naming conventions they might have developed.

Using characters that are not in the documented set though opens up names that might work now to being denied in the future, such as the braces {}, use of which is now limited by the simple JSON detection in dbParseLink(). I didn't want to start writing down exactly what all the rules were when I added JSON support though, so I didn't add them to that list.

One thing we probably should do though is disallow any characters that are not strictly 7-bit ASCII. I really don't want to have to check or enforce UTF-8 rules. Interestingly nobody has ever asked me about i18n or l10n of strings in EPICS.

Dirk Zimoch (dirk.zimoch) wrote :

I have tested with 3.14.12 and found that "" is fine as a record name, but segfaults when linking to it (or rather to ".VAL" as linking to "" itself does not work of course).
Also " " and "a b" and even "\"" work as record names but print warnings. Linking to them is impossible though.
I have no problem at all with the record names "1,3" or "[1,3]" or linking to them in 3.14.12.
But in EPICS 7 of course instead of linking to [1,3] I get a constant array.

Also using TAB as the record name works fine in 3.14.7 but I cant link to it any more in 7.

That means EPICS 7 already broke my perfectly working database. Of course only because I used silly record names.

I don't seen why we should care more about compatibility with records [] and [42] than with [1,3]?

I am very fond of the principle of least surprise. So what would be the bigger surprise?
a) [42] and [] are not valid record names
b) [42,43] and [42,] are valid arrays but [42] and [] are not

To me, b) was a big surprise.

What about a rule like:
1. No dot (.), no quotes (" and '), no chars below ASCII/Unicode 0x21.
2. First char is none of: {[<(
3. Otherwise any unicode char (e.g. !äßáñ絚录)

Andrew Johnson (anj) wrote :

I think a bigger surprise would be a site that developed a record naming convention that carefully followed our published naming rules and then found out that they can't upgrade EPICS because we changed those rules. Some EPICS sites don't have developers who have time to read every tech-talk message, and that possibility troubles me.

If you don't like surprises, don't use the [] constant array syntax at all, use {const:[...]} instead as the JSON parser does follow the spec. The current parser (without your change) gives these occasionally surprising results:

field(INP, [1]) INP : DB_LINK [1] NPP NMS
field(INP, [1,2]) INP : CONSTANT [1,2]
field(INP, "[1]") INP : DB_LINK [1] NPP NMS
field(INP, "[1,2]") INP : CONSTANT [1,2]
field(INP, [a]) INP : CONSTANT ["a"]
field(INP, [a,b]) INP : CONSTANT ["a","b"]
field(INP, "[a]") INP : DB_LINK [a] NPP NMS
field(INP, "[a,b]") INP : CONSTANT [a,b] <-- This generates a parse error at iocInit

The INP: part on each line is from a dbpr of the record and shows plink->type.

Applying your patch converts the 3 DB_LINKs into CONSTANTs, and adds a second parse error:

field(INP, [1]) INP : CONSTANT [1]
field(INP, [1,2]) INP : CONSTANT [1,2]
field(INP, "[1]") INP : CONSTANT [1]
field(INP, "[1,2]") INP : CONSTANT [1,2]
field(INP, [a]) INP : CONSTANT ["a"]
field(INP, [a,b]) INP : CONSTANT ["a","b"]
field(INP, "[a]") INP : CONSTANT [a] <-- Parse error at iocInit
field(INP, "[a,b]") INP : CONSTANT [a,b] <-- Parse error at iocInit

Links that give parse errors do not initialize their record at all. I know exactly why they happen so don't waste time investigating them, I have a fix in the works when we introduce the JSON-5 version of the YAJL parser, although putting array constants inside quotes isn't really correct anyway.

FWIW, I am with Dirk here. I would go even further: I do /not/ expect
any arbitrary existing database from 3.x to work with 7 without changes.
To the contrary, experience tells me that I /always/ need to tweak some
things when porting an application to a new (major) base version. For
instance some fields of standard records have gotten new types in EPICS
7 (or perhaps that was 3.14->3.15). And renaming a few records with
funny characters in them really isn't that much effort IMO.

This is why I think that for EPICS 7 we should define which characters
are allowed and that set should be as limited as acceptable by the
community in order to make future enhancements easier.

mdavidsaver (mdavidsaver) wrote :

I'm also in agreement with Dirk, though I won't go as far as Ben. "a few records" doesn't capture the potential effects. At the time, these are 'potential' effects. Based on lackluster response to my tech-talk mail, I think the reasonable course of action is to add new restrictions as warnings. These would remain warnings for some years to give sites an opportunity to notice them and request changes. I would then lean towards loosening restrictions to avoid orphaning whole sites from future updates.

So I begin with:

https://github.com/epics-base/epics-base/pull/78

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers