SQL persistence fails to support MySQL

Bug #529735 reported by Matt Chapman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenCog
Won't Fix
Wishlist
linas

Bug Description

persist/sql/README implies that MySQL is supported, but it is impossible to run atom.sql against MySQL because of the usee of the postgre-specific 'array' data type in the Atoms table:

atom.sql line 31:

outgoing INT[]

I'd suggest serializing the arrays to a string in the code to preserve database portability.

Revision history for this message
Matt Chapman (matt-ninjitsuweb) wrote :

Additional changes required for MySQL support:

(2) Atoms table index 'nodeidx' needs a limit on the length of the name index, e.g.:

-CREATE INDEX nodeidx ON Atoms(type, name);
+CREATE INDEX nodeidx ON Atoms(type, name(512));

(3) TypeCodes table needs a limit on size of typename index, e.g.:

- typename TEXT UNIQUE
+ typename VARCHAR(999) UNIQUE

(There's probably another way to do that where only the index is limited, but perhaps not for Unique. In any case, typenames should never be extraordinarily long, so I think it's perfectly reasonable to limit the size of this field.)

Revision history for this message
linas (linasvepstas) wrote :

Short-term solution would be to create different "atom.sql" files for each DB type. (I'd rather not try to use just one definition -- hard-coding length restrictions seems awfully hacky).

Yes, these *could* be put into the C code, but some additional work is needed to determine, at runtime, which kind of DB one is connected to, so as to use the appropriate table formats.

Changed in opencog:
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → linas (linasvepstas)
Revision history for this message
Matt Chapman (matt-ninjitsuweb) wrote :

I think maybe you misunderstood my suggestion about what to put in the C code.

You need to not use the int[] (array) type, because it is specific to PostgreSQL. So instead, you need to serialize your arrays in the code and write a string to the database in that column, so that an ODBC driver will work.

Revision history for this message
Matt Chapman (matt-ninjitsuweb) wrote :

%s/an ODBC/any ODBC/g

Revision history for this message
linas (linasvepstas) wrote :

If I understand you, then mysql has no support for arrays.

This is tricky -- simply converting the array into a string is not enough, since one would need to be able to perform searches on substrings. Does mysql support regex expressions on varchar fields? If so, then I guess that one could use regex, although that would seriously hurt performance. If there's no regex support, then one would have to implement a part of the search on the client side, which could add significant complexity. My knee-jerk reaction is to not support mysql.

Anyway, there is a real need to find all links that contain some particular atom in their outgoing set; future plans call for more complex joins as well.

Rather than focusing energy on mysql, it is probably(?) better to aim at the long-term solution of using hypertable, cassandra or hadoop.

Revision history for this message
Matt Chapman (matt-ninjitsuweb) wrote :

What about normalizing the array to a separate table? Is that an option? (I haven't studied the C code at all yet myself.)

Also, I would not urge focusing effort on MySQL specifically. I would suggest effort toward full ANSI SQL compatibility, and I'm willing to help toward that effort in my very limited time. (Maybe a couple hours each weekend.)

Revision history for this message
linas (linasvepstas) wrote :

> What about normalizing the array to a separate table? Is that an option? (I haven't studied the C code at all yet myself.)

Yes, that's an option, and its partly implemented. There's some #ifdef's surrounding the code that does this, called "inline links" or "out-of-line links" or something like that. That code used to compile & work long long ago, but is surely broken now. I moved away from it because a) it was significantly more complex, and b) it had 1/2 or 1/3'rd the performance of the current, in-line version.

linas (linasvepstas)
Changed in opencog:
importance: Medium → Wishlist
linas (linasvepstas)
Changed in opencog:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.