Comment 19 for bug 622194

Revision history for this message
fisharebest (fisharebest) wrote :

Mark - the double \n\n is normal.

We need to
a) convert MSDOS/MAC line endings to UNIX
b) ignore blank lines

So, I just do a replace of \r with \n as the first import step.

I tried replacing \r\n with \n, but when the search/replace strings are different lengths, this can be quite slow. Ignoring blank lines later on is quite quick, so I just replace \r characters.

I've had another look at the code. (editgedcoms.php, import_gedcom_file(), line ~75 if you want to take a look).

In MySQL, max_allowed_packet is (unfortunately) used for many different things. It is documented as being the size of the largest query that can be received over the network (to prevent DoS attacks). Depending on the version of MySQL, it also seems to be used as the largest size of a buffer to use when manipulating strings.

So, the current logic tries to load the file in chunks of 75% of the max-allowed-packet

On my mysql 5.0 system, this truncates the entire result when it gets too large. On yours, it seems to just keep part of the expression.

Stephen - this issue is very dependent on the exact version of MySQL. While it may work for you, it isn't working for us. Mark and I both get the error, although with different symptoms.

The solution, as stated elsewhere, is to break the file into chunks in PHP, and store this in a "gedcom_chunk" table.

I resisted that solution initially, because the current system is very fast (when it works!), and means we don't need to handle timeout issues.