file module to allow for arbitrary encodings

Bug #942171 reported by Matthias Brantner
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Zorba
Fix Committed
Medium
Paul J. Lucas

Bug Description

The file module's read and write functions need to be able to read and write arbitrary encodings.

Currently, only reading arbitrary encodings is supported. The write part requires API incompatible changes. Specifically, the file:write function should be removed and split into a file:write-text and file:write-binary function. The former accept the encoding as parameter. Non of the write functions should implicitly call fn:serialize.

Related branches

Chris Hillery (ceejatec)
Changed in zorba:
milestone: none → 3.0
Chris Hillery (ceejatec)
tags: added: incompatible-change
Changed in zorba:
assignee: Matthias Brantner (matthias-brantner) → Paul J. Lucas (paul-lucas)
Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

But doesn't the file module conform to an external specification? If so, how can you just change it at will?

Revision history for this message
Matthias Brantner (matthias-brantner) wrote : Re: [Bug 942171] Re: file module to allow for arbitrary encodings

The external specification allows for encodings.

On Jun 20, 2013, at 8:44 AM, "Paul J. Lucas" <email address hidden> wrote:

> But doesn't the file module conform to an external specification? If so,
> how can you just change it at will?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/942171
>
> Title:
> file module to allow for arbitrary encodings
>
> Status in Zorba - The XQuery Processor:
> New
>
> Bug description:
> The file module's read and write functions need to be able to read and
> write arbitrary encodings.
>
> Currently, only reading arbitrary encodings is supported. The write
> part requires API incompatible changes. Specifically, the file:write
> function should be removed and split into a file:write-text and file
> :write-binary function. The former accept the encoding as parameter.
> Non of the write functions should implicitly call fn:serialize.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/zorba/+bug/942171/+subscriptions

Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

If the serializer isn't to be used, then what should be used in its place?

Changed in zorba:
status: New → In Progress
Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

There are only two functions to write:

file:write-binary and file:write-text.

The latter accepts a string. Hence, no serialization is needed. The only thing that eventually needs to be done is to encode the given string.

Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

The existing function file:write() calls write-text() after passing $content through the serializer. So that's still OK?

Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

Also, the WriteBinary::evaluate has code in it like:

  while ( it->next( item ) ) {
    if ( item.isStreamable() && !item.isEncoded() )
      ofs << item.getStream().rdbuf();
    else {
      Zorba_SerializerOptions options;
      options.ser_method = ZORBA_SERIALIZATION_METHOD_BINARY;
      Serializer_t serializer( Serializer::createSerializer( options ) );
      SingletonItemSequence seq( item );
      serializer->serialize( &seq, ofs );
    }
  }

It is OK to keep the serializer there? If not, what should be substituted?

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

The file:write function should be removed.

Matthias

On Jul 29, 2013, at 3:05 PM, "Paul J. Lucas" <email address hidden> wrote:

> The existing function file:write() calls write-text() after passing
> $content through the serializer. So that's still OK?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/942171
>
> Title:
> file module to allow for arbitrary encodings
>
> Status in Zorba - NoSQL Query Processor:
> In Progress
>
> Bug description:
> The file module's read and write functions need to be able to read and
> write arbitrary encodings.
>
> Currently, only reading arbitrary encodings is supported. The write
> part requires API incompatible changes. Specifically, the file:write
> function should be removed and split into a file:write-text and file
> :write-binary function. The former accept the encoding as parameter.
> Non of the write functions should implicitly call fn:serialize.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/zorba/+bug/942171/+subscriptions

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

No, that doesn't look good. The binary serialization method should be removed.

if ( item.isStreamable() && !item.isEncoded() )
{
 ofs << item.getStream().rdbuf();
}
else if (item.isStreamable() && item.isEncoded() )
{
 // put base64 decoding streambuffer in place
 // and output
}
else if (!item.isStreamable() && !item.isEncoded() )
{
 size_t size;
 const char* c = item.getBase64BinaryValue(size);
 ofs.write(c, size);
}
else if (!item.isStreamable() && item.isEncoded() )
{
 size_t size;
 const char* c = item.getBase64BinaryValue(size);
 // base64 decode c and output
}

On Jul 29, 2013, at 3:09 PM, "Paul J. Lucas" <email address hidden> wrote:

> Also, the WriteBinary::evaluate has code in it like:
>
> while ( it->next( item ) ) {
> if ( item.isStreamable() && !item.isEncoded() )
> ofs << item.getStream().rdbuf();
> else {
> Zorba_SerializerOptions options;
> options.ser_method = ZORBA_SERIALIZATION_METHOD_BINARY;
> Serializer_t serializer( Serializer::createSerializer( options ) );
> SingletonItemSequence seq( item );
> serializer->serialize( &seq, ofs );
> }
> }
>
> It is OK to keep the serializer there? If not, what should be
> substituted?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/942171
>
> Title:
> file module to allow for arbitrary encodings
>
> Status in Zorba - NoSQL Query Processor:
> In Progress
>
> Bug description:
> The file module's read and write functions need to be able to read and
> write arbitrary encodings.
>
> Currently, only reading arbitrary encodings is supported. The write
> part requires API incompatible changes. Specifically, the file:write
> function should be removed and split into a file:write-text and file
> :write-binary function. The former accept the encoding as parameter.
> Non of the write functions should implicitly call fn:serialize.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/zorba/+bug/942171/+subscriptions

Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

With the changes, no write* function has serialization-parameters any more; yet there is still the append() function that takes a serialization-parameters. It's the only function left that does. Should it be eliminated?

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

On Aug 1, 2013, at 8:56 PM, "Paul J. Lucas" <email address hidden> wrote:

> With the changes, no write* function has serialization-parameters any
> more; yet there is still the append() function that takes a
> serialization-parameters. It's the only function left that does. Should
> it be eliminated?
Yes

>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/942171
>
> Title:
> file module to allow for arbitrary encodings
>
> Status in Zorba - NoSQL Query Processor:
> In Progress
>
> Bug description:
> The file module's read and write functions need to be able to read and
> write arbitrary encodings.
>
> Currently, only reading arbitrary encodings is supported. The write
> part requires API incompatible changes. Specifically, the file:write
> function should be removed and split into a file:write-text and file
> :write-binary function. The former accept the encoding as parameter.
> Non of the write functions should implicitly call fn:serialize.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/zorba/+bug/942171/+subscriptions

Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

The file test/rbkt/Queries/zorba/file/common.xqlib contains the function:

declare %ann:sequential function commons:testWriteSerializeXml($path as xs:string, $xml as item()) as xs:string* {
  file:write(
    $path,
    $xml,
    <output:serialization-parameters>
      <output:method value="xml"/>
    </output:serialization-parameters>);

  "SUCCESS";
};

How can this be rewritten without (A) file:write() and (B) without any write function that allows the specification of serialization parameters?

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

declare %ann:sequential function commons:testWriteSerializeXml($path as xs:string, $xml as item()) as xs:string* {
 file:write-text(
   $path,
   fn:serialize($xml,
   <output:serialization-parameters>
     <output:method value="xml"/>
   </output:serialization-parameters>));

 "SUCCESS";
};

On Aug 1, 2013, at 9:17 PM, "Paul J. Lucas" <email address hidden> wrote:

> The file test/rbkt/Queries/zorba/file/common.xqlib contains the
> function:
>
> declare %ann:sequential function commons:testWriteSerializeXml($path as xs:string, $xml as item()) as xs:string* {
> file:write(
> $path,
> $xml,
> <output:serialization-parameters>
> <output:method value="xml"/>
> </output:serialization-parameters>);
>
> "SUCCESS";
> };
>
> How can this be rewritten without (A) file:write() and (B) without any
> write function that allows the specification of serialization
> parameters?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/942171
>
> Title:
> file module to allow for arbitrary encodings
>
> Status in Zorba - NoSQL Query Processor:
> In Progress
>
> Bug description:
> The file module's read and write functions need to be able to read and
> write arbitrary encodings.
>
> Currently, only reading arbitrary encodings is supported. The write
> part requires API incompatible changes. Specifically, the file:write
> function should be removed and split into a file:write-text and file
> :write-binary function. The former accept the encoding as parameter.
> Non of the write functions should implicitly call fn:serialize.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/zorba/+bug/942171/+subscriptions

Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

Ignore previous comment. However, now when I run the query, I get:

870: http://www.w3.org/2005/xqt-errors:FODC0006: http://www.w3.org/2005/xqt-errors:FODC0006invalid content passed to fn:parse-xml(): "/Users/pjl/src/flwor/zorba/repo/bug-942171/zorba/build/test/rbkt/Queries/tmpCreateWriteReadDeleteSeries/file.xml":1,1: loader parsing error: empty XML document ('<' expected)[line 48][column 8][file file:///Users/pjl/src/flwor/zorba/repo/bug-942171/zorba/test/rbkt/Queries/zorba/file/common.xqlib]

file.xml now only contains:

  value

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

But fn:serialize does take such options.

On Aug 1, 2013, at 10:59 PM, "Paul J. Lucas" <email address hidden> wrote:

> That doesn't work because write-text() no longer has serialization
> options. Per your request, there is now only an encoding.
>
> On Aug 1, 2013, at 10:35 PM, Matthias Brantner
> <email address hidden> wrote:
>
>> declare %ann:sequential function commons:testWriteSerializeXml($path as xs:string, $xml as item()) as xs:string* {
>> file:write-text(
>> $path,
>> fn:serialize($xml,
>> <output:serialization-parameters>
>> <output:method value="xml"/>
>> </output:serialization-parameters>));
>>
>> "SUCCESS";
>> };
>>
>>
>> On Aug 1, 2013, at 9:17 PM, "Paul J. Lucas" <email address hidden> wrote:
>>
>>> The file test/rbkt/Queries/zorba/file/common.xqlib contains the
>>> function:
>>>
>>> declare %ann:sequential function commons:testWriteSerializeXml($path as xs:string, $xml as item()) as xs:string* {
>>> file:write(
>>> $path,
>>> $xml,
>>> <output:serialization-parameters>
>>> <output:method value="xml"/>
>>> </output:serialization-parameters>);
>>>
>>> "SUCCESS";
>>> };
>>>
>>> How can this be rewritten without (A) file:write() and (B) without any
>>> write function that allows the specification of serialization
>>> parameters?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/942171
>
> Title:
> file module to allow for arbitrary encodings
>
> Status in Zorba - NoSQL Query Processor:
> In Progress
>
> Bug description:
> The file module's read and write functions need to be able to read and
> write arbitrary encodings.
>
> Currently, only reading arbitrary encodings is supported. The write
> part requires API incompatible changes. Specifically, the file:write
> function should be removed and split into a file:write-text and file
> :write-binary function. The former accept the encoding as parameter.
> Non of the write functions should implicitly call fn:serialize.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/zorba/+bug/942171/+subscriptions

Revision history for this message
Matthias Brantner (matthias-brantner) wrote : Re: [Bug 942171] file module to allow for arbitrary encodings

I fixed the failing test. Here are some more comments on the merge proposal:

- append-text($file as xs:string, $value as xs:string, $encoding as xs:string) is missing
- append-binary should only accept one base64Binary
- could you also add the functions append-text-lines and write-text-lines from http://expath.org/spec/file

On Aug 2, 2013, at 7:23 AM, Paul J. Lucas <email address hidden> wrote:

> Ignore previous comment. However, now when I run the query, I get:
>
> 870: http://www.w3.org/2005/xqt-errors:FODC0006: http://www.w3.org/2005
> /xqt-errors:FODC0006invalid content passed to fn:parse-xml():
> "/Users/pjl/src/flwor/zorba/repo/bug-942171/zorba/build/test/rbkt/Queries/tmpCreateWriteReadDeleteSeries/file.xml":1,1:
> loader parsing error: empty XML document ('<' expected)[line 48][column
> 8][file
> file:///Users/pjl/src/flwor/zorba/repo/bug-942171/zorba/test/rbkt/Queries/zorba/file/common.xqlib]
>
> file.xml now only contains:
>
> value
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/942171
>
> Title:
> file module to allow for arbitrary encodings
>
> Status in Zorba - NoSQL Query Processor:
> In Progress
>
> Bug description:
> The file module's read and write functions need to be able to read and
> write arbitrary encodings.
>
> Currently, only reading arbitrary encodings is supported. The write
> part requires API incompatible changes. Specifically, the file:write
> function should be removed and split into a file:write-text and file
> :write-binary function. The former accept the encoding as parameter.
> Non of the write functions should implicitly call fn:serialize.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/zorba/+bug/942171/+subscriptions

Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

The following files/lines still use file:write() and it's not obvious as to how to change the code such that file:write() is no longer used:

doc/zorba/xqdoc/src/xqdoc-html.xq, line 44
modules/com/zorba-xquery/www/modules/xqdoc/batch.xq, lines 100, 144
test/fots_driver/reporting.xq, line 83

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

Same as the previous one. For example,

file:write($output-file, batch:xqdoc($page), ());
=>
file:write-text($output-file, serialize(batch:xqdoc($page)));

On Aug 5, 2013, at 4:00 PM, Paul J. Lucas <email address hidden> wrote:

> The following files/lines still use file:write() and it's not obvious as
> to how to change the code such that file:write() is no longer used:
>
> doc/zorba/xqdoc/src/xqdoc-html.xq, line 44
> modules/com/zorba-xquery/www/modules/xqdoc/batch.xq, lines 100, 144
> test/fots_driver/reporting.xq, line 83
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/942171
>
> Title:
> file module to allow for arbitrary encodings
>
> Status in Zorba - NoSQL Query Processor:
> In Progress
>
> Bug description:
> The file module's read and write functions need to be able to read and
> write arbitrary encodings.
>
> Currently, only reading arbitrary encodings is supported. The write
> part requires API incompatible changes. Specifically, the file:write
> function should be removed and split into a file:write-text and file
> :write-binary function. The former accept the encoding as parameter.
> Non of the write functions should implicitly call fn:serialize.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/zorba/+bug/942171/+subscriptions

Changed in zorba:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.