need a little more information in the GraphML
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
collatex |
Fix Committed
|
Undecided
|
Unassigned |
Bug Description
There are a couple of things I would like in the GraphML output which would be difficult to calculate myself, and which I think CollateX already calculates anyway.
At the moment, the GraphML uncouples transpositions, to keep the graph unidirectional. This is fine, but I would like an added key in the duplicated nodes, to indicate that they actually did get matched together. e.g. 'march of drought' vs. 'drought of march':
...
<key attr.name="number" attr.type="int" for="node" id="d1"/>
<key attr.name="token" attr.type="string" for="node" id="d0"/>
+ <key attr.name=
...
<node id="14">
<data key="d0"
<data key="d1">14</data>
+ <data key="d2">18</data>
</node>
<node id="15">
<data key="d0"
<data key="d1">15</data>
+ <data key="d2">17</data>
</node>
<node id="16">
<data key="d0">of</data>
<data key="d1">16</data>
</node>
<node id="17">
<data key="d0"
<data key="d1">17</data>
+ <data key="d2">15</data>
</node>
<node id="18">
<data key="d0"
<data key="d1">18</data>
+ <data key="d2">14</data>
</node>
...
Second, it is easy to see visually which nodes belong in 'columns' together; I know CollateX can put the graph into an alignment table; so I wonder if this alignment information can get put into the GraphML? A fourth data key called 'column' or some such, with a numeric ID for the columns, would do the trick.
Fixed in branch ~tla/collatex/ graphml (off of 1.0release branch)