remove more attributes with default values

Bug #714731 reported by Jan Thor
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Inkscape
Fix Released
Low
jazzynico
Scour
Fix Released
Medium
Unassigned
scour (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Within removeDefaultAttributeValues, scour removes some attributes if they have default values and are therefore unnecessary, but this list of default values is rather short and conservative. It would be nice if scour could also get rid of default attribute values like fill-rule:nonzero and stroke-linecap:butt and a lot of other stuff Inkscape loves to include.

There is a problem, though: if an element changes such a value, and a child element sets the value back to the default value, we can’t delete this attribute in the child element, since the default value is now meaningful. Therefore, it is necessary to keep track whether some attribute has been changed somewhere among the ancestors, in which case an attribute pointing to a default value should be left untouched.

I appended a code fragment where I made the necessary changes to the removeDefaultAttributeValues function. The rest of the code is unaffected, although my changes could be used to make repairStyle shorter (and remove a known problem of the way repairStyle deals with cases of opacity=1 attributes and their relatives).

Related branches

Revision history for this message
Jan Thor (jan-janthor) wrote :
Revision history for this message
Louis Simard (louis-simard-deactivatedaccount) wrote :

That code gives me deja vu for some reason, as if I had read it somewhere else already...

Anyway, on to the comments:

1. Are you in an environment where you can't make unified diffs, or perhaps you don't know how to make them yet? Unified diffs are an excellent way to make patches, where the diff itself shows what was removed and added, yet any contributor only needs to keep the old or new version of a file to apply or revert a patch. On Unix or BSD, you can copy the Bazaar repository version of the file to scour.py.old and issue "diff -u scour.py.old scour.py", after editing scour.py, to make a patch. On Windows you can probably get a tool to make unified diffs too, like a gnuwin32 port of 'diff'.

2. _getStyle and _setStyle could probably be used elsewhere when the code needs to read and write style attributes; well done on making the accessor code into functions.

3. Using str(), float() and convertColor() as canonicalising functions before checking the values is a nice touch, but it may make Scour a bit inefficient. It makes #000 equivalent to black, etc., but the time taken to ensure that both colors/numbers are canonical even after colors and number precision were optimised seems like a bit of a waste.

Overall your code looks sound, but I would hugely prefer having a patch in unified diff format to work with.

Changed in scour:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Jan Thor (jan-janthor) wrote :

Thanks, I'm aware of the existence of diffs, but don't use them regularly. I will try to make one within the next days.

Since reducePrecision() happens after removeDefaultAttributeValues() (although I guess that could be swapped without ill effects), I guess I can't assume that floats in attribute values are always in canonical form? But anyway, a value of four will, in most cases, be either "4" or "4.0" (and not "4.00000" or "0.4e1").

I'm a bit unsure what attributes-with-defaults should be included; the most thorough approach, I guess, would be to include everything in SVG1.2 and CSS3, but then, there isn't much point in checking each and every element for stuff none of the current vector editors uses just to slow scour down. Since I'm exclusively using Inkscape, I have no idea what Adobe or Corel do (perhaps they are less zealous to include defaults?).

Revision history for this message
Louis Simard (louis-simard-deactivatedaccount) wrote :

In order to retain Scour's speed, your code could be edited (by you or by another developer) to reverse the checks you make. Instead of iterating over defaultAttributes and checking for the existence of the attribute, you could iterate over the attributes of the element and check in defaultAttributes.

In most cases that would reduce the number of iterations from 38 to something around 5... or 10 for especially verbsoe elements :) But you could add the rest of the attribute defaults from the SVG specification without sacrificing much speed in Scour.

reducePrecision() does get called after removeDefaultAttributeValues(), whoops. After reversing those calls, the canonicalisation to float could be removed. Is there a built-in function in Python that does nothing but return its first argument? :)

Revision history for this message
Jan Thor (jan-janthor) wrote :

I installed bazaar today, loaded the current trunc and started to code a bit (I'll have to reverse the checks, as you suggested, and it's currently not working as intended, and I still have to clean up the list of default attributes, and now I'm tired and ripe for sleeping...).

While doing that, I noticed that both reducePrecision() and convertColors() only work on attributes, not on styles. But since the conversion of stuff in styles to attributes is the default and has to be explicitly disabled, perhaps it's no big deal, and fixing it not worth the trouble. Otherwise, this could be fixed in one go.

Revision history for this message
Louis Simard (louis-simard-deactivatedaccount) wrote :

There's an open bug related to an issue caused by --disable-style-to-xml, bug 702423. It makes me believe that Ubuntu needs --disable-style-to-xml in its optimisations of SVG files.

In my private testing, the default behavior ("enable style to XML", even if that's not a real option) is often responsible for expansions of about 200 bytes on small documents. Thus that may have been a wise choice for them.

For correctness' sake, and for Ubuntu's sake, it would be best if your fix also covered --disable-style-to-xml. :)

Revision history for this message
Jan Thor (jan-janthor) wrote :

I added a patch for revision 201 of scour.py.

I restricted the list of attributes to SVG 1.1 presentation attributes and SVG 1.2 tiny properties; the other attributes are different for different elements, so the right thing to do would have been a dictionary of dictionaries, with tagnames mapping to dictionaries mapoping attribute names to default values, and that sounds like a lot of work for little gain. Otherwise, I could try to rely on certain attributes only making sense only for certain tags (like the attribute 'k1' only appearing in a certain filter), but that assumption might break on later revisions of SVG.

I also excluded those attributes with a default value of 'auto', since I think it's rare that an editor (or a human author) will include those.

My patch also interacts with removeUnusedAttributesOnParent(), but, as far as I can see, only in a harmless way.

I removed some checks for default values which became obsolete (removing two TODO's doing that), and I also made changes to the canonicalization of colors and floats to apply them to styles.

I implemented the list of 'tainted' attributes as a set; sets have been introduced in Python 2.4, but as far as I'm aware, Scour doesn't maintain backwards compatibility further back, so that should be okay, I hope.

I did break a test in testscour.py; more on that in the next comment.

Revision history for this message
Jan Thor (jan-janthor) wrote :

The test case ConvertFillRuleOpacityPropertyToAttr tests whether a style 'fill-rule:nonzero;' gets converted to an attribute. Since I optimize this attribute away, the test now breaks. I added a new version of the corresponding file fill-none.svg, where I swapped 'nonzero' and 'evenodd'. Since the file is so small, the attached file is the complete version, not a diff.

Revision history for this message
Jan Thor (jan-janthor) wrote :

Obviously, ConvertFillRuleOpacityPropertyToAttr stills needs to be changed to check for the presence of an attribute with the value 'evenodd' instead of 'nonzero'. I attached a patch that contains this change. It also contains a few more tests for the new functionality.

Revision history for this message
Jan Thor (jan-janthor) wrote :

And finally, a new svg file for unittest, which my new tests refer to.

I noticed that test files are usually quite short and minimalistic; this one includes some stuff I end up not testing anymore (I had to check that my code doesn't interfere with removieUnusedAttributesOnParent(), but I didn't end up writing a test case for it), I hope it is usable as a starting point.

Changed in scour:
status: Confirmed → In Progress
Revision history for this message
Louis Simard (louis-simard-deactivatedaccount) wrote :

Thanks for the submission. :)

The code looks sound, and correctly replaces the removed code in repairStyle. The tainting of attributes is a good way to avoid running repairStyle on children of an element for properties that shouldn't be checked anymore.

The unit tests look sound, correctly testing for both the removal and non-removal of default and non-default values on parents and children.

I'll run the full tests, make sure they render right with and without --disable-style-to-xml and that they don't crash or hang Scour; if so, your patches are good to go as is.

Revision history for this message
Louis Simard (louis-simard-deactivatedaccount) wrote :

inkscape.svg renders incorrectly with --disable-style-to-xml, but that's not as a result of your patch; trunk r201 does it too. It's probably another file affected by bug 702423. See this post's attachment for renders in librsvg and Gecko (Eye of Gnome and Firefox).

Revision history for this message
Louis Simard (louis-simard-deactivatedaccount) wrote :

For future reference, you can use Bazaar to make a patch of all files changed from the last revision. Issue the command 'bzr diff > nextrevision.patch'. The total patch is attached to this comment; it will help packagers of Scour like Inkscape and Ubuntu.

Revision history for this message
Louis Simard (louis-simard-deactivatedaccount) wrote :

The changes are in trunk as revision 202.

Changed in scour:
status: In Progress → Fix Committed
Revision history for this message
Jan Thor (jan-janthor) wrote :

Thanks for the quick update and the patient explanation of the miracles of professional revision systems. I'll give it a try with my next bug. ;)

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package scour - 0.25+bzr207-1

---------------
scour (0.25+bzr207-1) unstable; urgency=low

  * New upstream bzr snapshot:
    - Add the option --no-renderer-workaround.
    - Fix gradient corruption. (LP: #702423)
    - Add option to only remove autogenerated id's. (LP: #492277)
    - Fix Windows line ending handling. (LP: #717826)
    - Fix occasional production of empty defs. (LP: #717254)
    - Remove more attributes with default values. (LP: #714731)
    - Fix wrong handling of file:// references. (LP: #708515)
    - Delete text attributes from groups with only non-text elements.
      (LP: #714727)
    - Remove unnecessary text-align properties from non-text elements.
      (LP: #714720)
    - Fix wrong optimization of 0-length Bézier curves. (LP: #714717)
    - Fix decimal number processing in rule_elliptical_arc(). (LP: #638764)
    - Fix transform matrix order. (LP: #722544)
  * Drop 01-bzr195.patch, upstream now.
  * debian/cmpsvg: Fix computation of difference to add up the absolute
    difference of each pixel.
  * debian/cmpsvg: Show percentages with three digits after comma.
  * debian/cmpsvg: Lower default treshold to 0.05%.
 -- Martin Pitt <email address hidden> Tue, 15 Mar 2011 09:28:45 +0100

Changed in scour (Ubuntu):
status: New → Fix Released
Revision history for this message
jazzynico (jazzynico) wrote :

Fix committed in the trunk, revision 10180.

Changed in inkscape:
assignee: nobody → JazzyNico (jazzynico)
importance: Undecided → Low
milestone: none → 0.49
status: New → Fix Committed
Revision history for this message
Louis Simard (louis-simard-deactivatedaccount) wrote :

This bug is fixed in release 0.26 of Scour.

Changed in scour:
status: Fix Committed → Fix Released
jazzynico (jazzynico)
Changed in inkscape:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.