SVG is a pretty flexible format. Vector graphic editors often inject a lot of extraneous data to make their job easier. Unfortunately this excess data is just ignored by browsers and from that perspective just looks like bloat. In my opinion, output from editors such as Inkscape or Adobe Illustrator are not quite ready for the web, especially for purposes like clip-art. It's not at all rare to find files up on Open ClipArt with large chunks of the file that are not even used (extraneous gradients, filters, etc). I had an itching to learn a little Python so I used this as an excuse to create a script that would tidy up SVG files: Scour.

The SVG Posse, Doug Schepers, attended the Libre Graphics Meeting this month in Montreal and gave a quick 5-minute talk on the script, which was kind of fun for me to watch.

Note that the purpose of this script is to make the SVG documents ready for web browsers - it shouldn't be used on files that have already been manually edited. The script strives to maintain a visually identical image, but it does not preserve all semantics (for instance, it might collapse groups). I provide switches to turn off some features, but not all. In addition, it doesn't handle external CSS or <style> elements yet.

Anyway, time to start using the script. Download it. Ask questions. Open bugs. Request features. Send me patches.

§536 · May 17, 2009 · Software, SVG, Technology, Web · Tags: , , · [Print]

Leave a Comment to “Scrubs”

  1. anon says:

    Nice. I’ve written several one-off scripts to clean up SVG from various places, having a general tool seems good.

    A couple of teeny notes from looking at the source for five minutes:

    * Print informative stuff to sys.stderr to avoid caring if you’re using stdout as a pipe.

    * Don’t use the string module, most of the functions are now string methods or otherwise available (yes, this isn’t all that clear from the docs).

    ** string.atof(value) -> float(value)

    ** string.join(list, s) -> s.join(list)

    * Various other not-yet-familiar with idiomatic python things

    …actually, I’ll do a nitpickers patch for you to accept if you wish.

  2. Sounds good anon. Btw, what is the benefit of using float(value) vs string.atof()? Quicker? Safer? Cleaner code?

  3. anon says:

    It’s just the modern spelling.

    Emailed you a bundle that’ll probably more informative than practical, largely different ways of writing what you had already. Stopped before doing anything of use as much of the code is string handling, which is difficult and annoying to write well in python.

    (Your captcha should be SVG! Then at least I’d have a shot at answering it without having to start X by looking at d attributes in the source.)

  4. Jeff says:

    Thanks Martin, I’ve merged in your patch and will adapt my code to be more PC (python-correct) over time.

    Any thoughts on how an SVG captcha would look? I would think any inline SVG with <text> elements in them would be trivial to hack… Not as many people could look at d attributes on path elements and try to decipher the image

  5. anon says:

    I think Jeff Atwood’s captcha shows that anything not used on thousands of other sites is pretty robot-proof by virtue of not being worth the time of script-kiddie. Even using <svg:text/> would probably an equivalent “hack” difficulty to your image with PWNtcha.

    But a captcha with <svg:path/> would be fun just for its own sake… and it’d keep out IE users. 😉

  6. Rob says:

    This website withholds it CSS to IE8 even though that has the most extensive CSS 2.1 support around.

    Ugly shit.

  7. Actually Rob, I don’t withhold my CSS from IE8 at all. You can see that in my source. However, since IE does not support SVG I cannot duplicate the look/feel of this website to IE users, unfortunately. Let’s hope for a better IE9.

    Btw, your comment has prompted me to add a ‘todo’ to my list to look at this site again on a Windows VM real soon now.

  8. C3PO says:

    Nice tool, I had the same idea one million times.

    I’m dowloading it, thanks.

  9. That was an inspiring post,

    Nice information about SVG


  10. Adrian says:

    Hey Jeff,

    Just used scour on the 2348 files in the IAN Image Library and it went from 220MB to 131MB – nice work – should result in better performance when using them with svg-edit too!

    Now I just need to do some testing to make sure none have changed visually 🙂