codedread

Scour - an SVG scrubber

Table of Contents

Introduction

Scour is an open-source Python script that aggressively cleans SVG files, removing a lot of 'cruft' that certain tools or authors embed into their documents. The goal of scour is to provide an identically rendered image (i.e. a scoured document should have no discernable visible differences from the original file).

WARNING: Scour is intended to be run on files that have been edited in Vector Graphics editors such as Inkscape or Adobe Illustrator. Scour attempts to optimize the file, and as result, it will change the file's structure and (possibly) its semantics. If you have hand-edited your SVG files, you will probably not be happy with the output of Scour. NEVER USE SCOUR TO OVERWRITE YOUR ORIGINAL FILE!

Documentation

Installation and Using

Command Line Script

To run scour on the command-line, first download and install Python, then download scour. The basics are:

$ python scour.py -i input.svg -o output.svg

In addition, you can use compressed svg (.svgz) on the input and output and scour will decompress/compress automatically.

Inkscape Extension

If you want this in your Inkscape then download this zipfile, unzip it in your Inkscape's extensions/ folder. Now when you Save As you will see an option for "Optimized SVG (*.svg)". This extension is included with Inkscape 0.47.

Configurable Options

Scour performs many operations automatically. In addition to these automatic operations, scour also provides some configurable options:

--create-groups
create <g> elements for runs of elements with identical attributes (New in 0.25)
--disable-embed-rasters
Prevents conversion of external rasters to embedded base-64 data
--disable-group-collapsing
Prevents collapsing of group elements
--disable-simplify-colors
Prevents conversion of colors to #RRGGBB format
--disable-style-to-xml
Prevents conversion of style properties into XML attributes
--enable-comment-stripping
remove all <!-- --> comments (New in 0.25)
--enable-id-stripping
Removes unreferenced id attributes
--enable-viewboxing
Changes document width/height to 100%/100% and creates viewbox coordinates
--indent=TYPE
Determines how XML will be indented: space, tab, none (defaults to space)
--keep-editor-data
Keeps all Inkscape/Sodipodi/Adobe elements, attrs
--quiet, -q
suppress non-error output (New in 0.25)
--remove-metadata
remove <metadata> elements (which may contain license metadata etc.) (New in 0.25)
--renderer-workaround
work around various renderer bugs (currently only librsvg) (New in 0.25)
--set-precision=N
Sets the number of significant digits that scour will keep (defaults to 5)
--shorten-ids
shorten all ID attributes to the least number of letters possible (New in 0.25)
--strip-xml-prolog
Removes the <? xml ?> prolog
--protect-ids-noninkscape
Don't change IDs not ending with a digit
--protect-ids-list=<list>
Don't change IDs given in a comma-separated list
--protect-ids-prefix=<prefix>
Don't change IDs starting with the given prefix

In case you're curious, I am maintaining a sample set of real files that I've found and recording statistics on versions of scour. You can help me build this into a decent cross-section of real-world SVG documents by sending me files. Over time I hope this will give a decent sense of what scour can do. My current stats show a Median Reduction Factor of 48.19% and a Mean Reduction Factor of 48.53% over 25 sample files for version 0.19 of scour, before gzip compression.

Development

Scour is open source under the Apache 2 License. It is maintained on Launchpad. Visit Launchpad to ask a question, report a bug, make a feature request or submit patches.

Downloads

See the Release Notes for what has changed.

Command-Line Script

Inkscape Extension

codedread codedread