Lots of talk these days about allowing SVG inline with text/html content. I thought I'd try and put some thoughts down.

Start with Doug's excellent post on this topic. I don't have any opinion on the aria-specific elements of the debate. I'm fine with either adding a namespace to these attributes when used in XML or letting those attributes attach themselves to the SVG language without a namespace if it simplifies things, I don't see a need to update the SVG specification for this though. I also don't see a need to reinvent or create a new namespacing mechanism using underscores or dashes, this seems silly and/or dangerous to me.

But there's something I'm not getting about the recent discussion of allowing inline SVG in text/html (or HTML5). Anne seems to be of the opinion that it would be a good opportunity to simplify the SVG language - maybe eliminate namespaces, allow upper-case SVG elements. Kind of an "if you want to play in the HTML playground, you have to wear the right kind of sneakers" attitude. This is kind of like HTML abusing its monopoly, isn't it?

I don't think this is a good idea. If you allow <CIRCLE CX=40> to be the same thing as <circle cx="40"/> eventually we'll start to see people producing non-compliant SVG in the wild. Then we'll have people creating inline SVG for HTML that won't work in the many SVG tools and viewers that are already out there and we'll just have frustrated authors. Then some tools might feel forced to accommodate the lax HTML-style of SVG, just like the mess we have now with browsers trying to understand as much content as they can in order to compete. Then we'll have to rewrite the SVG spec so that SVG has two serializations (like we're having to do with HTML5/XHTML5). It just seems to be going at it backwards, since SVG was designed from the ground up as an XML technology. Must we rewrite all XML specifications into HTML5-style languages in order to get inlining? I don't think so.

I guess I don't fully understand why the HTML5 parser can't just have the ability to hand off the character stream to another parser when it encounters some "special" elements like <svg> or <math>. Why does everything have to be in the hands of the HTML5 parser? In other words, in the document:

<!doctype html>

<html>

<title>SVG in text/html</title>

<p>

A green circle:

<svg xmlns="http://www.w3.org/2000/svg" >

  <circle r="50" cx="50" cy="50" fill="green"/>

</svg>

</p>

</html>

Upon encountering the characters "<svg ", the parser should back up five characters and send the bytes to the browser's parser that handles content with the MIME type image/svg+xml. When that parser is complete, those elements can be injected into the DOM in the proper namespace and the HTML5 parser can pick up after </svg>

I see some problems, none of which seem insurmountable to me:

  1. If the browser has no SVG parser, then what should happen? My proposal here is that all HTML5 browsers must also include a bog-standard XML parser to handle inline content that is known to be XML. This should be pretty straightforward, since XML parsing is actually much simpler than HTML5 parsing. In the case of a browser not grokking SVG, it throws the character stream to the bog-standard XML parser and waits for the character stream to return to it. Whether those elements are injected into the DOM is up to that browser (a browser could inject the foreign elements into the DOM even if it doesn't know how to render them).
  2. I hear that some browser don't properly handle namespaced content or colons or something. Can someone clarify which browsers? Can someone further clarify what exactly the problems are? Can someone confirm if that browser will have fixed itself, say, next year would we be good to go?
  3. Maybe the biggest problem with this idea is defining what happens in error scenarios - i.e. when the SVG is malformed, then at what point does the SVG parser return the character stream back to the HTML parser. In other words, maybe the challenge here would be in defining how parsers need to behave towards each other when mixing MIME types. Anybody have a suggestion here? Is this the deal-breaker?

As for namespace removal - why? Seriously just because it's hard to remember it? If we're trying to get to a "cut-and-paste" environment for some web authors, then they can just cut and paste the whole thing (namespace definitions in <svg> element and all). Maybe it's because I'm used to writing SVG, but I really don't have a problem with the concept of mixed namespace content. Sam's off-the-cuff solution seems to favor even skipping the <svg> element, which would seem to me to cause a mess of problems. Where would you define the viewBox? Where would you define the version of the SVG language? You couldn't, for example, cut and paste his "inlined" SVG content into a standalone SVG document for editing and be guaranteed it will even display properly, even if you wrapped it in a simple <svg/> element.

It seems like the belief that XHTML being a failure is a reflection on XML-on-the-web in general. In fact, all browsers except for IE can handle application/xhtml+xml MIME type these days, so it really seems to me that the verdict's still out on whether XHTML is a good technology or not. Some people out there still think that XML has a place on the web. People like Shelley, who also shares her thoughts on SVG in text/html here.

I think we should explore relaxing the draconian error handling on XML on the web, but I don't agree with re-inventing changing XML languages into HTML-style languages one after another.

§400 · October 19, 2007 · Questions, Software, SVG, Technology, Web · Tags: , · [Print]

4 Comments to “Yet More on SVG in text/html”

  1. The language is not reinvented. It would be identical on the DOM-level. It’s just the syntax that produces the DOM that would be different.

  2. Ok Anne, I agree that “re-inventing” was too strong of a term. I’ve updated the post.

  3. bernstein says:

    well i completely agree with your point !

    svg is working so well nowadays (well if one uses inkscape, adobe illustrator and firefox 2 in conjunction … other combinations probably too…) so why cripple it?

    I am currently converting my website to “text/xml”, because i don’t just want code that works ( document/webapp to look right) because of current browser quirks. I expect my written code to need minimal (none?) adjustments to look right in “a yet to be invented editor/viewer” supporting the standard i’ve written in. And i want it to look right in 50years. 🙂 So the more rigid the Standard the better.

    It is all about being standards compliant, because compliance of editors/viewers IS and WILL increase.

    i just don’t see whats the difference in producing XML compliant code nowadays… just take an compliant xhtml editor, wiki, weblog, svg editor … they ARE all here. sure, if you write xml/xhtml/etc. by hand and make something wrong you get a nice “XML error” instead of your page but well it is kinda like grammar… no one likes it but having none would make things so much worse… and XHMTL is fairly easy to master….

    ps: i don’t really understand what all the fuss is about (X)HTML5… i’ve gone through the spec and i see many unclarities and nothing that hasn’t been done better in the working drafts of XHTML2.

  4. Jeff says:

    bernstein,

    Please submit all your “unclarities” (sic) to the W3C HTML Working Group.

    Thanks!

    Jeff