Is HTML5 making XSS worse?

Mark Pilgrim responds to Nah Mendelsohn’s notes on HTML 5 with this remark: Draconian error handling enforced at runtime does not scale to the complexities of modern-day web applications. Ensuring well formedness becomes increasingly difficult whe…

Mark Pilgrim responds to Nah Mendelsohn’s notes on HTML 5 with this remark: Draconian error handling enforced at runtime does not scale to the complexities of modern-day web applications. Ensuring well formedness becomes increasingly difficult when content is dynamically cobbled together from multiple sources, some of which are beyond your control (user generated content, third-party ad servers, and so on).

 To paraphrase: “Web application development is incapable of delivering valid XML. Therefore, we need a more lenient (and more complex) parser. Forget about enforcing syntax.”

 Now, the class of bugs in Web applications that Mark describes is precisely what leads to cross site scripting attacks all over the place. And the more lenient (and complex, and informally specified) the parsing rules, the more likely it would appear that it becomes even more difficult for Web application developers to avoid cross-site scripting bugs, and that it becomes even more difficult to write code that (e.g.) filters user-supplied HTML to some safe subset.

 I guess the redeeming point here is that my argument uses XHTML as a baseline, and that HTML5 – with its defined error handling – improves predictability over the concoctions that parse HTML today.