A hitchhiker’s guide to the HTML5 + EME maze

W3C’s work on HTL5 and the Encrypted Media Extensions specification keeps drawing criticism and controversy. I spent today attending Amelia Andersdotter’s event at the European Parliament in Brussels about HTML5 and DRM, as an interested individual member of the W3C community who doesn’t speak for anyone but myself.

The topic is fraught with controversy: The W3C Director found “Content Protection” to be in scope for the HTML Working Group; the deliverable that the group is working on under this heading is EME. The specification itself defines a reasonably simple JavaScript API that permits a Web application to hand key material to a Content Decryption Module (the actual DRM black box). The general API leaves the nature of the key material unspecified; in the general case, that’s likely to be key material that is by itself encrypted, and not accessible to the browser. The EME spec defines one very simple CDM, Clear Key, which assumes that key material is accessible to the Web application and the browser (therefore, to the user); this is the sort of not- really-DRM that will later on permit the HTML WG to demonstrate interoperability of the API without having to dive into proprietary CDMs.

As far as it’s discernable today, EME has significant implementer interest; the motivation there is, of course, to use it as an interface to connect proprietary DRM systems with the Web. As with any controversy, there are plenty of confusing points to go around.

On fundamentals, some argue that content protection is, basically, the same thing as password protection for content that you buy for, or a paywall, or perhaps encryption of confidential material online. That’s a false equivalence: The commercial driver for standardization of EME are existing DRM systems — the proprietary CDMs that I mentioned above. The attacker against whom content is protected is the user (and the browser code, which could be under the user’s control); the attack is use of content in a way that isn’t explicitly authorized by the rights holder.

The DRM systems used in this context cannot be implemented in Open Source, they are typically patent encumbered, and they arguably are corrosive to the notion of putting general-purpose, modifiable computing into users’ hands. And while it is conceivable to build a watermarking-based system on top of EME, that would sound like a pretty awkward approach, and it isn’t why implementers are interested in EME.

All of that, however, doesn’t mean that EME (the interface) can’t be implemented in open source: EME, together with the ClearKey CDM that’s part of the specification, should be implementable in Open Source software, without royalty, just fine. It just doesn’t provide the protection that rights holders are after; the real deployment of EME is as an interface toward proprietary CDMs that are implemented in closed source software, and partially in hardware.

Some proponents of EME try to make it palatable by pointing out that, just maybe, it could help users protect the privacy of their personal information online — we heard that argument today. That doesn’t sound like it’s very plausible: EME is a pass-through API for browser implementation, tightly coupled to inline media elements in HTML. The basic model is actually very simple. Now, it is true that some in the privacy community have looked at policy enforcement using trusted computing mechanisms. But it doesn’t look like EME specifically, or the CDMs it interfaces with, are even in the same ballpark. I respectfully suggest that we just drop that part of the conversation and focus on the actual reasons for deploying EME.

Another argument that is frequently made is that, because EME is made part of a core Web technology (HTML5), “browsers do not have a choice.” That isn’t exactly true, either: EME is a separate spec from HTML5. The two documents can go to W3C Recommendation (or not) independently of each other. Just because somebody says they implement HTML5, that doesn’t mean they have to implement EME. That debate, however, is ultimately a debate about words, not about substance: The deployment driver is the desire to provide playback of DRMed video content, not the exact nature of the API spec, and how it is split across different documents.

The real focus of the discussion, then, ought to be on the merits (or not) of what EME actually is: A carefully scoped interoperability layer on top of existing, proprietary DRM systems, to enable the designers of Web applications (think youtube, think netflix) to pass key material to these CDMs in a way that’s interoperable across multiple browsers. That abstraction layer doesn’t “do” DRM; it can probably be implemented in open source software without royalty; but it isn’t very useful unless we end up in a world with a few widely implemented CDMs that ship with browsers across different platforms, and for which “protected” content on the Web is encoded.

Some of the questions to ask in this context: If EME is successfully standardized by W3C and broadly deployed by browsers — is that, by itself, an improvement over a future in which either of these (standardization, deployment) doesn’t happen? What would other plausible futures for EME or, more generally, for DRMed content sold today even look like? By what criteria would we evaluate those? What’s the impact of these futures on large content providers, small content providers, browser vendors, and innovation for the network?

How does that reasoning change if we assume either of EME being the end of DRM integration into the web platform, or EME being the beginning of DRM integration into the web platform? And which of these is more likely?

What is the weight that we might assign to “goodies” that could come with EME? For example, open APIs further down in the stack (between CDM and browser), or additional transparency into the DRM hat gets deployed on the Web? And what is the weight that we might assign to side effects of DRM deployment through EME — such as, perhaps, additional privacy concerns, and serious accessibility issues?

Finally, what does this entire discussion say about the governance model that we collectively want to apply to Web standards — how do we collectively reconcile between W3C as a member-driven organization, its accountability to the broader public, and its stewardship role for the Web?