Aaron Swartz’s unfinished book

…or one of them, anyway, was released today. You can find it here; or use this, a direct link to the PDF.

It really is unfinished — chapters end abruptly, discussions are alluded to that were clearly never written — and the whole thing is written (but not formatted) in Markdown. But it contains an extremely high percentage of Correct Opinions About The Web. Here’s one bit I particularly enjoyed about the Semantic Web (seemingly since relabeled as Linked Data, basically in order to escape this well-earned critique):

I have to say, however, the idea’s proponents do not escape culpability for these utopian perceptions. Many of them have gone around talking about the “Semantic Web” in which our computers would finally be capable of “machine understanding.” Such a framing (among other factors) has attracted refugees from the struggling world of artificial intelligence, who have taken it as another opportunity to promote their life’s work.

Instead of the “let’s just build something that works” attitude that made the Web (and the Internet) such a roaring success, they brought the formalizing mindset of mathematicians and the institutional structures of academics and defense contractors. They formed committees to form working groups to write drafts of ontologies that carefully listed (in 100-page Word documents) all possible things in the universe and the various properties they could have, and they spent hours in Talmudic debates over whether a washing machine was a kitchen appliance or a household cleaning device.

With them has come academic research and government grants and corporate R&D and the whole apparatus of people and institutions that scream “pipedream.” And instead of spending time building things, they’ve convinced people interested in these ideas that the first thing we need to do is write standards. (To engineers, this is absurd from the start—standards are things you write after you’ve got something working, not before!)

And so the “Semantic Web Activity” at the Worldwide Web Consortium (W3C) has spent its time writing standard upon standard: the Extensible Markup Language (XML), the Resource Description Framework (RDF), the Web Ontology Language (OWL), tools for Gleaning Resource Descriptions from Dialects of Languages (GRDDL), the Simple Protocol And RDF Query Language (SPARQL) (as created by the RDF Data Access Working Group (DAWG)).

Few have received any widespread use and those that have (XML) are uniformly scourges on the planet, offenses against hardworking programmers that have pushed out sensible formats (like JSON) in favor of overly-complicated hairballs with no basis in reality (I’m not done yet!—more on this in chapter 5).

Instead of getting existing systems to talk to each other and writing up the best practices,these self-appointed guarantors of the Semantic Web have spent their time creating their own little universe, complete with Semantic Web databases and programming languages. But databases and programming languages, while far from perfect, are largely solved problems. People already have their favorites, which have been tested and hacked to work in all sorts of unusual environments, and folks are not particularly inclined to learn a new one, especially for no good reason. It’s hard enough getting people to share data as it is, harder to get them to share it in a particular format, and completely impossible to get them to store it and manage it in a completely new system.

And yet this is what Semantic Webheads are spending their time on. It’s as if to get people to use the Web, they started writing a new operating system that had the Web built-in right at the core. Sure, we might end up there someday, but insisting that people do that from the start would have doomed the Web to obscurity from the beginning.

All of which has led “web engineers” (as this series’ title so cutely calls them) to tune out and go back to doing real work, not wanting to waste their time with things that don’t exist and, in all likelihood, never will. And it’s led many who have been working on the Semantic Web, in the vain hope of actually building a world where software can communicate, to burnout and tune out and find more productive avenues for their attentions.

Entertaining, brutal and accurate. Swartz does more or less succumb to the dream of the Semantic Web again in his conclusion. But who knows, maybe one day someone will deliver on it.

Leave a Reply