2020-11-09
HTML Is For All of Us
The World Wide Web has a problem.
A major topic in US political discourse over the last four years has been “undemocracy:” the undemocratic turn in national governance, and the undemocratic nature of long-standing political institutions. The internet, where power is consolidated within a small handful of American tech firms, has faced similar scrutiny. CEOs are summoned to Washington for Congressional anti-trust hearings on a seemingly routine basis, and calls to regulate or break up Big Tech are a staple of progressive political tickets. Yet though its symptoms may be amenable to relief through policy and legislation, the problem with the Web is more fundamental.
The internet has become— moreso than ever, given our present era of virus-imposed social distancing— the predominant medium through which we work, play, and learn. You likely visit dozens of websites over the course of a day. Given this familiarity, however, how confident are you that you could explain “websites” to someone unacquainted with the notion? If you wanted to create something and put it up on the Web right now, do you have a sense of how to do so? I believe that most people don’t have good answers to these questions. More significantly, it seems that even most people who are adept with the relevant technology consider it irrelevant for the average person to have good answers. Yet having answers is precisely what is necessary for saving the Web and restoring its role as an equitable and democratic commons.
Ironically, part of the issue is that explaining why understanding Web technology is so vital itself requires one’s audience to have a basic grasp of Web technology. Fortunately for us, the core concept of how websites work is simple: let us use this very page as an example. Once we have developed an intuition for the mechanics, we can then explore why that knowledge must be the basis for a free and open internet.
HTML, CSS, and Javascript
Don’t worry, the change you see is intentional— we’ve stripped the page back to its bare HTML. HTML, the “hypertext markup language,” is a specification for annotating text with additional semantic structure such as headings, paragraphs, or lists. Every page on the Web is, at its heart, a simple string of text representing an HTML document. Think of it as a file: in the same way you open up .doc
files with a word processor, or .mp3
files with a music player, you open up .html
files with a Web browser. The .html
file representing this page looks something as follows:
1
2
3 [ meta stuff ... ]
4
5
6
7 HTML Is For All of Us
8 The World Wide Web has a problem.
9 A major topic in US political discourse over the [...]
10 [ ... ]
11 HTML, CSS, and Javascript
12 Don't worry, the change you see is intentional [...]
13 [ ...etc.]
14
15
16
The names are a bit funny, but hopefully it’s easy to see the correspondence between the markup and how it gets rendered: (p)aragraphs here are inside <p>
tags in the source, (h)eadings are inside <h*>
tags, and so on. And that’s it! Fire up a text editor (such as Notepad or TextEdit), open some documentation about the available html elements, start typing, and you’re already halfway to your own website.
Most sites today don’t look quite as plain as this one does right now. Just as HTML was becoming, in the early ’90s, the standard way to encode semantic information about a document on the network, a separate technology emerged for specifying aesthetic information. Called “CSS,” the new specification introduced the notion of a separate “stylesheet” containing structured language about how a webpage should be styled and presented to the viewer.
For example, the now-enabled stylesheet for the present page contains the following snippet:
1 2 3 4 }
This is read as a “rule” stating that text inside an HTML <p>
tag should use a serif-styled font, and that it should span a width no greater than that of 34.5 uppercase-“M” characters.
In full, CSS is a loose acronym for "cascading stylesheets." The "cascading" part describes the model for how precedence is resolved when multiple rules in a stylesheet overlap. For instance, the * {...}
rule uses the "*
selector," which applies to all elements on a page and would thus overlap with the p {...}
rule in our example.
The above code barely scratches the surface of what is possible: even though every webpage is at heart just another boring HTML file, the vast differences in layout and appearance between sites are all accomplished through variations in CSS rules.
Alongside HTML and CSS, the final major component of a modern website is Javascript. If HTML specifies semantic metadata, and CSS specifies aesthetic directives, then JS facilitates general computation. It is neither the only, nor the first, tool for this purpose—Adobe Flash was, in its heyday, an immensely popular way to run code on a webpage—but it was until very recently the only one that was a widely-accepted open standard.
Javascript is typically employed to add functionality that isn’t available through standard HTML or CSS. For instance, I’ve used it on this site to implement the light/dark theme toggle button:
1 2 3 4 5 6 ;
The “click to enable” buttons in this post are another example of where Javascript can be useful.
There we have it: HTML markup, CSS stylesheets, and Javascript code. These three components are the core of the Web as we know and use it in daily life. Further, within this triumvirate, neither CSS nor JS are strictly necessary for building a functioning webpage. Even so, I mentioned above that writing a .html
file would only get you halfway to the web: there’s still one critical missing piece.
URLs and Servers
Our definition of a website as “just a simple HTML file,” perhaps with optional scripts or stylesheets attached, isn’t quite complete. To be more specific, a website is an HTML file with a unique global address. This comes in the form of a URL. (It’s the reason you have a devoted address bar at the top of your browser window.)
A URL ("uniform resource locator") is a composition of several parts, including protocol, domain, and path. If you look at your address bar, you will see that this page uses "https:
," the "secure hypertext transfer protocol;" that its domain is "www.nathanael0x4c.com
;" and that its path is "/blog/html-is-for-all-of-us
." This indicates, approximately, that the file representing this webpage is in a folder named "blog" on a computer named "www.nathanael0x4c.com," which communicates with your browser over HTTPS.
Navigating to a webpage on the internet, either by clicking a link or by typing the address into your browser, is the act of finding the file located at that specific URL and fetching it from across the network. But where does that file live until someone asks for it? Let us say that you’ve written an HTML file and saved it on your personal computer. You could set up a URL so that it points to the copy of the file on your machine; this would be enough to turn it into a simple website. Suppose a friend wants to visit your website: they type your URL into their browser, which reaches across the network to your machine, which retrieves its saved copy of the file and sends it back over the network to be displayed on your friend’s screen. Success! However, what happens if you lose internet connection, or—perhaps your computer is a laptop—the battery dies? The address where your HTML is supposed to live becomes a dead end. To prevent this, websites are typically stored on dedicated networked computers called servers that are kept running 24/7 to listen for requests and serve site files.
We now, with any luck, have a complete high-level understanding of how a website works: create some files, give them a unique URL, put them on a server. One question remains: if a dedicated server computer is necessary for operating a reliable website, where does one find such a machine?
Fridges and “2.0”
In 2007, Microsoft sponsored the creation of a short children’s book titled “Mommy, Why is There a Server in the House?”. The company projected that “the server” was on its way to becoming a common household appliance, akin to a refrigerator or a water heater, where people would be able to independently publish their own files to the web. One significant consequence of the "server as appliance" idea is that it supports the interpretation of internet access as just another 'dumb' utility, like water or electricity. While this may seem like a minor conceptual quibble, it is the core principle of "net neutrality," which has been the subject of intense legal debate in the US over the past decade. Of course— and unfortunately for Microsoft— sales of the Windows Home Server 2007 operating system never took off, and the entire market segment ended up being a major flop. Though numerous explanations exist for why the home server movement failed to gain traction, a significant factor in its premature demise was a different movement in the tech industry that was unfolding at the same time: “Web 2.0.”
Silicon Valley “thought leader” Tim O’Reilly, whose media empire popularized the term, defines Web 2.0 as:
[T]he network as platform … delivering software as a continually-updated service … creating network effects through an “architecture of participation,” and going beyond the page metaphor of Web 1.0 to deliver rich user experiences.
This “new” version of the Web manifested itself through the rapid growth of the platforms that dominate the internet today, including Facebook, Amazon, Reddit, and Twitter. The sites built by these companies replaced the simple “webpage as file” formulation with complex applications that twist the basic technologies of the Web beyond their intended use, subsuming an intrinsically open architecture into privatised systems governed by more restrictive logics. Let’s unwind the opaque jargon of O’Reilly’s definition. “Network as platform” shifts the basic utility of the internet from a medium for connecting independent nodes into a delivery system for a handful of “continually-updated,” javascript-heavy pages. Those pages “go beyond the page metaphor” by serving their private networks of content via a single portal; take Facebook’s ‘newsfeed’ as an example, where no matter how many posts you scroll through, you never leave the facebook.com home page. “Network effects” are “created” by using that portal to mediate interactions between users, capturing relations that might otherwise happen directly via the intrinsic network architecture of the web. The resulting “architecture of participation” is used to track and monitor user behavior, producing data that can be sold for profit to advertisers. In short, Web 2.0 was simply a playbook for consolidating power on a platform long celebrated for its democratic potential.
If the “techlash” of the last several years is any indication, Web 2.0 has perhaps not been the greatest thing for society. Many of its drawbacks did not become apparent until long after current platforms had “disrupted” the internet, but one may nonetheless find it curious how the paradigm was able to achieve hegemony in place of the competing home server movement. Web 2.0’s strategic edge can be found in the final phrase of O’Reilly’s definition: “rich user experiences.” A key feature of the web’s present menagerie of apps and social media sites is the variety of text-input boxes or photo-editing widgets that allow users to create and update content. By enabling people to participate in the Web through friendly graphical interfaces, these sites reinforce a narrative of making the internet more accessible to a wider audience. Web technology is complicated, the story goes, so why struggle with markup and servers when you can simply sign up here to share content in a few easy clicks? Leave the tricky business of displaying and disseminating that content to the experts in Silicon Valley. This narrative contains a clever trick: it strips agency from users by placing Web technologies out of reach— complex entites suitable only for management by experts— while at the same time enabling those “experts” to build specialized infrastructure for providing the experience of a few easy clicks. In other words, if the fundamental technologies of the modern Web are indeed too complex to be accessible to the average user, then that complexity only exists because a professionalized class of programmers (or, perhaps more significantly, the companies for which they work) have stretched the Web to include an ever-expanding set of features. Among the current set of accepted Web standards are mechanisms for: 3D graphics, real-time push notifications, embedding a webpage inside a different webpage, and running low-level assembly code. In turn, these features are rationalized through their utility in providing “solutions” for end-users to overcome the web’s complexity. Web 2.0 thus dominated prior formations of the Web through a process of recursive self-legitimation.
A compelling false logic was not the only advantage today’s major Web platforms held in their favor. Social media accounts are free, after all, while servers cost money. The massive datacenters of today, where thousands of computers are packed together for easy maintenance and shared climate control, are also less costly and more efficient than any equivalent network consisting of millions of independent machines would be. The economies of scale associated with datacenters, however, do not preclude the underlying dream manifested through the home server movement. Acquiring a server is still possible on the modern Web, closer now to the experience of renting a storage unit than to that of purchasing a household appliance: the machine is intangible, an abstract cpu process in a distant location. More specifically, modern servers often take the form of a commoditized quantity of memory, persistent storage, and processing power, purchased from a company that operates one or more datacenters. From a user interface and operating system perspective, this is not terribly different from a physical device sitting in your living room, but in real terms it is an emulation (a "virtual computer") running alongside numerous other virtual computers on a much more powerful real computer stacked alongside many other real computers in an industrial park somewhere. This is what is known as "The Cloud." Systems of this form are targeted towards use by the Silicon Valley professional class, but one can imagine some more accessible analog as an alternative to a device physically sitting in one’s living room. Put differently, the physicality of the computer does not matter; rather, core to the “home server” movement is the notion that any individual could run a home server, and have the experience be no different than if they had rented one in a remote computing facility. Going further, the fact that social media was free to use at its inception is itself remarkable, since fledgling internet platforms possessed no revenue streams with which to pay for their own servers. Instead, they relied on an influx of investment capital. Hidden behind its careful “just-so” story, Web 2.0 was effectively a venture-backed coup, without any intrinsic economic advantages over alternative paradigms and with a much less democratic structure.
“For whom, and to what end?”
The history of home servers and Web 2.0 illustrates how society and technology are mutually interpellated, and therefore why it is important to have a technical understanding of the systems we value. Knowledge of how a technology functions illuminates the boundaries of power at play in its use; it allows us to better trace a system’s beneficiaries (for whom), as well as its ultimate purpose (to what end). Though the Web has grown over the last two decades into a vast and complex technological apparatus, this essay (hopefully!) demonstrates that its basic ins and outs remain accessible to the layperson. Furthermore, the dynamics by which the complexity of the Web have grown suggest the possibility of a utopian inversion in how we produce future technologies: if naming the internet as a complex subject willed that complexity into existence, then perhaps constituting systems as democratic and widely knowable can likewise transform them in self-fulfilling ways. On both the Web and in other arenas, these processes of contestation rely on the collective participation of all stakeholders. In short, your knowledge matters: understanding technological systems determines not only how we as individuals use those systems, but also how those systems serve and shape humanity. {\}
All words (and errors) are my own, but writing this would not have been possible without the ideas and support of many others. I am deeply grateful to all of you.