Digging among the fossilised remains of Web 1.0.
On 14 December 2020, Yahoo! Groups, the world’s largest collection of online discussion boards, was permanently and irrevocably deleted by the company itself. Established in 2001 from late 1990s email-list technologies, Yahoo! Groups peaked in the mid-2000s with around 113 million users interacting across nine million different groups. Like many comparable sites of that decade, Groups’ popularity dwindled in the 2010s, with the number of active groups and users dropping off sharply and not recovering. The internet and its people had moved on for good, and Yahoo! Groups had become a dead weight with an unfavourable ratio of interesting and insightful discussion between passionate and knowledgeable enthusiasts to spam, porn and trolls.
At least, this was the conclusion drawn by Yahoo! and Verizon Media, the multinational tech company that had bought Yahoo! in 2017 (and which has itself just been bought by private equity firm Apollo Global Management and, confusingly, renamed Yahoo in a kind of tedious corporate ouroboros). After announcing that Groups would be erased, Yahoo! gave its users three months to download any content that they had uploaded and wanted to keep, or contact Yahoo! directly to request a zip file of data (excluding pictures and attachments) from any group that they were a member of. The realisation that Yahoo! would not be undertaking the work of preservation itself provoked a massive archiving effort from Archive Team, a self-described ‘loose collective of rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage’ affiliated with (but separate from) the Internet Archive, in which hundreds of volunteers worked to save huge quantities of data. Archive Team were rewarded for their labour with Verizon purposefully blocking their efforts, citing a violation of their user terms of service and preventing them from gaining access to the majority of Groups’ content. Once the destruction date came around anything that hadn’t yet been downloaded was gone forever.
If you find yourself wondering why this matters, you are not alone. Outside of tech circles, the media response to Yahoo!’s decision was subdued. However, no single online activity – apart from shopping – narrates the history of human engagement with the internet with as much comprehensiveness and clarity as how people talk to each other and what they talk about. On a granular level, an online forum offers a unique record of the specific community collated around it; taken as a whole, online discussion boards have considerable evidential value in understanding the social history and impact of the internet itself.
The question of how to preserve this history has bugged archivists, historians and academics for decades. Web pages constitute a part of what is termed ‘digital heritage’, along with other computer-based materials like digital photographs, text files, databases and any physical carriers such as floppy disks or CD-ROMs. Keeping this material legible is difficult, because as computing technology becomes more sophisticated the digital resources that were once considered state-of-the-art become obsolete, and are often unreadable by modern machines.
Many working in the library and archive fields in the 1990s predicted that the rapid speed with which technology was becoming obsolete would set a precedent for a constant ‘new-ness’ that diverts attention and resources away from preservation and onto endless upgrading. This cycle of obsolescence, they warned, could result in a massive, irreversible loss of information and of digital history resulting in what was dubbed a ‘digital dark age’. A literal dark age sounds a bit alarmist, but you probably encounter some signs of this every day through links that lead to nowhere, 404s, and expired web domains. These losses may feel small, but many small cuts can form a large wound.
One of the most frustrating aspects of the fate of Yahoo! Groups is that Yahoo! placed the burden of preservation on their users, who were never going to be able to gain the access to the platform that they needed in order to do this work effectively. You can view the Yahoo! Groups archive created by Archive Team’s work online, but all the digital architecture that made Groups actually usable has been lost. I tried to download a file of archived messages from a group called METAL666 but I couldn’t; I didn’t really know what I was doing and, after half an hour or so of downloading various apps in which I could supposedly open file formats I’d never even heard of, I gave up. Archive Team can preserve a board’s raw data, but it cannot replicate that board’s shape and structure, or what the experience of browsing and posting was like; nor can it make it accessible to the casual web user in the way that made the boards so popular in the first place. Yahoo! could have done so, but to keep Groups browsable for the casual web user, even in a fossilised state where no new content could be created, would cost money that they didn’t want to spend on an inert platform.
Yahoo! and Verizon’s actions present a pertinent example of how modern corporate behaviour is destroying digital heritage as predicted by the concerned archivists of the 90s. Once a platform starts to lose its value as an economic asset, the company that owns it is looking to kill it off. ‘New-ness’ in technology and trends will generate revenue for a company in a way that technologies deemed outdated obviously do not, and no profit-driven company will stay competitive by sinking resources into archiving the contents of their obsolete acquisitions. Put simply, corporations will act destructively against preservation when there is even the slightest possibility of financial gain.
This problem is widespread and ongoing. In 2017 IMDB, an Amazon subsidiary, deleted its extensive message boards; Verizon made another howler in 2018 when it nuked all adult content from Tumblr in order to get it reinstated in Apple’s App Store; Myspace lost all content uploaded prior to 2016 in a deeply suspect ‘migration accident’; Google and Yahoo! erased, respectively, Google Plus and GeoCities after failing to turn them into profitable ventures. As any custodian of heritage will wearily assure you, ‘old versus new’ isn’t an all-or-nothing choice. We can, and should, build for the future without obliterating the past. In the digital realm, however, the highly profit-driven culture of tech is making this very difficult indeed, and we find ourselves with yet another reason to be anxious about the monopoly on ownership of our online data by a few corporate giants.