Article body
University of Waterloo historian Ian Milligan is a passionate advocate for fellow historians’ awareness of the ways in which digital technologies disrupt and reshape the historical profession. His recent book, History in the Age of Abundance? How the Web Is Transforming Historical Research, provides readers an in-depth introduction to one of the main issues on the ever-expanding list of digital disruptions: the role of the Web as a historical source. Thankfully, Milligan’s clear and accessible writing style and use of real-life examples have helped him produce an excellent introduction to the topic, which is also meant to serve as “a wake-up call for historians of the twenty-first century” (as quoted from the book’s back cover). Given that one of the book’s main topics is the long-term preservation of web content, the topic can and should appeal to archivists, just as it does to historians.
The book consists of six logically organized chapters. The first chapter charts the history and unprecedented growth of the Web as a communication medium and poses some core questions on how future historians might be able to contend with the vast scope of hyperlinked, screen-mediated, open-ended texts. The second and third chapters are dedicated to web archives and will probably be of most interest to archivists: Milligan introduces the relevant terminology (crawls, seeds, etc.) and provides some basic explanations about the Web’s informational infrastructure. He reminds his readers that archival preservation entails many interventions and obstacles – an endeavour further complicated when trying to archive inherently unstable entities such as websites. The fourth and fifth chapters turn to more specific case studies as a way of exploring some of the above-mentioned challenges in more detail. Chapter four presents the Archives Unleashed project and discusses issues of search and retrieval, and the fifth chapter builds on the example of GeoCities – the massive social media site that closed down in 2009 – to contemplate potential ways for working with new kinds of historical sources. The sixth and final chapter suggests some practical skills that historians could and should learn, given the ever-expanding quantity of primary source material to be found and consumed over the Web.
Milligan’s overall objective is to persuade his readers that web-based information resources – the majority of items in today’s informational diet – will serve as crucial primary sources for future historians. Consequently, this means they require proper care and proactive interventions to ensure both long-term preservation and subsequent access. The book may have benefited from some more grounding by way of comparative discussions of earlier media revolutions or perhaps the incorporation of some media theory. However, such additions might have only concealed this important message rather than supporting it. Overall, beyond suggesting solutions, Milligan calls for increased awareness – advocating for self-education on the importance of generating stable and accessible archival copies from the Web’s transitory, decentralized, and nonhierarchical informational ecosystem as a way to help future historians avoid a potential digital dark age. As this review is written at a time in which the future of Twitter and our ability to access its vast repository of content seems uncertain, this call seems timely.
While historians deal with the above-mentioned challenges as users, archivists approach them as crucial, almost existential questions that should be further explored from an archival theory perspective. Milligan’s book can thus help us to better articulate some of these questions. For example, Are web pages to be treated as publications or as records? If a website is a record, where does it exist: in the entire domain, in each individual page or URL – or in the back-end HTML and CSS codes? Is the website record an envelope that contains any external embedded documents (for example, YouTube videos), or are these separate entities? Are our crawls copies, or are they new items altogether? There are also many more practical, usage, access, and privacy questions: At what frequency should we crawl sites, and what should we do in the quite common instance when – by the time the creation of a copy is completed – changes to the original have already taken place? What options are there for bringing organizational websites under a consistent recordkeeping, scheduling regime, and how do we best describe and connect web pages with specific creators? Do we notify site owners when their sites have been copied and archived, and do such notifications fall under traditional donor-relations management? What level of intervention should archivists impose on web content that contains fake, misleading information – and what tools do we actually have to make such determinations? The list goes on and on.
These and other related questions will take time to ponder, especially as they require levels of technical proficiency that are yet to make their way into the archival curriculum. After all, throughout history, the introduction of new technologies has challenged the archival profession and required the acquisition of new skill sets as a way to develop viable solutions. Nevertheless, one very promising aspect of web archiving concerns the existence, at its core, of digital-born, highly networkable records. These can be shared and accessed in a true postcustodial fashion, allowing new and exciting opportunities for participatory community archiving and for the sharing of expertise, data budgets, access, hosting services, and even quality assurance tasks across various memory organizations. Web archiving might also, as Milligan contemplates in his introduction, prove to be an opportunity to build new bridges and reinforce the deep relationships between archivists and historians.
What might be some of the ways forward? Building a vision and recommended approaches toward web archiving is definitely something archivists need to grapple with, individually and collectively. Archival professional associations and others from the galleries, libraries, archives, and museums (GLAM) sector are well placed to take the lead on these discussions. Presenting and sharing institution-specific policies and procedures in conferences, webinars, and listservs are another way forward. The same is true for practical training on the leading software and tools such as Archive-It (the benchmark web archiving software developed by the Internet Archive), Webrecorder, HTTrack, and others. Of course, developing academic courses on web archiving, either as stand-alone seminars or as part of broader digital-preservation training modules, would further contribute to both practical and theoretical progress.
In the meanwhile, Milligan’s book helps set the stage for such potential future developments and is a crucial addition to the evolving library of digital history. My personal takeaway is that, rather than leave web archiving to those with a better technical background, it is our professional duty as archivists to educate ourselves on the Web’s informational ecosystem and actively incorporate it into our professional thinking. Milligan proves that web archiving is another crucial building block for upholding our ongoing professional commitment to provide future generations with access that is as accurate and reliable as possible to the records of our time.