Going Digital: The Process of Adding a New Volume to our Digital Edition

Image of the PGWDE's XML coding, which shows a segment of the numerous entries for "John Smith" in our cumulative index.
TOPICS: Alexander Hamilton, Digital Humanities, Documentary Editing

by Katie Blizzard, Communications Specialist
August 2, 2019

An example of our digital edition’s XML coding, which shows a segment of the numerous entries for “John Smith” in our cumulative index.

Every volume that The Washington Papers produces, we publish in print and digital formats. Our subscription-based digital edition, which is published by the University of Virginia Press’s electronic imprint Rotunda, is one of several online resources that result from our work.1 Used by thousands of people every year,2 the Papers of George Washington Digital Edition (PGWDE) is an especially valuable tool since it includes a cumulative index, which spans all volumes and series. The PGWDE also links readers directly to any document referenced in an annotation or editorial note, making it easy for users to discover other pertinent information. The nature of digital publication furthermore enables authors—or, in our case, documentary editors—to emend material after publication. It is, of course, unfortunate that errors can still occur even after our multiple volume reviews, but it is helpful (and heartening) to be able to visibly correct those errors in our digital edition.

The addition of a new volume to our digital edition, however, is not automatic. While much of the source material can be transformed by computer into the XML, or programming language, needed to integrate the volume into the PGWDE, some elements must be done by hand. These tasks include reference tagging (the linking of in-text references to the relevant document or annotation) and the input of the new volume’s index into the PGWDE’s cumulative index. The latter is a cumbersome task. Ensuring our volumes and their indexes remain accurate requires confirmation that the new volume’s index entries are added under the appropriate heading of the cumulative index (if such a heading already exists in the PGWDE). If a corresponding entry or index heading does not yet exist, editors must ensure that the new entry is placed in the correct spot of the cumulative index, a sometimes challenging task given the PGWDE’s unique alphabetization system (to be discussed later in this post), and the thousands of lines of code that include entries, subentries, sub-subentries, and more.

The difficulty of matching two entries may seem simple or straightforward when many of the entries of individuals or places indexed in a new volume include references to an identification (I.D.) already in that volume or in another volume from the same series. An entry for Alexander Hamilton (1757-1804) in the index of a print volume from our Revolutionary War Series, for example, would include “(see 4:93)” after his name to indicate that an I.D. can be found in volume 4 of the Revolutionary War Series on page 93. But it is here where our first problem arises. While an I.D. of a certain person or place may already exist in our cumulative index, that I.D. may have been published in a different series (such as Colonial, Presidential, or Retirement) than the one to which the volume being added to the PGWDE belongs. Given our project’s rule that each series provide its own I.D. for persons or places referenced, this means that the volume being added to the PGWDE likely has an I.D. unique from the one already published in the digital edition under another series. So, the digital editor must confirm that the entry from the volume being added truly belongs under the one already existing in the PGWDE index. For instance, an entry for “John Smith” would require the digital editor to browse the 43 unique entries for John Smith3 referenced in the cumulative index (which includes five different John Smiths of Virginia) to find the correct John Smith (if there is already an existing entry) as the one to be added. In many cases, this may be easily resolved by comparing the I.D.’s provided in the different series. But what happens when there is no existing I.D., or the information provided in the existing I.D. does not provide enough information to confirm a match? And what happens when the existing I.D. matches the new one in all but one small way, such as a difference by only a year in the birth or death date of an individual? Of course, additional research must be performed. Where there is conflicting information between two I.D.’s, research may conclude that a correction needs to be made to one of the I.D.’s (the process for which will be discussed later in this post).

Aside from confirming that entries match, the digital editor may encounter issues that would make it harder for users to find the content they need. This may mean consolidating subentries of similar meaning, like “Martha Washington at” versus “Martha Washington in” a certain town or city. Bigger problems, though, arise when editors reference (and thus index) individuals or places by different names. An example I recently encountered during work on the digital edition was a subentry titled “[letters] from Wilhelm Knyphausen” rather than “[letters] from Wilhelm von Knyphausen” (emphasis added).4 The correct name would be “von Knyphausen,” but efforts must be taken before consolidating these two subentries to ensure that “von Knyphausen” replaces “Knyphausen” wherever else it might be mentioned. A more notable example, which impacts print and digital editor alike, is the indexing of Lewis Nicola, whose name changed from Nicola to Nicolas after 1789. Ultimately, the index heading for this individual resulted in “Nicola, Lewis (after 1789, Nicolas; 1717-1807)” to reduce confusion and prevent the unnecessary addition of an entry for “Nicolas, Lewis (1717-1807).”

Our concern for accessibility and discoverability leads the digital editor to make some changes without hesitation, like the re-arrangement in the order of indexed terms. Most notably, this can be seen in the differences of alphabetization and formatting between our print and digital editions. Where our print volume indexes disregard entries or subentries beginning with prepositions (after, during, following, in, of) or conjunctions (and, or) when alphabetizing entries or subentries, the digital edition does not. And where our print volume indexes list the subentries “letters from” and “letters to” at the end of every entry, our digital edition lists them (along with the subentry “id.”) directly after the entry’s general references. Doing so ensures that digital users can quickly locate an individual’s identity and connection to George Washington before delving deeper into other subentries (if there are any).

All in all, digital editors have some control and responsibility over the presentation of a new volume. As such, they can introduce errors just as much as print editors can. The ability to contribute to the digital publication of our volumes enables us to actively respond to any errors in our volumes, no matter their origin. This process involves confirming the necessity and accuracy of the proposed change, recording the change in our log of digital edition corrections, and then updating the XML in such a way as to provide the new information without erasing access to the old, so that users know a change was made from the original print publication and what that change was.

It is our hope that these efforts have improved the accessibility and discoverability of the papers of George Washington, without sacrificing the accuracy of our content. And we are always glad to answer readers’ questions about Washington and his correspondents, and we welcome any suggested corrections to the digital edition.

1. In addition to the Papers of George Washington Digital Edition (PGWDE), George Washington’s papers can be accessed on Founders Online. This free-to-use database, which is sponsored by the National Archives, differs from the PGWDE in that it does not include an index.
2. There are roughly 1,000 unique visitors to the PGWDE every month.
3. This count does not include the 8 other John Smiths for whom we have recorded a middle initial, a middle name, or a suffix (Jr.).
4. This problem underscores the benefit of a project style guide, which creates naming conventions for editors to use when referring to individuals—especially those individuals who acquire several different names throughout the course of their lives.