Demonstrating the Scholarship of Encoding



This post is the second installment of CIT’s new blog series, “Demonstrating the Scholarship of…,” in which we provide arguments and strategies for individual scholars to link their own particular suite of non-traditional means to these traditional ends. In this post, we will reflect on the mutual interdependence of encoding and research, share ideas for making this feedback loop more rewarding and efficient, and offer arguments for the significance of encoding as a rigorous and productive form of scholarship.

By “encoding,” I really mean textual encoding, which is the practice of adding markup to an electronic text—in other words, carefully marking strings of text with sets of tags that indicate various traits or objects. Encoding can involve adding metadata (information about the source and provenance of the text), indicating hierarchies (organizational units that comprise a text’s structure), providing annotations (such as definitions of obscure terms or identifying allusions or quotations), specifying classes of words (such as people, places, other texts mentioned in the base text), or providing variant readings.

Although it is common to associate markup language tags with design—the appearance the text will have when examined on a web browser or printed out—specialists do not privilege encoded text’s final appearance as the necessary end product of encoding practices. A later post will examine how to demonstrate the scholarship of design (and you can learn more about this distinction between descriptive and procedural markup here), but this post refers to using markup language to identify important features in a text and make them machine-readable. This means that the encoder decides what a human reader should find (literally) remarkable in a text, or in other words, would want retained for producing a new edition of this text and/or for submitting this text to further layers of computer processing (e.g., for creating visualizations or for text mining). Tags allow computer software to recognize these features, which would otherwise, as undifferentiated strings of characters, be unrecognizable to a computer.

Textual encoding is typically championed in terms of what else it allows scholars to do: to conduct research, to teach a class, to create a network visualization. Jerome McGann’s A New Republic of Letters is a vivid example of such arguments. In it, McGann explains that we “need to migrate our cultural heritage to a digital condition” in order to “make it usable for the present and the future” (22). Because, according to McGann, “no one any longer thinks that scholarship—our ongoing research and professional communication—can be organized and sustained through print resources” (132), textual encoding is indispensable for humanities scholarship. N. Katherine Hayles and Jessica Pressman, in their collection, Comparative Textual Media, provide a very different defense that can be used to explain the significance of textual encoding. They argue that the recent proliferation of digital texts and cultural forms have not made print obsolete, but rather have made it evident that “print itself is a medium” (xxiii), thus enabling scholars who are attentive to the digital-mediality of contemporary texts to make new arguments about past and present print cultures. As they explain, it is “possible once again to see print in a comparative context with other textual media, including the scroll, manuscript codex, early print codex, the variations of book forms produced by changed in letterpress to offset to digital publishing machines, and born-digital forms such as electronic literature and computer games” (vii). Textual encoders are, I argue, the best examples of such a critical renewal, for they engage precisely with this transition from print to digital culture.

Another set of arguments, the publications of the Women Writers Project, can provide inspiration for explaining the scholarship of encoding. One important example acknowledges that “text encoding is typically taught (particularly in a workshop setting) as a technical skill and as something having primarily to do with the use of computers,” so that it seems that there are “few opportunities for humanists to learn about text encoding in a way that emphasizes its theoretical and methodological significance for humanities research and teaching.” It is crucial, then, to “use the markup process as a way of thinking about the text—thus helping to illustrate how much the process has to do with expressing the scholar’s own interests.”

To end this literature review in miniature, it is worth spending some time reading and reflecting on Ryan Cordell’s inspiring blog post, “On Ignoring Encoding.” Cordell explains how “encoding inherited the stigma of scholarly editing, which has in English Departments long been treated as a lesser activity than critique—though critique depends on careful scholarly editing.” For Cordell, the solution is certainly not to privilege coding (or tool-building) over encoding, but to explain that textual encoding can be used to create a more robust model of DH that can withstand such critiques. Warning us against pro-digital humanities arguments that “ignore the field-constitutive work of scholars such as Julia Flanders, Bethany Nowviskie, and Susan Schreibman,” Cordell argues that we cannot represent the field as “all Graphs, Maps, Trees and no Women Writers Project.” This is unfair not only because it neglects much significant digital labor and many ground-breaking projects, but also because, ironically, “far more financial support has gone into encoding and archival projects than into data analysis over the past decades of DH history.”


Non-Traditional Means, Traditional Ends

Textual encoding is not simply a service to the profession (although it is an important one). It requires and develops the same research tools to fashion a scholarly edition, but, in addition to these skills, it creates a strategic, focused context for reflecting on the nature of digital textuality. As James Cummings has written, “If the study of literature is increasingly to become digital then we have an academic duty to ensure as much as possible that this is based on truly scholarly electronic editions which not only uphold the quality and reliability expected from such editions, but simultaneously capitalize upon the advantages that publication in a more flexible media affords.” I would add that encoders or readers should also recognize and elucidate the disadvantages or challenges of encoding texts. Doing so can avoid the feeling clubbiness that sometimes is assumed to hover around textual encoding communities. It can also ensure that you develop a critical awareness of encoding that approaches the theoretical sophistication of other schools of literary criticism and humanist scholarship.

Such a critical approach is possible because textual encoding—or even hypothetically approaching the problem of encoding a particular text—requires the encoder to confront simultaneously practical and theoretical questions about textuality. Work on encoding a text can quickly inspire research in textual studies or bibliography, not only because of the subject knowledge required for responsible encoding, but also because of the peculiar, double-sided nature of encoding itself. As the Women Writers Project’s introduction to encoding argues, “We might liken the encoder to an anthropologist […] creating a thick, contextualized, interpretative description of the text, or to a critical editor who produces an analytical representation of the text which provides systematic, expert knowledge about it.” These two approaches are not exclusive, of course; an encoder likely invokes each method at different points in the encoding process. Analysis and description are two familiar poles of humanist scholarship, ensuring that the critic (on the one pole) chronicles, elaborates, and forges connections and (on the other pole) theorizes, categorizes, and boils down complex materials into an argument.

These are, of course, indispensable skills for writing scholarly articles and books, but encoding does more than sharpen these skills: it is uniquely situated to provoking research projects that plumb the digital futures of textuality, recover networks of influence, theorize genres, or analyze narrative structures. Anyone approaching a text with encoding in mind analyzes the structure of the text, evaluates what is important about the text’s content, decides what kinds of information to retain about the text’s provenance and materiality, and chooses are particular set of scholarly apparatuses. To quote from the WWP’s introduction to encoding again, encoding represents “a way of formalizing and externalizing the structures in a text; a way of adding further information to the text that interests us; a meta-text that comments on, interprets, or extends the meaning a text.” Making these decisions explicit can allow you to use them as the basis for research about textual encoding, scholarly editions, or your disciplinary specializations.

Actively reshape conversations and debates about which encoding schema is best, or how to do it, into reflections of the broader significance and implications on encoding. Critiques of contemporary encoding practices—such as David Schloen and Sandra Schloen’s “Beyond Gutenberg: Transcending the Document Paradigm in Digital Humanities” and Desmond Schmidt’s “The inadequacy of embedded markup for cultural heritage texts”—can therefore be seen as positive indicators that there are theoretical research questions inhering in encoding, that there are arguments to be made and positions to weigh. These are not simply turf wars; disagreements reflect profound, salient, or topical differences in the models of textuality being explicitly or implicitly invoked in these debates. Your own reflections on the matter should not only influence how you encode, but also how you execute other types of research in your subfield, or perhaps in book or media history. Bringing these insights into discussions of theoretical approaches to the humanities, or bringing theory into your approach to encoding, will ensure that you do not separate the “encoding” part of your scholarly career from the rest of it.



As discussed in “Demonstrating the Scholarship of Pedagogy,” it is crucial to take every chance to transform experience with encoding into a different type of scholarly activity, even if your particular encoding project is not finished. Submitting your encoding work for peer review (to journals like DHQ, DLS, and the Journal of Digital and Media Literacy, or to relevant critical aggregation network, such as NINES or ModNets) is one way to solicit peer review for your particular encoding project. Although publishing in the Journal of the TEI immediately springs to mind as an appropriate venue for applying the fruits of your encoding work, this transformation does not necessarily require your experiences in encoding to have produced a scholarly edition that is completely finished. Many journals allow projects currently under development: DHCommons and Digital Scholarship in the Humanities invite accounts of projects currently underway but not finished.

If your encoding project has participated in the successful creation of a “live” scholarly edition (particularly if it is open-access), gather evidence of your impact by tracking user experience through numbers (analytics, many of which can be passively tracked) and through description (which you will probably have to solicit from users). These editions can pack a heavy impact through the number of users interacting with the text. Attracting grants will also prove the impact of your encoding labors. Some grants are only for early phases of a project, particularly internal “seed” grants designed as a precursor to applying for an external grant, and the NEH’s Digital Humanities Start-Up Grants, which are specifically for early-phase projects. Presenting at conferences and poster sessions can not only provide a line on your CV, but also expose you to the perspectives of conference-goers who may be able to illuminate a particular strength of your work that you yourself have not considered.

But what if you do not have a scholarly edition or other “deliverable” to point to? In that case, the impact of your encoding activities must be pitched within a narrative of your own scholarly development. Consider writing (or simply brainstorming) an encoding philosophy: like a statement of teaching philosophy, such a 1-2 page statement would summarize your approach to encoding, creating something like a manifesto by finding a central thesis or theme runs through your encoding. It would reflect the values and priorities you use when encoding, as well as provide a chance for you to think about why you’ve chosen encoding as part of your research endeavors. Just as a teaching philosophy clarifies your motivations and goals for teaching and explains how this fundamental service (we are all expected do) for our institutions matters to you particularly, an encoding philosophy should reveal the larger stakes of your encoding, which is also often regarded as merely a service.


Combating Counterarguments

  1. Textual encoding is service, not research. Challenge the binary pitting research against service, showing how they are mutually interdependent. Keep ready in your mind at all times a particular anecdote about how an encoding experience spurred a research question. Point to the grants, publications, presentations, and workshops that your encoding has led to. Ask your interlocutors if they have encountered edited texts whose underlying assumptions and argument either made the edition indispensable or infuriating for them, then ask why this matters so much to them.
  2. Textual encoding is repetitive, unending busywork. Point out how many research activities are repetitive: we reread texts, revise our writing, synthesize overlapping or similarly scholarly texts, and we revisit classic scholarship in our fields. These activities are not dismissed as “mere” repetition, so why should encoding? Discuss how, in the digital humanities, pattern recognition is a major source of insights and theses. Or explain that repetition is frequently repetition with a difference—you learn something with each recurring turn—when it is not, repetition can be soothing and renewing because it can propel you into a different mental space than other scholarly activities.
  3. Textual encoding can or should be done by more efficient, dedicated specialists or businesses. Concede that it may not be “efficient” for every humanist scholar to learn how to encode, or concede that your encoding may not go as quickly as someone who is not trying to maintain precisely the kind of research record you wish to maintain. Then, challenge the value of efficiency, explaining that the other benefits encoding brings are worth going “slower.” Explain that more efficient encoding is not typically better encoding, or that the inefficiencies of your encoding (going back and changing a schema halfway through, for example) help to provoke new ideas or cement nascent or previously undeveloped ideas you had about textuality or this text in particular. With regard to businesses, if your are politically inclined, you might critique the use of transcription farms or for-profit businesses, either by exposing any unjust labor practices or, if you are not so inclined, you can argue that humanities scholars need to participate in (or at least fully understand) the processes lurking beneath the resources they use in their scholarship.
  4. Textual encoding is just the same ol’ textual studies or bibliography, just in new clothes. Either justify the values of traditional textual scholarship, or point to the advances in such scholarship that has been spurred by new digital platforms and markup vocabularies and communities.


Verbs and Phrases

To create a statement of encoding philosophy (or even to hold a serious brainstorming session for a hypothetical statement), you will need some stock verbs and phrases that you can easily remember when conversing or writing about your encoding activities. Jerome McGann’s  A New Republic of Letters provides a wealth of manifesto-worthy lines, while more even-tempered lines can be found in Susan Hockey’s Digital Texts in the Humanities and in Digital Critical Editions, edited by Daniel Apollon, Claire Bélisle, and Phillippe Régnier.

Digital scholarly editions

Produce new models of textuality

Creative markup / theoretical markup

Textual structures or structural hierarchies

New methods of textual studies / bibliography

“Encoding neutrality?” (Nigel Lepianka)

“All coding and no encoding” (Ryan Cordell)

Beyond the “document paradigm” (Schloen and Schloen)

Documentary (or paradigmatic) editions (see Elena Pierazzo)

Encoding communities as research communities (Allen H. Renear)

Engagement with encoding as “different types of reading” (Jacob Heil)

“Knowledge society” and “the digital turn” (Apollon, Bélisle, and Régnier)

Realist versus antirealist trends in textual transcription (Kathryn Sutherland)

Encoding as a hermeneutic or semiotic act (Fiormonte, Martiradonna, and Schmidt)

“intellectual problem of representing textual data in electronic form” (Ide and Véronis)


Leave a Reply

Your e-mail address will not be published. Required fields are marked *.