Skip to main content

Digital Humanities: Glossary to Digital Scholarship

Digital Scholarship Vocabulary

Glossary for Digital Scholarship

Born Digital: Materials which were developed on computers and do not need to be digitized.

Controlled vocabulary:  is used in metadata to help make like things appear together in searches.  In practice, thesauri of defined terms are developed by specialists (art historians, anthropologists, librarians, engineers, etc.) of terms to be used so that searches work better and provide the most relevant results.

Digital archiving and preservation: The long-term storage, preservation, and access to information. 

Digital humanities /(e-humanities):  the intersection of computing and scholarly endeavors (teaching, studying, etc…) in the humanities. A means to promote new ways of analyzing texts. See this guide:

Digital identifiers: persistent digital identifiers given to people. Much like DOIs (digital object identifiers), numbers are given to journal articles, the following initiatives are trying the same for people in order to resolve authorship confusion: ORCID ID, Researcher ID (Web of Knowledge).  These identifiers are crucial for the semantic web, which is the next stage of the internet.

Digital Repository: digital collection, preservation, and sometimes access of a collection.  Examples include:  ScholarWorks, Selected Works,, Google Scholar, digital collections of various institutions (museums, libraries, etc.) and any other collection of digital materials which are stable and enduring.

Digitizing/Image capture:  the process of photographing or scanning physical objects (either two or three dimensional) into digital formats, most often TIFF or JPG2000.

GIS/Spatial Analysis: Geographic information system, a system (usually a type of software) for storing and manipulating geographical information on computer.

Institutional Repository: collecting, preserving and giving access to administrative and sometimes intellectual works from a particular institution.

Markup Languages: (HTML or XML) HTML is the older and (now) less accepted of these.  Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.  Language used to code language which is used in building web pages.

Metadata: A set of data that describes and gives information about other data.  A variety of standard schema have been designed for different uses, but most are based upon Dublin Core, an element set of 15 “core” fields, with additional specific fields which pertain to certain user needs or the type of material being described.

Open Access (OA): Open-access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions. It can be peer-reviewed, but some journals expect the author to pay for open access.

Scholarly Communication: Traditionally the term scholarly communication was narrowly defined as the system for disseminating scholarly work, primarily through journals. More recently the definition has been broadened to include the creation, transformation, dissemination and preservation of knowledge. It encompasses the entire process by which academics, scholars and researchers share and publish their findings within and beyond the academic community, and the entire gamut of publication types, from traditional journal articles, books and conference papers to sound and video recordings and interactive multimedia.- Carnegie Mellon University.

Semantic Web:  The Semantic Web is the extension of the World Wide Web that enables people to share content beyond the boundaries of applications and websites. It has been described in rather different ways: as a utopic vision, as a web of data, or merely as a natural paradigm shift in our daily use of the Web. Most of all, the Semantic Web has inspired and engaged many people to create innovative semantic technologies and applications. is the common platform for this community.  (

TEI (Text Encoding Initiative): “TEI allows texts to be marked up semantically at any level of granularity, or mixture of granularities. For example this paragraph (p) has been marked up into sentences (s) and clauses (cl)”.<"17 Simple Analytic Mechanisms - TEI P5: — Guidelines for Electronic Text Encoding and Interchange" .   TEI is a way of “unpacking” or “decoding” texts .

Text Analysis Tools: Digital tools that allow for closer readings of texts. TAPoR tools, MONK.

Subject Guide

Kate Langan's picture
Kate Langan
1st Floor Waldo Library, 1053