Glossary

Glossary

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z


A

Algorithm A set of formal instructions for performing an analysis or solving a problem.
ANSI The American National Standards Institute, the U.S. national body that accredits various standards organizations in the United States; the official member of ISO for the United States.
Application programming interface, API A set of specifications that provides a formal interface through which software programs can communicate with one another in standard ways.
Apparatus criticus The critical or editorial apparatus that accompanies an edition of a text, typically consisting of information about variant readings, textual history and witnesses, and editorial commentary.

B

BSI The British Standards Institute, the British equivalent of the American group ANSI.
byte A unit of information in a digital system, typically consisting of 8 bits. (A bit is the smallest unit into which digital information can be divided, consisting of a single true-or-false, 0-or-1 value.)

C

D

Data curation The active and on-going management of data through its lifecycle of interest and usefulness to scholarly and educational activities across the sciences, social sciences, and the humanities. Data curation activities enable data discovery and retrieval, maintain data quality, add value, and provide for re-use over time. This new field includes representation, archiving, authentication, management, preservation, retrieval, and use. See the introduction for more detail.
Data format A specific convention for data representation: i.e. the way that information is encoded and stored for use in a computer system, possibly constrained by a formal data type or set of standards.
Digital object identifier, DOI A permanent identifier associated with a digital object that permits it to be referenced reliably even if its location and metadata undergo change over time.

E

EAD The Encoded Archival Description, a metadata standard for representing archival finding aids in digital form.

F

Fedora An open-source digital repository system.

G

H

I

ICT Information and communication technology.
IMLS The Institute for Museum and Library Services, a US federal funding agency focusing on programs and research in museum and library science.
IETF The Internet Engineering Task Force, a standards organization focused on the development of the standards that underlie the overall operation of the Internet.
ISO The International Organization for Standardization, the largest standardization organization today.

J

JTC (abbrev.) Joint Technical Committee

K

L

M

METS The Metadata Encoding and Transmission Standard, an XML language for representing metadata to describe the structure, content, and management of digital objects within a digital library or digital repository.
MODS The Metadata Object Description Schema, a specification for representing bibliographic data in digital form.

N

Natural language processing, NLP A research domain concerned with the automated processing and recognition of human (i.e. natural rather than machine) language.
NEH The National Endowment for the Humanities, a US federal funding agency focused on supporting research and education in the humanities.
NISO The National Information Standards Organization, a member-based standards organization that has been accredited by ANSI.
NSF The National Science Foundation, a US federal funding agency focused on supporting research and education in the sciences.

O

OAI-PMH The Open Archives Initiative Protocol for Metadata Harvesting, a protocol for collecting metadata records from participating archives, who expose their metadata in a standard format.
OCR, optical character recognition An automated process of transforming image data into text data by identifying shapes and mapping them onto alphanumeric characters. OCR technology is commonly used as an early step in the digitization process, to produce searchable text from digitized page images.
Ontology A formal representation of a system of knowledge as a set of explicitly defined concepts and relationships between them. In digital form, ontologies are used as a way of providing a public, shared semantics for the key concepts being used for research or analysis.

P

Provenance Information about the origin, location, and ownership history of an object or resource.

Q

R

Resources In the context of the web, resources are addressable units of information (addressed through Uniform Resource Identifiers or URIs).

S

SDO Standards Developing Organizations, a term often used for formal standards organizations, such as ISO.
Service-oriented architecture, SOA An approach to software engineering in which applications are designed as sets of interoperable, reusable services that communicate through clearly defined interfaces and protocols, rather than as monolithic systems.
Stack, software stack, technology stack A set of software components designed as separable layers, each one handling a distinct set of characteristic functions and interacting with its adjacent layers through a set of standard inputs and outputs. By extension, in phrases like “the XML stack” or “the publication stack”, the term is used more informally to describe a layered set of components in which data flows “upward” through successive stages of processing, each of which passes its output on to the next stage in a specified way. Ideally, the components in a stack are cleanly separable so that individual components may be replaced with functionally equivalent pieces (taking the same kind of input and delivering the same kind of output) without requiring alteration of the rest of the stack.

T

TEI, TEI P5 The Text Encoding Initiative Consortium, an international community standard that develops and maintains the TEI Guidelines for Electronic Text Encoding and Interchange. The TEI Guidelines are a markup language for representing humanities textual data in digital form, expressed as XML. The P5 version, released in 2007, is the current version of the Guidelines.

U

Unicode An international character encoding standard, fully synchronized with the ISO standard ISO/IEC 10646.

V

Versioning, version control A formal representation of the sequence of changes within a digital file; a system for tracking and managing such changes explicitly so as to avoid accidentally replacing a current file with an obsolete previous version, and so as to permit comparison of different versions, reversion to an earlier state of the file, and similar actions.
Virtual research environment, VRE A formal representation of the sequence of changes within a digital file; a system for tracking and managing such changes explicitly so as to avoid accidentally replacing a current file with an obsolete previous version, and so as to permit comparison of different versions, reversion to an earlier state of the file, and similar actions.

W

W3C World Wide Web Consortium, an international standards organization focused on the development of standards for the data and technologies underlying the World Wide Web.

X

XML The Extensible Markup Language, a metalanguage that is a profile or subset of the Standard Generalized Markup Language (SGML). It provides a standards-based method for defining markup languages (such as XHTML, TEI, METS).

Y

Z