Digitization of Existing Images

Page Index


A digital image is a picture represented electronically as bits or bytes. It is an electronic snapshot taken of a scene or scanned from existing documents such as photographs, manuscripts, printed texts, and artwork.


In the following pages we have tried to provide helpful information for preserving and presenting digital images. These pages discuss digitization quality issues, archival and presentation formats, and metadata for images. It may help to first familiarize yourself with some of the terminology that will be used.

Presentation or Archival Format

The first decision that must be made regarding image capture concerns the purpose of the images being created. The end-users, required equipment, and storage facilities need to be kept in mind. Are the images simply for web delivery? See presentation format. Or are there preservation issues that must be considered? Then see archival format. Best practice for language documentation is to scan a high-quality archival image, and then create smaller presentation and thumbnail forms from that image. The higher quality the image needs to be, the higher the settings necessary for scanning.

Modern digital cameras enable digital images to be created comparatively easily. If you wish to archive these images, be aware of what format the camera is using. Some only store images in a compressed format, while others store both compressed and uncompressed images, the latter being considerably better for the archivist.

More on Presentation Format

More on Archival Format


If you wish to store your existing photographs, film, or paper materials on a computer, these resources need to be digitized. The process of digitization involves scanning, adjusting the image, and uploading.

More on Scanning

OCR or Keyboard?

Digitization of language documentation often involves conversion of paper records, such as lexicons, grammars, or narratives, to an electronic format. Scanning a document creates an image of the page, but in order to create a textual file that can be edited, OCR (Optical Character Recognition) software is needed.

More on OCR

Image Metadata

Metadata is information about resources. In this context, it is information about language resources: lexicons, audiotapes, transcribed texts, language descriptions, etc. It is analogous to card catalog information about library resources, in that it enables the discovery and retrieval of resources through standardized, machine-readable information. Metadata is becoming very important to the linguistics community, for it gives us the ability to find language resources in the vast and rapidly expanding realm of the Internet.

More on Metadata

More information on image digitization is available in the school's reading room

The content of this page was developed following the recommendations of the E-MELD working groups and the Electronic Text Center.

User Contributed Notes
Digitization of Existing Images
+ Add a comment
  + View comments

Back to top Credits | Glossary | Help | Navigation | Site Map | Site Search