Preservation of Word-Processing Documents
Date
Authors
Barnes, Ian
Journal Title
Journal ISSN
Volume Title
Publisher
Australia: Australian Partnership for Sustainable Repositories (APSR)
Abstract
Word processing documents are a major problem for digital repositories. As I will explain below, they are not
suitable for long-term storage, so they need to be converted into an archival format for preservation. In this
report I will address the following questions:
• What file formats are suitable for long-term storage of word processed text documents?; and
• How can we convert documents into a suitable archival format?
I also address the related non-technical question:
• How can we get authors to convert and deposit their work?
While the vast majority of material generated by universities is text, most research on digital preservation
concentrates on images, sound recordings, video and multimedia. You could be forgiven for thinking that this is
because text is simple, but unfortunately that’s not so. Even relatively short text documents (like this one) have
complex structure consisting of sections (parts, chapters, subsections etc) and also of indented structures like
lists and blockquotes. A significant part of the meaning is lost if that structure is ignored (for example by saving
as plain text).
Description
Keywords
Digital Preservation, Digital Curation, Digital Stewardship, Digital Sustainability, Data Sharing, Data Preservation, Digitisation, Digitization, DocBook, TEI, Text Encoding Initiative
Citation
Source
Type
Working/Technical Paper
Archives Series
Date created
2006-07
Access Statement
Open Access
License Rights
DOI
Restricted until
Downloads
File
Description