Industry


Ads by TechWords

See your link here


Larry Medina's picture
Larry Medina

For the Record

Conversion is NOT preservation!

There have been many articles that improperly indicate that the scanning of paper documents represents "preservation" of information, one of the most recent ones is here.   While there is some truth to the fact that conversion, scanning, image capture, or however its described is PART of a "preservation strategy", it does not on its own represent preservation.

Paper, while not exactly a search friendly medium for storage of information has a rather strong history as a means of preserving information.  Properly indexed and supported by sufficient finding aids, it satisfies the need to access the information contained on it.  Stored in proper environmental conditions and protected appropriately from fire, flood, and other hazards, it has been known to stand the test of time, lasting in many cases more than 300 years and remaining both viable and legible. One notable case of a major conversion effort was the Domesday Project in the UK, where a historical work dating back to 1089 was converted to electronic format and in the process, a large portion of what was converted was lost and had to be recaptured.  A read of this page gives users a decent understanding of the requirements to successfully capture the complete text for a 1:1 conversion.

What this conversion can successfully do is provide access to the content of the paper to multiple users simultaneously, even if they are in disparate locations.  And in some cases, there is a value to that... however, careful consideration should be given to the performance of a needs analysis and cost benefit of the effort first.  If the information is infrequently accessed, or it has a short required retention period, it may not be cost effective to convert it at all.  Conversion isn't a matter of simply plopping the paper into a sheet feeder, setting a machine to automatically image the sheets and walking away.  Quality control must be performed to ensure the images are captured in a complete and legible fashion, each sheet (or batch) must be indexed and supported by sufficient metadata to allow searching for the images, and there must be sufficient funding available for periodic conversion and migration to avoid technological obsolescence and media degradation.

Scanning paper can have its benefits, but it is NOT on its own preservation of information.

What People Are Saying

I've heard something like

I've heard something like that somewhere else... and... there's no solvation... ;)

Guys, if honesty... there's

Guys, if honesty... there's no way to protect information from anyone or anything...

data on paper on or on

data on paper on or on electronic devise , is only useful, if we could use it for decision making.

Very nice site. Thank's for

Very nice site. Thank's for help. Will tell about you to my friends in Kanzas city.

And what should we do in

And what should we do in this situation? How to save information, or it's impossible in general?!

Nothing is impossible...

Nothing is impossible... some things are impractical, others are simply illogical, and others yet are just expensive... but organizations need to budget for them if they make this "leap of faith".

The intent of this post wasn't to suggest an organization "DO NOTHING", but rather to ensure they take into account the required retention for information being considered for conversion, and the risks to storing it electronically because of media degradation and format instability. If the retention is regulated and there is a legal requirement to provide persistent access to the content for decades, or even for a period greater than 100 years, then sufficient steps need to be taken to ensure this happens.

When most organizations undertake a conversion project that involves scanning thousands of source documents, they don't think about the need to ensure the images that are required for longer periods of time need to be handled in a different manner than those that have a relatively short term value. Some administrative document may only have a required retention of 2 years, and a very short business value beyond that. Maintaining the ability to access these is not an issue; same is true for records that have a retention period of up to say, 10 years.

However, other images with extended retention requirements, or research value that goes out for much longer periods of time need to be handled in a substantially different manner. This can involve different steps in the original capture and storage, production of multiple sets and storing one in an environmentally safe manner to ensure longevity, or establishing a schedule for review, conversion and migration to avoid obsolescence (or degradation) of format, media, or the applications used to access them.

The Domesday Project dates

The Domesday Project dates from 1986 not 1089, was created in electronic format, has been partially converted to web format and no data was ever lost in the process.

Domesday Project

Sorry Adrian but your facts are severely lacking. The original Domesday Book was created during the reign of William the Conqueror to find out how much England had after he had conquered it. It was created in 1086. Lets just say it was an inventory.

The Domesday Project was an attempt to create a modern digital version. It was a BBC project fo the 900th anniversary of the original Domesday Book. This link will take you to the website of the project.

http://atsfcouk.demonweb.co.uk/dottext/domesday.html

In 2002 the English newspaper The Observer reported "It was meant to be a showcase for Britain's electronic prowess - a computer-based, multimedia version of the Domesday Book. But 16 years after it was created, the £2.5 million BBC Domesday Project has achieved an unexpected and unwelcome status: it is now unreadable.

The special computers developed to play the 12in video discs of text, photographs, maps and archive footage of British life are - quite simply - obsolete. "
http://www.guardian.co.uk/uk/2002/mar/03/research.elearning

This website documents the efforts taken to recover the information
http://www.ariadne.ac.uk/issue36/tna/

Well, not exactly Adrian.

Well, not exactly Adrian. The Domesday Book that was converted dates from 1089, and that was mentioned for the purpose of indicting the viability of paper as a storage medium not requiring any conversion or migration, and providing persistent access for nearly 1000 years.

And you're right about the dates of the project, but not about there being no data losses during the project. Initial scans were stored and the media they were on degraded prior to conversion to a web-based resource, and they were required to rescan a portion of the content at that time... had they accepted the "logic" of the IT folks involved, they would have considered discarding the source materials once they were scanned. Similar advice is routinely provided now to organizations scanning records that have a required retention of 100+ years. These organizations have not budgeted to cover the costs of periodic conversion or migration of the digital images, so what happens to them?

agree it!

agree it!