Super conference 2007 – William Wueppelmann, Electronic Systems Specialist, Canadiana.org
Overview of issues to consider prior to beginning a digitization project
I/Outsourcing vs Insourcing
Outsourcing
-equipment not required
-tech details no problem
-smaller committment $$
-give up some control of final project
-still stuck with doing extra logistics work re metadata and storage
Insourcing
-requires an up-front committment
-economical in the long run
-allows hands-on quality control and flexibility
-allows you to go back and redo
Don’t be afraid
-building a production workflow is at first challenging
-work iteratively
-break projects into series and steps
-be creative, original and flexible–think of the future
-useful to have someone else do a small batch of images to experiment with
Elements of Digitization
-scanning
-image processing and proofing
-OCR scanning and perhaps cleanup
consider = 1)metadata and storage 2)storage and preservation 3)content delivery
Scanning
-can be exspensive-not difficult but there is an art-dont be obsessive
-employ realistic standards based on requirements – no such thing as a completely accurate scan
-consider the nature of the source materials and what you want to do
OCR
-needed for keyword searching
-manual proofing and correction is a major investment
-OCR is not difficult — lots of packages available
-consider your requirements – what input and output formats are needed
-can re do later when technology improves
Metadata
-must have!!!
-easier to capture initially than add later
-biggest problem is putting it all togeather
-use a flexible format that can be extended or transformed as needed
-some metadata pertains to individual documents, some to group
-master format for preservation, Second format for production
Storage & Preservation
-need two
-data must be validated periodically
Content delivery
-traditional method is website or database
-typically presents derived images
-dont be more elaborate than necessary
-dont tie your images or metadata to one specific application – can migrate to a more elaborate system later
Workflow
-logistics is one of the biggest challenges
-sketch out main steps and develop iteratively
-scripting/programming to automate parts of the process is useful – reduces error rate
Final observationa
-dont be afraid to experiment
-dont assume its too hard
-when all else fails – do something