Digitization of Medieval and Modern History of British Isles for Institute of Historical Research, University of London, UK
Project: XML Conversion – TEI XML Confirming to TEI P5 Standard
Industry: University Institute
Client: Institute of Historical Research, University of London, UK
British History Online (http://www.british-history.ac.uk/) digitises rare and valuable printed primary and secondary sources of medieval and modern history of British Isles. Like many digitisation projects (including Google), the client understood that digitization of manuscripts was not possible using text OCR technique as the typeface and font was not readable by modern OCR machines. Thence a more difficult and strenuous approach was to be adopted for conversion.
Accredited and funded by Institute of Historical Research, University of London, UK, the rare and valuable printed documents containing information from Medieval and Modern History of British Isles were to be digitized and converted into XML format.
SunTec was involved in 3rd phase of the British History digitization containing 300 Calendars of State Papers over a 12 month period (in the year 2008). British History Online proposed sending us 7 titles from the series per month for a period of 12 months. Each title was approximately 680 pages in length.
- Considering the typeface, font, and the layout of these rare titles, text conversion using OCR was not a feasible options.
- The challenge was to provide a very high quality transcription (99.995%) by applying the double keying approach for digitization of rare manuscripts.
- Scan, Control, & Notes files for the publication were uploaded to our ftp server. The documents were scanned to 400dpi and delivered either as greyscale or bitmap.
- A client specified DTD containing detailed instructions for handling the individual typography of the book was to be followed.