HT Ingest Service

From MPublishing

Revision as of 14:11, 24 October 2012 by Taftman (Talk | contribs)
Jump to: navigation, search

What is the Projects2HT service? In order to ingest digtized books (or booklike volumes) into HathiTrust, each included image must possess proper OCR, pageturner and preservation metadata. Although books digitized by Google, Internet Archive, and other vendors already have these, locally-digitized images may not. MPublishing has developed software and workflows to produce this metadata for and begin the ingest process on behalf of HathiTrust partners.


Frequently Asked Questions

1. How is pricing structured?

Fees are structured in two tiers:

1-100 books: $5/book 101+ books: $4.5/book


(The first 100 books incur a slightly higher fee to account for one-time setup tasks.)


2. Does MPublishing offer volume discounts for larger projects? No, not at this time.


3. What is the turnaround time?

Normally, books should appear in HathiTrust within 4-6 weeks of initial delivery, provided that they meet the image specifications. [link]


4. Where can I see examples?

[Link to Utah State]


5. What are the steps in the process?

1. If needed, Convert the PDF to bitonal and contone TIFFs.

2. Send bitonal TIFFs to OCR.

3. Add needed preservation headers to TIFFs, convert contone TIFFs to JP2s.

4. Manually add necessary structural metadata for Pageturner, page by page.

5. Integrate OCR, Images, and Pagetag data into a package and pass to Core Services for ingest.

Personal tools