Choosing a Publishing Model
From MPublishing
Contents |
Choosing a Publishing Model
Selecting the best publishing model for your materials can be a challenging process and will probably require some fine-tuning beyond the scope of this document. That being said, this is meant to help you gain a sense of what the Scholarly Publishing Office can do with the materials you are able to supply and a sense of what—within that range of options—will be most suitable to you as a finished product. Note: the models described below apply to how articles are delivered, not to whether articles are published in issues or one at a time.
- To help facilitate your choice of a publishing model, review all of the formats in which your materials currently exist (try to be as specific as possible about media [i.e., paper or electronic files, specific electronic file formats, etc.] and editing stage represented in each format [e.g., if you have file is Quark format, PDF format, and InDesign format, try to determine which ones contain final proofs]):
Volumes, Issues, Etc. | Available format(s) that match final proofs |
_____________________ | ___________________________________________ |
_____________________ | ___________________________________________ |
_____________________ | ___________________________________________ |
_____________________ | ___________________________________________ |
_____________________ | ___________________________________________ |
- Review whether you need SPO to put your project online as a first-time publication or whether it exists already and you need SPO to serve as an online co-publisher. (Examples of the latter could include journals that already exist in print but which seek a parallel online edition or materials published elsewhere online for which SPO will serve as a stable archive, etc.).
The following “models” are intended to offer you a glimpse of some approaches SPO takes to the more common publishing scenarios that present themselves. Links to SPO publications employing each approach are linked to for reference purposes.
Page images scanned from paper
SPO uses the page image model for publications with a large print backfile, the content of which is not available in electronic form. Users can "turn" pages in a sequence or jump directly to a given page number.
We process these images with OCR software to allow users to search the full text of a document. This software can operate on most languages written in the Latin script; however, it works best on unilingual texts. The word recognition rate will be much lower for texts in multiple languages, but the predominant language of the text will have a higher level of accuracy than the other languages.
Model A: Journal issue as unit
ex: http://quod.lib.umich.edu/b/basp/
Best for: publications with a large print backfile and when pagination of articles is not consistent.
When scanning a journal from paper, it's most efficient to scan an entire issue without attempting to determine article boundaries at the time of scanning. We can still provide links to individual articles, and the user will be able to turn pages ‘across’ articles. Because this model prevents SPO from implementing article-level access restrictions, we generally choose Model B instead when scanning from paper.
Model B: Journal article as unit
ex: http://quod.lib.umich.edu/m/mjcsl/ (before volume 8)
Best for: publication where new documents sent to SPO will also be split at the article level or where article-level access restrictions in the delivery system are critical.
It can be more worthwhile to scan articles separately so that these will become separate units in the delivery system.
Model C: True Electronic Text
ex: http://quod.lib.umich.edu/w/wsfh/
Best for: instances where electronic source documents are available.
SPO generally prefers to publish true electronic text since it allows for hyperlinks, multimedia, and accurate searching of the full text based on the structure of the text. In addition, true electronic text allows the documents to be disseminated in various ways not tied to the print page. If the publishing partner provides PDF files, SPO can put these online as an alternative format for readers.
Model D: Page images from PDF files
ex: http://quod.lib.umich.edu/m/mjcsl/ (volume 8 to present)
Best for: cases where a publication includes many diagrams and figures that would be difficult to render in electronic text and/or cases where a publishing partner values precise page layout that cannot be consistently replicated online.
For some publications, we display page images but also have electronic text underneath that allows for more accurate searching. For this model, we need PDF files in which the text can be highlighted when you open the PDF.
A note of caution: Using page images in this way comes with limitations.
- Text in more than one column can present problems for extraction of text.
- Our current software only allows extraction of text written in the Latin script, so non-Latin text will not be searchable by users.
- Extracting text from PDF files leads to a number of problems that decrease the accuracy of searching:
- Words hyphenated across line breaks can't be automatically reconstructed into whole words.
- Other words at line-breaks are often not followed by a space character, causing them to run together with the word on the next line after extraction.
- One cannot search for phrases spanning pages, columns, and sometimes even lines.
We only use this model for publications where the journal article is the unit.
Combination of models
We often use one model for backfiles and another for new documents sent to SPO. Possible combinations include:
- Models A & C: ex: http://quod.lib.umich.edu/m/mdiag/
- Models B & C: ex: http://quod.lib.umich.edu/f/fs/ (access restricted to subscribing institutions) – This publication currently uses only Model C, but Model B content (for back issues) is forthcoming.
- Models B & D: ex: http://quod.lib.umich.edu/m/mjcsl/
For more information on these models, see the Scholarly Publishing Office whitepaper "Choice of DocEncodingType and encoding level for SPO publications."