The Peaceful Coexistence of Print and Digital Media, with Special Consideration of Slavic, East European, and Eurasian Studies
Kevin S. Hawkins
I was asked to discuss the future of digital applications. I can offer a few predictions, but my opinions can be illuminated by examining the range of digital resources hosted at the University of Michigan University Library (focusing, for this presentation, on resources in the Slavic, East European, and Eurasian area). To understand these, it's helpful to understand Michigan's approach to digitization and print, which is informed by the mixed messages that librarians are receiving about print versus digital resources.
- Mixed messages about the future and value of print
- Policy toward print and digital resources
- Sample Slavic, East European, and Eurasian resources demonstrating this policy
- Predictions for the future
Mixed messages about the future and value of print
With all the talk of digitization, people often wonder about the fate of printed material. We’ve all been through these discussions with our colleagues and curious people not in the profession, and it seems everyone has an opinion on the question. A few common views:
- Some librarians will claim that print is a more stable preservation format than digital media, which are subject to changes in technology and degrade faster than paper.
- Besides professing an aesthetic love for the printed page, many of us refuse to read long texts off a computer screen (or microform reader), instead choosing to print out non-print resources.
- On the other hand, users will proudly admit that they can’t remember the last time they went to the library to find something. For these users, “if it’s not online, it doesn’t exist.” They might have visited the library a handful of times for a resource, and they will probably tell you how annoying it was that they needed to go there. These same people often print out things they find online (for easier reading), but they're used to locating resources online.
In academic libraries, there is a clear trend toward moving print items to remote storage to free up valuable space for individual and collaborative use of technology for research, teaching, and studying; however, there have been a number of false starts in initiatives to stop collecting print materials altogether.
Taking these mixed messages from users and librarians into account, I will assume that there is no one-size-fits-all solution for the switch to electronic resources. It’s obvious but worth restating. The usefulness of electronic resources and the necessity of print varies based on the type of resource (reference, serial, or monograph), the type of users (college students, the general public, children, the elderly), and type of institution (research library or general collection). I would say this is the basis for the approach taken at Michigan.
Policy toward print and digital resources
Michigan has long been a leader in digitization, making huge volumes of reformatted material available online for free, hosting subscription-based collections of reformatted and born-digital material, and publishing born-digital serials, monographs, and other digital projects. Even since before Michigan’s collaboration with Google, the digital format was the default preservation option at Michigan. Digitized works in the public domain are made available online for free.
But Michigan also believes in giving people choice in how they access texts. Users can purchase print-on-demand reprints of items from the Making of America, Historical Math, and ACLS History E-Book collections (for now). Like Marcus Levitt said earlier this conference, print will become a mode for a particular use. Michigan sees three audiences for this:
- Users who don’t want to read an entire work online
- Users who want a print copy of the item even if they are willing to read it online
- Librarians who want a print copy of the item for their library (instead of directing users to the resource available online through Michigan)
I should point out that Michigan is not looking to make a profit from its sale of print-on-demand copies. We're still recovering start-up costs at this point, and once we break even on those, profit from print-on-demand sales will be reinvested in digitization.
Sample Slavic, East European, and Eurasian resources demonstrating this policy
Let me demonstrate some different types of resources available digitally and in print media. These were all created in digital form by Michigan or by Google through the Michigan Digitization Program (the Google partnership), which will show various digital methods and help illuminate my vision of the bibliographic future:
- Digitized items (page images, plus OCR for searching) available for purchase through print-on-demand:
- Scanned page images with bad OCR
- Cesty po Bulharsku (from the Travels in Southeastern Europe collection) — OCR software did not recognize Czech. (See OCR of the title page.) The text was either OCR'd using old software that doesn’t recognize Czech or without setting the software to recognize Czech. The software used to deliver digital content at Michigan automatically substitutes versions of a character with and without diacritical marks, allowing users to search by typing only base characters, so this isn't such a problem here. I don't know why we don't offer print-on-demand copies of this item.
- “The Reise” (from Cross Currents) — OCR software saw Polish “ł” as “t”. The OCR software wasn’t expecting non-English text here, or it just assumed it was a “t” anyway. Here the character mapping functionality (substituting characters with diacritics) will not help.
- «Новый міръ», available through Google and soon through Michigan’s delivery system [screenshot available now] — Note that both systems use the catalog record from Michigan, which has a typo in it: “l” instead of “i” in “Novyl mir”. [Since this presentation was given, this has been fixed.] Google originally used OCR software that only recognizes Latin characters. The current OCR text is soon viewable through Michigan [screenshot available now]), and this is used in Google’s delivery system: see the results of a search for “pantaie” within this work. [Since this presentation was given, Google has reprocessed to recognize the Cyrillic.] Google is currently reprocessing scanned images with better OCR software that will recognize Cyrillic, so we can expect the OCR on this work and others to be replaced in the near future. Note that access to this title from within and outside the US differs due to interpretation of copyright law by Google and Michigan. [Since this presentation was given, Google, like HathiTrust, has made this available from within the US.]
- Electronic text converted from electronic source
- “Encountering the Past: History at the Yugoslav War Crimes Tribunal” (from the Journal of the International Institute) — Real electronic text is more usable than page images or even OCR'd text because it’s usually more accurate and because paragraphs, headings, and other sections of text are tagged. But here you see “Meötrovic” should be “Meštrović”. The character mapping functionality will only help for the second incorrect character.
- Our conversion methods now better handle characters, but precise conversion depends on how they were input by the creator of the document.
- Electronic text keyboarded
Predictions for the future
These are my personal predictions for the future, informed by my personal opinions and the thoughts of other talking heads like me:
- Younger users will be accustomed to reading off a computer screen and have no complaints about doing so, thereby lowering demand for print resources even further.
- Print-on-demand technology will become less expensive, so more publishers will use it to avoid warehousing large quantities of material.
- Librarians will feel more confident in the long-term viability of digital resources because:
- Preservation methods for locally created digital content will continue to mature.
- Collaboration among libraries, new contractual agreements, changes in copyright law, or some combination of the above will guarantee access to previously acquired content and to previous versions of any content. Access to content from stopped subscriptions, as well as to previous versions of material put online, will be guaranteed.
- For new content, the electronic version will be seen as the archival, canonical version, with various print versions produced at different points in time using the technology available.
- Problems with non-English characters will soon be overcome due to the use of newer software through market pressures. Most significantly, I hope LC will eventually support detransliteration of catalog records for the Slavic and East European languages. This is a barrier to use by non-specialists and impedes discovery of cataloged resources made available through search engines. At the Slavic Librarians' Workshop on Thursday, Victor Gorodinsky mentioned that OCLC is interested loading the detransliterated records Wisconsin has generated into WorldCat.
- Electronic paper, available in prototype as a sort of paper-thin digital version of an Etch-A-Sketch, will finally become commercially viable at high resolutions, making digital content as convenient as reading a print book for those who do still wish for the print-like experience.
- Print books will exist for a long time because they’re cheap and durable.