Getting Found: Search Engine Optimization for Digital Repositories
MetadataShow full item record
Libraries and archives have been building digital repositories for over a decade, and, viewed in total, have amassed collections of considerable size. The use of the scholarly and lay content in these databases is predicated on visibility in Internet search engines, but initial surveys conducted by the University of Utah across numerous libraries and archives has revealed a disturbing reality: the number of digital objects successfully harvested and indexed by search engines from our digital repositories is abysmally low. The reasons for the poor showings in Internet search engines are complex, and are both technical and administrative. Web servers may be configured incorrectly, and may lack sufficient speed performance. Repository software may be designed or configured in a way that is difficult for crawlers to navigate. Metadata are often not unique or structured as recognizable taxonomies, and in some cases search engines prefer other schemas. Search engine policies change, and some commonly accepted standards in the library community are not being supported by some search engines. Google Scholar, for instance, has recently recommended against Dublin Core as a metadata schema in institutional repositories in favor of publishing industry schemas, a recommendation that comes as a shock to most librarians who learn of it. The problem lies less with search engines than with the content that search engines have to work with. This proposal will result in improvements to the way the content is presented so that search engines can parse, organize, and serve more relevant results to researchers and other users. The search engine market is fluid and intensely competitive. While Google retains the majority of direct search engine traffic, Bing is making progress quickly, and social media engines are changing the face of search itself, putting more emphasis on content that is popular and frequently refreshed. These changes will further affect the visibility of the content in our digital repositories, and must be investigated. With our formal partner, OCLC, Inc., and with help from informal partners the Digital Library Federation and the Mountain West Digital Library, we plan to expand our research, and then develop and publish a toolkit that will help libraries and archives make their database content more accessible and useful to search engines. The toolkit will include recommendations to web server administrators, repository software developers, and to repository managers. It will include reporting tools that will help measure and monitor effectiveness in achieving visibility in search engines, metrics that in turn will be useful to administrators in demonstrating the value proposition of their repositories. The sea of information available on the Internet is constantly growing, and library and archival content risks invisibility. We believe search engine optimization for digital repositories is a real and crucial issue that must be addressed, not only to improve our return on investment, but also to help us remain relevant in the age of electronic publishing.
This is the narrative of a funded proposal.