A recent research project led by the Book Industry Study Group in collaboration with KU Research, the Educopia Institute, and researchers from the University of Michigan and University of North Texas Libraries identified the challenges in understanding the usage of open-access (OA) scholarly ebooks, suggested some opportunities for resolving them, and created a framework for future action through community consultation. The project proposed the potential development of a “data trust” as a vehicle to manage the multiple data sets that are key to understanding OA ebook usage while respecting commercial and individual user concerns.
A data trust operates as an independent intermediary among industry stakeholders, compiling and analyzing data on behalf of trust members.1 Members of a data trust for OA monograph usage data would agree to make their data available to others who are members of the trust. Members would access normalized data through a user-specific dashboard or interface, while the trust would provide benchmarking data in a manner that respects contributor confidentiality and privacy. The data trust could also allow certain anonymized data to be extracted, typically through an agreed-upon API, for independent analysis.
Comprehensive access to usage data for OA monographs has the potential to provide all stakeholders in scholarly communication—from scholars and their institutions to publishers, content aggregators and platforms, and research funders—with valuable strategic insight into how and where OA books are being used. The ability to benchmark and understand usage data in the context of wider patterns and trends depends on access to aggregate data from multiple stakeholders; individual parties are unlikely to have this kind of access. Furthermore, a data trust helps lower the cost in staff expertise and resources for individual stakeholder organizations to engage in data analytics.
Successful collaboration around data sharing requires thoughtful engagement with issues of trust between stakeholders, the development of shared technical standards, and the development of requirements for the validation of data and information. This is a classic collective-action problem. Its solution, therefore, requires the development of a trusted framework for coordination between all the relevant stakeholders. Our recommendations address these aspects of successful collaboration.
Relevant research and initiatives around OA ebook usage are currently conducted separately in the United States and Europe, by both for-profit and nonprofit entities. The HIRMEOS project in Europe (part of the broader OPERAS framework) has been particularly influential. Coordinating or connecting those efforts, as well as improving our understanding of needs in other regional markets, is a priority for future efforts. The key recommendations for future work are the following:
1. Define the governance and architecture for the data trust and articulate priorities.
2. Create a pilot service that implements the defined governance and architecture.
3. Implement and extend relevant open-source technologies across a base of stakeholders in the US.
4. Develop personas and use cases that demonstrate who benefits from OA monograph usage information and how a data trust can better serve their needs.
5. Build engagement across multiple markets.
6. Better document the supply chain for OA monographs. This white paper provides detail on work to date and these recommendations.
This white paper was prepared by the Book Industry Study Group (BISG) as part of the Andrew W. Mellon Foundation funded project, Understanding OA Ebook Usage: Toward a Common Framework. Primary authors are: Brian O’Leary (BISG) and Kevin Hawkins (University of North Texas). The project team, who contributed editing and improvements, include Charles Watkinson (University of Michigan), Lucy Montgomery (Curtin University/KU Research), Cameron Neylon (Curtin University/KU Research), and Katherine Skinner (Educopia Institute). Copyright for this white paper is held by BISG and licensed to the general public under a Creative Commons Attribution 4.0 International license.