Spanning Today’s Chasms: Seven Steps to Building Trusted Data Intermediaries

Photograph of the Brooklyn Bridge's cables and tower Brooklyn Bridge, 2015. Credit: LEEROY/Pexels.

Senior Fellow James Shulman on philanthropy’s role as “social venture capitalist.”

In 2001, when hundreds of individual colleges and universities were scrambling to scan their slide libraries, The Andrew W. Mellon Foundation created a new organization, Artstor, to assemble a massive library of digital images from disparate sources to support teaching and research in the arts and humanities. 

Rather than encouraging—or paying for—each school to scan its own slide of the Mona Lisa, the Mellon Foundation created an intermediary organization that would balance the interests of those who created, photographed and cared for art works, such as artists and museums, and those who wanted to use such images for the admirable calling of teaching and studying history and culture.  This organization would reach across the gap that separated these two communities and would respect and balance the interests of both sides, while helping each accomplish their missions.  At the same time that Napster was using technology to facilitate the un-balanced transfer of digital content from creators to users, the Mellon Foundation set up a new institution aimed at respecting the interests of one side of the market and supporting the socially desirable work of the other. 

As the internet has enabled the sharing of data across the world, new intermediaries have emerged as entire platforms. A networked world needs such bridges—think Etsy or Ebay sitting between sellers and buyers, or Facebook sitting between advertisers and users. While intermediaries that match sellers and buyers of things provide a marketplace to bridge from one side or the other, aggregators of data work in admittedly more shadowy territories.

In the many realms that market forces won’t support, however, a great deal of public good can be done by aggregating and managing access to datasets that might otherwise continue to live in isolation. Whether due to institutional sociology that favors local solutions, the technical challenges associated with merging heterogeneous databases built with different data models, intellectual property limitations, or privacy concerns, datasets are built and maintained by independent groups that—if networked—could be used to further each other’s work.

Think of those studying coral reefs, or those studying labor practices in developing markets, or child welfare offices seeking to call upon court records in different states, or medical researchers working in different sub-disciplines but on essentially the same disease.  What intermediary invests in joining these datasets?  Many people assume that computers can simply “talk” to each other and share data intuitively, but without targeted investment in connecting them, they can’t.  Unlike modern databases that are now often designed with the cloud in mind, decades of locally created databases churn away in isolation, at great opportunity cost to us all. 

Art history research is an unusually vivid example. Most people can understand that if you want to study Caravaggio, you don’t want to hunt and peck across hundreds of museums, books, photo archives, libraries, churches, and private collections.  You want all that content in one place—exactly what Mellon sought to achieve by creating Artstor.

What did we learn in creating Artstor that might be distilled as lessons for others taking on an aggregation project to serve the public good?

The work of assembling the Artstor collections—described in general terms—included:

  1. Identifying the various constituencies within content-owning institutions who could make or block the decision to share institutional content, understanding and respecting their needs, and determining the right approaches to building and sustaining trusted relationships; 
  2. Making and strengthening value propositions encouraging those decision-makers and their institutions to place their content with the intermediary organization;
  3. Devising pragmatic legal and intellectual property regimes for doing this in ways that respected the concerns of the creators and cared for the digital content;
  4. Working out practical technology solutions for carrying out the work, depending upon institutionally based partners who had their own work to do, their own priorities, and their own standards;
  5. Ensuring that users got enough of what they needed to satisfice—maybe not everything they wanted, but enough to do their work;
  6. Creating a business model to support the operating costs of the maintenance of, and controlled access to, the aggregated content;
  7. Extending the use and outputs of the data held within the Trusted Data Intermediary in accordance with changing exogenous opportunities and evolving norms, while sustaining the trust of the content providers.

In creating Artstor, the Mellon Foundation acted as a social venture capitalist, investing significant funds in creation of an organization that would eventually be self-supporting but, which needed to build infrastructure and respectful relationships long before it would have sufficient subscribers. As a missing bridge in an emerging network, an intermediary organization tends to require philanthropic seed capital in lieu of traditional gifts from donors. Providing the critical, gap-filling infrastructure needed to launch these intermediary organizations isn’t generally of interest to traditional donors who are more inclined to give to existing entities to whom they have a sentimental attachment. Artstor, for example, served the interests of colleges, universities, schools, and museums, but the donors who were attached to those end-node institutions were not likely to support the shared plumbing that would connect them.

The Mellon Foundation made a “big bet” on Artstor, and made it based on the belief that spending over $40 million to create a solution that had the potential to serve thousands of institutions was a better investment than the sector itself spending $1 million dozens or hundreds of times to create multiple, partially redundant solutions. The TDI would provide better data services for various uses at far less collective cost, and would respect the needs and concerns of those who created, maintained, and provided the content.  

What other examples do you know of similar aggregation agents?  What realm of research or public good do you believe needs such an aggregating intermediary?  Let me know.

In a future post, I’ll write about what happened after the bridge was built.  We were trying to solve one very specific need, but by creating a bridge where there hadn’t been one (and not messing it up), we were eventually asked to play other roles for the community.   Stay tuned in a future post for how such an intermediary—once established and trusted—could play new roles as the ecosystem around it evolved.


James Shulman was the founding President of Artstor (2001-2016). He recently co-led a workshop at the Digital Civil Society Lab (at the Stanford University Center for Philanthropy and Civil Society) on Trusted Data Intermediaries, and is currently serving as an Affiliate of the Berkman-Klein Center for Internet & Society at Harvard University.