From HLS, Harvard Law Today, December 12:
The new program aims to make public domain materials housed at Harvard Law School Library and other knowledge institutions available to train AI
In less than a decade, artificial intelligence has evolved from a promising idea to a fully functioning engine driving changes in how people live and work across the globe. Engines, of course, need fuel, and the vast quantities of data used to train AI are powering these online innovations.
At the Institutional Data Initiative (IDI), a new program hosted within the Harvard Law School Library, efforts are already underway to expand and enhance the data resources available for AI training. At the initiative’s public launch on Dec. 12, Library Innovation Lab faculty director, Jonathan Zittrain ’95, and IDI executive director, Greg Leppert, announced plans to expand the availability of public domain data from knowledge institutions — including the text of nearly one million books scanned at Harvard Library — to train AI models.
“Libraries and other stewards of humanity’s aggregated knowledge can think in terms of centuries — preserving it and providing access both for known uses and for aims completely unanticipated,” said Zittrain, the George Bemis Professor of International Law at Harvard Law School and Vice Dean of the Harvard Law School Library.
“IDI’s aim is to address newly energized interest from those quarters in otherwise-obscure texts in ways that preserve institutions’ values. That means working towards access for all for public domain works that have remained fenced — access both for the human eye and for imaginative machine processing. The latter will require forging examples if not outright standards to facilitate the easiest and best range of uses, from the current frontier model to students and scholars who wish to explore and tinker.”
Leppert spoke with Harvard Law Today to discuss IDI’s mission and explain why the data stewarded by institutions like Harvard is the key to building a better AI future....
....MUCH MORE