When late last year google and a number of major universities announced google print their ambitious plans to digitize huge portions of their libraries and make these available online, I was pretty excited. The potential embodied in this sort of expansion of access to books well beyond their physical, restricted homes is rather phenomenal. The press releases at the time stressed that they would be obeying copyright regulations; only out of copyright books would be posted in full. For books still in copyright, only snippets would be made available. (Something akin to the Amazon search inside the book feature, I guess.) Of course, grump that I am, my mind immediately went to the sorts of things that are in copyright but are almost totally inaccessible, out of print, and not making money for anyone anymore, but that would be neverthless locked behind the copyright restrictions. In a recent op-ed piece in the L.A. Times, Lawrence Lessig brings further attention to this issue, pointing to the importance of requiring active rather than automatic copyright renewal as a potential solution to this problem:
The vast majority of creative work published in 1930, therefore, is in the public domain. But it is extremely costly to know which works in particular are in that category. And for those works that remain under copyright, unless new editions containing the latest copyright information become available — a reprint of an old book, say, or a DVD of an old movie — tracking down the current owners can require hours of detective work that may come up empty.
The solution is obvious enough: Clean up the copyright system. As with every other federal intellectual property regime, all copyrights should be registered.
Another issue, not discussed in Lessig's op-ed, is the question of what it means that this is being done by google-- a for-profit corporation. Now, I still love google. I can't stop raving about how much google desktop has simplified finding things on my computer. [If only they'd start indexing firefox pages too.] And, like google print, google scholar (beyond the access this could open up, there's just the ability to use googly-smart searching for looking for academic texts; the search capabilities on most of the databases UCSB library uses, at least, are beyond frustratingly limited.) And of course just plain old google still manages to make me happy on a daily basis. And it's not that I think that their "don't be evil" motto is disingenuous. I do think that there may be some genuinely noble motives behind some of this plan. And I certainly respect how many millions of dollars are going to be necessary to bring something like this massive scale of digitization into fruition. But-- does all or any of this mean that we should be satisfied with leaving this important project to a for-profit corporation that does not have any direct responsibility to the public? It seems like this is a project with such important potential contributions for the public good that a publicly funded, large-scale project is very much called for (but incredibly unlikely, sadly.) In the absence of this, it at least points to the need to keep on supporting nonprofit endeavors like Internet Archive, which in addition to its quite helpful caches of the internet of days and years past, in its wayback machine, is embarking of just such a project:
Today, a number of International libraries have committed to putting their digitized books in open-access archives, starting with one at the Internet Archive. This approach will ensure permanent and public access to our published heritage. Anyone with an Internet connection will have access to these collections and the growing set of tools to make use of them. In this way we are getting closer to the goal of Universal Access to All Knowledge.
By working with libraries from 5 countries, and working to expand this number, we are bringing a broad range of materials to every interested individual. This growing commitment to open access through public archives marks a significant commitment to broad, public, and free access. While still early in its evolution, works in dozens of languages are already stored in the Internet Archive's Open-Access Text Archive offering a breadth of materials to everyone.
Over one million books have been committed to the Text Archive. Currently over twenty-seven thousand are available and an additional fifty thousand are expected in the first quarter of 2005. Advanced processing of these multilingual books will offer unprecedented access.
This doesn't mean that I don't still find the google print project exciting. And none of this negates the points Lessig makes about the need for copyright reform. Check out the piece, before it gets trapped behind the L.A. Times copyright wall . . .
Lessig: Let a Thousand Googles Bloom