Disquiet in Google's online library

Worries surround pending deal for access to millions of books.

by John Timpane, Inquirer Staff Writer

Updated March 14, 2010, 3:01 a.m. ET

Published March 14, 2010, 3:01 a.m. ET

Google has been busy.

The Internet giant has been copying and storing millions of the world's out-of-print and out-of-copyright books in a vast online archive. It could all be just a mouse click away from your computer screen if the effort, known as the Google Books Library Project, survives a legal challenge.

At stake is access to millions of texts, billions of pages, trillions of words that constitute nothing less than human memory and identity.

At stake, too, is who gets paid - a decision that could affect the future of the publishing industry.

Suppose you want to read, say, Jane Austen's Pride and Prejudice in the original edition. Or the first printing of The Adventures of Huckleberry Finn. Or you don't have a Shakespeare First Folio lying around but want to browse it. How about a look at some printed shipping or immigration registers?

Someday you'll be able to access all of them. A world online library!

Just dying to grab the 1840 edition of Jacob Bigelow's Florula bostoniensis (a catalog of flowers in the Boston area)? Already on Google Books, and, quick, you can access it right now!

But in the digital future, you often will pay for access, meaning you often will pay Google, which is way ahead in the race to upload and store all, or nearly all, of the world's precious old texts.

But will Google continue building its digital library?

That may depend on a class-action lawsuit that the Authors Guild and several published authors, including Swarthmore poet Daniel Hoffman, filed against Google in 2005 in U.S. District Court in New York. The suit accuses Google of "massive copyright infringement."

Authors and publishers objected to Google's offering books online without consulting copyright holders, as well as publishing title pages, snippets, and previews for public access.

Hoffman declined to comment for this article, but in a 2005 commentary for The Inquirer he wrote that if Google becomes "the repository of the accumulated knowledge and literature of all civilization, won't the firm attract many more advertisers? We authors, whose work can be read and, in many cases, reproduced by the touch of a key, won't see five cents of this income. And to the extent that that income is based on illegal appropriation of our writings, neither should Google."

Even some Google partners, such as Harvard University, threatened to split unless the copyright issues were addressed.

Acknowledging the problems, in 2008 Google reached a $125 million settlement with the Authors Guild and the Association of American Publishers. The main aim of the much-debated deal is to compensate authors and publishers when Google scans in books to which they hold copyrights.

U.S. District Judge Denny Chin will decide the fairness of the settlement. Included in the agreement are Google payouts of $125 million to publishers, authors, and lawyers; the creation of a registry for copyright holders; and a controversial provision that would put the onus on copyright holders to opt out of the settlement.

Chin was to rule Feb. 18, but, citing the complexity of the deal, he put off a decision, not saying when he'd be ready.

When he is, it'll be big.

Googling the printed past

The online library project grew out of Google Book Search (born October 2004), which scans printed texts and stores them in an online database, available for perusal (free, partly, sometimes) and purchase.

Google promptly teamed with Harvard and Stanford Universities, the University of Michigan, the New York Public Library, and the University of Oxford in England. Partner librarians have been scanning in old tomes at a rate that theoretically can reach 1,000 pages an hour. Google says its holdings hit 10 million books in October.

Google Books is not the only project creating a huge virtual library. The first was the all-volunteer Project Gutenberg, founded by Michael S. Hart in 1971 when he digitized the Declaration of Independence. Gutenberg offers more than 30,000 books, most in the public domain. The Internet Archive is another free project. Others, such as Books on Demand, are for-profit publishers that create printed books from digital databases. Amazon.com, through its subsidiary BookSurge, teamed with Kirtas Technologies in 2007 to digitize and print copies of rare books on demand.

University and public libraries all over the world are digitizing their out-of-print holdings. Last year, the Penn Libraries joined with Kirtas to digitize 200,000 out-of-print Penn volumes to be sold at Kirtas' commercial Web site.

But in size and resources, Google overshadows all.

Some hail Google Books as a step toward a global future of shared knowledge. In the court case, Sony Electronics filed a friendly brief arguing that the settlement would "dramatically enlarge and diversify the universe of available e-books" and "increase demand for e-book readers, intensify competition among e-reader manufacturers, and spur innovation."

Jacob Epstein, publisher and founder of Books on Demand, wrote by e-mail that "Google has the head start and the means to create a universal, multilingual digital directory from which public-domain files can be downloaded by readers in a radically decentralized digital marketplace."

But questions have dogged this project like a bad conscience. Some objections are technical: Digital storage is great, but it's fragile. As Epstein wrote in the New York Review of Books this month, all the world's old books may soon be available at the click of a mouse, but "another click might obliterate these same contents and bring civilization to an end."

Sorest of all is the issue of copyrights. A project this big is bound to infringe somebody's. Google says it tries to err on the side of caution. But many want firmer guarantees than that.

Many publishers and authors worry that Google will trample their copyrights, have too much power to determine prices, and become a monopoly, a competition-killer. If Google had a text of Austen, could it block me from selling my own text of her? Or lowball prices so I couldn't profit from it? Could it block people from reading a text if it wanted to? Aren't we, objectors ask, making Google the gatekeeper of our past?

When is big too big?

The suit against Google was on its way to being settled when the Justice Department called a halt in September, citing antitrust violations. Google revised the agreement again, and the case went before Chin.

One of the firms involved in the original suit was Kohn Swift & Graf of Philadelphia, and Michael J. Boni, lead counsel in the suit, now practices at Boni & Zack L.L.C. in Bala Cynwyd.

Boni is cautiously optimistic that Chin will approve the settlement: "The judge has a lot of paper before him. A large number of objectors filed a mountain of paper. . . . The judge has a lot to consider and a lot to review. But he said he has an open mind, and we hope he'll see that the settlement is fair and adequate."

Google spokeswoman Jennie Johnson stressed that the agreement would create a legal precedent that protected copyright holders and the market. She wrote by e-mail that the agreement "is good for competition. It makes it easier for others - including our competitors - to find rights-holders and digitize books." She pointed out that public-domain books make up only about 3 percent of sales in the publishing market.

Epstein e-mailed that Google would never monopolize all publishing because that literally couldn't be done: "There will be many other ways to access digital content, for example, directly from publishers' Web sites, from Web sites of special interest, from Amazon and others."

Ken Auletta, author of Googled: The End of the World as We Know It, e-mailed that "there is legitimate cause for concern when only one company in the world digitizes all 20 or so million books ever published. It provokes legitimate public anxiety, just as Comcast's control over broadband and cable wires do."

James Grimmelmann, an associate professor at the New York Law School, said the crux of Chin's pending decision was "whether giving all these rights to Google precludes competition."

Central, he said, is the "opt-out" provision in the Google Books settlement. Its simplified form is: If Google wants to print a text to which you hold the copyright, it can unless you tell it no first.

"This reverses the default of prior law," Grimmelmann said. Usually, the publisher must seek out copyright holders and secure permission before publishing. "Some people are afraid that under such an agreement, no copyright is safe."

To e- or not to e-?

One big thing in Google's favor: Almost everyone thinks digital is the future of publishing, and people are trying to get there first and best. It makes sense to digitize the fading, falling-apart books of yore - that is their future.

Gutenberg's baby, the printed book, is still king: An estimated two billion were sold in the United States last year. But that number is falling, and although not yet a huge market, digital publishing is growing rapidly.

CNN's John D. Sutter wrote that last year's third-quarter sales spiked more than 235 percent over a year before. E-book retail sales, about $150 million in 2009, should rise to $201 million in 2010, according to professor Albert Greco of Fordham University's Graduate School of Business. It could hit perhaps $1 billion by 2012.

The age of the e-book has begun, with Amazon's Kindle (more than 1.5 million sold), Barnes & Noble's Nook, Sony's Reader, and Apple's iPad. E-books, about 1 percent of the market now, are projected to make up 13 percent by 2013. Google is moving aggressively to stake a position in this age being born.

So much is changing, and fast. Does all this mean we may have to rethink publishing law itself?

"Oh, yes," Grimmelmann said. "Copyright is a creation of the law to go along with the new technology of . . . printing. We are undergoing the biggest shift in publishing since Gutenberg, and we may end up having to rethink the nature of the legal rights and duties involved."

Google Books Settlement

The pending revised Google Books settlement is a complex agreement among Google, the Authors Guild, and the Association of American Publishers. Under the agreement:

Google could digitize books and parts of books (tables, charts, chapters, etc.), sell Google Books database subscriptions to libraries and other institutions, sell online access to books, and display previews and snippets of books.

Google would pay $34.5 million to create an independent, not-for-profit Books Rights Registry. This would seek out copyright holders, hold Google revenue for them, and pay them.

Google would dedicate $45 million to pay copyright holders. It would pay holders at least $60 per copyrighted book it adds to its database, plus 63 percent of Google revenue from the book.

Rights holders could opt out - tell Google they don't want to be part of the settlement. They could then take legal action if their works were added to the database.

Rights holders who opt in, though they could no longer sue, could tell Google to remove a work or not add it to the database.

Google would pay $15.5 million for the publishers' legal fees and $30 million to the authors' lawyers.EndText