RootsTech 2011: Day 3

Internet Archive
Inter­net Archive

Brew­ster Kahle, founder of the Inter­net Archive, gave an incred­i­ble keynote address this morn­ing.

His non-prof­it has been dig­i­tiz­ing and pro­vid­ing on the Inter­net all kinds of media. As he said, “We are in the busi­ness of giv­ing infor­ma­tion away.” He briefly men­tioned “born dig­i­tal” data, but focused his dis­cus­sion on the data we all have in shoe­box­es, what he called the “canon­i­cal box ‘o stuff.”

The Inter­net Archives has 23 scan­ning cen­ters in 6 coun­tries. For exam­ple, they have dig­i­tized doc­u­ments from the Leo Baeck Insti­tute, and did so while remov­ing pri­vate infor­ma­tion via remote cura­tion over the web.

Mr. Kahle also dis­cussed their dig­i­ti­za­tion of video con­tent (8mm, Super8, 16mm,  video tape). He point­ed out that some of this kind of con­ver­sion is avail­able in the con­sumer mar­ket, for about $200 / hour. High­er grade (HD-qual­i­ty trans­fers are also avail­able, but are much more expen­sive.

Specif­i­cal­ly in the genealog­i­cal field, Mr. Kahle said that the Inter­net Archive is involved in cre­at­ing a free genealog­i­cal library — part­ner­ing with Fam­il­y­Search and the Allen Coun­ty Library. Recent­ly, the Inter­net Archive com­plet­ed dig­i­tiz­ing the 1790–1930 Cen­sus and mak­ing it avail­able for free. They are now work­ing on dig­i­tiz­ing pas­sen­ger records. Soon, they will be announc­ing a part­ner­ship with libraries that will allow for 80,000 e-books to be “loaned” from the library to patrons who are in the library.

For me, this was all pow­er­ful, trans­for­ma­tive infor­ma­tion. But I was most inter­est­ed in Mr. Kahle’s dis­cus­sion of print-on-demand dig­i­tal book­mo­biles, which can pro­vide books as peo­ple need them, at a very low cost. (One exam­ple was that Alice in Won­der­land costs about $1 to print and bind.) Accord­ing to Mr. Kahle, a Har­vard study has shown that it takes a library $3 to loan a book, so $1 to give a book away should be a rea­son­able price. This is being used to pro­vide print­ed books free in India, Egypt, and Ugan­da.

One of the most mov­ing por­tions of the dis­cus­sion was the fact that the Inter­net Archive has dou­bled, to more than 1 mil­lion, the num­ber of books avail­able to the blind and text-dis­abled in the DAISY for­mat for auto­mat­ed read­ers.

A key issue for any archive, Mr. Kahle point­ed out is insti­tu­tion­al respon­si­bil­i­ty: How long, and at what lev­el can a com­pa­ny, or any insti­tu­tion be trust­ed to store infor­ma­tion. He told us not to trust that Flickr, Google, or even his non-prof­it would be around, or make the right deci­sions when it count­ed. So, his rec­om­men­da­tion is to not only have one copy in one insti­tu­tion. He said that the Library in Alexan­dria burned, yes, but it already had lost many of the impor­tant texts that it had gath­ered because of insti­tu­tion­al neglect: “the new guys didn’t like the old stuff around.”

In 2002, the Inter­net Archive hand­ed 200 TB of their data to the Library of Alexan­dria, which rec­i­p­ro­cat­ed with their col­lec­tion of dig­i­tized Ara­bic mate­ri­als. These kinds of large scale swap agree­ments are crit­i­cal to the redun­dan­cy need­ed to ensure that we do not have anoth­er loss sim­i­lar to what we lost at Alexan­dria, books by Aris­to­tle, the oth­er plays of Euripi­des … At this point, the whole Inter­net Archive is stored in three loca­tions: San Fran­cis­co, Alexan­dria, and Ams­ter­dam. Mr. Kahle acknowl­edged that an earth­quake zone, the Mid­dle East, and a flood plain were per­haps not the best choic­es, but they were not plan­ning on stop­ping there.

For us, as geneal­o­gists, Mr. Kahle pos­es the fol­low­ing ques­tions, which should make us think hard about the respon­si­bil­i­ty we have to take care of our data and doc­u­ments:

  • Can we learn the sto­ries of our ances­tors?
  • Will our descen­dants know our sto­ry?

The Root­sTech con­fer­ence was a great suc­cess. More than 3,000 atten­dees were there, mak­ing it one of the biggest, if not the biggest geneal­o­gy gath­er­ing in the US. Next year, the sec­ond Root­sTech con­fer­ence will be held at the Salt Palace in Salt Lake City, Utah from 2–4 Feb­ru­ary. I plan to be there.