Ancestry Increases Subscriber Base by 31%

Now that Ancestry.com Inc. is a pub­lic com­pany (ACOM: Google Finance), they are required to divulge more infor­ma­tion about their per­for­mance than they did as a pri­vate venture.

For them, the news is good. Ear­lier this week, they announced their year 2010 fig­ures, which included notably sub­scriber growth of 31% year-over-year and a 34% increase in rev­enue year-over-year. (At the end of Decem­ber there were 1,395,000 paid sub­scribers of Ancestry.com.) Total rev­enue for the year was $300.9 mil­lion. EBITDA (earn­ings before income tax, depre­ci­a­tion, and amor­ti­za­tion) was $101 million.

Monthly churn (mem­ber­ship turnover) is 3.9%, which is basi­cally equiv­a­lent to the 3.6% in the fourth quar­ter of 2009, and the 4.0% in the third quar­ter of 2010.

For 2011, Ances­try expects to have 1,700,000 t0 1,725,000 sub­scribers and bring in rev­enues of $370 — $375 mil­lion, lead­ing to an EBITDA of $125 — $130 million.

These are very healthy num­bers and bode well for the geneal­ogy indus­try. While a lot of us have some qualms about the size of Ances­try, as well as some of its busi­ness prac­tices, it’s still impor­tant that this major player is healthy and con­tin­u­ing to invest in dig­i­ti­za­tion and technology.

 

Black History Month at FamilySearch

Fam­il­y­Search made the fol­low­ing announce­ment on Monday:

Salt Lake City—This month, mil­lions of indi­vid­u­als of African descent are cel­e­brat­ing Black His­tory Month by explor­ing their fam­ily his­tory roots. In the U.S., Fam­il­y­Search vol­un­teers have been busy help­ing dig­i­tize his­toric doc­u­ments and cre­ate free, search­able indexes to them online. Through­out Africa, from Accra to Zim­babwe, where irre­place­able fam­ily infor­ma­tion and tra­di­tions are at risk of being lost due to neglect, war, and dete­ri­o­ra­tion, Fam­il­y­Search vol­un­teers are also help­ing pre­serve this valu­able his­tory so Africans can con­nect with their roots. Researchers can search the mil­lions of African-related records as they are pub­lished online at FamilySearch.org.

They con­clude their announce­ment with the following:

Many of the records col­lected by Fam­il­y­Search are now avail­able for free on FamilySearch.org. More African records will be posted on the site in the com­ing months. Fol­low­ing are a few sam­ples of some types of records at FamilySearch.org that may be of inter­est to those doing African or African-American research. Many of them are works in progress.

    • Vir­ginia, Freedmen’s Bureau Let­ters, 1865–1872
    • U.S. Arkansas Con­fed­er­ate Pen­sions, 1901 to 1929
    • Ghana 1982–1984 Census
    • South Africa, Orange Free State, Estate Files, 1951–1973
    • U.S. South­ern States Births, Mar­riages, and Deaths
    • U.S. Nat­u­ral­iza­tion Petitions

This is tremen­dous amount of mate­r­ial being made avail­able. Their blog entry about this release says that the Vir­ginia Freedmen’s Bureau records total more than 1 mil­lion records. It’s an impor­tant deliv­ery of doc­u­ments, and will pro­vide a great deal of help for African-American researchers.

Keeping Your Computer Up-to-Date

AppFresh
AppFresh

The com­puter is one of our most impor­tant genealog­i­cal tools.

Many of us remem­ber when this was not the case. I have my fair share of mimeo­graphed fam­ily group sheets filled out in fad­ing pen­cil wait­ing in a stack to be scanned. But today, with your research find­ings stored in a dig­i­tal data­base and your research con­sist­ing of a blend of pay and free web­sites, with the local and state repos­i­to­ries you want to visit tagged in a Google Map, and with your lat­est pho­tos of grave­stones shared on Flickr and Find­A­Grave, you need a com­puter and you need it to work.

Whether you have a Mac or a Win­dows machine, the key to keep­ing your sys­tem work­ing is main­te­nance. Just like with a car, you should have a sched­ule for main­tain­ing your com­puter. With a car, every 3,000 or 5,000 miles, you need to change the oil; peri­od­i­cally, you need to rotate the tires. It helps to check the air pres­sure, air fil­ters, and oil level from time to time. There is a sim­i­lar reg­i­men you should fol­low to keep your com­puter run­ning smoothly, so you can focus on your research and not on recov­er­ing from a cat­a­strophic com­puter issue.

Virus Check­ing

Those of us who use Macs often come off as smug about the lack of a need for virus check­ing soft­ware. This impli­ca­tion is that the supe­rior design of the Mac­in­tosh wards off all threats. (We can be such pains!) Of course, the Mac­in­tosh is just as vul­ner­a­ble as any other oper­at­ing sys­tem. Since OS X has been released, not as many viruses writ­ten for the Mac, but it takes only one virus to endan­ger your data or your pri­vacy. So, while Macs are less likely to get viruses, the Mac OS is not with­out its vul­ner­a­bil­i­ties. Addi­tion­ally, with cross-platform files (such as Microsoft Word files) can arrive with a virus and be sent on with that same virus, whether or not the virus infects your machine.

In addi­tion to viruses, it is impor­tant to under­stand that there are spy­ware appli­ca­tions that are designed to gather data about you and your online iden­tity. These often run based on your browser, and are there­fore often plat­form inde­pen­dent. So, no mat­ter what kind of com­puter your have, you should have anti-virus and anti-spyware soft­ware, and keep the virus and spy­ware def­i­n­i­tions up-to-date.

For both the Mac and the PC, the two main­stays of the secu­rity mar­ket, Nor­ton (us.norton.com) and McAfee (www.mcafee.com) offer a suite of prod­ucts that pro­vide pro­tec­tion against viruses, adware, spy­ware, and a vari­ety of other online threats. The biggest hur­dle for me in using virus pro­tec­tion like the pro­grams sold by McAfee and Nor­ton is hat they some­times take over your com­puter when you are not expect­ing it to do so. For the Mac, there is also ClamXav (www.clamxav.com), a free open-source virus pro­tec­tion soft­ware pack­age. While ClamXav is free, it does not proac­tively scan new or changed files; you have to remem­ber to run it. There­fore, you get less pro­tec­tion, but also more con­trol over what your com­puter is doing at any given moment.

Virus and mal­ware pro­tec­tion fall in the cat­e­gory of adap­tive main­te­nance. They are ways of adapt­ing to changes in the environment.

Sys­tem Secu­rity Updates

Both the PC in Win­dows Vista and Win­dows 7 and the Mac in OS X pro­vide peri­odic updates to the sys­tem soft­ware. Some of these are optional. They might be updat­ing a com­po­nent of the oper­at­ing sys­tem that you do not use, for exam­ple. But, often the updates will be issues to close up secu­rity holes in the oper­at­ing sys­tem. This is known as “adap­tive main­te­nance.” The When­ever you receive a security-related upgrade for your oper­at­ing sys­tem, you should allow it to install. The soft­ware ven­dors will usu­ally not announce secu­rity issues with their soft­ware until a fix is avail­able, so you will prob­a­bly not even know there is a prob­lem. How­ever, those who would like to exploit secu­rity issues with the oper­at­ing sys­tem are con­stantly on the look­out for these issues, so you should let the experts at Microsoft and Apple give you the ben­e­fit of their attempts to keep you and your genealog­i­cal data safe.

Secu­rity issues are often also dis­cov­ered with desk­top appli­ca­tion, espe­cially Adobe Acro­bat and the var­i­ous browsers, Inter­net Explorer, Fire­fox, Chrome, and Safari. Be aware of how your soft­ware ven­dor will make updates avail­able. Some updates, such as sys­tem updates for Win­dows or the Mac OS and many appli­ca­tions will be deliv­ered to your sys­tem auto­mat­i­cally, when­ever it is con­nected to the Inter­net and there has been a patch released.

In gen­eral, you should install these sys­tem and appli­ca­tion updates as soon as it is fea­si­ble to do so. If you have any con­cern with whether the updates you are receiv­ing are autho­rized by and deliv­ered from the ven­dor, go to the sup­port or down­loads area of their web­site to ver­ify that the change is valid, and learn what defect or vul­ner­a­bil­ity the change is intended to address.

Sim­ply Stay­ing Current

You have invested money in the soft­ware you use every day. More impor­tantly, you have invested time in it. You have spent time learn­ing how to use it, fig­ur­ing out its fea­tures and foibles. Any soft­ware that you use a lot for your geneal­ogy research, whether as a data­base for your records, or as a way to write or share your find­ings, should be pro­tected in another way. It should be kept rea­son­ably cur­rent. This does not mean that you need to be as assid­u­ous as you should be with installing OS secu­rity patches. How­ever, you should not be more than two major releases behind the released prod­uct. In other words, if the prod­uct is on ver­sion 7, you should be run­ning at least ver­sion 5. This is a gen­eral rule of thumb, and may vary depend­ing on how much the ven­dor has changed its product.

There are a cou­ple of pow­er­ful web­sites and desk­top appli­ca­tions that can help you keep on top of keep­ing your appli­ca­tions cur­rent. For both the Win­dows OS and the Mac OS, there is CNet’s Tech­Tracker (for­merly Ver­sion­Tracker), with both free and sub­scrip­tion ser­vices (www.cnet.com/techtracker-free). For the Mac OS, there is a handy desk­top soft­ware pack­age, AppFresh (metaquark.de/appfresh/) which uses the osx.iusethis.com web­site to keep track of changes to appli­ca­tions, wid­gets, pref­er­ence panes and appli­ca­tion plug-ins. In addi­tion to check­ing for new ver­sions of all the appli­ca­tions sub­mit­ted to osx.iusethis.com, AppFresh also keeps track of Apple and Microsoft Updates (and soon, Adobe updates), to help you keep your sys­tem cur­rent with the lat­est releases of the soft­ware you use on a reg­u­lar basis. The tool also allows for Sparkle updates, which are built into many Mac OS prod­ucts to auto­mat­i­cally keep an installed prod­uct aware of updates.

Reg­u­lar Maintenance

With your com­puter oper­at­ing sys­tem and the appli­ca­tions you run on it safe, you can focus the bulk of your energy on the search for and analy­sis of genealog­i­cal data. After all, your com­puter is sim­ply a tool for your research, for find­ing, gath­er­ing, arrang­ing, and stor­ing your genealog­i­cal find­ings. You are doing the key intel­lec­tual work of assess­ing sources, think­ing through unique ways to find your way past “brick­wall” prob­lems. It would be a shame if this work were lost because of a virus or a secu­rity hole. More com­monly, sim­ply by neglect of a stan­dard process, your sys­tem may degrade in its per­for­mance, and you will lose the ben­e­fit it can pro­vide you and get drawn into many hours of main­te­nance and repairs, of try­ing to reassem­ble the con­tent you have brought together. We all know, and I have talked about in this col­umn, the need for back­ups. In addi­tion to back­ing up your sys­tem, you should also main­tain what you have.
An ear­lier ver­sion of this arti­cle appeared in the National Genealog­i­cal Soci­ety Mag­a­zine. Used  by permission.

Pam Slaton: “Searching for …”

Pam Slaton, host of "Searching for ..."
Pam Sla­ton

Update: 9 March 2011

I am not Pam Sla­ton, and do not even know her. A lot of folks are post­ing here think­ing they are con­tact­ing Pam, but, unfor­tu­nately, they are not. I wish I could pass infor­ma­tion on to her, but I am not in touch with her.

This was news to me: Oprah Winfrey’s OWN tele­vi­sion net­work has a show that fol­lows a pro­fes­sional geneal­o­gist. The show, enti­tled “Search­ing for …” runs Mon­day nights at 9/8 Cen­tral. Pam Sla­ton, the geneal­o­gist the show focused on helps reunite the adopted with their birth fam­i­lies, and other fam­ily mem­bers with one another after they have been sep­a­rated for some time and lost touch with one another.

On the OWN site, they write:

“Search­ing For… is a doc­u­men­tary series that fol­lows the real-life work of Pam Sla­ton, a pro­fes­sional inves­tiga­tive geneal­o­gist, stay-at-home mom and New Jer­sey housewife.

View­ers can expect an intensely per­sonal ride when cam­eras fol­low Pam and her clients through each step as they track down lost loved ones. Each searcher’s story is dif­fer­ent, and the results are unpre­dictable and emo­tion­ally charged. Whether Pam’s clients find a joy­ous reunion, painful rejec­tion or tragic loss, they all walk away with the clo­sure they were des­per­ate to find.

Pam Slaton’s career as a pro­fes­sional inves­tiga­tive geneal­o­gist began nearly 20 years ago. Want­ing to find her own birth mother, Pam hired to a pro­fes­sional searcher. The expe­ri­ence was the most dev­as­tat­ing of her life, and Pam vowed that no one else should have to go through what she did. She keeps her own pain in mind when help­ing clients on their jour­neys. And her results are astound­ing! Pam has an 85 per­cent suc­cess rate, fol­lows a strict “no find, no pay” pol­icy, and is one of the most sought-after pro­fes­sional searchers in the country.”

I will have to take a look.

One of the key aspects of geneal­ogy shows, which this one looks to have in spades, is an emo­tional com­po­nent that most non-genealogists seem to not expect. With a focus on re-uniting liv­ing peo­ple, Pam Slaton’s niche in geneal­ogy seems to be focused directly on emo­tional con­tent which should drive the show. Unfor­tu­nately, I don’t know how many peo­ple know about this show.

Google Docs Goes Native

Google Docs was once an appli­ca­tion that was “like Microsoft Word” or “like Pow­er­Point”, and could read and write files from those pro­grams as well as Excel. But mainly, you under­stood that you were edit­ing your file and stor­ing it, in Google’s pro­pri­etary format.

Then, in Jan­u­ary 2010, Google announced that they would allow users to store any file for­mat in their Google Docs envi­ron­ment. That started to look like another cloud stor­age offer­ing. Frankly, it didn’t make a lot of sense to upload files you can­not even open in that envi­ron­ment. Google took a big step toward address­ing that week, mak­ing some key for­mats natively view­able within Google Docs.

On their blog, they say:

The Google Docs Viewer is used by mil­lions of peo­ple every day to quickly view PDFs, Microsoft Word doc­u­ments and Pow­er­Point pre­sen­ta­tions online. Not only is view­ing files in your browser far more secure than down­load­ing and open­ing them locally, but it also saves time and doesn’t clut­ter up your hard-drive with unwanted files.

Today we’re excited to launch sup­port for 12 new file types:

  • Microsoft Excel (.XLS and .XLSX)
  • Microsoft Pow­er­Point 2007 / 2010 (.PPTX)
  • Apple Pages (.PAGES)
  • Adobe Illus­tra­tor (.AI)
  • Adobe Pho­to­shop (.PSD)
  • Autodesk Auto­Cad (.DXF)
  • Scal­able Vec­tor Graph­ics (.SVG)
  • Post­Script (.EPS, .PS)
  • True­Type (.TTF)
  • XML Paper Spec­i­fi­ca­tion (.XPS)

Not only does this round out sup­port for the major Microsoft Office file types (we now sup­port DOC, DOCX, PPT, PPTX, XLS and XLSX), but it also adds quick view­ing capa­bil­i­ties for many of the most pop­u­lar and highly-requested doc­u­ment and image types.

In Gmail, these types of attach­ments will now show a “View” link, and click­ing on this link will bring up the Google Docs Viewer.

For me, one of the few annoy­ing aspects of how Gmail and Google Docs work together has been that, in the early days, sim­ply open­ing up a Word doc­u­ment in my Gmail would auto­mat­i­cally cre­ate a doc­u­ment in Google Docs, or that it wouldn’t allow me to pre­view it, and would force me to down­load the file. Now, I will sim­ply be able to View these doc­u­ments, and have them dis­ap­pear into the browser cache at the end of the session.

More Technology News for Genealogists

Google

Ear­lier this week, Apple announced a new sub­scrip­tion pay­ment model for the iPad.

Google responded yes­ter­day with a much more flex­i­ble sub­scrip­tion model using Google Check­out (a Pay­Pal com­peti­tor), and pro­vid­ing 10% in rev­enue for Google (in com­par­i­son with Apple’s 30%). Google does not require that the in-app pur­chase price be at least as inex­pen­sive as any other web offer­ing of the prod­uct. It’s a more open pro­gram, and hope­fully will gain trac­tion and help fos­ter a more sus­tain­able sales model for con­tent providers.

Until and unless other mod­els come along, expect to see genealog­i­cal con­tent providers, as they move into the tablet space, to opt for the Google pric­ing model, which will bet­ter align with their oper­at­ing profit margins.

SlideShare

SlideShare is a site that allows you to upload PowerPoint-style slides to share with oth­ers. (I post all my slides at SlideShare: http://www.slideshare.net/genealogymedia. This week they announced a free 1-click con­fer­enc­ing prod­uct, Zip­cast. I have not tried it, but it looks inter­est­ing, as most con­fer­enc­ing sys­tems that share slides require that the slides be uploaded in real time, as images of from the per­son shar­ing the slides. Zip­cast might be faster, because the slides will not need to be uploaded dur­ing the meet­ing, and will already be opti­mized for web view­ing at SlideShare.

Don’t be sur­prised if your next geneal­ogy meet­ing does not hap­pen in per­son, but instead over SlideShare’s Zipcast.

Subscriptions on the Apple App Store

Magazines on the iPad
Apple’s iPad

Apple announced today that they will be sup­port­ing sub­scrip­tions on the App­Store. A lot of us have been think­ing that would make for a good day, as it never made sense for own­ers of the iPad to only be able to buy some­thing like a mag­a­zine for the iPad one issue at a time (often for more than a print sin­gle copy).

How­ever, the way that Apple is doing this is caus­ing a great deal of con­ster­na­tion out­side of Cupertino.

First, they are demand­ing 30% of every sub­scrip­tion sale. This is a sim­i­lar rate that is paid on mag­a­zines at the news stand, but not hav­ing to pro­vide that dis­count to mag­a­zine stands is part of what allows mag­a­zine sub­scrip­tions to be so inex­pen­sive. Apple does allow peo­ple who sell sub­scrip­tions to do so “out­side the app.” But, again, the bar­gain they are ask­ing peo­ple to make is dra­con­ian. In their press release, they write:

How­ever, Apple does require that if a pub­lisher chooses to sell a dig­i­tal sub­scrip­tion sep­a­rately out­side of the app, that same sub­scrip­tion offer must be made avail­able, at the same price or less, to cus­tomers who wish to sub­scribe from within the app.” In other words, the time hon­ored tra­di­tion of the “cut-out-the-middleman” buy direct dis­count is not going to be allowed.

This means that Ama­zon can­not sell books in the iOS ver­sion of the Kin­dle reader, even though that reader only has a link to Amazon’s web­site to make that pur­chase. (For titles sold through Amazon’s Dig­i­tal Text Pro­gram, authors and pub­lish­ers get a 70% roy­alty. Sim­ple math shows that if Ama­zon gives Apple the remain­ing 30%, they will be spend­ing money to sup­port pub­lish­ers, authors, and Apple, with­out a penny going to pay for Amazon’s server farms, let alone its employ­ees or shareholders.)

Ama­zon does not have a sim­i­lar pol­icy. If you sell a book on Ama­zon, you can set the price, or let Ama­zon set guide­lines on the price ($2.99 — $9.99 and 20% less than the cheap­est print ver­sion of the title), and get a bet­ter per­cent­age of the sales price. But there’s noth­ing to stop some­one from sell­ing a Kindle-formatted book for $9.99 through Ama­zon and $7.99 directly from them. This is called the agency model, and it means that when Ama­zon acts as the pub­lisher or author’s agent, they get income, when they don’t … they don’t get income, and fur­ther­more, they make no stip­u­la­tions about how much the author or pub­lisher can sell the Kin­dle book for out­side of the Ama­zon store.

At best, this announce­ment by Apple will make legit­i­mate ven­dors of books, mag­a­zines, and audio and video think twice before offer­ing their ser­vices at cur­rent prices through the App Store, since doing so would incur a steep fee that they did not have before. At worst, some com­pa­nies will play, but oth­ers will be left out. It seems like a sure way for Apple to make good rev­enue from those who remain, and to sti­fle com­pe­ti­tion from the likes of Hulu and Net­flix (video rentals), Ama­zon (books and mag­a­zines), and Rhap­sody (music).

A com­pre­hen­sive arti­cle on the reac­tions appears on Read­WriteWeb: “A Round-Up of Reac­tions: Apple’s Greedy, Anti-Competitive, Evil, Bril­liant Announce­ment.” This arti­cle points out that the Wall Street Jour­nal muses about the legal­ity of the announcement:

Apple Inc.‘s new sub­scrip­tion ser­vice could draw antitrust scrutiny, accord­ing to law pro­fes­sors,” writes the Journal’s Nathan Kop­pel. Accord­ing to the arti­cle, the antitrust argu­ment hinges on two pri­mary points — whether or not Apple is exert­ing “anti­com­pet­i­tive pres­sures on price” and whether Apple is a “dom­i­nant player in the market.”

But what does this mean for geneal­o­gists? We may never know for sure. If Apple’s strat­egy goes for­ward, but actu­ally does have a chill­ing and anti­com­pet­i­tive impact, a lot of con­tent and ser­vices, some not yet con­ceived of, may not come to a dom­i­nant plat­form. Geneal­o­gists are rav­en­ous con­sumers of books, includ­ing e-books and audio books. This may delay or stop the deliv­ery of a lot of titles that might oth­er­wise have been avail­able. Hope­fully, Apple will re-think their announce­ment, at least as it con­cerns how ven­dors price and sell their con­tent off the iPad.

Using the Wayback Machine for Genealogy

Geocities Has Closed
Geoc­i­ties Has Closed

The Way­back Machine, a project of The Inter­net Archive, (cur­rent ver­sion: http://web.archive.org/; new beta ver­sion at http://waybackmachine.org/) is an attempt to archive the com­plete con­tent of the Inter­net. Brew­ster Kahle, the co-founder of the Inter­net Archive spoke about the project at the Sat­ur­day keynote address at Root­sTech 2011.

The key pur­pose of the Inter­net Archive is to make the Inter­net avail­able for future his­to­ri­ans and other researchers, in order that they might know what we were say­ing and doing in this often ephemeral envi­ron­ment called the Internet.

But it can also help us in the here and now. If you ever encounter a pub­licly avail­able site that has dis­ap­peared, you may find it else­where on Google, but, fail­ing that, you may find it in the Inter­net Archive.

For exam­ple, on an old Rootsweb page that I am in the process of migrat­ing to this site, I have a link that is no longer work­ing. (As the lingo goes, I have “link rot”.)

I try to link to:

http://www.geocities.com/Heartland/Hollow/1936/index.html

When I try to nav­i­gate to this site, I get a mes­sage saying:

Sorry, the GeoC­i­ties web­site you were try­ing to visit is no longer avail­able.
GeoC­i­ties has closed, but there’s a lot more to explore on Yahoo!”

This does not offer much solace. How­ever, when I go to the Way­back Machine and enter the URL I was search­ing for, I receive the fol­low­ing link:

http://web.archive.org/web/*/http://www.geocities.com/Heartland/Hollow/1936/index.html

Alter­nately, if I go to the beta ver­sion of the new Way­back Machine and enter this search I get to:

http://waybackmachine.org/*/http://www.geocities.com/Heartland/Hollow/1936/index.html

This page shows me the var­i­ous snap­shots the Inter­net Archive got around to mak­ing of this page. When I click on the most recent, I see that it has a link to a new location:

http://freepages.genealogy.rootsweb.ancestry.com/~pre1800vias/

I can also look at other snap­shots to see what the site looked like at that time.

The Inter­net Archive can­not instan­ta­neously cap­ture the whole Inter­net, but every cou­ple of months, it tra­verses most of the pub­lic web, cap­tures what has changed, and moves on. You should not rely on it, either as a web user, or as a web­mas­ter, how­ever it can prove very handy at times. Try it the next time you run across a link that you are sure used to work, but no longer does.

RootsTech 2011: Towards a New Genealogical Data Model

On Sat­ur­day at the Root­sTech con­fer­ence in Salt Lake City, there was an open dis­cus­sion ses­sion on genealog­i­cal data stan­dards. There has been a heated dis­cus­sion, lit­er­ally going on for years, about a new data model that could replace GEDCOM. A new GEDCOM stan­dard would address GEDCOM’s gaps — for exam­ple, being able to store evi­den­tiary analy­sis within the data model — and be a liv­ing dynamic stan­dard, unlike GEDCOM, which has been sta­tic since 1996.

In the first hour, the dis­cus­sion iden­ti­fied sev­eral issues with the data model:

  • Data in Pro­pri­etary For­mats — Because of gaps in GEDCOM, and the lack of a stan­dards body to address this issue, most soft­ware ven­dors devel­oped their own pro­pri­etary exten­sions, which lim­ited the abil­ity to share data.
  • Lack of Per­sis­tent URLs (PURLs)
  • Unstruc­tured Text
  • Tag & Link Issues
  • Incon­sis­tent Search Experience
  • Data Ver­sion­ing (Diff/Merge)
  • Inabil­ity to Trans­fer Rich Data (rich media)
  • Inabil­ity to do Cross-Repository Search
  • Doc­u­men­ta­tion (in other words, cap­tur­ing the source of a genealog­i­cal state­ment, the abil­ity to provide
  • Key as seen (Rep­re­sen­ta­tion) — In other words, how do we nor­mal­ize data while pre­serv­ing the orig­i­nal “as-keyed” version?
  • Sta­tic data interchange

After the first hour, devoted to cre­at­ing this list, we were to vote on buck­ets of tech­no­log­i­cal or fea­ture issues to come up with one or two we could dis­cuss. For me, the biggest issue was not any of these tech­ni­cal issues, it was the lack of a gov­er­nance model. Since no one was signed up to main­tain GEDCOM, it did not change with the times, and died as a stan­dard; in other words, peo­ple saw gaps and addressed them in a pro­pri­etary way, since there was no way to get issues addressed within the standard.

I got up and sug­gested we talk about how we build a work­ing gov­er­nance model instead of the issues that the gov­er­nance model would help us solve. For more than a decade, peo­ple have been lament­ing the lack of a stan­dards body to adju­di­cate issues, develop a com­mon stan­dard, and sub­mit it for pub­lic review. At the same time, peo­ple have pointed out the fea­ture gaps, and pro­posed ways to address them. For the fea­ture gap dis­cus­sion to have an effect, how­ever, we need to have a place to have these dis­cus­sions that is actu­ally designed to main­tain a work­ing stan­dard. Lack of gov­er­nance, not lack of tech­nol­ogy, is the issue. We voted, and changed the direc­tion of the meet­ing to dis­cuss governance.

It was at about this time that Tom Creighton, the CTO of Fam­il­y­Search, got up and announced that Fam­il­y­Search is nearly ready to announce a new pro­posed data model. This changed the meet­ing imme­di­ately. Instead of an open dis­cus­sion, it became more like a press con­fer­ence, with Tom field­ing ques­tions about what they have done, when the work will be shared, and so on. There was not a lot that he was able to divulge at this point.

Key por­tions of the new pro­posed stan­dard are based on the Gen­Tech genealog­i­cal data model owned by the National Genealog­i­cal Soci­ety (full dis­clo­sure, I am on the Board of the NGS). The deci­sion to make the new pro­posed data model pub­lic and free has not yet been made by the man­age­ment at Fam­il­y­Search, but is being dis­cussed. This means that there can­not be a date set for the launch of the new stan­dard, as it could remain the intel­lec­tual prop­erty of Fam­il­y­Search, and unavail­able out­side of Fam­il­y­Search. (Mr. Creighton said that they had dis­cussed the fact that they were devel­op­ing a new stan­dard with sev­eral soft­ware ven­dors, but had not pro­vided any of them any more detail than that they were work­ing on something.)

This is an excit­ing devel­op­ment in the inter­sec­tion of geneal­ogy and tech­nol­ogy. If Fam­il­y­Search decides to share their work, and if a gov­er­nance body can be iden­ti­fied or set up, and finally if that gov­er­nance body has the trust of the genealog­i­cal com­mu­nity, including:

  • the major desk­top and mobile appli­ca­tion developers
  • the major web databases
  • the NGS
  • NEHGS (New Eng­land His­toric Genealog­i­cal Society)
  • FGS (the Fed­er­a­tion of Genealog­i­cal Societies)
  • BCG (the Board for Cer­ti­fi­ca­tion of Genealogists)
  • APG (the Asso­ci­a­tion of Pro­fes­sional Genealogists)

we could be near the start of a much more rich tech­nol­ogy envi­ron­ment. A new data model, address­ing issues with GEDCOM and upgraded and changed through a com­mu­nity gov­er­nance model could lead to inte­grated set of inde­pen­dently devel­oped soft­ware tools that would allow peo­ple to rep­re­sent their research bet­ter than they can with GEDCOM, and bet­ter share their data or move it from one vended prod­uct to another.

It sounds a lit­tle like Shangri-la as I write it here, but we are talk­ing about the incred­i­ble poten­tial that would be unleashed if most soft­ware ven­dors did not have to fix inde­pen­dently (or ignore) issues with the cur­rent data model, and could instead focus on the next new way to access and work with genealog­i­cal data.

Update, 17 Feb­ru­ary 2011: A sum­mary of the meet­ing dis­cussed here has been posted on the Fam­il­y­Search wiki: https://wiki.familysearch.org/en/Genealogical_Data_Standards_(RootsTech_Session)

RootsTech 2011: Day 3

Internet Archive
Inter­net Archive

Brew­ster Kahle, founder of the Inter­net Archive, gave an incred­i­ble keynote address this morning.

His non-profit has been dig­i­tiz­ing and pro­vid­ing on the Inter­net all kinds of media. As he said, “We are in the busi­ness of giv­ing infor­ma­tion away.” He briefly men­tioned “born dig­i­tal” data, but focused his dis­cus­sion on the data we all have in shoe­boxes, what he called the “canon­i­cal box ‘o stuff.”

The Inter­net Archives has 23 scan­ning cen­ters in 6 coun­tries. For exam­ple, they have dig­i­tized doc­u­ments from the Leo Baeck Insti­tute, and did so while remov­ing pri­vate infor­ma­tion via remote cura­tion over the web.

Mr. Kahle also dis­cussed their dig­i­ti­za­tion of video con­tent (8mm, Super8, 16mm,  video tape). He pointed out that some of this kind of con­ver­sion is avail­able in the con­sumer mar­ket, for about $200 / hour. Higher grade (HD-quality trans­fers are also avail­able, but are much more expensive.

Specif­i­cally in the genealog­i­cal field, Mr. Kahle said that the Inter­net Archive is involved in cre­at­ing a free genealog­i­cal library — part­ner­ing with Fam­il­y­Search and the Allen County Library. Recently, the Inter­net Archive com­pleted dig­i­tiz­ing the 1790–1930 Cen­sus and mak­ing it avail­able for free. They are now work­ing on dig­i­tiz­ing pas­sen­ger records. Soon, they will be announc­ing a part­ner­ship with libraries that will allow for 80,000 e-books to be “loaned” from the library to patrons who are in the library.

For me, this was all pow­er­ful, trans­for­ma­tive infor­ma­tion. But I was most inter­ested in Mr. Kahle’s dis­cus­sion of print-on-demand dig­i­tal book­mo­biles, which can pro­vide books as peo­ple need them, at a very low cost. (One exam­ple was that Alice in Won­der­land costs about $1 to print and bind.) Accord­ing to Mr. Kahle, a Har­vard study has shown that it takes a library $3 to loan a book, so $1 to give a book away should be a rea­son­able price. This is being used to pro­vide printed books free in India, Egypt, and Uganda.

One of the most mov­ing por­tions of the dis­cus­sion was the fact that the Inter­net Archive has dou­bled, to more than 1 mil­lion, the num­ber of books avail­able to the blind and text-disabled in the DAISY for­mat for auto­mated readers.

A key issue for any archive, Mr. Kahle pointed out is insti­tu­tional respon­si­bil­ity: How long, and at what level can a com­pany, or any insti­tu­tion be trusted to store infor­ma­tion. He told us not to trust that Flickr, Google, or even his non-profit would be around, or make the right deci­sions when it counted. So, his rec­om­men­da­tion is to not only have one copy in one insti­tu­tion. He said that the Library in Alexan­dria burned, yes, but it already had lost many of the impor­tant texts that it had gath­ered because of insti­tu­tional neglect: “the new guys didn’t like the old stuff around.”

In 2002, the Inter­net Archive handed 200 TB of their data to the Library of Alexan­dria, which rec­i­p­ro­cated with their col­lec­tion of dig­i­tized Ara­bic mate­ri­als. These kinds of large scale swap agree­ments are crit­i­cal to the redun­dancy needed to ensure that we do not have another loss sim­i­lar to what we lost at Alexan­dria, books by Aris­to­tle, the other plays of Euripi­des … At this point, the whole Inter­net Archive is stored in three loca­tions: San Fran­cisco, Alexan­dria, and Ams­ter­dam. Mr. Kahle acknowl­edged that an earth­quake zone, the Mid­dle East, and a flood plain were per­haps not the best choices, but they were not plan­ning on stop­ping there.

For us, as geneal­o­gists, Mr. Kahle poses the fol­low­ing ques­tions, which should make us think hard about the respon­si­bil­ity we have to take care of our data and documents:

  • Can we learn the sto­ries of our ancestors?
  • Will our descen­dants know our story?

The Root­sTech con­fer­ence was a great suc­cess. More than 3,000 atten­dees were there, mak­ing it one of the biggest, if not the biggest geneal­ogy gath­er­ing in the US. Next year, the sec­ond Root­sTech con­fer­ence will be held at the Salt Palace in Salt Lake City, Utah from 2–4 Feb­ru­ary. I plan to be there.