Google Docs Goes Native

Google Docs was once an appli­ca­tion that was “like Microsoft Word” or “like Pow­er­Point”, and could read and write files from those pro­grams as well as Excel. But main­ly, you under­stood that you were edit­ing your file and stor­ing it, in Google’s pro­pri­etary for­mat.

Then, in Jan­u­ary 2010, Google announced that they would allow users to store any file for­mat in their Google Docs envi­ron­ment. That start­ed to look like anoth­er cloud stor­age offer­ing. Frankly, it didn’t make a lot of sense to upload files you can­not even open in that envi­ron­ment. Google took a big step toward address­ing that week, mak­ing some key for­mats native­ly view­able with­in Google Docs.

On their blog, they say:

The Google Docs View­er is used by mil­lions of peo­ple every day to quick­ly view PDFs, Microsoft Word doc­u­ments and Pow­er­Point pre­sen­ta­tions online. Not only is view­ing files in your brows­er far more secure than down­load­ing and open­ing them local­ly, but it also saves time and doesn’t clut­ter up your hard-dri­ve with unwant­ed files.

Today we’re excit­ed to launch sup­port for 12 new file types:

  • Microsoft Excel (.XLS and .XLSX)
  • Microsoft Pow­er­Point 20072010 (.PPTX)
  • Apple Pages (.PAGES)
  • Adobe Illus­tra­tor (.AI)
  • Adobe Pho­to­shop (.PSD)
  • Autodesk Auto­Cad (.DXF)
  • Scal­able Vec­tor Graph­ics (.SVG)
  • Post­Script (.EPS, .PS)
  • True­Type (.TTF)
  • XML Paper Spec­i­fi­ca­tion (.XPS)

Not only does this round out sup­port for the major Microsoft Office file types (we now sup­port DOC, DOCX, PPT, PPTX, XLS and XLSX), but it also adds quick view­ing capa­bil­i­ties for many of the most pop­u­lar and high­ly-request­ed doc­u­ment and image types.

In Gmail, these types of attach­ments will now show a “View” link, and click­ing on this link will bring up the Google Docs View­er.

For me, one of the few annoy­ing aspects of how Gmail and Google Docs work togeth­er has been that, in the ear­ly days, sim­ply open­ing up a Word doc­u­ment in my Gmail would auto­mat­i­cal­ly cre­ate a doc­u­ment in Google Docs, or that it wouldn’t allow me to pre­view it, and would force me to down­load the file. Now, I will sim­ply be able to View these doc­u­ments, and have them dis­ap­pear into the brows­er cache at the end of the ses­sion.

More Technology News for Genealogists


Ear­li­er this week, Apple announced a new sub­scrip­tion pay­ment mod­el for the iPad.

Google respond­ed yes­ter­day with a much more flex­i­ble sub­scrip­tion mod­el using Google Check­out (a Pay­Pal com­peti­tor), and pro­vid­ing 10% in rev­enue for Google (in com­par­i­son with Apple’s 30%). Google does not require that the in-app pur­chase price be at least as inex­pen­sive as any oth­er web offer­ing of the prod­uct. It’s a more open pro­gram, and hope­ful­ly will gain trac­tion and help fos­ter a more sus­tain­able sales mod­el for con­tent providers.

Until and unless oth­er mod­els come along, expect to see genealog­i­cal con­tent providers, as they move into the tablet space, to opt for the Google pric­ing mod­el, which will bet­ter align with their oper­at­ing prof­it mar­gins.


SlideShare is a site that allows you to upload Pow­er­Point-style slides to share with oth­ers. (I post all my slides at SlideShare: This week they announced a free 1-click con­fer­enc­ing prod­uct, Zip­cast. I have not tried it, but it looks inter­est­ing, as most con­fer­enc­ing sys­tems that share slides require that the slides be uploaded in real time, as images of from the per­son shar­ing the slides. Zip­cast might be faster, because the slides will not need to be uploaded dur­ing the meet­ing, and will already be opti­mized for web view­ing at SlideShare.

Don’t be sur­prised if your next geneal­o­gy meet­ing does not hap­pen in per­son, but instead over SlideShare’s Zip­cast.

Categories Uncategorized

Subscriptions on the Apple App Store

Magazines on the iPad
Apple’s iPad

Apple announced today that they will be sup­port­ing sub­scrip­tions on the App­Store. A lot of us have been think­ing that would make for a good day, as it nev­er made sense for own­ers of the iPad to only be able to buy some­thing like a mag­a­zine for the iPad one issue at a time (often for more than a print sin­gle copy).

How­ev­er, the way that Apple is doing this is caus­ing a great deal of con­ster­na­tion out­side of Cuper­ti­no.

First, they are demand­ing 30% of every sub­scrip­tion sale. This is a sim­i­lar rate that is paid on mag­a­zines at the news stand, but not hav­ing to pro­vide that dis­count to mag­a­zine stands is part of what allows mag­a­zine sub­scrip­tions to be so inex­pen­sive. Apple does allow peo­ple who sell sub­scrip­tions to do so “out­side the app.” But, again, the bar­gain they are ask­ing peo­ple to make is dra­con­ian. In their press release, they write:

How­ev­er, Apple does require that if a pub­lish­er choos­es to sell a dig­i­tal sub­scrip­tion sep­a­rate­ly out­side of the app, that same sub­scrip­tion offer must be made avail­able, at the same price or less, to cus­tomers who wish to sub­scribe from with­in the app.” In oth­er words, the time hon­ored tra­di­tion of the “cut-out-the-mid­dle­man” buy direct dis­count is not going to be allowed.

This means that Ama­zon can­not sell books in the iOS ver­sion of the Kin­dle read­er, even though that read­er only has a link to Amazon’s web­site to make that pur­chase. (For titles sold through Amazon’s Dig­i­tal Text Pro­gram, authors and pub­lish­ers get a 70% roy­al­ty. Sim­ple math shows that if Ama­zon gives Apple the remain­ing 30%, they will be spend­ing mon­ey to sup­port pub­lish­ers, authors, and Apple, with­out a pen­ny going to pay for Amazon’s serv­er farms, let alone its employ­ees or share­hold­ers.)

Ama­zon does not have a sim­i­lar pol­i­cy. If you sell a book on Ama­zon, you can set the price, or let Ama­zon set guide­lines on the price ($2.99 — $9.99 and 20% less than the cheap­est print ver­sion of the title), and get a bet­ter per­cent­age of the sales price. But there’s noth­ing to stop some­one from sell­ing a Kin­dle-for­mat­ted book for $9.99 through Ama­zon and $7.99 direct­ly from them. This is called the agency mod­el, and it means that when Ama­zon acts as the pub­lish­er or author’s agent, they get income, when they don’t … they don’t get income, and fur­ther­more, they make no stip­u­la­tions about how much the author or pub­lish­er can sell the Kin­dle book for out­side of the Ama­zon store.

At best, this announce­ment by Apple will make legit­i­mate ven­dors of books, mag­a­zines, and audio and video think twice before offer­ing their ser­vices at cur­rent prices through the App Store, since doing so would incur a steep fee that they did not have before. At worst, some com­pa­nies will play, but oth­ers will be left out. It seems like a sure way for Apple to make good rev­enue from those who remain, and to sti­fle com­pe­ti­tion from the likes of Hulu and Net­flix (video rentals), Ama­zon (books and mag­a­zines), and Rhap­sody (music).

A com­pre­hen­sive arti­cle on the reac­tions appears on Read­WriteWeb: “A Round-Up of Reac­tions: Apple’s Greedy, Anti-Com­pet­i­tive, Evil, Bril­liant Announce­ment.” This arti­cle points out that the Wall Street Jour­nal mus­es about the legal­i­ty of the announce­ment:

Apple Inc.‘s new sub­scrip­tion ser­vice could draw antitrust scruti­ny, accord­ing to law pro­fes­sors,” writes the Journal’s Nathan Kop­pel. Accord­ing to the arti­cle, the antitrust argu­ment hinges on two pri­ma­ry points — whether or not Apple is exert­ing “anti­com­pet­i­tive pres­sures on price” and whether Apple is a “dom­i­nant play­er in the mar­ket.”

But what does this mean for geneal­o­gists? We may nev­er know for sure. If Apple’s strat­e­gy goes for­ward, but actu­al­ly does have a chill­ing and anti­com­pet­i­tive impact, a lot of con­tent and ser­vices, some not yet con­ceived of, may not come to a dom­i­nant plat­form. Geneal­o­gists are rav­en­ous con­sumers of books, includ­ing e-books and audio books. This may delay or stop the deliv­ery of a lot of titles that might oth­er­wise have been avail­able. Hope­ful­ly, Apple will re-think their announce­ment, at least as it con­cerns how ven­dors price and sell their con­tent off the iPad.

Using the Wayback Machine for Genealogy

Geocities Has Closed
Geoc­i­ties Has Closed

The Way­back Machine, a project of The Inter­net Archive, (cur­rent ver­sion:; new beta ver­sion at is an attempt to archive the com­plete con­tent of the Inter­net. Brew­ster Kahle, the co-founder of the Inter­net Archive spoke about the project at the Sat­ur­day keynote address at Root­sTech 2011.

The key pur­pose of the Inter­net Archive is to make the Inter­net avail­able for future his­to­ri­ans and oth­er researchers, in order that they might know what we were say­ing and doing in this often ephemer­al envi­ron­ment called the Inter­net.

But it can also help us in the here and now. If you ever encounter a pub­licly avail­able site that has dis­ap­peared, you may find it else­where on Google, but, fail­ing that, you may find it in the Inter­net Archive.

For exam­ple, on an old Rootsweb page that I am in the process of migrat­ing to this site, I have a link that is no longer work­ing. (As the lin­go goes, I have “link rot”.)

I try to link to:

When I try to nav­i­gate to this site, I get a mes­sage say­ing:

Sor­ry, the GeoC­i­ties web­site you were try­ing to vis­it is no longer avail­able.
GeoC­i­ties has closed, but there’s a lot more to explore on Yahoo!”

This does not offer much solace. How­ev­er, when I go to the Way­back Machine and enter the URL I was search­ing for, I receive the fol­low­ing link:*/

Alter­nate­ly, if I go to the beta ver­sion of the new Way­back Machine and enter this search I get to:*/

This page shows me the var­i­ous snap­shots the Inter­net Archive got around to mak­ing of this page. When I click on the most recent, I see that it has a link to a new loca­tion:

I can also look at oth­er snap­shots to see what the site looked like at that time.

The Inter­net Archive can­not instan­ta­neous­ly cap­ture the whole Inter­net, but every cou­ple of months, it tra­vers­es most of the pub­lic web, cap­tures what has changed, and moves on. You should not rely on it, either as a web user, or as a web­mas­ter, how­ev­er it can prove very handy at times. Try it the next time you run across a link that you are sure used to work, but no longer does.

Categories Uncategorized

RootsTech 2011: Towards a New Genealogical Data Model

On Sat­ur­day at the Root­sTech con­fer­ence in Salt Lake City, there was an open dis­cus­sion ses­sion on genealog­i­cal data stan­dards. There has been a heat­ed dis­cus­sion, lit­er­al­ly going on for years, about a new data mod­el that could replace GEDCOM. A new GEDCOM stan­dard would address GEDCOM’s gaps — for exam­ple, being able to store evi­den­tiary analy­sis with­in the data mod­el — and be a liv­ing dynam­ic stan­dard, unlike GEDCOM, which has been sta­t­ic since 1996.

In the first hour, the dis­cus­sion iden­ti­fied sev­er­al issues with the data mod­el:

  • Data in Pro­pri­etary For­mats — Because of gaps in GEDCOM, and the lack of a stan­dards body to address this issue, most soft­ware ven­dors devel­oped their own pro­pri­etary exten­sions, which lim­it­ed the abil­i­ty to share data.
  • Lack of Per­sis­tent URLs (PURLs)
  • Unstruc­tured Text
  • Tag & Link Issues
  • Incon­sis­tent Search Expe­ri­ence
  • Data Ver­sion­ing (Diff/Merge)
  • Inabil­i­ty to Trans­fer Rich Data (rich media)
  • Inabil­i­ty to do Cross-Repos­i­to­ry Search
  • Doc­u­men­ta­tion (in oth­er words, cap­tur­ing the source of a genealog­i­cal state­ment, the abil­i­ty to pro­vide
  • Key as seen (Rep­re­sen­ta­tion) — In oth­er words, how do we nor­mal­ize data while pre­serv­ing the orig­i­nal “as-keyed” ver­sion?
  • Sta­t­ic data inter­change

After the first hour, devot­ed to cre­at­ing this list, we were to vote on buck­ets of tech­no­log­i­cal or fea­ture issues to come up with one or two we could dis­cuss. For me, the biggest issue was not any of these tech­ni­cal issues, it was the lack of a gov­er­nance mod­el. Since no one was signed up to main­tain GEDCOM, it did not change with the times, and died as a stan­dard; in oth­er words, peo­ple saw gaps and addressed them in a pro­pri­etary way, since there was no way to get issues addressed with­in the stan­dard.

I got up and sug­gest­ed we talk about how we build a work­ing gov­er­nance mod­el instead of the issues that the gov­er­nance mod­el would help us solve. For more than a decade, peo­ple have been lament­ing the lack of a stan­dards body to adju­di­cate issues, devel­op a com­mon stan­dard, and sub­mit it for pub­lic review. At the same time, peo­ple have point­ed out the fea­ture gaps, and pro­posed ways to address them. For the fea­ture gap dis­cus­sion to have an effect, how­ev­er, we need to have a place to have these dis­cus­sions that is actu­al­ly designed to main­tain a work­ing stan­dard. Lack of gov­er­nance, not lack of tech­nol­o­gy, is the issue. We vot­ed, and changed the direc­tion of the meet­ing to dis­cuss gov­er­nance.

It was at about this time that Tom Creighton, the CTO of Fam­il­y­Search, got up and announced that Fam­il­y­Search is near­ly ready to announce a new pro­posed data mod­el. This changed the meet­ing imme­di­ate­ly. Instead of an open dis­cus­sion, it became more like a press con­fer­ence, with Tom field­ing ques­tions about what they have done, when the work will be shared, and so on. There was not a lot that he was able to divulge at this point.

Key por­tions of the new pro­posed stan­dard are based on the Gen­Tech genealog­i­cal data mod­el owned by the Nation­al Genealog­i­cal Soci­ety (full dis­clo­sure, I am on the Board of the NGS). The deci­sion to make the new pro­posed data mod­el pub­lic and free has not yet been made by the man­age­ment at Fam­il­y­Search, but is being dis­cussed. This means that there can­not be a date set for the launch of the new stan­dard, as it could remain the intel­lec­tu­al prop­er­ty of Fam­il­y­Search, and unavail­able out­side of Fam­il­y­Search. (Mr. Creighton said that they had dis­cussed the fact that they were devel­op­ing a new stan­dard with sev­er­al soft­ware ven­dors, but had not pro­vid­ed any of them any more detail than that they were work­ing on some­thing.)

This is an excit­ing devel­op­ment in the inter­sec­tion of geneal­o­gy and tech­nol­o­gy. If Fam­il­y­Search decides to share their work, and if a gov­er­nance body can be iden­ti­fied or set up, and final­ly if that gov­er­nance body has the trust of the genealog­i­cal com­mu­ni­ty, includ­ing:

  • the major desk­top and mobile appli­ca­tion devel­op­ers
  • the major web data­bas­es
  • the NGS
  • NEHGS (New Eng­land His­toric Genealog­i­cal Soci­ety)
  • FGS (the Fed­er­a­tion of Genealog­i­cal Soci­eties)
  • BCG (the Board for Cer­ti­fi­ca­tion of Geneal­o­gists)
  • APG (the Asso­ci­a­tion of Pro­fes­sion­al Geneal­o­gists)

we could be near the start of a much more rich tech­nol­o­gy envi­ron­ment. A new data mod­el, address­ing issues with GEDCOM and upgrad­ed and changed through a com­mu­ni­ty gov­er­nance mod­el could lead to inte­grat­ed set of inde­pen­dent­ly devel­oped soft­ware tools that would allow peo­ple to rep­re­sent their research bet­ter than they can with GEDCOM, and bet­ter share their data or move it from one vend­ed prod­uct to anoth­er.

It sounds a lit­tle like Shangri-la as I write it here, but we are talk­ing about the incred­i­ble poten­tial that would be unleashed if most soft­ware ven­dors did not have to fix inde­pen­dent­ly (or ignore) issues with the cur­rent data mod­el, and could instead focus on the next new way to access and work with genealog­i­cal data.

Update, 17 Feb­ru­ary 2011: A sum­ma­ry of the meet­ing dis­cussed here has been post­ed on the Fam­il­y­Search wiki:

Categories Uncategorized

RootsTech 2011: Day 3

Internet Archive
Inter­net Archive

Brew­ster Kahle, founder of the Inter­net Archive, gave an incred­i­ble keynote address this morn­ing.

His non-prof­it has been dig­i­tiz­ing and pro­vid­ing on the Inter­net all kinds of media. As he said, “We are in the busi­ness of giv­ing infor­ma­tion away.” He briefly men­tioned “born dig­i­tal” data, but focused his dis­cus­sion on the data we all have in shoe­box­es, what he called the “canon­i­cal box ‘o stuff.”

The Inter­net Archives has 23 scan­ning cen­ters in 6 coun­tries. For exam­ple, they have dig­i­tized doc­u­ments from the Leo Baeck Insti­tute, and did so while remov­ing pri­vate infor­ma­tion via remote cura­tion over the web.

Mr. Kahle also dis­cussed their dig­i­ti­za­tion of video con­tent (8mm, Super8, 16mm,  video tape). He point­ed out that some of this kind of con­ver­sion is avail­able in the con­sumer mar­ket, for about $200 / hour. High­er grade (HD-qual­i­ty trans­fers are also avail­able, but are much more expen­sive.

Specif­i­cal­ly in the genealog­i­cal field, Mr. Kahle said that the Inter­net Archive is involved in cre­at­ing a free genealog­i­cal library — part­ner­ing with Fam­il­y­Search and the Allen Coun­ty Library. Recent­ly, the Inter­net Archive com­plet­ed dig­i­tiz­ing the 1790–1930 Cen­sus and mak­ing it avail­able for free. They are now work­ing on dig­i­tiz­ing pas­sen­ger records. Soon, they will be announc­ing a part­ner­ship with libraries that will allow for 80,000 e-books to be “loaned” from the library to patrons who are in the library.

For me, this was all pow­er­ful, trans­for­ma­tive infor­ma­tion. But I was most inter­est­ed in Mr. Kahle’s dis­cus­sion of print-on-demand dig­i­tal book­mo­biles, which can pro­vide books as peo­ple need them, at a very low cost. (One exam­ple was that Alice in Won­der­land costs about $1 to print and bind.) Accord­ing to Mr. Kahle, a Har­vard study has shown that it takes a library $3 to loan a book, so $1 to give a book away should be a rea­son­able price. This is being used to pro­vide print­ed books free in India, Egypt, and Ugan­da.

One of the most mov­ing por­tions of the dis­cus­sion was the fact that the Inter­net Archive has dou­bled, to more than 1 mil­lion, the num­ber of books avail­able to the blind and text-dis­abled in the DAISY for­mat for auto­mat­ed read­ers.

A key issue for any archive, Mr. Kahle point­ed out is insti­tu­tion­al respon­si­bil­i­ty: How long, and at what lev­el can a com­pa­ny, or any insti­tu­tion be trust­ed to store infor­ma­tion. He told us not to trust that Flickr, Google, or even his non-prof­it would be around, or make the right deci­sions when it count­ed. So, his rec­om­men­da­tion is to not only have one copy in one insti­tu­tion. He said that the Library in Alexan­dria burned, yes, but it already had lost many of the impor­tant texts that it had gath­ered because of insti­tu­tion­al neglect: “the new guys didn’t like the old stuff around.”

In 2002, the Inter­net Archive hand­ed 200 TB of their data to the Library of Alexan­dria, which rec­i­p­ro­cat­ed with their col­lec­tion of dig­i­tized Ara­bic mate­ri­als. These kinds of large scale swap agree­ments are crit­i­cal to the redun­dan­cy need­ed to ensure that we do not have anoth­er loss sim­i­lar to what we lost at Alexan­dria, books by Aris­to­tle, the oth­er plays of Euripi­des … At this point, the whole Inter­net Archive is stored in three loca­tions: San Fran­cis­co, Alexan­dria, and Ams­ter­dam. Mr. Kahle acknowl­edged that an earth­quake zone, the Mid­dle East, and a flood plain were per­haps not the best choic­es, but they were not plan­ning on stop­ping there.

For us, as geneal­o­gists, Mr. Kahle pos­es the fol­low­ing ques­tions, which should make us think hard about the respon­si­bil­i­ty we have to take care of our data and doc­u­ments:

  • Can we learn the sto­ries of our ances­tors?
  • Will our descen­dants know our sto­ry?

The Root­sTech con­fer­ence was a great suc­cess. More than 3,000 atten­dees were there, mak­ing it one of the biggest, if not the biggest geneal­o­gy gath­er­ing in the US. Next year, the sec­ond Root­sTech con­fer­ence will be held at the Salt Palace in Salt Lake City, Utah from 2–4 Feb­ru­ary. I plan to be there.

Categories Uncategorized

RootsTech 2011: Day 2

Day 2 of Root­sTech start­ed with a spir­it­ed keynote address by Curt Witch­er of the Allen Coun­ty Pub­lic Library on “The Chang­ing Face of Geneal­o­gy.” His point was: The world is going dig­i­tal and going there quick­ly. Get on board, or be left behind.

Bri­an Pugh of Fam­il­y­Search pre­sent­ed a pow­er­ful talk on how the new Fam­il­y­Search web­site has uti­lized cloud ser­vices (pri­mar­i­ly from Ama­zon Web Ser­vices: to pro­vide world class web­site in a cost-effi­cient man­ner. The strat­e­gy has allowed them to auto-scale up and down their ser­vices as need­ed. Addi­tion­al­ly, they are able to cre­ate data snap­shots to quick­ly build new pro­to­types of their site for devel­op­ment and test­ing. They use Ama­zon S3 as a shared filesys­tem for dynam­ic con­tent, though the per­for­mance of S3 is not designed for serv­ing up images, and so on, so they cache the data stored on S3 for actu­al deliv­ery to web browsers.

One thing they are doing on the Fam­il­y­Search web­site is uti­liz­ing Ama­zon Elas­tic IPs to allow for “hot” deploy­ment of new ver­sions of the site. They can build the new ver­sion of the site, test it, and then in a mat­ter of sec­onds, have Ama­zon redi­rect the IP address of the web­site to the new site, while keep­ing the old site in reserve. If they need to fall back to the old site, it’s again only a mat­ter of sec­onds.

They also use Ama­zon MapRe­duce to per­form com­plex com­pu­ta­tions.

Fam­il­y­Search engi­neers have made avail­able pro­gram­ming lan­guage for cre­at­ing cloud based sys­tems, avail­able at: This allows man­agers of cloud envi­ron­ments to quick­ly issue “verbs” such as

  • Deploy
  • Con­fig­ure
  • Shut­down
  • Snap­shot

One key thing that Mr. Pugh said about Amazon’s offer­ing in this space, is that it is being wide­ly used. Among oth­ers, he men­tioned that the New York Times, Major League Base­ball, Net­flix, 3M, Activi­sion, ESPN, NASDAQ, The Guardian, and Razor­fish (and I can add the New Eng­land His­toric Geneal­o­gy Soci­ety, based on the Fri­day lun­cheon.)

Lat­er in the day, I was able to attend a view­ing of “Who Do You Think You Are?” at the Fam­i­ly His­to­ry Library. They gave out raf­fle items, and I won a copy of Ances­try for the Mac. I then took advan­tage of the Library being open until mid­night, research­ing my Hills, John­sons, and Crows in Howard Coun­ty and Nance Coun­ty, Nebras­ka.

Categories Uncategorized

RootsTech 2011: Day 1

Yes­ter­day was the first day of Root­sTech, a new con­fer­ence on geneal­o­gy and tech­nol­o­gy held in Salt Lake City and spon­sored by Fam­il­y­Search Inter­na­tion­al, the geneal­o­gy infor­ma­tion arm of the Church of Jesus Christ of Lat­ter-day Saints.

The con­fer­ence start­ed with a lit­tle bit of con­fu­sion: It seemed that there was a rush to the reg­is­tra­tion table just pri­or to the keynote address. This kind of thing can be min­i­mized, of course, by open­ing reg­is­tra­tion the day before, or by send­ing all the light­weight items (tick­ets to lunch­es and events, lan­yard and badge) ahead of time, and then sim­ply exchang­ing one of those tick­ets for a stan­dard back­pack or lap­top case and any oth­er schwag and late-break­ing news.

In any case, the orga­niz­ers offered to let peo­ple reg­is­ter lat­er; they were not going to check badges for the first event. This was some­thing I def­i­nite­ly took advan­tage of, since I didn’t want to miss the talk by Shane R. Robi­son (Exec­u­tive Vice Pres­i­dent and Chief Strat­e­gy and Tech­nol­o­gy Offi­cer, Hewlett Packard) A World of Infor­ma­tion and Jay Verkler (CEO, Fam­il­y­Search Inter­na­tion­al) Turn­ing Roots, Branch­es, Trees into Nodes, Links, Graphs.

I am not sure what the more genealog­i­cal­ly and less tech­no­log­i­cal­ly mind­ed atten­dees thought of Shane’s speech. It was a well-deliv­ered dis­cus­sion of the future of cloud com­put­ing and glob­al­iza­tion. I found it fas­ci­nat­ing. Of course, with so much of the world so pop­u­lat­ed, and with these oth­er pop­u­la­tion cen­ters (Chi­na, India, Brazil) poised to dra­mat­i­cal­ly move into more of a mid­dle-class exis­tence, there are seri­ous chal­lenges for glob­al sus­tain­abil­i­ty. I was glad to see that Mr. Robi­son had sus­tain­abil­i­ty in the cen­ter of his group of pri­or­i­ties for Hewlett Packard.

Mr. Verkler got up and tied this all back into geneal­o­gy, point­ing out that cloud com­put­ing is hap­pen­ing in a big way already in the geneal­o­gy space: All of the new Fam­il­y­Search web­site is host­ed on Ama­zon EC2 servers in the cloud, not on servers Fam­il­y­Search owns itself.

Lat­er in the day, I spent some time man­ning the NGS booth, looked around at the exhib­it hall, and attend­ed some talks. IBM has a space in the exhib­it hall with games: non-vir­tu­al (pool, air hock­ey, chess) and vir­tu­al (Microsoft Kinect). They were also giv­ing away mas­sages. I also attend­ed jQuery and Web Ser­vices, a talk by Logan Allred. He was cogent and clear. Over lunch, I heard Chris van der Kuyl of bright­sol­id dis­cuss Fam­i­ly His­to­ry in the Age of the Cloud. He didn’t real­ly talk about the cloud much, but it was an inter­est­ing romp through the inter­sec­tion of tech­nol­o­gy and geneal­o­gy, and a good intro­duc­tion to bright­sol­id as a com­pa­ny.

Jim­my Zimmerman’s Ruby Library for Fam­il­y­Search API was also a great talk, so full of details, it was prac­ti­cal­ly a code review. I regret to say that Bar­ry Ewell’s talk, Dig­i­tal­ly Pre­serv­ing Your Fam­i­ly Her­itage, did not impress me. He’s very knowl­edge­able about the top­ic, but his speak­ing style grat­ed on me. He would start a sen­tence, stop in the mid­dle, say a cou­ple of sen­tences that were rel­e­vant to him, then fin­ish the orig­i­nal sen­tence. Maybe he was hav­ing an off day, or was a lit­tle ner­vous in the lights, but it didn’t make for a good pre­sen­ta­tion in my opin­ion. Michael Buck’s Top Ten Web Appli­ca­tions Secu­ri­ty Risks (based on OWASP rec­om­men­da­tion) was clear, well thought out, and easy to fol­low.

At the end of the day, bright­sol­id spon­sored a Night at the Plan­e­tar­i­um. There were nachos, sand­wich­es, and pop­corn, but also IMAX films, as well as all the plan­e­tar­i­um exhibits. A great end to the day … except that I also head­ed to the Fam­i­ly His­to­ry Library, which was open until 11.

Categories Uncategorized

Multiracial and Multiethnic Trees

New York Times: Mixed America's Family Trees
New York Times: Mixed America’s Fam­i­ly Trees

The New York Times has an arti­cle and an accom­pa­ny­ing inter­ac­tive fea­ture that allows users to explore the Amer­i­can phe­nom­e­non of mul­tira­cial and mul­ti­eth­nic fam­i­lies.

The sto­ry points out that the gov­ern­ment uses sta­tis­tics on race and eth­nic­i­ty to address race- and eth­nic-based inequities, how­ev­er the increas­ing­ly com­plex nature of fam­i­ly back­grounds is caus­ing a shift from tra­di­tion­al “select one” to more accu­rate “select all that are appro­pri­ate” mea­sures. The sto­ry fea­tures a young woman, “Michelle López-Mullins — a uni­ver­si­ty stu­dent who is of Peru­vian, Chi­nese, Irish, Shawnee and Chero­kee descent” and notes that the Edu­ca­tion Depart­ment would clas­si­fy her as “His­pan­ic.” This obvi­ous­ly over sim­pli­fies her back­ground, and thus, from my point of view makes the data and con­clu­sions drawn from it ques­tion­able. The US Cen­sus tracks 63 com­bi­na­tions of racial and eth­nic cat­e­gories, and allows peo­ple to select as many as apply to them.

Accord­ing to the arti­cle, things have changed dra­mat­i­cal­ly to the extent that cur­rent­ly 1 in 7 mar­riages in the US are mul­tira­cial or mul­ti­eth­nic. The cur­rent wave of immi­gra­tion, as well as falling bar­ri­ers between eth­nic and racial groups, as well as dimin­ish­ing of stig­mas regard­ing mul­tira­cial and mul­ti­eth­nic fam­i­lies.

If you click on the image to the Multiracial and Multiethnic Categorizationright, you will see some of the dif­fer­ent ways one indi­vid­ual is cat­e­go­rized. (In addi­tion to gov­ern­ment cat­e­gories, the Times gives us an idea of what Ms. López-Mullins, her father, and one of her friends think about her back­ground.)

This is of crit­i­cal impor­tance to geneal­o­gists. In the future, some­one min­ing gov­ern­ment doc­u­ments of their ances­tors will be enabled, if the infor­ma­tion is accu­rate and detailed enough, to get new clues. If the infor­ma­tion is watered down or con­fus­ing, with mul­ti­ple stan­dards with­in Fed­er­al agen­cies, not to men­tion across the states, the work of the future geneal­o­gist will be more dif­fi­cult.

The inter­ac­tive fea­ture I men­tioned allows you to share a small fam­i­ly tree along with the eth­nic and racial back­grounds that make it up, with pic­tures, if you have some handy. You may also add an audio file of up to 10 MB of audio explain­ing the tree.

Cyber Security

How to Protect Yourself in a Connected World

As geneal­o­gists, we are often online — whether using scanned records from a sub­scrip­tion site, search­ing through tran­scrip­tions on Gen­Web, vol­un­teer­ing for a local soci­ety, or send­ing e-mail to a recent­ly found cousin. Being online as much as we are, we assume some risks. While these risks are man­age­able, and do not exceed the val­ue of com­put­ing and Inter­net use for geneal­o­gists, it is impor­tant to assess your risk lev­el, and take steps to lim­it poten­tial attacks. Let me walk you through some of the things you should con­sid­er.

Create Secure Passwords

With all of the pass­words we need to cre­ate and remem­ber, it is tempt­ing to have a sin­gle, mem­o­rable pass­word for e-mail, sub­scrip­tion sites, and finan­cial insti­tu­tions. Doing so puts you at risk. If your pass­word is mem­o­rable for you it can prob­a­bly be guessed by some­one else, or by a com­put­er pro­gram. And if you only have one pass­word, if some­one guess­es it, that per­son has access to any and all of your accounts. The best pass­word secu­ri­ty will include pass­words that can­not be guessed. They should not be a date, a name, or a com­mon­ly known word found in any dic­tio­nary. Com­put­er pro­grams exist that can try numer­ous pos­si­bil­i­ties to hack your pass­word. Instead, your pass­words should have a com­bi­na­tion of upper- and low­er-case char­ac­ters, numer­als, and sym­bols. There are web­sites that can pro­duce ran­dom, secure pass­words; for exam­ple, PC Tools offers one Of course, hav­ing dozens of pass­words, all of them difficult to remem­ber, presents its own prob­lems— human mem­o­ry has its lim­its.

There is the tried-and-true method of writ­ing things down, but you cer­tain­ly do not want to lose a note­book of your pass­words. Since you might not want to take your pass­word list out of the house, you will not be able to log in to your sub­scrip­tion research sites from Star­bucks. Anoth­er method, which I rec­om­mend, is stor­ing your pass­words in a pass­word man­ag­er, either online or offline. This may seem counter-intu­itive, but it works. Pro­grams such as Robo­Form and web­sites such as Last­Pass allow you to encrypt pass­words and then store them on your computer’s hard disk, or in the cloud.

Robo­Form runs on Win­dows and stores all the pass­word data on your hard dri­ve in one of a num­ber of encryp­tion for­mats. You can also pur­chase a ver­sion that runs on a USB key, so you can take it with you. Last­Pass stores your pass­words in an encrypt­ed form in the cloud, in oth­er words, poten­tial­ly on a num­ber of servers across the Inter­net. For added secu­ri­ty, you can get a USB key to pro­vide anoth­er lev­el of val­i­da­tion. Access to the pass­words requires that the key, which is spe­cial­ly configured for your account, be plugged into your com­put­er, and that you know the e-mail address and pass­word of the account. If you lose the key, you can reset the account by a request on the web­site that you then must respond to from your pre­vi­ous­ly asso­ci­at­ed e-mail account.

Avoid E-mail Scams

Bulk e-mail can be a very finan­cial­ly efficient way for peo­ple to steal data. Spam­mers can send out mil­lions of mes­sages for almost noth­ing, and if only a few peo­ple respond in ways they can exploit, their cam­paign has been finan­cial­ly suc­cess­ful. The main method of e-mail scam these days has been called “phish­ing.” In a phish­ing attack, the scam­mer sends an e-mail that pre­tends to be for a legit­i­mate pur­pose, request­ing that you log in to its site, send your pass­word by return e-mail, or in some oth­er way to pro­vide the scam­mer with some of the cre­den­tials (user name/password com­bi­na­tions) that would allow access to one or more of your accounts or your pri­vate data. The e-mail can look very official, but often has some tell tale signs: words are mis­spelled and URLs are slight­ly differ­ent, either in a way you can read­i­ly see or under­neath the HTML code, which you can observe by hov­er­ing your mouse
over them.

To pro­tect your­self, the best first step to have good spam filter­ing. G-mail from Google includes some of the best spam filter­ing avail­able. G-mail is also free and is easy to set up. Very rarely do I see a phish­ing attack in my G-mail inbox; but the spam fold­er on G-mail is full of phish­ing attacks. In addi­tion to e-mail filter­ing, you can set up lists of e-mail address­es and domains so as always to allow (white list) or dis­al­low (black list) mail from those sources. For exam­ple, if you want to make sure that mail from your cousin Sheila gets though, you would white list her e-mail address. On the oth­er hand, if you had received mali­cious e-mail from (not, you might black list any mail com­ing from the domain Many ser­vice providers pro­vide this ser­vice, build­ing a black list of known or sus­pect­ed sources of spam and mal­ware.

Once you have spam filter­ing, and even if you have a black list and white list set up, some phish­ing attacks will get through. To keep your data safe, use cau­tion when respond­ing to e-mail. The e-mail address the mail comes from might be oth­er than what appears in your e-mail soft­ware. If you believe that your bank may actu­al­ly be con­tact­ing you via e-mail, do not sim­ply click on the e-mail link, hit the reply but­ton, or call a phone num­ber in the e-mail. Con­tact the bank direct­ly, either by typ­ing its Web address in your brows­er your­self, send­ing e-mail where you enter the address your­self, or by call­ing the bank with a phone num­ber you already have on file for them. If this was a legit­i­mate e-mail from your bank, a copy of it will be in your online account, and it should also be avail­able to the bank’s cus­tomer ser­vice per­son­nel when you call.

Thwart Viruses and Malware

Mal­ware is soft­ware that is designed to do harm. This soft­ware can be embed­ded into soft­ware pro­grams or files, and can be hid­den in what look like harm­less web­sites. This is a risk whether you are on a Win­dows or a Mac com­put­er.

Over the years, Mac­in­tosh enthu­si­asts like me have boast­ed that its oper­at­ing sys­tem is immune to these kinds of attacks. Despite the fact that we can be annoy­ing, even PC devo­tees have to admit that the num­ber of mal­ware pro­grams direct­ly aimed at the Mac OS has remained low. There have been no major virus out­breaks on Mac OS X, but this may be on the verge of chang­ing. Even the Mac OS X has to use browsers to nav­i­gate the Web, and any soft­ware designed to request files from the Inter­net will have vul­ner­a­bil­i­ties. At the CanSecWest dig­i­tal secu­ri­ty con­fer­ence in Van­cou­ver this Spring, com­put­er secu­ri­ty engi­neers demon­strat­ed the abil­i­ty to exploit Inter­net Explor­er on Win­dows, Fire­fox on the Mac­in­tosh, and Safari on the Mac­in­tosh and on iPhones. (Google Chrome was the only brows­er on which no one was able to demon­strate secu­ri­ty holes.) Anoth­er aspect of anti-virus con­sid­er­a­tions is that users who run Win­dows through Boot­Camp or a third-par­ty Win­dows vir­tu­al machine, have Mac­in­tosh­es that are vul­ner­a­ble to both Mac­in­tosh and PC virus­es.

What can you do about this? First of all, you should install virus pro­tec­tion soft­ware. On Win­dows, the best known pro­grams are McAfee VirusS­can and Nor­ton AntiVirus; on the Mac OS, choic­es include Nor­ton AntiVirus, McAfee VirusS­can, and Intego Virus­Bar­ri­er. Next, you should keep your oper­at­ing sys­tem and browsers up to date. Oper­at­ing sys­tem and brows­er devel­op­ers reg­u­lar­ly release patch­es (small fixes) to their soft­ware when they are able to thwart a known secu­ri­ty threat. If you set your pref­er­ences to allow down­load and instal­la­tion of these secu­ri­ty patch­es, you will be less vul­ner­a­ble to mal­ware than you would oth­er­wise be.

Geneal­o­gists pre­fer to focus their time on research and on eval­u­at­ing sources, but the abil­i­ty these days to do research depends on access to the Inter­net and to the files that have been scanned, down­loaded, and cre­at­ed. If you invest a min­i­mal amount of time in learn­ing how to address pass­word secu­ri­ty, phish­ing attacks, and mal­ware, you will like­ly avoid much more time-con­sum­ing and frus­trat­ing sit­u­a­tions in the future, where you might lose some of your genealog­i­cal data or have your com­put­er raid­ed.

This arti­cle, which orig­i­nal­ly appeared in a slight­ly dif­fer­ent form in the Nation­al Genealog­i­cal Soci­ety’s NGS Mag­a­zine, is repub­lished here by per­mis­sion.