Using the Wayback Machine for Genealogy

Geocities Has Closed

Geoc­i­ties Has Closed

The Way­back Machine, a project of The Inter­net Archive, (cur­rent ver­sion: http://web.archive.org/; new beta ver­sion at http://waybackmachine.org/) is an attempt to archive the com­plete con­tent of the Inter­net. Brew­ster Kahle, the co-founder of the Inter­net Archive spoke about the project at the Sat­ur­day keynote address at Root­sTech 2011.

The key pur­pose of the Inter­net Archive is to make the Inter­net avail­able for future his­to­ri­ans and other researchers, in order that they might know what we were say­ing and doing in this often ephemeral envi­ron­ment called the Internet.

But it can also help us in the here and now. If you ever encounter a pub­licly avail­able site that has dis­ap­peared, you may find it else­where on Google, but, fail­ing that, you may find it in the Inter­net Archive.

For exam­ple, on an old Rootsweb page that I am in the process of migrat­ing to this site, I have a link that is no longer work­ing. (As the lingo goes, I have “link rot”.)

I try to link to:

http://www.geocities.com/Heartland/Hollow/1936/index.html

When I try to nav­i­gate to this site, I get a mes­sage saying:

Sorry, the GeoC­i­ties web­site you were try­ing to visit is no longer avail­able.
GeoC­i­ties has closed, but there’s a lot more to explore on Yahoo!”

This does not offer much solace. How­ever, when I go to the Way­back Machine and enter the URL I was search­ing for, I receive the fol­low­ing link:

http://web.archive.org/web/*/http://www.geocities.com/Heartland/Hollow/1936/index.html

Alter­nately, if I go to the beta ver­sion of the new Way­back Machine and enter this search I get to:

http://waybackmachine.org/*/http://www.geocities.com/Heartland/Hollow/1936/index.html

This page shows me the var­i­ous snap­shots the Inter­net Archive got around to mak­ing of this page. When I click on the most recent, I see that it has a link to a new location:

http://freepages.genealogy.rootsweb.ancestry.com/~pre1800vias/

I can also look at other snap­shots to see what the site looked like at that time.

The Inter­net Archive can­not instan­ta­neously cap­ture the whole Inter­net, but every cou­ple of months, it tra­verses most of the pub­lic web, cap­tures what has changed, and moves on. You should not rely on it, either as a web user, or as a web­mas­ter, how­ever it can prove very handy at times. Try it the next time you run across a link that you are sure used to work, but no longer does.

RootsTech 2011: Towards a New Genealogical Data Model

On Sat­ur­day at the Root­sTech con­fer­ence in Salt Lake City, there was an open dis­cus­sion ses­sion on genealog­i­cal data stan­dards. There has been a heated dis­cus­sion, lit­er­ally going on for years, about a new data model that could replace GEDCOM. A new GEDCOM stan­dard would address GEDCOM’s gaps — for exam­ple, being able to store evi­den­tiary analy­sis within the data model — and be a liv­ing dynamic stan­dard, unlike GEDCOM, which has been sta­tic since 1996.

In the first hour, the dis­cus­sion iden­ti­fied sev­eral issues with the data model:

  • Data in Pro­pri­etary For­mats — Because of gaps in GEDCOM, and the lack of a stan­dards body to address this issue, most soft­ware ven­dors devel­oped their own pro­pri­etary exten­sions, which lim­ited the abil­ity to share data.
  • Lack of Per­sis­tent URLs (PURLs)
  • Unstruc­tured Text
  • Tag & Link Issues
  • Incon­sis­tent Search Experience
  • Data Ver­sion­ing (Diff/Merge)
  • Inabil­ity to Trans­fer Rich Data (rich media)
  • Inabil­ity to do Cross-Repository Search
  • Doc­u­men­ta­tion (in other words, cap­tur­ing the source of a genealog­i­cal state­ment, the abil­ity to provide
  • Key as seen (Rep­re­sen­ta­tion) — In other words, how do we nor­mal­ize data while pre­serv­ing the orig­i­nal “as-keyed” version?
  • Sta­tic data interchange

After the first hour, devoted to cre­at­ing this list, we were to vote on buck­ets of tech­no­log­i­cal or fea­ture issues to come up with one or two we could dis­cuss. For me, the biggest issue was not any of these tech­ni­cal issues, it was the lack of a gov­er­nance model. Since no one was signed up to main­tain GEDCOM, it did not change with the times, and died as a stan­dard; in other words, peo­ple saw gaps and addressed them in a pro­pri­etary way, since there was no way to get issues addressed within the standard.

I got up and sug­gested we talk about how we build a work­ing gov­er­nance model instead of the issues that the gov­er­nance model would help us solve. For more than a decade, peo­ple have been lament­ing the lack of a stan­dards body to adju­di­cate issues, develop a com­mon stan­dard, and sub­mit it for pub­lic review. At the same time, peo­ple have pointed out the fea­ture gaps, and pro­posed ways to address them. For the fea­ture gap dis­cus­sion to have an effect, how­ever, we need to have a place to have these dis­cus­sions that is actu­ally designed to main­tain a work­ing stan­dard. Lack of gov­er­nance, not lack of tech­nol­ogy, is the issue. We voted, and changed the direc­tion of the meet­ing to dis­cuss governance.

It was at about this time that Tom Creighton, the CTO of Fam­il­y­Search, got up and announced that Fam­il­y­Search is nearly ready to announce a new pro­posed data model. This changed the meet­ing imme­di­ately. Instead of an open dis­cus­sion, it became more like a press con­fer­ence, with Tom field­ing ques­tions about what they have done, when the work will be shared, and so on. There was not a lot that he was able to divulge at this point.

Key por­tions of the new pro­posed stan­dard are based on the Gen­Tech genealog­i­cal data model owned by the National Genealog­i­cal Soci­ety (full dis­clo­sure, I am on the Board of the NGS). The deci­sion to make the new pro­posed data model pub­lic and free has not yet been made by the man­age­ment at Fam­il­y­Search, but is being dis­cussed. This means that there can­not be a date set for the launch of the new stan­dard, as it could remain the intel­lec­tual prop­erty of Fam­il­y­Search, and unavail­able out­side of Fam­il­y­Search. (Mr. Creighton said that they had dis­cussed the fact that they were devel­op­ing a new stan­dard with sev­eral soft­ware ven­dors, but had not pro­vided any of them any more detail than that they were work­ing on something.)

This is an excit­ing devel­op­ment in the inter­sec­tion of geneal­ogy and tech­nol­ogy. If Fam­il­y­Search decides to share their work, and if a gov­er­nance body can be iden­ti­fied or set up, and finally if that gov­er­nance body has the trust of the genealog­i­cal com­mu­nity, including:

  • the major desk­top and mobile appli­ca­tion developers
  • the major web databases
  • the NGS
  • NEHGS (New Eng­land His­toric Genealog­i­cal Society)
  • FGS (the Fed­er­a­tion of Genealog­i­cal Societies)
  • BCG (the Board for Cer­ti­fi­ca­tion of Genealogists)
  • APG (the Asso­ci­a­tion of Pro­fes­sional Genealogists)

we could be near the start of a much more rich tech­nol­ogy envi­ron­ment. A new data model, address­ing issues with GEDCOM and upgraded and changed through a com­mu­nity gov­er­nance model could lead to inte­grated set of inde­pen­dently devel­oped soft­ware tools that would allow peo­ple to rep­re­sent their research bet­ter than they can with GEDCOM, and bet­ter share their data or move it from one vended prod­uct to another.

It sounds a lit­tle like Shangri-la as I write it here, but we are talk­ing about the incred­i­ble poten­tial that would be unleashed if most soft­ware ven­dors did not have to fix inde­pen­dently (or ignore) issues with the cur­rent data model, and could instead focus on the next new way to access and work with genealog­i­cal data.

Update, 17 Feb­ru­ary 2011: A sum­mary of the meet­ing dis­cussed here has been posted on the Fam­il­y­Search wiki: https://wiki.familysearch.org/en/Genealogical_Data_Standards_(RootsTech_Session)

RootsTech 2011: Day 3

Internet Archive

Inter­net Archive

Brew­ster Kahle, founder of the Inter­net Archive, gave an incred­i­ble keynote address this morning.

His non-profit has been dig­i­tiz­ing and pro­vid­ing on the Inter­net all kinds of media. As he said, “We are in the busi­ness of giv­ing infor­ma­tion away.” He briefly men­tioned “born dig­i­tal” data, but focused his dis­cus­sion on the data we all have in shoe­boxes, what he called the “canon­i­cal box ‘o stuff.”

The Inter­net Archives has 23 scan­ning cen­ters in 6 coun­tries. For exam­ple, they have dig­i­tized doc­u­ments from the Leo Baeck Insti­tute, and did so while remov­ing pri­vate infor­ma­tion via remote cura­tion over the web.

Mr. Kahle also dis­cussed their dig­i­ti­za­tion of video con­tent (8mm, Super8, 16mm,  video tape). He pointed out that some of this kind of con­ver­sion is avail­able in the con­sumer mar­ket, for about $200 / hour. Higher grade (HD-quality trans­fers are also avail­able, but are much more expensive.

Specif­i­cally in the genealog­i­cal field, Mr. Kahle said that the Inter­net Archive is involved in cre­at­ing a free genealog­i­cal library — part­ner­ing with Fam­il­y­Search and the Allen County Library. Recently, the Inter­net Archive com­pleted dig­i­tiz­ing the 1790–1930 Cen­sus and mak­ing it avail­able for free. They are now work­ing on dig­i­tiz­ing pas­sen­ger records. Soon, they will be announc­ing a part­ner­ship with libraries that will allow for 80,000 e-books to be “loaned” from the library to patrons who are in the library.

For me, this was all pow­er­ful, trans­for­ma­tive infor­ma­tion. But I was most inter­ested in Mr. Kahle’s dis­cus­sion of print-on-demand dig­i­tal book­mo­biles, which can pro­vide books as peo­ple need them, at a very low cost. (One exam­ple was that Alice in Won­der­land costs about $1 to print and bind.) Accord­ing to Mr. Kahle, a Har­vard study has shown that it takes a library $3 to loan a book, so $1 to give a book away should be a rea­son­able price. This is being used to pro­vide printed books free in India, Egypt, and Uganda.

One of the most mov­ing por­tions of the dis­cus­sion was the fact that the Inter­net Archive has dou­bled, to more than 1 mil­lion, the num­ber of books avail­able to the blind and text-disabled in the DAISY for­mat for auto­mated readers.

A key issue for any archive, Mr. Kahle pointed out is insti­tu­tional respon­si­bil­ity: How long, and at what level can a com­pany, or any insti­tu­tion be trusted to store infor­ma­tion. He told us not to trust that Flickr, Google, or even his non-profit would be around, or make the right deci­sions when it counted. So, his rec­om­men­da­tion is to not only have one copy in one insti­tu­tion. He said that the Library in Alexan­dria burned, yes, but it already had lost many of the impor­tant texts that it had gath­ered because of insti­tu­tional neglect: “the new guys didn’t like the old stuff around.”

In 2002, the Inter­net Archive handed 200 TB of their data to the Library of Alexan­dria, which rec­i­p­ro­cated with their col­lec­tion of dig­i­tized Ara­bic mate­ri­als. These kinds of large scale swap agree­ments are crit­i­cal to the redun­dancy needed to ensure that we do not have another loss sim­i­lar to what we lost at Alexan­dria, books by Aris­to­tle, the other plays of Euripi­des … At this point, the whole Inter­net Archive is stored in three loca­tions: San Fran­cisco, Alexan­dria, and Ams­ter­dam. Mr. Kahle acknowl­edged that an earth­quake zone, the Mid­dle East, and a flood plain were per­haps not the best choices, but they were not plan­ning on stop­ping there.

For us, as geneal­o­gists, Mr. Kahle poses the fol­low­ing ques­tions, which should make us think hard about the respon­si­bil­ity we have to take care of our data and documents:

  • Can we learn the sto­ries of our ancestors?
  • Will our descen­dants know our story?

The Root­sTech con­fer­ence was a great suc­cess. More than 3,000 atten­dees were there, mak­ing it one of the biggest, if not the biggest geneal­ogy gath­er­ing in the US. Next year, the sec­ond Root­sTech con­fer­ence will be held at the Salt Palace in Salt Lake City, Utah from 2–4 Feb­ru­ary. I plan to be there.

RootsTech 2011: Day 2

Day 2 of Root­sTech started with a spir­ited keynote address by Curt Witcher of the Allen County Pub­lic Library on “The Chang­ing Face of Geneal­ogy.” His point was: The world is going dig­i­tal and going there quickly. Get on board, or be left behind.

Brian Pugh of Fam­il­y­Search pre­sented a pow­er­ful talk on how the new Fam­il­y­Search web­site has uti­lized cloud ser­vices (pri­mar­ily from Ama­zon Web Ser­vices: http://aws.amazon.com) to pro­vide world class web­site in a cost-efficient man­ner. The strat­egy has allowed them to auto-scale up and down their ser­vices as needed. Addi­tion­ally, they are able to cre­ate data snap­shots to quickly build new pro­to­types of their site for devel­op­ment and test­ing. They use Ama­zon S3 as a shared filesys­tem for dynamic con­tent, though the per­for­mance of S3 is not designed for serv­ing up images, and so on, so they cache the data stored on S3 for actual deliv­ery to web browsers.

One thing they are doing on the Fam­il­y­Search web­site is uti­liz­ing Ama­zon Elas­tic IPs to allow for “hot” deploy­ment of new ver­sions of the site. They can build the new ver­sion of the site, test it, and then in a mat­ter of sec­onds, have Ama­zon redi­rect the IP address of the web­site to the new site, while keep­ing the old site in reserve. If they need to fall back to the old site, it’s again only a mat­ter of seconds.

They also use Ama­zon MapRe­duce to per­form com­plex computations.

Fam­il­y­Search engi­neers have made avail­able pro­gram­ming lan­guage for cre­at­ing cloud based sys­tems, avail­able at: code.google.com/p/lasic. This allows man­agers of cloud envi­ron­ments to quickly issue “verbs” such as

  • Deploy
  • Con­fig­ure
  • Shut­down
  • Snap­shot

One key thing that Mr. Pugh said about Amazon’s offer­ing in this space, is that it is being widely used. Among oth­ers, he men­tioned that the New York Times, Major League Base­ball, Net­flix, 3M, Activi­sion, ESPN, NASDAQ, The Guardian, and Razor­fish (and I can add the New Eng­land His­toric Geneal­ogy Soci­ety, based on the Fri­day luncheon.)

Later in the day, I was able to attend a view­ing of “Who Do You Think You Are?” at the Fam­ily His­tory Library. They gave out raf­fle items, and I won a copy of Ances­try for the Mac. I then took advan­tage of the Library being open until mid­night, research­ing my Hills, John­sons, and Crows in Howard County and Nance County, Nebraska.

RootsTech 2011: Day 1

Yes­ter­day was the first day of Root­sTech, a new con­fer­ence on geneal­ogy and tech­nol­ogy held in Salt Lake City and spon­sored by Fam­il­y­Search Inter­na­tional, the geneal­ogy infor­ma­tion arm of the Church of Jesus Christ of Latter-day Saints.

The con­fer­ence started with a lit­tle bit of con­fu­sion: It seemed that there was a rush to the reg­is­tra­tion table just prior to the keynote address. This kind of thing can be min­i­mized, of course, by open­ing reg­is­tra­tion the day before, or by send­ing all the light­weight items (tick­ets to lunches and events, lan­yard and badge) ahead of time, and then sim­ply exchang­ing one of those tick­ets for a stan­dard back­pack or lap­top case and any other schwag and late-breaking news.

In any case, the orga­niz­ers offered to let peo­ple reg­is­ter later; they were not going to check badges for the first event. This was some­thing I def­i­nitely took advan­tage of, since I didn’t want to miss the talk by Shane R. Robi­son (Exec­u­tive Vice Pres­i­dent and Chief Strat­egy and Tech­nol­ogy Offi­cer, Hewlett Packard) A World of Infor­ma­tion and Jay Verkler (CEO, Fam­il­y­Search Inter­na­tional) Turn­ing Roots, Branches, Trees into Nodes, Links, Graphs.

I am not sure what the more genealog­i­cally and less tech­no­log­i­cally minded atten­dees thought of Shane’s speech. It was a well-delivered dis­cus­sion of the future of cloud com­put­ing and glob­al­iza­tion. I found it fas­ci­nat­ing. Of course, with so much of the world so pop­u­lated, and with these other pop­u­la­tion cen­ters (China, India, Brazil) poised to dra­mat­i­cally move into more of a middle-class exis­tence, there are seri­ous chal­lenges for global sus­tain­abil­ity. I was glad to see that Mr. Robi­son had sus­tain­abil­ity in the cen­ter of his group of pri­or­i­ties for Hewlett Packard.

Mr. Verkler got up and tied this all back into geneal­ogy, point­ing out that cloud com­put­ing is hap­pen­ing in a big way already in the geneal­ogy space: All of the new Fam­il­y­Search web­site is hosted on Ama­zon EC2 servers in the cloud, not on servers Fam­il­y­Search owns itself.

Later in the day, I spent some time man­ning the NGS booth, looked around at the exhibit hall, and attended some talks. IBM has a space in the exhibit hall with games: non-virtual (pool, air hockey, chess) and vir­tual (Microsoft Kinect). They were also giv­ing away mas­sages. I also attended jQuery and Web Ser­vices, a talk by Logan Allred. He was cogent and clear. Over lunch, I heard Chris van der Kuyl of bright­solid dis­cuss Fam­ily His­tory in the Age of the Cloud. He didn’t really talk about the cloud much, but it was an inter­est­ing romp through the inter­sec­tion of tech­nol­ogy and geneal­ogy, and a good intro­duc­tion to bright­solid as a company.

Jimmy Zimmerman’s Ruby Library for Fam­il­y­Search API was also a great talk, so full of details, it was prac­ti­cally a code review. I regret to say that Barry Ewell’s talk, Dig­i­tally Pre­serv­ing Your Fam­ily Her­itage, did not impress me. He’s very knowl­edge­able about the topic, but his speak­ing style grated on me. He would start a sen­tence, stop in the mid­dle, say a cou­ple of sen­tences that were rel­e­vant to him, then fin­ish the orig­i­nal sen­tence. Maybe he was hav­ing an off day, or was a lit­tle ner­vous in the lights, but it didn’t make for a good pre­sen­ta­tion in my opin­ion. Michael Buck’s Top Ten Web Appli­ca­tions Secu­rity Risks (based on OWASP rec­om­men­da­tion) was clear, well thought out, and easy to follow.

At the end of the day, bright­solid spon­sored a Night at the Plan­e­tar­ium. There were nachos, sand­wiches, and pop­corn, but also IMAX films, as well as all the plan­e­tar­ium exhibits. A great end to the day … except that I also headed to the Fam­ily His­tory Library, which was open until 11.

Multiracial and Multiethnic Trees

New York Times: Mixed America's Family Trees

New York Times: Mixed America’s Fam­ily Trees

The New York Times has an arti­cle and an accom­pa­ny­ing inter­ac­tive fea­ture that allows users to explore the Amer­i­can phe­nom­e­non of mul­tira­cial and mul­ti­eth­nic families.

The story points out that the gov­ern­ment uses sta­tis­tics on race and eth­nic­ity to address race– and ethnic-based inequities, how­ever the increas­ingly com­plex nature of fam­ily back­grounds is caus­ing a shift from tra­di­tional “select one” to more accu­rate “select all that are appro­pri­ate” mea­sures. The story fea­tures a young woman, “Michelle López-Mullins — a uni­ver­sity stu­dent who is of Peru­vian, Chi­nese, Irish, Shawnee and Chero­kee descent” and notes that the Edu­ca­tion Depart­ment would clas­sify her as “His­panic.” This obvi­ously over sim­pli­fies her back­ground, and thus, from my point of view makes the data and con­clu­sions drawn from it ques­tion­able. The US Cen­sus tracks 63 com­bi­na­tions of racial and eth­nic cat­e­gories, and allows peo­ple to select as many as apply to them.

Accord­ing to the arti­cle, things have changed dra­mat­i­cally to the extent that cur­rently 1 in 7 mar­riages in the US are mul­tira­cial or mul­ti­eth­nic. The cur­rent wave of immi­gra­tion, as well as falling bar­ri­ers between eth­nic and racial groups, as well as dimin­ish­ing of stig­mas regard­ing mul­tira­cial and mul­ti­eth­nic families.

If you click on the image to the Multiracial and Multiethnic Categorizationright, you will see some of the dif­fer­ent ways one indi­vid­ual is cat­e­go­rized. (In addi­tion to gov­ern­ment cat­e­gories, the Times gives us an idea of what Ms. López-Mullins, her father, and one of her friends think about her background.)

This is of crit­i­cal impor­tance to geneal­o­gists. In the future, some­one min­ing gov­ern­ment doc­u­ments of their ances­tors will be enabled, if the infor­ma­tion is accu­rate and detailed enough, to get new clues. If the infor­ma­tion is watered down or con­fus­ing, with mul­ti­ple stan­dards within Fed­eral agen­cies, not to men­tion across the states, the work of the future geneal­o­gist will be more difficult.

The inter­ac­tive fea­ture I men­tioned allows you to share a small fam­ily tree along with the eth­nic and racial back­grounds that make it up, with pic­tures, if you have some handy. You may also add an audio file of up to 10 MB of audio explain­ing the tree.

Cyber Security

How to Pro­tect Your­self in a Con­nected World

As geneal­o­gists, we are often online — whether using scanned records from a sub­scrip­tion site, search­ing through tran­scrip­tions on Gen­Web, vol­un­teer­ing for a local soci­ety, or send­ing e-mail to a recently found cousin. Being online as much as we are, we assume some risks. While these risks are man­age­able, and do not exceed the value of com­put­ing and Inter­net use for geneal­o­gists, it is impor­tant to assess your risk level, and take steps to limit poten­tial attacks. Let me walk you through some of the things you should consider.

Cre­ate Secure Passwords

With all of the pass­words we need to cre­ate and remem­ber, it is tempt­ing to have a sin­gle, mem­o­rable pass­word for e-mail, sub­scrip­tion sites, and finan­cial insti­tu­tions. Doing so puts you at risk. If your pass­word is mem­o­rable for you it can prob­a­bly be guessed by some­one else, or by a com­puter pro­gram. And if you only have one pass­word, if some­one guesses it, that per­son has access to any and all of your accounts. The best pass­word secu­rity will include pass­words that can­not be guessed. They should not be a date, a name, or a com­monly known word found in any dic­tio­nary. Com­puter pro­grams exist that can try numer­ous pos­si­bil­i­ties to hack your pass­word. Instead, your pass­words should have a com­bi­na­tion of upper– and lower-case char­ac­ters, numer­als, and sym­bols. There are web­sites that can pro­duce ran­dom, secure pass­words; for exam­ple, PC Tools offers one www.pctools.com/guides/password/. Of course, hav­ing dozens of pass­words, all of them difficult to remem­ber, presents its own prob­lems— human mem­ory has its limits.

There is the tried-and-true method of writ­ing things down, but you cer­tainly do not want to lose a note­book of your pass­words. Since you might not want to take your pass­word list out of the house, you will not be able to log in to your sub­scrip­tion research sites from Star­bucks. Another method, which I rec­om­mend, is stor­ing your pass­words in a pass­word man­ager, either online or offline. This may seem counter-intuitive, but it works. Pro­grams such as Robo­Form and web­sites such as Last­Pass allow you to encrypt pass­words and then store them on your computer’s hard disk, or in the cloud.

Robo­Form runs on Win­dows and stores all the pass­word data on your hard drive in one of a num­ber of encryp­tion for­mats. You can also pur­chase a ver­sion that runs on a USB key, so you can take it with you. Last­Pass stores your pass­words in an encrypted form in the cloud, in other words, poten­tially on a num­ber of servers across the Inter­net. For added secu­rity, you can get a USB key to pro­vide another level of val­i­da­tion. Access to the pass­words requires that the key, which is spe­cially configured for your account, be plugged into your com­puter, and that you know the e-mail address and pass­word of the account. If you lose the key, you can reset the account by a request on the web­site that you then must respond to from your pre­vi­ously asso­ci­ated e-mail account.

Avoid E-mail Scams

Bulk e-mail can be a very finan­cially efficient way for peo­ple to steal data. Spam­mers can send out mil­lions of mes­sages for almost noth­ing, and if only a few peo­ple respond in ways they can exploit, their cam­paign has been finan­cially suc­cess­ful. The main method of e-mail scam these days has been called “phish­ing.” In a phish­ing attack, the scam­mer sends an e-mail that pre­tends to be for a legit­i­mate pur­pose, request­ing that you log in to its site, send your pass­word by return e-mail, or in some other way to pro­vide the scam­mer with some of the cre­den­tials (user name/password com­bi­na­tions) that would allow access to one or more of your accounts or your pri­vate data. The e-mail can look very official, but often has some tell tale signs: words are mis­spelled and URLs are slightly differ­ent, either in a way you can read­ily see or under­neath the HTML code, which you can observe by hov­er­ing your mouse
over them.

To pro­tect your­self, the best first step to have good spam filter­ing. G-mail from Google includes some of the best spam filter­ing avail­able. G-mail is also free and is easy to set up. Very rarely do I see a phish­ing attack in my G-mail inbox; but the spam folder on G-mail is full of phish­ing attacks. In addi­tion to e-mail filter­ing, you can set up lists of e-mail addresses and domains so as always to allow (white list) or dis­al­low (black list) mail from those sources. For exam­ple, if you want to make sure that mail from your cousin Sheila gets though, you would white list her e-mail address. On the other hand, if you had received mali­cious e-mail from paypal.net (not PayPal.com), you might black list any mail com­ing from the domain paypal.net. Many ser­vice providers pro­vide this ser­vice, build­ing a black list of known or sus­pected sources of spam and malware.

Once you have spam filter­ing, and even if you have a black list and white list set up, some phish­ing attacks will get through. To keep your data safe, use cau­tion when respond­ing to e-mail. The e-mail address the mail comes from might be other than what appears in your e-mail soft­ware. If you believe that your bank may actu­ally be con­tact­ing you via e-mail, do not sim­ply click on the e-mail link, hit the reply but­ton, or call a phone num­ber in the e-mail. Con­tact the bank directly, either by typ­ing its Web address in your browser your­self, send­ing e-mail where you enter the address your­self, or by call­ing the bank with a phone num­ber you already have on file for them. If this was a legit­i­mate e-mail from your bank, a copy of it will be in your online account, and it should also be avail­able to the bank’s cus­tomer ser­vice per­son­nel when you call.

Thwart Viruses and Malware

Mal­ware is soft­ware that is designed to do harm. This soft­ware can be embed­ded into soft­ware pro­grams or files, and can be hid­den in what look like harm­less web­sites. This is a risk whether you are on a Win­dows or a Mac computer.

Over the years, Mac­in­tosh enthu­si­asts like me have boasted that its oper­at­ing sys­tem is immune to these kinds of attacks. Despite the fact that we can be annoy­ing, even PC devo­tees have to admit that the num­ber of mal­ware pro­grams directly aimed at the Mac OS has remained low. There have been no major virus out­breaks on Mac OS X, but this may be on the verge of chang­ing. Even the Mac OS X has to use browsers to nav­i­gate the Web, and any soft­ware designed to request files from the Inter­net will have vul­ner­a­bil­i­ties. At the CanSecWest dig­i­tal secu­rity con­fer­ence in Van­cou­ver this Spring, com­puter secu­rity engi­neers demon­strated the abil­ity to exploit Inter­net Explorer on Win­dows, Fire­fox on the Mac­in­tosh, and Safari on the Mac­in­tosh and on iPhones. (Google Chrome was the only browser on which no one was able to demon­strate secu­rity holes.) Another aspect of anti-virus con­sid­er­a­tions is that users who run Win­dows through Boot­Camp or a third-party Win­dows vir­tual machine, have Mac­in­toshes that are vul­ner­a­ble to both Mac­in­tosh and PC viruses.

What can you do about this? First of all, you should install virus pro­tec­tion soft­ware. On Win­dows, the best known pro­grams are McAfee VirusS­can and Nor­ton AntiVirus; on the Mac OS, choices include Nor­ton AntiVirus, McAfee VirusS­can, and Intego Virus­Bar­rier. Next, you should keep your oper­at­ing sys­tem and browsers up to date. Oper­at­ing sys­tem and browser devel­op­ers reg­u­larly release patches (small fixes) to their soft­ware when they are able to thwart a known secu­rity threat. If you set your pref­er­ences to allow down­load and instal­la­tion of these secu­rity patches, you will be less vul­ner­a­ble to mal­ware than you would oth­er­wise be.

Geneal­o­gists pre­fer to focus their time on research and on eval­u­at­ing sources, but the abil­ity these days to do research depends on access to the Inter­net and to the files that have been scanned, down­loaded, and cre­ated. If you invest a min­i­mal amount of time in learn­ing how to address pass­word secu­rity, phish­ing attacks, and mal­ware, you will likely avoid much more time-consuming and frus­trat­ing sit­u­a­tions in the future, where you might lose some of your genealog­i­cal data or have your com­puter raided.

This arti­cle, which orig­i­nally appeared in a slightly dif­fer­ent form in the National Genealog­i­cal Soci­ety’s NGS Mag­a­zine, is repub­lished here by permission.

Kindle Update, 3.1

Kindle 3 with 3.1 Software (New York Times Capture)

Ama­zon announced today an update for the lat­est gen­er­a­tion of the Kin­dle e-book reader, com­monly called Kin­dle 3. Users can wait for their Kin­dle 3 or Kin­dle DX devices to auto­mat­i­cally be updated, or they can down­load the soft­ware and install it them­selves at:

Kin­dle Soft­ware Update Ver­sion 3.1 — Early Pre­view Release

Accord­ing to Ama­zon, the update pro­vides the fol­low­ing benefits:

  • Pub­lic Notes — In the inter­est of help­ing peo­ple become “social” about their read­ing, Ama­zon will allow peo­ple to share their notes and high­lights with other read­ers. In addi­tion to pub­lic shar­ing, there will also be pri­vate shar­ing, allow­ing book clubs or stu­dents to share their notes only with spe­cific people.
  • Real Page Num­bers — This has been requested from the begin­ning. In order to allow for what is called “re-flowable” con­tent, Ama­zon, and most other man­u­fac­tur­ers of e-readers, pro­vide loca­tions in an inter­nal scheme that doesn’t mean any­thing to users, and makes it dif­fi­cult for peo­ple using a Kin­dle to have close-text dis­cus­sions with oth­ers, say class­mates, read­ing the same book on paper. The Kin­dle page num­bers will be based on one spe­cific printed edition.
  • Before You Go — At the com­ple­tion of  a book, read­ers will be invited to rate it or to com­ment in a more detailed way.
  • New News­pa­per and Mag­a­zine Lay­out — Designed to give users a quicker overview of the con­tent, and eas­ier nav­i­ga­tion to it.

These changes will be ported to other Kin­dle software-based read­ers, such as Kin­dle for Mac and Kin­dle for Android. I could not get any of the books I have pur­chased before to dis­play pages, even after down­load­ing them again. The new for­mat for news­pa­pers and mag­a­zines is a big improvement.

Geneal­o­gists should know that the newer ver­sions of the Kin­dle (the Kin­dle 3 and the Kin­dle DX) can read any PDF natively. I find it handy to bring along dozens of PDF books with me every­where I go, in a portable, quick start­ing, low power device that can go weeks with­out a charge. I am also enjoy­ing Shelby Foote’s The Civil War: A Nar­ra­tive as an audio book from Audi­ble (a sub­sidiary of Ama­zon), which I also have on my Kindle.

Civil War Sesquicentennial Blogs

Geneal­o­gists need to be his­to­ri­ans. There is no way to under­stand a fam­ily his­tory out­side of the larger con­text of the his­tory the fam­ily lived through. Geneal­o­gists have a rare oppor­tu­nity over the next few years to wit­ness the sesqui­cen­ten­nial (150th anniver­sary) of the Civil War. This will be com­mem­o­rated with exhibits, books, web­sites, records releases, and re-enactments of battles.

In com­mem­o­ra­tion of an impor­tant and painful period in Amer­i­can his­tory, sev­eral groups have set up blogs about the sesqui­cen­ten­nial of the Civil War:

Many states also have blogs or other web­sites com­mem­o­rat­ing the events and detail­ing museum exhibits or online col­lec­tions. This is not intended to be a com­pre­hen­sive list. For exam­ple, I have not includes sites that seems to be mainly about a his­tor­i­cal com­mis­sion, but not about the Civil War his­tory itself.

Surname Saturday: Via

George A. Via

George A. Via, My 4th Great Grandfather

The Via sur­name is one of my more unique sur­names. Since I reg­u­larly research Smith, Jones, John­son, Miller, Hill, and Gra­ham, it’s good to have the occa­sional rare sur­name. My Vias descend through Mica­jah Via, Sr. (circa 1742 — circa 1810) and Phillipi Bur­nett, and their son Jonathan Via, Sr. and his wife Cather­ine O’Buck (or O’Bock).

George Allen Via, the son of Jonathan Sr. and Cather­ine O’Buck Via, was my 4th great grand­fa­ther. He and his wife Mary Eliz­a­beth Lane (mar­ried 2 Dec 1839) had a dozen or so chil­dren (I’m still work­ing out that gen­er­a­tion). Two of their sons Thomas David Via and John Robert Via, served in the Con­fed­er­ate army. John was in Com­pany A (2nd Com­pany), 12th Batal­lion, Vir­ginia Light Artillery Reg­i­ment (along with 6 other Vias).

Thomas (who is my 3rd great grand­fa­ther) first served in Com­pany ‘I,’ 7th Reg­i­ment Vir­ginia Vol­un­teers (along with two other Via), but after being cap­tured at Gettysburg,and impris­oned at Point Look­out Prison, he joined the 1st US Vol­un­teer Infantry, likely as a mat­ter of sur­vival. (This means he was a Gal­va­nized Yan­kee, which I will write more about in another post.)

Most of the Via’s that I have come across in the US descend from Amer Via, a Huguenot immi­grant to Vir­ginia in about 1680. Alter­nate spellings of Via include Vier/Viers, Viar/Viars and Viet.

A cou­ple of forums for the Via sur­name exist:

Here are some items I will be post­ing, as I get a chance, on this site. As I post them, I will come back to this page and link into them.

Pho­tographs
Mary E. Lane Via (1820−1893)
George A. Via (1814−1894)

Mag­a­zine and News­pa­per Arti­cles
Mica­jah Via, peti­ton on paper money, 1788
Betty Via on her stu­dents as crit­ics, 1953
Dan O. Via, Sr., 50th anniver­sary, 1968
Daniel Via, 20th anniver­sary, 1974
Dan O. Via, Jr. play about Jesus, 1982
Mar­garet B. Via ordi­na­tion, 1982
Via fam­ily saved by their dog arti­cle 1, 1986
Via fam­ily saved by their dog arti­cle 2, 1986

Birth Records
Floyd County, VA, 1853–1896
Franklin County, VA, 1852–1870
Hanover County, VA, 1853–1893

Land Records
Deed: James and Rosina Ingrum to Ander­son Via, 1848
Deed: James and Mary Via to David McAlexan­der, 1848
Deed: Robert and Eliz­a­beth Via to James Dod­son, 1849

Tax Records
Vias in the Albe­marle County, Vir­ginia Land Tax Records, 1782–1813
Vias in the Albe­marle County, Vir­ginia Land Tax Records, 1814–1825
Vias in the Floyd County, Vir­ginia Land Tax Records, 1831–1850
Vias in the Hanover County, Vir­ginia Land Tax Records, 1782–1857
Vias in the Nel­son County Per­sonal Prop­erty Tax, 1809–1850

Mil­i­tary Records
Thomas David Via’s pen­sion papers (HTMLPDF)

Mar­riage Bonds & Cer­tifi­cates
Jonathan Via and Cather­ine O’Buck, 1801

Death Records
Augusta County, VA, 1853–1896
Floyd County, VA, 1853–1896
Flu­vanna County, VA, 1853–1896
Franklin County, VA, 1853–1896

Wills
Jonathan Via, Sr., 1858
Jonathan Via, Jr., 1888

Tomb­stones
Thomas D. and Sal­lie E. Via
Mary Via

Obit­u­ar­ies
Obit­u­ary of Sal­lie Thomas Via, 1911
Obit­u­ary of Thomas David Via, 1913
Obit­u­ary of William Mar­tin Via, 1937
Obit­u­ary of Willie Cather­ine Eliz­a­beth Via, 1937
Obit­u­ary of Betty Via, 1993

Ged­coms and Other Com­piled Research
Jor­dan Jones’s Ged­com of Vias
Jor­dan Jones’s Geneal­ogy Data­base List of Vias