Structured data on genealogy websites

In my previous post, I listed several problems with publishing and using genealogy data on the Web. In short, the issues are:

  • it should be easier to copy data from Web resources to a researcher’s database
  • data on the Web may be ambiguous
  • good links to sources should be provided where possible
  • software could exist to help with the above

Before we can try to figure out what solutions are viable, let’s see what the current state of genealogy websites is. What I wanted to do is to take some websites that contain information interesting for genealogists and assess them in terms of how easy it is to collect data from them in an automatic and unambiguous way and also be able to link back to the site as the source of information.

Read More

Genealogy and the Semantic Web

Thousands of hours have already gone into existing genealogical research. A lot of this time was spent by amateurs, volunteers and people passionate about genealogy. People are transcribing and indexing documents, collecting and aggregating information from different sources and publishing their work in various forms.

There are lots of centralized databases collecting the work of genealogists. The biggest ones contain millions of records. For example, Wikitree has over 27 million profiles, Geneteka has over 38 million records, Werelate has around 3 million profiles, dbpedia has 1.6 million person records, and there are many more. Moreover, there are also countless very small sites publishing information about tens or hundreds of ancestors and relatives, often based on one of the popular genealogy site-building tools like webtrees or TNG or as reports from genealogy applications.

As an amateur genealogist I search the Web for mentions of my relatives. I use the search feature on different pages or just search using Google. When I find an interesting piece of information such as a birth record from a local archive or burial information from a cemetery’s website, I copy the information to my personal genealogy database. Usually, I add a note with a reference to the source but sometimes I forget :-/

Read More