New Indexing Capabilities
Submitted by Dale E. Lee
FamilySearch recently announced some exciting new capabilities which will exponentially speed up the data storage of Family History records. They call it HRAI: Handwriting Recognition Artificial Intelligence.
For many years the LDS Church has been collecting, digitizing, transcribing and storing records. Millions of records were originally microfilmed and afterward digitized. But accessing the data was difficult as users had to scroll through rolls of microfilm to find the information they needed.
As technological capabilities grew, the records were then digitized and stored on digital computer media. This next step in the transformation of records was a large improvement, but required that humans manually review the data and transcribe portions of it into the computer so that searchable indexes could be built to speed up the search. Digitization and indexing of documents was a huge step forward as it allowed people to research their genealogy from the comfort of their own homes for records that had been digitized and indexed in many different parts of the world.
Now the industry is taking the next huge step forward by getting Artificial Intelligence involved to help us speed up the process and reduce errors. Artificial Intelligence is now helping transcribe handwritten documents into clear text that can be stored in the computer and indexed as before. However, where before it could take years and even decades to do the work, now AI can process millions of records per day. This means that what used to take decades of work can now be done in a single year.
However, it does not mean that all of the indexing effort can be done by computers, we still need humans to review the results of the AI process, but it does mean that instead of having to type data in manually, now we only need to review it, which is a far faster process.
According to RootsTech, FamilySearch’s Handwriting Recognition Artificial Intelligence (HRAI) process uses the following components:
- Named Entity Recognition,
- Relation Extraction,
- And a process to output data users can understand.
This means that not only will the AI process attempt to extract handwritten data, it will also attempt to extract relationships between the people described in the document. You can review the RootsTech announcement at https://www.familysearch.org/rootstech/session/familysearch-get-involved?lang=eng.
If you wish to become involved:
- Sign up or sign on to FamilySearch.org
- Click on the Get Involved tab
- Read the Overview
- Click on the My Opportunities tab
- Click on Get Started under Review Names
You can also download the Get Involved app. (Get Involved is currently processing only US and Latin American documents, but will expand to other countries in the future.)
FamilySearch has 300 scanning stations, and 10,000 archives and partners in 100 countries throughout the world. It database currently boasts of 1.2 Billion ancestors and is Free to use.
Seekerz LLC, © 2022