Blogs from the "past" lane: Extraction and Indexing

Extraction is the process of making information on a document into a more useful form. For as long as cameras were available for that purpose, paper records have been filmed and stored. Those films were then available to peruse on film readers. This was a great improvement in record preservation, and still goes on today. That process also made it possible to store vast amounts of information in a relatively small area. The down side is that, unless there was an index included on the film, it was a process of looking through endless images to find the desired document. There was also no guarantee that the desired record was even on the film, and so much time was and is still lost in unproductive searches. The next step was to index the film by adding another film with the new index and referencing the other film. This was a big improvement, but still required going to the place the film was located, putting it on the reader and finding the right page. That process is is still used extensively in the lack of anything better being available.

Where are we now in this process of improving access to records? As just stated this the best that is available in many areas, mainly because of cost. Paper copies are still being lost by disintegration due to improper storage, fire, water, etc. Even more sadly, paper is discarded because of lack of storage space. Once a paper document is lost due to any cause it often cannot be replaced, and the information is lost forever!

With the advent of computer technology, there are now options that were not even imagined just a few years ago. When I started "doing genealogy" it was an entirely different process than it is now. My first records were hand written or typed. The sheets were bigger than a standard typewriter, so to type group sheets and pedigree charts required a special typewriter which I never had. Copies were made on a mimeograph because copiers did not exist yet. This was only 50 years age and how thing have changed!

The process of making indexes is clearly not new and has been done since records were kept. Sometimes, because of the chronological nature of the record, such as vital records, the record itself was an index if you knew the date when the event occurred. Even those records usually were indexed by the person keeping the record on a yearly basis. The books were then labeled accordingly. Records are still kept in that fashion where digital records are not used yet.

There are computer programs now that can take a paper or film record and make an editable copy of that record. What does that mean is terms of access? When it is possible to scan an image and make such a copy it can be done at incredible speed and without even having to look at the image except to make corrections. That technology is now being used to convert images to digital format where it can be stored and thus accessed, in a very small space. That is where E-books come from.

Millions of rolls of film are being processed, as we speak, to make these records available to search by computer and other means being offered. Computers as we have known and loved, are getting smaller and mobile and who knows what is next. We can now share information wherever we may be, with anyone we wish, anywhere in this big world! The filming of records being done now, is being done with digital cameras,
which eliminates the film copy of the record. All of these improvements add speed and accuracy to the process. Much of the film digitization and indexing is expected to be completed, at present rate, in about 8 years.

Enter the 1940 US census and the wild excitement over that record and the process of making it available to use by the public. Census records are private for 72 years from the date they are recorded, which is why it is just now being released. The estimate of time to provide an index was six months. It is exceeding expectations at this point. The original excitement may wear off and the process slow, but it appears to be heading in the other direction. There are tens of thousands who are working on this record. The process is three fold: Extraction onto computer, second extraction of same record by another person, and arbitration by a third if there are discrepancies in the two records. That record is then entered into the dbase for that area. This becomes part of what will be the final product-An index of the 132 million persons enumerated on the Census with very valuable information on who they are and relationships.

Where are we in this process? As of this time we are 14.19% complete with two States complete and many more nearing completion. This link will take you to the page where you can see the results on every State.

https://www.familysearch.org/1940census/?cid=fsHomeT1940Text_v2

Want it faster? Sign up and do it!

Blogs from the "past" lane

Pages

Friday, April 20, 2012

Extraction and Indexing

No comments:

Post a Comment