System Processes Help

This page describes how the system processes work within FreeBMD. System processes are those processes that provide the means to handle the more complex aspects of data handling with the FreeBMD database. Some aspects of these processes are only available to users with enhanced privilege.

System Entries

What are System Entries?

System entries are entries in the database that do not correspond to an entry in the GRO index but which have been specifically inserted into the database to correct an error that is presumed to exist in the GRO index. For example, where information from a certificate indicates that the entry in the index is incorrect a system entry can be inserted to provide the correct information. Whilst this can sometimes be handled by a postem, where the error would make the entry unsearchable, e.g. the spelling of a name is wrong, a system entry needs to be created.

How do System Entries work?

A system entry in the database has a flag set. In the results listing it has a special marker to indicate that it has a special status. If the Information

for such an entry is clicked an explanation is given of the special status of the entry. If a system entry has a link associated with it, pointing to the entry that is corrected, then the information will include a reference to the corrected entry.

Where a link exists, then the entry referred to will have a special marker in the search results and the information displayed will contain information about the system entry, including a clickable link that will perform a search to locate the system entry.

How are System Entries created?

System entries are those entries that exist under special user accounts that have a type of SystemEntry. Note that the type of a user account is different from the role of a user account. The type defines how entries under that user account are handled, where as the role indicates the privileges that the account has.

All entries in all files under a user account of type SystemEntry are system entries. There is no facility to opt out. In order to create the link mentioned above a special #THEORY line is used. Should these special #THEORY lines occur in ordinary user accounts they are treated as ordinary #THEORY lines. The format of these special #THEORY lines is as follows:

#THEORY,LINK comment,year,quarter,event,record

Where

`comment`	is an optional comment that will be included in the information about the system entry and the entry the link refers to
`year,quarter,event`	defines the year, quarter and event of the entry being referred to - quarter and event can be alphabetic or numeric
`record`	is the entry being referred to - this must contain all the fields, in the correct order, for the year, quarter and event for the entry being referred to (which may be different from the system entry)

It is expected that files containing System Entries will be ONENAME files, although they could be RANDOM. They must not be SEQUENCED. The use of ONENAME files means that where there is a block of System Entries the order will be preserved. +BREAK should be used as appropriate, i.e. if there are contiguous entries in the file for the same surname, but these are not contiguous in the index.

It is possible to have multiple links for a single System Entry. Just list the #THEORY,LINK lines after the System Entry.

More than one link may refer to the same entry and thus there would be more than one System Entry that is associated with the entry. This could occur if the correction was not certain and there were two or more possible corrections that should be shown.

System Entry special facilities

When editing or loading entries using FileManagement under a user account with the type SystemEntry a special facility is available to help with the creation of the #THEORY,LINK lines as described below.

In FileManagement:

Open the file to contain the system entries using View/Edit
Insert a +U line to define the year and quarter of the entries (the +INFO line of the file determines the type)
Insert the entries that are to become the system entries (e.g. paste them from another file)
If the surname of the system entries is different from the entries inserted (a common case) entre the revised surname in the Surname box
Select the entries
Click on the Insert button

This will cause #THEORY,LINK lines to be added after each entry. If a surname was entered in the Surname box the surname of the original entry (not the link) will be changed to this surname.

The #THEORY,LINK lines can now be edited, for example to add a comment or to change the year and quarter.

The following example might help to clarify:

First the +U line is inserted

System entry example

Then the entries are pasted in. These entries are in the index as HOOF but they should be HOOK so we are creating System Entries for HOOK which will be linked to these erroneous (we assert) HOOF entries.

System entry example

To create the System Entries put HOOK in the Surname box, a comment in the Comment box and select the entries.

System entry example

Clicking on the Insert button has caused System Entries to be created consisting of the original with the surname changed to HOOK and links from thse to the original entries.

System entry example

System Entry errors

If the link (in #THEORY,LINK) does not resolve to another entry in the database during the update, an entry is placed in a report that is produced. This report is available here.

Assisted Alignments

What is Alignment?

Alignment is the process that takes place during an update to match entries from different transcriptions to determine that they represent the same entry in the index. Logically the processes puts two transcriptions side by side and attempts to get the best correspondence. The two transcriptions are then merged to give a composite with equal entries shown as one doubled keyed entry and differing entries shown separately as single keyed. So in this example
Alignment example

most of the entries have been aligned except for the one where the given name is spelt differently.

What is Misalignment?

Misalignment is where two entries look like they should be representing the same entry but they are not the same. In the above example Austin, Zechariah and Austin, Zachariah are misaligned. The update produces a report that gives for each quarter the misalignments the system has found. Some of these are accurate and some are not. Where there has been a simple typing error then the report has good results but where a transcriber has re-ordered entries it gives poor results.

What is Assisted Alignment?

So, the process of Assisted Alignment is where the system is told to align two (or more) entries that are not the same. In the above example it would be told to treat Austin, Zechariah and Austin, Zachariah as if they were the same; it is also told which one is to be displayed in the search results. Clicking on the Information

button will show the original transcriptions with a note concerning the alignment of the entries.

How is Assisted Alignment done?

There are two methods for creating Assisted Alignments

match two (or more) existing database entries
specify in a transcription file that an entry is to be aligned with another entry

which will cause the entries referenced to be aligned at the next update of the database.

In order to match existing database entries go to the alignments page and follow the instructions there. Entries can be selected either from the misalignments page or from the search results.

In order to align with an entry in a file add the following after the entry

#THEORY,ALIGN,entry

where entry is the entry that the entry in the file is to be aligned with. Both entries must be in the same quarter. The format for entry must be the same as in the rest of the file, including the year, quarter and event if present. The content of entry must be the same as the original transcription, for example if the original has Roman numerals for the page then so must entry. There are two exceptions to this; letter case does not have to be the same and any number of spaces will match (hence Jack Jones will match with jack jones).

For example

Austin,Zachariah,Tenterden,5,277
#THEORY,ALIGN,Austin,Zechariah,Tenterden,5,277

would cause an entry for Zachariah to be put in the database and it would be aligned with the Zechariah entry. The entry being inserted (the Zachariah one in this case) is considered to be the correct one and is the one that will appear in the search results.

UCF Alignments

We said under Misalignments above that entries will not be aligned by the system unless they are the same. This is not exactly true. If two entries would be reported as misalgned (Austin,Zechariah and Austin,Zachariah in the above example) but they match according to UCF rules then they are considered to be the same. So in our example if Austin,Zechariah had been Austin,Z_chariah the entries would have been automatically aligned.

The system produces a report of alignments achieved in this way.

Note that UCF Alignments will not align entries unless those around them are aligned in the normal way. So if Austin,Zechariah and Austin,Z_chariah had been the only entries in each of two files they would not have been UCF aligned.

Superchunks

What is a chunk, let alone a superchunk?

Once the entries from files have been aligned the result is a chunk. In a simple case the chunk would be slightly bigger than the biggest file (because some entries that refer to the same entry in the index are not identical). However, if there are several files involved the chunk could be larger, for example here we show a chunk that is larger than any of the three constituent files:
Superchunk example

When search results are presented there may be a change of colour which is described in the legend as "a possible discontinuity" in the data. This discontinuity is where data comes from different chunks because there is no guarantee that between two adjacent chunks there is not a missing set of entries (e.g. a page of the index).

So, what's a superchunk?

Once the chunks have been produced the update tries to stitch them together into superchunks consisting of a number of chunks. The rules for doing this are relatively pragmatic, using such things as

No gaps between page numbers (taking into account suffices, e.g. 44a)
No gaps between filenames (e.g. 1852B20025 and 1852B20026)
Same surname at the end of one file and start of another (provided it is not a surname that has entries over more than a file)

Why is a superchunk useful?

The objective is to get every quarter to have just a single superchunk which would mean that there were no pages missing and no extraneous pages (e.g. pages that belong to another quarter). The update produces a report that shows what superchunks have been produced so we can investigate the places where superchunks have "broken" and thus find gaps and errors in the transcriptions. The two figures after each quarter are the number of superchunks and the number of chunks - the objective is to get the first to be 1!

How is the number of superchunks reduced?

Basically by looking the report, working out why the chunks have not been stitched together and correcting the issue. Typical reasons are:

Transcription from the wrong quarter
Random transcriptions coded as Onename or Sequenced
Wrong page number

FreeBMD Main Page

Search engine, layout and database Copyright © 1998-2022 Free UK Genealogy CIO, a charity registered in England and Wales, Number 1167484.
We make no warranty whatsoever as to the accuracy or completeness of the FreeBMD data.
Use of the FreeBMD website is conditional upon acceptance of the Terms and Conditions