TL03: Computer Science and Statistical Theory in Record Linkage: Will One Continue to Help the Other?

Yves Thibaudeau Presenting Author
U.S. Census Bureau
 
Tuesday, Aug 6: 7:00 AM - 8:15 AM
2392 
Roundtables – Breakfast 
Oregon Convention Center 
On the surface record linkage and pattern matching are basic exercises in
computation that can be automated and need not have anything to do with
probability or statistics. It quickly became evident, through the works of authors like Tepping and Newcombe (1967), managing uncertainty is an integral
part of record linkage and correctly implemented probabilistic concepts is central. In 1967 Fellegi and Sunter made a breakthrough by presenting a cohesive
probabilistic framework for RL which remains at the heart of many modern
record linkage systems. Nevertheless the algorithmic nature of RL means computer scientists excel at independently improving RL systems and developing
new ones. The advent of Bayesian record-linkage methods in record linkage
has improved the state of the art and has given a second wind to probabilistic
approaches. We will debate the importance of new probabilistic and statistical
theories going forward. Are statistical theories likely to continue spawning new
RL techniques. Or are new developments in computer science going replace the
need for it? Where are the statistical innovations likely to come from? Frequentist or Bayesian Theory? Partitioning?

Keywords

Computer Scientific Algorithms

Probability and Statistics

Bayesian Record Linkage

Partitionning 

Abstracts


Main Sponsor

Record Linkage Interest Group
Section on Statistical Computing