Data Fusion: Calculation of Feasible Correlations

Chris Moriarity First Author
 
Chris Moriarity Presenting Author
 
Sunday, Aug 3: 2:50 PM - 3:05 PM
1729 
Contributed Papers 
Music City Center 
I have described a method (2001, 2003, 2004, 2009, 2010) for merging two independent samples using data fusion (also known as statistical matching). One sample contains (X,Z) and the other contains (X,Y), both drawn from a common nonsingular normal (X,Y,Z) distribution. Following Kadane (1978) and Rubin (1986), I employ regression in my approach. I assess the uncertainty introduced during the merge that is due to the unobserved (Y,Z) relationship by repetition over a range of (Y,Z) values that are consistent with the observed data. An essential part of my algorithm is to add random residuals to the regression estimates. My initial approach for estimating the residual variance could fail (be negative) because it used subtraction of estimates from both files. An innovation due to Raessler and Kiesl (2009) give improved results for estimating the residual variance, solving one of the two open problems in the paradigm. The remaining open problem was determining the area of feasible correlations between Y and Z when both Y and Z are multivariate. Building on the foundation described by Kiesl and Raessler (2006), the solution to this problem is now known.

Keywords

statistical matching 

Main Sponsor

Survey Research Methods Section