An Optimal Two-step Estimation Approach for Two-phase Studies

Kin Yau Wong Co-Author
The Hong Kong Polytechnic University
 
Qingning Zhou Speaker
 
Monday, Aug 5: 9:00 AM - 9:25 AM
Invited Paper Session 
Oregon Convention Center 

Description

Two-phase sampling is commonly adopted for reducing cost and improving estimation efficiency. We consider the two-phase design where the outcome and some cheap covariates are observed for a cohort at Phase I, and expensive covariates are obtained for a selected subset of the cohort at Phase II. Hence, analyzing the association between the outcome and covariates faces a missing data problem. The complete case analysis that uses only the Phase II sample is generally inefficient. In this work, we develop a two-step estimation approach, which first obtains an estimator based on the complete data and then updates it using an asymptotically mean-zero estimator obtained from a working model between the outcome and cheap covariates based on the full data. The two-step estimator is asymptotically at least as efficient as the complete-data estimator and is robust to misspecification of the working model. We propose a kernel-based method to construct a two-step estimator that achieves optimal efficiency, and also develop a simple joint update approach based on multiple working models to approximate the optimal estimator. We apply the proposed method to various outcome models for illustration.