Combining Probability and Non-Probability Data: Considerations, Methods, and Applications

Abstract Number:

1359 

Submission Type:

Invited Paper Session 

Participants:

Morgan Earp (1), Katherine Irimata (1), Morgan Earp (1), Wendy Van de Kerckhove (2), Matt Williams (3), Michael Yang (4), Jon Krosnick (5), Katherine Irimata (1)

Institutions:

(1) National Center for Health Statistics, N/A, (2) Westat, N/A, (3) RTI, N/A, (4) NORC at The University of Chicago, N/A, (5) Stanford University, N/A

Chair:

Morgan Earp  
National Center for Health Statistics

Co-Organizer:

Katherine Irimata  
National Center for Health Statistics

Session Organizer:

Morgan Earp  
National Center for Health Statistics

Speaker(s):

Wendy Van de Kerckhove  
Westat
Matt Williams  
RTI
Michael Yang  
NORC at The University of Chicago
Jon Krosnick  
Stanford University
Katherine Irimata  
National Center for Health Statistics

Session Description:

Combining Probability and Non-Probability Data: Considerations, Methods, and Applications

Probability surveys have long been used as the gold standard for data collection for population statistics and inference, particularly for federal statistical agencies. However, with challenges such as increasing cost and nonresponse, survey statisticians have been challenged to consider supplemental or alternative data sources. While non-probability data including administrative records, commercial data, and surveys have limitations including lack of representativeness and sampling bias, these data are often faster and less expensive to obtain, and are increasingly becoming more prevalent. In addition, non-probability data can be used to target and collect information on specific subpopulations of interest which may be difficult to obtain using probability surveys. As a result, survey statisticians have increasingly considered combining the two data sources in order to utilize the strengths of both. Statisticians across federal statistical agencies and external research organizations have identified a variety of approaches for combining probability surveys along with nonprobability data for many purposes, including improving precision and improving representativeness of target subpopulations. In this session, speakers introduce various considerations and methods for combining probability and non-probability data and discuss a range of applications where this approach has been used.

Titles/Authors

Utilizing Data from an Incomplete Sample to Supplement the Probability-Based U.S. PIAAC Cycle II - Wendy Van de Kerckhove (Westat), Tom Krenzke (Westat)

A look at propensity-based methods for combining probability and non-probability sample data – Matt Williams (RTI), Jill Dever (RTI)

Comparing alternative estimation methods using combined probability and nonprobability samples - Michael Yang (NORC), Stas Kolenikov (NORC), David Dutwin (NORC)

A new evaluation of the impact of combining probability and non-probability sample data - Jon Krosnick (Stanford University)

Leveraging Non-Probability Data at the National Center for Health Statistics – Katherine Irimata (National Center for Health Statistics)

Sponsors:

Government Statistics Section 2
Social Statistics Section 3
Survey Research Methods Section 1

Theme: Statistics and Data Science: Informing Policy and Countering Misinformation

Yes

Applied

Yes

Estimated Audience Size

Medium (80-150)

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand and have communicated to my proposed speakers that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is nonrefundable.

I understand