The utility of big data for evaluating public opinion

Michael Robbins First Author
RAND Corporation
 
Michael Robbins Presenting Author
RAND Corporation
 
Tuesday, Aug 6: 10:05 AM - 10:20 AM
2588 
Contributed Papers 
Oregon Convention Center 
Social media data sources like Twitter (now X) provide a wealth of information that could be used evaluate public opinion in real time. However, users of Twitter (in particular the most vocal ones) are not representative of the general population. Characteristics that would traditionally be used for weighting to generalize such non-representative data (e.g., demographics) are unknown for Twitter users. By combining the results of two surveys, we show how proxies for such characteristics can be developed for any Twitter user, and we illustrate the use of those in developing weights that generalize a large universe of Twitter users. Large language models are used to evaluate the sentiment of posts made by our universe of Twitter users regarding Donald Trump. The sentiment analysis, in combination with the statistical weighting, is used to track Trump's approval rating over the period of 9/20/2020 to 1/20/2021.

Keywords

Weighting

Big data

Sentiment analysis 

Main Sponsor

Survey Research Methods Section