Text Analysis for Statisticians: Introduction to Advanced Language Modeling
Robin Cosbey
Instructor
Pacific Northwest National Laboratory
Saturday, Aug 3: 8:30 AM - 5:00 PM
CE_03C
Professional Development Course/CE
Oregon Convention Center
Room: A106
This course will provide a broad overview of text analysis and natural language processing (NLP), including a significant amount of introductory material with extensions to state-of-the-art methods. All aspects of the text analysis pipeline will be covered including data preprocessing, converting text to numeric representations (from simple aggregation methods to more complex embeddings), and training supervised and unsupervised learning methods for standard text-based tasks such as named entity recognition (NER), sentiment analysis, topic modeling, and text generation using Large Language Models (LLMs). The course will alternate between presentations and hands-on exercises in Python. Translations from Python to R will be provided for students more comfortable in that language and support will be given for both Mac and Windows users. Attendees should be familiar with Python (preferably), R, or both and have a basic understanding of statistics and/or machine learning. Attendees will gain the practical skills necessary to begin using text analysis tools for their tasks, an understanding of the strengths and weaknesses of these tools, and an appreciation for the ethical considerations of using these tools in practice.
Main Sponsor
Section on Text Analysis
Co Sponsors
Section on Statistical Learning and Data Science
Section on Statistics in Defense and National Security
You have unsaved changes.