06. AI-Driven Patent Data Extraction and Analysis for Agricultural Patents
Conference: Women in Statistics and Data Science 2025
11/13/2025: 2:30 PM - 4:00 PM EST
Speed
The AI-Driven Patent Data Extraction and Analysis System for Corteva Agriscience was a research project under The Data Mine at Purdue University. A team of 9 undergraduate and graduate students designed the system for efficient retrieval, extraction, and analysis of agricultural patents related to crop protection. The project integrated cutting-edge technologies, including large language models (LLMs) and advanced tools for data extraction and structured search. These capabilities will allow scientists and researchers to efficiently access, extract, and analyze patent data, enabling faster and more informed decision-making.
Project Objectives:
1. Patent Retrieval Development – Developed the system for retrieving patents directly from Google patents.
2. Automated Data Extraction – Developed a tool that extracts and converts patent metadata as well as relevant content from the example section of patents into a structured table format, making it downloadable for further analysis.
3. Interactive Chat Module – Implemented an LLM-chatbot that helps scientists to perform IP-related queries.
Patent Data Extraction
Large Language Models (LLMs)
Intellectual Property Analytics
Structured Data Retrieval
Student-Led Research
Data-Driven Decision Making
Presenting Author
Srishti Maurya
First Author
Srishti Maurya
CoAuthor(s)
Anna Bajszczak
Lina Im, Student
Target Audience
Beginner
Tracks
Knowledge
Women in Statistics and Data Science 2025
You have unsaved changes.