Accessing United States Bulk Patent Data with patentpy and patentr

James Yu, Hayley Beltz, Milind Y. Desai, Péter Érdi, Jacob G. Scott, Raoul R. Wadhwa

The United States Patent and Trademark Office (USPTO) provides publicly accessible bulk data files containing information for all patents from 1976 onward. However, the format of these files changes over time and is memory-inefficient, which can pose issues for individual researchers. Here, we introduce the patentpy and patentr packages for the Python and R programming languages. They allow users to programmatically fetch bulk data from the USPTO website and access it locally in a cleaned, rectangular format. Research depending on United States patent data would benefit from the use of patentpy and patentr. We describe package implementation, quality control mechanisms, and present use cases highlighting simple, yet effective, applications of this software.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment