GitHub - vishvapalsinh/Patent-classification: Classifying a patent using an AI at subclass level (600+ labels)

Patent classification

This project is aiming to implement the patent classification at the subclass level
according to IPC and CPC systems. The total number of classes is more than 600.

The pipeline for the project implementation is as below:

Extract dataset
EDA of the dataset
Train a model

For all of the above tasks, the respective jupyter notebook is shared.

With the Google big query, the dataset for the classification task is generated. The generated dataset is stored in the CSV file. For each year varying from the year, 2009 to 2019 separate CSV files are created. This dataset is made publically available for experiment purposes. The attribute of these CSV files are as shown in the table below:

ID	Date	Title	Claim	cpc_subclass
8844051	2014-09-23	Lithium-ion secondary battery	A lithium-ion secondary battery comprising ...	H01M,Y02E,Y02T

The link to download this dataset by year is provided below.

2009 CSV Link
2010 CSV Link
2011 CSV Link
2012 CSV Link
2013 CSV Link
2014 CSV Link
2015 CSV Link
2016 CSV Link
2017 CSV Link
2018 CSV Link
2019 CSV Link

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
DL_model		DL_model
Data_extraction		Data_extraction
EDA		EDA
ML_model		ML_model
TL_model		TL_model
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Patent classification

About

Releases

Packages

Languages

vishvapalsinh/Patent-classification

Folders and files

Latest commit

History

Repository files navigation

Patent classification

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages