Multi-view Multi-class Datasets

This repo contains some benchmarks for evaluating Multi-view Multi-class machine learning algorithms.

📄 Statistics of Datasets

📢 More information about the datasets can be found in [Google Sheets | Tencent Docs].

No.	Datasets	#Samples	#Classes	#Views	Tag	Reference
1	100Leaves	1,600	100	3		Plant leaf classification using probabilistic integration of shape, texture and margin features
2	Caltech101-7	1,474	7	6	`imbalance`	Large-scale multi-view spectral clustering via bipartite graph
3	Caltech101-20	2,386	20	6	`imbalance`	Deep Incomplete Multi-View Learning Network with Insufficient Label Information
4	Caltech101	9,144	102	6	`imbalance`	Binary Multi-View Clustering
5	Deep Caltech101	8,677	101	2	`imbalance`	Trusted Multi-View Classification
6	Caltech256	30,607	257	3	`imbalance`	Auto-weighted Multi-view Clustering for Large-scale Data
7	Deep AWA_2views	10,158	50	2	`imbalance`	Deep Partial Multi-View Learning
8	Reuters_2views	18,758	6	2	`imbalance`	Multi-view Spectral Clustering Network
9	NoisyMNIST	70,000	10	2		Robust Multi-View Clustering With Incomplete Information
10	NoisyMNIST	30,000	10	2		Robust Multi-View Clustering With Incomplete Information
11	MNIST-USPS	5,000	10	2		Robust Multi-View Clustering With Incomplete Information
12	Scene15	4,485	15	3		Ensemble projection for semi-supervised image classification
13	Out-Scene	2,688	8	4		Deep Incomplete Multi-View Learning Network with Insufficient Label Information
14	NUS-WIDE	30,000	31	5	`imbalance`	Fast Multi-view Clustering via Ensembles: Towards Scalability, Superiority, and Simplicity

✨ We have collated some publicly available datasets and you can download them from Baidu Netdisk. The data format is as follows:

xxx.mat
|---gnd: matrix, double, start from 1, (sample_number, 1).
|---X: cell, (1, view_num)
|---|---X{i}: matrix, double, (sample_number, feature_dimension).

🔥 Update

[2024/08/12] The script for label distribution plot is uploaded label_distribution/plot_label_distribution.ipynb!
[2024/08/08] Create a share link to datasets we have collected from the Internet for public research. [Baidu Netdisk]

🌋 Modality Evaluation

We simply adopt the SVM as a baseline to evaluate the contribution of each modality for the classification task.

📊 Label Distribution

📢 More figures can be found in fold label_distribution!

Acknowledgements

Some datasets were downloaded from these sites, for which we are very grateful:

[1] https://github.com/liujiyuan13/mvdata

[2] https://github.com/wangsiwei2010/large_scale_multi-view_clustering_datasets

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
label_distribution		label_distribution
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-view Multi-class Datasets

📄 Statistics of Datasets

🔥 Update

🌋 Modality Evaluation

📊 Label Distribution

Acknowledgements

About

Languages

ZhangqiJiang07/Multi-view_Multi-class_Datasets

Folders and files

Latest commit

History

Repository files navigation

Multi-view Multi-class Datasets

📄 Statistics of Datasets

🔥 Update

🌋 Modality Evaluation

📊 Label Distribution

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Languages