This repo contains some benchmarks for evaluating Multi-view Multi-class machine learning algorithms.
📢 More information about the datasets can be found in [Google Sheets | Tencent Docs].
No. | Datasets | #Samples | #Classes | #Views | Tag | Reference |
---|---|---|---|---|---|---|
1 | 100Leaves | 1,600 | 100 | 3 | Plant leaf classification using probabilistic integration of shape, texture and margin features | |
2 | Caltech101-7 | 1,474 | 7 | 6 | imbalance |
Large-scale multi-view spectral clustering via bipartite graph |
3 | Caltech101-20 | 2,386 | 20 | 6 | imbalance |
Deep Incomplete Multi-View Learning Network with Insufficient Label Information |
4 | Caltech101 | 9,144 | 102 | 6 | imbalance |
Binary Multi-View Clustering |
5 | Deep Caltech101 | 8,677 | 101 | 2 | imbalance |
Trusted Multi-View Classification |
6 | Caltech256 | 30,607 | 257 | 3 | imbalance |
Auto-weighted Multi-view Clustering for Large-scale Data |
7 | Deep AWA_2views | 10,158 | 50 | 2 | imbalance |
Deep Partial Multi-View Learning |
8 | Reuters_2views | 18,758 | 6 | 2 | imbalance |
Multi-view Spectral Clustering Network |
9 | NoisyMNIST | 70,000 | 10 | 2 | Robust Multi-View Clustering With Incomplete Information | |
10 | NoisyMNIST | 30,000 | 10 | 2 | Robust Multi-View Clustering With Incomplete Information | |
11 | MNIST-USPS | 5,000 | 10 | 2 | Robust Multi-View Clustering With Incomplete Information | |
12 | Scene15 | 4,485 | 15 | 3 | Ensemble projection for semi-supervised image classification | |
13 | Out-Scene | 2,688 | 8 | 4 | Deep Incomplete Multi-View Learning Network with Insufficient Label Information | |
14 | NUS-WIDE | 30,000 | 31 | 5 | imbalance |
Fast Multi-view Clustering via Ensembles: Towards Scalability, Superiority, and Simplicity |
✨ We have collated some publicly available datasets and you can download them from Baidu Netdisk. The data format is as follows:
xxx.mat
|---gnd: matrix, double, start from 1, (sample_number, 1).
|---X: cell, (1, view_num)
|---|---X{i}: matrix, double, (sample_number, feature_dimension).
- [2024/08/12] The script for label distribution plot is uploaded
label_distribution/plot_label_distribution.ipynb
! - [2024/08/08] Create a share link to datasets we have collected from the Internet for public research. [Baidu Netdisk]
We simply adopt the SVM as a baseline to evaluate the contribution of each modality for the classification task.
📢 More figures can be found in fold label_distribution
!
Some datasets were downloaded from these sites, for which we are very grateful:
[1] https://github.com/liujiyuan13/mvdata
[2] https://github.com/wangsiwei2010/large_scale_multi-view_clustering_datasets