The repository contains official Jittor implementations of the paper: Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation.
The paper is in Here.
Notes: CLIP-ViT-B-16 Pre-trained models can be found in there
Dataset | Setting | pAcc | mIoU(S) | mIoU(U) | hIoU | Model Zoo |
---|---|---|---|---|---|---|
PASCAL VOC 2012 | Inductive | 95.8 | 92.8 | 84.4 | 88.4 | [Drive] |
PASCAL VOC 2012 | Transductive | 97.0 | 93.9 | 92.2 | 93.0 | [Drive] |
PASCAL VOC 2012 | Fully | 97.1 | 94.1 | 93.4 | 93.7 | [Drive] |
COCO Stuff 164K | Inductive | 63.1 | 40.9 | 41.6 | 41.2 | [Drive] |
COCO Stuff 164K | Transductive | 69.9 | 42.0 | 60.8 | 49.7 | [Drive] |
COCO Stuff 164K | Fully | 70.8 | 42.9 | 64.1 | 51.4 | [Drive] |