Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support TIENet, UOD-AIR, and EDFFNet #31

Merged
merged 5 commits into from
Oct 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

This file was deleted.

60 changes: 11 additions & 49 deletions configs/detection/duo_dataset/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,57 +12,19 @@ Underwater object detection for robot picking has attracted a lot of interest. H
<img src="https://user-images.githubusercontent.com/48282753/233964524-73b49b46-03c2-48ba-9786-697c9d2c081a.png" height="400"/>
</div>

**Note:** DUO contains URPC2020, the categories of both datasets are same. DUO introduced URPC2020 and other underwater object detection datasets in the paper.

**TODO:**

- [ ] Support DUO Dataset and release models.
- [ ] Unify Dataset name in `LQIT`

## Results and Models

### URPC2020

| Architecture | Backbone | Style | Lr schd | box AP | Config | Download |
| :-----------: | :---------: | :-----: | :-----: | :----: | :----------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Faster R-CNN | R-50 | pytorch | 1x | 43.5 | [config](./faster-rcnn_r50_fpn_1x_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/faster-rcnn_r50_fpn_1x_urpc-coco_20220226_105840-09ef8403.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/faster-rcnn_r50_fpn_1x_urpc-coco_20220226_105840.log.json) |
| Faster R-CNN | R-101 | pytorch | 1x | 44.8 | [config](./faster-rcnn_r101_fpn_1x_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/faster-rcnn_r101_fpn_1x_urpc-coco_20220227_182523-de4a666c.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/faster-rcnn_r101_fpn_1x_urpc-coco_20220227_182523.log.json) |
| Faster R-CNN | X-101-32x4d | pytorch | 1x | 44.6 | [config](./faster-rcnn_x101-32x4d_fpn_1x_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/faster-rcnn_x101-32x4d_fpn_1x_urpc-coco_20230511_190905-7074a9f7.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/faster-rcnn_x101-32x4d_fpn_1x_urpc-coco_20230511_190905.log.json) |
| Faster R-CNN | X-101-64x4d | pytorch | 1x | 45.3 | [config](./faster-rcnn_x101-64x4d_fpn_1x_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/faster-rcnn_x101-64x4d_fpn_1x_urpc-coco_20220405_193758-5d2a37e4.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/faster-rcnn_x101-64x4d_fpn_1x_urpc-coco_20220405_193758.log.json) |
| Cascade R-CNN | R-50 | pytorch | 1x | 44.3 | [config](./cascade-rcnn_r50_fpn_1x_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/cascade-rcnn_r50_fpn_1x_urpc-coco_20220405_160342-044e6858.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/cascade-rcnn_r50_fpn_1x_urpc-coco_20220405_160342.log.json) |
| RetinaNet | R-50 | pytorch | 1x | 40.7 | [config](./retinanet_r50_fpn_1x_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/retinanet_r50_fpn_1x_urpc-coco_20220405_214951-a39f054e.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/retinanet_r50_fpn_1x_urpc-coco_20220405_214951.log.json) |
| FCOS | R-50 | caffe | 1x | 41.4 | [config](./fcos_r50-caffe_fpn_gn-head_1x_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/fcos_r50-caffe_fpn_gn-head_1x_urpc-coco_20220227_204555-305ab6aa.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/fcos_r50-caffe_fpn_gn-head_1x_urpc-coco_20220227_204555.log.json) |
| ATSS | R-50 | pytorch | 1x | 44.8 | [config](./atss_r50_fpn_1x_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/atss_r50_fpn_1x_urpc-coco_20220405_160345-cf776917.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/atss_r50_fpn_1x_urpc-coco_20220405_160345.log.json) |
| TOOD | R-50 | pytorch | 1x | 45.4 | [config](./tood_r50_fpn_1x_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/tood_r50_fpn_1x_urpc-coco_20220405_164450-1fbf815b.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/tood_r50_fpn_1x_urpc-coco_20220405_164450.log.json) |
| SSD300 | VGG16 | - | 120e | 35.1 | [config](./ssd300_120e_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/ssd300_120e_urpc-coco_20230426_122625-b6f0b01e.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/ssd512_120e_urpc-coco_20220405_185511.log.json) |
| SSD512 | VGG16 | - | 120e | 38.6 | [config](./ssd300_120e_urpc-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/ssd512_120e_urpc-coco_20220405_185511-88c18764.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc1/ssd512_120e_urpc-coco_20220405_185511.log.json) |

### DUO
## Results

Coming soon

## Citation

- If you use `URPC2020` or other `URPC` series dataset in your research, please cite it as below:

**Note:** The URL may not be valid, but this link is cited by many papers.

```latex
@online{urpc,
title = {Underwater Robot Professional Contest},
url = {http://uodac.pcl.ac.cn/},
}
```

- If you use `DUO` dataset in your research, please cite it as below:

```latex
@inproceedings{liu2021dataset,
title={A dataset and benchmark of underwater object detection for robot picking},
author={Liu, Chongwei and Li, Haojie and Wang, Shuchang and Zhu, Ming and Wang, Dong and Fan, Xin and Wang, Zhihui},
booktitle={2021 IEEE International Conference on Multimedia \& Expo Workshops (ICMEW)},
pages={1--6},
year={2021},
organization={IEEE}
}
```
```latex
@inproceedings{liu2021dataset,
title={A dataset and benchmark of underwater object detection for robot picking},
author={Liu, Chongwei and Li, Haojie and Wang, Shuchang and Zhu, Ming and Wang, Dong and Fan, Xin and Wang, Zhihui},
booktitle={2021 IEEE International Conference on Multimedia \& Expo Workshops (ICMEW)},
pages={1--6},
year={2021},
organization={IEEE}
}
```
39 changes: 39 additions & 0 deletions configs/detection/edffnet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Edge-Guided Dynamic Feature Fusion Network for Object Detection under Foggy Conditions

> [Edge-Guided Dynamic Feature Fusion Network for Object Detection under Foggy Conditions](https://link.springer.com/article/10.1007/s11760-022-02410-0)

<!-- [ALGORITHM] -->

## Abstract

Hazy images are often subject to blurring, low contrast and other visible quality degradation, making it challenging to solve object detection tasks. Most methods solve the domain shift problem by deep domain adaptive technology, ignoring the inaccurate object classification and localization caused by quality degradation. Different from common methods, we present an edge-guided dynamic feature fusion network (EDFFNet), which formulates the edge head as a guide to the localization task. Despite the edge head being straightforward, we demonstrate that it makes the model pay attention to the edge of object instances and improves the generalization and localization ability of the network. Considering the fuzzy details and the multi-scale problem of hazy images, we propose a dynamic fusion feature pyramid network (DF-FPN) to enhance the feature representation ability of the whole model. A unique advantage of DF-FPN is that the contribution to the fused feature map will dynamically adjust with the learning of the network. Extensive experiments verify that EDFFNet achieves 2.4% AP and 3.6% AP gains over the ATSS baseline on RTTS and Foggy Cityscapes, respectively.

<!-- [IMAGE] -->

<div align=center>
<img src="https://github.com/BIGWangYuDong/lqit/assets/48282753/82087e24-4ef6-40b4-ae95-a5893e293c1e"/>
</div>

## Results on RTTS

| Architecture | Neck | Lr schd | Edge Head | lr | box AP | Config | Download |
| :----------: | :---: | :-----: | :-------: | :--: | :----: | :------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ATSS | FPN | 1x | - | 0.01 | 48.2 | [config](../rtts_dataset/atss_r50_fpn_1x_rtts-coco.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc2/atss_r50_fpn_1x_rtts-coco_20231023_210916-98b5356b.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc2/atss_r50_fpn_1x_rtts-coco_20231023_210916.log.json) |
| ATSS | FPN | 1x | - | 0.02 | 49.6 | [config](./atss_r50_fpn_1x_rtts-coco_lr002.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc2/atss_r50_fpn_1x_rtts-coco_lr002_20231028_104029-114517ae.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc2/atss_r50_fpn_1x_rtts-coco_lr002_20231028_104029.log.json) |
| ATSS | DFFPN | 1x | - | 0.02 | 50.3 | [config](./atss_r50_dffpn_1x_rtts-coco_lr002.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc2/atss_r50_dffpn_1x_rtts-coco_lr002_20231028_104638-8f22abd9.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc2/atss_r50_dffpn_1x_rtts-coco_lr002_20231028_104638.log.json) |
| ATSS | DFFPN | 1x | Y | 0.02 | 50.8 | [config](./edffnet_atss_r50_dffpn_1x_rtts-coco_lr002.py) | [model](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc2/edffnet_atss_r50_dffpn_1x_rtts-coco_lr002_20231028_111154-89311078.pth) \| [log](https://github.com/BIGWangYuDong/lqit/releases/download/v0.0.1rc2/edffnet_atss_r50_dffpn_1x_rtts-coco_lr002_20231028_111154.log.json) |

## Citation

```latex
@article{he2023edge,
title={Edge-guided dynamic feature fusion network for object detection under foggy conditions},
author={He, Wanru and Guo, Jichang and Wang, Yudong and Zheng, Sida},
journal={Signal, Image and Video Processing},
volume={17},
number={5},
pages={1975--1983},
year={2023},
publisher={Springer}
}
```
12 changes: 12 additions & 0 deletions configs/detection/edffnet/atss_r50_dffpn_1x_rtts-coco_lr002.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
_base_ = ['./atss_r50_fpn_1x_rtts-coco_lr002.py']

# model settings
model = dict(
neck=dict(
type='lqit.DFFPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs='on_input',
shape_level=2,
num_outs=5))
Original file line number Diff line number Diff line change
Expand Up @@ -67,5 +67,19 @@
nms=dict(type='nms', iou_threshold=0.6),
max_per_img=100))

# optimizer
optim_wrapper = dict(
optimizer=dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001))

param_scheduler = [
dict(
type='LinearLR', start_factor=0.001, by_epoch=False, begin=0,
end=1000),
dict(
type='MultiStepLR',
begin=0,
end=12,
by_epoch=True,
milestones=[8, 11],
gamma=0.1)
]
Original file line number Diff line number Diff line change
@@ -1,17 +1,10 @@
_base_ = '../edffnet/atss_r50_fpn_1x_2xb8_rtts.py'
_base_ = ['./atss_r50_dffpn_1x_rtts-coco_lr002.py']

model = dict(
_delete_=True,
type='lqit.EDFFNet',
backbone=dict(norm_eval=True),
neck=dict(
type='lqit.DFFPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs='on_input',
shape_level=2,
num_outs=5),
enhance_head=dict(
detector={{_base_.model}},
edge_head=dict(
_scope_='lqit',
type='EdgeHead',
in_channels=256,
Expand All @@ -23,7 +16,8 @@
mean=[128],
std=[57.12],
pad_size_divisor=32,
element_name='edge')))
element_name='edge')),
vis_enhance=False)

# dataset settings
train_pipeline = [
Expand All @@ -41,19 +35,3 @@
dict(type='lqit.PackInputs', )
]
train_dataloader = dict(dataset=dict(pipeline=train_pipeline))

optim_wrapper = dict(
optimizer=dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001))

param_scheduler = [
dict(
type='LinearLR', start_factor=0.001, by_epoch=False, begin=0,
end=1000),
dict(
type='MultiStepLR',
begin=0,
end=12,
by_epoch=True,
milestones=[8, 11],
gamma=0.1)
]
Loading
Loading