GitHub - zuixiaosanlang/Yolact_minimal: Minimal PyTorch implementation of YOLACT.

Yolact_minimal

Minimal PyTorch implementation of Yolact:《YOLACT: Real-time Instance Segmentation》.
The original project is here.

This implementation simplified the original code, preserved the main function and made the network easy to understand.
This implementation has not been updated to Yolact++.

The network structure.

Environments

PyTorch >= 1.1
Python >= 3.6
onnx
onnxruntime-gpu ==1.6.0 for CUDA 10.2
tensooardX
Other common packages.

Prepare

Build cython-nms

python setup.py build_ext --inplace

Download COCO 2017 datasets, modify self.data_root in 'res101_coco' in config.py.
Download weights.

Yolact trained weights.

Backbone	box mAP	mask mAP	Google Drive	Baidu Cloud
Resnet50	31.5	29.3	best_29.3_res50_coco_400001.pth	password: 2v8s
Resnet101	32.9	30.5	best_30.5_res101_coco_392000.pth	password: 0jq2

ImageNet pre-trained weights.

Backbone	Google Drive	Baidu Cloud
Resnet50	resnet50-19c8e357.pth	password: a6ee
Resnet101	resnet101_reducedfc.pth	password: kdht

Improvement log

2021.1.7. Focal loss did not help, tried conf_alpha 4, 6, 7, 8.
2021.1.7. Less training iterations, 800k --> 680k with batch size 8.
2020.11.2. Improved data augmentation, use rectangle anchors, training is stable, infinite loss no longer appears.
2020.11.2. DDP training, train batch size increased to 16, +0.4 box mAP, +0.7 mask mAP (resnet101).

Train

# Train with resnet101 backbone on one GPU with a batch size of 8 (default).
python -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --train_bs=8
# Train on multiple GPUs (i.e. two GPUs, 8 images per GPU).
export CUDA_VISIBLE_DEVICES=0,1  # Select the GPU to use.
python -m torch.distributed.launch --nproc_per_node=2 --master_port=$((RANDOM)) train.py --train_bs=16
# Train with other configurations (res101_coco, res50_coco, res50_pascal, res101_custom, res50_custom, in total).
python -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --cfg=res50_coco
# Train with different batch_size (batch size should not be smaller than 4).
python -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --train_bs=4
# Train with different image size (anchor settings related to image size will be adjusted automatically).
python -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --img_size=400
# Resume training with a specified model.
python -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --resume=weights/latest_res101_coco_35000.pth
# Set evalution interval during training, set -1 to disable it.  
python -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --val_interval 8000
# Train on CPU.
python train.py --train_bs=4

Use tensorboard

tensorboard --logdir=tensorboard_log/res101_coco

Evalution

# Select the GPU to use.
export CUDA_VISIBLE_DEVICES=0

# Evaluate on COCO val2017 (configuration will be parsed according to the model name).
# The metric API in this project can not get the exact COCO mAP, but the evaluation speed is fast. 
python eval.py --weight=weights/best_30.5_res101_coco_392000.pth
# To get the exact COCO mAP:
python eval.py --weight=weights/best_30.5_res101_coco_392000.pth --coco_api
# Evaluate with a specified number of images.
python eval.py --weight=weights/best_30.5_res101_coco_392000.pth --val_num=1000
# Evaluate with traditional nms.
python eval.py --weight=weights/best_30.5_res101_coco_392000.pth --traditional_nms

Detect

detect result

# Select the GPU to use.
export CUDA_VISIBLE_DEVICES=0

# To detect images, pass the path of the image folder, detected images will be saved in `results/images`.
python detect.py --weight=weights/best_30.5_res101_coco_392000.pth --image=images

cutout object

# Use --cutout to cut out detected objects.
python detect.py --weight=weights/best_30.5_res101_coco_392000.pth --image=images --cutout

# To detect videos, pass the path of video, detected video will be saved in `results/videos`:
python detect.py --weight=weights/best_30.5_res101_coco_392000.pth --video=videos/1.mp4
# Use --real_time to detect real-timely.
python detect.py --weight=weights/best_30.5_res101_coco_392000.pth --video=videos/1.mp4 --real_time

linear combination result

# Use --hide_mask, --hide_score, --save_lincomb, --no_crop and so on to get different results.
python detect.py --weight=weights/best_30.5_res101_coco_392000.pth --image=images --save_lincomb

Transport to ONNX.

python export2onnx.py --weight='weights/best_30.5_res101_coco_392000.pth' --opset=12
# Detect with ONNX file, all the options are the same as those in `detect.py`.
python detect_with_onnx.py --weight='onnx_files/res101_coco.onnx' --image=images.

Train on PASCAL_SBD datasets

Download PASCAL_SBD datasets from here, modify the path of the img folder in data/config.py.
Then, generate a coco-style json.

python utils/pascal2coco.py --folder_path=/home/feiyu/Data/pascal_sbd

Download the Yolact trained weights. Google dirve, Baidu Cloud: eg7b

# Training.
python -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --cfg=res50_pascal
# Evalution.
python eval.py --weight=weights/res50_pascal_120000.pth

Train custom datasets

Install labelme

pip install labelme

Use labelme to label your images, only ploygons are needed. The created json files are in the same folder with the images.
Prepare a 'labels.txt' like this, this first line: 'background' is always needed.
Prepare coco-style json, pass the paths of your image folder and the labels.txt. The 'custom_dataset' folder is a prepared example.

python utils/labelme2coco.py --img_dir=custom_dataset --label_name=cuatom_dataset/labels.txt

Edit CUSTOM_CLASSES in config.py.

Note that if there's only one class, the CUSTOM_CLASSES should be like ('dog', ). The final comma is necessary to make it as a tuple, or the number of classes would be len('dog').
Choose a configuration ('res101_custom' or 'res50_custom') in config.py, modify the corresponding self.train_imgs and self.train_ann. If you need to validate, prepare the validation dataset by the same way.
Then train.

python -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM)) train.py --cfg=res101_custom

Some parameters need to be taken care of by yourself:

Training batch size, try not to use batch size smaller than 4.
Anchor size, the anchor size should match with the object scale of your dataset.
Total training steps, learning rate decay steps and the warm up step, these should be decided according to the dataset size, overwrite self.lr_steps, self.warmup_until in your configuration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yolact_minimal

The network structure.

Environments

Prepare

Improvement log

Train

Use tensorboard

Evalution

Detect

Transport to ONNX.

Train on PASCAL_SBD datasets

Train custom datasets

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
custom_dataset		custom_dataset
modules		modules
readme_imgs		readme_imgs
utils		utils
.gitignore		.gitignore
README.md		README.md
config.py		config.py
cython_nms.pyx		cython_nms.pyx
detect.py		detect.py
detect_with_onnx.py		detect_with_onnx.py
eval.py		eval.py
export2onnx.py		export2onnx.py
setup.py		setup.py
train.py		train.py

zuixiaosanlang/Yolact_minimal

Folders and files

Latest commit

History

Repository files navigation

Yolact_minimal

The network structure.

Environments

Prepare

Improvement log

Train

Use tensorboard

Evalution

Detect

Transport to ONNX.

Train on PASCAL_SBD datasets

Train custom datasets

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages