Add fuse layers for conv+affine+relu and conv+relu #2842

facug91 · 2023-07-28T19:37:06Z

This PR extends the fuse_layers method to also fuse convolutional layers followed by relu, and convolutional layers follower by affine followed by relu.

I tested it with examples/dnn_mmod_face_detection_ex.cpp and got exactly the same results on the images shown.

Here are also some tests for different neural networks from https://github.com/dlibml/dnn:

Name	master (w/o fusion)	master (w/ fusion)	this PR (w/o fusion)	this PR (w/ fusion)	comparative1	comparative2
alexnet	9.53	9.63	9.55	9.45	1.84%	1.10%
sqznet1.0	6.53	6.00	6.40	4.47	25.60%	30.25%
sqznet1.1	4.41	3.90	4.41	3.01	22.86%	31.76%
vggnet11	37.00	34.46	36.76	32.77	4.91%	10.87%
vggnet13	46.94	43.90	47.31	41.06	6.47%	13.21%
vggnet16	58.80	54.87	58.01	52.29	4.71%	9.86%
vggnet19	68.32	65.11	69.29	61.89	4.95%	10.69%
googlenet	9.95	9.54	10.09	8.53	10.61%	15.50%
resnet18	7.66	7.39	7.68	6.86	7.23%	10.64%
resnet34	13.90	13.67	13.90	12.77	6.57%	8.10%
resnet50	22.41	20.11	22.41	18.89	6.07%	15.71%
resnet101	39.15	35.72	38.39	33.73	5.56%	12.13%
resnet152	54.98	50.76	55.03	48.44	4.57%	11.98%
darknet19	13.55	12.43	13.54	12.53	-0.79%	7.51%
darknet53	32.70	30.87	32.36	30.63	0.80%	5.36%
darknet53csp	26.47	23.36	26.60	23.48	-0.53%	11.73%
densenet121	21.57	20.33	21.57	20.12	1.03%	6.71%
densenet169	28.50	27.13	28.43	25.66	5.43%	9.76%
densenet201	38.60	37.17	38.52	35.65	4.10%	7.45%
densenet265	55.85	54.43	55.81	52.72	3.13%	5.53%
densenet161	49.80	47.90	49.71	45.74	4.52%	7.99%
vovnet19s	6.84	6.39	6.85	5.05	20.88%	26.23%
vovnet19	14.25	13.85	14.18	11.57	16.47%	18.41%
vovnet27s	8.31	7.84	8.31	6.25	20.31%	24.81%
vovnet27	18.32	17.67	18.32	15.28	13.54%	16.61%
vovnet39	23.69	23.30	23.53	20.21	13.23%	14.11%
vovnet57	31.00	30.81	31.02	27.41	11.03%	11.63%
vovnet99	56.88	56.44	56.31	51.45	8.84%	8.64%
repvgg_a0	6.47	5.67	6.38	4.96	12.52%	22.22%
repvgg_a1	8.01	8.06	8.04	7.23	10.30%	10.14%
repvgg_a2	19.50	19.46	19.35	18.01	7.49%	6.95%
repvgg_b0	10.16	10.17	10.04	8.88	12.61%	11.54%
repvgg_b1	42.13	42.18	42.12	39.34	6.72%	6.60%
repvgg_b2	59.09	59.01	58.67	55.87	5.32%	4.78%
repvgg_b3	82.28	82.40	82.66	79.12	3.97%	4.28%

comparative1: measures how much faster the model run with the new fuse_layers compared to the current fuse_layers.
comparative2: measures how much faster the model run with the new fuse_layers compared to not using fuse_layers.

I left the measurements before the fusion to be sure that the different runs used the same convolutional algorithms in cuDNN. Every time they differed by more than 2% up or down, I ran those tests again.

facug91 · 2023-07-28T21:14:19Z

test_fuse_layers is failing, so I must have done something wrong. I'll review this before removing the Draft. Maybe @arrufat can help me out here please? This is the PR I was talking about before 😅

… visitors_abstract

arrufat · 2023-07-28T22:47:42Z

I'm traveling these days, so it might take some take to have a look. I'll try during this week, though.

I also thought about fusing the relu layer, so it's nice so see it here.

facug91 · 2023-07-28T23:51:12Z

I thought it was test_fuse_layers the test that was failing, but no, it was something else. When I looked for the line in the test file, it was a test of a normal convolution. I had forgotten to update the copy constructor and the assignment operator, and that's why it was failing. The tests on the CPU pass correctly now.
Anyway, I noticed that the tests on gpu are not passing. But it is not a problem of this PR, in master they are not passing either.

facug91 · 2023-07-28T23:58:38Z

One other thing, while reviewing this problem, I was looking for the documentation of "disable_duplicative_biases" function and I couldn't find it. Turns out it was left in layers_abstract.h file, so I took the opportunity to made a commit moving it to "visitors_abstract.h" right here.

davisking · 2023-08-05T13:41:46Z

disable_duplicative_biases

Oh yeah, thanks. That fixes the links in the main docs too :)

dlib/dnn/layers_abstract.h

davisking · 2023-08-05T16:38:15Z

Huh yeah, the GPU version of, layer_norm_ is failing since it's giving out bad derivatives on master. We need a CI test that runs with a GPU :|

Anyway, yeah, looks like your PR is good :)

arrufat · 2023-08-05T18:59:35Z

Huh yeah, the GPU version of, layer_norm_ is failing since it's giving out bad derivatives on master. We need a CI test that runs with a GPU :|

Anyway, yeah, looks like your PR is good :)

Oh, I need to check what's going on there

Add fuse layers for conv+affine+relu and conv+relu

9e4cf4f

facug91 marked this pull request as draft July 28, 2023 19:37

facug91 added 2 commits July 28, 2023 17:06

Add relu to tensor_conv for cpu

8c6f9af

Update convolution serialization

0c492d3

facug91 added 2 commits July 28, 2023 18:22

Move disable_duplicative_biases documentation from layers_abstract to…

f983bb2

… visitors_abstract

Fix convolution copy

0e583db

facug91 marked this pull request as ready for review July 28, 2023 23:51

davisking reviewed Aug 5, 2023

View reviewed changes

dlib/dnn/layers_abstract.h Outdated Show resolved Hide resolved

Update dlib/dnn/layers_abstract.h

3e2fb0c

davisking merged commit be2fa7f into davisking:master Aug 5, 2023
9 of 10 checks passed

facug91 deleted the fuse-conv-relu-and-conv-affine-relu branch August 5, 2023 20:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fuse layers for conv+affine+relu and conv+relu #2842

Add fuse layers for conv+affine+relu and conv+relu #2842

facug91 commented Jul 28, 2023

facug91 commented Jul 28, 2023

arrufat commented Jul 28, 2023

facug91 commented Jul 28, 2023

facug91 commented Jul 28, 2023

davisking commented Aug 5, 2023

davisking commented Aug 5, 2023

arrufat commented Aug 5, 2023

Add fuse layers for conv+affine+relu and conv+relu #2842

Add fuse layers for conv+affine+relu and conv+relu #2842

Conversation

facug91 commented Jul 28, 2023

facug91 commented Jul 28, 2023

arrufat commented Jul 28, 2023

facug91 commented Jul 28, 2023

facug91 commented Jul 28, 2023

davisking commented Aug 5, 2023

davisking commented Aug 5, 2023

arrufat commented Aug 5, 2023