Loss almost does not change #7

Oguzhanercan · 2024-09-13T13:21:38Z

Hi, I am trying to experience different losses. I Have implement a face similarity loss and disabled all of other losses. But The loss almost does not changed (%1 decreased). At my method, I am looking at the cosine similarity between the reference image and generated image like RectifID(https://github.com/feifeiobama/RectifID). Do you have any experiments with that or do you have any suggestions about it?

Face recognition model: Arcface Resnet200
Loss: (1 - Cosine simlarity)
Lr: 1 to 30 (I tried different settings on this)
SD Model: Pixart

sgk98 · 2024-09-13T20:52:09Z

Hi, while we had discussed similar experiments, I don't think we have ever tried this out, so I am quite curious to see what you get with this.

Coming to the specific details, I'd suggest you use either sd-turbo or sdxl-turbo, since it's reasonably fast at 512 resolution (vs HyperSDXL which does generation at 1024), and generates images which are of much better quality than pixart.
I am not sure if you removed the norm regularization loss, but you could first try to remove that to allow for more aggressive optimization of the objective (even if there is some amount of artifacts that could be introduced). You could also remove the gradient clipping to increase the effect of the optimization even further.

But apart from this, I would be very surprised if the loss doesn't go down at all (especially if you increase the number of iterations even further). From our experience with optimizing different objectives, the loss inevitably goes down quite easily, although in some cases this may not correspond to the changes/improvements that you desire.

Oguzhanercan · 2024-09-14T11:41:06Z

Then I will try your suggestions and see if it is working. I will be sharing new experimental results on this. The reason why I used pixart is I cannot run the code with other models. But I will be upgrading diffusers from 0.30 to newer version and see if it is working by following your suggestion at the other issue.

sgk98 · 2024-09-14T15:04:13Z

Sure, I think most recent versions of diffusers (even 0.24 worked a few months ago I think) would work smoothly. If you are interested, I can provide the pip freeze output, so that you can check if there's any other library that's causing the issue with the U-Nets.

Oguzhanercan · 2024-09-17T07:18:42Z

I have tried many things to decrease the loss but it is not changing, not even fraction. Is it possible that this is caused by memsave argument?

sgk98 · 2024-09-17T09:08:23Z

Sure, you can remove the memsave argument (and its application in the model directly if you prefer). There could be issues that it's causing (especially if you're also applying it to the feature extractor for the reference image.

I would also suggest switching to SD/SDXL-Turbo for much better optimization (even the results in the paper show that Pixart-alpha-DMD is a much worse one-step model). With SD-Turbo, we were able to even optimize a segmentation loss, although the resulting images were not perfectly aligned, the segmentation loss did go down a lot.

Also, if you can share your code (maybe in a fork of the repo), I can also have a look at it and see if there's any other issue/if I'm able to make changes to fix this issue.

Oguzhanercan · 2024-09-17T11:47:31Z

I could not remove the memsave while I was using pixart and SDXL-Turbo because I have 24GB gpu memory, when I tried it with SD-Turbo - w/wo memsave the loss just wiggles, so memsave is not the cause of the problem. I checked the code again but I could not find any problem, the code is on a remote computer, when I get cp permission, I will share it with you.

Oguzhanercan · 2024-09-26T12:01:57Z

I solved the problem, at some point, I convert the output image of diffusion model to np array to get the face embedding (onnx model), at that stage gradient chain is become broken.

Oguzhanercan · 2024-09-27T13:01:20Z

Anathor problem that I have faced is, I was using a JiT model for getting face embeddings, which I used for loss calculation (cosine similarity), this was causing that gradients to disappear.

Oguzhanercan closed this as completed Sep 26, 2024

Oguzhanercan reopened this Sep 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss almost does not change #7

Loss almost does not change #7

Oguzhanercan commented Sep 13, 2024 •

edited

Loading

sgk98 commented Sep 13, 2024

Oguzhanercan commented Sep 14, 2024

sgk98 commented Sep 14, 2024

Oguzhanercan commented Sep 17, 2024

sgk98 commented Sep 17, 2024

Oguzhanercan commented Sep 17, 2024

Oguzhanercan commented Sep 26, 2024

Oguzhanercan commented Sep 27, 2024

Loss almost does not change #7

Loss almost does not change #7

Comments

Oguzhanercan commented Sep 13, 2024 • edited Loading

sgk98 commented Sep 13, 2024

Oguzhanercan commented Sep 14, 2024

sgk98 commented Sep 14, 2024

Oguzhanercan commented Sep 17, 2024

sgk98 commented Sep 17, 2024

Oguzhanercan commented Sep 17, 2024

Oguzhanercan commented Sep 26, 2024

Oguzhanercan commented Sep 27, 2024

Oguzhanercan commented Sep 13, 2024 •

edited

Loading