[Bug]: efficientad training its own dataset reports an error #1200
-
Describe the bugI encountered an error while training my own dataset using EfficientAD. I only made modifications to the dataset section of the configuration file provided by the official EfficientAD repository. Based on the same modifications, I was able to train models like CFA and PatchCore successfully, but I encountered an error specifically when training EfficientAD. The yaml file for my efficientad is as follows (I only changed the
I have tentatively determined that the cause of my error is the parameter When I when I set
Then I referred to the related answer in #1148, and I see that @alexriedel1 explains that it should be set to When I when I set
DatasetFolder ModelOther (please specify in the field below) Steps to reproduce the behaviorefficientad training its own dataset reports an error OS informationAnomalib: 0.6.0 Expected behaviorHello @alexriedel1, @nelson1425, As someone who is most familiar with efficiented, can you answer the following questions? 1: What is the reason for this error in efficiented and how should I fix it. 2: Is the performance of efficiented really as good as in the paper, in fact I am more concerned about the speed of efficiented as in the paper. In the paper, it is mentioned that the FPS reaches 269 with efficientAD-M and 614 with efficientAD-S. Is it really possible to achieve this in real tests? If not, what is the FPS of your implementation for different sizes of images. (Although I realize this may be affected and limited by specific hardware) 3:What are the advantages of Efficiented over other tools in Anomalib, and what situations is it more suitable for. Looking forward to your answer, thanks! ScreenshotsNo response Pip/GitHubpip What version/branch did you use?No response Configuration YAML- Logs- Code of Conduct
|
Beta Was this translation helpful? Give feedback.
Replies: 21 comments
-
Hello. I can't answer all the questions. but regarding the normalization. As config says, only |
Beta Was this translation helpful? Give feedback.
-
hello @blaz-r, Thank you very much for your reply, I wasn't able to understand some of the things you said very accurately. As I said at the beginning of the question, for other models such as patchcore and cfa, when I make the exact same changes to the dataset section of the corresponding yaml files for these models (exactly the same dataset I gave in this question), all the other models train correctly and get results. Why is there still a download problem when my dataset is already local? Also the fact that other models' profiles utilizing the same dataset can be trained should be enough to show that there is no error in my file. Looking forward to another reply from you, thanks a lot! |
Beta Was this translation helpful? Give feedback.
-
It's not about your dataset, it's imagenet(te) dataset downloaded by EfficientAD model, as it uses imagenet as part of its functionality. It should be located inside |
Beta Was this translation helpful? Give feedback.
-
Hello @blaz-r Thank you very much, I'll give it a try and get back to you! |
Beta Was this translation helpful? Give feedback.
-
Hello @blaz-r , This does work, thank you very much for your help. Also I would like to ask two more questions: 2: If I want to skip ImageNet, i.e. not use ImageNet, will my efficientAD still work? Looking forward to another reply from you, thanks a lot! |
Beta Was this translation helpful? Give feedback.
-
If you want to know exactly how EfficientAD works, I recommend reading the paper. To cite the authors:
So imagenet is a key component of training, which can't really be skipped. |
Beta Was this translation helpful? Give feedback.
-
Thank you very much @blaz-r |
Beta Was this translation helpful? Give feedback.
-
Glad to help 😄. Regarding the other questions you had, I can't answer all that properly, but I'm sure contributors of EfficientAD can help. |
Beta Was this translation helpful? Give feedback.
-
Looking forward to hearing from them, and thank you very much for your patience, again! @blaz-r |
Beta Was this translation helpful? Give feedback.
-
hello @blaz-r , I'm sorry to bother you again, but I have a new error. Once ImageNet was downloaded, it started to train, but not long after that it reported an error.
Why am I prompted to |
Beta Was this translation helpful? Give feedback.
-
This indeed seems like a bug, that was already addressed in one PR, but it was only fixed in lighting model as it seems. I think this will need to be fixed the same way as it was done in lightning model. If you are able to fix this, a PR would be very welcome. |
Beta Was this translation helpful? Give feedback.
-
You could start by using a train and test batch size of 1, as recommended for the training of efficientad |
Beta Was this translation helpful? Give feedback.
-
@alexriedel1, would it be an idea to hardcode the batch size and remove from the config file for now ? |
Beta Was this translation helpful? Give feedback.
-
hello @alexriedel1 , @blaz-r Thank you very much for your help and patience in answering, I have successfully trained it when I set the batch size to 1. But I have a question, if the maximum value is 2**24, then when I start training with batch_size set to 32 and image_size set to 500. Logically, |
Beta Was this translation helpful? Give feedback.
-
Best of I can think right now is raising an error if the batch size is different from one. Otherwise it would be needed to implemented in the datamodule generator too I guess |
Beta Was this translation helpful? Give feedback.
-
The quantile calculation is not based on the input image but on feature maps from the teacher model. the tensor shape for 500x500 images and batch size 32 is [32, 384, 117, 117] -> 168,210,432 > 2**24 |
Beta Was this translation helpful? Give feedback.
-
hello @alexriedel1 , Thank you very much for your help and patience in answering. I see, but there doesn't seem to be an early stop mechanism in efficientad. If I set the max_epochs to a very large setting efficientad may get good results, but will this cause overfitting. How should I set my max_epochs to be more reasonable when there is no early stopping mechanism. |
Beta Was this translation helpful? Give feedback.
-
you can use early stopping just like in other models anomalib/src/anomalib/models/cflow/config.yaml Lines 37 to 40 in 27876c8 |
Beta Was this translation helpful? Give feedback.
-
@alexriedel1 OK! Thank you very much, at the very beginning of this question I presented my three confusions about efficientad:
I've got a clearer picture of the first question so far. Regarding my second and third questions, can you give the appropriate answers? Because I think you are one of the most knowledgeable people about efficientad. Very much looking forward to your answers and thanks again for your patience and help! |
Beta Was this translation helpful? Give feedback.
-
Im getting around 30 FPS on a GTX 1650, but that GPU is nowhere near the one used in the paper.. |
Beta Was this translation helpful? Give feedback.
-
@alexriedel1 OK, Thank you very much. |
Beta Was this translation helpful? Give feedback.
It's not about your dataset, it's imagenet(te) dataset downloaded by EfficientAD model, as it uses imagenet as part of its functionality. It should be located inside
datasets/imagenette
. So first thing I would recommend is to just delete this and rerun, which will download the entire dataset again, potentially fixing the issue.