You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However during training superglue with sift I encounter an error.
Epoch 0: 3%|▉ | 313/10851 [05:52<3:18:00, 1.13s/it, loss=7.73, v_num=0s2i, Train NLL loss=8.400, Train Metric loss=0.000]../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [1,0,0], thread: [41,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [1,0,0], thread: [62,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Running the debugger I get the error:
CUDA error: device-side assert triggered
File "/media/ssd4TB/software/OpenGlue/models/matching_module.py", line 81, in training_step
lafs1, responses1, desc1 = self.local_features_extractor(batch['image1'])
File "/media/ssd4TB/software/OpenGlue/models/features/base.py", line 79, in forward
lafs, scores = self.run_nms(lafs, scores, image.size())
File "/media/ssd4TB/software/OpenGlue/models/features/base.py", line 49, in run_nms
mask[0, 0, kpts_[:, 1], kpts_[:, 0]] = scores_
RuntimeError: CUDA error: device-side assert triggered
During handling of the above exception, another exception occurred:
File "/media/ssd4TB/software/OpenGlue/train.py", line 86, in main
trainer.fit(model, datamodule=dm, ckpt_path=config.get('checkpoint'))
File "/media/ssd4TB/software/OpenGlue/train.py", line 90, in <module>
main()
I use pytroch 1.11.0 with cuda 11.3 and pytorch lightning 1.6
GPU is a 3090 and operating system is Ubunutu.
I also tried with different lightning and torch versions, however the issue still occurs, but at other iterations during training.
Are you aware of the issue? Are there fixes for it.
Could you please help? Thanks!
The text was updated successfully, but these errors were encountered:
HI,
I am trying to get openGlue running. I start training like this:
However during training superglue with sift I encounter an error.
Running the debugger I get the error:
I use pytroch 1.11.0 with cuda 11.3 and pytorch lightning 1.6
GPU is a 3090 and operating system is Ubunutu.
I also tried with different lightning and torch versions, however the issue still occurs, but at other iterations during training.
Are you aware of the issue? Are there fixes for it.
Could you please help? Thanks!
The text was updated successfully, but these errors were encountered: