Training with ImageNet 64x64 #86

gulperii · 2021-05-21T17:58:10Z

Hello,

I am using ImageNet 64x64 and run the code with the following command :
python train.py --dataset I64_hdf5 --shuffle --batch_size 128 --num_G_accumulations 1 --num_D_accumulations 1 --num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 --G_attn 32 --D_attn 32 --G_nl relu --D_nl relu --SN_eps 1e-8 --BN_eps 1e-5 --adam_eps 1e-8 --G_ortho 0.0 --G_init xavier --D_init xavier --G_eval_mode --G_ch 32 --D_ch 32 --ema --use_ema --ema_start 2000 --test_every 5000 --save_every 1000 --num_best_copies 5 --num_save_copies 2 --seed 0 --which_best FID --num_epochs 1000 --num_workers 8 --parallel

and getting this error:

File "train.py", line 229, in <module>
    main()
  File "train.py", line 226, in main
    run(config)
  File "train.py", line 184, in run
    metrics = train(x, y)
  File "/BigGAN-PyTorch/train_fns.py", line 42, in train
    split_D=config['split_D'])
  File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 140, in forward
    return self.module(*inputs, **kwargs)
  File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/BigGAN-PyTorch/BigGAN.py", line 443, in forward
    D_out = self.D(D_input, D_class)
  File "/miniconda3/envs/biggan2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/BigGAN-PyTorch/BigGAN.py", line 403, in forward
    out = out + torch.sum(self.embed(y) * h, 1, keepdim=True)
RuntimeError: CUDA error: device-side assert triggered

I have used the prepare_data script in the repository as follows:

python make_hdf5.py --dataset I64 --batch_size 256 --data_root data
python calculate_inception_moments.py --dataset I64_hdf5 --data_root data

The interesting thing is when I create a "mini dataset" by randomly selecting 500 images per label from original ImageNet dataset code runs fine. What could be the problem? How can I solve this issue ?

The text was updated successfully, but these errors were encountered:

a28293971 · 2023-05-29T07:40:36Z

CUDA error: device-side assert triggered such ERR, it is best to transfer the model to the CPU to see the detailed ERR message.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training with ImageNet 64x64 #86

Training with ImageNet 64x64 #86

gulperii commented May 21, 2021

a28293971 commented May 29, 2023

Training with ImageNet 64x64 #86

Training with ImageNet 64x64 #86

Comments

gulperii commented May 21, 2021

a28293971 commented May 29, 2023