RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)` #47

liu83 · 2024-07-30T08:46:04Z

Hello,
I followed the ReadMe, creating a conda environment, activating it and running the demo with hero_model and vdr dataset according to the section "Setup" and "Running out of the box!".
However it did not work but having an error in the end (RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)).
Could you please help me figuring it out, where I did wrong?
I have checked, pytorch version 1.10.0, CUDA version 11.3. My GPU is NVIDIA GeForce RTX 3080 Ti with sm86.

Thank you!

Below you can find the loggings in my terminal.

/src/simplerecon(main)$ CUDA_VISIBLE_DEVICES=0 python test.py --name HERO_MODEL \

        --output_base_path OUTPUT_PATH \
        --config_file configs/models/hero_model.yaml \
        --load_weights_from_checkpoint weights/hero_model.ckpt \
        --data_config configs/data/vdr_dense.yaml \
        --num_workers 8 \
        --batch_size 2 \
        --fast_cost_volume \
        --run_fusion \
        --depth_fuser open3d \
        --fuse_color \
        --dump_depth_visualization;

########################### Options ###########################

 random_seed: 0
 name: HERO_MODEL
 log_dir: /home/aime/tmp/tensorboard
 notes: 
 log_interval: 100
 val_interval: 1000
 val_batches: 100
 dataset: vdr
 dataset_path: /home/aime/bliu_workspace/src/simplerecon/datasets/vdr
 num_workers: 8
 tuple_info_file_location: data_splits/vdr/
 mv_tuple_file_suffix: _eight_view_deepvmvs_dense.txt
 frame_tuple_type: dense
 model_num_views: 8
 num_images_in_tuple: 8
 dataset_scan_split_file: data_splits/vdr/scans.txt
 split: test
 image_width: 512
 image_height: 384
 shuffle_tuple: False
 test_keyframe_buffer_size: 30
 lr: 0.0001
 wd: 0.0001
 num_sanity_val_steps: 0
 max_steps: 110000
 batch_size: 2
 val_batch_size: 16
 gpus: 2
 precision: 16
 lr_steps: [70000, 80000]
 resume: None
 load_weights_from_checkpoint: weights/hero_model.ckpt
 image_encoder_name: efficientnet
 depth_decoder_name: unet_pp
 loss_type: log_l1
 matching_encoder_type: resnet
 matching_feature_dims: 16
 matching_scale: 1
 matching_num_depth_bins: 64
 min_matching_depth: 0.25
 max_matching_depth: 5.0
 cv_encoder_type: multi_scale_encoder
 feature_volume_type: mlp_feature_volume
 output_base_path: OUTPUT_PATH
 run_fusion: True
 fuse_color: True
 fusion_max_depth: 3.0
 fusion_resolution: 0.04
 depth_fuser: open3d
 single_debug_scan_id: None
 skip_frames: None
 skip_to_frame: None
 pc_fusion_z_thresh: 0.04
 n_consistent_thresh: 3
 voxel_downsample: 0.02
 mask_pred_depth: False
 cache_depths: False
 fusion_use_raw_lowest_cost: False
 high_res_validation: False
 fast_cost_volume: True
 standard_fps: 30
 dump_depth_visualization: True
 use_precomputed_partial_meshes: False
 viz_render_width: 640
 viz_render_height: 480
 cam_marker_size: 0.7
 back_face_alpha: 0.5

###############################################################

################################################################################
####################### VDR Dataset, number of scans: 2 ########################
################################################################################

################################################################################
######################### Running fusion! Using open3d #########################
Output directory:
OUTPUT_PATH/HERO_MODEL/vdr/dense/meshes/0.04_3.0_open3d_color
################################################################################

################################################################################
############################### Saving quick viz.###############################
#######Output directory:
OUTPUT_PATH/HERO_MODEL/vdr/dense/viz/quick_viz ########
################################################################################

WARNING - 2024-07-30 10:26:22,099 - warnings - /home/aime/miniconda3/envs/simplerecon/lib/python3.9/site-packages/timm/models/_factory.py:117: UserWarning: Mapping deprecated model name tf_efficientnetv2_s_in21ft1k to current tf_efficientnetv2_s.in21k_ft_in1k.
model = create_fn(

INFO - 2024-07-30 10:26:22,268 - _builder - Loading pretrained weights from Hugging Face hub (timm/tf_efficientnetv2_s.in21k_ft_in1k)
INFO - 2024-07-30 10:26:22,488 - _hub - [timm/tf_efficientnetv2_s.in21k_ft_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
WARNING - 2024-07-30 10:26:22,574 - _builder - Unexpected keys (bn2.bias, bn2.num_batches_tracked, bn2.running_mean, bn2.running_var, bn2.weight, classifier.bias, classifier.weight, conv_head.weight) found while loading pretrained weights. This may be expected if model is being adapted.
################################################################################
########################## Using FeatureVolumeManager ##########################
Number of source views: 7
Using all metadata.
Number of channels: [202, 128, 128, 1]
################################################################################

################################################################################
########################## Using FeatureVolumeManager ##########################
Number of source views: 7
Using all metadata.
Number of channels: [202, 128, 128, 1]
################################################################################

################################################################################
######################## Using FastFeatureVolumeManager ########################
Number of source views: 7
Using all metadata.
Number of channels: [202, 128, 128, 1]
################################################################################

0%| | 0/562 [00:04<?, ?it/s]
0%| | 0/2 [00:04<?, ?it/s]
Traceback (most recent call last):
File "/home/aime/bliu_workspace/src/simplerecon/test.py", line 473, in
main(opts)
File "/home/aime/bliu_workspace/src/simplerecon/test.py", line 270, in main
outputs = model(
File "/home/aime/miniconda3/envs/simplerecon/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/aime/bliu_workspace/src/simplerecon/experiment_modules/depth_model.py", line 328, in forward
src_cam_T_cur_cam = src_cam_T_world @ cur_world_T_cam.unsqueeze(1)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)

The text was updated successfully, but these errors were encountered:

liu83 · 2024-07-30T08:46:56Z

My PC is Ubuntu 20.04.6 LTS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)` #47

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)` #47

liu83 commented Jul 30, 2024

liu83 commented Jul 30, 2024

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches) #47

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches) #47

Comments

liu83 commented Jul 30, 2024

liu83 commented Jul 30, 2024

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)` #47

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)` #47