Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All fast5 files failed when perform call_mods #35

Open
yxlong-science opened this issue Aug 23, 2023 · 6 comments
Open

All fast5 files failed when perform call_mods #35

yxlong-science opened this issue Aug 23, 2023 · 6 comments

Comments

@yxlong-science
Copy link

Hi, very good software! Running multi_to_single_fast5, guppy_basecaller and tombo are all normal, but when running call_mods, it shows that all files fail. Whether it is the test data you provided or my own data. I have encountered this problem. The following is the code I run my own data and the corresponding log. I also set HDF5_PLUGIN_PATH as you said, but it doesn't work. For the test data, I also tried to unzip the file as issues #8, but the same problem still occurs. Thank you for your response.

data fractionation

multi_to_single_fast5 -i multi_read_fast5_dir -s fast5s/ -t 10 --recursive

guppy_basecaller

singularity exec /public/home/yxlong/Singularity/guppy-gpu.sif guppy_basecaller -i /public/home/yxlong/Modifications/HW04/fast5s/ -r -s fast5s_guppy --config dna_r9.4.1_450bps_hac_prom.cfg --device CUDA:0

config file: /opt/ont/guppy/data/dna_r9.4.1_450bps_hac_prom.cfg
model file: /opt/ont/guppy/data/template_r9.4.1_450bps_hac_prom.jsn
input path: /public/home/yxlong/Modifications/HW04/fast5s/
save path: fast5s_guppy
chunk size: 2000
chunks per runner: 1024
minimum qscore: 9
records per file: 4000
num basecallers: 4
gpu device: CUDA:0
kernel path:
runners per device: 20

Use of this software is permitted solely under the terms of the end user license agreement (EULA).
By running, copying or accessing this software, you are demonstrating your acceptance of the EULA.
The EULA may be found in /opt/ont/guppy/bin
Found 4000 input read files to process.
Init time: 1776 ms

0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|


Caller time: 20067 ms, Samples called: 87639365, samples/s: 4.36734e+06
Finishing up any open output files.
Basecalling completed successfully.

tombo

cat fast5s_guppy/*/*fastq > fast5s_guppy.fastq

micromamba run -n ont-tombo tombo preprocess annotate_raw_with_fastqs --fast5-basedir /public/home/yxlong/Modifications/HW04/fast5s/ --fastq-filenames fast5s_guppy.fastq --sequencing-summary-filenames /public/home/yxlong/Modifications/HW04/fast5s_guppy/sequencing_summary.txt --basecall-group Basecall_1D_000 --basecall-subgroup BaseCalled_template --overwrite --processes 10

[10:19:15] Getting read filenames.
[10:19:15] Parsing sequencing summary files.
[10:19:15] Annotating FAST5s with sequence from FASTQs.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4000/4000 [00:00<00:00, 7867.41it/s]
[10:19:15] Added sequences to a total of 4000 reads.

micromamba run -n ont-tombo tombo resquiggle /public/home/yxlong/Modifications/HW04/fast5s/ /public/home/jyli/HiFi_Genomes/03.AD1_Updated/HC04_V2/HC04_chr_adjust.fa --processes 10 --corrected-group RawGenomeCorrected_000 --basecall-group Basecall_1D_000 --overwrite

[10:21:21] Loading minimap2 reference.
[10:22:10] Getting file list.
[10:22:10] Loading default canonical ***** DNA ***** model.
[10:22:13] Re-squiggling reads (raw signal to genomic sequence alignment).
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4000/4000 [01:49<00:00, 36.43it/s]
[10:24:03] Final unsuccessful reads summary (2.6% reads unsuccessfully processed; 104 total reads):
1.4% ( 56 reads) : Poor raw to expected signal matching (revert with tombo filter clear_filters)
0.8% ( 31 reads) : Alignment not produced
0.4% ( 16 reads) : Read event to sequence alignment extends beyond bandwidth
0.0% ( 1 reads) : Fewer changepoints found than requested
[10:24:03] Saving Tombo reads index to file.

deepsignal_plant

micromamba run -n deepsignal deepsignal_plant call_mods --input_path /public/home/yxlong/Modifications/HW04/fast5s/ --model_path /public/home/yxlong/Modifications/example/model.dp2.CNN.arabnrice2-1_120m_R9.4plus_tem.bn13_sn16.both_bilstm.epoch6.ckpt --result_file fast5s.C.call_mods.tsv --corrected_group RawGenomeCorrected_000 --motifs C --nproc 10 --nproc_gpu 10

===============================================

parameters:

input_path:
/public/home/yxlong/Modifications/HW04/fast5s/
f5_batch_size:
30
model_path:
/public/home/yxlong/Modifications/example/model.dp2.CNN.arabnrice2-1_120m_R9.4plus_tem.bn13_sn16.both_bilstm.epoch6.ckpt
model_type:
both_bilstm
seq_len:
13
signal_len:
16
layernum1:
3
layernum2:
1
class_num:
2
dropout_rate:
0
n_vocab:
16
n_embed:
4
is_base:
yes
is_signallen:
yes
batch_size:
512
hid_rnn:
256
result_file:
fast5s.C.call_mods.tsv
gzip:
False
recursively:
yes
corrected_group:
RawGenomeCorrected_000
basecall_subgroup:
BaseCalled_template
is_dna:
yes
normalize_method:
mad
motifs:
C
mod_loc:
0
region:
None
positions:
None
reference_path:
None
nproc:
10
nproc_gpu:
10

===============================================

[main] call_mods starts..
cuda availability: False
4000 fast5 files in total..
parse the motifs string..
read_fast5 process-185516 starts
read_fast5 process-185518 starts
read_fast5 process-185517 starts
read_fast5 process-185519 starts
read_fast5 process-185520 starts
read_fast5 process-185521 starts
read_fast5 process-185522 starts
call_mods process-185523 starts
call_mods process-185524 starts
write_process-185525 starts
read_fast5 process-185516 ending, proceed 600 fast5s
read_fast5 process-185518 ending, proceed 570 fast5s
read_fast5 process-185522 ending, proceed 540 fast5s
read_fast5 process-185519 ending, proceed 510 fast5s
read_fast5 process-185521 ending, proceed 580 fast5s
read_fast5 process-185517 ending, proceed 600 fast5s
read_fast5 process-185520 ending, proceed 600 fast5s
call_mods process-185524 ending, proceed 0 feature-batches(512)
call_mods process-185523 ending, proceed 0 feature-batches(512)
write_process-185525 finished
4000 of 4000 fast5 files failed..

@PengNi
Copy link
Owner

PengNi commented Aug 23, 2023

Hi @yxlong-science , Thank you very much for using our tool. I'm now not sure what the problem is. But I suggest you set the VBZ plugin path and try it again. Check known issues of deepsignal-plant.

Best,
Peng

@yxlong-science
Copy link
Author

Thanks for your quick comment, I've tried setting HDF5_PLUGIN_PATH, but it doesn't work. It seems that guppy_basecaller and tombo are both normal. After unzipping, I still get the same error, is there any way for me to test what is wrong ?

image

@PengNi
Copy link
Owner

PengNi commented Aug 23, 2023

@yxlong-science , I am not sure. You can send me 10 to 100 resquiggled reads, then I will test it.

@yxlong-science
Copy link
Author

Thanks a lot! Here are 100 resquiggled reads. https://drive.google.com/file/d/1HvP6V7xSehnArod7Bxa6P9gNPYyH-9R8/view?usp=sharing

@PengNi
Copy link
Owner

PengNi commented Aug 24, 2023

Hi @yxlong-science , it was a bug related to numpy new versions. I have fixed it and updated the code. You can reinstall deepsignal-plant in your environment as follows and try again:

pip uninstall deepsignal_plant
git clone https://github.com/PengNi/deepsignal-plant.git
cd deepsignal-plant
python setup.py install

Best,
Peng

@yxlong-science
Copy link
Author

Thanks for your patience, it works very smoothly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants