Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

easy-search didn't return the alnResult.m8 in tmp folder #890

Open
Huilin-Li opened this issue Sep 20, 2024 · 0 comments
Open

easy-search didn't return the alnResult.m8 in tmp folder #890

Huilin-Li opened this issue Sep 20, 2024 · 0 comments

Comments

@Huilin-Li
Copy link

Expected Behavior

I was testing the example of mmseqs easy-search examples/QUERY.fasta examples/DB.fasta alnRes.m8 tmp. But I didn't see alnRes.m8 in aimed tmp folder.

Current Behavior

[lihuilin@login01 MMseqs2]$ ls
azure-pipelines.yml  build  cmake  CMakeLists.txt  data  Dockerfile  examples  lib  LICENSE.md  README.md  src  util
[lihuilin@login01 MMseqs2]$ mmseqs easy-search examples/QUERY.fasta examples/DB.fasta alnResult.m8 tmp
Create directory tmp
easy-search examples/QUERY.fasta examples/DB.fasta alnResult.m8 tmp

MMseqs Version:                         87e7103d289029dc3345f85ea9a4c4c6d6416e46
Substitution matrix                     aa:blosum62.out,nucl:nucleotide.out
Add backtrace                           false
Alignment mode                          3
Alignment mode                          0
Allow wrapped scoring                   false
E-value threshold                       0.001
Seq. id. threshold                      0
Min alignment length                    0
Seq. id. mode                           0
Alternative alignments                  0
Coverage threshold                      0
Coverage mode                           0
Max sequence length                     65535
Compositional bias                      1
Compositional bias                      1
Max reject                              2147483647
Max accept                              2147483647
Include identical seq. id.              false
Preload mode                            0
Pseudo count a                          substitution:1.100,context:1.400
Pseudo count b                          substitution:4.100,context:5.800
Score bias                              0
Realign hits                            false
Realign score bias                      -0.2
Realign max seqs                        2147483647
Correlation score weight                0
Gap open cost                           aa:11,nucl:5
Gap extension cost                      aa:1,nucl:2
Zdrop                                   40
Threads                                 40
Compressed                              0
Verbosity                               3
Seed substitution matrix                aa:VTML80.out,nucl:nucleotide.out
Sensitivity                             5.7
k-mer length                            0
Target search mode                      0
k-score                                 seq:2147483647,prof:2147483647
Alphabet size                           aa:21,nucl:5
Max results per query                   300
Split database                          0
Split mode                              2
Split memory limit                      0
Diagonal scoring                        true
Exact k-mer matching                    0
Mask residues                           1
Mask residues probability               0.9
Mask lower case residues                0
Minimum diagonal score                  15
Selected taxa
Spaced k-mers                           1
Spaced k-mer pattern
Local temporary path
Rescore mode                            0
Remove hits by seq. id. and coverage    false
Sort results                            0
Mask profile                            1
Profile E-value threshold               0.001
Global sequence weighting               false
Allow deletions                         false
Filter MSA                              1
Use filter only at N seqs               0
Maximum seq. id. threshold              0.9
Minimum seq. id.                        0.0
Minimum score per column                -20
Minimum coverage                        0
Select N most diverse seqs              1000
Pseudo count mode                       0
Min codons in orf                       30
Max codons in length                    32734
Max orf gaps                            2147483647
Contig start mode                       2
Contig end mode                         2
Orf start mode                          1
Forward frames                          1,2,3
Reverse frames                          1,2,3
Translation table                       1
Translate orf                           0
Use all table starts                    false
Offset of numeric ids                   0
Create lookup                           0
Add orf stop                            false
Overlap between sequences               0
Sequence split mode                     1
Header split mode                       0
Chain overlapping alignments            0
Merge query                             1
Search type                             0
Search iterations                       1
Start sensitivity                       4
Search steps                            1
Prefilter mode                          0
Exhaustive search mode                  false
Filter results during exhaustive search 0
Strand selection                        1
LCA search mode                         false
Disk space limit                        0
MPI runner
Force restart with latest tmp           false
Remove temporary files                  true
Alignment format                        0
Format alignment output                 query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits
Database output                         false
Overlap threshold                       0
Database type                           0
Shuffle input database                  true
Createdb mode                           0
Write lookup file                       0
Greedy best hits                        false

createdb examples/QUERY.fasta tmp/18110643841744502873/query --dbtype 0 --shuffle 1 --createdb-mode 0 --write-lookup 0 --id-offset 0 --compressed 0 -v 3

Converting sequences
[405] 0s 192ms
Time for merging to query_h: 0h 0m 0s 84ms
Time for merging to query: 0h 0m 0s 43ms
Database type: Aminoacid
Time for processing: 0h 0m 0s 373ms
createdb examples/DB.fasta tmp/18110643841744502873/target --dbtype 0 --shuffle 1 --createdb-mode 0 --write-lookup 0 --id-offset 0 --compressed 0 -v 3

Converting sequences
[19999] 0s 336ms
Time for merging to target_h: 0h 0m 0s 62ms
Time for merging to target: 0h 0m 0s 67ms
Database type: Aminoacid
Time for processing: 0h 0m 5s 544ms
Create directory tmp/18110643841744502873/search_tmp
search tmp/18110643841744502873/query tmp/18110643841744502873/target tmp/18110643841744502873/result tmp/18110643841744502873/search_tmp --alignment-mode 3 -s 5.7 --remove-tmp-files 1

prefilter tmp/18110643841744502873/query tmp/18110643841744502873/target tmp/18110643841744502873/search_tmp/5440497380282616509/pref_0 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --seed-sub-mat 'aa:VTML80.o                                                                                                          ut,nucl:nucleotide.out' -k 0 --target-search-mode 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 65535 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov                                                                                                          -mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load                                                                                                          -mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --threads 40 --compressed 0 -v 3 -s 5.7

Query database size: 500 type: Aminoacid
Estimated memory consumption: 1G
Target database size: 20000 type: Aminoacid
Index table k-mer threshold: 112 at k-mer size 6
Index table: counting k-mers
[=================================================================] 100.00% 20.00K 2s 504ms
Index table: Masked residues: 210586
Index table: fill
[=================================================================] 100.00% 20.00K 1s 855ms
Index statistics
Entries:          8552346
DB size:          537 MB
Avg k-mer size:   0.133630
Top 10 k-mers
    GQQVAR      190
    QLGQRV      110
    IHDKNI      105
    ALGSGK      105
    LLPGKT      102
    SGGTLR      84
    SGLGRV      75
    VGSSST      61
    VMHAGS      59
    ATADTT      59
Time for index table init: 0h 0m 5s 872ms
Process prefiltering step 1 of 1

k-mer similarity threshold: 112
Starting prefiltering scores calculation (step 1 of 1)
Query db start 1 to 500
Target db start 1 to 20000
[=================================================================] 100.00% 500 2s 36ms

296.967038 k-mers per position
19293 DB matches per sequence
0 overflows
137 sequences passed prefiltering per query sequence
113 median result list length
1 sequences with 0 size result lists
Time for merging to pref_0: 0h 0m 0s 162ms
Time for processing: 0h 0m 11s 426ms
align tmp/18110643841744502873/query tmp/18110643841744502873/target tmp/18110643841744502873/search_tmp/5440497380282616509/pref_0 tmp/18110643841744502873/result --sub-mat 'aa:blosum62.out,nucl:nucleotide.out'                                                                                                           -a 0 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.001 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-cor                                                                                                          r-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realig                                                                                                          n-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 40 --compressed 0 -v 3

Compute score, coverage and sequence identity
Query database size: 500 type: Aminoacid
Target database size: 20000 type: Aminoacid
Calculation of alignments
[=================================================================] 100.00% 500 23s 390ms
Time for merging to result: 0h 0m 0s 44ms
68875 alignments calculated
12897 sequence pairs passed the thresholds (0.187252 of overall calculated)
25.794001 hits per query sequence
Time for processing: 0h 0m 24s 118ms
rmdb tmp/18110643841744502873/search_tmp/5440497380282616509/pref_0 -v 3

Time for processing: 0h 0m 0s 12ms
rmdb tmp/18110643841744502873/search_tmp/5440497380282616509/aln_0 -v 3

Time for processing: 0h 0m 0s 0ms
rmdb tmp/18110643841744502873/search_tmp/5440497380282616509/input_0 -v 3

Time for processing: 0h 0m 0s 0ms
rmdb tmp/18110643841744502873/search_tmp/5440497380282616509/aln_merge -v 3

Time for processing: 0h 0m 0s 0ms
convertalis tmp/18110643841744502873/query tmp/18110643841744502873/target tmp/18110643841744502873/result alnResult.m8 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --format-mode 0 --format-output query,targe                                                                                                          t,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits --translation-table 1 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --db-output 0 --db-load-mode 0 --search-type 0 --threads 40 --compresse                                                                                                          d 0 -v 3

[=================================================================] 100.00% 500 0s 376ms
Time for merging to alnResult.m8: 0h 0m 0s 57ms
Time for processing: 0h 0m 1s 129ms
rmdb tmp/18110643841744502873/result -v 3

Time for processing: 0h 0m 0s 10ms
rmdb tmp/18110643841744502873/target -v 3

Time for processing: 0h 0m 0s 4ms
rmdb tmp/18110643841744502873/target_h -v 3

Time for processing: 0h 0m 0s 2ms
rmdb tmp/18110643841744502873/query -v 3

Time for processing: 0h 0m 0s 1ms
rmdb tmp/18110643841744502873/query_h -v 3

Time for processing: 0h 0m 0s 1ms
[lihuilin@login01 MMseqs2]$ ls
alnResult.m8  azure-pipelines.yml  build  cmake  CMakeLists.txt  data  Dockerfile  examples  lib  LICENSE.md  README.md  src  tmp  util
[lihuilin@login01 MMseqs2]$
[lihuilin@login01 MMseqs2]$ ls
alnResult.m8  azure-pipelines.yml  build  cmake  CMakeLists.txt  data  Dockerfile  examples  lib  LICENSE.md  README.md  src  tmp  util
[lihuilin@login01 MMseqs2]$ cd tmp
[lihuilin@login01 tmp]$ ls
18110643841744502873  latest
[lihuilin@login01 tmp]$ cat latest
cat: latest: Is a directory
[lihuilin@login01 tmp]$ cd latest
[lihuilin@login01 latest]$ ls
[lihuilin@login01 latest]$ cd ..
[lihuilin@login01 tmp]$ ls
18110643841744502873  latest
[lihuilin@login01 tmp]$ cd 18110643841744502873
[lihuilin@login01 18110643841744502873]$ ls
[lihuilin@login01 18110643841744502873]$

Steps to Reproduce (for bugs)

Please make sure to execute the reproduction steps with newly recreated and empty tmp folders.

MMseqs Output (for bugs)

Please make sure to also post the complete output of MMseqs. You can use gist.github.com for large output.

Context

Providing context helps us come up with a solution and improve our documentation for the future.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

  • Git commit used (The string after "MMseqs Version:" when you execute MMseqs without any parameters):
    I installed MMseqs2 by git clone, and followed these:
git clone https://github.com/soedinglab/MMseqs2.git
cd MMseqs2
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=. ..
make
make install 
export PATH=$(pwd)/bin/:$PATH
  • Which MMseqs version was used (Statically-compiled, self-compiled, Homebrew, etc.):
  • For self-compiled and Homebrew: Compiler and Cmake versions used and their invocation:
  • Server specifications (especially CPU support for AVX2/SSE and amount of system memory):
  • Operating system and version:
[lihuilin@login01 MMseqs2]$ [ $(uname -m) = "x86_64" ] && echo "64bit: Yes" || echo "64bit: No"
grep -q sse4_1 /proc/cpuinfo && echo "SSE4.1: Yes" || echo "SSE4.1: No"
# for very old systems which support neither SSE4.1 or AVX2
grep -q sse2 /proc/cpuinfo && echo "SSE2: Yes" || echo "SSE2: No"64bit: Yes
[lihuilin@login01 MMseqs2]$ grep -q avx2 /proc/cpuinfo && echo "AVX2: Yes" || echo "AVX2: No"
AVX2: Yes
[lihuilin@login01 MMseqs2]$ grep -q sse4_1 /proc/cpuinfo && echo "SSE4.1: Yes" || echo "SSE4.1: No"
SSE4.1: Yes
[lihuilin@login01 MMseqs2]$ # for very old systems which support neither SSE4.1 or AVX2
[lihuilin@login01 MMseqs2]$ grep -q sse2 /proc/cpuinfo && echo "SSE2: Yes" || echo "SSE2: No"
SSE2: Yes
[lihuilin@login01 MMseqs2]$
[lihuilin@login01 MMseqs2]$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant