Originally posted in the old forum by Malte Petersen on 13.01.2014 - 16:57
I am using the new Augustus version 3.0. It compiled fine and runs well, but if I want to retrain it, etraining complains about the input file not being in genbank format:
./bin/etraining --species=leptopilina /tmp/sequence.gb
./bin/etraining: ERROR
Input file not in genbank format.
However, this is a sequence file downloaded from Genbank [1], I doubt that it is misformatted. Could this be a bug in Augustus? Or am I doing anything wrong?
Thanks for your help!
[1] http://www.ncbi.nlm.nih.gov/nucleotide? ... ds=1293613
by Mario on 21.01.2014 - 10:51
I am onto this. This problem is likely to be an effect of the new possibility to input gzipped input files with 3.0 (both genbank and fasta). Another user has reported this with augustus itself. However, we could not reproduce the error on our machines. It runs fine here if it compiles.
Malte, can you please check whether you get the problem also with
augustus --species=human examples/HS08198.fa
? Have you installed the zlib and boost iostreams libaries on your system? They are now required with 3.0.
zlib and boost-iostreams are both installed (otherwise I could not have compiled the package).
I should note that the leptopilina species is a custom config that was generated using the new_species.pl script. That went without any problems, and I can use it on examples/HS08198.fa as well.
I also tried to use the example genbank file examples/hsackI10.gb, with the same error.