Error in etraining

Discussions about training AUGUSTUS from various sources of evidence. Not discussed here: BRAKER1 and WebAUGUSTUS!

Moderator: bioinf

Post Reply
ckuanglim
Posts: 1
Joined: Mon Mar 27, 2017 2:12 am

Error in etraining

Post by ckuanglim »

I have some error message from etraining (AUGUSTUS 3.1). Can you explain to me?
etraining --species=generic --stopCodonExcludedFromCDS=true gene.raw.gb 1>generic_train.out 2>generic_train.err

And the error are:
GBProcessor::getGeneList(): Stop codon out of sequence bounds. Ignoring sequence.
Encountered error after reading 1401 annotations.
GBProcessor::getGeneList(): Stop codon out of sequence bounds. Ignoring sequence.
Encountered error after reading 4795 annotations.
Error: In sequence p9_sc00477_896517-901421: One CDS exon does not begin properly after the previous CDS exon.3359 >= 3360
GBProcessor::getGeneList(): Intron has non-positive length.
Encountered error after reading 5728 annotations.
GBProcessor::getGeneList(): Stop codon out of sequence bounds. Ignoring sequence.
Encountered error after reading 6580 annotations.
GBProcessor::getGeneList(): Stop codon out of sequence bounds. Ignoring sequence.
Encountered error after reading 10454 annotations.
GBProcessor::getGeneList(): Stop codon out of sequence bounds. Ignoring sequence.
Encountered error after reading 11045 annotations.
GBProcessor::getGeneList(): Stop codon out of sequence bounds. Ignoring sequence.
Encountered error after reading 11695 annotations.
Error: In sequence p9_sc02138_221160-230971: One CDS exon does not begin properly after the previous CDS exon.1967 >= 1968
GBProcessor::getGeneList(): Intron has non-positive length.
Encountered error after reading 11711 annotations.
Error: In sequence p9_sc02293_5716-12857: One CDS exon does not begin properly after the previous CDS exon.1667 >= 1668
GBProcessor::getGeneList(): Intron has non-positive length.
Encountered error after reading 11983 annotations.
Error: In sequence p9_sc04279_39522-44156: One CDS exon does not begin properly after the previous CDS exon.2501 >= 2502
GBProcessor::getGeneList(): Intron has non-positive length.
Encountered error after reading 13786 annotations.
Error: In sequence p9_sc04869_20574-31091: One CDS exon does not begin properly after the previous CDS exon.1595 >= 1596
GBProcessor::getGeneList(): Intron has non-positive length.
Encountered error after reading 14009 annotations.
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: Error in etraining

Post by katharina »

Dear Ckuanglim,

I assume that this is the complete error list, or that errors did not occur for ALL of the training gene structures that you had, i.e. most were ok.

It happens that gene annotations are inconsistent. For example, sometimes, stop codons are annotated outside the CDS, even though they were annotated inside the CDS end for most other genes. AUGUSTUS (etraining) will then discard such inconsistent structures from training.

If the majority of gene structures produced such an error, please check the parameters.cfg file and set the stopCodonExcludedFromCds parameter the opposite way (it is a true/false boolean flag).

If it affected only a minority of gene structures, remove those from the input file, and you are fine.

Best,

Katharina
Post Reply