UTR annotation

Discussions about training AUGUSTUS from various sources of evidence. Not discussed here: BRAKER1 and WebAUGUSTUS!

Moderator: bioinf

Post Reply
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

UTR annotation

Post by katharina »

Originally posted in the old forum by Sammy on 20.08.2013 - 10:12
Hello all,
I am using script gff2gbSmallDNA.pl to covert trainingSetComplete.gff into trainingSetComplete.gb. Here is the screen output
Have mRNA and UTR for gene asmbl_9705 in sequence gnl|cnr_1.3|scaffold24. Ignoring UTR annotation and using mRNA annotation only.
Have mRNA and UTR for gene asmbl_630 in sequence gnl|cnr_1.3|scaffold1. Ignoring UTR annotation and using mRNA annotation only.
Have mRNA and UTR for gene asmbl_4749 in sequence gnl|cnr_1.3|scaffold15. Ignoring UTR annotation and using mRNA annotation only.
Have mRNA and UTR for gene asmbl_4752 in sequence gnl|cnr_1.3|scaffold15. Ignoring UTR annotation and using mRNA annotation only.
Have mRNA and UTR for gene asmbl_4751 in sequence gnl|cnr_1.3|scaffold15. Ignoring UTR annotation and using mRNA annotation only.
Have mRNA and UTR for gene asmbl_4748 in sequence gnl|cnr_1.3|scaffold15. Ignoring UTR annotation and using mRNA annotation only.
Have mRNA and UTR for gene asmbl_4750 in sequence gnl|cnr_1.3|scaffold15. Ignoring UTR annotation and using mRNA annotation only.
Have mRNA and UTR for gene asmbl_11224 in sequence gnl|cnr_1.3|scaffold28. Ignoring UTR annotation and using mRNA annotation only.
Have mRNA and UTR for gene asmbl_11233 in sequence gnl|cnr_1.3|scaffold28. Ignoring UTR annotation and using mRNA annotation only.
Have mRNA and UTR for gene asmbl_11248 in sequence gnl|cnr_1.3|scaffold28. Ignoring UTR annotation and using mRNA annotation only.
Suppressing this error message from now on.
Warning: Had redundant UTR exon information for 11626 genes.
Please tell me how can I add UTR annotation as it is supressing by script?
Thanks!
Sammy
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: UTR annotation

Post by katharina »

by katharina on 21.08.2013 - 15:40
This is not a problem. It is only a warning message. The input file apparently contained exon, CDS, and UTR feature annotation. But gff2gbSmalDNA.pl automatically extracts UTR features from the exon/CDS annotation.
Katharina
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: UTR annotation

Post by katharina »

by Sammy on 06.09.2013 - 17:22
Hi Katharina,
I am running optimize_augustus.pl with --UTR=on --trainOnlyUtr=1 and other flags. I am getting long list of error as
Encountered error after reading 505 annotations.
Error: In sequence gnl|cnr_1.3|scaffold1852_175705-196609: One CDS exon begins before the previous CDS exon ends.11073 >= 7711
GBProcessor::getGeneList(): Intron has negative length.
Encountered error after reading 538 annotations.

PS: training set for optimization is obtained from pasa on RNASeq.
-
Thanks
Sammy
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: UTR annotation

Post by katharina »

by katharina on 06.09.2013 - 17:34
I would simpley remove the error-causing gebank entry (or have a close look at how it is different from other non-error causing entries).
Katharina
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: UTR annotation

Post by katharina »

by Sammy on 11.09.2013 - 13:07

Hi Katharina, I removed all the error-causing genbank entries but again my job terminated after some time. Error:
bucket 1 2 3 4 5 6 7 8
starting accuracy: 0.9536, 0.7489, 0.8116, 0.6613, 0.2532, 0.1739, 537.75, 546.12, starting target: 0.5995
improving parameter /Constant/dss_end curently set to 4
1-4
/Constant/dss_end: checking values 1 2 3 4
bucket
augustus: ERROR
Invalid nucleotide '' encountered.
Could not read the accuracy values out of predictions.txt when processing bucket 1. at /home/shg29ny/opt/augustus/scripts/optimize_augustus.pl line 717.
One more question: I am getting terminal exon doesn't end in stop codon. Variable stopCodonExcludedFromCDS set right? in train.err.
Please tell me whether I should set --stopCodonExcludedFromCDS=True or not. I also realised that on setting --stopCodonExcludedFromCDS=false giving better gene accuracy but in case have to compromise with exon and nucleotide level accuracy. Suggest me what to do? My main objective is to get better accuracy at gene level.
-
Thanks
Sammy
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: UTR annotation

Post by katharina »

by Mario on 19.11.2013 - 15:18
The error
augustus: ERROR
Invalid nucleotide '' encountered.
can be caused by seperate processes of etrainin or optimize_augustus.pl writing into the same set of files: If you accidentally forget that optimize_augustus.pl is still running and start it again, then augustus may read an incomplete file than another process is just writing.
Mario
Post Reply