Reducing the number of introns

Discussions about predicting genes with AUGUSTUS. Not covered here: WebAUGUSTUS and BRAKER1

Moderator: bioinf

Post Reply
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Reducing the number of introns

Post by katharina »

Originally posted in the old forum by Lisa on 30.04.2013 - 10:37
I am annotating a yeast species and find that Augustus predicts too many introns.
How can I reduce the number of predicted introns?
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: Reducing the number of introns

Post by katharina »

by Mario on 30.04.2013 - 10:41

There is an easy trick to reduce the number of predicted introns:

If you do not use hints already create an empty hints file with

Code: Select all

touch hints.gff
Then copy and edit the hints parameters:

Code: Select all

cp config/extrinsic/extrinsic.cfg extrinsic.punishintrons.cfg
The intron line should then look like this

Code: Select all

intron 1 0.6 M 1 1e+100
The number at the position where 0.6 stands is a factor (malus) that applies to introns that are not supported by evidence from hints. In the case of an empty hints file that is all introns. Reduce this nonnegative number as long as you get too many introns, increase it when you get too few. The default is 1.

Run augustus like this

Code: Select all

augustus --species=saccharomyces --hintsfile=hints.gff --extrinsicCfgFile=extrinsic.punishintrons.cfg genome.fa
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: Reducing the number of introns

Post by katharina »

by Karina H on 20.03.2014 - 07:49
comparison of output Vs already known structures
is there a dataset where the number of introns etc are verified experimentally where # introns is well establishedd
so that malus value can be altered until current results match known results.
Also, are there scripts within latest AUGUSTUS release that help make this comparison (new results Vs old/expected results) and in batch mode for gene predictions on whole chromosome scale?
Post Reply