scipio output as hints?

Discussions about predicting genes with AUGUSTUS. Not covered here: WebAUGUSTUS and BRAKER1

Moderator: bioinf

Post Reply
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

scipio output as hints?

Post by katharina »

Originally posted in the old forum by Jason on 17.01.2013 - 04:24
Hi,
I was wondering if anyone used scipio output from protein alignment from a closely related species which is well curated? And what's their experiences/feelings on this approach?
Also is pri=1 more priortised or pri=5? What happens if hints of the same priorities contradict to each other?
Also with regard to grouping hints together. Let's say if I have 3 hints in one group. And say the prediction obey 2/3 of the hint, would the grouped hint got discarded at all?
Thanks. We have been using augustus for at least 30 species over the past 5 years but probably only time now to revisit our approach with the advent of rnaseq sequencing.
Cheers,
Jason
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: scipio output as hints?

Post by katharina »

by katharina on 17.01.2013 - 12:20
Hi Jason,
I will ask Mario to review my answers to your questions, so please wait until he confirms that I am correct:
I personally have not used the Scipio output as hints for AUGUSTUS, although it is rather easy to convert the file format. I usually create hints from proteins using exonerate (see http://bioinf.uni-greifswald.de/bioinf/ ... teProteins) because the Scipio output that we use for generating training gene structures (that is a rather strictly filtered gff file) will not contain hints for all proteins, but only for the proteins for which we obtained complete gene structures, i.e. we loose some information that we will keep with exonerate. However, I personally think that it is possible and reasonable to use hints from Scipio. Maybe Mario has tried this.
pri=5 will be prioritised over pri=1. AUGUSTUS initially discards impossible hints. If two possible hints with the same priority contradict each other, the hint that supports the gene structure with the higher probability will be used. If you enabled prediction of alternative transcripts, two different gene structures, one supported by each contradicting hint, might be predicted.
Regarding your question about the grouping: the grouped hint does not get discarded all together but support would be taken from those 2/3 of the hint group that can be incorporated in a gene structure.
Best,
Katharina
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: scipio output as hints?

Post by katharina »

by mario on 05.02.2013 - 14:08
Correction
I have to clarify two points:
If at least one hint in a group is unsatisfiable or contradicts a hint from a group with higher priority, then the whole hint group is not used. Groups with the same priority are not leading to any discarding, even when they contradict each other. Once a hint group is used, all hints are used individually to influence the prediction. As Katharina said, they do not have to be all satisfied.
If two hints (groups) with the same priority contradict each other, then both are used. A single transcript can only by supportedby one of them, which one it is (if any at all), depends not only on the hints but on the whole sequence and other hints as well.
Mario
Post Reply