Originally posted in the old forum by EJ Blom on 25.03.2013 - 16:10
Dear developers,
I was wondering whether conserved regions such as branch sites are used in the de novo gene prediction.
I stumbled across this database:http://lemur.amu.edu.pl/share/ERISdb/weblogos_U2.html
and there seem to be some regions very conserved in introns, which could be used for gene prediction.
Best
EJ
Branch sites in introns (and other conserved regions) used for gene prediction
Moderator: bioinf
Re: Branch sites in introns (and other conserved regions) used for gene prediction
by Mario on 21.05.2013 - 17:03
Yes, Augustus is using a branch point model and this is contributing a lot to accuracy. It is an inhomogeneous higher-order Markov chain model, that is trained using the fact that the distance to the 3' splice site is variable. This model is more general than the profile underlying the web logos you pointed to. I find these particular pictures suspicious because the T and the A are evenly conserved thoughout the species. Maybe the authors have looked for the pattern T.A in the first place.
Yes, Augustus is using a branch point model and this is contributing a lot to accuracy. It is an inhomogeneous higher-order Markov chain model, that is trained using the fact that the distance to the 3' splice site is variable. This model is more general than the profile underlying the web logos you pointed to. I find these particular pictures suspicious because the T and the A are evenly conserved thoughout the species. Maybe the authors have looked for the pattern T.A in the first place.