Training with proteins - problems of ATG necessity
Posted: Thu Nov 19, 2015 3:45 pm
Originally posted in the old forum by ebioman on 07.03.2014 - 13:52
Hello
I encountered a problem and wondered whether it might be acutally a "feature" and if so, whether there might be a hack.
For the training I took manually curated proteins from a close species. Mapping the proteins onto the genome I found that often the Methione at the beginning of the protein would not match strictly to an ATG on the genome - even though the mapping otherwise might be useful.
I thought that might still hold useful information for the training, but augustus removes everything which does not start with an ATG.
So I wondered whether in the case e.g. few bases upstream another ATG is present in the genome sequence I could force him to use that - and if I do that whether it actually might spoil the prediction ?
I hope it was actually clear what I wanted to say ?
Thanks
Hello
I encountered a problem and wondered whether it might be acutally a "feature" and if so, whether there might be a hack.
For the training I took manually curated proteins from a close species. Mapping the proteins onto the genome I found that often the Methione at the beginning of the protein would not match strictly to an ATG on the genome - even though the mapping otherwise might be useful.
I thought that might still hold useful information for the training, but augustus removes everything which does not start with an ATG.
So I wondered whether in the case e.g. few bases upstream another ATG is present in the genome sequence I could force him to use that - and if I do that whether it actually might spoil the prediction ?
I hope it was actually clear what I wanted to say ?
Thanks