integrating new evidence

Discussions about predicting genes with AUGUSTUS. Not covered here: WebAUGUSTUS and BRAKER1

Moderator: bioinf

Post Reply
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

integrating new evidence

Post by katharina »

Originally posted by RS in the old forum on 05.06.2014 - 01:33

Hi,
I have new RNA-Seq and peptide data (as evidence) for maize through intron hints and peptide locations in the genome. Could you tell me the best way to incorporate this and get improved predictions on existing gene models? I am hoping that a significant number of genes will require correction. I would appreciate if you could point me to workflows you have in place for such an exercise.
Thanks,
Raj
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: integrating new evidence

Post by katharina »

Originally posted by katharina in the old forum on 05.06.2014 - 18:22

I understand that you already prepared intron hints from both data sources in the correct format.

When preparing your peptide hints, pay attention that peptide hints contain strand-information. In contrast to coverage information from EST-data (= exonpart hints), coverage information from peptides should be treated as CDSpart hints. Also, peptide hints carry reading frame information, so set the frame to 0 for all coverage information from peptides.

You should define different src tags for the different hint sources (i.e. E for hints from RNA-Seq data and P for hints from peptide data). This allows you to treat the hint sources differently in the extrinsic.cfg file. There, you should set the influence of peptide hints higher, than the influence of RNA-Seq hints (since peptides are a more reliable source of information).

When running augustus, it will most likely be useful to set the argument --alternatives-from-evidence=true, and depending on the species, it might also make sense to unsual --allow-hinted-splice-sites (to be specified).

After running the predictions, load them into a browser (e.g. use the UCSC assembly hub), also load the RNA-Seq data and the peptides into the browser and visually inspect a couple of examples. (It might also make sense to display the current annotation for comparison.) You will then see whether you need to adjust the parameters in the extrinsic.cfg file.

Katharina
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: integrating new evidence

Post by katharina »

Originally posted by Sam in the old forum on 19.06.2014 - 21:44

Thanks Katharina!
To follow-up, I am working with the maize genome and the gene models provided by Ensembl.
Is there a way to supply existing gene models and the new hints and ask Augustus to make corrections (if any) to those gene models?
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: integrating new evidence

Post by katharina »

Originally posted by katharina in the old forum on 21.06.2014 - 15:34

It is possible to provide features of an existing annotation as "manual hints". You might need to tune the parameters in the extrinsic config file, though.
Katharina
Post Reply