On the below listed wiki pages, we provide ideas on how to run certain tasks for predicting gene structures with AUGUSTUS. Please be aware that we are only presenting ideas, and that we cannot guarantee that everything will work on your machine with your data set, or that the provided instructions will produce optimal results for your specific data set.

AUGUSTUS tutorials

The following tutorials were created for the Workshop on Comparative Genomics on Jan. 14th 2011 by Mario Stanke:

Other tutorials:

AUGUSTUS-CGP tutorials

RNAseq integration (raw reads)

RNAseq data can be integrated in gene predictions with AUGUSTUS using different aligners. In the past, we have developed protocols for BLAT, GSNAP and Tophat/Bowtie. Please note that these protocols are alternatives, i.e. you should choose one aligner instead of working with all three aligners!

In general, the RNA-Seq integration pipelines with BLAT, GSNAP and Tophat2/Bowtie2 lead to very similar gene prediction accuracy results in species with many introns per gene. You should consider that BLAT might be a bit more complicated to run because the pre-compiled binary has a memory limitation that will probably require you to split RNA-Seq libraries and genome into many small files. With GSNAP and Tophat2/Bowtie2, you won't have this problem. We found Tophat2/Bowtie2 to run faster than GSNAP when executed on a single CPU.

You should consider integrating RNAseq data as "raw reads" instead of assembled transcripts because you loose information if you only rely on assembled transcripts.

Other evidence integration

Simple procedures for parallelization

External Resources

  • Converting gff3 to gtf format Go to GFF tools, and then to GFF3 to GTF converter. It is important that the gff3 file that you'll upload to Galaxy contains a header line ##gff-version 3

Page last modified on September 27, 2016, at 12:02 PM