Bioinformatics Greifswald | Augustus / Augustus

On the below listed wiki pages, we provide ideas on how to run certain tasks for predicting gene structures with AUGUSTUS. Please be aware that we are only presenting ideas, and that we cannot guarantee that everything will work on your machine with your data set, or that the provided instructions will produce optimal results for your specific data set.

AUGUSTUS tutorials

The following tutorials were created for the Workshop on Comparative Genomics on Jan. 14th 2011 by Mario Stanke:

Other tutorials:

AUGUSTUS-CGP tutorials

RNAseq integration (raw reads)

RNAseq data can be integrated in gene predictions with AUGUSTUS using different aligners. In the past, we have developed protocols for BLAT, GSNAP and Tophat/Bowtie. Please note that these protocols are alternatives, i.e. you should choose one aligner instead of working with all three aligners! These protocols are also outdated! Today, you should use a state-of-the-art aligner, such das Hisat2, or STAR! Note that BRAKER3 can handle the alignment with Hisat2, automatically.

In general, the RNA-Seq integration pipelines with BLAT, GSNAP and Tophat2/Bowtie2 led to very similar gene prediction accuracy results in species with many introns per gene. BLAT was be a bit more complicated to run because the pre-compiled binary has a memory limitation that may require the user to split RNA-Seq libraries and genome into many small files. With GSNAP and Tophat2/Bowtie2, that problem was not observed. We found Tophat2/Bowtie2 to run faster than GSNAP when executed on a single CPU.

The protocols below are only kept for historic reasons. Where it seemed too dangerous in light of modern alternatives, we removed content.

Other evidence integration

Simple procedures for parallelization

Run AUGUSTUS predictions parallel

External Resources

Converting gff3 to gtf format Go to GFF tools, and then to GFF3 to GTF converter. It is important that the gff3 file that you'll upload to Galaxy contains a header line ##gff-version 3