BRAKER


Availability
BRAKER is availabe from github at https://github.com/Gaius-Augustus/BRAKER.
TSEBRA, the Transcript Selector for BRAKER is available from github at https://github.com/Gaius-Augustus/TSEBRA.

BRAKER is a tool for fully automated genome annotation with GeneMark-ES/ET/EP and AUGUSTUS. BRAKER is a joint project of Georgia Institute of Technology, USA and Institute for Mathematics and Computer Science, University of Greifswald, Germany.

In its initial version, BRAKER1 was able to process genome and RNA-Seq data, only.

BRAKER1 required two input files:

  • genome file in fasta format,
  • corresponding RNA-Seq alignment file (must contain spliced alignments) in bam format.

First, GeneMark-ET performs RNA-Seq supported iterative training and generates initial gene structures. Second, AUGUSTUS uses predicted genes for training and then integrates RNA-Seq read information as extrinsic evidence into final gene predictions.

With the most recent release, BRAKER2, training of GeneMark-EP from alignment information of proteins from species with a longer evolutionary distance to the target genome is possible. You may thus achieve good gene prediction accuracy of GeneMark-EP and AUGUSTUS in the absence of RNA-Seq data. BRAKER1 functionality is fully maintained by BRAKER2. You find more information about BRAKER2 in our preprint at bioRxiv

Accuracy of BRAKER1 with RNA-Seq

We compare prediction accuracy of BRAKER1 on four model species genomes to accuracy of MAKER2 and CodingQuarry (only applicable to fungi). The following table is an excerpt from our publication BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS:

Level Arabidopsis thaliana Caenorhabditis elegans Drosophila melanogaster S. pombe*
BRAKER1 MAKER2 BRAKER1 MAKER2 BRAKER1 MAKER2 BRAKER1 MAKER2 Coding Quarry
Gene Sens. 64.4 51.3 55.0 41.0 64.9 55.2 77.4 42.8 79.7
Gene Spec. 52.0 52.2 55.2 30.8 59.4 46.3 80.5 68.7 72.6
Exon Sens. 82.9 76.1 80.2 69.4 75.0 66.4 83.2 50.1 79.6
Exon Spec. 79.0 76.1 85.3 62.3 81.7 66.9 83.2 71.4 81.7
*) Schizosaccharomyces pombe

Note:

We have to correct one important reference in the BRAKER1 publication. In computations of the gene prediction accuracy for the D. melanogaster genome we used the r6.07 version of the fly genome and annotation. However, the Supplementary materials to the paper (available at the "Bioinformatics" journal website) incorrectly cite the earlier r5.55 version of the D. melanogaster genome.

Accuracy of BRAKER2 with OrthoDB proteins

We compare prediction accuracy of BRAKER2 on three model species genomes to accuracy of MAKER2. The following table is an excerpt from our publication BRAKER2: Automatic Eukaryotic Genome Annotation with GeneMark-EP+ and AUGUSTUS Supported by a Protein Database:

Level Arabidopsis thaliana Caenorhabditis elegans Drosophila melanogaster
BRAKER2 MAKER2 BRAKER2 MAKER2 BRAKER2 MAKER2
Gene Sens. 70.6 53.9 43.7 30.4 60.0 48.0
Gene Spec. 65.8 55.6 51.3 38.9 59.5 50.3
Exon Sens. 80.6 74.7 71.9 62.6 71.3 63.7
Exon Spec. 85.8 83.0 87.1 81.4 83.2 76.0

Publications

Please cite the following publications when using BRAKER for your project: