Data Input for Training AUGUSTUS


Use this form to submit data for training AUGUSTUS parameters for novel species/new genomic data.

Before submitting a training job for your species of interest, please check whether parameters have already been trained and have been made publicly available for your species at our species overview table

Please read the training tutorial before submitting a job for the first time. Example data for this form is available here. You may also use the button below to insert sample data. Please note that you will always need to enter the verification string at the bottom of the page, yourself, in order to submit a job!

We strongly recommend that you specify an e-mail address! Please read the Help page before submitting a job without e-mail address! You have to give a species name, and a genome file!

 Help
  Help

There are two options for sequence file (fasta format) transfer:
You may either upload data files from your computer or specify web links.   Help

Please read our instructions about fasta headers before using this web service! Most problems with this web service are caused by a wrong fasta header format!
Genome file *  (max. 250000 scaffolds) Help
 or 

You need to specify at least one of the following files: * Help

cDNA file   Non-commercial users only   Help
 or 

Protein file   Non-commercial users only   Help
 or 

Training gene structure file   Help (gff or gb format, no gzip!)

Possible file combinations [click to minimize]

  • {genome file, cDNA file}
    In this case, the cDNA file is used to create a training gene set. If cDNA quality is sufficient, also a UTR training set will be created.
  • {genome file, protein file}
    In this case, the protein file is used to create a training gene set. No UTR training set can be created.
  • {genome file, gene structure file}
    In this case, the gene structure file is used as a training gene set. If the gene structure file contains UTR elements, also a UTR training set will be created.
  • {genome file, cDNA file, protein file}
    In this case, the protein file will be used to create a training gene set. No UTR training set will be created. cDNA sequences will be used as evidence for prediction, only.
  • {genome file, cDNA file, gene structure file}
    In this case, the gene structure file is used as a training gene set. If the gene structure file contains UTR elements, also a UTR training set will be created. cDNA sequences will be used as evidence for prediction, only.

File combinations that are currently not supported

  • {genome file, cDNA file, protein file, gene structure file}
  • {genome file, protein file, gene structure file}


 *  Help


We use a verification string to figure out whether you are a human. Please type the text in the image below into the text field next to the image.

 *

*) mandatory input arguments