iBeetle Bioinformatics Resources
This page is summarizing some web ressources for the iBeetle project.
It is work in progress and mostly for internal use.
The data below is based on different assemblies available from the BeetleBase ftp server.
genome annotation on Tcas 5.2
The Annotation is based on Assembly 5.2
Tcas_5.2.genbank.gff.gz : GFF annotation file submitted to GenBank (md5sum e23b9fb3a8e2d55c415c7fcdae1fd4c7).
README : contains explanation on the generation of the gene set
Batch 6 files
Link list of pubished NCBI genes to view in gbrowse with publication information: ncbi.genes.html
Batch 5 files
- batch5.fa : The FASTA formatted sequences.
- batch5.html : A link list together with a summary of applied filter criteria.
- batch5.s.tbl : A map of the fragments and their parental transcripts.
Batch 4 files
Note that files changed, removed fragments that have been filtered out by Mirko (leaving 942 fragments), additional 442 fragments (generated in January 2014)
Batch 4 is based on an assembly 4.0 draft.
The prediction was done on Assembly 4.0.
au3 is a combination of 11729 genes predicted by AUGUSTUS and 2563 genes from the OGS.
Big correspondence table
up to batch 3 (plates 1-62).
The columns are
The assignment to OGS genes is based on at least 98% sequence identity over at least 50% of the dsRNA sequence. Note that there may be more than 1 such
hit per dsRNA. Also note, that there does not need to be an OGS gene for every dsRNA.
- iBeetle number, e.g. iB_00002
- dsRNA sequence, e.g. CACCACAGCACGACAAA...
- batch number ("safe fragment"), e.g. b1.ds2
- gene id from official gene set OGS), e.g. TC000021
- coding sequence (CDS) from OGS gene, e.g. ATGCGGTCCCATAAAAAAA...
- Drosophila ortholog gene name, e.g. JIL
- Drosophila ortholog protein isoform id, e.g. JIL-1-PA
- CDS of protein Drosophila isoform, e.g. ATGAGTCGCTTGCAAAA
au2 OGS mapping
Tab-separated list of OGS transcripts (TC number) and their corresponding au2 transcripts in one-to-one relation: au2-ogs-mapping
sample of au2 genes for testing
mRNA sequences of au2 genes
The sequences are mRNA sequences from genes of au2 that have not been covered by iBeetle templates yet.
- supported by neither RNA-Seq nor protein homology
- supported by protein homology but not by RNA-Seq
- supported by RNA-Seq but not by protein homology
A Gene is called supported by RNA-Seq if at least one of its mRNAs is supported by RNA-Seq. The same applies to protein homology.
A mRNA sequence is called supported by RNA-Seq if it is at least half covered by an interval of reads where each covered position is covered by at least 2 reads and gaps of at most 20 bp are allowed.
If for one gene several mRNAs satisfy the above condition the most probable one was taken.
We provide a list with the number of genes for each category.
Batch 3 files
- batch3.fa : The FASTA formatted sequences.
- batch3.html : A link list together with a summary of applied filter criteria.
- batch3.s.tbl : A map of the fragments and their parental transcripts.
Batch 2 files
- batch2.fa : The FASTA formatted sequences.
- batch2.html : A link list together with a summary of applied filter criteria.
- batch2.s.tbl : A map of the fragments and their parental transcripts.
Batch 1 files
- batch1.fa : The FASTA formatted sequences.
- batch1.html : A link list leading to the GBrowse2 sites.
- batch1.s.tbl : A map of the fragments and their parental transcripts.
BLAST server (Official Gene Set, AUGUSTUS prediction (revised) and genome)
GBrowse: developmental genome browser
This browser holds tracks that may be useful for iBeetle. Some tracks are experimental. Currently, there are
- OGS: The official gene set (OGS 2) from BeetleBase ftp
- "safe" mRNA fragments, first, second and third batch: mRNA fragments for the above batches of dsRNA construction
- polyA hints: 3' termini identified by nontemplated polyA stretches in the raw ESTs
- AUGUSTUS ab initio with Tribolium parameters (CDS only): first AUGUSTUS prediction for T.cas.
- AUGUSTUS (UTR and hints from cDNA, revised): latest AUGUSTUS prediction, including UTR prediction and incorporating evidence from RNA-Seq data
(Former resources at gobics.de)
Mario Stanke, Universität Greifswald
Last modified: Tue Oct 11 15:35:42 CEST 2011