I am trying to use Augustus 2.7 to model genes in a novel fungal organism, and I'm getting a segmentation violation when the first hint is being read in.
We have a good assembled genome sequence, gene models produced with Cegma and Snap, and a nice dataset of RNAseq reads from Illumina. Using Tophat and Cufflinks, we get a number of putative transcripts consistent with Cegma and Snap.
I was able to make an initial Augustus species configuration using the Cegma results.
I'm following the tutorial in the documentation:
http://bioinf.uni-greifswald.de/bioinf/ ... seq.Tophat
but step6 is failing: Here's the command I'm running:
Code: Select all
augustus --species=fungusamongus --extrinsicCfgFile=extrinsic.cfg --alternatives-from-evidence=true --hintsfile=hints.gff --allow_hinted_splicesites=atac genome_seq.fa
Code: Select all
scaffold00001 b2h intron 102395 102452 0 . . pri=4;src=E
Code: Select all
(gdb) where
#0 0x0000000000485c0d in FeatureCollection::readGFFFile (
this=0x7fff9a76a040, filename=0xabaab88 "hints.gff")
at extrinsicinfo.cc:2201
#1 0x000000000040305d in main (argc=7, argv=0x7fff9a76aed8)
at augustus.cc:154
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000485c0d in FeatureCollection::readGFFFile (
this=0x7fff9a76a040, filename=0xabaab88 "hints.gff")
at extrinsicinfo.cc:2201
2201 double gradequot = fti.gradequots[sourcenum][f.gradeclass];
(gdb) p f.gradeclass
$1 = 0
(gdb) p sourcenum
$2 = 2
(gdb) p fti.gradequots[sourcenum][f.gradeclass]
One of the arguments you tried to pass to operator[] could not be converted to what the function wants.
(gdb) l 2150,2210
2160 void FeatureCollection::readGFFFile(const char *filename){
2161 /*
2162 * Read in the configuration file for extrinsic features.
2163 */
2164 readExtrinsicCFGFile();
2165 int predictionStart, predictionEnd;
2166 try {
2167 predictionStart = Properties::getIntProperty( "predictionStart" ) - 1;
2168 } catch (...) {
2169 predictionStart = 0;
2170 }
2171 if (predictionStart < 0)
2172 predictionStart = 0;
2173 try {
2174 predictionEnd = Properties::getIntProperty( "predictionEnd" ) - 1;
2175 } catch (...) {
2176 predictionEnd = INT_MAX;
2177 }
2178
2179 try {
2180 datei.open(filename);
2181 if( !datei ) {
2182 cerr << "FeatureCollection::readGFFFile( " << filename << " ) : Could not open the file!!!"\
<< endl;
2183 throw ProjectError();
2184 }
2185
2186 /*
2187 * read in line by line
2188 *
2189 */
2190 Feature f;
2191 string seqname;
2192 datei >> comment >> ws;
2193 while (datei) {
2194 datei >> f >> comment >> ws;
2195 if (f.end >= predictionStart && f.start <= predictionEnd && f.type != -1) {
2196 f.start -= predictionStart;
2197 f.end -= predictionStart;
2198 FeatureTypeInfo& fti = typeInfo[f.type];
2199 int sourcenum = esource(f.esource);
2200 f.gradeclass = fti.gradeclass(sourcenum, f.score);
2201 double gradequot = fti.gradequots[sourcenum][f.gradeclass];
2202 // set the general values if applicable
2203 if (!(fti.bonus < 0)) { // general bonus/malus
2204 f.bonus = fti.bonus * gradequot * BONUS_FACTOR;
2205 f.malus = fti.malus;
2206 // Let the intron bonus depend on the length
2207 if (f.type == intronF) {
2208 int laenge = f.end - f.start + 1;
2209 double intronGeoProb = IntronModel::getGeoProb(); // 1-1/1730 (mal)
2210 double igenicGeoProb = IGenicModel::getGeoProb(); // 0.9999;
Can anyone provide advice on what I'm going wrong or what to try next?
Thanks.
Robert Bruccoleri
bruc@acm.org