AutoAugustus starting PASA at pipline 21

Discussions about training AUGUSTUS from various sources of evidence. Not discussed here: BRAKER1 and WebAUGUSTUS!

Moderator: bioinf

Post Reply
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

AutoAugustus starting PASA at pipline 21

Post by katharina »

Originally posted in the old forum by Vickie on 31.01.2013 - 15:45

Hello!
I have problem usin autoaug.pl with PASA. Exactly, because of some bug in PASA, I should skip the step 20 and start pasa at step 21. But I can't do it, 'cause starting autoaug.pl with changed line 426 ($perlCmdString="perl $PASAHOME/scripts/Launch_PASA_pipeline.pl -s 21 -c alignAssembly.config -C -R -g genome.fasta ") -s 21 means starting pipeline at index 21, the autoaug.pl drops existing database in mysql, and this causes error. How should I change the script, to run PASA at needed pipeline without dropping existing database at mysql?
Best,
Vickie
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: AutoAugustus starting PASA at pipline 21

Post by katharina »

by mario on 05.02.2013 - 14:11
I can't tell you. I suggest, in this case you run PASA by itself, locally. And then use the set of full length ORF gene structures as GFF training set in a run of autoAugTrain.
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: AutoAugustus starting PASA at pipline 21

Post by katharina »

by Vickie on 05.02.2013 - 15:06
I already found a solution. Simply commented lines 422-424 where autoAug.pl drops existing database. And as I can see, it works, DB exists, no errors from pasa. So I started pasa from pipeline index 14, because of different servers usage, it ran about 3 h and now is on index 15. I expect it break at pipeline 20, because of pasa bug. And then will start it from index 21. I'll write here, if some error appears, or if I successfully get my results. 
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: AutoAugustus starting PASA at pipline 21

Post by katharina »

by Vickie on 07.02.2013 - 11:00
Next error
I ran into another error:
grep complete ../pasa/trainingSetCandidates.fasta | perl -pe 's/>(\S+).*/$1\$/' 1> pasa.complete.lst
grep: ../pasa/trainingSetCandidates.fasta: No such file or directory
PASA has not constructed any complete training gene. Training aborted because of insufficient data.
And really there is no trainingSetCandidates.fasta file, and when I looked into the pasa_asmbls_to_training_set.dbi script, I understood that it can't and doesn't generate any fasta file, but it generates .cds, .pep,.gff3, .gff3.inx, .top_500_longest, cds.scoreas and cds.scores.selected files. The .cds, .top_500_longest, .pep files are in fasta format. The .pep file contains peptide sequence, the other ones DNA sequences. So the question is, which one: .cds or .top_500_longest should I rename to .fasta?
Post Reply