filterBam and tophat2 question
Posted: Fri Nov 20, 2015 1:17 pm
Originally posted in the old forum by Jason on 14.01.2013 - 05:15
Hi,
We have been using augustus for many of our species and have used custom scripts to retrieve intron hints from tophat2 bam files. I saw the augustus has new instructions on using rna-seq evidence (yes!) and would like to try it. However, I have run into some problems and would like to clarify:
We have fastq files in xxx_1.fastq and xxx_2.fastq. I have converted the data to include -1 and -2 instead of /1 and /2 and map with tophat:
tophat reference.fa xxx_1.fastq xxx_2.fastq
and then sort by read according to read name:
samtools sort -n output_directory/accepted_hits.bam > output_directory/accepted_hits.s.bam
and run filterbam:
filterBam --uniq --paired --in output_directory/accepted_hits.s.bam --out output_directory/accepted_hits.sf.bam
but it produced lots and lots of lines like these:
processed line 1------------------------------------------------
Letting pass all mated-paired alignments= 0, listed below:
Size of matepairs=0
Letting pass all mated-paired alignments= 0, listed below:
Size of matepairs=0
Letting pass all mated-paired alignments= 0, listed below:
Size of matepairs=0
Letting pass all mated-paired alignments= 0, listed below:
Size of matepairs=0
Reading the manual again it looks like I have to align the paired end reads in single end mode? or the bam file has to contain only paired mapping (i.e., singletons - pairs with only one mate mapped are excluded?)
Help would be much appreciated. An example of filtered bam would also be very useful.
Cheers,
Jason
Hi,
We have been using augustus for many of our species and have used custom scripts to retrieve intron hints from tophat2 bam files. I saw the augustus has new instructions on using rna-seq evidence (yes!) and would like to try it. However, I have run into some problems and would like to clarify:
We have fastq files in xxx_1.fastq and xxx_2.fastq. I have converted the data to include -1 and -2 instead of /1 and /2 and map with tophat:
tophat reference.fa xxx_1.fastq xxx_2.fastq
and then sort by read according to read name:
samtools sort -n output_directory/accepted_hits.bam > output_directory/accepted_hits.s.bam
and run filterbam:
filterBam --uniq --paired --in output_directory/accepted_hits.s.bam --out output_directory/accepted_hits.sf.bam
but it produced lots and lots of lines like these:
processed line 1------------------------------------------------
Letting pass all mated-paired alignments= 0, listed below:
Size of matepairs=0
Letting pass all mated-paired alignments= 0, listed below:
Size of matepairs=0
Letting pass all mated-paired alignments= 0, listed below:
Size of matepairs=0
Letting pass all mated-paired alignments= 0, listed below:
Size of matepairs=0
Reading the manual again it looks like I have to align the paired end reads in single end mode? or the bam file has to contain only paired mapping (i.e., singletons - pairs with only one mate mapped are excluded?)
Help would be much appreciated. An example of filtered bam would also be very useful.
Cheers,
Jason