filterBam execution time

Discussions about predicting genes with AUGUSTUS. Not covered here: WebAUGUSTUS and BRAKER1

Moderator: bioinf

Post Reply
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

filterBam execution time

Post by katharina »

Originally posted in the old forum by Cecilia on 23.05.2012 - 21:52

I tried to run 'filterBam --uniq --paired --in input.bam --out filtered.bam'. The process has been running for more than 1 day and still not finished. The output file is empty.
How long usually does filterBam take to run?
The input.bam file was generated by Tophat. 'samtools flagstat input.bam' showed
1675091 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
1675091 + 0 mapped (100.00%:-nan%)
1675091 + 0 paired in sequencing
828443 + 0 read1
846648 + 0 read2
1478028 + 0 properly paired (88.24%:-nan%)
....
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: filterBam execution time

Post by katharina »

by katharina on 23.05.2012 - 22:57
Did you name the read pairs with a dash (-1, -2) instead of a slash (/1, /2) to indicate pairdness? Tophat/Bowtie cleave off the slash numbers, which means that filterBam will try for a veeeery long time to identify impossible pairs. This should eventually result in a segmentation fault.
Use "samtools view" to inspect the tophat output and make sure that the -1, -2 is present.
I'll post some example runtimes of filterBam on different file sizes next week.
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: filterBam execution time

Post by katharina »

by Cecilia on 24.05.2012 - 02:29
You're right. Tophat chopped off the /1 and /2 part so the read ID became something like 'HWIST945:92:d059facxx:5:1205:7877:128128'. How does 'samtools flagstat' tell whether the alignment is for read1 or read2? Since the stats show no duplicate, can I just obmit the filterBam step?
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: filterBam execution time

Post by katharina »

by katharina on 29.05.2012 - 10:14
Tophat (or rather Bowtie) can directly produced "paired alignments". Our filterBam and bam2hints tools currently do not work with the paired bam format. Therefore, you'll need to run Tophat/Bowtie in 'single mode' and use the -1/-2 read pair name convention.
You can omit filterBam but this will produce hints of lower quality. (And you cannot use native paired bam output format from Tophat/Bowtie, as mentioned above.)
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: filterBam execution time

Post by katharina »

by katharina on 29.05.2012 - 10:24
The execution time for filterBam with --uniq and --paired flag took about 40 minutes for a 8.8 GB bam file on a Intel(R) Xeon(R) CPU 3.07GHz.
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: filterBam execution time

Post by katharina »

by -C on 15.12.2012 - 01:11
filterBam background jobs processing
When running filterBam the output is displayed in the terminal and once you run the job in the background you cannot disown it. Is there a command I can add to fix this?
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: filterBam execution time

Post by katharina »

by katharina on 18.12.2012 - 12:31
Can you please post your exact program call?
Katharina
User avatar
katharina
Site Admin
Posts: 531
Joined: Wed Nov 18, 2015 6:14 pm
Location: Greifswald
Contact:

Re: filterBam execution time

Post by katharina »

by -C on 21.12.2012 - 22:50
filterBam background jobs processing
I figured it out. thanks for the response.
Post Reply