Find depth at specific position in a list of BAM files. My colleague Estelle asked: in all the BAM we sequenced, can you give me the depth at a given position ?
This program is now part of the main jvarkit
tool. See jvarkit for compiling.
Usage: java -jar dist/jvarkit.jar findallcoverageatposition [options] Files
Usage: findallcoverageatposition [options] Files
Options:
-clip, --clip
use clipped bases.
Default: false
-x, --extend
[20190218]extend by 'x' base to try to catch close with clipped reads. A
distance specified as a positive integer.Commas are removed. The
following suffixes are interpreted : b,bp,k,kb,m,mb,g,gb
Default: 500
-filter, --filter
[20171201](moved to jexl). A JEXL Expression that will be used to filter
out some sam-records (see
https://software.broadinstitute.org/gatk/documentation/article.php?id=1255).
An expression should return a boolean value (true=exclude, false=keep
the read). An empty expression keeps everything. The variable 'record'
is the current observed read, an instance of SAMRecord (https://samtools.github.io/htsjdk/javadoc/htsjdk/htsjdk/samtools/SAMRecord.html).
Default: record.getMappingQuality()<1 || record.getDuplicateReadFlag() || record.getReadFailsVendorQualityCheckFlag() || record.isSecondaryOrSupplementary()
-h, --help
print help and exit
--helpFormat
What kind of help. One of [usage,markdown,xml].
-Q, --mapq
Min mapping quality. Dicard reads having MAPQ < 'x'
Default: 1
-o, --out
Output file. Optional . Default: stdout
--groupby, --partition
Group Reads by. Data partitioning using the SAM Read Group (see
https://gatkforums.broadinstitute.org/gatk/discussion/6472/ ) . It can
be any combination of sample, library....
Default: sample
Possible Values: [readgroup, sample, library, platform, center, sample_by_platform, sample_by_center, sample_by_platform_by_center, any]
-f, --posfile
File containing positions. if file suffix is '.bed': all positions in
the range will be scanned.
-p, --position
-p chrom:pos . Multiple separated by space. Add this chrom/position.
Required
Default: []
-r, -R, --reference
[20171201]Indexed fasta Reference file. This file must be indexed with
samtools faidx and with picard/gatk CreateSequenceDictionary or samtools
dict
--version
print version and exit
20141128
The project is licensed under the MIT license.
Should you cite findallcoverageatposition ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md
The current reference is:
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030
The input is a file containing a list of path to the bam.
$ find ./testdata/ -type f -name "*.bam" | \
java -jar dist/findallcoverageatposition.jar -p rotavirus:100
#File CHROM POS SAMPLE DEPTH M I D N S H P EQ X Base(A) Base(C) Base(G) Base(T) Base(N) Base(^) Base(-)
./testdata/S4.bam rotavirus 100 S4 126 126 0 0 0 29 0 0 0 0 5 0 0 121 0 0 0
./testdata/S1.bam rotavirus 100 S1 317 317 1 0 0 50 0 0 0 0 27 0 1 289 0 1 0
./testdata/S2.bam rotavirus 100 S2 311 311 0 1 0 60 0 0 0 0 29 1 0 281 0 0 1
./testdata/S3.bam rotavirus 100 S3 446 446 1 0 0 86 0 0 0 0 39 0 1 406 0 1 0