Extract BAMs coverage as a VCF file.
This program is now part of the main jvarkit
tool. See jvarkit for compiling.
Usage: java -jar dist/jvarkit.jar biostar78285 [options] Files
Usage: biostar78285 [options] Files
Options:
-B, --bed, --capture
Limit analysis to this bed file
-f, --filter
A JEXL Expression that will be used to filter out some sam-records (see
https://software.broadinstitute.org/gatk/documentation/article.php?id=1255).
An expression should return a boolean value (true=exclude, false=keep
the read). An empty expression keeps everything. The variable 'record'
is the current observed read, an instance of SAMRecord (https://samtools.github.io/htsjdk/javadoc/htsjdk/htsjdk/samtools/SAMRecord.html).
Default: record.getMappingQuality()<1 || record.getDuplicateReadFlag() || record.getReadFailsVendorQualityCheckFlag() || record.isSecondaryOrSupplementary()
-gcw, --gc-percent-window, --gcw
GC% window size. (if REF is defined)
Default: 20
-h, --help
print help and exit
--helpFormat
What kind of help. One of [usage,markdown,xml].
-m, --min-depth
Min depth tresholds.
Default: []
-o, --output
Output file. Optional . Default: stdout
--partition
When using display READ_GROUPS, how should we partition the ReadGroup ?
Data partitioning using the SAM Read Group (see
https://gatkforums.broadinstitute.org/gatk/discussion/6472/ ) . It can
be any combination of sample, library....
Default: sample
Possible Values: [readgroup, sample, library, platform, center, sample_by_platform, sample_by_center, sample_by_platform_by_center, any]
-R, --reference
Optional. Indexed fasta Reference file. This file must be indexed with
samtools faidx and with picard/gatk CreateSequenceDictionary or samtools
dict
--version
print version and exit
The project is licensed under the MIT license.
Should you cite biostar78285 ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md
The current reference is:
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030
$ java -jar dist/biostar78285.jar -m 300 -R ref.fa S*.bam
##fileformat=VCFv4.2
##Biostar78285.SamFilter=record.getMappingQuality()<1 || record.getDuplicateReadFlag() || record.getReadFailsVendorQualityCheckFlag() || record.isSecondaryOrSupplementary()
##FILTER=<ID=DP_LT_300,Description="All genotypes have DP< 300">
##FORMAT=<ID=DF,Number=1,Type=Integer,Description="Number of Reads on plus strand">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=DR,Number=1,Type=Integer,Description="Number of Reads on minus strand">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##INFO=<ID=AVG_DP,Number=1,Type=Float,Description="Mean depth">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
##INFO=<ID=FRACT_DP_LT_300,Number=1,Type=Float,Description="Fraction of genotypes having DP< 300">
##INFO=<ID=GC_PERCENT,Number=1,Type=Integer,Description="GC% window_size:20">
##INFO=<ID=MAX_DP,Number=1,Type=Integer,Description="Max depth">
##INFO=<ID=MEDIAN_DP,Number=1,Type=Float,Description="Median depth">
##INFO=<ID=MIN_DP,Number=1,Type=Integer,Description="Min depth">
##INFO=<ID=NUM_DP_LT_300,Number=1,Type=Integer,Description="Number of genotypes having DP< 300">
##contig=<ID=rotavirus,length=1074>
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1 S2 S3 S4
rotavirus 1 . G . . DP_LT_300 AVG_DP=4.25;DP=17;FRACT_DP_LT_300=1.0;GC_PERCENT=38;MAX_DP=5;MEDIAN_DP=4.50;MIN_DP=3;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:5:5:0 ./.:5:5:0 ./.:3:3:0 ./.:4:4:0
rotavirus 2 . G . . DP_LT_300 AVG_DP=9.50;DP=38;FRACT_DP_LT_300=1.0;GC_PERCENT=40;MAX_DP=14;MEDIAN_DP=8.50;MIN_DP=7;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:14:14:0 ./.:9:9:0 ./.:8:8:0 ./.:7:7:0
rotavirus 3 . C . . DP_LT_300 AVG_DP=12.25;DP=49;FRACT_DP_LT_300=1.0;GC_PERCENT=39;MAX_DP=18;MEDIAN_DP=11.50;MIN_DP=8;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:18:18:0 ./.:11:11:0 ./.:12:12:0 ./.:8:8:0
rotavirus 4 . T . . DP_LT_300 AVG_DP=16.25;DP=65;FRACT_DP_LT_300=1.0;GC_PERCENT=37;MAX_DP=22;MEDIAN_DP=17.00;MIN_DP=9;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:22:22:0 ./.:16:16:0 ./.:18:18:0 ./.:9:9:0
rotavirus 5 . T . . DP_LT_300 AVG_DP=20.75;DP=83;FRACT_DP_LT_300=1.0;GC_PERCENT=40;MAX_DP=27;MEDIAN_DP=21.50;MIN_DP=13;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:27:27:0 ./.:18:18:0 ./.:25:25:0 ./.:13:13:0
rotavirus 6 . T . . DP_LT_300 AVG_DP=24.25;DP=97;FRACT_DP_LT_300=1.0;GC_PERCENT=42;MAX_DP=33;MEDIAN_DP=25.50;MIN_DP=13;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:30:30:0 ./.:21:21:0 ./.:33:33:0 ./.:13:13:0
rotavirus 7 . T . . DP_LT_300 AVG_DP=28.00;DP=112;FRACT_DP_LT_300=1.0;GC_PERCENT=40;MAX_DP=38;MEDIAN_DP=30.00;MIN_DP=14;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:38:38:0 ./.:23:23:0 ./.:37:37:0 ./.:14:14:0
rotavirus 8 . A . . DP_LT_300 AVG_DP=30.50;DP=122;FRACT_DP_LT_300=1.0;GC_PERCENT=42;MAX_DP=41;MEDIAN_DP=33.00;MIN_DP=15;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:41:41:0 ./.:26:26:0 ./.:40:40:0 ./.:15:15:0
rotavirus 9 . A . . DP_LT_300 AVG_DP=34.75;DP=139;FRACT_DP_LT_300=1.0;GC_PERCENT=44;MAX_DP=48;MEDIAN_DP=37.50;MIN_DP=16;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:46:46:0 ./.:29:29:0 ./.:48:48:0 ./.:16:16:0
rotavirus 10 . T . . DP_LT_300 AVG_DP=40.75;DP=163;FRACT_DP_LT_300=1.0;GC_PERCENT=43;MAX_DP=56;MEDIAN_DP=43.00;MIN_DP=21;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:56:56:0 ./.:35:35:0 ./.:51:51:0 ./.:21:21:0
rotavirus 11 . G . . DP_LT_300 AVG_DP=44.75;DP=179;FRACT_DP_LT_300=1.0;GC_PERCENT=45;MAX_DP=58;MEDIAN_DP=49.50;MIN_DP=22;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:58:58:0 ./.:42:42:0 ./.:57:57:0 ./.:22:22:0
rotavirus 12 . C . . DP_LT_300 AVG_DP=48.75;DP=195;FRACT_DP_LT_300=1.0;GC_PERCENT=43;MAX_DP=66;MEDIAN_DP=53.00;MIN_DP=23;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:66:66:0 ./.:46:46:0 ./.:60:60:0 ./.:23:23:0
rotavirus 13 . T . . DP_LT_300 AVG_DP=53.50;DP=214;FRACT_DP_LT_300=1.0;GC_PERCENT=42;MAX_DP=73;MEDIAN_DP=58.50;MIN_DP=24;NUM_DP_LT_300=4 GT:DF:DP:DR ./.:73:73:0 ./.:51:51:0 ./.:66:66:0 ./.:24:24:0