
Set FILTER for VCF if intersects with BED.
This program is now part of the main jvarkit tool. See jvarkit for compiling.
Usage: java -jar dist/jvarkit.jar vcfbedsetfilter [options] Files
Usage: vcfbedsetfilter [options] Files
Options:
--bcf-output
If this program writes a VCF to a file, The format is first guessed from
the file suffix. Otherwise, force BCF output. The current supported BCF
version is : 2.1 which is not compatible with bcftools/htslib (last
checked 2019-11-15)
Default: false
-e, --exclude, --blacklist
Tribble or Tabix bed file containing the regions to be FILTERED. Must be
indexed with tribble or tabix, or use '--fast' to load in memory.
-x, --extend
Extend the variant coordinates per 'x' bases. A distance specified as a
positive integer.Commas are removed. The following suffixes are
interpreted : b,bp,k,kb,m,mb,g,gb
Default: 0
-f, --filter
FILTER name. If `--filter` is empty, FILTERED variant will be discarded.
Default: VCFBED
--generate-vcf-md5
Generate MD5 checksum for VCF output.
Default: false
-h, --help
print help and exit
--helpFormat
What kind of help. One of [usage,markdown,xml].
--fast, --memory
Load the bed in memory: faster than tribble/tabix but memory consumming)
Default: false
--min-bed-fraction
Min BED fraction overlap after extension. Only consider BED records if
variant overlap >= 'x' percent of bed length. A decimal number between
0.0 and 1.0. If the value ends with '%' it is interpretted as a
percentage eg. '1%' => '0.01'. A slash '/' is interpretted as a ratio.
e.g: '1/100' => '0.01'.
--min-vc-fraction
Min Variant fraction overlap after extension. Only consider BED records
if bed overlap >= 'x' percent of vc length. A decimal number between 0.0
and 1.0. If the value ends with '%' it is interpretted as a percentage
eg. '1%' => '0.01'. A slash '/' is interpretted as a ratio. e.g: '1/100'
=> '0.01'.
-M, --mutual-fraction
Mutual comparator Two SV have are the same if they share a fraction 'x'
of their bases. For very small SV the fraction can be quite small while
for large SV the fraction should be close to 1. The Syntax is the
following : (<MAX_SIZE_INCLUSIVE>:<FRACTION as double or percent>;)+ .
For example if the SV as a size of 99bp, the fraction used with be 0.6
for '10:0.1;100:0.6;1000:0.9'. For the smallest size, a simple overlap
is a positive match.. Example:"10:0.5;100:75%;1000:80%;10000:90%".
-o, --out
Output file. Optional . Default: stdout
--version
print version and exit
-i, --include, --whitelist
Tribble or Tabix bed file containing the regions to be accepted. Regions
NOT overlapping those regions will be FILTERED. Must be indexed with
tribble or tabix, or use '--fast' to load in memory.
20150415
The project is licensed under the MIT license.
Should you cite vcfbedsetfilter ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md
The current reference is:
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030
$java -jar jvarkit.jar vcfbedsetfilter -f MYFILTER --include database.bed in.bed in.vcf