introduce artificial mutation SNV in bam
This program is now part of the main jvarkit
tool. See jvarkit for compiling.
Usage: java -jar dist/jvarkit.jar biostar404363 [options] Files
Usage: biostar404363 [options] Files
Options:
--bamcompression
Compression Level. 0: no compression. 9: max compression;
Default: 5
--disable-nm
Disable NM change. By default the value of NM is increasing to each
change.
Default: false
-h, --help
print help and exit
--helpFormat
What kind of help. One of [usage,markdown,xml].
--ignore-clip
Ignore clipped section of the reads.
Default: false
-o, --output
Output file. Optional . Default: stdout
* -p, --position, --vcf
VCF File containing the positions to change. if INFO/AF(allele
frequency) field is present, variant is inserted if rand()<= AF.
-R, --reference
For reading/writing CRAM files. Indexed fasta Reference file. This file
must be indexed with samtools faidx and with picard/gatk
CreateSequenceDictionary or samtools dict
--samoutputformat
Sam output format.
Default: SAM
Possible Values: [BAM, SAM, CRAM]
--version
print version and exit
20191023
The project is licensed under the MIT license.
Should you cite biostar404363 ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md
The current reference is:
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030
major bug detected after https://github.com/lindenb/jvarkit/issues/178 . Some reads with INDEL might have been ignored.
##Example
# look at the original bam at position 14:79838836
$ samtools tview -d T -p 14:79838836 src/test/resources/HG02260.transloc.chr9.14.bam | cut -c 1-20
79838841 79838
NNNNNNNNNNNNNNNNNNNN
CATAGGAAAACTAAAGGCAA
cataggaaaactaaaggcaa
cataggaaaaCTAAAGGCAA
cataggaaaactaaaggcaa
CATAGGAAAACTAAAGGCAA
cataggaaaactaaaggcaa
CATAGGAAAACTAAAGGCAA
CATAGGAAAACTAAAGGCAA
CATAGGAAAACTAAAGGCAA
CATAGGAAAACTAAAGGCAA
CATAGGAAAACTAAAGGCAA
taaaggcaa
# look at the VCF , we want to change 14:79838836 with 'A' with probability = 0.5
$ cat jeter.vcf
##fileformat=VCFv4.2
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency among genotypes, for each ALT allele, in the same order as listed">
#CHROM POS ID REF ALT QUAL FILTER INFO
14 79838836 . N A . . AF=0.5
# apply biostar404363
$ java -jar dist/biostar404363.jar -o jeter.bam -p jeter.vcf src/test/resources/HG02260.transloc.chr9.14.bam
# index the sequence
$ samtools index jeter.bam
# view the result (half of the bases were replaced because AF=0.5 in the VCF)
$ samtools tview -d T -p 14:79838836 jeter.bam | cut -c 1-20
79838841 79838
NNNNNNNNNNNNNNNNNNNN
MATAGGAAAACTAAAGGCAA
cataggaaaactaaaggcaa
aataggaaaaCTAAAGGCAA
cataggaaaactaaaggcaa
CATAGGAAAACTAAAGGCAA
aataggaaaactaaaggcaa
AATAGGAAAACTAAAGGCAA
AATAGGAAAACTAAAGGCAA
CATAGGAAAACTAAAGGCAA
CATAGGAAAACTAAAGGCAA
AATAGGAAAACTAAAGGCAA
taaaggcaa