remove duplicated names in sorted BAM
This program is now part of the main jvarkit
tool. See jvarkit for compiling.
Usage: java -jar dist/jvarkit.jar samrmdupnames [options] Files
Usage: samrmdupnames [options] Files
Options:
--bamcompression
Compression Level. 0: no compression. 9: max compression;
Default: 5
-h, --help
print help and exit
--helpFormat
What kind of help. One of [usage,markdown,xml].
-o, --out
Output file. Optional . Default: stdout
-R, --reference
Indexed fasta Reference file. This file must be indexed with samtools
faidx and with picard/gatk CreateSequenceDictionary or samtools dict
--report
Only report duplicate names in the output bam file
Default: false
--samoutputformat
Sam output format.
Default: SAM
Possible Values: [BAM, SAM, CRAM]
--validation-stringency
SAM Reader Validation Stringency
Default: LENIENT
Possible Values: [STRICT, LENIENT, SILENT]
--version
print version and exit
20221207
The project is licensed under the MIT license.
Should you cite samrmdupnames ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md
The current reference is:
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030
java -jar dist/samgrep.jar -R r001 -- samtools-0.1.18/examples/toy.sam
@HD VN:1.4 SO:unsorted
@SQ SN:ref LN:45
@SQ SN:ref2 LN:40
@PG ID:0 PN:com.github.lindenb.jvarkit.tools.samgrep.SamGrep VN:dac03b80e9fd88a15648b22550e57d10c9bed725 CL:-R r001 samtools-0.1.18/examples/toy.sam
r001 163 ref 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112
r001 83 ref 37 30 9M = 7 -39 CAGCGCCAT *
java -jar dist/samgrep.jar -R r001 -- -n 1 samtools-0.1.18/examples/toy.sam
@HD VN:1.4 SO:unsorted
@SQ SN:ref LN:45
@SQ SN:ref2 LN:40
@PG ID:0 PN:com.github.lindenb.jvarkit.tools.samgrep.SamGrep VN:dac03b80e9fd88a15648b22550e57d10c9bed725 CL:-R r001 -n 1 samtools-0.1.18/examples/toy.sam
r001 163 ref 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112
I got a BAM file with the same read duplicated. They have the same position, the same flags