make a bam file smaller by removing unwanted information see also https://www.biostars.org/p/173114/
This program is now part of the main jvarkit
tool. See jvarkit for compiling.
Usage: java -jar dist/jvarkit.jar biostar173114 [options] Files
Usage: biostar173114 [options] Files
Options:
--bamcompression
Compression Level. 0: no compression. 9: max compression;
Default: 9
-h, --help
print help and exit
--helpFormat
What kind of help. One of [usage,markdown,xml].
-keepAtt, --keepAttributes
keep Attributes
Default: false
-keepCigar, --keepCigar
keep cigar : don't remove hard clip
Default: false
-keepName, --keepName, --name
keep Read Name, do not try to create a shorter name
Default: false
-keepQuals, --keepQualities
keep base qualities
Default: false
-keepRG, --keepReadGroup
if attributes are removed, keep the RG
Default: false
-keepSeq, --keepSequence
keep read sequence
Default: false
-mate, --mate
keep Mate/Paired Information
Default: false
-o, --output
Output file. Optional . Default: stdout
-R, --reference
For reading/writing CRAM files. Indexed fasta Reference file. This file
must be indexed with samtools faidx and with picard/gatk
CreateSequenceDictionary or samtools dict
--samoutputformat
Sam output format.
Default: SAM
Possible Values: [BAM, SAM, CRAM]
--version
print version and exit
The project is licensed under the MIT license.
Should you cite biostar173114 ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md
The current reference is:
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030
$ java -jar dist/biostar173114.jar --keepSequence my.bam
@HD VN:1.5 GO:none SO:coordinate
@SQ SN:rotavirus LN:1074
R0 0 rotavirus 1 60 70M * 0 0 GGCTTTTAATGCTTTTCAGTGGTTGCTGCTCAATATGGCGTCAACTCAGCAGATGGTCAGCTCTAATATT *
R1 0 rotavirus 1 60 70M * 0 0 GGCTTTTACTGCTTTTCAGTGGTTGCTTCTCAAGATGGAGTGTACTCATCAGATGGTAAGCTCTATTATT *
R2 0 rotavirus 1 60 70M * 0 0 GGCTTTTAATGCTTTTCATTTGATGCTGCTCAAGATGGAGTCTACACAGCAGATGGTCAGCTCTATTATT *
R3 0 rotavirus 1 60 70M * 0 0 GGCTTTTAATGCTTTTCAGTGGTTGCTGCTCAAGATGGAGTCTCCTGAGCAGCTGGTAAGCTCTATTATT *
R4 0 rotavirus 1 60 70M * 0 0 GGCATTTAATGCTTAACAGTGGTTGCTGCTCAAGATGGAGTCTACTCAGCAGATGGTAAGCTCTCTTATT *
R5 0 rotavirus 2 60 70M * 0 0 GCTTTTAAAGCTTTTCAGTTGTTGCTGCTCAAGATGGAGTCTACTCAGCAGATGTTACGCTCTATTATTA *
R6 0 rotavirus 2 60 51M19S * 0 0 GCTTTTAATGCTTTTCAGTTGTTGCTGCACAAGATGGAGTCTACACAGCAGCTGTTCATCTCTCTTCATC *
R7 0 rotavirus 2 60 70M * 0 0 GCTTTTAATGCTTTTCAGTGGTTTCTTCTCACGATGGAGTCTACTCAGCAGAAGGTAAGCACTATTATTA *
R8 0 rotavirus 2 60 70M * 0 0 GCTTTTAAAGCATTACAGTTGTTGCAGCTCAAGAAGGAGACTACTCAGCAGATGGTAAGCTCTATAATTA *
R9 0 rotavirus 2 60 70M * 0 0 GCTTTTAATTCTATTCAGTGGTTGCTGCTCCAGAAGGAGTCTACTCAGGAGATGGTACGCTCTCTTATTA *
Ra 0 rotavirus 2 60 70M * 0 0 GCTTTTAATGCTTTTCAGTGGTTGCTGCTCAAGATGGAGTCTACTCAGCAGATGGTAAGCTCAATTATTA *
Rb 0 rotavirus 2 60 70M * 0 0 GCTTTTAATGCTTTTCAGTTGTAGCTGCTCAAGATGGAGTCTACTCATCAGATGGTAAGCTCTCTTCTTA *
Rc 0 rotavirus 2 60 63M7S * 0 0 TCTTTAAATGCTTTTCAGTGTTTGCTGCTCAAGATGGAGTCTACTCAGCAGAAGGTAAGCTCTCTTAAAC *
Rd 0 rotavirus 2 60 66M4S * 0 0 GCATTTAATGCTTTTCAGTGGTTGCTGCACAAGATGGAGTCTACTCAGCAGATTGTAAGCTCTATTCTAA *
Re 0 rotavirus 3 60 70M * 0 0 CTTTTAATGCTTTTCAGTGGTTGCTGCTCAAGAAGGCGTCTCCTGATGAGATGGTAAGCTCTATTATTAA *
Rf 0 rotavirus 3 60 70M * 0 0 CTTTTAATGGTTATGAGTGGTTGGTGCACAAGATGGAGTCTACTCAGCAGATGGTACTCTCTATAATTAA *
R10 0 rotavirus 3 60 70M * 0 0 CTTTTAAAGCTTTTCAGTGGTTGCTGCTCAAGATGGAGTCTACTCAGCAGATGGTACTCTCTATTCTTAA *
R11 0 rotavirus 3 60 70M * 0 0 CTTTTAAAGCTTTTCAGAGGTTGCTGCTCAAGATGTAGTCTACTCAGGAGATTGTAAGCTCTATTATTAA