Creates an archive of small bams with only a few regions.


This program is now part of the main jvarkit tool. See jvarkit for compiling.

Usage: java -jar dist/jvarkit.jar mkminibam  [options] Files

Usage: mkminibam [options] Files
      [20190427]When reading VCF file, don't get the mate position for the 
      structural BND variants.
      Default: false
    -b, --bounds, --edge
      [20190427] If `b` is greater than 0 and the user interval has a length 
      greater than `b` then consider the edges of the object as two positions. 
      the idea is to just save the boundaries of a large deletion. A distance 
      specified as a positive integer.Commas are removed. The following 
      suffixes are interpreted : b,bp,k,kb,m,mb,g,gb
      Default: -1
    -C, --comment
      [20190427]Add a file '*.md' with this comment.
      Default: <empty string>
    -x, --extend
      Extend the positions by 'x' bases. A distance specified as a positive 
      integer.Commas are removed. The following suffixes are interpreted : 
      Default: 5000
      A filter expression. Reads matching the expression will be filtered-out. 
      Empty String means 'filter out nothing/Accept all'. See https://github.com/lindenb/jvarkit/blob/master/src/main/resources/javacc/com/github/lindenb/jvarkit/util/bio/samfilter/SamFilterParser.jj 
      for a complete syntax. 'default' is 'mapqlt(1) || Duplicate() || 
      FailsVendorQuality() || NotPrimaryAlignment() || 
      Default: Accept All/ Filter out nothing
    -h, --help
      print help and exit
      What kind of help. One of [usage,markdown,xml].
      [20191129]Allow no sample/ no read group : use filename
      Default: false
  * -o, --output
      An existing directory or a filename ending with the '.zip' or '.tar' or 
      '.tar.gz' suffix.
      File prefix in the archive. Special value 'now' or empty string will be 
      replaced by the current date
      Default: miniBam.
    -R, --reference
      Optional Reference file for CRAM files. Multiple allowed. Indexed fasta 
      Reference file. This file must be indexed with samtools faidx and with 
      picard/gatk CreateSequenceDictionary or samtools dict
      Default: []
    -T, --tmp
      Tmp working directory
      Default: /tmp
  * -B, --bed, -p, --pos, -V, --variant, --vcf
      A source of intervals. The following suffixes are recognized: vcf, 
      vcf.gz bed, bed.gz, gtf, gff, gff.gz, gtf.gz.Otherwise it could be an 
      empty string (no interval) or a list of plain interval separated by '[ 
      Default: (unspecified)
      print version and exit


The project is licensed under the MIT license.


Should you cite mkminibam ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md

The current reference is:


Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030


Bams are too bigs and my users often ask to visualize a small region of a set of bam


input is a set of bam files or a file with the suffix ‘.list’ containing one path to a bam per line.


$  find src/test/resources/ -name "S*.bam" > bams.list
$   java -jar dist/mkminibam.jar -p "RF01:100" -o out.zip bams.list 

$ unzip -t out.zip 

Archive:  out.zip
    testing: miniBam.S5.bam           OK
    testing: miniBam.S5.bai           OK
    testing: miniBam.S2.bam           OK
    testing: miniBam.S2.bai           OK
    testing: miniBam.S4.bam           OK
    testing: miniBam.S4.bai           OK
    testing: miniBam.S3.bam           OK
    testing: miniBam.S3.bai           OK
    testing: miniBam.S1.bam           OK
    testing: miniBam.S1.bai           OK
No errors detected in compressed data of out.zip.