Pad empty fastq sequence/qual with N/#
use awk
Usage: pademptyfastq [options] Files
Options:
-h, --help
print help and exit
--helpFormat
What kind of help. One of [usage,markdown,xml].
-o, --out
Output file. Optional . Default: stdout
--version
print version and exit
-N
number of bases/qual to be added. -1=length of the first read
Default: -1
${PATH}
. Setting JAVA_HOME is not enough : (e.g: https://github.com/lindenb/jvarkit/issues/23 )$ git clone "https://github.com/lindenb/jvarkit.git"
$ cd jvarkit
$ ./gradlew pademptyfastq
The java jar file will be installed in the dist
directory.
The project is licensed under the MIT license.
Should you cite pademptyfastq ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md
The current reference is:
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030
compiled version of awk:
{if(NR%4==2 && length($0)==0) { printf("NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN\n");} else if(NR%4==0 && length($0)==0) { printf("##################################################\n");} else {print;}}
$ cutadapt -a AGATCGGAAGAGCGTCGT 2> /dev/null in.fastq.gz |\
java -jar dist/pademptyfastq.jar -o pad.fastq.gz