Mapping a mutation on a protein back to the genome.
This program is now part of the main jvarkit
tool. See jvarkit for compiling.
Usage: java -jar dist/jvarkit.jar backlocate [options] Files
Usage: backlocate [options] Files
Options:
* -g, --gtf
A GTF (General Transfer Format) file. See
https://www.ensembl.org/info/website/upload/gff.html . Please note that
CDS are only detected if a start and stop codons are defined.
-h, --help
print help and exit
--helpFormat
What kind of help. One of [usage,markdown,xml].
-o, --out
Output file. Optional . Default: stdout
-p, --printSeq
print mRNA & protein sequences
Default: false
* -R, --reference
Indexed fasta Reference file. This file must be indexed with samtools
faidx and with picard/gatk CreateSequenceDictionary or samtools dict
--version
print version and exit
20140619
The project is licensed under the MIT license.
Should you cite backlocate ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.md
The current reference is:
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030
mutation P->M at 1090 in NOTCH2
$ echo -e "NOTCH2\tP1090M" | java -jar dist/backlocate.jar -R hg19.fa --gtf ucsc.gtf
(...)
#User.Gene AA1 petide.pos.1 AA2 knownGene.name knownGene.strandknownGene.AA index0.in.rna codon base.in.rna chromosome index0.in.genomic exon
##uc001eik.3
NOTCH2 P 1090 M uc001eik.3 NEGATIVE P 3267 CCA C chr1 120480548 Exon 20
NOTCH2 P 1090 M uc001eik.3 NEGATIVE P 3268 CCA C chr1 120480547 Exon 20
NOTCH2 P 1090 M uc001eik.3 NEGATIVE P 3269 CCA A chr1 120480546 Exon 20
##uc001eil.3
NOTCH2 P 1090 M uc001eil.3 NEGATIVE P 3267 CCA C chr1 120480548 Exon 20
NOTCH2 P 1090 M uc001eil.3 NEGATIVE P 3268 CCA C chr1 120480547 Exon 20
NOTCH2 P 1090 M uc001eil.3 NEGATIVE P 3269 CCA A chr1 120480546 Exon 20
$ echo -e "NOTCH2\tPro1090M\tInteresting" | java -jar dist/backlocate.jar --gtf ucsc.gtf -R /path/to/human_g1k_v37.fasta | grep -v "##" | java -jar dist/prettytable.jar
+------------+-----+--------------+-----+----------------+------------------+--------------+---------------+------------+----------------------+-------------+------------+-------------------+---------+-----------------+
| #User.Gene | AA1 | petide.pos.1 | AA2 | knownGene.name | knownGene.strand | knownGene.AA | index0.in.rna | wild.codon | potential.var.codons | base.in.rna | chromosome | index0.in.genomic | exon | extra.user.data |
+------------+-----+--------------+-----+----------------+------------------+--------------+---------------+------------+----------------------+-------------+------------+-------------------+---------+-----------------+
| NOTCH2 | Pro | 1090 | Met | uc001eik.3 | - | P | 3267 | CCA | . | C | 1 | 120480548 | Exon 20 | Interesting |
| NOTCH2 | Pro | 1090 | Met | uc001eik.3 | - | P | 3268 | CCA | . | C | 1 | 120480547 | Exon 20 | Interesting |
| NOTCH2 | Pro | 1090 | Met | uc001eik.3 | - | P | 3269 | CCA | . | A | 1 | 120480546 | Exon 20 | Interesting |
| NOTCH2 | Pro | 1090 | Met | uc001eil.3 | - | P | 3267 | CCA | . | C | 1 | 120480548 | Exon 20 | Interesting |
| NOTCH2 | Pro | 1090 | Met | uc001eil.3 | - | P | 3268 | CCA | . | C | 1 | 120480547 | Exon 20 | Interesting |
| NOTCH2 | Pro | 1090 | Met | uc001eil.3 | - | P | 3269 | CCA | . | A | 1 | 120480546 | Exon 20 | Interesting |
+------------+-----+--------------+-----+----------------+------------------+--------------+---------------+------------+----------------------+-------------+------------+-------------------+---------+-----------------+
backlocate was cited in: