2023-07-11 crossMap不同版本基因组位点坐标转换

CrossMap

installation

pip3 install CrossMap

download chain files

A chain file describes a pairwise alignment between two reference assemblies. UCSC and Ensembl chain files are available:

UCSC chain files

  • Chain files from hs1 (T2T-CHM13) to hg38/hg19/mm10/mm9 (ore vice versa): https://hgdownload.soe.ucsc.edu/goldenPath/hs1/liftOver/
  • Chain files from hg38 (GRCh38) to hg19 and all other organisms: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/liftOver/
  • Chain File from hg19 (GRCh37) to hg17/hg18/hg38 and all other organisms: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/
  • Chain File from mm10 (GRCm38) to mm9 and all other organisms: http://hgdownload.soe.ucsc.edu/goldenPath/mm10/liftOver/

Ensembl chain files

  • Human to Human: ftp://ftp.ensembl.org/pub/assembly_mapping/homo_sapiens/
  • Mouse to Mouse: ftp://ftp.ensembl.org/pub/assembly_mapping/mus_musculus/
  • Other organisms: ftp://ftp.ensembl.org/pub/assembly_mapping/

User Input file

CrossMap supports the following file formats.

  1. BAM, CRAM, or SAM

  2. BED or BED-like. (BED file must have at least ‘chrom’, ‘start’, ‘end’)

  3. Wiggle (“variableStep”, “fixedStep” and “bedGraph” formats are supported)

  4. BigWig

  5. GFF or GTF

  6. VCF

  7. GVCF

  8. MAF

usage

CrossMap.py bed hg18ToHg19.over.chain.gz test.hg18.bed3

$ CrossMap.py -h

usage: CrossMap.py [-h] [-v] {bed,bam,gff,wig,bigwig,vcf,gvcf,maf,region,viewchain} ...

CrossMap (v0.6.0) is a program to convert (liftover) genome coordinates between different reference
assemblies (e.g., from human GRCh37/hg19 to GRCh38/hg38 or vice versa). Supported file formats: BAM,
BED, BigWig, CRAM, GFF, GTF, GVCF, MAF (mutation annotation format), SAM, Wiggle, and VCF.

positional arguments:
  {bed,bam,gff,wig,bigwig,vcf,gvcf,maf,region,viewchain}
                        sub-command help
    bed                 converts BED, bedGraph or other BED-like files. Only genome coordinates
                        (i.e., the first 3 columns) will be updated. Regions mapped to multiple
                        locations to the new assembly will be split. Use the "region" command to
                        liftover large genomic regions. Use the "wig" command if you need
                        bedGraph/bigWig output.
    bam                 converts BAM, CRAM, or SAM format file. Genome coordinates, header section,
                        all SAM flags, insert size will be updated.
    gff                 converts GFF or GTF format file. Genome coordinates will be updated.
    wig                 converts Wiggle or bedGraph format file. Genome coordinates will be updated.
    bigwig              converts BigWig file. Genome coordinates will be updated.
    vcf                 converts VCF file. Genome coordinates, header section, reference alleles will
                        be updated.
    gvcf                converts GVCF file. Genome coordinates, header section, reference alleles
                        will be updated.
    maf                 converts MAF (mutation annotation format) file. Genome coordinates and
                        reference alleles will be updated.
    region              converts big genomic regions (in BED format) such as CNV blocks. Genome
                        coordinates will be updated.
    viewchain           prints out the content of a chain file into a human readable, block-to-block
                        format.

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit

https://crossmap.readthedocs.io/en/latest/

你可能感兴趣的:(2023-07-11 crossMap不同版本基因组位点坐标转换)