解析KEGG文件

https://www.genome.jp/kegg-bin/get_htext?hsa00001+3101

C开头的就是kegg的pathway的ID所在行,D开头的就是属于它的kegg的所有的基因

perl -alne '{if(/^C/){/PATH:hsa(\d+)/;$kegg=$1}else{print "$kegg\t$F[1]" if /^D/ and $kegg;}}' hsa00001.keg >kegg2gene.txt


++++++++++++++++++++++++++++++++

#!usr/bin/perluse warnings;use strict; my ($path, $num);open IN, 'hsa00001.keg';open OUT, '>kegg_sorting'; while (){

  chomp;

    if (/^C/){

        ($num)=$_=~/C\s*(\d+).*/;


        #print OUT"$num\n";

    }

    elsif(/^D/){


        ($path)=$_=~/D\s+(\d+).*/;

        print OUT "$num\t$path\n";

    }

}


你可能感兴趣的:(解析KEGG文件)