Perl形势不太好,我是必修课要学的,大家可以去学习Python,不过Perl处理文本还是很可以的。
有错误欢迎私信我
1.读入一个文件,将序列变成一行输出
#!/usr/bin/perl -w
use strict;
use 5.026;
open IN,"<$ARGV[0]",or die $!;
open OUT,">out.txt",or die $!;
while( ){
chomp;
print OUT $_;
}
print OUT "\n";
close IN;
close OUT;
2.读入一个文件,将前5行打印到屏幕。
#!/usr/bin/perl -w
use strict;
use 5.026;
open IN,"<$ARGV[0]",or die $!;
for(0..4){
chomp (my $Txt= );
print "$Txt\n";
}
close IN;
3.将以下字母c a e w i m j q z s n 存入数组,排序后输出。
#!usr/bin/perl -w
use strict;
use 5.026;
my @array=qw(c a e w i m j q z s n);
my @sortarray=sort @array;
print @sortarray,"\n";
4.存储一些人的名字,写一个程序,做到输入一个姓就能告诉这个人的名。
#!/usr/bin/perl -w
use strict;
use 5.026;
my %haxi=qw(Gang fei He Qiang Lie Wenzhe Tang Jingsha);
chomp(my $input= );
print $haxi{$input},"\n";
5.文本文件共有两列信息,第一列为序列编号,第二列为序列,请按以下四项要求进行过滤和处理,然后将符合要求的序列编号和序列输出到新文件中
1)每条序列需以ATG开头或者含有ATATAT;
2)每条序列长度需大于25;
3)如果符合上述两项要求的序列中含有除ATCG以外的字符,请替换成N;
4)请在每一行的末端加上poly-A/poly-T的序列及其在原序列中的位置。
注:这道题我自己随便写了个txt文件,大家也可以自己编一个,第四小问我不太懂,就给每个符合条件的序列尾部加了10个A,并记录第一个A的位置(序列下标从1开始计算)
#!/usr/bin/perl -w
use strict;
use 5.026;
open IN,"<$ARGV[0]",or die $!;
open OUT,">testout.txt",or die $!;
while(my $txt= ){
if($txt=~/^(\w+)\s+(((?:ATG)|(?:ATATAT))\w+)/){
my $id=$1;
my $seq=$2;
if(length $seq>25){
$seq=~s/[^ATCG]/N/g;
my $ployA="A"x10;
my $ployA_i=length($seq)+1;
print OUT $id,"\t",$seq,$ployA,"\t",$ployA_i,"\n";
}
}
}
close IN;
close OUT;
我的原文件是这样的:
处理后是这样的:
6.某文件(可自己编一个)中记录了若干数字,并用“,”将其分隔开来,计算并输出这些数字的平均值(保留两位小数)和中位数。
#!/usr/bin/perl -w
use strict;
use 5.026;
open IN,"<$ARGV[0]",or die $!;
chomp(my $txt= );
my @number=split /,/,$txt;
my $avg,my $midnumber;
for(@number){
$avg+=$_/($#number+1);
}
printf "%0.2f\n",$avg;
my @sortnumber=sort {$a <=> $b} @number;
if($#sortnumber%2==1){
$midnumber=($sortnumber[$#sortnumber/2]+$sortnumber[$#sortnumber/2+1])/2;
print "$midnumber\n";
}
else{
$midnumber=$sortnumber[$#sortnumber/2];
print "$midnumber\n";
}
7.某文件(可以自己编一个)记录了一些人的身高,第一列姓名,第二列为身高,请按照身高从低到高重新排序输出到新的文件中。
#!/usr/bin/perl -w
use strict;
use 5.026;
open IN,"<$ARGV[0]",or die "Can't open file\n";
open OUT,">out.txt";
my %hash;
while(my $txt= ){
$txt=~/^(\w+)\s+(\d+)/;
$hash{$1}=$2;
}
close IN;
my @sort=sort {$hash{$a} <=> $hash{$b}} keys %hash;
for(@sort){
print OUT $_,"\t",$hash{$_},"\n";
}
close OUT;
8.现有一个fq格式文件(生信知识,不知道的可以去补一下)
#!/usr/bin/perl -w
use strict;
use 5.026;
open FQ,"<$ARGV[0]",or die $!;
open OUTFQ,">out.fq";
open FA,">out.fa";
my $totallength;
my $GC;
while(my $id= ){
chomp($id);
chomp(my $seq= );
chomp(my $comment= );
chomp(my $quality= );
my $sub_seq=substr($seq,3,length($seq)-6);
print OUTFQ "$id\n$sub_seq\n$comment\n$quality\n";
$totallength+=length $seq;
$GC+=($seq=~tr/GCgc/GCgc/);
$id=~s/@/>/;
print FA "$id\n$seq\n";
}
close FQ;
close OUTFQ;
close FA;
my $GCp=$GC/$totallength*100;
printf "GC=%0.2f\n",$GCp;
9.现有一个fa文件,做以下处理:
#!/usr/bin/perl -w
use strict;
use 5.026;
open FAIN,"<$ARGV[0]",or die $!;
open FAOUT1,">Reverseout.fa";
open FAOUT2,">Complementaryout.fa";
while(my $id= ){
chomp(my $seq= );
my $rseq=reverse $seq;
print FAOUT1 "$id$rseq\n";
$seq=~tr/ATCGatcg/TAGCtagc/;
print FAOUT2 "$id$seq\n";
}
close FAIN;
close FAOUT1;
close FAOUT2;
第3,4题代码如下:
#!/usr/bin/perl -w
use strict;
use 5.026;
open FAIN,"<$ARGV[0]",or die $!;
open FAOUT,">rnaout.fa";
open FALOWOUT,">lowout.fa";
while(my $id= ){
my $seq= ;
my $lowseq=$seq;
$lowseq=~tr/ATCG/atcg/;
$seq=~tr/Tt/Uu/;
print FAOUT "$id$seq";
print FALOWOUT "$id$lowseq";
}
close FAIN;
close FAOUT;
close FALOWOUT;
第五题代码如下:
#!/use/bin/perl -w
use strict;
use 5.026;
open FAIN,"<$ARGV[0]",or die "The file cannot be opened\n";
open FAOUT,">mlineout.fa";
while(my $id= ){
print FAOUT $id;
chomp(my $seq= );
for(my $i=0;$i<length $seq;$i+=20){
my $line=substr($seq,$i,20);
print FAOUT "$line\n";
}
}
close FAIN;
close FAOUT;