python按行读取字符串-python处理多行字符串将第一行进行修改

以源代码从网站 KEGG-API获取了所需要的文本,其格式如下:

[字符串1]:

b">hsa:10056 K01890 phenylalanyl-tRNA synthetase beta chain [EC:6.1.1.20] | (RefSeq) FARSB, FARSLB, FRSB, HSPC173, NEDBLLA, PheHB, PheRS; phenylalanyl-tRNA synthetase subunit beta (A) MPTVSVKRDLLFQALGRTYTDEEFDELCFEFGLELDEITSEKEIISKEQGNVKAAGASDV VLYKIDVPANRYDLLCLEGLVRGLQVFKERIKAPVYKRVMPDGKIQKLIITEETAKIRPF AVAAVLRNIKFTKDRYDSFIELQEKLHQNICRKRALVAIGTHDLDTLSGPFTYTAKRPSD IKFKPLNKTKEYTACELMNIYKTDNHLKHYLHIIENKPLYPVIYDSNGVVLSMPPIINGD HSRITVNTRNIFIECTGTDFTKAKIVLDIIVTMFSEYCENQFTVEAAEVVFPNGKSHTFP ELAYRKEMVRADLINKKVGIRETPENLAKLLTRMYLKSEVIGDGNQIEIEIPPTRADIIH ACDIVEDAAIAYGYNNIQMTLPKTYTIANQFPLNKLTELLRHDMAAAGFTEALTFALCSQ EDIADKLGVDISATKAVHISNPKTAEFQVARTTLLPGLLKTIAANRKMPLPLKLFEISDI VIKDSNTDVGAKNYRHLCAVYYNKNPGFEIIHGLLDRIMQLLDVPPGEDKGGYVIKASEG PAFFPGRCAEIFARGQSVGKLGVLHPDVITKFELTMPCSSLEINVGPFL "

处理成utf-8格式后:

[字符串2]:

>hsa:10056 K01890 phenylalanyl-tRNA synthetase beta chain [EC:6.1.1.20] | (RefSeq) FARSB, FARSLB, FRSB, HSPC173, NEDBLLA, PheHB, PheRS; phenylalanyl-tRNA synthetase subunit beta (A)

MPTVSVKRDLLFQALGRTYTDEEFDELCFEFGLELDEITSEKEIISKEQGNVKAAGASDV

VLYKIDVPANRYDLLCLEGLVRGLQVFKERIKAPVYKRVMPDGKIQKLIITEETAKIRPF

AVAAVLRNIKFTKDRYDSFIELQEKLHQNICRKRALVAIGTHDLDTLSGPFTYTAKRPSD

IKFKPLNKTKEYTACELMNIYKTDNHLKHYLHIIENKPLYPVIYDSNGVVLSMPPIINGD

HSRITVNTRNIFIECTGTDFTKAKIVLDIIVTMFSEYCENQFTVEAAEVVFPNGKSHTFP

ELAYRKEMVRADLINKKVGIRETPENLAKLLTRMYLKSEVIGDGNQIEIEIPPTRADIIH

ACDIVEDAAIAYGYNNIQMTLPKTYTIANQFPLNKLTELLRHDMAAAGFTEALTFALCSQ

EDIADKLGVDISATKAVHISNPKTAEFQVARTTLLPGLLKTIAANRKMPLPLKLFEISDI

VIKDSNTDVGAKNYRHLCAVYYNKNPGFEIIHGLLDRIMQLLDVPPGEDKGGYVIKASEG

PAFFPGRCAEIFARGQSVGKLGVLHPDVITKFELTMPCSSLEINVGPFL

现在我的目标是将第一行空格后的数据删除,其余不修改,完成如下:

[字符串3]:

>hsa:10056

MPTVSVKRDLLFQALGRTYTDEEFDELCFEFGLELDEITSEKEIISKEQGNVKAAGASDV

VLYKIDVPANRYDLLCLEGLVRGLQVFKERIKAPVYKRVMPDGKIQKLIITEETAKIRPF

AVAAVLRNIKFTKDRYDSFIELQEKLHQNICRKRALVAIGTHDLDTLSGPFTYTAKRPSD

IKFKPLNKTKEYTACELMNIYKTDNHLKHYLHIIENKPLYPVIYDSNGVVLSMPPIINGD

HSRITVNTRNIFIECTGTDFTKAKIVLDIIVTMFSEYCENQFTVEAAEVVFPNGKSHTFP

ELAYRKEMVRADLINKKVGIRETPENLAKLLTRMYLKSEVIGDGNQIEIEIPPTRADIIH

ACDIVEDAAIAYGYNNIQMTLPKTYTIANQFPLNKLTELLRHDMAAAGFTEALTFALCSQ

EDIADKLGVDISATKAVHISNPKTAEFQVARTTLLPGLLKTIAANRKMPLPLKLFEISDI

VIKDSNTDVGAKNYRHLCAVYYNKNPGFEIIHGLLDRIMQLLDVPPGEDKGGYVIKASEG

PAFFPGRCAEIFARGQSVGKLGVLHPDVITKFELTMPCSSLEINVGPFL

现在我的问题是:

如何获取文本后,不保存为为文件,直接对多行字符串进行处理,然后再保存为文件?

因为将获取的文本写入文件,然后再去处理这个文件感觉多此一举。

这是获取文本的代码:

def getHtml(url): #获取网页源代码

request = urllib.request.Request(url)

response = urllib.request.urlopen(request)

return response.read().decode("utf-8")

url1 = "http://rest.kegg.jp/get/hsa:10056/aaseq"

text = getHtml(url1)

其中获取的"text’内容如上[字符串2]所示

我知道可以使用split切除第一行:

>>>str1 = "hsa:10056 K01890 phenylalanyl-tRNA synthetase beta chain [EC:6.1.1.20] | (RefSeq) FARSB, FARSLB, FRSB, HSPC173, NEDBLLA, PheHB, PheRS; phenylalanyl-tRNA synthetase subunit beta (A)"

>>>str2 = str1.split(" ")[:1]

>>>print(str2)

["hsa:10056"]

但现在问题是,"text"是个多行字符串,我只要处理它的第一行,不知道如何解决?

你可能感兴趣的:(python按行读取字符串-python处理多行字符串将第一行进行修改)