PTA-MOOC《Python程序设计浙江大学》拼题题目集第七章题目及代码答案

7-1 词频统计 (30分)

请编写程序,对一段英文文本,统计其中所有不同单词的个数,以及词频最大的前10%的单词。

所谓“单词”,是指由不超过80个单词字符组成的连续字符串,但长度超过15的单词将只截取保留前15个单词字符。而合法的“单词字符”为大小写字母、数字和下划线,其它字符均认为是单词分隔符。

输入格式:
输入给出一段非空文本,最后以符号#结尾。输入保证存在至少10个不同的单词。

输出格式:
在第一行中输出文本中所有不同单词的个数。注意“单词”不区分英文大小写,例如“PAT”和“pat”被认为是同一个单词。

随后按照词频递减的顺序,按照词频:单词的格式输出词频最大的前10%的单词。若有并列,则按递增字典序输出。

输入样例:

This is a test.

The word "this" is the word with the highest frequency.

Longlonglonglongword should be cut off, so is considered as the same as longlonglonglonee.  But this_8 is different than this, and this, and this...#
this line should be ignored.

输出样例:(注意:虽然单词the也出现了4次,但因为我们只要输出前10%(即23个单词中的前2个)单词,而按照字母序,the排第3位,所以不输出。)

23
5:this
4:is

代码

import sys
s=sys.stdin.read()
strs=s[:s.find('#')]
for k in set([i for i in strs if (not i.isalnum()) and i !='_']):
    strs=strs.replace(k,' ')
strs=strs.rstrip(' ').lower().split()
count={}
for i in strs:
    i=i[:15]
    if i in count:
        count[i]+=1
    else:
        count[i]=1
a=int(len(count)*0.1)
print(len(count))
ans=sorted(count.items(),key=lambda x:(-x[1],x[0]))
for i in range(a):
    print(str(ans[i][1])+":"+ans[i][0])

7-2.统计文本文件"letter.txt"中各类字符个数:分别统计字母( 大小写不区分),数字及其他字符的个数。 程序压缩后(zip)以文件形式上传!

代码

with open('example.txt','r') as f:
    s=f.readlines()
    print(s)
    for i in range(len(s)):
        s[i]=s[i].replace('\n','')
        for j in range(len(s[i])):
            if s[i][j].islower():
                s[i]=s[i].replace(s[i][j],s[i][j].upper())
            elif s[i][j].isupper():
                s[i]=s[i].replace(s[i][j],s[i][j].lower())
with open('result.txt','w')as f:
    for i in s:
        f.write(i)
        f.write('\n')

7-2.统计文本文件"letter.txt"中各类字符个数:分别统计字母( 大小写不区分),数字及其他字符的个数。 程序压缩后(zip)以文件形式上传!

count=[0,0,0]
with open('letter.txt','r') as f:
    s=f.readlines()
    for i in range(len(s)):
        s[i]=s[i].lower()
    for i in s:
        for j in range(len(i)):
            if i[j].isalpha():
               count[0]+=1
            elif i[j].isdigit():
                count[1]+=1
            else:
                count[2]+=1
print(count[0],count[1],count[2])
                

7-3. 马丁路德金的"I have a dream"节选存放在"freedom.txt"中:

I have a dream that one day this nation will rise up, live up to the true meaning of its creed: “We hold these truths to be self-evident; that all men are created equal.”

I have a dream that one day on the red hills of Georgia the sons of former slaves and the sons of former slave-owners will be able to sit down together at the table of br otherhood.

I have a dream that one day even the state of Mississippi, a state sweltering with th e heat of injustice, sweltering with the heat of oppression, will be transformed into an oasis of freedom and justice.

I have a dream that my four children will one day live in a nation where they will no t be judged by the color if their skin but by the content of their character. I have a dream today.

I have a dream that one day down in Alabama with its governor having his lips drippin g with the words of interposition and nullification, one day right down in Alabama li ttle black boys and black girls will be able to join hands with little white boys and white girls as sisters and brothers.

I have a dream today.


I have a dream that one day every valley shall be exalted, every hill and mountain sh all be made low, the rough places will be made plain, and the crooked places will be made straight, and the glory of the Lord shall be revealed, and all flesh shall see i t together.

编程实现词汇表,计算每一个单词出现的次数,大小写不区分,输出到"dic.txt" 文件保存。
程序压缩后(zip)以文件形式上传!

代码

 count={}
with open('freedom.txt','r') as f:
    s=f.readlines()
    print(s)
    for i in range(len(s)):
        s[i]=s[i].lower()
        s[i]=s[i].replace('\n','')
    for i in s:
        a=i.split()
        for j in range(len(a)):
            if a[j] in count:
                count[a[j]]+=1
            else:
                count[a[j]]=1
with open('dic.txt','w')as f:
    for i in count:
        f.write(i+":"+str(count[i]))
        f.write('\n')            
        

你可能感兴趣的:(PTA,python,mooc)