Contest100000599 - 《算法笔记》6.4小节——C++标准模板库(STL)介绍->map的常见用法详解

问题 A: Speech Patterns (25)

题目描述

People often have a preference among synonyms of the same word. For example, some may prefer “the police”, while others may prefer “the cops”. Analyzing such patterns can help to narrow down a speaker’s identity, which is useful when validating, for example, whether it’s still the same person behind an online avatar.

Now given a paragraph of text sampled from someone’s speech, can you find the person’s most commonly used word?

输入

Each input file contains one test case. For each case, there is one line of text no more than 1048576 characters in length, terminated by a carriage return ‘\n’. The input contains at least one alphanumerical character, i.e., one character from the set [0-9 A-Z a-z].

输出

For each test case, print in one line the most commonly occurring word in the input text, followed by a space and the number of times it has occurred in the input. If there are more than one such words, print the lexicographically smallest one. The word should be printed in all lower case. Here a “word” is defined as a continuous sequence of alphanumerical characters separated by non-alphanumerical characters or the line beginning/end.

Note that words are case insensitive.

样例输入

Can1: “Can a can can a can? It can!”

样例输出

can 5

Note

题意就是统计一行文本中每个“单词”出现的次数,然后输出出现次数最高的单词和它出现的次数,次数最高的单词不唯一时按字典序最小输出。常规做法就是逐个读取字符串中的每个字符,遇到非数字字母字符就将单词存入map里。最后用一个for循环来找出最大次数和对应的单词。代码如下所示,因为map容器自带按字典序排序的功能,所以我们每次找到次数更大的单词时,其字典序一定是最小的,所以不需要另外用一个字符串来保存次数最大的单词。

#include 
#include 
#include 
using namespace std;

bool NOA(char c) // umber or alpha,判断字符是否为数字或字母
{
    if (c >= '0' && c <= '9' || c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z')
        return true;
    return false;
}

int main()
{
    string speech, word, ans;
    while (getline(cin, speech))
    {
        map<string, int> mp;
        for (int i = 0; i < speech.length(); ++i)
        {
            while (i < speech.length() && NOA(speech[i])) // 如果是字母数字字符
            {
                if (speech[i] >= 'A' && speech[i] <= 'Z') // 先转成小写字母
                    speech[i] += 32;
                word += speech[i++]; // 则加入单词
            }
            if (word != "") // 如果word不是空串,即读出了一个单词
            {
                mp[word] += 1; // 该单词出现次数加1,默认初值是0,所以不用初始化
                word = ""; // word字符串清零,重新记录新单词
                --i;
            }
        }

        int max = 0;
        for (map<string, int>::iterator it = mp.begin(); it != mp.end(); ++it)
        {
            if (it->second > max)
            {
                max = it->second;
                ans = it->first;
            }
        }
        cout << ans << ' ' << max << endl;
    }

    return 0;
}

然而更通用的做法就是,利用sort函数对map按value进行排序。因为map不像vector容器是一对一线性存储,因此sort不能直接对map排序。我们知道,map中存储的是一个个pair对象,如果我们将pair对象提取出来,存入vector中,不就相当于线性存储了吗?具体的看代码就能明白。由于pair是结构体,换种方式理解就是定义了一个结构体数组,只不过结构体存储在vector中。因此对它的成员引用需要“ . ”而不是“ -> ”。

#include 
#include 
#include 
#include 
#include 
using namespace std;

bool NOA(char c) // umber or alpha,判断字符是否为数字或字母
{
    if (c >= '0' && c <= '9' || c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z')
        return true;
    return false;
}

bool cmp(pair<string, int> i, pair<string, int> j)
{
    return i.second > j.second; // 按出现次数从大到小排序
}

int main()
{
    string speech, word;
    while (getline(cin, speech))
    {
        map<string, int> mp;
        for (int i = 0; i < speech.length(); ++i)
        {
            while (i < speech.length() && NOA(speech[i])) // 如果是字母数字字符
            {
                if (speech[i] >= 'A' && speech[i] <= 'Z') // 先转成小写字母
                    speech[i] += 32;
                word += speech[i++]; // 则加入单词
            }
            if (word != "") // 如果word不是空串,即读出了一个单词
            {
                mp[word] += 1; // 该单词出现次数加1,默认初值是0,所以不用初始化
                word = ""; // word字符串清零,重新记录新单词
                --i;
            }
        }

        vector<pair<string, int> > mpvec(mp.begin(), mp.end());
        sort(mpvec.begin(), mpvec.end(), cmp);
        //for (int i = 0; i != mpvec.size(); ++i)
            //cout << mpvec[i].first << ' ' << mpvec[i].second << endl;
        cout << mpvec[0].first << ' ' << mpvec[0].second << endl;
    }

    return 0;
}

一定要自己写一遍哦~~~

你可能感兴趣的:(#,第6章)