Speech Patterns (string)

People often have a preference among synonyms of the same word. For example, some may prefer "the police", while others may prefer "the cops". Analyzing such patterns can help to narrow down a speaker's identity, which is useful when validating, for example, whether it's still the same person behind an online avatar.

 

 Now given a paragraph of text sampled from someone's speech, can you find the person's most commonly used word?

 

 Input Specification:

 

Each input file contains one test case. For each case, there is one line of text no more than 1048576 characters in length, terminated by a carriage return '\n'. The input contains at least one alphanumerical character, i.e., one character from the set [0-9 A-Z a-z].

 

 Output Specification:

 

For each test case, print in one line the most commonly occurring word in the input text, followed by a space and the number of times it has occurred in the input. If there are more than one such words, print the lexicographically smallest one. The word should be printed in all lower case. Here a "word" is defined as a continuous sequence of alphanumerical characters separated by non-alphanumerical characters or the line beginning/end.

 

Note that words are case insensitive.

Sample Input:

Can1: "Can a can can a can?  It can!"

 Sample Output:

can 5

 

例子是个坑,一开始认为 格式是can1:”_______”   ______为要检查的内容。

其实“ne character from the set [0-9 A-Z a-z]”,所以“can1”是一种字符,有别于“can”

 

还有就是容易超时,后来想了一个O(n)的算法,如下:

 

  1 #include <iostream>

  2 

  3 #include <map>

  4 

  5 #include <string>

  6 

  7 using namespace std;

  8 

  9  

 10 

 11 struct word

 12 

 13 {

 14 

 15    int time;

 16 

 17    int len;

 18 

 19 };

 20 

 21  

 22 

 23  map<string,word>  mm;

 24 

 25  

 26 

 27 void fun1(string ss)

 28 

 29 {

 30 

 31    int i=0;string tem="";

 32 

 33    while(i<ss.length())

 34 

 35    {

 36 

 37         while(i<ss.length())

 38 

 39     {

 40 

 41       if((ss[i]>='A'&&ss[i]<='Z')||(ss[i]>='a'&&ss[i]<='z')||(ss[i]>='0'&&ss[i]<='9'))

 42 

 43         {

 44 

 45               if(ss[i]>='A'&&ss[i]<='Z')

 46 

 47                     ss[i]=ss[i]-'A'+'a';

 48 

 49               tem+=ss[i];

 50 

 51               ++i;

 52 

 53         }

 54 

 55         else break;

 56 

 57     }

 58 

 59      ++mm[tem].time;

 60 

 61       mm[tem].len=i;

 62 

 63       tem="";

 64 

 65       while(i<ss.length())

 66 

 67      {

 68 

 69       if((ss[i]>='A'&&ss[i]<='Z')||(ss[i]>='a'&&ss[i]<='z')||(ss[i]>='0'&&ss[i]<='9'))

 70 

 71         {

 72 

 73               break;

 74 

 75         }

 76 

 77         else i++;

 78 

 79      }

 80 

 81    }

 82 

 83  

 84 

 85  

 86 

 87 }

 88 

 89  

 90 

 91 void fun2(string ss)

 92 

 93 {

 94 

 95      int i=0;string tem="";

 96 

 97                 while(i<ss.length())

 98 

 99        {

100 

101                 while(i<ss.length())

102 

103             {

104 

105                if((ss[i]>='A'&&ss[i]<='Z')||(ss[i]>='a'&&ss[i]<='z')||(ss[i]>='0'&&ss[i]<='9'))

106 

107                      break;

108 

109                   else i++;

110 

111            }

112 

113              while(i<ss.length())

114 

115            {

116 

117               if((ss[i]>='A'&&ss[i]<='Z')||(ss[i]>='a'&&ss[i]<='z')||(ss[i]>='0'&&ss[i]<='9'))

118 

119                {

120 

121                 if(ss[i]>='A'&&ss[i]<='Z')

122 

123                       ss[i]=ss[i]-'A'+'a';

124 

125                    tem+=ss[i];

126 

127                     ++i;

128 

129                 }

130 

131                else break;

132 

133             }

134 

135            ++mm[tem].time;

136 

137                mm[tem].len=i;

138 

139             tem="";  

140 

141          }

142 

143    }

144 

145  

146 

147 int main()

148 

149 {

150 

151       string ss;

152 

153       int i,j;int c1,c2;

154 

155      while(getline(cin,ss))

156 

157       {

158 

159             mm.clear();

160 

161             

162 

163            

164 

165             if((ss[0]>='A'&&ss[0]<='Z')||(ss[0]>='a'&&ss[0]<='z')||(ss[0]>='0'&&ss[0]<='9'))

166 

167                   fun1(ss);

168 

169             else fun2(ss);

170 

171        

172 

173         int Max=0;int Min=1048576;

174 

175                   string most;

176 

177               map<string,word>::iterator it;

178 

179               for(it=mm.begin();it!=mm.end();it++)

180 

181               {                    if((it->second).time>Max||((it->second).time==Max&&(it->second).len<Min))

182 

183                   {

184 

185                      Max=(it->second).time;

186 

187                      Min=(it->second).len;

188 

189                most=it->first;

190 

191                   }

192 

193               }

194 

195         cout<<most<<" "<<Max<<endl;

196 

197       }

198 

199  

200 

201    return 0;

202 

203 }
View Code

 

你可能感兴趣的:(Pattern)