4.6 Sentence Rankv
After previous section’s good results, the linguistic information plays a key role in LS query finding. According to Rada Mihalcea’s study on the academic paper’s keywords extraction and paragraph summarization by graph-based ranking algorithms, undirected graph sentence rank, forward graph sentence rank and backward graph sentence rank which have achieved great effect in plain text retrieval, now are all practiced and tested in the project on all 225 pages.
SentenceRank |
Yahoo |
3 |
87.00 |
38.67% |
93.00 |
41.33% |
4 |
113.00 |
50.22% |
122.00 |
54.22% |
5 |
133.00 |
59.11% |
136.00 |
60.44% |
6 |
146.00 |
64.89% |
151.00 |
67.11% |
7 |
152.00 |
67.56% |
155.00 |
68.89% |
8 |
157.00 |
69.78% |
157.00 |
69.78% |
9 |
159.00 |
70.67% |
161.00 |
71.56% |
10 |
162.00 |
72.00% |
157.00 |
69.78% |
11 |
166.00 |
73.78% |
158.00 |
70.22% |
12 |
165.00 |
73.33% |
160.00 |
71.11% |
13 |
165.00 |
73.33% |
161.00 |
71.56% |
14 |
164.00 |
72.89% |
166.00 |
73.78% |
15 |
167.00 |
74.22% |
167.00 |
74.22% |
Average |
148.92 |
66.19% |
149.54 |
66.46% |
Table4. 25
(a) (b)
Figure4. 27 success retrieved pages’ counts per 225 pages and corresponding percentage value by undirected graph sentence rank.
Forward |
Yahoo |
3 |
85.00 |
37.78% |
93.00 |
41.33% |
4 |
105.00 |
46.67% |
127.00 |
56.44% |
5 |
139.00 |
61.78% |
144.00 |
64.00% |
6 |
155.00 |
68.89% |
155.00 |
68.89% |
7 |
161.00 |
71.56% |
160.00 |
71.11% |
8 |
165.00 |
73.33% |
159.00 |
70.67% |
9 |
167.00 |
74.22% |
163.00 |
72.44% |
10 |
169.00 |
75.11% |
168.00 |
74.67% |
11 |
172.00 |
76.44% |
167.00 |
74.22% |
12 |
172.00 |
76.44% |
168.00 |
74.67% |
13 |
170.00 |
75.56% |
169.00 |
75.11% |
14 |
169.00 |
75.11% |
171.00 |
76.00% |
15 |
173.00 |
76.89% |
174.00 |
77.33% |
Average |
154.00 |
68.44% |
155.23 |
68.99% |
Table4. 26
(a) (b)
Figure4. 28 success retrieved pages’ counts per 225 pages and corresponding percentage value by forwarded graph sentence rank.
Backward |
Yahoo |
3 |
79.00 |
35.11% |
87.00 |
38.67% |
4 |
109.00 |
48.44% |
105.00 |
46.67% |
5 |
128.00 |
56.89% |
117.00 |
52.00% |
6 |
147.00 |
65.33% |
136.00 |
60.44% |
7 |
154.00 |
68.44% |
135.00 |
60.00% |
8 |
158.00 |
70.22% |
153.00 |
68.00% |
9 |
159.00 |
70.67% |
151.00 |
67.11% |
10 |
164.00 |
72.89% |
152.00 |
67.56% |
11 |
165.00 |
73.33% |
162.00 |
72.00% |
12 |
164.00 |
72.89% |
166.00 |
73.78% |
13 |
164.00 |
72.89% |
155.00 |
68.89% |
14 |
168.00 |
74.67% |
153.00 |
68.00% |
15 |
170.00 |
75.56% |
160.00 |
71.11% |
Average |
148.38 |
65.95% |
140.92 |
62.63% |
Table4. 27
(a) (b)
Figure4. 29 success retrieved pages’ counts per 225 pages and corresponding percentage value by backward graph sentence rank.
Besides a higher success retrieve rate by sentence rank, which is reaching up to 75%, it is worth to mention that the results from Google and Yahoo are very closed and similar to each other, rather than all the previous sections’ results. The average results are shown in Figure4.30 , for easy comparison, the Title method is also included in Figure4.30 . The benefits from sentence rank cannot be disregarded.
Figure4. 30 all sentence rank related methods comparison along with title method
After all, the comprehensive chart which includes all the average success retrieve rates is shown in Figure4.31 .
Figure4. 31 all methods comparison
x-axis |
Method |
Yahoo |
1 |
Title |
50.05% |
44.00% |
2 |
TF |
60.24% |
51.52% |
3 |
DF |
71.35% |
58.36% |
4 |
66.36% |
55.62% |
5 |
PW |
60.55% |
49.94% |
6 |
TF3DF2 |
71.38% |
61.09% |
7 |
TF4DF1 |
66.84% |
55.90% |
8 |
TF5DF5 |
71.25% |
61.23% |
9 |
71.62% |
63.08% |
10 |
69.16% |
58.63% |
11 |
71.69% |
62.12% |
12 |
Word Rank |
55.79% |
49.57% |
13 |
Nouns & Verbs Rank |
47.32% |
44.55% |
14 |
WordRank3DF2 |
71.69% |
59.90% |
15 |
WordRank4DF1 |
65.95% |
56.79% |
16 |
WordRank5DF5 |
71.52% |
60.72% |
17 |
WordRank3TFIDF2 |
53.64% |
44.79% |
18 |
WordRank4TFIDF1 |
54.87% |
47.90% |
19 |
WordRank5TFIDF5 |
51.38% |
43.01% |
20 |
Random Sentence Pick |
67.97% |
64.14% |
21 |
Sentence Rank |
66.19% |
66.46% |
22 |
Forward Sentence Rank |
68.44% |
68.99% |
23 |
Backward Sentence Rank |
65.95% |
62.63% |
Table4. 28