Academic search engines are useful tools for finding and accessing articles in academic journals or conferences. So we select three popular players in this field and compare them with several measures in the following.
We have selected Microsoft Academic Search and two other players, Google Scholar and Arnetminer for comparison.
Our measures for comparison are:
(1) Quality of returned results
(including the relevence between queries and results, coverage of areas and ability of approximate matching);
(2) Update rate;
(3) User satisfaction;
(4) Extended services.
(1) Quality of results for papers
When we search for a particular paper (i.e. enter the full name or most part of the full name of a paper), the result shows that Microsoft Academic search just returns that particular paper without any other related results while the other two search engines- Google Scholar and Arnetminer- return lots of related results.
Microsoft | Arnetminer |
The three pictures shown above demonstrate the results of searching the paper "SPEED: Precise and Efficient Static Estimation of Program Computational Complexity".
(2) Quality of results for an academic field
When we search for an academic field, all of the three search engines can retrun pretty good results. The example this time is "Static Analysis".
Microsoft | Arnetminer |
(3) Approximate matching
There is no denying that the ability of approximate matching is indispensible for every search engine. Here we do some experiments on approximate matching.
a) Search for "jiawei hen" instead of "jiawei han"
Microsoft | Arnetminer |
This experiment shows that Microsoft Academic search is lack of the power to do approximate matching, while Google Scholar and Arnetminer are fault-tolerant and will provide some alternative query at the same time.
(4) Others considerations
Here we have to point out one detail about Google scholar.
Google Scholar has paid lots of attention on abbreviated name. An appropriate example is not very far to seek. In the case we type "S. Gulwani" instead of "Sumit Gulwani". The first several results from Google Scholar contain key words "Sumit Gulwani". However, neither Microsoft academic search nor Arnetminer can reach it.
Arnterminer employes H-Index and Microsoft uses H-Index and G-Index. As google has no such function, it is hard to measure google for ranking researchers.
In comparing the time efficiency among these three search engines, we give out two cases.(The result is double checked until 2011.3.8)
(1) Conference Paper: The first case is "WSDM 2011". WSDM(Web Search and Data Mining) is the premier international ACM conference covering research in the areas of search and data mining on the Web which took place during Februray 9-12, 2011 in Hong Kong. We try "WSDM 2011" on Google Scholar, and to our surprise, we can find more than 100 papers published in WSDM 2011, while no suitable results returned by other two players.
Give one example, the accepted paper "On the Selection of Tags for Tag Cloud" can be found by Google scholar while it does not present in both Microsoft Academic or Arnetminer. The paper is also available on the website from the author's website and google can relate it to the correct acm portal page in such "short" time after the proceedings of WSDM 2011
(2) Journal Paper: The second case is "Topology-Adaptive Mesh Deformation for Surface Evolution, Morphing, and Multiview Reconstruction", a newest paper published via the homepage of IEEE(http://www.computer.org/portal/web/tpami/). The result is similar to the first case: Google Scholar outperforms her competitors by the powerful update rate.
Measure 3: User satisfaction
(1) Loading Time
Measured by the http://tools.pingdom.com/, It is in the order of Microsoft, Google, Arnetminer
Microsoft | Arnetminer |
As we can see, google takes less to load the webpages while arnetminer takes more. Google has less contents and Microsoft contains much. Arnetminer may have slow connections due to network conditions.
(2) User experience
a) Advanced search(filter the result):
Google offers much advanced search options like keyword specification, Author, Publication, Year, Specific Topics, Legal Opinions and Patent Search.
Microsoft offers a smaller set of such options, and it has Author, Conference, Journal, Year.
Arnetminer offers no such advanced search
b) Download Experience
Viewing and downloading papers is one key function user expected from such a search engine.
Google offers the best download experience. Google provides three ways to do such things
Microsoft offers a relative good download experience, as it can handle some of the websites that is related to certain papers.
Arnetminer offers almost no download experience
c) User subscribe:
Subscribe experience is quite important as it can add to the user viscosity.
Google: Google can extend its subscription with its google reader and google calendar. I have subscribed to several IEEE transactions. It can send emails with new search content.
Microsoft: Very good subscription as an individual project, offers subscription to author, journal, and search keyword.
Arnetminer: No subscription offered
d) Citation:
It is common scenario that researchers use the reference papers to do literal review before conducting research
Google:Will offers different version of a single publication and papers cited that paper
Microsoft:Only the papers from the same conference
Arnetminer:Offers a list of citing papers
e) Misalliance:
Microsoft requires to install siverlight which makes viewing co-author graph as well as viewing conference calendar impossible under linux/unix and those who does not install cannot have an alternative like Flash.
Bibtex (or other citation format) is best supported at arnetminer.
Google serves as a solid and power search engine for publications. However, she fails to extend her services to a broader view as Microsoft Academic and Arnetminer plays.
Each of Microsoft Academic and Arnetminer provide name disambiguation, that is to distinguish different authors with the same name.
Both Microsoft Academic and Arnetminer will direct you to a profile page dedicated to each paper, author, organization, conference or journal. Specifically pages for each author will cover detailed academic information, such as affiliation, research interest, homepage and list of publications. Similarly, pages for a certain organization, conference or journal also provide detailed introductions which are absent on Google.
Moreover, the Visual Explorer of Microsoft Academic explores the scholars' cooperating network by the co-author graph for a certain author and the co-author path between two authors. At the same time, Arnetminer moves a step further by mining advisor-advisee relationships between two authors who ever cooperated, so users can easily find the directors or students of an author. Detailed examples are shown in the following.
Microsoft Academic: co-author graph for "Jiawei Han" | Microsoft Academic: co-author path between "Jiawei Han" and "Jie Tang" |
Arnetminer: social graph for "Jie Tang" (Red lines stand for "advisor" and yellow lines stand for "advisee") |
Arnetminer: social graph between "Jiawei Han" and "Jie Tang" |
To sum up, each of Microsoft Academic, Google Scholar and Arnetminer plays a vital role for academic search and meets the needs of different users. For simplification, we can list the strength (marked by green) and weakness (marked by red) in the table as follows.
Measures | Google Scholar | Microsoft Academic | Arnetminer |
M1:Content | Satisfactory Good approximate match Good in abbreviate spell |
Satisfactory Bad approximate matching Bad in abbreviate spell |
Satisfactory Good approximate matching Bad in abbreviate spell |
M2:Update Rate | Fastest | Slow | Slow |
M3:User Satisfaction | Fastest Good download experience Good subscription experience Citation finding is fine |
Relatively Slow satisfactory download experience Good subscription experience Citation finding is relatively poor |
Slow Bad download experience No subscription experience Citation finding is fine |
M4:Extended Services | No | Name disambiguation, Co-author graph & path H-Index, G-Index call-for-paper calendar Fail to mining deeper relationships |
Name disambiguation, Advisor-advisee social graph H-Index |
1. Make it faster and improve its update rates (actually I believe that Microsoft will have the latest proceedings at hand asap);
2. Extend the field range, such as mathematics, biology, physic after computer science is mature.
3. Offer more convient downloading experience, as it may be the primative expectation for a large range of potential users
4. Try more extended service, such as advisor-advisee relationship mining, expert finding. Explore more practical senario for such features rather than a toy example.
5. There are SNS examples present as Mendeley which may use a different strategy but Microsoft can integrates it with its LIVE account as well as its information in cmt.research.microsoft.com.
6. It is a good idea for subscription with new papers, trend, call for papers, special issues on journal and the like, it is a good idea that microsoft academic be a source center which will largely improve the user viscosity.
7. Microsoft has a clearer structure for academic search actually though part of the functionality is not mature, its object-level-vertical search. As a vertical search product that targets at more specific audience with more specific needs from its audience, sometimes user needs grows with what we can offers them. Hence, after we can satisfy the basic needs, more sophisticated needs are also crucial to satisfy its potential users.