How to Integrate Google Searches into Your Application
By Klaus Salchner
The first thing coming to mind when we hear Google is search engine. Google has been able to turn the search business up-side-down within the last five years. The founders of Google started with an idea in 95 which really became widely used and known in 98/99. Today Google is the number one search engine. You can find out more about Google抯 history here. Like other organizations Google is trying to establish itself as a platform rather then a solution. This means it provides the necessary tools and infrastructure so other people can build their own solutions on top of it. Google provides a web service interface which allows you to integrate Google searches right into your application. You can find out more about the Google web service API at http://www.google.ca/apis .
You can download from the URL above the developer抯 kit which comes with a number of sample applications for different languages like .NET or Java. You also need a valid license key, which you need to pass along with every web service call. To obtain a Google license key visit the URL http://www.google.ca/apis and select 揅reate Account?on the left side navigation bar. You need to create an account by entering your email address and a password. This sends an email to the email address you entered to verify its existence. The email you receive has a link to complete the account creation by activating it. When done click on the continue link which brings you back to the account creation page. At the bottom of the page you see a link 搒ign in here? Follow the link and sign into your account with your email address and password. This shows then a page confirming that a license key has been generated and sent to your email address. Should you loose your license key, sign in again and Google will resend the license key to your email address. The license key is for free but limits you to 1,000 calls per day. This will be more then enough to get started. If you need to make more then 1,000 calls per day contact Google.
Create your project in Visual Studio .NET and in the "solution explorer" pane right click on the project. In the popup menu select 揂dd Web Reference?and enter as URL the following WSDL URL - http://api.google.com/GoogleSearch.wsdl . This will check the existence of the WSDL, download it and show you in the dialog the web methods available by this web service. Enter under 搘eb reference name?the name of the web service reference, for example GoogleSearch. When done click 揂dd Reference?and you are ready to use the Google web service API. It will be shown in the 搒olution explorer?under 揥eb References? You can right click on the web service reference and update it through the 揢pdate Web Reference?menu item or view it in the object explorer through the 揤iew in Object Browser?popup menu. This shows you that there are four different types available. The type GoogleSearchService exposes the actual web service calls you can make. It has three different web methods (plus the usual Begin/End methods if you want to call a web method asynchronously).
When you open up Google in your browser and search for a word or phrase you see sometimes the phrase 揇id you mean: [suggested search term]?at the top of the search results page. Google performs a spell check of the search term you entered and then shows you alternative spellings of your search term. This helps the user to search for properly spelled words and phrases and the user can simply click on it to search for the corrected search term. The Google web service also provides a web method to check for alternate spellings of a search term. Here is a code snippet:
public static string SpellingSuggestion(string Phrase) { // create an instance of the Google web service Google.GoogleSearchService GoogleService = new Google.GoogleSearchService(); // get the new spelling suggestion string SpellingSuggestion = GoogleService.doSpellingSuggestion(Key, Phrase); // null means we have no spelling suggestion if (SpellingSuggestion == null) SpellingSuggestion = Phrase; // release the web service object GoogleService.Dispose(); return SpellingSuggestion; }
First we create an instance of the web GoogleSearchService class and then we call the web method doSpellingSuggestion(). The first argument is the Google license key you pass along and the second one is the search term. The web method returns the alternate spelling of the search term or null if there is no alternate spelling. The code snippet above returns the alternate spelling or the original one. At the end it calls Dispose() to free up the underlying unmanaged resource.
Google is constantly crawling the Internet to keep its search index and directory up to date. Google抯 crawler also caches the content locally on its servers and allows you to obtain the cached page, which is the content as of when the crawler visited that resource the last time. URL抯 can point to many different resources, most typically to HTML pages. But these can also be Word documents, PDF files, PowerPoint slides, etc. The cached page is always in HTML format. So for any other resources then HTML it also converts the format to HTML. Here is a code snippet:
public static void GetCachedPageAndSaveToFile(string PageUrl, string FileName) { // create an instance of the Google web service Google.GoogleSearchService GoogleService = new Google.GoogleSearchService(); // get the cached page content byte[] CachedPage = GoogleService.doGetCachedPage(Key, PageUrl); // file writer to write a stream to the file & a binary writer to write data to FileStream FileWriter = new FileStream(FileName, FileMode.Create); BinaryWriter Writer = new BinaryWriter(FileWriter); // write the page content to the file and close the streams; Writer.Write(CachedPage); Writer.Close(); FileWriter.Close(); // release the web service object GoogleService.Dispose(); }
First we again create an instance of the GoogleSearchService class and then we call the web method doGetCachedPage(). We pass along the Google license key plus the URL of the page we are looking for. This returns a byte array, using base64 encoding, which contains the HTML content of the cached page. Next we create a FileStream which we use to write the obtained page to a local file. With FileMode.Create we tell it to create the file, which overwrites any existing file. Then we create a BinaryWriter which uses as output the FileStream. Then we write the returned byte array to the BinaryWriter which in turn writes it to the FileStream, which in turn writes it to the local file. Then we close the FileStream and BinaryWriter. At the end we call again Dispose() to free up underlying unmanaged resources.
The web method doGoogleSearch() allows you to perform searches. You pass along the search term and then certain filter criteria抯 to filter the content for example to a specific country, language, topic, etc. Here are the arguments you pass along to the web method:
This web method allows you to perform simple or complex search queries against Google. It also allows you to filter the search result as well as page through the search result. Here is a code snippet:
public static XmlNode Search(string QueryTerm, int Start, int MaxResults, bool Filter, string Restricts, bool SafeSearch, string LanguageRestrict, string InputEncoding, string OutputEncoding) { // create an instance of the Google web service Google.GoogleSearchService GoogleService = new Google.GoogleSearchService(); // perform search Google.GoogleSearchResult SearchResult = GoogleService.doGoogleSearch(Key, QueryTerm, Start, MaxResults, Filter, Restricts, SafeSearch, LanguageRestrict, InputEncoding, OutputEncoding); // we return the result back as a XML document XmlDocument ResultXml = CreateXmlDocument(SearchResultXmlNode); // add the search result StringValueOfObject(ResultXml.DocumentElement, SearchResult); // add the result elements and directory categories root node XmlElement ResultElementsParentNode = AddChildElement(ResultXml.DocumentElement, "ResultElements"); XmlElement CategoriesParentNode = AddChildElement(ResultXml.DocumentElement, "DirectoryCategories"); // now add all result elements foreach (Google.ResultElement ResultElement in SearchResult.resultElements) StringValueOfObject(ResultElementsParentNode, ResultElement); // now add all directory categories foreach (Google.DirectoryCategory DirectoryCategory in SearchResult.directoryCategories) StringValueOfObject(CategoriesParentNode, DirectoryCategory); // release the web service object GoogleService.Dispose(); return ResultXml; }
First we create an instance of the GoogleSearchService class and then we call the web method doGoogleSearch(). We pass along all the arguments as described above. This performs the search and returns its result as an instance of the GoogleSearchResult class. The code snippet then takes all values of the GoogleSearchResult object and puts them into a XML document. Please refer to the attached sample application for the complete code. First it creates a XML document with the method CreateXmlDocument(). It then calls the method StringValueOfObject() which creates a XML element for the object in the XML document using the name of the object as the name of the XML element. The method uses then reflection to walk the returned GoogleSearchResult object and for each field it finds in the object it adds an attribute to the created XML element. It of course adds to each created attribute the value of the associated object field. The returned GoogleSearchResult object has two fields which hold an array of ResultElement and DirectoryCategory objects. The method StringValueOfObject() is not able to walk each object in those arrays. Therefore we create two root XML elements in the XML document using the method AddChildElement(). We then loop through both arrays and call for each object StringValueOfObject() so we can convert each object to a XML element adding all its fields as attributes. Finally we call again Dispose() to free up the underlying unmanaged resources and then return the XML document which contains all search information of the GoogleSearchService object. This enables you to run XPath queries against the search result XML document to find the required search result information.
The attached sample application provides a wrapper class for all Google web methods. It also provides a simple user interface demonstrating the use of each web method. You can enter a search term and get alternate spelling suggestions, you can download the cached HTML page of a URL and display it and you can perform a search entering all the search arguments. Please make sure to obtain your own Google license key and enter it in the app.config file.
Download Source
The Google web service API is very easy to use. It enables you to search the Internet from within your application. Complex query terms and filtering capabilities assure relevancy of the search results to your application needs. The Google web service is one of many other emerging ones, like Amazon抯 web service or eBay抯 web service. By introducing a web service interface these companies moved to a platform, enabling third parties to build solutions non top of them. For these companies an ever increasing number of requests and business transactions are coming through these web service interfaces. If you have comments on this article or this topic, please contact me @ [email protected] . I want to hear if you learned something new. Contact me if you have questions about this topic or article.
Klaus Salchner has worked for 14 years in the industry, nine years in Europe and another five years in North America. As a Senior Enterprise Architect with solid experience in enterprise software development, Klaus spends considerable time on performance, scalability, availability, maintainability, globalization/localization and security. The projects he has been involved in are used by more than a million users in 50 countries on three continents.
Klaus calls Vancouver, British Columbia his home at the moment. His next big goal is doing the New York marathon in 2005. Klaus is interested in guest speaking opportunities or as an author for .NET magazines or Web sites. He can be contacted at [email protected] or http://www.enterprise-minds.com/ .
Enterprise application architecture and design consulting services are available. If you want to hear more about it contact me! Involve me in your projects and I will make a difference for you. Contact me if you have an idea for an article or research project. Also contact me if you want to co-author an article or join future research projects!