Alternate Access Mappings (AAM) may be one of the least understood aspects of SharePoint and can have substantial impact on Search(both Crawl and Query). In this post, I'll document some of my experiencesand insights on AAMs.
Note: "Plan alternate access mappings (Office SharePoint Server)" is the best resource I've found fordocumenting AAMs, which covers both several configuration scenarios andtroubleshooting common mistakes. Although listed as SharePoint 2007 content, its information largely applies to both SharePoint 2010 and SharePoint 2013. I won't duplicate that TechNet page, but I will make reference(calling it the"Plan for AAMs" document) to reiterate some aspects that have particular impact for Search.
The most common mistakes I see regarding AAMs tend tofall into the following groups (more detail on each of thesethroughout this post):
AAMs: An Overview and BaseExample
In the Plan for AAMs document (referenced above), it provides the following description(again, this also applies to SP2010 and SP2013):
"Alternate access mappingsenable a Web application that receives a request for an internal URL, in one ofthe five authentication zones, to return pages that contain links to the publicURL for the zone. You can associate a Web application with a collection ofmappings between internal and public URLs. Internal refers to the URL of a Webrequest as it is received by Office SharePoint Server 2007. Public refers tothe URL of an externally accessible Web site. The public URL is the base URLthat Office SharePoint Server 2007 uses in the pages that it returns. If theinternal URL has been modified by a reverse proxy device, it can differ fromthe public URL."
That accurately describes it, but if you've never seen itdefined before, it's easy to get hung up on the difference between the InternalURL versus the Public URL or some other nuance of this. Or if you're likeme... you made it about fifteen words into that before your eyes glazedand you started skimming it. So, let me try to describe this with pictures(which you can click on these to see a bigger view).
First, assume you just created a WebApplication with a load balanced URL http://initech and haven't configured any other modifications to the AAMs. Forthis, your AAMs would resemble the following:
And the communication flow for a page request from the userwould look something like this...
Now, let's look at some definitions to help you get your mindwrapped around Internal and Public URLs:
Typically, the Internal URL is the same as thePublic URL, so it's not always obvious that there is a difference betweenthe two. But what if you added a second Internal URL, such as:
Adding a second internal address is common when you want totarget a specific server rather than a load balanced address. In both cases,the user is targeting the same SharePoint Web Application (and morespecifically, the same Web Application Zone), but via a different"incoming" URL.
In this scenario of browsing to http://someServer, thepicture would look exactly as the communication flow depictionabove, except the initial request from the client would contain "HTTPGET (http://someServer)". Even so, the rendered page wouldstill show relative to its Public URL, http://initech. Inother words, if the user browsed to http://someServer, it would pull the same content as whenbrowsing to http://initech. However, when browsing to http://someServer,the URL in my address bar would change back to http://initech (the Public URL) such as:
Note: Technically, this isn't redirecting to http://initech,but the page will display relative to http://initech.However, the page has to then pullback JavaScript, CSS files, images and other resources tofull load the page content. The links to these resources are definedserver-side as relative URLs...using the Public URL for this request.Thus, the page loads with references to the load balanced address http://initech andwill pull these resources from http://initech.
This is common a strategy fortroubleshooting/targeting a specific web server to process a request.Think http://your.typical.load.balanced.url (which would route through a NLB) versus the samerequest targeted to a specific server such as http://server1 or http://server2 etc...). It is also required when setting WebApp's SiteDataServers property for implementing dedicatedcrawl targets.
Extending the Web Application andWhy It's Important
In a second example, the Web Application gets extended intoanother zone, such as:
Which would have a corresponding communication flow like:
In this, the same content is rendered via different URLs, whereeach URL represents a different zone (and in this case, each zone also has adifferent Authentication Provider, but it is not required to be different). Acommon problem occurs when creating a new AAM zone without extending the WebApplication (or similarly, manually altering the Public URL of any zone fromCentral Admin). To better describe why this causes a problem, I need to firstprovide some background.
When creating a Web Application, SharePoint also creates acorresponding IIS Site (an IIS Site is different than a SharePoint Site [whichis really a Site Collection] or a SharePoint Web [which is often called theroot or sub-site]... unfortunately, the term "site" is often usedinterchangeably causing a ton of confusion). When a request reaches a server,the underlying server subsystems (such as HTTP.SYS) route these to IIS. Usingthe IIS Sites and its bindings, these requests are then routed to theappropriate w3wp.exe for processing (in other words, anIIS Site is bound to a particular App Pool, which has a w3wp.exeprocess).
Note: Check out SharePoint as an ASP.NET-IIS Application for much more information on this topic, which includes thefollowing:
"A web application in SharePoint terminology is closely related to what is called a website in IIS terminology... It can behelpful, especially when you are trying to see the relation between SharePointand IIS from a high and broad perspective, to think of the SharePoint webapplication and its corresponding IIS website as a single entity... [A]lthough there is usually a one-to-one relation between SharePoint webapplications and IIS websites, this is not always the case."
When extending a Web Application into another zone, SharePoint:
When just simply modifying the AAMswithout extending the Web Application, the IIS Site does not get created for the extended zone. In fact, from PowerShell, wecan verify that the IisSettings object does not created for the Web Application.For example, in my Farm, I manually created an Intranet zone by simply adding a new AAM rather than extendingthe Web Application, then ran the following:
$webApp =Get-SPWebApplication http://initech
$webApp.IisSettings[[Microsoft.SharePoint.Administration.SPUrlZone]::("Default")]
ServerComment :SharePoint Initech 80
Path :C:\inetpub\wwwroot\wss\VirtualDirectories\initech80
ServerBindings :{Microsoft.SharePoint.Administration.SPServerBinding}
...etc...
$webApp.IisSettings[[Microsoft.SharePoint.Administration.SPUrlZone]::("Internet")]
#...this is emptybecause IisSettings is $null
The Plan for AAMs also notes the following (and I wish this were alsoexplicitly documented for SP2010 and SP2013, but this article was neverupdated for these newer versions. In either case, thisfunctionality hasn't really changed across the versions, so it - in mymind- applies to all version of SharePoint):
"We recommend extending a Web application to a newIIS Web site for each zone you want to use. This provides a backing IIS Web site. We do not recommendreusing the same IIS Web site for multiple zones, unless you are specifically told to do so by Microsoft."
Depending on the bindings in the otherIIS Sites (particularly with "*" [wildcard] host headers and/ormultiple sites using the same port), it is quite possible to create ambiguousscenarios where requests may reach the incorrect w3wp.exe for processing (here is aMicrosoft KB describing this behavior). In this scenario,SharePoint can in cases still accept the request and process it. Here, it is quite possible that things will appear to work,but it might also create some weird idiosyncrasies that can't be fullyexplained.
A similar problem occurs when manually alteringthe AAMs from Central Admin (e.g. changing the AAMs without extending). This,as noted as "Mistake 4" in the Plan for AAMs TechNet, seems to correlate to a common misconception that"updates made in alternate access mappings automatically update IISbindings". However,changesto AAMs are not reflected in IIS (whereas extending the Web Application creates a new IIS Site withthe appropriate IIS settings/bindings in place). Thus, manually changing theAAMs could induce these problems even if the zones were originally extended.
Some Impacts to Crawling thenon-Default URL
This isn't documented anywhere, but I'm convinced that there isan inherent assumption built into SharePoint that a Web Application'sDefault zone will be crawled (specifically, the Public URL).
For example, in SharePoint 2010, contextual scopes andpopular social tags both break if crawling the non-Default zone (Contextualscopes break because the query processor attempts to translate the URL to theDefault zone's Public URL before processing the query. For Social Tags,SharePoint normalizes the tagged target URL to its Default zone's Public URLbefore storing it in the Social DB).
In SharePoint 2013, URL-related managed properties includingPath, ParentURL and SPSiteUrl all store values relative to the URL that wascrawled. However, these clearly get negatively impacted at query time bycrawling the non-Default zone (again because the query processorattempts to translate the URL to the Default zone's Public URL beforeprocessing the query).
Also, although not directly causedby AAM configuration, AAMs do play a contributing role on theconfiguration of IIS Bindings, DNS (or HOSTS files), load balancers, proxyservers, and the DisableLoopbackCheck (orBackConnectionHostNames) registry settings.
Mapping External Resources
As noted above, some query-time scenarios require atranslation between zones using the AAMs. However, AAMs are an aspectof Web Applications, so what happens if the Web Application is ina remote farm (e.g. as in a Enterprise Services Publishing/Consumingscenario where the Search Farm is different than the content Farm)? For this,we have the ability to create an "External ResourceMapping" in the Search Farm. These allow you to create AAMs for anexternal Farm within the local Search Farm. The author describes this scenarioand shows how to build the External Resource Mappings.
Essentially, you're duplicating the alternateaccess mappings in the Search Farm so the query processor can properly mapthe requests to the correct Default zone. For example, if the original Web App in the content farm has a Default, Intranet, and Extranet mapping, you'd want to create an External Resource Mapping in the Search farm and provide the URLs for the target's Default, Intranet, and Extranet zones.