Microsoft URL Rewrite Module 2.0 for IIS is an incremental release that includes all the features from version 1.1, and adds support for response headers and content rewriting. The module applies regular expressions or wildcards pattern to the HTTP response to locate and replace the content parts based on the rewriting logic expressed by outbound rewrite rules. More specifically, the module can be used to:
WARNING: When response headers or the response content is modified by an outbound rewrite rule an extra caution should be taken to ensure that the text which gets inserted into the response does not contain any client side executable code, which can result in cross-site scripting vulnerabilities. This is especially important when rewrite rule uses un-trusted data, such as HTTP headers or the query string, to build the string that will be inserted into the HTTP response. In such cases the replacement string should be HTML encoded by using the HtmlEncode function, e.g:
<action type="Rewrite" value="{HtmlEncode:{HTTP_REFERER}}" />
The main configuration concept used for response rewriting is the concept of an outbound rule. An outbound rule is used to express the logic of what to compare or match the response content with and what to do if the comparison was successful.
Conceptually, an outbound rule consists of the following parts:
The process of executing outbound rules is different from the one used for inbound rules. The inbound ruleset is evaluated only once per request because its input is just a single request URL string. Outbound ruleset may be evaluated many times per response as it is being applied in multiple places within HTTP response content. For example, if there is a ruleset as below:
Rule 1: applies to <a> tag and <img> tag
Rule 2: applies to <a> tag
and the HTML response contains this markup:
<a href="/default.aspx"><img src="/logo.jpg" />Home Page</a>
Then URL Rewrite Module 2.0 will evaluate Rule 1 against "/default.aspx" string. If rule was executed successfully, then the output of Rule 1 will be given to Rule2. If Rule 2 was executed successfully, then the output of Rule 2 will be used to replace the content of the href attribute in the <a> tag in the response.
After that URL Rewrite Module 2.0 will evaluate Rule1 against the "/logo.jpg" string. If rule was executed successfully then its output will be used to replace the content of the src attribute in the <img> tag in the response.
If rules are defined on multiple configuration levels, then URL rewrite module evaluates the rule set that includes distributed rules from parent configuration levels as well as rules from current configuration level. The evaluation is performed in a parent-to-child order, which means that parent rules are evaluated first and the rules defined on a last child level are evaluated last.
Pre-conditions are used to check if a rule should be evaluated against a response content. Pre-conditions collection is defined as a named collection within <preConditions> section and it may contain one or more pre-condition checks. The outbound rule references the pre-conditions collection by name.
A pre-conditions collection has an attribute called logicalGrouping that controls how conditions are evaluated. A pre-conditions collection evaluates to true if:
A pre-condition is defined by specifying the following properties:
In addition, the result of the pre-condition evaluation can be negated by using the negate attribute.
An example of a pre-condition that checks if the response content type is text/html:
<preConditions>
<preCondition name="IsHTML">
<add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/html" />
</preCondition>
</preConditions>
Tag filters are used to narrow down the search within the response content to a set of well known or custom defined HTML tags. When a rewrite rule uses tag filters then, instead of matching the rule pattern against the entire response, URL Rewrite Module 2.0 looks for an HTML tags that are listed in the rule's tag filter and then takes the content of the URL attribute of that tag and evaluates it against the rule's pattern. Tag filters are specified within the filterByTags attribute of the <match> element of an outbound rule. For example:
<match filterByTags="A" pattern="^/(article\.aspx.*)" />
If an HTTP response contains an anchor tag such as:
<a href="/article.aspx?id=1">link</a>
Then the rewrite rule pattern will be evaluated against the string: "/article.aspx?id=1".
URL Rewrite Module 2.0 includes a set of pre-defined tags that can be used with outbound rules. The table below lists all the pre-defined tags and the attributes, whose values will be used as an input for outbound rule pattern:
Tag | Attributes |
---|---|
A | href |
Area | href |
Base | href |
Form | action |
Frame | src, longdesc |
Head | profile |
IFrame | src, longdesc |
Img | src, longdesc, usemap |
Input | src, usemap |
Link | href |
Script | src |
If rewriting needs to be performed within an attribute of a tag that is not included in the pre-defined tags collection, then a custom tag collection can be used to specify the tag name and the corresponding attribute that needs to be rewritten. Custom tags collection is defined as a named collection within the <customTags> section. Outbound rule references a custom tags collection by name.
The following example shows a definition of a custom tags collection:
<customTags>
<tags name="My Tags">
<tag name="item" attribute="src" />
<tag name="element" attribute="src" />
</tags>
</customTags>
This custom tags collection can be referenced from an outbound rule as shown in the example below:
<match filterByTags="A, CustomTags" customTags="My Tags" pattern="^/(article\.aspx.*)" />
A rule pattern is used to specify what the rule input string should be matched to. Rule input differs based on the rule configuration:
Pattern is specified within a <match> element of a rewrite rule.
When filterByTags attribute is not specified in the match element of the rule then the pattern will be applied on the entire response content. Evaluation of regular expression patterns on the entire response content is a CPU intensive operation and may affect the performance of the web application. There are several options to reduce the performance overhead introduced by the full response pattern matching:
<outboundRules rewriteBeforeCache="true">Note that this setting should not be used if the chunked transfer encoding is used for responses.
<match pattern="</head>" occurrences="1" />
Rule pattern syntax can be specified by using the patternSyntax attribute of a rule. This attribute can be set to one of the following options:
ECMAScript – Perl compatible (ECMAScript standard compliant) regular expression syntax. This is a default option for any rule. This is an example of the pattern format: ”^([_0-9a-zA-Z-]+/)?(wp-.*)”
Wildcard – Wildcard syntax used in IIS HTTP redirection module. This is an example of pattern in this format: “/Scripts/*.js”, where asterisk (“*”) means “match any number of any characters and capture them in a back-reference”. Note that wildcard pattern type cannot be used when rule does not have any tag filters.
ExactMatch - exact string search is performed within the input string.
The scope of the patternSyntax attribute is per rule, meaning that it applies to the current rule’s pattern and to all patterns used within conditions of that rule.
Pattern can be negated by using the negate attribute of the <match> element. When this attribute is used then the rule action will be performed only if the input string does NOT match the specified pattern.
By default, case insensitive pattern match is used. To enable case sensitivity you can use the ignoreCase attribute of the <match> element of the rule.
Rule conditions allow defining additional logic for rule evaluation, which can be based on inputs other than just a current input string. Any rule can have zero or more conditions. Rule conditions are evaluated after the rule pattern match is successful.
Conditions are defined within a <conditions> collection of a rewrite rule. This collection has an attribute called logicalGrouping that controls how conditions are evaluated. If a rule has conditions, then the rule action will be performed only if rule pattern is matched and:
A condition is defined by specifying the following properties:
A rewrite rule action is performed when the input string matches the rule pattern and the condition evaluation has succeeded ( depending on rule configuration, either all conditions matched or any one or more of the conditions matched). There are two types of actions available and the “type” attribute of the <action> configuration element can be used to specify which action the rule has to perform. The following sections describe different action types and the configuration options related to specific action types.
Rewrite action replaces the current rule input string with a substitution string. The substitution string is specified within the value attribute of the <action> element of the rule. Substitution string is a free form string that can include the following:
None action is used to specify that no action should be performed.
The content of any response HTTP header can be obtained from within a rewrite rule by using the same syntax as for accessing server variables, but with a special naming convention. If a server variable starts with "RESPONSE_", then it stores the content of an HTTP response header whose name is determined by using the following naming convention:
For example the following pre-condition is used to evaluate the content of the content-type header:
<preCondition name="IsHTML">
<add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/html" />
</preCondition>
Inbound rewrite rules in URL Rewrite Module 2.0 can be used to set request headers and server variables.
Global rewrite rules can be used to set any request headers and server variables, as well as overwrite any existing ones. Distributed rewrite rules can only set/overwrite the request headers and server variables that are defined in the allowed list for server variables <allowedServerVariables>. If a distributed rewrite rule attempts to set any server variable or an HTTP header that is not listed in the <allowedServerVariables> collection a runtime error will be generated by URL Rewrite Module. The <allowedServerVariables> collection by default is stored in applicationHost.config file and can be modified only by an IIS server administrator.
A rule element <serverVariables> is used to define a collection of server variables and http headers to set. Those will be set only if the rule pattern has matched and the condition evaluation has succeeded (depending on rule configuration, either all conditions matched or any one or more of the conditions matched). Each item in the <serverVariables> collection consists of the following:
The following example rule rewrites the requested URL and also sets the server variable with name X_REQUESTED_URL_PATH:
<rule name="Rewrite to index.php" stopProcessing="true">
<match url="(.*)\.htm$" />
<serverVariables>
<set name="X_REQUESTED_URL_PATH" value="{R:1}" />
</serverVariables>
<action type="Rewrite" url="index.php?path={R:1}" />
</rule>
Note: for the above example to work it is required to add X_REQUESTED_URL_PATH to the <allowedServerVariables> collection:
<rewrite>
<allowedServerVariables>
<add name="X_REQUESTED_URL_PATH" />
</allowedServerVariables>
</rewrite>
The request headers are set by using the same mechanism as for server variables, but with a special naming convention. If a server variable name in the <serverVariables> collection starts with "HTTP_" then this results in an HTTP request header being set in accordance to the following naming convention:
For example the following configuration is used to sets the custom x-original-host header on the request:
<set name="HTTP_X_ORIGINAL_HOST" value="{HTTP_HOST}" />
Outbound rewrite rules in URL Rewrite Module 2.0 can be used to set new or modify existing response HTTP headers. The response HTTP headers are accessed within the outbound rules by using the same syntax as for server variables and by using the naming convention as described in Accessing Response Headers from Rewrite Rules.
If the serverVariable attribute of the <match> element of an outbound rewrite rule has a value then it indicates that this rewrite rule will operate on the content of the corresponding response header. For example, the following rule sets the response header "x-custom-header":
<outboundRules>
<rule name="Set Custom Header">
<match serverVariable="RESPONSE_X_Custom_Header" pattern="^$" />
<action type="Rewrite" value="Something" />
</rule>
</outboundRules>
The pattern of the rewrite rule will be applied on the content of the specified response header and if the rule's pattern and optional conditions evaluates successfully then the value of that response header will be rewritten.
The regular expression patterns and easy access to existing request and response headers within a rewrite rule provides a lot of flexibility when defining a logic for rewriting response HTTP headers. For example the following rewrite rule can be used to modify the content of the Location header in redirection responses:
<outboundRules>
<!-- This rule changes the domain in the HTTP location header for redirection responses -->
<rule name="Change Location Header">
<match serverVariable="RESPONSE_LOCATION" pattern="^http://[^/]+/(.*)" />
<conditions>
<add input="{RESPONSE_STATUS}" pattern="^301" />
</conditions>
<action type="Rewrite" value="http://{HTTP_HOST}/{R:1}"/>
</rule>
</outboundRules>
Parts of rules or conditions inputs can be captures in back-references. These can be then used to construct substitution URLs within rules actions or to construct input strings for rule conditions.
Back-references are generated in different ways, depending on which kind of pattern syntax is used for the rule. When ECMAScript pattern syntax is used, a back-reference can be created by putting parenthesis around the part of the pattern that must capture the back-reference. For example, the pattern ([0-9]+)/([a-z]+)\.html will capture 07 and article in back-references from this string: 07/article.html. When “Wildcard” pattern syntax is used, the back-references are always created when an asterisk symbol (*) is used in the pattern.
Usage of back-references is the same regardless of which pattern syntax was used to capture them. Back-references can be used in the following locations within rewrite rules:
Back-references to condition patterns are identified by {C:N} where N is from 0 to 9; back-references to rule pattern are identified by {R:N} where N is from 0 to 9. Note that for both types of back-references, {R:0} and {C:0}, will contain the matched string.
For example in this pattern:
^(www\.)(.*)$
For the string: www.foo.com the back-references will be indexed as follows:
{C:0} - www.foo.com
{C:1} - www.
{C:2} - foo.com
By default, within a rule action, you can use the back-references to the rule pattern and to the last matched condition of that rule. For example, in this rule:
<rule name="Back-references with trackAllCaptures set to false">
<match url=”^article\.aspx” >
<conditions>
<add input="{QUERY_STRING}" pattern="p1=([0-9]+)" />
<add input="{QUERY_STRING}" pattern="p2=([a-z]+)" />
</conditions>
<action type=”Rewrite” url="article.aspx/{C:1}" /> <!-- rewrite action uses back-references to the last matched condition -->
</rule>
The back-reference {C:1} will always contain the value of the capture group from the second condition, which will be the value of query string parameter p2. The value of p1 will not be available as a back-reference.
In URL Rewrite Module 2.0, it is possible to change how capture groups are indexed. Enabling trackAllCaptures setting to on the <conditions> collection makes the capture groups form all matched conditions to be available through the back-references. For example, in this rule:
<rule name="Back-references with trackAllCaptures set to true">
<match url=”^article\.aspx” >
<conditions trackAllCaptures="true">
<add input="{QUERY_STRING}" pattern="p1=([0-9]+)" />
<add input="{QUERY_STRING}" pattern="p2=([a-z]+)" />
</conditions>
<action type=”Rewrite” url="article.aspx/{C:1}/{C:2}" /> <!-- rewrite action uses back-references to both conditions -->
</rule>
The back-reference {C:1} will contain the value of the capture group from the first condition, and the back-reference {C:2} will contain the value of the capture group from the second condition.
When trackAllCaptures is set to true, the condition capture back-references are identified by {C:N}, where N is from 0 to the total number of capture groups across all the rule's conditions. {C:0} contains the entire matched string from the first matched condition. For example for these two conditions:
<conditions trackAllCaptures="true">
<add input="{REQUEST_URI}" pattern="^/([a-zA-Z]+)/([0-9]+)/$" />
<add input="{QUERY_STRING}" pattern="p2=([a-z]+)" />
</conditions>
If {REQUEST_URI} contains "/article/23/" and {QUERY_STRING} contains "p1=123&p2=abc" then the condition back-references will be indexed as follows:
{C:0} - "/article/23/"
{C:1} - "article"
{C:2} - "23"
{C:3} - "abc"
A distributed inbound rewrite rule can be configured to log rewritten URLs into the IIS log files, instead of logging the original URLs requested by HTTP client. To enable logging of rewritten URLs use the logRewrittenUrl attribute of the rule's <action> element, e.g:
<rule name="set server variables">
<match url="^article/(\d+)$" />
<action type="Rewrite" url="article.aspx?id={R:1}" logRewrittenUrl="true" />
</rule>
地址:
http://www.iis.net/learn/extensions/url-rewrite-module/url-rewrite-module-20-configuration-reference