在做模拟登陆的脚本时,通常会遇到设置cookie的问题。一般都是从response中取得Set-Cookie的值》放到request的Cookie头中。
当SetCookie返回多个头信息时,再用上面的处理方式就会出错。正确的处理方式如下面规范描述,要通过将多个头信息中path较长的那个信息放到前面来完成。
http://curl.haxx.se/rfc/cookie_spec.html
This is the original spec once found on netscape.com but since they decidedto not keep the original URL alive and working, we host it here and point tothis URL instead. Go to curl.haxx.se formore curl info
PERSISTENTCLIENTSTATE
HTTP COOKIES
Preliminary Specification - Use with caution
INTRODUCTION
Cookies are a general mechanism which server side connections (such asCGI scripts) can use to both store and retrieve information on theclient side of the connection. The addition of a simple, persistent,client-side state significantly extends the capabilities of Web-basedclient/server applications.
OVERVIEW
A server, when returning an HTTP object to a client, may also send apiece of state information which the client will store. Included in thatstate object is a description of the range of URLs for which that state isvalid. Any future HTTP requests made by the client which fall in thatrange will include a transmittal of the current value of the stateobject from the client back to the server. The state object is calleda
cookie, for no compelling reason.
This simple mechanism provides a powerful new tool which enables a hostof new types of applications to be written for web-based environments.Shopping applications can now store information about the currentlyselected items, for fee services can send back registration informationand free the client from retyping a user-id on next connection,sites can store per-user preferences on the client, and have the client supplythose preferences every time that site is connected to.
SPECIFICATION
A cookie is introduced to the client by including a
Set-Cookieheader as part of an HTTP response, typically this will be generatedby a CGI script.
Syntax of the Set-Cookie HTTP Response Header
This is the format a CGI script would use to add to the HTTP headersa new piece of data which is to be stored by the client for later retrieval.
Set-Cookie: NAME=VALUE; expires=DATE;
path=PATH; domain=DOMAIN_NAME; secure
-
NAME=
VALUE
-
This string is a sequence of characters excluding semi-colon, comma and whitespace. If there is a need to place such data in the name or value, someencoding method such as URL style %XX encoding is recommended, though noencoding is defined or required.
This is the only required attributeon the Set-Cookie header.
-
expires=
DATE
-
The
expires attribute specifies a date string thatdefines the valid life time of that cookie. Once the expirationdate has been reached, the cookie will no longer be stored orgiven out.
The date string is formatted as:
Wdy, DD-Mon-YYYY HH:MM:SS GMT
This is based on RFC 822, RFC 850, RFC 1036, and RFC 1123,with the variations that the only legal time zone is
GMT andthe separators between the elements of the date must be dashes.
expires is an optional attribute. If not specified, the cookie willexpire when the user's session ends.
Note: There is a bug in Netscape Navigator version 1.1 and earlier.Only cookies whose path attribute is set explicitly to "/" willbe properly saved between sessions if they have an expiresattribute.
-
domain=
DOMAIN_NAME
-
When searching the cookie list for valid cookies, a comparison of the
domainattributes of the cookie is made with the Internet domain name of thehost from which the URL will be fetched. If there is a tail match,then the cookie will go through
path matching to see if itshould be sent. "Tail matching" means that
domain attributeis matched against the tail of the fully qualified domain name ofthe host. A
domain attribute of "acme.com" would matchhost names "anvil.acme.com" as well as "shipping.crate.acme.com".
Only hosts within the specified domaincan set a cookie for a domain and domains must have at least two (2)or three (3) periods in them to prevent domains of the form: ".com", ".edu", and "va.us". Any domain that fails withinone of the seven special top level domains listed below only requiretwo periods. Any other domain requires at least three. Theseven special top level domains are: "COM", "EDU", "NET", "ORG", "GOV", "MIL", and "INT".
The default value of domain is the host name of the serverwhich generated the cookie response.
-
path=
PATH
-
The
path attribute is used to specify the subset of URLs in adomain forwhich the cookie is valid. If a cookie has already passed
domainmatching, then the pathname componentof the URL is compared with the path attribute, and if there isa match, the cookie is considered valid and is sent along withthe URL request. The path "/foo"would match "/foobar" and "/foo/bar.html". The path "/" is the mostgeneral path.
If the path is not specified, it as assumed to be the same pathas the document being described by the header which contains the cookie.
-
secure
-
If a cookie is marked
secure, it will only be transmitted if thecommunications channel with the host is a secure one. Currentlythis means that secure cookies will only be sent to HTTPS (HTTP over SSL)servers.
If secure is not specified, a cookie is considered safe to be sentin the clear over unsecured channels.
Syntax of the Cookie HTTP Request Header
When requesting a URL from an HTTP server, the browser will matchthe URL against all cookies and if any of them match, a linecontaining the name/value pairs of all matching cookies willbe included in the HTTP request. Here is the format of that line:
Cookie: NAME1=OPAQUE_STRING1; NAME2=OPAQUE_STRING2 ...
Additional Notes
- Multiple Set-Cookie headers can be issued in a single serverresponse.
- Instances of the same path and name will overwrite each other, with thelatest instance taking precedence. Instances of the same path butdifferent names will add additional mappings.
- Setting the path to a higher-level value does not override other morespecific path mappings. If there are multiple matches for a given cookiename, but with separate paths, all the matching cookies will be sent.(See examples below.)
- Theexpires header lets the client know when it is safe to purge the mappingbut the client is not required to do so. A client may also delete acookie before it's expiration date arrives if the number of cookiesexceeds its internal limits.
- When sending cookies to a server, all cookies with a more specificpath mapping should be sent before cookies with less specific pathmappings. For example, a cookie "name1=foo" with a path mappingof "/" should be sent after a cookie "name1=foo2" witha path mapping of "/bar" if they are both to be sent.
- There are limitations on the number of cookies that a clientcan store at any one time. This is a specification of the minimumnumber of cookies that a client should be prepared to receive andstore.
- 300 total cookies
- 4 kilobytes per cookie, where the name and the OPAQUE_STRING combine to form the 4 kilobyte limit.
- 20 cookies per server or domain. (note that completely specified hosts and domains are treated as separate entities and have a 20 cookie limitation for each, not combined)
Servers should not expect clients to be able to exceed these limits.When the 300 cookie limit or the 20 cookie per server limitis exceeded, clients should delete the least recently used cookie.When a cookie larger than 4 kilobytes is encountered the cookieshould be trimmed to fit, but the name should remain intactas long as it is less than 4 kilobytes.
- If a CGI script wishes to delete a cookie, it can do so byreturning a cookie with the same name, and an expires timewhich is in the past. The path and name must match exactlyin order for the expiring cookie to replace the valid cookie.This requirement makes it difficult for anyone but the originatorof a cookie to delete a cookie.
- When caching HTTP, as a proxy server might do, the Set-cookieresponse header should never be cached.
- If a proxy server receives a response whichcontains a Set-cookie header, it should propagate the Set-cookieheader to the client, regardless of whether the response was 304(Not Modified) or 200 (OK).
Similarly, if a client request contains a Cookie: header, itshould be forwarded through a proxy, even if the conditionalIf-modified-since request is being made.
EXAMPLES
Here are some sample exchanges which are designed to illustrate the useof cookies.
First Example transaction sequence:
-
Client requests a document, and receives in the response:
-
Set-Cookie: CUSTOMER=WILE_E_COYOTE; path=/; expires=Wednesday, 09-Nov-99 23:12:40 GMT
-
When client requests a URL in path "/" on this server, it sends:
-
Cookie: CUSTOMER=WILE_E_COYOTE
-
Client requests a document, and receives in the response:
-
Set-Cookie: PART_NUMBER=ROCKET_LAUNCHER_0001; path=/
-
When client requests a URL in path "/" on this server, it sends:
-
Cookie: CUSTOMER=WILE_E_COYOTE; PART_NUMBER=ROCKET_LAUNCHER_0001
-
Client receives:
-
Set-Cookie: SHIPPING=FEDEX; path=/foo
-
When client requests a URL in path "/" on this server, it sends:
-
Cookie: CUSTOMER=WILE_E_COYOTE; PART_NUMBER=ROCKET_LAUNCHER_0001
-
When client requests a URL in path "/foo" on this server, it sends:
-
Cookie: CUSTOMER=WILE_E_COYOTE; PART_NUMBER=ROCKET_LAUNCHER_0001; SHIPPING=FEDEX
Second Example transaction sequence:
-
Assume all mappings from above have been cleared.
-
Client receives:
-
Set-Cookie: PART_NUMBER=ROCKET_LAUNCHER_0001; path=/
-
When client requests a URL in path "/" on this server, it sends:
-
Cookie: PART_NUMBER=ROCKET_LAUNCHER_0001
-
Client receives:
-
Set-Cookie: PART_NUMBER=RIDING_ROCKET_0023; path=/ammo
-
When client requests a URL in path "/ammo" on this server, it sends:
-
Cookie: PART_NUMBER=RIDING_ROCKET_0023; PART_NUMBER=ROCKET_LAUNCHER_0001
-
NOTE: There are two name/value pairs named "PART_NUMBER" due to theinheritanceof the "/" mapping in addition to the "/ammo" mapping.