HttpClient supports automatic management of cookies, including allowing the server to set cookies and automatically return them to the server when required. It is also possible to manually set cookies to be sent to the server.
Unfortunately, there are several at times conflicting standards for handling Cookies: the Netscape Cookie draft, RFC2109, RFC2965 and a large number of vendor specific implementations that are compliant with neither specification. To deal with this, HttpClient provides policy driven cookie management. This guide will explain how to use the different cookie specifications and identify some of the common problems people have when using Cookies and HttpClient.
The following cookie specifications are supported by HttpClient 3.1.
RFC2109 is the first official cookie specification released by the W3C. Theoretically, all servers that handle version 1 cookies should use this specification and as such this specification is used by default within HttpClient.
Unfortunately, many servers either incorrectly implement this standard or are still using the Netscape draft so occasionally this specification is too strict. If this is the case, you should switch to the compatibility specification as described below.
RFC2109 is available at http://www.w3.org/Protocols/rfc2109/rfc2109.txt
RFC2109 is the default cookie policy used by HttpClient.
RFC2965 defines cookie version 2 and attempts to address the shortcomings of the RFC2109 regarding cookie version 1. RFC2965 is intended to eventually supersede RFC2109.
Servers that send RFC2965 cookies will use the Set-Cookie2 header in addition to the Set-Cookie header. RFC2965 cookies are port sensitive.
RFC2965 is available at http://www.w3.org/Protocols/rfc2965/rfc2965.txt
The Netscape draft is the original cookie specification which formed the basis for RFC2109. Despite this it has some significant differences with RFC2109 and thus may be required for compatibility with some servers.
The Netscape cookie draft is available at http://wp.netscape.com/newsref/std/cookie_spec.html
The compatibility specification is designed to be compatible with as many different servers as possible even if they are not completely standards compliant. If you are encountering problems with parsing cookies, you should probably try using this specification.
There are many web sites with badly written CGI scripts that only work
when all cookies are put into one request header. It is advisable to
set http.protocol.single-cookie-header
parameter to true
for maximum compatibility.
There are two ways to specify which cookie specification should be
used, either for each HttpMethod
instance using the
HttpMethodParams
, or by setting the default value on
CookiePolicy
.
In most cases, the best way to specify the cookie spec to use is the
setCookiePolicy(String policy)
method on
HttpMethodParams
. The value of policy
must be one of the values registered with CookiePolicy.registerCookieSpec()
.
HttpMethod method = new GetMethod(); method.getParams().setCookiePolicy(CookiePolicy.RFC_2109);
The cookie management API of HttpClient can co-exist with the manual
cookie handling. One can manually set request Cookie
headers or process response Set-Cookie
headers in addition
or instead of the automatic cookie management
HttpMethod method = new GetMethod(); method.getParams().setCookiePolicy(CookiePolicy.IGNORE_COOKIES); method.setRequestHeader("Cookie", "special-cookie=value");
The most common problems encountered with parsing cookies is due to non-compliant servers. In these cases, switching to the compatibility cookie specification usually solves the problem.
Since cookies are transfered as HTTP Headers they are confined to the US-ASCII character set. Other characters will be lost or mangeled. Cookies are typically set and read by the same server, so a custom scheme for escaping non-ASCII characters can be used, for instance the well-established URL encoding scheme. If cookies are used to transfer data between server and client both parties must agree on the escaping scheme used in a custom way. The HttpClient cookie implementation provides no special means to handle non-ASCII characters nor does it issue warnings.