Developers new to web services are often intimidated by parade of technologies and concepts required to understand it: REST, SOAP, WSDL, XML Schema, Relax NG, UDDI, MTOM, XOP, WS-I, WS-Security, WS-Addressing, WS-Policy, and a host of other WS-* specifications that seem to multiply like rabbits. Add to that the Java specifications, such as JAX-WS, JAX-RPC, SAAJ, etc. and the conceptual weight begins to become heavy indeed. In this series of articles I hope to shed some light on the dark corners of web services and help navigate the sea of alphabet soup (1). Along the way I'll also cover some tools for developing web services, and create a simple Web Service as an example. In this article I will give a high-level overview of both SOAP and REST.
Introduction
There are currently two schools of thought in developing web services: the traditional, standards-based approach (SOAP) and conceptually simpler and the trendier new kid on the block (REST). The decision between the two will be your first choice in designing a web service, so it is important to understand the pros and cons of the two. It is also important, in the sometimes heated debate between the two philosophies, to separate reality from rhetoric.
SOAP
In the beginning there was...SOAP. Developed at Microsoft in 1998, the inappropriately-named "Simple Object Access Protocol" was designed to be a platform and language-neutral alternative to previous middleware techologies like CORBA and DCOM. Its first public appearance was an Internet public draft (submitted to the IETF) in 1999; shortly thereafter, in December of 1999, SOAP 1.0 was released. In May of 2000 the 1.1 version was submitted to the W3C where it formed the heart of the emerging Web Services technologies. The current version is 1.2, finalized in 2005. The examples given in this article will all be SOAP 1.2.
Together with WSDL and XML Schema, SOAP has become the standard for exchanging XML-based messages. SOAP was also designed from the ground up to be extensible, so that other standards could be integrated into it--and there have been many, often collectively referred to as WS-*: WS-Addressing, WS-Policy, WS-Security, WS-Federation, WS-ReliableMessaging, WS-Coordination, WS-AtomicTransaction, WS-RemotePortlets, and the list goes on. Hence much of the perceived complexity of SOAP, as in Java, comes from the multitude of standards which have evolved around it. This should not be reason to be too concerned: as with other things, you only have to use what you actually need.
The basic structure of SOAP is like any other message format (including HTML itself): header and body. In SOAP 1.2 this would look something like
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope"> <env:Header> <!-- Header information here --> </env:Header> <env:Body> <!-- Body or "Payload" here, a Fault if error happened --> </env:Body> </env:Envelope>
Note that the <Header>
element is optional here, but the <Body>
is mandatory.
The SOAP <Header>
SOAP uses special attributes in the standard "soap-envelope" namespace to handle the extensibility elements that can be defined in the header. The most important of these is the mustUnderstand
attribute. By default, any element in the header can be safely ignored by the SOAP message recipient unless the the mustUnderstand
attribute on the element is set to "true" (or "1", which is the only value recognized in SOAP 1.1). A good example of this would be a security token element that authenticates the sender/requestor of the message. If for some reason the recipient is not able to process these elements, a fault should be delivered back to the sender with a fault code of MustUnderstand
.
Because SOAP is designed to be used in a network environment with multiple intermediaries (SOAP "nodes" as identified by the <Node>
element), it also defines the special XML attributes role
to manage which intermediary should process a given header element and relay
, which is used to indicate that this element should be passed to the next node if not processed in the current one.
The SOAP <Body>
The SOAP body contains the "payload" of the message, which is defined by the WSDL's <Message>
part. If there is an error that needs to be transmitted back to the sender, a single <Fault>
element is used as a child of the <Body>
.
The SOAP <Fault>
The <Fault>
is the standard element for error handling. When present, it is the only child element of the SOAP <Body>
. The structure of a fault looks like:
<env:Fault xmlns:m="http://www.example.org/timeouts"> <env:Code> <env:Value>env:Sender</env:Value> <env:Subcode> <env:Value>m:MessageTimeout</env:Value> </env:Subcode> </env:Code> <env:Reason> <env:Text xml:lang="en">Sender Timeout</env:Text> </env:Reason> <env:Detail> <m:MaxTime>P5M</m:MaxTime> </env:Detail> </env:Fault>
Here, only the <Code>
and <Reason>
child elements are required, and the <Subcode>
child of <Code>
is also optional. The body of the Code/Value element is a fixed enumeration with the values:
VersionMismatch
: this indicates that the node that "threw" the fault found an invalid element in the SOAP envelope, either an incorrect namespace, incorrect local name, or both.MustUnderstand
: as discussed above, this code indicates that a header element with the attributemustUnderstand="true"
could not be processed by the node throwing the fault. ANotUnderstood
header block should be provided to detail all of the elements in the original message which were not understood.DataEncodingUnknown
: the data encoding specified in the envelope'sencodingSytle
attribute is not supported by the node throwing the fault.Sender
: This is a "catch-all" code indicating that the message sent was not correctly formed or did not have the appropriate information to succeed.Receiver
: Another "catch-all" code indicating that the message could not be processed for reasons attributable to the processing of the message rather than to the contents of the message itself.
Subcodes, however, are not restricted and are application-defined; these will commonly be defined when the fault code is Sender
or Receiver
. The <Reason>
element is there to provide a human-readable explanation of the fault. The optional <Detail>
element is there to provide additional information about the fault, such as (in the example above) the timeout value. <Fault>
also has optional children <Node>
and <Role>
, indicating which node threw the fault and the role that the node was operating in (see role
attribute above) respectively.
SOAP Encoding
Section 5 of the SOAP 1.1 specification describes SOAP encoding, which was originally developed as a convenience for serializing and de-serializing data types to and from other sources, such as databases and programming languages. Problems, however, soon arose with complications in reconciling SOAP encoding and XML Schema, as well as with performance. The WS-I organization finally put the nail in the coffin of SOAP encoding in 2004 when it released the first version of the WS-I Basic Profile, declaring that only literal XML messages should be used (R2706). With the wide acceptance of WS-I, some of the more recent web service toolkits do not provide any support for (the previously ubiquitous) SOAP encoding at all.
A Simple SOAP Example
Putting it all together, below is an example of a simple request-response in SOAP for a stock quote. Here the transport binding is HTTP.
The request:
GET /StockPrice HTTP/1.1 Host: example.org Content-Type: application/soap+xml; charset=utf-8 Content-Length: nnn <?xml version="1.0"?> <env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:s="http://www.example.org/stock-service"> <env:Body> <s:GetStockQuote> <s:TickerSymbol>IBM</s:TickerSymbol> </s:GetStockQuote> </env:Body> </env:Envelope>
The response:
HTTP/1.1 200 OK Content-Type: application/soap+xml; charset=utf-8 Content-Length: nnn <?xml version="1.0"?> <env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:s="http://www.example.org/stock-service"> <env:Body> <s:GetStockQuoteResponse> <s:StockPrice>45.25</s:StockPrice> </s:GetStockQuoteResponse> </env:Body> </env:Envelope>
If you play your cards right, you may never have to actually see a SOAP message in action; every SOAP engine out there will do its best to hide it from you unless you really want to see it. If something goes wrong in your web service, however, it may be useful to know what one looks like for debugging purposes.
REST
Much in the way that Ruby on Rails was a reaction to more complex web application architectures, the emergence of the RESTful style of web services was a reaction to the more heavy-weight SOAP-based standards. In RESTful web services, the emphasis is on simple point-to-point communication over HTTP using plain old XML (POX).
The origin of the term "REST" comes from the famous thesis from Roy Fielding describing the concept of Representative State Transfer (REST). REST is an architectural style that can be summed up as four verbs (GET, POST, PUT, and DELETE from HTTP 1.1) and the nouns, which are the resources available on the network (referenced in the URI). The verbs have the following operational equivalents:
HTTP CRUD Equivalent ============================== GET read POST create,update,delete PUT create,update DELETE delete
A service to get the details of a user called 'dsmith', for example, would be handled using an HTTP GET to http://example.org/users/dsmith
. Deleting the user would use an HTTP DELETE, and creating a new one would mostly likely be done with a POST. The need to reference other resources would be handled using hyperlinks (the XML equivalent of HTTP's href, which is XLinks' xlink:href
) and separate HTTP request-responses.
A Simple RESTful Service
Re-writing the stock quote service above as a RESTful web service provides a nice illustration of the differences between SOAP and REST web services.
The request:
GET /StockPrice/IBM HTTP/1.1 Host: example.org Accept: text/xml Accept-Charset: utf-8
The response:
HTTP/1.1 200 OK Content-Type: text/xml; charset=utf-8 Content-Length: nnn <?xml version="1.0"?> <s:Quote xmlns:s="http://example.org/stock-service"> <s:TickerSymbol>IBM</s:TickerSymbol> <s:StockPrice>45.25</s:StockPrice> </s:Quote>
Though slightly modified (to include the ticker symbol in the response), the RESTful version is still simpler and more concise than the RPC-style SOAP version. In a sense, as well, RESTful web services are much closer in design and philosophy to the Web itself.
Defining the Contract
Traditionally, the big drawback of REST vis-a-vis SOAP was the lack of any way of specifying a description/contract for the web service. This, however, has changed since WSDL 2.0 defines a full compliment of non-SOAP bindings (all the HTTP methods, not just GET and POST) and the emergence of WADL as an alternative to WSDL. This will be discussed in more detail in coming articles.
Summary and Pros/Cons
SOAP and RESTful web services have a very different philosophy from each other. SOAP is really a protocol for XML-based distributed computing, whereas REST adheres much more closely to a bare metal, web-based design. SOAP by itself is not that complex; it can get complex, however, when it is used with its numerous extensions (guilt by association).
To summarize their strengths and weaknesses:
*** SOAP ***
Pros:
- Langauge, platform, and transport agnostic
- Designed to handle distributed computing environments
- Is the prevailing standard for web services, and hence has better support from other standards (WSDL, WS-*) and tooling from vendors
- Built-in error handling (faults)
- Extensibility
Cons:
- Conceptually more difficult, more "heavy-weight" than REST
- More verbose
- Harder to develop, requires tools
*** REST ***
Pros:
- Language and platform agnostic
- Much simpler to develop than SOAP
- Small learning curve, less reliance on tools
- Concise, no need for additional messaging layer
- Closer in design and philosophy to the Web
Cons:
- Assumes a point-to-point communication model--not usable for distributed computing environment where message may go through one or more intermediaries
- Lack of standards support for security, policy, reliable messaging, etc., so services that have more sophisticated requirements are harder to develop ("roll your own")
- Tied to the HTTP transport model