
REST is the new SOAP

REST是新的SOAP(Simple Object Access Protocol)

Written by Pascal Chambon, reviewed by Rapha?l Gomès

作者Pascal Chambon,审校Rapha?l Gomès

Update: this article mostly deals with the RESTish ecosystem, which now constitutes a major part of webservices. For more in-depth analysis of the original REST, and of HATEOAS, see my follow-up article.




Some years ago, I developed a new information system in a big telecom company. We had to communicate with an increasing number of web services, exposed by older systems or by business partners.


Needless to say, we had our fair share of SOAP Hell. Abstruse WSDLs, incompatible libraries, weird bugs… So whenever we could, we advocated?—?and used?—?simple Remote Procedure Call protocols: XMLRPC or JSONRPC.

不用说,我们有大量的SOAP陷阱、深奥难懂的WSDL(Web Service Description Language),不兼容的库,怪异的缺陷…)。所以我们无时不抓住机会倡导并且自己使用简单的RPC(远程方法调用协议)协议:XMLRPC或JSONRPC(XML: Extensible Markup Language, JSON: Java Syntax Object Notation).

Our first servers and clients for these protocols were pretty basic, limited, fragile. But gradually, we improved them; and with a few hundreds lines of additional code, we achieved the dream: support for different dialects (such as Apache-specific XMLRPC extensions), built-in conversion between python exceptions and hierarchical error codes, separate handling of functional and technical errors, auto-retries for the latter, relevant logging and stats before/after requests, thorough validation of input data…


Now we were able to robustly connect to any such API, with just a few lines of code.


Now we were able to expose any set of functions to a wide audience, to servers and to web browsers, with a few decorators and doc updates.


And when it came to communicating between our different applications (microservice-style), it was a job for our system administrator; software-side, it was almost transparent.


Then came REST.


REpresentational State Transfer.


A wave of renewal shook the foundations of inter-services communication.
RPC was dead, the future was RESTful: resources living each on its own URL, and manipulated exclusively through HTTP protocol.

一波革新动摇了跨服务交流的根基。RPC死了,未来是REST的:资源存在于它自身的URL(Unified Resource Link)上,由HTTP协议来操作。

From then on, every API we had to expose or consume became a new challenge; not to say a testimony to insanity.


What’s the problem with REST?


A short example is worth a long talk. Here is a small API, with data types removed for readability.


createAccount(username, contact_email, password) -> account_id
addSubscription(account_id, subscription_type) -> subscription_id
sendActivationReminderEmail(account_id) -> null
cancelSubscription(subscription_id, reason, immediate=True) -> null
getAccountDetails(account_id) -> {full data tree}

Just add a properly documented hierarchy of exceptions (InvalidParameterError, MissingParameterError, WorkflowError…), with subclasses to identify important cases (eg. AlreadyExistingUsernameError), and you’re good to go.

仅需要加上很好地描述了的分层级的异常(InvalidParameterError, MissingParameterError, WorkflowError…),和用子类来标识重要的场景(比如AlreadyExistingUsernameError),这就足够好了。

This API is easy to understand, easy to use, and robust. It is backed by a precise state machine, but the restricted set of available operations keeps users away from nonsensical interactions (like changing the creation date of an Account).


Estimated time to expose this API as a simple RPC service: a few hours.


Ok, now time to go the RESTful way.


No more standards, no more precise specifications. Just a vague “RESTful philosophy”, prone to endless metaphysical debates, and as many ugly workarounds.


How do you map the precise functions above, to a handful of CRUD operations? Is sending the activation reminder email an update on a “must_send_activation_reminder_email” attribute? Or the creation of a “activation_reminder_email resource”? Is it sensible to use DELETE for cancelSubscription() if the subscription remains alive during a grace period, and may be resurrected during that time? How do you split the data tree of getAccountDetails() between endpoints, to respect the data model of REST?

你如何把如上的精确的函数映射为方便的CRUD(create, retrieval, update, delete)操作?”激活提醒邮件”,还是创建”激活提醒邮件资源”是对”必须激活提醒邮件”属性的更新?如果订阅在临界死亡期间,并且期间可能复活,用DELETE(删除)操作来取消订阅合理吗?你如何在节点间拆分getAccountDetails()函数的数据树来使它符合REST模型?

What URL endpoint do you assign to each of your “resources”? Yeah it’s easy, but it has to be done anyway.


How do you express the diversity of error conditions, using the very limited bunch of HTTP codes?


What serialization formats, which specific dialects do you use for input and output payloads?


How exactly do you scatter these simple signatures between HTTP method, URL, query string, payload, headers, and status code?


And you’re gone for hours, reinventing the wheel. Not even a tailored, smart wheel. A broken and fragile wheel, requiring tons of documentation to be understood, and violating specifications without even knowing it.


How come REST means so much WORK?


This is both a paradox, and a shameless pun.


Let’s dive further into the artificial problems born from this design philosophy.


The joy of REST verbs


Rest is not CRUD, its advocates will ensure that you don’t mix up these two. Yet minutes later they will rejoice that HTTP methods have well defined semantics to create (POST), retrieve (GET), update (PUT/PATCH) and delete (DELETE) resources.

REST不是CRUD(查插删改),它倡导你不要混淆这俩词。然而几分钟后,他们会庆祝HTTP方法具有正确的create(创建,POST), retrieve(获取Get), update(更新PUT/PATCH)和delete(删除DELETE)资源的语义。

They’ll delight in professing that these few “verbs”are enough to express any operation. Well, of course they are; the same way that a handful of verbs would be enough to express any concept in English: “Today I updated my CarDriverSeat with my body, and created an EngineIgnition, but the FuelTank deleted itself”; being possible doesn’t make it any less awkward. Unless you’re an admirator of the Toki Pona language.


If the point is to be minimalist, at least let it be done right. Do you know why PUT, PATCH, and DELETE have never been implemented in web browser forms? Because they are useless and harmful. We can just use GET for read and POST for write. Or POST exclusively, when HTTP-level caching is unwanted. Other methods will at best get in your way, at worst ruin your day.

如果追求极简,至少要做正确。你知道为什么PUT, PATCH和DELETE操作从来没有被浏览器的窗体实现吗?因为他们无用有害。我们可以用GET来读,用POST来写。或者当我们不想要HTTP层的缓存的时候,我们可以只用POST。其余的操作最好的情况会挡你的路,最坏的情况会毁你的生活。

You want to use PUT to update your resource? OK, but some Holy Specifications state that the data input has to be equivalent to the representation received via a GET. So what do you do with the numerous read-only parameters returned by GET (creation time, last update time, server-generated token…)? You omit them and violate the PUT principles? You include them anyway, and expect an “HTTP 409 Conflict” if they don’t match server-side values (forcing you to then issue a GET…)? You give them random values and expect servers to ignore them (the joy of silent errors)? Pick your poison, REST clearly has no clue what a read-only attribute it, and this won’t be fixed anytime soon. Meanwhile, a GET is dangerously supposed to return the password (or credit card number) which was sent in a previous POST/PUT; good luck dealing with such write-only parameters too.

你想用PUT操作来更新你的资源吗?好吧,但是某些神圣的规范说数据输入必须和通过GET获得的表示等价。所以,你拿这些大量的只读的由GET返回的参数(创建时间、上次更新时间、服务器生成的令牌)做什么?你忽略他们来违反PUT的原则吗?还是你无论如何包括他们,然后如果他们不匹配服务端的值(这时强迫你调用一次GET),你期待”HTTP 409冲突”?还是你给他们随机值然后期待服务器忽略他们(无声的错误就当作不存在)?你自己挑选你的死法吧,REST肯定没有迹象指明什么是只读属性,这短期内不会被解决的。同时GET返回之前PUT或POST发送的密码(或信用卡号)信息是危险的;祝你在处理这些只写的参数时好运。

Did I forget to mention that PUT also brings dangerous race conditions, where several clients will override each other’s changes, whereas they just wanted to update different fields?


You want to use PATCH to update your resource? Nice, but like 99% of people using this verb, you’ll just send a subset of resource fields in your request payload, hoping that the server properly understands the operation intended (and all its possible side effects); lots of resource parameters are deeply linked or mutually exclusive(ex. it’s either credit card OR paypal token, in a user’s billing info), but RESTful design hides this important information too. Anyway, you’d violate specs once more: PATCH is not supposed to just send a bunch of fields to be overridden. Instead, you’re supposed to provide a “set of instructions” to be applied on the resources. So here you go again, take your paperboard and your coffee mug, you’ll have to decide how to express these instructions. Often with handcrafted specifications, since Not-Invented-Here Syndrome is a de-facto standard in the REST world. (Edit: REST advocates have backpedaled on this subject, with Json Merge Patch, an alternative to formats like Json Patch)

你想用PATCH来更新你的资源吗?很好,如99%的使用这个操作的人一样,你会在你的负载里发送一个子集的资源域然后希望服务器理解这个操作的意图(和所有可能的副作用);许多资源参数或者是深度关联的,或者是互不相关的(例如在用户账单信息里,或者用银行卡或者用支付宝),但是REST设计隐藏了这些重要的信息。不管怎样,你会再一次违反规范:PATCH不是被用来发送一打需要被覆盖的域的,而是你被期待提供一系列的指令来应用到资源上。又需要你了,拿好你的文件夹和咖啡杯,你要决定如何表达这些指令。经常需要手工的规范,因为没有标准是REST世界的事实标准。(编辑:REST的倡导者对这个主题已经变卦了,他们让你使用JSON合并PATCH,另外一种像JSON PATCH的候选方案)

You want to DELETE resources? OK, but I hope you don’t need to provide substantial context data; like a PDF scan of the termination request from the user. DELETE prohibits having a payload. A constraint that REST architects often dismiss, since most webservers don’t enforce this rule on the requests they receive. How compatible, anyway, would be a DELETE request with 2 MBs of base64 query string attached? (Edit: the RFC 2616, indicating that payloads without semantics should be ignored, is now obsolete)


REST aficionados easily profess that “people are doing it wrong” and their APIs are “actually not RESTful”. For exemple, lots of developers use PUT to create a resource directly on its final URL (/myresourcebase/myresourceid), whereas the “good way” (edit: according to many) of doing it is to POST on a parent URL (/myresourcebase), and let the server indicate, with an HTTP “Location” header, the new resource’s URL (edit: it’s not an HTTP redirection though). The good news is: it doesn’t matter. These rigorous principles are like Big Endian vs Little Endian, they occupy philosophers for hours, but have very little impact on real life problems, i.e “getting stuff done”.

REST迷经常信奉”人们做错了”,他们的API不是REST的。例如,很多开发者使用PUT操作来直接在它的最终URL上创建资源(/myresourcebase/myresourceid),然而正确的方法(编辑:根据很多人)应该是在上一层URL使用POST操作(/myresourcebase), 然后服务器来通过HTTP的”Location”头来指明新的资源的URL(编辑:这不是HTTP重定向)。好消息是:这没关系。这些严格的原则就像吃鸡蛋从大端还是小端吃一样,哲学家为此费脑筋,但是对生活没影响,关键是把事做成。

By the way… handcrafting URLs is always great fun. Do you know how many implementations properly urlencode() identifiers while building REST urls? Not that many. Get ready for nasty breakages and SSRF/CSRF attacks.

顺便说下,设计URL总是充满乐趣。你知道构造REST URL时有多少合适的urlencode()函数的实现吗?不多。准备好险恶的破坏和SSRF/CSRF攻击吧。

The joy of REST error handling


About every coder is able to make a “nominal case” work. Error handling is one of these features which will decide if your code is robust software, or a huge pile of matchsticks.


HTTP provides a list of error codes out-of-the-box. Great, let’s see that.


Using “HTTP 404 Not Found” to notify about an unexisting resource sounds RESTful as heck, doesn’t it? Too bad: your nginx was misconfigured for 1 hour, so your API consumers got only 404 errors and purged hundreds of accounts, thinking they were deleted….

用”HTTP 404没找到”来通知不存在的资源听起来像鬼一样符合REST,不是吗?太坏了:你的NGINX服务器配置错误一个小时,你的API的用户仅能得到404错误,然后他们已经清洗了上百的账号了,因为他们认为这些用户已经被删除。

Using “HTTP 401 Unauthorized” when a user doesn’t have access credentials to a third-party service sounds acceptable, doesn’t it? However, if an ajax call in your Safari browser gets this error code, it might startle your end customer with a very unexpected password prompt [it did, years ago, YMMV].

用”HTTP 401未授权”来表示一个用户没有访问一个第三方的服务的权限听起来合理,不是吗?然而,如果你的Safari浏览器的一个ajax(Asynchronous JavaScript And XML)调用得到了这个错误,它可能会弹出一个让输入密码的对话框来惊吓你的客户[几年前它这样做过,YMMV(Your Mileage May Vary你的里程表可能不一样,你可能有不同意见)]

HTTP existed long before “RESTful webservices”, and the web ecosystem is filled with assumptions about the meaning of its error codes. Using them to transport application errors is like using milk bottles to dispose of toxic waste: inevitably, one day, there will be trouble.


Some standard HTTP error codes are specific to Webdav, others to Microsoft, and the few remaining have definitions so fuzzy that they are of no help. In the end, like most REST users, you’ll probably use random HTTP codes, like “HTTP 418 I’m a teapot” or unassigned numbers, to express your application-specific exceptions. Or you’ll shamelessly return “HTTP 400 Bad Request” for all functional errors, and then invent your own clunky error format, with booleans, integer codes, slugs, and translated messages stuffed into an arbitrary payload. Or you’ll give up altogether on proper error handling; you’ll just return a plain message, in natural language, and hope that the caller will be a human able to analyze the problem, and take action. Good luck interacting with such APIs from an autonomous program.

一些标准的HTTP的错误值是Webdav特定的,另外一些是微软特定的,另外一些剩余的定义得太模糊没有什么实际帮助。最后,像大多数REST用户,你可能会用随机的HTTP错误码,比如“HTTP 418我是一个茶壶”或者是没有分配的错误值,来表达你的应用相关的异常。或者你无耻地为每个功能错误都返回”HTTP 400坏的请求”错误,然后你用整形、布尔型、slug、翻译的消息等来拼凑你自己的笨重的错误格式,然后把他们放进人为的负载中。或者你完全放弃合适的错误处理;你只是返回一个自然语言描述的普通消息,然后期待调用者是个人类并能够分析问题采取行动。

The joy of REST concepts


REST has made a career out of boasting about concepts that any service architect in his right mind already respects, or about principles that it doesn’t even follow. Here are some excerpts, grabbed from top-ranked webpages.


REST is a client-server architecture. The client and the server both have a different set of concerns. What a scoop in the software world.


REST provides a uniform interface between components. Well, like any other protocol does, when it’s enforced as the franca lingua of a whole ecosystem of services.


REST is a layered system. Individual components cannot see beyond the immediate layer with which they are interacting. It sounds like a natural consequence of any well designed, loosely coupled architecture; amazing.
Rest is awesome, because it is STATELESS. Yes there is probably a huge database behind the webservice, but it doesn’t remember the state of the client. Or, well, yes, actually it remember its authentication session, its access permissions… but it’s stateless, nonetheless. Or more precisely, just as stateless as any HTTP-based protocol, like simple RPC mentioned previously.


With REST, you can leverage the power of HTTP CACHING! Well here is at last one concluding point: a GET request and its cache-control headers are indeed friendly with web caches. That being said, aren’t local caches (Memcached etc.) enough for 99% of web services? Out-of-control caches are dangerous beasts; how many people want to expose their APIs in clear text, so that a Varnish or a Proxy on the road may keep delivering outdated content, long after a resource has been updated or deleted? Maybe even delivering it “forever”, if a configuration mistake once occurred? A system must be secure by default. I perfectly admit that some heavily loaded systems want to benefit from HTTP caching, but it costs much less to expose a few GET endpoints for heavy read-only interactions, than to switch all operations to REST and its dubious error handling.


Thanks to all this, REST has HIGH PERFORMANCE! Are we sure of that? Any API designer knows it: locally, we want fine-grained APIs, to be able to do whatever we want; and remotely, we want coarse-grained APIs, to limit the impact of network round-trips. Here is again a domain in which REST fails miserably. The split of data between “resources”, each instance on its own endpoint, naturally leads to the N+1 Query problem. To get a user’s full data (account, subscriptions, billing information…), you have to issue as many HTTP requests; and you can’t parallelize them, since you don’t know in advance the unique IDs of related resources. This, plus the inability to fetch only part of resource objects, naturally creates nasty bottlenecks.


REST offers better compatibility. How so? Why do so many REST webservices have “/v2/” or “/v3/” in their base URLs then? Backwards and forward compatible APIs are not hard to achieve, with high level languages, as long as simple rules are followed when adding/deprecating parameters. As far as I know, REST doesn’t bring anything new on the subject.


REST is SIMPLE, everyone knows HTTP! Well, everyone knows pebbles too, yet people are happy to have better blocks when building their house. The same way XML is a meta-language, HTTP is a meta-protocol. To have a real application protocol (like “dialects” are to XML), you’ll need to specify lots of things; and you’ll end up with Yet Another RPC Protocol, as if there were not enough already.


REST is so easy, it can be queried from any shell, with CURL! OK, actually, every HTTP-based protocol can be queried with CURL. Even SOAP. Issuing a GET is particularly straightforward, for sure, but good luck writing json or xml POST payloads by hand; people usually use fixture files, or, much more handy, full-fledged API clients instantiated directly in the command line interface of their favorite language.


“The client does not need any prior knowledge of the service in order to use it”. This is by far my favourite quote. I’ve found it numerous times, under different forms, especially when the buzzword HATEOAS lurked around; sometimes with some careful (but insufficient) “except” phrases following. Still, I don’t know in which fantasy world these people live, but in this one, a client program is not a colony of ants; it doesn’t browse remote APIs randomly, and then decide how to best handle them, based on pattern recognition or black magic. Quite the opposite; the client has strong expectations on what it means, to PUT this one field to this one URL with this one value, and the server had better respect the semantic which was agreed upon during integration, else all hell might break loose.

“客户不需要任何服务的预备知识就能够使用它。”这是目前我最喜欢的引用了。我发现它很多次了,以不同的形式,尤其是热词HATEOAS(Hypermedia As The Engine Of Application State)出现的时候;有时后面会跟一些小心的(不充分的)例外短语。我仍然不知道这些人居住在什么样的神奇世界,但是这一条,一个客户程序不是一群蚂蚁;它不会随机地浏览远程API然后利用模式识别或者黑魔法来决定怎么最好地使用它们。恰恰相反;客户端对为什么把这个URL中的这个域设为这个值有强烈的期待,服务端最好尊重在集成的时候双方约定的语义,否则所有地狱里的小鬼都可能挣脱。

How to do REST right and quick?


Forget about the “right” part. REST is like a religion, no mere mortal will ever grasp the extent of its genius, nor “do it right”.


So the real question is: if you’re forced to expose or consume webservices in a kinda-RESTful way, how to rush through this job, and switch to more constructive tasks asap?


Update: it turns out that there are actually lots of “standards” and industrialization efforts for REST, although I had never encountered them personnally (maybe because few people use them?). More information in myfollow-up article.


How to industrialize server-side exposure?


Each web framework has its own way of defining URL endpoint. So expect some big dependencies, or a good layer of handwritten boilerplate, to plug your existing API onto your favorite server as a set of REST endpoint.


Libraries like Django-Rest-Framework automate the creation of REST APIs, by acting as data-centric wrappers above SQL/noSQL schemas. If you just want to make “CRUD over HTTP”, you could be fine with them. But if you want to expose common “do-this-for-me” APIs, with workflows, constraints, complex data impacts and such, you’ll have a hard time bending any REST framework to fit your needs.

像Django-Rest-Framework一样的库能够在SQL或NOSQL的大纲之上做数据密集的封装,使REST API的创建自动化。如果你仅仅想通过HTTP实现查插删改,你对他们会满意的。如果你想暴露一些常见的“为我做这件事”这样的有工作流、限制、复杂数据影响等的API,你很难发现满足你需求的REST框架。

Be prepared to connect, one by one, each HTTP method of each endpoint, to the corresponding method call; with a fair share of handmade exception handling, to translate passing-through exceptions into corresponding error codes and payloads.


How to industrialize client-side integration?


From experience, my guess is: you don’t.


For each API integration, you’ll have to browse lengthy docs, and follow detailed recipes on how each of the N possible operations has to be performed.


You’ll have to craft URLs by hand, write serializers and deserializers, and learn how to workaround the ambiguities of the API. Expect quite some trial-and-error before you tame the beast.


Do you know how webservices providers make up for this, and ease adoption?


Simple, they write their own official client implementations.




I’ve recently dealt with a subscription management system. They provide clients for PHP, Ruby, Python, .NET, iOS, Android, Java… plus some external contributions for Go and NodeJS.

最近我处理过一个订阅管理系统。他们提供PHP, Ruby, Python, .NET, IOS, Android, Java…再加上一些外部贡献的Go和NodeJS的客户端。

Each client lives in its own Github repository. Each which its own big list of commits, bug tracking tickets, and pull requests. Each with its own usage examples. Each with its own awkward architecture, somewhere between ActiveRecord and RPC proxy.


This is astounding. How much time is spent developing such weird wrappers, instead of improving the real, the valuable, the getting-stuff-done, webservice?


Sisyphus developing Yet Another Client for his API.




For decades, about every programming language has functioned with the same workflow: sending inputs to a callable, and getting results or errors as output. This worked well. Quite well.


With Rest, this has turned into an insane work of mapping apples to oranges, and praising HTTP specifications to better violate them minutes later.


In an era where MICROSERVICES are more and more common, how come such an easy task?—?linking libraries over networks?—?remains so artificially crafty and cumbersome?


I don’t doubt that some smart people out there will provide cases where REST shines; they’ll showcase their homemade REST-based protocol, allowing to discover and do CRUD operation on arbitrary object trees, thanks to hyperlinks; they’ll explain how the REST design is so brilliant, that I’ve just not read enough articles and dissertations about its concepts.


I don’t care. Trees are recognized by their own fruits. What took me a few hours of coding and worked very robustly, with simple RPC, now takes weeks and can’t stop inventing new ways of failing or breaking expectations. Development has been replaced by tinkering.


Almost-transparent remote procedure call was what 99% people really needed, and existing protocols, as imperfect as they were, did the job just fine. This mass monomania for the lowest common denominator of the web, HTTP, has mainly resulted in a huge waste of time and grey matter.


REST promised simplicity and delivered complexity.


REST promised robustness and delivered fragility.


REST promised interoperability and delivered heterogeneity.


REST is the new SOAP.




The future could be bright. There are still tons of excellent protocols available, in binary or text format, with or without schema, some leveraging the new abilities of HTTP2… so let’s move on, people. We can’t forever remain in the Stone Age of Webservices.


Edit: many people asked for these alternative protocols, the subject would deserve its own story, but one could have a look at XMLRPC and JSONRPC (simple but quite relevant), or JSONWSP (includes schemas), or language-specific layers like Pyro or RMI when for internal use, or new kids in the block like GraphQL and gRPC for public APIs…

编辑:许多人问这些其他的协议,这个主题值的另一篇文章,但是感兴趣的可以看看XMLRPC和JSONRPC(简单但是非常有意义),或者看看JSONWSP(JSON Web Service Protocol),或者只是内部使用的话可以看看特定语言的方案比如Pyro或者RMI(Remote Method Invocation),或者如果提供公开API的话,看看这个街区的新孩子比如GraphQL和gRPC。

