In his latest finely crafted post, REST and WS, Joe Gregorio
gives the quick definitive overview of web services and modern distributed architecture, while clarifying much confusion.
First of all, what REST really is:
"REST is not a specific piece of technology but an Architectural Style that was abstracted from HTTP during the transition from HTTP 1.0 to HTTP 1.1."
OK.. I get it. From a network perspective, going up the OSI Model/TCP Stack... starting from Layer 4, TCP is the transport layer protocol. HTTP is the [Layer 7] application layer protocol that rides on top of it. However, one of the most attractive things about the Web is the ability to use HTTP as a simple transport protocol abstraction, rather than interfacing wih TCP directly. So with this additional transport abstraction in place, you can build another application layer protocol on top of this and use that as your API for distributed operations. That is where the rubber meets the road in modern large scale systems, and that is where the action is taking place in the current debate about SOA, REST, Web Services, and distributed architectures. Furthermore, the foundation for this style is built directly into HTTP 1.1.
The problem with whole debate going on is that we are talking apples and oranges. Different architectural styles offer certain advantages, and these become apparent as your system grows in scale:
"REST and WS-* are two different tools whose strengths shine at different scales. The easiest way to think about this is an example from nature: at the scale of the atom the forces responsible for most of the action are different from the forces at the scale of a cell. Quantum effects and the strong nuclear force determine the structure and operation of an atom, while the operation of a cell is dominated by molecular reactions and Van der Waals' forces.
Another example closer to home; when programming and making calls into other functions and libraries, you pass along classes and types in the function call parameters. You expect those classes and types to be perfectly understood on the other side of that function call. Those are the rules at that scale; that type information can be counted on to survive and be useful over the function call boundary. As your scale grows, as you move outside the single executable, the same machine, or the same platform, that assumption begins to weaken, to the point that when you get to Internet scale services that assumption is actually harmful.
When working at the smaller scale the assumption that types can move across a boundary is powerful and allows many optimizations. Working in a homogeneous environment such as Java, WS-* has real advantages; you can very quickly create interfaces in your target programming language and expose those interfaces via WSDL and have them consumed just as easily on the calling side using the same WSDL.
As you move to larger systems, either many more clients connecting, or a non-homogeneous pool of clients, this paradigm starts to break down. If there are many clients then the demands for caching semantics will be begin to dominate. In that case you need to abandon HTTP as just a simple transport and start using the application level semantics of HTTP to start leveraging the caching architecture already built into the Internet."
Well.. that pretty much cements the whole idea in my head. When you move towards larger distributed systems and/or less-homogeneous environments, scalability and interoperability become a concern. There have been some clever approaches to solving these issues. Systems continue to become larger, more loosely coupled, and more interoperable... this is good... but as you approach this space, there are some tradeoffs you must make.
The real question is: should you think in those larger and better organized terms right from the start, or do you want to quickly exploit some of the advantages and optimizations available in another approach? And of course the answer is context... "It depends on the system".