Wednesday, 27 February 2019

A brief look at the evolution of interface protocols leading to modern APIs

Application interfaces are as old as the origins of distributed computing and can be traced back to the late 1960's when the first request-response style protocols were conceived.  For example, according to this research, it wasn't until the late 1980's when the first popular release of RPC (described below) was introduced by SUN Microsystems (later acquired by Oracle), that internet-based interface protocols gained wide popularity and adoption.

This is perhaps why the term Application Programming Interface (API) even today can often result in ambiguity depending on who you ask and in what context. This is probably because of the fact that historically the term API has been used to (and to a degree continues to) describe all sorts of interfaces well beyond just web APIs (e.g. REST).

This article therefore attempts to demystify (to an extend) the origins of modern web-based APIs. This is done by listing and describing in chronological order (as illustrated below) the different interface protocols and standards that in my view have had major influence to modern web-based APIs as we know them today (e.g. SOAP/WSDL based web services, REST, GraphQL, gRPC to name a few).

This article is part the research done for my coming book Enterprise API Management where I deep-dive into the 3 most trendy API architectural styles according to the followig Google trends.

Note that some of the texts in the following section are not mine but extracts from the referenced articles. Please do let me know if I missed any reference! Thanks.


Open Network Computing (ONC) Remote Procedure Call (RPC) is a remote procedure call system originally developed by Sun Microsystems in the 1980s as part of their Network File System project, sometimes referred to as Sun RPC.

A remote procedure call is when a computer program causes a procedure (subroutine) to execute in a different address space (commonly on another computer on a shared network), which is coded as if it were a normal (local) procedure call, without the programmer explicitly coding the details for the remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote.This is a form of client–server interaction (caller is client, executor is server), typically implemented via a request–response message-passing system. 

In the object-oriented programming paradigm, RPC calls are represented by remote method invocation (RMI). 

The RPC model implies a level of location transparency, namely that calling procedures is largely the same whether it is local or remote, but usually they are not identical, so local calls can be distinguished from remote calls. Remote calls are usually orders of magnitude slower and less reliable than local calls, so distinguishing them is important.

Note that theoretical proposals of remote procedure calls as the model of network operations date to the 1970s, and practical implementations date to the early 1980s. Bruce Jay Nelson is generally credited with coining the term "remote procedure call" in 1981. Thought the idea of treating network operations as remote procedure calls goes back at least to the 1970s in early ARPANET documents. The first popular implementation of RPC on Unix was Sun's RPC.


Interface definition
External Data Representation (XDR) / RPC language.

Serialised data based on XDR.

Transport protocol
ONC then delivers the XDR payload using either UDP or TCP.

First released
April 1988.


The Common Object Request Broker Architecture (CORBA) is a standard defined by the Object Management Group (OMG) and was designed to facilitate the communication of systems that are deployed on diverse platforms.

CORBA enables collaboration between systems on different operating systems, programming languages, and computing hardware. CORBA uses an object-oriented model although the systems that use the CORBA do not have to be object-oriented. CORBA is an example of the distributed object paradigm.

Interface definition
CORBA uses an interface definition language (IDL) to specify the interfaces that objects present to the outer world. CORBA then specifies a mapping from IDL to a specific implementation language like C++ or Java. 

Internet Inter-Orb Protocol (IIOB).

Transport protocol
TCP/IP and later HTTP (since 2007 apparently with HTIOP).

First released
Version 1.0 was released in October 1991.



The Distributed Computing Environment (DCE) RPC was an RPC system commissioned by Open Software Foundation (OSF), a non-profit organisation originally consisting Apollo Computer, Groupe Bull, Digital Equipment Corporation, Hewlett-Packard, IBM, Nixdorf Computer, and Siemens AG, sometimes referred to as the "Gang of Seven". In February 1996 Open Software Foundation merged with X/Open to become The Open GroupThe OSF was intended to be a joint development effort mostly in response to a perceived threat of "merged UNIX system" efforts by AT&T Corporation and Sun Microsystems

In DCE RPC, the client and server stubs are created by compiling a description of the remote interface (interface definition file) with the DCE Interface Definition Language (IDL) compiler. The client application, the client stub, and one instance of the RPC runtime library all execute in the caller machine; the server application, the server stub, and another instance of the RPC runtime library execute in the called (server) machine.


Interface definition
By making use of interface definition files (IDF) based on the The Interface Definition Language (IDL).

The DCE RPC protocol specifies that inputs and outputs be passed in octet streams. Whereas the IDL is to provide syntax for describing these structured data types and values. The Network Data Representation (NDR) specification of the protocol is responsible to provide a mapping of IDL data types onto octet streams. NDR defines primitive data types, constructed data types and representations for these types in an octet stream.

Transport protocol
DCE/RPC can run atop a number of protocols, including:

- TCP: Typically, connection oriented DCE/RPC uses TCP as its transport protocol. The well known TCP port for DCE/RPC EPMAP is 135. This transport is called ncacn_ip_tcp.
- UDP: Typically, connectionless DCE/RPC uses UDP as its transport protocol. The well known UDP port for DCE/RPC EPMAP is 135. This transport is called ncadg_ip_udp.
- SMB: Connection oriented DCE/RPC can also use authenticated named pipes on top of SMB as its transport protocol. This transport is called ncacn_np.
- SMB2: Connection oriented DCE/RPC can also use authenticated named pipes on top of SMB2 as its transport protocol. This transport is called ncacn_np.

First released
The first release ("P312 DCE: Remote Procedure Call") dates to1993.



The Distributed Component Object Model (DCOM) Is a proprietary Microsoft technology for communication between software components on networked computers. The addition of the "D" to COM was due to extensive use of DCE/RPC (Distributed Computing Environment/Remote Procedure Calls) – more specifically Microsoft's enhanced version, known as MSRPC.

DCOM was considered a major competitor to CORBA.

Interface definition
Characteristics of an interface are defined in an interface definition (IDL) file and an optional application configuration file (ACF):

- The IDL file specifies the characteristics of the application's interfaces on the wire — that is, how - data is to be transmitted between client and server, or between COM objects.
- The ACF file specifies interface characteristics, such as binding handles, that pertain only to the local operating environment. The ACF file can also specify how to marshal and transmit a complex data structure in a machine-independent form.
- The IDL and ACF files are scripts written in Microsoft Interface Definition Language (MIDL), which is the Microsoft implementation and extension of the OSF-DCE interface definition language (IDL).

DCOM objects (Microsoft proprietary).

Transport protocol
TCP/IP and later HTTP (since 2003).

First released
OLE 1.0, released in 1990, was an evolution of the original Dynamic Data Exchange (DDE) concept that Microsoft developed for earlier versions of Windows. OLE 1.0 later evolved to become an architecture for software components known as the Component Object Model (COM), which later in early 1996 became DCOM. 



The Extensible Markup Language (XML) Remote Procedure Call (RPC) is a protocol which uses XML to encode its calls and HTTP as a transport mechanism. In XML-RPC, a client performs an RPC by sending an HTTP request to a server that implements XML-RPC and receives the HTTP response. A call can have multiple parameters and one result. The protocol defines a few data types for the parameters and result. Some of these data types are complex, i.e. nested. For example, you can have a parameter that is an array of five integers.

Interface definition
No explicit interface definition language. The protocol defines a set of header and payload requirements which the implementation (e.g. in Java) most comply with.


Transport protocol

First released
The XML-RPC protocol was created in 1998 by Dave Winer of UserLand Software and Microsoft, with Microsoft seeing the protocol as an essential part of scaling up its efforts in business-to-business e-commerce. As new functionality was introduced, the standard evolved into what is now SOAP.



Enterprise Java Beans (EJB) is a server-side software component that encapsulates business logic of an application. An EJB web container provides a runtime environment for web related software components, including computer security, Java servlet lifecycle management, transaction processing, and other web services. The EJB specification is a subset of the Java EE specification. EJBs are/were typically used when building highly scalable and robust enterprise level applications that can be deployed on Jakarta EE (former J2EE) compliant Application Server such as JBOSS, Web Logic etc.

Interface definition
Java remote interface (extending javax.ejb.EJBObject) declaring the methods that a client can invoke.

Originally as serialised Java objects (e.g. DTO), but later releases also support XML and JSON over HTTP.

Transport protocol
EJB originally specified Java Remote Method Invocation (RMI) as the transport protocol, but later releases also support HTTP.

First release
The EJB specification was originally developed in 1997 by IBM and later adopted by Sun Microsystems (EJB 1.0 and 1.1) in 1999 and enhanced under the Java Community Process as JSR 19 (EJB 2.0), JSR 153 (EJB 2.1), JSR 220 (EJB 3.0), JSR 318 (EJB 3.1) and JSR 345 (EJB 3.2).



Representational State Transfer (REST) is a software architectural style that defines a set of constraints to be used for creating Web services. Such constraints restrict the ways that the server can process and respond to client requests so that, by operating within these constraints, the system gains desirable non-functional properties, such as performance, scalability, simplicity, modifiability, visibility, portability, and reliability. If a system violates any of the required constraints, it cannot be considered RESTful.

These constraints are (from Roy's dissertation):

Client-server:  separation of concerns is the principle behind the client-server constraints. By separating the user interface concerns from the data storage concerns, we improve the portability of the user interface across multiple platforms and improve scalability by simplifying the server components. Perhaps most significant to the Web, however, is that the separation allows the components to evolve independently, thus supporting the Internet-scale requirement of multiple organisational domains
Stateless: communication must be stateless in nature such that each request from client to server must contain all of the information necessary to understand the request, and cannot take advantage of any stored context on the server. Session state is therefore kept entirely on the client.
Cache: In order to improve network efficiency, we add cache constraints to form the client-cache-stateless-server style. Cache constraints require that the data within a response to a request be implicitly or explicitly labeled as cacheable or non-cacheable. If a response is cacheable, then a client cache is given the right to reuse that response data for later, equivalent requests
Uniform interface: The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on a uniform interface between components. By applying the software engineering principle of generality to the component interface, the overall system architecture is simplified and the visibility of interactions is improved. Implementations are decoupled from the services they provide, which encourages independent evolvability. REST is defined by four interface constraints: identification of resources; manipulation of resources through representations; self-descriptive messages; and, hypermedia as the engine of application state (HATEOAS)
Layered system: In order to further improve behaviour for Internet-scale requirements the layered system style allows an architecture to be composed of hierarchical layers by constraining component behaviour such that each component cannot "see" beyond the immediate layer with which they are interacting. y restricting knowledge of the system to a single layer, we place a bound on the overall system complexity and promote substrate independence. Layers can be used to encapsulate legacy services and to protect new services from legacy clients, simplifying components by moving infrequently used functionality to a shared intermediary. Intermediaries can also be used to improve system scalability by enabling load balancing of services across multiple networks and processors.
Code-on-demand: REST allows client functionality to be extended by downloading and executing code in the form of applets or scripts. This simplifies clients by reducing the number of features required to be pre-implemented. Allowing features to be downloaded after deployment improves system extensibility. However, it also reduces visibility, and thus is only an optional constraint within REST.

REST was first introduced in the year 2000 as part of Roy Fielding's PhD dissertation titled "Architectural Styles and the Design of Network-based Software Architectures".

REST was first introduced in the year 2000 as part of Roy Fielding's PhD dissertation titled "Architectural Styles and the Design of Network-based Software Architectures".  Although the first (or at least the first publicly known) REST API was launched by eBAY the same year, the adoption of REST over alternatives (such as SOAP/WSDL Web Services) really only gained traction towards the end of 2004 when Flickr first launched its first publicly available REST API and shortly after Facebook and Twitter followed also publishing their own public REST APIs.

Note that the first (or at least the first publicly known) REST API launched by eBAY the same year as Roy's dissertation.


Interface definition
Many open-sourced interface definition languages exists for REST APIs -although none as part of the original Roy Fielding's REST PhD dissertation.

Some of the most popular ones are REST are:

Swagger -later renamed to OpenAPI Specification (OAS) being the latest version 3.0 (with Swagger yous becoming the name to refer to a set of toolsets for adopting OAS).
API blueprint created by the founders of (and now part of Oracle) which is a platform that offers robust REST API design and testing capabilities.
RESTful API Modeling Language (RAML) created by MuleSoft.

REST does not specify any specific payload format, however majority of REST APIs make use of the JavaScript Object Notation (JSON) which is a lightweight data-interchange format easy for humans to read and write. XML payloads are also not uncommon in REST APIs.

Transport protocol:

First release:
REST was defined by Roy Fielding in his 2000 PhD dissertation "Architectural Styles and the Design of Network-based Software Architectures" at UC Irvine. He developed the REST architectural style in parallel with HTTP 1.1 of 1996–1999, based on the existing design of HTTP 1.0 of 1996.

In terms of interface definition languages:

- Swagger was first released in 2011. Its name switched to OAS in 2016..
- In 2017, OAS 3.0 was released.
- API Blueprint and RAML were both released in 2013.


SOAP/WSDL & Web Services

The Simple Object Access Protocol (SOAP) is an XML based protocol for exchange of information in a decentralised, distributed environment.  SOAP was designed as an object-access protocol in 1998 for Microsoft.

It consists of three parts:

- An envelope that defines a framework for describing what is in a message and how to process it
- A set of encoding rules for expressing instances of application-defined datatypes
- A convention for representing remote procedure calls and responses.

The Web Services Description Language (WSDL) as it name suggests is an interface description language also based on XML. The main purpose of WSDL is to describe the functionality offered by a SOAP interface. WSDL 1.0 (Sept. 2000) was developed by IBM, Microsoft, and Ariba to describe Web Services for their SOAP toolkit

The combination of SOAP and WSDL as the means to define and implement open-standard based interfaces eventually became known as Web Services. The term also became official in 2004 as part of W3C's Web Services Architecture.

Its worth mentioning that Web Services became one of the core building blocks of Service Oriented Architectures (SOA).


Interface definition

XML  SOAP message containing an envelop and within the header and body.

Transport protocol:
SOAP can potentially be used in combination with a variety of other protocols; however, the only bindings defined in this document describe how to use SOAP in combination with HTTP and HTTP Extension Framework.

First release:
Both SOAP and WSDL versions 1.2 became an official W3C recommendation in June 2003.
In 2004 the term Web Services became official as part of W3C's Web Service Architecture recommendation.



The Open Data Protocol (OData) is a REST-based protocol originally designed by Microsoft but later becoming ISO/IEC approved and an OASIS standard.  The main objective of OData is to standardise the way in which basic data access operations can be made available via REST.

OData It’s built on top of HTTP and uses URIs to address and access data feed resource (in JSON format). The protocol is based on AtomPub and extends it by adding metadata to describe the data source and a standard means of querying the underlying data.

OData was subject to criticism in 2013 when Netflix abandoned the use of the protocol.

"A more technical concern with OData is that it encourages poor development and API practices by providing a black-box framework to enforce a generic repository pattern. This can be regarded as an anti-pattern as it provides both a weak contract and leaky abstraction. An API should be designed with a specific intent in mind rather than providing a generic set of methods that are automatically generated. OData tends to give rise to very noisy method outputs with a metadata approach that feels more like a WSDL than REST. This doesn’t exactly foster simplicity and usability".

In spite of this, large software companies like Microsoft and SAP still back the protocol, although industry wide OData seems to have declined in popularity.


Interface definition
OData services are described in terms of an Entity Model. The Common Schema Definition Language (CSDL) defines a representation of the entity model exposed by an OData service using (in version 4) JSON.


Transport protocol:

First release:
In May, 2012, companies including Citrix, IBM, Microsoft, Progress Software, SAP AG, and WSO2 submitted a proposal to OASIS to begin the formal standardization process for OData. Many Microsoft products and services support OData, including Microsoft SharePoint, Microsoft SQL Server Reporting Services, and Microsoft Dynamics CRM. OData V4.0 was officially approved as a new OASIS standard in March, 2014.



The Graph Query Language (GraphQL) is a query language for APIs and a runtime for fulfilling those queries with your existing data. It was created by Facebook in 2012 to get around a common constraints in the REST approach when fetching data.

GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.

GraphQL isn't tied to any specific database or storage engine and is instead backed by your existing code and data. A GraphQL service is created by defining types and fields on those types, then providing functions for each field on each type.

GraphQL is not a programming language capable of arbitrary computation, but is instead a language used to query application servers that have capabilities defined in this specification. GraphQL does not mandate a particular programming language or storage system for application servers that implement it. Instead, application servers take their capabilities and map them to a uniform language, type system, and philosophy that GraphQL encodes. This provides a unified interface friendly to product development and a powerful platform for tool-building.

GraphQL has a number of design principles:

- Hierarchical: Most product development today involves the creation and manipulation of view hierarchies. To achieve congruence with the structure of these applications, a GraphQL query itself is structured hierarchically. The query is shaped just like the data it returns. It is a natural way for clients to describe data requirements.
- Product‐centric: GraphQL is unapologetically driven by the requirements of views and the front‐end engineers that write them. GraphQL starts with their way of thinking and requirements and build the language and runtime necessary to enable that.
- Strong‐typing: Every GraphQL server defines an application‐specific type system. Queries are executed within the context of that type system. Given a query, tools can ensure that the query is both syntactically correct and valid within the GraphQL type system before execution, i.e. at development time, and the server can make certain guarantees about the shape and nature of the response.
- Client‐specified queries: Through its type system, a GraphQL server publishes the capabilities that its clients are allowed to consume. It is the client that is responsible for specifying exactly how it will consume those published capabilities. These queries are specified at field‐level granularity. In the majority of client‐server applications written without GraphQL, the server determines the data returned in its various scripted endpoints. A GraphQL query, on the other hand, returns exactly what a client asks for and no more.
- Introspective: GraphQL is introspective. A GraphQL server’s type system must be queryable by the GraphQL language itself, as will be described in this specification. GraphQL introspection serves as a powerful platform for building common tools and client software libraries.

Interface definition
The GraphQL Schema Definition Language (SDL).


Transport protocol:

First release:
GraphQL was publicly released in 2015. The GraphQL Schema Definition Language (SDL) added to spec in Feb’18.



gRPC is defined by Google as a modern Remote Procedure Call (RPC) framework that can run in any environment.  gRPC in principle enables client and server applications to communicate transparently, and makes it easier to build connected systems.

gRPC was designed based on the following principles:

- Services not objects, messages not references: promote the microservices design philosophy of coarse-grained message exchange between systems while avoiding the pitfalls of distributed objects and the fallacies of ignoring the network.
- Coverage & simplicity: the stack should be available on every popular development platform and easy for someone to build for their platform of choice. It should be viable on CPU & memory limited devices.
- Free & open: make the fundamental features free for all to use. Release all artefacts as open-source efforts with licensing that should facilitate and not impede adoption.
- Interoperability & reach: the wire-protocol must be capable of surviving traversal over common internet infrastructure.
- General purpose & performant: the stack should be applicable to a broad class of use-cases while sacrificing little in performance when compared to a use-case specific stack.
- Layered: key facets of the stack must be able to evolve independently. A revision to the wire-format should not disrupt application layer bindings.
- Payload agnostic: different services need to use different message types and encodings such as protocol buffers, JSON, XML, and Thrift; the protocol and implementations must allow for this. Similarly the need for payload compression varies by use-case and payload type: the protocol should allow for pluggable compression mechanisms.
- Streaming: storage systems rely on streaming and flow-control to express large data-sets. Other services, like voice-to-text or stock-tickers, rely on streaming to represent temporally related message sequences.
- Blocking & non-blocking: support both asynchronous and synchronous processing of the sequence of messages exchanged by a client and server. This is critical for scaling and handling streams on certain platforms.
- Cancellation & timeout: operations can be expensive and long-lived - cancellation allows servers to reclaim resources when clients are well-behaved. When a causal-chain of work is tracked, cancellation can cascade. A client may indicate a timeout for a call, which allows services to tune their behaviour to the needs of the client.
- Lameducking: servers must be allowed to gracefully shut-down by rejecting new requests while continuing to process in-flight ones.
- Flow-control: computing power and network capacity are often unbalanced between client & server. Flow control allows for better buffer management as well as providing protection from DOS by an overly active peer.
- Pluggable: A wire protocol is only part of a functioning API infrastructure. Large distributed systems need security, health-checking, load-balancing and failover, monitoring, tracing, logging, and so on. Implementations should provide extensions points to allow for plugging in these features and, where useful, default implementations.
- Extensions as APIs: extensions that require collaboration among services should favour using APIs rather than protocol extensions where possible. Extensions of this type could include health-checking, service introspection, load monitoring, and load-balancing assignment.
- Metadata exchange: common cross-cutting concerns like authentication or tracing rely on the exchange of data that is not part of the declared interface of a service. Deployments rely on their ability to evolve these features at a different rate to the individual APIs exposed by services.
- Standardised status codes: clients typically respond to errors returned by API calls in a limited number of ways. The status code namespace should be constrained to make these error handling decisions clearer. If richer domain-specific status is needed the metadata exchange mechanism can be used to provide that.

Interface definition
gRPC can use protocol buffers as both its Interface Definition Language (IDL) and as its underlying message interchange format.  Protocol buffers is Google’s mature open source mechanism for serialising structured data.

By default gRPC uses protocol buffers (although it can be used with other data formats such as JSON).

Transport protocol:

First release:
Google released gRPC as open source in 2015.



RSocket is a binary application (level 7) protocol, originally developed by Netflix (and later by engineers from Facebook, Netifi and Pivotal amongst others), for use on byte stream transports such as TCP, WebSockets, and Aeron. The motivation behind its development was to replace HTTP, which was considered inefficient for many tasks such as microservices communication, with a protocol that has less overhead.

RSocket is a bi-directional, multiplexed, message-based, binary protocol based on reactive streams back pressure. It enables the following symmetric interaction models via async message passing over a single connection:

- request/response (stream of 1).
- request/stream (finite stream of many).
- fire-and-forget (no response).
- channel (bi-directional streams).

It also supports session resumption, to allow resuming long-lived streams across different transport connections. This is particularly useful for mobile <> server communication when network connections drop, switch, and reconnect frequently.

Some of the key motivations behind rSocket include:

- support for interaction models beyond request/response such as streaming responses and push.
- application-level flow control semantics (async pull/push of bounded batch sizes) across network boundaries.
- binary, multiplexed use of a single connection.
- support resumption of long-lived subscriptions across transport connections.
- need of an application protocol in order to use transport protocols such as WebSockets and Aeron.
The protocol is specifically designed to work well with Reactive-style applications, which are fundamentally non-blocking and often (but not always) paired with asynchronous behaviour. The use of Reactive back pressure, the idea that a publisher cannot send data to a subscriber until that subscriber has indicated that it is ready, is a key differentiator from "async".

Interface definition
Depends on the implementation. For example RSocket RPC uses Google's protocol buffer v3 as its interface definition language.

RSocket provides mechanisms for applications to distinguish payload into two types. Data and Metadata. The distinction between the types in an application is left to the application.

The following are features of Data and Metadata:

- Metadata can be encoded differently than Data.
- Metadata can be "attached" (i.e. correlated) with the following entities:
- Connection via Metadata Push and Stream ID of 0
- Individual Request or Payload (upstream or downstream)
Transport protocol:
The RSocket protocol uses a lower level transport protocol to carry RSocket frames. A transport protocol MUST provide the following:

1- Unicast reliable delivery.
2- Connection-oriented and preservation of frame ordering. Frame A sent before Frame B MUST arrive in source order. i.e. if Frame A is sent by the same source as Frame B, then Frame A will always arrive before Frame B. No assumptions about ordering across sources is assumed.
3- FCS is assumed to be in use either at the transport protocol or at each MAC layer hop. But no protection against malicious corruption is assumed.

RSocket as specified here has been designed for and tested with TCP, WebSockets, Aeron and HTTP/2 streams as transport protocols.

First release:
Although originally released by Netflix in October 2015, the protocol is currently a draft for the final specifications. Current version of the protocol is 0.2 (Major Version: 0, Minor Version: 2). This is currently considered a 1.0 Release Candidate. Final testing is being done in Java and C++ implementations with goal to release 1.0 in the near future.


Tuesday, 30 October 2018

The Se7en Deadly Sins of API Design

During Oracle Code One 2018 (formerly Java One) I was lucky enough to deliver a funny yet insightful presentation titled "The Seven Deadly Sins of API Design" focused on API design anti-patterns and how to overcome them.

The presentation was partly inspired by Daniel Bryant presentation titled 7 Deadly Sins of Microservices but really focused on API design and API-led architectures, not so much on Microservices (though the too are related so some coverage was inevitable). But my main motivation was really around the fact that we're all sinners when it comes to making mistakes. When I first started designing REST APIs (or before that SOAP/WSDL based services), I myself made so many mistakes. However the main thing is to learn from them. And not just from our own mistakes, but that of others. So my presentation is about this, shortlisting the seven most common pitfalls on API design and architectures and then using the deadly sins as a vehicle to tell a story on how to "deliver us from evil".

The 7 deadly sins, also known as capital sins, represent corrupt and/or perverse versions of love. In this case, corrupt or perverse APIs.

Following a description of each deadly sin including a description of what anti-pattern I went for on each:
  1. Lust: unrestrained desire for something. In this sin I talk about why sometimes we focus so much in the implementation aspects of an API, but specially on what tools to us, and not so much on the usability of the API itself which also means getting feedback from the audience of the API to ensure the interface is fit for purpose and intuitive enough -something I refer to as API-design first.
  2. Gluttony: the over-indulge specially by over eating. I use this sin to articulate the fact that many API implementations end-up with several layers of middleware (e.g. mainly load balancers and multiple API Gateways) before an actual service endpoint is actually reached. This is bad for many reasons (e.g. added complexity, additional costs, etc) and my conclusion is that we should not just add layers on top of layers for no strong reason. In some scenarios it might be inevitable but as rule of thumb we should question any additional layer added on top of the service. For example, I think one API Gateway should be enough and is justified, adding another one? umnn... 
  3. Greed: intense and selfish desire for something. In here I talk about how many times a frontend results in poor user experience consequence of chatty APIs that require several API calls in order to construct e.g. a single UI page. Instead, I talk about how to prevent this sin by implementing different patterns such as web-hooks and/or API composition (e.g. with GraphQL).
  4. Sloth: laziness, lack of effort. an obvious one in my view, I highlight the fact that even though REST as an architectural style is well documented on the web and stablished industry wide still we go about implementing "corrupt and perverse" versions of REST. I then describe some REST (obvious?) best-practices, though in my view, there is no excuse for a poor RESTful design because of the aforementioned.
  5. Wrath: uncontrolled feelings of hatred and anger. In this sin I use as example the fact that many APIs are poorly documented or not documented at all which can results in angry developers. To present this sin, I talk about some documentation best-practices and show a diagram depicting what a good API documentation should look like.
  6. Envy: jealousy towards another's happiness. My favourite one (and I think the audience's too), I talk about the fact that some times we get hung-up on a specific architectural style without considering alternatives that might better suited for the problem at hand.  Look at the slides, you'll laugh.
  7. Pride: Inflated sense of one's accomplishments. And lastly I talk about bottom-up API design, which I refer to as the result of auto-generating API specs either from code or from a database relational models without first designing/mocking the interface and establishing feedback-loops. I am not in favour of this approach for many reasons but most notably because of the lack of abstraction and separation of concerns. For example, If an API is derived from a relational model and an API consumer binds to the interface, then a change in the database might end-up propagating all the way to the user interface. BAD IDEA. But this also means, as in the case of Lust, usability and feedback loops are not really considered as there is no API design, instead a database or backend system design is forced upon API consumers.
Here's the presentation:

As you can probably tell, the theme of the presentation is inspired by the movie Se7en. If you haven't seen it! please do as it's brilliant.

I am also glad to say that the session was very well attended (standing room) which I am very happy about given the effort that went into preparing it -anyone that has worked hard on a presentation knows exactly what I mean by this!

Last but not least I want to thank my team in Capgemini, Phil, Sander, Yury, Ben, Amy, James for giving me feedback as it helped a lot in maturing the story line.

Wednesday, 25 July 2018

The Spotify's Engineering Culture. My interpretation and summary.

I've come across the so called "Spotify model" several times. Pretty much every organisation I am working with is using it one way or another either as inspiration for their target organisation or as an example of what they would like their IT culture to be like.

Thanks to the brilliant 2-part video posted by Henrik Kniberg, I was able to listen, visualise and truly digest what this engineering culture actually means from a organisational, technological and people/culture point of view.

To this end, I've created the following presentation with the intention to also share my interpretation of their engineering approach.

The Spotify engineering culture empowers its people at many different levels as it provides a very good balance of freedom and structure. It’s open approach towards collaboration, respect and trust, ensures that Squads are align, share knowledge and experiences, thus avoiding common pitfalls –whilst not reducing the amount of innovation.

Their experimental and “fail fast-learn fast-improve fast” culture is an engine for innovation as teams are encouraged to try new ideas out, without being worry of being punished if some of the ideas fail.

Spotify’s decoupled architecture (probably based on Microservices although not explicitly mentioned) is most likely a result of their engineering culture, as opposed to purely driven by technology and/or architectural preferences. Can’t help it but to say it’s Conway's law in action.

This model however, is not for all organisations and many will find it very difficult to adopt. Specially large traditional corporations where the level of politics and bureaucracy is so high that change take ages to occur, shifting to the Spotify way of doing this will be a huge undertaking.  For such [traditional] organisations, keeping pace with more innovative companies (those that do succeed in adopting a Spotify like model) will be a struggle. On the flip-size, large organisations that do manage to shift, will be able to benefit from their size and market reach plus the agility, speed and innovation enjoyed by the likes of Spotify. Only time will tell!!.

Tuesday, 15 May 2018

A comparison of push vs phone-home communication approaches between API Gateways and Management Services

API Gateways deliver critical runtime capabilities in enterprise-wide API management infrastructures. However, such runtime capabilities must also be complemented with other design-time and governance capabilities in support of activities such as APIs lifecycle management, API design, policy definition and implementation, deployment, retirement, monitoring, and so on.

The aforementioned design-time/governance capabilities, are often offered by different API management vendors as a separate Management Service infrastructure that augments/complements the runtime infrastructure (API Gateways). Needless to say in order for runtime and design-time/governance infrastructure to work together cohesively as a collective whole, there must be some sort of effective and reliable communication between these two main components.

Whereas some products like for example the Oracle API Platform Cloud Service, deliver a phone-home approach for API Gateways to communicate with the management infrastructure, other vendors implement a push approach whereby the Management Service is responsible for establishing and handling the connection to the API Gateways.

Both approaches are fundamentally different and understanding how such differences can impact/influence a solution becomes even more critical as the need for API Gateways increase e.g. as a result of  adopting cloud or Microservices Architectures.

Furthermore, as cloud adoption continues to rocket, vendors also offer Management Service capabilities as a PaaS cloud service. This is important and not trivial as it means that communication between the PaaS-based management infrastructure and the API Gateways must be in placed prior implementing the solution.

This article compares these two main communication strategies and highlights key differences including pros and cons (from the point of view of the author).

Installation / Configuration:
When installing/configuring an API Management solution, both Management Service and API Gateways should be provisioned and configured so they can talk to each other. The communication style can have a considerable impact in the steps required in order to implement the solution (e.g. open firewalls, etc).

  • Pros:
    • No obvious benefit.
  • Cons:
    • May require several network / firewall changes in order to allow inbound connections to the API Gateways if the same reside in a DMZ e.g. on-premises. This may be even more complicated if the the ports used are non-standard and random.
Phone-home (pull):
  • Pros:
    • Installation / configuration of the API Gateway "might" not require any major changes in networks/firewalls as connections are initiated by the API gateways to the management service (in principle) using standard ports (e.g. HTTPS/443).
  • Cons:
    • Outbound connectivity to the Management Service must be in place. If the management infrastructure resides in the cloud, this means outbound internet access is required. Not necessarily a con but an important consideration.
Deployment of APIs:
Once APIs are defined along with their relevant policies (e.g. OAuth, API key validation, throttling, rate-limiting, API plans, etc) they are deployed to the relevant API Gateways. The more API Gateways an API have to be deployed to, the more complicated and error prone the process can be.

  • Pros:
    • The main benefit is that deployment of APIs occur immediately as soon as a deployment task is initiated. This is because the Management Service initiates the connection to the API Gateway.
  • Cons:
    • As direct connection to the API Gateway is required, some vendor offerings might require the Management Service and API Gateways to reside in the same network segment. This is an important constraint and it also means that as more API Gateways are required (e.g. in the cloud or different locations), additional Management Services have to be implemented which is not ideal and introduces additional management overheads, complexity and costs.
    • Depending on the size of the solution and how many Management Services are in place, deployment of a single API to multiple API Gateways in one-go, can be a non-trivial task.
    • Issues in communication between Management Service and API Gateways may only become evident during the deployment of APIs, which isn't ideal.
Phone-home (pull):
  • Pros:
    • It is the API Gateway’s responsibility to phone-home and download/configure APIs, typically at the pre-defined interval (e.g. every minute, every-hour).
    • The internals also act as heart-beat. So issues in communication between API Gateway and Management Service should become evident rather quickly and not during a deployment.
  • Cons:
    • Deployments may take a while to complete if pre-defined time internals are long (basically it won't be immediate).
Infrastructure Topology:
As briefly mentioned previously, how the management infrastructure and the API Gateway communicate, does impose important considerations in regards to the overall API management infrastructure topology and options available.

  • Pros:
    • Typically the Management Service can be installed and configured by the client in any infrastructure of choice. This can be beneficial if full API Management infrastructure is is required in a close-loop network with several constraints (e.g. A cruise ship) 
  • Cons:
    • Potential proliferation of management infrastructure as more than one Management Service might be required.
    • Increased complexity in the management of the infrastructure and most likely also an increase in costs.
Phone-home (pull):
  • Pros:
    • Typically it means that API Gateways can share a single management infrastructure, which simplifies the solution and reduces the management overhead and costs.
  • Cons:
    • If the management infrastructure is only available as a PaaS capability, for solutions with special network constraints or requirements this might not be an option.
Solution at Scale:
Some very large organisations may have the need to implement API Gateways at very large scale (e.g. hundreds to even thousands). In such large-scale implementations, every small factor becomes an important consideration, and how Management Services and API Gateways communicate, is certainly of them.

  • Pros:
    • If the solution requires (could be for organisational reasons) to have multiple and separate management infrastructure, a push model could work well.
  • Cons:
    • A proliferation of management infrastructure results in overall higher TCO and more complexity.
    • Additional tooling may be required to provide some sort of centralised monitoring capabilities so the overall API management infrastructure can be monitored.

Phone-home (pull):
  • Pros:
    • This solution is more easily scalable as more API Gateways can be added without necessarily having to also increase the number of Management Service.
    • Simpler solution to operate given reduced number of Management Services.
    • Deployment of APIs becomes a much easier task. A single Management Service can deploy to several API Gateways deployed in many locations easily.
  • Cons:
    • A single management infrastructure, could also introduce a single point-of-failure. Therefore adequate high-availability infrastructure must be in placed.

In the majority of cases the business or IT management won't be interested in understanding how API Gateways and management infrastructure interact. However a different story emerges if the implications of such communication are presented in terms of TCO impact, scalability of the solution and business agility. This article summarised some of these implications in terms of installation/configuration, deployment, infrastructure topology and scalability. But ultimately what solution (push vs phone-home) works best for your organisation should be determined by what business requirements is driving the need for APIs and their management, and how each approach can best help realise such benefits.

Friday, 2 February 2018

Is BPM Dead, Long Live Microservices?

With the massive uptake of Microservices Architecture -industry wide- and with it, the adoption of patterns such as Event Sourcing, CQRS and Saga as the means for Microservices to asynchronously communicate with each and effectively "choreograph" business processes, it might seem as if the days of process orchestration using BPM engines (e.g. Oracle Process Cloud now also part of Oracle Integration Cloud, Pega, Appian, etc) or BPEL (or BPEL-like) engines are over.

Although the use of choreography and associated patterns (such as the aforementioned) makes tons of sense in many use cases, I've come across a number of them where choreography can be impractical.

Some examples:
  • Data needs to be collected and aggregated from multiple services -e.g. check the Composition pattern. Note that this pattern doesn't necessarily implies that an orchestration is required. Could be that data is collected and aggregated (not transformed) into a single response. But if data collected from multiple sources needs to also be transformed into a common response payload, then it feels pretty close to one of the typical use cases for orchestration.
  • The process is human-centric and can't be fully automated. Basically at some point a human has to take an action in other for the process to complete (e.g. approval of a credit card application, or a credit check) -BPM/Orchestration tools tend to be quite good at this.
  • There is a need to have very clear visibility of the end to end business processes. In traditional BPM tools, this is fairly straight forward, with Choreography / Events, although possible to monitor individual events, a form of correlation would be required to build an end to end view on the status of a business process.
It was "perhaps" for some of these reasons, that Netflix developed their own process orchestration engine called Netflix Conductor (now also open sourced). Their reason for developing this tool, with their own words:

"With peer to peer task choreography, we found it was harder to scale with growing business needs and complexities" --read this link for the complete article.

And it's not just Netflix, the like sof Camunda, Zeebe, Baker, seem to have spotted the need for such microservices oriented process engines, and thus their solutions fits well with this architectural style.

This is also one of the reasons why has the concept of semi-decoupled services, in other words, a service that's not entirely independent either because it runs on a share runtime or because it conducts an orchestration and therefore has runtime-coupling to other services. This compared to a fully-decoupled service that only implements choreography to interact with other services (aka Microservice).

Sample Use Case
In order to better illustrate what's being said, take the following sample use case:
  • A simple Credit Check process that determines if a customer credit score is adequate or not for a given transaction
  • In certain scenarios (e.g. just above threshold credit score), a manual human intervention is required to accept or reject an application.
  • The process can be implemented in a number of ways:
    1. As an orchestrated (synchronous) business process
    2. As a choreographed (asynchronous) business process - no process engine
    3. Choreographed but with process engine
let's have a deeper look:

1) Orchestrated (Synchronous) Credit Check Process
An orchestrated business process implemented with a traditional process engine tool of choice and synchronous as both request/reply would be within the same HTTP thread. As it's notable in the diagram, this is not really dramatically different from a traditional SOA architecture. The process could be a BPMN 2.0 engine or a BPEL orchestration tool (as many support human workflows).

Main advantage of this approach is that process metrics are clearly visible end to end plus it would be fairly straight forward to implement, including important capabilities such as exception and compensation handling. However, performance and scalability wise, as all HTTP requests are synchronous, if threads can't be served rapidly they could accumulate becoming a bottle neck (e.g. hang threads).

It's worth also noting these advantages are true so long that a process engine is already available for use, and it supports REST as entry point to trigger a process. Shall this not be the case, the effort involved in standing up new infrastructure would probably counter the benefits and other options might be more viable.

2) Choreographed (Asynchronous) Credit Check Process –no process engine
This alternative would be a fully choreographed thus asynchronous business process, purely based on a Microservices Architecture. No process engine used in this option. Services are completely independent and only communicate to each other using events via an event hub.

The main benefit of this option is the flexibility, extensibility and adaptability it delivers. Because all interactions would be via events, services are decoupled from one another, so therefore can be developed, deployed, tested and scaled independently. furthermore, this architecture, if done well, could scale to handle very large throughput as each service can independently scale.

However this approach would be complex to implement given the increased number of services and events. Coordinating the sequence of actions > events requires careful modelling. Also getting end to end visibility of the business process wouldn't be straight forward unless additional tolling or custom solutions are adopted. In addition, for the human workflow bits, a custom additional web application would have to be developed, no simple task if the approval workflows are completed (at this point the question would come: why reinvent the wheel when process orchestration engine do this out of the box and quite well?).

Lastly, it's also worth noting that in order to be able to effectively "call back" to the consumer application, some sort of callback handler implemented for example using Websockets or Server-Side Events would be required. This isn't necessarily simple and therefore would add to the complexity.

3) Choreographed (Asynchronous) Credit Check Process –with process engine
This option can be seen as the best of both worlds. Event Sourcing is still adopted as a pattern, however instead of having only services react to events in order to accomplished all desired process steps, a business process implemented in a process engine is adopted such as it can execute all desired steps by publishing or subscribing to the relevant steps. Using modern process engines such as the aforementioned, the process it self could either be compact enough to be deployed to its own runtime meaning it would effectively be Microservice in its own right, or like it's the case in Netflix Conductor, multiple (work) microservices could independently interact (via events) by interacting with Conductor's worklist and queue services.

Alternatively only a portion of the process could be implemented in the process engine e.g. the Human Workflow bit given that this feature alone would safe considerable effort if the tool does it out of the box.  Key to get this right, is to ensure that the process itself can be map to a single bounded context, and doesn't multiple ones as that would break one of the main principles of domain-driven design, fundamental in Microservices Architectures. As it was well stated by Bernd Rücker from Camunda (another great BPM engine that aligns well to Microservices Architectures), a much better way to define business processes is to "cut the end-to-end process into appropriate pieces which fit into the bounded contexts" and therefore aligning well to Microservices Architectures.

There are many benefits in this approach. For starting, it would be simpler (thought not simple) to implement to the previous option -but not as scalable and flexible. However because better visibility of the business process analytics would be available, it would perhaps compensate for the drawbacks.  In addition to the fact that be-spoking a custom web app for approvals wouldn't be required, certainly an option to consider.

Lastly, as as per previous approach, a call back handler would also be required in this approach.

Comparing the 3 approaches
The following table makes it simpler to visualise the pros/cons of each option:

Orchestrated (Synchronous) Credit Check Process
Choreographed (Asynchronous) Credit Check Process –no process engine
Choreographed (Asynchronous) Credit Check Process –with process engine
(++) Less complex. Fairly straight forward to implement. Known pattern.
(--) Complex to implement (many moving pieces) with additional technologies and considerations. Increased number of services and events to handle + plus a custom web application for human workflow.
(+-) Simpler not simple specially when compared to a purely choreographed process.
Scalability (ability to scale and handle high-throughout)
(--) Process can become the bottle neck if many parallel threads need to be handled.
(++) Very scalable and can handle large throughputs as each service can scale fully independently. Fully decoupled architecture.
(+-) Even though Services can scale independently, process ”could” become a bottleneck (depending heavily on what process engine is used). However because process is asynchronous it could handle more parallel threads than a synchronous option.
Visibility of end to end process
(++) Process can be monitored end to end. Which can be very useful from a business standpoint.
(+-) Visibility of an end to end process not as straight but still possible if right tooling is used.
(+) Process can be monitored almost end to end. Which can be very useful from a business standpoint.
Flexibility (ability to independently change/deploy components without affecting others)
(+-) Any change to an API consumed by the process will directly impact the process itself. This could be avoided by adding a virtualisation layer in between, but would result in additional complexity.
(++) Very flexible. Runtime decoupling via events. Almost all components can be evolved without impacting others (provided events are kept consistent)
(+) Good flexibility. Runtime decoupling via events. Almost all components can be evolved without impacting others (provided events are kept consistent) however majority of process still dependent in central process engine.

There are no silver bullets. No exceptions in this case. However once again, I think Netflix with its Conductor "microservice" orchestrator is changing the ball game on what we thought would be acceptable in a Microservices Architecture.

That said and answering the question in the title of this article, I don't think the days of BPM / process orchestration are dead per say. What I do think though, is that the way process orchestrations are implemented and processes modelled should (and probably will) change to to be more microservices / event oriented therefore be able to take part in a Choreography.

However equally important that the anatomy / underlaying architecture of more traditional process engines also changes (evolves) not just to cope much higher throughputs but also support the distributed deployment model, e.g. each process within its bounded context, could be packaged and deployed it its own runtime.  On the mean time, I hope that some of the approaches described in this article provides some inspiration on how to try and combine two paradigms that until recently (at least two me) seemed completely incompatible.

Lastly, I would like to thank Lucas Jellema, Lonneke Dickmans and specially Guido Schmutz and Sven Bernhardt for their valuable contributions to this article.