There are a lot of integration approaches and standards out there. In order to make sense out of the variety I distinguish three successive generations of such standards. As far am I’m aware it is not something commonly accepted, but again I’m not aware of any other sensible commonly accepted classification, so I see mine as being credible enough for causal use.

The first generation is unsurprisingly the oldest one. Standards belonging to this generation started to emerge in dark ages of distributed computing where merely sending chunks of bytes around in a reliable way was considered being an achievement. Thanks to the byte-centric perspective those standards were concerned mostly about structuring those chunks, i.e. what the layout of data is, what fields are, what delimiters are used, etc. Hence, I collectively refer to standards of the first generation as syntax-based integration standards.

Perhaps the most widely used first-generation standard is IP. The structure of headers of IP packets is defined in a bit-precise way, and the well-formedness rules  are again specified down to the bit-level. However, the fact that first generation is the oldest one doesn’t mean that it is something outdated. In fact quite an opposite. Most integration standards in use today actually belong to the first generation. Even relatively modern concept of a web service, as far as WS-I interoperability standard is concerned, is still describes standards of the first generation, because a concrete syntax is what’s defined by XSD specifications embedded into a WSDL.

There’s a joke about first-generation integration standards stating that the good thing about standards is that there’s a lot of standards to select from. The second generation of integration standards started to emerge when people begun to treat this joke seriously. There are so many (almost) overlapping and (nearly) interchangeable standards that true ad hoc interoperability is only possible by pure coincidence, which in fact undermines the point of having integration standards.

Authors of second-generation standards tried to make sense out of wide variety of syntaxes by standardizing underlying model of information represented by such. In turn, models could be rendered in a variety of concrete syntaxes. Consequently, standards of the second generation are called model-based integration standards.

Nowadays, model-based standards are rather common, but of course not as common, as syntax-based. Arguably, one of the most widely known, albeit a bit outdated, second-generation standard is CORBA IDL — a declarative language used to define remotely accessible interfaces of objects that was agnostic to the underlying “on-the-wire” encoding of data passing as parameters and return values.  Another example would be ISO 20022 — commonly accepted within financial industry second-generation standard, which describes models for messages representing financial transactions currently covered by first-generation standards as ISO 15022 (so called “MT messages”), FIX, FpML and others.

A wave of interest for the third generation of integration standards coincides with relatively recent theoretic developments in the holistic approaches to IT architecture of an enterprise. Even though models for every integration concern were standardized, there’d still be too many disparate, yet overlapping concerns that an enterprise have to deal with.

To illustrate the point, consider the notion of purchase. The physical product being procured may have manifestations in multitude of different aspects of an enterprise whilst still maintaining its identity. First, a purchase order for a product is received by the vendor using some sort of supply chain integration, possibly via ebXML, and an invoice is sent back. Second, an inbound payment is being acknowledged by a bank and reported to the company via SWIFT MT message. Third, an inventory tracking system is told to deduce the number of items in stock. Finally, a shipping is agreed upon with a courier service. And there might be some automated regulatory reporting involved if, say, a product is subject to export restrictions or special export VAT rate. Data related to the same physical item is being exchanged through these integration points, but they have widely varied and even mutually inconsistent understanding of what the item is.

Therefore, someone somehow still have to bridge all those differences between disparate domains, by implementing and maintaining fairly sophisticated EAI logic that have to cater for such obscure things as partial payments, returns, bulk and repeating orders, lost shipping, claims, reconciliation, and things like that.

Third-generation integration standards offer an answer for this challenge, that is the standardization of semantics, i.e. converging on a common meaning of elements belonging to disparate underlying models of all involved integration standards. Such standards usually start with agreeing on a particular upper ontology that describes general categories of being such as time, space, math, physics, and then producing an unambiguous specifications of all relevant models using a formal notation, which is quite commonly based on first-order predicate calculus. In other words, natural language specifications of first- and second-generation integration standards are rewritten in a logically rigorous way, e.g. in a form of, say, RDF triplets. Considering all above, probably the best way to refer to the third generation as semantics-based integration standards.

Considering amount of effort required to rewrite and reconcile differences between disparate integration domains, it is not surprising that third-generation standards are still in their infancy. However, there are few prominent attempts that slowly getting traction at least in certain industries. As far as I know, the most well-developed one is ISO 15926: Industrial automation systems and integration — Integration of life-cycle data for process plants including oil and gas production facilities. Despite its confusing name, it’s actually contains a full-blown upper ontology of 201 elements describing all categories that are arguably sufficient to construct any practically required. As of now, ISO 15926 standard is primarily used by manufacturing organizations to share and hand over specifications for engineering artifacts and components, but there are few interest groups working toward applying the standard to other domains.

Needless to say, third-generation standards indeed look promising. However, having recently got an exposure to post-structuralist thinking I’ve obtained a new perspective on things that make me feel incredulity toward the possibility of an overreaching third generation integration standard. But this is something I’m going to address as a separate post.