Development drivers for API based integration
There are six major drivers pushing towards the convergence of application and data integration via API.
Cheap bandwidth: Advances in fiber optics are enabling wires to carry terabytes of data at the speed of light, whereas 4G (and imminent arrival of 5G) are sending gigabytes over ether to mobile devices. The abundant supply is driving down the effective cost of bandwidth to the point of insignificance.
Sensors, systems and data explosion: Increasingly, we are living in a connected world, filled with various sensors and systems, generating enormous amounts of machine data. The maturity of a global IT manufacturing supply chain has lowered the cost of IT Hardware significantly, allowing collecting, collating, processing and making sense out of this humongous data load via artificial intelligence, completely removing the human bottlenecks in the process.
Privacy and security: Deployment of cheap biometric devices for identification by various government organizations worldwide have necessitated foolproof data protection legal framework. The advent of social network has sensitized the masses about the danger in lack and/or loss of privacy.
Cloud—private, public and hybrid: While on-premise systems are not going away anytime soon, every organization worth its name are spending significant amounts of effort on their hybrid cloud strategy to cut down their operational costs. Other than optimizing idle capacity, the ‘pay as you go’ model of cloud is inherently more suitable for the business, driving towards its adoption.
Handheld devices everywhere: Cheap bandwidth combined with significant processing power available in handheld devices have increased the appetite for real time data of the data consumers. The information latency of traditional batch processes is no longer acceptable. Enriched data must appear for user consumption as soon as the event occurs and tracked till the logical closure in all four space-time dimensions.
Everything has an app for it: The data consumer is ready to pay for the on-demand service, giving rise to the ‘subscription’ model of instant revenue generation for the data. Taking advantage of this, an ecosystem of connected apps has come up, providing every kind of service at every corner of the world. The data they use are not always their own. The interaction patterns and the data generated by them changes rapidly. This necessitates a different kind of agile tool that is lightweight, fault tolerant and horizontally scalable.
IPaaS: The wholesome integration
In the last few years, Integration Platform as a Service has gained significant traction. Taking advantage of increased security over public networks, cheaper cost of connectivity and larger on-demand capacity of cloud, they are steadily gaining ground over traditional on-premise ETL tools.
The current global revenue has already crossed a billion1 and can only go up since all the traditional ETL vendors are increasingly focusing on this space. Using their existing customer base, existing vendors have either retained strong presence or regained the initial lost ground. Simultaneously new players have come up with their own niche. There is still entropy in the market and it will go through a consolidation phase in future but today’s business can ill afford to wait it out.
A. The first advantage the IPaaS tools have is their dual capability of API connectivity and data processing. Today’s SaaS offerings are not simple. Take for example ServiceNow, SalesForce, Ariba or Office365; each provides significant capability at the cost of a very complex and complicated API. They also need to transfer huge amount of data back and forth. With the deep inroads by them in every domain, an organization has a serious business disadvantage if it is not ready to consume the relevant ones in its own business process.
However, the design and development of an API, its gateway or even the development of a public or B2B API client in bespoke manner is not simple and can quickly turn into an expensive and time-consuming proposition completely outweighing its benefit. Another consideration is the adaptation of own API by the outside world. In case of no or low adaptation, the entire expense goes out of the window without any return.
Given the diverse nature of the public and B2B APIs, the tools already have significant capability of transforming data. Traditionally there has been three patterns of data transform which every ETL tool excel in:
a. Movement with enrichment from one system to another
b. Synchronization of data in multiple systems - either uni or bidirectionally and
c. Aggregation from many systems into one
B. The second advantage of IPaaS tools are their tight integration with high performing messaging platforms. While the traditional ETL tools have developed connectors for them, the cloud native IPaaS tools have distinct advantage in streaming mode for better near real time asynchronous performance. The combination of high throughput and low latency allows creation and deployment of new architectures like hybrid transactional analytical platform.
C. Typically, the on premise ETL tools have a perpetual license model with upfront payments connected to number of CPUs. The IPaaS, on the other hand, typically uses a subscription model tied to actual usage. This gives optimized cost of capability and business prefers this model more.
The days of ETL are far from over. The current customer base is still an order of magnitude more than the nearest IPaaS offering. Nevertheless, there is no denying that IPaaS is steadily gaining ground and is the way to the future. Given such a scenario, the prudent approach would be to retain the existing ETL solutions but simultaneously prepare a ‘to-be’ state involving an IPaaS tool. For any new development, a due diligence should be done based on the following key considerations:
- Data volume to move: Both one time as well as ongoing
- Complexity of data formats and transformations needed
- Number of real time operations between two or more systems
- Nature of systems/data sources to integrate: Are all participating systems API based? Or are some legacy and only allow data access via files and SQL queries?