The emergence of cloud computing and integration patterns is not only disrupting existing integration nomenclature but also makes us rethink the way we do integration. When the rest of the enterprise architecture is being taken by micro services based architecture storm, one aspect of enterprise architecture which is still not disturbed is the data warehouse architecture. In the recent days, there is a lot of talk around the micro service based architecture after data platform like Data Lake being explored as a critical component in the enterprise data platform environment of organization.
Roads designed with minimal exits
There are many organizations which has consciously built and embraced the concept of data services around their data warehouse especially for read only data and master data. Beyond that, micro service driven architecture has not seen much light in the data integration, data quality or validation and meta data management space of the work. The Notion of “Separation of Concerns” was not looked in the conventional data warehouse architecture due to heavy fork lifting of data and dependency on the boxed tools for integration. The solutions are designed and built with heavy dependency to each other. Rather they were designed like a straight line starting with reading the data from source, validating, transforming and loading. Introducing changes, managing job failures were laborious which off sets the benefit of using integration tools.
Why Micro service based design will benefit?
The basic principle of micro service is to break a complex application and decompose it in to multiple self-contented services which can connect to each other to achieve a complex functionality. Given that the usual data warehouse architecture is always complex and involves a huge effort to design and build, micro services could be a great way to design our future data warehouses.
The most significant advantage which organizations will benefit from micro service based architecture is reusability, scalability and easy to adopt change. In microservice based architecture, the whole data warehouse solution is designed with loosely coupled components. Each component is aware of its functionality and self-sufficient and not really concerned what the other service or components is doing.
There can be micro services for the variety of sources from which data warehouse receives data. With the expanded data platform, micro services adds value as each source is different and needs to be treated specially.
Think of a micro service, each to pull data from source, read file from a server, read from the enterprise bus, and listen to streaming source. Same way, micro services to perform sanity checks on the source files, validate. Another set of micro services to generate necessary keys; look up reference data, encapsulated data transformation services, persistence services by data domain and much more. And all the micro services are connected by a cohesive API contract to complete the integration.
Managing Micro Services: - Arduous task?
One of the common complaints which go against micro services architecture is the effort required to manage the various micro services in the organization. Also the network bandwidth it occupies due to the continuous exchange of messages between the services.
With the availability of cheaper commodity hardware, the micro services can be clustered to run with same device to avoid network overhead. The biggest benefit could be the code separation, small and manageable chunks of code which are easier to manage, change and faster to deploy.