We live in a data-driven world where every bit of data can be analyzed to derive information that can be leveraged for several use cases. In an IT system, data resides in various layers that can be mined for diverse purposes such as reporting, understanding customer behavior, monitoring key performance indicators, etc.
Blockchain technology is a prime candidate for data analytics and mining as it is the backbone on which P2P (peer-to-peer), B2C (business-to-customer) and/or B2B (business-to-business) transactions take place over a business network.
Does blockchain support analytics?
One of the challenges that the initial set of blockchain platforms faced was the inability to support analytics because the underlying transactional data is stored in a key-value pair format and the querying of current or historical state is possible only when the “key” is used. For example, an asset may be identified via an asset identifier that is stored as the key based on which blockchain data can be retrieved. Querying based on the associated metadata or performing analytics by slicing and dicing the state of the data cannot be achieved directly. Hence, the ledger data contained in each blockchain node is not conducive for rich querying and analytics functions.
Therefore, solution implementers have relied on non-standard, peripheral techniques of storing the same business transaction data in off-chain repositories for overcoming this shortcoming of blockchains. This effectively means that the same set of data must be maintained on blockchain as well as other off-chain components, which can lead to data inconsistency and integrity issues. In the absence of a standardized mechanism, this approach of creating a data replica can quickly go out of sync as the transaction volume increases. Moreover, the contents of the off-chain database can be altered, resulting in data integrity issues that cannot be prevented proactively.
How can we apply analytics to blockchain?
Blockchain is designed for establishing trust through consensus-based approach for transaction validation. It is not advisable to change its core and replace how it stores the transactional data as it can lead to performance and security issues. Therefore, our approach to make this transactional data available for analytics is to create a read-only immutable twin of the main blockchain node.