Data security and privacy in cloud based analytics platform
While a cloud based solution has its own set of security and privacy challenges, data analytics also brings in the stringent aspects of data privacy. As data privacy deals with the personal/customer data, it enhances customer experience and anticipates an increase in top line/bottom line.
To address cloud security, the (privacy) controls are implemented at different layers such as infrastructure, network, application and data (at rest, in motion). However, with a cloud based analytics platform, we need to adhere to data privacy regulations and also provide the required outcome for the organization to make informed decisions.
Data privacy applies to the processing of personal data, namely any information relating to an identified or identifiable natural person. In the context of Big Data, the focus is more on indirect identification of personal data by following data privacy regulation, such as GDPR principles, notice, choice, consent, purpose of processing, privacy by design, etc.
Risk based proactive approach to data privacy
Organizations must take a risk based proactive approach to data privacy by creating data privacy standards and privacy control frameworks which can be applied consistently across all geographies and solution(s). This would minimize complexities and maximize data protection. Such a framework must provide guidance on what constitutes personal data, what are the requirements for personal data collection, process of managing consent, rules for accessing and using personal data, how to classify and protect personal data, implement the right set of processes and controls, based on the risk. Also, guidelines must be created in a way that integrates privacy as a key component from design to delivery of the products and services. In addition, privacy requirements into the initial phase of the software development life cycle, which will then decrease compliance risks and improve customer confidence in the product or services, will need to be incorporated.
Key design principles in data privacy in the Data Analytics platform
Collects data that is necessary to provide a feature or service. Conduct a Privacy Impact Assessment to define the exact data processing needs thereby limiting data to what is essential.
Hides personal data and its interrelationship from plain view. Leverages data anonymization solutions to anonymize the data at source.
Personal data will be processed in a distributed fashion in separate compartments whenever possible. The data separation controls will be implemented at the data center level, to meet the data privacy regulations in different countries.
Processes personal data at the highest level of aggregation, and with the least possible detail, yet is still useful.
Consent and Notice
Informs data subjects about the data being collected and takes consent from individuals at the time of data collection. Data with customer consent should be used for data processing in Aata Analytics.
Implements proper security controls such as firewalls, anti-virus, access controls, authorizations, audit logs, data encryption motion and rest, masking, anonymization, etc.
Ways to approach data privacy in Cloud Analytics platform
To secure sensitive data within the analytics ecosystem on the cloud, data protection is applied to sensitive data close to the source, i.e. up-stream application by integrating the application/job with Formation Preserving Encryption. The data flows in an encrypted format to the cloud and is made available for processing and analytical purposes without the risk of exposure. Applications/tools running on the cloud works on the encrypted data for reporting or analytical purposes.
Format Preserving Encryption (FPE) preserves the format of the sensitive data fields, while providing AES level of encryption strength. Here is a typical example of FPE: