Defending organizations against cyber threats is becoming an increasingly sophisticated science. Detailed analysis of the high profile data breaches at second largest retailer in the US or the more recent bank heist confirm that determined attackers breach into organizations months before they finally carry out their objective and are able to evade detection. Further to this, as a growing number of systems and devices across the world get connected to the Internet of Things, attack surfaces have expanded, leading organizations to examine new ways to protect their assets. There is an ever-growing acceptance among the Information Security Officers (ISOs) to the fact that it is just a matter of time before their organizations experience a significant IT security breach. This paradigm change in the drivers influencing information security strategy of organizations has also influenced the shift in focus from protection against threats to timely detection and effective response to these threats.
Challenges in identifying ‘low and slow’ breaches
Protection and detection security controls from the past decade have primarily used signature based approaches to detect threats. However, this approach has a few downsides: • Controls could only detect previously known threat vectors and actors; • Controls including security information and event management solutions (SIEM) could only review and assimilate information over a short period of time; • Mature SIEM threat detection use-cases reflected the organization’s collective knowledge of previously known (or well researched) threat scenarios. These solutions are ineffective when indicators of compromise of a ‘low and slow’ breach have to be accumulated over a long period.
Leveraging machine learning
Detection of these ‘low and slow’ threats primarily require:
- Ability to analyze data from a variety of sources and over long period of time;
- Ability to weed out anomalous behavior vis-à-vis legitimate business transactions.
Over the past 18 months, the security controls domain has been abuzz with commercial solutions using machine learning capability to detect these previously unknown threats. Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Such algorithms are often categorized as being supervised or unsupervised1 .
Supervised algorithms can apply what has been learned in the past, to new data. In the context of threat detection, it manifests in the form of:
- Cybersecurity experts and threat researchers through their continuous analysis of newer malware, attack models, techniques and procedures, selecting the characteristics and behaviors of an attack that the data science models will be trained to detect
- Analysis of huge volumes of malicious and attack traffic to distill the key characteristics that make malicious traffic unique
For example, by analyzing large numbers of remote access tools RATs, a supervised machine learning model can learn how traffic from these tools differ from normal traffic.
Supervised learning algorithms support detection of threats on ‘day one’ of their use within the organization’s network and do not have an organization specific learning phase. Unsupervised learning models augment their intelligence specific to the customer’s environment. While supervised learning models are useful to provide day one benefits, detection of some threat scenarios can only be learned specific to each customer’s environment.
For example, to determine anonymous employee access behavior. Unsupervised learning algorithms focus on understanding what makes the customer’s IT usage pattern unique and identifies abnormalities when they occur. The risk carried by unsupervised learning algorithms is, that it may learn bad behavior as the baseline if it is exposed to bad usage. In general, this category of algorithms has a higher degree of being associated with false positives.
Adoption of machine learning capabilities
According to me, the selection of machine learning capabilities is underpinned by the following belief: