Sponsored content: Thursday, 24th September 2020 – Singapore, Malaysia, Philippines
Focus Network, in partnership with SentinelOne, brought together leading IT Security executives to discover how they are dealing with the challenges of digital transformation and technology sprawl and how they view the opportunities around security automation such as:
- Drastically reduce the amount of uninvestigated and unresolved alerts
- Automate time-consuming investigations and remediate well-known threats
- Act as a force multiplier for resource-constrained security teams
- Reduce your organisation’s security risk exposure, including the time to containment and remediation
The session was coordinated by Blake Tolmie, Director – Operations, Focus Network expertly moderated by Andrew Milroy, Principal adviser, Eco-System and providing great insights into the session was experienced strategy and thought leader Jan Tietze.
Brief introduction of the speakers
Jan Tietze, Director Security Strategy EMEA, SentinelOne – Before joining SentinelOne in 2020, Jan Tietze served in senior technical and management roles ranging from engineering to CIO and CTO roles for global IT and consultancy organisations. With a strong background in enterprise IT and an early career in senior field engineering roles in Microsoft and other security and consulting organisations, Jan understands real world risk, challenges and solutions and has been a trusted advisor to his clients for many years.
Andrew Milroy, Principal adviser at Eco-System, an analyst firm based in Singapore, was moderator for the session and welcomed the delegates to the interactive session. The roundtable was also joined by the SentinelOne team, which includes Jan Tietze, Director of security strategy, Evan Davidson, Vice President and head of the region. Lawrence Chan, Head of regional sales. And Kelvin Wee, Technical director for APAC.
The SentinelOne event in partnership with Focus Network, presented a theme of minimizing risk from cyber threats, focus on reducing ‘time to containment’. Security teams today are working hard on the front lines, identifying, analysing and mitigating threats. Yet despite all of their efforts, visibility into malicious activity remains challenging as the mean time to identify a security breach is still 197 days, which is quite astonishing with the mean time to containment being another 69 days after initial detection, according to the Ponemon Institute. The reality is that with the current reactive approaches to cyber defence, there simply aren’t enough skilled professionals to analyse the volume of incidents that most organisations face with limited resources, an ever growing skills gap and an escalating volume of security alerts organisations are left vulnerable to what is often perceived to be unavoidable risk. This environment is demanding more from already resource constrained CISOs and other cybersecurity professionals. Focus today is on how automation can help, specifically how it can help to drastically reduce the amount of uninvestigated and unresolved alerts, ultimate time-consuming investigations and remediate well known threats, act as a force multiplier for resource constrained security teams, reduce an organisation’s security risk exposure, including time to containment and remediation.
Time to Contain
Defining incidents. A computer security incident is any adverse event that negatively impacts any of the three goals of security. Traditionally, information security has confidentiality, integrity and availability as its goals. And if the integrity is disturbed at the processing layer or its storage or transmission, then we have a computer security incident. Cybersecurity or infosecurity is the practice of ensuring that we have fewer of these incidents and that the impact is lower.
“It may be different in different industries, but we’re all part of a long global supply chain in one way or another or a local critical service. And safety is one of the most critical outcomes of our profession”, says Jan Tietze, Director Security Strategy EMEA, SentinelOne. “Incidents are the result of risk, and risk being the theoretical concept, is a quantifiable expected loss of operating information security and operating information systems, and you can look at it in an abstract way as the cost that an asset causes in an incident. In cybersecurity, we reduce either the cost that occurs during an incident, impact or the attack surface on account of the assets in that class, or we optimize the frequency per year through configuration and best practices by adding additional security controls. So, what we do is to optimize critical metrics and the risk management process. And I think it’s widely accepted that that infosec is actually a risk management discipline. “
There are three phases in any incident handling or incident response methodology; a pre-incident phase – before an incident occurs, a post-incident phase – after the incident has occurred. And then there is the phase during the incident or peri-incident where you actually deal with and handle the incident. And then there are different methodologies, like SANS and GARC and others that define what happens during each of those phases. This session has been oriented around the GARC methodology published in a SANS paper, but the concept always applies whether you think the lessons learned are part of the incident or at the post-incident phase, they all have lessons learned at some point after the incident has been captured. But the repeatable process, the one that occurs with every incident, is the phase during the incident. And hypothesis that we need to think about today is the one metric that we can use to influence the cost to the organisation, the risk to safety, the risk to availability of compute systems, which is the end-to-end ‘time to contain’. That’s the time it takes from the start of the incident, the compromise happening or from the attack starting the disruption to your business occurring until you have restored the trustworthy state of the compute environment.
Two weeks ago, at the University in Dusseldorf, attackers gained access, presumably last year through a Citrix security vulnerability that they exploited. They basically put that particular hospital on the list of places that they were going to work on compromising and performing the actual ransomware action when their project team had time to deal with it. The actual compromise happened a long time ago. There was a long time for detection in which they were not detected but had the access already.
Time to contain, to recap, is the time to regain full control of all the affected assets and then restore the trustworthiness of the environment. And that is the key metric that we are looking at today. We are able to decompose that into the individual phases of an incident, so it starts with the initial compromise/disruption, and then there are phases in which you deal with it. There’s a detection that needs to occur and sometimes there’s an automatic response that can occur. There needs to be alerts and the need to then identify whether the incident is real and whether it needs any manual follow up. And to optimize the end-to-end time spent between T0 when it occurs until keypunch, when you’re done, you can optimize everything, and it starts with detection and the efficacy of performing that detection.
The ideal approach
Stronger approaches to containment do not rely on prior knowledge as much and they can use programmatic detection that is autonomous and does not need to have a human in the process in order to take telemetry or take information about what has occurred on an endpoint to raise a detection. In the Ponemon study from 2018, they did a study commissioned by IBM about what this ‘dwell time’ until you have the first detection in a system is, and found that the mean for that was 197 days. Even though they looked at large scale incidents, 200 days is still a long time to act and time that goes largely unused. In those instances, enacting automatic responses is more effective than having manual responses, because of the criticality of speed. However, a lot of organisations struggle to implement that because of high false positive counts. The reason for that is due to end point detection and response systems that are based on telemetry from endpoints, struggle to identify benign behaviour from malicious behaviour resulting in high false positive counts, which then means you can’t really use automation.
Automated controls require a high signal to noise ratio, and not all systems are equipped to provide that. Alerts typically flow in as a nearly endless source of actions to respond to or of input to respond to for the people and security operations or in the cyber defence center. They have the tendency to ignore some or all of them. In fact, there are systems that specialize in ignoring alerts by means of correlation of loss from multiple sources and try to prioritize for you. Better systems provide you with prioritized alerts so that you know what to do and that you can focus on a small number of events rather than spending your day sifting through hundreds of alerts. And identification is better if more of it is automated because if there’s an intense human element to perform the initial analysis, like the need to follow process IDs and look at different systems, it is difficult to have a workflow where one collaborates on resolving an incident making it less effective.
It is really important and imperative today that we correlate individual actions and make that available. And one of the manifestations of not correlating well is that people complain about being understaffed and of skills shortage in our industry. However, very often it’s a symptom of not having well integrated systems and having systems that are really disparate with a need to switch to a context between one tool and another and not being able to operate all the things that are at our disposal. Whenever the boundaries between systems are crossed, there is typically no correlation of forces. And this whole process from detection to containment on average, takes between 56 and 69 days, depending on who you look at. While the Mandiant security report 2020 gives the lower end and higher figures, it’s somewhere in that time range that mean ‘time to contain’ after detection the industry can work on.
Six principles for the best business outcome
Ultimately the business outcome is to be able to recover quickly when an incident occurs and knowing that it has occurred. There are different and competing approaches. Some of those have merit while others are outdated, like the ones that rely on prior knowledge. But in general, technology is less relevant compared to the desired outcome. To elaborate further, there are six principles that are important that we can distil from this line of thinking:
- Automation
- Autonomy
- Correlation
- End to End integrated process
- One Platform
- SOC Empowerment
Automation should take precedence over human work. If you can automate a response and afford to do that from a risk management perspective or a false positive perspective, then you should be doing that because that stops many incidents, cold and early in their tracks. While there is no glory in prevention, automation is what will eventually stop a small-scale incident from becoming a large-scale incident.
Autonomy is another concept that is important and that’s not having dependencies to having to send data somewhere in order to be able to respond and work on it. But being able to perform the action and being able to perform the detection without needing to consume outside knowledge.
Correlation is really important in making the humans more effective, like making the human responder understand that what has happened is a function of correlating the information and correlating the telemetry and bringing to their attention what really occurred during the incident. Correlation is everything to help you make sense of something that has technically been observed. And when you need to respond, there’s an aspect of having visibility across the environment and different kinds of assets that may have different tooling. Hence having the end-to-end visibility and ability to respond quickly from one place is of paramount importance with situations.
A good example is the BlueKeep vulnerability in RDP that was there for 19 years until it was discovered by the NSA that it had been actively exploited for a significant portion of that time. People were closing RDP as a preventative measure. Having the ability to perform these response actions in one place is very powerful because it gives the SOC the tools that they need in order to effectively respond, implement lessons learned and prevent other incidents from occurring.
Roundtable discussion with delegates
Azril Rahim, Senior Manager, IT Security, Tenaga Nasional Berhad, a Malaysian utility, says, “If we if we follow the concept in a court of law, there is a term called TKO (Total Knock Out). In this case, the most important issue is detection and without good detection the rest of the process is TKO. Hence we really need to address end point detection to a greater extent.”
“The lowest hanging fruit in terms of where we can save the most time and probably have the biggest chance to reduce the impact is by reducing the time to detection, because allowing attackers to compromise an environment and establish multiple persistence points basically means that we’re kind of blind when we’re responding to the first time we see a detection for the incident”, noted Jan Tietze, Director Security Strategy EMEA, SentinelOne. “There has been since 2012 the emerging market of endpoint detection and response. And when you look at the numbers, for instance, in Mandiant’s reports, what you see is that the mean dwell time has reduced since then. And I think that is largely due to the introduction of endpoint detection and response technologies, which aim to use telemetry to look at behaviours and describe not a concrete threat but describe the behaviours that make up an attack. However, many of those require concrete knowledge of what a particular attack looks like. In other words, they will not detect BlueKeep, which was unknown for 19 years until the day that it became unknown vulnerability and then proof of concept exploits in the wild that could be defined so that you knew what the behaviour looked like. They were looking for known bad. And I think the more effective technologies are the ones that don’t look for known bad, but that look for ‘post exploitation attacker behaviours’ like there are things that any attacker needs to do after they compromise an environment. If you assume that you’re getting hit with a brand-new vulnerability that wasn’t known before, the attackers still need to do something to act on their objective. They still need to perform reconnaissance. They need to actually move from one device to another in your organisation and exfiltrate information. They still need to exfiltrate credentials depending on what they’re after. So to get to those objectives, they will need to perform these actions and detecting those gives you a very short time span between them compromising an environment with a completely new attack that’s currently undetectable and knowing that they’re there and being able to respond and then respond automatically.”
“I think it’s (automation) a very hot topic on which one, which we definitely have most of the challenges,” says Thomas Robert, Global Head of Infrastructure Operations, CACIB. “We managed to work some of those measures that were mentioned with the relevant correlation and some automation, but primarily based on the scenario. And that’s where it’s a been challenging as usually the scenario that you build is based on the past experience and not necessarily forward looking into what could happen, but we have this strategy of Cybersecurity Correlation and Automation (CSCA) over the years now. it’s not necessarily that well integrated and I’m definitely looking forward to solutions that are more interacting between each other in terms of sharing the information dynamically. So, the analysis of the site and on the behaviour is more relevant than what you can get with individual systems. That is also complicated when you when you use multiple systems usually coming from different vendors and it’s not always easy to have to have a good level of interaction. On the other side. If you go with only one vendor, you can get a good level of integration. But it’s better to have different vendors so we have different perspectives on the scenario. And you can provide better protection than a single one which might have a failure in one domain.”
Steve Ng, VP, Digital Platform Operations, Mediacorp Pte Ltd notes that,” The current situation give us more time to really explore some learning experience, either on our own, like learning some of the new skills, new tactics, new approach, as well as a lot more online engagement with the vendors. We learn a lot from each other. And currently what we are doing is also testing and prototyping some new approaches on cloud to help us do continuous trend hunting. That give us a better scan of our perimeters by knowing what we have inside and what is on the perimeter, including what’s coming. So that is also one of the areas that we are developing the capability and competency now. We should have this platform up and running soon. So, although we are working from home, we actually can deliver substantial project improvements to our security posture.”
“We are looking at how to improve on our security posture,” says Soon Tein Lim, Vice President, Corporate Development, ST Engineering Electronics. “For example, for remote work from home, we have a VPN for most of the users. And in the process, we realized that users when at home, can turn on the Internet, but they don’t go into the VPN immediately using the office computer. Hence, we have installed GlobalProtect, a particular software from Palo Alto to force whenever the user when a home onto Wi-Fi, the whole connection is coming back to office. At the end point we have introduced EDR, the employee response system and we also have SOC monitoring. “
“There should be an agent on servers and an agent on clients, as also that protects Kubernetes, docker environments and Linux workloads. There should literally not be a computer workload that does not have that has any sort of value to these. That’s an extra protective system that looks at flows coming in and filtering out, says Jan Tietze, Director Security Strategy EMEA, SentinelOne. “The scope differs among those solutions and when you have response options, very often you have manual response options. Manual means you use the EDR agent to enforce or respond, but it needs to be initiated by a human and I think that’s just too slow. And the types of responses, very often things like writing a script are killing a process by being very pinpointed in your response. Whereas what we do, for instance, we give you the ability to just roll back all of the actions that are associated with a chain of events so you can correlate what happens in real time on the endpoint and we can roll back all of the changes performed by that chain of events. Also, if you find out a user logs in with compromised credentials and performs a series of actions, and one of those is then identified as malicious, we can go back to the beginning of the actions in that session and remove what they have performed.”
Few years from now, we’re not ever going to talk about on-prem in this particular context because of two things. And it depends on the region you’re in. There are regions in the world that are very likely to stick with on premise solutions for regulatory reasons or for security data sovereignty or distrust or like many of those kinds of reasons. And Germany is one of those markets. As also in the Middle East, where you see the same kinds of issues. However, the scale at which you need to process telemetry data does not lend itself well to doing it on prem. You literally need hundreds of servers in order to satisfy relatively simple queries. Or you end up with a telemetry system in EDR that allows you to hunt for a very limited set of data, for a limited set of time and. Or end up building gargantuan infrastructures that you then have to maintain and ship the data to and as users are more mobile and work remotely, etc., and we’re in a more connected world, the kind of natural flow of that data is to a place that has a lot of them and a lot of compute.
Mac Esmilla, Global CISO, World Vision International, acknowledged the skills gap in the market saying, “We have a big team of IT folks in the Philippines of about approximately 400. There’s no shortage of IT technical people and experience with working with tools. But there’s a great shortage in people who understand cyber security. We can easily find people who are good with the technical skills, but not with the security skills. Process is very important and there’s a lack of people with good understanding of process, especially how to interface with legal processes, data protection requirements etc. hence we do a lot of training and enablement, and also, we pick partners who have good knowledge transfer enablement programs. It’s good to have partners who actually know what they’re doing, not just selling you a “buy a tool and you look cool”. You’re really entering into a partnership, not just subscribing to a tool or a technology kit. So, we’re very conscious about, partnering with the right people, with the right mentality, with the right experience and with the right attitude as well.
Conclusion
“In terms of the interactions that we have with various customers, the need for containment and isolation has been a very painful point for them”, shared Kelvin Wee, Technical Director – APAC, SentinelOne. “Lack of automation is also one of those key points. Although when it comes to going down even deeper, having the ability to do the forensics and all that, automation becomes vital. In many cases because of the lack of insights, customers lose the visibility in those aspects, and become kind of lost in that whole situation. And that’s where they look for other technologies and partners like us who are able to provide them with guidance in the trusted advisor role. So that’s how we actually help them.”
“SentinelOne has a readiness package”, referred Lawrence Chan, Head of Regional Sales. “The main intention of this package is to coach our customer to on-board the tool itself. We go through the process, perform the actions and do the best with the end-to-end remediation, containment, annotation and rollbacks. The main intention of this exercise is to hand over a brand-new clean dashboard that’s better for the customer and they continue to monitor on that. SentinelOne Readiness uses a structured methodology and personalized assistance for deployment planning. Readiness includes environment discovery, SaaS configuration assistance, and staged agent pushes to get you endpoint coverage as fast as possible.”
“Mac Esmilla mentioned about choosing the right partner and that really resonated with me. I think that cybersecurity is a very fast-moving space and whoever has a good solution today does not necessarily have it a year down the road”, noted Jan Tietze. “I’ve worked for companies that have solved one hard problem very well and unfortunately failed to address other problems that have developed in the space over time. Very often SOCs have very little visibility of that mode of operating and that mode of deployment are not part of those ICT pipelines for continuous integration and deployment. I think the right partner is one that has demonstrated the ability to both listen to customers to address these new and emerging landscapes, as well as addressing new and emerging threats and works with customers also as part of their innovation process. I think that’s really key. It’s not a static purchase, but a longer-term partnership.”
That brought to conclusion a very exciting conversation that touched on a lot of the most pertinent and relevant issues in cybersecurity generally, but more specifically around automation and some of the issues and challenges associated with it. The highly interactive session with participation from the delegates across the APAC region and great discussions was put together by SentinelOne.
Focus Network facilitates a data-driven information hub for senior-level executives to leverage their learnings from, while at the same time assisting businesses in connecting with the most relevant partners to frame new relationships. With a cohort of knowledge hungry and growth minded delegates, these sessions have seen imparting great value for participants. With the advent of the new ways of working remotely, Focus Network continues to collaborate with the best thought leaders from the industry to still come together to share and navigate the ever-changing landscapes that’s barrelling into the neo industrial revolution.