Security information and event management (SIEM) refers to a set of tools which assist detection and response efforts by centralizing security data. Because it’s impractical for security operators to manually collect the vast amounts of information needed to properly understand the security posture of their network, SIEMs serve to aggregate relevant data and present it to operators within a coherent and single user interface. Within these interfaces, SIEMs serve the twin functions of collecting and categorizing information.
Typical information collected by SIEMs might include event logs, user activity, permission changes, known vulnerabilities, and network traffic. As this information is collected, it is further categorized in a variety of manners – depending on the needs of the security operations center. Categorization might distinguish between high or low network traffic, known and unknown IP addresses, or routine or unexpected permission changes.
The real benefit of a SIEM system is combining the collection and categorization functions to generate tailored security alerts by establishing filters to better guide the attention of security operators. For example, perhaps there is a device within a network that regularly communicates with addresses outside of the local network. Rather than reviewing all such traffic, there may only be a need to review such traffic that is also considered abnormally high. While a simple example, this ability to collect, categorize, and filter demonstrates the core functionality of SIEM tools.
The primary benefit of a SIEM is to maximize personnel effectiveness by minimizing redundant activity. Without a centralized solution, security operators often lose situational awareness within the noise of security alerts. Carefully tailored system filters limit redundancies and the time security personnel must spend sifting through system noise. Acknowledging this, the April 2022 draft of NIST’s OT Security Guide recommends using SIEMs to, “help filter the types of events and reduce alert fatigue.”
Aside from enhancing security, SIEM tools have the added benefit of increasing system efficiency when deployed for industrial control systems. OT efficiency is heavily reliant on assets that are not easily replaced. Therefore, closely monitoring system activity can also inform predictive maintenance efforts, aid in coordinating asset updates, and increase response time to non-malicious outages. In IT networks such problems are more easily resolved without centralized management, but having effective SIEM functionality is critical to mitigate both security and operational threats within OT architectures.
Potential problems are most likely to arise during the collection portion of the SIEM workflow. Using traditional monitoring methods to feed a SIEM may not be effective for OT environments since they weren’t built for these unique systems which prioritize safety and availability over confidentiality and integrity. Many of the devices supporting a SCADA system are not designed to handle various scanning procedures that some SIEM systems might employ, which could result in safety hazards and loss of availability. Subsequently, it is important to feed any SIEM systems using tools that are designed specifically for OT.
Context is critical to identify system threats and vulnerabilities. Since malicious activity is not always immediately identifiable through individual device monitoring or discrete network packets, contextual information is required to determine system anomalies. Context can be collected from many different data points such as: user behaviors, network activity, event timelines, and system configurations. Manual or siloed collection efforts are simply impractical when capturing this data. Instead, the collect, categorization, and filter functions of a SIEM are well designed for the task. Many threats and vulnerabilities are uncovered through the interaction of different systems. SIEMS, therefore, do not simply draw operators’ attention to problems, but can illuminate vulnerabilities that were not previously visible.
Context is simultaneously important to respond to system threats and vulnerabilities. Once a threat or vulnerability is detected, contextual knowledge informs the method of response. Security operators need to understand the nature of a given exploit, affected systems, and operational dependencies when responding to a security incident. A lack of information may cause operators to remove more devices than necessary from operation. Conversely, the true extent of a system penetration may be obscured without enough contextual information. Having accurate data reveals the true depth of a threat and decreases the mean time to respond (MTTR). This is an important benchmark for security operators as it translates directly to safety and business continuity.
Contextual data can also help predict threats and vulnerabilities. As threats continue to grow and diversify, contextual data can feed machine learning (ML) security solutions. The fundamental approach of machine learning solutions is to train software with a model of a specific environment. Within OT environments, however, it is important that the model is accurate, complete, consistent, timely, unique, and valid. Unlike IT systems, there is significant variation in equipment logic and system activity that prevents ML tools from being integrated into a network it was not trained within. Therefore, having accurate contextual data will be increasingly necessary in OT infrastructure to take advantage of emerging security solutions.
Security orchestration, automation and response, or SOAR, technologies enable organizations to efficiently observe, understand, decide upon and act on security incidents from a single interface. As the name suggests, SOARs automate responses to common types of security events. Since many events require predictable responses, having those actions immediately triggered when an event is first detected by a monitoring system provides the advantage of further decreasing MTTR and maximizing the efficiency of security workflows. The extent to which these benefits are realized, however, is heavily determined by the amount of contextual information integrated into a SOAR program.
SOAR functionality has been difficult to establish within OT environments. Given the uniqueness of network architectures and sensitivities of industrial control systems, automation can result in unanticipated effects, including safety issues. In IT environments, security responses are easily reversable. Within the OT landscape, however, network dependencies and legacy equipment make automated actions riskier and harder to reverse. For example, quarantining a personal computer from the network may be inconvenient for one user within the office. On the other hand, quarantining a single PLC may cause an entire factory to go offline. For these reasons SOAR adoption has been difficult and slow in OT infrastructure. Building a data repository of operational technology asset information, including location, criticality and function is the only way that companies can take the first step towards implementing SOAR in their OT infrastructure.
Two integral components of SOAR functionality are Endpoint Detection and Response (EDR) and Managed Detection and Response (MDR). EDR systems monitor endpoints directly to identify and respond to security incidents. With the rise of the Internet of Things (IoT) and the subsequent proliferation of network endpoints, EDR has become a more popular method for securing digital systems. MDR solutions remotely monitor network architectures. An MDR solution might centrally manage detection for an enterprises cloud services and network activity across multiple physical presences.
Both EDR and MDR have limitations with respect to the needs of OT environments. Most importantly, EDR can easily pose specific threats to maintaining safety and availability. Within industrial control systems, endpoint traffic must be carefully managed – adding monitoring or scanning traffic can easily disrupt established processes. Likewise, remotely managed activity with MDR can also generate the same concerns. Having remote access set up for a manufacturing or factory location may itself create additional security concerns – particularly if that system previously had a more secure airgap.
Therefore, an EDR/MDR alternative for OT systems must be tailor-made around how industrial environments operate. This alternative should be based upon an intentional integration of SIEM and SOAR capabilities. SOAR protocols are only as effective as the information provided to them. Therefore, they must be integrated with an established SIEM to optimize operation. This will limit the potential for false positives which might otherwise trigger unnecessary actions.
Given the cybersecurity advantages gained through contextual information, OT security operators must establish, customize, and integrate SIEM solutions within their security practices. When establishing a SIEM solution, it’s important to determine whether the methods for collecting, categorizing, and filtering work seamlessly for an OT environment.
Once established, to be effective, the SIEM must be customized to the particular security needs of the specific network. This can be done by determining filters in an iterative manner which verifies that the workflow is being streamlined and that relevant information is not being lost.
Finally, the SIEM should be integrated with a SOAR system as an alternative to EDR/MDR approaches. This design will enhance the response time of security operators while maintaining the maximum system safety and availability.
The Industrial Defender for Splunk application equips security operators with the right OT data to integrate SIEM and SOAR functionalities into their security architectures. Designed with OT environments in mind, the application can detect changes in network activity, provide contextual data, and help facilitate IT/OT collaboration.
Additionally, Industrial Defender’s OT Machine Learning Engine enhances security by incorporating information from OT environments into existing data models for detecting, investigating and responding to cyberthreats such as ransomware. This functionality is particularly useful for larger systems needing additional mechanisms to limit alert fatigue and mitigate false positives.