No items found.

SIEM and Data Lakes in OT: New Challenges Bring New Choices

December 10, 2022

When we listen to IT folks talk about big data and analytics as a powerful possibility, we should remember OT organizations have been information-driven from the start. Industrial control systems and environments have always been managed and optimized through applied information—even if nobody called it analytics at the time.

But digital transformation and the expansion of technology across the enterprise now means organizations are collecting data at an exponentially increasing rate, with IT organizations looking to turn those mountains of systems data into actionable insights on how they can build, run, and secure environments with greater visibility and control.

This means that real, meaningful IT/OT convergence fundamentally about bringing data together so teams on both sides of the perimeter are empowered to collaborate and cooperate from a single shared source of truth. This is when many of the big picture benefits of unified operations--visibility, efficiency, faster risk mitigation—really start to emerge.

The more information that can be integrated into this single source of truth, the better—thus the need for more advanced methods of storing and analyzing large amounts of information. It’s critical to understand the downstream implications of these choices on daily operations and long-term strategy.

SIEM vs. DataLake: What’s the Difference?

The relative ease and low cost of data collection means organizations are accumulating more information than ever in pursuit of that shared objective truth, OT and cybersecurity teams included. They know that more data can help create richer context, which then enables more accurate decision-making.

But how organizations achieve this aggregated view matters, and as security and IT teams begin to rethink how data gets stored and used, it’s important to understand two components of a modernized, data-led security for both IT and OT: SIEM analysis and data lake storage.

SIEM Platforms – Simplifying Analysis

You’ve probably already heard about security organizations deploying System Incident & Event Management (SIEM) solutions, essentially high-powered log ingestion platforms that enables teams to quickly contextualize incidents by correlating information around expected system and user behavior vs. what’s known about unexpected anomalies.

  • Ingests a subset of collected data, both system and event info
  • Designed to be engines of fast, accurate analysis
  • Helps teams quickly sift through/contextualize large amounts of information
  • Relies mostly on structured log data

SIEM solutions can ingest data from a diverse set of sources. Teams can import log and/or telemetry data directly from devices or databases. They can also be integrated with new strategies like the enterprise organizations are currently building.

Data lakes – Increasing Storage Flexibility

Both IT and OT teams are leveraging the power of information to improve how they secure their environments. Getting more data into one place, and making it accessible to common tools, will always yield better results. But all that accumulation can be difficult, and traditional databases—even consolidated data warehouses—can’t keep pace.

  • Data has different formats and features, and they’re not all easy to work with. Traditional structured data (think XLS) is critical input to data-led decision-making, but unstructured data (think video files) can be too. Data lakes let both be stored side by side.
  • Data has different temperatures driven by different stakeholder needs. Some must be available for real-time security decision-making, while other information can be maintained and retained according to archival schedules. Data lakes can output to both use cases.
  • Data has different owners and obligations. Data unification starts with a conversation about shared governance and understanding what compliance rules mean for both the information itself and the systems used to store and protect it. A data lake gives a common platform for making these decisions.
  • Data applications have different hygiene needs. Much of the hard work of making data usable occurs well before the query. Getting the information into a single flexible data lake repository makes maintenance and hygiene faster and more efficient.

Data lakes enable organizations to serve distributed data needs with a centralized repository. This enables OT teams to begin to maximize their data collection efforts even as assessment and analysis strategies are still in flight. Teams can also collect the information once and use it multiple times for asset management, CCM, and other critical security workflows, including SIEM analysis.

SIEM or Data Lake: Competing or Complementary Solutions

As decision-makers look to enable stronger, smarter, data-driven security analytics across the entire enterprise, they can combine SIEM and data lake repositories for maximum impact. There are a couple of different strategies for achieving both completeness of storage and accuracy of structure.

  • Sending information to SIEM and data lake simultaneously is the path of least resistance but also eliminates many of the efficiencies gained through consolidation. Teams will still be duplicating maintenance and hygiene tasks, even though any formal schema is only applied by the SIEM database.
  • Sending information to the SIEM first, data lake second also doesn’t take full advantage of either platform, duplicating efforts and complicating the governance discussions mentioned earlier.
  • Finally, the smartest play is probably data lake first, SIEM second. This lets teams maximize the collective benefits of data aggregation while still carefully optimizing monitoring and detection systems for the very particular needs of OT Orgs can start collecting even before structure gets 100% solved at the SIEM. As the structure matures, more and more information from the data lake can be ingested.

Your Turn: Making Big Data Better for OT

As the demand for data-led decision-making grows, enterprise data strategies are changing. The move from traditional databases towards data aggregation enables organizations to take full advantage of new technologies SIEM log aggregation and data lakes. It’s up to OT leadership to help organizations understand the impact these decisions have on achieving the data convergence needed to secure the modern enterprise.

If you’re attending S4x23, be sure to catch our thought leadership talk on OT Data Lakes on Thursday, February 16th at 1:30pm Eastern.