Why Passive Network Monitoring Isn’t Truly “Passive”

February 15, 2021

Since around 2016 there has been a term thrown around in the OT cybersecurity world that is a classic case of marketing naming a segment well.

I am of course talking about passive monitoring. This started out with admirable goals to add visibility into the cybersecurity of OT networks. It evolved from the early days of OT protocol firewalls and OT aware signature-based network intrusion detection. This technology has been a driving factor behind the explosion of the OT security market, which is set to surpass $12 billion by 2026. In full disclosure, it is available in our solution, and in the opinion of this author, an invaluable piece of the modern cybersecurity toolkit.

But what is passive monitoring really? Because that’s a non-descriptive marketing term. It has all the feel of cookies, cozy blankets and a roaring fire in a remote log cabin. It is soothing wellness yoga in a world full of Fear, Uncertainty and Doubt (FUD). What it really is, and I how I refer to it for the rest of the article is network protocol analysis. This is the term MITRE uses when describing it as data source tool for discovery of techniques. More Cylon or Dalek sounding, and less stuffed teddy bear. It’s also way harder to say it three times fast, which is a requirement for any word or phrase a sales engineer says.

Why Do We Say Passive, and Why Does It Even Matter?

Network protocol analysis was rebranded because OT engineers for years suffered outages from well-meaning, if ill-informed, IT security staff performing “ACTIVE network scans”. I did it, way back in the early days of Network Security Scanner and free Nessus. I ran my first wide-open Nessus scan, and promptly sent the audit department’s printer off-line. It ended up requiring a full power cycle to recover and left more than few of my co-workers irritated at having to re-run print jobs. Now do that at a power plant or chemical factory, and you see why OT engineers are a bit nervous about anything cybersecurity-related coming through their doors. So, to make everyone feel a lot better with what was happening, the early purveyors of this technology called it “passive” in one of the most brilliant marketing moves since Leonardo DiCaprio was cast in Titanic.

So why am I so bent out of shape about this? Why does this require a blog? Because “Passive Monitoring” is not so passive in a lot of cases, especially as its evolved from its original intent and morphed into “ASSET VISIBILITY” (deep voice in an echoing canyon implied.)

To make this point we need to understand exactly how network protocol analysis even works. Essentially you make a copy of all the network traffic flowing in your OT environment, and then you tear it apart into bits and pieces and try to decode information from it. A lot of that data is really useful to know.

In the early OT network protocol analysis days, everyone focused on some giant industry standards, like Modbus, DNP3, EthernetIP, etc. These have open published specs which allow developers to quickly build some relatively insightful analysis. Sometimes it is about events and actions, like blocks of codes on a PLC being updated, or someone told the device to reset itself. Sometimes it is about the device itself. Some protocols are really good about telling everyone everything about themselves, like a 10 year old on social media. This is great data, and it has significant value to operators who often view the control system delivered to them as a black box with some levers and dials they use to perform a limited set of actions. Now they could see what really was happening when they pushed that button on the HMI screen.

These early implementations were simple, and cast light into some dark shadowy areas. Like a candle in a pitch-black room, you could get a lot from a little. These early solutions used one or two sensors, and they were put where it was convenient and whatever they discovered was usually pretty insightful. There were no expectations and promise of 100% coverage.

The problem was, it was too good at its job and a lot of little companies jumped on this technology and started to press it for more and more use cases. To the point where now people are saying you can run an entire security program or solve all your compliance problems with it.

Now this is where our nice, helpful, insightful little tool, network protocol analysis, starts to evolve and becomes more intrusive.

Intrusive, You Say?

I do say. Why? Because people were told that only passive was safe in an OT environment, but now it is being asked to push the boundaries of what makes sense. Let us take a look at a few scenarios related to the number one purchase driver for these solutions.

Asset Visibility

This use case is described as the ability to identify an asset as it appears on the network. There are four major reasons why asset visibility alone sheds light on some of issues with pushing network protocol analysis too far.

  1. OT Network Equipment – Remember how network protocol analysis takes a copy of the network traffic and rips it apart to pick out useful data? Well, that relies on being able to conveniently access that network data, and 100% asset identification needs a whole lot of network traffic to be forwarded. This requires a couple of things. Either a really nice set of advanced switches capable of copying data for every single port on the switch out through one or two empty ports, or a separate data collection network made up of devices called Taps.

    Guess what OT networks do not have? That’s right, big fancy switches with a lot of fancy features. Oftentimes, it’s a bunch of unmanaged switches that have no ability to perform this packet copying technique. Other times, those switches can do it, but only for one port at a time, or each has to use one whole port to copy, meaning you cut your port count in half. This is not “passive.” Ripping and replacing every switch on your OT network is a very major and expensive endeavor. There are other options, but they require time, money and engineering, as well. You are sacrificing budget for other resources in order to make passive fit.

  2. Protocols – In order to get the 100% visibility promise from network protocol analysis, such as make, model and firmware, that data must be placed on the network. In IT networks a lot of devices do this very naturally. Protocols like Cisco Discovery Protocol or Link Layer Data Protocol are routinely enabled. They provide operational benefits to networks that are constantly being reconfigured to meet changing business demands.

    These are terms you don’t use in OT. You do not constantly add and remove switches and need to dynamically create new routes, so these protocols, if even present, are turned off. Why? Because they cause their own issues. I saw a major plant taken offline once from a network storm caused by a spanning tree protocol incident. These protocols also have their own vulnerabilities, like CVE-2020-3119 to name just one. But if you are only getting asset visibility data from network analytics you must enable these or something similar, which means you are sacrificing potential up time.

  3. Fake Network Traffic – In some cases, devices are never asked the right questions in the normal routine of the OT system. A lot of protection relays I know do not routinely get asked, nor do they volunteer, their make, model and serial number. Why? Because it’s not important to the process. They take in signals in the form of hardwired control voltage and will send out commands in the same form. Not a whole lot of reason to share asset data in that case. In order for network analytics to “see” these assets and get that information, another system has to login and ask it a question.

    This querying system doesn’t need the data, it is just asking for the sake of generating a response for the “passive” observer because, well, you paid $1,800 to get the managed switch at your 30 substations to enable the traffic to be copied to network analytics tool. Or you could just have used a solution that knows how to ask that device the right question directly and interpret the result itself. You could have done that with your existing unmanaged switch, and saved yourself almost $60,000. Also, this is a compromise that might strain a system like an RTU that otherwise would have been fine if you task it with too much extra work it was not designed to do. This is a lot of compromise to stay passive.

  4. Insecure protocols – In order for network protocol analysis to work, the analytics engine needs to be able to read and understand the data. This is very hard to do when everything but the IP headers are encrypted. There are ways to deal with this, but again, this is not equipment just laying around in your everyday OT deployment. The other cheaper option is to ask the question in a clear text protocol, which is not the right answer. So now you’re sacrificing valuable budget and your security posture to stay passive, when a simple SSH login query would have safely, efficiently and securely answered the same question.

As you can see, when you push any great tool beyond its design limitations you start to make compromises, and if you’re not willing to compromise on which tool to use, then you will make other, less favorable compromises to your budgets and overall security.

It’s time to change our language. Passive is not precise language. Network protocol analysis is a must-have tool in any OT security and compliance toolbox, but it cannot be your only one. “Active” methods are not the evil they are portrayed as. They can bring deep, rich insights that are simply not achievable with network protocol analysis. You cannot do file integrity monitoring or removable media detection with network protocol analysis, but the MITRE ATT&CK for ICS Matrix lists both of those critical capabilities, early on in the kill chain. Heck, you cannot even do simple software vulnerability analysis on anything with an operating system instead of firmware.

To be very clear you will never see me advocate for nmap or a broad spectrum vulnerability scan in a production network, but that doesn’t mean we should throw the cheese sauce out with the broccoli. Select the right tool for the right job, and you will go from compromising to thriving.

To understand what a more comprehensive approach to detecting anomalous network and device activity might look like, check out our MITRE ATT&CK for ICS infographic.