Talkin’ SMAC: Alert Labeling and Why It Matters

Talkin’ SMAC: Alert Labeling and Why It Matters

If you’ve ever worked in a Security Operations Center (SOC), you know that it’s a special place. Among other things, the SOC is a massive data-labeling machine, and generates some of the most valuable data in the cybersecurity industry. Unfortunately, much of this valuable data is often rendered useless because the way we label data in the SOC matters greatly. Sometimes decisions made with good intentions to save time or effort can inadvertently result in the loss or corruption of data.


Thoughtful measures must be taken ahead of time to ensure that the hard work SOC analysts apply to alerts results in meaningful, usable datasets. If properly recorded and handled, this data can be used to dramatically improve SOC operations. This blog post will demonstrate some common pitfalls of alert labeling, and will offer a new framework for SOCs to use—one that creates better insight into operations and enables future automation initiatives.


First, let’s define the scope of “SOC operations” for this discussion. All SOCs are different, and many do much more than alert triage, but for the purposes of this blog, let’s assume that a “typical SOC” ingests cybersecurity data in the form of alerts or logs (or both), analyzes them, and outputs reports and action items to stakeholders. Most importantly, the SOC decides which alerts don’t represent malicious activity, which do, and, if so, what to do about them. In this way, the SOC can be thought of as applying “labels” to the cybersecurity data that it analyzes.


There are at least three main groups that care about what the SOC is doing:


SOC leadership/management
Customers/stakeholders
Intel/detection/research

These groups have different and sometimes ..

Support the originator by clicking the read the rest link below.