Explainability in Cybersecurity Data Science

Explainability in Cybersecurity Data Science

Cybersecurity is data-rich and therefore a natural setting for machine learning (ML). However, many challenges hamper ML deployment into cybersecurity systems and organizations. One major challenge is that the human-machine relationship is rooted in a lack of explainability. Generally, there are two directions of explainability in cybersecurity data science:

  • Model-to-Human: predictive models inform the human cyber experts

  • Human-to-Model: human cyber experts inform the predictive models

  • When we build systems that combine both directions, we encourage a bidirectional, continuous relationship between the human and machine. We consider the absence of this two-way relationship a barrier to adopting ML systems at the cybersecurity-operations level. On a very basic level, explainable cybersecurity ML can be achieved now, but there are opportunities for significant improvement.

    In this blog, we first provide an overview of explainability in ML. Next, we illustrate (1) model-to-human explainability with the ML model form of cybersecurity decision trees. We then illustrate (2) human-to-model explainability with the feature engineering step of a cybersecurity ML pipeline. Finally, motivated by the progress made toward physics-informed ML, we recommend research needed to advance cybersecurity ML to achieve the level of two-way explainability necessary to encourage use of ML-based systems at the cybersecurity operations level.

    The Demand for Explainability

    As ML-based systems increasingly integrate into the fabric of daily life, the public is demanding increased transparency. The European Union encoded into law the individual’s right to an explanation when decisions made by au ..

    Support the originator by clicking the read the rest link below.