New Research: Enhancing Botnet Detection with AI using LLMs and Similarity Search

New Research: Enhancing Botnet Detection with AI using LLMs and Similarity Search

As botnets continue to evolve, so do the techniques required to detect them. While Transport Layer Security (TLS) encryption is widely adopted for secure communications, botnets leverage TLS to obscure command-and-control (C2) traffic. These malicious actors often have identifiable characteristics embedded within their TLS certificates, opening a potential pathway for advanced detection techniques.

In first-of-its-kind research, Rapid7’s Dr. Stuart Millar, in collaboration with Kumar Shashwat, Francis Hahn and Prof. Xinming Ou, at the University of South Florida, studied the use of AI large language models (LLMs) to detect botnets' use of TLS encryption by analyzing embedding similarities to weed out botnets within a sea of benign TLS certificates. The work was presented at AISec 2024 in Salt Lake City as part of the leading ACM CCS conference toward the end of last year, where previously Rapid7 collected the best paper award.

Botnets — networks of hacked devices that attackers control remotely — often use TLS encryption to hide their activity. This encryption keeps the traffic secure, making it challenging for traditional security tools to detect whether a device is part of a botnet. Millar and company found they could detect botnets by analyzing the unique characteristics in the TLS certificates that each server uses to identify itself, dramatically reducing the time and human effort required.

Large language models can represent text as embeddings, or numerical vectors that capture the meaning and structure of the text. These embeddings were used to create vector representations of the text in TLS certificates, such as the organization names and country codes listed on them.  By projecting these representations into a vector space and then using a similarity search, any new certificate can first be compared to a known set of botnet and benign certificates, and then a decision made as to w ..

Support the originator by clicking the read the rest link below.