CatBERT: Context-aware tiny BERT for detecting targeted social engineering emails

Targeted phishing emails are a major cyber threat on the Internet today and are insufficiently addressed by current defences. In this paper, we leverage industrial-scale datasets from Sophos cloud email security service, which defends tens of millions of customer mailboxes, to propose a novel Transformer-based architecture for detecting targeted phishing emails. Using real-world targeted phishing data as well as millions of benign customer emails for training and evaluation, we show that our proposed CatBERT (Context-Aware Tiny Bert) model achieves a 87% detection rate at a false positive rate of 1%, as compared to DistilBERT [20], LSTM (Long Short-Term Memory) [13], and logistic regression baselines which achieve 83%, 79%, and 54% detection rates respectively. Our model leverages both natural language and email header inputs, is more computationally efficient than competing transformer approaches, and we show that it is less prone to adversarial attacks which deliberately replace keywords with typos or synonyms.