From: Use of subword tokenization for domain generation algorithm classification
Detection or classification problem | DL models | Dataset | F1 score |
---|---|---|---|
Detection (Berman 2019) | CNN embedding + CNN (1D) + fully connected layers | 1 million benign 852,116 DGA from 50 classes | 0.9933 |
Detection (Selvi et al. 2021) | LSTM: embedding + LSTM  + fully connected layers | 32,000 benign 32,000 DGA | 0.9762 |
Classification (Qiao et al. 2019) | LSTM with attention: embedding + LSTM  + attention + fully connected layers | 910,313 benign 765,091 DGA from 15 classes:     759,091 DGA from 14 random-looking classes     6000 from 1 word-looking class | 0.9458 |
Detection and Classification (Vij et al. 2020) | LSTM: embedding + LSTM  + fully connected layers | 109,935 benign 109,935 DGA from 11 classes (all are random-looking DGAs) | Detection: 0.9804 Classification: 0.7192 |
Detection and Classification (Ren et al. 2020) | CNN-BiLSTM with attention: embedding + CNN  + LSTM + attention  + fully connected layer | 1 million benign 308,230 DGA from 24 classes:     19 arithmetic-based     2 wordlist-based     3 part-wordlist-based | Detection: 0.9879 Classification: 0.8300 |
Detection (Yang et al. 2022) | Subword tokenization and transformer | 10,000 benign and 10,000 DGA from 9 classes: (one wordlist-based DGA) | Detection: 0.9697 |