Skip to main content

Table 15 The properties of the malicious domain detection methods

From: PUMD: a PU learning-based malicious domain detection framework

No.

Work

Object

Technique

Dataset

Feature construction*

Model training

Data anonymize

Ground truth SRC

Handcraft

Implicit

Trainset size

Imbalance ratio

HC

T

W

E

IC

A

Mali

Benign

Unlabel

1

PUMD(our)

Malicious activity DN(C&C)

PU learning: iForest + RF

DN, IP Addr

manual label

✓

✓

✓

✓

 

✓

100–861

 

19651–20412

22.8–204.1

2

Phoenix

DGA

mahalanobis distance + dbscan

 

generate DN, blacklist, Alexa

✓

    

✓

~ 100k

   

3

AULD

Malicious activity DN

filter-rule + canopy + k-means

IP Addr

simulate DN, Alexa

✓

✓

✓

       

4

HinDom

Malicious activity DN

HIN + transductive classify

IP Addr

blacklist, Alexa, whitelist

    

✓

✓

0.02M–0.22M

0.07M–0.63M

 

2.8

5

ELM

Malicious activity DN

ELM

 

blacklist, Alexa

✓

✓

✓

   

~ 20k

~ 6k

 

0.3

6

LSTM.MI

DGA

cost-sensitive lstm

 

blacklist, Alexa

    

✓

 

~ 41k (total)

~ 44k

 

2–3534

7

KSDom

Malicious activity DN

catboost+ kmSmote

 

blacklist, Alexa, whitelist

✓

✓

✓

   

5.4k, 3.6k, 1.8k, 0.9k

9k

 

1.6, 2.5, 5, 10

8

HAC_Easy Ensemble

Malicious activity DN

undersample +ensemble learning

 

blacklist, Alexa

✓

✓

    

2.7k

5.76k

 

2.1

No.

Work

Object

Technique

Model testing

Model output

Testset size

Imbalance ratio

Process

Class

Mali

Benign

1

PUMD(our)

Malicious activity DN(C&C)

PU learning: iForest + RF

91–852

19560

22.9–214.9

Auto

2

2

Phoenix

DGA

mahalanobis distance + dbscan

~ 1.3M

  

Threshold, Rule- match

5

3

AULD

Malicious activity DN

filter-rule + canopy + k-means

1462

9068

6.2

Threshold, Manual -analysis

2

4

HinDom

Malicious activity DN

HIN + transductive classify

~ 0.25M

~ 0.7M

2.8

Auto

2

5

ELM

Malicious activity DN

ELM

~ 20k

~ 6k

0.3

Auto

2

6

LSTM.MI

DGA

cost-sensitive lstm

~ 41k (total)

~ 44k

2–3534

Auto

38

7

KSDom

Malicious activity DN

catboost+ kmSmote

0.6k, 0.4k, 0.2k, 0.1k

1k

1.6,2.5, 5,10

Auto

2

8

HAC_Easy Ensemble

Malicious activity DN

undersample +ensemble learning

0.3k

0.64k

2.1

Auto

2

  1. *HC: Handcraft Character, T: Traffic, W: Whois, E: Evidence, IC: Implicit Character, A: Associate