Skip to main content

Table 1 Comparison of related works

From: LSTM RNN: detecting exploit kits using redirection chain sequences

Ref

Dataset

Approach

Results

Nikolaev et al. (2016)

HTTP logs from 200+ networks over 6 months and PCAPs (Duncan 2020) over 3 months

Compares EK detection using 5 indicators (MIME type, structure, duration, repetition, browser agent) against RegEx only based detection

Average precision of 0.95 and recall of 0.92-0.95 using all 5 indicators

Harnmetta and Ngamsuriyaroj (2018)

820 PCAPs (Duncan 2020) (2014-2016)

Applies Decision Tree classifier to content-based, interaction and connection-specific features extracted from the HTTP, DNS and Files logs produced by Zeek

Classified EK traffic with 0.99 accuracy, 0.92 precision and families with 0.82-0.99 accuracy, 0.8-0.99 precision

Singh and Goyal (2019)

Dataset extracted from 3496 malicious and 2907 benign websites (MalCrawler)

Determines importance of 25 different features for detecting malicious websites, according to accuracy and computational costs. Applies 10-fold cross-validation (CV) in WEKA using Naive Bayes and C4.5 classifiers

Identifies top 5 attributes of malicious sites; cloaking, use of iFrame, redirection, size of obfuscated code and pop-ups using Window.open() function

Süren et al. (2019)

240 PCAPs (Duncan 2020) (2016)

Extracts 20 URL-based features from each domain in EK attack chain and compares ML algorithms

KNN, SVM, GBC achieved 0.958, 0.916 and 1.0 accuracy

Stringhini et al. (2013)

5000 redirect chains from a large AV vendor (2012)

Builds redirection graphs by aggregating redirect chains from a collection of different users, and, extracts 28 features from 5 categories for SVM

Achieved F1 score of up to 0.881, depending on the range of features considered

Li et al. (2014)

Crawled Alexa top 1m domains and Microsoft’s feed of malicious URLs over 4-6 weeks (2012)

Detects mass redirect-script injections by comparing suspicious JS files to their original versions. Based on the observation that redirection scripts are often quietly injected into legitimate JS libraries, whose unaltered code is publicly available

Produced detailed analysis of malicious JS/redirects and quantified the use obfuscation/evasion techniques

Matsunaka et al. (2014)

D3M 2013 dataset of 108 malicious websites (Marionette)

Uses monitoring sensors on the client-side (browser, web proxy and DNS), and, an analysis centre on the server-side to detect EK attacks. EXE downloads are classified as malicious if the URL is not present in previous HTTP headers or web content

Achieved 0% FPR with 24.2% FNR when tested against dataset of 108 URLs (33 malicious)

Mekky et al. (2014)

15,000 malicious paths and 225,000 benign paths, provided by a large ISP (2011-2012)

Reconstructs user browsing activity into trees, representing time-based sessions, and, extracts 8 redirection-based features for use with a Decision Tree classifier

Extracted redirection trees with 0.965 accuracy, and, classified with precision and recall values of 0.9-0.98

Takata et al. (2015)

Crawled 19,899 EK landing pages over 3 years (Marionette)

Applies program slicing to JS; executes each code segment and extracts URLs, even when cloaking prevents the execution of malicious JS branches

Extracted 30,000 new URLs compared to existing techniques

Nelms et al. (2015)

Dataset of 683 manually labelled, malicious download paths (164 EK instances)

Investigates browsing paths followed by users before an attack. WebWitness identifies a malicious download and traces back through HTTP requests, building a tree of redirects that led to the malware

Identified EKs with 0.9919 accuracy when tested against 48 EK samples using 10-fold CV

Taylor et al. (2016)

688 million redirection trees, extracted from 3800 hours of traffic (2013-2014)

Builds web session trees (WST) and extracts URL-based features. Subtree similarity searches are performed against the WSTs to identify node-level and structural similarities with known malicious trees

Achieved 95% FPR against a dataset of 85 EK samples, and, identified 28 new EK instances during analysis

Nagai et al. (2019)

D3M 2015 dataset of 256 malicious websites (Marionette)

Builds WSTs similar to (Taylor et al. 2016), but, aims to handle incomplete redirection data using time-based clustering. Focus is on WST construction rather than feature extraction

Average accuracy of 0.862 using 2-f CV. Scored higher on EKs families represented in both train and test sets

Takata et al. (2018)

8467 JS samples from 20,272 malicious websites (2012-2016)

Compares redirection graphs from browsers running different JS implementations to identify structural differences resulting from evasive code

Discovered several new evasion techniques that abuse JS implementation differences

Shibahara et al. (2019)

Crawled 455,860 websites, 1.3% labelled as malicious or evasive (2016)

Graph mining approach to detect malicious sites, even if full chain of redirects cannot be extracted. 22 redirect, HTML and JS-based features obtained from each graph, evaluated with RF classifier

Achieved F1 score of 0.766 for sites hosting EK URLs, and, identified 143 more malicious sites than conventional systems