Skip to main content

Table 2 Code clone detectors and their manifest including extracted feature, used dataset, and supported clone types

From: Are our clone detectors good enough? An empirical study of code effects by obfuscation

Tools

Venue

Method

Feature

Dataset

Clone Type

T1

T2

ST3

MT3

T4

CCFinder (Kamiya et al. 2002)

TSE 2002

token normalization + token-wised comparison

Token

JDK 1.3.0, FreeBSD 4.0

✓

✓

   

SDD (Lee and Jeong 2005)

OOPSLA 2005

inverted index + N-neighbor

Text

JDK 1.5, httpd-2.0.54

✓

✓

✓

  

Deckard (Jiang et al. 2007)

ICSE 2007

Locality Sensitive Hash

AST

JDK 1.4.2, Linux kernel 2.6.16

✓

✓

✓

  

SourcererCC (Sajnani et al. 2016)

ICSE 2016

Filtering Heuristics

Token

BigCloneBench, Mutation/Injection

✓

✓

✓

  

Oreo (Saini et al. 2018)

ESEC/FSE 2018

action token + metric comparison

Token

BigCloneBench

✓

✓

✓

  

CCAligner (Wang et al. 2018)

ICSE 2018

code window + edit distance

Token

JDK 1.2.2, OpenNLP 1.8.1

✓

✓

✓

  

DeepSim (Zhao and Huang 2018)

FSE 2018

Multilayer Perceptron

CFG

BigCloneBench, GCJ

✓

✓

✓

✓

✓

ASTNN (Zhang et al. 2019)

ICSE 2019

Bidirectional RNN

AST

BigCloneBench

✓

✓

✓

✓

✓

CCLearner (Li et al. 2017)

ICSME 2017

DNNs

Token

BigCloneBench

✓

✓

✓

✓