Skip to main content

Table 3 Evaluation results under threshold 0.7

From: Automated identification of sensitive data from implicit user specification

Category

H I

M I

TP

FP

TN

FN

P(%)

R(%)

FS(%)

Acc(%)

Account

66

66

63

3

5083

3

95.5

95.5

95.5

99.9

Calendar

18

19

17

2

5132

1

89.5

94.4

91.9

99.9

Credential

77

87

75

12

5063

2

86.2

97.4

91.5

99.7

Finance

83

103

81

22

5047

2

78.6

97.6

87.1

99.5

Profile

200

232

193

39

4913

7

83.2

96.5

89.4

99.1

Search & History

42

41

39

2

5108

3

95.1

92.9

94.0

99.9

Setting

78

78

75

3

5071

3

96.2

96.2

96.2

99.9

Average

-

-

-

-

-

-

89.2

95.8

92.2

99.7

One Category

564

626

545

81

4507

19

87.1

96.6

91.6

98.1

  1. HI: Number of texts identified by human as sensitive (category); MI: Number of texts identified by S3 as sensitive (category); TP: Number of true positives; FP: Number of false positives; TN: Number of true negatives; FN: Number of false negatives; P: Precision; R: Recall; FS: F-score; Acc: Accuracy; *: The last row is computed by treating all sensitive texts as one category