Skip to main content

Table 3 Evaluation results under threshold 0.7

From: Automated identification of sensitive data from implicit user specification

Category H I M I TP FP TN FN P(%) R(%) FS(%) Acc(%)
Account 66 66 63 3 5083 3 95.5 95.5 95.5 99.9
Calendar 18 19 17 2 5132 1 89.5 94.4 91.9 99.9
Credential 77 87 75 12 5063 2 86.2 97.4 91.5 99.7
Finance 83 103 81 22 5047 2 78.6 97.6 87.1 99.5
Profile 200 232 193 39 4913 7 83.2 96.5 89.4 99.1
Search & History 42 41 39 2 5108 3 95.1 92.9 94.0 99.9
Setting 78 78 75 3 5071 3 96.2 96.2 96.2 99.9
Average - - - - - - 89.2 95.8 92.2 99.7
One Category 564 626 545 81 4507 19 87.1 96.6 91.6 98.1
  1. HI: Number of texts identified by human as sensitive (category); MI: Number of texts identified by S3 as sensitive (category); TP: Number of true positives; FP: Number of false positives; TN: Number of true negatives; FN: Number of false negatives; P: Precision; R: Recall; FS: F-score; Acc: Accuracy; *: The last row is computed by treating all sensitive texts as one category