Skip to main content

Table 1 Non-adaptive attack evaluation. SRoA denotes the success rate of attack

From: Towards the universal defense for query-based audio adversarial attacks on speech recognition system

Attack

Dataset

SRoA(\(\%\))

Avg.Queries(n)

Detections

DSR(%)

FSNR (dB)

CS

Music-sets

100.00

\(\sim\)300

\(\sim\)3.92

98.00

7.38

DW

Music-sets

98.00

\(\sim\)150

\(\sim\)1.7

84.74

18.41

Average

100.00

\(\sim\)225

\(\sim\)2.81

91.37

12.90

IRTA

Mini-librispeech

100.00

\(\sim\)5000

\(\sim\)56.00

84.00

40.97

DS

Mini-librispeech

100.00

\(\sim\)1000

\(\sim\)11.00

82.50

13.02

Average

100.00

\(\sim\)3000

\(\sim\)34.00

83.25

27.00

  1. The higher the value of DSR and FSNR, the more beneficial. Normally, every k (k=75) query is detected once, and if the queries are less than k, at least one detection is performed for all n queries, and the ratio of n/k is the detections