Audio Replay Attack Detection for Speaker Verification System Using Convolutional Neural Networks

Kemanth, P.J.; Supanekar, S.; Koolagudi, S.G.

Please use this identifier to cite or link to this item: http://idr.nitk.ac.in/jspui/handle/123456789/7413

Title:	Audio Replay Attack Detection for Speaker Verification System Using Convolutional Neural Networks
Authors:	Kemanth, P.J. Supanekar, S. Koolagudi, S.G.
Issue Date:	2019
Citation:	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, Vol.11942 LNCS, , pp.445-453
Abstract:	An audio replay attack is one of the most popular spoofing attacks on speaker verification systems because it is very economical and does not require much knowledge of signal processing. In this paper, we investigate the significance of non-voiced audio segments and deep learning models like Convolutional Neural Networks (CNN) for audio replay attack detection. The non-voiced segments of the audio can be used to detect reverberation and channel noise. FFT spectrograms are generated and given as input to CNN to classify the audio as genuine or replay. The advantage of the proposed approach is, because of the removal of the voiced speech, the feature vector size is reduced without compromising the necessary features. This leads to significant amount of reduction on training time of the networks. The ASVspoof 2017 dataset is used to train and evaluate the model. The Equal Error Rate (EER) is computed and used as a metric to evaluate model performance. The proposed system has achieved an EER of 5.62% on the development dataset and 12.47% on the evaluation dataset. � 2019, Springer Nature Switzerland AG.
URI:	http://idr.nitk.ac.in/jspui/handle/123456789/7413
Appears in Collections:	2. Conference Papers

Files in This Item:

There are no files associated with this item.

Show full item record