Voice identification Using a Composite Haar Wavelets and Proper Orthogonal Decomposition

Anwer, Mohammed; Khan, Rezwan-Al-Islam

Volume 4, Issue 2, October 2013, Pages 353–358

Voice identification Using a Composite Haar Wavelets and Proper Orthogonal Decomposition

BibTex | RIS | EndNote | RefWorks

@article{IJIAS-13-204-21,
author = {Mohammed Anwer and Rezwan-Al-Islam Khan},
title = {{Voice identification Using a Composite Haar Wavelets and Proper Orthogonal Decomposition}},
journal = {International Journal of Innovation and Applied Studies},
volume = {4},
year = {2013},
pages = {353--358},
issue = {2},
number = {2},
issn = {2028-9324},
url = {http://www.ijias.issr-journals.org/abstract.php?article=IJIAS-13-204-21},
abstract_html_url = {http://www.ijias.issr-journals.org/abstract.php?article=IJIAS-13-204-21},
pdf_url = {http://www.issr-journals.org/links/papers.php?journal=ijias&application=pdf&article=IJIAS-13-204-21},
document_type={Article},
source={www.issr-journals.org}
}

TY  - JOUR
ID  - 
TI  - Voice identification Using a Composite Haar Wavelets and Proper Orthogonal Decomposition
AU  - Mohammed Anwer
AU  - Rezwan-Al-Islam Khan
PY  - 2013
VL  - 4
IS  - 2
SP  - 353
EP  - 358
JO  - International Journal of Innovation and Applied Studies
T2  - International Journal of Innovation and Applied Studies
SN  - 20289324
UR  - http://www.ijias.issr-journals.org/abstract.php?article=IJIAS-13-204-21
AB  - In present day business and consumer environment, a robust voice identification system is needed to reduce false positives, and true negatives. In this work, a modified voice identification system is described using over sampled Haar wavelets followed by proper orthogonal decomposition. The audio signal is decomposed using over sampled Haar wavelets. This converts the audio signal into various non-correlating frequency bands. This allows us to calculate the linear predictive cepstral coefficient to capture the characteristics of individual speakers. Adaptive threshold was applied to reduce noise interference. This is followed by multi-layered vector quantization technique to eliminate the interference between multi-band coefficients. Finally, proper orthogonal decomposition is used to evaluate unique characteristics for capturing more details of phoneme characters. The proposed algorithm was used on KING and MAT-400 databases. These databases were chosen as previous extraction results were available for them. In the present study, the KING database were trained with three sentences, and tested with two. On the other hand, the MAT-400 database were trained with two seconds of random voice signal, and tested with other two seconds. Results were compared with vector quantization and Gaussian mixture models. The present model gave consistently better performance on speech collected through mouthpieces, but gave comparatively poor performance on audio collected on telephones. The better performance is obtained at the cost of higher computational time.
ER  -

TY  - JOUR
ID  - 
TI  - Voice identification Using a Composite Haar Wavelets and Proper Orthogonal Decomposition
AU  - Mohammed Anwer
AU  - Rezwan-Al-Islam Khan
PY  - 2013
VL  - 4
IS  - 2
SP  - 353
EP  - 358
JO  - International Journal of Innovation and Applied Studies
SN  - 20289324
AB  - 
In present day business and consumer environment, a robust voice identification system is needed to reduce false positives, and true negatives. In this work, a modified voice identification system is described using over sampled Haar wavelets followed by proper orthogonal decomposition. The audio signal is decomposed using over sampled Haar wavelets. This converts the audio signal into various non-correlating frequency bands. This allows us to calculate the linear predictive cepstral coefficient to capture the characteristics of individual speakers. Adaptive threshold was applied to reduce noise interference. This is followed by multi-layered vector quantization technique to eliminate the interference between multi-band coefficients. Finally, proper orthogonal decomposition is used to evaluate unique characteristics for capturing more details of phoneme characters. The proposed algorithm was used on KING and MAT-400 databases. These databases were chosen as previous extraction results were available for them. In the present study, the KING database were trained with three sentences, and tested with two. On the other hand, the MAT-400 database were trained with two seconds of random voice signal, and tested with other two seconds. Results were compared with vector quantization and Gaussian mixture models. The present model gave consistently better performance on speech collected through mouthpieces, but gave comparatively poor performance on audio collected on telephones. The better performance is obtained at the cost of higher computational time.
ER  -

RT Journal Article
ID IJIAS-13-204-21
A1 Mohammed Anwer
A1 Rezwan-Al-Islam Khan
YR 2013
T1 Voice identification Using a Composite Haar Wavelets and Proper Orthogonal Decomposition
JF International Journal of Innovation and Applied Studies

Download

Mohammed Anwer¹ and Rezwan-Al-Islam Khan²

¹ School of Engineering and Computer Science, Independent University, Dhaka, Bangladesh
² School of Engineering and Computer Science, Independent University, Dhaka, Bangladesh

Original language: English

Copyright © 2013 ISSR Journals. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

In present day business and consumer environment, a robust voice identification system is needed to reduce false positives, and true negatives. In this work, a modified voice identification system is described using over sampled Haar wavelets followed by proper orthogonal decomposition. The audio signal is decomposed using over sampled Haar wavelets. This converts the audio signal into various non-correlating frequency bands. This allows us to calculate the linear predictive cepstral coefficient to capture the characteristics of individual speakers. Adaptive threshold was applied to reduce noise interference. This is followed by multi-layered vector quantization technique to eliminate the interference between multi-band coefficients. Finally, proper orthogonal decomposition is used to evaluate unique characteristics for capturing more details of phoneme characters. The proposed algorithm was used on KING and MAT-400 databases. These databases were chosen as previous extraction results were available for them. In the present study, the KING database were trained with three sentences, and tested with two. On the other hand, the MAT-400 database were trained with two seconds of random voice signal, and tested with other two seconds. Results were compared with vector quantization and Gaussian mixture models. The present model gave consistently better performance on speech collected through mouthpieces, but gave comparatively poor performance on audio collected on telephones. The better performance is obtained at the cost of higher computational time.

Author Keywords: Voice identification, Haar wavelet, Proper Orthogonal Decomposition, Signal Processing, Modeling.

How to Cite this Article

Mohammed Anwer and Rezwan-Al-Islam Khan, “Voice identification Using a Composite Haar Wavelets and Proper Orthogonal Decomposition,” International Journal of Innovation and Applied Studies, vol. 4, no. 2, pp. 353–358, October 2013.

About IJIAS

News

Submission

Downloads

Archives

Custom Search

Contact

Connect with IJIAS

Voice identification Using a Composite Haar Wavelets and Proper Orthogonal Decomposition

Abstract

How to Cite this Article