Xiao-Bing Zhang, Ting-Ting Sun, Yan-Ping Li. An Algorithm of Voice Activity Detection Based on EMD and Wavelet Entropy Ratio[J]. Journal of Electronic Science and Technology, 2017, 15(1): 64-68. DOI: 10.11989/JEST.1674-862X.6010715
Citation: Xiao-Bing Zhang, Ting-Ting Sun, Yan-Ping Li. An Algorithm of Voice Activity Detection Based on EMD and Wavelet Entropy Ratio[J]. Journal of Electronic Science and Technology, 2017, 15(1): 64-68. DOI: 10.11989/JEST.1674-862X.6010715

An Algorithm of Voice Activity Detection Based on EMD and Wavelet Entropy Ratio

doi: 10.11989/JEST.1674-862X.6010715
Funds: 

This work was supported by the Significant Projects of Anhui University of Technology under Grant No.14206003

More Information
  • Author Bio:

    Xiao-Bing Zhang. His current research is measurement and control technology,e-mail:zxb13013113592@126.com;
    Ting-Ting Sun. Her current interest is speech recognition,e-mail:609655209@qq.com;
    Yan-Ping Li. He is currently pursuing the M.S. degree with Anhui University of Technology. Her current interest is speech recognition,e-mail:1021754704@qq.com

    Xiao-Bing Zhang. His current research is measurement and control technology,e-mail:zxb13013113592@126.com;
    Ting-Ting Sun. Her current interest is speech recognition,e-mail:609655209@qq.com;
    Yan-Ping Li. He is currently pursuing the M.S. degree with Anhui University of Technology. Her current interest is speech recognition,e-mail:1021754704@qq.com

    Xiao-Bing Zhang. His current research is measurement and control technology,e-mail:zxb13013113592@126.com;
    Ting-Ting Sun. Her current interest is speech recognition,e-mail:609655209@qq.com;
    Yan-Ping Li. He is currently pursuing the M.S. degree with Anhui University of Technology. Her current interest is speech recognition,e-mail:1021754704@qq.com

  • Authors’ information: Xiao-Bing Zhang,e-mail:zxb13013113592@126.com
  • Received Date: 2016-01-06
  • Rev Recd Date: 2016-06-15
  • Publish Date: 2017-03-24
  • A new method was proposed to identify speech-segment endpoints based on the empirical mode decomposition (EMD) and a new wavelet entropy ratio with improving the accuracy of voice activity detection. With the EMD, the noise signals can be decomposed into several intrinsic mode functions (IMFs). Then the proposed wavelet energy entropy ratio can be used to extract the desired feature for each IMFs component. In view of the question that the method of voice endpoint detection based on the original wavelet entropy ratio cannot adapt to the low signal-to-noise ratio (SNR) condition, an appropriate positive constant was introduced to the basic wavelet energy entropy ratio with effectively improved discriminability between the speech and noise. After comparing the traditional wavelet energy entropy ratio with the proposed wavelet energy entropy ratio, the experiment results show that the proposed method is simple and fast. The speech endpoints can be accurately detected in low SNR environments.
  • [1]
    J. C. Junqua, B. Reaves, and B. Mak, A study of endpoint detection algorithms in adverse conditions:Incidence on a DTW and HMM recognize, in Proc. of European Conf. on Speech Communication and Technology, 1991, pp. 1371-1374.
    [1]
    J. A. Haigh and J. S. Mason, Robust voice activity detection using cepstral features, in Proc. of IEEE Region 10 Conf. on Computer, Communication, Control and Power Engineering, 1993, pp. 321-324.
    [2]
    C.-F. Juang, C.-N. Cheng, and T.-M. Chen, Speech detection in noisy environments by wavelet energy-based recurrent neural fuzzy network, Expert Systems with Applications, vol. 36, no. 1, pp. 321-332, 2009.
    [3]
    K.-C. Wang and Y.-H. Tasi, Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy, in IEEE Intl. Symposium on Universal Communication, 2008, pp. 423-428.
    [4]
    S. G. Tanyer and H. zer, Voice activity detection in nonstationary noise, IEEE Trans. on Speech and Audio Processing, vol. 8, no. 4, pp. 1573-1575, 2000.
    [5]
    P. Maragos and J. F. Kaiser, Energy separation in signal modulations with application to speech analysis, IEEE Trans. on Signal Processing, vol. 41, no. 10, pp. 3024-3051, 1993.
    [6]
    M. Sarma and K. K. Sarma, Effective speech signal reconstruction technique using empirical mode decomposition under various conditions, in Proc. of Recent Trends in Intelligent and Emerging Systems, 2015, pp. 151-161.
    [7]
    L. Hea, M. Lecha, and N. C. Maddagea, Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speech, Biomedical Signal Processing and Control, vol. 6, no. 2, pp. 139-146, 2011.
    [8]
    H. Wang and S.-D. Yuan, Voice activity detection based on EMD and power spectrum entropy, Audio Engineering, vol. 37, no. 11, pp. 40-44, 2013.
    [9]
    S. G. Mallat A theory for multi-resolution signal decomposition:The wavelet representation, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674-693, 1989.
    [10]
    C.-C. Tu and C.-F. Juang, Recurrent type-2 fuzzy neural network using Haar wavelet energy and entropy features for speech detection in noisy environments, Expert Systems with Applications, vol. 39, no. 3, pp. 2479-2488, 2012.
    [11]
    C.-J. Chen and J.-S. Zhang, Wavelet energy entropy as a new feature extractor for face recognition, in Proc. of the 4th IEEE Intl. Conf. on Image and Graphics, 2007, pp. 616-619.
    [12]
    C. E. Shannon A mathematical theory of communication, The Bell System Technical Journal, vol. 27, no. 3, pp. 379-423, 1948.
    [13]
    S.-P. Li and Y.-H. Zhang, Novel approach for fault detection and diagnosis of sensors in air handling units using wavelet energy entropy, Journal of Donghua University, vol. 30, no. 3, pp. 207-210, 2013.
    [14]
    B.-Y. Chen, Y.-Q. Lan, J.-Y. Liu, et al., Voice activity detection algorithm based on improved radial basis function neural network, Intl. Journal of Signal Processing, Image Processing and Pattern Recognition, vol. 7, no. 5, pp. 187-196, 2014.
  • Related Articles

    [1]Dong-Qing Zhang, Feng Yang, Yang Luo, Yu-Xiao Huang, Cheng-Long Xia. Energy Management for a Residential Microgrid Using Wavelet Transform and Fuzzy Control Including a Vehicle-to-Grid System[J]. Journal of Electronic Science and Technology, 2016, 14(4): 291-297. DOI: 10.11989/JEST.1674-862X.604221
    [2]Min-Sung Koh, Esteban Rodriguez-Marek. Perfect Reconstructable Decimated One-Dimensional Empirical Mode Decomposition Filter Banks[J]. Journal of Electronic Science and Technology, 2014, 12(2): 196-200. DOI: 10.3969/j.issn.1674-862X.2014.02.011
    [3]Fei Wen, Qun Wan. Time Delay Estimation Based on Entropy Estimation[J]. Journal of Electronic Science and Technology, 2013, 11(3): 258-263. DOI: 10.3969/j.issn.1674-862X.2013.03.003
    [4]Ce-Wu Lu, Xiao-Jun Liu, Bo Zhao, Guang-You Fang. Clutter Suppression Method in GPR Using the Convergence of Matrix Entropy[J]. Journal of Electronic Science and Technology, 2010, 8(4): 333-336. DOI: 10.3969/j.issn.1674-862X.2010.04.007
    [5]De-Xiang Zhang, Xiao-Pei Wu, Zhao Lü. Speech Endpoint Detection in Noisy Environments Using EMD and Teager Energy Operator[J]. Journal of Electronic Science and Technology, 2010, 8(2): 183-186. DOI: 10.3969/j.issn.1674-862X.2010.02.018
    [6]DU Tian-jun, CHEN Guang-ju. Power System Harmonic Detection Using Frequency-Domain Interpolation Wavelet Transform[J]. Journal of Electronic Science and Technology, 2005, 3(3): 245-248.
    [7]WANG Feng-bi, HUANG Jun-cai, WANG Bin, SHE Kun, ZHOU Ming-tian. Video Watermark Using Multiresolution Wavelet Decomposition[J]. Journal of Electronic Science and Technology, 2005, 3(2): 120-122,160.
    [8]LIU Yan-su, XIA Yang, XU Hong-ru, ZHOU Dong, YAO De-zhong. Nonlinear Analysis of Clinical Epileptic EEG by Approximate Entropy[J]. Journal of Electronic Science and Technology, 2005, 3(1): 72-74.
    [9]FU Yu, WANG Bao-bao, LI Chun-ru, QUAN Ning-qiang. A Novel Algorithm for Robust Audio Watermarking in Wavelet Domain[J]. Journal of Electronic Science and Technology, 2004, 2(2): 70-72,78.
    [10]CHEN Huafu, NIU Hai. Detection the Character Wave in Epileptic EEG by Wavelet[J]. Journal of Electronic Science and Technology, 2004, 2(1): 69-71.
  • Cited by

    Periodical cited type(1)

    1. Yao, C.. Voice Assistance and Big Data Financial Management Based on High-Resolution Imaging Algorithm. ACM International Conference Proceeding Series, 2021. DOI:10.1145/3510858.3510965

    Other cited types(0)

  • Catalog

      Article Metrics

      Article views (457) PDF downloads (137) Cited by(1)
      Related
      Proportional views

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return