A Study on English Audio Segmentation Methods Based on Threshold Value and Energy Sequence

Open Access

Issue		MATEC Web of Conferences Volume 22, 2015 International Conference on Engineering Technology and Application (ICETA 2015)


Article Number		02017
Number of page(s)		5
Section		Electric and Electronic Engineering
DOI		https://doi.org/10.1051/matecconf/20152202017
Published online		09 July 2015

Liu, P. & Wang, Z.Y. 2005. Multi-mode voice endpoint detection, Journal of Tsinghua University (Science Edition), 45(7): 896–899. [Google Scholar]
Chen, J.Y., Li, Y.H., Wu, L.D., et al. 2004. Automatic audio classification and segmentation assisting segmentation of soccer video, Journal of National University of Defense Technology, 26(6): 49–53. [Google Scholar]
Lu, G.Y., Jiang, D.M., Fan, Y.Y., et al. 2009. Audio/video speech recognition and phoneme segmentation based on multi-streaming three phonemes DBN, Journal of Electronics and Information, 31(2): 297–301. [Google Scholar]
Lu, G.Y., Jiang, D.M., Zhang, Y.N., et al. 2008. A study on continuous speech recognition and phoneme segmentation of large vocabulary based on the dynamic Bayesian network, Journal of Northwestern Polytechnical University, 26(2): 173–178. [Google Scholar]
Lu, G.Y., Jiang, D.M., Jiang, X.Y., et al. 2007. Continuous speech recognition and phoneme segmentation of audio and video based on the dynamic Bayesian network, Computer Application, 27(7): 1670–1673. [Google Scholar]
Izadinia, H., Saleemi, I., Shah, M., et al. 2003. Multimodal analysis for identification and segmentation of moving-sounding objects, IEEE Transactions on Multimedia, 15(2): 378–390. [CrossRef] [Google Scholar]
Mohammad A. Haque, Jong-Myon Kim. 2013. An enhanced fuzzy c-means algorithm for audio segmentation and classification, Multimedia Tools and Applications, 63(2): 485–500. [CrossRef] [Google Scholar]
Chung-Hsien Wu, Chia-Hsin Hsieh. 2006. Multiple change-point audio segmentation and classification using an MDL-based Gaussian model, IEEE Transactions on Audio, Speech, and Language Processing: A Publication of the IEEE Signal Processing Society, 14(2): 647–657. [CrossRef] [Google Scholar]
Makoto Yamamoto, Miki Haseyama. 2009. An Accurate Scene Segmentation Method Based on Graph Analysis Using Object Matching and Audio Feature, IEICE Transactions on Fundamentals of Electronics, Communications & Computer Sciences, E92/A(8): 1913–1919. [Google Scholar]
Kiranyaz S., Ahmad Farooq Qureshi, Gabbouj M., et al. 2006. A generic audio classification and segmentation approach for multimedia indexing and retrieval, IEEE Transactions on Audio, Speech, and Language Processing: A Publication of the IEEE Signal Processing Society, 14(3): 1062–1081. [CrossRef] [Google Scholar]
Rongqing Huang, Hansen J.H.L. 2006. Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora, IEEE Transactions on Audio, Speech, and Language Processing: A Publication of the IEEE Signal Processing Society, 14(3): 907–919. [CrossRef] [Google Scholar]
Kotsakis, R., Kalliris, G., Dimoulas, C., et al. 2012. Investigation of broadcast-audio semantic analysis scenarios employing radio-program-adaptive pattern classification, Speech Communication: An International Journal, 54(6): 743–762. [CrossRef] [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.