On Enabling Techniques for Personal Audio Content Management

Lahti, Tommi; Helén, Marko; Vuorinen, Olli; Väyrynen, Eero; Partala, Juha; Peltola, Johannes; Mäkelä, Satu-Marja

State-of-the-art automatic analysis tools for personal audio con-tent management are discussed in this paper. Our main target is to create a system, which has several co-operating management tools for audio database and which improve the results of each other. Bayesian networks based audio classification algorithm provides classification into four main audio classes (silence, speech, music, and noise) and serves as a first step for other subsequent analysis tools. For speech analysis we propose an improved Bayesian information criterion based speaker segmen-tation and clustering algorithm applying also a combined gender and emotion detection algorithm utilizing prosodic features. For the other main classes it is often hard to device any general and well functional pre-categorization that would fit the unforesee-able types of user recorded data. For compensating the absence of analysis tools for these classes we propose the use of efficient audio similarity measure and query-by-example algorithm with database clustering capabilities. The experimental results show that the combined use of the algorithms is feasible in practice.


audio content management

Book title:
ACM International Conference on Multimedia Information Retrieval (MIR 2008)
Vancouver, Canada