Archetypal analysis for audio dictionary learning

Diment, Aleksandr; Virtanen, Tuomas

This paper proposes dictionary learning with archetypes for audio processing. Archetypes refer to so-called pure types, which are a combination of a few data points and which can be combined to obtain a data point. The concept has been found useful in various problems, but it has not yet been applied for audio analysis. The algorithm performs archetypal analysis that minimises the generalised Kullback-Leibler divergence, shown suitable for audio, between an observation and the model. The methodology is evaluated in a source separation scenario (mixtures of speech) and shows results, which are comparable to the state-of-the-art, with perceptual measures indicating its superiority over all of the competing methods in the case of medium-size dictionaries.


archetypes; audio analysis; non-negative matrix factorisation; sparse representation

Book title:
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
Demo: .