Audio research group - Tampere University

Microphone arrays provide a link between the physical locations of sound objects for the computer software and can allow capturing of the sound field. The applications of microphone arrays include physical location determination (speaker localization), enhancement, and separation. A distant speech interface system can benefit through the enhancement of the captured signal using a microphone array.

Sound localization

Sound source localization can be approached by spatial capture of signals and activity detection. During active segments, the spatial information carried by the observed wave can be extracted via microphone array processing. A standard method of source localization is based on steered response power. Microphone placement affects the capability of the array.

Bibliography

conference

Acoustic Scene Classification Using Higher-Order Ambisonic Features
{Marc C.} Green, Sharath Adavanne, Damian Murphy, Tuomas Virtanen, 2019

conference

Time Difference of Arrival Estimation of Speech Signals Using Deep Neural Networks with Integrated Time-frequency Masking
Pasi Pertilä, Mikko Parviainen, 2019

article

Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
Sharath Adavanne, Archontis Politis, Joonas Nikunen, Tuomas Virtanen, 2018

conference

Direction of Arrival Estimation for Multiple Sound Sources Using Convolutional Recurrent Neural Network
Sharath Adavanne, Archontis Politis, Tuomas Virtanen, 2018

article

Multichannel Blind Sound Source Separation using Spatial Covariance Model with Level and Time Differences and Non-Negative Matrix Factorization
{Julio Jose} {Carabias Orti}, Joonas Nikunen, Tuomas Virtanen, Pedro Vera-Candeas, 2018

article

Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
Sharath Adavanne, Archontis Politis, Joonas Nikunen, Tuomas Virtanen, 2018

article

Separation of Moving Sound Sources Using Multichannel NMF and Acoustic Tracking
Joonas Nikunen, Aleksandr Diment, Tuomas Virtanen, 2017

article

Self-localization of dynamic user-worn microphones from observed speech
Mikko Parviainen, Pasi Pertilä, 2017

conference

Sound event detection using spatial features and convolutional recurrent neural network
Sharath Adavanne, Pasi Pertila, Tuomas Virtanen, 2017

conference

Sound event detection using spatial features and convolutional recurrent neural network
Sharath Adavanne, Pasi Pertila, Tuomas Virtanen, 2017

conference

Increasing the environment-awareness of rake beamforming for directive acoustic sources
Pasi Pertilä, Alessio Brutti, 2016

article

Distant speech separation using predicted time-frequency masks from spatial features
Pasi Pertilä, Joonas Nikunen, 2015

conference

Microphone Array Post-Filtering Using Supervised Machine Learning for Speech Enhancement
Pasi Pertilä, Joonas Nikunen, 2014

conference

Self-localization of Wireless Acoustic Sensors in Meeting Rooms
Mikko Parviainen, Pasi Pertilä, Matti S. Hämäläinen, 2014

conference

Multichannel audio separation by Direction of Arrival Based Spatial Covariance Model and Non-negative Matrix Factorization
Joonas Nikunen, Tuomas Virtanen, 2014

article

Passive temporal offset estimation of multichannel recordings of an ad-hoc microphone array
Pasi Pertilä, Matti S. Hämäläinen, Mikael Mieskolainen, 2013

article

Online Blind Speech Separation using Multiple Acoustic Speaker Tracking and Time-Frequency Masking
Pasi Pertilä, 2013

conference

Passive Self-Localization of Microphones Using Ambient Sounds
Pasi Pertilä, Mikael Mieskolainen, Matti Hämäläinen, 2012

conference

Closed-Form Self-Localization of Asynchronous Microphone Arrays
Pasi Pertilä, Mikael Mieskolainen, Matti S. Hämäläinen, 2011

conference

Low-complexity angle of arrival estimation of wideband signals using small arrays
Jari Yli-Hietanen, Kari Kalliojärvi, Jaakko Astola, 1996

inbook

Time–Frequency Domain Spatial Audio Enhancement
, 0

inbook

Microphone-Array-Based Speech Enhancement Using Neural Networks
"Pasi Pertil\{"a, 0

inbook

Time–Frequency Domain Spatial Audio Enhancement
, 0

Self-localization

The knowledge of microphone locations is required by multichannel signal processing methods relying on geometry, such as beamforming and speaker localization. Devices used as distant talking interfaces such as smartphones and laptops are ubiquitous, inherently asynchronous, and have a known microphone and loudspeaker layout. The joint utilization of such devices in geometrical multichannel signal processing applications is dependent on the accurate knowledge of the microphone placements, i.e., rotation and translation of the devices. This information is too cumbersome to measure by hand. This research focuses on the automatic localization and synchronization of the device microphones. Self-localization provides useful spatial information and enables the use of array signal processing methods developed originally for ﬁxed arrays with known geometry.

Bibliography

article

Self-localization of dynamic user-worn microphones from observed speech
Mikko Parviainen, Pasi Pertilä, 2017

conference

Self-localization of Wireless Acoustic Sensors in Meeting Rooms
Mikko Parviainen, Pasi Pertilä, Matti S. Hämäläinen, 2014

article

Passive temporal offset estimation of multichannel recordings of an ad-hoc microphone array
Pasi Pertilä, Matti S. Hämäläinen, Mikael Mieskolainen, 2013

conference

Passive Self-Localization of Microphones Using Ambient Sounds
Pasi Pertilä, Mikael Mieskolainen, Matti Hämäläinen, 2012

conference

Closed-Form Self-Localization of Asynchronous Microphone Arrays
Pasi Pertilä, Mikael Mieskolainen, Matti S. Hämäläinen, 2011

Speaker position tracking

The real-life sound sources are typically non-stationary and non-continuously emitting. Therefore, to keep track of their physical location during their movement and inactivity periods, tracking methods such as Kalman filtering, extended Kalman filtering, and particle filtering has been employed. When multiple speakers are present, also the association of observation or measurement to the correct speaker, and generating and deleting tracks due to new speaker activity or inactivity must be handled.

Bibliography

conference

A Track Before Detect Approach for Sequential Bayesian Tracking of Multiple Speech Sources
Pasi Pertilä, Matti S. Hämäläinen, 2010

conference

A real-time talker localization implementation using multi-PHAT and particle filter
Antti Löytynoja, Pasi Pertilä, 2009

phdthesis

Acoustic Source Localization in a Room Environment and at Moderate Distances
Pasi Pertilä, 2009

article

Measurement combination for acoustic source localization in a room environment
Pasi Pertilä, Teemu Korhonen, Ari Visa, 2008

conference

TUT acoustic source tracking system 2007
Teemu Korhonen, Pasi Pertilä, 2007

conference

TUT acoustic source tracking system 2006
Pasi Pertilä, Teemu Korhonen, Tuomo Pirinen, Mikko Parviainen, 2007

conference

A speaker localization system for lecture room environment
Mikko Parviainen, Tuomo Pirinen, Pasi Pertilä, 2006

Beamforming and enhancement

Beamforming is the linear combination of multiple input signals using a set of complex weights for the aim of enhancing the target signal. Beamforming has a rich literature of different methods for estimating the beamforming weights with various criteria, and recent development in deep learning methods has allowed a further improvement in the estimation of such parameters. Our research contribution in this field is in the proposal of a new type of spatial features, that have been shown to be important features in learning to predict a post-filter for the beamformer, to further enhance the signal by removing unwanted interference components and noise.

Spatial audio

Sound localization

Bibliography

Self-localization

Bibliography

Speaker position tracking

Bibliography

Beamforming and enhancement