Publications

Filter by:

2016

book
T. Virtanen et al.. Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016), Tampere University of Technology. Department of Signal Processing, 2016.
inbook
A. Diment, T. Virtanen, M. Parviainen, R. Zelov and A. Glasman. "Noise-Robust Detection of Whispering in Telephone Calls Using Deep Neural Networks". IEEE. 2016.
inbook
S. I. Mimilakis, K. Drossos, T. Virtanen and G. Schuller. "Deep Neural Networks for Dynamic Range Compression in Mastering Applications". AES Audio Engineering Society. 2016.
conference
G. Parascandolo, H. Huttunen and T. Virtanen. "Recurrent Neural Networks for Polyphonic Sound Event Detection in Real Life Recordings". 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2016. pp. 6440-6444.
conference
M. Valenti, A. Diment, G. Parascandolo, S. Squartini and T. Virtanen. "DCASE 2016 Acoustic Scene Classification Using Convolutional Neural Networks". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016). 2016.
article
T. Barker and T. Virtanen. "Blind Separation of Audio Mixtures Through Nonnegative Tensor Factorization of Modulation Spectrograms", Ieee-Acm transactions on audio speech and language processing, Vol. 24, 12, 2016, pp. 2377-2389.
article
J. Nikunen, A. Diment, T. Virtanen and M. Vilermo. "Binaural rendering of microphone array captures based on source separation", Speech Communication, Vol. 76. 2016, pp. 157-169.
conference
P. Pertilä and A. Brutti. "Increasing the environment-awareness of rake beamforming for directive acoustic sources". 15th International Workshop on Acoustic Signal Enhancement (IWAENC). 2016.
conference
G. Naithani, G. Parascandolo, T. Barker, N. H. Pontoppidan and T. Virtanen. "Low-Latency Sound Source Separation Using Deep Neural Networks". IEEE Global Conference on Signal and Information Processing. 2016.

2015

conference
A. Hurmalainen, R. Saeidi and T. Virtanen. "Noise Robust Speaker Recognition with Convolutive Sparse Coding". Proceedings of 16th Interspeech. 2015.
article
T. Virtanen, J. Gemmeke, B. Raj and P. Smaragdis. "Compositional Models for Audio Processing", IEEE Signal Processing Magazine, March, 2015.
article
K. Drossos, A. Floros and K. L. Kermanidis. "Evaluating the Impact of Sound Events’ Rhythm Characteristics to Listener’s Valence", Journal of the Audio Engineering Society, Vol. 63. 2015, pp. 139-153.
article
T. Virtanen, J. Gemmeke, B. Raj and P. Smaragdis. "Compositional Models for Audio Processing: Uncovering the structure of sound mixtures", IEEE Signal Processing Magazine, Vol. 32. 2015, pp. 125 - 144.
conference
D. Battaglino, A. Mesaros, L. Lepauloux, L. Pilati and N. Evans. "Acoustic context recognition for mobile devices using a reduced complexity SVM". European Signal Processing Conference (EUSIPCO-2015). 2015. pp. 534-538.
article
P. Pertilä and J. Nikunen. "Distant speech separation using predicted time–frequency masks from spatial featur", Speech Communication, Vol. 68. 2015, pp. 97 - 106.
article
U. Simsekli, T. Virtanen and A. T. Cemgil. "Non-negative Tensor Factorization Models for Bayesian Audio Processing", Digital Signal Processing. 2015.
conference
E. Cakir, T. Heittola, H. Huttunen and T. Virtanen. "Multi-label vs. combined single-label sound event detection with deep neural networks". 23rd European Signal Processing Conference 2015 (EUSIPCO 2015). 2015.
conference
E. Cakir, T. Heittola, H. Huttunen and T. Virtanen. "Polyphonic sound event detection using multi label deep neural networks". International Joint Conference on Neural Networks 2015 (IJCNN 2015). 2015.
conference
A. Diment, E. Cakir, T. Heittola and T. Virtanen. "Automatic recognition of environmental sound events using all-pole group delay features". European Signal Processing Conference (EUSIPCO 2015). 2015.
conference
A. Mesaros, T. Heittola, O. Dikmen and T. Virtanen. "Sound event detection in real life recordings using coupled matrix factorization of spectral representations and class activity annotations". Proceedings of 40th IEEE International Conference on Audio, Speech and Signal Processing (ICASSP). 2015. pp. 151-155.
article
P. Pertilä and J. Nikunen. "Distant speech separation using predicted time-frequency masks from spatial features", Speech Communication, Vol. 68. 2015, pp. 97-106.
article
K. Drossos, A. Floros, A. Giannakoulopoulos and N. Kanellopoulos. "Investigating the Impact of Sound Angular Position on the Listener Affective State", IEEE Transactions on Affective Computing, Vol. 6, 1, 2015, pp. 27-42.
conference
D. Baby, J. Gemmeke, T. Virtanen and H. V. Hamme. "Exemplar-based speech enhancement for deep neural network based automatic speech recognition". ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2015. pp. 4485-4489.
article
D. Baby, T. Virtanen, J. Gemmeke and H. V. Hamme. "Coupled dictionaries for exemplar-based speech enhancement and automatic speech recognition", Ieee-Acm transactions on audio speech and language processing, Vol. 23, 11, 2015, pp. 1788-1799.
article
E. Räsänen, O. Pulkkinen, T. Virtanen, M. Zollner and H. Hennig. "Fluctuations of Hi-Hat Timing and Dynamics in a Virtuoso Drum Track of a Popular Music Recording", PLoS ONE, Vol. 10. 2015.
conference
D. Baby, J. Gemmeke, T. Virtanen and H. V. Hamme. "Exemplar-based speech enhancement for deep neural network based automatic speech recognition". IEEE International Conference on Acoustics, Speech and Signal Processing. 2015.
conference
S. Drgas and T. Virtanen. "Speaker verification using adaptive dictionaries in non-negative spectrogram deconvolution". 12th International Conference on Latent Variable Analysis and Signal Separation. 2015.
conference
T. Barker, T. Virtanen and N. H. Pontoppidan. "Low-Latency Sound-Source-Separation using Non-Negative Matrix Factorisation with Coupled Analysis and Synthesis Dictionaries". ICASSP 2015. 2015.
conference
A. Diment and T. Virtanen. "Archetypal analysis for audio dictionary learning". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2015.
conference
A. Hurmalainen, R. Saeidi and T. Virtanen. "Similarity Induced Group Sparsity for Non-negative Matrix Factorisation". Proceedings of 40th IEEE International Conference on Audio, Speech and Signal Processing (ICASSP). 2015. pp. 4425-4429.

2014

article
J. Nikunen and T. Virtanen. "Direction of Arrival Based Spatial Covariance Model for Blind Sound Source Separation", IEEE/ACM Transactions on Audio, Speech & Language Processing, Vol. 22, March, 2014, pp. 727-739.
conference
T. Barker, H. V. Hamme and T. Virtanen. "Modelling Primitive Streaming of Simple Tone Sequences Through Factorisation of Modulation Pattern Tensors". INTERSPEECH2014, 15th Annual Conference of the International Speech Communication Association, 14-18 September 2014, Singapore. 2014. pp. 1371-1375.
conference
T. Barker, T. Virtanen and O. Delhomme. "Ultrasound-Coupled Semi-Supervised Nonnegative Matrix Factorisation for Speech Enhancement". 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy, May 4-9.2014. 2014. pp. 2148-2152.
conference
D. Baby, T. Virtanen, T. Barker and H. V. Hamme. "Coupled Dictionary Training for Exemplar-Based Speech Enhancement". 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4-9 May 2014, Florence. 2014. pp. 2883 - 2887.
conference
G. Sanchez, H. Silén, J. Nurminen and M. Gabbouj. "Hierarchical modeling of F0 contours for voice conversion". INTERSPEECH 2014, Proceedings of the15th Annual Conference of the International Speech Communication Association, 14-18, September 2014, Singapore. 2014. pp. 2318-2321.
conference
O. Gencoglu, T. Virtanen and H. Huttunen. "Recognition of Acoustic Events Using Deep Neural Networks". 2014.
conference
T. Barker and T. Virtanen. "Semi-supervised non-negative tensor factorisation of modulation spectrograms for monaural speech separation". Neural Networks (IJCNN), 2014 International Joint Conference on. 2014. pp. 3556-3561.
article
T. Heittola, A. Mesaros, D. Korpi, A. Eronen and T. Virtanen. "Method for creating location-specific audio textures", EURASIP Journal on Audio, Speech and Music Processing, Vol. 2014. 2014.
conference
T. Virtanen, B. Raj, J. Gemmeke and H. V. Hamme. "Active-set Newton algorithm for non-negative sparse coding of audio". In Proc. International Conference on Acoustics, Speech, and Signal Processing. 2014.
article
Z. Wu, T. Virtanen, E. S. Chng and H. Li. "Exemplar-based sparse representation with residual compensation for voice conversion", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22. 2014.
conference
D. Baby, T. Virtanen, T. Barker and H. V. Hamme. "Coupled Dictionary Training for Exemplar-based Speech Enhancement". International Conference on Acoustics, Speech, and Signal Processing. 2014.
conference
D. Baby, T. Virtanen, J. Gemmeke, T. Barker and H. V. Hamme. "Exemplar-based noise robust automatic speech recognition using modulation spectrogram features". IEEE Spoken Language Technology Workshop. 2014.
conference
T. Barker, H. V. Hamme and T. Virtanen. "Modelling Primitive Streaming of Simple Tone Sequences Through Factorisation of Modulation Pattern Tensors". INTERSPEECH 2014. 2014.
conference
J. Nikunen and T. Virtanen. "Multichannel audio separation by Direction of Arrival Based Spatial Covariance Model and Non-negative Matrix Factorization". Proceedings of 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2014. pp. 6727-6731.
conference
P. Pertilä and J. Nikunen. "Microphone Array Post-Filtering Using Supervised Machine Learning for Speech Enhancement". INTERSPEECH 2014 - 15th Annual Conference of the International Speech Communication Association. 2014.
conference
M. Parviainen, P. Pertilä and M. S. Hämäläinen. "Self-localization of Wireless Acoustic Sensors in Meeting Rooms". 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA). 2014.
incollection
A. Diment, P. Rajan, T. Heittola and T. Virtanen. "Group Delay Function from All-Pole Models for Musical Instrument Recognition". Aramaki et al eds. Springer International Publishing. 2014. pp. 606-618.

2013

conference
A. Hurmalainen and T. Virtanen. "Learning State Labels for Sparse Classification of Speech with Matrix Deconvolution". Proceedings of the Automatic Speech Recognition and Understanding Workshop (ASRU). 2013.
article
P. Pertilä, M. S. Hämäläinen and M. Mieskolainen. "Passive temporal offset estimation of multichannel recordings of an ad-hoc microphone array", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21, Nov., 2013, pp. 2393-2402.
conference
A. Diment, T. Heittola and T. Virtanen. "Semi-supervised Learning for Musical Instrument Recognition". 21st European Signal Processing Conference 2013 (EUSIPCO 2013). 2013.
conference
A. Hurmalainen and T. Virtanen. "Acquiring Variable Length Speech Bases for Factorisation-Based Noise Robust Speech Recognition". Proceedings of the 21st European Signal Processing Conference (EUSIPCO). 2013.
conference
J. Geiger et al.. "The TUM+TUT+KUL Approach to the CHiME Challenge 2013: Multi-Stream ASR Exploiting BLSTM Networks and Sparse NMF". proceedings of the 2nd CHiME workshop. 2013. pp. 25-30.
conference
J. Gemmeke, T. Virtanen and A. Hurmalainen. "HMM-Regularization for NMF-Based Noise Robust ASR". Proceedings of the 2nd CHiME workshop. 2013. pp. 47-52.
conference
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Compact Long Context Spectral Factorisation Models for Noise Robust Recognition of Medium Vocabulary Speech". Proceedings of the 2nd CHiME workshop. 2013. pp. 13-18.
article
P. Pertilä. "Online Blind Speech Separation using Multiple Acoustic Speaker Tracking and Time-Frequency Masking", Computer Speech & Language, Vol. 27, May, 2013, pp. 683–702.
article
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Modelling Non-stationary Noise with Spectral Factorisation in Automatic Speech Recognition", Computer Speech & Language, Vol. 27, May, 2013, pp. 763-779.
conference
J. Geiger et al.. "The TUM+TUT+KUL Approach to the 2nd CHiME Challenge: Multi-Stream ASR Exploiting BLSTM Networks and Sparse NMF". The 2nd International Workshop on Machine Listening in Multisource Environments CHiME Workshop, 1st June 2013, Vancouver, Canada (in conjuction with ICASSP). 2013. pp. 25-30.
conference
J. Gemmeke, A. Hurmalainen and T. Virtanen. "HMM-regularization for NMF-based noise robust ASR". The 2nd International Workshop on Machine Listening in Multisource Environments CHiME Workshop, 1st June 2013, Vancouver, Canada (in conjuction with ICASSP). 2013. pp. 47-52.
conference
J. Nurminen, H. Silen, E. Helander and M. Gabbouj. "Evaluation of detailed modeling of the LP residual in statistical speech synthesis". 2013 IEEE International Symposium on Circuits and Systems, May 19-23,2013, Beijing, China. 2013. pp. 313-316.
conference
J. Nurminen, H. Silen and M. Gabbouj. "Speaker-specific retraining for enhanced compression of unit selection text-to-speech databases". Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 25-29 August, Lyon, France. 2013. pp. 388-391.
conference
H. Silen, J. Nurminen, E. Helander and M. Gabbouj. "Voice Conversion for Non-Parallel Datasets Using Dynamic Kernel Partial Least Squares Regression". Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 25-29 August, Lyon, France. 2013. pp. 373-377.
conference
K. Mahkonen et al.. "Music Dereverberation by Spectral Linear Prediction in Live Recordings". 16th International Conference on Digital Audio Effects, Ireland, 2-5.9,2013. 2013.
conference
T. Barker and T. Virtanen. "Non-negative Tensor Factorisation of Modulation Spectrograms for Monaural Sound Source Separation". Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 25-29 August, Lyon, France. 2013. pp. 827 - 831.
conference
A. Mesaros, T. Heittola and K. Palomäki. "Query-by-example retrieval of sound events using an integrated similarity measure of content and label". 14th International Workshop on Image and Audio Analysis for Multimedia Interactive Services (WIA2MIS). 2013. pp. 1-4.
article
T. Heittola, A. Mesaros, A. Eronen and T. Virtanen. "Context-Dependent Sound Event Detection", EURASIP Journal on Audio, Speech and Music Processing. 2013.
conference
A. Mesaros, T. Heittola and K. Palomäki. "Analysis of acoustic-semantic relationship for diversely annotated real-world audio data". Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. 2013. pp. 813-817.
conference
T. Heittola, A. Mesaros, T. Virtanen and M. Gabbouj. "Supervised Model Training for Overlapping Sound Events Based on Unsupervised Source Separation". Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. 2013.
inbook
F. Briggs et al.. "The 9th Annual MLSP Competition: New Methods For Acoustic Classification Of Multiple Simultaneous Bird Species In a Noisy Environment". Institute of Electrical and Electronics Engineers IEEE. 2013.
conference
P. Pertilä and A. Tinakari. "Time-of-Arrival Estimation for Blind Beamforming". 2013.
conference
A. Diment, R. Padmanabhan, T. Heittola and T. Virtanen. "Modified Group Delay Feature for Musical Instrument Recognition". 10th International Symposium on Computer Music Multidisciplinary Research (CMMR). 2013.
conference
J. Gemmeke, T. Virtanen and K. Demuynck. "Exemplar-based joint channel and noise compensation". In Proc. International Conference on Acoustics, Speech, and Signal Processing. 2013.
conference
Z. Wu, T. Virtanen, T. Kinnunen, E. S. Chng and H. Li. "Exemplar-based Voice Conversion using Non-negative Spectrogram Deconvolution". in proc. 8th ISCA Speech Synthesis Workshop. 2013.
conference
Z. Wu, T. Virtanen, T. Kinnunen, E. S. Chng and H. Li. "Exemplar-based unit selection for voice conversion utilizing temporal information". In proc. Interspeech. 2013.
conference
J. Kauppinen, A. Klapuri and T. Virtanen. "Music Self-Similarity Modeling Using Augmented Nonnegative Matrix Factorization of Block and Stripe Patterns". In proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2013.
article
T. Virtanen, J. Gemmeke and B. Raj. "Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21. 2013.
article
A. Diment, R. Padmanabhan, T. Heittola and T. Virtanen. "Modified Group Delay Feature for Musical Instrument Recognition", 10th International Symposium on Computer Music Multidisciplinary Research (CMMR). 2013.

2012

article
J. Nikunen, T. Virtanen and M. Vilermo. "Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization", Journal of the Audio Engineering Society, Vol. 60, October, 2012, pp. 794-806.
conference
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Detection, Separation and Recognition of Speech From Continuous Signals Using Spectral Factorisation". 20th European Signal Processing Conference (EUSIPCO). 2012. pp. 2649-2653.
article
E. Helander, H. Silén, T. Virtanen and M. Gabbouj. "Voice Conversion Using Dynamic Kernel Partial Least Squares Regression", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 3, March, 2012, pp. 806 - 817.
conference
V. Popa, H. Silén, J. Nurminen and M. Gabbouj. "Local Linear Transformation for Voice Conversion". ICASSP. 2012.
article
S. Kiranyaz, T. Mäkinen and M. Gabbouj. "Dynamic and scalable audio classification by collective network of binary classifiers framework: An evolutionary approach", Neural Networks, Vol. 34. 2012, pp. 80-95.
conference
H. Silen, E. Helander, J. Nurminen and M. Gabbouj. "Ways to Implement Global Variance in Statistical Speech Synthesis". Proceedings of 13th Annual Conference of the International Speech Communication Association, Interspeech 2012, September 9 - 13, Portland, Oregon, USA. 2012. pp. 1-4.
conference
J. Nurminen, H. Silen, V. Popa, E. Helander and M. Gabbouj. "Voice Conversion". Speech Enhancement, Modeling and Recognition: Algorithms and Applications. S. Ramakrishnan ed. 2012. pp. 1-27.
article
T. Mäkinen, S. Kiranyaz, J. Raitoharju and M. Gabbouj. "An evolutionary feature synthesis approach for content-based audio retrieval", EURASIP Journal on Audio, Speech, and Music Processing. 2012.
book
T. Virtanen, R. Singh and B. Raj. Techniques for Noise Robustness in Automatic Speech Recognition, John Wiley & Sons, 2012.
conference
J. Nurminen, H. Silén, V. Popa, E. Helander and M. Gabbouj. "Ch. Voice Conversion in Speech Enhancement, Modeling and Recognition - Algorithms and Applications". S. Ramakrishnan ed. 2012.
conference
F. Weninger et al.. "Non-Negative Matrix Factorization for Highly Noise-Robust ASR: to Enhance or to Recognize?". In proc. 37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2012.
conference
A. Hurmalainen and T. Virtanen. "Modelling spectro-temporal dynamics in factorisation-based noise-robust automatic speech recognition". Proc. 37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2012.
conference
F. Mazhar, T. Heittola, T. Virtanen and J. Holm. "Automatic Scoring of Guitar Chords". Proc. AES 45th International Conference. 2012.
conference
T. Mäkinen, S. Kiranyaz, J. Pulkkinen and M. Gabbouj. "Evolutionary Feature Generation for Content-based Audio Classification and Retrieval". 20th European Signal Processing Conference (EUSIPCO). 2012.
article
J. Nikunen, T. Virtanen and M. Vilermo. "Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization", Journal of the Audio Engineering Society, Vol. 60. 2012, pp. 794-806.
article
D. Korpi, T. Heittola, T. Partala, A. Eronen, A. Mesaros and T. Virtanen. "On the human ability to discriminate audio ambiances from similar locations of an urban environment", Personal and Ubiquitous Computing, Vol. November 2012. 2012.
conference
P. Pertilä, M. Mieskolainen and M. Hämäläinen. "Passive Self-Localization of Microphones Using Ambient Sounds". Proc. 20th European Signal Processing Conference (EUSIPCO-2012). 2012.
conference
R. Saeidi, A. Hurmalainen, T. Virtanen and D. van Leeuwen. "Exemplar-based Sparse Representation and Sparse Discrimination for Noise Robust Speaker Identification". Proc. Odyssey 2012: The Speaker and Language Recognition Workshop. 2012.
conference
A. B. Rad and T. Virtanen. "Phase spectrum prediction of audio signals". 5th International Symposium on Communications, Control and Signal Processing. 2012.
conference
J. Nikunen, T. Virtanen, P. Pertilä and M. Vilermo. "Permutation Alignment Of Frequency-Domain Ica By The Maximization Of Intra-Source Envelope Correlations". European Signal Processing Conference (EUSIPCO). 2012.
conference
A. Hurmalainen, R. Saeidi and T. Virtanen. "Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition". 13th Interspeech. 2012.
conference
F. Rodriguez-Serrano, J. J. Orti, P. Vera-Candeas, T. Virtanen and N. Ruiz-Reyes. "Multiple Instrument Mixtures Source Separation Evaluation Using Instrument-Dependent NMF Models". The 10th International Conference on Latent Variable Analysis and Source Separation. 2012.

2011

article
J. Gemmeke, T. Virtanen and A. Hurmalainen. "Exemplar-based Sparse Representations for Noise Robust Automatic Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, September, 2011, pp. 2067-2080.
conference
A. Hurmalainen, K. Mahkonen, J. Gemmeke and T. Virtanen. "Exemplar-based Recognition of Speech in Highly Variable Noise". Proc. International Workshop on Machine Listening in Multisource Environments (CHiME). 2011. pp. 1-5.
Results 101 - 200 of 421