Publications
2020
conference
K. Drossos, {. I. Mimilakis, S. Gharib, Y. Li and T. Virtanen. "Sound Event Detection with Depthwise Separable and Dilated Convolutions". IEEE World Congress on Computational Intelligence (WCCI) 2020. 2020.
conference
Y. Li, M. Liu, K. Drossos and T. Virtanen. "Sound Event Detection Via Dilated Convolutional Recurrent Neural Networks". ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2020. pp. 286-290.
conference
A.-J. Muñoz-Montoro, {. J. Carabias-Orti, A. Politis and K. Drossos. "Multichannel Singing Voice Separation by Deep Neural Network Informed DOA Constrained CNMF". IEEE International Workshop on Multimedia Signal Processing (MMSP). 2020.
conference
{. I. Mimilakis, K. Drossos and G. Schuller. "Unsupervised Interpretable Representation Learning for Singing Voice Separation". 28th European Signal Processing Conference. 2020.
conference
X. Favory, K. Drossos, T. Virtanen and X. Serra. "COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations". International Conference on Machine Learning (ICML). 2020.
conference
P. Pyykkönen, {. -. Mimilakis, K. Drossos and T. Virtanen. "Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation". IEEE International Workshop on Multimedia Signal Processing (MMSP). 2020.
conference
N. Nicodemo, G. Naithani, K. Drossos, T. Virtanen and R. Saletti. "Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters". 28th European Signal Processing Conference. 2020.
article
P. Magron and T. Virtanen. "Online Spectrogram Inversion for Low-Latency Audio Source Separation", IEEE Signal Processing Letters, Vol. 27. 2020, pp. 306-310.
conference
K. Drossos, S. Lipping and T. Virtanen. "Clotho: an Audio Captioning Dataset". IEEE 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2020. pp. 736-740.
2019
conference
A. Mesaros, T. Heittola and T. Virtanen. "Acoustic scene classification in DCASE 2019 Challenge: closed and open set classification and data mismatch setups". Proceedings of Workshop on Detection and Classification of Acoustic Scenes and Events, 2019. 2019.
article
S. I. Mimilakis, K. Drossos, E. Cano and G. Schuller. "Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation", Ieee-Acm transactions on audio speech and language processing, Vol. 28. 2019, pp. 266-278.
conference
S. Lipping, K. Drossos and T. Virtanen. "Crowdsourcing a Dataset of Audio Captions". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019). 2019.
conference
K. Drossos, S. Gharib, P. Magron and T. Virtanen. "Language Modelling for Sound Event Detection with Teacher Forcing and Scheduled Sampling". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019). 2019.
conference
K. Drossos, P. Magron and T. Virtanen. "Unsupervised Adversarial Domain Adaptation Based On The Wasserstein Distance For Acoustic Scene Classification". 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2019.
conference
I. Martín-Morató, A. Mesaros, T. Heittola, T. Virtanen, M. Cobos and {. J. Ferri. "Sound Event Envelope Estimation in Polyphonic Mixtures". ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2019. pp. 935-939.
conference
P. Pertilä and M. Parviainen. "Time Difference of Arrival Estimation of Speech Signals Using Deep Neural Networks with Integrated Time-frequency Masking". 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. 2019. pp. 436-440.
conference
A. Diment, E. Fagerlund, A. Benfield and T. Virtanen. "Detection of Typical Pronunciation Errors in Non-native English Speech Using Convolutional Recurrent Neural Networks". 2019 International Joint Conference on Neural Networks, IJCNN 2019. 2019.
article
H. Purwins, B. Li, T. Virtanen, J. Schüller, S.-Y. Chang and T. Sainath. "Deep Learning for Audio Signal Processing", IEEE Journal of Selected Topics in Signal Processing, Vol. 13, 5, 2019, pp. 206-219.
article
V. M. Garcia-Molla, P. S. Juan, T. Virtanen, A. M. Vidal and P. Alonso. "Generalization of the K-SVD algorithm for minimization of β-divergence", Digital Signal Processing, Vol. 92, 9, 2019, pp. 47-53.
conference
S. Wang, G. Naithani and T. Virtanen. "Low-latency Deep Clustering for Speech Separation". 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. 2019. pp. 76-80.
article
A. Mesaros et al. "Sound Event Detection in the DCASE 2017 Challenge", Ieee-Acm transactions on audio speech and language processing, Vol. 27, 6, 2019, pp. 992-1006.
conference
P. Pertilä. "Data-Dependent Ensemble of Magnitude Spectrum Predictions for Single Channel Speech Enhancement". 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP). 2019.
conference
H. Xie and T. Virtanen. "Zero-Shot Audio Classification Based On Class Label Embeddings". 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2019. pp. 264-267.
conference
S. Adavanne, A. Politis and T. Virtanen. "Localization, Detection and Tracking of Multiple Moving Sound Sources with a Convolutional Recurrent Neural Network". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019). 2019. pp. 20-24.
conference
{. C. Green, S. Adavanne, D. Murphy and T. Virtanen. "Acoustic Scene Classification Using Higher-Order Ambisonic Features". 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2019. pp. 328-332.
conference
S. Adavanne, A. Politis and T. Virtanen. "A Multi-room Reverberant Dataset for Sound Event Localization and Detection". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019). 2019. pp. 10-14.
conference
{. N. Ahsan, C. Kertesz, A. Mesaros, T. Heittola, A. Knight and T. Virtanen. "Audio-Based Epileptic Seizure Detection". 2019 27th European Signal Processing Conference (EUSIPCO). 2019.
conference
A. Mesaros, S. Adavanne, A. Politis, T. Heittola and T. Virtanen. "Joint Measurement of Localization and Detection of Sound Events". 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2019. pp. 333-337.
conference
H. L. Bear, T. Heittola, A. Mesaros, E. Benetos and T. Virtanen. "City Classification from Multiple Real-World Sound Scenes". 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2019. pp. 11-15.
2018
conference
A. Mesaros, T. Heittola and T. Virtanen. "A multi-device dataset for urban acoustic scene classification". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop. 2018. pp. 9-13.
conference
P. Magron and T. Virtanen. "Expectation-maximization algorithms for Itakura-Saito nonnegative matrix factorization". Interspeech 2018. 2018.
conference
E. Cakir and T. Virtanen. "Musical Instrument Synthesis and Morphing in Multidimensional Latent Space Using Variational, Convolutional Recurrent Autoencoders". Proceedings of the Audio Engineerings Society 145th Convention. 2018.
conference
P. Magron, K. Drossos, S. I. Mimilakis and T. Virtanen. "Reducing Interference with Phase Recovery in DNN-based Monaural Singing Voice Separation". Interspeech. 2018.
conference
S. Gharib, K. Drossos, E. Cakir, D. Serdyuk and T. Virtanen. "Unsupervised Adversarial Domain Adaptation for Acoustic Scene Classification". Detection and Classification of Acoustic Scenes and Events. 2018.
conference
S. I. Mimilakis, E. Cano, D. FitzGerald, K. Drossos and G. Schuller. "Examining The Perceptual Effect of Alternative Objective Functions for Deep Learning Based Music Source Separation". IEEE Asilomar Conference on Signals, Systems, and Computers. 2018.
conference
S. I. Mimilakis, K. Drossos, J. F. Santos, G. Schuller, T. Virtanen and Y. Bengio. "Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask". 2018.
article
G. Naithani, J. Kivinummi, T. Virtanen, O. Tammela, M. J. Peltola and J. M. Leppänen. "Automatic segmentation of infant cry signals using hidden Markov models", Eurasip Journal on Audio, Speech, and Music Processing, Vol. 2018. 2018.
article
{. J. Orti}, J. Nikunen, T. Virtanen and P. Vera-Candeas. "Multichannel Blind Sound Source Separation using Spatial Covariance Model with Level and Time Differences and Non-Negative Matrix Factorization", Ieee-Acm transactions on audio speech and language processing, 4, 2018.
article
P. Magron and T. Virtanen. "Complex ISNMF: a phase-aware model for monaural audio source separation", Ieee-Acm transactions on audio speech and language processing, Vol. 27, 10, 2018, pp. 20-31.
conference
J. Nikunen and T. Virtanen. "Estimation of time-varying room impulse responses of multiple sound sources from observed mixture and isolated source signals". 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. 2018. pp. 421-425.
conference
E. Cakir and T. Virtanen. "End-to-End Polyphonic Sound Event Detection Using Convolutional Recurrent Neural Networks with Learned Time-Frequency Representation Input". 2018 International Joint Conference on Neural Networks, IJCNN 2018 - Proceedings. 2018.
conference
K. Drossos, S. I. Mimilakis, D. Serdyuk, G. Schuller, T. Virtanen and Y. Bengio. "MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation". Proceedings of the IEEE World Congress on Computational Intelligence (WCCI)/International Joint Conference on Neural Networks (IJCNN). 2018.
article
S. Adavanne, A. Politis, J. Nikunen and T. Virtanen. "Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks", IEEE Journal of Selected Topics in Signal Processing, 12, 2018.
conference
G. Naithani, J. Nikunen, L. Bramslow and T. Virtanen. "Deep neural network based speech separation optimizing an objective estimator of intelligibility for low latency applications". 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018. 2018. pp. 386-390.
conference
P. Magron and T. Virtanen. "On modeling the STFT phase of audio signals with the von Mises distribution". 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018. 2018.
conference
M. Parviainen, P. Pertila, T. Virtanen and P. Grosche. "Time-frequency masking strategies for single-channel low-latency speech enhancement using neural networks". 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018. 2018. pp. 51-55.
conference
A. Mesaros, T. Heittola and T. Virtanen. "Acoustic scene classification: An overview of dcase 2017 challenge entries". 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018. 2018. pp. 411-415.
conference
G. Huang, T. Heittola and T. Virtanen. "Using sequential information in polyphonic sound event detection". 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018. 2018. pp. 291-295.
conference
P. Magron and T. Virtanen. "Bayesian anisotropic Gaussian model for audio source separation". 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2018.
conference
P. Magron and T. Virtanen. "Towards Complex Nonnegative Matrix Factorization with the Beta-Divergence". 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC). 2018. pp. 156-160.
conference
S. Gharib et al.. "Acoustic Scene Classification: A Competition Review". 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018. 2018.
conference
S. Adavanne, A. Politis and T. Virtanen. "Direction of Arrival Estimation for Multiple Sound Sources Using Convolutional Recurrent Neural Network". 2018 26th European Signal Processing Conference (EUSIPCO). 2018. pp. 1462-1466.
conference
K. Drossos, P. Magron, S. I. Mimilakis and T. Virtanen. "Harmonic-Percussive Source Separation with Deep Neural Networks and Phase Recovery". 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018. 2018. pp. 421-425.
article
A. Mesaros et al. "Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge", IEEE-ACM Transactions on Audio Speech and Language Processing, 11, 2018.
inbook
A. Mesaros, T. Heittola and D. Ellis. "Datasets and Evaluation". T. Virtanen, M. D. Plumbley and D. Ellis eds. Springer. 2018. pp. 147-179.
inbook
T. Heittola, E. Cakir and T. Virtanen. "The machine learning approach for analysis of sound scenes and events". T. Virtanen, Plumbley, M. D. and D. Ellis eds. Springer. 2018. pp. 13-40.
2017
article
M. Parviainen and P. Pertilä. "Self-localization of dynamic user-worn microphones from observed speech", Applied Acoustics, Vol. Volume 117, Part A, February, 2017, pp. 76 - 85.
conference
K. Drossos, S. Adavanne and T. Virtanen. "Automated Audio Captioning with Recurrent Neural Networks". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2017.
conference
D. Caballero et al.. "ASR in classroom today: Automatic visualization of conceptual network in science classrooms". Data Driven Approaches in Digital Education - 12th European Conference on Technology Enhanced Learning, EC-TEL 2017, Proceedings. 2017. pp. 541-544.
conference
K. Drossos, S. I. Mimilakis, A. Floros, T. Virtanen and G. Schuller. "Close Miking Empirical Practice Verification: A Source Separation Approach". Audio Engineering Society Convention 142. 2017.
conference
P. Magron, J. L. Roux and T. Virtanen. "Consistent Anisotropic Wiener Filtering for Audio Source Separation". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2017. pp. 269-273.
conference
E. Cakir and T. Virtanen. "Convolutional Recurrent Neural Networks for Rare Sound Event Detection". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017). 2017. pp. 27-31.
conference
J. Nikunen and T. Virtanen. "Time-difference of arrival model for spherical microphone arrays and application to direction of arrival estimation". Proceedings of 25th European Signal Processing Conference. 2017. pp. 1255-1259.
conference
A. Diment and T. Virtanen. "Transfer Learning of Weakly Labelled Audio". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2017. pp. 6-10.
conference
S. I. Mimilakis, K. Drossos, T. Virtanen and G. Schuller. "A Recurrent Encoder-Decoder Approach With Skip-Filtering Connections for Monaural Singing Voice Separation". 27th IEEE International Workshop on Machine Learning for Signal Processing (MLSP). 2017.
conference
E. Cakir, S. Adavanne, G. Parascandolo, K. Drossos and T. Virtanen. "Convolutional recurrent neural networks for bird audio detection". European Signal Processing Conference. 2017. pp. 1744-1748.
conference
S. Adavanne and T. Virtanen. "Sound event detection using weakly labeled dataset with stacked convolutional and recurrent neural network". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017). 2017. pp. 12-16.
conference
S. Adavanne, K. Drossos, E. Cakir and T. Virtanen. "Stacked convolutional and recurrent neural networks for bird audio detection". 2017 25th European Signal Processing Conference (EUSIPCO). 2017. pp. 1729-1733.
conference
Z. Shuyang, T. Heittola and T. Virtanen. "Learning vocal mode classifiers from heterogeneous data sources". 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2017. pp. 16–20.
conference
A. Mesaros et al.. "DCASE 2017 challenge setup: tasks, datasets and baseline system". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017). 2017. pp. 85-92.
conference
A. Mesaros, T. Heittola and T. Virtanen. "Assessment of human and machine performance in acoustic scene classification: DCASE 2016 case study". 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2017. pp. 319–323.
conference
S. Adavanne, P. Pertila and T. Virtanen. "Sound event detection using spatial features and convolutional recurrent neural network". IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017). 2017.
conference
M. Malik, S. Adavanne, K. Drossos, T. Virtanen, D. Ticha and R. Jarina. "Stacked convolutional and recurrent neural networks for music emotion recognition". Sound and Music Computing Conference. 2017.
conference
Z. Shuyang, T. Heittola and T. Virtanen. "Active Learning for Sound Event Classification by Clustering Unlabeled Data". 2017.
conference
M. Parviainen and P. Pertilä. "Obtaining an optimal set of head-related transfer functions with a small amount of measurements". 2017 IEEE International Workshop on Signal Processing Systems (SiPS). 2017.
conference
P. Magron, R. Badeau and A. Liutkus. "Lévy NMF : un modèle robuste de séparation de sources non-négatives". Actes du XXVIème Colloque GRETSI. 2017.
conference
P. Magron, R. Badeau and A. Liutkus. "Lévy NMF for robust nonnegative source separation". 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2017. pp. 259-263.
conference
P. Magron, R. Badeau and B. David. "Phase-dependent anisotropic Gaussian model for audio source separation". 42nd International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2017. pp. 531-535.
inbook
D. Ellis, T. Virtanen, M. D. Plumbley and B. Raj. "Future Perspective". T. Virtanen, M. D. Plumbley and D. Ellis eds. Springer. 2017. pp. 401-415.
book
T. Virtanen, M. D. Plumbley and D. Ellis. Computational analysis of sound scenes and events, Springer, 2017.
inbook
T. Virtanen, M. D. Plumbley and D. Ellis. "Introduction to sound scene and event analysis". T. Virtanen, M. D. Plumbley and D. Ellis eds. Springer. 2017. pp. 3-12.
article
G. Richard, T. Virtanen, J. P. Bello, N. Ono and H. Glotin. "Introduction to the Special Section on Sound Scene and Event Analysis", Ieee-Acm transactions on audio speech and language processing, Vol. 25, 6, 2017, pp. 1169-1171.
article
J. Nikunen, A. Diment and T. Virtanen. "Separation of Moving Sound Sources Using Multichannel NMF and Acoustic Tracking", IEEE-ACM Transactions on Audio Speech and Language Processing, 11, 2017.
inbook
J. Nikunen and T. Virtanen. "Source Separation and Reconstruction of Spatial Audio Using Spectrogram Factorization". V. Pulkki, S. Delikaris-Manias and A. Politis eds. John Wiley & Sons. 2017.
conference
J. M. Perez-Macias, S. Adavanne, J. Viik, A. Värri, S.-L. Himanen and M. Tenhunen. "Assessment of support vector machines and convolutional neural networks to detect snoring using Emfit mattress". 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). 2017. pp. 2883-2886.
article
S. Drgas, T. Virtanen, J. Lücke and A. Hurmalainen. "Binary Non-Negative Matrix Deconvolution for Audio Dictionary Learning", IEEE-ACM Transactions on Audio Speech and Language Processing, Vol. 25, 8, 2017, pp. 1644-1656.
book
T. Virtanen et al.. Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), Tampere University of Technology. Laboratory of Signal Processing, 2017.
conference
M. Valenti, S. Squartini, A. Diment, G. Parascandolo and T. Virtanen. "A convolutional neural network approach for acoustic scene classification". 2017 International Joint Conference on Neural Networks, IJCNN 2017. 2017. pp. 1547-1554.
article
E. Cakir, G. Parascandolo, T. Heittola, H. Huttunen and T. Virtanen. "Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection", IEEE-ACm Transactions on Audio Speech and Language Processing, Vol. 25, 6, 2017, pp. 1291-1303.
article
P. Maijala, Z. Shuyang, T. Heittola and T. Virtanen. "Environmental noise monitoring using source classification in sensors", Applied Acoustics, Vol. 129, 8, 2017, pp. 258-267.
conference
G. Naithani, T. Barker, G. Parascandolo, L. Bramsløw, N. H. Pontoppidan and T. Virtanen. "Low Latency Sound Source Separation using Convolutional Recurrent Neural Networks". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2017.
conference
P. Pertilä and E. Cakir. "Robust Direction Estimation with Convolutional Neural Networks-based Steered Response Power". ICASSP. 2017.
2016
conference
K. Mahkonen, A. Hurmalainen, T. Virtanen and J.-K. Kämäräinen. "Cascade processing for speeding up sliding window sparse classification". European Signal Processing Conference (EUSIPCO), 2016. 2016.
article
A. Mesaros, T. Heittola and T. Virtanen. "Metrics for polyphonic sound event detection", Applied Sciences, Vol. 6. 2016, pp. 162.
conference
S. Adavanne, G. Parascandolo, P. Pertila, T. Heittola and T. Virtanen. "Sound event detection in multichannel audio using spatial and harmonic features". Detection and Classification of Acoustic Scenes and Events. 2016.
conference
A. Mesaros, T. Heittola and T. Virtanen. "TUT Database for Acoustic Scene Classification and Sound Event Detection". 2016.
techreport
G. Parascandolo, P. Pertila, T. Heittola and T. Virtanen. "Sound event detection in real life audio". 2016.
article
K. Drossos, M. Kaliakatsos-Papakostas, A. Floros and T. Virtanen. "On the Impact of The Semantic Content of Sound Events in Emotion Elicitation", Journal of the Audio Engineering Society, Vol. 64, 8, 2016, pp. 525-532.