Speech and Cognition research group

Speech and Cognition Research Group

Unit of Computing Sciences,
Faculty of Information Technology and Communication Sciences,
Tampere University,
Finland

Publication categories

machine learning

language acquisition

speech processing

neuroscience

context-aware computing

multimodal processing

articulatory modeling/analysis

perception & psychoacoustics

About Research Members Publications Resources Contact

Journal articles and book chapters

Lahtinen, K., Mustanoja, L. & Räsänen, O. (in press). FinnAffect: An affective speech corpus for spontaneous Finnish. Speech Communication. in press.

Räsänen, O. & Kocharov, D. (2025). A pipeline for stochastic and controlled generation of realistic language input for simulating infant language acquisition. Behavior Research Methods, 57, article no. 275, https://doi.org/10.3758/s13428-025-02772-6.

Vaaras, E., Airaksinen, M. & Räsänen, O. (2025). PFML: Self-supervised learning of time-series data without representation collapse. IEEE Access, 13, 60233–60244, https://doi.org/10.1109/ACCESS.2025.3556957.

Airaksinen, M., Räsänen, O. & Vanhatalo, S. (2025). Trade-offs between simplifying IMU-based movement recordings and the attainability of different levels of analyses: Systematic assessment of method variations. JMIR mHealth and uHealth, 13, 58078, https://doi.org/10.2196/58078.

Liu, S., Reddy, M.K., Yagnavajjula, M.K., Räsänen, O., Alku, P., Ikävalko, T., Hakanpää, T., Öyry, A., & Laukkanen, A-M. (in press). Automatic classification of strain in the singing voice using machine learning Journal of Voice, https://doi.org/10.1016/j.jvoice.2025.03.040.

Khorrami, K. & Räsänen, O. (2025). A model of early word acquisition based on realistic-scale audiovisual naming events. Speech Communication, 167, 103169, https://doi.org/10.1016/j.specom.2024.103169.

Vaaras, E., Airaksinen, M. & Räsänen, O. (2025). IAR 2.0: An algorithm for refining inconsistent annotations for time-series data using discriminative classifiers. IEEE Access, 13, 19979–19995, https://doi.org/10.1109/ACCESS.2025.3534637.

Cruz Blandón, M. A., Gonzalez-Gomez, N., Lavechin, M., & Räsänen, O. (2025). Simulating prenatal language exposure in computational models: an exploration study. Cognition, 256, 106044, https://doi.org/10.1016/j.cognition.2024.106044.

Räsänen, O., Airaksinen, M., Marchi, V., Chorna, O., Guzzetta, A., & Festante, F. (2025). Motherese directed at prelinguistic infants at risk for neurological disorders: an exploratory study. Journal of Child Language, 52(6), 1249–1279, https://doi.org/10.1017/S0305000924000217.

Xie, H., Khorrami, K., Räsänen, O., & Virtanen, T. (2025). Text-based audio retrieval by learning from similarities between audio captions. IEEE Signal Processing Letters, 32, 221–225, https://doi.org/10.1109/LSP.2024.3511414.

Airaksinen, M., Vaaras E., Haataja, L., Räsänen, O., & Vanhatalo S. (2024). Automatic assessment of infant carrying and holding using at-home wearable recordings. Scientific Reports, 14, 4852, https://doi.org/10.1038/s41598-024-54536-5.

Cristia, A., Gautheron, L., Zhang, Z., Shuller, B., Scaff, C., Rowland, C., Räsänen, O., Peurey, L., Lavechin, M., Havard, W., Fausey, C., Cychosz, C., Bergelson, E., Anderson, H., Al Futaisi, N., & Soderstrom, M. (2024). Establishing the reliability of metrics extracted from long-form recordings using LENA and the ACLEW pipeline. Behavior Research Methods, 56, 8588–8607, https://doi.org/10.3758/s13428-024-02493-2.

Cruz Blandón, M. A., Cristia, A., & Räsänen, O. (2023). Introducing meta-analysis in the evaluation of computational models of infant language development. Cognitive Science, 47, e13307, https://doi.org/10.1111/cogs.13307.

Convey, R., Ihalainen, T., Liu, Y., Räsänen, O., Ylinen, S., & Penttilä, N. (2023). A comparative study of automatic vowel articulation index and auditory-perceptual assessments of speech intelligibility in Parkinson's disease. Int. J. Speech and Language Pathology, https://doi.org/10.1080/17549507.2023.2251725.

Airaksinen, M., Taylor, E., Gallen, A., Ilen, E., Saari, A., Sankilampi, U., Räsänen, O., Haataja, L., & Vanhatalo S. (2023). Charting infants’ motor development at home using a wearable system: validation and comparison to physical growth charts. eBioMedicine, 92, 104591, https://doi.org/10.1016/j.ebiom.2023.104591.

Airaksinen, M., Vanhatalo S. & Räsänen, O. (2023). Comparison of end-to-end neural network architectures and data augmentation methods for automatic infant motility assessment using wearable sensors. Sensors, 23, 3773, https://doi.org/10.3390/s23073773.

Vaaras, E., Ahlqvist-Björkroth, S., Drossos, K., Lehtonen, K., & Räsänen, O. (2023). Development of a speech emotion recognizer for large-scale child-centered audio recordings from a hospital environment. Speech Communication, 148, 9–22, (ScienceDirect).

Liu, Y., Mittapalle, K. R., Penttilä, N., Ihalainen, T., Alku P. & Räsänen, O. (2023). Automatic assessment of Parkinson’s disease using speech representations of phonation and articulation. IEEE Transactions on Audio, Speech, and Language Processing, 31, 242–255, (IEEExplore).

Airaksinen, M., Gallen, A., Kivi, A., Vijayakrishnan, P., Häyrinen, T., Ilen, E., Räsänen, O., Haataja, L. & Vanhatalo S. (2022). Intelligent wearable allows out-of-the-lab tracking of developing motor abilities in infants. Communications Medicine, 2, 69, (Nature.com).

Khorrami, K. & Räsänen, O. (2021). Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? – A computational investigation. Language Development Research, 1, 123–193, , https://doi.org/10.34842/w3vw-s845.

Liu, Y., Penttilä, N., Ihalainen, T., Lintula, J., Convey, R., & Räsänen, O. (2021). Language-independent approach for automatic computation of vowel articulation features in dysarthric speech assessment. IEEE Transactions on Audio, Speech, and Language Processing, 29, 2228–2243, (IEEExplore).

Räsänen, O., Seshadri, S., Lavechin, M., Cristia, A., & Casillas, M. (2021). ALICE: An open-source tool for automatic measurement of phoneme, syllable, and word counts from child-centered daylong recordings. Behavior Research Methods, 53, 818–835, (Springer) (PsyArXiv) (code).

Cristia, A., Lavechin, M., Scaff, C., Soderstrom, M., Rowland, C., Räsänen, O., Bunce, J., & Bergelson, E. (2021). A thorough evaluation of the Language Environment Analysis (LENA) system. Behavior Research Methods, 53, 467–486, (OSF) (Springer).

Vanhatalo, S., Airaksinen, M, Ilen, E., Häyrinen, T., Ranta, J., Räsänen, O., & Haataja, L. (2021). Vauvan älyvaatteet: hypeä ja lupausta paremmasta terveydenhoidosta. Duodecim, 137, 596–604, (Duodecim online).

Airaksinen, M., Räsänen, O., Ilén, E., Häyrinen, T., Kivi, A., Marchi, V., Gallen, A., Blom, S., Varhe, A., Kaartinen, N., Haataja, L., & Vanhatalo, S. (2020). Automatic posture and movement tracking of infants with wearable movement sensors. Scientific Reports, 10, 169, (Nature.com) (arXiv).

Räsänen, O., Seshadri, S., Karadayi, J., Riebling, E., Bunce, J., Cristia, A., Metze, F., Casillas, M., Rosemberg, C., Bergelson, E. & Soderstrom, M. (2019). Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech. Speech Communication, 113, 63–80, (ScienceDirect).

Seshadri, S. & Räsänen, O. (2019). SylNet: an adaptable end-to-end syllable count estimator for speech. IEEE Signal Processing Letters, 26, 1359–1363, (IEEExplore) (arXiv) (code).

Seshadri, S., Juvela, L., Räsänen, O. & Alku, P. (2019). Vocal effort based speaking style conversion using vocoder features and parallel learning. IEEE Access, 7, 17230–17246,
(IEEE) (code).

Kakouros, S., Räsänen, O. & Alku, P. (2018). Comparison of spectral tilt measures for sentence prominence in speech — effects of dimensionality and adverse noise conditions. Speech Communication, 103, 11–26, (ScienceDirect) (.pdf).

Räsänen, O., Kakouros, S. & Soderstrom, M. (2018). Is infant-directed speech interesting because it is surprising? — Linking properties of IDS to statistical learning and attention at the prosodic level. Cognition, 178, 193–206, (ScienceDirect) (PsyArXiv) (code).

Kakouros, S., Salminen, N. & Räsänen, O. (2018). Making predictable unpredictable with style — Behavioral and electrophysiological evidence for the critical role of prosodic expectations in the perception of prominence in speech. Neuropsychologia, 109, 181–199, (PsyArXiv).

Räsänen, O., Doyle, G., & Frank, M. C. (2018). Pre-linguistic segmentation of speech into syllable-like units. Cognition, 171, 130–150, (PsyArXiv) (code).

Rasilo H. & Räsänen O. (2017). An online model of vowel imitation learning. Speech Communication, 86, 1–23, (.pdf) (ScienceDirect).

Kakouros S. & Räsänen O. (2016). Perception of sentence stress in speech correlates with the temporal unpredictability of prosodic features. Cognitive Science, 40, 1739–1774, (.pdf).

Räsänen O. & Saarinen J. P. (2016). Sequence prediction with sparse distributed hyperdimensional coding applied to the analysis of mobile phone use patterns. IEEE Transactions on Neural Networks and Learning Systems, 27, 1878–1889, (.pdf) (IEEExplore).

Kakouros S. & Räsänen O. (2016). 3PRO - An unsupervised method for the automatic detection of sentence prominence in speech. Speech Communication, 82, 67–84, (.pdf) (ScienceDirect).

Koolen N., Dereymaeker A., Räsänen O., Jansen K., Vervisch J., Matic V., Naulaers G., De Vos M., Van Huffel S., & Vanhatalo S. (2016). Early development of synchrony in cortical activations in the human. Neuroscience, 322, 298–307, (ScienceDirect).

Räsänen O. & Rasilo H. (2015). A joint model of word segmentation and meaning acquisition through cross-situational learning. Psychological Review, 122(4), 792–829, (.pdf) (APA).

Pohjalainen J., Räsänen O. & Kadioglu S. (2015). Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits. Computer Speech and Language, 29, 145–171, (ScienceDirect) (code).

Koolen N., Dereymaeker A., Räsänen O., Jansen K., Vervisch J., Matic V., De Vos M., Van Huffel S., Naulaers G. & Vanhatalo S. (2014). Interhemispheric synchrony in neonatal EEG revisited: Activation Synchrony Index as a promising classifier. Frontiers in Human Neuroscience, 8:1030, doi: 10.3389/fnhum.2014.01030 (Frontiers).

Räsänen O. & Kakouros S. (2014). Modeling dependencies in multiple parallel data streams with hyperdimensional computing. IEEE Signal Processing Letters, 21, 899–903, (IEEExplore) (.pdf).

Räsänen O. & Laine U. K. (2013). Time-frequency integration characteristics of hearing are optimized for perception of speech-like acoustic patterns. The Journal of the Acoustical Society of America, 134, 407–419, (ASA).

Rasilo H., Räsänen O. & Laine U. K. (2013). Feedback and imitation by a caregiver guides a virtual infant to learn native phonemes and the skill of speech inversion. Speech Communication, 55, 909–931, (ScienceDirect).

Räsänen O., Metsäranta M. & Vanhatalo S. (2013). Development of a novel robust measure for interhemispheric synchrony in the neonatal EEG: Activation Synchrony Index (ASI). NeuroImage, 69, 256–266, (ScienceDirect).

Räsänen O. (2012). Computational modeling of phonetic and lexical learning in early language acquisition: existing models and future directions. Speech Communication, 54, 975–997, (.pdf) (ScienceDirect).

Räsänen O. & Laine U. K. (2012). A method for noise-robust context-aware pattern discovery and recognition from categorical sequences. Pattern Recognition, 45, 606–616, (ScienceDirect) (.pdf).

Räsänen O. (2011). A computational model of word segmentation from continuous speech using transitional probabilities of atomic acoustic events. Cognition, 120, 149–176, (ScienceDirect) (.pdf).

Räsänen O., Laine U. K. & Altosaar T. (2011). Blind segmentation of speech using non-linear filtering methods. in Ipsic I. (Ed.): Speech Technologies, InTech Publishing. (.pdf).

Papers in peer-reviewed conference proceedings

Vaaras, E., & Airaksinen, M. (2025). Feature space topology control via Hopkins loss. ICTAI-2025, Athens, Greece, pp. 427–432 (arXiv).

Lahtinen, K., Vaaras, E., Mustanoja, L. & Räsänen, O. (2025). Investigating affect mining techniques for annotation sample selection in the creation of Finnish affective speech corpus. Proc. Interspeech-2025, Rotterdam, Netherlands, pp. 3958–3962 (ISCA Archive).

Räsänen, O. & Kocharov, D. (2024). Age-dependent analysis and stochastic generation of child-directed speech. Proc. CogSci-2024, Rotterdam, Netherlands, pp. 5102–5108 (arXiv).

Kocharov, D. & Räsänen, O. (2024). Age-dependent intonational changes in child-directed speech. Proc. Speech Prosody, Leiden, Netherlands, pp. 225–229 (ISCA Archive).

Coffey, J., Räsänen, O., Scaff, C., & Cristia, A. (2024). The difficulty and importance of estimating the lower and upper bounds of infant speech exposure. Proc. Interspeech-2024, Kos, Greece, pp. 3615–3619 (ISCA Archive).

Khorrami, K., Cruz Blandón, M. A., Virtanen, T., & Räsänen, O. (2023). Simultaneous or sequential training? How speech representations cooperate in a multi-task self-supervised learning system. Proc. EUSIPCO-2023, Helsinki, Finland, pp. 431–435 (EURASIP).

Peng, P., Li, S.-W., Räsänen, O., Mohamed, A., & Harwath, D. (2023). Syllable discovery and cross-lingual generalization in a visually grounded, self-supervised speech model. Proc. Interspeech, Dublin, Ireland, pp. 391–395 (ISCA Archive).

Lavechin, M., Sy, Y., Titeux, H., Cruz Blandón, M. A., Räsänen, O., Bredin, H., Dupoux, E., & Cristia, A. (2023). BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models. Proc. Interspeech, Dublin, Ireland, pp. 4588–4592 (ISCA Archive).

Khorrami, K., Cruz Blandón, M. A., & Räsänen, O. (2023). Computational insights to acquisition of phonemes, words, and word meanings in early language: sequential or parallel acquisition? Proc. CogSci-2023, Sydney, Australia (PsyArXiv).

Cruz Blandón, M. A., Cristia, A., & Räsänen, O. (2023). Analysing the impact of audio quality on the use of naturalistic long-form recordings for infant-directed speech research. Proc. CogSci-2023, Sydney, Australia (arXiv).

Räsänen, O. , Cruz Blandón, M. A., & Leppänen, J. (2023). Is reliability of cognitive measures in children dependent on participant age? a case study with two large-scale datasets. Proc. CogSci-2023, Sydney, Australia (PsyArXiv).

Vaaras, E., Airaksinen, M., Vanhatalo, S., & Räsänen, O. (2023). Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors. Proc. EMBC-2023, Sydney, Australia (arXiv).

Xie, H., Räsänen, O., & Virtanen, T. (2022). On negative sampling for contrastive audio-text retrieval. Proc. ICASSP-2023, Rhodes Island, Greece (arXiv).

Vaaras, E., Airaksinen, M. & Räsänen, O. (2022). Analysis of self-supervised learning and dimensionality reduction methods in clustering-based active learning for speech emotion recognition. Proc. Interspeech-2022, Incheon, South Korea (arXiv).

Xie, H., Räsänen, O., Drossos, K. & Virtanen, T. (2022). Unsupervised audio-caption aligning learns correspondences between individual sound events and textual phrases. Proc. ICASSP-2022, Singapore. (arXiv).

Vaaras, E., Ahlqvist-Björkroth, S., Drossos, K. & Räsänen, O. (2021). Automatic analysis of the emotional content of speech in daylong child-centered recordings from a neonatal intensive care unit. Proc. Interspeech-2021, Brno, Czech Republic (arXiv).

Khorrami, K. & Räsänen, O. (2021). Evaluation of audio-visual alignments in visually grounded speech models. Proc. Interspeech-2021, Brno, Czech Republic (arXiv).

Xie, H., Räsänen, O., & Virtanen, T. (2021). Zero-shot audio classification with factored linear and nonlinear acoustic-semantic projections Proc. ICASSP-2021, Toronto, Canada, pp. 326–330 (arXiv).

Räsänen, O. & Cruz Blandón, M. A. (2020). Unsupervised discovery of recurring speech patterns using probabilistic adaptive metrics. Proc. Interspeech-2020, Shanghai, China, pp. 4871–4875 (arXiv).

Cruz Blandón, M. A. & Räsänen, O. (2020). Analysis of predictive coding models for phonemic representation learning in small datasets. Proc. ICML-2020 Workshop on Self-Supervision in Audio and Speech, held as a virtual conference (web).

MacDonald K., Räsänen, O., Casillas, M., & Warlaumont, A. (2020). Measuring prosodic predictability in children’s home language environments. Proc. CogSci-2020, held as a virtual conference, pp. 695–701 (.pdf).

Räsänen, O. & Khorrami, K. (2019). A computational model of early language acquisition from audiovisual experiences of young infants. Proc. Interspeech-2019, Graz, Austria, pp. 3594–3598 (arXiv).

Seshadri, S., Juvela, L., Alku, P., & Räsänen, O. (2019). Augmented CycleGANs for continuous scale normal-to-Lombard speaking style conversion. Proc. Interspeech-2019, Graz, Austria, pp. 2838–2842 (.pdf).

Airaksinen, M., Juvela, L., Alku, P., & Räsänen, O. (2019). Data augmentation strategies for neural network F0 estimation. Proc. ICASSP-2019, Brighton, UK, pp. 6485–6489 (.pdf) (code).

Seshadri, S., Juvela, L., Räsänen, O. & Alku, P. (2019). Cycle-consistent adversarial networks for non-parallel vocal effort based speaking style conversion. Proc. ICASSP-2019, Brighton, UK, pp. 6835–6839 (.pdf) (code).

Räsänen, O., Seshadri, S. & Casillas, M. (2018). Comparison of syllabification algorithms and training strategies for robust word count estimation across different languages and recording conditions. Proc. Interspeech-2018, Hyderabad, India, pp. 701–705 (.pdf).

Airaksinen, M., Juvela, L., Räsänen, O. & Alku, P. (2018). Time-regularized linear prediction for noise-robust extraction of the spectral envelope of speech. Proc. Interspeech-2018, Hyderabad, India, pp. 1200–2014 (.pdf).

Räsänen, O., Kakouros, S. & Soderstrom, M. (2017). Connecting stimulus-driven attention to the properties of infant-directed speech – Is exaggerated intonation also more surprising? Proceedings of the 39th Annual Conference of the Cognitive Science Society, London, UK, pp. 998–1003 (.pdf), (code).

Seshadri, S., Remes, U., & Räsänen, O. (2017). Comparison of non-parametric Bayesian mixture models for syllable clustering and zero-resource speech processing. Proc. Interspeech-2017, Stockholm, Sweden, pp. 2744–2748 (.pdf) (code).

Kakouros, S., Räsänen, O. & Alku P. (2017). Evaluation of spectral tilt measures for sentence prominence under different noise conditions. Proc. Interspeech-2017, Stockholm, Sweden, pp. 3211–3215 (.pdf).

Ramirez Lopez, A., Seshadri, S., Juvela, L., Räsänen, O. & Alku P. (2017). Speaking style conversion from normal to Lombard speech using a glottal vocoder and Bayesian GMMs. Proc. Interspeech-2017, Stockholm, Sweden, pp. 1363–1367 (.pdf).

Räsänen, O. (2017). Language is not about language: towards formalizing the role of extra-linguistic factors in human and machine language acquisition and communication. Proc. Workshop on Grounding Language Acquisition (GLU-2017), Stockholm, Sweden, pp. 37–41 (.pdf).

Michel, P., Räsänen, O., Thiolliere, R., & Dupoux, E. (2017). Blind phoneme segmentation with temporal prediction errors. Proc. ACL SRW-2017, Vancouver, Canada (arXiv).

Seshadri, S., Remes, U. & Räsänen O. (2017). Dirichlet process mixture models for clustering i-vector data. Proc. ICASSP-2017, New Orleans, LA, pp. 5470–5474 (.pdf) (code).

Räsänen O., Nagamine T. & Mesgarani N. (2016). Analyzing distributional learning of phonemic categories in unsupervised deep neural networks. Proceedings of the 38th Annual Conference of the Cognitive Science Society, Philadelphia, PA, pp. 1757–1762 (.pdf).

Kakouros S. & Räsänen O.(2016). Statistical learning of prosodic patterns and reversal of perceptual cues for sentence prominence. Proceedings of the 38th Annual Conference of the Cognitive Science Society, Philadelphia, PA, pp. 2489–2494 (.pdf).

Kakouros S., Pelemans J., Verwimp L., Wambacq P. & Räsänen O. (2016). Analyzing the contribution of top-down lexical and bottom-up acoustic cues in the detection of sentence prominence. Proc. Interspeech-2016, San Francisco, CA, pp. 1074–1078 (.pdf).

Räsänen O., Doyle G. & Frank M. C. (2015). Unsupervised word discovery from speech using automatic segmentation into syllable-like units. Proc. Interspeech-2015, Dresden, Germany, pp. 3204–3208 (.pdf).

Rasilo H. & Räsänen O. (2015). Weakly-supervised word learning is improved by an active online algorithm. Proc. Interspeech-2015, Dresden, Germany, pp. 1561–1565 (.pdf).

Kakouros S. & Räsänen O. (2015). Automatic detection of sentence prominence in speech using predictability of word-level acoustic features. Proc. Interspeech-2015, Dresden, Germany, pp. 568–572 (.pdf).

Koolen N., Dereymaeker, A., Räsänen O., Jansen K., Vervisch J., Matic V., De Vos M., Naulaers G., Van Huffel S. & Vanhatalo S. (2015). Data-driven metric representing the maturation of preterm EEG. Proc. 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Milan, Italy, pp. 1492–1495 (.pdf).

Räsänen O. & Rasilo H. (2015). Cross-situational cues are relevant for early word segmentation. Proc. 37th Annual Conference of the Cognitive Science Society, Pasadena, California, pp. 1949–1954 (.pdf).

Räsänen O. (2015). Generating hyperdimensional distributed representations from continuous-valued multivariate sensory input. Proc. 37th Annual Conference of the Cognitive Science Society, Pasadena, California, pp. 1943–1948 (.pdf).

Rasilo H. & Räsänen O. (2015). Computational evidence for effects of memory decay, familiarity preference and mutual exclusivity in cross-situational learning. Proc. 37th Annual Conference of the Cognitive Science Society, Pasadena, California, pp. 1955–1960 (.pdf).

Kakouros S. & Räsänen O. (2015). Analyzing the predictability of lexeme-specific prosodic features as a cue to sentence prominence. Proc. 37th Annual Conference of the Cognitive Science Society, Pasadena, California, pp. 1039–1044 (.pdf).

Kakouros S. & Räsänen O. (2014). Perception of sentence stress in english infant directed speech. Proc. Interspeech-2014, Singapore, pp. 1821–1825 (.pdf).

Räsänen O. (2014). Basic cuts revisited: Temporal segmentation of speech into phone-like units with statistical learning at a pre-linguistic level. Proc. 36th Annual Conference of the Cognitive Science Society, Quebec, Canada, pp. 2817–2822 (.pdf).

Kakouros S. & Räsänen O. (2014). Statistical unpredictability of f0 trajectories as a cue to sentence stress. Proc. 36th Annual Conference of the Cognitive Science Society, Quebec, Canada, pp. 1246–1251 (.pdf).

Räsänen O. & Pohjalainen J. (2013). Random subset feature selection in automatic recognition of developmental disorders, affective states, and level of conflict from speech. Proc. Interspeech-2013, Lyon, France, pp. 210–214 (.pdf).

Knuuttila J., Räsänen O. & Laine U. K. (2013). Automatic self-supervised learning of associations between speech and text. Proc. Interspeech-2013, Lyon, France, pp. 465–469 (.pdf).

Rasilo H., Räsänen O. & de Boer B. (2013). Virtual infant's online acquisition of vowel categories and their mapping between dissimilar bodies. Proc. Workshop on Speech Production in Automatic Speech Recognition, Lyon, France (.pdf).

Kakouros S., Räsänen O. & Laine U. (2013). Attention based temporal filtering of sensory signals for data redundancy reduction. Proc. ICASSP-2013, Vancouver, Canada, pp. 3188-3192 (.pdf).

Räsänen O. (2012). Average spectrotemporal structure of continuous speech matches with the frequency resolution of human hearing. Proc. Interspeech-2012, Portland, Oregon (.pdf).

Räsänen O. (2012). Non-auditory cognitive capabilities in computational modeling of early language acquisition. Proc. Interspeech-2012, Portland, Oregon (.pdf).

Räsänen O., Rasilo H. & Laine, U. K. (2012). Modeling spoken language acquisition with a generic cognitive architecture for associative learning. Proc. Interspeech-2012, Portland, Oregon (.pdf).

Pohjalainen J., Kadioglu S. & Räsänen O. (2012). Feature selection for speaker traits. Proc. Interspeech-2012, Portland, Oregon (.pdf).

Räsänen O. & Rasilo H. (2012). Acoustic analysis supports the existence of a single distributional learning mechanism in structural rule learning from an artificial language. Proc. 34th Annual Conference of the Cognitive Science Society (CogSci2012), Sapporo, Japan, pp. 887-892 (.pdf).

Räsänen O. (2012). Context induced merging of synonymous word models in computational modeling of early language acquisition. Proc. ICASSP-2012, Kyoto, Japan, pp. 5037-5040 (.pdf).

Räsänen O. (2012). Hierarchical unsupervised discovery of user context from multivariate sensory data. Proc. ICASSP-2012, Kyoto, Japan, pp. 2105-2108 (.pdf).

räsänen o., leppänen j., laine u. k., saarinen, j. (2011). comparison of classifiers in audio and acceleration based context classification in mobile phones. Proc. EUSIPCO-11, Barcelona, Spain, pp. 946-950 (.pdf).

Rasilo H., Laine U. K., Räsänen O. & Altosaar T. (2011). Method for speech inversion with large scale statistical evaluation", Proc. Interspeech-11, Florence, Italy, pp. 2693-2696 (.pdf).

Rasilo H., Laine U. K. & Räsänen O. (2010). Estimation studies of vocal tract shape trajectory using a variable length and lossy Kelly-Lochbaum model. Proc. Interspeech-10, Chiba, Japan, pp. pp. 2414-2417 (.pdf).

Räsänen O. (2010). Fully unsupervised word learning from continuous speech using transitional probabilities of atomic acoustic events. Proc. Interspeech-10, Chiba, Japan, pp. pp. 2922-2925 (.pdf).

ten Bosch L., Räsänen O., Driesen J., Aimetti G., Altosaar T., Boves L. (2009). Do multiple caregivers speed up language acquisition? Proc. Interspeech-09, Brighton, England, pp. 704-707 (.pdf).

Aimetti G., Moore R., ten Bosch L., Räsänen O. & Laine U. K. (2009). Discovering keywords from cross-modal input: ecological vs. engineering methods for enhancing acoustic repetitions. Proc. Interspeech-09, Brighton, England, pp. 1171-1174 (.pdf).

Räsänen O., Laine U. K. & Altosaar T. (2009). A noise robust method for pattern discovery in quantized time series: the concept matrix approach. Proc. Interspeech-09, Brighton, England, pp. 3035-3038 (.pdf).

Räsänen O., Laine U. K. & Altosaar T. (2009). An improved speech segmentation quality measure: the R-value. Proc. Interspeech-09, Brighton, England, pp. 1851-1854, (.pdf).

Räsänen O., Laine U. K. & Altosaar T. (2009). Self-learning Vector Quantization for Pattern Discovery from Speech. Proc. Interspeech-09, Brighton, England, pp. 852-855 (.pdf).

Räsänen O. & Driesen J. (2009). A comparison and combination of segmental and fixed-frame signal representations in NMF-based word recognition. Proc. 17th Nordic Conference on Computational Linguistics (NODALIDA-09), Odense, Denmark (.pdf).

Räsänen O., Laine U. K. & Altosaar T. (2008)., "Computational language acquisition by statistical bottom-up processing. Proc. Interspeech-08, Brisbane, Australia, pp. 1980-1983 (.pdf).

Other scientific publications

Räsänen O., "Studies on unsupervised and weakly supervised methods in computational modeling of early language acquisition", Doctoral thesis, Aalto University, School of Electrical Engineering, 2013 (.pdf).

Räsänen O., Laine U. K. & Saarinen J., "Automatic learning of a topology of associations from multiple data streams", A technical white paper, 2012 (.pdf).

Laine U. K. & Räsänen O., "Indirect estimation of formant frequencies through mean spectral variance with application to automatic gender recognition", In Proc. 6th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA), Firenze, Italy, 2009 (.pdf).

ten Bosch L., Boves L. & Räsänen O., "Learning meaningful units from multimodal input - the effect of interaction strategies", Proc. Workshop on Child, Computer and Interaction 2009 (WOCCI), Boston, MA, United States, 2009 (.pdf).

Räsänen O., "A review of missing-feature methods in automatic speech recognition", in Palomäki K. J., Remes U. & Kurimo M. (Eds.), Studies on noise robust automatic speech recognition. Technical Report TKK-ICS-R19, Helsinki University of Technology, Dept. ICS, Finland, 2009.

Räsänen O., Altosaar T. & Laine U. K., "Comparison of prosodic features in Swedish and Finnish IDS/ADS speech", Proc. Nordic Prosody X, Helsinki, Finland, 2008 (.pdf).

Räsänen O., "Speech segmentation and clustering methods for a new speech recognition architecture", M.Sc. Thesis, Helsinki University of Technology, 2007 (.pdf).

Journal articles and book chapters

Papers in peer-reviewed conference proceedings

Other scientific publications

Other published documents (working papers/reports/patents)