TAU logo
About      Research     Members     Publications      Resources      Contact
How do human children learn to understand and produce speech without explicit teaching? What aspects of language development are built-in to our brains and bodies, and how much is actually learnable from the environment using generic cognitive skills? How can we make machines to use and understand language in the way humans do, not necessarily through textual representations, but by truly understanding and communicating meanings in the signal?

These are some of the key questions that we work on in the Speech and Cognition research group. Our primary research method is computational modeling that combines signal processing and machine learning to (potentially large-scale) language and multimodal data in order to address these questions. In addition, we work on various other topics related to speech technology and signal processing, such as development of automatic detection of neurophysiological problems in infants and development of technological tools for large-scale audio- and language data analysis.


Some selected publications

Khorrami, K. & Räsänen, O. (in press). A model of early word acquisition based on realistic-scale audiovisual naming events. Speech Communication, accepted for publication. Preprint: https://arxiv.org/abs/2406.05259. Slides of the associated ICIS-2024 presentation here.

Cruz Blandón, M. A., Cristia, A., & Räsänen, O. (2023). Introducing meta-analysis in the evaluation of computational models of infant language development. Cognitive Science, 47, e13307, https://doi.org/10.1111/cogs.13307.

Airaksinen, M., Gallen, A., Kivi, A., Vijayakrishnan, P., Häyrinen, T., Ilen, E., Räsänen, O., Haataja, L. & Vanhatalo S. (2022). Intelligent wearable allows out-of-the-lab tracking of developing motor abilities in infants. Communications Medicine, 2, 69, https://doi.org/10.1038/s43856-022-00131-6.

Khorrami, K. & Räsänen, O. (2021). Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? – A computational investigation. Language Development Research, https://doi.org/10.34842/w3vw-s845.

Räsänen, O., Seshadri, S., Lavechin, M., Cristia, A., & Casillas, M. (2021). ALICE: An open-source tool for automatic measurement of phoneme, syllable, and word counts from child-centered daylong recordings. Behavior Research Methods, 53, 818–835, https://doi.org/10.3758/s13428-020-01460-x (source code).

Räsänen, O., Doyle, G., & Frank, M. C. (2018). Pre-linguistic segmentation of speech into syllable-like units. Cognition, 171, 130–150, https://doi.org/10.1016/j.cognition.2017.11.003 (.pdf).

Kakouros, S., Salminen, N. & Räsänen, O. (2018). Making predictable unpredictable with style — Behavioral and electrophysiological evidence for the critical role of prosodic expectations in the perception of prominence in speech. Neuropsychologia, 109, 181–199 (.pdf).

Räsänen, O., Kakouros, S. & Soderstrom, M. (2018). Is infant-directed speech interesting because it is surprising? — Linking properties of IDS to statistical learning and attention at the prosodic level. Cognition, 178, 193–206 (.pdf).

Rasilo H. & Räsänen O. (2017). An online model of vowel imitation learning. Speech Communication, 86, 1–23, (.pdf) (web).

Räsänen, O. & Rasilo, H. (2015). A joint model of word segmentation and meaning acquisition through cross-situational learning. Psychological Review, 122(4), 792–829 (.pdf).

Räsänen, O. & Laine, U. K. (2013). Time-frequency integration characteristics of hearing are optimized for perception of speech-like acoustic patterns. The Journal of the Acoustical Society of America, 134, 407–419 (web).

Contact: firstname.surname@tuni.fi