Speaker

Dr. Giovanni Di Liberto

Dept of Computer Science and Statistics at Trinity College Dublin
Title:
Investigating auditory cognition with natural speech and music
Abstract:
That cortical activity tracks the dynamics of auditory stimuli is reasonably well established. In speech and music perception, this phenomenon produces reliable coupling between the acoustic envelope of the sound input and the corresponding cortical responses. However, it remains unclear to what extent that neural tracking reflects low-level acoustic properties, as opposed to more abstract linguistic structures. In this talk, I will discuss a series of studies aimed at assessing the impact of higher-order linguistic information on the cortical processing of speech and music sounds. I will demonstrate methodologies for disentangling neural responses to stimulus properties at different abstraction levels, deriving multiple objective indices for probing auditory perception with experiments involving natural speech and music listening. I will then describe recent developments of these measures in the context of developmental research.
Bio:
Giovanni received his Bachelor's degree in Information Engineering in 2011 and his Master's degree in Computer Engineering in 2013, both from the University of Padova, Italy. After a period working on his thesis at University College Cork (UCC, Ireland), he joined Edmund Lalor's research lab in Trinity College Dublin where he pursued a PhD in auditory neuroscience in the School of Electronic and Electrical Engineering. He received his PhD in 2017 and he joined the Laboratoire des Systèmes Perceptifs at École Normale Superieure (Paris) immediately after, under the supervision of Alain de Cheveigné and Shihab Shamma. Then, he briefly continued his work on speech communication with Richard Reilly as a postdoctoral researcher (TCD), while also working with Simon Kelly at UCD, expanding his expertise into the Decision Making domain. He holds the title of Assistant Professor in Intelligent Systems in the School of Computer Science and Statistics at Trinity College Dublin. Giovanni's scientific interests centre on understanding the brain mechanisms underlying speech comprehension. In his work, he develops data analysis methods and applies them to brain data to identify the neural processes responsible for the transformation of a sensory stimulus into its abstract meaning. Brain electrical data is measured with either non-invasive (e.g., electroencephalography - EEG) or invasive (e.g., electrocorticography - ECoG) technologies. The first aspect of his research is methodological and has produced novel experimental and analysis frameworks to investigate cortical auditory processing. The second aspect of his research is to use such novel methods to test theories on auditory perception, such as the hierarchical processing of speech and predictive processing theories (e.g. predictive coding). Finally, the third part of his work is translational and involves the identification of solutions to utilise his novel methods in applied settings, for example as tools to develop brain-computer interfaces or as objective measures for the monitoring of language development and healthy ageing.
Speaker

Dr Catherine Lai

Lecturer in Speech and Language Technology, working at the Centre for Speech Technology Research at the University of Edinburgh
Title:
Into the prosodic dimension: Finding meaning in the non-lexical aspects of speech.
Abstract:
With recent advances in machine learning, automated dialogue systems have become more able to produce coherent language-based interactions. However, most work on automated spoken language understanding uses still only text transcriptions, i.e., just the lexical content of speech. This ignores the fact that the way we speak can change how our words are interpreted. In particular, speech prosody --e.g. pitch, energy, and timing characteristics of speech -- can be used to signal speaker intent in spoken dialogues. In fact, prosodic features can help automatic detection of both dialogue structure and speaker affect/states. In this talk, I will discuss our recent work on how we can combine non-lexical and lexical aspects to speech to improve speech understanding tasks, such as emotion recognition, and how new approaches to self-supervised learning from speech might be able to help us make the most of the true richness of speech.
Bio :
Dr Catherine Lai is a Lecturer in Speech and Language Technology, working at the Centre for Speech Technology Research at the University of Edinburgh. She specialises in spoken language understanding, affective computing, and multimodal speech processing. Her main focus on interest is how non-lexical aspects of speech contribute to discourse and dialogue understanding. She's particularly interested in speech prosody, e.g., pitch, energy, and timing, and how varying the way we speak can change our understanding of speaker intent in a dialogue from both a recognition and generation perspective. She has also worked on evaluating speech and language technologies for engaging older people in shared decision making and the feasibility of robot companions.