SMM23

Program

Event	Time Slot (IST, Irish time)
Workshop Introduction	09:50 to 10:10
Keynote 1: Into the prosodic dimension: Finding meaning in the non-lexical aspects of speech. Dr. Catherine Lai (Lecturer in Speech and Language Technology, working at the Centre for Speech Technology Research at the University of Edinburgh) With recent advances in machine learning, automated dialogue systems have become more able to produce coherent language-based interactions. However, most work on automated spoken language understanding uses still only text transcriptions, i.e., just the lexical content of speech. This ignores the fact that the way we speak can change how our words are interpreted. In particular, speech prosody --e.g. pitch, energy, and timing characteristics of speech -- can be used to signal speaker intent in spoken dialogues. In fact, prosodic features can help automatic detection of both dialogue structure and speaker affect/states. In this talk, I will discuss our recent work on how we can combine non-lexical and lexical aspects to speech to improve speech understanding tasks, such as emotion recognition, and how new approaches to self-supervised learning from speech might be able to help us make the most of the true richness of speech.	10:10 to 11:00
Coffee Break	11:00 to 11:20
Oral session: Influencing and assessing human productivity through audio signals Team work quality prediction using speech-based features Coding music for no stress learning	11:20 to 12:00
Oral session: Speech Production Voice source correlates of acted male speech emotions Comparison of acoustic-to-articulatory and brain-to-articulatory mapping during speech production using ultrasound tongue imaging and EEG	12:00 to 12:40
Lunch	12:40 to 14:10
Keynote 2: Investigating auditory cognition with natural speech and music. Dr. Giovanni Di Liberto (Dept. of Computer Science and Statistics at Trinity College Dublin) That cortical activity tracks the dynamics of auditory stimuli is reasonably well established. In speech and music perception, this phenomenon produces reliable coupling between the acoustic envelope of the sound input and the corresponding cortical responses. However, it remains unclear to what extent that neural tracking reflects low-level acoustic properties, as opposed to more abstract linguistic structures. In this talk, I will discuss a series of studies aimed at assessing the impact of higher-order linguistic information on the cortical processing of speech and music sounds. I will demonstrate methodologies for disentangling neural responses to stimulus properties at different abstraction levels, deriving multiple objective indices for probing auditory perception with experiments involving natural speech and music listening. I will then describe recent developments of these measures in the context of developmental research.	14:10 to 15:00
Oral session: Assessment of personality traits and mental states through speech Apparent personality prediction from speech using expert features and wav2vec2 Harnessing the power of speech technology for mental health assessment Voice technology to identify fatigue from Japanese speech	15:00 to 16:00
Invited Talk: Dynamics in Interactions and Affects Considering Personality Characteristics. Dr. Ronald Böck (Genie Enterprise)	16:00 to 16:30
Roundtable session with invited guests: Ethics and assessment of mental states. João Cabral, Francesca Bonin, Nicolas Obin, Christian Saam	16:30 to 17:30
Workshop Conclusion	17:30 to 17:40
Social at Pavilion Bar, in the Trinity College Dublin campus	18:00 onwards