SPEAKER-TURN AWARE DIARIZATION FOR SPEECH-BASED COGNITIVE ASSESSMENTS

Speaker-turn aware diarization for speech-based cognitive assessments

Speaker-turn aware diarization for speech-based cognitive assessments

Blog Article

IntroductionSpeaker diarization is an Neuw Rebel Skinny Eternal Black Jeans essential preprocessing step for diagnosing cognitive impairments from speech-based Montreal cognitive assessments (MoCA).MethodsThis paper proposes three enhancements to the conventional speaker diarization methods for such assessments.The enhancements tackle the challenges of diarizing MoCA recordings on two fronts.First, multi-scale channel interdependence speaker embedding is used as the front-end speaker representation for overcoming the acoustic mismatch caused by far-field microphones.Specifically, a squeeze-and-excitation (SE) unit and channel-dependent attention are added to Res2Net blocks for multi-scale feature aggregation.

Second, a sequence comparison approach with a holistic view of the whole conversation is applied to measure the similarity of short speech segments in the conversation, which results in a speaker-turn aware scoring matrix for the subsequent clustering step.Third, to further enhance the diarization performance, we propose incorporating a pairwise similarity measure so that the speaker-turn aware scoring matrix contains both local and global information across cocktail tree for sale the segments.ResultsEvaluations on an interactive MoCA dataset show that the proposed enhancements lead to a diarization system that outperforms the conventional x-vector/PLDA systems under language-, age-, and microphone-mismatch scenarios.DiscussionThe results also show that the proposed enhancements can help hypothesize the speaker-turn timestamps, making the diarization method amendable to datasets without timestamp information.

Report this page