This two-set volume LNAI 16187 and 16188 constitutes the refereed proceedings of the 27th International Conference on Speech and Computer SPECOM 2025 held in Szeged, Hungary, during October 13 15, 2025. The 47 full papers and 1 invited paper included in this book were carefully reviewed and selected from 77 submissions. The papers are organized in the following topical sections: Part I- Invited Paper; Speech Perception and Synthesis; Computational Paralinguistics; Speech Processing for Healthcare; Speech and Language Resources; Speaker Recognition. Part II- Automatic Speech…mehr
This two-set volume LNAI 16187 and 16188 constitutes the refereed proceedings of the 27th International Conference on Speech and Computer SPECOM 2025 held in Szeged, Hungary, during October 13 15, 2025.
The 47 full papers and 1 invited paper included in this book were carefully reviewed and selected from 77 submissions. The papers are organized in the following topical sections:
Part I- Invited Paper; Speech Perception and Synthesis; Computational Paralinguistics; Speech Processing for Healthcare; Speech and Language Resources; Speaker Recognition.
Part II- Automatic Speech Recognition; Speech Processing for Under-Resourced Languages; Digital Speech Processing; Natural Language Processing; Multimodal Systems.
Artikelnr. des Verlages: 89581793, 978-3-032-07955-8
Seitenzahl: 343
Erscheinungstermin: 20. November 2025
Englisch
Abmessung: 235mm x 155mm
ISBN-13: 9783032079558
ISBN-10: 3032079551
Artikelnr.: 75331427
Herstellerkennzeichnung
Springer-Verlag GmbH
Tiergartenstr. 17
69121 Heidelberg
ProductSafety@springernature.com
Inhaltsangabe
.- Invited Paper. .- Towards Responsible Multimodal Modeling for Mental Healthcare. .- Speech Perception and Synthesis. .- When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs. .- WhiSQA: Non-Intrusive Speech Quality Prediction using Whisper Encoder Features. .- Prompting the Mind: EEG-to-Text Translation with Multimodal LLMs and Semantic Contro. .- Effectiveness of Tacotron2 for Intonation Model Synthesis in Russian. .- Enhancing Sinhala Text-to-Speech with End-to-End VITS Architecture. .- Computational Paralinguistics. .- Spoken Emotion Recognition using Soft Labels. .- NAMTalk: From Muscle Vibrations to Emotional Speech. .- What Do LLMs Know about Human Emotions? The Russian Case Study. .- Emotions Manifestation by Adolescents with Intellectual Disabilities. .- Retention-Augmented Voice Assistant: A Lightweight Architecture for Stateful Interaction with Comprehensive Evaluation and Privacy-Preserving Design. .- Speech Processing for Healthcare. .- Investigation of Explainable Multimodal Methods for Detecting Mental Disorders. .- Attention Deficit Hyperactivity Disorder: Identifying Approaches for Early Diagnosis, a Pilot Study. .- Text-to-Dysarthric-Speech Generation for Dysarthric Automatic Speech Recognition: Is Purely Synthetic Data Enough?. .- Colour Preferences in Schizophrenic Speech. .- Automated Assessment of Phrase Intelligibility for Russian Speech Based on Esophageal Voice. .- Speech and Language Resources. .- Subtle Changes in L1 Stops of Late Salento Italian-French Bilinguals: An Acoustic Study using AutoVOT Adapted for Italian and French. .- Sound and Colour in Phonosemantics: Perceptual and Acoustic Correlates of Mongolian Vowels. .- Rhythmic Diglossia Based on Discourse Types and Dialects of English: Australian and New Zealand Corpora. .- Automatic Annotation of Discourse and Speech Formulas in Internet Communication: A Telegram Comment Corpus. .- Speaker Recognition. .- Effect of Spoof Speech on Forensic Voice Comparison using Deep Speaker Embeddings. .- Source Vendor Tracing of Audio Deepfakes. .- Language-Specific Adaptation Strategies for Speaker Recognition using MobileNet. .- Enhancing Audio Replay Attack Detection with Silence-based Blind Channel Impulse Response Estimation.
.- Invited Paper. .- Towards Responsible Multimodal Modeling for Mental Healthcare. .- Speech Perception and Synthesis. .- When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs. .- WhiSQA: Non-Intrusive Speech Quality Prediction using Whisper Encoder Features. .- Prompting the Mind: EEG-to-Text Translation with Multimodal LLMs and Semantic Contro. .- Effectiveness of Tacotron2 for Intonation Model Synthesis in Russian. .- Enhancing Sinhala Text-to-Speech with End-to-End VITS Architecture. .- Computational Paralinguistics. .- Spoken Emotion Recognition using Soft Labels. .- NAMTalk: From Muscle Vibrations to Emotional Speech. .- What Do LLMs Know about Human Emotions? The Russian Case Study. .- Emotions Manifestation by Adolescents with Intellectual Disabilities. .- Retention-Augmented Voice Assistant: A Lightweight Architecture for Stateful Interaction with Comprehensive Evaluation and Privacy-Preserving Design. .- Speech Processing for Healthcare. .- Investigation of Explainable Multimodal Methods for Detecting Mental Disorders. .- Attention Deficit Hyperactivity Disorder: Identifying Approaches for Early Diagnosis, a Pilot Study. .- Text-to-Dysarthric-Speech Generation for Dysarthric Automatic Speech Recognition: Is Purely Synthetic Data Enough?. .- Colour Preferences in Schizophrenic Speech. .- Automated Assessment of Phrase Intelligibility for Russian Speech Based on Esophageal Voice. .- Speech and Language Resources. .- Subtle Changes in L1 Stops of Late Salento Italian-French Bilinguals: An Acoustic Study using AutoVOT Adapted for Italian and French. .- Sound and Colour in Phonosemantics: Perceptual and Acoustic Correlates of Mongolian Vowels. .- Rhythmic Diglossia Based on Discourse Types and Dialects of English: Australian and New Zealand Corpora. .- Automatic Annotation of Discourse and Speech Formulas in Internet Communication: A Telegram Comment Corpus. .- Speaker Recognition. .- Effect of Spoof Speech on Forensic Voice Comparison using Deep Speaker Embeddings. .- Source Vendor Tracing of Audio Deepfakes. .- Language-Specific Adaptation Strategies for Speaker Recognition using MobileNet. .- Enhancing Audio Replay Attack Detection with Silence-based Blind Channel Impulse Response Estimation.
Es gelten unsere Allgemeinen Geschäftsbedingungen: www.buecher.de/agb
Impressum
www.buecher.de ist ein Internetauftritt der buecher.de internetstores GmbH
Geschäftsführung: Monica Sawhney | Roland Kölbl | Günter Hilger
Sitz der Gesellschaft: Batheyer Straße 115 - 117, 58099 Hagen
Postanschrift: Bürgermeister-Wegele-Str. 12, 86167 Augsburg
Amtsgericht Hagen HRB 13257
Steuernummer: 321/5800/1497
USt-IdNr: DE450055826