O-COCOSDA Logo
O-COCOSDA 2025

12–14 November 2025, Universitas Kristen Duta Wacana (UKDW), Yogyakarta, Indonesia


Latest Program Book O-COCOSDA 2025 (14 Nov 2025)

📥 Download Program Book (PDF)

Conference Program

Note: All times are in UTC+7 (WIB - Indonesia Western Time)

Detailed Sessions

Note: All times are in UTC+7 (WIB - Indonesia Western Time)

Oral Session #1

Wednesday, November 12, 2025 (Day 1) Time: 10:40 – 12:10 Session Chair: Prof. Norihide Kitaoka
36
Variations in Nasalance Values of Oral Vowels: Nasometric Evidence from Shanghainese
Feifan Wang and Yaohua Jin
Speech Science
52
To Hesitate is to be Proficient: Acoustics and Speech Perception of Filled Pauses in L2 Spontaneous Hindi Speech by Native L1 Assamese Speakers
Sridevi Ravi, Joyshree Chakraborty and Priyankoo Sarmah
Speech Science
23
Exploring Dialects with Speech Embeddings: Insights from Two Speech Databases in Assamese and Finnish
Tuukka Törö, Antti Suni, Leena Dihingia, Juraj Šimko and Priyankoo Sarmah
Speech Science
28
Rhythm – Syntax Interaction Across Modalities: Evidence From Chinese Learners Of English
Tian Peng and Jue Yu
Speech Science
32
Acoustic Differences Between Coronal Nasal /n/ And Lateral /l/ In Standard Chinese
Yijing He and Wai-Sum Lee
Speech Science

Oral Session #2

Wednesday, November 12, 2025 (Day 1) Time: 13:10 – 14:10 Session Chair: Prof. Priyankoo Sarmah
20
A GOP-Based Automatic Pronunciation Scoring System for Taiwanese Hakka Using Transformer Regression Models
Yi-Chin Huang, Yu-Heng Chen, Chih-Chung Kuo, Chao-Shih Huang and Yuan-Fu Liao
Technology
39
HifiDiff: Two Stream Diffusion Models for High Fidelity Speech Generation of Unseen Languages
Lhuqita Fazry, Kurniawati Azizah, Dipta Tanaya, Ayu Purwarianti, Dessi Lestari and Sakriani Sakti
Technology
13
Bridging Disfluent to Fluent in Speech Translation: Effective Tagging and Fine-tuning Strategies
Yuka Ko, Katsuhito Sudoh, Satoshi Nakamura and Sakriani Sakti
Technology
15
FarmSaathi: A Retrieval-Reranking RAG Framework for Multilingual Conversational AI in Indian Agriculture
Devansh Kapoor, Saanya Setia, Shivam Arora and Komal Bharti
Technology

Poster Session #1

Wednesday, November 12, 2025 (Day 1) Time: 14:10 – 15:10
10
A Study on Chinese Tone and Intonation Errors among Bangladeshi Learners ——An Investigation Based on Monosyllabic Words and Sentences
Jia Jing and Wen Cao
Speech Science
17
Joining Diarization and Multi-Speaker Automatic Speech Recognition with Overlap Handling for Long Conversations
Myat Aye Aye Aung, Win Pa Pa and Sakriani Sakti
Technology
21
Rapid Model Adaptation of Code-Switching ASR in Low-Resource, High-Noise Industrial Domains
Tian-Yi Chen, Chih-Chung Kuo, Yu-Siang Lan, Yuan-Fu Liao, Bo-Wei Chen, Yi Liu and Ming-Hsuan Wu
Technology
80
Quality Improvement of Low-Resourced Bahasa Indonesia Expressive Speech Synthesis Using Cross-Lingual Transfer Learning and Tacotron2
Elok Anggrayni, Aprianto Dwi Prasetyo and Dhany Arifianto
Technology
29
Voiceless laterals in Hmar
Ruubino Peseyie and Priyankoo Sarmah
Speech Science
64
Optimization of Large-Scale Speaker Identification System Based On I-Vector with LDA, PCA, and LSH
Muhammad Zaydan Athallah and Dessi Puji Lestari
Technology
53
Comparative Evaluation of N-Gram and Transformer Based Language Models for ECoG-based Speech Neuroprosthesis
Bagas Aryo Seto, Nur Ahmadi and Dessi Puji Lestari
Technology
57
Tokyo-type Accent Production among the North East Indian Students
Nozomi Tokuma, Gulab Jha and Ruubino Peseyie
Speech Science
58
Exploring rhythm formant analysis for Indic language classification
Parismita Gogoi, Sishir Kalita, Priyankoo Sarmah and S.R Mahadeva Prasanna
Technology
70
LIMVO: A Less Is More Approach for Visual Reasoning in Knowledge Based Visual Question Answering
Ade Rohmat Maulana, Arie Ardiyanti Suryani and Ema Rachmawati
Technology

Oral Session #3

Wednesday, November 12, 2025 (Day 1) Time: 15:40 – 17:10 Session Chair: Prof. Yuan-Fu Liao
9
Myanmar-English Code-Switching Speech Dataset : MEASR
Theingi Aye, Win Pa Pa and Hay Mar Soe Naing
Data Development
25
A Corpus-Based Investigation of Acoustic Features Influencing Intelligibility of Super-elderly Japanese SpeecH
Meiko Fukuda, Ryota Nishimura and Norihide Kitaoka
Data Development
50
A Prosodically Annotated Bengali and Assamese Audiobook Corpus for Sentence Boundary Detection
Priyanjana Chowdhury, Sanghamitra Nath and Utpal Sharma
Data Development
55
Vietnamese Speech Database for No-Reference Telecommunication Quality Assessment
Hong Nhat Tran, Bao Thang Ta and Van Hai Do
Data Development
85
BK3AT: An Automated Assessment Tool for K-3 Bangsamoro Education
Crisron Rudolf Lucas, Michael Gringo Bayona, Kiel Gonzales, Edsel Jedd Renovalles, Francis Paolo Santelices, Nissan Macale, Jose Marie Mendoza, Jazzmin Maranan and Nicole Anne Palafox
Data Development
5
The Effect of Question Intonation on Focus: A Comparative Study of Tianjin Mandarin and American English
Binbin Sun, Shuang Yuan, Hui Feng and Aijun Li
Speech Science

Oral Session #4

Thursday, November 13, 2025 (Day 2) Time: 10:10 – 12:00 Session Chair: Dr. Kurniawati Azizah
34
Toward Natural Emotional Text-to-Speech System with Fine-grained Non-Verbal Expression Control
Wangzixi Zhou, Bagus Tris Atmaja and Sakriani Sakti
Technology
37
Stage-Wise Acoustic-Linguistic Fine-Tuning for Overlapped Speech Recognition: Does Ordering Matter?
Saddam Annais Shaquille, Dessi Puji Lestari and Sakriani Sakti
Technology
4
Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages
Yao-Fei Cheng, Li-Wei Chen, Hung-Shin Lee and Hsin-Min Wang
Technology
79
LAFAEK-CORPUS-1M+: A Large-Scale Tetun Corpus to Build A Low-Resourced LLM For SPeech and Text Processing
Yuichi Nishida, Yuto Kuroda and Satoshi Tamura
Data Development
24
Japanese Articulatory Speech Dataset Acquired with 3D Electromagnetic Articulography
Eri Ikeda, Yukiyasu Yoshinaga, Kouichi Katsurada and Kohei Wakamiya
Data Development
18
Production Patterns and Prosodic features of Chinese Tones by Learners from Five Central Asian Countries
Yuan Jia and Mingshuai Yin
Speech Science
19
Is Beijing Mandarin Stress-Timed? Examining Rhythmic Patterns in Spontaneous and Read Speech
Zhiwei Wang and Aijun Li
Speech Science

Oral Session #5

Thursday, November 13, 2025 (Day 2) Time: 13:00 – 14:30 Session Chair: Dr.Phil. Lucia Dwi Krisnawati
51
Accent Conversion: Preserving Speaker Identity in Native English Synthesis
Sabyasachi Chandra, Puja Bharati, Debolina Pramanik, Shyamal Kumar Das Mandal and Riya Sil
Technology
54
ThaiMRC: A Comprehensive Corpus for Advancing Machine Reading Comprehension in Thai
Chaianun Damrongrat, Santipong Thaiprayoon, Pornpimon Palingoon, Sumonmas Thatphithakkul and Vataya Chunwijitra
Data Development
61
Application of Data Augmentation to Reduce Session Variability in An I-Vector-Based Speaker IdentificatioN System
Muhammad Hanan and Dessi Puji Lestari
Technology
42
Automatic classification of disyllabic tone sandhi in Wuhan dialect based on functional principal component analysis
Wanping Xu and Aijun Li
Technology
78
Development of Chatbot Module in an Intelligent Tutoring System for English Language Learning Using Large Language Model
Ziyad Dhia Rafi, Ayu Purwarianti and Samsu Sempena
Technology

Poster Session #2

Thursday, November 13, 2025 (Day 2) Time: 14:30 – 15:30
2
Taiwanese POS Tagging Without Training Data: An LLM Model Merging-Based Approach with Chinese Resources
Chun Hsuan Chen, Hsiao-Wen Chu and Yuan-Fu Liao
Technology
14
Chinese Learners’ Processing of English Prosodic Boundaries: An ERP Study
Xiaoli Ji, Feier Cai, Pixiang Sun, Yanqin Yang and Aijun Li
Speech Science
27
Native Language Identification in Multilingual Indian English Speech: A Hybrid Deep Neural Approach with Feature Space Visualization
Debolina Pramanik, Puja Bharati, Sabyasachi Chandra, Satya Prasad Gaddamedi, Shyamal Kumar Das Mandal and Tarun Kanti Bhattacharya
Technology
30
Speech input interface for electronic medical record supporting automatic SOAP generation using large language models
Rikuto Yamanaka, Tsubasa Saito, Yukoh Wakabayashi and Norihide Kitaoka
Technology
33
Advancements in Speaker Diarization: A Comprehensive Study Integrating Audio-Visual, Neural, and Language Model-Based Approaches
Riya Sil, Sabyasachi Chandra and Pubali Maiti
Technology
48
Effects of Speech Rate and Syllable Position on The Temporal and SpectraL Characteristics of Cantonese Vowels
Chu Yan Ho and Wai-Sum Lee
Speech Science
49
Oriental COCOSDA in the Philippine and Global Academic Landscape: Policy and Bibliometric Perspectives
Nathaniel Oco
Technology
59
Case Studies on Error Checking for Tagalog and Bikol Language
Zhean Robby Ganituen, Stephen Borja, Justin Ethan Ching and Nathaniel Oco
Technology
66
Literature Review: Fusion and Attention Mechanisms in Text and Image Based Multimodal Sentiment Analysis
Revano Fabiansyah Priadi and Arie Ardiyanti Suryani
Technology
84
Integrating Semantic and Orthographic Features for Drug Name Similarity Analysis
Zhean Robby Ganituen, Stephen Borja, Erin Gabrielle Chua, Gideon Chua and Nathaniel Oco
Technology
3
The influence of emotional prosody on preschoolers’ perception of mandarin tones under noise: benefits from visual-articulatory cues
Wenyu Xiang, Yindan Weng, Shuimei Wang and Ping Tang
Speech Science

Oral Session #6

Thursday, November 13, 2025 (Day 2) Time: 16:00 – 17:30 Session Chair: Prof. Nathaniel Oco
7
Measuring Emotion Preservation in Expressive Speech-to-Speech Translation
Bagus Tris Atmaja, Toru Shirai and Sakriani Sakti
Technology
16
Generation and Automatic Evaluation of SOAP Notes from Medical Dialogue Using Large Language Models
Tsubasa Saito, Rikuto Yamanaka, Yukoh Wakabayashi and Norihide Kitaoka
Technology
35
Tuning Tone with Age: Adapting Dialogue Response Generation Based On LLMS and Self-Supervised Speaker Age Estimation
Riichi Yagi, Wangzixi Zhou, Hongwei Hu, Yuta Hirano and Sakriani Sakti
Technology
44
Enhancing Indonesian Deepfake Speech Localization with Pathological Features
Edia Zaki Naufal Ilman, Dessi Puji Lestari and Candy Olivia Mawalim
Technology
11
A Deep Learning Approach to Low-Resource Sanskrit Speech Recognition Using CTC Loss
Suhani Suhani, Amita Dev and Poonam Bansal
Technology

Oral Session #7

Friday, November 14, 2025 (Day 3) Time: 08:00 – 09:30 Session Chair: Prof. Satoshi Tamura
56
Tonal coarticulation in Jotsoma Angami compounds vs non-compounds
Zhonei I Gwirie, Priyankoo Sarmah and Sanasam Ranbir Singh
Speech Science
93
Improving Multi-Speaker Transcription for Live News Broadcasts with Canary 1B and PYannote Diarization
Muhammad Rifqi Adli Gumay, Rahmat Bryan Naufal, Alvin Xavier Rakha Wardhana and Kurniawati Azizah
Technology
94
Neural Network-Based Speech Emotion Recognition for The Indonesian Language
Muhammad Iqbal Asrif, Muhammad Alif Ismady and Kurniawati Azizah
Technology
73
Multilingual Multi-task Learning with Gradient Manipulation Method for Local Languages in Indonesia
Wilbert Fangderson and Ayu Purwarianti
Technology
26
Development of AcehX for Sentiment Analysis Using a BERT-Based Model
Doni Sumito Sukiswo, Hammam Riza, Muhammad Subianto, Taufik Fuadi Abidin and Afnan Afnan
Technology