O-COCOSDA Logo
O-COCOSDA 2025

12–14 November 2025, Universitas Kristen Duta Wacana (UKDW), Yogyakarta, Indonesia

Prof. Suyanto

Speaker: Prof. Dr. Suyanto, S.T., M.Sc.

Rector of Telkom University, Indonesia

Session Chair: Prof. Hammam Riza


Biography

Prof. Dr. Suyanto is a Professor of Artificial Intelligence and the current Rector of Telkom University. He holds degrees in Informatics Engineering (Telkom University), Complex Adaptive Systems (Chalmers University, Sweden), and Computer Science (Universitas Gadjah Mada).
With over 100 international publications and an h-index of 23, he is recognized globally for his contributions to AI, machine learning, swarm intelligence, and computational linguistics. He is the creator of the Komodo Mlipir Algorithm (KMA), a novel optimization method inspired by natural behavior and cultural philosophy.
Listed among the world’s top 2% scientists by Stanford University, Prof. Suyanto is also an inventor, educator, and academic leader. His current focus is on advancing research, innovation, and entrepreneurial transformation in higher education.

Abstract

According to Southeast Asian Ministers of Education Organization (SEAMEO), Southeast Asia is home to one of the richest linguistic heritages in the world , yet hundreds of local languages are disappearing due to limited documentation, lack of formal education, and the dominance of national or global languages . In this talk, I discuss the connected link between language as human communication system, linguistics as a field of studying language, and technology, from a computer scientist with strong interest to social studies point of view. This talk is based on several research on Indonesian language and Indonesian local languages.

In Indonesia, 522 languages are endangered and 15 extinct, according to Ethnologue. The emergence of AI-driven speech and language technologies provides a transformative opportunity to preserve and revitalize these languages, but it also exposes the gaps that remain. For instance, research by Bang et al. published on 2023 reveals that large language models such as ChatGPT have failed to translate even simple English–Sundanese sentences correctly , highlighting persistent biases and data scarcity in multilingual modeling.

Recent progress in Speech Technology, NLP, and Large Language Models (LLMs) offers a foundation for inclusive language revitalization. Several research in Indonesia has contributed to this direction through grapheme-to-phoneme modeling, syllabification, and automatic speech recognition (ASR)—for example, Suyanto et al. demonstrated phonotactic and syllabic rules to improve Indonesian phonemicization and potentially end-to-end ASR performance, which can be extended to low-resource local languages.

Building upon these foundations, current research highlights several emerging trends in AI-enabled language revitalization:

  1. Community-in-the-loop dataset creation, where native speakers act as data contributors and cultural validators, ensuring ethical and authentic representation.
  2. Human-AI collaborative curation, utilizing LLMs for data cleaning, transliteration, and tagging while maintaining expert oversight and explainability.
  3. Socio-technical integration, combining linguistics, anthropology, education, and AI ethics to form a holistic revitalization ecosystem rather than isolated language models.

Looking ahead, strengthening the collaboration between communities, linguists, anthropologists, and technologists offers a pathway toward sustainable language revitalization—laying the groundwork for culturally grounded innovation. By aligning AI development with cultural preservation, Southeast Asia can pioneer a model of human-centered linguistic AI—where technology not only understands language but also keeps it alive.


1https://www.seameo.org/category/7/500
2https://aclanthology.org/2022.acl-long.500.pdf
3https://aclanthology.org/2023.ijcnlp-main.45/
4https://www.sciencedirect.com/science/article/pii/S1319157821000069

Dr. Nancy F. Chen

Speaker: Dr. Nancy F. Chen

Institute for Infocomm Research (I²R) Singapore; 2025 ISCA Fellow

Session Chair: Prof. Satoshi Nakamura


Biography

Nancy F. Chen received her Ph.D. from MIT and Harvard in 2011, conducting research at MIT Lincoln Laboratory in multilingual speech processing. She currently leads research in conversational AI and natural language generation at the Institute for Infocomm Research (I²R), A*STAR, Singapore, with applications in education, healthcare, journalism, and defense. Her team’s speech evaluation technology was deployed by Singapore’s Ministry of Education to support home-based learning during the COVID-19 pandemic.
She led a cross-continental team on low-resource spoken language processing that ranked among the top performers in the NIST Open Keyword Search Evaluations (2013–2016). Dr. Chen is the recipient of numerous honors including the 2025 ISCA Fellow, Singapore 100 Women in Tech (2021), Best Paper Awards at SIGDIAL and APSIPA, and the L’Oréal-UNESCO For Women in Science National Fellowship. She serves as an ISCA Board Member (2021–2025), Program Chair of ICLR 2023, and has held editorial roles in several top journals in speech and language processing.

Abstract

Unlike sight, which we can shut off with a blink, sound is inescapable. We are always listening, even when we wish not to. Hearing comes naturally, but understanding what we hear requires learning, knowledge, focus, and interpretation. Yet it is sound — be it the quiet drone of an air conditioner, a gentle whisper, or the distant rush of a waterfall — that anchors us to our physical surroundings, social connections, and the present moment.

In this talk, I will share our experience in modelling the audio signal in multimodal generative AI to drive translational impact across domain applications. In particular, we exploit the audio modality to strengthen contextualization, reasoning, and grounding. Cultural nuances and multilingual peculiarities add another layer of complexity in understanding verbal interactions. Examples include our generative AI efforts in Singapore’s National Multimodal Large Language Model Programme has led to MERaLiON (Multimodal Empathetic Reasoning and Learning In One Network), the first multimodal large language model developed for Southeast Asia context. Such endeavors complement North American centric models to make generative AI more widely deployable for localized needs. Another case in point is SingaKids AI Tutor, which enables young children to learn ethnic languages such as Malay, Mandarin and Tamil. We are currently expanding applications to embodied agentic AI, aviation, and healthcare.


Follow us on

© 2025 O-COCOSDA. All rights reserved.

Oriental Chapter of the International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques