Comunicació - Departament de Llenguatges i Sistemes Informàtics

Títol:	From Detection to Understanding - A Multimodal Approach to Machine Translationese Using Text and Speech and Interpretability Analysis	Importa'l al teu calendari:
Tipus:	xerrada
Per:	Yongjian Chen (estudiant/e PhD univ. Groningen)
Lloc:	aula I2/POLIVALENT en Instituts Universitaris II
Dia/hora:	11.00 27/05/2025
Més informació:	https://ieeexplore.ieee.org/abstract/document/10887578
Persona de contacte:	Toral Ruiz, Antonio (antonio.toralrua.es)
Resum:	This talk presents two interconnected studies that advance machine translationese detection through multimodal approaches. The first work (presented at ICASSP 2025) introduces a novel method that combines speech features with text-based features to distinguish original from machine-translated texts. Critically, this multimodal approach proves especially effective when named entities are masked during training, compensating for the performance drop that occurs when models must rely on genuine translationese signals rather than spurious correlations like named entities. The second study (currently under review) provides crucial insights into why this compensation works through systematic attribution analysis using Integrated Gradients. The analysis reveals that bimodal integration leads to more balanced feature utilization: reducing the text modality's over-reliance on named entities while moderating extreme attribution patterns in speech features. Together, these studies establish both the practical effectiveness of multimodal translationese detection and the theoretical understanding of how cross-modal feature balancing enables more robust classification that captures genuine linguistic deviations rather than superficial topical cues.

[ Tancar ]