Lecture - Department of Software and Computing Systems

Title:	From Detection to Understanding - A Multimodal Approach to Machine Translationese Using Text and Speech and Interpretability Analysis	Import to your calendar:
	xerrada
Presenter:	Yongjian Chen (estudiant/e PhD univ. Groningen)
Venue:	aula I2/POLIVALENTE en Institutos Universitarios II
Date&time:	11:00 27/05/2025
More information:	https://ieeexplore.ieee.org/abstract/document/10887578
Contact person:	Toral Ruiz, Antonio (antonio.toralrua.es)
Abstract:	This talk presents two interconnected studies that advance machine translationese detection through multimodal approaches. The first work (presented at ICASSP 2025) introduces a novel method that combines speech features with text-based features to distinguish original from machine-translated texts. Critically, this multimodal approach proves especially effective when named entities are masked during training, compensating for the performance drop that occurs when models must rely on genuine translationese signals rather than spurious correlations like named entities. The second study (currently under review) provides crucial insights into why this compensation works through systematic attribution analysis using Integrated Gradients. The analysis reveals that bimodal integration leads to more balanced feature utilization: reducing the text modality's over-reliance on named entities while moderating extreme attribution patterns in speech features. Together, these studies establish both the practical effectiveness of multimodal translationese detection and the theoretical understanding of how cross-modal feature balancing enables more robust classification that captures genuine linguistic deviations rather than superficial topical cues.

[ Close ]