Department of Software and Computing Systems

Lecture

Title:From Detection to Understanding - A Multimodal Approach to Machine Translationese Using Text and Speech and Interpretability Analysis Import to your calendar:
[CSV]
xerrada
Presenter:Yongjian Chen (estudiant/e PhD univ. Groningen)
Venue:aula I2/POLIVALENTE en Institutos Universitarios II
Date&time:11:00 27/05/2025
More information:https://ieeexplore.ieee.org/abstract/document/10887578
Contact person:

Toral Ruiz, Antonio (antonio.toralr[Perdone'm]ua.es)
Abstract:
This talk presents two interconnected studies that advance machine
translationese detection through multimodal approaches. The first work
(presented at ICASSP 2025) introduces a novel method that combines
speech features with text-based features to distinguish original from
machine-translated texts. Critically, this multimodal approach proves
especially effective when named entities are masked during training,
compensating for the performance drop that occurs when models must rely on
genuine translationese signals rather than spurious correlations like named
entities. The second study (currently under review) provides crucial insights
into why this compensation works through systematic attribution analysis using
Integrated Gradients. The analysis reveals that bimodal integration leads to
more balanced feature utilization: reducing the text modality's over-reliance
on named entities while moderating extreme attribution patterns in speech
features. Together, these studies establish both the practical effectiveness
of multimodal translationese detection and the theoretical understanding of
how cross-modal feature balancing enables more robust classification that
captures genuine linguistic deviations rather than superficial topical cues.

[ Close ]