Direct Neural Machine Translation with Task-level Mixture-of-Experts models
Isidora Chara Tourni,
Subhajit Naskar
Under Submission, 2023
|
Analyzing Factual Knowledge in Multilingual pretrained Language Models
Jay Gala*,
Zeina Saadeddin*,
Isidora Chara Tourni
Under Submission, 2023
|
Relevance-guided Low-Resource Neural Machine Translation
Isidora Chara Tourni,
Derry Tanti Wijaya
Under Submission, 2023
|
Low-Resource Machine Translation for Low-Resource Languages: Leveraging Comparable Data, Code-Switching and Compute Resources
Isidora Chara Tourni*,
Garry Kuwanto*,
Afra Feyza Akyürek*,
Siyang Li*,
Alex Jones,
Derry Tanti Wijaya
Arxiv Preprint, 2021
abstract
We conduct an empirical study of unsupervised neural machine translation (NMT) for
truly low resource languages, exploring the
case when both parallel training data and
compute resource are lacking, reflecting the
reality of most of the world’s languages and
the researchers working on these languages.
We propose a simple and scalable method to
improve unsupervised NMT, showing how
adding comparable data mined using a bilingual dictionary along with modest additional
compute resource to train the model can significantly improve its performance. We also
demonstrate how the use of the dictionary to
code-switch monolingual data to create more
comparable data can further improve performance. With this weak supervision, our best
method achieves BLEU scores that improve
over supervised results for English→Gujarati
(+18.88), English→Kazakh (+5.84), and
English→Somali (+1.16), showing the
promise of weakly-supervised NMT for many
low resource languages with modest compute
resource in the world. To the best of our
knowledge, our work is the first to quantitatively showcase the impact of different modest
compute resource in low resource NMT.
|
Detecting Frames in News Headlines and Lead Images in U.S. Gun Violence Coverage
Isidora Chara Tourni,
Taufiq Daryanto,
Fabian Zhafransyah,
Hengchang Hu,
Edward Edberg Halim,
Boqi Chen,
Sha Lai,
Mona Jalal,
Margrit Betke,
Lei Guo,
Prakash Ishwar,
Derry Tanti Wijaya
Findings of the Association for Computational Linguistics: EMNLP 2021.
Presented at FEVER - The Fourth Workshop on Fact Extraction and Verification
video/dataset/abstract
News media structure their reporting of events or issues using certain perspectives. When describing an incident involving gun violence, for example, some journalists may focus on mental health or gun regulation, while others may emphasize the discussion of gun rights. Such perspectives are called “frames” in communication research.We study, for the first time, the value of combining lead images and their contextual information with text to identify the frame of a given news article. We observe that using multiple modes of information (article- and image-derived features) improves prediction of news frames over any single mode of information when the images are relevant to the frames of the headlines. We also observe that frame image relevance is related to the ease of conveying frames via images, which we call frame concreteness. Additionally, we release the first multimodal news framing dataset related to gun violence in the U.S., curated and annotated by communication researchers. The dataset will allow researchers to further examine the use of multiple information modalities for studying media framing.
|
Ani-GIFs: A Benchmark Dataset for Domain Generalization of Action Recognition from GIFs
Isidora Chara Tourni*,
Shoumik Majumdar*,
Shubhangi Jain*,
Arsenii Mustafin*,
Diala Lteif,
Stan Sclaroff,
Kate Saenko,
Sarah Adel Bargal
Frontiers in Computer Science, 2022
abstract
Deep learning models perform remarkably well for the same task under the assumption that data is always coming from the same distribution. However, this is generally violated in practice, mainly due to the differences in data acquisition techniques and the lack of information about the underlying source of new data. Domain generalization targets the ability to generalize to test data of an unseen domain; while this problem is well-studied for images, such studies are significantly lacking in spatiotemporal visual content—videos and GIFs. This is due to (1) the challenging nature of misalignment of temporal features and the varying appearance/motion of actors and actions in different domains, and (2) spatiotemporal datasets being laborious to collect and annotate for multiple domains. We collect and present the first synthetic video dataset of Animated GIFs for domain generalization, Ani-GIFs, that is used to study the domain gap of videos vs. GIFs, and animated vs. real GIFs, for the task of action recognition. We provide a training and testing setting for Ani-GIFs, and extend two domain generalization baseline approaches, based on data augmentation and explainability, to the spatiotemporal domain to catalyze research in this direction.
|
Cultural and Geographical Influences on Image Translatability of Words across Languages
Nikzad Khani,
Isidora Chara Tourni,
Mohammad Sadegh Rasooli,
Chris Callison-Burch,
Derry Tanti Wijaya
NAACL, 2021
video/code/abstract
Neural Machine Translation (NMT) models have been observed to produce poor translations when there are few/no parallel sentences to train the models. In the absence of parallel data, several approaches have turned to the use of images to learn translations. Since images of words, e.g., horse may be unchanged across languages, translations can be identified via images associated with words in different languages that have a high degree of visual similarity. However, translating via images has been shown to improve upon text-only models only marginally. To better understand when images are useful for translation, we study image translatability of words, which we define as the translatability of words via images, by measuring intra- and inter-cluster similarities of image representations of words that are translations of each other. We find that images of words are not always invariant across languages, and that language pairs with shared culture, meaning having either a common language family, ethnicity or religion, have improved image translatability (i.e., have more similar images for similar words) compared to its converse, regardless of their geographic proximity. In addition, in line with previous works that show images help more in translating concrete words, we found that concrete words have improved image translatability compared to abstract ones.
|