Reuben Tan

I am currently a PhD student at Boston University. I am fortunate to be advised by Professors Kate Saenko and Bryan Plummer.



My primary research interests lie in applications of machine learning in vision-language tasks and video understanding as well as representation learning.

Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
Reuben Tan, Bryan A. Plummer, Kate Saenko
EMNLP, 2020
arXiv / code / bibtex

In this paper, we introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions. To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles as well as conduct a series of human user study experiments based on this dataset. In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies, which will serve as an effective first line of defense and a useful reference for future work in defending against machine-generated disinformation.

LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval
Reuben Tan, Huijuan Xu, Kate Saenko, Bryan A. Plummer
WACV, 2021
arXiv / code / bibtex

We propose an efficient Latent Graph Co-Attention Network (LoGAN) that exploits fine-grained frame-by-word inter-actions to jointly reason about the correspon-dences between all possible pairs of frames, providing context cues absent in prior work.

Learning Similarity Conditions Without Explicit Supervision
Reuben Tan, Mariya I. Vasileva, Kate Saenko, Bryan A. Plummer
ICCV, 2019
arXiv / code / bibtex

Learning image representations for different similarity conditions and their contributions as a latent variable allows the neural network to generalize to unseen attributes.

Language Features Matter: Effective Language Representations for Vision-Language Tasks.
Andrea Burns, Reuben Tan, Kate Saenko, Stan Sclaroff, Bryan A. Plummer
ICCV, 2019
arXiv / code / bibtex

Based on extensive experiments which compare various word embeddings and language models, we present a set of best practices for incorporating the language component of Vision-Language tasks.