Computer Science, Boston University
I am a PhD student in the computer science department at Boston University, advised by Professor Kate Saenko.
My research focuses on deep learning, computer vision and natural language processing, particularly in the area of visual question answering, video language description and activity detection.
Before coming to US, I received a Master’s degree from Graduate University of Chinese Academy of Sciences in 2012, advised by Professor Hua Yu, and a Bachelor’s degree in computer science from Hefei University of Technology in 2009. I conducted my PhD studies at the CS Department of UMass Lowell for three years, before transferring to the CS Department of Boston University to continue my PhD studies with Professor Kate Saenko.
Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning.
Huijuan Xu, Kun He, Leonid Sigal, Stan Sclaroff, Kate Saenko.
Joint Event Detection and Description in Continuous Video Streams.
Huijuan Xu, Boyang Li, Vasili Ramanishka, Leonid Sigal, Kate Saenko.
R-C3D: Region Convolutional 3D Network for Temporal Activity Detection.
Huijuan Xu, Abir Das, Kate Saenko.
In International Conference on Computer Vision (ICCV), 2017.
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering.
Huijuan Xu, Kate Saenko.
In European Conference on Computer Vision (ECCV), 2016.
Translating Videos to Natural Language Using Deep Recurrent Neural Networks.
Subhashini Venugopalan, Huijuan Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, Kate Saenko.
North American Chapter of the Association for Computational Linguistics (NAACL), 2015.
Huijuan Xu and Kate Saenko. Dual Attention Network for Visual Question Answering. ECCV2016 2nd Workshop on Storytelling with Images and Videos (VisStory), 2016.
Huijuan Xu and Kate Saenko. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering. VQA Challenge Workshop at CVPR 2016.
Huijuan Xu, Subhashini Venugopalan, Vasili Ramanishka, Marcus Rohrbach and Kate Saenko. A Multi-scale Multiple Instance Video Description Network. ICCV15 workshop on Closing the Loop Between Vision and Language (CLVL), 2015. (Abstract)
Teaching Assistant, UMass Lowell(Spring 2015)
91.422/545: Machine Learning
Lab Instructor, UMass Lowell (Fall 2013, Spring 2014)
91.103: Computing I Lab
Disney Research, Pittsburgh, Summer 2017
Reviewer of International Journal of Computer Vision (IJCV), ICCV2017,