E-mail: sunxm at bu dot edu
Hi, I am currently a Ph.D. student in Computer Science at Boston University starting from 2019 Spring, supervised by Prof. Kate Saenko. In 2019 and 2020 Summer, I am very honored to work with Rogerio Feris and Rameswar Panda at IBM Watson Health. Previously, I received my M.S. in ECE from University of Michigan, Ann Arbor and received B.ENG. in Communication Engineering from Beijing University of Posts and Telecommunications.
I am interested in the deep learning and computer vision. My recent research is focused on the multi-task learning and deep generative models.
CV / GitHub / Google Scholar
- Ping Hu, Ximeng Sun, Kate Saenko, Stan Sclaroff. "Weakly-supervised Compositional Feature Aggregation for Few-shot Recognition", arXiv preprint arXiv:1906.04833, 2019.
- Huijuan Xu, Bingyi Kang, Ximeng Sun, Jiashi Feng, Kate Saenko, Trevor Darrell. "Similarity R-C3D for Few-shot Temporal Activity Detection", arXiv preprint arXiv:1812.10000, 2018.
Conferences and Journals
- Ximeng Sun, Rameswar Panda, Rogerio Feris, Kate Saenko. "AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning". Neural Information Processing Systems (NeurIPS), 2020.
pdf / project page / code
Overview: AdaShare is a novel and differentiable approach for efficient multi-task learning that learns the feature sharing pattern to achieve the best recognition accuracy, while restricting the memory footprint as much as possible. Our main idea is to learn the sharing pattern through a task-specific policy that selectively chooses which layers to execute for a given task in the multi-task network. In other words, we aim to obtain a single network for multi-task learning that supports separate execution paths for different tasks.
- Ximeng Sun, Huijuan Xu, Kate Saenko. "TwoStreamVAN: Improving Motion Modeling in Video Generation". IEEE Winter Conference on Applications of Computer Vision (WACV), 2020.
arXiv / demo / code / dataset
Overview: We propose TwoStreamVAN to output a realistic video given an input action label by progressively generating and fusing motion and content features at multiple scales using adaptive motion kernels. In addition, to better evaluate video generation models, we design a new synthetic human action dataset SynAction to bridge the difficulty gap between overcomplicated human action datasets and simple toy datasets.
- Ximeng Sun, Ryan Szeto, Jason Corso. "A Temporally-Aware Interpolation Network for Video Frame Inpainting". Asian Conference on Computer Vision (ACCV), 2018.
paper / demo / code
Ryan Szeto, Ximeng Sun, Kunyi Lu, Jason Corso. "A Temporally-Aware Interpolation Network for Video Frame Inpainting". IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019 Nov 6
paper / code
Overview: We propose the first deep learning solution to video frame inpainting. We devise a pipeline composed of two modules: a bidirectional video prediction module and a temporally-aware frame interpolation module. Our experiments demonstrate that our approach produces more accurate and qualitatively satisfying results than a state-of-the-art video prediction method and many strong frame inpainting baselines.
- Xingchao Peng, Zijun Huang, Ximeng Sun, Kate Saenko. "Domain Agnostic Learning with Disentangled Representations". International Conference on Machine Learning (ICML), 2019.