Jongin Kim

I am a first-year PhD student in Computer Science at Boston University, where I am advised by Prof. Derry Wijaya.

My research interest is Multilingual/Crosslingual NLP.
My goal is to develop NLP systems that work well across different languages, so that langauge technologies are easily accessible to diverse and/or disadvantaged users who may need them most.

I have done some research towards this research direction. (publications)
Also, I plan to further broaden my research topics in the future while I pursue my Ph.D. !

Publications

2021

Analysis of Zero-Shot Crosslingual Learning between English and Korean for Named Entity Recognition
Jongin Kim, Nayoung Choi, Seunghyun S. Lim, Jungwhan Kim, Soojin Chung, Hyunsoo Woo, Min Song, and Jinho D. Choi
In Proceedings of the EMNLP Workshop on Multilingual Representation Learning (MRL), 2021
Anthology Paper Presentation
FantasyCoref: Coreference Resolution on Fantasy Literature Through Omniscient Writer’s Point of View
Sooyoun Han, Sumin Seo, Minji Kang, Jongin Kim, Nayoung Choi, Min Song, and Jinho D. Choi
In Proceedings of the EMNLP Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC), 2021
Anthology Paper Presentation

Future Research Directions

Efficient Way of Building Dataset for Low-Resource languages

Ways to efficiently build monolingual labeled dataset for low-resource languages
(Particularly, data that contain knowledge that cannot be transferred from other high-resoure languages)

Active Learning, Human-in-the-loop machine learning

Ways to efficiently crowdsource/collect Parallel texts
(for Multilingual Neural Machine Translation)

Annotation using Image or GIFs as pivots
Collect Parallel(Comparable) texts from Web and Filter out noise

Multilingual Benchmark Datasets that enable more comprehensive evaluation of Multilingual Models

Inclusion of typologically diverse languages
Inclusion of more challenging NLU/NLG tasks

Novel Methods for Pre-Training Multilingual Language Models for more accurate alignment accross different languages (for better multilingual representations)

Multilingual subword tokenizer
Encourage explicit attention between languages
Augmenting the model with linguistic knowledge
(= incorporating linguistic knowledge into the model)

Explore Ways to improve Cross-Lingual Transfer Learning

Improving zero-shot cross-lingual transfer (Direct Model Transfer)

Intermediate task training

Overcoming Word Order Difference
Annotation Projection

Applying Cross-Lingual Transfer Learning to Other Tasks
(Would like to explore ways to apply Cross-lingual Transfer Learning to more challenging Tasks)

Crossingual IR, including QA, Text Summarization

Multilingual Neural Machine Translation

Jongin Kim

PhD in Computer Science, Boston University

Jongin Kim

Publications

2021

Future Research Directions