Welcome, my friend! 😊

My English name is Qitong (Andrew) Wang, Chinese name is 王琦童. I am from Shandong Province of China.

I am a second year master student and one of the summer interns of Image and Video Computing (IVC) group in the Department of Computer Science at Boston University, advised by Professor Margrit Betke. I am also a member of the Artificial Intelligence Research (AIR) at Boston University. Prior to that, I received my BEng in the Department of Computer Science and Engineering at Wuhan University of Technology.

My research interest is now focused on computer vision and pattern recognition. Specifically, I am interested in pixel-based scene text detection and semantic-based sentence recognition using deep neural networks.

SA-Text: Simple but Accurate Detector for Text of Arbitrary Shapes

Description: We introduce a new framework for text detection named SA-Text which is simple and only utilizes text region heatmap to reach the scene text detection goal.

➢ Encoded the probability of the text regions with a Gaussian heatmap in the pre-processing of text detection branch.

➢ Trained VGG-16 based text detection deep neural network, SA-Net, which outputs only one text region heatmap channel. Not only does the model solve challenging problems in scene text detection field such as accurately detect and separate multiple text which sticks together but it can effectively detect small text in natural scene images.

➢ Proposed a novel post-processing algorithm: Textfill algorithm in the text detection branch to extract each text bounding polygon in original images.

➢ Obtained F-score of 85.6% from test dataset of Total-Text, 85.2% from SCUT-CTW1500, and 80.0% from MSRA-TD500 in the single-scale inferences, which are higher than most of state-of-the-art text detection methodologies.

➢ Performed powerfully well on two challenging datasets, Total-Text (F-measure: 70.7%) and SCUT-CTW1500 (F-measure: 74.6%) even not fine-tuned on them and even outperformed ALL effective state-of-the-art baselines which did the same generalization experiments.

➢ Built a complete pipeline-based text spotting system by adding one powerful state-of-the-art text recognition framework called ASTER behind our SA-Text. (Our pipeline-based text spotting system (SA-Text + ASTER) obtained F-score of 75.7% from test dataset of Total-Text, which outperformed ALL effective state-of-the-art baselines which did the same text spotting experiments.)

Semantic-based Sentence Recognition in Images with Multimodal Deep Learning

Description: Assuming context from document images can improve the accuracy of text recognition, we proposed a novel “semantic-based text recognition” (SSR) deep learning model that reads text in images with the help of understanding context.

➢ Adopted WOGA to arrange isolated words in an image into the correct logical order and then groups them into phrases or sentences that belong together.

➢ Applied the seq2seq model to correct misspelling using semantic information from sentences of text images, improving the accuracy of text recognition effectively.

➢ Created two new labeled datasets, which were the Interior Design Dataset (IDD), and Text-containing Protest Image Dataset (TPID).

➢ Built an end-to-end text spotting system by adding a state-of-the-art text detector before SSR.

Development of Information Retrieve System of Law Document (Undergraduate Graduation Project at IDEA Group at Wuhan University of Technology)

Description: Developed an Information Retrieve System getting the most relative law documents of key words using IR and machine learning techniques.

➢ Captured successfully thousands of law documents on the Internet using Selenium package of Python.

➢ Preprocessed thousands of law documents using documents vectorization techniques.

➢ Developed Information Retrieval System using BP Neural Network, training law document vectors in order to let system rank the most relevant documents for different key words.


Qitong Wang, Yi Zheng, Margrit Betke. "SA-Text: Simple but Accurate Detector for Text of Arbitrary Shapes", arXiv:1911.07046, 10 pages, November 2019. [paper]

Yi Zheng, Qitong Wang, and Margrit Betke. "Deep Neural Network for Semantic-based Text Recognition in Images", arXiv:1908.01403, 10 pages, August 2019. [paper]