ASAR 2018 Layout Analysis Competition Challenge: Physical layout analysis of scanned Arabic books

Overview

Physical layout analysis (PLA) of a scanned document is the ability to segment the layout of the document image and identifying the class to which each image region belongs without using text recognizers or human supervision. Successful physical layout analysis leads to better performance of text recognizers and other applications like search and retrieval, word spotting, PDF-to-Word conversion, etc.

Scientific solutions devoted to physical layout analysis of Arabic documents are few and difficult to assess because of differences in methods, data, and evaluation metrics. Researchers cannot compare their work to the work of others due to the absence of publicly available research datasets that are suitably annotated for evaluating solutions to physical layout analysis tasks.

This competition will provide: (1) a benchmarking dataset for testing physical layout analysis solutions, which contains an annotated test set of scanned Arabic book page samples with a wide variety of content and appearance, and (2) a full evaluation scheme by offering code to compute a set of evaluation metrics to both analysis tasks (segmentation and classification) for quantitative evaluation, and to visually asses the analysis result for qualitative evaluation.

Tasks and Procedures

Task is to provide zone-level segmentation results to the benchmarking data: i.e. BB coordinates of each zone in addition to identifying the zone type for being “text” or “non-text”.

The zone segmentation evaluation should be performed based on certain metrics defined by Shafait et al. [1]: Average black pixel rate, over-segmentation error, under-segmentation error, correct-segmentation, missed-segmentation error, false alarm error, overall block error rate.

The zone classification evaluation should be performed based on: Precision (Pr), Recall (Rec), F1-measure (F1) and Average class accuracy (Acc) on both pixel and block levels

Resources and submissions

BCE-Arabic benchmarking dataset[2]

Contact Info

References

Register for ASAR2018 Competition

Name:

Title:

Email:

Institution:

Occupation:

Country:

Purpose: Interested Competition Participation


I commit not to modify my system during the test stage and that the reported final results are not manipulated