Identification of Naturally Occurring Numerical Expressions in Arabic

TitleIdentification of Naturally Occurring Numerical Expressions in Arabic
Publication TypeConference Paper
Year of Publication2008
AuthorsHabash, Nizar, and Roth Ryan
Conference NameProceedings of the Sixth International Language Resources and Evaluation (LREC-08)
Date Published05/2008
PublisherEuropean Language Resources Association (ELRA)
Conference LocationMarrakech, Morocco
ISBN Number2-9517408-4-0
Abstract

In this paper, we define the task of Number Identification in natural context. We present and validate a language-independent semi-automatic approach to quickly building a gold standard for evaluating number identification systems by exploiting hand-aligned parallel data. We also present and extensively evaluate a robust rule-based system for number identification in natural context for Arabic for a variety of number formats and types. The system is shown to have strong performance, achieving, on a blind test, a 94.8% F-score for the task of correctly identifying number expression spans in natural text, and a 92.1% F-score for the task of correctly determining the core numerical value.

URLhttp://www.lrec-conf.org/proceedings/lrec2008/