The First Parallel Multilingual Corpus of Persian: Toward a Persian BLARK

Behrang Qasemizadeh, Saeed Rahimi, Behrooz Mahmoodi Bakhtiari

In this article, we have introduced the first parallel corpus of Persian with more than 10 other European languages. This article describes primary steps toward preparing a Basic Language Resources Kit (BLARK) for Persian. Up to now, we have proposed morphosyntactic specification of Persian based on EAGLE/MULTEXT guidelines and specific resources of MULTEXT-East. The article introduces Persian Language, with emphasis on its orthography and morphosyntactic features, then a new Part-of-Speech categorization and orthography for Persian in digital environments is proposed. Finally, the corpus and related statistic will be analyzed.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment