2002
DOI: 10.7202/002564ar
|View full text |Cite
|
Sign up to set email alerts
|

Pour une lexicomatique de l'arabe : l'unité lexicale simple et l'inventaire fini des spécificateurs du domaine du mot

Abstract: Résumé Cet article à pour objet de présenter dans leurs grandes lignes : 1) la structure générale de la base de données lexicales qui est la condition sine qua non de tout traitement automatique opérationnel de l'arabe, et dont la conception est fondée sur une approche déclarative de la morphologie ; 2) la méthodologie de réalisation de cette base ; il s'agit de la conception et de I'implementation d'interfaces de saisie des données lexicales. Ces dernières doivent … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0
2

Year Published

2007
2007
2015
2015

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(12 citation statements)
references
References 2 publications
0
10
0
2
Order By: Relevance
“…According to [18] a substantial subset of Arabic wordforms consist of a stem of the <root + pattern> type, to which a finite set of compatible proclitics and prefixes are agglutinated to the left and a finite set of compatible suffixes and enclitics are agglutinated to the right. This renders the isolation of the stem very difficult.…”
Section: B the Complex Structure Of The Arabic Word-formmentioning
confidence: 99%
See 1 more Smart Citation
“…According to [18] a substantial subset of Arabic wordforms consist of a stem of the <root + pattern> type, to which a finite set of compatible proclitics and prefixes are agglutinated to the left and a finite set of compatible suffixes and enclitics are agglutinated to the right. This renders the isolation of the stem very difficult.…”
Section: B the Complex Structure Of The Arabic Word-formmentioning
confidence: 99%
“…It presents serious challenges and obstacles to the task of automatic processing and classification that should be indispensably overcome. In what follows we present a brief description of the main real obstacles encountered while working with Arabic documents and not faced while working documents written in languages using Latin characters and that are overcome by the automatic segmentation of each word using the analyzer and segmenter of our computerized dictionary DIINAR.1 [16][17] [18][19] [20]. The resulting subsegments of the words are then used as features in the dataset.…”
Section: The Datasetmentioning
confidence: 99%
“…; -enklityka (ECL) rozumiana w zasadzie jako zaimek sufigowany (ABBÈS 2004). Idea DIINAR wyraźnie nawiązuje do myśli Josepha Dichy (1997), który uprościł koncepcję Davida Cohena o mot maximal -mot minimal do postaci Fe -Fn -Fe i ustalił pięcioogniwową koncepcję słowoformy arabskiej, z której jak się wydaje, skorzystał później Zrigui. Ta koncepcja ma swoje wady, gdyż dobry stemmer analizując tekst arabski, zwłaszcza niewokalizowany, generuje dużą liczbę potencjalnych rdzeni, które albo nie mają zastosowania w konkretnym tekście, albo w ogóle nie występują w języku arabskim.…”
Section: Podstawowe Schematy Analizyunclassified
“…Interlinear glosses follow the standard set of parsing conventions and grammatical abbreviations explained in: "The Leipzig Glossing Rules: Conventions for interlinear morpheme-by-morpheme glosses" February 2008. Hyphen marks segmentable morphemes and an equal sign marks clitic boundaries, both in transliterations and in the interlinear gloss 4 Dichy J. (1997)…”
mentioning
confidence: 99%