2006
DOI: 10.1002/asi.20323
|View full text |Cite
|
Sign up to set email alerts
|

On the development of name search techniques for Arabic

Abstract: The need for effective identity matching systems has led to extensive research in the area of name search. For the most part, such work has been limited to English and other Latin-based languages. Consequently, algorithms such as Soundex and n-gram matching are of limited utility for languages such as Arabic, which has vastly different morphologic features that rely heavily on phonetic information. The dearth of work in this field is partly caused by the lack of standardized test data. Consequently, we have bu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2009
2009
2017
2017

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 18 publications
(6 citation statements)
references
References 19 publications
0
6
0
Order By: Relevance
“…Extra characters { ‫ؤ‬ , ‫ئ‬ , ‫ة‬ } were converted ‫‖ة-‬ to -‫ه‬ , " " ‫ؤ‬ ‖ to ‫‖,و-‬ and ‫‖ئ-‬ to ‫‖ي-‬ in our manual mapping for the PIM project [11]. Reference [10] mentioned an exact Arabic Soundex table, and another Arabic Soundex table is called ASoundex, which has one character ‫)ﻅ(‬ in a different group [12]. Table II represents the Arabic table generated by our automated tool.…”
Section: B Mapping Resultsmentioning
confidence: 99%
“…Extra characters { ‫ؤ‬ , ‫ئ‬ , ‫ة‬ } were converted ‫‖ة-‬ to -‫ه‬ , " " ‫ؤ‬ ‖ to ‫‖,و-‬ and ‫‖ئ-‬ to ‫‖ي-‬ in our manual mapping for the PIM project [11]. Reference [10] mentioned an exact Arabic Soundex table, and another Arabic Soundex table is called ASoundex, which has one character ‫)ﻅ(‬ in a different group [12]. Table II represents the Arabic table generated by our automated tool.…”
Section: B Mapping Resultsmentioning
confidence: 99%
“…That is why, we decided to improve the last method by indexing the words by their pronunciation or more exactly by their representative sounds. For that, we used Soundex [14], a phonetic algorithm for indexing by sound. Words are encoded by taking advantage of their phonetic form.…”
Section: Indexing Words By Their Soundsmentioning
confidence: 99%
“…There are more than four hundred million Arabic language speakers who live in 22 countries from the Arabian Peninsula across North Africa to the Atlantic Ocean [15]. Arabic is the main language of newspapers, TV news, books, and official works, even though there are different accents across countries and even within countries.…”
Section: Arabic-to-english Mappingmentioning
confidence: 99%