2010
DOI: 10.1145/1838751.1838754
|View full text |Cite
|
Sign up to set email alerts
|

An Information-Extraction System for Urdu---A Resource-Poor Language

Abstract: There has been an increase in the amount of multilingual text on the Internet due to the proliferation of news sources and blogs. The Urdu language, in particular, has experienced explosive growth on the Web. Text mining for information discovery, which includes tasks such as identifying topics, relationships and events, and sentiment analysis, requires sophisticated natural language processing (NLP). NLP systems begin with modules such as word segmentation, part-of-speech tagging, and morphological analysis a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
46
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 50 publications
(46 citation statements)
references
References 43 publications
0
46
0
Order By: Relevance
“…Figure 1 illustrates the numbers of percentage for each language beside English used in evaluated studies. For instance, studies of [5] [17], Arabic [11], Vietnamese [33], and Akkadian [35]. However, other studies [13] [29][22] applied sentence segmentation to analyse Japanese, Greek and Malay languages accordingly.…”
Section: Review Methodsmentioning
confidence: 99%
“…Figure 1 illustrates the numbers of percentage for each language beside English used in evaluated studies. For instance, studies of [5] [17], Arabic [11], Vietnamese [33], and Akkadian [35]. However, other studies [13] [29][22] applied sentence segmentation to analyse Japanese, Greek and Malay languages accordingly.…”
Section: Review Methodsmentioning
confidence: 99%
“…However, POS tagger is a basic tool for various applications in NLP field such as information retrieval (IR), information extraction (IE), etc. Moreover, POS tagger is necessary as a tool to build up any language corpus [19,20].…”
Section: Part Of Speech (Pos)mentioning
confidence: 99%
“…Besides the use of English language in the research studies of subjectivity classification, there are several research works in the Arabic language [23] and the Urdu language [24]. [23] used support vector machine (SVM) as supervised machine learning for the subjectivity and sentiment analysis.…”
Section: ) Subjectivity Classificationmentioning
confidence: 99%
“…[23] used support vector machine (SVM) as supervised machine learning for the subjectivity and sentiment analysis. As well, [24] used techniques such as bootstrap learning and resource sharing from a syntactically similar language.…”
Section: ) Subjectivity Classificationmentioning
confidence: 99%