2012
DOI: 10.1007/978-3-642-28604-9_6
|View full text |Cite
|
Sign up to set email alerts
|

Building a Hierarchical Annotated Corpus of Urdu: The URDU.KON-TB Treebank

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
28
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
5
2

Relationship

5
2

Authors

Journals

citations
Cited by 11 publications
(28 citation statements)
references
References 12 publications
0
28
0
Order By: Relevance
“…Unfortunately, the document of the Urdu 5000 most frequently used words contained only counts of words. So, the probability value for each word listed in the document was calculated using (1). A sample of unigram and probability values is given in Fig.…”
Section: A N-gram Acquisitionmentioning
confidence: 99%
See 2 more Smart Citations
“…Unfortunately, the document of the Urdu 5000 most frequently used words contained only counts of words. So, the probability value for each word listed in the document was calculated using (1). A sample of unigram and probability values is given in Fig.…”
Section: A N-gram Acquisitionmentioning
confidence: 99%
“…As Urdu is an under-resourced language and no such precise or sufficient support available on today's machines for this language. This work presented here is a positive 1 http://www.autohotkey.com/community/viewtopic.php?f=2\&t=53630 2 http://www.clasohm.com/lmt/en/ 3 http://jsimlo.sk/notepad/ contribution for Urdu word prediction (UWP) in general and a helpful tool to boost up the typing needs of related handicapped persons.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…A comparative study made is detailed in Section 5. The URDU.KON-TB treebank having phrase structure (PS) and the hyper dependency structure (HDS) annotation with rich encoded information (Abbas, 2012;Abbas, 2014) is used for the training of the Urdu parser discussed in Section 3. The treebank has a semi-semantic POS (SSP) tag set, a semi-semantic syntactic (SSS) tag set and a functional (F) tag set.…”
Section: Related Workmentioning
confidence: 99%
“…Despite remarkable work on text categorization of English documents, the work on Urdu language text is still in infancy [32] though online Urdu text data is rapidly growing and necessitating the need to develop methods to organize and handle data. An important reason for this lack of interest is unavailability of publically accessible Urdu data collections [23,30].…”
Section: Introductionmentioning
confidence: 99%