Shafiq Ur Rehman Khan scite author profile

Shafiq Ur Rehman Khan

4Publications

20Citation Statements Received

21Citation Statements Given

How they've been cited

How they cite others

Affiliations

Capital University of Science and Technology

Publications

Order By: Most citations

Temporal specificity-based text classification for information retrieval

Khan¹,

Islam²,

Aleem³

et al. 2018

Turk J Elec Eng & Comp Sci

View full text Add to dashboard Cite

Time is an important aspect in temporal information retrieval (TIR), a subfield of information retrieval (IR). Web search engines like Google or Bing are common examples of IR systems. An important constituent of a search engine is news retrieval, where users present their information needs in the form of temporal queries. Users are usually interested in news documents focusing on a particular time period. Existing search engines rarely fulfill the temporal information requirements as they ignore the temporal information available in the content of news documents, also known as document focus time. Furthermore, information related to multiple time periods in a news document makes the identification of document focus time a challenging task. Therefore, it is necessary to classify news documents based on temporal specificity before it is possible to use the temporal information in the retrieval process. In this study, we formulate the temporal specificity problem as a time-based classification task by classifying news documents into three temporal classes, i.e. high temporal specificity, medium temporal specificity, and low temporal specificity. For such classification, rule-based and temporal specificity score (TSS)-based classification approaches are proposed. In the former approach, news documents are classified using a defined set of rules that are based on temporal features. The later approach classifies news documents based on a TSS score using the temporal features. The results of the proposed techniques are compared with four machine learning classification algorithms: Bayes net, support vector machine, random forest, and decision tree. The results show that the proposed rule-based classifier outperforms the four algorithms by achieving 82% accuracy, whereas TSS classification achieves 77% accuracy.

show abstract

Comparative Analysis of Information Retrieval Models on Quran Dataset in Cross-Language Information Retrieval Systems

Taan

Khan²,

Raza³

et al. 2021

IEEE Access

View full text Add to dashboard Cite

Section-Based Focus Time Estimation of News Articles

et al. 2018

View full text Add to dashboard Cite

Event-Dataset: Temporal information retrieval and text classification dataset

Khan

Islam

2019

Data in Brief

View full text Add to dashboard Cite

Recently, Temporal Information Retrieval (TIR) has grabbed the major attention of the information retrieval community. TIR exploits the temporal dynamics in the information retrieval process and harnesses both textual relevance and temporal relevance to fulfill the temporal information requirements of a user Ur Rehman Khan et al., 2018. The focus time of document is an important temporal aspect which is defined as the time to which the content of the document refers Jatowt et al., 2015; Jatowt et al., 2013; Morbidoni et al., 2018, Khan et al., 2018. To the best of our knowledge, there does not exist any standard benchmark data set (publicly available) that holds the potential to comprehensively evaluate the performance of focus time assessment strategies. Considering these aspects, we have produced the Event-dataset, which is comprised of 35 queries and set of news articles for each query. Such that, where C represents the dataset, is query set and for each there is a set of news articles . are sets of relevant documents and non-relevant documents respectively. Each query in the dataset represents a popular event. To annotate these articles into relevant and non-relevant, we have employed a user-study based evaluation method wherein a group of postgraduate students manually annotate the articles into the aforementioned categories. We believe that the generation of such dataset can provide an opportunity for the information retrieval researchers to use it as a benchmark to evaluate focus time assessment methods specifically and information retrieval methods generically.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.