Maharshi R. Pandya scite author profile

Maharshi R. Pandya

3Publications

3Citation Statements Received

4Citation Statements Given

How they've been cited

How they cite others

Affiliations

St. Luke's University Health Network, Illinois College, University of North Carolina at Chapel Hill

Publications

Order By: Most citations

Method for Customizable Automated Tagging: Addressing the problem of over-tagging and under-tagging text documents

Pandya

Reyes

Vanderheyden³

2020

View full text Add to dashboard Cite

Using author provided tags to predict tags for a new document often results in the overgeneration of tags. In the case where the author doesn't provide any tags, our documents face the severe under-tagging issue. In this paper, we present a method to generate a universal set of tags that can be applied widely to a large document corpus. Using the IBM Watson's NLU service, first, we collect keywords/phrases that we call "complex document tags" from 8,854 popular reports in the corpus. We apply LDA model over these complex document tags to generate a set of 765 unique "simple tags". In applying the tags to a corpus of documents, we run each document through the IBM Watson NLU and apply appropriate simple tags. Using only 765 simple tags, our method allows us to tag 87,397 out of 88,583 total documents in the corpus with at least one tag. About 92.1% of the 87,397 documents are also determined to be sufficiently-tagged. In the end, we discuss the performance of our method and its limitations.

show abstract

Racial Disparities in Outcomes of Cardiac Device Related Infections

Shah

Ferraro

Modi

et al. 2023

Journal of the American College of Cardiology

View full text Add to dashboard Cite

Method for Customizable Automated Tagging: Addressing the Problem of Over-tagging and Under-tagging Text Documents

Pandya¹,

Reyes²,

Vanderheyden³

2020

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.