Bidhan Sarkar scite author profile

Bidhan Sarkar

2Publications

2Citation Statements Received

18Citation Statements Given

How they've been cited

How they cite others

Affiliations

National Institute of Technology Durgapur

Publications

Order By: Most citations

Mining multilingual and multiscript Twitter data: unleashing the language and script barrier

Sarkar

Sinhababu²,

Roy

et al. 2020

IJBIDM

View full text Add to dashboard Cite

Micro-blogging sites like Twitter have become an opinion hub where views on diverse topics are expressed. Interpreting, comprehending and analysing this emotion-rich information can unearth many valuable insights. The job is trivial if the tweets are in English. But lately, increase in native languages for communication has imposed a great challenge in social media mining. Things become more complicated when people use Roman scripts to write non-English languages. India, being a country with a diverse collection of scripts and languages, encounters the problem severely. We have developed a system that automatically identifies and classifies native tweets, irrespective of the script used. Converting all tweets to English, we get rid of the 'script vs language' problem. The new approach we formulated consists of Script Identification, Language analysis, and Clustered mining. Considering English and the top two Indian languages, we found that the proposed framework gives better precision than the prevailing approaches.

show abstract

Mining Multilingual and Multiscript Twitter Data: Unleashing the Language and Script Barrier

Sarkar

Sinhababu²,

Roy

et al. 2018

IJBIDM

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Bidhan Sarkar

Mining multilingual and multiscript Twitter data: unleashing the language and script barrier

Mining Multilingual and Multiscript Twitter Data: Unleashing the Language and Script Barrier

Contact Info

Product

Resources

About