Mohammad Golam Sohrab scite author profile

Mohammad Golam Sohrab

5Publications

196Citation Statements Received

53Citation Statements Given

How they've been cited

296

191

How they cite others

Affiliations

National Institute of Advanced Industrial Science and Technology, Toyota Technological Institute, Tokushima University

Publications

Order By: Most citations

Deep Exhaustive Model for Nested Named Entity Recognition

Sohrab¹,

Misawa²

2018

168

View full text Add to dashboard Cite

We propose a simple deep neural model for nested named entity recognition (NER). Most NER models focused on flat entities and ignored nested entities, which failed to fully capture underlying semantic information in texts. The key idea of our model is to enumerate all possible regions or spans as potential entity mentions and classify them with deep neural networks. To reduce the computational costs and capture the information of the contexts around the regions, the model represents the regions using the outputs of shared underlying bidirectional long short-term memory. We evaluate our exhaustive model on the GENIA and JNLPBA corpora in biomedical domain, and the results show that our model outperforms state-of-the-art models on nested and flat NER, achieving 77.1% and 78.4% respectively in terms of F-score, without any external knowledge resources.

show abstract

Class-indexing-based term weighting for automatic text classification

Ren

Sohrab

2013

Information Sciences

114

View full text Add to dashboard Cite

BENNERD: A Neural Named Entity Linking System for COVID-19

Sohrab¹,

Duong²,

Misawa³

et al. 2020

View full text Add to dashboard Cite

We present a biomedical entity linking (EL) system BENNERD that detects named entities in text and links them to the unified medical language system (UMLS) knowledge base (KB) entries to facilitate the corona virus disease 2019 (COVID-19) research. BEN-NERD mainly covers biomedical domain, especially new entity types (e.g., coronavirus, viral proteins, immune responses) by addressing CORD-NER dataset. It includes several NLP tools to process biomedical texts including tokenization, flat and nested entity recognition, and candidate generation and ranking for EL that have been pre-trained using the CORD-NER corpus. To the best of our knowledge, this is the first attempt that addresses NER and EL on COVID-19-related entities, such as COVID-19 virus, potential vaccines, and spreading mechanism, that may benefit research on COVID-19. We release an online system to enable real-time entity annotation with linking for end users. We also release the manually annotated test set and CORD-NERD dataset for leveraging EL task. The BENNERD system is available at https://aistairc.github.io/BENNERD/.

show abstract

Centroid-Means-Embedding: An Approach to Infusing Word Embeddings into Features for Text Classification

Sohrab

Misawa

Sasaki

2015

View full text Add to dashboard Cite

EDGE2VEC: Edge Representations for Large-Scale Scalable Hierarchical Learning

Sohrab¹,

Nakata²,

Misawa³

et al. 2018

CyS

View full text Add to dashboard Cite

In present front-line of Big Data, prediction tasks over the nodes and edges in complex deep architecture needs a careful representation of features by assigning hundreds of thousands, or even millions of labels and samples for information access system, especially for hierarchical extreme multi-label classification. We introduce edge2vec, an edge representations framework for learning discrete and continuous features of edges in deep architecture. In edge2vec, we learn a mapping of edges associated with nodes where random samples are augmented by statistical and semantic representations of words and documents. We argue that infusing semantic representations of features for edges by exploiting word2vec and para2vec is the key to learning richer representations for exploring target nodes or labels in the hierarchy. Moreover, we design and implement a balanced stochastic dual coordinate ascent (DCA)-based support vector machine for speeding up training. We introduce a global decision-based top-down walks instead of random walks to predict the most likelihood labels in the deep architecture. We judge the efficiency of edge2vec over the existing state-of-the-art techniques on extreme multi-label hierarchical as well as flat classification tasks. The empirical results show that edge2vec is very promising and computationally very efficient in fast learning and predicting tasks. In deep learning workbench, edge2vec represents a new direction for statistical and semantic representations of features in task-independent networks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.