Prantik Goswami scite author profile

Prantik Goswami

5Publications

1Citation Statement Received

166Citation Statements Given

How they've been cited

How they cite others

131

166

Affiliations

University of Koblenz and Landau

Publications

Order By: Most citations

MexPub: Deep Transfer Learning for Metadata Extraction from German Publications

Boukhers¹,

Beili²,

Hartmann³

et al. 2021

View full text Add to dashboard Cite

Extracting metadata from scientific papers can be considered as a solved problem in NLP due to the high accuracy of state-of-the-art methods. However, this does not apply to German scientific publications, which have a variety of styles and layouts. In contrast to most of the English scientific publications that follow standard and simple layouts, the order, content, position and size of metadata in German publications vary greatly among publications. This variety makes traditional NLP methods fail to accurately extract metadata from these publications. In this paper, we present a method that extracts metadata from PDF documents with different layouts and styles by viewing the document as an image. We used Mask R-CNN that is trained on COCO dataset and finetuned with PubLayNet dataset that consists of 200K PDF snapshots with five basic classes (e.g. text, figure, etc). We refine-tuned the model on our proposed synthetic dataset consisting of 30K article snapshots to extract nine patterns (i.e. author, title, etc). Our synthetic dataset is generated using contents in both languages German and English and a finite set of challenging templates obtained from German publications. Our method achieved an average accuracy of around 90% which validates its capability to accurately extract metadata from a variety of PDF documents with challenging templates.

show abstract

Knowledge Guided Multi-filter Residual Convolutional Neural Network for ICD Coding from Clinical Text

Boukhers

Goswami

Jürjens

2023

Preprint

View full text Add to dashboard Cite

One challenge often encountered when using Deep Neural Network models for automatic ICD coding is their potential inability to effectively handle unseen clinical texts, especially when these models are only trained on a limited number of examples. This is because these models rely solely on the patterns and relationships present in the training data, and may not be able to effectively incorporate additional knowledge about the relationships between medical entities. To address this issue, we introduce KG-MultiResCNN - Knowledge Guided Multi-filter Residual Convolutional Neural Network model, which combines training examples with external knowledge from the Wikidata Knowledge Graph (KG) in order to better capture the relationships between medical entities. The KG is a structured database that contains a wealth of information about various entities, including medical concepts and their relationships with one another. By incorporating this external knowledge into our model, we are able to improve its ability to predict ICD codes for new clinical texts. In our experiments with the MIMIC-III dataset, we found that the KG-MultiResCNN model significantly outperformed the baseline approaches. This demonstrates the effectiveness of using external knowledge, in addition to training examples, to improve the performance of deep learning models for automatic ICD coding.

show abstract

MexPub: Deep Transfer Learning for Metadata Extraction from German Publications

Boukhers¹,

Beili²,

Hartmann³

et al. 2021

Preprint

View full text Add to dashboard Cite

Knowledge guided multi-filter residual convolutional neural network for ICD coding from clinical text

Boukhers

Goswami

Jürjens

2023

Neural Comput & Applic

View full text Add to dashboard Cite

A common challenge encountered when using Deep Neural Network models for automatic ICD coding is their potential inability to effectively handle unseen clinical texts, especially when these models are only trained on a limited number of examples. This is because these models rely solely on the patterns and relationships present in the training data, and may not be able to effectively incorporate additional knowledge about the relationships between medical entities. To address this issue, we introduce KG-MultiResCNN—KnowledgeGuidedMulti-filterResidualConvolutionalNeuralNetwork model, which combines training examples with external knowledge from the Wikidata Knowledge Graph (KG) in order to better capture the relationships between medical entities. The KG is a structured database that contains a wealth of information about various entities, including medical concepts and their relationships with one another. By incorporating this external knowledge into our model, we are able to improve its ability to predict ICD codes for new clinical texts. In our experiments with the MIMIC-III dataset, we found that the KG-MultiResCNN model significantly outperformed the baseline approaches. This demonstrates the effectiveness of using external knowledge, in addition to training examples, to improve the performance of deep learning models for automatic ICD coding.

show abstract

Knowledge Guided Multi-filter Residual Convolutional Neural Network for ICD Coding from Clinical Text

Boukhers

Goswami

Jürjens

2022

Preprint

View full text Add to dashboard Cite

Recent research works on automatic ICD coding demonstrated that Deep Neural Network models are efficient in capturing essential features from textual clinical data to predict the ICD code of the disease. However, it has been discovered that since these models rely only on training examples, they cannot grasp most of the relationships between medical entities present in new clinical texts. To overcome this, this paper introduces KG-MultiResCNN - Knowledge Guided Multi-filter Residual Convolutional Neural Network, a DNN model that relies, in addition to training examples, on external knowledge that embeds the relationships between the medical entities found in the clinical text. Specifically, we use the Wikidata Knowledge Graph (KG) to extract the relationship embeddings of medical entities. With this combination, our guided model is better at predicting the correct ICD codes. The extensive experiments with MIMIC-III dataset showed that KG-MultiResCNN outperformed the current state-of-the-art model and other baseline approaches.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Prantik Goswami

MexPub: Deep Transfer Learning for Metadata Extraction from German Publications

Knowledge Guided Multi-filter Residual Convolutional Neural Network for ICD Coding from Clinical Text

MexPub: Deep Transfer Learning for Metadata Extraction from German Publications

Knowledge guided multi-filter residual convolutional neural network for ICD coding from clinical text

Knowledge Guided Multi-filter Residual Convolutional Neural Network for ICD Coding from Clinical Text

Contact Info

Product

Resources

About