Arpita Joshi scite author profile

Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph‐based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these “knowledge graphs” (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open‐access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open‐source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object‐oriented classification and graph‐oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.

show abstract

Impact of Wavelet Transform and Median Filtering on removal of Salt and Pepper Noise in Digital Images

Joshi¹,

Boyat²,

Joshi³

2014

View full text Add to dashboard Cite

Image ac q uisition is a common task in ever y image processing operation. Noise is entered during image ac q uisition from its source and once entered it degrades the image and is difficult to remove. In order to achieve the noise cancellation in an image, non-linear filter works better than linear. This paper presents the joint scheme of Wavelet Transform using iterative noise densit y and Median Filtering to remove Salt and Pepper Noise in Digital Images. The first part of the paper derives the wavelet coefficients with slight increase in noise densit y and in second part these coefficients are further modified b y median filter. The algorithm shows the remarkable improvement over Gaussian noise model and removes most of the nois y part from the image and maintains the visual q ualit y . The level of wavelet decomposition is restricted to three. The renowned indexes Peak Signal to Noise Ratio (PSNR) and Root Mean S q uare Error (RMSE) demonstrate marked improvement of image denoising over Gaussian method.

show abstract

Clustering of Protein Conformations Using Parallelized Dimensionality Reduction

Joshi¹,

Haspel²

2019

JAIT

View full text Add to dashboard Cite

Integrating Co-Evolutionary Information in Monte Carlo Based Method for Proteins Trajectory Simulation

Vajdi

Joshi

Haspel

2019

View full text Add to dashboard Cite

A Novel Data Instance Reduction Technique using Linear Feature Reduction

Joshi¹,

Haspel²

2020

AIS

View full text Add to dashboard Cite

Representation of structurally significant data is indispensable to modern research. The need for dimensionality reduction finds its foray in varied genres viz-a-viz, Structural Bioinformatics, Machine Learning, Robotics, Artificial Intelligence, to name a few. The number of points required to effectively capture the essence of a structure is an intuitive decision. Feature reduction methods like Principal Component Analysis (PCA) have already been explored and proven to be an aid in classification and regression. In this work we present a novel approach that first performs PCA on a data set for reduction of features and then attempts to reduce the number of points itself to get rid of the points that have nothing or very little new to offer. The algorithm was tested on various kinds of data (points representing a spiral, protein coordinates, the Iris dataset prevalent in Machine Learning, face image) and the results agree with the quantitative tests applied. In each case, it turns out that a lot of data instances need not be stored to make any kind of decision. Matlab and R simulations were used to assess the structures with reduced data points. The time complexity of the algorithm is linear in the degrees of freedom of the data if the data is in a natural order.

show abstract

Characterizing Protein Conformational Spaces using Dimensionality Reduction and Algebraic Topology

Joshi

Haspel

2021

Preprint

View full text Add to dashboard Cite

Datasets representing the conformational landscapes of protein structures are high dimensional and hence present computational challenges. Efficient and effective dimensionality reduction of these datasets is therefore paramount to our ability to analyze the conformational landscapes of proteins and extract important information regarding protein folding, conformational changes and binding. Representing the structures with fewer attributes that capture the most variance of the data, makes for quicker and precise analysis of these structures. In this work we make use of dimensionality reduction methods for reducing the number of instances and for feature reduction. The reduced dataset that is obtained is then subjected to topological and quantitative analysis. In this step we perform hierarchical clustering to obtain different sets of conformation clusters that may correspond to intermediate structures. The structures represented by these conformations are then analyzed by studying their high dimension topological properties to identify truly distinct conformations and holes in the conformational space that may represent high energy barriers. Our results show that the clusters closely follow known experimental results about intermediate structures, as well as binding and folding events.

show abstract

Characterizing Protein Conformational Spaces using Efficient Data Reduction and Algebraic Topology

Joshi

Haspel

2022

J. Hum. Earth Future

View full text Add to dashboard Cite

Datasets representing the conformational landscapes of protein structures are high-dimensional and hence present computational challenges. Efficient and effective dimensionality reduction of these datasets is therefore paramount to our ability to analyze the conformational landscapes of proteins and extract important information regarding protein folding, conformational changes, and binding. Representing the structures with fewer attributes that capture the most variance in the data makes for a quicker and more precise analysis of these structures. In this study, we make use of dimensionality reduction methods for reducing the number of instances and for feature reduction. The reduced dataset that is obtained is then subjected to topological and quantitative analysis. In this step, we perform hierarchical clustering to obtain different sets of conformation clusters that may correspond to intermediate structures. The structures represented by these conformations are then analyzed by studying their high-dimensional topological properties to identify truly distinct conformations and holes in the conformational space that may represent high energy barriers. Our results show that the clusters closely follow known experimental results about intermediate structures as well as binding and folding events. Doi: 10.28991/HEF-SP2022-01-01 Full Text: PDF

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Arpita Joshi

Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science

Impact of Wavelet Transform and Median Filtering on removal of Salt and Pepper Noise in Digital Images

Clustering of Protein Conformations Using Parallelized Dimensionality Reduction

Integrating Co-Evolutionary Information in Monte Carlo Based Method for Proteins Trajectory Simulation

A Novel Data Instance Reduction Technique using Linear Feature Reduction

Characterizing Protein Conformational Spaces using Dimensionality Reduction and Algebraic Topology

Characterizing Protein Conformational Spaces using Efficient Data Reduction and Algebraic Topology

Contact Info

Product

Resources

About