Adrian Silvescu scite author profile

Background: Glycosylation is one of the most complex post-translational modifications (PTMs) of proteins in eukaryotic cells. Glycosylation plays an important role in biological processes ranging from protein folding and subcellular localization, to ligand recognition and cell-cell interactions. Experimental identification of glycosylation sites is expensive and laborious. Hence, there is significant interest in the development of computational methods for reliable prediction of glycosylation sites from amino acid sequences.

show abstract

Learning accurate and concise naïve Bayes classifiers from attribute value taxonomies and data

Zhang

et al. 2005

View full text Add to dashboard Cite

In many application domains, there is a need for learning algorithms that can effectively exploit attribute value taxonomies (AVT)-hierarchical groupings of attribute values-to learn compact, comprehensible and accurate classifiers from data-including data that are partially specified. This paper describes AVT-NBL, a natural generalization of the naïve Bayes learner (NBL), for learning classifiers from AVT and data. Our experimental results show that AVT-NBL is able to generate classifiers that are substantially more compact and more accurate than those produced by NBL on a broad range of data sets with different percentages of partially specified values. We also show that AVT-NBL is more efficient in its use of training data: AVT-NBL produces classifiers that outperform those produced by NBL using substantially fewer training examples.

show abstract

A Framework for Learning from Distributed Data Using Sufficient Statistics and Its Application to Learning Decision Trees

Caragea

Silvescu

Honavar

2004

HIS

View full text Add to dashboard Cite

This paper motivates and precisely formulates the problem of learning from distributed data; describes a general strategy for transforming traditional machine learning algorithms into algorithms for learning from distributed data; demonstrates the application of this strategy to devise algorithms for decision tree induction from distributed data; and identifies the conditions under which the algorithms in the distributed setting are superior to their centralized counterparts in terms of time and communication complexity; The resulting algorithms are provably exact in that the decision tree constructed from distributed data is identical to that obtained in the centralized setting. Some natural extensions leading to algorithms for learning from heterogeneous distributed data and learning under privacy constraints are outlined.

show abstract

Analysis and Synthesis of Agents That Learn from Distributed Dynamic Data Sources

Caragea¹,

Silvescu²,

Honavar³

2001

View full text Add to dashboard Cite

Abstract. We propose a theoretical framework for specification and analysis of a class of learning problems that arise in open-ended environments that contain multiple, distributed, dynamic data and knowledge sources. We introduce a family of learning operators for precise specification of some existing solutions and to facilitate the design and analysis of new algorithms for this class of problems. We state some properties of instance and hypothesis representations, and learning operators that make exact learning possible in some settings. We also explore some relationships between models of learning using different subsets of the proposed operators under certain assumptions.

show abstract

Decision Tree Induction from Distributed Heterogeneous Autonomous Data Sources

Caragea

Silvescu

Honavar

2003

View full text Add to dashboard Cite

Summary.With the growing use of distributed information networks, there is an increasing need for algorithmic and system solutions for data-driven knowledge acquisition using distributed, heterogeneous and autonomous data repositories. In many applications, practical constraints require such systems to provide support for data analysis where the data and the computational resources are available. This presents us with distributed learning problems. We precisely formulate a class of distributed learning problems; present a general strategy for transforming traditional machine learning algorithms into distributed learning algorithms; and demonstrate the application of this strategy to devise algorithms for decision tree induction (using a variety of splitting criteria) from distributed data. The resulting algorithms are provably exact in that the decision tree constructed from distributed data is identical to that obtained by the corresponding algorithm when in the batch setting. The distributed decision tree induction algorithms have been implemented as part of INDUS, an agent-based system for data-driven knowledge acquisition from heterogeneous, distributed, autonomous data sources.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Adrian Silvescu

Glycosylation site prediction using ensembles of Support Vector Machine classifiers

Learning accurate and concise naïve Bayes classifiers from attribute value taxonomies and data

A Framework for Learning from Distributed Data Using Sufficient Statistics and Its Application to Learning Decision Trees

Analysis and Synthesis of Agents That Learn from Distributed Dynamic Data Sources

Decision Tree Induction from Distributed Heterogeneous Autonomous Data Sources

Contact Info

Product

Resources

About