Riccardo Cardin scite author profile

In many applicative contexts in which textual documents are labelled with thematic categories, a distinction is made between the primary categories of a document, which represent the topics that are central to it, and its secondary categories, which represent topics that the document only touches upon. We contend that this distinction, so far neglected in text categorization research, is important and deserves to be explicitly tackled. The contribution of this paper is threefold. First, we propose an evaluation measure for this preferential text categorization task, whereby different kinds of misclassifications involving either primary or secondary categories have a different impact on effectiveness. Second, we establish several baseline results for this task on a well-known benchmark for patent classification in which the distinction between primary and secondary categories is present; these results are obtained by reformulating the preferential text categorization task in terms of well established classification problems, such as single and/or multi-label multiclass classification; state-of-the-art learning technology such as SVMs and kernelbased methods are used. Third, we improve on these results by using a recently proposed class of algorithms explicitly devised for learning from training data expressed in preferential form, i.e., in the form ''for document d i , category c 0 is preferred to category c 00 ''; this allows us to distinguish between primary and secondary categories not only in the

show abstract

PCA-Based Representations of Graphs for Prediction in QSAR Studies

Cardin¹,

Michielan²,

Moro³

et al. 2009

View full text Add to dashboard Cite

In recent years, more and more attention has been paid on learning in structured domains, e.g. Chemistry. Both Neural Networks and Kernel Methods for structured data have been proposed. Here, we show that a recently developed technique for structured domains, i.e. PCA for structures, permits to generate representations of graphs (specif- ically, molecular graphs) which are quite effective when used for predic- tion tasks (QSAR studies). The advantage of these representations is that they can be generated automatically and exploited by any tradi- tional predictor (e.g., Support Vector Regression with linear kernel)

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Riccardo Cardin

Combining selectivity and affinity predictions using an integrated Support Vector Machine (SVM) approach: An alternative tool to discriminate between the human adenosine A2A and A3 receptor pyrazolo-triazolo-pyrimidine antagonists binding sites

Preferential text classification: learning algorithms and evaluation measures

PCA-Based Representations of Graphs for Prediction in QSAR Studies

Contact Info

Product

Resources

About