Riccardo Ortale scite author profile

Abstract. We propose a novel methodology for clustering XML documents on the basis of their structural similarities. The idea is to equip each cluster with an XML cluster representative, i.e. an XML document subsuming the most typical structural specifics of a set of XML documents. Clustering is essentially accomplished by comparing cluster representatives, and updating the representatives as soon as new clusters are detected. We present an algorithm for the computation of an XML representative based on suitable techniques for identifying significant node matchings and for reliably merging and pruning XML trees. Experimental evaluation performed on both synthetic and real data shows the effectiveness of our approach.

show abstract

An incremental clustering scheme for data de-duplication

Costa

Manco

Ortale

2009

Data Min Knowl Disc

View full text Add to dashboard Cite

We propose an incremental technique for discovering duplicates in large databases of textual sequences, i.e., syntactically different tuples, that refer to the same real-world entity. The problem is approached from a clustering perspective: given a set of tuples, the objective is to partition them into groups of duplicate tuples. Each newly arrived tuple is assigned to an appropriate cluster via nearest-neighbor classification. This is achieved by means of a suitable hash-based index, that maps any tuple to a set of indexing keys and assigns tuples with high syntactic similarity to the same buckets. Hence, the neighbors of a query tuple can be efficiently identified by simply retrieving those tuples that appear in the same buckets associated to the query tuple itself, without completely scanning the original database. Two alternative schemes for computing indexing keys are discussed and compared. An extensive experimental evaluation on both synthetic and real data shows the effectiveness of our approach.

show abstract

Model-Based Collaborative Personalized Recommendation on Signed Social Rating Networks

Costa¹,

Ortale²

2016

ACM Trans. Internet Technol.

View full text Add to dashboard Cite

Recommendation on signed social rating networks is studied through an innovative approach. Bayesian probabilistic modeling is used to postulate a realistic generative process, wherein user and item interactions are explained by latent factors, whose relevance varies within the underlying network organization into user communities and item groups. Approximate posterior inference captures distrust propagation and drives Gibbs sampling to allow rating and (dis)trust prediction for recommendation along with the unsupervised exploratory analysis of network organization. Comparative experiments reveal the superiority of our approach in rating and link prediction on Epinions and Ciao , besides community quality and recommendation sensitivity to network organization.

show abstract

Modeling item selection and relevance for accurate recommendations

Barbieri

Costa

Manco

et al. 2011

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Riccardo Ortale

Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data

A Tree-Based Approach to Clustering XML Documents by Structure

An incremental clustering scheme for data de-duplication

Model-Based Collaborative Personalized Recommendation on Signed Social Rating Networks

Modeling item selection and relevance for accurate recommendations

Contact Info

Product

Resources

About