SMILE: A Similarity-Based Approach for Multiple Instance Learning

Xiao, Yanshan; Liu, Bo; Cao, Longbing; Yin, Jie; Wu, Xindong

doi:10.1109/icdm.2010.126

Cited by 13 publications

(13 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…MI-SVM (support vector machine) [13] is based on maximizing the margin of the most positive instances and the least negative instances. There are several algorithms combining the instance selection for MI classification, such as MILES [14], SMILE [15], MILD [16], and MILIS [17]. Sparse-kernel classifiers [18] are learned for MI classification.…”

Section: Related Workmentioning

confidence: 99%

A Boosting Approach to Exploit Instance Correlations for Multi-Instance Classification

Wang

Ding

2016

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

We propose a Boosting approach for multi-instance (MI) classification. L -norm is integrated to localize the witness instances and formulate the bag scores from classifier outputs. The contributions are twofold. First, a flexible and concise model for Boosting is proposed by the L -norm localization and exponential loss optimization. The scores for bag-level classification are directly fused from the instance feature space without probabilistic assumptions. Second, gradient and Newton descent optimizations are applied to derive the weak learners for Boosting. In particular, the instance correlations are exploited by fitting the weights and Newton updates for the weak learner construction. The final Boosted classifiers are the sums of iteratively chosen weak learners. Experiments demonstrate that the proposed L -norm-localized Boosting approach significantly improves the MI classification performance. Compared with the state of the art, the approach achieves the highest MI classification accuracy on 7/10 benchmark data sets.

show abstract

Section: Related Workmentioning

confidence: 99%

A Boosting Approach to Exploit Instance Correlations for Multi-Instance Classification

Wang

Ding

2016

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

show abstract

“…Other approaches dealing with PU classifiers in the context of text classification have been presented in more recent years [8,9,10]. Elkan et al [8] introduce a method to assign weights to the examples belonging to the unlabeled set.…”

Section: Related Workmentioning

confidence: 99%

“…The whole set of weighted unlabeled examples is then used to build the final SVM-based classifier. Also Xiao et al [9] present an approach based on SVMs. The authors combine two techniques borrowed from information retrieval (Rocchio and Spy-EM) to extract a set of reliable negative examples.…”

Section: Related Workmentioning

confidence: 99%

Positive and unlabeled learning in categorical data

Ienco

Pensa

2016

Neurocomputing

View full text Add to dashboard Cite

International audienceIn common binary classification scenarios, the presence of both positive and negative examples in training datais needed to build an efficient classifier. Unfortunately, in many domains, this requirement is not satisfied andonly one class of examples is available. To cope with this setting, classification algorithms have been introducedthat learn from Positive and Unlabeled (PU) data. Originally, these approaches were exploited in the context ofdocument classification. Only few works address the PU problem for categorical datasets. Nevertheless, theavailable algorithms are mainly based on Naive Bayes classifiers. In this work we present a new distance basedPU learning approach for categorical data: Pulce. Our framework takes advantage of the intrinsic relationshipsbetween attribute values and exceeds the independence assumption made by Naive Bayes. Pulce, in fact,leverages on the statistical properties of the data to learn a distance metric employed during the classificationtask. We extensively validate our approach over real world datasets and demonstrate that our strategy obtainsstatistically significant improvements w.r.t. state-of-the-art competitors

show abstract

“…The original work by Dietterich et al [9] attempted to recover an optimal axis-parallel hyper-rectangle in the instance feature space to separate instances in positive bags from those in negative bags. Departing from this model, several researchers have extended the framework, such as MI-SVM [1], DD-SVM [5], SMILE [24], MILES [4] and MILIS [14].…”

Section: Related Workmentioning

confidence: 99%

Multiple Instance Learning for Group Record Linkage

Zhou

Christen

et al. 2012

Advances in Knowledge Discovery and Data Mining

View full text Add to dashboard Cite

Record linkage is the process of identifying records that refer to the same entities from different data sources. While most research efforts are concerned with linking individual records, new approaches have recently been proposed to link groups of records across databases. Group record linkage aims to determine if two groups of records in two databases refer to the same entity or not. One application where group record linkage is of high importance is the linking of census data that contain household information across time. In this paper we propose a novel method to group record linkage based on multiple instance learning. Our method treats group links as bags and individual record links as instances. We extend multiple instance learning from bag to instance classification to reconstruct bags from candidate instances. The classified bag and instance samples lead to a significant reduction in multiple group links, thereby improving the overall quality of linked data. We evaluate our method with both synthetic data and real historical census data.

show abstract

SMILE: A Similarity-Based Approach for Multiple Instance Learning

Cited by 13 publications

References 14 publications

A Boosting Approach to Exploit Instance Correlations for Multi-Instance Classification

A Boosting Approach to Exploit Instance Correlations for Multi-Instance Classification

Positive and unlabeled learning in categorical data

Multiple Instance Learning for Group Record Linkage

Contact Info

Product

Resources

About