Proceedings of the Nineteenth Conference on Computational Natural Language Learning 2015
DOI: 10.18653/v1/k15-1006
|View full text |Cite
|
Sign up to set email alerts
|

An Iterative Similarity based Adaptation Technique for Cross-domain Text Classification

Abstract: Supervised machine learning classification algorithms assume both train and test data are sampled from the same domain or distribution. However, performance of the algorithms degrade for test data from different domain. Such cross domain classification is arduous as features in the test domain may be different and absence of labeled data could further exacerbate the problem. This paper proposes an algorithm to adapt classification model by iteratively learning domain specific features from the unlabeled test d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(8 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…should be classified as the topic 'Apple' although it does not contain the keyword 'Apple' and the keyword 'Tim Cook' is not contained in the training samples. In other words, the reliable classifier should learn decision rules that generalize across domains (Fei and Liu 2015;Bhatt, Semwal, and Roy 2015;Bhatt, Sinha, and Roy 2016). This problematic phenomenon frequently happens in realworld datasets.…”
Section: Class : Applementioning
confidence: 99%
“…should be classified as the topic 'Apple' although it does not contain the keyword 'Apple' and the keyword 'Tim Cook' is not contained in the training samples. In other words, the reliable classifier should learn decision rules that generalize across domains (Fei and Liu 2015;Bhatt, Semwal, and Roy 2015;Bhatt, Sinha, and Roy 2016). This problematic phenomenon frequently happens in realworld datasets.…”
Section: Class : Applementioning
confidence: 99%
“…Zhuang et al [42] presented a probabilistic model, by which both the shared and distinct concepts in different domains can be learned by the Expectation-Maximization process which optimizes the data likelihood. In [1], an algorithm to adapt a classification model by iteratively learning domain-specific features from the unlabeled test data is described. Cross-domain polarity classification.…”
Section: Cross-domain Classificationmentioning
confidence: 99%
“…In recent years, cross-domain sentiment (polarity) classification has gained popularity due to the advances in domain adaptation on one side, and to the abundance of documents from various domains available on the Web, expressing positive or negative opinion, on the other side. Some of the general domain adaptation frameworks have been applied to polarity classification [1,6,42], but there are some approaches that have been specifically designed for the cross-domain sentiment classification task [2, 11-13, 26, 30-33]. To the best of our knowledge, Blitzer et al [2] were the first to report results on cross-domain classification proposing the structural correspondence learning (SCL) method, and its variant based on mutual information (SCL-MI).…”
Section: Cross-domain Classificationmentioning
confidence: 99%
“…For example, Lui and Baldwin (2011) chose to select cross-domain features to help domain adaption; Axelrod, He, and Gao (2011) chose to select pseudo in-domain data; Wen (2016) chose to train on multi-domain data. Sun, Kashima, and Ueda (2013) used Gaussian RBF and polynomial kernels to compute task similarity; Bhatt, Semwal, and Roy (2015) used cosine similarity measure to compute similarity for domain adaption.…”
Section: Cross-domain Learningmentioning
confidence: 99%
“…For cross-domain learning, a lot of previous work focuses on domain similarity (Sun, Kashima, and Ueda 2013;Bhatt, Semwal, and Roy 2015;Bhatt, Sinha, and Roy 2016). Sun, Kashima, and Ueda (2013) propose a multitask learning method to automatically discover task relationships from real-world data.…”
Section: Introductionmentioning
confidence: 99%