Extreme learning machine (ELM) has been applied in a wide range of classification and regression problems due to its high accuracy and efficiency. However, ELM can only deal with cases where training and testing data are from identical distribution, while in real world situations, this assumption is often violated. As a result, ELM performs poorly in domain adaptation problems, in which the training data (source domain) and testing data (target domain) are differently distributed but somehow related. In this paper, an ELM-based space learning algorithm, domain space transfer ELM (DST-ELM), is developed to deal with unsupervised domain adaptation problems. To be specific, through DST-ELM, the source and target data are reconstructed in a domain invariant space with target data labels unavailable. Two goals are achieved simultaneously. One is that, the target data are input into an ELM-based feature space learning network, and the output is supposed to approximate the input such that the target domain structural knowledge and the intrinsic discriminative information can be preserved as much as possible. The other one is that, the source data are projected into the same space as the target data and the distribution distance between the two domains is minimized in the space. This unsupervised feature transformation network is followed by an adaptive ELM classifier which is trained from the transferred labeled source samples, and is used for target data label prediction. Moreover, the ELMs in the proposed method, including both the space learning ELM and the classifier, require just a small number of hidden nodes, thus maintaining low computation complexity. Extensive experiments on real-world image and text datasets are conducted and verify that our approach outperforms several existing domain adaptation methods in terms of accuracy while maintaining high efficiency.
Background: High-throughput microarray technologies have generated and accumulated massive amounts of gene expression datasets that contain expression levels of thousands of genes under hundreds of different experimental conditions. The microarray datasets are usually presented in 2D matrices, where rows represent genes and columns represent experimental conditions. The analysis of such datasets can discover local structures composed by sets of genes that show coherent expression patterns under subsets of experimental conditions. It leads to the development of sophisticated algorithms capable of extracting novel and useful knowledge from a biomedical point of view. In the medical domain, these patterns are useful for understanding various diseases, and aid in more accurate diagnosis, prognosis, treatment planning, as well as drug discovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.