Dimensionality reduction, including feature extraction and selection, is one of the key points for text classification. In this paper, we propose a mixed method of dimensionality reduction constructed by principal components analysis and the selection of components. Principal components analysis is a method of feature extraction. Not all of the components in principal component analysis contribute to classification, because PCA objective is not a form of discriminant analysis (see, e.g. . In this context, we present a function of components selection, which returns the useful components for classification by the indicators of the performances on the different subsets of the components. Compared to traditional methods of feature selection, SVM classifiers trained on selected components show improved classification performance and a reduction in computational overhead.
Lithology recognition is an essential part of reservoir parameter prediction. Compared to conventional algorithms, deep learning that needs a large amount of training data as support can extract features automatically. In the process of real data acquisition, the labeled data account for only a small portion due to high drilling cost, and it is difficult to achieve the data size required for deep learning training, resulting in a significant variance of the recognition model. In this paper, for this shortage, a semisupervised algorithm based on generative adversarial network (GAN) with Gini-regularization is proposed, called SGAN_G, which takes borehole-side data as labeled data and seismic data as unlabeled data. First, the SGAN_G is trained by Adam (a method for stochastic optimization) algorithm and utilizes a discriminator to lithology recognition. And, we add the entropy regularization to the initial loss function which enhances the convergence speed and accuracy of the model. Eventually, we propose a novel sampling approach which employs multiple sampling points of seismic data as inputs to use the stratum information implicitly. Through the experimental comparison with a variety of supervised approaches, we can see that the SGAN_G can achieve higher prediction accuracy by using unlabeled data effectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.