In modern biomedical research aimed at finding methods for early diagnosis of cancer, microarrays containing certain biological information about patients are used. Based on these data, patients are assigned to one of two classes, corresponding to the presence and absence of some diagnosis. When solving this problem, one of the steps that have a decisive influence on the quality of classification is the significant features selection. This paper proposes a criterion for the selection of significant features, based on the ledge-coefficient of correlation. The ledge-coefficient was previously used to estimate the degree of interrelation of numerical and binary features. For two sets of microarray data, comparative examples of their binary classification are presented using three feature selection algorithms, three dimensionality reduction methods, six classification models. The use of the ledge-criterion for feature selection made it possible to obtain a classification quality comparable to the results of using common methods of feature selection, such as t-test and U-test. For the data set of the peptide microarrays considered in the paper, the effectiveness of applying the projection method to latent structures had previously been identified. The use of this method in combination with the significant features’ selection using the ledge-criterion made it possible to obtain a higher classification quality measure.
One of the promising areas for using Prolog-systems is to solve logical tasks. This study outlines a solution approach based on the state generation procedure and the verification procedure. A solution to a logical task is presented, which demonstrates in practice the proposed approach and method of specifying a procedure for generating states. In the proposed example, a bit chain is generated that defines the code of a particular letter in the solution of the applied problem. Building a solution by means of code generation with verification allows not storing in the knowledge base a binary tree of all possible codes. The process of generating new states can be associated with the training of the program, with the dynamic formation of the knowledge base. The approach is based on the capabilities of software environments for adding facts and rules to existing ones, which were obtained as the results of the program or its stages. In this case, the entire program is the generating rule. An analysis of the constructed and tested procedures for the dynamic generation of states and the generation of facts allows us to talk about the applicability of such a solution for certain applied problems.
The article deals with the problem of the reconciliation of observation results, which arises when solving problems of interval analysis of a database. It is found that the values of the set of input variables and the output variable are consistent if the graph of the desired dependence is located at the inner points of the interval hyper-rectangle in each observation. In this case, it is proposed to use special solutions of interval systems of linear algebraic equations (ISLAU) to analyze the data of linear processes. However, in real and model conditions, the specified property of the database is not always fulfilled a priori. In these cases, it is proposed to use the principle of robust estimation: inconsistent observations should either be excluded from the sample or adjusted. This paper presents the results of the study of these methods of matching the used experimental database on model linear processes under conditions when the basic assumptions of interval estimation of dependencies are fulfilled. In addition, variant computational experiments have been investigated to reveal the possibility of increasing the accuracy of interval analysis due to preliminary correction of observations, including the possibility of guaranteed estimation of the sought dependences.
The article presents the results of the approximation of the set of solutions of interval systems of linear algebraic equations. These systems are used in the problems of modeling linear deterministic processes. It is assumed that the modeled process is described by an output variable and a set of input variables, the measurement errors of which are assumed to be set by known intervals symmetric with respect to the zero value. Traditionally, the sets of solutions of interval systems of linear algebraic equations in applied problems are approximated by a hyper-rectangular whose sides are parallel to the axes of the selected coordinate system. In this paper, we propose to use an ellipsoidal approximation of these sets, which is more efficient. The main results of the work include the substantiation of assumptions about the properties of the modeled process, the choice of a mathematical method for constructing an approximating ellipsoid, the proposed method for forming boundary points, and a numerical method for solving the problem. A computer simulation of the problem of estimating the parameters of a linear process is performed in Excel, which is used for a comparative study of approximations of solutions of interval systems of linear algebraic equations by a hyper-rectangular and an ellipse.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.