Choice of the type of description for the arialysis of quantitative structure-activity relationships (QSAR) of a given chemical structure is determined by the character of each particular problem and the time and effort required to obtain experimental and calculated data representing the structure. For the QSAR analysis of a group of compounds characterized by the same type of activity, whose structures comprise a main fragment (nucleus) and variable substituents, the minimum expenditures are provided by reducing the description to indicating the type of substituent and its position in a given molecule. In this approach, the structural signs of the molecule are essentially measured against a dichotomic (binary) scale refleeting the presence or absence of a specified quality in the given object [1].Such a binary scale for measuring the structural signs is used, in particular, in the Free-Wilson model [2], whose predicting ability is based on the assumption that the biological activity (A or log A ) of a molecule is determined by the sum of contributions due to the activities of substituents, that is tTl A=a0+ ~-~ aj X:-.( 1) j=lHere aj is the contribution ofthejth substituent (j = ~ ) to the total activity A, and X] is a binary variable equal to unity or zero, depending on whether thejth substituent is present or absent in the structure, respectively. The regression coefficients a are obtained by solving a system of n equations of type (1) for a sample set of n molecules (n >> m ). The Free -Wilson method is sill an attractive tool for the QSAR analysis [3 -6] despite a number of disadvantages inherent in this approach. For example, one essential feature of the method is the requirement for availability of considerable statistical material (the number of molecules n constituting the teaching set must significantly exceed m, the number of substructures). Moreover, the structure of the covariation matrix employed in the Free-Wilson method also exhibits 1 Odessa State Marine University, Odessa, Ukraine.
205some special features [3,4] requiring certain precautions in the application of direct methods of regression analysis.As is known, the use of regression techniques imposes certain limitations on the initial numerical data file [3]. Therefore, it is an interesting task to implement the heuristic Free -Wilson paradigm on the basis of an alternative, nonregressive approach capable of processing multidimensional bodies of experimental data for relatively small sample sets. For example, we employed the method oft-rend-vector [5] for determining the contributions of various substructures to the activity of a molecule. In this work, we suggest to employ the method ofbaricentric coordinates (BC) in the description of a multidimensional space of characteristics of the molecular substructures [6,7].In the BC system, each point in the m-dimensional space is represented as the center of gravity of an m-gon whosejth vertex has a mass proportional to Qy, the contribution of the correspondingjth structural sign Sj to the total ac...