Rapidly developing viral resistance to licensed human immunodeficiency virus type 1 (HIV-1) protease inhibitors is an increasing problem in the treatment of HIV-infected individuals and AIDS patients. A rational design of more effective protease inhibitors and discovery of potential biological substrates for the HIV-1 protease require accurate models for protease cleavage specificity. In this study, several popular bioinformatic machine learning methods, including support vector machines and artificial neural networks, were used to analyze the specificity of the HIV-1 protease. A new, extensive data set (746 peptides that have been experimentally tested for cleavage by the HIV-1 protease) was compiled, and the data were used to construct different classifiers that predicted whether the protease would cleave a given peptide substrate or not. The best predictor was a nonlinear predictor using two physicochemical parameters (hydrophobicity, or alternatively polarity, and size) for the amino acids, indicating that these properties are the key features recognized by the HIV-1 protease. The present in silico study provides new and important insights into the workings of the HIV-1 protease at the molecular level, supporting the recent hypothesis that the protease primarily recognizes a conformation rather than a specific amino acid sequence. Furthermore, we demonstrate that the presence of 1 to 2 lysine residues near the cleavage site of octameric peptide substrates seems to prevent cleavage efficiently, suggesting that this positively charged amino acid plays an important role in hindering the activity of the HIV-1 protease.In less than a quarter of a century, over 20 million people have succumbed to AIDS, and at the end of 2003, an estimated 38 million people were living with a human immunodeficiency virus (HIV) infection. With an increase of almost 5 million new cases per year, more than 40 million people are likely to be infected with HIV today, with over 2 million of those afflicted being children under the age of 15 years (see the UNAIDS Report on the Global AIDS Epidemic and the AIDS Epidemic Update December 2004 from UNAIDS/WHO [48a, 48b]).Drugs that inhibit the HIV-1 protease, so-called protease inhibitors, are an important part of AIDS therapy today (20), since the HIV-1 protease cleaves viral Gag and Gag-Pol polyproteins into structure and replication proteins that are necessary for the virus to become infectious (28). Currently licensed protease inhibitors are all peptidomimetic; they mimic a peptide that the HIV-1 protease normally cleaves but are chemically modified such that the scissile bond cannot be cleaved (21, 37). Hence, rational design of an efficient inhibitor requires a good understanding of the HIV-1 protease specificity, i.e., knowing which amino acid sequences are cleaved by the protease and which are not. This is, however, difficult since it cleaves at several different sites that have little or no sequence similarity.A problem with the clinical use of protease inhibitors is the fact that th...