A series of papers had analyzed a simplified model of an automated cytology prescreening configuration consisting of a two-class cell classifier followed by a two-class specimen classifier. This has shown, among other things, that the proportion (p) of abnormal cells on an abnormal specimen dictates the number (N) of cells that must be classified before the specimen can be classified with specified accuracy (Anal Quant Cytol, 2:117-122, 1980). It has also shown that if a system designed assuming one fixed value, po, encounters a specimen with a different fixed value, p, then the specimen classifier false negative rate will deviate significantly from the design value, increasing for p < po and vice versa (Cytometry, 2:
155-158, 1981).Using a Gaussian approximation, Timmers and Gelsema (Cytometry, 622-25, 1985) extended this model to the case where p is a Beta-distributed random variable. They showed that N increases dramatically with the width (coefficient of variation) of the distribution of p. They also concluded that the randomness of p imposes a fundamental lower limit on the specimen false negative rate below which it is impossible to go, even with an error-free cell classifier.In this paper we also extend the basic model to cover the case of random p, but by using an asymptotic expansion (rather than the Gaussian approximation), to develop an expression for N. We show that the limit cited by Timmers and Gelsema is not real, but is actually an artifact of the breakdown of the Gaussian approximation. This result means that any desired performance is, in fact, achievable, even with random p, (but potentially at the cost of very many cells examined). We also present a simple iterative design procedure using the basic model, and show that the cell classifier figureof-merit from the earlier analysis remains valid for the case of random p. Using the Beta distribution, we examine numerically several situations to validate the extended model, and to illustrate the effects of the parameters. These results apply to all cell-and specimen-classifier cascades, whether they are implemented by statistical techniques, neural networks, or other means. Key terms: Classification, cell classification, specimen classification, cervical cytology A prescreening system for automated cytology would be called upon to recognize many types of cells and identify many types and degrees of abnormalities. Several multi-class models have been presented (2,5,13). However, considerable basic understanding of the behavior of such systems can be gained from the analysis of simplified models (4,(6)(7)(8)(9)11,12).The doubly two-class model applies to prescreening instruments that operate by first classifying some number of cells and then, from those results, the specimen. A series of papers studying that model has elucidated several important principles (4,6-9,111.