Abstract. The potential of Boltzmann machines to cope with difficult combinatorial optimization problems is investigated. A discussion of various (parallel) models of Boltzmann machines is given based on the theory of Markov chains. A general strategy is presented for solving (approximately) combinatorial optimization problems with a Boltzmann machine. The strategy is illustrated by discussing the details for two different problems, namely MATCHING and GRAPH PARTITION-ING.Key Words. Boltzmann machines, Combinatorial optimization, Connectionist models, Neural networks, Simulated annealing.1. Introduction. Simulated annealing is a generally applicable solution method in the field of combinatorial optimization based on an analogy with the physical annealing process [Kirkpatrick et al., 1983]. Numerous performance evaluations have led to the general opinion that simulated annealing can obtain high-quality solutions but often at the cost of large running times [Aarts and Korst, 1989a; Van Laarhoven and Aarts, 19873. As problems will inevitably increase in size, the situation with respect to computational effort will only worsen.Recent efforts have concentrated on investigating possibilities for speeding up simulated annealing algorithms by executing them on parallel machines. Unfortunately, with only moderate success I-Aarts and Korst, 1989a]. Most parallel algorithms presented in the literature follow rather strictly the basic concept of the annealing algorithm, i.e., generation of a sequence of solutions, which is intrinsically sequential and therefore leads to low efficiencies on parallel machines. Our general feeling is that the design of efficient parallel annealing algorithms requires a computational model that differs substantially from the traditional sequential annealing model. Recent interest focuses on computational models such as connectionist models that support in a natural way the exploitation of massive parallelism as is done in the human brain. Connectionist models [Feldman and Ballard, 1982;Fahlman and Hinton, 1987] can be viewed as generalizations of neural network models. They are based on the assumption that information can be processed by massively parallel networks consisting of cooperating simple neuron-like computing elements that are highly interconnected by weighted links. Furthermore, a typical feature of these models is that the response of an individual computing element to other elements in the network is given by a nonlinear scalar function of the weighted