Abstract.This paper discusses similarities between reliability and security with the intention of finding probabilistic measures of operational security similar to those that we have for reliability of systems. Ideally, a measure of the security of a system should capture quantitatively the intuitive notion of 'the ability of the system to resist attack', described by the parameter effort.That is, it should reflect the degree to which the system can be expected to remain free of security breaches under particular conditions of operation, including those of attack. Current security levels, e.g., those of the Orange book, at best reflect the extensiveness of safeguards introduced during the design and development of a system. Even though we might expect a system developed to a higher level than another system to exhibit 'more secure behaviour' in operation, this cannot be guaranteed. In particular, we can not assess the actual operational security from knowledge of such a level.We have carried out two realistic intrusion experiments intended to investigate the empirical issues that arise from this probabilistic view of security assessment. More specifically, they investigated the problems of measuring effort and reward associated with security attacks and breaches. In the first, pilot experiment, the intention was to see whether experiments of this type, in which a number of under-graduate students were allowed to attack a system under controlled circumstances, were at all feasible, and if so, to get valuable information on how they should be carried out. In the second full-scale experiment, we aimed at getting enough data to be able to start a methodology development, a methodology by which operational security measures could be derived. During this latter experiment 181 activity reports were submitted, resulting in 63 successful breaches, and reflecting a total expenditure of 594 man-hours. The breaches were classified into 6 different categories, based on which kind if security flaw was exploited and the underlying functionality and nature of these flaws is discussed. In a short concluding discussion on quantitative assessment, it is recognized that, even if effort is meant to be composed of many different parameters, various time parameters, such as working time, on-line time and CPU time, form an important base for the measure.