The penal system and treatment programs, such as social therapy need to be evaluated. This is required by every relevant law that has come into force since the reform of the German federal system in 2006. Treatment and penal research would benefit if conventional designs comparing treated and untreated people, were supplemented by comparisons between organizations or organizational units with respect to success variables. This strategy, known as benchmarking in other sectors of society, has rarely been used in German penal research and is applied to social therapeutic treatment in the present article. Using data from a case documentation system, social therapy units in Lower Saxony were compared with respect to the allowance of privileges (temporary leave), treatment duration and employment status at the time of release. Multilevel models were applied. In all of the three dependent variables, distinct differences were found between the units that cannot solely be accounted for by the individual characteristics of the treated prisoners.Keywords Prisons · Evaluation · Social therapy · Benchmarking · Multilevel models Evaluation hat viele Facetten. In der kaum noch über-schaubaren Literatur zur systematischen Bewertung von Programmen und Maßnahmen mit wissenschaftlichen Methoden findet man eine Vielzahl von Unterscheidungskriterien, Einteilungsschemata und Klassifikationsansätzen