The reliability of marking for the final cohort of students to graduate from the psychology degree scheme in place at the University of Northumbria at Newcastle between 1985 and 1993 was investigated. Inter-marker correlations for some course components were low, but the correlation between students' overall first marks and their overall second marks was .93, a value in keeping with those typically reported for national school examinations. The reliability of a student's overall agreed mark was estimated to be .96 and the standard error of measurement to be about 1 per cent. Further analyses went on to consider the influence of question and option choice on reliability, the representativeness of the cohort studied and the effects of agreeing marks rather than simply averaging first and second marks. Cronbach's alpha was proposed as a means of estimating reliability in the absence of second marking and was used to compare the reliability of first and second markers. The possibility of second marking the work only of those students who were classified as borderline on the basis of their first marks was discussed. The paper concludes with a reminder that reliability does not guarantee validity.
Health visitors from two health authorities in the North East of England were asked to indicate when they would first propose to visit and how often they would expect to visit in the next 6 months a (fictitious) new client who was breast feeding a 14-day-old baby. The study provided no evidence that the health visitors modified their visiting behaviour in response to a mother's age or age at leaving school, factors that Wright & Walker (1983) had identified as predictors of early termination of breast feeding. However, significant differences were found in proposed visiting behaviour to a number of other new clients who were included in the study to provide an appropriate context and to help disguise the nature of the manipulations. These differences were attributed to the variation in the ages of the babies and to the existence of specific problems. Significant differences were also found between the responses of the health visitors from the two health authorities. These were explained in terms of different caseloads in a mainly rural versus a mainly urban area. The implications of these results for the assumption that the implementation of research findings can be left to individual health professionals are discussed.
If null hypotheses are known to be false before any data are collected, what is the point of conducting hypothesis tests on them? An argument is made that such tests can be useful in establishing the direction of differences between populations but that the practice of drawing directional decisions from two tailed tests has serious unforeseen consequences. Three-alternative hypothesis testing puts the practice on a sounder theoretical basis and recognises the possibility (thankfully small) of committing Type 3 errors. The issues are illustrated using an idealised example of a developmental psychologist comparing the performance of random samples of children of two different ages on a ‘balancing’ task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.