Theme identification is one of the most fundamental tasks in qualitative research. It also is one of the most mysterious. Explicit descriptions of theme discovery are rarely found in articles and reports, and when they are, they are often relegated to appendices or footnotes. Techniques are shared among small groups of social scientists, but sharing is impeded by disciplinary or epistemological boundaries. The techniques described here are drawn from across epistemological and disciplinary boundaries. They include both observational and manipulative techniques and range from quick word counts to laborious, in-depth, line-by-line scrutiny. Techniques are compared on six dimensions: (1) appropriateness for data types, (2) required labor, (3) required expertise, (4) stage of analysis, (5) number and types of themes to be generated, and (6) issues of reliability and validity.
Sample size determination for open-ended questions or qualitative interviews relies primarily on custom and finding the point where little new information is obtained (thematic saturation). Here, we propose and test a refined definition of saturation as obtaining the most salient items in a set of qualitative interviews (where items can be material things or concepts, depending on the topic of study) rather than attempting to obtain all the items. Salient items have higher prevalence and are more culturally important. To do this, we explore saturation, salience, sample size, and domain size in 28 sets of interviews in which respondents were asked to list all the things they could think of in one of 18 topical domains. The domains—like kinds of fruits (highly bounded) and things that mothers do (unbounded)—varied greatly in size. The datasets comprise 20–99 interviews each (1,147 total interviews). When saturation was defined as the point where less than one new item per person would be expected, the median sample size for reaching saturation was 75 (range = 15–194). Thematic saturation was, as expected, related to domain size. It was also related to the amount of information contributed by each respondent but, unexpectedly, was reached more quickly when respondents contributed less information. In contrast, a greater amount of information per person increased the retrieval of salient items. Even small samples (n = 10) produced 95% of the most salient ideas with exhaustive listing, but only 53% of those items were captured with limited responses per person (three). For most domains, item salience appeared to be a more useful concept for thinking about sample size adequacy than finding the point of thematic saturation. Thus, we advance the concept of saturation in salience and emphasize probing to increase the amount of information collected per respondent to increase sample efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.