1Machine learning is helping the interpretation of biological complexity by enabling the 2 inference and classification of cellular, organismal and ecological phenotypes based on 3 large datasets, e.g. from genomic, transcriptomic and metagenomic analyses. A number 4 of available algorithms can help search these datasets to uncover patterns associated with 5 specific traits, including disease-related attributes. While, in many instances, treating an 6 algorithm as a black box is sufficient, it is interesting to pursue an enhanced 7 understanding of how system variables end up contributing to a specific output, as an 8 avenue towards new mechanistic insight. Here we address this challenge through a suite 9 of algorithms, named BowSaw, which takes advantage of the structure of a trained 10 random forest algorithm to identify combinations of variables ("rules") frequently used 11 for classification. We first apply BowSaw to a simulated dataset, and show that the 12 algorithm can accurately recover the sets of variables used to generate the phenotypes 13 through complex Boolean rules, even under challenging noise levels. We next apply our 14 method to data from the integrative Human Microbiome Project and find previously 15 unreported high-order combinations of microbial taxa putatively associated with Crohn's 16 disease. By leveraging the structure of trees within a random forest, BowSaw provides a 17 new way of using decision trees to generate testable biological hypotheses. 18 19 20 21 22