Abstract:It is 50 years since Sieveking et al. published their pioneering research in Nature on the geochemical analysis of artefacts from Neolithic flint mines in southern Britain. In the decades since, geochemical techniques to source stone artefacts have flourished globally, with a renaissance in recent years from new instrumentation, data analysis, and machine learning techniques. Despite the interest over these latter approaches, there has been variation in the quality with which these methods have been applied. U… Show more
“…A support vector machine is a supervised machine learning model regularly used in classifying archaeological materials [14,49,29,37,80,28], which has utility in comparing and classifying datasets aggregated from digital repositories, comparative collections, open access reports, as well as other digital assets. For this effort, linear data were imported and modeled using the scikit-learn Fig.…”
Recent research into Caddo bottle and biface morphology yielded evidence for two distinct behavioral regions, across which material culture from Caddo burials expresses significant morphological differences. This study asks whether Perdiz arrow points from Caddo burials differ across the same geography, which would extend the pattern of morphological differences to a third category of Caddo material culture. Perdiz arrow points collected from the geographies of the northern and southern Caddo behavioral regions were employed to test the hypothesis that morphological attributes differ, and are predictable, between the two communities. The analysis of linear metrics indicated a significant difference in morphology by behavioral region. Using the linear metrics combined with the tools of machine learning, a predictive modelsupport vector machine-was designed to assess the degree to which community differences could be predicted, achieving a receiver operator curve score of 97 percent, and an accuracy score of 94 percent. The subsequent landmark geometric morphometric analysis identified significant differences in Perdiz arrow point shape and size between the behavioral regions-one characterized Components of the analytical workflow were developed and
“…A support vector machine is a supervised machine learning model regularly used in classifying archaeological materials [14,49,29,37,80,28], which has utility in comparing and classifying datasets aggregated from digital repositories, comparative collections, open access reports, as well as other digital assets. For this effort, linear data were imported and modeled using the scikit-learn Fig.…”
Recent research into Caddo bottle and biface morphology yielded evidence for two distinct behavioral regions, across which material culture from Caddo burials expresses significant morphological differences. This study asks whether Perdiz arrow points from Caddo burials differ across the same geography, which would extend the pattern of morphological differences to a third category of Caddo material culture. Perdiz arrow points collected from the geographies of the northern and southern Caddo behavioral regions were employed to test the hypothesis that morphological attributes differ, and are predictable, between the two communities. The analysis of linear metrics indicated a significant difference in morphology by behavioral region. Using the linear metrics combined with the tools of machine learning, a predictive modelsupport vector machine-was designed to assess the degree to which community differences could be predicted, achieving a receiver operator curve score of 97 percent, and an accuracy score of 94 percent. The subsequent landmark geometric morphometric analysis identified significant differences in Perdiz arrow point shape and size between the behavioral regions-one characterized Components of the analytical workflow were developed and
“…To date, ML has been applied in a number of lithic studies addressing a wide variety of anthropological questions: identifying heat-treated raw material nodules, a practice employed to improve the ease of working raw nodules into stone artifacts [15]; identifying the materials worked by a stone tool according to the classification of the use-wear created on its edge [16], [17]; predicting the original flake mass from variables on the striking platform in order to quantify the degree of resharpening (and thus the length of its use-life as a tool) [18]; predicting site formation conditions from the surface alteration of the site's lithic artifacts [19]; creating more quantitatively rigorous approaches to the creation of typologies for studying artifact shape through time and space [20], [21]; predicting the raw material of the stone tool from the cut marks produced by the edge [22]; identifying the geochemical signatures of geological sources of lithic raw materials as a means of studying prehistoric mobility and material selection criteria [23], [24]; distinguishing the flake products from different reduction strategies for exploiting the volume of a core [25]; distinguishing chronological manifestations of lithic behavior between the Middle and Late Stone Age in Africa through the presence vs. absence of types within assemblages [26]; developing virtual knapping software [27]; and quantifying lithic knapping skill acquisition for studying the evolution of human cognition [28].…”
Section: B Lithic Technologymentioning
confidence: 99%
“…The studies [3], [4] apply standardization of the data before a train-test split, while the studies [8], [11] (and other works by these authors) apply PCA for dimension reduction before the train-test split. In [24] the authors use the t-SNE embedding to visualize the dataset in two dimensions and manually remove "outliers" before the train-test split. Due to the difficulty interpreting the t-SNE embedding, the "outliers" removed could in fact be valid datapoints that are simply difficult to classify, thereby artificially increasing accuracy scores.…”
Machine learning (ML), being now widely accessible to the research community at large, has fostered a proliferation of new and striking applications of these emergent mathematical techniques across a wide range of disciplines. In this paper, we will focus on a particular case study: the field of paleoanthropology, which seeks to understand the evolution of the human species based on biological (e.g. bones, genetics) and cultural (e.g. stone tools) evidence. As we will show, the easy availability of ML algorithms and lack of expertise on their proper use among the anthropological research community has led to foundational misapplications that have appeared throughout the literature. The resulting unreliable results not only undermine efforts to legitimately incorporate ML into anthropological research, but produce potentially faulty understandings about our human evolutionary and behavioral past.The aim of this paper is to provide a brief introduction to some of the ways in which ML has been applied within paleoanthropology; we also include a survey of some basic ML algorithms for those who are not fully conversant with the field, which remains under active development. We discuss a series of missteps, errors, and violations of correct protocols of ML methods that appear disconcertingly often within the accumulating body of anthropological literature. These mistakes include use of outdated algorithms and practices; inappropriate testing/training splits, sample composition, and textual explanations; as well as an absence of transparency due to the lack of data/code sharing, and the subsequent limitations imposed on independent replication. We assert that expanding samples, sharing data and code, re-evaluating approaches to peer review, and, most importantly, developing interdisciplinary teams that include experts in ML are all necessary for progress in future research incorporating ML within anthropology and beyond.
“…A support vector machine is a supervised machine learning model regularly used in classifying archaeological materials [36,37,38,39,40,41], which has utility in comparing and classifying datasets aggregated from digital repositories, comparative collections, open access reports, as well as other digital assets. For this effort, linear data were imported and modeled using the scikit-learn package in Python [42,43] (supplementary materials), and subsequently split into training (75 percent) and testing (25 percent) subsets.…”
Recent research in the ancestral Caddo area yielded evidence for distinct _behavioral regions_, across which material culture from Caddo burials—bottles and Gahagan bifaces—has been found to express significant morphological differences. This inquiry assesses whether Perdiz arrow points from Caddo burials, assumed to reflect design intent, may differ across the same geography, and extend the pattern of shape differences to a third category of Caddo material culture. Perdiz arrow points collected from the geographies of the northern and southern Caddo _behavioral regions_ defined in a recent social network analysis were employed to test the hypothesis that morphological attributes differ, and are predictable, between the two communities. Results indicate significant between-community differences in maximum length, width, stem length, and stem width, but not thickness. Using the same traditional metrics combined with the tools of machine learning, a predictive model---support vector machine---was designed to assess the degree to which community differences could be predicted, achieving a receiver operator curve score of 97 percent, and an accuracy score of 94 percent. The subsequent geometric morphometric analysis identified significant differences in Perdiz arrow point shape, size, and allometry, coupled with significant results for modularity and morphological integration. These findings bolster recent arguments that established two discrete _behavioral regions_ in the ancestral Caddo area defined on the basis of discernible morphological differences across three categories of Caddo material culture.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.