This paper assesses topic coherence and human topic ranking of uncovered latent topics from scientific publications when utilizing the topic model latent Dirichlet allocation (LDA) on abstract and full-text data. The coherence of a topic, used as a proxy for topic quality, is based on the distributional hypothesis that states that words with similar meaning tend to co-occur within a similar context. Although LDA has gained much attention from machine-learning researchers, most notably with its adaptations and extensions, little is known about the effects of different types of textual data on generated topics. Our research is the first to explore these practical effects and shows that document frequency, document word length, and vocabulary size have mixed practical effects on topic coherence and human topic ranking of LDA topics. We furthermore show that large document collections are less affected by incorrect or noise terms being part of the topic-word distributions, causing topics to be more coherent and ranked higher. Differences between abstract and full-text data are more apparent within small document collections, with differences as large as 90% high-quality topics for full-text data, compared to 50% high-quality topics for abstract data.
Accurate detection of accelerometer non-wear time is crucial for calculating physical activity summary statistics. in this study, we evaluated three epoch-based non-wear algorithms (Hecht, troiano, and choi) and one raw-based algorithm (Hees). in addition, we performed a sensitivity analysis to provide insight into the relationship between the algorithms' hyperparameters and classification performance, as well as to generate tuned hyperparameter values to better detect episodes of wear and non-wear time. We used machine learning to construct a gold-standard dataset by combining two accelerometers and electrocardiogram recordings. The Hecht and Troiano algorithms achieved poor classification performance, while choi exhibited moderate performance. Meanwhile, Hees outperformed all epochbased algorithms. the sensitivity analysis and hyperparameter tuning revealed that all algorithms were able to achieve increased classification performance by employing larger intervals and windows, while more stringently defining artificial movement. These classification gains were associated with the ability to lower the false positives (type i error) and do not necessarily indicate a more accurate detection of the total non-wear time. Moreover, our results indicate that with tuned hyperparameters, epoch-based non-wear algorithms are able to perform just as well as raw-based non-wear algorithms with respect to their ability to correctly detect true wear and non-wear episodes. Accelerometers are increasingly used as an objective tool to study daily physical activity (PA) 1-4. Currently, accelerometers are capable of measuring the body's acceleration in all three spatial axes and are used as a proxy for PA intensity and duration 5. Accelerometers offer versatility, minimal participation burden, relative cost efficiency 6,7 , and they also limit the information bias commonly found in PA self-report measures, such as questionnaires, activity logs and diaries 8-10. Besides collecting data on PA intensity and duration, accelerometers have also been used successfully for activity type recognition 11,12 , body posture and movement classification 13 , energy expenditure prediction 14 , and sleep pattern estimation 15. Presently, there exists an overwhelming amount of accelerometry data collection and processing criteria addressing a variety of research needs 16,17. Examples include minimum daily wear time 18 , body placement of the accelerometer 19 , the use of raw recordings (i.e., gravity units) compared with count-based recordings 20 , cut-points for intensity classification 21 , and determination of accelerometer non-wear time 16. However, the latter, determination of the time during which the accelerometer is not worn (non-wear time), has received little attention in the literature 22 , and numerous studies have failed to address it entirely 16,23. Determining the non-wear time of the accelerometer is important in assessing study compliance and accurately calculating of summary statistics 5 , such as minutes spent sedentary, or in light...
Despite increased fisheries science output and publication outlets, the global crisis in fisheries management is as present as ever. Since a narrow research focus may be a contributing factor to this failure, this study uncovers topics in fisheries research and their trends over time. This interdisciplinary research evaluates whether science is diversifying fisheries research topics in an attempt to capture the complexity of the fisheries system, or whether it is multiplying research on similar topics, attempting to achieve an in-depth, but possibly marginal, understanding of a few selected compo- This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
Modeling has become the most commonly used method in fisheries science, with numerous types of models and approaches available today. The large variety of models and the overwhelming amount of scientific literature published yearly can make it difficult to effectively access and use the output of fisheries modeling publications. In particular, the underlying topic of an article cannot always be detected using keyword searches. As a consequence, identifying the developments and trends within fisheries modeling research can be challenging and time-consuming. This paper utilizes a machine learning algorithm to uncover hidden topics and subtopics from peer-reviewed fisheries modeling publications and identifies temporal trends using 22,236 full-text articles extracted from 13 top-tier fisheries journals from 1990 to 2016. Two modeling topics were discovered: estimation models (a topic that contains the idea of catch, effort, and abundance estimation) and stock assessment models (a topic on the assessment of the current state of a fishery and future projections of fish stock responses and management effects). The underlying modeling subtopics show a change in the research focus of modeling publications over the last 26 years.
As socio‐environmental problems have proliferated over the past decades, one narrative which has captured the attention of policymakers and scientists has been the need for collaborative research that spans traditional boundaries. Collaboration, it is argued, is imperative for solving these problems. Understanding how collaboration is occurring in practice is important, however, and may help explain the idea space across a field. In an effort to make sense of the shape of fisheries science, here we construct a co‐authorship network of the field, from a data set comprising 73,240 scientific articles, drawn from 50 journals and published between 2000 and 2017. Using a combination of social network analysis and machine learning, the work first maps the global structure of scientific collaboration amongst fisheries scientists at the author, country and institutional levels. Second, it uncovers the hidden subgroups—here country clusters and communities of authors—within the network, detailing also the topical focus, publication outlets and relative impact of the largest fisheries science communities. We find that whilst the fisheries science network is becoming more geographically extensive, it is simultaneously becoming more intensive. The uncovered network exhibits characteristics suggestive of a thin style of collaboration, and groupings that are more regional than they are global. Although likely shaped by an array of overlapping micro‐ and macro‐level factors, the analysis reveals a number of political–economic patterns that merit reflection by both fisheries scientists and policymakers.
Background:Pleomorphic adenoma (PA) accounts for 45–74% of all the salivary gland neoplasms, of which 40–70% are present in minor salivary glands. Studies have depicted variations in histological typing and classification of these tumors. Its pleomorphism is attributed to the cytological differentiations of the epithelial components and the diverse stromal components. Biochemical investigations of saliva have revealed “mucins” to be its main component. Mucins reflect in their composition, the functional state of the mucosa, both in health and disease. Many reviews on histochemical classification and identification have been put forward to explain the intricacies of mucins; however, no attempts have been made to classify salivary gland tumors based on their mucin profiles and assess its prognostic significance. Thus, this study was executed to analyze the clinical, histopathological and histochemical behavior of PA of minor salivary glands and decipher a correlation.Materials and Methods:Twenty-six diagnosed cases of PA of minor salivary glands and five controls of normal minor salivary glands of the hard palate were included in the study. Blocks were retrieved, sectioned and stained with hematoxylin and eosin (H and E) stain as well as combined Alcian blue (AB)-periodic acid-Schiff (PAS) stains.Results:The stained slides revealed an array of epithelial and stromal patterns and varying heterogeneity of mucin expression of normal and neoplastic minor salivary glands.Conclusion:The study elucidated the role of mucins in tumorigenesis and its prognostic implications.
To date, non-wear detection algorithms commonly employ a 30, 60, or even 90 mins interval or window in which acceleration values need to be below a threshold value. A major drawback of such intervals is that they need to be long enough to prevent false positives (type I errors), while short enough to prevent false negatives (type II errors), which limits detecting both short and longer episodes of non-wear time. In this paper, we propose a novel non-wear detection algorithm that eliminates the need for an interval. Rather than inspecting acceleration within intervals, we explore acceleration right before and right after an episode of non-wear time. We trained a deep convolutional neural network that was able to infer non-wear time by detecting when the accelerometer was removed and when it was placed back on again. We evaluate our algorithm against several baseline and existing non-wear algorithms, and our algorithm achieves a perfect precision, a recall of 0.9962, and an F1 score of 0.9981, outperforming all evaluated algorithms. Although our algorithm was developed using patterns learned from a hip-worn accelerometer, we propose algorithmic steps that can easily be applied to a wrist-worn accelerometer and a retrained classification model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.