Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
As the population is becoming increasingly culturally diverse, there is a growing need for nurses to provide culturally competent care. It has been suggested that health disparities exist within in ethnic minorities and by providing culturally competent care could reduce disparities. The aim of the "TransCoCon: Developing Multimedia Learning for Transcultural Collaboration and Competence in Nursing (GA No 2017-1-UK01-KA203-036612) ERASMUS+ project has been to create innovative accessible multi-media learning resources that will enable undergraduate nursing students and registered nurse in five countries to develop knowledge and skills that enable self-efficacy to influence direct patient care.A participatory approach based on the ASPIRE framework followed, in order to develop the multimedia learning resources, namely Reusable Learning Objects (RLOs). The ASPIRE framework stands for Aims, Storyboarding, Population, Implementation, Release and Evaluation and it includes the following steps: i) Participatory Workshop, ii) Specification writing, iii) Peer Review of Specification -followed by amendments, iv) Development of the RLO, v) Review of the RLO-followed by amendments, vi) Evaluation with stakeholders-followed by amendments, vii) Publish the RLO online.The creation of an RLO is a time-consuming process, but in order to ensure its quality, peer review of the content (Specifications) before development, content and technical review once developed and evaluation from the stakeholders after amendments, are consider vital steps in the development process. This paper will present the evaluation of RLOs with stakeholders at 2 different stages of its development process aiming to discuss the value of the evaluation of the stakeholders at different stages of the process and not only once complete.Four RLOs developed by four partners of the TransCoCon project are evaluated; two after the initial Specification developed using a University of Nottingham HELM team bespoke Specification tool that allows preview of the specification as an RLOs without the interactivity element, and two at the stage of development. Each RLO was evaluated by 23 nursing students or registered nurses. The result of the evaluation was generally consistent across the 4 RLOs. Most participants agreed that the RLOs were clear about its objectives, easy to navigate and have introduced new concepts. On top of that, majority of the participants agreed that the content was appropriate for the topic and have enjoyed learning on their own. In terms of attributes that have contributed or might contribute once developed to their learning, interactivity, self-test exercises, working at own pace, ability to access the RLO anytime and from anywhere were found important by most participants.Since RLOs were on a prototype stage or on a specification preview, technical problems identified by the participants in all 4 RLOs. Technical problems seem to influence the perception of the participants around the reuse of other RLOs. Over 60% of the participants would like...
In data clustering, the problem of selecting the subset of most relevant features from the data has been an active research topic. Feature selection for clustering is a challenging task due to the absence of class labels for guiding the search for relevant features. Most methods proposed for this goal are focused on numerical data. In this work, we propose an approach for clustering and selecting categorical features simultaneously. We assume that the data originate from a finite mixture of multinomial distributions and implement an integrated expectationmaximization (EM) algorithm that estimates all the parameters of the model and selects the subset of relevant features simultaneously. The results obtained on synthetic data illustrate the performance of the proposed approach. An application to real data, referred to official statistics, shows its usefulness.
Communication professionals increasingly need to be able to read and critically comment on statistical data to communicate statistical information fittingly. Consequently, Statistics play an important role in the education of these professionals. Unfortunately, in Portugal most of the students who intend to become communication professionals do not feel at ease with numbers, mathematics or statistics. To engage our students and motivate them to learn, we have selected working groups themes related to their courses and used statistical information available in the real world. With this approach, we expect to get students more involved, make them familiar with using statistical information and let them increase their statistical skills. Therefore, everybody in the classroom, both students and teachers, are likely to become more motivated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.