Purpose Text mining is growing in importance proportionate to the growth of unstructured data and its applications are increasing day by day from knowledge management to social media analysis. Mapping skillset of a candidate and requirements of job profile is crucial for conducting new recruitment as well as for performing internal task allocation in the organization. The automation in the process of selecting the candidates is essential to avoid bias or subjectivity, which may occur while shuffling through thousands of resumes and other informative documents. The system takes skillset in the form of documents to build the semantic space and then takes appraisals or resumes as input and suggests the persons appropriate to complete a task or job position and employees needing additional training. The purpose of this study is to extend the term-document matrix and achieve refined clusters to produce an improved recommendation. The study also focuses on achieving consistency in cluster quality in spite of increasing size of data set, to solve scalability issues. Design/methodology/approach In this study, a synset-based document matrix construction method is proposed where semantically similar terms are grouped to reduce the dimension curse. An automated Task Recommendation System is proposed comprising synset-based feature extraction, iterative semantic clustering and mapping based on semantic similarity. Findings The first step in knowledge extraction from the unstructured textual data is converting it into structured form either as Term frequency–Inverse document frequency (TF-IDF) matrix or synset-based TF-IDF. Once in structured form, a range of mining algorithms from classification to clustering can be applied. The algorithm gives a better feature vector representation and improved cluster quality. The synset-based grouping and feature extraction for resume data optimizes the candidate selection process by reducing entropy and error and by improving precision and scalability. Research limitations/implications The productivity of any organization gets enhanced by assigning tasks to employees with a right set of skills. Efficient recruitment and task allocation can not only improve productivity but also cater to satisfy employee aspiration and identifying training requirements. Practical implications Industries can use the approach to support different processes related to human resource management such as promotions, recruitment and training and, thus, manage the talent pool. Social implications The task recommender system creates knowledge by following the steps of the knowledge management cycle and this methodology can be adopted in other similar knowledge management applications. Originality/value The efficacy of the proposed approach and its enhancement is validated by carrying out experiments on the benchmarked dataset of resumes. The results are compared with existing techniques and show refined clusters. That is Absolute error is reduced by 30 per cent, precision is increased by 20 per cent and dimensions are lowered by 60 per cent than existing technique. Also, the proposed approach solves issue of scalability by producing improved recommendation for 1,000 resumes with reduced entropy.
The authors propose clustering based multistep iterative algorithm. The important step is where terms are grouped by synonyms. It takes advantage of semantic relativity measure between the terms. Term frequency is computed of the group of synonyms by considering the relativity measure of the terms appearing in the document from the parent term in the group. This increases the importance of terms which though individually appear less frequently but together show their strong presence. The authors tried experiments on different real and artificial datasets such as NEWS 20, Reuters, emails, research papers on different topics. Resulted entropy shows that their algorithm gives improved result on certain set of documents which are well-articulated, such as research papers. The results are marginal on documents where the message is emphasized by repetitions of terms specifically the documents that are rapidly generated such as emails. The authors also observed that newly arrived documents get appropriately mapped based on proximity to the semantic group.
PurposeThis paper utilizes data mining to study the effect of Problem Based Learning (PBL), an innovative pedagogical approach that has been implemented in undergraduate education at a private university in India for teaching Statistics and Operations Research (OR) to techno-management students.Design/methodology/approachThe study follows the assumptions of an in-situ experiment. It employs BBA (IT) and BCA student(s) as a subject and their end of semester GPA as a performance indicator. The pedagogical approach to this study is integrating PBL with classroom teaching. The paper uses a combination of statistics and data mining to analyze the impact of PBL and establish research conclusions.FindingsThe study concludes that the introduction of PBL positively results in an improved GPA for students with a math background. PBL is more effective for BBA (IT) male students. Female students seem to be performing equally well irrespective of the inclusion of PBL. Pattern analysis of shape parameters evidences the impact of PBL, and the results are established through the decision tree and test of proportions.Research limitations/implicationsThe study is limited to students from a single institute.Practical implicationsThis Pattern analysis, as applied in this paper, can be scaled to evaluate the impact of any innovative pedagogical approach agnostic of the field of study. Facilitators can use the process defined in the paper to implement PBL for teaching Statistics and Operations research. Shape parameters of the batch in the previous semester can be utilized by facilitators to plan remedial action for the next semester by classifying students as desirable/non-desirable. Techno-management institutes can alleviate the dread and fear of mathematical subjects by integrating PBL with classroom teaching.Originality/valueThe study utilizes an innovative analytical approach of combining shape parameters with classification. It further provides uniqueness in arriving at a classification of batch performance as desirable/non-desirable and utilizes data mining to emphasize a delineating impact of PBL across both critical parameters of the batch and the student. The study also defines a framework for the implementation of PBL for a techno-management program in Statistics and Operations Research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.