Abstract. In recent years, process mining has become one of the most important and promising areas of research in the field of business process management as it helps businesses to understand, analyze, and improve their business processes. In particular, several proposed techniques and algorithms have been proposed to discover and construct process models from workflow execution logs (i.e., event logs). With the existing techniques, mined models can be built based on analyzing the relationship between any two events seen in event logs. Being restricted by that, they can only handle special cases of routing constructs and often produce unsound models that do not cover all of the traces seen in the log. In this paper, we propose a novel technique for process discovery using Maximal Pattern Mining (MPM) where we construct patterns based on the whole sequence of events seen on the traces-ensuring the soundness of the mined models. Our MPM technique can handle loops (of any length), duplicate tasks, non-free choice constructs, and long distance dependencies. Our evaluation shows that it consistently achieves better recall, precision, F-measure and efficiency than the existing techniques. Furthermore, by using the MPM, the discovered models are generally much easier to understand.
There are opposing views on whether readers gain any advantage from using a computer model of a 3D physical book. There is enough evidence, both anecdotal and from formal user studies, to suggest that the usual HTML or PDF presentation of documents is not always the most convenient, or the most comfortable, for the reader. On the other hand it is quite clear that while 3D book models have been prototyped and demonstrated, none are in routine use in today's digital libraries. And how do 3D book models compare with actual books?This paper reports on a user study designed to compare the performance of a practical Realistic Book implementation with conventional formats (HTML and PDF) and with physical books. It also evaluates the annotation features that the implementation provides.
Abstract:One of the major challenges in the era of big data use is how to 'clean' the vast amount of data, particularly from micro-blog websites like Twitter. Twitter messages, called tweets, are commonly written in ill-forms, including abbreviations, repeated characters, and misspelled words. These 'noisy tweets' require text normalisation techniques to detect and convert them into more accurate English sentences. There are several existing techniques proposed to solve these issues, however each technique possess some limitations and therefore cannot achieve good overall results. This paper aims to evaluate individual existing statistical normalisation methods and their possible combinations in order to find the best combination that can efficiently clean noisy tweets at the character-level, which contains abbreviations, repeated letters and misspelled words. Tested on our Twitter sample dataset, the best combination can achieve 88% accuracy in the Bilingual Evaluation Understudy (BLEU) score and 7% Word Error Rate (WER) score, both of which are considered better than the baseline model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.