Process mining employs event logs to provide insights into the actual processes. Event logs are recorded by information systems and contain valuable information helping organizations to improve their processes. However, these data also include highly sensitive private information which is a major concern when applying process mining. Therefore, privacy preservation in process mining is growing in importance, and new techniques are being introduced. The effectiveness of the proposed privacy preservation techniques needs to be evaluated. It is important to measure both sensitive data protection and data utility preservation. In this paper, we propose an approach to quantify the effectiveness of privacy preservation techniques. We introduce two measures for quantifying disclosure risks to evaluate the sensitive data protection aspect. Moreover, a measure is proposed to quantify data utility preservation for the main process mining activities. The proposed measures have been tested using various real-life event logs.
An online community is a virtual community where people can express their opinions and their knowledge freely. There are a great deal of information in online communities, however there is no way to determine its authenticity. Thus the knowledge which has been shared in online communities is not reliable. By determining expertise level of users and finding experts in online communities the accuracy of posted comments can be evaluated. In this study, a hybrid method for expert finding in online communities is presented which is based on content analysis and social network analysis. The content analysis is based on concept map and the social network analysis is based on PageRank algorithm. To evaluate the proposed method java online community was selected and then correlation between our results and scores prepared by java online community was calculated. Based on obtained results Spearman correlation for 11 subcategories of java online community using this method is 0.904, which is highly an acceptable value.
Process mining aims to bridge the gap between data science and process science by providing a variety of powerful data-driven analyses techniques on the basis of event data. These techniques encompass automatically discovering process models, detecting and predicting bottlenecks, and finding process deviations. In process mining, event data containing the full breadth of resource information allows for performance analysis and discovering social networks. On the other hand, event data are often highly sensitive, and when the data contain private information, privacy issues arise. Surprisingly, there has currently been little research toward security methods and encryption techniques for process mining. Therefore, in this paper, using abstraction, we propose an approach that allows us to hide confidential information in a controlled manner while ensuring that the desired process mining results can still be obtained. We show how our approach can support confidentiality while discovering control-flow and social networks. A connector method is applied as a technique for storing associations between events securely. We evaluate our approach by applying it on real-life event logs.
The extraction, transformation, and loading of event logs from information systems is the first and the most expensive step in process mining. In particular, extracting event logs from popular ERP systems such as SAP poses major challenges, given the size and the structure of the data. Open-source support for ETL is scarce, while commercial process mining vendors maintain connectors to ERP systems supporting ETL of a limited number of business processes in an ad-hoc manner. In this paper, we propose an approach to facilitate event data extraction from SAP ERP systems. In the proposed approach, we store event data in the format of object-centric event logs that efficiently describe executions of business processes supported by ERP systems. To evaluate the feasibility of the proposed approach, we have developed a tool implementing it and conducted case studies with a real-life SAP ERP system.
Privacy and confidentiality are very important prerequisites for applying process mining to comply with regulations and keep company secrets. This article provides a foundation for future research on privacy-preserving and confidential process mining techniques. Main threats are identified and related to a motivation application scenario in a hospital context as well as to the current body of work on privacy and confidentiality in process mining. A newly developed conceptual model structures the discussion that existing techniques leave room for improvement. This results in a number of important research challenges that should be addressed by future process mining research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.