Credit card fraud is a criminal offense. It causes severe damage to financial institutions and individuals. Therefore, the detection and prevention of fraudulent activities are critically important to financial institutions. Fraud detection and prevention are costly, time-consuming, and labor-intensive tasks. A number of significant research works have been dedicated to developing innovative solutions to detect different types of fraud. However, these solutions have been proved ineffective. According to Cifa, 33 305 cases of credit card identity fraud were reported between January and June in 2018. 1 Various weaknesses of existing solutions have been reported in the literature. Among them all, the imbalance classification is the most critical and well-known problem. Imbalance classification consists of having a small number of observations of the minority class compared with the majority in the data set. In this problem, the ratio fraud: legitimate is very small, which makes it extremely difficult for the classification algorithm to detect fraud cases. In this paper, we will conduct a rigorous experimental study with the solutions that tackle the imbalance classification problem. We explored these solutions along with the machine learning algorithms used for fraud detection. We identified their weaknesses and summarized the results that we obtained using a credit card fraud labeled dataset. According to this paper, imbalanced classification approaches are ineffective, especially when the data are highly imbalanced. This paper reveals that the existing approaches result in a large number of false alarms, which are costly to financial institutions. This may lead to inaccurate detection as well as increasing the occurrence of fraud cases.INDEX TERMS Fraud analysis and detection, fraud cybercrimes, imbalanced classification, secure society.
In recent years, the radical advancement of technologies has given rise to an abundance of software applications, social media, and smart devices such as smartphone, sensors, and so on. More extensive use of these applications and tools in various industrial domains has led to data deluge, which has fostered enormous challenges and opportunities. However, it is not only the volume of the data but also the speed, variety, and uncertainty, which are promoting a massive challenge for traditional technologies such as data warehouse. These diverse and unprecedented characteristics have engendered the notion of ''Big Data.'' The data-intensive industries have been experiencing a wide variety of challenges in terms of processing, managing, and analysis of data. For instance, the healthcare sector is confronting difficulties in respect of integration or fusion of diverse medical data stemming from multiple heterogeneous sources. Data integration is critically important within the healthcare sector because it enriches data, enhances its value, and more importantly paves a solid foundation for highly efficient and effective healthcare analytics such as predicting diseases or an outbreak. Several data integration technologies and tools have been developed over the last two decades. This paper aims at studying data integration technologies, tools, and applications within the healthcare domain. Furthermore, this paper discusses future research directions in the integration of Big healthcare data. INDEX TERMS Big data, data integration, healthcare data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.