Abstract:The special issue on "Machine Learning for Science and Society" showcases machine learning work with influence on our current and future society. These papers address several key problems such as how we perform repairs on critical infrastructure, how we predict severe weather and aviation turbulence, how we conduct tax audits, whether we can detect privacy breaches in access to healthcare data, and how we link individuals across census data sets for new insights into population changes. In this introduction, w… Show more
“…It is a highly interdisciplinary field building upon ideas from many different kinds of fields such as artificial intelligence, optimization theory, information theory, statistics, cognitive science, optimal control, and many other disciplines of science, engineering, and mathematics [15][16][17][18]. Because of its implementation in a wide range of applications, machine learning has covered almost every scientific domain, which has brought great impact on the science and society [19]. It has been used on a variety of problems, including recommendation engines, recognition systems, informatics and data mining, and autonomous control systems [20].…”
Section: Definition and Classification Of Machine Learningmentioning
There is no doubt that big data are now rapidly expanding in all science and engineering domains. While the potential of these massive data is undoubtedly significant, fully making sense of them requires new ways of thinking and novel learning techniques to address the various challenges. In this paper, we present a literature survey of the latest advances in researches on machine learning for big data processing. First, we review the machine learning techniques and highlight some promising learning methods in recent studies, such as representation learning, deep learning, distributed and parallel learning, transfer learning, active learning, and kernel-based learning. Next, we focus on the analysis and discussions about the challenges and possible solutions of machine learning for big data. Following that, we investigate the close connections of machine learning with signal processing techniques for big data processing. Finally, we outline several open issues and research trends.
“…It is a highly interdisciplinary field building upon ideas from many different kinds of fields such as artificial intelligence, optimization theory, information theory, statistics, cognitive science, optimal control, and many other disciplines of science, engineering, and mathematics [15][16][17][18]. Because of its implementation in a wide range of applications, machine learning has covered almost every scientific domain, which has brought great impact on the science and society [19]. It has been used on a variety of problems, including recommendation engines, recognition systems, informatics and data mining, and autonomous control systems [20].…”
Section: Definition and Classification Of Machine Learningmentioning
There is no doubt that big data are now rapidly expanding in all science and engineering domains. While the potential of these massive data is undoubtedly significant, fully making sense of them requires new ways of thinking and novel learning techniques to address the various challenges. In this paper, we present a literature survey of the latest advances in researches on machine learning for big data processing. First, we review the machine learning techniques and highlight some promising learning methods in recent studies, such as representation learning, deep learning, distributed and parallel learning, transfer learning, active learning, and kernel-based learning. Next, we focus on the analysis and discussions about the challenges and possible solutions of machine learning for big data. Following that, we investigate the close connections of machine learning with signal processing techniques for big data processing. Finally, we outline several open issues and research trends.
“…To test those methods, churn analysis and prediction experiments were used. The ubiquitous data mining methodology CRISP-DM was adopted to investigate customer churn in the telecommunications sector (Chapman et al, 2000;Rudin & Wagstaff, 2014). The CRISP-DM methodology gives comprehensive instructions and procedures for applying data mining algorithms to solve real-world problems.…”
Section: Modelling Experiments and Resultsmentioning
The high increase in the number of companies competing in mature markets makes customer retention an important factor for any company to survive. Thus, many methodologies (e.g., data mining and statistics) have been proposed to analyse and study customer retention. The validity of such methods is not yet proved though. This paper tries to fill this gap by empirically comparing two techniques: Customer churn -decision tree and logistic regression models. The paper proves the superiority of decision tree technique and stresses the needs for more advanced methods to churn modelling.
“…Work on knowledge discovery systems in different domains have highlighted some of the important challenges that we also face in this work [see for instance Fayyad et al, 1996, Frawley et al, 1992, Hand, 1994, Langley and Simon, 1995, Provost and Kohavi, 1998, Brodley and Smyth, 1997, Saitta and Neri, 1998, Rudin and Wagstaff, 2013.…”
We quantify the effects of learning and decision making on each other in three parts. In the first part, we look at how knowledge about decision making can influence learning. Let the decision cost be the amount spent by the practitioner in executing a policy. If we have prior knowledge about this cost, for instance that it should be low, then this knowledge can help restrict the hypothesis space for learning, which can help with its generalization. We derive a suite of theoretical generalization bounds and an algorithm for this setting.In the second part, we look at how knowledge about learning can influence decision making. We study this in the context of robust optimization. Taking the uncertainty of learning the right model into account, we derive multiple probabilistic guarantees on the robustness of the resulting policy.In the last part, we explore the interactions between learning and decision making in depth for two applications. The first application is in the area of power grid maintenance and the second is in the area of professional racing. We provide tailored solutions for modeling, predicting and making decisions in each context.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.