Stefan Rüping scite author profile

227

Support Vector Machines (SVMs) have become a popular tool for learning with large amounts of high dimensional data. However, it may sometimes be preferable to learn incrementally from previous SVM results, as computing a SVM is very costly in terms of time and memory consumption or because the SVM may be used in an online learning setting. In this paper an approach for incremental learning with Support Vector Machines is presented, that improves existing approaches. Empirical evidence is given to prove that this approach can effectively deal with changes in the target concept that are results of the incremental learning setting.

Int J Adv Manuf Technol

A review of machine learning for the optimization of production processes

Weichert

Link

Stoll

et al. 2019

201

Tight Optimistic Estimates for Fast Subgroup Discovery

Großkreutz

Wrobel

Abstract. Subgroup discovery is the task of finding subgroups of a population which exhibit both distributional unusualness and high generality. Due to the non monotonicity of the corresponding evaluation functions, standard pruning techniques cannot be used for subgroup discovery, requiring the use of optimistic estimate techniques instead. So far, however, optimistic estimate pruning has only been considered for the extremely simple case of a binary target attribute and up to now no attempt was made to move beyond suboptimal heuristic optimistic estimates. In this paper, we show that optimistic estimate pruning can be developed into a sound and highly effective pruning approach for subgroup discovery. Based on a precise definition of optimality we show that previous estimates have been tight only in special cases. Thereafter, we present tight optimistic estimates for the most popular binary and multi-class quality functions, and present a family of increasingly efficient approximations to these optimal functions. As we show in empirical experiments, the use of our newly proposed optimistic estimates can lead to a speed up of an order of magnitude compared to previous approaches.

Learning with Local Models

2005

E2mC: Improving Emergency Management Service Practice through Social Media and Crowdsourcing Analysis in Near Real Time

Havas

Resch

Francalanci

et al. 2017

Sensors

In the first hours of a disaster, up-to-date information about the area of interest is crucial for effective disaster management. However, due to the delay induced by collecting and analysing satellite imagery, disaster management systems like the Copernicus Emergency Management Service (EMS) are currently not able to provide information products until up to 48–72 h after a disaster event has occurred. While satellite imagery is still a valuable source for disaster management, information products can be improved through complementing them with user-generated data like social media posts or crowdsourced data. The advantage of these new kinds of data is that they are continuously produced in a timely fashion because users actively participate throughout an event and share related information. The research project Evolution of Emergency Copernicus services (E2mC) aims to integrate these novel data into a new EMS service component called Witness, which is presented in this paper. Like this, the timeliness and accuracy of geospatial information products provided to civil protection authorities can be improved through leveraging user-generated data. This paper sketches the developed system architecture, describes applicable scenarios and presents several preliminary case studies, providing evidence that the scientific and operational goals have been achieved.

On subgroup discovery in numerical domains

Großkreutz

2009

Data Min Knowl Disc

Subgroup discovery is a Knowledge Discovery task that aims at finding subgroups of a population with high generality and distribu-tional unusualness. While several subgroup discovery algorithms have been presented in the past, they focus on databases with nominal attributes or make use of discretization to get rid of the numerical attributes. In this paper, we illustrate why the replacement of numerical attributes by nominal attributes can result in suboptimal results. Thereafter , we present a new subgroup discovery algorithm that prunes large parts of the search space by exploiting bounds between related numerical subgroup descriptions. The same algorithm can also be applied to ordinal attributes. In an experimental section, we show that the use of our new pruning scheme results in a huge performance gain when more that just a few split-points are considered for the numerical attributes. This is an extended abstract of an article published in the Data Mining and Knowledge Discovery journal [1].

On Subgroup Discovery in Numerical Domains

Großkreutz

2009

Abstract. Subgroup discovery is a Knowledge Discovery task that aims at finding subgroups of a population with high generality and distributional unusualness. While several subgroup discovery algorithms have been presented in the past, they focus on databases with nominal attributes or make use of discretization to get rid of the numerical attributes. In this paper, we illustrate why the replacement of numerical attributes by nominal attributes can result in suboptimal results. Thereafter, we present a new subgroup discovery algorithm that prunes large parts of the search space by exploiting bounds between related numerical subgroup descriptions. The same algorithm can also be applied to ordinal attributes. In an experimental section, we show that the use of our new pruning scheme results in a huge performance gain when more that just a few split-points are considered for the numerical attributes.

Advanced Sensing and Human Activity Recognition in Early Intervention and Rehabilitation of Elderly People

et al. 2020

Ageing is associated with a decline in physical activity and a decrease in the ability to perform activities of daily living, affecting physical and mental health. Elderly people or patients could be supported by a human activity recognition (HAR) system that monitors their activity patterns and intervenes in case of change in behavior or a critical event has occurred. A HAR system could enable these people to have a more independent life. In our approach, we apply machine learning methods from the field of human activity recognition (HAR) to detect human activities. These algorithmic methods need a large database with structured datasets that contain human activities. Compared to existing data recording procedures for creating HAR datasets, we present a novel approach, since our target group comprises of elderly and diseased people, who do not possess the same physical condition as young and healthy persons. Since our targeted HAR system aims at supporting elderly and diseased people, we focus on daily activities, especially those to which clinical relevance in attributed, like hygiene activities, nutritional activities or lying positions. Therefore, we propose a methodology for capturing data with elderly and diseased people within a hospital under realistic conditions using wearable and ambient sensors. We describe how this approach is first tested with healthy people in a laboratory environment and then transferred to elderly people and patients in a hospital environment. We also describe the implementation of an activity recognition chain (ARC) that is commonly used to analyse human activity data by means of machine learning methods and aims to detect activity patterns. Finally, the results obtained so far are presented and discussed as well as remaining problems that should be addressed in future research.