Abstract. The Support Vector Machine algorithm is sensitive to the choice of parameter settings. If these are not set correctly, the algorithm may have a substandard performance. Suggesting a good setting is thus an important problem. We propose a meta-learning methodology for this purpose and exploit information about the past performance of different settings. The methodology is applied to set the width of the Gaussian kernel. We carry out an extensive empirical evaluation, including comparisons with other methods (fixed default ranking; selection based on cross-validation and a heuristic method commonly used to set the width of the SVM kernel). We show that our methodology can select settings with low error while providing significant savings in time. Further work should be carried out to see how the methodology could be adapted to different parameter setting tasks.
Abstract. Recent advances in meta-learning are providing the foundations to construct meta-learning assistants and task-adaptive learners. The goal of this special issue is to foster an interest in meta-learning by compiling representative work in the field. The contributions to this special issue provide strong insights into the construction of future meta-learning tools. In this introduction we present a common frame of reference to address work in meta-learning through the concept of meta-knowledge. We show how meta-learning can be simply defined as the process of exploiting knowledge about learning that enables us to understand and improve the performance of learning algorithms.
This paper is concerned with a comparative study of different machine learning, statistical and neural algorithms and an automatic analysis of test results. It is shown that machine learning methods themselves can be used in organizing this knowledge. Various datasets can be characterized using different statistical and information theoretic measures. These together with ~he test results car~ be used by a ML system to generate a set of rules which could also be altered or edited by the user, The system can be applied to a new dataset to provide the user with a set of recommendations concerning the suitability of different algorithms and these are graded by an appropriate information score. The experiments with the implemented system indicate that the method is viable and useful.
BackgroundThe study of length of stay (LOS) outliers is important for the management and financing of hospitals. Our aim was to study variables associated with high LOS outliers and their evolution over time.MethodsWe used hospital administrative data from inpatient episodes in public acute care hospitals in the Portuguese National Health Service (NHS), with discharges between years 2000 and 2009, together with some hospital characteristics. The dependent variable, LOS outliers, was calculated for each diagnosis related group (DRG) using a trim point defined for each year by the geometric mean plus two standard deviations. Hospitals were classified on the basis of administrative, economic and teaching characteristics. We also studied the influence of comorbidities and readmissions. Logistic regression models, including a multivariable logistic regression, were used in the analysis. All the logistic regressions were fitted using generalized estimating equations (GEE).ResultsIn near nine million inpatient episodes analysed we found a proportion of 3.9% high LOS outliers, accounting for 19.2% of total inpatient days. The number of hospital patient discharges increased between years 2000 and 2005 and slightly decreased after that. The proportion of outliers ranged between the lowest value of 3.6% (in years 2001 and 2002) and the highest value of 4.3% in 2009. Teaching hospitals with over 1,000 beds have significantly more outliers than other hospitals, even after adjustment to readmissions and several patient characteristics.ConclusionsIn the last years both average LOS and high LOS outliers are increasing in Portuguese NHS hospitals. As high LOS outliers represent an important proportion in the total inpatient days, this should be seen as an important alert for the management of hospitals and for national health policies. As expected, age, type of admission, and hospital type were significantly associated with high LOS outliers. The proportion of high outliers does not seem to be related to their financial coverage; they should be studied in order to highlight areas for further investigation. The increasing complexity of both hospitals and patients may be the single most important determinant of high LOS outliers and must therefore be taken into account by health managers when considering hospital costs.
We investigate the problem of using past performance information to select an algorithm for a given classification problem. We present three ranking methods for that purpose: average ranks, success rate ratios and significant wins. We also analyze the problem of evaluating and comparing these methods. The evaluation technique used is based on a leave-one-out procedure. On each iteration, the method generates a ranking using the results obtained by the algorithms on the training datasets. This ranking is then evaluated by calculating its distance from the ideal ranking built using the performance information on the test dataset. The distance measure adopted here, average correlation, is based on Spearman's rank correlation coefficient. To compare ranking methods, a combination of Friedman's test and Dunn's multiple comparison procedure is adopted. When applied to the methods presented here, these tests indicate that the success rate ratios and average ranks methods perform better than significant wins.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.