Feature value acquisition in testing

Sheng, Victor S.; Ling, Charles X.

doi:10.1145/1143844.1143946

Cited by 27 publications

(5 citation statements)

References 16 publications

(13 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Another possibility is to also include the other types of cost (e.g., delay cost (Sheng and Ling, 2006) and computational cost (Demir and Alpaydin, 2005)) into the problem formulation.…”

Section: Resultsmentioning

confidence: 99%

“…Sheng and Ling (2006) and Yang et al (2006) compute the posterior probability of a feature taking a particular value by using the Bayes' rule where likelihoods and priors are estimated by maximum likelihood estimation. Zhang and Ji (2006) compute posteriors using dynamic Bayesian networks.…”

Section: Posterior Estimationmentioning

confidence: 99%

“…These studies define their splitting criterion as a function of both the information gain of a feature and its extraction cost (Nunez, 1991;Tan, 1993). Alternatively, they use the sum of misclassification and test costs as a splitting criterion (Sheng and Ling, 2006;Yang et al, 2006). These studies use a greedy approach to construct their decision trees.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Qualitative test-cost sensitive classification

Cebe

Gündüz-Demir

2010

Pattern Recognition Letters

View full text Add to dashboard Cite

a b s t r a c tThis paper reports a new framework for test-cost sensitive classification. It introduces a new loss function definition, in which misclassification cost and cost of feature extraction are combined qualitatively and the loss is conditioned with current and estimated decisions as well as their consistency. This loss function definition is motivated with the following issues. First, for many applications, the relation between different types of costs can be expressed roughly and usually only in terms of ordinal relations, but not as a precise quantitative number. Second, the redundancy between features can be used to decrease the cost; it is possible not to consider a new feature if it is consistent with the existing ones. In this paper, we show the feasibility of the proposed framework for medical diagnosis problems. Our experiments demonstrate that this framework is efficient to significantly decrease feature extraction cost without decreasing accuracy.

show abstract

“…Another possibility is to also include the other types of cost (e.g., delay cost (Sheng and Ling, 2006) and computational cost (Demir and Alpaydin, 2005)) into the problem formulation.…”

Section: Resultsmentioning

confidence: 99%

Section: Posterior Estimationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Qualitative test-cost sensitive classification

Cebe

Gündüz-Demir

2010

Pattern Recognition Letters

View full text Add to dashboard Cite

show abstract

“…Certain cost-sensitive learning algorithms consider the waiting time for test results as a type of cost [40]. Sheng and Ling [37] propose a sequential batch test for disease prediction, minimizing attribute acquisition, delay, and misclassification costs. Zhang [51] introduces test time as a waiting cost in cost-time sensitive classification, addressing missing values with sequential and batch test strategies.…”

Section: Related Workmentioning

confidence: 99%

“…Table 2 specifies the misclassification costs. We adopt the settings of the test cost and test time used in Turney [39], Sheng and Ling [37], and Chen et al [5]. The test cost and test time for the variables in the Heart Disease data appear in Table 3.…”

Section: Simulation Studymentioning

confidence: 99%

Cost‐sensitive classification with time constraint on incomplete data

Lee,

2024

Statistical Analysis

View full text Add to dashboard Cite

Missing values are common, but dealing with them by inappropriate method may lead to large classification errors. Empirical evidences show that the tree‐based classification algorithms such as classification and regression tree (CART) can benefit from imputation, especially multiple imputation. Nevertheless, less attention has been paid to incorporating multiple imputation into cost‐sensitive decision tree induction. This study focuses on the treatment of missing data based on a time‐constrained minimal‐cost tree algorithm. We introduce various approaches to handle incomplete data into the algorithm including complete‐case analysis, missing‐value branch, single imputation, feature acquisition, and multiple imputation. A simulation study under different scenarios examines the predictive performances of the proposed strategies. The simulation results show that the combination of the algorithm with multiple imputation can assure classification accuracy under the budget. A real medical data example provides insights into the problem of missing values in cost‐sensitive learning and the advantages of the proposed methods.

show abstract