2005
DOI: 10.1007/11539087_71
|View full text |Cite
|
Sign up to set email alerts
|

Training Data Selection for Support Vector Machines

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
46
0

Year Published

2009
2009
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 68 publications
(47 citation statements)
references
References 8 publications
1
46
0
Order By: Relevance
“…We present the result for RBF kernel shown in Table III, and for RSSVM on both data sets, sample is still proportional to 0.5%, 1% and 5% respectively. It shows a similar result as our second experiment, except that RSSVM got a lower precision on NWI data set, which means that RSSVM is not a reliable algorithm like shown in [3] and the performance of RSSVM depends on the distribution of the data set. Similar problem also exists for CBSVM, since it doesn't have a tendency to keep important data near the boundary uncompressed during building the CF trees.…”
Section: Experiments and Resultssupporting
confidence: 74%
See 2 more Smart Citations
“…We present the result for RBF kernel shown in Table III, and for RSSVM on both data sets, sample is still proportional to 0.5%, 1% and 5% respectively. It shows a similar result as our second experiment, except that RSSVM got a lower precision on NWI data set, which means that RSSVM is not a reliable algorithm like shown in [3] and the performance of RSSVM depends on the distribution of the data set. Similar problem also exists for CBSVM, since it doesn't have a tendency to keep important data near the boundary uncompressed during building the CF trees.…”
Section: Experiments and Resultssupporting
confidence: 74%
“…It works by sampling a small proportion of data to approximately reflect the distribution of the entire data set. Experiments have shown that this scheme efficiently works well but sometimes has poor performance because it may lose important information while sampling [3], [4]. Active learning [9] is developed to reduce the costly labeling work by selecting "important" data instances in the data set, and only requiring user to consider these important data instances for further labeling.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…A simple yet effective neighborhood analysis of each T vector was proposed by Wang et al (2005). For each training vector, the largest sphere which contains only vectors of the same class is determined, and the number of vectors encompassed by this sphere is verified (N a for each a).…”
Section: Neighborhood Analysis Methodsmentioning
confidence: 99%
“…to select the most important samples) plays an important role. Plenty of work for sample selection has been done, including for example based on clustering methods [4,5], Mahalanobis distance [6], !-skeleton and Hausdorffdistance [7,8], and the information theory [9,10]. Although much research progress has been achieved, problems still remain.…”
Section: Introductionmentioning
confidence: 99%