2016
DOI: 10.1016/j.knosys.2016.05.056
|View full text |Cite
|
Sign up to set email alerts
|

Instance selection of linear complexity for big data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0
1

Year Published

2017
2017
2021
2021

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 65 publications
(17 citation statements)
references
References 44 publications
0
15
0
1
Order By: Relevance
“…Ideally, data reduction would lead to a dataset of smaller size and dimension that can be handled more efficiently. Various types of data reduction methods have been developed, which focus on feature extraction, dimensionality reduction, instance selection, noise removal, and outlier detection to reduce, refine and clean spatial big data [20][21][22][23][24]. With the aim of selecting the most important flow instances pertaining to the purpose of analysis, the head/tail break falls among the instance-based approaches.…”
Section: Head/tail Break As a Methods Of Data Reductionmentioning
confidence: 99%
“…Ideally, data reduction would lead to a dataset of smaller size and dimension that can be handled more efficiently. Various types of data reduction methods have been developed, which focus on feature extraction, dimensionality reduction, instance selection, noise removal, and outlier detection to reduce, refine and clean spatial big data [20][21][22][23][24]. With the aim of selecting the most important flow instances pertaining to the purpose of analysis, the head/tail break falls among the instance-based approaches.…”
Section: Head/tail Break As a Methods Of Data Reductionmentioning
confidence: 99%
“…While removing valid observations is considered a sin by many analysts working with discrete choice models, in fields like machine-learning down-scaling of data sets is more common practice (e.g. Arnaiz-González et al 2016;Loyola et al 2016). As we will argue in this paper, using a carefully sampled subset of choice observations can give nearly identical estimation results as compared to using the complete dataset.…”
Section: Introductionmentioning
confidence: 93%
“…We are planning to extend the use of kNN-IS to instance selection techniques for big data [45], where it reports good results. Another direction for future work is to extend the application of the presented kNN-IS approach to a big data semi-supervised learning [46] context.…”
Section: Conclusion and Further Workmentioning
confidence: 99%