Performance evaluation of clustering algorithms for varying cardinality and dimensionality of data sets

Renjith, Shini; Sreekumar, A.; Jathavedan, M.

doi:10.1016/j.matpr.2020.01.110

Cited by 18 publications

(13 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We used the density-based spatial clustering of applications with noise (DBSCAN) algorithm to group high-density closely related data points (or geo-locations), forming spatial clusters of data points that represented significant events (such as stopping and movement ) during normal minibus taxi operations. We chose the DBSCAN algorithm because of its robustness to outlier detection, its ability to discover clusters with uneven densities and arbitrary shapes, and the fact that it does not need prior knowledge of the number of clusters (Liu et al, 2012;Renjith et al, 2020). For cluster analysis we used a Python implementation of the DBSCAN algorithm from the Scikit-Learn package (Pedregosa et al, 2011).…”

Section: Spatial Clustering and Analysismentioning

confidence: 99%

Ray of hope for sub-Saharan Africa's paratransit: solar charging of urban electric minibus taxis in South Africa

Abraham¹,

Rix²,

Ndibatya³

et al. 2021

Preprint

View full text Add to dashboard Cite

Minibus taxi public transport is a seemingly chaotic phenomenon in the developing cities of the Global South with unique mobility and operational characteristics. Eventually this ubiquitous fleet of minibus taxis will have to transition to electric vehicles. This paper examines the impact of this inevitable evolution. We present a generic simulation environment to assess the grid impact and charging opportunities, given the unique paratransit mobility patterns. We used floating car data to assess the energy requirements of electric minibus taxis, which will have a knock-on effect on Africa's already fragile electrical grids. We used spatio-temporal and solar photovoltaic analyses to assess the informal and formal stops that would be needed for the taxis to recharge from solar PV in the region's abundant sunshine. The results showed energy demand from a median of 215 kWh/day to a maximum of 490 kWh/day, with a median charging potential (stationary time) across taxis of 7.7 h/day to 10.6 h/day. The potential for charging from solar PV was 0.38 kWh/m2 to 0.90 kWh/m2. Our simulator and results will allow traffic planners and grid operators to assess and plan for looming electric vehicle roll-outs, and could lead to a new funding model for transport in Africa.

show abstract

Section: Spatial Clustering and Analysismentioning

confidence: 99%

Ray of hope for sub-Saharan Africa's paratransit: solar charging of urban electric minibus taxis in South Africa

Abraham¹,

Rix²,

Ndibatya³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The immense generation of data has spurred a rapid development in the utilisation of several data science techniques to obtain relevant socio-economic value from the data being produced in fields such as medicine, biology, transportation and business enterprises [18]. It is obvious that the interdisciplinary sub-field of data mining, which involves the design and implementation of a scalable descriptive and predictive machine learning algorithms, has made commendable efforts to discover useful patterns in datasets that will prove very useful [14,16,18].…”

Section: Related Workmentioning

confidence: 99%

“…Furthermore, there is a surge in the cardinality of the datasets with increasing deposits of observations due to frequently used services such as trading platforms, social networking and app data usage. Due to this growing challenge of data complexity, many pre-processing techniques have been proposed to reduce the dimensions and cardinality of the data entries [18]. This helps to reduce the computational costs involved in the clustering operations and detection of outliers.…”

Section: Related Workmentioning

confidence: 99%

“…Over the years, there has been an increase in the application of computational models in several fields such as in medical sciences, transportation, marketing, governance and law enforcement due to the recent advancement in computing technologies [18]. The goal of the usage of these data science techniques is to redefine data through the implementation of supervised and unsupervised learning models, thereby extracting useful information to facilitate decision making and the allocation of scarce resources [14,16].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Comparative Analysis of K-Means and Traversal Optimisation Algorithms

Adama

Olatunji

Yahaya

et al. 2021

Advances in Intelligent Systems and Computing

View full text Add to dashboard Cite

This research aims to present a technical analysis of the Traversal Optimisation Algorithm (TOA) for clustering and K-means clustering algorithm. The goal is to rigorously test this algorithm against different data specifications beyond what has previously been used with K-means without artificially and subjectively setting the initial number of clusters. The experimental evaluation involve the use of diverse cluster optimisation techniques for K-means while applying a wider range of internal validation methods such as Davies-Bouldin Index, Dunn Index and Silhouette Method, for appraising cluster quality of the Traversal Optimisation Algorithm, while at the same time not compromising the configuration of the default algorithm. The findings in this work shows that the optimisation algorithm's clustering quality as calculated by multiple internal validity indices can be very poor when operating on datasets with varying characteristics. This is owing to the algorithm's lack of any add-on mechanism for computing the optimal number of clusters that a dataset needs apriori. The results reveal that in a data processing contexts where the number of clusters are specified, the TOA yields a favourable cost-benefit in terms of run-time complexity and clustering quality.

show abstract

“…We used the DBSCAN (density-based spatial clustering of applications with noise) algorithm to group high-density closely related data points (or geo-locations), forming spatial clusters of data points that represented significant events (such as stopping and movement) during normal minibus taxi operations. We chose this algorithm because it is robust to outlier detection, can discover clusters with uneven densities and arbitrary shapes, and does not need prior knowledge of the number of clusters [53,54]. For cluster analysis in this chapter, we used a Python implementation of the DBSCAN algorithm from the Scikit-Learn package [55].…”

Section: Spatial Clustering and Analysismentioning

confidence: 99%

e-Quantum leap on a data highway: Planning for electric minibus taxis in sub-Saharan Africa's paratransit system

Booysen¹,

Abraham²,

Ndibatya³

et al. 2021

Preprint

View full text Add to dashboard Cite

Minibus taxis are ubiquitous in the developing cities of the Global South. This versatile, and somewhat chaotic public transport system is now faced with the need to move to renewable energy. But the looming roll-out of electric vehicles poses a threat to the already fragile electrical grids of African cities. This chapter evaluates the energy requirements of decarbonisation and evaluates two types of data, passenger-based and vehicle-based, from research in South Africa that has modelled these taxis. Using these two data capture methods, we assess the energy requirements and charging opportunities for electric minibus paratransit in three African cities and compare the results of the two methods to assess their suitability for planning minibus taxi electrification.

show abstract

Performance evaluation of clustering algorithms for varying cardinality and dimensionality of data sets

Cited by 18 publications

References 26 publications

Ray of hope for sub-Saharan Africa's paratransit: solar charging of urban electric minibus taxis in South Africa

Ray of hope for sub-Saharan Africa's paratransit: solar charging of urban electric minibus taxis in South Africa

Comparative Analysis of K-Means and Traversal Optimisation Algorithms

e-Quantum leap on a data highway: Planning for electric minibus taxis in sub-Saharan Africa's paratransit system

Contact Info

Product

Resources

About