2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) 2020
DOI: 10.1109/sbac-pad49847.2020.00037
|View full text |Cite
|
Sign up to set email alerts
|

XPySom: High-Performance Self-Organizing Maps

Abstract: In this paper, we introduce XPySom, a new opensource Python implementation of the well-known Self-Organizing Maps (SOM) technique. It is designed to achieve high performance on a single node, exploiting widely available Python libraries for vector processing on multi-core CPUs and GP-GPUs. We present results from an extensive experimental evaluation of XPySom in comparison to widely used open-source SOM implementations, showing that it outperforms the other available alternatives. Indeed, our experimentation c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 25 publications
0
3
0
Order By: Relevance
“…Three data sets and grid sizes are used to test the efficiency of the SOM models. The quantisation error is the metric evaluation recommended by [11,[22][23][24]. The error can prove the accuracy because it measures the fitting of the neural map towards the data.…”
Section: Analysis and Discussionmentioning
confidence: 99%
“…Three data sets and grid sizes are used to test the efficiency of the SOM models. The quantisation error is the metric evaluation recommended by [11,[22][23][24]. The error can prove the accuracy because it measures the fitting of the neural map towards the data.…”
Section: Analysis and Discussionmentioning
confidence: 99%
“…The size of the map is chosen as a trade‐off between the level of information reduction, and the accuracy of the clustering. We use the XPySom package (Mancini et al., 2020) run in a batch mode (using all the vectors in data sequentially, in opposition to a random training), in order to obtain a reproducible result. Each node being associated with a subset of the input data, we can then use the time‐steps associated with a node (called Best Matching Units BMUs) to compute composites of other variables (“conditional average”) and investigate the link between the nodes and those variables.…”
Section: Data Sets and Methodsmentioning
confidence: 99%
“…Such accelerations have been proved to be necessary in order to reach a satisfactory performance when tackling the massive data set provided by Vodafone. In the future, we plan to switch to a new implementation we recently realized performing even better [23].…”
Section: Som Implementationmentioning
confidence: 99%