2014
DOI: 10.1007/s11634-014-0176-4
|View full text |Cite
|
Sign up to set email alerts
|

Basic statistics for distributional symbolic variables: a new metric-based approach

Abstract: In data mining it is usual to describe a group of measurements using summary statistics or through their empirical distribution functions. Each summary of a group of measurements is the representation of a typology of individuals (sub-populations) or of the evolution of the observed variable for each individual. Therefore, typologies or individuals are expressible through multi-valued descriptions (intervals, frequency distributions). Symbolic Data Analysis, a relatively new statistical approach, aims at the t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
34
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 34 publications
(34 citation statements)
references
References 22 publications
(48 reference statements)
0
34
0
Order By: Relevance
“…Finally, it is noteworthy that, according to [23] and the therein references, a closed form of the distance in Eq. (1) can be obtained under some alternative conditions: there exists closed forms for the integral of Q kj , Q 2 kj and Q kj ·Q k j ; if Q kj is a quantile function associated with a histogram [22]; or if the probability density function associated with Q kj depends only on its means and its standard deviation (for example, if Q kj and Q kj are the quantile functions of two Gaussian or two Uniform distributional data).…”
Section: Distribution-valued Data and Wasserstein Distancementioning
confidence: 94%
See 3 more Smart Citations
“…Finally, it is noteworthy that, according to [23] and the therein references, a closed form of the distance in Eq. (1) can be obtained under some alternative conditions: there exists closed forms for the integral of Q kj , Q 2 kj and Q kj ·Q k j ; if Q kj is a quantile function associated with a histogram [22]; or if the probability density function associated with Q kj depends only on its means and its standard deviation (for example, if Q kj and Q kj are the quantile functions of two Gaussian or two Uniform distributional data).…”
Section: Distribution-valued Data and Wasserstein Distancementioning
confidence: 94%
“…. , P ) element of the matrix of prototypes G is obtained as follows, for each cluster, setting the partial derivatives w.r.t.ḡ ij and Q c gij equal to zero, and according to [23], the quantile function associated with the corresponding probability density function (pdf ) g kj is:…”
Section: Computation Of the Prototypesmentioning
confidence: 99%
See 2 more Smart Citations
“…). Irpino and Verde also noticed that the two distances that give a single center in the form of histogram are the L 2 Euclidean distance and the L 2 Wasserstein distance. If φi1j1 and φi2j2 are the two density functions associated with the histogram Yi1j1 and Yi2j2, respectively, and Fi1j11(t) and Fi2j21(t)(t[]0;1) their corresponding quantile functions, the L 2 Wasserstein distance (for a continuous distribution) is defined as follows: dW:=o1[]Fi1j11(t)Fi2j21(t)2 The mean quantile function is MW(Yj)=falseFj11(t)¯=1ni=1nFi1…”
Section: Descriptive Statistics Of Histogram Variablesmentioning
confidence: 98%