2008
DOI: 10.3844/jcssp.2008.421.426
|View full text |Cite
|
Sign up to set email alerts
|

Mining Functional Dependency from Relational Databases Using Equivalent Classes and Minimal Cover

Abstract: Data Mining (DM) represents the process of extracting interesting and previously unknown knowledge from data. This study proposes a new algorithm called FD_Discover for discovering Functional Dependencies (FDs) from databases. This algorithm employs some concepts from relational databases design theory specifically the concepts of equivalences and the minimal cover. It has resulted in large improvement in performance in comparison with a recent and similar algorithm called FD_MINE. Key words:Data mining, funct… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2010
2010
2020
2020

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 9 publications
0
4
0
Order By: Relevance
“…(2) Rely on information theoretic measures [30] by considering the ratio Both above approaches have a fundamental flaw: given a finite sample of tuples, as the number of attributes in X increases, it more likely that the empirical ratio PR (x|y) = |(x, y)|/|x| is 1.0, leading both aforementioned approaches to determine that Equation 1 is satisfied 1 . This behavior leads to overfitting to spurious dependencies and the discovery of complex (dense) structures across attributes.…”
Section: Functional Dependenciesmentioning
confidence: 99%
See 2 more Smart Citations
“…(2) Rely on information theoretic measures [30] by considering the ratio Both above approaches have a fundamental flaw: given a finite sample of tuples, as the number of attributes in X increases, it more likely that the empirical ratio PR (x|y) = |(x, y)|/|x| is 1.0, leading both aforementioned approaches to determine that Equation 1 is satisfied 1 . This behavior leads to overfitting to spurious dependencies and the discovery of complex (dense) structures across attributes.…”
Section: Functional Dependenciesmentioning
confidence: 99%
“…We build upon recent works that observe that in the presence of strong structured dependencies automated data cleaning can be effective [17,40] and perform the following experiment: For each data set in Table 3, we separate its attributes into two groups (1) attributes that participate in an FD based on FDX's output, and (2) attributes that are independent according to FDX. We measure the median imputation accuracy for each group for AimNet and XGBoost and examine if the constraints discovered by FDX can be used as a proxy to identify if automated cleaning will be accurate.…”
Section: Using Fdx In Data Preparationmentioning
confidence: 99%
See 1 more Smart Citation
“…In general, there is an effort to recover as much of the semantics as possible from many different source genres. For example, researchers have investigated semantic recovery from relational databases [5,6], XML [1,38], human-readable tables [24,25,34], forms [22,31], and free-running text [10].…”
Section: Theorem 3 Let S Be a Relational Database With Its Schema Rementioning
confidence: 99%