2019
DOI: 10.1587/transinf.2018edl8237
|View full text |Cite
|
Sign up to set email alerts
|

TFIDF-FL: Localizing Faults Using Term Frequency-Inverse Document Frequency and Deep Learning

Abstract: Existing fault localization based on neural networks utilize the information of whether a statement is executed or not executed to identify suspicious statements potentially responsible for a failure. However, the information just shows the binary execution states of a statement, and cannot show how important a statement is in executions. Consequently, it may degrade fault localization effectiveness. To address this issue, this paper proposes TFIDF-FL by using term frequency-inverse document frequency to ident… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 14 publications
0
3
0
Order By: Relevance
“…The algorithm is easily affected by the skew of the data set, such as a large number of documents in a certain category, which leads to the underestimation of IDF. IDF improvement algorithms such as TFIDF-FL ( Zhang et al, 2019 ) have been proposed, and some scholars have also suggested combining TF-IDF with Word2Vec to solve the shortcomings of TF-IDF ( Naeem et al, 2022 ); in short, simply using the TF-IDF algorithm to calculate semantic similarity leads to the problem of low accuracy.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The algorithm is easily affected by the skew of the data set, such as a large number of documents in a certain category, which leads to the underestimation of IDF. IDF improvement algorithms such as TFIDF-FL ( Zhang et al, 2019 ) have been proposed, and some scholars have also suggested combining TF-IDF with Word2Vec to solve the shortcomings of TF-IDF ( Naeem et al, 2022 ); in short, simply using the TF-IDF algorithm to calculate semantic similarity leads to the problem of low accuracy.…”
Section: Resultsmentioning
confidence: 99%
“…The algorithm is easily affected by the skew of the data set, such as a large number of documents in a certain category, which leads to the underestimation of IDF. IDF improvement algorithms such as TFIDF-FL (Zhang et al, 2019) have been proposed, and some scholars have also suggested combining TF-IDF with Word2Vec to solve the shortcomings of TF-IDF (Naeem et al, 2022); in short, While the features of the SimHash algorithm are as mentioned above, its text similarity calculation is suitable for low-precision and high-speed scenarios. This calculation has lower requirements for speed but higher requirements for accuracy, which proves that SimHash is unsuitable for studying long texts or for high-precision similarity calculations.…”
Section: Analysis Of Calculationmentioning
confidence: 99%
“…TF-IDF computes the weight or value of each word (token) in a corpus document. This method is frequently utilized in information retrieval and text creation to evaluate the relationship of each word in a relationship document [21]. This normalization process determines the weight of terms that appear frequently in a document.…”
Section: Feature Extractionmentioning
confidence: 99%