Yinan An scite author profile

In industrial data analytics, one of the fundamental problems is to utilize the temporal correlation of the industrial data to make timely predictions in the production process, such as fault prediction and yield prediction. However, the traditional prediction models are fixed while the conditions of the machines change over time, thus making the errors of predictions increase with the lapse of time. In this paper, we propose a general data renewal model to deal with it. Combined with the similarity function and the loss function, it estimates the time of updating the existing prediction model, then updates it according to the evaluation function iteratively and adaptively. We have applied the data renewal model to two prediction algorithms. The experiments demonstrate that the data renewal model can effectively identify the changes of data, update and optimize the prediction model so as to improve the accuracy of prediction.

show abstract

The inherent time complexity and an efficient algorithm for subsequence matching problem

Chao

Gao

et al. 2022

Proc. VLDB Endow.

View full text Add to dashboard Cite

Subsequence matching is an important and fundamental problem on time series data. This paper studies the inherent time complexity of the subsequence matching problem and designs a more efficient algorithm for solving the problem. Firstly, it is proved that the subsequence matching problem is incomputable in time O ( n 1-δ ) even allowing polynomial time preprocessing if the hypothesis SETH is true, where n is the size of the input time series and 0 ≤ δ < 1, i.e., the inherent complexity of the subsequence matching problem is ω ( n 1-δ ). Secondly, an efficient algorithm for subsequence matching problem is proposed. In order to improve the efficiency of the algorithm, we design a new summarization method as well as a novel index for series data. The proposed algorithm supports both Euclidean Distance and DTW distance with or without z -normalization. Experimental results show that the proposed algorithm is up to about 3 ~ 10 times faster than the state of art algorithm on the constrained z -normalized Euclidean Distance and DTW distance, and is up to 7 ~ 12 times faster on Euclidean Distance.

show abstract

Error Detection in a Large-Scale Lexical Taxonomy

Liu

Wang

2020

Information

View full text Add to dashboard Cite

Knowledge base (KB) is an important aspect in artificial intelligence. One significant challenge faced by KB construction is that it contains many noises, which prevent its effective usage. Even though some KB cleansing algorithms have been proposed, they focus on the structure of the knowledge graph and neglect the relation between the concepts, which could be helpful to discover wrong relations in KB. Motived by this, we measure the relation of two concepts by the distance between their corresponding instances and detect errors within the intersection of the conflicting concept sets. For efficient and effective knowledge base cleansing, we first apply a distance-based model to determine the conflicting concept sets using two different methods. Then, we propose and analyze several algorithms on how to detect and repair the errors based on our model, where we use a hash method for an efficient way to calculate distance. Experimental results demonstrate that the proposed approaches could cleanse the knowledge bases efficiently and effectively.

show abstract

Diversified Top-k Querying in Knowledge Graphs

Guo

Gao

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yinan An

Smart home energy management with vehicle-to-home technology

A General Data Renewal Model for Prediction Algorithms in Industrial Data Analytics

The inherent time complexity and an efficient algorithm for subsequence matching problem

Error Detection in a Large-Scale Lexical Taxonomy

Diversified Top-k Querying in Knowledge Graphs

Contact Info

Product

Resources

About