Shiming He scite author profile

Log anomaly detection is an efficient method to manage modern large-scale Internet of Things (IoT) systems. More and more works start to apply natural language processing (NLP) methods, and in particular word2vec, in the log feature extraction. Word2vec can extract the relevance between words and vectorize the words. However, the computing cost of training word2vec is high. Anomalies in logs are dependent on not only an individual log message but also on the log message sequence. Therefore, the vector of words from word2vec can not be used directly, which needs to be transformed into the vector of log events and further transformed into the vector of log sequences. To reduce computational cost and avoid multiple transformations, in this paper, we propose an offline feature extraction model, named LogEvent2vec, which takes the log event as input of word2vec to extract the relevance between log events and vectorize log events directly. LogEvent2vec can work with any coordinate transformation methods and anomaly detection models. After getting the log event vector, we transform log event vector to log sequence vector by bary or tf-idf and three kinds of supervised models (Random Forests, Naive Bayes, and Neural Networks) are trained to detect the anomalies. We have conducted extensive experiments on a real public log dataset from BlueGene/L (BGL). The experimental results demonstrate that LogEvent2vec can significantly reduce computational time by 30 times and improve accuracy, comparing with word2vec. LogEvent2vec with bary and Random Forest can achieve the best F1-score and LogEvent2vec with tf-idf and Naive Bayes needs the least computational time.

show abstract

Deep Learning-Based Data Storage for Low Latency in Data Center Networks

Liao

Zhang

et al. 2019

IEEE Access

View full text Add to dashboard Cite

Low-latency data access is becoming an upcoming and increasingly important challenge. The proper placement of data blocks can reduce data travel among distributed storage systems, which contributes significantly to the latency reduction. However, the dominant data placement optimization has primarily relied on prior known data requests or static initial data distribution, which ignores the dynamics of clients' data access requests and networks. The learning technology can help the data center networks (DCNs) learn from historical access information and make optimal data storage decision. Consider a more practical DCNs with fat-tree topology, we utilize a deep-learning technology k-means to help store data blocks and then improve the read and write latency of the DCN, where k is the number of cores in the fat-tree. The evaluation results demonstrate that the average write and read latency of the whole system can be lowered by 33% and 45%, respectively. And the best set of parameter k is analyzed and recommended to provide guidance to the real application, which is equal to the number of cores in the DCNs. INDEX TERMS Data center networks, data storage, deep learning, k-means.

show abstract

LogUAD: Log Unsupervised Anomaly Detection Based on Word2Vec

Wang¹,

Zhao²,

He³

et al. 2022

View full text Add to dashboard Cite

Interference-Aware Routing for Difficult Wireless Sensor Network Environment with SWIPT

Tang

et al. 2019

Sensors

View full text Add to dashboard Cite

The main challenges of sensing in harsh industrial and biological environments are the limited energy of sensor nodes and the difficulty of charging sensor nodes. Simultaneous wireless information and power transfer (SWIPT) is a non-invasive option to replenish energy. SWIPT harvests energy and decodes information from the same RF signal, which is influencing the design of a wireless sensor network. In multi-hop multi-flow wireless sensor networks, interference generally exists, and the interference has a different influence on SWIPT. Route, interference and SWIPT are dependent. However, existing works consider SWIPT link resource allocation with a given route or only select path for one flow without interference. Therefore, this paper firstly analyzes the influence of interference on SWIPT, and select the SWIPT routing with interference. We design an interference-based information and energy allocation model to maximize the link capacity with SWIPT. Then, we design an interference-aware route metric, formulate SWIPT routing problem, and design an interference-aware SWIPT routing algorithm. The simulation results show that as the number of flows increases, there is more likely to obtain performance gains from interference and SWIPT.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shiming He

Energy-Aware Routing for SWIPT in Multi-Hop Energy-Constrained Wireless Network

Parameters Compressing in Deep Learning

Interference-Aware Multisource Transmission in Multiradio and Multichannel Wireless Network

An efficient privacy-preserving compressive data gathering scheme in WSNs

LogEvent2vec: LogEvent-to-Vector Based Anomaly Detection for Large-Scale Logs in Internet of Things

Deep Learning-Based Data Storage for Low Latency in Data Center Networks

LogUAD: Log Unsupervised Anomaly Detection Based on Word2Vec

Interference-Aware Routing for Difficult Wireless Sensor Network Environment with SWIPT

Contact Info

Product

Resources

About