For the security defense in the current Intelligent Transportation System (ITS), malware is often used as the security analysis data source, but only the known attack type can be detected. A general anomaly detection framework is proposed, using log data as the analysis data source. By modeling the log template sequence as a natural language sequence and using the stacked Long Short-Term Memory (LSTM) with self-attention mechanism, the framework can effectively extract the hidden pattern of the log template sequence, and well express the dependencies inside the log template sequence. The experimental results show that the overall accuracy of log sequence anomaly detection of the detection framework is better than that of existing methods and the time cost is lower.
Data compression and decompression have been widely used in modern communication and data transmission. But how to decompress the corrupted lossless compressed files remains a challenge. Aiming at the Lempel-Ziv-Storer-Szymanski (LZSS), a lossless data compression algorithm widely used in the field of general coding, this paper proposes an effective method to repair the errors and decompress and restore the corrupted LZSS files, and provides the theoretical basis for the method. By using the residual redundancy left by the LZSS encoder to carry the check information, the method can repair the errors in LZSS compressed data without any loss of compression performance. The proposed method neither requires additional bits nor changes coding rules or data formats. It is fully compatible with standard algorithms. That is, the data compressed by LZSS with error repair capability can still be decompressed by the standard LZSS decoder. The experimental results verify the validity and practicability of the proposed method.
Named entity classification of Wikipedia articles is a fundamental research area that can be used to automatically build large-scale corpora of named entity recognition or to support other entity processing, such as entity linking, as auxiliary tasks. This paper describes a method of classifying named entities in Chinese Wikipedia with fine-grained types. We considered multi-faceted information in Chinese Wikipedia to construct four feature sets, designed different feature selection methods for each feature, and fused different features with a vector space using different strategies. Experimental results show that the explored feature sets and their combination can effectively improve the performance of named entity classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.