With the development of Internet cloud technology, the scale of data is expanding. Traditional processing methods find it difficult to deal with the problem of information extraction of big data. Therefore, it is necessary to use machine-learning-assisted intelligent processing to extract information from data in order to solve the optimization problem in complex systems. There are many forms of data storage. Among them, text data is an important data type that directly reflects semantic information. Text vectorization is an important concept in natural language processing tasks. Because text data can not be directly used for model parameter training, it is necessary to vectorize the original text data and make it numerical, and then the feature extraction operation can be carried out. The traditional text digitization method is often realized by constructing a bag of words, but the vector generated by this method can not reflect the semantic relationship between words, and it also easily causes the problems of data sparsity and dimension explosion. Therefore, this paper proposes a text vectorization method combining a topic model and transfer learning. Firstly, the topic model is selected to model the text data and extract its keywords, to grasp the main information of the text data. Then, with the help of the bidirectional encoder representations from transformers (BERT) model, which belongs to the pretrained model, model transfer learning is carried out to generate vectors, which are applied to the calculation of similarity between texts. By setting up a comparative experiment, this method is compared with the traditional vectorization method. The experimental results show that the vector generated by the topic-modeling- and transfer-learning-based text vectorization (TTTV) proposed in this paper can obtain better results when calculating the similarity between texts with the same topic, which means that it can more accurately judge whether the contents of the given two texts belong to the same topic.
Robotic Mobile Fulfillment System (RMFS) is a new type of parts-to-picker order picking system and has become the development trend of e-commerce logistics distribution centers. There are usually a large number of tasks need to be allocated to many robots and the picking time for e-commerce orders is usually very tight, which puts forward higher requirements for the efficiency of multirobot task allocation (MRTA) in e-commerce RMFS. Current researches on MRTA in RMFS seldom consider task correlation and the balance among picking stations. In this paper, a task time cost model considering task correlation is built according to the characteristics of the picking process. Then, a multirobot task allocation model minimizing the overall picking time is established considering both the picking time balance of picking stations and the load balance of robots. Finally, a four-stage balanced heuristic auction algorithm is designed to solve the task allocation model and the tasks with execution sequence for each robot are obtained. By comparing with the traditional task time cost model and the algorithm without considering the balance among picking stations, it is found that the proposed model and algorithm can significantly shorten the overall picking time.
Abstract-It is a critical task to infer discriminative and coherent topics from short texts. Furthermore, people not only want to know what kinds of topics can be extract from these short texts, but also desire to obtain the temporal dynamic evolution of these topics. In this paper, we present a novel model for short texts, referred as topic trend detection (TTD) model. Based on an optimized topic model we proposed, TTD model derives more typical terms and itemsets to represent topics of short texts and improves the coherence of topic representations. Ultimately, we extend the topic itemsets obtained from the optimized topic model by vector space representations of words to detect topic trends. Through extensive experiments on several real-world short text collections in Sina Microblog, the results show our method achieves comparable topic representations than state-of-the-art models, measured by topic coherence, and then show its application in identifying topic trends in Sina Microblog.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.