Cross-modal retrieval aims to elucidate information fusion, imitate human learning, and advance the field. Although previous reviews have primarily focused on binary and real-value coding methods, there is a scarcity of techniques grounded in deep representation learning. In this paper, we concentrated on harmonizing cross-modal representation learning and the full-cycle modeling of high-level semantic associations between vision and language, diverging from traditional statistical methods. We systematically categorized and summarized the challenges and open issues in implementing current technologies and investigated the pipeline of cross-modal retrieval, including pre-processing, feature engineering, pre-training tasks, encoding, cross-modal interaction, decoding, model optimization, and a unified architecture. Furthermore, we propose benchmark datasets and evaluation metrics to assist researchers in keeping pace with cross-modal retrieval advancements. By incorporating recent innovative works, we offer a perspective on potential advancements in cross-modal retrieval.
To accelerate the inference of machine-learning (ML) model serving, clusters of machines require the use of expensive hardware accelerators (e.g., GPUs) to reduce execution time. Advanced inference serving systems are needed to satisfy latency service-level objectives (SLOs) in a cost-effective manner. Novel autoscaling mechanisms that greedily minimize the number of service instances while ensuring SLO compliance are helpful. However, we find that it is not adequate to guarantee cost effectiveness across heterogeneous GPU hardware, and this does not maximize resource utilization. In this paper, we propose HetSev to address these challenges by incorporating heterogeneity-aware autoscaling and resource-efficient scheduling to achieve cost effectiveness. We develop an autoscaling mechanism which accounts for SLO compliance and GPU heterogeneity, thus provisioning the appropriate type and number of instances to guarantee cost effectiveness. We leverage multi-tenant inference to improve GPU resource utilization, while alleviating inter-tenant interference by avoiding the co-location of identical ML instances on the same GPU during placement decisions. HetSev is integrated into Kubernetes and deployed onto a heterogeneous GPU cluster. We evaluated the performance of HetSev using several representative ML models. Compared with default Kubernetes, HetSev reduces resource cost by up to 2.15× while meeting SLO requirements.
With the development of information technology, the speed and scope of dissemination have been greatly improved, the content of information directly affects the social order, and the mining of communication laws has become a hot issue. At present, there are three common analytical models, which are based on the popularity curve fitting, interpersonal networks simulation, and machine learning models. Although these models play a major role in diffusion effects analysis, there are also problems such as the data analysis results are too dependent on the data and the versatility is weak. Based on the classical Schramm's eight elements theory and previous research results, in this study, we identified evolution curve usually has multiple peaks, and assumed a composite decomposition function to establish a mathematical abstraction. Draws on the denoising autoencoder neural network model, a multi-scale sliding-window model is proposed for fitting the energy variety curve with the number of microblogs as the influence index. The theoretical basis of the model makes it interpretable, and because of the use of a large number of known news dissemination data for training, so the versatility is better, and the denoising characteristics of the autoencoder model weaken the intensity of data cleaning. Finally, compared with the SIR function and SEIR function, the experiment shows that the fitting error of the function obtained from the model is significantly reduced. However, the model’s algorithm needs to be improved in parallel processing, and the model itself needs to be further optimized.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.