We examine the problem of joint top-down active search of multiple objects under interaction, e.g., person riding a bicycle, cups held by the table, etc.. Such objects under interaction often can provide contextual cues to each other to facilitate more efficient search. By treating each detector as an agent, we present the first collaborative multiagent deep reinforcement learning algorithm to learn the optimal policy for joint active object localization, which effectively exploits such beneficial contextual information. We learn inter-agent communication through cross connections with gates between the Q-networks, which is facilitated by a novel multi-agent deep Q-learning algorithm with joint exploitation sampling. We verify our proposed method on multiple object detection benchmarks. Not only does our model help to improve the performance of state-of-the-art active localization models, it also reveals interesting codetection patterns that are intuitively interpretable.
Deep learning techniques have made significant progress in medical image analysis. However, obtaining ground truth labels for unlabeled medical images is challenging as they often outnumber labeled images. Thus, training a high-performance model with limited labeled data has become a crucial challenge. Methods: This study introduces an underlying knowledge-based semi-supervised framework called UKSSL, consisting of two components: MedCLR extracts feature representations from the unlabeled dataset; UKMLP utilizes the representation and finetunes it with the limited labeled dataset to classify the medical images. Results: UKSSL evaluates on the LC25000 and BCCD datasets, using only 50% labeled data. It gets precision, recall, F1-score, and accuracy of 98.9% on LC25000 and 94.3%, 94.5%, 94.3%, and 94.1% on BCCD, respectively. These results outperform other supervised-learning methods using 100% labeled data. Conclusions: The UKSSL can efficiently extract underlying knowledge from the unlabeled dataset and perform better using limited labeled medical images.
Many tasks in artificial intelligence require the collaboration of multiple agents. We exam deep reinforcement learning for multi-agent domains. Recent research efforts often take the form of two seemingly conflicting perspectives, the decentralized perspective, where each agent is supposed to have its own controller; and the centralized perspective, where one assumes there is a larger model controlling all agents. In this regard, we revisit the idea of the master-slave architecture by incorporating both perspectives within one framework. Such a hierarchical structure naturally leverages advantages from one another. The idea of combining both perspectives is intuitive and can be well motivated from many real world systems, however, out of a variety of possible realizations, we highlights three key ingredients, i.e. composed action representation, learnable communication and independent reasoning. With network designs to facilitate these explicitly, our proposal consistently outperforms latest competing methods both in synthetic experiments and when applied to challenging StarCraft 1 micromanagement tasks.
Microblog Sentiment Classification (MSC) is a challenging task in microblog mining, arising in many applications such as stock price prediction and crisis management. Currently, most of the existing approaches learn the user sentiment model from their posted tweets in microblogs, which suffer from the insufficiency of discriminative tweet representation. In this paper, we consider the problem of microblog sentiment classification from the viewpoint of heterogeneous MSC network embedding. We propose a novel recurrent random walk network learning framework for the problem by exploiting both users' posted tweets and their social relations in microblogs. We then introduce the deep recurrent neural networks with random-walk layer for heterogeneous MSC network embedding , which can be trained end-to-end from the scratch. We employ the back-propagation method for training the proposed recurrent random walk network model. The extensive experiments on the large-scale public datasets from Twitter show that our method achieves better performance than other state-of-the-art solutions to the problem.
Active Object Tracking (AOT) is crucial to many vision-based applications, e.g., mobile robot, intelligent surveillance. However, there are a number of challenges when deploying active tracking in complex scenarios, e.g., target is frequently occluded by obstacles. In this paper, we extend the single-camera AOT to a multi-camera setting, where cameras tracking a target in a collaborative fashion. To achieve effective collaboration among cameras, we propose a novel Pose-Assisted Multi-Camera Collaboration System, which enables a camera to cooperate with the others by sharing camera poses for active object tracking. In the system, each camera is equipped with two controllers and a switcher: The vision-based controller tracks targets based on observed images. The pose-based controller moves the camera in accordance to the poses of the other cameras. At each step, the switcher decides which action to take from the two controllers according to the visibility of the target. The experimental results demonstrate that our system outperforms all the baselines and is capable of generalizing to unseen environments. The code and demo videos are available on our website https://sites.google.com/view/pose-assisted-collaboration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.