A growing number of applications, e.g. video surveillance and medical image analysis, require training recognition systems from large amounts of weakly annotated data while some targeted interactions with a domain expert are allowed to improve the training process. In such cases, active learning (AL) can reduce labeling costs for training a classifier by querying the expert to provide the labels of most informative instances. This paper focuses on AL methods for instance classification problems in multiple instance learning (MIL), where data is arranged into sets, called bags, that are weakly labeled. Most AL methods focus on single instance learning problems. These methods are not suitable for MIL problems because they cannot account for the bag structure of data. In this paper, new methods for bag-level aggregation of instance informativeness are proposed for multiple instance active learning (MIAL). The aggregated informativeness method identifies the most informative instances based on classifier uncertainty, and queries bags incorporating the most information. The other proposed method, called clusterbased aggregative sampling, clusters data hierarchically in the instance space. The informativeness of instances is assessed by considering bag labels, inferred instance labels, and the proportion of labels that remain to be discovered in clusters. Both proposed methods significantly outperform reference methods in extensive experiments using benchmark data from several application domains. Results indicate that using an appropriate strategy to address MIAL problems yields a significant reduction in the number of queries needed to achieve the same level of performance as single instance AL methods.Recent years have witnessed substantial advances of machine learning techniques that promise to address many complex large-scale problems that were previously thought intractable. However, in many applications, annotating enough representative training data to train a recognition system is costly, and in such cases, one can resort to AL to reduce the annotation burden [1,2]. Moreover, several applications allow to leverage some targeted interactions with human experts, as needed, to label informative data and drive the training process. AL has been used in various applications to reduce the cost of annotations, e.g., in medical image segmentation [3], text classification [4,5] and visual object detection [6].Alternatively, the cost of annotations can be reduced through weakly supervised learning. It generalizes many kinds of learning paradigms including semi-supervised learning and MIL in partially observable environments or learning from uncertain labels. With MIL, training instances are grouped in sets (commonly referred to as bags), and a label is only provided for an entire set, but not for each individual instance. MIL has also been shown to efficiently reduce annotation costs in several