Most active learning approaches select either informative or representative unlabeled instances to query their labels. Although several active learning algorithms have been proposed to combine the two criteria for query selection, they are usually ad hoc in finding unlabeled instances that are both informative and representative. We address this challenge by a principled approach, termed QUIRE, based on the min-max view of active learning. The proposed approach provides a systematic way for measuring and combining the informativeness and representativeness of an instance. Extensive experimental results show that the proposed QUIRE approach outperforms several state-of -the-art active learning approaches.
In this paper, we propose the MIML (Multi-Instance Multi-Label learning) framework where an example is described by multiple instances and associated with multiple class labels. Compared to traditional learning frameworks, the MIML framework is more convenient and natural for representing complicated objects which have multiple semantic meanings. To learn from MIML examples, we propose the MimlBoost and MimlSvm algorithms based on a simple degeneration strategy, and experiments show that solving problems involving complicated objects with multiple semantic meanings in the MIML framework can lead to good performance. Considering that the degeneration process may lose information, we propose the D-MimlSvm algorithm which tackles MIML problems directly in a regularization framework. Moreover, we show that even when we do not have access to the real objects and thus cannot capture more information from real objects by using the MIML representation, MIML is still useful. We propose the InsDif and SubCod algorithms. InsDif works by transforming single-instances into the MIML representation for learning, while SubCod works by transforming single-label examples into the MIML representation for learning. Experiments show that in some tasks they are able to achieve better performance than learning the single-instances or single-label examples directly.
In real-world recognition/classification tasks, limited by various objective factors, it is usually difficult to collect training samples to exhaust all classes when training a recognizer or classifier. A more realistic scenario is open set recognition (OSR), where incomplete knowledge of the world exists at training time, and unknown classes can be submitted to an algorithm during testing, requiring the classifiers to not only accurately classify the seen classes, but also effectively deal with the unseen ones. This paper provides a comprehensive survey of existing open set recognition techniques covering various aspects ranging from related definitions, representations of models, datasets, evaluation criteria, and algorithm comparisons. Furthermore, we briefly analyze the relationships between OSR and its related tasks including zero-shot, one-shot (few-shot) recognition/learning techniques, classification with reject option, and so forth. Additionally, we also overview the open world recognition which can be seen as a natural extension of OSR. Importantly, we highlight the limitations of existing approaches and point out some promising subsequent research directions in this field.
High‐silica zeolite Y (FAU) plays a vital role in (petro)chemical industries. However, the slow nucleation and growth kinetics of the high‐silica FAU framework limit its direct synthesis and the improvement of framework SiO2/Al2O3 ratio (SAR). Here, a facile strategy is developed to realize the fast crystallization of high‐silica zeolite Y, which involves the combination of high crystallization temperature, ultra‐stable Y (USY) seeds and efficient organic‐structure directing agent (OSDA). The synthesis can be finished in 5–16 h at 160 °C and with tunable SAR up to 18.2, and the key factors affecting crystallization kinetics and phase purity are elucidated. Moreover, the crystallization process was monitored to reveal the fast crystal growth mechanism. The high‐silica products possess high (hydro)thermal stability and abundant strong acid sites, which endow them excellent catalytic cracking performance, obviously superior to commercial USY.
One-step transformation of isobutyl alcohol to aromatics (benzene, toluene, and xylene) has been studied in a gas phase, fixed-bed reactor system over several purely acidic zeolites and zeolite-supported metal catalysts. ZSM-5 zeolites give higher aromatics yields (∼42 wt %) among the evaluated zeolites, and the Si/ Al ratios (Si/Al = 13−43) of ZSM-5 slightly influence their catalytic performances. During the transformation of isobutyl alcohol, large amounts of short alkanes (mainly propane and butane isomers) are also generated on the acidic ZSM-5. To improve the conversion to aromatics, several metal species (Zn, Ga, Mo, La, Ni, Ag, and Pt) are supported on the ZSM-5. The enhancements in aromatics yields (∼60 wt %) are observed only on the Zn/ZSM-5 catalysts. The incorporation of Zn species preferentially decreases the strong-strength Brønsted acidity and, thus, suppresses the cracking to C 3 fragments. Moreover, mainly the Zn species at the exchange sites facilitate the recombinative desorption of H 2 and, hence, enhance the reactions toward aromatics. Through these effects, Zn/ZSM-5 catalysts exhibit the remarkably promoted formation of toluene and xylene and inhibit the generation of undesired alkanes products.
In many real-world tasks, particularly those involving data objects with complicated semantics such as images and texts, one object can be represented by multiple instances and simultaneously be associated with multiple labels. Such tasks can be formulated as multi-instance multi-label learning (MIML) problems, and have been extensively studied during the past few years. Existing MIML approaches have been found useful in many applications; however, most of them can only handle moderate-sized data. To efficiently handle large data sets, in this paper we propose the MIMLfast approach, which first constructs a low-dimensional subspace shared by all labels, and then trains label specific linear models to optimize approximated ranking loss via stochastic gradient descent. Although the MIML problem is complicated, MIMLfast is able to achieve excellent performance by exploiting label relations with shared space and discovering sub-concepts for complicated labels. Experiments show that the performance of MIMLfast is highly competitive to state-of-the-art techniques, whereas its time cost is much less. Moreover, our approach is able to identify the most representative instance for each label, and thus providing a chance to understand the relation between input patterns and output label semantics.
In traditional active learning, there is only one labeler that always returns the ground truth of queried labels. However, in many applications, multiple labelers are available to offer diverse qualities of labeling with different costs. In this paper, we perform active selection on both instances and labelers, aiming to improve the classification model most with the lowest cost. While the cost of a labeler is proportional to its overall labeling quality, we also observe that different labelers usually have diverse expertise, and thus it is likely that labelers with a low overall quality can provide accurate labels on some specific instances. Based on this fact, we propose a novel active selection criterion to evaluate the cost-effectiveness of instance-labeler pairs, which ensures that the selected instance is helpful for improving the classification model, and meanwhile the selected labeler can provide an accurate label for the instance with a relative low cost. Experiments on both UCI and real crowdsourcing data sets demonstrate the superiority of our proposed approach on selecting cost-effective queries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.