Optimizing a deep neural network is a fundamental task in computer vision, yet direct training methods often suffer from over-fitting. Teacher-student optimization aims at providing complementary cues from a model trained previously, but these approaches are often considerably slow due to the pipeline of training a few generations in sequence, i.e., time complexity is increased by several times. This paper presents snapshot distillation (SD), the first framework which enables teacher-student optimization in one generation. The idea of SD is very simple: instead of borrowing supervision signals from previous generations, we extract such information from earlier epochs in the same generation, meanwhile make sure that the difference between teacher and student is sufficiently large so as to prevent under-fitting. To achieve this goal, we implement SD in a cyclic learning rate policy, in which the last snapshot of each cycle is used as the teacher for all iterations in the next cycle, and the teacher signal is smoothed to provide richer information. In standard image classification benchmarks such as CIFAR100 and ILSVRC2012, SD achieves consistent accuracy gain without heavy computational overheads. We also verify that models pre-trained with SD transfers well to object detection and semantic segmentation in the PascalVOC dataset.
We focus on the problem of training a deep neural network in generations. The flowchart is that, in order to optimize the target network (student), another network (teacher) with the same architecture is first trained, and used to provide part of supervision signals in the next stage. While this strategy leads to a higher accuracy, many aspects (e.g., why teacher-student optimization helps) still need further explorations.This paper studies this problem from a perspective of controlling the strictness in training the teacher network. Existing approaches mostly used a hard distribution (e.g., one-hot vectors) in training, leading to a strict teacher which itself has a high accuracy, but we argue that the teacher needs to be more tolerant, although this often implies a lower accuracy. The implementation is very easy, with merely an extra loss term added to the teacher network, facilitating a few secondary classes to emerge and complement to the primary class. Consequently, the teacher provides a milder supervision signal (a less peaked distribution), and makes it possible for the student to learn from inter-class similarity and potentially lower the risk of overfitting. Experiments are performed on standard image classification tasks (CIFAR100 and ILSVRC2012). Although the teacher network behaves less powerful, the students show a persistent ability growth and eventually achieve higher classification accuracies than other competitors. Model ensemble and transfer feature extraction also verify the effectiveness of our approach.
Endodontic treatment is performed to treat inflamed or infected root canal system of any involved teeth. It is estimated that 22.3 million endodontic procedures are performed annually in the USA. Preparing a proper access cavity before cleaning/shaping (instrumentation) of the root canal system is among the most important steps to achieve a successful treatment outcome. However, accidents such as perforation, gouging, ledge and canal transportation may occur during the procedure because of an improper or incomplete access cavity design. To reduce or prevent these errors in root canal treatments, this Letter introduces an assistive augmented reality (AR) technology on the head-mounted display (HMD). The proposed system provides audiovisual warning and correction in situ on the optical see-through HMD to assist the dentists to prepare access cavity. Interaction of the clinician with the system is via voice commands allowing the bi-manual operation. Also, dentist is able to review tooth radiographs during the procedure without the need to divert attention away from the patient and look at a separate monitor. Experiments are performed to evaluate the accuracy of the measurements. To the best of the authors' knowledge, this is the first time that an HMD-based AR prototype is introduced for an endodontic procedure.
Lithium-ion batteries (LiBs) are the most important part of electric vehicle (EV) systems. Because there are two different degradation rates during LiB degradation, there are many two-phase models for LiBs. However, most of these methods do not consider the randomness of the changing point in the two-phase model and cannot update the change time in real time. Therefore, this paper proposes a method based on the combination of the two-phase Wiener model and an extreme learning machine (ELM). The two-phase Wiener model is used to derive the mathematical expression of the remaining useful life (RUL), and the ELM is implemented to adaptively detect the changing point. Based on the Poisson distribution, the distribution of the changing time is derived as a gamma distribution. To evaluate the theoretical results and practicality of the proposed method, we perform both numerical and practical simulations. The results of the simulations show that due to the precise and adaptive detection of changing points, the proposed method produces a more accurate RUL prediction than existing methods. The error of our method for detecting the changing point is about 4% and the mean prediction error of RUL in the second phase is improved from 4.39 cycles to 1.61 cycles.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.