Figure 1: Left: the framework of knowledge distillation (KD). KD introduces an extra distillation loss, transferring knowledge from the teacher model. Middle: a conceptual sketch of flat and sharp local minima [22, 13]. The Y-axis denotes the loss value, and the X-axis the network parameters. The considerable sensitivity of the training loss at sharp minima damages the generalization on test data. In this paper, we discover that knowledge distillation (KD) benefits the student baseline (CE) with flatter minima but unexpectedly limits the convergence. See Figure 2 and 3. Right: the task and distillation loss dynamics. It suggests that introducing KD brings about a trade-off between the task and distillation losses. See Figure 3. To address this trade-off issue and achieve better performance, we propose Distillation-Oriented Trainer (DOT). Our DOT breaks the trade-off and leads the student to ideal minima of both great flatness and convergence.
Can the Dual-Sequence-Frequency-Hopping (DSFH) as a military emergency communication mode work under strong color noise? And is there any detection improvement of the DSFH signal via stochastic resonance (SR) processing under color noise? To deal with this problem, we analyze the physical feature of the DSFH signal. Firstly, the signal models of transmission, reception and the intermediate frequency (IF) are constructed. And the scale transaction is used to adjust the IF signal to fit the SR. Secondly, the non-markovian Langevin Equation (LE) is transformed into a markovian one by expand the 1-D LE to a 2-D one. Thirdly, the non-autonomous Fokker-Plank Equation (FPE) is transformed into an autonomous one by assuming that the SR transition of magnetic particles is instantaneous and introducing the decision time. Therefore, the analytical periodic steady solution of the probability density function (PDF) with the parameter of the correlation time of the color noise is obtained. Finally, the detection probability, false alarm probability and Receiver Operating Characteristics (ROC) curve are obtained, under the criterion of the maximum a posterior probability (MAP). Theoretical and simulation results show as below: 1) whether the DSFH can work under strong color noise is decided by the correlation time of the color noise; 2) when the power intensity of the color noise is constant, the smaller the correlation time with the bigger local SNR, the greater PDF difference of the SR output under two hypothesis, leading to better detection performance.by the periodic steady-state solution of FPE via introducing the decision time. In Sec. 4, the numerical simulations verifies the theory. The last section draws some conclusions. The System Model of the DSFH The Transmitted Signal of the DSFHThe communication and dual carriers controlled by PN sequences are chosen by the transmitted symbol in the DSFH mode, described as Fig. 1.The channel 0 and 1 are respectively the carrier f 0,n and f 1,n controlled by PN sequences FS 0 and FS 1 . At the time of t, the sine carrier s 0 (t) with frequency of f 0,n is transmitted if the transmitted symbol is 0. Otherwise the sine carrier s 1 (t) represented of symbol 1 is transmitted. The final transmitted signal s (t) of DSFH is the combination of the s 0 (t) and the f 0,n , after the channel switch.
Aiming at the reception of the intermediate frequency signal of sine wave of radio and communication system at extremely low signal-to-noise ratio (SNR), a quadratic polynomial receiving scheme for sine signals enhanced by stochastic resonance (SR) is proposed. Through analyzing the mechanism of sine signals enhanced by SR and introducing the decision time, the analytic periodic stable solution with time parameters of the Fokker-Planck Equation (FPE) is obtained through converting the non-autonomous FPE into an autonomous equation. Based on the probability density function of the particle of SR output, a quadratic polynomial receiving scheme is proposed by analyzing the feature of energy detector and matching filter receiver. By maximizng the deflection coefficient, the binomial coefficients and the test statistic are obtained. For further reducing the bit error, by combining the thought of " the average of <i>N</i> samples”, a quadratic polynomial receiving scheme for sine signals enhanced by SR is proposed through the hypothesis under Gaussian distribution approximation of the law of large <i>N</i>. And the conclusion is obtained as follows. When <i>N</i> is 500 and the SNR is greater than –17 dB, the bit error rate is less than 2.2 × 10<sup>–2</sup>, under the constraint of the parameters of the optimally matched SR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.