We devise a cascade GAN approach to generate talking face video, which is robust to different face shapes, view angles, facial characteristics, and noisy audio conditions. Instead of learning a direct mapping from audio to video frames, we propose first to transfer audio to high-level structure, i.e., the facial landmarks, and then to generate video frames conditioned on the landmarks. Compared to a direct audio-to-image approach, our cascade approach avoids fitting spurious correlations between audiovisual signals that are irrelevant to the speech content. We, humans, are sensitive to temporal discontinuities and subtle artifacts in video. To avoid those pixel jittering problems and to enforce the network to focus on audiovisual-correlated regions, we propose a novel dynamically adjustable pixel-wise loss with an attention mechanism. Furthermore, to generate a sharper image with well-synchronized facial movements, we propose a novel regression-based discriminator structure, which considers sequence-level information along with frame-level information. Thoughtful experiments on several datasets and realworld samples demonstrate significantly better results obtained by our method than the state-of-the-art methods in both quantitative and qualitative comparisons.
We have performed first-principle density functional theory calculations to investigate how a subsurface transition metal M (M = Ni, Co, or Fe) affects the energetics and mechanisms of oxygen reduction reaction (ORR) on the outermost Pt mono-surface layer of Pt/M(111) surfaces. In this work, we found that the subsurface Ni, Co, and Fe could down-shift the d-band center of the Pt surface layer and thus weaken the binding of chemical species to the Pt/M(111) surface. Moreover, the subsurface Ni, Co, and Fe could modify the heat of reaction and activation energy of various elementary reactions of ORR on these Pt/M(111) surfaces. Our DFT results revealed that, due to the influence of the subsurface Ni, Co, and Fe, ORR would adopt a hydrogen peroxide dissociation mechanism with an activation energy of 0.15 eV on Pt/Ni(111), 0.17 eV on Pt/Co(111), and 0.16 eV on Pt/Fe(111) surface, respectively, for their rate-determining O2 protonation reaction. In contrast, ORR would follow a peroxyl dissociation mechanism on a pure Pt(111) surface with an activation energy of 0.79 eV for its rate-determining O protonation reaction. Thus, our theoretical study explained why the subsurface Ni, Co, and Fe could lead to multi-fold enhancement in catalytic activity for ORR on the Pt mono-surface layer of Pt/M(111) surfaces.
Improving the efficiency of electrocatalytic reduction of oxygen represents one of the main challenges for the development of renewable energy technologies. Here, we report the systematic evaluation of Pt-ternary alloys (Pt3(MN)1 with M, N = Fe, Co, or Ni) as electrocatalysts for the oxygen reduction reaction (ORR). We first studied the ternary systems on extended surfaces of polycrystalline thin films to establish the trend of electrocatalytic activities and then applied this knowledge to synthesize ternary alloy nanocatalysts by a solvothermal approach. This study demonstrates that the ternary alloy catalysts can be compelling systems for further advancement of ORR electrocatalysis, reaching higher catalytic activities than bimetallic Pt alloys and improvement factors of up to 4 versus monometallic Pt.
Cross-modal audio-visual perception has been a long-lasting topic in psychology and neurology, and various studies have discovered strong correlations in human perception of auditory and visual stimuli. Despite works in computational multimodal modeling, the problem of cross-modal audio-visual generation has not been systematically studied in the literature. In this paper, we make the first attempt to solve this cross-modal generation problem leveraging the power of deep generative adversarial training. Specifically, we use conditional generative adversarial networks to achieve cross-modal audio-visual generation of musical performances. We explore different encoding methods for audio and visual signals, and work on two scenarios: instrument-oriented generation and pose-oriented generation. Being the first to explore this new problem, we compose two new datasets with pairs of images and sounds of musical performances of different instruments. Our experiments using both classification and human evaluations demonstrate that our model has the ability to generate one modality, i.e., audio/visual, from the other modality, i.e., visual/audio, to a good extent. Our experiments on various design choices along with the datasets will facilitate future research in this new problem space.
In this study, we calculated the reaction energetics (including surface adsorption energy, heat of reaction, and activation energy) of oxygen reduction reactions (ORR) on Pt(100) and Pt/Ni(100) surfaces using firstprinciples density functional theory methods. Our calculation results suggest that, on the Pt and Pt/Ni(100) surfaces, the ORR would proceed following direct oxygen dissociation mechanism in which the rate-determining step is OH hydrogenation reaction. Furthermore, we compared the calculated reaction energies of the ORR on the Pt(100), Pt(111), Pt/Ni(100), and Pt/Ni(111) surfaces. Our results indicated that the subsurface Ni atoms would weaken the strength of various ORR chemical intermediates binding to the outermost Pt monolayers and further cause an increase in the heats of reaction for all the O−O bond dissociation reactions but a decrease in the heats of reaction for all the hydrogenation reactions of the ORR on the Pt/Ni surfaces as compared to the pure Pt surfaces. However, we found that the extent of such ligand effect was more pronounced on the (111) surfaces than the (100) surfaces. Moreover, we determined the activation energy for the rate-determining step of the ORR on the Pt(100) to be 0.80 eV, on the Pt/Ni(100) to be 0.79 eV, on the Pt(111) to be 0.79 eV, and on the Pt/Ni(111) to be 0.15 eV. Consequently, our study predicted that the catalytic activity for the ORR should be higher on the (111) surfaces than the (100) surfaces and would be much higher on the Pt/Ni(111) than all the other three surfaces. These theoretical predictions agree well with the trend of the ORR catalytic activity observed in previous experimental measurements.
Density functional theory is used to determine the reaction mechanisms of CO oxidation and the active oxygen species on a Au/TiO2 model catalyst. The model consists of a Au rod supported along the TiO2 [11̅0] direction of the TiO2(110) surface. An interfacial Au/Ti5c site at the interface boundary is identified to be particularly active toward O2 adsorption and dissociation. At this site, O2 dissociation has an energy barrier of 0.5 eV, which is facile at room temperature. The resulting adsorbed Au/O/Ti5c oxygen species are shown to be stable and active for CO oxidation under relevant reaction conditions with an activation energy of 0.24 eV. Furthermore, the adsorbed Au/O/Ti5c oxygen species functions as an electron reservoir, and it lowers the oxygen vacancy formation energy of a surface lattice oxygen (Obri), as well as the Ti interstitial formation energy, due to electron transfer from high-energy defect states to low-energy p-states of the adsorbed Au/O/Ti5c oxygen species. Hence, the Obri species is activated at the oxidized Au/TiO2 interface boundary and the energy barrier of CO oxidation with Obri is calculated to be 0.55 eV. Thus, the CO oxidation reaction can proceed at room temperature either via a Langmuir–Hinshelwood mechanism with an adsorbed Au/O/Ti5c oxygen species or via a Au-assisted Mars–van Krevelen mechanism with Obri.
The relationship between experiment and theory in electrocatalysis is one of profound importance. Until fairly recently, the principal role of theory in this field was interpreting experimental results. Over the course of the past decade (roughly the period covered by this review), however, that has begun to change, with theory now frequently leading the design of electrocatalytic materials. Though rewarding, this has not been a particularly easy union. For one thing, experimentalists and theorists have to come to grips with the fact that they rely on different models. Theorists make predictions based on individual, perfect structural models, while experimentalists work with more complex and heterogeneous ensembles of electrocatalysts. As discussed in this review, computational capabilities have improved in recent years, so that theory is better represented by the structures that experimentalists are able to prepare. Likewise, synthetic chemists are able to make ever more complex electrocatalysts with high levels of control, which provide a more extensive palette of materials for testing theory. The goal of this review is to highlight research from the last ∼10 years that focuses on carefully controlled electrocatalytic experiments which, in combination with theoretical predictions, bring us closer to bridging the gap between real catalysts and computational models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.