We present an approach that exploits hierarchical Recurrent Neural Networks (RNNs) to tackle the video captioning problem, i.e., generating one or multiple sentences to describe a realistic video. Our hierarchical framework contains a sentence generator and a paragraph generator. The sentence generator produces one simple short sentence that describes a specific short video interval. It exploits both temporal-and spatial-attention mechanisms to selectively focus on visual elements during generation. The paragraph generator captures the inter-sentence dependency by taking as input the sentential embedding produced by the sentence generator, combining it with the paragraph history, and outputting the new initial state for the sentence generator. We evaluate our approach on two large-scale benchmark datasets: YouTubeClips and TACoS-MultiLevel. The experiments demonstrate that our approach significantly outperforms the current state-of-the-art methods with BLEU@4 scores 0.499 and 0.305 respectively.
Recognizing human activities in partially observed videos is a challenging problem and has many practical applications. When the unobserved subsequence is at the end of the video, the problem is reduced to activity prediction from unfinished activity streaming, which has been studied by many researchers. However, in the general case, an unobserved subsequence may occur at any time by yielding a temporal gap in the video. In this paper, we propose a new method that can recognize human activities from partially observed videos in the general case. Specifically, we formulate the problem into a probabilistic framework: 1) dividing each activity into multiple ordered temporal segments, 2) using spatiotemporal features of the training video samples in each segment as bases and applying sparse coding (SC) to derive the activity likelihood of the test video sample at each segment, and 3) finally combining the likelihood at each segment to achieve a global posterior for the activities. We further extend the proposed method to include more bases that correspond to a mixture of segments with different temporal lengths (MSSC), which can better represent the activities with large intra-class variations. We evaluate the proposed methods (SC and MSSC) on various real videos. We also evaluate the proposed methods on two special cases: 1) activity prediction where the unobserved subsequence is at the end of the video, and 2) human activity recognition on fully observed videos. Experimental results show that the proposed methods outperform existing state-of-the-art comparison methods.
Flexible power sources and efficient energy storage devices with high energy density are highly desired to power a future sustainable community. Theoretically, rechargeable metal−air batteries are promising candidates for the next-generation power sources. The rational design of oxygen reduction reaction (ORR) and oxygen evolution reaction (OER) catalysts with high catalytic activity is critical to the development of efficient and durable metal−air batteries. Herein, we propose a novel strategy to mass synthesize nonprecious transition-metal-based nitrogen/oxygen codoped carbon nanotubes (CNTs) grown on carbon-nanofiber films (MNO-CNT-CNFFs, M = Fe, Co, Ni) via a facile free-surface electrospinning technique followed by in situ growth carbonization. With a combination of the high catalytic activity of Fe-catalyzed CNTs and the efficient mass-transport characteristics of 3D carbon fiber films, the resultant flexible and robust FeNO-CNT-CNFFs exhibit the highest bifunctional oxygen catalytic activities in terms of a positive half-wave potential (0.87 V) for ORR and low overpotential (430 mV @ 10 mA cm −2 ) for OER. As proof-of-concept, newly designed hybrid Li−air batteries fabricated with FeNO-CNT-CNFFs as air electrode present high voltage (∼3.4 V), low overpotential (0.15 V), and long cycle life (over 120 h) in practical open-air tests, demonstrating the superiority of the freestanding catalysts and their promising potential for the applications in fuel cells and flexible energy storage devices.
Background: Hepatic encephalopathy is associated with altered gut microbiota. Proton pump inhibitors increase the risk of small bowel bacterial overgrowth.Objectives: This was a case-control study aimed at exploring the relationship of proton pump inhibitor use with the risk of hepatic encephalopathy during hospitalization in liver cirrhosis. Methods: Case and control groups were defined as cirrhotic patients who developed hepatic encephalopathy during hospitalization and those without hepatic encephalopathy at admission or during hospitalization, respectively. Age, gender, and Child-Pugh score were matched between the groups. Odds ratios with 95% confidence intervals were calculated to express the association of proton pump inhibitors with the risk of hepatic encephalopathy. Four subgroup analyses were performed after excluding patients with acute upper gastrointestinal bleeding, infections, and in-hospital death, and after matching model for end-stage liver disease score. Results: In the overall analysis, 128 patients were included in each group of cases and controls. The proportion of proton pump inhibitor use was significantly higher in the case group than the control group (79.7% vs 43%, p < 0.001). Proton pump inhibitor use (odds ratio ¼ 3.481, 95% confidence interval: 1.651-7.340, p ¼ 0.001) was independently associated with the development of hepatic encephalopathy in the multivariate analysis. In the four subgroup analyses, proton pump inhibitor use remained independently associated with the risk of hepatic encephalopathy. Conclusion: Proton pump inhibitor use might increase the risk of hepatic encephalopathy during hospitalization.
Herein, fifteen new compounds containing coumarin, 1,2,3-triazole and benzoyl- substituted arylamine moieties were designed, synthesized and tested in vitro for their anticancer activity. The results showed that all tested compounds had moderate antiproliferative activity against MDA-MB-231, a human breast cancer cell line, under both normoxic and hypoxic conditions. Furthermore, the 4-substituted coumarin linked with benzoyl 3,4-dimethoxyaniline through 1,2,3-triazole (compound 5e) displayed the most prominent antiproliferative activities with an IC50 value of 0.03 μM, about 5000 times stronger than 4-hydroxycoumarin (IC50 > 100 μM) and 20 times stronger than doxorubicin (IC50 = 0.60 μM). Meanwhile, almost all compounds revealed general enhancement of proliferation-inhibiting activity under hypoxia, contrasted with normoxia. A docking analysis showed that compound 5e had potential to inhibit carbonic anhydrase IX (CA IX).
We present an approach to simultaneously reasoning about a video clip and an entire natural-language sentence. The compositional nature of language is exploited to construct models which represent the meanings of entire sentences composed out of the meanings of the words in those sentences mediated by a grammar that encodes the predicate-argument relations. We demonstrate that these models faithfully represent the meanings of sentences and are sensitive to how the roles played by participants (nouns), their characteristics (adjectives), the actions performed (verbs), the manner of such actions (adverbs), and changing spatial relations between participants (prepositions) affect the meaning of a sentence and how it is grounded in video. We exploit this methodology in three ways. In the first, a video clip along with a sentence are taken as input and the participants in the event described by the sentence are highlighted, even when the clip depicts multiple similar simultaneous events. In the second, a video clip is taken as input without a sentence and a sentence is generated that describes an event in that clip. In the third, a corpus of video clips is paired with sentences which describe some of the events in those clips and the meanings of the words in those sentences are learned. We learn these meanings without needing to specify which attribute of the video clips each word in a given sentence refers to. The learned meaning representations are shown to be intelligible to humans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.