Detecting text and caption from videos is important and in great demand for video retrieval, annotation, indexing, and content analysis. In this paper, we present a corner based approach to detect text and caption from videos. This approach is inspired by the observation that there exist dense and orderly presences of corner points in characters, especially in text and caption. We use several discriminative features to describe the text regions formed by the corner points. The usage of these features is in a flexible manner, thus, can be adapted to different applications. Language independence is an important advantage of the proposed method. Moreover, based upon the text features, we further develop a novel algorithm to detect moving captions in videos. In the algorithm, the motion features, extracted by optical flow, are combined with text features to detect the moving caption patterns. The decision tree is adopted to learn the classification criteria. Experiments conducted on a large volume of real video shots demonstrate the efficiency and robustness of our proposed approaches and the real-world system. Our text and caption detection system was recently highlighted in a worldwide multimedia retrieval competition, Star Challenge, by achieving the superior performance with the top ranking.
Advances in wireless communication and microelectronic devices technologies have enabled the development of low-power micro-sensors and the deployment of large scale sensor networks. With the capabilities of pervasive surveillance, sensor networks can be very useful in a lot of commercial and military applications for collecting and processing the environmental data. One of the very interesting research issues is the energy saving in object tracking sensor networks (OTSNs). However, most of the past studies focused only on the aspect of movement behavior analysis or location tracking and did not consider the temporal characteristics, which are very critical in OTSNs. In this paper, we propose a novel data mining method named TMP-Mine with a special data structure named TMP-Tree for discovering temporal moving patterns efficiently. To our best knowledge, this is the first study that explores the issue of discovering temporal moving patterns that contain both movement and time interval simultaneously. Through empirical evaluation on various simulation conditions, TMP-Mine is shown to deliver excellent performance in terms of accuracy, execution efficiency, and scalability.
SUMMARYBased on the standardized IEEE 802.11 Distributed Coordination Function (DCF) protocol, this paper proposes a new backoff mechanism, called Smart Exponential-Threshold-Linear (SETL) Backoff Mechanism, to enhance the system performance of contention-based wireless networks. In the IEEE 802.11 DCF scheme, the smaller contention window (CW) will increase the collision probability, but the larger CW will delay the transmission. Hence, in the proposed SETL scheme, a threshold is set to determine the behavior of CW after each transmission. When the CW is smaller than the threshold, the CW of a competing station is exponentially adjusted to lower collision probability. Conversely, if the CW is larger than the threshold, the CW size is tuned linearly to prevent large transmission delay. Through extensive simulations, the results show that the proposed SETL scheme provides a better system throughput and lower collision rate in both light and heavy network loads than the related backoff algorithm schemes, including Binary Exponential Backoff (BEB), Exponential Increase Exponential Decrease (EIED) and Linear Increase Linear Decrease (LILD).
This work details the authors' efforts to push the baseline of expression recognition performance on a realistic database. Both subject-dependent and subject-independent emotion recognition scenarios are addressed in this work. These two happen frequently in real life settings. The approach towards solving this problem involves face detection, followed by key point identification, then feature generation and then finally classification. An ensemble of features comprising of Hierarchial Gaussianization (HG), Scale Invariant Feature Transform (SIFT) and Optic Flow have been incorporated. In the classification stage we used SVMs. The classification task has been divided into person specific and person independent emotion recognition. Both manual labels and automatic algorithms for person verification have been attempted. They both give similar performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.