Lingxiang Wu scite author profile

Artificial creativity has attracted increasing research attention in the field of multimedia and artificial intelligence. Despite the promising work on poetry/painting/music generation, creating modern Chinese poetry from images, which can significantly enrich the functionality of photo-sharing platforms, has rarely been explored. Moreover, existing generation models cannot tackle three challenges in this task: (1) Maintaining semantic consistency between images and poems; (2) preventing topic drift in the generation; (3) avoidance of certain words appearing frequently. These three points are even common challenges in other sequence generation tasks. In this article, we propose a Constrained Topic-aware Model (CTAM) to create modern Chinese poetries from images regarding the challenges above. Without image-poetry paired dataset, we construct a visual semantic vector to embed visual contents via image captions. For the topic-drift problem, we propose a topic-aware poetry generation model. Additionally, we design an Anti-frequency Decoding (AFD) scheme to constrain high-frequency characters in the generation. Experimental results show that our model achieves promising performance and is effective in poetry’s readability and semantic consistency.

show abstract

Noise Augmented Double-Stream Graph Convolutional Networks for Image Captioning

Sang

et al. 2021

IEEE Trans. Circuits Syst. Video Technol.

View full text Add to dashboard Cite

Person re-identification via rich color-gradient feature

Wang

Zhu

et al. 2016

View full text Add to dashboard Cite

Person re-identification refers to match the same pedestrian across disjoint views in non-overlapping camera networks. Lots of local and global features in the literature are put forward to solve the matching problem, where color feature is robust to viewpoint variance and gradient feature provides a rich representation robust to illumination change. However, how to effectively combine the color and gradient features is an open problem. In this paper, to effectively leverage the color-gradient property in multiple color spaces, we propose a novel Second Order Histogram feature (SOH) for person reidentification in large surveillance dataset. Firstly, we utilize discrete encoding to transform commonly used color space into Encoding Color Space (ECS), and calculate the statistical gradient features on each color channel. Then, a second order statistical distribution is calculated on each cell map with a spatial partition. In this way, the proposed SOH feature effectively leverages the statistical property of gradient and color as well as reduces the redundant information. Finally, a metric learned by KISSME [1] with Mahalanobis distance is used for person matching. Experimental results on three public datasets, VIPeR, CAVIAR and CUHK01, show the promise of the proposed approach.Index Terms-Person re-identification, encoding color space, second order histogram

show abstract

Appearance features in Encoding Color Space for visual surveillance

Zhu

et al. 2018

Neurocomputing

View full text Add to dashboard Cite

Person re-identification and visual tracking are two important tasks in video surveillance. Many works have been done on appearance modeling for these two tasks. However, existing feature descriptors are mainly constructed on three-channel color spaces, such like RGB, HSV and XYZ. These color spaces somehow enable meaningful representation for color, yet may lack distinctiveness for real-world tasks. In this paper, we propose a multi-channel Encoding Color Space (ECS), and consider the color distinction with the design of image feature descriptor. In order to overcome the illumination variation and shape deformation, we design features on the basis of the Encoding Color Space and Histogram of Oriented Gradient (HOG), which enables rich color-gradient characteristics. Additionally, we extract Second Order Histogram (SOH) on the descriptor constructed to capture abstract information with layout constrains. Exhaustive experiments are performed on datasets VIPeR, CAVIAR, CUHK01 and Visual Tracking Benchmark. Experimental results on these datasets show that our feature descriptors could achieve promising performance.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lingxiang Wu

Multi-camera multi-player tracking with deep player identification in sports video

Recall What You See Continually Using GridLSTM in Image Captioning

Image to Modern Chinese Poetry Creation via a Constrained Topic-aware Model

Noise Augmented Double-Stream Graph Convolutional Networks for Image Captioning

Person re-identification via rich color-gradient feature

Appearance features in Encoding Color Space for visual surveillance

Contact Info

Product

Resources

About