The field of NLP has seen unprecedented achievements in recent years. Most notably, with the advent of large-scale pre-trained Transformer-based language models, such as BERT, there has been a noticeable improvement in text representation. It is, however, unclear whether these improvements translate to noisy user-generated text, such as tweets. In this paper, we present an experimental survey of a wide range of well-known text representation techniques for the task of text clustering on noisy Twitter data. Our results indicate that the more advanced models do not necessarily work best on tweets and that more exploration in this area is needed.
When we humans tell a long paragraph about an image, we usually first implicitly compose a mental "script" and then comply with it to generate the paragraph. Inspired by this, we render the modern encoder-decoder based image paragraph captioning model such ability by proposing Hierarchical Scene Graph Encoder-Decoder (HSGED) for generating coherent and distinctive paragraphs. In particular, we use the image scene graph as the "script" to incorporate rich semantic knowledge and, more importantly, the hierarchical constraints into the model. Specifically, we design a sentence scene graph RNN (SSG-RNN) to generate sub-graph level topics, which constrain the word scene graph RNN (WSG-RNN) to generate the corresponding sentences. We propose irredundant attention in SSG-RNN to improve the possibility of abstracting topics from rarely described sub-graphs and inheriting attention in WSG-RNN to generate more grounded sentences with the abstracted topics, both of which give rise to more distinctive paragraphs. An efficient sentence-level loss is also proposed for encouraging the sequence of generated sentences to be similar to that of the ground-truth paragraphs. We validate HSGED on Stanford image paragraph dataset and show that it not only achieves a new state-of-the-art 36.02 CIDEr-D, but also generates more coherent and distinctive paragraphs under various metrics. CCS CONCEPTS • Computing methodologies → Computer vision tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.